DISCLOSURE AND PREDICTIVE CODING: PYHRRO EXPLAINED FOR THE TYRO

There has been much written already in relation to the decision of Master Matthews in Pyrrho Investments Ltd -v- MWB Property Ltd [2016] EWHC 256 (Ch) [see the links below]. However I want to concentrate upon the fact that this judgment provides a readily accessible guide to predictive coding.

WHAT IS PREDICTIVE CODING?

It is explained by the Master in some detail.
  1. Mr Spencer explains in his statement that the term ‘predictive coding’ is used interchangeably with ‘technology assisted review’, ‘computer assisted review’, or ‘assisted review’. It means that the review of the documents concerned is being undertaken by proprietary computer software rather than human beings. The software analyses documents and ‘scores’ them for relevance to the issues in the case. This technology saves time and reduces costs. Moreover, unlike with human review, the cost does not increase at the same rate as the number of documents to be reviewed increases. So doubling the number of documents does not double the cost.
  2. I should say, by way of footnote, that the ideas underpinning this process are not completely new. Primitive versions of this kind of process were being demonstrated to (sometimes sceptical) litigation lawyers in the mid-1980s. I was one of them. But this was before the advent of personal computers, let alone of tablets and smartphones. There was no everyday or home computer culture then, and especially not amongst English lawyers. Now computers and computer technology are much more accepted as the norm, and, crucially, the technology is vastly better, for example in terms of storage size, portability of hardware and storage media, processor speed and programming, amongst other matters. A number of computer software companies now offer predictive coding software for use by lawyers.
  3. In modem times, as I understand it, the predictive coding process runs more or less like this. First of all, the parties will settle a predictive coding protocol, setting out the process in more detail, including definition of the data set, sample size, batches, control set, reviewers, confidence level and margin of error. Then criteria (perhaps agreed, perhaps unilateral) must be decided upon for inclusion of documents in the process. Those criteria will include who had the documents (“custodians”) and the date range, but perhaps also whether the documents contained any of the keywords chosen. Certain types of documents, not having any or any sufficient text, will be excluded (they will have to be considered manually). The resulting documents are ‘cleaned up’, by removing repeated content (eg email headers or disclaimers) and words that will not be indexed (eg because not useful in assessing relevance).
  4. Then a representative sample of the ‘included’ documents is used to ‘train’ the software. In the present case, Mr Spencer suggests that it will comprise 1600-1800 documents (a size set by the size and variety of the entire document set). A person who would otherwise be making the decisions as to relevance for the whole document set (ie a lawyer involved in the litigation) considers and makes a decision for each of the documents in the sample, and each such document is categorised accordingly. It is essential that the criteria for relevance be consistently applied at this stage. So the best practice would be for a single, senior lawyer who has mastered the issues in the case to consider the whole sample. Where documents would for some reason not be good examples, they should be deselected so that the software does not use them to learn from. The software analyses all of the documents for common concepts and language used. Based on the training that the software has received, it then reviews and categorises each individual document in the whole document set as either relevant or not.
  5. The results of this categorisation exercise are then validated through a number of quality assurance exercises. These are based on statistical sampling. The sampling size will be fixed in advance depending on what confidence level and what margin of error are desired. The higher the level of confidence, and the lower the margin of error, the greater the sample must be, the longer it will take and the more it will cost. (These quality assurance exercises are clearly “additional techniques” contemplated by paragraph 27 of Practice Direction B to Part 31.)
  6. The samples selected are (blind) reviewed by a human for relevance. The software creates a report of software decisions overturned by humans. The overturns are themselves reviewed by a senior reviewer. Where the human decision is adjudged correct, it is fed back into the system for further learning. (It analyses the correctly overturned documents just as the originals were analysed.) Where not correct, the document is removed from the overturns. Where the relevance of the original document was incorrectly assessed at the first stage, that is changed and all the documents depending on it will have to be re-assessed.
  7. The process of sampling is repeated as many times as required to bring the overturns to a level within agreed tolerances, and so as to achieve a stability pattern. This is usually not less than 3, making 4 rounds in total. In his statement, Mr Spencer says that he understands that in fact it should involve review of some 8 to 12 batches of documents. The trend of overturns should be lower from round to round. Ultimately there will be a final overturn report within the agreed tolerance, so that the expense of further rounds of review will not be justified by the reduced chance of finding further errors, and the list of relevant documents can be produced.
  8. Although the number of documents that have to be manually reviewed in a predictive coding process may be high in absolute numbers, it will be only a small proportion of the total that need to be reviewed in the present case. Thus – whatever the cost per document of manual review – provided that the exercise is large enough to absorb the up-front costs of engaging a suitable technology partner, the costs overall of a predictive coding review should be considerably lower. It will be seen that, because the software has to be trained for every case, each use of the predictive coding process is bespoke for that case.

WHY DID PREDICTIVE CODING MATTER IN THIS CASE

The original number of electronic files held by the second claimant alone was 17.6 million. This was reduced by a process of electronic de-duplication to 3.1 million. However as the Master observed

But it is still a large and costly number to search.”

WHY THE USE OF PREDICTIVE CODING WAS ALLOWED

Unsurprisingly in this day and age the Master was concerned with costs and proportionality.  The claim was said to run into “tens of millions of pounds”.  However this does not mean that the court should disregard the overriding objective. Indeed the overriding objective play s a major part in the Master’s judgment.

  1. In the present case, the factors in favour of approving the use of predictive coding technology in the disclosure process seemed to me to be these:
(1) Experience in other jurisdictions, whilst so far limited, has been that predictive coding software can be useful in appropriate cases.
(2) There is no evidence to show that the use of predictive coding software leads to less accurate disclosure being given than, say, manual review alone or keyword searches and manual review combined, and indeed there is some evidence (referred to in the US and Irish cases to which I referred above) to the contrary,
(3) Moreover, there will be greater consistency in using the computer to apply the approach of a senior lawyer towards the initial sample (as refined) to the whole document set, than in using dozens, perhaps hundreds, of lower-grade fee-earners, each seeking independently to apply the relevant criteria in relation to individual documents.
(4) There is nothing in the CPR or Practice Directions to prohibit the use of such software.
(5) The number of electronic documents which must be considered for relevance and possible disclosure in the present case is huge, over 3 million.
(6) The cost of manually searching these documents would be enormous, amounting to several million pounds at least, hr my judgment, therefore, a full manual review of each document would be “unreasonable” within paragraph 25 of Practice Direction B to Part 31, at least where a suitable automated alternative exists at lower cost.
(7) The costs of using predictive coding software would depend on various factors, including importantly whether the number of documents is reduced by keyword searches, but the estimates given in this case vary between £181,988 plus monthly hosting costs of £15,717, to £469,049 plus monthly hosting costs of £20,820. This is obviously far less expensive than the full manual alternative, though of course there may be additional costs if manual reviews still need to be carried out when the software has done its best.
(8) The ‘value’ of the claims made in this litigation is in the tens of millions of pounds. In my judgment the estimated costs of using the software are proportionate.
(9) The trial in the present case is not until June 2017, so there would be plenty of time to consider other disclosure methods if for any reason the predictive software route turned out to be unsatisfactory.
(10) The parties have agreed on the use of the software, and also how to use it, subject only to the approval of the Court.
There were no factors of any weight pointing in the opposite direction.
  1. Accordingly, I considered that the present was a suitable case in which to use, and that it would promote the overriding objective set out in Part 1 of the CPR if I approved the use of, predictive coding software, and I therefore did so. Whether it would be right for approval to be given in other cases will, of course, depend upon the particular circumstances obtaining in them.”

RELATED POSTS

On this case

Predictive coding generally