Computer algorithm predicts most Strasbourg judgments

brainwigArtificial intelligence … it’s no longer in the future. It’s with us now.

I posted a review of a book about artificial intelligence in autumn last year. The author’s argument was not that we might find ourselves, some time in the future, subservient to or even enslaved by cool-looking androids from Westworld. His thesis is more disturbing: it’s happening now, and it’s not robots. We are handing over our autonomy to a set of computer instructions called algorithms.

If you remember from my post on that book, I picked out a paragraph that should give pause to any parent urging their offspring to run the gamut of law-school, training contract, pupillage and the never never land of equity partnership or tenancy in today’s competitive legal industry. Yuval Noah Harari suggests that the everything lawyers do now – from the management of company mergers and acquisitions, to deciding on intentionality in negligence or criminal cases – can and will be performed a hundred times more efficiently by computers.

Now here is proof of concept. University College London has just announced the results of the project it gave to its AI researchers, working with a team from the universities of Sheffield and Pennsylvania. Its news website announces that a machine learning algorithm has just analysed, and predicted, “the outcomes of a major international court”:

The judicial decisions of the European Court of Human Rights (ECtHR) have been predicted to 79% accuracy using an artificial intelligence (AI) method.

Nicolas Aletras, the computer scientist who led the project, reassures us that they “don’t see AI replacing judges or lawyers”. This study, he suggests, will help the legal industry that has grown up around the Strasbourg Court to identify “patterns in cases that lead to certain outcomes”. Indeed the result of the study bears out a prediction made over fifty years ago that computers would one day become able to analyse and predict the outcomes of judicial decisions (Lawlor, 1963). According to Lawlor,

reliable prediction of the activity of judges would depend on a scientific understanding of the ways that the law and the facts impact on the relevant decision-makers, i.e., the judges.

Now we have significant advances in two types of processing,  Natural Language Processing (NLP) and Machine Learning (ML), and the authors argue that these provide us with the tools to automatically analyse legal materials, so as to build successful predictive models of judicial outcomes.

What is there not to like? Well, one thing is that if an algorithm can identify these patterns, why do we need people to do it when an AI machine can manage the same task a hundred times faster at a fraction of the cost?

But the question goes deeper. The researchers looked closely at applications under Articles 3 (prevention of inhuman treatment), 6 (right to justice) and 8 (right to bodily integrity and family life). They then instructed the machine to apply a pattern to the text of the judgments. Here’s a quick reminder of how the ECtHR judgments present themselves:

  • circumstances (which is the factual/legal matrix of the individual case)
  •  relevant law
  • ‘topics” covered in the discussion
  • Court’s assessment of the law and facts
  • judgment of the Court

As the co-author of this study explains, the machine learned how to combine the abstract ‘topics” (such as the right to privacy, or the incidence of negligence) with the “circumstances” section across the 584 cases it was given to chew over.

According to the project’s co-author, Dr Vasileios Lampos of UCL Computer Science, the most reliable factors for predicting the court’s decision were found to be the language used as well as the topics and circumstances mentioned in the case text. The ‘circumstances’ section of the text includes information about the factual background to the case.

Previous studies have predicted outcomes based on the nature of the crime, or the policy position of each judge, so this is the first time judgements have been predicted using analysis of text prepared by the court.

In fact in this instance the research team were hobbled by data protection and privacy laws: they were not allowed to look at the  applications that were actually submitted to the court. All they had to go on were the published judgments.

In other words, the AI machine achieved a staggeringly high prediction level of judicial outcomes, with hardly any data to go on. The text in the judgments could be seen as a proxy for the applications actually lodged at the Court by individuals. The authors point out that at the very least, their work could be approached on the following hypothetical basis:

if there is enough similarity between the chunks of text of published judgments that we analyzed and that of lodged applications and briefs, then our approach can be fruitfully used to predict outcomes with these other kinds of texts.

There could be sufficient similarity, simply because in the vast majority of cases, parties do not tend to dispute the facts themselves, as contained in the ‘Circumstances’ subsection, but only their legal significance (i.e., whether a violation took place or not, given those facts).  If the research team had been given access to the actual complaints submitted to the court, the calculations as to the outcome would have been closer to perfect.

In the abstract to the full paper, the authors reflect that their empirical analysis

indicates that the formal facts of a case are the most important predictive factor. This is consistent with the theory of legal realism suggesting that judicial decision-making is significantly affected by the stimulus of the facts.

and in the body of the paper, they observe that

The consistently more robust predictive accuracy of the ‘Circumstances’ subsection suggests a strong correlation between the facts of a case, as these are formulated by the Court in this subsection, and the decisions made by judges. The relatively lower predictive accuracy of the ‘Law’ subsection could also be an indicator of the fact that legal reasons and arguments of a case have a weaker correlation with decisions made by the Court.

In other words it is facts, rather than the law, that are predictive of the judicial outcome. If a computer can work this out one might be forgiven for wondering whether cases should wind their labyrinthine way through lawyers’ wet brains to a panel of judges at the other end.

But that is a place beyond the dark horizon. All the UCL team was doing was running a controlled experiment on one court, in a familiar legal environment, using tools that could readily latch on to listed arguments (all those associated with Articles 2 – 14 of the ECHR and its relevant protocols)

Soon we might expect this sort of tool to provide every service from in-house legal advice to final adjudication. Why not?

Related reading:

17 thoughts on “Computer algorithm predicts most Strasbourg judgments

  1. Marching 79% implies 21% error or injustice one time in five. Whether it was the algorithm or the Court was in error is another matter, One hopes the results were not terminal.

  2. My post should have read “Matching 79% implies 21% error or injustice one time in five. Whether it was the algorithm or the Court that was in error is another matter, One hopes the results were not terminal.”

  3. I’m not sure those who use courts will really want to have computers sitting in judgment over them. Humans are flawed, emotion-driven; they get tired, they make mistakes, they have biases both conscious and unconscious. But surely that is why the vast majority of people want human judges and juries? Surely it would have been quite easy years ago to implement an automatic sentencing calculus in criminal cases, but it hasn’t been done yet because computers don’t make exceptions or have mercy or take the zeitgeist into account or correct wrong turnings in the law. A lot of people, when asked what “British” rights are, cite the right to jury trial. I’m not sure if what people want will have a huge bearing, but surely it must factor in somewhere. The law courts are such a human institution.
    Also, surely the same thing applies to the stock market and mergers and acquisitions and the capital markets. When you think about it, it is a huge game with complicated rules. What would be the point of it if machines were doing it?

  4. I really wonder why a 79% accuracy is seen as a success? You’ll get already 50% when flipping a coin and 79% also means that more than every fifth case was not accurate.

  5. This is interesting although slightly mad.

    The biggest caveat, identified in the paper, is that the program has not been able to “predict” decisions at all. It has simply managed to extrapolate the outcome from the way in which the case is presented in judgments. Most judgments (although more so in the UK than in the ECHR I expect) do give clues in the factual summary as to whether the judge thinks the case has any merit.

    The challenge will be seeing whether the programme has any predictive power when looking at actual applications.

  6. This suggests that 21 per cent of ECtHR decisions are wrong, does it not?
    Perhaps we should dismiss the judges NOW before they commit any more mistakes?
    It might also mean that prior assessment of outcomes could mean reduced numbers of cases?
    Willkommen to the Brave New World.

  7. It’s worth reading this paper *very* closely. There are a lot of caveats in it, and even reasons to be wary about the methodology.

  8. It would be interesting to break the 79% accuracy down into more detail.e.g. the proportion of False Positives and proportion of False Negatives. The importance of each of these varies according to where an algorithm is used. A surgeon could afford very few False Positives (using a method predicted to work, but it does not when he /she applies it)). But a share trader could afford a very high percentage of False Positives, because they can expect to recoup their losses in future trades

  9. Early in the article: Nicolas Aletras, the computer scientist who led the project, reassures us that they “don’t see AI replacing judges or lawyers”.
    Conclusion at the end: Soon we might expect this sort of tool to provide every service from in-house legal advice to final adjudication. Why not?

    If my students ever wrote such an article, where their conclusion contradicts the one real expert quoted, they would not get a passing grade for that assignment.

    • When there is a mismatch of one in five, how do we know which is ‘correct’? It may be that both algorithm and the Court are wrong – perhaps in more than one-fifth of cases. If both Court and algorithm are equally likely to be wrong, why not use the cheaper option. On the other hand, in arbitration, one chooses the tribunal and has only oneself to blame!

  10. The good news is that Strasbourg is consistent especially, as you say, becasue contetnnis more important than law in rpedicting the algorithm’s conclusions, Would the algorithm notice the hearsay inconsistency that UKHRB noted (30/09/2016) when Price v. UK, ECtHR 15602/07 followed Seton v. UK 55287/10, both contrasting with Al Khawaja ECHR 2127?

  11. ok – you want to walk with your head in the clouds or – exercise the grey matter with speculation and more on AI. How, though, if I may ask is that helping me, as an unrepresented Whistleblower who has to bring a case to Strasbourg, after one Judge in the UK usefully ignored that the main witness for the Defendant NHS Trust had no answers while their barister said he had no defence? And another Judge (Court of Appeal) seems to me to have taken all of 10 minutes to find my appeal had no merit? – I’d be interested to hear, too, how you think AI would predict the outcome if I get to Strasbourg btw.

  12. In a legal system based on precedent, surely a high proportion of judgements should be predictable? So it is unsurprising that AI should score highly. But human input is needed in the exceptional case, where the need is to perceive the injustice of following precedent and the need to distinguish a subset of circumstances where a new principle is needed, It would be very surprising if AI were ever able to do that.

  13. Pingback: Shaking off Constitutional Constraints | Verfassungsblog

Comments are closed.