AI & Law: Open Court Records And AI

Image for post
Image for post
More open access to court records is needed, including for boosting AI in the law

by Dr. Lance B. Eliot

For a free podcast of this article, visit this link https://ai-law.libsyn.com/website or find our AI & Law podcast series on Spotify, iTunes, iHeartRadio, plus on other audio services. For the latest trends about AI & Law, visit our website www.ai-law.legal

Key briefing points about this article:

  • There are strident calls for greater open access to court records
  • Doing so will provide vital data needed for holistic policy analyses of justice
  • Another important use would aid in bolstering the advent of AI in the law
  • AI that utilizes Machine Learning requires extensive data to be suitably trained
  • Via plentiful access to court data the efforts to devise AI Legal Reasoning would be aided

Introduction

Court records in the United States are not as readily accessible as one might assume. In a sense, there is an abundant amount of friction that prevents a full semblance of openness to court records. Various potential opportunities to better understand our system of justice and proffer modifications or enhancements are presumably being stymied.

There has been an increasing widespread call for enabling greater open access to court records of the US justice system. Despite court records generally being considered within the public domain, it is a bridge to far to presume that those archives are sufficiently available. The rub according to those making such a clamor is that there are excessive financial and technical obstacles in the way of readily being able to get access to the vast corpus (see Science, July 10, 2020, on “How to build a more open justice system”).

Those expressing these concerns point to the onerous fact that federal court records are typically charged at about ten cents per printed page to access a case online. Though ten cents seems like a minuscule amount, if you were to multiply that dime by the likely hundreds or thousands of possible pages in a protracted case, and then multiply that by hundreds of thousands of cases nationwide, the final tab to obtain any large-scale set of court records is undoubtedly in a sky-high costly range.

Yet another significant hurdle is that court records are oftentimes in varying formats and structured unalike, meaning that even if a sizable set is purchased, trying to rationalize and align the obtained data can be burdensome and exacerbate the cost of trying to sift through them.

Critics of this status quo are apt to vehemently argue that the barriers of access to court records woefully undercuts the availability of vital data needed for holistic policy analyses of justice in America. If the bleak situation were rectified, doing so would inevitably improve the practice of law and further enhance the overall administration of justice.

Attorneys using any of the numerous commercial legal services entities that have already procured and downloaded various court records might contend that there is no need to seek out the records directly from the federal government since instead, you can get those records from those paid services. Though this path might to some degree ameliorate the formatting issues, it typically does not resolve the cost-related barriers.

There are encouraging signs of open source alternatives that tend to use crowdsourcing as a means of establishing databases of court records that are then made essentially free to access, supported at times by generous donors or sometimes via ad-based sponsors, but those datasets are not necessarily considered at a large enough scale, plus their sustainability is not relatively assured as those entities continually grapple to keep their aspirational projects afloat.

The oft-cited solution to the entire matter involves getting Congress to repeal the laws that authorize the judiciary to charge fees for access to court records.

Be forewarned, a tide of controversy muddies this straightforward proposal and centers on the root of most issues, the money involved. It turns out that those paid-for court record access fees are relatively substantial when totaled up. Estimates suggest on the order of $145 million in the fiscal year 2019 alone was derived from fees as part of the federal judiciary budget, and so the question immediately arises as to what funding would be newly surfaced to then plug the gap after making those court records available for free going forward.

Usefulness of Court Records

Shifting for the moment beyond the matter of how to assure that court records are more openly accessible, you might be wondering what would be done with those court records if they were indeed freely available.

It is assumed and hoped that a vast corpus of court records that was frictionless for access would enable researchers and scholars an increased opportunity to analyze aspects of the justice system and the nature of our courts in ways that heretofore have been prohibitively costly to undertake. In addition, journalists would potentially be bolstered in their reporting on court trends throughout the United States, and the general public would potentially become more readily engaged in what our courts are doing.

There is another beneficial possibility that does not as yet meet the eye, though gradually will have an increasing bearing on this topic, namely the advent of AI LegalTech.

Data is considered the lifeblood of AI and without which, the AI cannot function and nor flourish.

A brief explanation might help in showcasing why data is so crucial for enabling AI LegalTech.

One of the most visible and commonly applied uses of AI consists of Natural Language Processing (NLP), which we all generally experience daily via uses of Alexa, Siri, and other NLP-enabled apps. Perhaps you’ve recently used an e-Discovery software package that had NLP added to it or performed an online query of a contracts database using a modern NLP capability.

AI insiders know that a significant booster to NLP has been the infusion of Machine Learning (ML). Machine Learning consists of computer-based pattern matching and underlies many of the latest advances in AI. There is no magic involved, which regrettably sometimes is implied by those hyping ML, and it is wise instead to think of Machine Learning as a statistical method on steroids (kind of like multiple regression that you might have learned while in college, but more advanced). Machine Learning is a hidden element that bolsters NLP and sits inside many other AI applications too.

Leveraging Data For AI Legal Reasoning

This brings us now to the cusp of how hindrance of access to court records relates to AI LegalTech.

When formulating an AI Machine Learning application, there is customarily a need to have data, lots of data, since you need to train the ML, of which a crucial means to do so consists of feeding in immense masses of relevant data. Consider again the nature of multiple regression, as an illustration, showcasing that if you have too little data to feed into a regression analysis, you are sketchily on the thinnest of ground when making any bold assertions about what you have discovered. There is just not enough data there to reach valid conclusions.

If we are going to have AI LegalTech gradually become utilized as a kind of sophisticated sidearm legal advisor, the odds are that Machine Learning is going to be a crucial path to that aim, and, in turn, Machine Learning needs mega-scale sets of legal data for training purposes. Thus, besides the other overall reasons to seek a vast and freely available corpus of court records, another notable justification involves the ongoing ambitions of applying AI to the law.

Conclusion

Admittedly, there can be a dual-edged sword to this matter in that if such data is improperly used for enabling untoward AI (or, misused in other ways that have nothing to do with AI), the resultant legal rendering outputs would be dubious and suspect.

Nonetheless, it would seem ill-stated to suggest that keeping such vital data at bay is warranted, and the upside argument underlying the hoped-for benefits seems to convincingly outweigh the possibilities of any miscast incursions.

For the latest trends about AI & Law, visit our website www.ai-law.legal

Additional writings by Dr. Lance Eliot:

And to follow Dr. Lance Eliot (@LanceEliot) on Twitter use: https://twitter.com/LanceEliot

Copyright © 2020 Dr. Lance Eliot. All Rights Reserved.

Written by

Dr. Lance B. Eliot is a renowned global expert on AI, Stanford Fellow at Stanford University, was a professor at USC, headed an AI Lab, top exec at a major VC.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store