What Does It Really Mean to “Free the Law”? Part 1

A fantastic development out of the United States last week – Harvard Law School and Ravel Law plan to make access to the school’s entire library of reported U.S. case law available for free on Ravel’s website. In a multi-year effort and at a cost said to be in the millions (exact details not known), some “40,000 books containing approximately forty million pages of court decisions” are being digitized and uploaded to Ravel’s platform, where anybody will be able to search, read and use the material at no cost. This is an incredible advance in open access to law and one that is being rightly praised in the U.S. and around the world.

Ravel Law, as a private entity (in which Harvard holds a 4% equity interest), has an eight-year exclusive license to commercially exploit the files, but the deal provides for early expiration on a jurisdiction-by-jurisdiction basis as each “publishes its future court decisions online in an acceptable format.” Harvard reports that Illinois and Arkansas already satisfy that condition. (If you’re curious where the other States currently stand, Berkman Center Research Fellow and Slaw contributor Sarah Glassmeyer has you covered.)

A little more background on the Harvard/Ravel deal as reported on Bob Ambrogi’s Law Sites blog:

  • It is literally every published opinion from every U.S. jurisdiction from all time
  • Text of cases will include links to images showing the original cases as they appeared in print
  • [F]or any state that makes its case law available online in an authoritative, machine-readable format, […] the historical collection for that state will become immediately available in the public domain. They are doing this, Lewis said, as a call to action for the states to put their case law online in an authoritative format.
  • […] Ravel Law will develop an API to let other developers work with the information. Ravel will license the data to commercial publishers and others who want to purchase it.
  • Researchers who are doing empirical research will be able to apply for access to the bulk data prior to the expiration of the eight years.

U.S. case law is considered public domain, but headnotes and editorial treatments are subject to copyright. Consequently, anyone hoping to scan old casebooks needs to strip out the copyrightable elements before putting the content to use. Several players have taken up the task and today at least five “retail” legal information services offer extensive fee-based case law access (e.g., Westlaw, Lexis, Fastcase, Casemaker, Bloomberg BNA) and I’m aware of at least three that offer fee-based bulk access to “wholesale” feeds of case law with collections going back decades. Throw in groups like Casetext, Justia and the Free Law Project, that offer free retail or wholesale access of more current case law and you have the makings of a very competitive, open and innovative market. Moreover, you have numerous players with an incentive to support third-party development of tools and services that make new and exciting use of that content.

The Harvard/Ravel deal will push things even further by truly liberating access to the content that supports the business models of all players. When bulk access to case law is free for everybody, it gets harder for anyone to make a buck selling simple access. As the commercial value of U.S. case law begins to diminish, the incentives to build something on top of that data increase.

Exciting times ahead, but it has me thinking: whither Canada?

I’ll explore that question with a Part 2 post next week.

Comments

  1. Colin,

    Great news from Harvard, no doubt. But allow me to steer away for a second from my Canadian modesty to say that when it comes to free access to law, what makes headlines in the US has been a commodity for quite some time in Canada. It’s kind-of-like with public healthcare.

    On a more serious note, let’s give well deserved credit to Canadian law societies, Canadian courts and many involved individuals, like you, for having built, supported and evolved the most sustainable free access to law ecosystem in the world.

    Cheers,

    Ivan

  2. “It is literally every published opinion from every U.S. jurisdiction from all time.”

    That’s a sweeping statement, and I’m not sure it can withstand the harsh light of fact-checking. As we all know, for many decades West was the exclusive publisher in print of cases for many states and inferior federal courts. The print versions of these cases are all littered with West copyrighted headnotes and editorial treatments. There can be no doubt that “[s]everal players have taken up the task” of “strip[ping] out the copyrightable elements” of all these cases.

    But, do we have any conclusive evidence that these players have actually caught them all? If so, I’d like to see that evidence.

    Not that I have any axe to grind against Ravel or Harvard. I admire their efforts, but as a long-time consumer of legal information I want to be sure that their product actually comes as advertised.

  3. HI Ivan and Bill.

    Thanks for your comments.

    Ivan – without a doubt, when Canadian access to law is stacked up against just about any nation on earth, we have more to celebrate than lament. I agree completely that when it comes to putting law on the internet, the Canadian legal establishment, along with visionary organizations like Lexum have been true leaders. I look forward to getting into some of those details in Part 2, when I will also offer thoughts on opportunities to go further.

    Bill – it’s fair to cast a skeptical eye on the claim “It is literally every published opinion from every U.S. jurisdiction from all time”. It’s a pretty broad claim. That said, it seem to be pretty much bang-on.

    Here’s the full extract from the Law Sites article:

    “When I asked Lewis about the scope of this collection, his answer was, “Everything.” It is literally every published opinion from every U.S. jurisdiction from all time. It does not include opinions that were never published, such as many state-level trial court opinions.”

    So the primary qualifier is “published” – i.e., in a book. As you note, West publishing that business pretty much to itself for decades.

    The first link in my post takes you to Harvard’s description of the arrangement, which includes a discussion of the source material. Here’s how they describe it:

    Scope:
    *All official reported decisions of the federal courts
    *All official reported decisions of the courts of every state
    *All territorial and pre-statehood decisions in HLSL’s collection
    *Estimated 43,000 volumes and 40MM pages

    Whether that’s truly “everything” is for the historians to judge, but it’s probably safe to assume that it will result in a deeper digital collection than currently exists anywhere else and as close to complete as possible.

    A final thought on the idea of “everything” with a tie-in back to the pioneering work of Lexum.

    Earlier this year, the Supreme Court of Canada announced that every decision that had ever appeared in the Supreme Court Reports was now available on its site and on CanLII. Lexum, the online source of Supreme Court decisions since the pre-web days (anybody remember Gopher?), did the work and though rightly proud of its accomplishment noted that perhaps as many as 500 unreported SCC decisions might still find their way online someday. [See this Canadian Lawyer Magazine article for more details on the multi-year efforts of Lexum, CanLII and so many others to bring these cases online http://www.canadianlawyermag.com/legalfeeds/2633/historical-scc-decisions-now-available-online.html%5D

    Colin

  4. Colin —

    Thanks for your elaboration, but I remain a Doubting Thomas. A big part of West’s success from the late 19th into the mid-20th centuries derived directly from their publishing opinions that were not officially reported by their respective jurisdictions. And then there were the jurisdictions that abandoned official reporting and publication, leaving the job entirely to West. Undoubtedly, there are many thousands of published opinions that were never officially reported, but that have nonetheless entered the canon of precedential case law through citation and adoption by courts in their jurisdictions. The only way that judges and lawyers ever had access to these cases was by finding them in the West editions.

    Now it’s entirely likely that Ravel and other parties have gone back and stripped a very large number of those cases of their West annotations, thus freeing them from copyright constraints. But every single one of them? I don’t think so. And, so far, I don’t see Ravel sharing with us the methodology they’re using to incorporate such cases into their database. Their video scrupulously shows only official reporter volumes being chopped and scanned. So where are they getting the West cases from, and how are they getting them? So far, I’ve not heard Ravel say.

    None of this should be interpreted to mean that I have any sort of axe to grind against Ravel. On the contrary, I very much admire what they’re trying to do. But if they’re going to promote their database as including every published opinion, from every U.S. jurisdiction, from all time, then they owe it to their users to give us more information about exactly how they’re going to accomplish this.

  5. Hi – I’m part of the Library Innovation Lab at the Harvard Law Library, and I manage this project. Colin’s post nails it in terms of the impact we hope this project will have, and we’re thrilled to be partnering with Ravel to make it happen.

    Here’s what we’re doing to fulfill our goals in terms of scope:

    First, our amazing librarians conducted a great deal of painstaking research to identify all known sources of reported U.S. state and federal court case law. Second, our librarians teamed up with our developer to wrangle a large amount of bibliographic data to help us identify all the sources in our collection and to assess the completeness of our collection. Third, our librarians are resolving any gaps or anomalies in our data to ensure that we aren’t missing anything. Fourth, we are systematically retrieving and digitizing everything we have that is in scope.

    I hope that’s helpful. We feel very good about the comprehensiveness of our efforts, but ultimately the data will be available for folks to test and measure, and that will be a major improvement on where things stand right now.

  6. Bill I agree. It is impossible, practically and statistically to claim “all case law”. Nevertheless for quite a few years this claim has been made by many. Perhaps the claim should be — selective and the most important ? This used to be the claim in the past when it was impossible to report exhaustively, instead of selectively and comprehensively — sounds like a better solution ?

  7. I think the Harvard-Ravel announcement should prove to be a positive development for ensuring that the entirety of U.S. case law is eventually freely available to the public online. I would have preferred that we got there eight years ago rather than eight years from now, but the goal needs to be achieved, however and whenever we get there.

    While I don’t like the eight year exclusivity for one private vendor any more than anyone else, I understand it, because I’ve been trying to figure out how to get someone to care enough about this problem to fund it for at least the last five years. The very nature of the deal illustrates how difficult it is, because Harvard of all places had to find outside funding for this. The University with the largest endowment in the world needed $8M, a mere .02% (note the decimal!) of that endowment, and the powers that be weren’t willing to self-fund this effort. That tells you something.

    We at Free Law Project have been asking large foundations to take an interest in funding this effort for some time, with little to no success. Most recently we failed to advance to the semi-final round of the Knight News Challenge on Data: https://www.newschallenge.org/challenge/data/entries/the-entirety-of-united-states-case-law-online-for-the-public-for-free-at-last

    In that proposal we proposed to do what the commenter above, Bill, appears to want done: an audit. We don’t want to just say that we have all the cases, we want to prove it. But it does sound like Bill has a different view of what “all” entails. We’ve thought about this a lot, and decided that we will declare we have “all” of U.S. case law online once we can confirm that we have every case that appears in the bound volumes of a great law school library, such as Berkeley’s or Harvard’s. Bill sounds like he is interested in including non-precedential cases that have only ever been published online by West or Lexis. We’d certainly include those if anyone offered them to us, but I do not know of a case that was only ever published online that subsequently became an important precedent, as Bill suggests. Instead, my impression is that all such online-only cases are always deemed non-precedential.

    But, in another way, this is a minor quibble. Because the fact is, no legal publisher that I am aware of has ever asserted that their collection was complete, even where complete simply means “bound-volume-equivalent.” I’ve never seen such a statement from West, Lexis, or Bloomberg, and yet practicing lawyers proceed blithely on safe in the knowledge that “No one was ever fired for buying IBM.” It is the stature of the big three (or two) that make people deem them the “reliable” “complete” choices, even with absolutely no such assurance that this is the case. But since Free Law Project is a small, scrappy non-profit, people DO want to know how complete our collection is and so we propose to conduct the audit described in the link above (whether funded or not) and then we’ll be able to show people that we have every case that appears in the bound volumes. This will then be a first in two ways, the first time it will ever all be freely available online and the first time anyone has ever asserted that they have it “all.”

    If through such an audit, we learn of gaps in our coverage, then we hope this Harvard-Ravel effort assists us. They say that they are going to allow non-profits free access to their proposed API, which will provide us with up to 500 cases per day. If Harvard & Ravel keep this promise, then we think we will be able to complete our collection and prove its completeness.

    Even then, there will still be much to do. Having everything that ever appeared in West’s National Reporter Series will be a monumental achievement in itself. Finding more obscure specialty reporters and adding all of their unique cases will be a nice addition. Finding administrative decisions that have never been widely available outside their field will also be nice. This sort of filling in on the edges might never end, but we will more likely turn our attention to bigger fish: all the federal and state statutes. All the rules, regulations, etc. These things too should be freely available to the public online, not just in their present form, but all the historical versions. When will I have a web interface that will show me not just what Georgia law is today, but also what it was in 1932? or 1832? There is plenty more work to do.

  8. Great post Colin! I’m excited to read part 2 next week. I’m a student at Thompson Rivers and I’m currently taking a class that is exploring the question, what is Lawyering in the 21st century. In class we are constantly discussing issues that pertain to information technology and how it’s changing the profession. “Freeing Law” through comprehensive and free databases is critical and I’m so glad to hear that that Harvard and Ravel Law are taking on this project. I’m curious to read part 2 and to see what your thoughts are on how Canada is stacking up in comparison to these developments. I think Canlii and Canlii connects are great Canadian resources that brings both primary law and secondary (in the form of Canlii connects) to the internet. As a student I love these resources and would like to know what’s next in Canada.

    If you’re interested in learning more about the class I’m in here is a link to our blog: http://l21c.trubox.ca/

    Thanks!

  9. This might be not only a dumb question, but also, given this audience, an inflammatory one, but here goes: What difference does it make if 100.0% of all American case law is not included in this (really tremendous) project? If a 1793 decision by a Delaware judge applying the rule against perpetuties or an 1811 New Hampshire ruling on replevin doesn’t make it into the database, does that really affect the database’s jurisprudential utility, in practical terms?

    I suppose I can see an argument for utter perfection from an historical perspective, that we want a record of absolutely everything because we want a record of absolutely everything. But from the practical perspective of the needs of lawyers and clients, isn’t it safe to say that the Ravel/Harvard project will give us everything we should ever actually need? Flame away.

  10. Jordan – not a dumb question at all! Actually, it’s one I am currently struggling with in my current research on the publication of law by U.S. State governments.

    Pay law will get you the entire run of cases from a state (or at least we think, since one of the questions that Harvard’s project [which, full disclosure, I’m currently acting as a research fellow in the lab running the scanning project] will answer is a census of state law since we can’t estimate _within a million_ how many cases actually exist.) Most free law sites in the U.S. (Google Scholar, Justia, etc.) stop at 1950 for their state case holdings. Most state case law publication starts somewhere in the mid-1990s. I think that the latter is obviously not sufficient, but what about the former?

    Much more research needs to be done on what the actual research needs of the public are and what types of law free and open law (as well as governments) should make available. My librarian instinct tells me that nothing less than complete is sufficient, and I acknowledge that this is probably not possible.

  11. Thanks, Sarah!

    Within a million? Holy cow…