Precydent – a Better Case Law Search Engine or Puffery?

Precydent is the web page (URL typed in full at the bottom of the message) for a beta-site for what the developers claim is a better search engine algorithm.

There’s a study posted on the page in which the developers set out the results of tests comparing their engine to Westlaw and Lexis. According to their tests, their engine has a better “recall” value – recall expressing the number of previously identified and tagged significant decisions in the database which the various search engines produce when processing the same question.

The developers claim they’re bringing the ranking procedures that web search engines use to legal database searches – on the assumption that the more times something is referred to (downloaded) the more significant it is.

Transposing this to law, the assumption is that the more times a case or article is referred to, the more significant it is.

Putting it more simply, the developers seem to be saying that their algorithm is more likely to produce more of the cases that judges refer to more often. Inasmuch as, in litigation, the results more often than not depend on what the judge(s) with the last word say the law is rather than what the law actually is …

Or, GIGO, but, then, I’m cynical. (An even crasser way to put it, which I wouldn’t, of course, given my occupation, is “monkey see, monkey do”.)

Maybe it’s a typo but the home page – which explains that the current test database is limited to SCOTUS cases – says it contains “a sample of about 20,000 important U.S. Supreme Court cases” which isn’t all the important SCOTUS cases. “Only” 20,000?

Does Canada have at least 2,000 important SCC decisions? While I suppose that depends on how one defines important, I’m inclined to doubt it. If we assumed 20 discrete areas of legal inquiry, that would mean, on the average, at least 100 important SCC cases per category. We can’t claim the disadvantage of complexity due to the legal system from our other “ahem” nation. After all, the US has its own civil-law jurisdiction: Louisiana. So, even allowing for the fact the US is about 10 times bigger than Canada and the legal system has had about 100 more years to amass final resort decisions, what does more than “20,000” tell us about US society? Or, us?

http://www.precydent.com/precydent_search/sv1.do

Comments

  1. It’s not just puffery. Here is an excerpt of an email that was forwarded to me from the head litigation KM attorney at a top 100 US law firm:

    Legal technology guru Robert Ambrogi posted yesterday on law.com’s legal technology blog about what he terms an “online legal revolution.” He had earlier posted on the subject on his own blog. He discussed four new web sites that are providing either search or direct access to databases of federal appellate caselaw. What’s happened is that PublicResource.org in January 2008 obtained this caselaw through companies such as Justia.com and has made it available in a fairly easy-to-work-with database format. Three other outfits have taken this data and have turned their search engines loose on it. I initially thought that none of these would be worth a legal researcher’s time, but as I explain in my detailed reviews below, one of them definitely is, for some types of work.

    . . .

    I didn’t get a clear sense from the article of which of the searches he liked better, so I decided to figure out which one I liked best.

    My Test

    I decided to challenge all of the engines with the same search term—“in personam jurisdiction”, a legal concept that arises on occasion in the case law. The term refers to a particular court’s power to judge a controversy involving a particular person, usually a corporation or person from out of state. IPJ is restricted by statute and by federal constitutional guarantees of procedural due process, the theory being that it’s not fair to haul someone before a court in a state with which they have no connection.

    I chose IPJ because, while not a jurisdiction expert, I had studied the subject hard under U. of Mich. Law Prof. Mathias Reimann, and had applied that learning in several instances in my practice as a litigator.

    The Winner—PreCYdent.

    I was stunned by the results of my search for IPJ on PreCYdent. The top six cases were the leading U.S. Supreme Court cases I studied in Prof. Reimann’s jurisdiction class. Each of them is fundamental to an understanding of the application of personal jurisdiction in federal courts. I have never seen a such a highly relevant set of search results on any electronic case search engine. Not in Westlaw. Not in Lexis. Not anywhere.

    Drilling into the cases, each clearly indicated where the page breaks were in the text. Federal appellate cases in the text of the case were hyperlinks to those cases.

    There also appears to be collaborative social features such as tagging, ranking of how relevant the case was to your search, and display of other’s search terms that also generated this result. Based on these results, those features have not been fully implemented, or used, but as a dedicated tagger I like the idea very much.

    This stellar result was no accident or the result of preprogrammed “best bets.” I tested a few other terms (e.g., “abortion” “sodomy”) and had comparably excellent results. The statute search also worked quite well.

    And this is the “Alpha” release!

    Although Ambrogi talked to the leader of PreCYdent, San Diego Law Prof. Thomas Smith, and apparently discussed “proprietary algorithms,” in my opinion he didn’t properly convey either the effectiveness of this search or how the tool might really work. (In an earlier post on Ambrogi’s own blog he did point to a Youtube video from PreCYdent CTO Antonio Tomarchio.)

    A look on the PreCYdent team list and firm description site has a clue.

    “PreCYdent search technology is able to mine the information latent in the “Web of Law”, the network of citations among legal authorities. This means it is also able to retrieve legally relevant authorities, even if the search terms do not actually occur or occur frequently in the retrieved document.”

    What this search engine must do is to track how often a particular case is cited to by other decisions. Perhaps there is some weighting given to cases and case citations from higher courts such as the U.S. Supreme Court. I would hope that more recent citations might also be given more weight. I imagine Westlaw and Lexis, which also have figured out how to automatically identify a case citation in a document and to link it with the original authority (see this post on West KM), could have done something like this, but thus far they have to do so.