Should Search Engines Index Court Decisions?

by John Gregory

Comments

Colin Lachance

March 25th, 2015 at 12:04 pm

Hi John,

CanLII employs the robots.txt protocol to shield some – not all – of our databases from search engine crawling. Excluded from the protection are legislative collections and Supreme Court of Canada decisions. You can see the full list of treatment here: http://www.canlii.org/robots.txt

To the extent that Canadian courts are putting their decisions directly online, I believe that most, if not all, rely on the same protocol and/or use other means to limit deep indexing of content accessible through their sites.

Reliance on any efforts short of password-protected access can still fall short because once a page is copied and reposted on a different site, the search engines make the information available. So what do we do? I and others have written on Slaw and elsewhere about this. I’ll spare readers all the links, but offer this one because it provides a good round-up of these discussions and pathways to even more discussion: http://www.slaw.ca/2014/05/26/google-gonzalez-and-globe24h/

While I’m tempted to go on and on referencing more of my own past statements on this topic, I’ll limit myself to just one more: “When it comes to material originating from the courts, we have to start thinking of “the internet” as beginning the moment a judge shares a final draft of her ruling with her clerk.”

Colin Lachance, CanLII CEO
David Whelan

March 25th, 2015 at 12:15 pm

I think it’s a bad idea.

Courts that post their own opinions are often part of the normal Google results (for example, you can paste this into Google to see US 6th Circuit criminal cases site:www.ca6.uscourts.gov/opinions criminal ). Same for Canadian courts. Ontario, like CanLII, uses a robots.txt file to block searching but Manitoba doesn’t (hyra “criminal matters” manitoba).

Privacy through obscurity blocks the ability of people who do not regularly do legal research – and know where to start – from finding relevant documents. The 2013 ABA Legal Technology Survey (vol. V, p. 38) found that Google was the preferred free tool for legal research for 36% of respondents. If that’s what lawyers are choosing, it seems hard to block access to opinions by the public who just as likely to use it for their legal research.

If the opinions contain sensitive information, they can be fixed by the court or the publisher (CanLII does this, I believe). If they’re documents that people, especially those who aren’t legal professionals, might need to access, it would be better for them to be easy to access rather than having to know where they are stored.
Karen Sawatzky

March 25th, 2015 at 12:36 pm

There’s still too much information publicly available in court decisions. I’m still finding full names and full birthdates, even of children, in family decisions. While it’s good that search engines can’t search those, I’m sure identity thieves have figured out which sites to troll.
Matt Earle

March 26th, 2015 at 10:56 am

My company Reputation.ca deals with the impacts of this on a daily basis. We help people protect their privacy online and remove damaging information. We have helped many people remove documents (which originated on CANLii) from Caselaw.Globe24h.com

Robots.txt is not a realistic or effective solution for keeping the information off of the internet. A lot of CANLii’s database was duplicated by the Romainian scraper site Globe24h. A database of court decisions can be mirrored and then exploited for advertising traffic fairly easily. It can actually be done with one Linux command and then waiting about ten minutes.

The courts, the OSC, BCSC, the professional self regulatory colleges and other organizations that publicize private information need to rethink their whole process. They need to understand that the publication of the private information is effectively an additional punishment for the person and they should assume it will end up in Google, ranking for the person’s name. I think if they recognized this fully they could decide whether it was fair to publish it all and what information should be included in their decisions.

The situation right now is not fair and not right and something needs to change.
David Collier-Brown

March 29th, 2015 at 11:25 am

An interesting balancing question: should courts set publication rules for anyone employing their decisions? If so, what should they allow?

I might suggest that personal names that are part of case-named remain, but publishers be required to exclude others. This could range from replacing the name with a black blob, simulating on-paper practice, to substituting a non-personal identifier like “party 2” if the publisher want the material to be easily readable.

At that point, google-style indexing becomes much less of a problem.

Most Recent Comments

Melanie Bueckert on Book Review: Mary Jane Mossman’s Quiet Rebels: A History of Ontario Women Lawyers:

Thank you, Katarina! I hope you enjoy it as much as I did :) more »
Katarina Daniels on Book Review: Mary Jane Mossman’s Quiet Rebels: A History of Ontario Women Lawyers:

I wanted to read this book as soon as I saw it in the CALL list of available titles, and… more »
Ferrin Evans on Seeing Is Believing: Visualizing Legal Research:

Fantastic work, Hannah! more »
Verna Milner on Seeing Is Believing: Visualizing Legal Research:

This post was a great pleasure to read. Love the graphics and analogies. more »

+ -

Towards Transparency: Why Not a Court AI Register?

Bill C-12 and the Changing Landscape of Asylum Access in Canada

New Panic Over Old Mistakes: Judicial Sanctions and Hallucinated Citations

Book Review: Mary Jane Mossman’s Quiet Rebels: A History of Ontario Women Lawyers

RECLAIM Part III: Equity and Clarity Are the Foundation of a High-Performing Law Firm

Agreeing to Disagree: The Value of Having an Interaction Plan as a Dispute Is Addressed

Should Search Engines Index Court Decisions?

Comments