Should Search Engines Index Court Decisions?

by John Gregory

Comments

Colin Lachance

March 25th, 2015 at 12:04 pm

Hi John,

CanLII employs the robots.txt protocol to shield some – not all – of our databases from search engine crawling. Excluded from the protection are legislative collections and Supreme Court of Canada decisions. You can see the full list of treatment here: http://www.canlii.org/robots.txt

To the extent that Canadian courts are putting their decisions directly online, I believe that most, if not all, rely on the same protocol and/or use other means to limit deep indexing of content accessible through their sites.

Reliance on any efforts short of password-protected access can still fall short because once a page is copied and reposted on a different site, the search engines make the information available. So what do we do? I and others have written on Slaw and elsewhere about this. I’ll spare readers all the links, but offer this one because it provides a good round-up of these discussions and pathways to even more discussion: http://www.slaw.ca/2014/05/26/google-gonzalez-and-globe24h/

While I’m tempted to go on and on referencing more of my own past statements on this topic, I’ll limit myself to just one more: “When it comes to material originating from the courts, we have to start thinking of “the internet” as beginning the moment a judge shares a final draft of her ruling with her clerk.”

Colin Lachance, CanLII CEO
David Whelan

March 25th, 2015 at 12:15 pm

I think it’s a bad idea.

Courts that post their own opinions are often part of the normal Google results (for example, you can paste this into Google to see US 6th Circuit criminal cases site:www.ca6.uscourts.gov/opinions criminal ). Same for Canadian courts. Ontario, like CanLII, uses a robots.txt file to block searching but Manitoba doesn’t (hyra “criminal matters” manitoba).

Privacy through obscurity blocks the ability of people who do not regularly do legal research – and know where to start – from finding relevant documents. The 2013 ABA Legal Technology Survey (vol. V, p. 38) found that Google was the preferred free tool for legal research for 36% of respondents. If that’s what lawyers are choosing, it seems hard to block access to opinions by the public who just as likely to use it for their legal research.

If the opinions contain sensitive information, they can be fixed by the court or the publisher (CanLII does this, I believe). If they’re documents that people, especially those who aren’t legal professionals, might need to access, it would be better for them to be easy to access rather than having to know where they are stored.
Karen Sawatzky

March 25th, 2015 at 12:36 pm

There’s still too much information publicly available in court decisions. I’m still finding full names and full birthdates, even of children, in family decisions. While it’s good that search engines can’t search those, I’m sure identity thieves have figured out which sites to troll.
Matt Earle

March 26th, 2015 at 10:56 am

My company Reputation.ca deals with the impacts of this on a daily basis. We help people protect their privacy online and remove damaging information. We have helped many people remove documents (which originated on CANLii) from Caselaw.Globe24h.com

Robots.txt is not a realistic or effective solution for keeping the information off of the internet. A lot of CANLii’s database was duplicated by the Romainian scraper site Globe24h. A database of court decisions can be mirrored and then exploited for advertising traffic fairly easily. It can actually be done with one Linux command and then waiting about ten minutes.

The courts, the OSC, BCSC, the professional self regulatory colleges and other organizations that publicize private information need to rethink their whole process. They need to understand that the publication of the private information is effectively an additional punishment for the person and they should assume it will end up in Google, ranking for the person’s name. I think if they recognized this fully they could decide whether it was fair to publish it all and what information should be included in their decisions.

The situation right now is not fair and not right and something needs to change.
David Collier-Brown

March 29th, 2015 at 11:25 am

An interesting balancing question: should courts set publication rules for anyone employing their decisions? If so, what should they allow?

I might suggest that personal names that are part of case-named remain, but publishers be required to exclude others. This could range from replacing the name with a black blob, simulating on-paper practice, to substituting a non-personal identifier like “party 2” if the publisher want the material to be easily readable.

At that point, google-style indexing becomes much less of a problem.

Most Recent Comments

Alastair Clarke on Issues of Self-Representation in a Landmark Decision: Reflecting on Ahluwalia v. Ahluwalia:

Indeed, this situation is very serious within the immigration context. IRCC encourages applicants to follow their guides and they actively… more »
David Collier-Brown on Resisting the Echo Chamber: AI-Assisted Judgment Writing and the Risk of Homogenization:

I find LLMs are better at critiquing text than writing it. I also tell the editor-bots "If you suggest alternate… more »
Bryce Smith on Issues of Self-Representation in a Landmark Decision: Reflecting on Ahluwalia v. Ahluwalia:

Thank you for highlighting the stated purpose of the justice system to provide justice, alongside the profound tensions created by… more »
Dennis Prieto on Law and Literature in Latin America: Context in the Classroom:

When I think of Law and Literature in the North American context, I think of Stevens, MacLeish, Dos Passos, and… more »

+ -

“Refs, You Suck!”: Personal Attacks on Decision Makers

Tips Tuesday: Use Newspaper Archives to Find Cases

Forum Shopping Could Fix the Delay Problem

Resisting the Echo Chamber: AI-Assisted Judgment Writing and the Risk of Homogenization

AI in Mediation. the Tool Is Not the Process: Using the IBA Guidelines to Evaluate Risk in Mediation Practice

RECLAIM: A Is for Autonomy

Should Search Engines Index Court Decisions?

Comments