Good Old Hyperlinks

By the time I was figuring out my stance on artificial intelligence, the legal tech talk had already moved on to blockchain. So I decided to write about something even more outdated – hyperlinks.

Links are the backbone of (legal) information systems

In research information systems we use citation information heavily. In the legal field, the cited-citing connections allow us to group documents into smaller universes. This feature of legal information is being exploited by virtually all providers for use in hyperlinking, to create citators, to provide the ability to note-up documents, and to rank search results. Citations can be used to create a visual presentation of legal content (Fastcase, Justis), as well as to offer some really impressive predictive tools (Casetext’s CARA). In the scientific context more broadly, Google Scholar uses citations to rank and link content, while PageRank determines how we find the world’s information based on a similar principle.

How we (Lexum) make it work

The linking function is essential to our work at Lexum. In CanLII, you find and use hyperlinks to statutes, regulations, sections, subsections and cases. You can note-up all these elements. As you know or assume, a software reads the text, and when it sees “Extradition Act, SC 1999, c 18Extradition Act, S.C. 1999, c. 18” or “[2000] B.C.J. No. 1012 (QL)”, it puts a hyperlink to that act or case.

In statutes

In practice, things are a little fuzzier. Imagine a case with the following three paragraphs:

[12] … charge of violating a municipal fencing by-law on the basis that he was constitutionally exempt, under the Canadian Charter of Rights and …

[24] … facts found by the trial judge on his appreciation of the evidence, was sec. 7 of the Charter applicable …

[55] …in the by-law violate the security rights guaranteed under sec. 7 by reason …”

The software needs to make the following assumptions:

In [12]: Canadian Charter of Rights = Canadian Charter of Rights and Freedoms

In [24]: Charter = Canadian Charter of Rights and Freedoms

In [55]: Sec. 7 = section 7 of the Charter of Rights and Freedoms

Taking the Charter as an example of the different ways we can refer to statutes remains still a quite simplistic view because there are many cases in which multiple statutes are referenced and references are more obscure.

It is often the case that multiple acts are mentioned in a case and then further down in the text, a reference is made to one of them simply by the word “the Act.” Or a situation involving a comparative analysis of provincial acts that all deal with the same matter. Take for example the following statutes: British Columbia’s Workers Compensation Act, RSBC 1996, c 492, Alberta’s Workers’ Compensation Act, RSA 2000, c W-15, Nova Scotia’s Workers’ Compensation Act, SNS 1994-95, c 10.

If the decision considers all three statutes but then further down the text contains “Workers Compensation Act” or “the Act,” which one is the judge referring to? How can we infer the relevant act for hyperlinking purposes? To sort these out, and many more, we use heuristics based on various contextual and metadata reference points.

In cases

For cases, of course we don’t have access to other publishers’ citators so we have to infer that “J.E. 2006‑716”, “SOQUIJ AZ‑50363026”, and “[2006] Q.J. No. 2519 (QL)” are actually all references to 2006 QCCA 413 (CanLII).

Learning programs used on CanLII will infer a connection between citations in a bloc of parallel citations and such heuristics are required to inhibit conflicting association between parallel citations.

In practice, we have been using heuristic programming for a while and it works pretty well.

One step further

But what if there is no mention whatsoever of the applicable act or case?

Consider the following situation:

Case 1: While the accused was incarcerated, he had a telephone conversation with his ex‑girlfriend during which he repeatedly told her that he would kill her upon his release if she proceeded with her planned abortion of their child. The accused was charged with uttering threats.

Case 2: The accused was charged with three counts of threatening to cause serious bodily harm. He had written anonymous letters to three football cheerleaders graphically detailing various sexual acts which he wished to perform upon them and concluded each with a threat that he would have sexual intercourse with them “even if I have to rape you”.

Case 3: The respondent [… ] was tried on a single charge of uttering a death threat […]. Essentially, the Crown alleged that, on November 7, 2012, while he was incarcerated at the Toronto Don Jail, the respondent threatened to kill a correctional officer named Jason Groeneveld, who was employed in that facility.

In the absence of an explicit reference to s. 264.1 (1) of the Criminal code concerning “uttering threats,” can we predict that the cases above actually deal with this section of the Criminal Code? This output would be helpful in cases when relevant documents have not been referred to for whatever reason.

And another step further

Taking the possibilities for linked content in legal information systems further, what if there is no legal qualification of the facts? That is to say what if the text omits altogether the key legal terms that are essential to the issue.

If the texts above didn’t contain some reference to the words “uttering threats” – Case 1 “The accused was charged with uttering threats”; Case 2 “The accused was charged with three counts of threatening to cause serious bodily harm”; and Case 3: “The respondent was tried on a single charge of uttering a death threat” – can we still predict s. 264.1 (1) of the Criminal code based on merely factual descriptions? Achieving this level of analysis would be even more interesting since it can apply to texts describing facts provided by a layperson.

Providing further legal qualification and a map of the relevant sources of law from the expression of a set of facts, legally qualified or not, is one of our current research and development goals in relation with AI.

Lexum is well-placed to developing these next steps in legal information, analyses, linking and AI. Having the chance to combine the knowledge of computer engineers with deep understanding of the specifics of legal texts, a large set of legal decisions (there are over 2 million decisions from CanLII) already available in parsable format and the direct access to the knowledge and experience of one of the leading teams in AI worldwide, Lexum intends to improve once again the way legal information is made accessible in Canada.

This is essentially what Lexum is currently working on in an alliance with MILA (Montreal Institute for Learning Algorithms). We will keep you posted.


  1. Interesting stuff sir! And short of being 100% sure as to which legal document the judge was referring to, one can always insert an hyperlink to the most likely choices, and then keep track as to which one the end user generally picks and click on. Even better if you store that info to leverage it later so as to make the service smarter.

  2. Merci Éric! Very interesting idea, kind of like a learning-to-link approach similar to the well-known learning to rank.