Law is a unique and important dataset: to a large degree it is a record of governance. It also tends to be conservative, so people can know what is likely to happen in the future based on what has happened in the past. Structurally, it has elements in common with other large text-based collections, such as aggregations of literary works. However, socially it has more in common with other high stakes bodies of information like medical research, with concerns like privacy and direct impact on people’s lives being necessary considerations. These attributes combine to make law as data a strange and wonderful problem to work on: this complexity is compounded by the way the law interacts with time.
Unlike literary works, which can be considered on a relatively linear timeline, laws change constantly, and they don’t have predictive patterns or synchronicity with one another. I remember a request I got years ago to find an 18th century case in the English Reports about a cart wheel, which was only ever reported as a footnote to another decision. But it was being requested as something relevant to the contemporary adjudication of law. It is unlikely that anything happened in that case that was so different from any number of other similar matters that happened that day or that year. There is human intervention involved in selecting the case for inclusion in the reporter series and some large component of randomness that means that this decision was still considered to be relevant while the many others were not.
In contrast to this ongoing relevance, many decisions, which may seem important at the time, quickly become irrelevant. For example, network analysis carried out in 2013 showed that the cited half life of Ontario Court of Appeal decisions was less than five years. However, this value was calculated in 2013, and there is evidence that historical decisions are being used more over time, so this number is almost certainly lengthening. This may be attributed to the increased availability of older decisions in online systems as these systems continue to mature.
Not only do the laws change over time, but considerations of the time laws were passed continue into the present in ways that may, or may not, make sense. In December, I spent a few days in London, and I took the opportunity to walk around Temple Church (In case anyone is wondering, I was informed that the number of tourists wanting to see the church because of The Da Vinci Code has dropped off significantly in recent years).
In the church there was a display about the Magna Carta that detailed the circumstances of its creation and how it became law in 1215. One of the facts noted is that it was signed at Runnymede. This has since entered into lore, giving rise to ongoing mention of the symbolism of the place such as in Kipling’s “The Reeds of Runnymede.”
However, the important thing about Runnymede at the time was that it is a water meadow and the ground is too soft for cavalry. This made it safer for everyone to be there as opposed to somewhere else. If a similar agreement were signed now, it wouldn’t make sense to choose a place like Runnymede because cavalry hasn’t been important militarily since the 19th century. It’s possible that a similar event would be held there now, but it wouldn’t be chosen for the same reason.
Context is important to understand the law because the text may not perfectly convey the detail of how it should be applied. This means that for contentious issues research may need to be done into the context and intention associated with how laws were passed, which creates a need for books like Susan Barker and Erica Anderson’s Researching Legislative Intent: A Practical Guide.
Sophisticated legal researchers understand this, and they make sure they go to the historical documentation needed to ensure relevant context is considered when warranted. How computing applications will ever do the same thing is unclear, as situations like this are unlikely to appear in training datasets as researchers are more likely to resolve them offline.
In cases with complex legal situations that have important time elements, it is likely that human experts will continue to be necessary for full elucidation of relevant information. But that does not mean there won’t be the need to integrate considerations relating to the passage of time into data-driven systems. Law is multi-dimensional, and time is one of those dimensions.
Without fully integrating time, systems will always face limitations in how far they can be extended. This is not to say there aren’t significant and immediate opportunities to use data-based applications in simpler areas of law. However, it is likely this will continue to be an issue until governments start issuing laws in more machine-readable formats, as the ambiguity in the ways time is handled in the existing legal corpus are confusing for human readers and will likely be insurmountable for computing applications.
Thank you to Paul Magrath at ICLR for the conversation with me about the history of legal London over lunch, which inspired this column.