Posted in:
Rethinking the Way a Court Formats and Publishes Its Judgments
If you could change the way a court formatted or published its judgments, what changes would you recommend? XML? Typography? Are there any courts whose judgments you think are better (looking) than the rest? Or are there any ongoing initiatives or helpful products/sources in this area you’d like to point out? I would be grateful for your comments, tips, etc. Thanks!
As a former web developer turned articled student, this issue really interests me. When I was going through law school and reading judgment after judgment, I ached for an easy way to store, organize, normalize, search, and abstract all of these wildly different pieces of writing. I dabbled with CanLII’s API to fetch judgments and metadata, but since the layout and formatting of every judgment is different, there’s no easy, systematic way to parse the data.
XML would be a good first step. It’s easy to learn and use. I worked for a time as a web developer at the University of Victoria. I worked closely with faculty members doing web-based research projects. Many of them employed research assistants to read through old texts and put the information into XML files. Very few of the assistants knew what XML was before they stepped into the lab, and quickly picked it up.
The great thing about XML is that, as long as the schema design is solid, it can easily be translated into whatever other format you need it to be in. So, for example, if the Supreme Court of Canada were to publish their judgments in XML, I could take the feed and throw it into a MySQL database, etc.
But the key to success with XML – or with any similar markup language – is to have a solid schema that’s actually used properly. It would do no good to publish judgments in XML if the schema were just BIGWALLOFTEXT.
Ideally we’d get XML judgments that look something like this (this is obviously off the cuff and incomplete):
Smith v. Jones
Smith
Jones
2014 SCC 55
2014-03-01
2014-08-01
Mr. Justice Reasonable
Family
Will we ever get there? Who knows.
Oops, it looks like the comment box removed all of my XML formatting. Let me try that again (using square brackets):
[judgment]
[styleOfCause]Smith v. Jones[/styleOfCause]
[plaintiff]Smith[/plaintiff]
[defendant]Jones[/defendant]
[citation]2014 SCC 55[/citation]
[hearingDate]2014-03-01[/hearingDate]
[judgmentDate]2014-08-01[/judgmentDate]
[judge]Mr. Justice Reasonable[/judge]
[areaOfLaw]Family[/areaOfLaw]
[statutes]
[statute name=”Family Law Act” jurisdiction=”British Columbia” sections=”46,47,91″/]
[/statutes]
[/judgment]
See: http://typographyforlawyers.com/introduction.html
A controlled vocabulary for the various sections of the document, like statute name=”Family Law Act” jurisdiction=”British Columbia” sections=”46,47,91″ above
A form with all of the various lines or blocks of test marked with the terms above, that can be filed in in any word processor
A program, callable from inside most word processors, that makes the form into a word-processor document with the terms converted into the names of paragraph types.
Another that takes the word-processor docuement and creates the annoying-to-write but oh-so-computer-friendly XML that I need in order to
– put it in a database
– typeset it
– convert it to a web page.
This should help, too, or perhaps this can be updated or superseded: “The Preparation, Citation and Distribution of Canadian Decisions” from the Canadian Citation Committee: https://lexum.com/ccc-ccr/preparation/en/
The first step would be to have the courts publish the judgements in the first place. In Ontario the only judgements that are published are at the Court of Appeal level. The rest are only accessible through Canlii, QuickLaw and WestLaw, and all three have restrictive terms of use that would stop someone from doing the kind of large scale processing necessary to infer many of the tags that are mentioned above (http://www.cameronhuff.com/blog/ontario-case-law-private/index.html).
Clever parsers and machine learning could do a lot of the “tagging” work but only if the cases are available for use. One area where that might not be enough is with references to laws (addressed above by David with the solution of using XML tags). If judges were forced to use a standard format for citations then you could probably avoid the need for special tagging of references and avoid the huge expense of having someone specially tag each reference.
The first step is unrestricted access to judgements. I think programmers could probably fill in almost all of the rest without extra effort by the courts. Crowdsourcing might be able to address a few of the shortfalls. The remaining parts could be handled through the use of consistent formats like McGill Guide for citations (which would enable proper parsing).
A court case publishing system needs to be both viable and useful. On other words, judgment publishing needs to be inexpensive for the court to operate and also cheap for those who want reuse the information by, say, loading it into another tool. In my opinion, there is one software that is a good compromise between the two objectives and it is called Word, by the beloved Microsoft Corporation.
The use of a properly styled Word template can bring us very close to obtaining structured information at the end-user level at a marginal cost to the court producing the judgements (they use Word anyway to write them).
Word styles are easily converted into HTML or XML structure.
There is no need for an XML input to detect paragraph numbers in judgements, as in this example – https://www.canlii.org/en/ca/scc/doc/2014/2014scc53/2014scc53.html#par45. Even in highly hierarchical documents such as this one – http://greybook.seylii.org/se/CAP40 – all the structure can be derived directly from Word as long as a clean Word template is used for the document preparation.
Virtually all case metadata – case citation, date, docket, on appeal from, judges, parties, cases cited, and so on – can be captured directly from the Word file if the data elements are marked with distinct MS-Word styles according to a per-defined template (see here for example – http://scc-csc.lexum.com/scc-csc/scc-csc/en/item/14302/index.do)
I hate the way that Canlii-published judgments are double spaced – printing cases becomes very wasteful.
Save the html file, open it in your word processing program, and modify the line spacing, font, etc. as you prefer.
As a gripe specific to CanLII, how about the lower-quality PDFs the site offers now, compared to years past?
Not sure what exactly is the problem, but there is some substitution of typefaces, or some corruption of kerning, or some spasm of spacing that crams glyphs together within words while also increasing word spacing.
Times New Roman might not be the best ‘face in town, but if using a lowest common denominator, at least try not to make it even lower! The documents from the Courts are not the source of the problem.
Structured text (XML) is the single most important thing, as suggested by Jamie Nay. That will address typography, PDFs, consistent formatting across the country, database searching, paragraph numbering, quick inter-document linking, consistent noteup/citing judgments, etc etc etc
Even if judges used Multimarkdown to write judgments in plain text rather than Word, it would be a simple, consistent conversion process to HTML for CanLII and PDF, with professional typesetters (or even just CSS designers) worrying about double-spacing online or single-spacing in PDF, font choices, whitespace, etc.
Since the SCC has affirmed that litigants are entitled to reasons for judgement
that explain the reasoning process, why not adopt some form of hypertextual
toulmin diagram ? Each finding would have to be tagged with metadata making
explicit what each finding actually is (credibility-related? judicial notice?)
and where it fits in the overall inference network.
In my opinion, as a non-lawyer, we as taxpayers and beneficiaries of the system
are entitled to nothing less.
Law is a language. Where the reasons are adequate as written, then the words DO adequately explain, to any person with adequate fluency, the basis of each finding (conclusion) and how each finding fits into the framework of the portion of legal system involved in the decision. You want to understand where “each finding … fits in the overall inference network” – whatever that phrase means – take the time to learn the language. You DON’T have to go to law school to do that.
I attempted to construct a diagram to dynamically illustrate relationships between cases using D3.js: http://davidlanger.ca/
IMHO a more polished and complete version of the diagram, developed by an expert, ought to be available for readers of publicly available case law.
Why? Because words do not adequately explain [express], to each and every reader [‘any person’], the content of precedent.