Law and the Semantic Web

One of the things that surprised me when I started working with law firms is that most firms and most tech people ask one question repeatedly that seems to stifle innovation and the development of new concepts and ideas. When presented with something new, most ask: “which other law firm is doing this?’ While this makes some sense and provides a way of weeding out wacky ideas with no traction, it also limits innovation and creativity. What about ideas emanating from other professional service firms? Other service firms? From industry in general?

Take for example the semantic web:

  • “… a project that intends to create a universal medium for information exchange by putting documents with computer-processable meaning (semantics) on the World Wide Web”
  • “… an evolving extension of the World Wide Web in which web content can be expressed not only in natural language, but also in a form that can be understood, interpreted and used by software agents, thus permitting them to find, share and integrate information more easily”

The original vision for this is credited to Sir Tim Berners-Lee as an extension to his original invention (the world wide web). You can find an outline of the concept penned by Tim himself here. Much has been done to establish this framework which is aimed at making web content more accessible and usable — especially by machines.

Originally, I asked a number of software vendors to elaborate on their plans for the semantic web. Once I got pass the blank stares and the half-baked answers and came to the realization that most vendors have done little to nothing in this area, I went looking for more in-depth thinking. Surely something as important as this warranted some serious consideration by the best minds in the business. I was surprised to find some excellent resources are available to those who want to be on the leading edge of this thinking.

After all, this is more than software vendors using XML as a basis for word processing documents. What about using the semantic web constructs to represent and structure documents — from contracts to KM artifacts such as precedents, research memos and opinions?

There are a number of resources available for those interested in tracking and understanding the possibilities in this area. These include:

  • – although this group seems to be inactive, it has produced a number of schemas and document frameworks. Much of this work has been deemed “completed” and includes:

    • electronic court filing documents – “using XML to create and transmit legal documents among attorneys, courts, litigants, and others”;
    • eContracts – “enabling the efficient creation, maintenance, management, exchange, and publication of contract documents and terms”;
    • eNotarization – “technical requirements to govern self-proving electronic legal information”;
    • integrated justice frameworks and documents – to facilitate “the exchange of data among justice system branches and agencies for criminal and civil cases”;
    • lawful intercept documents – ‘production of a structured, end-to-end lawful interception process framework consisting of XML standards and authentication mechanisms, including identifiable related XML standards and XML translations of ASN.1 modules”;
    • legislative documents, citations, and messaging – “standardizing markup for legislative documents and simple citation for non-legislative documents”
    • online dispute resolution – “using XML to allow public access to justice through private- and government-sponsored dispute resolution systems”
  • leXML – an initiative “established to serve the growing interest in automated exchange of legal data”.
  • Legal XML Conferences – several of which have taken place over the last five to ten years. Proceedings from these make a good read.
  • Legal RDF – a not for profit initiative set up to move the concept of legal semantic web along. Note: this web site will take you to other resources and is well worth the visit.

If your head isn’t hurting yet with too much of these technical details, I would also like to introduce you to a book reference – Law and The Semantic Web (Benjamins et al. eds; Springer, Berlin, N.Y.: 2005). In this book, the editors assemble a number of primary research papers dealing with legal ontologies, methodologies, legal information retrieval, and applications. Prompted in part by the EU Lisbon Summit in March 200 where EU heads of state made a public commitment to “become the most competitive knowledge-based society in the world by 2010.” Several projects were funded as part of the EU’s “Semantic-Based Knowledge and Content Systems” Strategic Objective. The book provides leading edge thinking on the application of semantic web frameworks to the legal domain.

So why then does this not have the traction one would expect for such an important initiative? First, no major software vendor has embraced this in an aggressive way. Some of the document automation vendors have done so taking baby steps into the world of XML. But in my view, the efforts to use XML formats for word processing documents is not enough. Second, Nathan Simpson, a colleague and collaborator, makes the point that we are waiting for artificial intelligence to automate the tagging and classification of these documents — given the volume of documents most firms and individuals are dealing with, manual tagging of content is a losing battle.

Nevertheless, I do think we should be working on ways to move this along. It is only by our collective efforts will we be able to make a difference and lay the groundwork for these frameworks to enable better solutions for the legal industry.


  1. Joel, one vendor that you may want to check is Siderean

  2. Thank you for this fantastic post! This is an area in which my company (Legal Data Services) is actively working, but taking it slowly to find out where agility and speed can make a difference (in other words, small systems, rather than large ones). I was working in XML back in 1998, but didn’t have anything to do with it! 10 years later, XML is almost passe! It’s a method, not The Answer.

    One thing that I think lawyers haven’t yet grasped is that the “semantic web” does NOT refer to the World Wide Web, but to ANY network. It could be something as simple as the files on your local computers. Hook the computers together and you have a “network.” Hook many computers together and you have a “web.” So, the key is the word “semantic,” which means building relationships between concepts that can be extended to finding relationships between documents.

    To add some resources, I can across several fantastic resources through the Earley & Associates JumpStart series. I would flag Keith Hawker of the UK, whose company Metatomix has already created a system to use disparate data across court systems in Florida, Georgia and I think, Ohio court systems. In his presentation, he really flagged the idea that the objective is to build relationships that lead to action, not just tagging for social purposes.

  3. Metatomix, as a semantic integration software provider, has taken a simple approach to the use of the “semantic web.” Rather than recreating enterprise data as RDF, and then running it through application logic, we take the approach of combining real-time RDF transformation with a real-time business rules engine. Metatomix Policies (ontology-described rules), then, enable business event driven applications to quickly and easily integrate data through its context, for the purpose of creating dynamic insights and alerts. This is in fact how we’ve applied semantic technologies in the Justice and Public Safety space in states like Florida, Georgia, Ohio, as mentioned in the comment above.

    While this doesn’t achieve the objectives Joel writes about above, with respect to contextual tagging of legal documents, it does deliver real results in the criminal justice environment.

    Great insights however, to where the broader legal industry can go with more powerful tools leveraging ontologies and RDF.