“Legal search algorithm” … now there’s a phrase to make your head spin. I’ve been thinking about legal search for years, but I confess that I hadn’t given the algorithm much thought until recently. Type it into Google, and you come up with an excellent post by Aaron Kirschenfeld in the Cornell LII blog: “Everything is Editorial: Why Algorithms are Hand-Made, Human, and Not Just for Search Anymore”.
For legal publishers, ensuring that our users can find what they are looking for is one of the biggest challenges we face. I’ve never encountered legal information that isn’t incredibly dense. We know that we need to provide tools for users to locate the answers to their research questions. For print publications, the traditional tools are pretty straightforward: tables of cases, legislation, and references, and especially the index. But delivering findability for online resources is another matter altogether.
This is where the search engine is key. But there’s another important aspect to this as well: the search algorithm. We know that legal research and searching is much different from keyword searching. We know that legal research requires much more than Google. We also know that the big legal publishers pour significant resources into the development of their search engines.
Kirschenfeld notes that although we assume that a search engine has magic properties, but it doesn’t or shouldn’t. The algorithm and the results interface are made by human beings, after all. One of his most illuminating points is that we should consider the algorithm to be just another secondary source. Through this lens, we can evaluate it the same way we consider other secondary sources; for example, who worked on it; how skilled they were; how informed they were; and so on. In other words, if your developers have a deep and comprehensive idea of how to approach legal information, you’re going to get much better/more relevant search results.
This is fascinating stuff for a small but ambitious legal publisher. We don’t have anything like the resources or expertise to develop our own algorithm. Even so, we need to pay close attention to how well our search engine is performing.
After a dismal experience with our first search engine (its chief problem was that as we increased the size of our online library the search engine just didn’t have the horsepower to search the entire library; in other words it was unable to scale), we connected with our friends at Lexum, the software developer behind CanLII. They have developed LexFind, a search engine built on open source technology and designed specifically for legal research needs.
Are we meeting the needs of our users? Stickhandling the expectations of librarians and legal researchers is a constant challenge. We get valuable feedback from them as we develop and define our search function.
We sometimes hear that we have “too many results”. Does this mean that the algorithm needs further crafting? Or do we need to improve our instructions for advanced search, or refine the facets for federated search, or enhance and deepen the taxonomy? These are ongoing questions for a legal publisher. I’m resigned to the fact that we may never really consider our search engine to be perfected and its improvement may be one of those projects with no end.