Librarians Will Always Be Needed

The attached URL is to an interesting post on the Google Library project in which the blogger makes the interesting comment that

In spite of their inherent slowness, organizing information is a job that’s still best done by people, and in most places those people are called librarians. I admit librarians can’t begin to sort all the available information, but at least for them preserving, categorizing, and creating access to the information that people need is a higher priority than content–targeted advertising. The insistence of librarians on continuing to use what might seem like arcane and antiquated systems — such as the Dewey Decimal or Library of Congress systems — is as much for the benefit of patrons as it is for the librarians who shelve the books. These systems were designed to keep the materials on platypuses in one place and the materials on stuffed toys in another.

In a world of full-text engines, should we look to cataloguing systems based on human intelligence to do something different than a machine-based catalogue would.

I remember Jon Bing telling me 15 years back that for all the legal publishers vaunted emphasis on the value-added elements provided by headnotes written by trained lawyers, when one actually did the vocabulary analysis most headnotes simply reused the same terms as the judge did in the first place – no additional terms to attract a search engine that wouldn’t have been there in the first place.

I’ve always argued that as open source services like Canlii and the other LIIs expand their corpus of caselaw, the publishers must significantly enhance the value of their editorial additions in order to justify the expense of subscriptions.

Of course, this argument is undercut by the paranoia of lawyers that they may have missed something of value and the relative price insensitivity of purchasers of legal materials.

Simon C


  1. What about synonyms, homonyms, mis-spellings, and such? What happens when ‘juvenile delinquency’ is used rather than ‘youth crime’? How much is to be allowed ‘between the cracks’ before a controlled vocabulary becomes a necessity?

    I know how much lawyers enjoy missing a case or two during their research… :-)

    Controlled vocabularies aren’t cheap. The legal publishers know this and are slowly moving away it. I’m predicting a future of higher prices, less subject access, and very expensive current awareness services.

  2. Don’t know if this responds to Simon Chester’s post, but it does illustrate in a very small way the need for some “value added” material or metadata before searching becomes useful:

    I’m working on an archive of dissertations for Osgoode Hall Law School’s research web pages — a subset, if you like, of what ProQuest offers, but done in a table and able to be ordered by year, title, author, degree, and area of law etc. I’d like visitors to be able to search as well, because there a few hundred records.

    Thing is, though, do I take the position that visitors to the site will understand that because it’s a Canadian law school, “labour” is spelled with a ‘u’ or do I tell them of that odditiy and those others that I can think of? Or do I change all “labour” to “labor”? If I’m putting in keywords or tags, I could include both spellings every time, if I feel like some extra work, but when I’m hoping to make searching titles sensible, then…?

    Here is where an electronic dictionary of correspondences might help, such that every search for “labor” searches as well for “labour” unless I tell it not to.

  3. I totally agreed with Simon Chester – “publishers must significantly enhance the value of their editorial additions in order to justify the expense of subscriptions.” Free text searching is making the world way too flat. Sometimes it is important to see the hierarchical organization and the contextualization of concepts. I just came back from the annual meeting of the American Association of Law Libraries. One of the programs I attended was titled “Indexes, Taxonomies and the Google Generation: What You Don’t Know Will Hurt You.” Obviously, the panel was preaching to the converted. (One of the panelist was Dan Dabney, West’s Senior Director of Research and Development. He is brilliant and an engaging speaker with a wicked dry sense of humor.)

    It’s almost a chicken-and-egg situation. If users are all only interested in free text searching, there is no incentive for the publishers to employ indexers to do a good indexing job and make index easier to use. To be honest with you, if West’s Key Numbering system is not online and so well-integrated within their system, I would have a harder time convincing students to use it (even in conjunction with free text searching). The Google generation knows basic keyword searching and they are used to finding “something” on Google. When they cannot find their cases or statute sections on Lexis and Westlaw, they panick (or even worse, they think they don’t exist). Try convincing them to use a digest or an index, in print!

    All is not lost though. Some publishers, especially those operating in a highly competitive environment, are trying to highlight their labour-intensive value-added features – such as Westlaw’s highly sophisticated Key Numbering system, and BNA’s indexes to their various current awareness newsletters (US-focused). They are putting their indexes online with cross-reference links and other more sophisticated linkages (in the case of Westlaw). They are striving to make their indexes as easy to use as possible (I think online indexes are still not user-friendly enough).

    I am hoping that someone would start pulling interesting examples that illustrate how index is a much more effective and efficient finding tools than the “search” button. And we can all share them to convince (no, scare) our students and attorneys.

    As to Simon Fodden’s question, from a user’s point of view, I’d like to see the cross-referencing as transparent as possible. So if I enter “labor,” the search should automatically pick up both “labor” and “labour.” I don’t know how easy it is to build that in the database.

    Sorry, this comment is way too long.

  4. Oh gosh how the world turns.

    Does anyone remember the definitive piece in the LLJ by Dabney, The Curse of Cadmus.

    It’s such an unusual name that he must have gone over to the dark side and live in Eagan, MN.

    Seriously, the arguments of Professor Dabney are still worth considering.

  5. trust me to get the title wrong:

    Dabney, The Curse of Thamus: An Analysis of full-Text Legal Document Retrieval,78 Law Libr. 5 (1986) but it is posted full text at

  6. Posts and comments often need amending or emending. Commenters can edit their comments by clicking on the “e” that will appear after the date of a comment if they are logged in.

    If comments are made when a member is logged in, the member’s full name and website are produced automatically.

  7. One of the best indexing systems around is the one used for PubMed (for the biomedical literature, but the indexing concepts would apply to legal specialty terminology). Terms input into the search box “map” to both text words (in all searchable fields) and subject headings (single words and phrases); also the underlying dictionary “translates” many concepts into keywords for searching.

  8. The problem with Mr. Bing’s observation is that, while it is true, the majority of headnotes contain most of the words used by the judge, it means nothing to searchers. The value of headnotes is that it co-locates those words with the statute, rule, regulation or ordinance to which it relates. Judges don’t do that.

    The second value, which is invaluable, it fits it into an elaborate, elegant taxonomy which every practitioner understands even if he doesn’t worship it. Through this “controlled” classification system every relevant item can be found by the advocate or his opponent. Find cases about “motions to dismiss” in cases of “breach of contract” or “failure to state a cause of action” and you will quickly become fogged out of business.

    And another thing: What makes it “paranoia” for a lawyer to fully expect his opponent will closely read and double check everything he says or writes with a virw toward shooting him down for the slightest error or misstep. For anyone, not a lawyer, to label that normal, understandable and widely prevalent attitude as “paranoia,” there is a category for them: Illiterate.