Open Text Mining Interface

by John N. Davis

A couple of days ago, I came across a note by Tim O’Reilly concerning the Open Text Mining Interface (OTMI). O’Reilly described it as a “copyright hack.” It seems this initiative was started by Timo Hannay, who has also blogged about it on the website of his employer, Nature magazine. The initiative itself is an attempt to respond positively to requests from indexers and data-miners for full-text versions of articles, but without at the same time making human-readable versions of the articles readily available free to non-subscribers. OTMI, an XML format, consists of “word vectors” plus “snippets” which amount, more or less, to all of the sentences in the article arranged alphabetically instead of in their original order. Links to samples are available in Hannay’s posting.

Comment

Simon Fodden

April 27th, 2006 at 4:39 pm

It’s funny, John: one of the ideas that came to mind when I was fussing how to deal with the refusal of Big Law Publishers to let Slaw put up the tables of contents of their books was to “unarrange” them in some way that would free them from whatever copyright there might be. The notion was that the search mechanism on the TOC site would do the work of getting you where you wanted to go within the mess of data that was once a table of contents. I don’t think it would work legally, though, and might not work within the aim of the venture.

Most Recent Comments

Alastair Clarke on Issues of Self-Representation in a Landmark Decision: Reflecting on Ahluwalia v. Ahluwalia:

Indeed, this situation is very serious within the immigration context. IRCC encourages applicants to follow their guides and they actively… more »
David Collier-Brown on Resisting the Echo Chamber: AI-Assisted Judgment Writing and the Risk of Homogenization:

I find LLMs are better at critiquing text than writing it. I also tell the editor-bots "If you suggest alternate… more »
Bryce Smith on Issues of Self-Representation in a Landmark Decision: Reflecting on Ahluwalia v. Ahluwalia:

Thank you for highlighting the stated purpose of the justice system to provide justice, alongside the profound tensions created by… more »
Dennis Prieto on Law and Literature in Latin America: Context in the Classroom:

When I think of Law and Literature in the North American context, I think of Stevens, MacLeish, Dos Passos, and… more »

+ -

“Refs, You Suck!”: Personal Attacks on Decision Makers

Tips Tuesday: Use Newspaper Archives to Find Cases

Forum Shopping Could Fix the Delay Problem

Resisting the Echo Chamber: AI-Assisted Judgment Writing and the Risk of Homogenization

AI in Mediation. the Tool Is Not the Process: Using the IBA Guidelines to Evaluate Risk in Mediation Practice

RECLAIM: A Is for Autonomy

Open Text Mining Interface

Comment