easyreader icon Data Mining

There’s an interesting article in the recent D-Lib Magazine: “From Babel to Knowledge: Data Mining Large Digital Collections,” by Daniel J. Cohen. He talks about the task of refining search tools in a couple of specific ways so that a researcher can extract the kind of document needed from the welter of uncatalogued documents on the internet (or in offline collections). It’s not hard to see the potential for getting better access to law-related documents or otherwise making better use of full-text-indexed law related databases.

You might take a particular look at two of the tools his research has led to, Syllabus Finder, which, as the name suggests, throws up academics’ syllabi on topics searched for, and H-Bot, a natural language search engine that can answer (some) questions about history.

[via this month's Current Cites]

SOME COMMENTS MODERATED

If you have not had an approved comment here before, your comment will be held for approval. We are glad to publish comments that address issues raised in the post or other comments on it and contribute to a fruitful discussion. We do not publish comments that seek to promote commercial products.

Although we do not require it, we ask that in making a comment you use your full name. You must supply a valid email address, which will not appear with your comment.