Internet Archive and Copyright
Too late for our Theme Week on copyright but still interesting:
Michael Shamos, a computer science professor at Carnegie Mellon University, said archiving like that done by the Internet Archive is “the biggest copyright infringement in the world,” but said it is done in a way “that almost nobody cares about.”
CNEWS (via AP): Internet Archive faces copyright suit
A couple of weeks ago a there was an item in the news
To a researcher, this is one of those instances where you want to say, “Yes, sure, but…” The “but” is about free access, of course; however, in this case it would seem that the “yes, sure” must be correct. Alas. If it was copyrighted to begin with, it’s surely still copyrighted when someone else copies it, even though my current version has changed. I think the Internet Archive follows something like the infamous Rogers Cable negative option: there are ways, evidently, to prevent their bots and spiders from taking your site, and ways to get them to remove your old material — but you do have to take steps, and I’m not sure that’s right. In the instant case, though, it would seem that Healthcare Advocates did take steps to protect their data but it got out via the Archive nonetheless.
Or is this another case of the Google cache brouhaha? In Parker v. Google [pdf] the U.S. District Court decided in favour of the search giant. Perhaps the copyright doesn’t kick in, as it were, until someone further downstream from the Internet Archive makes an impermissable use of the data.
On a more general level, archiving material on the internet is a serious issue. A great deal of data now finds itself in digital format only, and that on the net, rather than in print form, with the consequence that it’s evanescent. It would make sense for individuals and organizations to take steps to archive their own contributions to the internet; and in that regard people might be interested in a monograph by Neils Brügger, “Archiving Websites: General Considerations and Strategies” [pdf], published by the Centre for Internet Research in Århus, Denmark.


Comments are closed.