- Slaw - https://www.slaw.ca -

Google Cache & Copyright – I Don’t Get It

I’m wondering everyone’s opinion on this:
http://www.eff.org/deeplinks/archives/004344.php [1]

I don’t get it. Google cache is an almost complete reproduction of a webpage, and goes way beyond legitimate copying in my mind. This decision seems to open the door for every scraping program on the web today. They add a couple highlighted terms, and that’s ‘transformative’? What’s next, ads next to the cached page?

And why is it incumbent on webmasters to add a ‘no-cache’ tag to their robots.txt file? It’s not like the old days where you submit your site to a search engine, Google now indexes without asking. Truth be told, if the option was available to add a ‘yes-cache’ tag, I would do so (and I would definitely do so, and submit, for the Internet Archive [2]); BUT letting for-profit Google build a database of the web without publisher permission smacks of ‘negative option billing’.

We’ve got a couple of IP gurus here at Slaw, so help me, what am I missing?