The Chesapeake Digital Preservation Group has just published its 5th annual study of link rot among the original URLs for online law- and policy-related materials it has been archiving since 2007.
“Every year, the Chesapeake Group investigates whether or not the documents in the archive can still be found at the original web addresses from which they were captured. The group analyzes two samples of web addresses, or URLs, pulled from the archive’s records”
“The first sample includes 579 original URLs for content captured from 2007-2008. This sample is revisited every year to document link rot and explore how it changes over time (…) “
“In 2012, 218 out of 579 URLs in the sample no longer provide access to the content that was originally selected, captured, and archived by the Chesapeake Group. In other words, link rot has increased to 37.7 percent within five years.”
Link rot describes “a URL that no longer provides direct access to files matching the content originally harvested from the URL and currently preserved in the Chesapeake Group’s digital archive. In some instances, a 404 or “not found” message indicates link rot at a URL. In other cases, the URL may direct to a site hosted by the original publishing organization or entity, but the specific resource has been removed or relocated from the original or previous URL” (from the 2011 link rot report)
More than 90% of the sample URLs were from state governments (state.[state code].us), organizations (.org), and Us government (.gov) top-level domains.
The Project has built a digital archive collection comprising more than 8,600 digital items. Most of the material archived is American. The Project is an initiative of the Georgetown Law School and Harvard Law School Libraries, and of the State Law Libraries of Maryland and Virginia.
This issue is also of major concern to Canadian legal researchers, as illustrated by the following posts here on Slaw: