Palimpsests

A palimpsest — one of my favourite words — is, according to Wikipedia, “a manuscript page, scroll, or book that has been written on, scraped off, and used again,” dating from a time when wax tablets and parchments were bases for writing. Stuff from the past, right? Don’t be so sure.

The U.S. National Security Agency, the “puzzle palace” of recent spying scandals, has released an advisory on “Redacting with Confidence: How to Safely Publish Sanitized Reports Converted From Word to PDF” [pdf]. (You’ve got to love the bureaucratese that styles the name of the releasing body, which is the Information Assurance Directorate of the Architectures and Applications Division of the Systems and Network Attack Center (SNAC). Talk about the need for redaction!) The idea is that Word documents contain a whole lot of metadata — earlier versions, comments, author details, hidden text, etc. — that isn’t visible to the drafter, but that, when the document is sent in electronic form, might be retrieved by the recipient. Some of this metadata might be carried over into a PDF version unless some precautionary steps are taken.

The problem of metadata in electronic palimpsests is not new. See, for example, this piece called “Confidentiality and Metadata in Word Documents”, on a site called Addbalance. Indeed, Microsoft itself has an article on its site on how to minimize metadata in Word 2002.

The NSA advisory contains fairly clear instructions as to how to minimize the chances of unwanted data being transmitted. In so doing, it makes clear that:

One reason to convert a Word document to PDF is that the conversion redacts some information or hidden data from the document that is intrinsic to the Word format. However, some PDF software has the ability to automatically copy document meta-data and properties from Word to PDF. This feature, among others, must be disabled when downgrading or sanitizing documents.

Incidentally, one of the features of the newly released WordPerfect from Corel is it’s ability to strip away some (most?) metadata on command.

These are things that lawyers should be aware of and about which a firm should have a policy and a clear practice.

Comments are closed.