Data, Metadata, De-Identification and Re-Identification

Data about individuals is very valuable. It can be used to discern trends, popular thought, individual buying habits, customer behaviour, do medical research, and many other things. But it is important that the collectors and users of that data use it in a privacy friendly manner.

One of the deflections by the NSA is that they don’t record conversations, just metadata about phone calls and other communication. Metadata means information about information, and can be just as personal and invasive as the data itself.

The Ontario Privacy Commissioner, Ann Cavoukian, recently published a paper entitled A Primer on Metadata: Separating Fact from Fiction that uses the NSA revelations to discuss why metadata is a threat to privacy – that privacy is about control, not secrecy – and that we don’t have to give up privacy for security.

Related to this issue is that of de-identification and re-identification. The metadata issue tells us that we can’t just scrub names off a list and call it de-identified or anonymous. It can be very easy to re-identify people based on other information in that data, or by combining it with other data. So if we want to have a database of anonymous or aggregate data, it is important to consider how to best accomplish that.

As the Ontario Privacy Commissioner points out in a paper entitled Looking Forward: De-identification Developments – New Tools, New Challenges , de-identification can be done in ways to make re-identification difficult, despite musings by some to the contrary.


  1. David Collier-Brown

    There’s a good discussion of what the data is used for in “A Robust Social Graph of the United States”, in the middle of

    Dr Moglen cites the U.S. Director of National Intelligence as saying “we’ve come to realize that we need a robust social graph of the United States. That’s how we’re going to connect new information to old information. I said let’s just talk about the constitutional implications of this for a moment. You’re talking about taking us from the society we have always known, which we quaintly refer to as a free society, to a society in which the United States government keeps a list of everybody every American knows. So if you’re going to take us from what we used to call a free society to a society in which the US government keeps a list of everybody every American knows, what should be the constitutional procedure for doing this? Should we have, for example, a law? He just laughed. Because of course they didn’t need a law. They did it with a press release on a rainy Wednesday night after everybody went home, and you live there now. “

  2. David Collier-Brown

    In breaking news, the site “Groklaw” just shut down, in my understanding as a direct result of the widespread traffic analysis and forced disclosure (of data and metadata) authorized by secret courts in the U.S.

    Groklaw was a site for techies interested in computer law, administered by a paralegal and a lawyer. It and Slaw used to be on my home-page tabs.

    The announcement is at

    A noisy debate on the subject is at