In the past two weeks, the New York Times reported that Microsoft has made a minor concession with European privacy authorities about how long it retains its log files. A committee of European privacy regulators had asked that these logs be kept for only six months. Microsoft's response? Eighteen months.Yahoo used to keep them for thirteen months and just announced it will cut retention to 90 days. Google keeps them for nine.

The privacy implictions of these innocuous log files have been underestimated, particularly when you think about the fulsome picture of your private life that companies like Google may be assembling about you. The information in an ordinary web-server log usually contains the just a tid-bit of information. One "hit" on a website may look like this (but all on one line):

127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700]
"GET /apache_pb.gif HTTP/1.0" 200 2326
"http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)" 

The first bundle of numbers is the IP address of the computer that requested a particular web-page. "Frank" refers to a userid, which is usually not eabled. The next field is the date" Following that, and usually preceded by "GET" is the command your web-browser sent to the server. The next bits are the status code returned by the server and then the size of the entity requested. Next is something called a "referer" (mis-spelled) , followed by details about your browser.

Since many people often share the same IP address (it could be one IP for an entire company or just a group of people in a house using the same internet connection), some have argued it is not personal information and a log-file doesn't contain personal information. The problem is that even if an IP address is not directly connected to one individual, one can do some easy analysis to make the connections. After AOL released supposedly de-identified search logs to researchers, an intrepid reporter was able to track down at least one of the users who had some very personal health-related searches in the logs (see: Users identifiable by AOL search data).

What's additionally troubling from a privacy point of view is that the large inernet companies, like Google, Yahoo and Microsoft, don't just have your search queries. Increasingly, they have a huge trove of data sources in their logs.

Take Google, for example. Google has their famous Google search. They also have GMail, Google Analytics, Google AdSense, Google Documents, Google Toolbar and more. Each time you "hit" one of their sites, you're in their logs. Most internet users hit Google's logs dozens of times a day and on many of those occasions aren't even aware that they're using a Google service. Google has what is probably the most popular and widely used network of online advertising: AdSense. Each time you go to a website that features Google's ads, your computer sends a request to Google's servers and that "hit" goes into their logs, along with the information about what site you were visiting, when you visited and what ad was served. If you click on the ad, even more information is collected and logged. But even if you don't visit a site with Google's ads, there's a very good chance that the webmaster is using Google Analytics to find out about useage of his or her site. (Full disclosure: I use Google Analytics for my site at www.privacylawyer.ca.) I should also note that Yahoo! and MSN also have advertising networks, which collect the same sort of information.What this means is that Google, Yahoo and Microsoft register in their logs a significant portion of your usage of the internet.

And if you have a Google, Yahoo! or MSN account, that hit can be connected to your account details, includig your name.

I don't think it's too far fetched to think of a day when it will become standard for all investigations involving the internet to inlcude a warrant served on Google or Yahoo! or Microsoft for all logs related to a particular user or IP address or both.

Next week, I'll discuss efforts being made by governments and law enforcement to make log rentention mandatory.

David Fraser is a technology and privacy lawyer with McInnes Cooper in Halifax. He is the author of the Canadian Privacy Law Blog and on twitter @privacylawyer.
[click on the author's name for more information]

up

One Comment on “Privacy and Internet Log Files”

  1. I've read a few blog posts and articles on this, and none have addressed the question of mirrored sites or backups. I can't be certain that log data sets are mirrored (simultaneously stored on another computer, perhaps in another country) or if the logs are backed up — but these practices need to be spelled out.

    The other area of interest is user control over identifying data. As users get more sophisticated and aware of their rights, companies are going to have to start offering services to meet requests for this information. Some form of self-reporting capability seems inevitable in the future. Taken a bit further, one should also be able to request a purge of all identifying data held by companies. Are these not, in some way, our data?

    Mike
    http://codetechnology.ca

SlawTips      

SlawTips Cash Flow Reports – Part 1
Thursday, May 17

Following on our earlier Top 10 Financial Errors posts, this is the first in a series of 10 posts dealing with Cash Flow Reports and in particular, cash flow management.… »»

Practice

SlawTips Just the Facts
Wednesday, May 16

Today’s research tip is about facts. When research is assigned to juniors (and librarians for that matter) it is important to share facts that are critical to the research. It … »»

Research

SlawTips Minimize That Darn Office Ribbon for More Room on Your Desktop
Wednesday, May 16

If you are using Office 2007 or 2010, The Ribbon is now a part of your life. Some of you will be happy about this – some of you won’t.… »»

Technology

noted on Slaw    

MLB Selected Case Summaries    

These summaries of selected recent cases are provided each week to Slaw by Maritime Law Book.
More information.

  • Aliens - Exclusion and expulsion - Power to detain and deport - Minister’s certificate - Review - Evidence

    In 2002, Harkat was detained pursuant to a ministerial security certificate issued under the Immigration and Refugee Protection Act (IRPA) as a person inadmissible to Canada on grounds ...

  • Contracts - Formation of contract - Signing - Electronic signature

    The plaintiff expressed an interest in purchasing the defendant’s (vendor’s) condo. The parties agreed to carry on their discussions through e-mail. Following an exchange of e-mails, the plaintiff claimed that the defendant was contractually bound to ...

  • Barristers and Solicitors - Relationship with client - Confidential communications - General

    The petitioner was a Receiver appointed in March 2009 by a California court over the assets of GJB Enterprises Inc. (a “Ponzi scheme”) and its principals, the Berkes (the GJB parties). The court ordered ...

  • Practice - Costs - Funding before judgment - When interim or advance costs available

    The plaintiffs were “direct to home” satellite based subscription program providers. Rex and other defendants offered “grey market” services to Canadian residents to facilitate the unauthorized reception in Canada of the plaintiffs’ ...

TalkLaw/ParLoi    

This is a listing of a few upcoming events in Canada of interest to lawyers, law students, legal librarians, and others involved in the practice of law.

Clicking on any event in the list below will give you access to more information and to links allowing you to see the full entry and to add the event to your own calendar.

Click this link for a fuller version of the TalkLaw/ParLoi calendar of events and for instructions as to how to add events and calendars to your own calendar.