The Federal Court recently analyzed what portions of postal codes were personal information and how the data could be made suitably anonymous. Anonymizing data will become increasingly important under Canada’s proposed Consumer Privacy Protection Act and Artificial Intelligence and Data Act, currently at second reading as Bill C-27.
In Cain v. Canada (Health), 2023 FC 55, the Federal Court considered an application under the Access to Information Act for disclosure of postal codes and cities for licensees entitled to grow medical marijuana. The applicant sought access to the ‘Forward Sortation Area’, namely the first three digits of the postal codes. The court upheld Health Canada’s decision to only release the first digital of the post codes.
The Court reached this conclusion because for some regions, a relatively small number of people live within a single Forward Sortation Area and there was a risk that releasing the three first digits may be sufficient, when combined with other information that was publicly available, or released previously by Health Canada, could be used to identify a particular licensee.
Whether data could be truly anonymous, or whether it could be used to identify specific individuals depended on what information the ‘adversary’ (i.e. the person who may have access to the purported anonymous information – see paragraph 74 of Cain) already had. In this case, the Court heard that previously released information on licensees of medical marijuana had been organized into a publicly available map.
The proposed Consumer Privacy Protection Act would define the process of anonymizing data as:
irreversibly and permanently modify personal information, in accordance with generally accepted best practices, to ensure that no individual can be identified from the information, whether directly or indirectly, by any means.
The proposed legislation on the handling of personal information would not apply to “personal information that has been anonymized”. Therefore, organizations that may rely on anonymizing data to avoid the requirements of the Act, such as obtaining appropriate consent, should consider carefully the process used to anonymize the data.
In the proposed Artificial Intelligence and Data Act, the proposed legislation includes provisions that would require organizations that process anonymized data for use with artificial intelligence systems to establish measures on the manner in which the data is anonymized.
Related to anonymizing data is “de-identifying” personal data. De-identifying data involves no longer allowing an individual to be directly or indirectly identified without the use of additional information. For example, associating individual information with an account number rather than a name may be a form of de-identified personal data, but would still allow the proprietor to cross-reference the account number with the individual as needed. On the other hand, anonymized data involves alternating the data in some way so that it is no longer possible to identify a specific individual. The proposed Consumer Privacy Protection Act would impose requirements on organizations using de-identified personal information including ensuring “that any technical and administrative measures applied to the information are proportionate to the purpose for which the information is de-identified and the sensitivity of the personal information”.
The Ontario Government’s White Paper on Modernizing Privacy in Ontario discusses de-identifying and anonymizing personal data and recommended “clear definitions, requirements and standards” to guide the use of de-identified data. The Information and Privacy Commissioner of Ontario has published guidelines on de-identification. While these resources focus on de-identification, similar criteria and guidance will be useful for organizations relying on anonymized data. Supporting regulations for the Consumer Privacy Protection Act and Artificial Intelligence and Data Act, which have not yet been published, may also provide guidance on the type of anonymizing or de-identification that would be appropriate.
The decision in Cain focuses on the practical aspects of anonymizing personal data and how it is important to consider what other information is available and how it may be combined with the purported anonymous personal information. For example, if two organizations (or the same organization at different times) decide to anonymize the same data set in different manners, and a motivated third party could combine the two data sets to de-anonymize the data, then it may be possible that personal information could be disclosed.
Similar issues about the disclosure of personal information could arise if anonymous or de-identified data is obtained from multiple sources and combined to identify specific individuals. This appears to be the primary concern considered in Cain, namely whether the information already available to a recipient, including potential personal knowledge of medical marijuana licenses, could be used to identify individuals in the purportedly anonymous information if more specific location was provided.
Organizations anonymizing personal information will have to be careful about the process, including balancing the intended use of the anonymous date and the audience, as well as consider that the use or audience of the anonymized data may change over time, and the abilities of adversaries to analyze the data may improve in the future. Guidelines or best practices will need to be developed and adopted to ensure that the release of purported anonymous data is not haphazardly released or puts an organization in violate of privacy legislation. As found in Cain, even releasing more than one digit of a postal code, could be considered protected personal information.