The privacy "problem" of Open Data
(this page is part of my 2011 report on “Open Data: Emerging trends, issues and best practices”. Please follow that link to reach the Introduction and Table of Content, but don’t forget to also check the notes for readers! of the initial report of the same project, “Open Data, Open Society”)
Being perceived as a lethal attack to privacy remains one of the biggest misunderstandings that prevents adoption of Open Data. On one hand, there is no doubt that in an increasingly digital world it becomes harder and harder to protect privacy. But, exactly because the whole world is going digital, attacks to privacy and to civil rights in general can and are coming by so many other sides that those from (properly done) Open Data are a really tiny percentage of the total.
This is a consequence of the fact that data about us end up online from the most different sources (including ourselves and our acquaintances), and that often it would be very hard to discover, never mind prove, that they’ve been used against our interest. There have been concerns, for example, that insurance companies may charge higher fees for life insurance to those among their customers who… put online a family tree from which it shows that they come from families with an average life expectancy lower than usual.
Assuming such concerns were real, would it always be possible to spot and prove such abuses of data, that weren’t even published by any Public Administration? Of course, publishing online complete, official Census data of several generations, in a way that would make such automatic analyses possible would be a totally different matter.
Getting rid of all the unjustified concerns about privacy is very simple, at least in theory. All is needed to dismiss for good the idea that Open Data is a generalized attack to privacy is to always remember and explain that:
Most Open Data have nothing personal to begin with (examples: digital maps, budgets, air pollution measurements….)
The majority of data that are directly related to individuals (e.g. things like names and address of people with specific diseases, or who were victims of some crime) have no reason to be published, nor there is any actual demand for them by Open Data advocates
Exceptions that limit privacy for specific cases and categories of people (e.g. candidates to public offices, Government and Parliament members etc…) already exist in many countries
Very often, in practice, Open Data struggles only happen about when and how to make available in the most effective way for society information that was already recognized as public. What to declare public, hence open, is indeed a serious issue (more on this in the next paragraph) but is a separate one.