The privacy "problem" of Open Data

(Paywall-free popularization like this is what I do for a living. To support me, see the end of this post)

(this page is part of my 2011 report on “Open Data: Emerging trends, issues and best practices”. Please follow that link to reach the Introduction and Table of Content, but don’t forget to also check the notes for readers! of the initial report of the same project, “Open Data, Open Society”)

Being perceived as a lethal attack to privacy remains one of the biggest misunderstandings that prevents adoption of Open Data. On one hand, there is no doubt that in an increasingly digital world it becomes harder and harder to protect privacy. But, exactly because the whole world is going digital, attacks to privacy and to civil rights in general can and are coming by so many other sides that those from (properly done) Open Data are a really tiny percentage of the total.

This is a consequence of the fact that data about us end up online from the most different sources (including ourselves and our acquaintances), and that often it would be very hard to discover, never mind prove, that they’ve been used against our interest. There have been concerns, for example, that insurance companies may charge higher fees for life insurance to those among their customers who… put online a family tree from which it shows that they come from families with an average life expectancy lower than usual.

Assuming such concerns were real, would it always be possible to spot and prove such abuses of data, that weren’t even published by any Public Administration? Of course, publishing online complete, official Census data of several generations, in a way that would make such automatic analyses possible would be a totally different matter.

Getting rid of all the unjustified concerns about privacy is very simple, at least in theory. All is needed to dismiss for good the idea that Open Data is a generalized attack to privacy is to always remember and explain that:

Most Open Data have nothing personal to begin with (examples: digital maps, budgets, air pollution measurements….)
The majority of data that are directly related to individuals (e.g. things like names and address of people with specific diseases, or who were victims of some crime) have no reason to be published, nor there is any actual demand for them by Open Data advocates
Exceptions that limit privacy for specific cases and categories of people (e.g. candidates to public offices, Government and Parliament members etc…) already exist in many countries
Very often, in practice, Open Data struggles only happen about when and how to make available in the most effective way for society information that was already recognized as public. What to declare public, hence open, is indeed a serious issue (more on this in the next paragraph) but is a separate one.

Who writes this, why, and how to help

I am Marco Fioretti, tech writer and aspiring polymath doing human-digital research and popularization.
I do it because YOUR civil rights and the quality of YOUR life depend every year more on how software is used AROUND you.

To this end, I have already shared more than a million words on this blog, without any paywall or user tracking, and am sharing the next million through a newsletter, also without any paywall.

The more direct support I get, the more I can continue to inform for free parents, teachers, decision makers, and everybody else who should know more stuff like this. You can support me with paid subscriptions to my newsletter, donations via PayPal (mfioretti@nexaima.net) or LiberaPay, or in any of the other ways listed here.THANKS for your support!

« The need to better define what is Public Data Unprepared Public Administrators »

Stop at Zona-M

The privacy "problem" of Open Data

Who writes this, why, and how to help