If Open Data are so good, why aren't all Public Data already open?

(Paywall-free popularization like this is what I do for a living. To support me, see the end of this post)

(this page is part of my Open Data, Open Society report. Please follow that link to reach the introduction and Table of Content, but don’t forget to check the notes to readers!)

Much of the current civic activity around Open Data still happens in the conditions described in a blog post from Mash the State: the great independent civic websites using public data are mostly having to scrape and steal it. Very few councils will even acknowledge them, let alone co-operate with them."

Sometimes this happens because of issues which are much more general than PSI availability, from limits to freedom of speech to lack of affordable Internet connections and other physical infrastructures. Very often, however, at least in the EU, PSI data aren’t available for a combination of much less serious reasons. The Danish addresses study, for example, also indicates that in the Central Business Register (CVR) and the utilities sector, usage of the official addresses is still limited due to technical, traditional and legislative barriers. Here’s a summary of the most common reasons why PSI data aren’t open yet:

Pure and simple lack of real awareness about the importance and benefits of Open Data is still the norm in many government organizations (even if they should digitize all their procedures and documents anyway for their own good or to comply with some local e-Government directive, if they haven’t done it yet). Side by side with ignorance, lack of explicit guidelines on data reuse from upper levels and fear to lose control are powerful motivators to do nothing, hence maintaining data locked.
Legal barriers, or (even worst) serious confusion about the legal status of data. This happens when data come under restrictive or unclear terms of use,or simply without any terms of use at all, which is even worse. Under current legislation and international treaties, the default status of any creative work, including PSI data, is “All rights reserved” for many decades, so no re-use is possible without explicit authorization. But when datasets were produced assembling data by many different public and private bodies without a clear single policy (not an unusual case), even figuring out who is entitled to authorize reuse can become a costly legal procedure.
Fear of embarrassment deriving from publishing low quality material: “we can’t publish this data, because there are errors in it” (Zijlstra, Business case for PSI). Torkington reports the same issue from New Zealand: “serious problems exist in some datasets. Sometimes corners were cut in gathering the data, or there’s a poor chain of provenance for the data so it’s impossible to figure out what’s trustworthy and what’s not."
Last but not least, money. We explained that raw data are like soil: a generic foundation upon which wealth is created in many different ways which are basically impossible to predict. The “dark side” of this power is that the administrations that first see the extra money generated by Open Data almost never are the same who created and should have opened them in the first place. This makes quite difficult, for a public body without other sources of external funding and no policy imposed from the top, to see anything beyond its own real or perceived short term benefits coming from selling data, no matter if much more public money will be spent or not gained in the big picture. Even when data are already available at no charge to the public really opening them, that is deciding the proper license, getting approval for it and reformatting everything for online publication in the right formats is an extra expense that is very often perceived as not easily affordable or justifiable

Who writes this, why, and how to help

I am Marco Fioretti, tech writer and aspiring polymath doing human-digital research and popularization.
I do it because YOUR civil rights and the quality of YOUR life depend every year more on how software is used AROUND you.

To this end, I have already shared more than a million words on this blog, without any paywall or user tracking, and am sharing the next million through a newsletter, also without any paywall.

The more direct support I get, the more I can continue to inform for free parents, teachers, decision makers, and everybody else who should know more stuff like this. You can support me with paid subscriptions to my newsletter, donations via PayPal (mfioretti@nexaima.net) or LiberaPay, or in any of the other ways listed here.THANKS for your support!

« Open Data to restructure government Why Open Data? »

Stop at Zona-M

If Open Data are so good, why aren't all Public Data already open?

Who writes this, why, and how to help