Practical advice for Public Officers promoting Open Data

(this page is part of my Open Data, Open Society report. Please follow that link to reach the introduction and Table of Content, but don’t forget to check the notes to readers!)

Don’t worry (initially) about data quality

Just don’t worry, initially of course, about data quality. Graves explicitly recommends that public sector bodies make PSI available at the earliest point that it is useful to businesses and citizens. In practice, this means as soon as possible and “quality” shouldn’t be an issue. First of all, a corollary of the fact that data should be open because they are like soil and therefore not even their creators can possibly know all the ways to use them, is that the same creators can’t even be always absolutely sure that they have all the elements to properly evaluate if quality of their data is good or poor.

Even if quality were not sufficient for the Public Administration (which would then be a problem to solve regardless of openness!) it may be already good enough for third parties. In the second place, quoting the Business case for PSI “quality of published datasets actually increases, both because of much more feedback from end users and because of more attention being given to generating the data in the first place”. When the Greater London Authority asked developers community how should they release their data, the response was clear:

“Go ugly early – don’t worry about formats – just get the data out there and we will help you to clear it up… The sooner you can get datasets up and build sample applications to demonstrate the purpose and benefits of open data, the more likely you are to encourage other people to give you their data”. Data quality is a case where the slogans of the Open Source Software movement, “release early, release often” and “given enough eyes, all bugs are shallow” really apply and can give positive results.

Beware of too much looking for a “business case for open data”

Speaking of how to justify opening PSI Zijlstra rightly says that: “The business case has a long history of being abused to stop change: business cases are fine for investments that are one all or nothing decision about something of which all possible returns are known in advance and will happen in the same department that is considering the business case” (something we already said it’s impossible). When it comes to Open Data, instead, according to many experts, “it is very difficult, if not impossible in some cases, to quantify in advance the economic value of opening PSI”. Even the already quoted MEPSIR report says: “It turns out to be impossible to draw conclusions… at the level of the domains of PSI (e.g. legal information, social data, meteorological information, geographical information, and business information)… Generic business cases for open PSI cannot be made in a way that is relevant for a specific public body”. Claims of huge savings at EU level mean very little for a local manager that has already finished her budget.

Another reason to not put too much faith in “business case” evaluations is that there will be costs that become apparent because of an open PSI project, but are not caused by it. Usage of, or conversion to, open file formats is already required by law in several EU countries, so it’s a cost that sooner or later must be paid anyway regardless of openness. In practice, it may make much more sense to not look for a traditional business case but just start gradually, that is locally, as we already recommended, and/or by opening first, as soon as possible, easily available, non-controversial data sets.

Another valid criteria to decide which datasets should be opened first is to start with those for which community-generated alternatives already exist (e.g. OpenStreetMap). The existence of such alternatives proves the public need and interest for those data, and therefore releasing the official ones will allow to those communities to concentrate on adding values to the existing raw data and improving their quality, rather than re-generating everything from scratch.

Other practical advice about Open Data for public officers

Other do’s and don’t’s for Public Administrations, partly derived from the list first issued by the London Datastore are:

  • Don’t waste scarce resources on expensive consultancy firms unless you’re already sure and have proved that no help will come at a lower cost from the community
  • Avoid building new, possibly very expensive official public websites unless it’s really, absolutely necessary. Just put the raw data online instead: “Councils need stand-alone open data projects with their own resources and budgets. Lumping it in with general website work has demonstrably failed to give open data the priority it deserves”. The only work that really needs to be done on many public websites, which by the way is only partially related to PSI openness, is to make them search engine friendly
  • Publishing PSI online is not enough. It is necessary to establish and maintain connections with end users to get (often for free) their help to improve the data, measure reuse and publish those measures