Data alterations and financial sustainability

(this page is part of my 2011 report on “Open Data: Emerging trends, issues and best practices”. Please follow that link to reach the Introduction and Table of Content, but don’t forget to also check the notes for readers! of the initial report of the same project, “Open Data, Open Society”)

Some concerns about the limits of Open Data are about what may happen, or stop to happen, before they are published online. The most common concerns of this type are (from Open Public Data: Then What? - Part 1):

  1. Opening up PSI causes those data to not be produced anymore, or to be only produced as private property by private corporations, because the public agencies whose job was to produce those data, can’t sell them anymore.

  2. total accessibility of data provides more incentives to tinker with them, at the risk of reducing trust in institutions and inhibiting decision-making even more than today.

Data manipulation is the topic of the next paragraph. Speaking of costs, a point to take into account is that, once data are open, routinely used and monitored by as many independent users as possible, eveb the cost of keeping them up to date may be sensibly reduced: in other words, in the medium/long term Open Data may reduce the need to periodically perform complete, that is very expensive, studies and surveys to update a whole corpus of data in one run.

Besides, and above all, even if opening data always destroyed any source of income for the public office that used to create and maintain them, this problem would only exist for the PSI datasets that are already sold today. Such data, even if of strategic importance as is the case with digital cartography, are only a minimal fraction of all the PSI that could and should be opened to increase transparency, reduce the costs of Government and stimulate the economy. In all these other cases:

  • the money to generate the data already arrives by some other source than sales and licensing(but even with those data it may be possible to generate them by crowdsourcing, thereby reducing those costs!)

  • the only extra expense caused by publishing those data online (assuming they’re already available in some digital format, of course!), would be the hosting and bandwidth costs, that may be greatly reduced by mirroring and other technical solutions like torrents, already widely used to distribute Free/Open Source Software (FOSS) through the Internet.