Open data vs optimistic assumptions about citizens data literacy

(this page is part of my Open Data, Open Society report. Please follow that link to reach the introduction and Table of Content, but don’t forget to check the notes to readers!)

Lies, damn lies and statistics. Leonard Henry, Baron Courtney of Penwith, 1895

Internet access greatly increases opportunities for access to information. However, it does not magically give all people the skills they need to interpret what they find. Even the so-called digital natives are simply citizens born into a world where digital technology was already commonplace. That’s all the term really means and it has nothing to do with how digitally savvy they actually are. Assuming otherwise would be like assuming that all the people born after FM radio or analog TV became mass media are surely fully aware of all the ways those other technologies can influence their judgment.

Information is power and as such can be manipulated to actually disempower or manipulate people, especially when it’s used as a tool of fear. This is particularly evident with data tied to public security, like sex offender registries or other crime mapping tools, but is a an absolutely general problem. Even simple lists of “risky” locations like the Control of major accident hazards directory (COMAH) in UK can generate panic, or at least confusion, if not released with context.

Open data vs optimistic assumptions about citizens data literacy /img/asborometer.png

An Anti-Social Behavior Order (ASBO) is a civil order made against a person who has been shown, on the balance of evidence, to have engaged in anti-social behavior in the United Kingdom and in the Republic of Ireland. In February 2010 the most popular free download in UK was the ASBOrometer: a mobile application that measures levels of anti-social behavior at one’s current location by looking at the number of ASBOs issued to residents of that area. The release of the ASBOrometer caused comments like “a developer has seen the future, and it’s anti-social networking”.

In and by itself, information doesn’t necessarily lead people toward pro-active solutions. In worst cases, the extra information given first to the public may simply be the one that strengthens the position of the one power group already in charge.

A big part of the reason for this problem is that transparency is not enough without real interest and literacy in the masses. In this context, “literacy” means the combination of computer, digital media and traditional math skills necessary to correctly give context to sources, numbers and other information and to interpret everything as objectively as possible. For example, very often the age or release date of some data is at least as important as their actual value or their source. The consequence is that the largest class of PSI end users, that is responsible citizens, should adapt to the idea of data versions and version dependencies, just like they have already done, or should have, with versions of software programs. This kind of literacy is far from being widespread these days, is not evenly distributed across all segments of population and isn’t something that people develop just because broadband comes to town or information is available online. It would be naive to assume otherwise.

If literacy is absent, data taken out of context or “assumed” without skills can have unintended consequences, like generating fear or loss of interest instead of engagement (here’s another reason for linked data: they provide at least some context by themselves). As an example of these risks, Danah Boyd quotes, in a talk on which part of this paragraph is based, “the statistic from 2006 that 1 in 7 minors are sexually solicited online. Most people interpret this statistic as suggesting that 1 in 7 minors are sexually solicited by older sketchy adults seeking to meet minors offline for sex. But over 90% of sexual solicitations are from other minors or young adults, 69% of solicitations involve no attempt at offline contact and the term “solicitation” refers to any communication of a sexual nature, including sexual harassment and flirtation”.

These are not theoretical concerns. The author personally experienced several bloggers republishing without problems, even after being told about that it wouldn’t make sense, obviously absurd assertions like “in 2003 Microsoft got from the Italian state more money than the state deficit in that year”". In the USA, a 2010 study concluded that “about 70% of students in Grade 6 in the U.S. “exhibit misconceptions” about the equal sign”. Tests performed in Italy in the same year on 125.389 primary and junior high school students showed a decrease of math skills with age: correct answers to math tests where 61,3% among 10-year old students, but only 50,9% among 11-year old ones. Still in 2010, a report on “Trust Online: Young Adults’ Evaluation of Web Content” concluded that students rely greatly on search engine brands to guide them to what they then perceive as credible material: over a quarter of respondents mentioned that they chose a Web site only because their preferred search engine had returned that site as the first result.

Obviously this report and many other sources still prove that it is necessary to open as much PSI as possible, if nothing else to give private entrepreneurs more opportunities to start new businesses. Our point here is simply to remind that opening PSI can be enough in that sphere, but is far from being enough when it comes to transparency in government, at any level. That can only happen if there is a mass interest, usage and understanding of Open Data.