It's time for Open Data in and from (not "ABOUT"!) schools


This essay expands a proposal on Open Data in schools that I made in 2011, which requires very little, if any, funding and central authorization/coordination to be implemented. As of this writing, I know of no other proposal of the same kind, with the exception of this 2012 presentation from New Zealand. Also, I have not heard of any large scale implementation, or had occasion to do any real work on this topic. However, I am even more convinced now than in 2011 that the idea has a great potential.

Here I describe the proposal in detail, providing some anecdotes and examples to show how it may work (or is already working), and then suggest one way to implement it in a scalable way, taking into account some obstacles (both objective and perceived ones). While this is not explicitly declared in the rest of the essay, many points of this proposal apply, more than to Open Data in the strictest sense, also to Open Access and (production of) Open Educational Resources.

The common problems in Open Data and Education, and part of their solution

In order to better explain the proposal, it is useful to start from two classes of projects that are great Open Data activities for schools on one hand, and of what this proposal is not on the other. One is At School with OpenCoesione (ASOC), which is described in another chapter of this ebook. The ASOC project promotes, inside Italian schools, the study, reuse and improvement of Open Data about the Cohesion Policy of the European Union.

OpenStreetMap (OSM) is an Open Data/Open Knowledge project and community that, in my opinion, every teenager should (at least!) know about. Initiatives to bring OSM to students range from occasional Cartoparties to multi-year projects like the one on Open Culture and Free Maps in a High School of Rovereto, Italy (also described here and here).

The goal of that project, which came from a proposal made in 2008 by Maurizio Napolitano, was to promote in students “a conscious and critical use of the ICT technologies behind acquisition and publication of Free (as in Freedom) geographic data”. In practice, students learned why and how to contribute to OpenStreetMap by using Wikisource, Wikimedia and the OSM SlippyMapGenerator to build two temathic maps, that include notions and concepts they were studying in history, arts, literature and other courses.

Proposals like those in the previous paragraph are excellent and necessary: they provide extra skills and educate to civil service and active citizenship. The more schools follow them, the better. The attractiveness of these proposals, however, is also their natural, built-in limit: they all consist, at least in a great part, of extracurricular activities. They are “extra” (almost surely unpaid) work, in a world in which motivation and empowerment of students and teachers alike is not exactly at satisfying levels. The practical consequence is that only students and (above all) teachers with “extra” time, resources, support, energy, motivation and so on can “afford” the addition of Open Data to their work.

The issues above are the main reasons why, at the 2011 Open Government Data Camp (OGD), I proposed what I still consider a necessary, and complementary way to bring Open Data in and out of schools. Since 2011, I have held a woorkshop on this proposal, discussed it with as many teachers and Open Data activists I could, and included it into the official “OPEN DATA ACTION PLAN 2014-2020” of the italian Regione Veneto. All these activities confirm, for me, the value of the original proposal, as described here with many more details than the original presentation.

Two general Open Data issues I try to address are well expressed by these two things I heard at the 2011 OGD Camp:

  • Rufus Pollock: “Open Data has no value if it isn’t used. We need now open tools and communities that utilize Open Data”

  • Ton Zijlstra: “[to make Open Data in your local community] get started with nothing, offer free beer, do things immediately within your power”

In 2011 we had, and in my opinion still have today, two big problems with Open Data: one is getting enough data in the right formats, as soon as possible. The other, which is much more serious, is getting enough citizens really interested in Open Data, and using them on a regular basis.

Then there is a problem related to these, but at another level, which is what I described in “When Open Data meets Show and Tell: young people have always criticized politics, and public institutions in general, for many things, not least their complexity, and their distance from their world. This, of course, is absolutely natural and even necessary, to a degree. Today, however, too many young people just give up that “fight” altogether. They are convinced that no form of participation or dialogue with “institutions” could ever be remotely interesting or relevant for their lives and future. This is a terrible waste of energies for society.

Finally, in the education world, we hear plenty of cries worldwide that students (and teachers!) are not motivated enough by school curricula deeply disconnected from real life.

I believe that one natural, necessary part of the solution to all these problems is:

  • systematic use and production of Open Data in High School, and possibly even before that.

  • but inside everyday, traditional school activities, that is without adding extra “load”, or turning existing curricula and teaching practices upside down

We must work to make the “marriage” of ordinary curricula and homework with usage and production of Open Data the rule, not the exception, in grade school (age 6-18). This must obviously happen in different ways for each combination of country, age range and school type, but the journey can be much less expensive and complex than it may seem, and the potential results too good to not try.

On the education side, Open Data may help students to learn better, by making lessons and homework school more satisfying (for both students and teachers!). Open Data may facilitate understanding of subjects otherwise perceived as deadly boring, from calculus to taxonomy or literature. Eventually, it may also stimulate more active participation, from young people, in public administration and services. It may help them to “touch” with their hands both that government can’t really be as simple as they sometime imagine, and that they too can contribute a little to make it better.

On the Open Data side, if usage and production of Open Data happens as often as possible, inside ordinary school activities, and if it makes them more engaging, and helps achieve better grades, there is no more need to “promote Open Data” as such. I’ll dare say that Open Data needs students, in order to both get data and stay relevant.

Before looking at how all this may actually happen in practice, it is necessary to make two things clear. One is that this way of working is accessible, if on a reduced scale, to all schools, students and teachers, not just those with plenty of computers and broadband Internet connections. Usage and contribution to OpenStreetMap, for example, is possible even with paper maps, thanks to the Field Papers service. The same applies to many other kinds of data. When the goal is to prepare exercises for class tests or homework, it is often enough to download real world datasets (or upload them) from a public library, and then use them on disconnected computers, or in many cases to just print them out.

The other thing to keep in mind is that production of Open Data is as important as their usage, regardless of their initial quality. The whole point of Open Data is continuous reuse and improvement, by whoever can do it, isn’t it? Linked, Five-Stars Open Data are a goal, to reach in many, incremental little steps. Not a prerequisite to release data. There are thousands and thousands of real world cases in which, if we waited for Five Star data to publish something, we’d never get anything at all.

Grade school students already continuously collect huge amounts of data of all sorts. Primary school kids collect location of city toilets and similar info for their “know your town” Show and Tell days. High School student find and write in digital files numbers copied from newspaper articles in high schools. If all this stuff (which, often, is digitized anyway!) were systematically put online, even if only as CSV files with meaningful column names and the right license (a task well within the reach of most computer or smartphone users!) data hackers would get for free, every semester, gigabytes of stuff ready to be processed, analysed and refined.

Another important characteristic of this way to produce Open Data is that there is no real need to bother about its cost effectiveness, how many times each specific dataset will be reused and similar metrics (without denying their relevance in other contexts, of course). We are talking of “data” that must be produced and published, often already in some digital format, anyway. At that point, the mere act of including a license declaration to such files and putting them online somewhere costs practically nothing. Even if the “only” result of such activities were to embed, in some students, the idea that all the data that should be public should be automatically published in this way, it would be a good result, and one achieved at really small costs.

For these reasons, all schools should:

  • Request students to scrape data using existing tools

  • Routinely put online as Open Data everything produced during normal school activity (wandering around, observing and reporting about something is already part of many school projects)

  • Make an official school policy of it

Some examples and suggestions

This chapter describe some practical ways (including some cases in which it already happened) in which Open Data may be used and produced during school activities. Some of the data sources mentioned, or produced, are not Open Data today, but that doesn’t change the value of the example (students one year may produce the Open Data used by their successors…).

The easiest type of Open Data to use as I suggest is budgets. We promote Open Data by saying things like “If the budget of your City Council were Raw Open Data, you could check it yourself”. Then we wonder why almost nobody stops whatever else they are doing with their life to run those numbers, and forget that, every week, there are million of people who must run some numbers as homework, or assign that homework, anyway.

How many of those people know that there are plenty of real, local budget data that can be used as courseware, ready to take (and improve) from websites like Openly Local?

Real world Open Data like those may, and should be, standard homework for accounting classes. Homework that would be more interesting (“Dad? I’ve found why the City didn’t fix our street last year, and got an A for it!"), and professionally useful later on, than the content of certain exercise books, which are already old when they are printed.

The same approach can be applied to almost any topic taught in high schools. This also applies to transforming into Open Data the results of the many forms of “Show and Tell” that is constantly happening in classes of all grades, for most subjects.

It is possible, for example, to prepare interactive, real-world based lessons and reports of history and art, by attaching pictures or all sort of data to OpenStreetMap. At the same time, all is needed to make Open Data of the findings of those or many other school projects is to publish the results online with the right license. In that way, next year’s class (or any other class interested in the same data, anywhere) wll be surely able to reuse those data as a basis for other projects, instead of starting from scratch.

A few cases from the field: Brazil, Laos and Vancouver

Let’s now list some practical examples of how this may happen in practice, starting from some real world use cases. In 2013 I talked with several Brazilian teachers who already used or produced Open Data. As an example, Alexandre Gomes told me that:

Usually, I ask students to create apps using Open (Gov) Data. Each class produces about 20 small apps. And, as long as most of the students are public officials, they use to spread their open data ideas in their work environment (public departments, ministries and agencies), fostering the conversation about how to publish public data and how to build new services with them. In 2010, Gomes' student salso created some simple apps (not online anymore at time of this writing) at to analyze and display SELIC rates and Brazilian Senate salaries and Census of education.

Roberto Pinho, who teaches Data Mining at an MBA, reported that he usually “points students towards open data sources” including, for Brazil:

I have read of a similar case in Laos: Students Make Use of Open Data. Back in 2012, students of the National University of Laos were introduced to, a portal said to host “the world’s most comprehensive collection of data on developing economies”, from energy to health, trade, poverty, and more. One of the participants said “Now I know how to use Open Data- it’s very useful indeed. It will help me do my research study”.

In 2010 David Eaves reported, as an “example of the long tail of public policy at work”, about “Victor Ngo, a student at the University of British Columbia who just completed his 2nd year in the Human Geography program with an Urban Studies focus” and other “University students using real Vancouver data to work on projects that could provide some insights, all while learning”.

Some suggestions

The following examples, not tested in class as far as I know, are some of the suggestions I collected either in my 2013 seminar, or while preparing it.

Arts/cultural data

It is possible to fetch (or produce, of course) many Open Data sets from Museums, even automatically, for example with the APIS in this list. The list of “Cool stuff made with cultural heritage APIs” gives plenty of real-world, reusable examples of how “linked open cultural data” may make class activities more productive and interesting, besides teaching valuable and reusable data-related skills… while NOT studying math, statistics or coding. Even Open Data aggregators like Europeana or the Digital Public Library of America can be used in this way.

Math, Statistics, Social Studies

At the “top” level, statistical portal like OECD and Eurostat allow everybody to download individual datasets or (at least in the second case) the complete database by using the bulk download facility.

Working in bottom-up fashion, instead, teachers and students may emulate Runeman, a reader of my blog. He kindly provided this suggestions and materials as bases for exercises in which students practice several skills to “collect automobile brand data, and analise brand popularity in different regions”.

Biology, Earth Sciences, etc

The Gulf Stream Voyage website “utilizes both real time data and primary source materials to help guide students to discover the science and history of the Gulf Stream”. The following paragraphs, copied straight from its home page, show how this Project, or raw data of the same kind from other sources, may be already used in High Schools as I suggest here:

Students will investigate this great ocean current [with] activities for marine science, earth science, chemistry, physics, biology, math, history and language arts… The activities are presented in a manner so that each may be used individually to supplement traditional classroom lessons… students access real time ocean data, atmospheric data and historical primary source materials.

Put Open Data into Show and Tell. One teacher at a time

If Open Data are to become familiar tools for students worldwide, the sooner they start using it (that is not later than High School, possibly earlier) the better. But nothing of this sort will happen, on a meaningful scale that is, without active participation of teachers.

Today’s teachers are often, disenchanted, underpaid and, quite often, not really up to date on the possibilitites of ICT for education. They also work, all too often, in schools with no money at all available for any “extra” activity.

For these reason, I’ve often met the objection that “for such a project to take place, there need to be strong incentives and/or collaboration with the local Ministry of Education."

But… is this true? I don’t think so. There is no doubt that any formal COMMITMENT to make all schools of a nation, or district, would need strong collaboration with at least the national Ministry of Education, PLUS a big budget. I cannot even exclude that it may require, in some countries, dedicated lawmaking and/or regulatory adjustments. If and when this is both wanted by the local authorities and feasible (as in “properly funded”) that’s great, of course. However, nothing of that kind is mandatory to make this change start.

This may also be a perfect case of distributed “innovation without permission”, with very little budget to just get it started along a gradual and much less bureaucratic and risky path.

First, and again: the proposal is to use and produce Open Data WHILE doing normal, already “due” school activities which are part of the official curricula. All these activities are easier if all students and teachers have personal computers and broadband, but many are possible even in offline classes.

Therefore, by definition, there is no need at all for anybody to change curricula, or for single teachers to ask for any permission or money, once they know what to do, how and why. This is true even for publication of school projects findings as Open Data, which as we already saw can be as simple as putting online, anywhere, some CSV file with the right license notice. The only exception may be if some school had some official policy that forbids publication of data that are not private, sensitive etc, which would be an interesting thing to see.

There is also no need at all for this change to be mandatory for everybody, at the same time, ever. As completely a made-up example, if in some country the double entry accounting system is only studied during Year 4 of “Accounting Professional School”, I say “OK, let’s suggest THOSE classes, and only those, to use Open Data budgets as courseware”.

Above all, there is no need at all of anything that even remotely resembles “let’s train/pay/force ALL teachers to do this, starting next month”. We can (and should) simply start very small, from some realistic assumptions, and still get a great payoff.

The first assumption is that today, out of 100 teachers in any given country, only a small part of them would ALREADY like to work in this way. Teachers that just happen to be naturally inclined to it, and are already ready to engage and stimulate their students with different, more meaningful homework and tests. Let’s assume that, in any given country, such teachers are no more than 10% of the total.

Next, if my own, strictly personal experience of almost all the teachers in Italy and abroad with whom I’ve discussed this topic since 2011 is representative (and if it isn’t, it’s even better!) we can be pretty sure of another thing: of all those few teachers naturally “ready” for Open Data, only a SMALL percentage knows that the concept, practices and communities of Open Data EXIST, and how easy it could be to use them during normal class activities. I’d estimate that percentage to be not more than 5%.

If all this makes sense, the consequence is that today, in any given nation, there are nine thousands and five hundred teachers out of every 100 thousands (1000000.10.95) that would be already happily using Open Data in their classes, if they only knew it exists, and how to get started. Another way to put this is that, if the right information were available, we may have several thousands of classes in each country using Open Data on a more or less regular basis, instead of… very, very, very few. Over time, each of those teachers may “convert” others… by just doing their daily job. And each school may move at its own pace, without any artificial deadline from above.

The conclusion is that it is perfectly OK, and sufficient, to make the right tools available to get a good result, both for schools and for the Open Data movement. A temporary list of these tools may include:

  • curated collection of the best Web scraping tools for use in schools, and of the related documentation

  • library of ready to use Open Data based courseware

  • contexts, hackatons etc.. reserved to students and teachers, that create such courseware

  • “Open Data in Schools” success stories

  • contacts of “Open Data friendly” teachers, available to support colleagues and exchange experiences with them

  • An “Open Data for Schools” Manual, that is a synthesis of already existing works of the same type, but rewritten and simplified specifically for teachers

  • free online training for teachers

  • translations in several languages of all of the above

As far as I know, at the moment none of the items in that list already exist, or is already “packaged” as I suggest. If this is the case, the most likely explanation is that, while being several orders of magnitude less expensive than certain flagship “Education 2.0” programs, preparing and maintaining those resources is no pet project, doable in one’s spare time. Otherwise somebody, not necessarily me of course, would have already taken care of it. What do you think?