The thought is if governments create open data, users will come to use it.
According to researchers at the University at Albany, that isn’t really happening.
“Open Government Data (OGD) impacts are limited, in part because of lack of user capabilities to re-use and analyze data,” said Mila Gasco-Hernandez who is the Associate Research Director at the Center for Technology in Government at the University at Albany (CTG UAlbany). “Simply being aware of open data doesn’t promote its use. End users also need analysis skills which, when combined with the data itself, encourage the use of open data. “
Gasco-Hernandez is the lead author of “Promoting the use of open government data: Cases of training and engagement,” which was published on April 12 in the journal Government Information Quarterly. It is currently available online.
Beyond her work with CTG UAlbany, Gasco-Hernandez is an associate research professor in the Department of Public Administration and Policy at Rockefeller College of Public Affairs and Policy. Co-authors of the paper include: Rockefeller College Associate Professors Erika Matin and Luis F. Luna-Reyes along with UAlbany PHD students Luigi Reggi and Sunyoung Pyo.
To back up a step, “open data” is the idea that data sets should be freely available to use and republish for free without copyright or patent restrictions. In the lens of the team’s research, governments (local, state, or federal) offer data – for example: daily temperatures in downtown metro areas, traffic accident information, unemployment trends, anything really – and other public agencies, private companies, or citizens themselves can create analytical tools that can lead to new conclusions about what that data means. These conclusions could then guide public policy decisions.
The idea is a good in theory, but as Gasco-Hernandez and her team point out it hasn’t really come to fruition in practice. Government data is available, but most people in both the public and private sectors do not know how to analyze it effectively.
“Everyone wants improved government transparency, citizen collaboration and participation, and innovation. The basic assumption is that once data is discoverable, accessible, available in alternative formats, and with licensing schemes that allow free re-use, diverse stakeholders will develop innovative data applications,” said Gasco-Hernandez. “The trouble is that there is a lack of technical skills and user training.”
In order to solve that problem, the team scrutinized three training programs – one right here at UAlbany, one in Italy and one in Spain – that effectively trained users to collect, present, understand and analyze data publicly for everyone’s benefit.
The first case studied was the newly redesigned core course, “Data, Models, and Decisions,” for first-year graduate students in UAlbany’s Public Administration Master’s program. The course includes six weeks of training on basic data literacy, how to manipulate data in Microsoft Excel, advanced data analysis and visualization in Excel, and basic data management in Microsoft Access. The curriculum focuses on Excel and Access because they are the most commonly used software packages in the public and nonprofit sectors.
In-class exercises primarily focused on the State of New York's hospital discharge data, while homework used restaurant inspection data. Both datasets are produced by the state's Department of Health and available through Health Data NY, the state's open health data portal.
In the second case, researchers looked at a civil society initiative created in 2013 called “Monithon,” – a combination of “monitoring” and “marathon” – where Italian open data activists used the open data portal OpenCoesione.gov.it to verify how the country spends Structural and Investment Funds from the European Union. These funds are one of the main sources of public investment in Southern Italy to support new businesses and infrastructure development such as broadband, local transportation, or water supplies.
Monithon created curriculum for high-school students to participate in a six-month course focused on open data, data journalism, and civic monitoring of public spending. Monithon's training toolkits are also used to create local “civic monitoring schools.”
Researchers looked into the “Barcelona Open Data Initiative,” for the third case. The Barcelona Open Data Initiative has three main training focuses.
First, there is an introductory training program to educate citizens in the basic skills and knowledge needed to use and make sense of open data. Second, public employees working with open data, developers, and data journalists can take several four-hour classes on different topics, such as legal aspects of open data, reuse, linked data, and the social impact of open data. Finally, there is a tailored program where agencies and organizations can request specific training from the Initiative in specific areas for employees.
“These cases are valuable examples because they represent three geographical contexts, target different audiences, have alternative instructional designs and fee structures for participants, and use different datasets for instruction,” Gasco-Hernandez said. “Although a comparison of these three cases cannot answer a question about which format is most effective, these cases do illustrate how training programs can be easily adapted to different contexts and participant interests.”
In other words, the tools are there, while the talent is developing. There is a lot of room to grow.
The team pointed out that further studies can evaluate the short- and long-term impacts of these training programs in more detail. Other studies could focus on the types of topics and datasets that are most successful at, “fostering curiosity and a lasting interest in using OGD, and how to make these trainings sustainable.”
“The main objective was to start the conversation on open data training and its contribution to raising awareness about open data and enhancing users' skills,” Gasco-Hernandez said. “As platforms and data offerings evolve, engaging new users and breaking down barriers to use are critical to unlocking the full value of open data.”