Johannes Kast on open data and big data… and the data revolution.
Big Data is shaping the way we look at the world and offers an alternative way of predicting what is going to happen next. And the amount of data is exponentially increasing. While in 2012, 2.8 Billion Terrabyte of data were saved, the IDC predicts that this number will increase to 40 Billion in the year 2020. Data is changing how we make sense of the the world, it changes classic business models drastically and it has the potential to revolutionise social sciences and the development sector.
There is an obvious benefit for companies to use their collected user data to analyse their markets and consumers, a practice that social media has monetized for a while now. And the tendency to collect massive amounts of data by government agencies has been demonstrated by the scope of the recent NSA scandal. However the Open Data and Open Government trend, which is essentially unstructured data being made publicly available to everyone, is growing as well and can potentially open up new possibilities how non-profits (or other third parties) can play a more active and creative role in shaping our world.
While it can be argued that the current form of data being released is supply driven, while it should be demand driven there are already several access points made available. With more than 150,000 data sets and tools to use them, the US Open Data initiative is a step into the right direction, offering raw information on over twenty topics, such as agriculture, climate and education.
Many other nations and governmental agencies offer portions of their collected data as well. Data Catalogs is the most comprehensive list with now 390 catalogs of Open Data provided by local, regional and national governments, but also from international organisations like the United Nations, the World Bank and NGOs.
Other places to find and make use of data can be found at the center of knowledge and research. Several universities took part in Dataverse Network projects, which originated at Harvard University in 2006. The Dataverse Network is an open source application for “sharing, citing, analysing and preserving research data.” There are several other Dataverse Networks being launched by universities in the US and other countries, such as in Holland and Denmark.
Some of the largest collectors of data, and arguably the most insightful data when it comes to mirroring our societies, is that created by the collective users of social media sites like Twitter and Facebook. It is estimated that Facebook ingests 500 times more data than the New York Stock Exchange. In cooperation with the most popular social media sites, GNIP is trying to make this ever-expanding wealth of information available to everyone, however it comes with a price, depending on the size and focus of the project the data is being used for.
Another example for an public Open Data map is Natural Earth which flexibly interprets data through visually appealing maps. Freebase is a community driven and open repository of structured (in graphs) data of over 39 million topics about real-world entities like people, places and things. But also individuals come up with innovative ways to collect and interpret data. The user of the social networking site reddit ieeamo came up with this interactive visualisation of a number of data sets on different topics.
And here is a collection of 20 Big Data repositories that were shared by Bernard Marr and posted on Data Science Central to check out.
Big Data, Open Data and Open Government are on the rise and are being described by many as an impending Data Revolution. How will these massive amounts of data at our finger tips change our world and are they potentially impactful enough to eradicate poverty? With the increasing speed that information is created, shared and stored – and increasingly made accessible – we might soon find out.