Many development goals, policies and programs are based on numbers and statistics. How accurate are these numbers on the African continent and can big data help in improving the accuracy?
African statistics today
In his book, Jerven offers a devastating critique over the state of statistics on the African continent. He notes that the numbers being produced and published are neither reliable nor valid, often being based on estimates, guesswork and/or assumptions. Many times, these assumptions are in turn based on older data sometimes dating back decades. These old baseline numbers have very little relevance with how things look today.
Several consequences can be identified here. Firstly, different actors may look at the same old data, but through different angles, and thus produce very different numbers (in Jerven’s case it is mostly about GDP/capita). Secondly, these poor numbers feed into a larger picture of how African countries are depicted, which problems they have and where these can be found, and any possible solutions to remedy them, to develop the nations.
Jerven laments the poor state of the countries’ statistical offices and argues that they are basically there to serve actors from the international aid, donor and development communities (Jerven 2013:105). “International institutions are the main providers and disseminators”, as he notes (Jerven 2013:8f).
Jerven calls for new baseline estimates, from which fresh statistics can be extrapolated and drawn from. However, he stresses that “these must be based on local applicability, not solely on theoretical or political preference” (Jerven 2013:xiii) and also highlights the importance and necessity of local knowledge and input. Data and statistics ought to serve the needs of the people on the ground, not reaching targets for some faraway aid organization.
Big data replacing statistics?
Can big data replace the poor state of statistics on the African continent and help improve public policy and development goals? First, let us quickly go through what big data is and how it works, before answering the question.
In their book, Mayer-Schönberger & Cukier provide us with a clear overview of big data and what it is. They note that “at its core, big data is about predictions” (Mayer-Schönberger & Cukier 2013:11), about inferring probabilities. Furthermore, big data is about finding the general direction, about a trade-off between being accurate at the micro level versus gaining insights at the macro level (Mayer-Schönberger & Cukier 2013:12f). So far, big data seems like a useful tool to use. In fact, big data can be viewed as pure statistics. But which data can be and is currently collected in big data sets?
A lot of the data comes from using various communication tools, such as cell phones and computers, while simultaneously being connected to the Internet. Taylor & Schroeder warn us when they point out that far from everyone use cell phones or is connected to the Internet in developing countries. This results in user bias and a situation where vulnerable or ‘hidden’ populations, such as children, the elderly and the poorest in society are left out in the data collection (Taylor & Schroeder 2015:510f). They argue that “mobile phone use is highly differentiated by gender and income level” in India (Taylor & Schroeder 2016:506), and a qualified guess is that many African countries exhibit the same patterns.
Meier concurs, saying that “not everyone is on social media. In fact, social media users tend to represent a very distinct demographic, one that is younger, more urban, and more affluent than the norm” (Meier 2015:37). So perhaps inferring national probabilities from a rather narrow subset of the population is a fairly poor idea, which will not give a rewarding big picture, as is one of big data’s positive sides.
Quality of analysis
If the previous section discussed the quality of data, this will delve deeper into the quality of analysis regarding big data. In the previous post I briefly mentioned how big data actors mostly are big corporations and governments. What they have in common is that the majority of them are based in the global North, far away from the realities of Africa.
Jerven writes: “In order to employ the evidence usefully, one must know the conditions under which the data were produced. This is readily recognized in qualitative analysis, but somehow these principles have not been applied to quantitative evidence” (Jerven 2013:7).
Read, Taithe & MacGinty are even more pessimistic and question the quality, reliability and validity of data when “field level information may be sent to headquarters in a different country, collated with other data and then sent back to the country of operation” (Read, Taithe & MacGinty 2016:7). They continue saying that there is a risk where people analyzing the data are cut-off from local knowledge and context, only looking at numbers (Read, Taithe & MacGinty 2016:12).
Mayer-Schönberger & Cukier in turn touch upon the very real possibility of a situation where “data-driven decisions are poised to augment or overrule human judgment” (Mayer-Schönberger & Cukier 2013:141). Let us hand over everything to the machines!
Big data the statistical saviour?
Based on the literature reviewed here, this question can only be answered with a resounding no. Jerven complained about the dominance of outsiders when producing statistics and I cannot see how things would be any different if big data actors were to run the show instead of today’s powerhouses within the statistical field. The same objections, such as democratic deficit and out-of-touch with local circumstances, can be raised and more, such as the gender and income gap among users, may even be added.
Big data proponents argue that big data “offers new and higher knowledge ‘with the aura of truth, objectivity, and accuracy’” (Read, Taithe & MacGinty 2016:10). But statistics, be it presented as big data or traditional surveys carried out on the ground, is always subjected to human bias. This is actually something that Meier, himself a big proponent of big data, confirms when he says that everything is biased (Meier 2015:39).
Jerven, M. 2013: Poor Numbers: How We Are Misled By African Development Statistics and What To Do About it. Ithaca, NY: Cornell University Press.
Mayer-Schönberger, V., Cukier, K. 2013: Big Data: A Revolution That Will Transform How We Live, Work, and Think. London: John Murray Publishers.
Meier, P. 2015: Digital Humanitarians: How BIG DATA Is Changing the Face of Humanitarian Response. Boca Raton, FL: CRC Press.
Read, R., Taithe, B., MacGinty, R. 2016: Data hubris? Humanitarian information systems and the mirage of technology, Third World Quarterly, forthcoming.
Taylor L, Schroeder R. 2015: Is bigger better? The emergence of big data as tool for international development policy. GeoJournal 80: 503-528.