Big Data: Get on Board or Get Left Behind

Big Data startups in the West received $6.64B in venture capital investment in 2015, 11% of total tech VC. To put that in perspective, it is greater than the GDP of Malawi, 4 times the GDP of Guinea-Bissau.

“The explosion of big data has far-outpaced our ability to make sense of it in poorer nations that already lack human and technical capacity.”

This is the position articulated by Claire Melamed from the Overseas Development Institute, quoted in a SciDevNet article, “Big Obstacles Ahead for Big Data for Development.” She adds, ““We are all running to catch up with the technology.”

This is a good starting point for this post which will look at the burden on the Global South to catch up with Big Data developments in the Global North. I will be looking in this post at the following issues:

  • Is the Global South becoming a playground for academia and the private sector to test theories and Big Data tech on data sets with fewer privacy issues?
  • Is Big Data still a thing?
  • Does Big Data create an additional burden on Global South government finances and capacity
  • Is Big Data for Development contributing to a shift towards the “corporatization of development”.
  • Does the sample bias and technical limitations inherent in Big Data risk leaving those most in need of development assistance behind?

Is the Global South becoming a playground for academia and the private sector to test theories and Big Data tech on data sets with fewer privacy issues?

A Data 2X report, The Landscape of Big Data for Development” (2014) acknowledges, use of big data in development is largely driven by “opportunistic partnerships between private companies and academics.” A key concern is that the data exhaust is often owned by the private sector, especially mobile phone operators. In this nascent phase of application of big data for development, academics and private companies have a high degree of influence on how big data is utilized and what approach to take in terms of how to harness it for development.

In Big Data Hubris, the author worries that many developments are “driven by what is possible rather than what is needed.”

As pilot projects and feasibility studies abound (a good searchable map of current big data for development projects can be found on the report, The Landscape of Big Data for Development), are these driven by academic and corporate interests in the field – with a blank slate of less troublesome data sets in the Global South – or by a genuine belief that this can eventually lead to tangible development outcomes and complement and/or replace existing structures? There is an understandable reluctance to turn over control of statistics and data to academics and private companies who could be criticized for using the Global South as their playgrounds – the next subject of study, in which the consequences for success or failure, are more to do with academic credentials, or better marketing for products, than a genuine engagement with development issues.

2) Is Big Data still a thing?

“The barriers facing big data becoming a useful tool for development are numerous and great. But they are not insurmountable. With the development community rallying around the UN’s data revolution call, there is reason to believe that big data can fulfil its promise in the years to come.”

So writes Jan Piotrowski in the same SciDevNet quoted above. Recalling Easterly’s notion of the Legend of the Big Push in White Man’s Burden, a note of caution is clear in Data Hubris which worries that development partners, institutions and donors are being urged “to get on board, or get left behind.”

Meanwhile, many tech commentators in the West are already realizing that the era of Big Data has already reached its zenith, and the focus has shifted from Big Data to AI, machine intelligence and deep learning, just as the Global South is being heavily lobbied to jump on the Big Data bandwagon. The two are obviously inextricably linked, but as the tech giants focus more on the application of AI into Big Data sets, the Global South is still struggling to come to terms with the volume, variety and velocity of the data itself.

Matt Turck’s blog post “Is Big Data Still a Thing” from February 2016, acknowledges that the Big Data fever may have plateau’d sometime in 2015. As he explains the interest in Big Data came primarily from the big social media networks (Google, Yahoo, Facebook, Twitter, LinkedIn) who realized that they were both heavy users and creators of Big Data. Creating the technologies such as Hadoop for Big Data analysis was wrapped up in the entrepreneurial spirit of the social media conglomerates.

But it is not just a question of rolling out Hadoop software into the world. Private companies – from small to multinational – are not starting from scratch and have a functioning technology infrastructure. The commitment for companies to transition to a Big Data driven corporate culture is massive. As Matt acknowledges, “You need to capture data, store data, clean data, query data, analyze data, visualize data. Some of this will be done by products, and some of it will be done by humans.  Everything needs to be integrated seamlessly. Ultimately, for all of this to work, the entire company, starting from senior management, needs to commit to building a data-driven culture, where Big Data is not “a” thing, but “the” thing.”

How much more so for governments and development actors in the Global South? As the academic rush to embrace Big Data creates numerous pilot projects to test what is possible with the tech available, there is a danger that newer technologies and concepts will overshadow Big Data before the Global South has had a chance to catch up.

  • Does Big Data create an additional burden on Global South government finances and capacity

In the wild west data-rush of data for development, competing and overlapping data sets are required and requested from non-governmental partners. Nicolas de Cordes, vice-president of Marketing Vision from telecom firm Orange, said “their experience in trying to engage the national statistics office in Côte d’Ivoire during their Data for Development challenge was that the stats office was not really interested in “your big data”; indeed such offices earn money by selling their services to international organisations and may see similar initiatives as a threat.”

This is an increasingly familiar story. Moreover, financial support for Statistics Offices is being cut. “The proportion of overseas aid dedicated to statistical programmes was slashed in half between 2011 and 2012, to 0.16 per cent, according to a 2013 report from the Partnership in Statistics for Development in the 21st Century (PARIS21).

In Poor Numbers: How We Are Misled by African Development Statistics and What to Do about It, Morten Jerven looks at the burden on National Statistics offices and acknowledges that the paucity of accurate statistics is not merely a technical problem; it has a massive impact on the welfare of citizens in developing countries. The condition of many African national statistical offices, usually understaffed and underfunded, frequently bypassed by international donor organisations who collect and produce their own data, and occasionally subject to political pressures from the governments, puts them in a weak position to inform policy-making with updated and accurate information. Statistics are not neutral data sets, but often serve a political and financial agenda.

One of the policy implications of Jerven’s work is that development resources need to be directed toward supporting national statistics offices across Africa so that they can produce better data about their economies. A potential counter argument could be made that instead of supporting the statistics offices, the development community should be focusing on the citizens themselves and devoting as much development assistance directly to them as possible.

A recent UN Economic and Social Council report looking at 107 national statistical offices showed that they see big data projects as a complement to, not a replacement for, traditional collection methods such as surveys. Does Big Data then represent an additional burden on statistics offices which themselves are subject to political interference? And what about the inherent danger that in diverting scant financial resources from the provision of public services and political engagement to getting better data, the belief that better data will lead to better-informed policy won’t be realised. In the stand-off that occurs, will poorer nations find themselves not in control or ownership of data on their own citizens as academics, multilaterals and private companies bypass government statistical offices rather than enter a problematic relationship with them?

  • Is Big Data for Development contributing to a shift towards the “corporatization of development”.

 With big corporate players encroaching the development sphere, it is perhaps not surprising that the language used to reflect this new private-public partnership has also shifted. These “opportunistic” partnerships created in the rush to get on board or get left behind are often with large corporations, telecoms companies and social media conglomerates. The language around development has shifted to a focus on efficiency, value-chains, monitoring and evaluation. This is not just a question of semantics, but a very real re-assessment of what development means.

  • Does the sample bias and technical limitations inherent in Big Data risk leaving those most in need of development assistance behind?

Drawing conclusions from unrepresentative data has implications for harnessing Big Data for Development. In the enthusiastic mining of data sets from social media, who is left out? This is something which Data2X is actively challenging with regard to gender as I pointed out in a previous post on BigData4Gender.

The following infographics show the disparity in usage of facebook for example between the UK and India, male and female, and across age quintiles.

 

When other factors such as access to internet, electricity, language issues (the majority or sentiment analysis is still based on English usage) Big Data as a tool for development has the worrying propensity to miss out those most in need.

Leave a Reply

Your email address will not be published. Required fields are marked *