To address gaps in data gathering and generation in the digital age, the origins of datafication might provide an evergreen reminder that not all data is digital and that referring back to analog data might be key to addressing older, systemic problems.
The concept of datafication arguably sprung from the work a former naval officer named Matthew Fontaine Maury in the mid-1800s. Tasked with the naval role of head of the Depot of Charts and Instruments, Maury took this new posting as an opportunity to extract, record and study patterns in sea voyages in order to optimise and expedite sea routes (Schonberger and Cukier, 2013).
His goal was to harness a greater form of common shared knowledge tabulated from an array of sources, including the deteriorating logbooks in his care and from conversations with old sea captains at various ports, to ultimately provide value and convenience for merchants, traders and other types of maritime voyagers (ibid, 2013).
To this day much training and educational resources in higher income countries are allocated to programs that would view data (and the analysis thereof) as pivotal to the realms of business insights and the creation of entrepreneurial value.
Development potential is frequently relegated to something of an afterthought in the training arena for scientific data analysis. It’s an underlining worry in a multifaceted overview in our increasingly data-hinged and tech-hung world. However, what the story of Maury and his naval charts can remind us is this:
Datafication and digitization are not one and the same. Conflating the two terms can be commonly done as big data has become something of a hot-button phenomenon on the post-web 2.0 digital era.
With this in mind, data extracted for value and insight, transposed into a context for development could consider acknowledging the roots of datafication to understand that technology doesn’t have to be a dealbreaker in terms of creating optimal forms of development knowledge. Analog (often older; not created digitally) data can indeed be pivotal in addressing long-term or systemic development problems.
In Lower-middle income countries, analog data such as censuses can provide a backbone of contextual information that might not carry some of the bias and exclusion that born-digital data can reflect. Due to disparities in access to technology and other forms of privilege, it is important to remember that the possibility for representational oversight in data being emitted from technological devices and digital services can be limited (Gonzalez-Bailon et al. 2012). Aside from issues of gender, big data collection can be prone to a plethora of reasons for potential inaccuracy or incompleteness.
Taylor and Broeders (2015) examine a power shift in lower-middle income countries that sees the state going from being the primary collector and user of statistics to a model where power distributes, less predictably and wholly messier, to those who hold the most data.
Absent Voices, Incomplete Pictures
However, in Kenya some censuses have been repeatedly rejected by sectors of the population who feel underrepresented, highlighting Jerven’s (2013) documentation on how data are politicized and must be legitimized by the greater population in order to be useful in policy formation (Taylor and Schroeder, 2014).
In the case of the D4D project in the Ivory Coast where mobile operator Orange released 2.5 billion call records to data scientists. The aim was to ‘help address the questions regarding development in novel ways’ with the winning project ultimately providing a contextually incomplete analysis of the city of Abidjan’s transport system (ibid, 2014).
Without local or contextual knowledge on Abidjan’s city transport dynamics, the D4D transport project only managed to reference data on roughly 10-30% of the transport system, neglecting the remaining majority of transport, which is handled by small-scale informal providers (ibid, 2014).
And perhaps it’s these local transport operators that could have provided the D4D project with a more complete story of the city’s transport quirks, much like the way Maury recognized the value in the information held by the old sea captains in ports around the world and employing a multi-dimensional datafication approach. Not simply stopping short at what resources had been conveniently put in his charge.
Additional texts used:
Gonzalez-Bailon, S., Wang, N.,Rivero, A., Borge-Holthoefer, J., & Moreno, Y. (2012). Assessing the bias in communication networks sampled from twitter. Available at SSRN 2185134.
Jerven, M. (2013).Poor numbers: How we are misled by African development statistics and what to do about it. Ithaca: Cornell University Press.
Mayer-Schonberger, V., Cukier, C. (2013). Big Data: A Revolution That Will Transform How We Live, Work and Think. London: John Murray
Taylor, L. and Broeders, D. (2015). In the name of Development: power, profit and the datafication of the global south. Geoforum, 2015.
Taylor, L. and Schroeder, R. (2014). Is bigger better? The emergence of big data as a tool for international development policy. Geojournal (2015) 80:503-518. DOI 10.1007/s10708-014-9603-5