Spatial Big Data, Impact, Sharing and Ethics

The term ‘born digital data’ was coined by Taylor and Schroeder in 2015 to denote “data that are digital from the start rather than starting out in non-digital form”. ‘Born digital data’ can be ‘consciously volunteered data’ or ‘data in the wild’ (pp. 504-505).

In my post published on February 22, 2017, I already wrote about ‘consciously volunteered data’ that are ‘born digital data’, namely crowdsourced data. Crowdsourcing has been proved to be particularly useful for humanitarian response (Meier, 2015). One of the first, and most emblematic, example of the power of crowdsourcing is the digital humanitarian response to the 2010 Haiti earthquake, in which the Ushahidi crisis-mapping platform played a critical role.

The Haitian humanitarian crisis that followed the 2010 earthquake also highlighted the fact that real time data could now feature in humanitarian responses. In their 2011 Haitian study, Bengtsson and his colleagues demonstrate that data could, in principle, be obtained for continuous and extended periods and in near real time, and that data were readily available.

Big data: accurate/fake, actionable/inefficient but fast

‘Real-time situational data’ is considered the area where much of the optimism around technological advances in humanitarian information systems has been focused. If data science is to be brought together successfully with humanitarian response, timely data access is critical (Read, Taithe & Mac Ginty, 2016, p. 6; Taylor & Schroeder, 2015, p. 510). As Spratt and Baker recall velocity is part of the three ‘V’s of early definitions of big data, along with variety that I examined in my previous posts, and volume (2015, p. 8).

Volume is an important feature to consider. Indeed, the sheer size of big data makes it near impossible to make sense of it in the first place. Additionally, verifying the relevance and the credibility of such data becomes a significant issue in a context where accurate but also fake and misleading information can spread fast (Meier, 2015, p. 27; Read et al., 2016, p. 9). In this regard, let’s watch part of the video called Digital Humanitarians: Overview of Chapters that I mentioned in my latest post.

Echoing Meier’s words, the first representative of the International Committee of the Red Cross (ICRC) emphasises the overwhelming nature of large amounts of data, as well as the difficulties related to verifying such data and the credibility of the people who provide them. Data enthusiasts have claimed that technology will render data more accurately and more quickly, and thus help address the information deficit often found in conflict or disaster-affected areas with an impact on the timeliness of humanitarian responses. However, the evidence measured against the record of delivery in mapping and visualisation suggests a profound inefficiency due to resources being wasted on gathering information that cannot be fully processed (Read et al., 2016).

Additionally, Read and her colleagues argue that instead of ‘actionable data’ that is required by the humanitarian sector, ‘inactionable data’ is often produced. While an increasing amount of money is generally invested in advancing information technology, its use is often limited because many of the developments in digital humanitarianism seem to be driven by what is possible rather than what is needed. With regard to data revolution in geography, including maps, Barnes poses the following question: “are we generating useful knowledge or are we collecting ‘data for data’s sake?’” (in Read et al., 2016, p. 2).

Data sharing: open data, social media and digital security

Returning to the video above, you may have noticed that the second representative of the ICRC sheds light on a significant problem related to data sharing: the very limited awareness and knowledge people possess in digital security. While open data for governments is considered to have a huge potential to improve the accountability and effectiveness of governments across the world, the issue of open personal data is different and raises serious questions with respect to data privacy and protection. One of these questions is: should the content of text messages be made public on crisis maps, along with personal information? (Meier, 2015, p. 11; Spratt & Baker, 2015, p. 30). As Meier puts it (p. 27):

[…] just because a few sources of Big (Crisis) Data are open and publicly available doesn’t mean that using this information is either ethical or safe.

On November 13, 2015 a series of attacks shook Paris. In this particularly chaotic context, the hashtag #PorteOuverte (‘Open Door’) started spreading as a way to offer shelter for those seeking safety from attacks.

At first, Twitter users were encouraged to geo-locate their messages. Some even shared their address on the social platform. However, many of them quickly turned to asking for requests to be made through private messaging in order not to expose themselves to potential danger. While some Twitter users were opening their doors to host disoriented people, others were searching for lost loved ones. In this respect, the #RechercheParis (‘Research Paris’) started spreading as a way to try to locate missing persons. Both hashtags have been referred to as examples of how social media can help during crisis.

Meanwhile, another social platform, namely Facebook, was also participating in locating potential affected people. Its Safety Check tool issued an alert to ask Facebook users located in the risk area to identify themselves as safe. I, myself was expecting to see the safety check of my brother who was living within 500 meters of the Bataclan.

 

Source: Beatrice Verhoeven/TheWrap (13 November 2015)

Data exhaust: privacy, protection and ethics

Since its launch in 2014, Facebook Safety Check has been activated about 30 times in several crisis contexts around the world. Safety Check alerts are now generated when a spike in user statuses tells the algorithm that there’s a crisis underway. By comparison, other tools such as Google Person Finderwhich was initially created by volunteers in response to the 2010 Haiti earthquake and has recently been deployed in Nepal, are searchable databases.

On a different note, Facebook data scientists recently discussed how data can be used to produce high quality maps to support development, humanitarian action, and government planning. Whereas Facebook’s initiatives are laudable, they also raise questions with respect to the type of data emitted.

Since my first post on mapping, I wrote about ‘consciously volunteered data’, that is to say data that are submitted for a particular purpose by users who are aware that they are making the submission. However, data may come in a different form, for example, being created passively as a result of other activities (i.e. ‘data exhaust’) (Spratt & Baker, 2015, p. 7). These ‘data in the wild’, generally emitted under corporate auspices, are increasingly being used to inform development policies and interventions (Taylor & Schroeder, 2015, pp. 505 & 513). While the issue of ethics and data security is crucial regarding open personal data submitted by people for a particular purpose, it is even more problematic in relation to data exhaust. As Taylor and Schroeder state (2015, p. 504):

[…] in the absence of a clear ethical framework or set of rules for handling and sharing ‘born-digital’ data [, there are risks to privacy]: anonymisation techniques are unreliable […]; there is less awareness in [Low and Middle Income Countries (LMICs)] of the implications of making personal data public, and digital data protection is not yet a concern for a majority of LMIC governments […]

The idea that privacy concerns brought about by big data may be even more acute in developing countries than in developed countries is emphasised as well by other experts, Spratt and Baker (2015), who point out the lack of legal and institutional framework to ensure data privacy in those countries. Privacy concerns, sometimes protection concerns as well, are raised regarding both governments and firms. In developing countries where privacy, liberties and data rights are protected to a lesser extent than in developed countries, risks around government surveillance are greater, and firms from more heavily regulated market may exploit the personal data of people in ways they could not in their home country.

In this perspective, Spratt and Baker propose to ensure that all individuals have the right to control their own personal data. According to the same logic, Meier proposes the use of the hashtag #NoShare as a way for individuals to decide whether or not to be sensed in an environment where big data means big sensing (2015, p. 186).

To conclude and come back to the final quotation of my post published on February 22, 2017, I will refer to Meier’s chapter on “the future of data privacy, protection, and ethics”. Just because new technologies are new doesn’t mean that established data protection and privacy protocols don’t apply. In the humanitarian sector at least, the protocols that guide how information can be collected, shared and used have recently been extended to include the role of digital humanitarian volunteers and social media (2015, pp. 183-184).

———————————————————————

  • Bengtsson, L., Lu, X., Thorson, A., Garfield, R., & von Schreeb, J. (2011). Improved response to disasters and outbreaks by tracking population movements with mobile phone network data: A post-earthquake geospatial study in Haiti. PLoS Medicine, 8(8): e1001083. Doi: 10.1371/journal. pmed.1001083
  • Meier, P. (2015). Digital Humanitarians: How BIG DATA Is Changing the Face of Humanitarian Response. Boca Raton, FL: CRC Press.
  • Read, R., Taithe, B. & Mac Ginty, R. (2016). Data hubris? Humanitarian information systems and the mirage of technology. Third World Quarterly, 37(8): 1314-1331. Doi: 10.1080/01436597.2015.1136208
  • Spratt, S. and Baker, J. (2015). Big Data and International Development: Impacts, Scenarios and Policy Options. Brighton: IDS.
  • Taylor, L. and Schroeder, R. (2015). Is bigger better? The emergence of big data as tool for international development policy. GeoJournal, 80: 503-528.

2 thoughts on “Spatial Big Data, Impact, Sharing and Ethics”

  1. The three V’s (velocity, variety, volume) that encapsulated the early days of big data, are no longer the only V’s when it comes to present situation, additional V’s have been added to the mix and we are now dealing with in addition to these, with the challenges of veracity, validity, and volatility of big data. As the article discusses in illuminating details, all these elements of rapid technological change, mean that on the individual level, it is becoming harder and harder to maintain privacy, which is not helped by the fact that 1) the regulatory requirements keep changing and 2) regulations vary from country to country, company to company. In the quest to find answers to the challenge of quality and ethics of big data, I really found this article did a great job condensing Spratt and Baker’s report “Big Data and International Development: Impacts, Scenarios and Policy Options”, it would be interesting to see if as they suggest, we can truly strike a balance between individual rights and using data to make effective decisions in the development field.

Leave a Reply

Your email address will not be published.