Big Data and Data Justice

Information generation, storage capacity and data sharing has been made easier and cheaper now than ever before thanks to leaps of technological advances. Our daily interaction with the internet on various platforms plus mobile phones has enabled creation of huge amounts of data where estimates have that 90% of the data in the world has been created in the last two years, and it is projected to increase by 40% annually. Governments and the private sector alike have taken advantage of this huge amount of data to get insights within areas that they had difficulties in the past while commercial enterprises are now presented with new forms of interacting with their customers.  As the UN describes it as the “Data Revolution””, we risk forgetting the down sides of Big Data.

The knowledge retrieved from Big Data is based on correlations of data which give predictions of what is to happen. It doesn’t tell the future but give close enough predictions of what might happen. Algorithms control our online experience and amount to the decisions people make about us. Enterprises decide who qualifies for a loan and who doesn’t, schools decide on who gets or misses a placement in the university whereas insurance companies use it to decide how much premium a client will pay based on various elements outside one’s health condition. Mayer-Schönberger and Cukier discuss the dark side of Big Data in regard to privacy and penalties based on propensities. He describes propensities as “the possibility of using Big-Data predictions about people to judge and punish them even before they’ve acted”. (P. 151).

It is these decisions that go to classify people in various categories further heightening inequalities such as discrimination and segregation. Data Scientist Cathy O’Neil in her conversation on Ted Radio Hour ´Can We Trust The Numbers?´ gives her insights on algorithms, stating that they work to separate people by class and race and creating “winners and loser.”

The Swedish police presented a report displaying how they divide Swedish suburbs as “vulnerable areas” and “especially vulnerable areas]”. These utsatt område are characterised by various social problems, low education levels and high unemployment rates and report high crime rates and gang violence. It is noted that the majority of these neighbourhoods are resided by people not born in Sweden.

The state and police have an easy way of collecting and receiving information from the public and other sources freely mainly because of the authority they hold. This information combined with information mined from other networked sources such as our internet connections, mobile telephone communication and our whereabouts due to the location detected from our smartphones, and further backgrounds such as education and migration status further expose us to unfair levels of scrutiny and victimization. This is a form of bias enabled by big data that categories locations and people along certain lines thus downplaying their agency. Big data also gives power to the ones who hold it leading them to create control mechanisms that suit them to the disadvantage of the other – who many times are the marginalised groups.

While the police may insist that these divisions allow them to focus their efforts more effectively and thus improve efficiency, we cannot overlook its problematic nature. Safeguarding people’s rights to privacy and treating them as individuals and not as homogenous groups based on race, social status and other biases is the essence of our basic human right.

How then do we work to minimise these risks in a datafied world?

Policies and laws must be developed that move from the national level to transnational levels which  look into issues of privacy, representation and equality.

Taylor proposes a data justice framework that charts ethical paths in a “datafying world”. She submits an approach based on three pillars. Visibility which deals both with privacy and representation where the question of how the sorting and grouping of the public with the intention of intervention should function in a democratic context. The second is in connection to our engagement with digital technologies and having the freedom not to use particular technologies thus taking back the power to understand and determine one’s visibility. This leads to the reconfiguring of the data value chains so that value is not just in the hands of the technology giants but that data’s returns can be captured and processed at the local level. Countering data-driven discrimination is the third pillar which address the authority to challenge bias and the freedom not to be discriminated against. This constitutes how data is bias based on how it is gathered, processed and stored and how they are influenced by other factors such as politics, economics and infrastructure. (Taylor, p.11).

This is a framework to be added to the bank of knowledge that will go to creating lasting guidelines on data justice before the bias become harder to comprehend due to fast changing technology.


Mayer-Schönberger, V., Cukier, K. (2013) Big Data: A Revolution That Will Transform How We Live, Work, and Think. London: John Murray Publishers.

Taylor, L. 2017: What is data justice? The case for connecting digital rights and freedoms on the global level, draft paper.

Ted Radio Hour. (2018) Can We Trust the Numbers?

THE LOCAL se (2017) 5,000 criminals in Sweden’s vulnerable areas: police

United Nations: (2018) Big Data for Sustainable Development.



  1. Kijo Esther


    interesting article on big data and dat
    a justice. there’s a lot of room for the right policies and even legislation on usage in the current environment and it begins when the issues are highlighted and discussed as in your article.

    These policies can then be customised to different countries depending on their current context and situations.

  2. Yahneake

    Interesting article Diana! There are pros and cons to Big Data but in the grand scheme of things, it is mostly corporations that benefit. On the one hand, one could argue that big data benefits businesses because they use it as a marketing tool to target consumers based on shopping trends for example. On the other hand, consumers can benefit in banking for example, when attempting to access a loan, available data saves time for both parties and allows for a faster processing time. We live in a data driven world and like Kijo, I agree that policies and legislations could go a long way in regulating how the information is used and stored.

Comments are closed.

Back to Top