Big data, privacy and ownership

Shahin Madjidian


Studying big data critically leads to several interesting topics which can be examined and developed. In a string of four blog posts this is exactly what I will do. The first post is perhaps the heaviest as it deals with ownership issues, democracy and whether or not big data can be seen as something revolutionary that will lead to social change.

I will start with a short analysis of the individual’s right to his/her data and then move on to the macro level – who owns the data, who can store it, analyze it or draw conclusions from it, and later act on them?


Today, most people in the world leave digital fingerprints as we go on about our businesses, whether we like it or not. This data gets stored and many times analyzed and acted upon by the actors who picked up the data in the first place. The data can be anonymized and used in a big data set where it is very difficult, or even outright impossible, to identify individuals, or it can be used to improve targeted commercial ads suited for a unique individual.

Privacy issues have been raised, especially as individuals have very few possibilities to reject the data collection. Spratt & Baker argue that algorithm transparency is important, as well as the fact that people should be allowed to know which data is stored on them and where. They suggest that “all individuals have the right to control their own personal data, and can choose to sell as much or as little of this as they like” (Spratt & Baker 2016:30). While this sounds laudable, there are several problems which are ignored. How will an individual get access to this data and where can s/he store it? What about situations when the individual requires money or are in other desperate situations and decide to sell data, doing a trade-off between short-term gain in favour of perhaps long-term exposure? Which population sectors in which countries may be most prone to do this?

In reality, as Spratt & Baker note, consumers may object to their personal data being bought and sold, but in reality have very little control over it once it has been collected (Spratt & Baker 2016:12). This leads us to the next section, namely who these holders of data are.

Ownership and “usership”

The main holders and owners of data today are big corporations, especially those in the social media and communication sectors, and government. Often, data collected from actions and events are used to create new forms of value in innovative ways, as “the system takes information generated for one purpose and re-uses it for another” (Mayer & Schönberger 2013:97, 103). The trick is in finding secondary usage of the data and as a result, hidden correlations which may turn out to be highly valuable. Related to the privacy issue, it is very difficult to prohibit something that has not yet happened, to prohibit uses of data which the data owners have not previously thought about.

Read et al. argue that “if the power of initiative, design, funding and analysis still resides with the tech-savvy individuals and organisations based in the global North, then it is difficult to concur with the view that technology is empowering or liberating” (Read, Taithe & MacGinty 2016:12).  This notion is amplified in the global South considering “the growth of private-sector involvement in public infrastructure projects across the globe” (Lovink & Zehle 2005:10), with infrastructure here broadly meaning Internet and cellphone development.

A few huge corporations have taken the lead in the use of big data and to remedy this, Spratt & Baker propose state support to startup companies within the field in order to learn and become more competitive (Spratt & Baker 2016:26).

Are there any possibilities of individuals becoming owners, analyzers and users of big data? Meier certainly believes so, and I will return to his book “Digital humanitarians” in a later post. For now, I will use his own words against him, as he writes that big data can easily turn into information overload and that the data coming in during one of his humanitarian efforts was simply too much for him and his hundreds of volunteer to handle (Meier 2015:4, 50, 52).

It is not only the vast amount of data that makes it difficult or impossible for individuals to use, but also its messiness and complexity. The data comes from different sources, in a wide variety of shape and form, many times unclear and fragmented. The technologies required means that big data use today is limited to a few actors. Individuals, or groups of individuals, are usually not among the lucky ones.

Mayer & Schönberger have a somewhat romantic view of the future development, believing that just like everyone with cell phones has the potential of being a “journalist” in the broad sense, everyone may be able to extract and analyze big data as “tools get better and easier to use” (Mayer & Schönberger 2013:134). It may not be necessary to be a statistician, engineer or software developer working for a government agency or Facebook.


While development may allow more people to become big data users, today’s actors will have a huge head start. Furthermore, Mayer & Schönberger predict that data owners will increasingly be in the most lucrative position in the future (Mayer & Schönberger 2013:134) and as long as the privacy laws are not changed, the data owners will be social media and communication corporations, not individual citizens.

I agree with the somewhat glum view of “although cloaked in an the language of empowerment, data technology may be based on an ersatz participative logic in which local communities feed data into the machine /…/ but have little leverage on the design or deployment of the technology” (Read, Taithe & MacGinty 2016:11).

In many ways, big data is revolutionary and holds great possibilities for humankind, but used within today’s societal and economic logic, it is but a furthering and strengthening of the status quo, with little to none possibility of empowering individuals or inciting social change.


  1. Thank you for this post Shahin, I think it sets the tone in a right way of what has followed later in this blog.
    I will focus on the part of your post regarding privacy, where I believe that you make the right questions to challenge statements like the one you cite from Spratt & Baker on the individuals’ right to control and to choose to sell their own personal data.
    Statements like this seem to me a bit like “building castles in the air” in terms of what an individual can do for protecting their privacy. And I say this because in my view, the international regulatory framework of personal data protection is very complex and complicated for one individual to manage. On the contrary, differences in regulatory frameworks on international level are more than significant and what is protected in one country, might not be regulated in another.
    And if we add the cultural differences, where what is considered as very private, embarrassing, stigmatising, or posing grounds for discrimination varies among individuals and groups, and also differs between cultural backgrounds, places or time periods (see Kaplan), we understand that it even more difficult for individuals to justify remedy requests regarding their privacy. There are numerous studies on comparative law on this topic, providing very interesting examples from this comparison. As Jost analyses in his book, the rights of e.g. patients to their medical data vary considerably between countries, from an absolute protection of confidentiality e.g. in France a physician is criminalised if breaches confidentiality, even in the court testimonies and even if the patient allows it, in comparison to the approaches taken e.g. in India or the US (as also cited by Kaplan).
    A possible solution to protect confidentiality and privacy in Big Data could be de-identification of the data when they are stored for further analysis in the databases, so that it would not be possible to track back and identify the person behind the data. However, scholars like Lee et al note that “there are many technical and legal problems related to de-identification but the biggest problem will be re-identification. This is due to the fact that even information that at first cannot be used to identify an individual, when it has sufficiently increased in quantity and kind, can eventually be used to identify the main subject of the information”. Meaning that it will always be possible to track-back an individual and there is no guarantee for confidentiality of private data.
    But even if the individual seems very unprotected in the online data market, I would not necessarily agree with the “glum view” of Read, Taithe & MacGinty which you also adopt towards the end of your post. In my view, the power of the big databases is the individual user, and we still have a choice (with maybe not absolute, but still considerable freedom) on what parts of our private lives we will upload, share and make public online.

