Oct 17

Who are they? Citizens who are using the big data?

Diana, October 26.

big data citizens

In my last post, I would like to review the topic of Big Data one last time. I have been talking about challenges with Big Data, as well as about the hidden bias in it. One of my points, that I lifted up in my earlier posts, was that it is citizens, people, who create Big Data. People, by sharing, tagging, commenting, writing posts are creating all that information that is being selected in the big see of data. Like this group mentioned many times before, data, on one hand, could be very useful and helpful, but, on the other hand, damaging and contains some risks.

Let’s, then, look at the beginning, namely, at those people who write, share, like, tag etc. Who are they?

I can see two types of people who create information on the internet: “development professionals” and “readers”. I have already named “development professionals” – professional organisations such as NGO’s. Thomas Tufte states that technological frameworks have opened up a pathway for a better way to spread the information to locals, government, donors in order to provide an improved aid, for example in health and education.

“Development professionals” expect a transparent, legit data in order to collect the required information, in turn, to know where they are needed and what kind of help is needed from them. I’m not going into details here, I, and my group mates, already discussed both risks and opportunities with big data usage by the organisations. The group, also, lifted some positive and negative examples of big data usage. Instead, I’ll go directly to the second type of people being responsible for creating big data.

On the other side, Michael Mandiberg is talking about “readers”. Readers are people who used to just read the information on the internet, but have become the writers themselves and now has control of the information on social media.

Give the people control of media, they will use it. The corollary: Don’t give the people control of media, and you will lose. Whenever citizens can exercise control,
they will. – Jeff Jarvis

Recent development in social media gave people the freedom not just to communicate with each other, but, also, to express their opinions and thoughts online. New technological frameworks have emerged that focus on allowing media creations, like blogging, where people can speak openly about any issue, any matter, any problem. Those new frameworks gave way to so-called amateur media, which can be a risk of becoming bigger than any of those professional organizations.

Furthermore, “amateurs” can be a reason for biased data, namely, they don’t always doublecheck the legitimacy of the information they are spreading. I’m not going to go more into details, I already provided some examples of biased data in my previous posts.

Big data citizens

In conclusion, I have to say that this post is ironic because we are the newly cleated bloggers (well, some of us), who are writing on the subject of #BigData. Who is here to say that we are legit? We might just be those so-called “amateurs”. Are we contributing to a transparent data or biased? Yes, we are using the academic references. Yes, we are doing our researchers before posting our blog posts. No, we are not professionals. Who are we then? Here, in the end of my discussion, I discovered a third group of people who create information on the internet: us 🙂

Thank you for the time you spent on our blog reading it, commenting, participating in discussions, sharing it! It was a fun time for all of us! I hope that you enjoyed it as much as we in this group did!


Oct 17

Pingback: Have you #HeForShe’d yet?

Eraptis, October 24.

Tonight, Studentafton hosted an event with the head of #HeForShe, Elizabeth Nyamayaro, at Lund University, Sweden.


Here are some of the things being tweeted by #HeForShe during the event!

Elizabeth Nyamayaro emphasizes the need for all (both men and women) to engage in order to create lasting change:

To this day, there’s yet a country that has achieved gender equality. Thus, it’s not only a “Global South” problem – change needs to happen everywhere:

The need to act when required, to not be passive in the face of inequality and injustice:

But what about data?

DIAL* asked the same question – is all data created equal?

* DIAL (Digital Impact Alliance) is a “partnership amongst some of the world’s most active digital development champions” including: UN Foundation; Bill & Melinda Gates Foundation; SIDA (Swedish International Development Cooperation Agency); and USAID 

Ping: Have you #HeForShe’d yet? Data for women’s empowerment?


Oct 17

Can Big Data Help Feeding The World?

Aymen, October 20.

Big Data goes beyond just the existence of data. The ability of Big Data techniques to generate insights through synthesizing data from a range of sources may hold the greatest potential and carry the greatest risks of all. On one hand, Spratt and Baker, in their report “Big Data and International Development: Impacts, Scenarios and Policy Options”, explain that Big Data can be manipulated to promote certain political agenda or increase the possibilities for governments and large corporations to discriminate against certain groups or individuals.

big data feeding world

On the other hand, Big Data may have a positive environmental impact as well as a great potential in agriculture and rural development. It can bring new insights and decision points that lead to product/service innovations. This potential touch on, for example, precision farming with very efficient water and fertilizer use, food security coordination through tracking, tracing and transparency and personalized health and nutrition advice. The availability of easily accessible data plays a major role in documenting quality standards of agricultural products, saving time and improving productivity.

Several projects launched by development organizations rely on Big Data to optimize agriculture. For example, FAO launched in more than 10 countries in Africa, Asia, Eastern Europe, Latin America and Near East the Virtual Extension and Research Communication Network (VERCON). According to FAO, VERCON is a conceptual model that employs internet-based technologies and Communication for Development methodologies to facilitate networking, knowledge sharing and interaction among agricultural institutions, producer organization and other actors of the agricultural innovation system.

In Egypt for instance, where the first project was launched in early 2000, 100 VERCON access points had been installed in various places, such as extension units, agricultural directorates, research institutions and stations, and Development Support Communication Centers. They were connected to the internet to allow farmers to access to an agricultural economic database as well as news and bulletins that help them in solving their problems. In addition, the platform was useful to share ideas and experience of local farmers and monitor the whole project.

The VERCON project was successful since it relied on existing organizational structures and links. Also, the platform ensured rapid response to user feedback thanks to regular monitoring and access to monitoring results. It used rural and agricultural appraisals at the field level to ensure that the virtual network would be accurately focused on the information and knowledge needs of the larger agricultural community.

The project was successful and the Rural and Agricultural Development Communication Network (RADCON) was set up to engage with a wider range of rural and agricultural development issues and to extend the VERCON network to a wider range of stakeholders, including farmer organizations, youth centres, universities, and NGOs.

However, the challenge that the project must take is the use of ICTs by farmers themselves.  Despite the success of projects that imply Big Data for rural development, developing world-based farmers often face difficulties in meeting the quality and safety standards set by the developed world. The conditions that stimulated the growth of Big Data in the farming industry in the global north such as the widespread adoption of mechanized tractors; genetically modified seeds, computers, and tablets for farming activities are less prevalent in developing countries. While large growers can afford specialized machinery, small farmers do not have this opportunity. As a result, they can neither access the data nor interpret it.

Big Data for rural development can help analyzing large amounts of information related to rainfall data or the pest vector could give valuable insights into important issues such as climate change, weather patterns and disease and pest infestation patterns. However, this valuable information largely benefits the Big Data industry in the Global North. It can have a positive impact on big farmers in the global south, but rural communities might be excluded as they still have little or no access to ICTs.

Nowadays, as evoked by Spratt and Baker, those who are in favour of Big Data adopt an evangelical tone to argue for its benefits; while those who are against it tend to stress its dystopian nature. It is important to remember that Big Data is a very recent phenomenon; according to sciencedaily.com, a full 90 percent of all the data in the world has been generated since 2011. In practice, we don’t have the necessary distance to evaluate its real impact.

When it comes to agriculture, farmers all over the world must produce more to feed world’s rapidly expanding population in the coming decades. Will big data help feeding nine billion people by 2050? Time will tell…


Oct 17

Is BIG DATA against of for low-income countries?

Goda, October 18.

The development of new ICTs has brought many changes into our day to day life. These technologies are often seen as being undoubtedly good with the recognised capacity to make the world better. Big data is one of the key elements of it. According to Spratt & Baker big data is the belief which offers new and higher knowledge ‘with the aura of truth, objectivity, and accuracy” (Spratt & Baker, 2015).

However, my last post will be focused on Unwin’s argument that even if the purpose and introduction of such technology has a potential to do good, quite often this potential has negative outcomes, especially for poor and marginalized communities. Moreover, although big data is seen as offering new solutions for development issues (Spratt & Baker, 2015), it is mainly focused on “what is”, rather than on “what should be” (Unwin, 2017).

Big data benefits and risks have been discussed in all our previous posts from many different perspectives and illustrated by using different examples. As mentioned earlier, it can be used for various decision-making models. It can create added value or be used for manipulative purposes. So, as data becomes all-pervasive in our lives, it is getting more difficult indeed to achieve a right balance between possibilities and dangers of it.

bid data lowincome countries

According to Unwin, big data is designed “with particular interests in mind, and unless poor people are prioritized in such design they will not be the net beneficiaries“ (Unwin, 2017). In other words, big data primarily maintains the interests of governments or shareholders and it is much less interested in the people, especially from low-income populations. Despite such issues, in the previous posts was clearly stated that governments still play inevitably important role in creating the legislative and policy framework.

This concluding post highlights the most important aspect of big data which should be taken into account. Technologies need to focus on empowerment of people, especially of people from less developed countries rather than controlling them.

Therefore, there is no doubt that big data has been used for reasonable purposes. However, it is difficult to decide if all of them are positive. The use of social media to provoke a certain political change can be seen as being good and bad at the same time. Big data can be an opportunity in various contexts as well as a problem that needs to be solved. Everything depends on the context, particular situation and particular human intervention (Unwin, 2017).

Moreover, in terms of the less developed countries, as the world becomes even more digitally connected, there is a real need for the sharing the knowledge and technical capacity by richer countries and international organizations in order to improve global digital security. While this can cause privacy issues, it needs to be discussed openly and transparently within countries especially if it is related to an equal decision-making towards the reduction of inequality.

Additionally, even if Unwin declares that much more attention needs to be paid to the balance of interests between the rich and the poor than to the ways through which data are used I agree with Read, Taithe and Mac Ginty that data to become explicit, requires a careful analysis in terms of how it is being gathered and used. Especially when the technology itself becomes cheaper and social networking platforms such as Facebook became mainstream forms of communication (Read, R., Taithe, B., Mac Ginty, R, 2016).

However, it is not just the access to technology that matters. The data revolution risks strengthening specialists in headquarters. Thus, not only access to connectivity needs to be provided, but also governments should be innovative and open to new ideas. Also, there should be integrated an appropriate content which should empower, integrate less developed countries and help to use big data for their own interests (Unwin, 2017).

Nevertheless, despite all the risks in terms of poor communities, there are many potential benefits of big data analysis also. Among other things, such information can offer more employment opportunities, transform health care delivery, and do even much more than that (slate.com, 2016).

Therefore, the capacity to meaningfully analyse big data still has the same importance as a balance between developed and developing countries (Rettberg, 2016).


Oct 17

Big (biased) Data. What can go wrong?

Diana, October 16.

Today’s question that I want to ask is: what can go wrong by using Big Data? I move, with that, from theoretical posts to more explanatory post to show, with a particular example, how Big Data can be a risk for “development professionals” in order to provide aid.

big data wrongs

In my previous posts, I talked about risks and hidden bias in Big Data. I mentioned, that people are the ones responsible for data construction.

Google and particularly the Internet are generally observed as groundbreaking discoveries that have changed the way millions of people live their lives and yet researchers and practitioners in the field of ICT and development often struggle to demonstrate explicit influences of the technology to “development professionals”. There are definite reasons why certain projects fail and there are even some generalisable outlines of failure.

One of the examples is Google – the most used search engine in the world, where millions of people can find all kind of information that affects their daily lives. In 2008, Google came up with a, like they thought, brilliant application – Google Flu Trends – to truck flu and its spreading in the world. That has been done in order to help “development professionals” to provide aid to affected areas. Google claimed that they could see the advances of flu based on people’s searches. The essential idea was that when people are sick with the flu, they search for flu-related information on Google, providing almost instant signals of overall flu prevalence.

But this concept didn’t work. Why? Let’s examine.

David Lazer and Ryan Kennedy write in SCIENCE that Google relayed too much on simple search. That led to the spectacular failure of Google Flu Trends. Application missed, at the peak of the 2013, flu season by 140 percent.

Like I mention in my previous post, it is hard to know what is really happening in the affected area if you are not actually present in this area.

That is what happened here – Google didn’t take into the account that multiple people with the flu don’t actually use the search engine to seek for flu-related information. Furthermore, Google didn’t do the research of how many people rely upon internet in order to find records about the flu. Also, Google didn’t take into the account all those people who use Yahoo or Bing instead of Google.

David Lazer and Ryan Kennedy – professors in the Department of Political Science at the College of Computer and Information Sciences at Northeastern University respective at the University of Houston – continue that Google’s algorithm was relatively weak to overfitting to seasonal terms unrelated to the flu. With millions of search terms being fit into data, there were searches that were strongly correlated by pure chance.

These terms were unlikely to be determined by actual flu cases or to be prognostic of future inclinations. Moreover, Google did not take into account variations in search activities over time. These errors are not randomly distributed: an old error predict a new error scale of error varies with the time of year (seasonality). These outlines mean that Google Flu Trends overlook significant information that could be extracted by traditional statistical methods.

big data wrong

Google, as well as the whole Internet, is continuously changing because of the activities of millions of engineers and consumers. Researchers require an improved understanding of how these changes transpire over time. Scientists need to reproduce findings using these data sources across time and using other data sources to guarantee that they are observing robust outlines and not temporary trends. For instance, it is extremely practicable to do controlled experiments with Google, e.g., observing how Google search results will differ based on location and past searches.

More commonly, reviewing the evolution of socio-technical systems rooted in our societies is fundamentally important and worthy of study. The algorithms underlying Google support to regulate what we find out about our health, politics, and friends.

It’s Not Just About Size of the Data. There is a tendency for big data research and more traditional applied statistics to live in two different realms – aware of each other’s existence but generally not very trusting of each other (SCIENCE).

Big data offer massive potentials for understanding human connections at a societal scale, with rich spatial and temporal changing aspects, and for spotting compound interactions and nonlinearities among variables. Those are the most thrilling borderlines in studying human behaviour.

As an alternative of focusing on a “big data revolution,” perhaps it is time to concentrate on an “all data revolution,” where it can be recognised that the critical change in the world has been innovative analytics, using data from all traditional and new sources, and providing a deeper, clearer understanding of our world.


Oct 17

(Big) Data for women’s empowerment? – How does it work?

Eraptis, October 12

In my last post, I asked the question about how data could be used in order to measure the impact of the #HeForShe movement on women’s empowerment and argued that theory could guide us in the interpretation of such data. But through logical deduction data must first be generated before it can be analyzed, how does it work when data is generated in practice?

In accordance with Morten Jerven, a basic point of departure when we want to know something about a population is to first establish what the population is. Only after we have established this can we know something about other properties affecting that population. From a development perspective factors such as economic growth, agricultural production, education and health measurements are all predicated upon population data to be meaningful. Many times, however, the definite (real) population number is not actually known but estimated through a population counting process commonly referred to as a census. Jerven illustrates the possible implications of census-taking in a development context through a case study from Nigeria, saying that:

“Today, we can only guess at the size of the total Nigerian population. In particular, very little is known about the population growth rate. The history of census-taking in Nigeria is an instructive example of the measurement problems that can arise in sub-Saharan Africa. It is also a powerful lens through which the legitimacy of the Nigerian colonial and postcolonial state can be observed” – Morten Jerven

Without us getting into the particulars of the Nigerian census-taking case, Jerven points out an interesting aspect of this quote which he elaborates further elsewhere in his book: the involvement of the state in the production of data and official statistics. Thus, argues Jerven, if a particular state is interested in achieving development, we should expect that it also has an interest in measuring (that particular) development. If that’s the case, then the availability of state-generated data should reflect its statistical priorities, which is likely to mirror its political priorities. We, therefore, once again, arrive at the question asked in my first blog post – is all data created equal?


Directing that question to Data2x seems to yield the simple answer “No”, at least not yet. Data2x is a joint initiative of the UN Foundation, the Bill & Melinda Gates Foundation, and the William & Flora Hewlett foundation dedicated to “improving the quality, availability, and use of gender data in order to make a practical difference in the lives of women and girls worldwide”. According to this report produced by Data2x, approximately eighty percent of countries produce sex-disaggregated data on education and training, labour force participation, and mortality. But only one third do the same on informal employment, unpaid work, and violence against women. Mapping the gender data gap across five development and women’s empowerment domains by using 28 indicators identified several types of gaps for each indicator as shown in this table:

big data womens empowerment

Source: Data2x

To close these gaps Data2x argues that existing data sources should be mined for sex-disaggregated data, and new data collection should be designed as a tool for social change that takes into account gender disaggregation already in the planning stages. But useful as they are, conventional data forms generated by household surveys, institutional records, and national economic accounts are not very well suited to capture a detailed account of the lives, experiences, and expressions of women and girls.

Can big data help close this gap? In this new report, Data2x shows it might by profiling a set of innovative approaches of harnessing big data to close the gender data gap even further. For example, have you ever wondered how data generated by 500 million daily tweets across 25,000 development keywords from 50 million Twitter users could be disaggregated by sex and location for analysis? I admit it, before reading the report; I cannot recall being struck by that thought. But, apparently, open data generated from social media platforms may not be sex-disaggregated from the outset. To solve this, the UN Global Pulse and the University of Leiden jointly collaborated together with Data2x to develop and test an algorithm inferring the sex of Twitter users. The tool takes into account a number of classifiers such as name and profile picture to determine the sex of the user producing a tweet and was developed so it could be applied on a global scale across a variety of languages. Comparing the gender classification results generated by the tool to that of a crowdsourced panel for which the correct results were assumed assessed the accuracy of the algorithm. In 74 % of the cases, the algorithm indicated the correct sex, a number that UN Global Pulse deems could be improved through further system development. The results of the project show great potential in generating new insights on development concerns disaggregated by both sex and location by using user-generated data from social media channels, as shown in this screenshot of the online dashboard (go and explore it for yourself!):

big data womens empowerment

Source: UN Global Pulse

Before ending this post I’d like to highlight three other relevant posts from our blog that takes up the relevant questions of bias in data generated from social media, the issue of privacy, as well as the geo-mapping and visualization of big data. Read, reflect, and tell us what you think in the comments field below!


Oct 17

Big Data Visualization: A Big Asset for decision making

Aymen, October 10.

As defined in a previous article by Feinleib, Big Data is the ability to capture, store, and analyze vast amounts of human and machine data, and then make predictions from it. On the other hand, Beyer and Laney, in their definition of Big Data, stress that it useful for an enhanced insight and decision-making.

Indeed, stocking large amounts of information is not useful by itself, but the main goal, in this case, is the way to use and combine this stock of data to facilitate decision making. For example, financial markets increasingly rely on Big Data to trace stock prices and refine predictions for computer-based trading. From a development perspective, Big Data can be useful to follow the development of projects and better understand the needs and expectations of beneficiaries.

Still, when thinking of Big Data, large SQL or Excel tables or algorithms for instance usually come to our minds. Although these two tools require a certain expertise to have the ability to “read between the lines,” they remain incomprehensible for the ordinary person. In his book “The Promise and Peril of Big Data,” David Bollier explains that Big Data usually rely on powerful algorithms that can detect patterns, trends, and correlations over various time horizons in the data, but also on advanced visualization techniques as “sense-making tools”.

In this sense, it is equally important to present eye-catching visualizations of the results extracted from Big Data. They will firstly contribute to making a large amount of information understandable and accessible, in addition to the dissemination of the findings through academic publications, reports, presentations, and social media. In his book “Data Visualization with JavaScript”, Stephen A. Thomas defines Data Visualization as the way to visualize large amounts of data in a format that is easily understood by viewers. The simpler and more straightforward presentation, the more likely the viewer will understand the message.

Indeed, Data Visualization is an important feature of Big Data analytics, as it can provide new perspectives on findings that would otherwise be difficult to grasp. For example, “word clouds”, which are a set of words that have appeared in a certain body of text – such as blogs, news articles or speeches, for example – are a simple and common example of the use of visualization, or “information design,” to unearth trends and convey information contained in a dataset, but there exist many alternatives to word clouds, such as geographic representation.

For instance, the infographic here below, based on large amounts of information, explains how mobile technology is used worldwide as a tool to improve health care, education, public safety, entrepreneurship, or the environment. These worldwide initiatives are part of the 9th Sustainable Development Goal (SDG), which aims to “Build resilient infrastructure, promote inclusive and sustainable industrialization, and foster innovation.”

big data visualization

In this example, Big Data is doubly useful as it is the base of projects launched in the various countries represented on the map. Concretely, it contributes to reducing maternal deaths from placenta praevia in Moroccan rural areas. Furthermore, the simple visualization of this large volume of information facilitates its analysis and can consequently help decision makers to track the progress of projects and can be used as benchmark data to reproduce the same successful initiatives in other countries.

Geographic representation of Big Data is used in the various field, especially in monitoring migration movements. In this sense, almost 200 academic studies involving big data and migration had been published between 2007 and 2016. The Syrian refugees’ crisis is a significant example of the use of Big Data to visualize migration flows. The infographic here below from the New York Time explains in a clear way how nearly half of Syria’s entire population was displaced due to the civil war.

big data visualization

In a wider context, the geographic representation of migration flows between 2000 and 2016 based on Big Data gathered by the UN Refugee Agency (UNHCR) summarizes in a clear and synthetic way the countries of origins and of residence/asylum of migrants.

These are two examples of how geographic visualization of Big Data can – through the historical track record – predict more accurately how many more refugees could be expected over coming years into which points of entry. As a result, military, police, and humanitarian efforts be more coordinated and pre-emptive based on this information.

It is important to take into consideration that stocking large amounts of information is not useful by itself, but its main goal is to use and combine this stock of data to facilitate decision making and create added value. The analysis of Big Data is a big asset as it might facilitate tracking the progress of projects, understand migration trends or allow a better-coordinated mapping of conflict or adversity and delivery of aid to people in dire circumstances.

However, we must remember that Big Data also enable strategies of surveillance, control, and population management. Big Data involves the quantification, classification, and construction of individuals and populations, and categories that are never impartial or objective but embedded in socio-political contexts. It is the researcher’s ethical role to keep these crucial points in mind when deciding what data to use, how to get it, treat it, store it, and share it. In the UK for instance, the manipulation of data and statistics has played a major role in bolstering anti-immigration narratives and xenophobic political agendas. This is to say that raw Big Data or visualized one are not themselves harmful, but the way they are used is indeed dangerous.

Oct 17


Goda, October 8.

From the previous posts, a big data has been characterised as a fuel that drives the next industrial revolution into every aspect of economic and social life. Moreover, it was highlighted that handling of data is a central and the main component in the context of creating trust online (Spratt & Baker, 2016).

While in developing countries social, economic and financial activities moved into a virtual space, huge amounts of information, including personal data also, have been transmitted, stored and collected globally.

Thus, one of the main issues moving activities online is that present regulatory environment on the protection of data is far from ideal. Many social and cultural norms around the world include a respect for privacy – some protect privacy as a fundamental right while others include the individual privacy in constitutional doctrines or similar documents. Nevertheless, there are certain countries that are still in process of adopting this rule (UNCTAD, 2016).

Today personal data are the fuel which drives more commercial activities online. However, the relevance of data protection and the need for controlling privacy is inevitable and increasingly important not only in global economy and international trade but in social media also (UNCTAD, 2016). From publicly available data in social media platforms, it is so easy to find everyone’s interests, political or religious views, shopping habits and etc. I believe that most people would feel really uncomfortable knowing that someone knows that much about them.

So, everything can be tracked and controlled by the information generated by online activities and it has become a concern to global data protection, privacy, security and trust.

How is Facebook using Big Data?

Facebook, as the world’s most popular social media network, is sometimes called a massive data wonderland. It has been estimated that there will be more than 169 million Facebook users only in the United States by 2018. “Facebook is the fifth most valuable public company in the world, with a market value of approximately $321 billion” (Monnappa, 2017).

Every day and every second numerous amount of photos and comments are uploaded, posted, liked and shared on Facebook. At first, this information doesn’t seem very meaningful but considering the fact that this giant social networking platform knows who peoples’ friends are, what they look like, where they are, what they are doing, some researchers say Facebook has enough of data to know people better than their therapists. Moreover, as it was mentioned before, for the same reason it has been widely used for many political activities also.

”Apart from Google, Facebook is probably the only company that possesses this high level of detailed customer information” (Monnappa, 2017). Facebook has always guaranteed its users that all the details are being shared only with their permission. Nevertheless, there have always been serious privacy concerns among these users. For example, many of them complain that Facebook’s privacy settings are not clearly explained or they are too complex. Also, it is easy for people to share things unintentionally.

privacy issue

Furthermore, there have been several cases in the United States and the UK, such as a Schrems v Facebook, initiated by consumers and civil liberties organizations to challenge the extent of the surveillance. One important results of the case were the renegotiation of the Safe harbour agreement (now called as the EU-U.S. Privacy Shield) which includes a commitment to stronger enforcement and monitoring of privacy and data protection (UNCTAD, 2016).

Moreover, a couple of years ago Belgium privacy commission took Facebook to the court over alleged privacy breaches and users tracking online. According to the report on which the commission was acting, Facebook was tracking users on a long-term basis who visit any page (Gibbs, 2017). The outcome of the case – 28 EU Member States prepared a draft of European law in relation to privacy that would improve the same national regulators’ powers over the companies like Facebook (Schechner and Drozdiak, 2015).

It’s no secret that data privacy is a huge concern for companies that deal with big data. With the help of the new technologies, someone knows more about people than they know about themselves which is frightening. One of the consequences – the majority of people have become slaves to data and have been terrified of social media.

Therefore, not only the countries, societies or companies but people themselves also need to take a great responsibility for their actions. Computers are amazing tools but many people have forgotten that they should use them just like tools. We don’t need to forget the best computer ever created is our brain.


Oct 17

Biased #data

Diana, October 5.

bias data

To continue my previous post, I’ll talk more about the biased data in this one.

Like I mentioned before, the Big Data is, unfortunately, not objective, but a human creation: Taylor and Schroeder accentuate that if we know the whole information on the matter, it can lead to the difficulty in understanding it and to the unwillingness to share it. Also, if we are not critical enough towards data we are receiving, we can buy false information as it is, without the evidence.

Big Data is everywhere. Big companies or “development professionals” such as the United Nations (UN) or Organisation for Economic Co-operation and Development (OECD) are using these types of data for research and exploration. Companies meet a lot of technical concerns on the way, like risks and issues of bias have tended to dominate the discussion so far.

Taylor and Schroeder point out the role of biased data in development politics. One example is how data is politicised, namely, that even correct data may not be accepted: all information has to be agreed upon in order to be useful to country authorities as support for policy decisions. Many undeveloped countries have that problem, where real information is hard to acquire. Officials censors all information that comes from sectors of the population who feel underrepresented.

bias data

Kate Crawford — a Principal Researcher at Microsoft Research New York City, a Visiting Professor at MIT’s Center for Civic Media and a Senior Fellow at NYU’s Information Law Institute, her research addresses the social impacts of big data and she’s currently writing a new book on data and power with Yale University Press — published an article in Harward Business Review: “The Hidden Biases in Big Data”.

Hidden biases in both the collection and analysis stages present considerable risks and are as important to the big-data equation as the numbers themselves. — Kate Crawford.

Kate takes up an example to explain the hidden bias in data. There was a lot of tweets about Hurricane Sandy, more than 20 million, between October 27 and November 1. A study shows that these data don’t represent the whole picture. The highest number of tweets about Sandy came from Manhattan: the city has a high level of smartphone ownership and Twitter use. On the other hand, it forms the illusion that Manhattan was the hub of the disaster. Not so many messages originated from affected locations, such as Breezy Point, Coney Island, Rockaway and even fewer tweets came from the worst-hit areas.

Here we can ask ourselves: how do the people outside of affected areas know about what is really happening there?

We rely more and more on Big Data’s numbers to speak for themselves, but we risk in misunderstanding the results and in turn misdirecting important public resources are as big as data itself. “Development professionals” do that mistake also, they rely on information without questioning it. All that misinformation can cause a wrong type of help to a wrong place or be an obstacle in aid relief.

Taylor and Schroeder take a similar example of biased data: the Big Data being used by “development professionals” in mobiles for tracking population movement in disaster relief. The problem with collecting this data is that it is not totally complete: not everyone uses mobile phones, with users particularly low amongst vulnerable and ‘hidden’ populations such as children, the elderly, the poorest and women.

As we move into an era in which personal devices are seen as proxies for public needs, we run the risk that already existing inequities will be further entrenched. Thus, with every big data set, we need to ask which people are excluded. Which places are less visible? What happens if you live in the shadow of big data sets? — Kate Crawford.


Sep 17

Big Data: a Tool to Improve Local Governments

Aymen, September 30.

Nowadays, data is everywhere around us, from our Smartphone to our tablet, to that laptop or PC on our desks; data is pervasive and plentiful. Indeed, as reported by the Economist, “The world’s most valuable resource is no longer oil, but data”. Big Data may create risks with respect to rights, as surveillance opportunities are increased, and the growth in e-waste creates environmental risk, but it also generates a wealth of opportunities.

Continue reading →