26
Oct 17

Who are they? Citizens who are using the big data?

Diana, October 26.

big data citizens

In my last post, I would like to review the topic of Big Data one last time. I have been talking about challenges with Big Data, as well as about the hidden bias in it. One of my points, that I lifted up in my earlier posts, was that it is citizens, people, who create Big Data. People, by sharing, tagging, commenting, writing posts are creating all that information that is being selected in the big see of data. Like this group mentioned many times before, data, on one hand, could be very useful and helpful, but, on the other hand, damaging and contains some risks.

Let’s, then, look at the beginning, namely, at those people who write, share, like, tag etc. Who are they?

I can see two types of people who create information on the internet: “development professionals” and “readers”. I have already named “development professionals” – professional organisations such as NGO’s. Thomas Tufte states that technological frameworks have opened up a pathway for a better way to spread the information to locals, government, donors in order to provide an improved aid, for example in health and education.

“Development professionals” expect a transparent, legit data in order to collect the required information, in turn, to know where they are needed and what kind of help is needed from them. I’m not going into details here, I, and my group mates, already discussed both risks and opportunities with big data usage by the organisations. The group, also, lifted some positive and negative examples of big data usage. Instead, I’ll go directly to the second type of people being responsible for creating big data.

On the other side, Michael Mandiberg is talking about “readers”. Readers are people who used to just read the information on the internet, but have become the writers themselves and now has control of the information on social media.

Give the people control of media, they will use it. The corollary: Don’t give the people control of media, and you will lose. Whenever citizens can exercise control,
they will. – Jeff Jarvis

Recent development in social media gave people the freedom not just to communicate with each other, but, also, to express their opinions and thoughts online. New technological frameworks have emerged that focus on allowing media creations, like blogging, where people can speak openly about any issue, any matter, any problem. Those new frameworks gave way to so-called amateur media, which can be a risk of becoming bigger than any of those professional organizations.

Furthermore, “amateurs” can be a reason for biased data, namely, they don’t always doublecheck the legitimacy of the information they are spreading. I’m not going to go more into details, I already provided some examples of biased data in my previous posts.

Big data citizens

In conclusion, I have to say that this post is ironic because we are the newly cleated bloggers (well, some of us), who are writing on the subject of #BigData. Who is here to say that we are legit? We might just be those so-called “amateurs”. Are we contributing to a transparent data or biased? Yes, we are using the academic references. Yes, we are doing our researchers before posting our blog posts. No, we are not professionals. Who are we then? Here, in the end of my discussion, I discovered a third group of people who create information on the internet: us 🙂

Thank you for the time you spent on our blog reading it, commenting, participating in discussions, sharing it! It was a fun time for all of us! I hope that you enjoyed it as much as we in this group did!

 


24
Oct 17

Pingback: Have you #HeForShe’d yet?

Eraptis, October 24.

Tonight, Studentafton hosted an event with the head of #HeForShe, Elizabeth Nyamayaro, at Lund University, Sweden.

 

Here are some of the things being tweeted by #HeForShe during the event!

Elizabeth Nyamayaro emphasizes the need for all (both men and women) to engage in order to create lasting change:

To this day, there’s yet a country that has achieved gender equality. Thus, it’s not only a “Global South” problem – change needs to happen everywhere:

The need to act when required, to not be passive in the face of inequality and injustice:

But what about data?

DIAL* asked the same question – is all data created equal?

* DIAL (Digital Impact Alliance) is a “partnership amongst some of the world’s most active digital development champions” including: UN Foundation; Bill & Melinda Gates Foundation; SIDA (Swedish International Development Cooperation Agency); and USAID 

Ping: Have you #HeForShe’d yet? Data for women’s empowerment?

 


23
Oct 17

Event: WEF India Summit 2017. Tracing the ICT4D discourse

Eraptis, October 23.

I’d like to use my concluding blog post to zoom out a bit and take a bird’s eye view of our vibrant discussions about the interconnectedness of social media, data, and development and place it into a wider discussion about development discourse. The question I’d like to ask is really, how much space does the ICT4D discourse occupy in mainstream development narratives? To do so I have chosen to focus on a single session coupled with a number of tweets from the recent World Economic Forum India Summit 2017 taking place in New Delhi on 4-6 October 2017.

The World Economic Forum (WEF) is an independent not-for-profit foundation based in Geneva, Switzerland. It was established in 1971 with the mission to “improv(e) the state of the world” through public-private cooperation, and as such engages the “foremost political, business and other leaders of society to shape global, regional, and industry agendas”. Since 1985, WEF annually organizes the “Indian Economic Summit” event specifically aimed at shaping the political, economic and industrial agendas of India in partnership with multiple stakeholders as outlined above. The topic of this year’s event was perhaps specifically indicative of this aim: Creating Indian Narratives on Global Challenges.

Apart from broadcasting and recording the live sessions of these events WEF also shares additional content through blogs and reports, which are distributed through their website, and shared and communicated on various social media channels such as Twitter, YouTube, and Facebook. Participation in the events is however restricted to specific stakeholders.

The session I chose to cover directly addressed the overall topic of the event, “Creating Indian Narratives”, and took the form of a panel discussion comprising of the following participants representing government, private, and non-governmental stakeholders.

ict4d discourse

From top left: Ajay S. Banga, President and CEO of Mastercard; Dipali Goenka, CEO of Welspun India Ltd; Piyush Goyal, Minister of Railways and Coal; Malvika Iyer, Member of the UN Inter-Agency Network on Youth Development’s Working Group on Youth and Gender Equality; Karan Johar, Head of Dharma Productions; Sunil Bharti Mittal, Chairman of Bharti Enterprises

My first impression of the session was that several of the participants did indeed mention various ICT solutions as being part of a new Indian narrative numerous times. This was not wholly unexpected judging both by the composition of the panel and the mission of the WEF. In fact, Murphy & Carmody even go as far as characterizing organizations like the WEF and international financial institutions as forms of social movements of their own (albeit top-down), with the aim of advancing corporate globalization and the neoliberal agenda. In this sense, it seems that the ICT4D discourse occupies a significant amount of space in the realm of mainstream development discourse, at least in the more traditional branches focusing on economic growth and structural transformation as drivers of development. But in what way does ICT play a role in this kind of development? In order to unpack some of the statements made by the panellists, I found the following conceptualization provided by Murphy & Carmody to be very useful. By distinguishing the forms of ICT integration into “thin” and “thick” categories, the importance and strength of ICT integration can be meaningfully discussed. Thin, or imminent, forms of ICT integration often lead to cumulative gains in productivity and efficiency at the level of the individual or firm. Whereas thick, or imminent, ICT integration is more transformative in its character, often leading to new forms of industrial organization and practices at the industry and market level. Or what Schumpeter would call “creative destruction”. So, with this in mind, what kind of ICT integration did the panellists advocate for?

One of the most interesting exchanges from this perspective occurred in the first round of addresses where Ajay S. Banga, CEO and President of Mastercard, began speaking about India’s “productivity challenge” as a barrier to growth, resulting in a large part due to a large portion of the Indian labor force being employed by the informal sector. For him, transforming the economy to provide formal job opportunities must be part of the new Indian narrative. However, one of the challenges facing this transition is a relatively low skilled labour force, which furthermore is not incentivized to gain new skills in an informal environment, coupled with a low incentive for firms to invest in such an environment. Dipali Goenka continued Ajay’s line of reasoning from the perspective of her own industry (textiles) experience arguing on the one hand that India is still an agricultural economy, with these sectors employing maximum workforces (mainly women). And that digitization could be a way of educating women to increase their skills and productivity in that sector. Dipali also argued for smartphone usage for cotton farmers to increase their agricultural productivity through better access to information on weather and crop conditions.

Furthermore, Minister Goyal in the second round of addresses talked about the role of entrepreneurship and provided an example of ICT solutions as an enabling force empowering individuals along the railroad nodes to franchise ticket sales through mobile devices by small-scale entrepreneurs. Ajay then added that his own corporation (Mastercard) has played a significant role in providing financial services to a large number of people in the world through electronic and mobile payment solutions and digitization, which he argued has increased efficiency and safety in the overall payments systems. This kind of transformation in the way of doing things, Ajay argued, is part of changing the Indian narrative. Moreover, a large part of this kind of business and system development within Mastercard is developed in India, by Indian developers, and is then exported to the rest of world.

Applying Murphy & Carmody’s conceptual framework of imminent and immanent ICT integration to these arguments generates some interesting insights. Firstly, it could be argued that Ajay’s opening statement regarding the “productivity challenge” to some extent is a call for an imminent and deep transformation of both structure and organization of the productive formal sectors in order to absorb and employ labour from the unproductive informal sectors. Whereas Dipali’s following argument can be viewed more as an imminent form of ICT integration where efficiency and productivity gains could be achieved both in the textile industry by empowering and educating female workers through ICT solutions, as well as in the downstream (garment) and upstream (cotton farmers) sectors of the value chain. Likewise, Minister Goyal’s suggestion of franchising train ticket sales through a mobile application to small-scale local entrepreneurs instead of opening local sales offices is also an example of imminent ICT integration where increased efficiency may occur on the level of the individual or the firm. But what about Ajay’s last argument regarding Indian driven innovation within electronic and mobile payments system development? Surely this must be an example of thick, immanent, ICT integration fundamentally transforming industry practices? Yes, but not necessarily in India, as this depends on a multitude of things. On the one hand, and as Ajay expressed himself, new innovations are being developed “for the world”. Thus, the benefits from both uptake and development of these new technologies may occur elsewhere as a form of economic extraversion, and profits generated from these new ICT solutions may be accumulated where the company’s head office is situated (Murphy & Carmody). Whether these skills and capabilities developed eventually spills-over to local entrepreneurs and spur innovation at the national level, countering some of the dependency from multinational corporations, is not certain.

To answer my initial question “how much space does the ICT4D discourse occupy in mainstream development narratives?” – it seems, from this example, quite significantly both directly and indirectly. What is more interesting however is to try and unpack what kind of development, and how deep. Murphy & Carmody’s framework has been very helpful in thinking about ICT4D discourse from this critical perspective. What are your thoughts? And what are other potential threats and opportunities to digitization and ICT4D?

 


20
Oct 17

Can Big Data Help Feeding The World?

Aymen, October 20.

Big Data goes beyond just the existence of data. The ability of Big Data techniques to generate insights through synthesizing data from a range of sources may hold the greatest potential and carry the greatest risks of all. On one hand, Spratt and Baker, in their report “Big Data and International Development: Impacts, Scenarios and Policy Options”, explain that Big Data can be manipulated to promote certain political agenda or increase the possibilities for governments and large corporations to discriminate against certain groups or individuals.

big data feeding world

On the other hand, Big Data may have a positive environmental impact as well as a great potential in agriculture and rural development. It can bring new insights and decision points that lead to product/service innovations. This potential touch on, for example, precision farming with very efficient water and fertilizer use, food security coordination through tracking, tracing and transparency and personalized health and nutrition advice. The availability of easily accessible data plays a major role in documenting quality standards of agricultural products, saving time and improving productivity.

Several projects launched by development organizations rely on Big Data to optimize agriculture. For example, FAO launched in more than 10 countries in Africa, Asia, Eastern Europe, Latin America and Near East the Virtual Extension and Research Communication Network (VERCON). According to FAO, VERCON is a conceptual model that employs internet-based technologies and Communication for Development methodologies to facilitate networking, knowledge sharing and interaction among agricultural institutions, producer organization and other actors of the agricultural innovation system.

In Egypt for instance, where the first project was launched in early 2000, 100 VERCON access points had been installed in various places, such as extension units, agricultural directorates, research institutions and stations, and Development Support Communication Centers. They were connected to the internet to allow farmers to access to an agricultural economic database as well as news and bulletins that help them in solving their problems. In addition, the platform was useful to share ideas and experience of local farmers and monitor the whole project.

The VERCON project was successful since it relied on existing organizational structures and links. Also, the platform ensured rapid response to user feedback thanks to regular monitoring and access to monitoring results. It used rural and agricultural appraisals at the field level to ensure that the virtual network would be accurately focused on the information and knowledge needs of the larger agricultural community.

The project was successful and the Rural and Agricultural Development Communication Network (RADCON) was set up to engage with a wider range of rural and agricultural development issues and to extend the VERCON network to a wider range of stakeholders, including farmer organizations, youth centres, universities, and NGOs.

However, the challenge that the project must take is the use of ICTs by farmers themselves.  Despite the success of projects that imply Big Data for rural development, developing world-based farmers often face difficulties in meeting the quality and safety standards set by the developed world. The conditions that stimulated the growth of Big Data in the farming industry in the global north such as the widespread adoption of mechanized tractors; genetically modified seeds, computers, and tablets for farming activities are less prevalent in developing countries. While large growers can afford specialized machinery, small farmers do not have this opportunity. As a result, they can neither access the data nor interpret it.

Big Data for rural development can help analyzing large amounts of information related to rainfall data or the pest vector could give valuable insights into important issues such as climate change, weather patterns and disease and pest infestation patterns. However, this valuable information largely benefits the Big Data industry in the Global North. It can have a positive impact on big farmers in the global south, but rural communities might be excluded as they still have little or no access to ICTs.

Nowadays, as evoked by Spratt and Baker, those who are in favour of Big Data adopt an evangelical tone to argue for its benefits; while those who are against it tend to stress its dystopian nature. It is important to remember that Big Data is a very recent phenomenon; according to sciencedaily.com, a full 90 percent of all the data in the world has been generated since 2011. In practice, we don’t have the necessary distance to evaluate its real impact.

When it comes to agriculture, farmers all over the world must produce more to feed world’s rapidly expanding population in the coming decades. Will big data help feeding nine billion people by 2050? Time will tell…

 


18
Oct 17

Is BIG DATA against of for low-income countries?

Goda, October 18.

The development of new ICTs has brought many changes into our day to day life. These technologies are often seen as being undoubtedly good with the recognised capacity to make the world better. Big data is one of the key elements of it. According to Spratt & Baker big data is the belief which offers new and higher knowledge ‘with the aura of truth, objectivity, and accuracy” (Spratt & Baker, 2015).

However, my last post will be focused on Unwin’s argument that even if the purpose and introduction of such technology has a potential to do good, quite often this potential has negative outcomes, especially for poor and marginalized communities. Moreover, although big data is seen as offering new solutions for development issues (Spratt & Baker, 2015), it is mainly focused on “what is”, rather than on “what should be” (Unwin, 2017).

Big data benefits and risks have been discussed in all our previous posts from many different perspectives and illustrated by using different examples. As mentioned earlier, it can be used for various decision-making models. It can create added value or be used for manipulative purposes. So, as data becomes all-pervasive in our lives, it is getting more difficult indeed to achieve a right balance between possibilities and dangers of it.

bid data lowincome countries

According to Unwin, big data is designed “with particular interests in mind, and unless poor people are prioritized in such design they will not be the net beneficiaries“ (Unwin, 2017). In other words, big data primarily maintains the interests of governments or shareholders and it is much less interested in the people, especially from low-income populations. Despite such issues, in the previous posts was clearly stated that governments still play inevitably important role in creating the legislative and policy framework.

This concluding post highlights the most important aspect of big data which should be taken into account. Technologies need to focus on empowerment of people, especially of people from less developed countries rather than controlling them.

Therefore, there is no doubt that big data has been used for reasonable purposes. However, it is difficult to decide if all of them are positive. The use of social media to provoke a certain political change can be seen as being good and bad at the same time. Big data can be an opportunity in various contexts as well as a problem that needs to be solved. Everything depends on the context, particular situation and particular human intervention (Unwin, 2017).

Moreover, in terms of the less developed countries, as the world becomes even more digitally connected, there is a real need for the sharing the knowledge and technical capacity by richer countries and international organizations in order to improve global digital security. While this can cause privacy issues, it needs to be discussed openly and transparently within countries especially if it is related to an equal decision-making towards the reduction of inequality.

Additionally, even if Unwin declares that much more attention needs to be paid to the balance of interests between the rich and the poor than to the ways through which data are used I agree with Read, Taithe and Mac Ginty that data to become explicit, requires a careful analysis in terms of how it is being gathered and used. Especially when the technology itself becomes cheaper and social networking platforms such as Facebook became mainstream forms of communication (Read, R., Taithe, B., Mac Ginty, R, 2016).

However, it is not just the access to technology that matters. The data revolution risks strengthening specialists in headquarters. Thus, not only access to connectivity needs to be provided, but also governments should be innovative and open to new ideas. Also, there should be integrated an appropriate content which should empower, integrate less developed countries and help to use big data for their own interests (Unwin, 2017).

Nevertheless, despite all the risks in terms of poor communities, there are many potential benefits of big data analysis also. Among other things, such information can offer more employment opportunities, transform health care delivery, and do even much more than that (slate.com, 2016).

Therefore, the capacity to meaningfully analyse big data still has the same importance as a balance between developed and developing countries (Rettberg, 2016).

 


16
Oct 17

Big (biased) Data. What can go wrong?

Diana, October 16.

Today’s question that I want to ask is: what can go wrong by using Big Data? I move, with that, from theoretical posts to more explanatory post to show, with a particular example, how Big Data can be a risk for “development professionals” in order to provide aid.

big data wrongs

In my previous posts, I talked about risks and hidden bias in Big Data. I mentioned, that people are the ones responsible for data construction.

Google and particularly the Internet are generally observed as groundbreaking discoveries that have changed the way millions of people live their lives and yet researchers and practitioners in the field of ICT and development often struggle to demonstrate explicit influences of the technology to “development professionals”. There are definite reasons why certain projects fail and there are even some generalisable outlines of failure.

One of the examples is Google – the most used search engine in the world, where millions of people can find all kind of information that affects their daily lives. In 2008, Google came up with a, like they thought, brilliant application – Google Flu Trends – to truck flu and its spreading in the world. That has been done in order to help “development professionals” to provide aid to affected areas. Google claimed that they could see the advances of flu based on people’s searches. The essential idea was that when people are sick with the flu, they search for flu-related information on Google, providing almost instant signals of overall flu prevalence.

But this concept didn’t work. Why? Let’s examine.

David Lazer and Ryan Kennedy write in SCIENCE that Google relayed too much on simple search. That led to the spectacular failure of Google Flu Trends. Application missed, at the peak of the 2013, flu season by 140 percent.

Like I mention in my previous post, it is hard to know what is really happening in the affected area if you are not actually present in this area.

That is what happened here – Google didn’t take into the account that multiple people with the flu don’t actually use the search engine to seek for flu-related information. Furthermore, Google didn’t do the research of how many people rely upon internet in order to find records about the flu. Also, Google didn’t take into the account all those people who use Yahoo or Bing instead of Google.

David Lazer and Ryan Kennedy – professors in the Department of Political Science at the College of Computer and Information Sciences at Northeastern University respective at the University of Houston – continue that Google’s algorithm was relatively weak to overfitting to seasonal terms unrelated to the flu. With millions of search terms being fit into data, there were searches that were strongly correlated by pure chance.

These terms were unlikely to be determined by actual flu cases or to be prognostic of future inclinations. Moreover, Google did not take into account variations in search activities over time. These errors are not randomly distributed: an old error predict a new error scale of error varies with the time of year (seasonality). These outlines mean that Google Flu Trends overlook significant information that could be extracted by traditional statistical methods.

big data wrong

Google, as well as the whole Internet, is continuously changing because of the activities of millions of engineers and consumers. Researchers require an improved understanding of how these changes transpire over time. Scientists need to reproduce findings using these data sources across time and using other data sources to guarantee that they are observing robust outlines and not temporary trends. For instance, it is extremely practicable to do controlled experiments with Google, e.g., observing how Google search results will differ based on location and past searches.

More commonly, reviewing the evolution of socio-technical systems rooted in our societies is fundamentally important and worthy of study. The algorithms underlying Google support to regulate what we find out about our health, politics, and friends.

It’s Not Just About Size of the Data. There is a tendency for big data research and more traditional applied statistics to live in two different realms – aware of each other’s existence but generally not very trusting of each other (SCIENCE).

Big data offer massive potentials for understanding human connections at a societal scale, with rich spatial and temporal changing aspects, and for spotting compound interactions and nonlinearities among variables. Those are the most thrilling borderlines in studying human behaviour.

As an alternative of focusing on a “big data revolution,” perhaps it is time to concentrate on an “all data revolution,” where it can be recognised that the critical change in the world has been innovative analytics, using data from all traditional and new sources, and providing a deeper, clearer understanding of our world.

 


14
Oct 17

#BigData – Big Threat to Human Rights?

Louise, October 14.

I have recently explored how big data can be an opportunity in the context of human rights. Among other things, I have taken a closer look at the phenomena called citizen-generated data – data in the hands of the people who are fighting to bring about change. But what happens when big data is controlled by governments or large corporations whose purpose is not to promote the respect for human rights but instead to advance their self-interest or power status?

“Those who argue for the benefits of big data often adopt an evangelical tone, while opponents tend to stress the dystopian nature of big data future”

Spratt & Baker (2016:5)

It is no secret that we live in what could be described as a digital age where technology is advancing and becoming more powerful by the day. The possibilities for governments and other actors, such as large companies, to compile big data, including personal data, is growing larger by the day. It has been stressed that these new technologies are important for issues such as national security. But at the same time, the very same technologies increase the possibilities for governments and large corporations to discriminate against certain groups or individuals (Spratt & Baker, 2016:14).

Even in long-term, stable democracies such as Sweden, these advanced technologies can potentially be used to infringe on rights such as the right to privacy, the right to not be discriminated against, and the right to effective remedy. There are numerous examples of how both companies and governments use various technologies relating to big data to interfere with people’s private lives. This has been demonstrated not least in relation to the increased efforts to combat terrorism.

In Sweden, which is generally considered a country where human rights are widely respected, there are several notable examples of when companies or state authorities have collected big data in ways that violate or may violate people’s right to privacy.

For example, in 2014, the website Lexbase was launched. The website provides access to people’s criminal records so that other people in a user-friendly way can find out if their neighbour, colleague or new date are in the system. The service has been deemed legal as it provides official records, but it has been widely criticised for interfering with people’s personal data. What was however not legal was when the Swedish police compiled personal data of close to five thousand Swedish Roma citizens with no criminal record. In April 2017, the Swedish state was subsequently found guilty of ethnic registration.

“No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honour and reputation. Everyone has the right to the protection of the law against such interference or attacks.”

Universal Declaration of Human Rights, Article 12

As stated by Spratt & Baker (2016:14), not all governments use these new technologies that allow for the collection of big data in a negative way, but the potential for them to do so is expanding. For example, authoritarian regimes now have the possibility to monitor opposition members, human rights defenders, independent journalists and other critical voices. Seeing this in relation to other forms of state repression, the challenges for those working to promote and protect human rights could be immense.

Let us take Russia as an example. Since 2015, internet companies are required by law to store personal information – big data – about Russian citizens on servers in Russia. This enables the authoritarian government to mass-surveil its population while at the same time strictly interfering with the citizens’ internet freedoms. This form of control over big data has a severe impact also on other human rights and freedoms such as the rights to freedom of expression, association and assembly.

“Everyone has the right to freedom of opinion and expression; this right includes freedom to hold opinions without interference and to seek, receive and impart information and ideas through any media regardless of frontiers.”

Universal Declaration of Human Rights, Article 19

A related issue that should thus not be forgotten is the issue of self-censorship as means to adapt to the state-controlled data collection systems. This means that for example opposition members, the independent media, and human rights defenders are forced to try to adjust their behaviour and communication patterns to avoid state repression. This is a phenomenon that not seldom suffocates the political discourse as it crowds out all critical voices.  

Big data human rights

Copyright © Shutterstock; All Rights Reserved

In the title of this post, I asked myself if big data is a threat to human rights. Having thought about this for quite a while now I would say both yes and no. Big data is not the perpetrator, but it is the weapon. In my next post, I will return to that though when I discuss what more that can be done to ensure that big data is handled correctly.

 


12
Oct 17

(Big) Data for women’s empowerment? – How does it work?

Eraptis, October 12

In my last post, I asked the question about how data could be used in order to measure the impact of the #HeForShe movement on women’s empowerment and argued that theory could guide us in the interpretation of such data. But through logical deduction data must first be generated before it can be analyzed, how does it work when data is generated in practice?

In accordance with Morten Jerven, a basic point of departure when we want to know something about a population is to first establish what the population is. Only after we have established this can we know something about other properties affecting that population. From a development perspective factors such as economic growth, agricultural production, education and health measurements are all predicated upon population data to be meaningful. Many times, however, the definite (real) population number is not actually known but estimated through a population counting process commonly referred to as a census. Jerven illustrates the possible implications of census-taking in a development context through a case study from Nigeria, saying that:

“Today, we can only guess at the size of the total Nigerian population. In particular, very little is known about the population growth rate. The history of census-taking in Nigeria is an instructive example of the measurement problems that can arise in sub-Saharan Africa. It is also a powerful lens through which the legitimacy of the Nigerian colonial and postcolonial state can be observed” – Morten Jerven

Without us getting into the particulars of the Nigerian census-taking case, Jerven points out an interesting aspect of this quote which he elaborates further elsewhere in his book: the involvement of the state in the production of data and official statistics. Thus, argues Jerven, if a particular state is interested in achieving development, we should expect that it also has an interest in measuring (that particular) development. If that’s the case, then the availability of state-generated data should reflect its statistical priorities, which is likely to mirror its political priorities. We, therefore, once again, arrive at the question asked in my first blog post – is all data created equal?

Directing that question to Data2x seems to yield the simple answer “No”, at least not yet. Data2x is a joint initiative of the UN Foundation, the Bill & Melinda Gates Foundation, and the William & Flora Hewlett foundation dedicated to “improving the quality, availability, and use of gender data in order to make a practical difference in the lives of women and girls worldwide”. According to this report produced by Data2x, approximately eighty percent of countries produce sex-disaggregated data on education and training, labour force participation, and mortality. But only one third do the same on informal employment, unpaid work, and violence against women. Mapping the gender data gap across five development and women’s empowerment domains by using 28 indicators identified several types of gaps for each indicator as shown in this table:

big data womens empowerment

Source: Data2x

To close these gaps Data2x argues that existing data sources should be mined for sex-disaggregated data, and new data collection should be designed as a tool for social change that takes into account gender disaggregation already in the planning stages. But useful as they are, conventional data forms generated by household surveys, institutional records, and national economic accounts are not very well suited to capture a detailed account of the lives, experiences, and expressions of women and girls.

Can big data help close this gap? In this new report, Data2x shows it might by profiling a set of innovative approaches of harnessing big data to close the gender data gap even further. For example, have you ever wondered how data generated by 500 million daily tweets across 25,000 development keywords from 50 million Twitter users could be disaggregated by sex and location for analysis? I admit it, before reading the report; I cannot recall being struck by that thought. But, apparently, open data generated from social media platforms may not be sex-disaggregated from the outset. To solve this, the UN Global Pulse and the University of Leiden jointly collaborated together with Data2x to develop and test an algorithm inferring the sex of Twitter users. The tool takes into account a number of classifiers such as name and profile picture to determine the sex of the user producing a tweet and was developed so it could be applied on a global scale across a variety of languages. Comparing the gender classification results generated by the tool to that of a crowdsourced panel for which the correct results were assumed assessed the accuracy of the algorithm. In 74 % of the cases, the algorithm indicated the correct sex, a number that UN Global Pulse deems could be improved through further system development. The results of the project show great potential in generating new insights on development concerns disaggregated by both sex and location by using user-generated data from social media channels, as shown in this screenshot of the online dashboard (go and explore it for yourself!):

big data womens empowerment

Source: UN Global Pulse

Before ending this post I’d like to highlight three other relevant posts from our blog that takes up the relevant questions of bias in data generated from social media, the issue of privacy, as well as the geo-mapping and visualization of big data. Read, reflect, and tell us what you think in the comments field below!

 


10
Oct 17

Big Data Visualization: A Big Asset for decision making

Aymen, October 10.

As defined in a previous article by Feinleib, Big Data is the ability to capture, store, and analyze vast amounts of human and machine data, and then make predictions from it. On the other hand, Beyer and Laney, in their definition of Big Data, stress that it useful for an enhanced insight and decision-making.

Indeed, stocking large amounts of information is not useful by itself, but the main goal, in this case, is the way to use and combine this stock of data to facilitate decision making. For example, financial markets increasingly rely on Big Data to trace stock prices and refine predictions for computer-based trading. From a development perspective, Big Data can be useful to follow the development of projects and better understand the needs and expectations of beneficiaries.

Still, when thinking of Big Data, large SQL or Excel tables or algorithms for instance usually come to our minds. Although these two tools require a certain expertise to have the ability to “read between the lines,” they remain incomprehensible for the ordinary person. In his book “The Promise and Peril of Big Data,” David Bollier explains that Big Data usually rely on powerful algorithms that can detect patterns, trends, and correlations over various time horizons in the data, but also on advanced visualization techniques as “sense-making tools”.

In this sense, it is equally important to present eye-catching visualizations of the results extracted from Big Data. They will firstly contribute to making a large amount of information understandable and accessible, in addition to the dissemination of the findings through academic publications, reports, presentations, and social media. In his book “Data Visualization with JavaScript”, Stephen A. Thomas defines Data Visualization as the way to visualize large amounts of data in a format that is easily understood by viewers. The simpler and more straightforward presentation, the more likely the viewer will understand the message.

Indeed, Data Visualization is an important feature of Big Data analytics, as it can provide new perspectives on findings that would otherwise be difficult to grasp. For example, “word clouds”, which are a set of words that have appeared in a certain body of text – such as blogs, news articles or speeches, for example – are a simple and common example of the use of visualization, or “information design,” to unearth trends and convey information contained in a dataset, but there exist many alternatives to word clouds, such as geographic representation.

For instance, the infographic here below, based on large amounts of information, explains how mobile technology is used worldwide as a tool to improve health care, education, public safety, entrepreneurship, or the environment. These worldwide initiatives are part of the 9th Sustainable Development Goal (SDG), which aims to “Build resilient infrastructure, promote inclusive and sustainable industrialization, and foster innovation.”

big data visualization

In this example, Big Data is doubly useful as it is the base of projects launched in the various countries represented on the map. Concretely, it contributes to reducing maternal deaths from placenta praevia in Moroccan rural areas. Furthermore, the simple visualization of this large volume of information facilitates its analysis and can consequently help decision makers to track the progress of projects and can be used as benchmark data to reproduce the same successful initiatives in other countries.

Geographic representation of Big Data is used in the various field, especially in monitoring migration movements. In this sense, almost 200 academic studies involving big data and migration had been published between 2007 and 2016. The Syrian refugees’ crisis is a significant example of the use of Big Data to visualize migration flows. The infographic here below from the New York Time explains in a clear way how nearly half of Syria’s entire population was displaced due to the civil war.

big data visualization

In a wider context, the geographic representation of migration flows between 2000 and 2016 based on Big Data gathered by the UN Refugee Agency (UNHCR) summarizes in a clear and synthetic way the countries of origins and of residence/asylum of migrants.

These are two examples of how geographic visualization of Big Data can – through the historical track record – predict more accurately how many more refugees could be expected over coming years into which points of entry. As a result, military, police, and humanitarian efforts be more coordinated and pre-emptive based on this information.

It is important to take into consideration that stocking large amounts of information is not useful by itself, but its main goal is to use and combine this stock of data to facilitate decision making and create added value. The analysis of Big Data is a big asset as it might facilitate tracking the progress of projects, understand migration trends or allow a better-coordinated mapping of conflict or adversity and delivery of aid to people in dire circumstances.

However, we must remember that Big Data also enable strategies of surveillance, control, and population management. Big Data involves the quantification, classification, and construction of individuals and populations, and categories that are never impartial or objective but embedded in socio-political contexts. It is the researcher’s ethical role to keep these crucial points in mind when deciding what data to use, how to get it, treat it, store it, and share it. In the UK for instance, the manipulation of data and statistics has played a major role in bolstering anti-immigration narratives and xenophobic political agendas. This is to say that raw Big Data or visualized one are not themselves harmful, but the way they are used is indeed dangerous.