Big Data 4 Gender

A criticism of Big Data for development is that the data is not representative and therefore can effectively disregard the already marginalised. The recognition that big data sets tends to marginalise women and girls is at the heart of Data2x mission with their tagline: “If data isn’t from all of us, then data isn’t for all of us.”

Their mission is to improve the quality, availability and use of gender data in order to make a practical difference in the lives of women and girls worldwide. You can see more about this here:

From their site:

“Gender data” is data disaggregated by sex, such as primary school enrollment rates for girls and boys, as well as data that affects women and girls exclusively or primarily, such as maternal mortality rates.

Today, we have only a partial snapshot of the lives of women and girls and the constraints they face because gender data are limited, especially in developing countries. We have no data or bad data on issues that disproportionately affect women and girls but that society does not highly value. Gender biases both impede and distort data collection.

This data would make it possible to determine the size and nature of social and economic problems and opportunities as well as the efficacy and cost-effectiveness of alternative policies. For example: How many girls are married before the age of 18? What explains gender wage gaps? How can extension services reach more women farmers? Are mobile payments to poor mothers more cost-effective than traditional cash transfers? Good data can provide valid answers to these questions.

On the 21st March, they will be releasing their report on “Big Data and the

Well-Being of Women and Girls.” Follow @Data2X and #BigData4Gender on Tuesday, March 21 at 11AM ET/4PM UTC to join the conversation!

The report will be look at the exciting applications of big data sources to fill gender data gaps, track indicators of girls’ and women’s well-being, and inform policymaking.

They have mapped the gaps in their infographic here:

Feasibility Study: Identifying Trends in Discrimination Against Women in the Workplace in Social Media

In 2014 UN Global Pulse conducted a feasibility study with the International Labour Organsiation on workplace gender discrimination in Indonesia. Through their pulse lab in Jakarta, the study aimed to identify trends in discrimination against women in the workplace through data mining keywords in 100,000 tweets over three years. A data set was available; the lab recognised the potential to mine this data set to monitor gender-based discrimination in the workplace. They had a funding partner (ILO) which knew the issue was real, and wanted to prove it through the data analytics.

As their report noted however, the signals were not strong enough to lead to conclusive results. Was this a problem with the methodology, or was it a consequence of women in Indonesia keeping silent about their experiences regarding discrimination and violence in the workplace?

The report made the following recommendations:


  • Four of the analysed topics about gender discrimination should be investigated further- in particular messages related to discriminatory job requirements and about the conciliation of work and family for women. A more detailed subtopic classification and extension to other social media platforms that might be appropriate.
  • The study showed increasing volumes of relevant online conversations over the three years of data analysed, which reveals opportunity for further research as greater volumes of social media data become available for exploration.
  • Appropriate mechanisms at workplace should be established to address gender discrimination and violence at workplace. Monitoring cases might be complemented by digital tools.


This one example shows both the potential and limitations of data analysis on social media. Perhaps as importantly, it demonstrates that despite the pilot study proving inconclusive, there was an urge to expand it, re-design it, and repeat it. And a realisation that it needed to be complemented by “appropriate mechanisms” anchored in the real-world of the workplace. What these appropriate mechanisms are, is not clear. The report’s focus (understandably) is on the potential for data-based development.

There are five issues that this one case study highlights:

  • The Data analysis aimed to prove statistically what was already known anecdotally
  • There is an enthusiasm from data analysts to scale up from pilot projects and feasibility studies to reap the digital dividends – it is only with more funding and better analytics that the benefits will be seen. However, the proportion of overseas aid dedicated to statistical programmes was slashed in half between 2011 and 2012, to 0.16 per cent, according to a 2013 report from the Partnership in Statistics for Development in the 21st Century (PARIS21).
  • The need to reinforce the link between digital data analytics and the reality of the issue being explored – the “appropriate mechanisms.”
  • The need to “ground-truth” data with other information
  • A concern that a focus on data analytics diverts attention and funding from direct interventions.

I will be returning to these questions in subsequent posts.






Leave a Reply

Your email address will not be published. Required fields are marked *