What is a digital footprint?
Every step we take online (or if it is taken on our behalf) leaves a digital trace. With every click or online interaction, we create digital traces (also known as ‘digital footprints’). They usually are automatically captured and provide a detailed record of a person’s online activity. The production of digital data provides opportunities to perform thorough analysis and gain insights about individuals, the circles of people they create around them, their behaviour patterns, demographic attributes and personalities. Also, those digital footprints enable categorization and clustering of people. Read et al. (2016) have noted that with this increasing access to data, in the humanitarian sector the reliance and desire for it has been steadily growing.
Screenshot produced by and used with permission from NRD Cyber Security
Just how much information can a digital footprint provide about an individual?
It appears quite a lot. One can say that all you need to do is to use fake identity or not to reveal all the information about yourself. But data can tell more about us than we think. Research shows that by only having snippets of information, it is possible to come close to figuring out a person’s identity. For instance, personality researchers have suggested that individuals leave behavioural residue (unconscious traces of actions that may objectively depict their identity). Thus, behavioural residue such as language patterns, smartphone metrics and meta-data (e.g. posts, followers, browsing history, commenting, etc.), provide opportunities to infer demographic attributes with computational techniques (e.g. natural language processing, machine learning) that would be too complex for humans to process.
It matters how we communicate
AI based open source intelligence (OSINT) tools help to make data collection, analysis and interpretation very targeted and user-friendly (GreyCampus). It is surprising how much information can things like language patterns (the way we write, construct thoughts and sentences, that kind of words we choose, etc.) can tell about our gender, moods or social status. Studies on online behaviour have highlighted similar trends to offline studies of language, in that females and males tend to communicate differently. Women are more likely to use pronouns, emotion words (e.g. happy, bored, love), interjections (e.g. urgh, hmm) emojis (e.g. <3, ☺ and abbreviations (e.g. lol, omg). Males, on the other hand, tend to use more practical dictionary-based words, post videos and links, share funny images. While in isolation this information may not reveal much about a person, i.e. gender does not tell much about our identity, but when all the digital traces are combined, they form a web of information and can quite accurately depict a person in the middle of that web. This has serious implications for privacy and control over the information about one’s identity and how it is used.
Image from IQmatrix.com
Control over information
The recent case of Cambridge Analytica shows just how thin the line is between using the digital data and tools to get to know your target audience and manipulating them. Also, according to Taylor (2017), the increasingly available digital data is driving a shift in policy making worldwide. Data is used to make calculated decisions where to invest, how to allocate funds and where to set priorities. ‘Data-driven development report’, published by the World Bank, gives plenty of examples on how data has been used to tackle crises all around the world.
Example of AI and data-based solution, published in Data-driven development report by the World Bank
However, some researchers suggest that there is a clear over-reliance on data – Read et al. (2016) state that there is a tendency to use the same digital data, which is formed as a result of our digital footprints, as it is usually cheaper and quicker. Hence, this leads to limited and uninformed decisions. So, what can one do to enable his or her data to be useful for the benefit of the society, but at the same time to have control over the information which is given away? As more and more people from the Global South are connecting to the digital world, it is important to raise and discuss such questions.
Below is the suggestion of two areas that could be at the epicentre of those discussions:
- Critical thinking. Evaluating what information to share, what subjects to discuss. I was part of „Cyber Defence East Africa 2019“ conference in Uganda last year and it was really interesting to hear the discussions around different digital environments we operate in (work, personal, home, etc.) and how we often talk about one (usually it is work), but do not pay enough attention to the other (e.g. don’t discuss information sharing at home). Hence, it is easy to forget that all our different digital environments are interconnected in digital space and our digital footprints there can be put together. We should only encourage these discussions.
- Regulation. 2018 has seen one of the greatest occurrences in the digital world – the General Data Protection Regulation (GDPR) has taken a stricter and more regulated form. It has generated so much buzz, that as a result people started questioning why their data is collected, how it is stored and how it is later used. In general, it drew attention to the importance of data. Although the act is specific to the EU, it worked as an earthquake with many shockwaves appearing in other countries and although there have been and still are concerns of how capable other countries are on adapting the regulation, many, both from Global South and North, have at least put some stepping stones in place. For instance, Bahrain introduced GDPR-like regulation on 1st August 2019 and there are still questions how prepared the state has been, but it has surely triggered discussions amongst the general public.