Part II: the bad Mr Hyde?
By Athanasia K.
In a previous post, I have tried to show that Big Data applications could offer innovative and effective solutions towards better healthcare services.
Despite the very many promises however, Big Data applications in healthcare are not a panacea against all evils, but could also result in negative impacts with challenging aspects. And these challenges are still out there, unresolved for years now, despite the exponential technological development of the field.
In developed countries, one of the biggest concern seems to be the protection of personal data, which is even more sensitive when this is medical data. Kaplan very rightly notes that “data can be sold and replicated anywhere and, once sold, may be used for good or ill”. Furthermore, as Lunshof et al have showed, with the current IT technology we have, privacy and confidentiality can no longer be guaranteed. On the contrary, when we are dealing with the analysis of genetic samples, the re-identification of data samples back to their donor is more than possible, as Malin et al showed some years ago. But even if indeed there is a way to security and totally anonymise the samples in a genetic database, this can limit the usefulness of the data, as showed by Budimir et al.
One more point of concern on the impact of Big data in healthcare is that not all data are reliable. In fact, “people change their behavior and withhold information in order to protect their health information privacy” and “according to a 1999 survey, nearly one in six patients withheld information, provided inaccurate information, doctor-hopped, paid out of pocket instead of using insurance, or even avoided care” as Kaplan notes. This has lead experts to fear a GIGO effect (e.g., garbage in–garbage out), and to a questioning of the reliability of this methodology to vulnerable groups and poorer regions, as also analysed previously by Shahin. However other scholars such as Alemayehu argue that “although much of real world data is sparse and a lot of the data is ‘‘dirty’’, with proper analytical, computational and data management tools, it is still useful and can support health policy decision-making”.
Adding to this conundrum of confidentiality vs usefulness, the lack of transparency in the acquisition and ownership of the data also adds more question marks in the field. It is common practice that Big data vendor companies do not disclose their contracts on the acquisition these data. As Kaplan notes, the legal framework in the United States and abroad ”does not address health data ownership clearly; it is not clear who the owner should be … Furthermore, it is also not clear where those who sell data analytics services obtain the data, or how they might use them.” Furthermore, as Kaplan continues, “vendors often consider their contracts intellectual property and do not reveal these and other contract provisions”.
But who benefits from this?
One could very logically assume that the companies involved in Big data do gain some sort of profit from this business. But what about the rest? As Kaplan notes, “the cost [of data gathering] is passed on to patients and payers, whether private of confidential. These individuals gain little benefit from the aggregation and sale of data about them, and they may even be harmed by it”. Indeed, Kaplan continues, “patients can be harmed when data about them are violated: to deny employment, credit, insurance”.
This unbalance of the distribution of benefits is more evident when we look in developing countries. As Rudan et al note, nearly all biobanks (at least back in 2011) “have been developed to address the health problems relevant to the minority of people living in wealthy countries”. This has caused reluctance in developing countries to share their national data or permit foreign researchers to access them, in fear of exploitation. An example to illustrate this better is the one cited by Staunton and Moodley, where in “2007, Indonesia refused to share its H5N1 samples without a legally binding agreement which addressed among others, benefit arrangements and intellectual property rights”.
Apart from the benefits’ unbalance, one more real concern regarding data collection in healthcare is about the possible stigmatization of the patients in case the confidentiality of data is breached. This has been reflected even in court cases, where, as cited by Staunton and Moodley, in April 2010 the Arizona State University paid 700,000$ to the Havasupai Indian tribe as a settlement against claims of an improper use of blood samples which stigmatised the tribe. This fear of stigmatisation is also reported on African studies, where research participants fear about discrimination and possible stigmatisation of themselves and their family (see again the Staunton and Moodley paper). This aspect is more difficult to tackle since cultural differences make the analysis more difficult. As Kaplan notes, what is considered as very private, embarrassing, stigmatising, or posing grounds for discrimination varies among individuals and groups, and also differs between cultural backgrounds, places or time periods”.
But is it all that black and white, Dr Jekyll vs Mr Hyde situation when we speak about Big data for healthcare? In a forthcoming post, I’ll try to maybe find a third way of looking at this.