Standing on the shoulder of giants, and returning the favor

Big data are produced at an accelerated speed, and made use of by governments, businesses and research agendas- for good or bad. At the same time “small data” are trapped in individual researchers’ hard drives around the world, for no good reason at all.

“If I have seen further, it is by standing on the shoulders of giants”  – Isaac Newton

This quote sums up the nature of scientific collaboration. Only when building on the work of our predecessors, we can make scientific advancements, and only by sharing our own discoveries can they be built upon by others. Most researchers understand this, but yet the research community has not fully embraced open data or data sharing practices according to the report, Open Data- The researcher perspective from 2017.

Why care to share?

Societal benefits of researchers opening their data are for example to enable reproduction and verification of results, make results of publicly funded research available, enable others to ask new questions about the data, and to advance the state of research and innovation. (Borgman, 2012) Add to that a couple of not so altruistic reasons, such as increased research impact in terms of more citations, scientific integrity and to safeguard your own future use of the data. By preparing your data for sharing with others, you will benefit by being able to identify, retrieve, and understand the data yourself after you have lost familiarity with it, perhaps several years later.
On this note, there are many strong voices for openness in the research community, such as Robin Rice  who is triggered by the fact that 80 percent of original scientific data obtained through publicly funded research is lost within two decades of publication.
From a donor and policy side, the pressure is on that research data should be submitted and made open access when research results are published. Hence an initiative like OpenAire  has been initiated and funded by the European Commission to support the Open Access Policy. 

What prevents researchers from sharing then?

As many as one third of the respondents in the Open Data report do not publish their data at all and less than 15% of researchers share their data in repositories. The report which is based on a global online survey of 1,200 researchers and three Dutch case studies from different disciplines also reveals that many researchers perceive data as personally owned. 

The more hesitant voices and sceptics won´t share their data due to privacy issues, proprietary aspects and ethics barriers. Also financial and legal issues could hamper sharing according to the Open Data Report. 
On the personal side, researchers tend to worry about publishing first. There is a fear of giving away the data before being published and in that way risk being scooped by someone who might come to more novel conclusions, using your data. Collecting data is hard work and
researchers in general wants a good return on that investment.

The road ahead towards greater openness

A one-size-fits-all approach is not doable in the case of data-sharing due to disciplinary, cultural, and local differences with respect to data privacy and licensing. There are some general policies for open data in place, as well as standards such as FAIR guiding principles for scientific data.  However, the reality for researchers is not necessarily aligned with those principles and incentives for sharing are often not in place. To get there, better policies to incentivize the production and use of open data are still needed. For this to happen, research institutions and departments will be crucial in encouraging researchers to share data and give substantial support in guidance, infrastructure and data management training.  And perhaps most importantly: to encourage collaboration, and nurture an environment of trust.

A personal reflection

To practice what they preach the Open Data Report team have published all raw data, which their report builds on. As I flick through the files, new questions comes to my mind, that this data could possibly help to answer. For example, do attitudes towards sharing data differ between men and women? Or in different parts of the world? I can see from the data-set that all continents are represented, but the lion’s share of the respondents are from Europe and the US. What would the outcome of this survey look like if more of the respondants were from the Global South?  If I would ever undertake a study related to my curiosity based on this generously provided data-set, I will give a big credit to the providers, whose shoulders I am about to climb.


Referenced work:

Borgman, C. L. (2012), The conundrum of sharing research data. J Am Soc Inf Sci Tec, 63: 1059–1078. doi:10.1002/asi.22634

Fecher, B. et al. (2015) What Drives Academic Data Sharing?, PlosONE DOI: https://doi.org/10.1371/journal.pone.0118053

Piwowar HA, Vision TJ. (2013) Data reuse and the open data citation advantage. PeerJ 1:e175 DOI: https://doi.org/10.7717/peerj.175

Vines, Timothy H. et al.  (213) The Availability of Research Data Declines Rapidly with Article Age, Current Biology , Volume 24 , Issue 1 , 94 – 97 DOI: https://doi.org/10.1016/j.cub.2013.11.014

Image credits:

Free Code Camp and Research Data Alliance (Plenary Cartoons by Auke Herrema)

4 Comments

  1. Christina

    Very interesting read! Personally, I could imagine that a lot of researchers and let’s say also students who do research for their final thesis etc. do not know about a good ‘place’ to make their findings available. Would you think that language of collected data could be a barrier for publishing findings as well?

    • Karin

      Hi Christina. Thanks for your comment. My reading on this matter and talking to researchers suggest exacteley that: Even though increased openness and accessible data-sets are requested by more donors, still there is limited infrastructure available for sharing own and find other researchers´data. I think that a lot of investments are needed for this to happen. Surely there could also be language barriers.

  2. Jassir de Windt (Group 2)

    Hi Karin,

    I enjoyed reading your post. Particularly, because in the Netherlands there is currently much ado about the new ‘European General Data Protection Regulation’ (to be implemented this year). As such, ‘new’ job titles such as ‘Data Protection Officer’ (DPO) are being called into existence.

    A while ago, I read an article on Wired in which the authored emphasised on the fact that the full potential of big data is held back for the everyday user not being a data scientist which results in a great amount of information remaining into the hands of a happy few. The author in question advocated to gradually ‘consumerise’ big data and argued furthermore that big data shall be made mobile (smartphone, tablet etc).

    From a research perspective, I once read an equally interesting article in the Atlantic Daily in which the author pointed out that, due to competition, many scientist seem to have, what she described as, a ‘sharing problem’…

    All in all, I think your post opens the debate on both angles — well done!

  3. Karin

    Hi Jassir, thanks for an encouraging comment. I am glad you liked the text. I think there will be a lot of new jobs in this area in the near future, as the positions you see emerge in the Netherlands. I am also facinateted and a bit scared of the Big Data analytical concentration as you mentioned . I really recommend listening to TED Radio talk Big Data Revolution for some more food for thougts on unlocked potential of Big Data if you are interested in that. https://www.npr.org/programs/ted-radio-hour/492296605/big-data-revolution

Comments are closed.

Back to Top