Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jun 1;1(1):10.1177/2053951714534395.
doi: 10.1177/2053951714534395.

What Difference Does Quantity Make? On the Epistemology of Big Data in Biology

Affiliations

What Difference Does Quantity Make? On the Epistemology of Big Data in Biology

Sabina Leonelli. Big Data Soc. .

Abstract

Is big data science a whole new way of doing research? And what difference does data quantity make to knowledge production strategies and their outputs? I argue that the novelty of big data science does not lie in the sheer quantity of data involved, but rather in (1) the prominence and status acquired by data as commodity and recognised output, both within and outside of the scientific community; and (2) the methods, infrastructures, technologies, skills and knowledge developed to handle data. These developments generate the impression that data-intensive research is a new mode of doing science, with its own epistemology and norms. To assess this claim, one needs to consider the ways in which data are actually disseminated and used to generate knowledge. Accordingly, this paper reviews the development of sophisticated ways to disseminate, integrate and re-use data acquired on model organisms over the last three decades of work in experimental biology. I focus on online databases as prominent infrastructures set up to organise and interpret such data; and examine the wealth and diversity of expertise, resources and conceptual scaffolding that such databases draw upon. This illuminates some of the conditions under which big data need to be curated to support processes of discovery across biological subfields, which in turn highlights the difficulties caused by the lack of adequate curation for the vast majority of data in the life sciences. In closing, I reflect on the difference that data quantity is making to contemporary biology, the methodological and epistemic challenges of identifying and analyzing data given these developments, and the opportunities and worries associated to big data discourse and methods.

Keywords: big data epistemology; biology; data curation; data infrastructures; data-intensive science; databases; model organisms.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Ankeny Rachel, Leonelli Sabina. Valuing Data in Postgenomic Biology: How Data Donation and Curation Practices Challenge the Scientific Publication System. In: Stevens Hallam, Richardson Sarah., editors. PostGenomics. Duke University Press; 2015. in press.
    1. Baker Karen S., Millerand Florence. Infrastructuring Ecology: Challenges in Achieving Data Sharing. In: Parker John N., Vermeulen Niki, Penders Bart., editors. Collaboration in the New Life Sciences. Ashgate; Farnham, UK: 2010. pp. 111–138.
    1. Bastow Ruth, Leonelli Sabina. Sustainable digital infrastructure. EMBO Reports. 2010;11(10):730–735. - PMC - PubMed
    1. Bauer Susanne. Mining Data, Gathering Variables, and Recombining Information: The Flexible Architecture of Epidemiological Studies. Studies in History and Philosophy of Biological and Biomedical Sciences. 2008;39:415–426. - PubMed
    1. Bechtel William. Discovering Cell Mechanisms. The Creation of Modern Cell Biology. Cambridge University Press; 2006.

LinkOut - more resources