Curating for FAIR and Reproducible Research
The goal of the working group is to establish standards-based guidelines for curating for reproducible and FAIR data and code (Wilkinson et al., 2016). Informed by an examination of current curation practices and their alignment with FAIR principles, these guidelines will offer a framework for implementing effective curation workflows for publishing FAIR data and code that support scientific reproducibility. The ultimate objective is to improve FAIR-ness and long-term usability of “reproducible file bundles” across domains.
When we think of specific research outputs, we might think of data, software, codebooks, etc. These individual outputs may have inherent value. For example, a set of observations that is very costly to produce, or that cannot be repeated, or a script that can be used by others for computation. Traditional curation has considered these outputs as its core objects. But in the context of empirical research, these outputs interact with each other, often to produce specific findings or results. Nowadays, the process by which results are generated is captured in computation. Our approach to curation takes into account this process and focuses on computational reproducibility.
Computational reproducibility is the ability to repeat the analysis and arrive at the same results (National Academies of Sciences, Engineering, and Medicine, 2019; Stodden, 2015). It requires using the data and code used in the original analysis, and additional information about study methods and computational environment. The reason to pursue computational reproducibility is to preserve a complete scientific record, to verify scientific claims, to do science and build upon the findings, and to teach (Elman, Kapieszewski, & Lupia, 2018; Resnik & Shamoo, 2017; Stodden, Bailey, & Borwein, 2013).
In this framework, the object of the curation is a “reproducible file bundle” and its component parts, including the files and their elements (e.g., variables), with the goal of enabling continued access and independent reuse of the bundle for the long term.The CURE-FAIR WG is focused on the curation practices that support computational reproducibility and FAIR principles.
By curation we refer to the activities designed for “maintaining, preserving and adding value to digital research data throughout its lifecycle” (Digital Curation Center, n.d.).
The WG will deliver,
- A snapshot of the current state of CURE-FAIR practices drawing upon community surveys and reviews of practice;
- A synthesis of practices relating to curating for computational reproducibility and FAIR principles;
- A final document outlining standards-based guidelines for CURE-FAIR best practices in publishing and archiving computationally reproducible studies, including the associated computational methods and materials (target date: January 2022).
See more in our case statement.