Curating for FAIR and reproducible data and code
Submitted by Limor Peer
- Introduce the topic of curating for FAIR and reproducible data and code
- Review environmental scan of existing relevant training and identify gaps
- Introduce the CURE Consortium and elicit input on proposed standards, practices, and tools for curating for FAIR and reproducible data and code
- Review CURE goals in the context of the Reproducibility IG at RDA and potential synergies with other RDA IGs and WGs
Collaborative session notes: https://docs.google.com/document/d/1IenuTqrcIK-ktBkohr1ypvGkTjXmhCcSOLHL...
10 minutes: Introductions and goals of the BoF
20 minutes: Background on CURE project, aims, activities
40 minutes: Group activity on CURE practices, gaps, opportunities for collaboration and integration
20 minutes: Open discussion of key issues, potential synergies, and the role RDA can play
Scientific reproducibility provides a common purpose and language for data professionals and researchers. For data professionals, reproducibility can be a framework to hone and justify curation actions and decisions, and for researchers it offers a rationale for inserting best practices early into the research lifecycle. Curating for reproducibility (CURE) includes activities that ensure that statistical and analytic claims about given data can be reproduced with that data. Academic libraries and data archives have been stepping up to provide systems and standards for making research materials publicly accessible, but the datasets housed in repositories rarely meet the quality standards required by the scientific community. Even as data sharing becomes normative practice in the research community, there is growing awareness that access to data alone – even well-curated data – is not sufficient to guarantee the reproducibility of published research findings. Computational reproducibility, the ability to recreate computational results from the data and code used by the original researcher, is a key requirement to enable researchers to reap the benefits of data sharing (Stodden et al., 2013), but one that recent reports suggest is not being met. Verifying findings to confirm the integrity of the scientific record and to build upon previous work to discover and develop new innovations also requires access to the analysis code used to produce reported results. The exhaustive laundry list of tasks that characterize the traditional data curation workflow that enables data access--file review and normalization, metadata generation, assignment of persistent IDs, data cleaning, and assembly of contextual documentation--falls short when research reproducibility is the ultimate goal (Peer et al., 2014). In order to curate for reproducibility, activities must include a review of the computer code used to produce the analysis to verify that the code is executable and generates results identical to those presented in associated publications. CURE has been implementing practices and developing workflows and tools that support curating for reproducibility.
Publications
- Peer, L., et al. (2014). Committing to Data Quality Review. International Journal of Digital Curation. 9(1): 263-291. (https://doi.org/10.2218/ijdc.v9i1.317)
- Stodden, V., et al., (2013). Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals. PloS one 8, 6 (2013), e67111 (· https://doi.org/10.1371/journal.pone.0067111).
RDA
- Reproducibility IG (https://rd-alliance.org/groups/reproducibility-ig.html)
Additional Co-Chairs:
- Thu-Mai Christian
- Florio Arguillas
CURE Consortium
- Website (http://cure.web.unc.edu/)
- Slides (https://osf.io/3wkex/)
- Recent workshops using CURE standards, workflows, and tools
- Open Repositories 2019 (https://www.conftool.net/or2019/index.php?page=browseSessions&form_date=2019-06-10 )
- FORCE11 2018 (https://force2018.sched.com/event/FYrH/engaging-researchers-in-data-management-by-focusing-on-reproducibility-of-results)
- IDCC 2018 (http://www.dcc.ac.uk/events/idcc18/workshops#workshop1)
- IASSIST 2017 (https://iassistdata.org/conferences/archive/2017)
REMOTE ACCESS INSTRUCTIONS
Please join my meeting from your computer, tablet or smartphone.
https://global.gotomeeting.com/join/634210829
You can also dial in using your phone.
United States: +1 (224) 501-3216
Access Code: 634-210-829
More phone numbers
Australia: +61 2 9087 3604
Austria: +43 7 2081 5427
Belgium: +32 28 93 7018
Canada: +1 (647) 497-9410
Denmark: +45 32 72 03 82
Finland: +358 923 17 0568
France: +33 170 950 594
Germany: +49 692 5736 7317
Ireland: +353 15 360 728
Italy: +39 0 230 57 81 42
Netherlands: +31 207 941 377
New Zealand: +64 9 280 6302
Norway: +47 21 93 37 51
Spain: +34 932 75 2004
Sweden: +46 853 527 836
Switzerland: +41 225 4599 78
United Kingdom: +44 330 221 0088
New to GoToMeeting? Get the app now and be ready when your first meeting starts:
https://global.gotomeeting.com/install/634210829
- 2787 reads