WG Data Versioning - RDA 13th Plenary Meet

You are here

10 Jan 2019
Group(s) submitting the application: 
Meeting agenda: 

Meeting title 

Data Versioning WG Working Meeting (Remote Access Instructions)


Room Location: Commonwealth A2


Collaborative session notes: https://docs.google.com/document/d/1p-VVKhZ6NV-feon-68FMu_dEDO9e3H7Tca5mNxpLRdY/edit


Short introduction describing the scope of the group and if any previous activities 

The demand for reproducibility of research results is growing, Therefore it will become increasingly important for a researcher to be able to cite the exact extract of the data set that was used to underpin their research publication. However, systematic data versioning practices are currently not available.
Versioning procedures and best practices are well established for scientific software and can be used to enable reproducibility of scientific results. The codebase of large software projects does bear some semblance to large dynamic datasets. Are therefore versioning practices for code also suitable for data sets or do we need a separate suite of practices for data versioning? How can we apply our knowledge of versioning code to improve data versioning practices?
Over the past year, we have collected use cases of data versioning practices and extracted data versioning patterns. A draft of the Working Group’s report and recommendations for data versioning practices will be presented in this session. We invite data scientists, operators of data repositories, and anyone who is interested in moving data versioning forward, to attend.


Additional links to informative material related to the group 


Notes and presentations from past plenaries:


Meeting objectives 

The objective of this session is to establish a work plan for this RDA Working Group on developing agreed practices for Data Versioning to finalise the outcomes and recommendations. This includes:

  1. Identifying areas where versioning is required and/or other use cases:
    1. Identifying groups in RDA and planning of how to engage 
    2. Identifying external groups 
    3. Overview of collected use cases
  2. Present the outline of a white paper on recommendations for data versioning:
    1. Spectrum of data types to be included (files, databases, unstructured data, model runs, etc.), 
    2. How to align these with the practices for the assignment of persistent identifiers.
    3. Identify other topics that should be included


Meeting agenda

  • Introduction
  • Data Versioning WG retrospective
  • Presentation of report draft and recommendations on data versioning practices
  • Adoption of WG recommendations through W3C
  • Engagement with other RDA and external groups
  • Work plan for final six months of RDA Data Versioning WG
  • Scheduling of online meetings up to Plenary 14


Group chair serving as contact person

Jens Klump


Type of meeting

Working meeting


Target Audience

  • Members of the Working Group.
  • Data scientists and operators of data repositories
  • Data producers and users
  • Publishers who want to be sure that the correct version of a data set is cited in a publication
  • Anyone who is interested in moving data versioning forward.