The Data Versioning WG is closing down. In this session, we will present the final report of the Data Versioning Working Group. In this session, we will also discuss how to promote adoption of the data versioning principles and the need for follow-up activities such as an Interest Group.
- Presentation of the Data Versioning WG report and data versioning principles (20 minutes)
- Use case presentations and pathways for the adoption of the data versioning principles (40 minutes)
- Discussion on the next steps: how can we promote adoption of the data versioning principles? Should this work be continued as a new WG or IG? (30 minutes)
Data curators, Data facility managers / data policy managers / data users, data developers
The group has worked on the principles of data versioning practices. The group is identifying the kinds of methods that have been adopted (or became ad hoc practices) and is generating the background theoretical framework needed for versioning considerations. The group seeks to answer the questions, “What is a new version?” and “How should it be documented?”, etc.
The group has prepared a report, based on documented use cases, and developed a set of basic considerations for data versioning:
- Management: Recognise identification and tracking of data revisions and data releases as an important component of data management. Establish a procedure and policy for consistent management of data revisions and releases.
- Identification: Be clear about which dataset is to be identified. Identify data revisions, consider issuing a new persistent identifier per revision and release
- Communication: Communicate the significance of the change to the designated user community of this dataset. Concepts such as Semantic Versioning describe the significance of a version change
- Provenance: Track changes and record provenance information between revisions. Provenance information describes the changes that have been made to each newer revision. Display provenance information, attribution and credit on landing pages of each publically released dataset.
- Citation: Cite a specific data release. For each released dataset, have a clear recommendation, including a release number, on how to cite a dataset.
The Working Group has is ending now and will present its Final Report at RDA P15 Melbourne.
- P8 Denver (Sept 2016): BoF on Data Versioning
- P9 Barcelona (April 2017): Constituting the Data Versioning IG
- P10 Montreal (Sept 2017): Data Versioning IG
- P11 Berlin (March 2018): Data Versioning WG first meeting
- P12 Gaborone (Nov 2018): Data Versioning WG working meeting
- P13 Philadelphia (April 2019): Data Versioning WG draft report and recommendations
- P14 Helsinki (October 2019): Data Versioning WG final report and recommendations, preparation for TAB adoption.
- P15 Melbourne (March 2020): Report on TAB adoption, closure of WG, discussion of future activities.