Adoption Grant Introductions - Template for Reproducible, Shareable And Archivable Data Analysis
In December 2018, RDA Europe issued an open call for projects adopting outputs from the RDA’s various Working and Interest Groups. Following recommendations from external evaluators, eight funding grants were awarded in April 2019. This blog series will introduce the eight Adoption Grant cases, giving an overview of their project remits and demonstrating the practical approaches organisations can take when looking to implement the RDA’s Recommendations & Outputs.
According to the European Space Agency's Planck space mission, the observable universe is approximately 13.8 billion years old and at least 28 billion light-years in diameter. In layman’s terms, this could be better described as ‘huge’. With such a vast amount of space (and time!) to study, large datasets have become par for the course for astronomical sciences; projects such as the Large Synoptic Telescope and the Square Kilometer Array are set to produce petabytes’ worth of information. Alongside the question of our place in the universe comes the issue of just what to do with the surfeit of data being produced by large-scale telescopes worldwide, and how the resulting analyses can be documented in a more procedural fashion.
Surveying the scene
Dr Mohammad Akhlaghi and his colleagues have sought to address this. Focusing on the reusability and interoperability of research data, the team at Instituto de Astrofísica de Canarias (IAC) has developed a reproducible research paper template. Following a successful application to the RDA’s Adoption Grants call, they have applied the findings of the Workflows for Research Data Publishing: Models and Key Components Recommendation to their project and will look to improve their implementation. This Recommendation was developed by the RDA/WDS Publishing Data Workflows Working Group and focuses on identifying the commonalities that exist in disciplines across the data-publishing landscape with a view to establishing a generic reference model for all data workflows.
An approach to the solution
As the name suggests, the reproducible paper template set out by Dr Akhlaghi and the team tackles concerns over the replication of research findings. Previous versions of their template have been designed to capture not only the data used as part of each publication, but also the software and analysis procedures that led to the results. Looking to build on the findings of the Working Group and their own work so far, the key features of the IAC output will include:
- a fully customisable paper template;
- linked input and output datasets;
- a version-controlled history of the project’s analysis;
- Persistent Identifiers (PIDs) for each specific version.
The reproducible paper will not be geared toward research in astronomical sciences alone. Recognising that issues surrounding the reproducibility of scientific results are affecting a growing number of disciplines, Dr Akhlaghi’s team will enlist the help of researchers in other fields to provide input in the early stages of their template redevelopment. The goal is to develop a robust template which can be implemented by researchers and data curators in any discipline. In addition, it is envisaged that the template will have positive knock-on effects when it comes to pre- and post-publication, with data procedures being made readily available for referees, critical peer-reviewers, readers, and other stakeholders.
Steps so far and for the future
Work has already begun on the dissemination of the team’s approach. Dr Akhlaghi recently presented on the reproducible template in a talk at the Indian Institute of Astrophysics in Bangalore and discussed the template in a session titled ‘Reproducible Science’ at the European Week of Astronomy and Astrophysics, along with similar events at the International Astronomical Union's Symposium 355 and the 5th Indo-French Astronomy School in Pune, India. Further presentations are planned at the September ParCo2019 Symposium in Prague and the 14th RDA Plenary meeting in Helsinki in October.
The template development is ongoing and available as a free and open-source software - feedback and contribution is welcome, as is any input in terms of technical development.
As with all RDA Europe activities, researchers and data curators can take advantage of the invaluable links to a network of data practitioners. In the case of the IAC reproducible template in particular, the RDA will provide an avenue for project feedback as well as access to a pool of discipline-specific experts. Due to the cross-disciplinary nature of the project, it is hoped that the value of the Working Group’s Recommendation in establishing protocols for research data publishing will be demonstrated in action to encourage future uptake and adoption by other organisations.
Find out more about the other RDA Europe 4.0 Adoption Projects here.