This interest group will provide a forum to discuss issues on management, sharing, discovery, archival and provenance of software source code. The group will pay special attention to source code that generates research data and plays an important role in scientific publications. The Research Data Alliance (RDA) mission is to build the social and technical bridges that enable open sharing of data. Software (as source code and executables) and data are intrinsically linked, both to ensure continued creation, analysis and reuse of data and also to preserve the knowledge of the software development, relationships with other assets and the context in which it was created.
This IG adds value to the RDA community by channeling expertise in software development, sharing, management, versioning, reproducibility and preservation into RDA, and into the RDA groups which could benefit from this expertise.
User scenario(s) or use case(s) the IG wishes to address
Software source code plays a critical role in all fields of modern research, where source code is written and developed to address a variety of needs, like cleaning, processing and visualising data. Software source code is a necessary component for research reproducibility and reusability. Thus software source code should be properly curated in the same way as other research inputs and outputs such as research data and paper publication. Software source code developers and organisations that sponsor software development should also be properly credited and attributed.
This interest group focuses on software source code as a first class citizen in the landscape of scientific research, related to but distinct from research data. The group’s objective is to bring together entities and individuals with complementary expertise and different use cases in order to address the following:
- Develop a consistent metadata profile for discovery of software, source code, algorithms and other software artefacts
- Review existing metadata for describing source code if they are already in place, especially those metadata that link source code to data and research publication;
- Investigate if there is a need for additional specific metadata for software in order to make it citable, findable and accessible
- Review existing schemas for identifying software artefacts
- Identify and promote an identification schema specifically adapted to track software artefacts
- Collect and publish use cases of current examples and practices
- Develop guidelines for managing, describing and publishing software source code
- Liaison with other groups in RDA which express interest in issues specifically related to software source code
This group is open to all RDA members to participate.
This group will interact with the following relevant RDA IGs/WGs:
- Research data provenance IG&WG
- PID kernel information WG
- Reproducibility IG
- Metadata IG
- Preservation Tools, Techniques, and Policies IG
- Virtual Research Environment IG (VRE-IG)
- Data versioning IG
- Data Citation WG
And other IGs/WGs if they become relevant to this group.
The group will also liaison with outside expertise on software that will be beneficial for RDA, like WSSSPE, FORCE11 (the software citation work in particular), the Software Sustainability Institute, the Software Heritage initiative, journals that publish software, and relevant national and international initiatives.
Provide an extensive background for RDA members on software source code development, sharing, management, versioning, reproducibility and preservation in order to foster the emergence of shared standards across the research community on how to describe, identify, find and attribute software source code.
This group will coordinate activities and communicate through following means:
- Monthly teleconference to discuss specific issues
- Asynchronous collaboration through Google docs, RDA mailing list and wikis
- Inform other relevant RDA IG/WG of the group’s ongoing activities through RDA group mailing lists
- Hold face-to-face interactions within and across groups at RDA plenaries.
In the first year, we plan to set up an active discussion in three key areas: metadata, identifiers, and use cases.
Potential Group Members
- Benoit Baudry
- Daniel S. Katz
- Fernando Rios
- Gribonval Rémi
- Ian Bruno
- Jen Martin
- Jonathan Tedds
- Julia Collins
- Lesley Wyborn
- Martin Hammitzsch
- Martin Monperrus
- Michelle Barker
- Mingfang Wu
- Morane Gruenpeter
- Neil Chue Hong (co-chair)
- Roberto Di Cosmo (co-chair)
- Sandra Gesing
- Stefanie Kethers
- Victoria Stodden