RDA-OfR Creating a Multi-omics Metadata Schema Standard Reporting Matrix WG
Note: The WG is actively seeking co-chairs and members from different countries and continents who cover various Omics domains, including expertise in mass spectrometry and sequencing. A dedicated communications campaign via the RDA’s communications and social media channels will take place to recruit co-chairs and members from geographically diverse regions.
Multi-omics data integration merges multiple Omics data types (e.g., genomics, proteomics, metabolomics, phenomics, etc.), leveraging a wide range of high-throughput technologies as a holistic approach to quantification and characterisation of large complex pools of biological molecules. Multi-omics data integration and analysis provides a closer look into the complex structural and functional interactions at the molecular level for improving our understanding of biological dynamics of living organism(s) across the life science landscape.
Significant advancements continue to evolve high-throughput Omics technologies (sequencing, mass spectrometry, imaging, etc.), bridging a variety of subject matter expert methodologies and applications. As a result, there has been an unprecedented increase in the volume of multi-omics data generation and storage over the past decade. However, many challenges remain regarding multi-omics data management and sharing for reuse of complex Omics datasets (Figure 1).
Figure 1. Challenges of effectively managing and integrating multi-omics data identified during the RDA-OfR WG brainstorming workshop in September 2023.
A persistent challenge is the vast array of fragmented experimental metadata and data formats generated across the different Omics domains, making it difficult to manage and integrate multi-omics data prior to downstream analyses.
This RDA working group (WG), supported by Oracle for Research (OfR), aims to address a few of these challenges by creating a matrix of identified reporting guidelines and standards essential for integration of multiple Omics metadata elements across the different domain technologies (Figure 2).
Landscape review and collection of existing Omics community standards (Deliverable)
The WG will research and consult existing work in the area by undertaking an in-depth landscape review to identify current Omics domain data standards (common data formats, controlled vocabularies and ontologies, metadata reporting guidelines, and identifier schemas) outlined by community accepted data management and sharing best practices within and across the different Omics domains. This work will leverage and build upon existing resource records at FAIRsharing (an RDA WG and a curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies) in evaluating existing community Omics standard records in contribution to an iterative and open Omics Domain Collection at FAIRsharing as part of the landscape review recommendation output. Leveraging FAIRsharing educational materials to help guide WG curation activities, the group will collect standalone Omics domain community standards and reporting guidelines for informed downstream crosswalk activities. The Omics landscape review and analysis collection serves to benefit and encourage continuous (machine-actionable) community level standard curation beyond the lifecycle of the WG for existing and future research community stakeholder groups focused on data standards in life sciences.
Omics community standard and reporting guideline crosswalk (Deliverable)
There are currently many well-established and well-developed Omics standards, but knowing intuitively or immediately which standards are complementary across domains is not obvious. Based on the results of the Omics landscape review, this WG will provide a crosswalk that will identify common data standards and metadata reporting guidelines implemented across Omics domains to link complementary integration points. This activity aims to support common metadata elements outlined in the resulting standard reporting matrix and downstream data interpretation endeavours. This crosswalk will highlight domain metadata reporting gaps and areas where current standard implementations may not be in alignment across the various Omics standards where supplementation may be of use.
Multi-omics metadata schema standards and reporting matrix plus use case collection (Deliverable)
The multi-omics metadata schema standard reporting matrix, detailing the essential domain standards metadata elements required to accommodate multi-omics integration in areas of genomics, transcriptomics, proteomics, metabolomics (including imaging mass spectrometry). This guideline will be supported by multi-omics community example use cases and curated Multi-omics Database Collection at FAIRsharing capturing use case records, if applicable/available. Documented use cases support diverse subject domain community examples of multi-omics standard integrations from existing cross-disciplinary group activities (societies, alliances, standard consortiums, and research projects focused on data standards harmonisation). Use case database collection will outline specific developments where multiple Omics standards have been successfully integrated for supporting multi-omics data platforms and/or metadata frameworks (Ex: National Microbiome Data Collaborative (NMDC), Multi-Omics Research Factory (MORF), etc.). Multi-omics data integration use cases with an included database record at FAIRsharing, identified during the landscape review, will be included in a new collection (as mentioned above) consisting of use-case multi-omics data repositories and/or knowledge bases implementing at least 2 different domain Omics data standards (reporting guidelines, controlled vocabularies and ontologies, etc.) reflecting key points of integration captured in the standard reporting matrix.
Figure 2. Omic domain technologies and corresponding core Omics working group deliverable outline.
Meetings and Deliverables (see WG meeting folder)
Documentation and Deliverables
7 September 2023
Brainstorming workshops: Definition of WG scope
|29 November 2023
1st WG meeting: 1st WG meeting
|13 December 2023
|2nd WG meeting: Landscape review and collection of existing Omics community standards (Deliverable)
|25 January 2024
3rd WG meeting: Landscape review and collection of existing Omics community standards (Deliverable) continued
|21 Februry 2024
4th WG meeting: Landscape review and collecion of existing Omics community standards (Deliverable) continued.
|20 March 2024
5th WG meeting: TBD