Skip to main content

Notice

We are in the process of rolling out a soft launch of the RDA website, which includes a new member platform. Existing RDA members PLEASE REACTIVATE YOUR ACCOUNT using this link: https://rda-login.wicketcloud.com/users/confirmation. Visitors may encounter functionality issues with group pages, navigation, missing content, broken links, etc. As you explore the new site, please provide your feedback using the UserSnap tool on the bottom right corner of each page. Thank you for your understanding and support as we work through all issues as quickly as possible. Stay updated about upcoming features and functionalities: https://www.rd-alliance.org/rda-web-platform-upcoming-features-and-functionalities/

Describing diverse chemistry datasets across distributed data resources

  • Creator
    Discussion
  • #133961

    Ian Bruno
    Member

     
    Collaborative session notes: https://docs.google.com/document/d/1_X2e1qCEedP4IaKr1uwsgB5Eh7_7Unl4KeUn…
     
    We intend spending half of the session sharing perspectives and half discussing what the wider community needs are, what may already exist to help address these and further action required.
     
    Contributions and participation will be drawn from the list below who have all indicated some interest. A more detailed agenda will be established nearer the time. Note that we do not expect all those mentioned will give presentations and we will aim to have some discussions amongst stakeholders in advance so we can efficiently lay out the challenges to be considered during the session.

    NFDI4Chem

    OneGeochemistry 

    UK Catalysis Hub

    UK Physical Sciences Data Infrastructure

    IUPAC standards projects (e.g., Gold Book, FAIRSpec, ThermoML, Adsorption Information Framework, Solubility metadata, etc.) 

    IUPAC WorldFAIR Working Groups

    The InChI Trust

    Chemistry domain repositories

    European Materials & Modelling Ontology (EMMO)

    The RDA PDINST Working Group

    DataCite representatives and/or DataCite Metadata Working Group

     

    Additional links to informative material

    CRDIG: https://www.rd-alliance.org/groups/chemistry-research-data-interest-group.html

    OneGeochemistry: https://www.earthchem.org/communities/onegeochemistry/

    NFDI4Chem: https://www.nfdi4chem.de

    PSDI: https://www.psdi.ac.uk/

    UK Catalysis Hub: https://ukcatalysishub.co.uk/ /

    IUPAC Digital Standards: https://iupac.org/what-we-do/digital-standards/ 

    IUPAC WorldFAIR: https://iupac.org/worldfair-global-cooperation-on-fair-data-policy-and-practice/

    IUPAC FAIR Chemistry: https://zenodo.org/communities/fairchemistry/

    FAIRSpec: https://doi.org/10.26434/chemrxiv-2022-t783k

    The InChI Trust: https://www.inchi-trust.org/

    Chemistry Domain Repositories: https://www.nfdi4chem.de/index.php/repos/

    EMMO: https://github.com/emmo-repo/EMMO

    Avoid conflict with the following group (1)
    Data Usage Metrics WG

    Brief introduction describing the activities and scope of the group
    The Chemistry Research Data Interest Group exists to provide a forum for discussion of matters relating to the sharing and reuse of research data generated by and relevant to the chemical sciences and aligned disciplines. We periodically convene sessions to address timely topics that may be of broad community interest and can benefit from input by experts from across RDA communities. The activities of the group complement data-related activities being undertaken by other national and international chemistry initiatives, in particular the International Union of Pure and Applied Chemistry (IUPAC), the standards body for chemistry.

    Group chair serving as contact person
    Ian Bruno

    Meeting objectives
    Increasing numbers of chemistry-related datasets are becoming available for re-use through data repositories that range from generic to very specialised. These repositories may specialise in one particular data type, or focus on one particular sub-discipline; some may capture just the raw data produced by an experiment, or results derived from those datasets; others may be less structured and accept whatever files are deposited.
     
    Chemistry data can be quite complex with a broad variety of data types and associated information necessary to analyse. Datasets may contain collections of files relating to different analytical methods that together describe an overall result but also have relevance individually. Much of chemical data can be rendered meaningless without adequate contextual description (e.g., samples, conditions, measurement parameters) and there are many approaches to documenting this information.
     
    Repositories will necessarily involve a variety of in depth description of datasets and the scope of coverage can impact the availability of metadata that is exposed. Additionally, it is important to ensure domain metadata are registered alongside DOIs with appropriate description granularity. Implementation of domain data standards will be critical for maximising interoperability and sustainability.
     
    This session aims to explore the metadata descriptions needed to ensure that deposited chemical data may be fully discovered and robustly re-used broadly across disciplines and use cases.
     
    We will explore specific needs and challenges relating to the description of chemistry datasets including:

    The vocabularies needed to characterise disciplines and sub-domains relevant to a particular chemistry dataset

    How to accurately represent the specific type of a chemistry dataset

    How to describe chemical substances relevant to a chemistry dataset

    Deficiencies in general-purpose metadata schema that prevent us from describing chemistry datasets to the level of detail required

    For each of these, we will address the following questions:

    To what extent do existing scientific standards and motifs provide a basis for addressing our needs and how might these need to evolve?

    How do we ensure that the technical community is aware of existing standards that are applicable to the development of new infrastructure?

    Are new standards or recommendations, chemistry-related or otherwise, required to ensure the consistency and coherence of description across diverse data resources?

    How do we engage subject experts in helping to fill scientific gaps in both general and domain-specific repositories?

    We will also consider more general concerns such as:

    The level of description appropriate to include in disciplinary vs general data and metadata repositories and registries

    How to effectively group together different data objects required to reproduce published results when these may be stored in different repositories

    Whether new services are required to enable the discovery, reuse and interoperability of chemistry datasets across resources

    Privacy Policy
    1

Log in to reply.