Harmonizing FAIR descriptions of observational data (Remote Access Instructions)
New Title of the planned WG:
Interoperability of Observable Property Descriptions WG
Collaborative session notes: https://docs.google.com/document/d/1PVSUDcglbZmFrRYpcFUmgxSWHvilQdLBALzRI1Ilqxc/edit
Meeting Location: Commonwealth A2
The main purpose of this meeting is to create an official RDA working group under the umbrella of VSSIG (Vocabulary and Semantic Services Interest Group).
The objective of this WG is:
Improve research data interoperability through the harmonization of observable property descriptions with community agreed semantic model and existing terminologies
Collect research observation use cases suitable to demonstrate the value of a common approach
Review existing semantic models for observable properties
Evaluate strengths and weaknesses of the different semantic models using the selected use cases
Based on this review and analysis develop a community consensus for the semantic modelling of observable properties
(Analyse the applicability in other languages)
Evaluate suitable terminologies with atomic terms needed to describe observable properties according to the semantic model
Test the semantic model with the selected terminologies on the collected use cases
Compile best practices for semantic model usage
Develop alignments of the semantic model with alternative related models
Inform and collaborate with other relevant efforts (e.g. schema.org, W3C)
Potential case studies of implementations of the common model
For the envisaged working group the identification of terminologies and the alignment effort (steps 6 and 7) is limited to observable properties. A further working group as follow up could continue work on topics such as observation and measurement methods, devices and units if desired.
Specific objectives for the meeting are:
- Presentation, discussion and general agreement on the objective and tasks
- Identification of additional partners, especially from beyond the European Union, and of task leads
G. Moncoiffe (10 min):
- Motivation - Why we want to harmonize descriptions of observational data
- Summary of activities so far in the informal group
- Presentation of outputs of the WG, as proposed in the draft case statement
Lightning talks from other RDA groups 3 min each (30 min):
- Dimitris Koureas: Biodiversity Data Integration IG
- Jane Wyngard: Small Unmanned Aircraft Systems’ Data IG
- Helen Glaves: Marine Data Harmonization IG
- Kerstin Lehnert: Physical Samples IG
- Tobias Weigel: Data Type Registries WG/Data Fabric IG
- Amy Nurnberger: Education and Training on handling of research data IG
- Thomas Jejkal: Research Data Repository Interoperability WG
- Fotis Psomopoulos: Using Schema.org and enriching metadata to enable/boost FAIRness on research resource BoF
Each of the ligthning talk should address following questions:
- Parameters: Which sort of parameters/properties do you deal with?
- Vocabularies: Do you use controlled vocabularies to describe these?
- Strategies: What, if any, is your strategy to reconcile redundancies, synonymy/near-synonymy across vocabularies? And how do you deal with complex properties such as monthly mean dissolved lead (ppb) in water?
- Expectations: What do you expect from a harmonizing parameter description working group?
Discussion (facilitated by Mike Brown & John Watkins, wrapped up by Mark Schildhauer, John Graybeal taking notes – 50 min)
Topics: tasks, additional partners and task leads identification
This session will be relevant for:
- Data providers/publishers
- Individual researchers
- Ontology engineers
- Research infrastructures
- Digital libraries
If you want to join the group check this link: https://rd-alliance.org/groups/harmonizing-fair-descriptions-observation...
The scientific community produces large amounts of data about the attributes and behaviours of observed entities collected in research studies, particularly in the life and earth sciences. This includes measurements taken by sensors and observations of living individuals.
In the real world of scientific monitoring and research, the conceptualization of observable variables can be complex. They are often reported using simple terms (such as “temperature”) or an association of such terms (e.g. “air temperature”, “body temperature”) that are often syntactically concise. However, such compact labelling is invariably ambiguous to users which are not familiar with the observation context and often obscures the meaning for machines. In order to be described without ambiguity, any given measurement value must be associated with a term from a FAIR terminology that precisely links meaning with syntax. Clarity is further enhanced if terms adopt syntax which, itself, is consistent and descriptive of the phenomena measured; for example, “Temperature of the air in the room”, “Temperature of the air in the atmosphere”, “Temperature of the human body”, “Concentration of total petroleum hydrocarbons per gram dry weight of sediment <63µm , or “Abundance of Cyanobacteria per milliliter of ocean water”.
A number of observation or sensor centric formal models such as the Semantic Sensor Network Ontology (https://www.w3.org/TR/vocab-ssn/), the Observations and Measurements (O&M) conceptual model (https://www.iso.org/standard/32574.html), OBOE: The Extensible Observation Ontology (https://github.com/NCEAS/oboe), the Biological Collections Ontology (http://www.obofoundry.org/ontology/bco.html) or the Community Surface Dynamics Modeling System (CSDMS) that can be used to describe this kind of data and their semantics in machine readable form and with less ambiguity (e.g., the observed properties and the entities of interest, the methods used to observed them, units of data about them, etc.) have been developed. In parallel to those efforts, practitioners (e.g. data managers of research projects or Research Infrastructures) have established their own controlled vocabularies (e.g. EnvThes - http://vocabs.ceh.ac.uk/evn/tbl/envthes.evn, AnaeeThes - https://fairsharing.org/FAIRsharing.49bmk, LifeWatch Thesauri -http://thesauri.lifewatchitaly.eu/PhytoTraits/index.php, http://thesauri.lifewatchitaly.eu/FishTraits/index.php, http://thesauri.lifewatchitaly.eu/alienspecies/index.php). However, a lack of interoperability between existing terminologies used in different disciplinary and regional communities prevent the global integration of data and its efficient exploitation.
To harmonize terminology resources and improve the interoperable description of observational data there is a need for curators and creators of scientific terminology to agree on a common conceptual model, from which a set of patterns can be derived, to allow alignments of smaller components. This effort will considerably improve data interoperability and re-use by accelerating discovery and integration.
The formation of a working group to address this challenge was discussed at the “Ontology and Semantic Web for Research Workshop” in Lecce, July 2017, hosted by LifeWatch Italy and the EU project EUDAT. During this meeting preliminary work to describe data from a common use case using distinct, but well-used, terminologies in order to align them and develop a common model was undertaken. Interest in this effort has been sustained through regular calls in 2018, with informal participation of a group with scientists from European Research Infrastructures (eLTER - http://www.lter-europe.net/, LifeWatch Italy - www.lifewatchitaly.eu, AnaEE - https://www.anaee.com/, ICOS - https://www.icos-ri.eu/), Data Publishers (PANGAEA - https://www.pangaea.de/), terminology developers (ENVO - https://github.com/EnvironmentOntology/envo) and terminology services (BioPortal - https://bioportal.bioontology.org/, AgroPortal - http://agroportal.lirmm.fr/ontologies, EcoPortal - http://ecoportal.lifewatchitaly.eu/, NERC Vocabulary Server - https://www.bodc.ac.uk/resources/products/web_services/vocab/), and other institutions and initiatives (TIB - https://www.tib.eu, AquaDiva - http://www.aquadiva.uni-jena.de/) under the RDA Vocabulary and Semantic Services Interest Group (VSSIG). The current progress of this coordination was presented at the 11th RDA Plenary Meeting in Berlin.
Additional links to informative material
* Ontology and Semantic Web for Research Workshop: http://www.servicecentrelifewatch.eu/ontology-semantic-web-for-biodiversity-ecosystem-research
* Life Watch use case: https://drive.google.com/open?id=1soeYDyYuES-XZ_VTvdczixwqZS0vL0yN
* VISSIG task group “Harmonize conceptualization of observation types” google drive: https://drive.google.com/open?id=1eZ2ypn2Q1SRqSZYOBiVMob3ZgkKG667e
* M. Diepenbroek et al (2017): Terminology supported archiving and publication of environmental science data in PANGAEA, Journal of Biotechnology 261 (2017) 177-186, https://doi.org/10.1016/j.jbiotec.2017.07.016
* D. Osumi-Sutherland et al. (2017): Dead simple OWL design patterns, Journal of Biomedical Semantics (2017) 8: 18, https://doi.org/10.1186/s13326-017-0126-0