Samples are taken in every field of study, but they vary widely in terms of type, e.g., single crystals, powder, complex structures, proteins and other biological (macro)molecules, cells, tissues, organisms, archeological artifacts, fossils, artwork, etc. Different fields may categorize samples from multiple perspectives simultaneously (e.g., nanomaterials are considered both physical particles and molecular entities, proteins are molecular entities of biological origin). Samples may consist of multiple components, in multiple phases; samples may represent collections of multiple entities, or single entities.
The sampling scheme is a critical aspect of designing any experiment to yield informative and reproducible results. A number of factors around sample collection, storage and processing are relevant for interpretation of measurement data derived from those samples. Different samples may be collected for different purposes: for example, biological specimens (or parts of specimens such as leaf for plants and tissue for animals), soil, and even air samples. Samples may be dependent on conditions of handling and storage (e.g., humidity, temperature), and may also be subject to further processing workflows (e.g., dispersion, mixing, plating, staining etc.). Samples may have spatial, temporal or other relationships that need to be articulated. A macro sample may be collected, with subsequent subsamples taken at increasing granularity down to the nanoscale, and multiple series of parent-child relationships need to be documented.
There are many well developed identifiers and other semantic descriptions used to describe different facets of sample provenance. A few cross-domain community endorsed examples include ISO 19156: 2013 Observations, Measurements and Samples and the W3C/OGC Semantic Sensor Network Ontology which includes the core SOSA (Sensor, Observation, Sample, and Actuator) Ontology for its elementary classes and properties. Other more domain specific approaches include iSamples and the Global Biodiversity Information Facility (GBIF). How widely known and used are these existing cross domain ontologies and models? New international cross-domain ontologies are being published, as well as community driven ontologies, such as in earth sciences and biodiversity; how can we adapt these to be suitable for additional disciplines?
The ability to compile data from disparate disciplines will greatly facilitate the opportunity to answer broader, global challenges. Harmonization of sample descriptions will also facilitate the workflow of instrument facilities that apply physical measurement techniques to extremely diverse sample types and need to meet a broad range of user needs for documentation. This session brings together a variety of disciplines including geochemistry, biodiversity, nanomaterials, analytical chemistry, and crystallography, among others, to explore approaches to harmonization around sample description and provenance.
The expected outcomes of this discussion will be to:
Compile a list of needs for describing sample types, origin, processing workflows and other requirements across disciplines
Identify existing identifiers, classifications, ontologies and terminologies that support these descriptions
Identify if there is engagement to propose an RDA Working Group project to develop best practices for sample data model specifications