Posts

Wiki

Events

Repository

Outputs

Case Statements

Plenaries

Members

NEW! February 2020: In order to make it easier for you to collaborate with your teams, we have improved the user experience of your Groups’ online space. A series of icons and labels now guide your activity and help you post messages to the group members, create and organise wiki pages, send events’ announcements, publish and organise the outputs and case statements resulting from your group’s activity and browse all the members of your Group. One new area also collects the Plenary sessions your group contributed to.

We hope that you’ll find this useful! Please do not hesitate to send your comments and suggestions to the RDA Secretariat here.

Secretariat Liaison: 
Stefanie Kethers
TAB Liaison: 
Rainer Stotzka
 

Introduction:

 

The wide use of schema.org to add structured metadata in web pages for use by  commercial search engines has attracted the attention of the data management community as a possible mechanism to leverage the robust commercial search engines like Google, Yahoo, Bing etc. to facilitate discovery and access to scientific data. Various projects have been exploring this approach, including the US NSF EarthCube p418 projectGoogle's Dataset Recommendations, BioSchemas, Force11 DCIP, Research Data AustraliaDataCiteHarvard Dataverse,  NASA’s Distributed Active Archive Center (DAAC) Infrastructure, EOSCpilot,  etc.  Since schema.org has largely been driven by commercial business use cases, and a loosely governed process for adding and defining resource type, property and vocabulary for research domain, there are gaps and deficiencies  that make its application for research data problematic.

 

 

Since P11, the RDA Data Discovery Paradigms IG started the task force "Using schema.org for research data discovery". The group has organised sessions at RDA plenaries and online calls to discuss how we research community come together to embrace the advantages of discovering data via web search engines, meanwhile to address gaps and deficiencies. There is a proposal to form a RDA Working Group with a focused scope and set of well-defined priorities/objectives.

 

The objectives of this work group are twofold:

  1. to identify and bridge gaps in existing schemas commonly used for research data, by bringing together communities who are working with such vocabularies to document research data and related resources;
     
  2. to provide guidelines for those communities whose needs are not addressed by existing metadata schema such as schema.org, and provide guidelines on proposing extensions.

To align with the above objectives, we instrumented a survey on current practices in using schemas to describe research datasets. The survey is still open, your participation is more than welcome. (The survey was developed by the DDP IG TF, which led to the formation of this WG.)

 

The planned outputs will include:

  1. A generic ‘conceptual data model’ with essential types and properties for research data discovery over the web. The model will be built on bioschemas.org, science-on-schema.org, schema.org, DCAT, DDI-DISCO and SSN schemas from some representative research domains, and data discovery use cases. A research domain can map their schema to the conceptual model when they publish data to the web or exchange metadata between data portals/repositories.
     

  2. A guideline, illustrated with common patterns, of common patterns for publishing metadata landing pages with structured data markups; and a guideline of how to customise the research schemas for target domains with examples. 
     

  3. Toolings for making the implementation easier if resources are available. This could include collecting and cataloguing tools that generate, validate and parse schema.org & DCAT markup, etc.

 

The WG's Wiki Index

 

Meeting schedule:

 

This group has a regular meeting on second Thursday each month, starting 8pm UTC. A meeting reminder with zoom ID will be emailed to this group ahead of each meeting.