Version 3 of the RDA Data Concept Definitions

A key ingredient to RDA meetings is the pursuit of understanding about research, data and its foundations. Agreement on the meaning of research terms remains centrally important to these goals and is part of Plenary discussions such as the RDA data landscape and how it is communicated. A particularly interesting development has been the discussion of RDA progress at recent Plenaries and collaborative meetings on RDA directions. As part of the last few Plenaries there is agreement that RDA's direction and current status should be made more explicit and understandable to outside groups. This is useful for internal RDA communication and group work as well as outreach & branding success. There is working agreement also that this might be tested through a periodic Road-mapping exercise, developed in coordination between TAB and the Chairs with feedback solicited from OAB and Council. A start on this road map vision has been made at the bi-annual Chairs cooperative meeting and holds the promise of being important as additional RDA groups mature, as common interests emerge and are understood all of which can contribute to RDA accelerated messaging.

Analysis of word and group clusters as part of DFT vocabulary development provides some insight into RDA's overall direction and to the issue of characterizing an RDA core versus a multi-core of many different areas. This is a topic planned for discussion at the next Plenary. A key aspect of DFT-IG sessions is to support broader model and vocabulary agreements within and across RDA groups (and representative communities and stakeholders) on such core ideas as "Useful data" or "Interoperable data." Other areas of discussion including distinctions about infrastructure such as “middleware infrastructure” and concepts within the FAIR principles such as “Fairsharing”, “23 Things: Libraries for Research Data1” which includes a Data Thesaurus2. Some comparisons between the NLM Data Thesaurus which provides more extensive references and explicit relations of terms and the DFT catalog is planned for the Plenary.

Discussion of DFT vocabulary refinement and expansions continue at each Plenary session based on some pre-Plenary discussion as well as from audience participation.

DFT IG also includes contact and coordination with groups outside of RDA who are working on data vocabularies. Earlier Plenaries included a special discussions on extant data vocabulary efforts such as a special focus on the International Research Data Management glossary (IRiDiuM) supported by RDC, CASRAI, and CODATA. An update on IriDiuM and its relations to DFT can be expected at P12. The Digital Curation Centre (DCC) is another organization whose efforts are of interest to DFT.

In addition to these P12 will allow the DFT IG to continue the discussion of an understandable data vision initiated by various RDA DFT WGs and to support continuing RDA efforts to elaborate the basic data concepts within a useful framework while documenting data vocabularies.

To support this virtual meetings are planned for 2 or 3 months before plenaries along with contact with RDA groups on candidate terms for population in the DFT term tool called TeD-T.

Footnotes: 1. https://rd-alliance.org/system/files/documents/23Things_Libraries_For_Da...
2. https://nnlm.gov/data


Case Statement at: https://rd-alliance.org/groups/data-foundations-and-terminology-ig.html

A page summary of DFT is available at:https://www.rd-alliance.org/data-foundations-and-terminology-ig-overview...
Some slides available - see the DFT site for P11 overviews and updates by Gary Berg-Cross

This session will present a stable version 3.0 of the vocabulary. This will include updates from the P11 and 2 chairs collaborative discussions which including discussion of the landscape of core RDA topics and possible roadmaps. The intent is to support continued synchronization of RDA conceptualization and enable better understanding within and between RDA groups. In addition it will provide updates on the term tool operation, functionality and use by groups as well as the use of term definitions to support group cooperation.

The session will also allow newer groups to present their vocabularies issues to the group and discuss relations to other groups and their definitions. We expect, for example, as in previous years several terms around open data to be completed along with progress on metadata profiles. Some additional ideas for topics will have been developed as part of DFT-IG virtual sessions since P11.

Improvements in the contextual depth of definitions will be discussed to support synchronized conceptualization as well as to enable better understanding within and between communities. Such discussion is often facilitated by construction of conceptual maps showing key relations between and among termed concept.

One potential area of interest is terminological services, which has again become an active group. DFT remains interested in various vocabulary services including mapping between vocabularies and finding similar terms and will therefore interact with the renamed -Vocabulary and Semantic Services Interest Group – VSIG .
  • The role of conceptual and knowledge graphs for vocabulary representation.
  • Update on defining additional relations and useful links between and among terms.
  • How to handle data definition mutability as concepts change over time, including marking definitions or terms as deprecated and versioning of stable snapshots of the DFT vocabulary set. We have implemented an initial version approach around Plenaries.
  • Interest in handling similarity between terms including those as part of other data vocabulary development. A shared repository has been proposed.
  • How to satisfy the need to add more rich, contextual semantics to metadata as discussed at prior Chairs meeting.

The following is the working agenda for the DFT Breakout session at P12

  • An Overview of DFT IG history & the Breakout Session Agenda- (updated 2 page handout as background)
  • Explaining the latest additions to the DFT vocabulary for core RDA area such as “publisher-facing policy”, “Data Packaging” “FAIR-compliant”, “Repository certification” and the like.
  • A brief update of the Ted-T tool used to capture vocabularies and progress in sharing vocabularies and Status of version 3.0. We will explain how we assign PIDs to every individual definition so that linking between different vocabularies is possible and definitions can be cited.
  • Group discussion of mapping to and leveraging other vocabularies the NLM Data Thesaurus and recent discussions of an RDA core or multi-cores as part of roadmap discussions.
  • Solicitation from other RDA IGs and WGs about current/near term candidate vocabulary items. These may include:
    • MIG and related RDA work
    • Vocabulary service IG
    • L4RD IG
  • The session will conclude with a summary of the results and next steps.


Group chair serving as contact person: Gary Berg-Cross

Working meeting