Draft Notes from the DFT Session at P11

16 Apr 2018

The Notes from the DFT Session at P11 can be viewed on Google Docs:

https://docs.google.com/document/d/1P_VE3AWgE2jukmEOPwSXxqMkTc0tPhKdgVaM...

 

 

Session page: https://rd-alliance.org/ig-data-foundations-and-terminology-rda-11th-plenary-meeting

 

Attendee List:

   

Berg-Cross

 

Goudie

Gary

 

Simon

Addison

Aaron

Van den Berghe

Steven

Ritz

Raphael

zastrow

Tom

Weigel

Tobias

   

Wolf

Eike

Alfred

Debbie

Neher

Gunther

Kurakawa

Kei

PIGNOL

Cecile

peukert

hagen

Eva-Maria

Rieß

Enke

Harry

Thoben

Stella

Glaves

Helen

Sekiya

Takayuki

Kurakawa

Kei

 

  1. Overview & Update (Gary Berg-Cross - provided 1 page handout so some basics & history were not covered in session although the objective of supporting discussion of new data concepts developed by RDA groups was noted).

 

Vocabulary Updates - We now have almost 300 terms in our term repository.  We continue to add terms as groups conclude or develop materials that expand upon our core.  A recent example of this is the expansion of guiding principle PID Kernel information.

 

Some of the new terms reflect a request at the January, 2018 chairs meeting for foundational concepts.  These include:

  1. Data Lifecycle/Lifecycle is the sequence of processing that a data undergoes from its creation, documentation through its storage in a repository and eventual disposal.‎

  2. Data Science is the scientific study of the generalizable extraction, organization and interpretation of information and knowledge from data. It works across all the steps of a data science lifecycle.

In addition some effort was made to keep up with the dozen or so types of metadata mentioned by RDA groups.  These ranged from:

Administrative Metadata -  a type of Metadata that provides information to help manage a resource, such as when and how data was created, a file type and other technical information, and who can access it. To System Metadata and Topical metadata  (see https://smw-rda.esc.rzg.mpg.de/index.php?search=metadata&title=Special%3ASearch&fulltext=1)

 

2.  Tool Update (Raphael Ritz and Thomas) – handling IDs for term concepts

 

3. DFT support for RDA self analysis and working relation with, DDRI.....

 Gary Berg-Cross explained the RDA analysis using definitions and the 6 Words describing groups. An example is provided below:

 

One analysis which used all of the 6 word terms showed a Data Organization View of DFT Terms. This is depicted below.  However there was no real semantics showing relations between terms or relation to groups. It would be good to be able to show this.

 

As part of further analysis we might like to show interactions between groups (current, past, potential), and some idea of trends in the above, such as who is a lineage parent of a WG.

 

In discussions with RDA’s DDRI it seemed that we could use their Research Graph with the DFT vocabularies and terms as part of group analysis.  By expanding to other RDA and public data more questions could be asked. These include:

Study the relation of:

(rd-alliance:members) -- (publications) -- (FOR:Field of Research)

(rd-alliance:members) -- (publications) -- (Grants)

(rd-alliance:members) -- (publications) – [use Scholix] - (data)

(rd-alliance:members) -- (rd-alliance:groups) -- (DFT:Vocabs) -- (FOR:Field of Research)

(rd-alliance:members) -- (org:affiliation) - [:AugmentAPI] - (Research Object)

 

Amir Aryani provided more information about research graphs.

 

Group discussion centered on possible abuses of such analysis. Issues were noted in light of German law as well as recent abuses of  “open” data. It was agreed that unintended uses was potentially a problem even though some of this information was routinely added as part of registration and institutional affiliation allowed linkage to much more information. This may be a issue for the community to discuss and indeed RDA might be useful to provide a balance of data openness and protection.

 

4. A topic that there was not sufficient time to explore the question of should RDA and DFT develop a Common Data Vocabulary Registry?

 

This makes sense because we have liaison relation to other data vocabulary efforts and there might be an interest in developing a common Registry for these data vocabularies. It is also possible that the RDA VSIG might provide some services to support such a registry. RDA groups may also provide some advice on standards and infrastructure for the effort.

 

The registry should support concept-term search as well as browsing.  It should also allow machine to machine interfacing from the Registry to individual vocabulary tools.

 

Keeping track of vocabulary revisions may be a challenge although some vocabulary services may be available for this.

 

Notes by Gary Berg-Cross

 

Draft Notes from the DFT Session at P11 | RDA

Error

The website encountered an unexpected error. Please try again later.