Data Type Model and Registry - Data Type Registries (DTR) WG Recommendations
By Larry Lannom
|Data Type Registries Working Group
Larry Lannom - Corporation for National Research Initiatives, Virginia USA
Daan Broeder - Max Planck Institute for Psycholinguistics, Netherlands
Recommendation Title: Data Type Model and Registry
|Authors: Larry Lannom; Daan Broeder; Giridhar Manepalli
Impact: Ensures data producers classify their data sets in standard data types, allowing data users to automatically identify instruments to process and visualise the data
|Recommendation package DOI: doi.org/10.15497/A5BCD108-ECC4-41BE-91A7-20112FF77458
|Citation: Larry Lannom; Daan Broeder; Giridhar Manepalli (2015): Data Type Registries working group output. DOI:10.15497/A5BCD108-ECC4-41BE-91A7-20112FF77458
The RDA Data Type Registries (DTR) Working Group (WG) was approved at the first RDA Plenary (March 2013, Gothenburg, Sweden). The basic goal of this group was to aid data sharing efforts through improved data typing, specifically to make clear the details and assumptions buried in other peoples’ data. This was seen primarily as a problem in defining a data model appropriate to a wide potential collection of data types, prototyping that model in a registry, and developing a federation strategy across multiple registry instances, all following an analysis of use cases and related efforts. Larry Lannom of CNRI and Daan Broeder of MPI took on the co-chair tasks.
The WG attracted a large degree of interest, both at the conceptual level and in the details of the prototype, which the co-chairs took as confirmation of the relevance of the issue. The prototype was successfully deployed and a number of use cases implemented, allowing us to gain experience with DTR issues and discuss the community’s reactions and comments. The scope of the issues involved, however, proved to be too broad especially with respect to community specific typing needs for a single WG and therefore a follow-on WG, provisionally named Data Typing, will be proposed. The follow-on WG will primarily try to identify the data model that will allow people to specify and represent data types from select communities. In the end, the outcomes of the DTR WG can be summarized as:
Confirmation that detailed and precise data typing is a key consideration in data sharing and reuse and that a federated registry system for such types is highly desirable and needs to accommodate each community’s own requirements
Deployment of a prototype registry implementing one potential data model, against which various use cases can be tested
Involvement of multiple ongoing scientific data management efforts, across a variety of domains, in actively planning for and testing the use of data types and associated registries in their data management efforts
Integration with one additional RDA WG (Persistent Identifier Types) and at least one Interest Group (RDA/CODATA Materials Data, Infrastructure & Interoperability IG)
Development of a set of questions that require further consideration before a detailed recommendation on data typing can be issued.
Finally, we believe that the DTR WG served as an excellent example of the benefits that RDA can and will bring to solving the problems of data sharing, by bringing together what would otherwise be disparate domain-specific groups to focus on common problems at the data level as opposed to the domain level. The remainder of this report will provide details on the outcomes summarized above.
The output is now available for public comment, please have a look.