RDA Plenary 3: Data Type Registries WG
In Dublin it was my first time at a RDA Plenary Meeting.
I was not sure what to expect, I went to the conference with curiousity about a subject that I've dealt with only marginally in my work, but that I think it's becoming more and more crucial for many aspects of research, in many different fields. I have to say that a thing that really hit me, right from the very beginning of this conference, was the passion and partecipation. Everybody speaking seemed to firmly believe in this project, and the multidisciplinarity of the place was just amazing. At your typical conference, you don't see altogether people from humanistic and scientific backgrounds, engineers and biologists, computer scientists and law students, and so on.
Seeing such a large and various crowd was very inspiring to me. I believe in the importance of keeping an eye on what happens outside of your usual field, because it enriches you as a person and thus as a professional too, and it can give you new perspectives, exposing you to new approaches. And it's always good to be aware of what's going on outside of your lab.
Beside, the organization of the conference as all was just great.
I was assigned to work with the Data Type Registries WG, and it was a really nice experience, especially because in this group, that has been very active, you could already see many ideas taking the shape of results.
Dr. Larry Lannom and Dr. Daan Broeder were the co-chairs of the meeting. Dr. Lannom made a presentation summing up the work done until now and what to do next. The audience was very active too, showing propositions and interest towards the subject, talking about the use cases and related efforts. We had a contribution from Dr. David Giarretta from Alliance for Permanent Access, presenting some slides about the work his company is doing.
This WG has already been doing some tangible work. The idea revolves around the concept of type of data, that is the characterization of data structure at multiple levels of granularity, from individual data points up to and including large data sets. The aim is to build a common data model and expression to describe types: if these types are standardized, this will make it easy to add them to registries, in order to have standard ways to discover and treat data; so we could guarantee interoperability and, as additional step, offer a common API for machine consumption.
For example, we can imagine users having some datasets, and we can imagine some tools that will easily discover the standardized types associated to the dataset, so that the user can know how to treat the data. We can also imagine to have some services that will process the data directly according to their type.
These types can also find application in certification and access control (that is: for this type, there are these rules), or for data acquisition and experiments. The possibilties offered by such an approach arise also some related sets of problems that will need to be further discussed (for example about metadata, about replication of information), while keeping in mind use cases and the needs of those who work with data.
During the meeting, a prototype for data type registries was shown: each type should have an ID, human descritpion, provenance, properties, etc. There can be some "primitive types" that can be used to define new types.
Now the challenge is to go from the prototype to the real usage and to evolve the data model.
I'm looking forward to the Fourth Plenary Meeting, I really hope that I will be able to take part and I am curious to see how the many ideas I've seen in these days will develop in the next months (and years).
Let's keep up the good work!