Collections call VC 2016/07/26
Attending: Bridget, Tom, Frederik, Tobias, Maggie, Ana, Dimitris
Action items for group prior next meeting:
- Bridget to continue working on the API, hopefully to have an updated
version by the end of next week
- Tobias wants to take a look at the API with a view towards the latest
definitions
- Frederik hopes to have a vertical prototype ready for us to look at
- Magareta is working on a use case description that she will circulate
Agenda items for next meeting (Aug 09, 13:00 UTC):
- Tobias wants to circle back to the definition discussion
- We can look at either the updated API or the vertical prototype, or both
Discussion notes - focus topic DiSSCo:
- Main things the group is hoping for from the Collections WG is less
about the API and more about best practices/lessons learned for managing
digital collections and enabling interoperability
-- but interested in following the API development to see where it leads
- They have relevant needs for several RDA groups in addition to
Collections, including PID Types, DTR, Provenance
-- The most likely way to show value to their community would be a
combination of these
-- We will try to get together at P8 to discuss if/how we could
prototype something
- Biggest challenge in this community: digitisation and the accounting
of all objects - have to learn how many objects there are and where they are
-- There are two kinds of collections: Always a collection of the
physical objects; and then digital objects derived from the physical ones
-- The physical objects are naturally preserved for eternity and not
changed. The digital objects are subject to all kinds of changes as part
of curation, fixing errors, and also deriving objects through analysis.
- Buy-in from users/contributors (particularly for digitisation,
metadata provisioning) is required to achieve a coherent infrastructure
- added-value services are key, for example a citation service for
giving better credit to all actors in the chain
-- Citation service is a good example for a first service
- There are several challenges to address in the community and for
building the infrastructure:
-- Quality control: This has been discussed for many years, but not
implemented in a satisfactory way yet. Metadata get harvested into a
common catalog and then there are quality/sanity checks, but there is no
good way to feed them back to the original repositories
-- Provenance challenge: trying to trace digital objects through
curation or analysis actions and being able to make this transparent to
users
-- Identifiers for all the samples: This is not a challenge of
determining what identifier solution to use (like IGSN), but a challenge
of the diverse policies/procedures at the institutions. The policy
harmonization challenge is that it is not possible to have the same
procedures at all facilities, there are too many institutional differences
- When looking at the scope of the collection API, it will be important
to talk about the granularity of the fields exposed through it. A
potentially big benefit would be to create semantic links between data
types - based on some of the properties that the data types have. The
goal would be to harvest data across institutions and also provide
interoperability to external communities - to link derived data products
with those from completely other domains.
--
Dr. Tobias Weigel
Abteilung Datenmanagement
Deutsches Klimarechenzentrum GmbH (DKRZ)
Bundesstraße 45 a • 20146 Hamburg • Germany
Phone: +49 40 460094-104
Email: ***@***.***
URL: http://www.dkrz.de
ORCID: orcid.org/0000-0002-4040-0215
Geschäftsführer: Prof. Dr. Thomas Ludwig
Sitz der Gesellschaft: Hamburg
Amtsgericht Hamburg HRB 39784