Recent Activity: Research Data Collections WG

08 Aug 2016

Collection definitions and the mapping function

after looking at the possible alternatives and thinking about what they
would mean in practice, I have now made some decisions on how to clarify
the still ambiguous points in the collection definitions, most
particularly the issue that Ulrich raised on the mapping function.
I have uploaded a new document version and a new diagram to the workspace:

27 Jul 2016

Minutes form call 2016-07-26 - next meeting on August 09

Collections call VC 2016/07/26
Attending: Bridget, Tom, Frederik, Tobias, Maggie, Ana, Dimitris
Action items for group prior next meeting:
- Bridget to continue working on the API, hopefully to have an updated
version by the end of next week
- Tobias wants to take a look at the API with a view towards the latest
- Frederik hopes to have a vertical prototype ready for us to look at
- Magareta is working on a use case description that she will circulate
Agenda items for next meeting (Aug 09, 13:00 UTC):

25 Jul 2016

Reminder/Info for tomorrow's call

Hi all,
Just a reminder that we have a call tomorrow at 13:00 GMT (9:00 Eastern,
15:00 Central European). Call-in info is as usual here:
We decided to extend the call to 90 minutes as during the first portion
we will be hearing from Dimitris Koureas about the use case from the

05 Jul 2016

Diagram on collections and PIT

Hello Frederik,
I've had a look at your diagram; two comments:
- As already noted, the object should not be part of the PID record, but
it should be pointed to. This may simply be described by replacing
"Object" in the top box with "Object pointer" or similar.
- You list a "PIT" entry in the technical metadata section along "size"
as another example. I would see this as a special (and very useful)
interpretation of a PIT: This PIT describes the object as a whole.
During the PIT WG, we interpreted a PIT more widely, and particularly

05 Jul 2016

Definitions, mapping function discussion

Hello Ulrich,
I have gone through your revised definitions and I think they are an
improvement over what I originally wrote. I agree that your point about
the rather indirect recursion is right; to build collections of
collections is a core feature and should not be obfuscated in the
I've made some small changes to the document (uploaded to the shared
folder and attached). I got a bit confused around the collection
metadata and membership metadata. I've now further clarified that the
latter is part of the former.

28 Jun 2016

Updated specification document

Dear all,
As promised, I updated our specifications document so that it contains
now the "collections" graphic from Tobias in version 3 and the
definitions from Ulrichs mail.
The up-to-date version of the document is:
Dr. Thomas Zastrow
Max Planck Computing and Data Facility (MPCDF)

15 Jun 2016

Minutes from call 2016-06-15 - next meeting on June 28

!!! Next meeting: June 28, 15:00 CEST / 09:00 EDT !!!
Minutes from call 2016-06-15:
Attending: Thomas, Frederik, Bridget, Javier, Maggie, Ulrich, Tobias
Most of the call was used discussion the new definitions Tobias sent
over the list and the corresponding diagram uploaded to the workspace.
On the diagrams/definitions:
* Frederik suggests we could think of the combination of
+collection_state as the PID Record , allowing us to
think about a collection record as a pid record in the PIT API terms

13 Jun 2016

Meeting & Agenda

Dear all,
Just a quick reminder that tomorrow is the second Tuesday of the month and we will have an online meeting at 3pm CEST / 9am EDT / 6 am PDT.
Our preliminary agenda is:
1. Feedback to Tobias’ definitions for Collections, State, Capabilities and Metadata
2. Discussion on interfacing with PIT API and DTR
3. Preparation for Dimitris’ Use Case presentation on 07/26 (if he can join the meeting)
Hope to see you tomorrow,

13 Jun 2016

Further definitions for collections

Dear all,
I've taken another shot at extending the definitions - in view of the
discussion about collection metadata and capabilities. Let me know what
you think - I want to use these to extend the current definitions in the
draft doc later on.
Every individual *collection* is a 2-tuple of an identifier and
collection state.
--> The identifier is part of a collection to allow users to define
multiple collections that are identical with regards to their membership
and general behaviour (operations, metadata etc.).

06 Jun 2016

Metadata for collections

Dear all,
Another point: which metadata we want to add to a collection? Some
The easiest case: a collection has only its own identifier, no more
other metadata. Disadvantages would be that a potential user needs
already to know about the collection. Displaying / exploring through a
bunch of collections would be difficult.
We leave that topic to other APIs like the PIT API or DTR.
The collection API itself is able to store / handle metadata about the

01 Jun 2016

Minutes from yesterday's call

Dear all,
here are the merged minutes from yesterdays' group call. The next call
will take place in 2 weeks at the usual timeslot: Tuesday June 14, 13:00
Best, Tobias
Attendees: Frederik, Ulrich, Christopher, Tobias
* Formal definitions - set theory:
o potentially use ADT as intermediate step between model and
o Look into Haskell docs, use that to get to a small core def for
sorted/unsorted, multimembership/unique

31 May 2016

RDA Collection WG call details

Dear all,
apologies for not sending a reminder earlier - here are the details for
our regular call whihc will just start in a few minutes. We will try to
use gotomeeting again:
Best, Tobias
Tobias Weigel
Abteilung Datenmanagement
Deutsches Klimarechenzentrum GmbH (DKRZ)
Bundesstraße 45 a • 20146 Hamburg • Germany
Phone: +49 40 460094-104
Email: ***@***.***

13 May 2016

New version of the specification

Dear all,
Thanks Ulrich for pointing to the double graphic of Steam Concept A, I
corrected this now.
And I added a new chapter ... we didn't talked about authentication and
authorization so far. As far as this issue always is problematic and a
source of trouble, we may ignore it when implementing the prototype. But
nevertheless, the specification itself should say at least a few words
about A&A. So I wrote down a simple A&A model for our collections,
please be free to tear it apart.