Minutes from call 2016-06-15 - next meeting on June 28

15 Jun 2016

!!! Next meeting: June 28, 15:00 CEST / 09:00 EDT !!!
Minutes from call 2016-06-15:
Attending: Thomas, Frederik, Bridget, Javier, Maggie, Ulrich, Tobias
Most of the call was used discussion the new definitions Tobias sent
over the list and the corresponding diagram uploaded to the workspace.
On the diagrams/definitions:
* Frederik suggests we could think of the combination of
+collection_state as the PID Record , allowing us to
think about a collection record as a pid record in the PIT API terms
o Tom: disagrees - a collection record can't be itself a PID
record because it has no content (but again, disagreement on
whether collection contents, in form of a list e.g., are or
aren't content)
* Frederik: started using the term capabilities as a property of a
service, not the collection
o Bridget: A collection needs to advertise its capabilities to the
service that offers the collection API
o so that the API correctly communicates the actions possible on
the collection to potential clients
* Tom: diagram should better refer to definitions in the wiki - put
collection as such on top of diagram, reference PID record
o several versions of diagram may be useful to illustrate the
various perspectives to look at the concepts
o also to add own definition of e.g. PID record opposed to what's
in DFT wiki
o Tom: made a quick new diagram - demonstrates that the
combination of a PID record and collection state as the
Collection object
+ not all agree on this point .. Bridget think it's too soon
to say whether/how a Collection is represented as a PID Record
+ Tobias: diagram still inconsistent - PID record in this new
diagram as distinct element, but will contain parts of the
bottom elements in actual implementations
+ Frederik will produce an alternate diagram as well, to
demonstrate his perspective
o Bridget suggests that Tobias' diagram and definitions give us a
good abstract starting point for modeling a collection, in the
context of the Collections API, and that a good next step would
be to see if we can apply this model to existing collection
implementations, such as the LDP (Fedora 4) and others
* Possible next step: look at the proposed collection implementations,
match them against the definitions/diagram and see how good the
match is and how the API would work
On capabilities discussion:
* Frederik: distinguish technical from descriptive metadata
o What part should be part of the API / the datatypes the API returns?
o there are 2 types of capabilities -- capabilities of the service
itself (e.g. does it provide PID minting or not) and
capabilities of the collection in the context of the service
(collections API)
+ Bridget thinks Tobias' diagram accurate captures
representation of the latter
o technical metadata indicates the actions that are possible on
collections
+ describes to an engineer how to describe to a client way to
interact
o descriptive metadata describes to the domain expert using the
collections what its usefulness is
* The collection service may also have descriptive metadata (who is
providing it etc.)
o the service itself may also have a PID record...
* Javier: Suggestion to include at least a small use case description
as attachment to the diagram that illustrates how this may look like
in practice; also, things like the distinction between technical and
descriptive metadata
o Tobias: agreed - hard to understand otherwise for an outsider
not involved in all of these discussions and looking at it from
a pragmatic viewpoint; use cases to illustrate our finer points
o Probably have more than one example - there are some nuances and
disagreement about the implementation
Actions for July meeting:
* Bridget will contact Dimitris and confirm that we'd like him to
present his use case at the July 26 meeting.
* Bridget will also update the wiki page with a permanent link to the
goto meeting info
* All: Upload variations of definitions/diagram to be discussed again
* Tobias: Continue putting things in the document that seem consistent
and agreed at this point
--
Tobias Weigel
Abteilung Datenmanagement
Deutsches Klimarechenzentrum GmbH (DKRZ)
Bundesstraße 45 a • 20146 Hamburg • Germany
Phone: +49 40 460094-104
Email: ***@***.***
URL: http://www.dkrz.de
ORCID: orcid.org/0000-0002-4040-0215
Geschäftsführer: Prof. Dr. Thomas Ludwig
Sitz der Gesellschaft: Hamburg
Amtsgericht Hamburg HRB 39784

  • Ulrich Schwardmann's picture

    Author: Ulrich Schwardmann

    Date: 16 Jun, 2016

    Dear Tobias, Bridget, all,
    thanks a lot for the fast preperation of the minutes.
    Tobias diagram really is a good thing to start with and discuss the
    concepts and I have provided another view of it and uploaded it to the
    data-share
    (see collection-definitions-v03.pptx on
    https://datashare.rzg.mpg.de/index.php/s/YWUxkSd7zqplCDM).
    What I was missing (and I think Frederik as well?), is the immediate
    recursive nature of a collection in the diagram. I had first to think
    about, what the problem here is, and it seems that in Tobias diagram
    recursion is a bit hidden and difficult to see, and will also be
    difficult to implement.
    This is because the membership is there a structure itself, consisting
    of three components, where only one bears the recursion information, the
    set of identifiers. This would mean, that in an implementation of this
    conceptual view one needs to cross another layer, the membership,
    because it is structure, and this in each recursive step.
    I would very much prefer an immediate access to the recursion data,
    where membership as a concept is nothing as set of identifiers.
    The membership metadata is then another kind of metadata of the
    collection state of the collection itsself and/or of the collection
    state of the subcollections referred in the set of item identifiers.
    Whether it can be here or there or on both levels, seems to depend on
    the use case. This question is strongly related to the parent/child
    discussion, we had before.
    The mapping function has a view on the whole set of item identifiers and
    therefore belongs to the collection metadata, and not to the subcollections.
    And the recursiveness is for completeness explicitely described in the
    diagram by indicating that each identifier, if its a collection
    identifier, comes with a collection state.

  • Tobias Weigel's picture

    Author: Tobias Weigel

    Date: 17 Jun, 2016

    Hello Ulrich,
    thank you uploading your diagram. I've had a look at it and at what you
    wrote, and indeed, the recursive nature was a bit hidden and is more
    visible in your diagram. I think it is a question of what to emphasize
    when deciding whether - as in my definitions and diagram - one defines
    the "membership" as a unit with parts or dissolves it by putting the
    actual membership, mapping function and metadata on the same level as
    capabilities and description. In my original definitions, I did this to
    emphasize the importance of capabilities; a traditional viewpoint would
    rather emphasize the membership, but I see the essential value of this
    WG as going beyond that by talking about CRUD actions.
    Anyway: I only did the diagram to better explain my definitions, so
    these are likely more important. Can you provide such definitions as
    well? I guess you only have to change some parts in what I defined. I
    also think that, for implementation, one can merge these concepts as
    needed, so I would not worry too much abhout the cost of additional
    conceptual layers.
    Best, Tobias

  • Ulrich Schwardmann's picture

    Author: Ulrich Schwardmann

    Date: 27 Jun, 2016

    Hallo Tobias, all
    my suggestion of a refinement of our definition would be:
    A *collection* is a 2-tuple of an identifier and the collection state.
    The *collection state* is a 2-tuple of collection membership and
    collection metadata.
    The *collection membership* is a finite set of collection item identifiers.
    A *collection item identifier* points either to a collection or to some
    digital object, called a collection leave, which is not a collection.
    The *collection metadata* consists of the collection description and of
    procedural (or functional or technical) metadata, that is machine
    interpretable and used for automated processes on collections.
    Procedural (or functional/technical) metadata describes the Membership,
    Collection Capabilities and the Mapping Function.
    *Membership Metadata* is metadata of the collection structure itsself.
    It describes well defined collection properties.
    *Examples*: the collection state like its size or whether it is an
    ordered list or an unordered set, whether it is mutable or fixed. Also
    possible relations to other collections, like parent collections, or
    properties common to all its collection items like mutuability or the
    belonging to one repository.
    *Collection Capabilities* fully comprises the set of actions that are
    supported by it. Actions may or may not affect collection state.
    *Remark*: (1) An external agent may provide more actions than are in a
    collection's capabilities, e.g. more sophisticated composite actions or
    actions across multiple collections.
    (2) An agent submits a capability request to a collection to retrieve
    the action set.
    The *Mapping Function* is a function F mapping from the collection
    membership to item metadata elements and collection membership metadata.
    *Remark*: (1) This is not well defined: it is unclear whether the target
    of the function is the union of item metadata elements and collection
    membership metadata or the cross product or whatever. TODO: Clarify,
    what is the use of the mapping function?
    (2) The function F need not be injective: Multiple items can be related
    to the same metadata.
    (3) example: the mime types of the items, can be given as list or as the
    mapping of each item to its mime type.
    The *Collection Description* is all the metadata that is not procedural.
    Beside others it describes to the domain expert using the collections
    what its usefulness is.
    Additional remarks from my side:
    The mapping function needs further refinement: currently it is not well
    defined. Perhaps it is sufficient to change it to: The *Mapping
    Function* is a function F mapping from the collection membership to the
    union of item metadata elements and collection membership metadata. But
    here the use cases in mind could also go into another direction.
    As already said before my emphasis is to see the recursive structure
    immediatly in the definition and to organize the additional properties
    as metadata of different kind at the same level.
    Actually the current version would also suggest to have a different
    coloring in the diagram I provided, such that the grouping of procedural
    and descriptive metadata becomes more obvious.
    We can discuss all this perhaps tomorrow or also via email of course.
    --
    Mit freundlichem Gruss
    Ulrich Schwardmann
    Phone:+49-551-201-1542 Email:***@***.*** _____ _____ ___
    Gesellschaft fuer wissenschaftliche / __\ \ / / \ / __|
    Datenverarbeitung mbH Goettingen (GWDG) | (_--\ \/\/ /| |) | (_--
    Am Fassberg 11 D-37077 Goettingen Germany \___| \_/\_/ |___/ \___|
    URL: http://www.gwdg.de E-Mail: ***@***.***
    Tel.: +49 (0)551 201-1510 Fax: +49 (0)551 201-2150
    Geschaeftsfuehrer: Prof. Dr. Ramin Yahyapour
    Aufsichtsratsvorsitzender: Dipl.-Kfm. Markus Hoppe
    Sitz der Gesellschaft: Goettingen Registergericht: Goettingen
    Handelsregister-Nr. B 598 Zertifiziert nach ISO 9001

submit a comment