Re: [rda-datafabric-ig][rda-collection-wg] Some thoughts on "Data Aggregations" terminology & concepts

11 Apr 2016

Hi Tobias, Gary and others,
in principle each function, that generates (new) collections, could be
used. For example from a given collection one could build a new
collection by requiring restrictions like for example time constraints
on the generation of the DOs it contains. Or one can build a kind of
power collection, the collection of all sub collections.
Particularly interesting generation rules come with the possibity of
following the links given in the collection, either by the PIDs in the
collection itsself or by the additional pointers/links given in the
definition. For example if one has a set of collections consisting each
of lets say two PIDs pointing to another collection in this set, then
one can see this as such a set, but also one can build the sub
collections build by the connected components in the graph with PID
vertices and edges defined by the relation 'PID in a collection'.
A real world example would be 'references in publications': each
publication (collection) only contains a small number of references
(PIDs), but for a given publication there is a whole tree of all
publications, that this publication relies on, which is a new collection.
Even more interesting is also the reverse generation rule: give me all
publications, that rely on a given publication. It is a valid rule too,
but its much harder to implement it, because one needs for each
publication to know all reliying publications, or all publications at all.
Similarly new collections can be build from the additional pointers that
are possible for a collection according the definition below. A typical
example for such a pointer could be the previous version of a collection
and one can build easily the collection of all previous versions of a
collection by the rule to follow always the previous version pointer.