looking for stakeholders for a Fedora implementation of the RDA Collections API

21 Jul 2017

Dear list members,
We are approaching the close of our 18 month Research Data Alliance
working group effort to develop a general purpose API for working with
research data collections [1]. This work stemmed from a needed
expressed by several communities across disciplines to leverage
aggregations of objects with particular focus on building such
aggregations through PIDs and providing identifiers for aggregation
objects. Up until now there has been no unified, machine-actionable
cross-community approach to building and managing such collections and
no common model for understanding them.
The RDA Research Data Collections Working Group has defined a general
model for a collection as a digital object which bears a unique
identifier and consists of a finite number of digital object identifiers
and metadata associated with each referenced identifier. We have also
developed a common abstract HTTP-based API [2] for data management of
collections which we hope will facilitate data-interoperability and
reuse by, (1) making solutions for managing collections more sustainable
and widely available, thus (2) encouraging better data management
practices and (3) allowing data objects in collections to be shared and
re-used across projects and domains.
It was not the intent of the working group to propose an alternative to
existing well established standards for describing and archiving
collections but rather to propose an API and implementation for
creation, consumption, distribution and citation of collections and
their items that could serve as a unifying layer /on top of/ the
existing models and which could enable producers and consumers of
collections to operate on data items managed in diverse collection
models and repositories. Existing solutions focus on describing
collections and their semantics with metadata, but do not offer a full
set of generic, machine-actionable CRUD operations on them. (An
in-progress version of our final documentation on our group effort can
be found at [3]).
While a number of individual projects have already begun implementing
and using this API, in order to achieve our goals of enabling widespread
data sharing, we believe it must be implemented by the infrastructures
researchers are already using for managing their data and collections.
Repositories like Fedora are an obvious candidate for this, and we
believe the work the API-X community [4] has done to implement an API
Framework for adding services to Fedora should provide the hooks needed
to fairly easily implement the RDA Collections API as an added-value
service. The implementation we have done for the Perseids project [5]
has already confirmed that it is possible to use the API to manage
collections of data which are expressed according to the Linked Data
Protocol model used by Fedora.
This posting is a call for interest and possible stakeholders in such an
effort across both the RDA and Fedora communities. Please respond and
let us know if you are interested!
Sincerely
Bridget Almas
Frederik Baumgardt
Tobias Weigel
Thomas Zastrow
co-chairs of the RDA Collections Working Group
[1] https://www.rd-alliance.org/groups/pid-collections-wg.html
[2] http://rdacollectionswg.github.io/apidocs/#
[3] https://github.com/RDACollectionsWG/specification/blob/master/readme.md
[4]
https://wiki.duraspace.org/display/FF/Design+-+API+Extension+Architecture
[5] http://collections.perseids.org and
https://github.com/RDACollectionsWG/perseids-manifold