Policies demystified: The RDA Practical Policies WG
We are happy to announce that at the end of August 2013, the RDA Practical Policies Working Group has been endorsed by the RDA Council. The main focuses of the group are to identify the most important policies used by data centers that manage research data collections and to provide the policies as a “starter kit”. At the RDA second plenary meeting in Washington DC the group discussed the policies that have been submitted and the next steps in creating generic versions of the policies. A summary of the results and all slides are available in the RDA file depot.
A policy is defined as an assertion or assurance that is enforced about a collection or a dataset. In his presentation Reagan Moore provided a Policy-based Data Management concept graph. Starting with the purpose of a data collection he defined the properties leading to the policy itself. The policy controls the execution of a procedure, either invoked manually or periodically by an automatism. The procedure can be implemented as a workflow that chains together basic functions. As an important side effect the attributes of the objects within the collection will be updated to monitor the execution results of the procedure. For example: to ensure the integrity of a collection an integrity policy might start every month a procedure that calculates the MD5 checksum of the files included in a digital data object. The results can be stored as status flags in the attributes of the objects.
In the RDA wiki a collection of policies for the categories of data lifecycle management, replication, description and trustworthiness has been posted and will be continuously extended. Step by step the policies will be analyzed and reviewed; starting with integrity, access control, replication, provenance and publication.
We invite all RDA members to join the active discussions in the mailing list and RDA forum and to contribute their experience and views.
The Practical Policy group provides five testbeds using the data management systems dCache, DataVerse, E-iRODS and iRODS. On the testbeds the policies will be deployed, automated and evaluated. In principle the testbeds are also open for other RDA working groups to deploy and to exploit their results, if applicable.
If you are interested in using the testbeds please contact Reagan Moore or Rainer Stotzka.