RDA P3: PID Information Types WG session summary
2 session slots: Wed, 13:30-15:00 and 15:30-17:00
The working session of the PID Information Types WG (PIT WG) at the 3rd RDA Plenary focused on in-depth conceptual and technical discussions on the scope and functionality of its main deliverable, an Application Programming Interface (API) for interaction with typed information closely associated with Persistent Identifiers (PIDs). The session also addressed critical issues of typing for PID information in a larger context and the implementation and finalization plans for the remaining WG lifetime. Towards the end of the session, some motivating type examples were gathered which well illustrate the intended use of the PIT mechanisms.
One first important scope-related discussion item was the recurring debate on the relationship between PID records and object metadata and an overall governance process deemed to be necessary. It was emphasized that the PIT API and the larger framework around it will not establish strict rules on what belongs into a PID record and what is better kept as a separate (and, preferably, persistently identified) object. The debate in the Terminology WG on internal and external properties could help, but the view within PIT is that the API is agnostic towards the information stored using it. There should and will not be a “one size fits all” approach; individual communities and disciplines will have to establish their own understanding of what their essential properties are and possibly enforce adequate policies. The PIT view is to motivate and sustainably support such processes by offering the necessary functionality to structure PID information, so that eventually cross-community properties can emerge. This is however a larger cultural and distinctly non-technical process. The issue of longer-term governance of types was briefly discussed and it was agreed that the PIT WG will not govern the future creation of types, particularly given its limited lifetime.
A possible conceptual model to be determined as part of an interesting area of future research revolves around the question where to store information, and thus which properties to establish at a PID level. Core research questions, pointed out by Bob Kahn in particular, are what information to store in a rapid resolution part and what information to better keep in external metadata storage. Can we formulate useful policies independent from particular disciplines for that?
An important terminological inconsistency detected during the session was the multiple usage of the term type and the idea of typing at more than just one central point in the PIT API. The use of the term type was discovered to be very ambiguous. The WG must work towards a more precise description of its concepts. A possible alternative term for at least one of the typing concept was “cluster”, which does not seem to conflict with other terms established in the communities of the WG participants. Thus, properties will be defined as triples of name, value and value type; Types as clusters of one or more properties; and Profiles as clusters of types. Concerns were also raised as to whether the third conceptual layer of Profiles is really required and could be dropped in view of simplicity. It could not be decided during the session precisely which arguments are in favor of keeping the Profiles layer, and it is expected that this discussion continues across WGs after the plenary. On another related topic it was agreed that there are practical scenarios calling for both global and local Profiles, and the API should support both alternatives.
The role of the type registry was also explained. One of its core features, the capability to search for types, was pointed out as a major functional dependency for the PIT API, which will not offer comparable capabilities.
The second session included a productive and creative brainstorming on possible type examples that should be included in the final WG deliverables to better illustrate the use of the PIT mechanisms. The examples gathered during the session show that the essential mechanics have been understood after the first session cleared up important terminological ambiguities. The gathered examples clearly show that a core set of value types can be determined, which particularly includes many uses of temporal and non-atomic (tuple) information. It was pointed out that the issue of encodings should be held back for the moment, transferring responsibility for serialization and deserialization to clients.
The session concluded with a short discussion of next steps. Implementation of the Java prototype will be done over summer, and the design and precise documentation of the RESTful API take a high priority since it represents a protocol specification highly desired by future adopters.