Building infrastructure through strategies of interconnection (not an architecture)
RDA seeks to create a data infrastructure that enables researchers and innovators to easily share data across cultures, scales, and technologies to address the grand challenges of society. We envision this infrastructure as a vast interconnection of computing and data systems, people, and organisations. It is well understood that infrastructure must consider both technical and social issues and solutions and how they interact, but research also shows that one cannot pre-define how all those interactions will occur.
“Boundaries between technical and social solutions are mobile, in both directions: the path between the technological and the social is not static and there is no one correct mapping. Robust cyberinfrastructure will develop only when social, organizational, and cultural issues are resolved in tandem with the creation of technology-based services.”*
RDA addresses this issue by focusing on creating and implementing the “social and technical bridges” that connect different systems, communities, tools, etc. And we consciously avoid defining a specific a priori architecture or roadmap as to how these bridges should interconnect. Instead, we encourage RDA members to propose topics as Working Groups and Interest Groups and to participate in existing Groups in their domains of interest and expertise. We try to identify how Recommendations and other group outputs are adopted and where consensus emerges and then work to increase interconnection and adoption.
To date, RDA Recommendations have generally taken the form of technical specifications, data and reference models, recommended practice, and harmonization of existing standards. The Recommendations by themselves are not transformative; it is how they are adopted and how they come together that will create a data infrastructure. And while we cannot fully define how that will happen or what else is specifically required, we can develop agile strategies that codify core principles, define specific systems, and further interconnection across systems. The Technical Advisory Board (TAB) and multiple other groups in RDA are doing just that.
At a first level, it is important to recognize that there are already some important consensus agreements across RDA. The community has clearly shown that they respect the principles of RDA and value the RDA forum and human network as a way toward solving problems. On a more technical level, there is strong agreement on the need to apply persistent identifiers to data and other entities when constructing and interconnecting data systems. These are important agreements of first principle and illustrate emergent social interconnection.
Furthermore, TAB is always reviewing the current set of RDA Recommendations and Working and Interest Groups to identify points of interconnection and overlap. The goal is to help identify consensus, foster collaboration across groups, help connect better to specific disciplines, and to help the community understand the overall landscape and the current gaps and overlaps. TAB does this through careful review of proposed groups and by following group activities, recommending coordination between specific groups, and through ongoing assessment of the RDA landscape.
TAB has recently established a “Landscape Overview Group” (TAB LOG) which is assessing how to map interconnections across the groups and specifically trying to interconnect the domain-specific groups and the cross-domain groups and groups that argue from their infrastructure orientation (e.g. libraries, Archives, IT services) and those that argue from the functional situations (e.g. Publishing, DMP, VRE, Reproducibility) [See early prototype of link mapping].
Governments and other stakeholders also need guidance on how to facilitate a data infrastructure. In response, some groups have identified principles of guidelines that suggest a landscape for further system and infrastructure development. The Data Fabric Interest Group has been reviewing existing statements of principle and high-level guidelines and working to identify where there is clear consensus across them and where RDA recommendations may be applied. The Legal Interoperability Interest Group has conducted a thorough review and extensive community feedback process to define a set of legal interoperability guidelines to better enable data sharing across jurisdictions and data types [link pending]. The Libraries Interest Group specifically address their community with the guidelines on “23 things: Libraries for Research Data”. Other groups, including the Domain Repositories IG, are developing similar guidelines.
Finally, there is an effort to demonstrate interconnection more at the systems level. When developing specific systems with defined data and user communities, it becomes more appropriate to define more specific architectures. The Organisational Assembly has been reviewing RDA Recommendations from an organisational perspective and assessing how RDA Recommendations might augment existing workflows or help in the construction of new repositories or related systems. For example, Griffin University is assessing all the RDA Recommendations as part of its investigation of what their next research data repository system might look like. The WDS/DSA Common Certification Recommendations can also be used as guidelines for repositories to establish and assess their processes.
Similarly, the Data Fabric IG and the Data Publishing IG have been developing specific test beds and demonstrations of how specific Recommendations can be adopted in conjunction with each other within specific contexts. The Data Fabric group has been collaborating with national data services while the Data Publishing Group has been collaborating with publishers and repositories. There are also regional efforts to foster adoption and interconnection at a systems-level. Both the RDA/US and RDA/Europe have funded adoption programs. Again, TAB is watching all these activities with an eye to further interconnection.
Overall, TAB and RDA in general continually seek interconnections but we do not necessarily define how those interconnections will happen. We want to approach the problem at multiple scales and from different directions, reflecting the huge diversity of topics and points of views involved in the construction of the scientific data infrastructure. We welcome feedback on this document especially with ideas on how we can continue to foster consensus and interconnection within RDA.
— Mark Parsons, RDA Secretary General, and Francoise Genova, RDA TAB Co-Chair, with the endorsement of the RDA Council Strategy Subcommittee.
* The quote above is from Edwards et al., (2007) Understanding Infrastructure: Dynamics, Tensions, and Design. (http://hdl.handle.net/2027.42/49353), which provides a nice summary of the research around how infrastructures evolve and mature. Other relevant works include:
Bowker, Geoffrey C. 1997. Social Science, Technical Systems, and Cooperative Work: Beyond the Great Divide. Mahwah, N.J.: Lawrence Erlbaum Associates.
Bowker, Geoffrey C., Karen Baker, Florence Millerand, and David Ribes. 2010. Toward information infrastructure studies: Ways of knowing in a networked environment. In International Handbook of Internet Research. Springer Science+Business Media.
Edwards P, MS Mayernik, A Batcheller, G Bowker, and C Borgman. 2011. Science friction: Data, metadata, and collaboration. Social Studies of Science0306312711413314. http://dx.doi.org/10.1177/0306312711413314
Lampland, M. and SL Star. 2009. Standards and their stories : how quantifying, classifying, and formalizing practices shape everyday life. Ithaca, NY: Cornell University Press.
Star SL and K Ruhleder. 1996. Steps toward an ecology of infrastructure: Design and access for large information spaces. Information Systems Research 7 (1): 111.