Building infrastructure through strategies of interconnection

RDA Council Strategy Subcommittee - (27 August 2016)

Building infrastructure through strategies of interconnection (not an architecture)

RDA seeks to create a data infrastructure that enables researchers and innovators to easily share data across cultures, scales, and technologies to address the grand challenges of society. We envision this infrastructure as a vast interconnection of computing and data systems, people, and organisations. Some suggest that RDA needs a specific, a priori architecture or clear roadmap describing how all those interconnections should work. We submit that a more organic approach will be more effective at enabling those interconnections to grow and be useful. This document lays out some of our strategies to develop, foster, and understand those interconnections.

It is well understood that infrastructure must consider both technical and social issues and solutions and how they interact, but research also shows that one cannot pre-define how all those interactions will occur. “Boundaries between technical and social solutions are mobile, in both directions: the path between the technological and the social is not static and there is no one correct mapping. Robust cyberinfrastructure will develop only when social, organizational, and cultural issues are resolved in tandem with the creation of technology-based services.”*

RDA addresses this issue by focusing on creating and implementing the “social and technical bridges” that connect different systems, communities, tools, etc. And we consciously avoid defining a specific a priori architecture or roadmap as to how these bridges should interconnect. Instead, we encourage RDA members to propose topics as Working Groups and Interest Groups and to participate in existing Groups in their domains of interest and expertise. We try to identify how Recommendations and other group outputs are adopted and where consensus emerges and then work to increase interconnection and adoption.

To date, RDA Recommendations have generally taken the form of technical specifications, data and reference models, recommended practice, and harmonization of existing standards. The Recommendations by themselves are not transformative; it is how they are adopted and how they come together that will create a data infrastructure. And while we cannot fully define how that will happen or what else is specifically required, we can develop agile strategies that codify core principles, define specific systems, and further interconnection across systems. The Technical Advisory Board (TAB) and multiple other groups in RDA are doing just that.

At a first level, it is important to recognize that there are already some important consensus agreements across RDA. The community has clearly shown that they respect the principles of RDA  and value the RDA forum and human network as a way toward solving problems. On a more technical level, there is strong agreement on the need to apply persistent identifiers to data and other entities when constructing and interconnecting data systems. These are important agreements of first principle and illustrate emergent social interconnection.

Furthermore, TAB is always reviewing the current set of RDA Recommendations and Working and Interest Groups to identify points of interconnection and overlap. The goal is to help identify consensus, foster collaboration across groups, help connect better to specific disciplines, and to help the community understand the overall landscape and the current gaps and overlaps. TAB does this through careful review of proposed groups and by following group activities, recommending coordination between specific groups, and through ongoing assessment of the RDA landscape.

TAB has recently established a “Landscape Overview Group” (TAB LOG) which is assessing how to map interconnections across the groups and specifically trying to interconnect the domain-specific groups and the cross-domain groups and groups that argue from their infrastructure orientation (e.g. libraries, Archives, IT services) and those that argue from the functional situations (e.g. Publishing, DMP, VRE, Reproducibility) [See early prototype of link mapping].

Governments and other stakeholders also need guidance on how to facilitate a data infrastructure. In response, some groups have identified principles of guidelines that suggest a landscape for further system and infrastructure development. The Data Fabric Interest Group has been reviewing existing statements of principle and high-level guidelines and working to identify where there is clear consensus across them and where RDA recommendations may be applied. The Legal Interoperability Interest Group has conducted a thorough review and extensive community feedback process to define a set of legal interoperability guidelines to better enable data sharing across jurisdictions and data types [link pending]. The Libraries Interest Group specifically address their community with the guidelines on “23 things: Libraries for Research Data”. Other groups, including the Domain Repositories IG, are developing similar guidelines.

Finally, there is an effort to demonstrate interconnection more at the systems level. When developing specific systems with defined data and user communities, it becomes more appropriate to define more specific architectures. The Organisational Assembly has been reviewing RDA Recommendations from an organisational perspective and assessing how RDA Recommendations might augment existing workflows or help in the construction of new repositories or related systems. For example, Griffin University is assessing all the RDA Recommendations as part of its investigation of what their next research data repository system might look like. The WDS/DSA Common Certification Recommendations can also be used as guidelines for repositories to establish and assess their processes.

Similarly, the Data Fabric IG and the Data Publishing IG have been developing specific test beds and demonstrations of how specific Recommendations can be adopted in conjunction with each other within specific contexts. The Data Fabric group has been collaborating with national data services while the Data Publishing Group has been collaborating with publishers and repositories. There are also regional efforts to foster adoption and interconnection at a systems-level. Both the RDA/US and RDA/Europe have funded adoption programs. Again, TAB is watching all these activities with an eye to further interconnection.

Overall, TAB and RDA in general continually seek interconnections but we do not necessarily define how those interconnections will happen. We want to approach the problem at multiple scales and from different directions, reflecting the huge diversity of topics and points of views involved in the construction of the scientific data infrastructure. We welcome feedback on this document especially with ideas on how we can continue to foster consensus and interconnection within RDA.

Drafted by Francoise Genova, RDA TAB Co-Chair and RDA Council Subcommittee Member & Mark Parsons, RDA Secretary General


* The quote above is from Edwards et al., (2007) Understanding Infrastructure: Dynamics, Tensions, and Design.  (, which provides a nice summary of the research around how infrastructures evolve and mature. Other relevant works include:

  • Malcolm Wolski's picture

    Author: Malcolm Wolski

    Date: 12 Jan, 2017

    You ask for feedback and I notice no one has replied. Upon reviewing the document (with which I have no issues) I am wondering if other readers had the same thought as myself. What is the specific problem in RDA that the document is trying to address. Are you implying there there is an issue about what the focus of the RDA should be in relation to infrastructure or whether it should be more focussed on facilitation and a promotor of interconnectivity?


  • Mark Parsons's picture

    Author: Mark Parsons

    Date: 08 Feb, 2017

    Good point, Malcolm. I have revised the first paragraph to indicate that the issue is how to understand, develop, and promote the interconnections that become 'infrastructure' and to explain what RDA is doing in that regard and why we are not laying out an a priori architecture. 

