GORC International Benchmarking WG Case Statement

08 Jan 2021

GORC International Benchmarking WG Case Statement

RDA Case Statement

GORC International Benchmarking WG

 

  1. Charter

The Global Open Research Commons (GORC) is an ambitious vision of a global set of interoperable resources necessary to enable researchers to address societal grand challenges including climate change, pandemics, and poverty. The realized vision of GORC will provide frictionless access to all research artifacts including, but not limited to: data, publications, software and compute resources; and metadata, vocabulary, and identification services to everyone everywhere, at all times.

The GORC is being built by a set of national, pan-national and domain specific organizations such as the European Open Science Cloud, the African Open Science Platform, and the International Virtual Observatory Alliance (see Appendix A​       for a fuller list). The ​  GORC IG is working on a set of deliverables to support coordination amongst these organizations, including a roadmap for global alignment to help set priorities for Commons development and integration. In support of this roadmap, this WG will develop and collect a set of benchmarks for GORC organizations to measure their user engagement and development internally within the organization, gauge their maturity and compare features across commons. 

In the first case, the WG will collect information about how existing commons are measuring success, adoption or use of their services within their organization, such as data downloads, contributed software, and similar KPI and access statistics. 

Secondly, we will also develop, validate, collect and curate a set of benchmarks that will allow Commons developers to compare features across science clouds. In the latter case for example, we would consider benchmarks such as evidence or the existence of : 

  1. A well defined decision making process
  2. A consistent and openly available data privacy policy
  3. Federated Authentication and Authorization infrastructure
  4. Community supported and well documented metadata standard(s)
  5. A workflow for adding and maintaining PIDs for managed assets
  6. A mechanism for utilizing vocabulary services or publishing to the semantic web
  7. A process to inventory research artefacts and services
  8. An Open Catalogue of these artefacts and services
  9. A proven workflow to connect multiple different research artefact types (e.g. ​data and publications; data and electronic laboratory notebooks; data and related datasets)
  10. A mechanism to capture provenance for research artefacts
  11. Mechanisms for community engagement and input; an element or scale for inclusion

We anticipate that the first set of metrics will be quantitative measures used within an organization, while the second set of benchmarks will be comparable across organizations.

  1. Value Proposition

This WG is motivated by the broader goal of openly sharing data and related services across technologies, disciplines, and countries to address the grand challenges of society. The deliverables of the WG itself will inform roadmaps for development of the infrastructure necessary to meet that goal, while engagements and relationships formed during the work period will help forge strong partnerships across national, regional and domain focused members which are crucial to its success. Identifying observable and measurable benchmarks in pursuit of the global open science commons will help create a tangible path for development and support strategic planning within and across science commons infrastructures. In the future, best practices for commons development will emerge based on the experience of what actions led to successful outcomes. This work will provide a forum for discussion that will allow members to identify the most important features and the minimal elements required to guide their own development and build a commons that is globally interoperable. Finally, it will support developers as they seek resources to build the global commons by helping them respond to funding agencies requirements for measurable deliverables.

The proposed WG was discussed at the RDA 16 virtual plenary.[1] Participants discussed the initial work packages and agreed during the meeting this was a worthy goal and an appropriate approach. 

 

  1. Engagement with Existing Work

The GORC IG builds on, and incorporates the previous National Data Services IG. The Commons that will be investigated in this WG are likely either to have considered or implemented outputs from other RDA groups, such as the  ​Domain Repositories IG, the ​Data Fabric IG, and the Virtual Research Environment IG, just to name a few. These groups and many others outside of RDA will have recommendations that speak to functionality and features of various components of Commons; for example the re3data.org schema for collecting information on research data repositories for registration, the EOSC ​FAIR WG and Sustainability WG that seek to define the EOSC as a Minimum Viable Product (MVP).  We will review these and other related outputs to see if they have identified benchmarks that we feel will support our goals. This review period will ensure that we do not duplicate existing efforts. ​Appendix B of this case statement identifies a few of these existing efforts, both within and without RDA; this list will be expanded and reviewed by the WG members.

 

  1. Work Plan

To create these deliverables, members of the group will:

  1. Create a target list of Commons (​Appendix A)
  2. Review public facing documentation of each Commons to extract benchmarking information (both KPIs and feature lists). 
  3. Review public facing documentation of recommendations and roadmaps from related communities to extract benchmarking information (Appendix B​). This evaluation phase will include an examination of the outputs from other RDA WGs and position papers available in the wider science infrastructure community, along with experiences gathered by the WG’s members.
  4. Because benchmarking information may not be easily found in public documents we will conduct outreach to Commons representatives and related organizations to ask for additional feedback and information about benchmarks used by their community.  This may include benchmarks already in use, as well as benchmarks that organizations feel would be useful but which are not yet implemented.
  5. Synthesize and document the benchmarks into 2 deliverables, described below..

We anticipate that the WG will create sub-working groups or task groups. The WG will decide if they would rather define the task group according to the deliverables, creating a Commons Internal Benchmarking TG and a Commons External Benchmarking TG, or if they would rather subdivide according to a typology of the commons, for example with some members looking at pan-national, national, or domain specific commons, or by some other subdivision of labor.

The WG will proceed according to the following schedule:

Month

Activity

Jan-Mar

2020

Group formation

  1. Agreement on the scope of work and deliverables (broad scope)
  2. Case statement community review
  3. Creation of sub-working groups

Apr-Sept

2021

Begin literature review of public facing documents from Science Commons and related organizations

Refine scope: Meeting point to consolidate list of topics to be addressed in the deliverables and assess level of resource available to achieve them

Oct-Dec

2021

Begin outreach to Science Commons and related organizations

Update at ​RDA17

Jan-Mar

2021

First draft: Internal Benchmarks distributed for community review

 

Mar-Jun

2022

First draft: External Benchmarks  distributed for community review

July

2022

Final deliverables

 

  1. Deliverables

This group will create Supporting Outputs in furtherance of the goals of the  ​GORC IG.

Specifically, 2 documents: 

D1: a non-redundant set of KPIs and success metrics currently utilized, planned or desired for existing science commons, and 

D2: a list of observable international benchmarks of features, structures and functionality that can help define a Commons and that will feed into a roadmap of Commons interoperability.

D3: Adoption Plan: described in section 9 below.  

  1. Mode and Frequency of Operation

The WG will meet monthly over Zoom, at a time to be determined by the membership. The WG will also communicate asynchronously online using the mailing list functionality provided by RDA and via shared online documents. If and when post-Covid international travel is restored during the 18 month work period of this WG then we will propose and schedule meetings during RDA plenaries and at other conferences where a sufficient number of group members are in attendance.

  1. Addressing Consensus and Conflicts

The WG will adhere to the stated RDA Code of Conduct and will work towards consensus, which will be achieved primarily through mailing list discussions and online meetings, where opposing views will be openly discussed and debated amongst members of the group. If consensus cannot be achieved in this manner, the group co-chairs will make the final decision on how to proceed.

The co-chairs will keep the working group on track by reviewing progress relative to the deliverables. Any new ideas about deliverables or work that the co-chairs deem to be outside the scope of the WG defined here will be referred back to the GORC IG to determine if a new

WG should be formed.  

  1. Community Engagement

The working group case statement will be disseminated to RDA mailing lists and communities of practice related to Commons development that are identified by the GORC IG in an effort to cast a wide net and attract a diverse, multi-disciplinary membership. Similarly, when appropriate, draft outputs will also be published to relevant stakeholders and mailing lists to encourage broad community feedback.

  1. Adoption Plan

The WG will create an adoption plan for distributing and maintaining the deliverables.  A specific plan will be developed to facilitate adoption or implementation of the WG

Recommendation and other outcomes within the organizations and institutions represented by WG members.  This will include possible strategies  for adoption more broadly within the global community, and in such a way as to facilitate interoperability of global infrastructures.  Pilot adoptions or implementations would ideally start within the 18 month timeframe before the WG is complete. We envision implementation occurring when developers of commons compare themselves with similar organizations. We also envision the adoption plan will speak to howwe include the benchmarks in the larger GORC roadmap being created by the parent IG. 

 

 

  1. Initial Membership

Co-chairs: 

  1. Karen Payne <​ito-director@oceannetworks.ca>
  2. Mark Leggott <​mark.leggott@rdc-drc.ca> 
  3. Andrew Treloar <​andrew.treloar@ardc.edu.au

 

 Appendix A: List of Commons

 

Pan National Commons

  1. European Open Science Cloud
  2. African Open Science Platform
  3. Nordic e-Infrastructure Collaboration
  4. the Arab States Research and Education Network, ASREN
  5. LIBSENSE  (​LIBSENSE is a community of practice, not an infrastructure. The infrastructure will be built by the RENs, NRENs and universities)
  6. WACREN  
  7. LA Referencia 

 

National Commons

European Roadmaps -​The European Commission and European Strategy Forum on Research Infrastructures (ESFRI) encourage Member States and Associated Countries to develop national roadmaps for research infrastructures.  

  1. German National Research Data Infrastructure (NFDI)
  2. DANS
  3. ATT (Finland) 
  4. GAIA-X (non- member state?; see also) (​focused on data sharing in the commercial sectors - without excluding research)
  5. UK
    1. UK Research and Innovation
    2. JISC
    3. Digital Curation Centre

Non-European 

  1. QNL (Qatar) 
  2. China Science and Technology Cloud (CSTCloud); ​​  see also 
  3. Australian Research Data Commons
  4. Canadian National Data Services Framework (in development)
  5. National Research Cloud (US; AI focused)​ 
  6. NII Research Data Cloud (Japan) 
  7. KISTI (South Korea)

 

Domain Commons

  1. International Virtual Observatory Alliance (IVOA)
  2. NIH Data Commons; Office of Data Science Strategy​           (USA)
  3. NIST RDaF (USA)
  4. Earth Sciences 
    1. DataOne Federation
    2. Federation of Earth Science Information Partners (ESIP)​    
    3. EarthCube
    4. GEO / GEOSS
    5. Near-Earth Space Data Infrastructure for e-Science (ESPAS, prototype)​       
    6. Polar
      1. The Arctic Data Committee landscape map of the Polar Community 
      2. Polar View - The Canadian Polar Data Ecosystem (includes international​       initiatives, infrastructure and platforms)
      3. Polar Commons / Polar International Circle (PIC) [not sure if this is active] iv.            PolarTEP
    7. Infrastructure for the European Network for Earth System Modelling (IS-ENES)​        
  5. Global Ocean Observing Systems (composed of ​Regional Alliances)
  6. CGIAR Platform for Big Data in Agriculture
  7. Social Sciences & Humanities Open Cloud (SSHOC)
  8. Dissco ​https://www.dissco.eu/ Research infrastructure for natural collections (a commons for specimens and their digital twins)
  9. ELIXIR Bridging Force IG (in the process of being redefined as “Life Science Data Infrastructures IG”)
  10. Global Alliance for Genomics and Health (GA4GH) 
  11. Datacommons.org - primarily statistics for humanitarian work​        

 

Gateway/Virtual Research Environment/Virtual Laboratory communities and other Services

  1. International Coalition on Science Gateways
  2. Data Curation Network
  3. CURE Consortium
  4. OpenAire
  5. RDA VRE IG

 

 Appendix B: Draft List of WG/IG, documents, recommendations, frameworks and roadmaps from related and relevant communities

 

  1. RDA Outputs and Recommendations Catalogue
  2. RDA D​ata publishing workflows (​Zenodo)
  3. RDA FAIR Data Maturity Model
  4. RDA 9 functional requirements for data discovery
  5. Repository Platforms for Research Data IG
  6. Metadata Standards Catalog WG  
  7. Metadata IG
  8. Brokering IG
  9. Data Fabric IG
  10. Repository Platform IG
  11. International Materials Resource Registries WG
  12. RDA Collection of Use Cases (see also)
  13. Existing service catalogues (for example the ​eInfra service description template used in the EOSC)
  14. the Open Science Framework
  15. Matrix of use cases and functional requirements for research data repository platforms.
  16. Activities and recommendations arising from the interdisciplinary EOSC Enhance program
  17. Scoping the Open Science Infrastructure Landscape in Europe
  18. Docs from https://investinopen.org/about/who-we-are/  
  19. Monitoring Open Science Implementation in Federal Science-based Departments and Agencies: Metrics and Indicators
  20. Next-generation metrics:Responsible metrics and evaluation for openscience. Report of the European Commission Expert Group on Altmetrics (​see also)
  21. Guidance and recommendations arising from EOSC FAIR WG and ​​   Sustainability WG
  22. Outputs from the International FAIR Convergence Symposium (Dec 2020) (particularly the session Mobilizing the Global Open Science Cloud (GOSC) Initiative: Priority, Progress and Partnership
  23. The European Strategy Forum on Research Infrastructures (​ESFRI) Landscape Analysis “provides the current context of the most relevant Research Infrastructures that are available to European scientists and to technology developers”
  24. NIH Workshop on Data Metrics (Feb 2020)

 

 

 
Review period start: 
Friday, 8 January, 2021 to Monday, 8 February, 2021
  • Francoise Genova's picture

    Author: Francoise Genova

    Date: 03 Feb, 2021

    I used to be the vice-chair of the FAIR Working Group of the EOSC Executive Board, which completed its task at the end of 2020. I would like to strongly support the proposal to create this GORC International Benchmarking WG. The EOSC FAIR WG recommended in particular that its proposal for FAIR Metrics in the EOSC, inspired from the FAIR Data Maturity Model WG, be reviewed in an international context, and we suggested the GORC IG. The point was discussed during the P16 pre-WG BoF and it seems to fit well as one of the points which could be addressed in the International Benchmarking WG. I am also pleased to see the domain specific organisations are considered, and that the subdivision of tasks will be defined in a flexible way.

    A few minor comments:

    - the long list of possible topics in Section 1 can be frightening when organisations will be contacted to participate. It would be useful to say explicitly that not all commons-developing organisations are expected to develop all these features, depending on their mission and community requirements.

    - I suggest to add the FAIR Data Maturity Model WG in the list of relevant RDA Groups in Section 3. It is cited elsewhere, but FAIR plays an essential role for enabling seamless access to data and other digital objects.

    Very minor comment: in Section 9, last sentence, howwe > how we

    I will disseminate information about this WG proposal in the IVOA, which is cited as one of the relevant organisations among the domain commons.

    Best wishes

    Francoise Genova

  • Ville Tenhunen's picture

    Author: Ville Tenhunen

    Date: 08 Feb, 2021

    Hi all,

    We (Yin Chen and Ville Tenhunen) discussed in the EGI office (nowadays virtual one) about the RDA GORC Benchmarking IG Charter proposal. 

    Generally, the idea of the IG is good and easy to support. This is an activity which supports GORC WG work and global research data collaboration.

    We hope you could consider a few points of views to clarify or sharpen the charter text:

    • The value proposition has described values of having a set of benchmarks. It might be better if it defines who are the targeting readers and what benefits they can gain from this work?

    • Some phase of the WG lifecycle it is necessary to provide more information how such a set of benchmarks will be developed, so that whether the approach is appropriate, whether the result is valid can be evaluated.

    • How will the benchmark be used? It should provide some use scenarios to justify the usefulness of these benchmarks.

    Couple detailed comments about the Charter and Appendixes:

    1. Charter

    There is also the list of some benchmarks to consider. It is understandable that here is some examples about benchmarks, but couple of proposals for further considerations:

    • Is it possible to benchmark budgeting of commons per BKT or research funding in general (OECD has pretty much data about this)? Budgeting structure is also an important aspect. Some commons has funded internationally, some nationally and some has institution based funding etc.
    • Training and support infrastructures and methods are an important part of success in many places and for example EOSC had Skills and Training WG (https://www.eoscsecretariat.eu/working-groups/skills-training-working-group). Perhaps this should be also considered as a part of the benchmarks list.
    • Point "9. A proven workflow to connect multiple different research artefact types (e.g. data and publications; data and electronic laboratory notebooks; data and related datasets)". Currently also softwares has their increasing role as a research artefact. Please see: Report from the EOSC Executive Board Working Group (WG) Architecture Task Force (TF) SIRS (https://op.europa.eu/s/oK7d)

    2. Appendix A

    • There is mentioned “ATT (Finland)”. This initiative has ended. Currently these activities has organised by national open science coordination: https://www.avointiede.fi/en
    • Appendix contain mix of projects/initiatives/recommendations etc. It is not clear what ‘common’ refers to -- standards, recommendations, or practice? 

    3. Appendix B; Draft List of WG/IG, documents, recommendations, frameworks and roadmaps from related and relevant communities

     

    With the best regards,

    Yin & Ville

submit a comment