Working Group Recommendation July 8, 2024

The Global Open Research Commons International Model, Version 1.1

  • Primary Domain: Domain Agnostic
  • RDA Pathways: Data Infrastructures and Environments - Institutional, Data Infrastructures and Environments - International, Data Infrastructures and Environments - Regional or Disciplinary
  • Group Technology Focus: Grant Preparation, Policy-Related, Virtual Research Environments (VREs)
  • Stakeholders: Funders & Policy makers, Industry, Infrastructures, Libraries, Research Performing Organisations
  • Sustainable Development Goals: Affordable and Clean Energy, Climate Action, Good Health and Well-being, Life Below Water, Life on Land, Partnerships to achieve the Goal, Quality Education
  • Language: English

Abstract

In response to the global movement to implement national and cross-national or global commons, a Research Data Alliance (RDA) Interest Group was formed to work towards a community-developed typology for describing research commons. This Interest Group created a Working Group to develop an International Model describing the attributes of Global Open Research Commons. The RDA Global Open Research Commons (GORC) International Model (IM) v. 1.0 (https://doi.org/10.15497/RDA00099) and accompanying report (https://doi.org/10.15497/RDA00097) were endorsed as supporting outputs in October 2023. The GORC IM v1.1 is presented here as a spreadsheet and is accompanied by a short introduction document. The v1.0 accompanying report is still relevant to v1.1 of the model, as it is a narrative document that provides background information about the initiative, describes its intent and intended audience, the method used to create it, and its structure and content. It also provides brief descriptions of communities and activities that have proposed to, or are currently, utilising the model in different contexts, as well as next steps for work in this area. It is important to recognise that the model is aspirational in nature and not prescriptive, drawing on existing good practice and promoting inclusive approaches.


The GORC IM Working Group (WG) consolidated a large range of resources and expert feedback to generate the model, which consists of a number of elements, with associated categories, subcategories, attributes and features, to be considered when undertaking the development of a commons of any kind, at any stage.  Although the categories, subcategories, attributes and features are marked as core, desirable or optional, the model does not mandate what should be implemented, or in what way; the decisions on what is relevant, and where resources should be invested will vary depending on the environment and priorities of the implementer. The model is already being used in several contexts that are adapting and testing the model in real world situations. In some cases, the work is being used in the development of commons, while in other cases it is being utilised in other research infrastructure projects.


While the work supports the development of individual commons, it also supports the work necessary to make the commons interoperable. The GORC IM WG outputs provide a firm foundation for the GORC IG as it seeks to create a roadmap for commons integration. They provide a firm, yet flexible, foundation for creating a set of recommendations and a roadmap for building the GORC. The realised vision of GORC will provide frictionless access to all research artefacts including, but not limited to: data, publications, software and compute resources; and metadata, vocabulary, and identification services to everyone, everywhere, at all times with the appropriate protocols and procedures in place. This is the environment that will allow the research community to focus on their enquiries and respond accordingly. It is an audacious goal and we believe that this model will advance our collective efforts in that direction.

Community comments may also be left in the online google sheet document: model v1.1

 

This Recommendation which is V1.1 of our model, is the next version of the supporting output endorsed in October 2023: https://doi.org/10.15497/RDA00099

 

Impact Statement

The need for coordination of data infrastructure on various levels (country, continent, discipline, sector) arises from the emergence of so called “Open Science Commons” or “Data commons”, which provide a shared virtual space or platform that presents the researcher with a marketplace for data and services. Examples include the European Open Science Cloud, the Australian Research Data Commons, the African Open Science Platform, open government portals and a range of initiatives outside traditional research contexts. Coordinating across these initiatives to enable a global network of interoperable data commons is the goal. This output is intended as a guideline with suggestions for commons on how to be better prepared to join the GORC, or to become a GORC. It is not a prescriptive list and not all elements and features will be applicable to all. All items in the Model should be given careful consideration by those undertaking its implementation, deciding which elements are applicable and feasible for them; all decisions should be intentional based on the individual circumstances.

Explanation of Sustainable Development Goals

This output contributes indirectly to all of the UN SDGs by potentially further enabling interoperability between research commons to address these very goals. More direct contributions can be identified for: “Partnerships for the Goals”, where the model is intended to increase understanding and interactions between research commons as well as other stakeholders in the research community to enable global research; “Affordable and Clean Energy”, “Good Health and Well being”, “Climate Action”, “Life Below Water”, and “Life on Land” as these areas all directly benefit from increased interoperability between research teams internationally and often depend on efficient globally-oriented research and access to research artefacts; and “Quality Education” through the model’s call for sustainability of knowledge, engagement with the research community, and specifically the development, maintenance, and growth of human capacity in the research community through training and education.

Citations

Woodford, C. J., Treloar, A., Leggott, M., Payne, K., Jones, S., Lopez Albacete, J., Madalli, D., Genova, F., Dharmawardena, K., Chibhira, N., Åkerström, W., Macneil, R., Nurnberger, A., Pfeiffenberger, H., Tanifuji, M., Zhang, Q., Jones, N., Sesink, L., & Wood-Charlson, E. (2023). The Global Open Research Commons International Model, Version 1.1 (Version 1.1). Research Data Alliance.

Comments

  • Profile Picture

    October 3, 2024 at 1:26 pm

    CJ Woodford says:

    We have reviewed and considered the comments made here and by our WG members in meetings, and have come to the following decisions for revisions for the commons model V1.1: - We are providing the model spreadsheet in CSV and XLSX formats. This is to ensure that familiar and accessible versions of the model V1.1 are available readily. It will also continue to live as a Google Spreadsheet with comments enabled (https://docs.google.com/spreadsheets/d/1tyFpCEbLvHRE2BKy0EDyPc1Gz5w6jm9Q5RVYx2XETkM/edit?usp=sharing). - We did not remove “stakeholder” from this version. Our reasons for this are multifaceted, but the main consideration was the fact that “stakeholder” is currently widely used across nations and languages, and we would likely be making the model harder to understand for non-native English speakers by exchanging this word with an unfamiliar or uncommon synonym (specifically stated as a barrier by multiple WG members). The other roadblock is that the suggested way to remove “stakeholder” is to specifically state the groups, individuals, etc. that are relevant in context, which would potentially make the model prescriptive (something we VERY MUCH do not want) and would take more time to evaluate, assess, and implement. In the spirit of the model being global and wanting to not wait too long to complete the recommendation process, we will instead consider how to address using “stakeholder” fully in future versions. However, we have updated our definition of “stakeholder” to better reflect our inclusive intentions with this word – i.e. changing “Any human individual or entity that is associated with, is a member of, participates in, provides to, or uses the commons* past, present, and future.” to “Any human individual or entity that represent themself or a non-human (animal, plant, land, resource, or technology) entity or group that is associated with, is a member of, participates in, provides to, or uses the commons* past, present, and future.” - We did not change the layout of the model in the spreadsheet, and therefore did not introduce identifiers at this time. Introducing identifiers was an attempt to make the model more accessible for those using screen readers and those with small screens, by removing the merged cells and colour-coding. We have not converged on the format of identifiers, if using identifiers is the correct way to tackle these changes, or how to appropriately change the layout of the spreadsheet. Considering other formats for the model is consistently part of our on-going and future work plans, and introducing identifiers will be revisited in future discussions, will keep the model structure as-is for this version. - WG-specific acronyms have been spelled out and contextualized. This includes breaking “TG” into “Task Group”, specifying the timing of “Phase 1 review” to be “Phase 1 Review (Literature Analysis, August – October 2022)” and the same for other timed and abbreviated phrases, specifically in the “Primary Sources” columns. - Intentionally empty cells now have “-” to indicate they’re purposefully left empty. Not for merged cells in columns A-D in the Google Sheet and XLSX versions, but for empty cells in columns E-H. All empty cells have "-" to indicate they've been purposefully left empty in the CSV versions. - The conceptual model is now mentioned in the accompanying model documentation, as well as future work. I have submitted all the updated documentation to RDA to move forward in the recommendation process. Thank you to everyone who took part in leaving/making comments and helping us work through how to improve this specific version in reasonable ways. Please feel free to reach out to me at c.joseph.woodford@gmail.com if you'd like to discuss any part of our decisions or process for V1.1, or have suggestions for future versions. Sincerely, CJ

  • Profile Picture

    August 7, 2024 at 1:26 pm

    CJ Woodford says:

    Hi Julien, Thank you for your very detailed comment. Please reply here if we miss anything or email CJ directly at c.joseph.woodford@gmail.com if the commenting window closes beforehand. You raise a good point on the accessibility of the document. In the development of the model, we had to prioritize the development of the content and push the refinement of its container to later work, however accessibility for all users is a key priority. Looking into how spreadsheets are accessed by screen readers, we didn't necessarily follow best practises for spreadsheets (e.g. https://support.microsoft.com/en-us/office/accessibility-best-practices-with-excel-spreadsheets-6cc05fc5-1314-48b5-8eb3-683e49b3e593). This is something we can absolutely try to fix. The co-chairs can discuss how to approach this for this proposed recommendation. Related to John's comment, we've introduced a model structure as well as content, which the spreadsheet tries to explain all at once, and having an ontology separate from the model content would help reduce the complexity of any one document. Regarding an overview, there is an accompanying PDF file that describes the structure of the model, and the first sheet in the spreadsheet is an introduction to the structure and intent of the model. Were these not sufficient as overviews, and if not, what could be added or replaced? We appreciate your suggestion of a mindmap, and it's an active avenue of work to try and represent the model in something other than a spreadsheet. We currently have a preliminary mindmap on Kumu ( incomplete and hence not part of this version; https://embed.kumu.io/dcedb828c2be6fd9e854ebf6ab8bb330), and we have a conceptual model linked in the introduction tab of the spreadsheet in UML notation and in text (part of this version: https://docs.google.com/presentation/d/1EzM1wfRzsDhbwzbSwF0vlmy0cOR9yGvoww9vRoN5q_Q/edit?usp=sharing). If the conceptual model is helpful (we've heard from others that it is), we can try and promote it more as an introduction to the model structure. The colours of the cells are intended to guide the eye and do not hold any importance to the model content itself. We take your point about why encoded information, as well as what we attempted to do regarding visual aesthetics (i.e. the merged and sometimes empty cells) would provide confusion for anyone using a screen reader. We can absolutely follow the accessible recommendations linked above and you've stated here, most notably by not using merged cells. Then the empty cells would truly mean there is nothing there - e.g. a category without subcategories, or an attribute without features. We're surprised that there are cells with duplicate content, we will absolutely ensure that the model items (categories, subcategories, attributes, features) are unique. Thank you for raising this point. One way we can better enable accessibility and perhaps help with inadvertent redundancy is by introducing a numbering system to the items in the model, so it's not reliant on encoded information or visuals. We also appreciate your point on abbreviations that we used, we can make sure to spell out uncommon abbreviations throughout all the cells. The Task Group (TG) evaluations in particular were iterative reviews, phase 1 occurring between late 2022 and early 2023 and phase 2 occuring in mid-2023. Items often came from our landscape analysis and speaker series, but sometimes emerged instead from the phase 1 or phase 2 discussions and hence those are cited as sources. Regarding having a (or multiple).csv files instead of .xlsx, this is something we can attempt and compare before final submission. Ideally, I (CJ) would like for a Google Sheets version to remain active online (with the recommendations you've suggested) and then each sheet could be downloaded as a .csv for static submission. We will trial options and see what makes the most sense. An ongoing challenge with our work is that it is difficult to determine when to release an output and recommendation as there is always more work to do and investigations on optimal representations. Finding better model representations, both visuals and interactive web options, is an avenue we're currently pursuing. There's more information about future work for the model in the related supporting output, which is a full narrative report and linked directly from the introduction page of the spreadsheet and in the model overview PDF that's part of this recommendation (https://zenodo.org/records/10032913). We did not submit the report again as part of this recommendation as it is supplement to the model, however we can include more information about our future work in the model overview PDF explicitly. A video is an interesting suggestion. We'll review how the model looks after implementing your recommendations, and see if a video would be a helpful addition. Thank you for providing this article on the use of "stakeholder". We had many discussions on whether and how to use "stakeholder" in the model and its affiliate documents, and opted to use it because it encompassed all parties. We'll review this choice again before final submission. Sincerely, CJ & Mikiko, on behalf of the GORC IM WG co-chairs

  • Profile Picture

    August 5, 2024 at 1:26 pm

    Julien Colomb says:

    Hello, The document seems to be rich in information and may be very helpful in developing new project, however, it is not very accessible. It seems one needs to invest a lot of time to understand how the sheets are structured, an I doubt that visually impaired people can have any use of it. I do not have access to a big screen at the moment and this prevents me from looking more into the content, my review here is restricted to the form. I am not sure wether this is the problem that John Graybeal solution was looking for, and whether making an ontology would help. First, it lacks a bit of an overview, maybe a mindmap of the relations between the sheets would help ? Another way might be to make one sheet with the information of the categories (I think that is the information one can read in the first line in each category, where no subcategory is given). This latter solution would also take out these strange first lines with no subcategory from the sheets. Second, I think a spreadsheet by the RDA should be tidy: One should avoid merged cells completely, and not having any information encoded in cells format (color). Also empty cells should be avoided: is it something missing, is it not applicable or not relevant, is the information already in another cell,... One could also try to make subcategories name unique. For example, in the Goverrnance and leadership sheet, Attribute and features are sometimes a single cell, sometimes different cells, what can that mean? why is subcategory in cells B 9-15 empty ? There are still some abbreviation (TG1 phase 1 evaluation?), in H8 versus H9: what are these evaluation, what does the primary source for a category mean (it seems this relates to the category as a whole, but I am not sure. If it does can the source of a category be of eval.1 when one subcategory is of eval. 2 ?) My advice would be to replace the .xlsx file by a series of .csv file, and see how the csv file needs to be completed to have all the information available in the initial file. One could also write a small script to create a text version of the information (once it is in .csv form). (Also creating an ontology from there would be easier). If this would be too much work at this point, one could provide an explanatory video as part of the output. A plan for a more accessible output should be provided. As a last commetn: I do have a small content wish: I would pledge for the disparition of the word "stakeholder" from this (and any) RDA output (see Reed, M.S., Merkle, B.G., Cook, E.J. et al. Reimagining the language of engagement in a post-stakeholder world. Sustain Sci 19, 1481–1490 (2024). https://doi.org/10.1007/s11625-024-01496-4). Thanks for the enormous work, I think the information collected here is very useful and important for the field, and that it will be worth spending time on making it more straightforward to use and comprehend.

  • Profile Picture

    July 29, 2024 at 1:26 pm

    CJ Woodford says:

    Hi John, Thank you for taking the time to review the model and for your very kind words about it! We have been thinking about alternative ways to represent the model in the background, including as an ontology, and we welcome ideas whole-heartedly. Right now, we have a demo of the glossary as a machine-actionable vocabulary through the ARDC vocabulary service (https://demo.vocabs.ardc.edu.au/viewById/1041), and we hope eventually its URI will be hosted by RDA. We did try to identify ontologies that the GORC IM could be described through (landing on the SPARS ontologies [http://www.sparontologies.net/] and specifically DoCO [http://www.sparontologies.net/ontologies/doco]), but it might indeed make more sense to create an ontology from the model itself. If you have the ability and interest to help push forward making GORC semantic objects, that would be fantastic! I would be keen to help where I can, and some of the other co-chairs will likely feel the same. I have no qualms with you pushing these forward on your own as well, however I would appreciate seeing the draft objects before they're publicly posted if possible. Sincerely, CJ

  • Profile Picture

    July 19, 2024 at 1:26 pm

    John Graybeal says:

    This is an extraordary compendium, very intriguing. I haven't fully absorbed this, but between the glossary and the diagram of your content model, I had the thought: Have you considered encoding some or all of your model as an ontology? At a minimum, having the glossary in a SKOS vocabulary would make those terms/definitions useful/discoverable to any number of other users. Having Categories, Subcategories, Attributes, and Features in an ontology, with each of your tabs, and other components like the Consideration Levels, represented with their precise definitions, might create a whole different way of working with the content. I realize this may sound like a solution looking for a problem, but I'd be interested in a creating at least a set of controlled vocabularies from this document to put into the ESIP COR (Community Ontology Repository). Would you have any objections?

You must be logged in or join the group to leave a comment.