Trusted Research Environments for Sensitive Data: FAIRness for "Closed" Data and Processes

You are here

30 Oct 2022

Trusted Research Environments for Sensitive Data: FAIRness for "Closed" Data and Processes

Submitted by Andreas Rauber


Meeting objectives: 

The conflicting goals of protecting and maintaining control over sensitive data while also thriving to give third parties access to the data pose a significant challenge. Trusted Research Environments (TREs) have been established in the last decade that, if properly set-up and operated, help ease this problem through providing high security guarantees of a highly controlled and monitored environment with trust.

In many settings, both academia and industry have a need to safeguard access to highly sensitive data that is commonly referred to as closed data. This sensitivity can arise through e.g. commercial value or privacy requirements and warrants careful data management through support of technical-, organizational- and legal measures intertwined within a TRE.

The FAIR Guiding Principles for scientific data management and stewardship addresses infrastructures supporting reuse of scholarly data specifically targeted at machine-readability and –actionability of this data. While the guiding principles cover the management of all research data and aid in identifying necessary steps towards FAIR research data, they do not provide a best practice guideline / template or design decisions to make closed data FAIR per se.

We found that many TREs are similar in architecture design and technical implementation. There is however a lack of openly available guidelines and design decision explanation for setting up and running TREs.

We thus aim at establishing a WG that will

  • identify and publish a blue-print/reference architecture for the technical architecture, roles and processes commonly found in such trusted research environments base don the evaluation of existing solutions
  • make it easier for institutions to set up data infrastructures that allow researchers to gain access to sensitive data (irrespective of whether that sensitivity stems from privacy/GDPR reasons or is due to the commercial/IPR sensitivity of the data). It will demonstrate that such data, in spite of not being freely share-able can still be FAIR, made available for research.
  • demonstrate how results obtained on such closed data can still be made reproducible and transparent to the degree permitted by data sensitivity, establishing a clear public metadata record on the research performed as well as supporting findability of the data, linked to clear access request/permission processes and public verification of access by specific trusted parties.
  • increase interoperability between such environments on a technical, legal and organizational level, hopefully enabling easier set-up of ad-hoc joint TREs in settings where specific data sources need to be joined but may not be passed on to a third party for hosting.
  • make it easier to set-up data visiting platforms where trusted code can be executed, with monitoring and result inspection processes clearing results for return to a researcher so that, in some cases, even a completely shielded interaction with sensitive data may be possible.

 

The goal is to establish best practices for balancing these differing requirements for access limitations and flexibility of interaction / analysis, understanding associated risks, with the goal of making data accessible and usable to research that otherwise would not be possible.

 


 

 

Meeting agenda: 

 

Collaborative session notes: https://docs.google.com/document/d/1pQheT3ZIp1SUU4iYc0Z4IO7PVOT6_-N7QjdZ...

 

Round of introduction

- Short presentation of TREs: core architectural decisions, roles, processes

- Identify further TREs within RDA community

- Present a draft proposal for a WG Charter

- Discussion, comments, cooperation with other WGs/IGs

- Planning next steps for revising / finalizing Charter

 

Type of Meeting: 
Working meeting
Short introduction describing any previous activities: 

A number of bilateral meetings and video calls with several TREs have been held over the past year, discussing ismilarities and difference sin set-up. However, these discoveries were isolated - bringin them together on a more formal basis, clarifaing the advantags and disadvantages of specific design decisions and - most importantly - sharing them with the community of institutins intereste din setting up such an infrastructure has been missing.

We would like to get toegther the groups operating or planing to operate such infrastrutures in a more sturctured way, documenting the experience made so far, explaining design decisions and options, pointing out best-practices.

 

We will reach out, particularly, to the following RDA groups to launch this BOF session:

  • Virtual Research Environments IG (VRE-IG), to have input on common policies and best practices, as well as specifications for underlying architectures and components and interfaces.
  • Data Security and Trust WG (WG DST), for outputs regarding global data sharing minimum data protection standards and establishing mutual trust between research data infrastructures. Important mechanisms that can be adopted from WG DST are authentication and authorization protocols for data access.
  • FAIR for Virtual Research Environments WG (FAIR VRE WG), to address data steward requirements for human and machine-actionable interfaces to find, access, interoperate and reuse data stored in virtual research environments and output on making VREs themselves FAIR.
  • Sensitive Data IG (SD-IG), The activities can be aligned with and feed into the IG Virtual Research Environments. It will further have strong links to many other domain-specific WGs/IGs (e.g. health data), building on top of earlier WGs such as Working Group for Data Security and Trust (WGDST); It will also build upon numerous national initiatives and regional or domain-specific infrastructures

 

BoF chair serving as contact person: 
Meeting presenters: 
Andreas Rauber; to be confirmed -based on discussions, need to check availability: Jan-Willem Boiten, Carole Goble, Susheel Varma, Gard Thomassen, Oddgeir Lingaas Holmen, Tore A. Linde, Tommi Nyronen, Ali Syed, Peter Løngren, Jonas Hagberg, Martin Weise
Avoid conflict with the following group (1): 
Avoid conflict with the following group (2): 
Contact for group (email): 
If "Other," Please specify:: 
Privacy, Security, Sensitive Data
Driven by RDA Organisational Member: 
No
Applicable Pathways: 
FAIR, CARE, TRUST - Adoption, Implementation, and Deployment
Data Infrastructures and Environments - Institutional
Other