Achieving anonymity and correcting bias with synthetic data through generative AI

You are here

26 Jan 2021
Group(s) submitting the application: 
Meeting objectives: 

Former HDIG discussions have highlighted the hurdles, such as silo fragmentation and privacy concerns, which are usually encountered when striving to access significant amounts of health data, and how this comparative shortage of Big Data in health is hindering artificial intelligence and knowledge discovery in medicine.

The forthcoming HDIG session in Edinburgh aims at addressing the issue of effective anonymisation through synthetic data generation in combination with advanced privacy-enhancing technologies. Federated approaches based on Secure Multiparty Computation and Differential Privacy can enable the creation of artificial data through Generative Adversarial Networks and allow scaling up experimentation with non-re-identifiable health data and effective training of clinical decision-support tools. At the same time, large-scale generation and use of synthetic data sets is raising renewed interest in their legal and ethical applications, as well as in their validation.

This session proposes to highlight several critical issues related to synthetic data: 1) their still insufficiently acknowledged legal status as anonymous data; 2) the deep learning methodologies currently used to generate synthetic data and synthetic imaging; 3) their capacity to correct biased databases; 4) their potential for augmenting specific patient cohorts and for creating virtual cohorts; 5) the trust synthetic data inspire and their validation tests; 6) their combined usage with other privacy enhancing technologies.

Meeting agenda: 

Collaborative meeting notes:

  • Brief introduction to the HDIG
  • Presentations on synthetic data critical issues and perspectives
  • Q&A and discussion on topics presented
  • Next steps
Target Audience: 

For this open session, we invite Policymakers for Healthcare; Clinicians wanting to use data technology to improve their practice; Biomedical researchers using data-driven analytical techniques in their research life-cycle; Healthcare Data Scientists dealing with data mining, machine learning, physiological modelling and image processing technologies and the data these produce; Health bioinformatics legal experts; Healthcare and Health Maintenance Organisation administrators; Pharmaceutical industry researchers and manufacturers; Medical equipment researchers and manufacturers, in silico modelling, testing and clinical trial experts; and, participants form other related WG/IG.


Group chair serving as contact person: 
Brief introduction describing the activities and scope of the group: 


When the Health Data IG was officially recognised and endorsed, in 2016, it was the only RDA group focusing on the intricacies of Health Data, especially as it relates to privacy and security issues in Healthcare but not only. Since then, the group has regularly met at all RDA Plenary events, and its sessions are attended by several researchers and professionals from diverse backgrounds. The Health Data IG sessions are becoming fora where new topics gaining interest for scientific research communities – such as the May 2018 entry into force of the EU General Data Protection Regulation (GDPR) or Artificial Intelligence applied to Health – can be debated from a broad and competent audience with a worldwide perspective.

Currently, several groups are starting to explore different aspects related to Health Data, and in particular two working groups (WG) spread from the HDIG, one on “Blockchain Applications in Health”, and another one on “Reproducible Health Data Services”, both endorsed by the RDA.

Short Group Status: 

The Health Data Interest Group (HDIG) was officially instituted in 2016, following successful BoF Sessions during the 6th RDA Plenary Meeting in Paris and the 7th RDA Plenary Meeting in Tokyo. It is now a mature RDA component, actively involved in P8, the 8th RDA Plenary Meeting, in Denver, with a session titled “Health Data Privacy & Security issues”, in P9 in Barcelona, with a session focused on “Meaningful health data for research and for industry”, in P10 in Montreal, with a session on “Health data mapping and diverging trends in health data protection”, in P11 in Berlin, on “First results on RDA Adoption and Training Guide for Reproducible Data Service Workflows and diverging trends in Health data protection”, and in P12 in Gaborone, Botswana, where the topic of “genomic data in the light of privacy rules” was addressed. The last two meetings attended in person, in Philadelphia (P13) and Helsinki (P14), were the occasion for HDIG members to discuss more in detail and on a use-case basis the theme of Artificial Intelligence, with special regards to “AI medicine: preconditions to apply AI to medicine and privacy concerns (after the EU GDPR)" and “Hospitals’ experiences towards a large-scale data sharing ecosystem for AI”, while during the RDA Virtual Plenary 16 (Costa Rica) we focused on “Transparency and Trust in Health Data”.

Type of Meeting: 
Informative meeting
Avoid conflict with the following group (1): 
Avoid conflict with the following group (2): 
Meeting presenters: 
TBD (Yannis Ioannidis, Edwin Morley-Fletcher, Minos Garofalakis, Leslie McIntosh ….)