Skip to main content


We are in the process of rolling out a soft launch of the RDA website, which includes a new member platform. Existing RDA members PLEASE REACTIVATE YOUR ACCOUNT using this link: Visitors may encounter functionality issues with group pages, navigation, missing content, broken links, etc. As you explore the new site, please provide your feedback using the UserSnap tool on the bottom right corner of each page. Thank you for your understanding and support as we work through all issues as quickly as possible. Stay updated about upcoming features and functionalities:


Jorge Clarke

Hello everyone,
Regarding the objectives of the WG, I think we should aim to create the
bases of an open-access health/social data repository.
Something like this is spontaneously happening with all the dataset that
are being uploaded in Github, my personal concern is that GitHub is owned
by Microsoft. In that sense, I think that we should aim to something more
like “arXiv” (htttps:// ) (“DataRxiv” ??
.org domain is currently owned by amazon), which is hosted/owned by Cornell
University (private institution, but educational).
As it is pointed out in the case statement data uploaded to this
repository should, at least, adhere to FAIR
principles. Besides that, here
are some complementary ideas that came into my mind until now:
1.- Data should be anonymised before uploaded.
2.- A chain of trust to data security responsible (individuals and
institutions) should be indicated. (As pointed-out in Christopher
Harrison’s e-mail)
3.- Extremely sensitive data must be flagged and access to it must be
granted only with permission from the owners. A protocol to speed-up the
procedure permission should be available in emergency situations like
4.- Every time a dataset is uploaded this must be curated by at least two
experts (like peer-review system). The curators should also be indicated in
the dataset information.
5.- Potential bias in the dataset (gender, race, etc.) should be
mentioned in the dataset information. We don’t want to repeat Amazon kinds
of issues about hiring only men…
Regarding point 3, as this kind of data is usually very sensitive, maybe
this repository could host dataset description and information, but
not-necessarily the data itself. This will at least to help know who owns
the data, to whom we need to talk to get it and/or work with, and to
elucidate if this data could help our current research or not. This could
even be helpful to encourage collaborations between
Regarding Christopher concerns about IRBs ( “*…An IRB’s goal is to
protect patients privacy and rights and providing a method to review
ethical use a study…*” ), we may even consider ways to make that the
dataset uploaded or pointed in this repository include among their
description/information if they already passed through and IRB (or
something similar) or if they are about to do it, prepared to do it, etc.
I know all this may sound obvious so several of you, I just wanted to
contribute a bit and I think it is worthy to outline it.
I whish you all a enjoyable confinement, if you are confined as me.
Else, keep enjoying life and take care.
Jorge Andrés Clarke De la Cerda
PhD in Applied Sciences
On Thu, Apr 2, 2020 at 7:57 AM paoladimaio via RDA-COVID19 <
***@***.***> wrote: