Hallo Jamie,
there is so much going on in RDA that I did not have the time to look through the wiki of the preservation group beforehand.
What I have seen and read are for example interesting documents about preservation in the HEP community and there are various pointers to interesting workshops. I also saw that you refer to ISO based certification which obviously is suitable for CERN.
Was there any attempt to come to guidelines for preservation that would work across many centres active in various disciplines? It would be great to have such guidelines. For me it would be interesting for example how to formulate certification requirements. Most communities or community centres I interacted with rely on DSA/WDS certification for practical reasons and in the time I was responsible in CLARIN for example, we had long debates as well and finally decided to act according DSA and WDS (at that time separated). This email is not the place to discuss this in detail.
So what I read is certainly interesting, but I would like to look at
· various case studies from a range of centres and
· generalised guidelines or so?
They may be accessible via the wiki, but I could not find them.
Putting all you did in a convincing case story would certainly be good as well. Of course I would found it great if there were also references to what the other RDA groups are doing. As I indicated for example, the PP group collected a cookbook partly relevant for DM. Linking between the various groups and looking at each other’ stuff seems to be so important.
best
peter
------------------------------------------------------------------------------------------------
Peter Wittenburg Tel: +49 2821 49180
***@***.*** ; ***@***.***
RDA Europe Director, RDA TAB Member, EUDAT Scientific Advisor
Senior Advisor Data Systems, Max Planck Computing and Data Facility
Gießenbachstraße 2, 85748 Garching, Germany
http://www.mpcdf.de, http://www.mpcdf.de/~pewi
former affiliation: MPI for Psycholinguistics, Nijmegen, The Netherlands
Von: Jamie.Shiers=***@***.***-groups.org [mailto:***@***.***-groups.org] Im Auftrag von Jamie Shiers
Gesendet: Sonntag, 1. Mai 2016 12:26
An: Wittenburg, Peter; RDA Organisational Assembly / Organisational Advisory Board (OAB); RDA Europe Synchronisation Assembly (SyA)
Cc: Hugh Shanahan; ***@***.***; Mark Parsons; Berman, Fran; Simon CODATA; Harrison, Andrew
Betreff: Re: [rda-oab][synchronisation-assembly] [rda-oab][synchronisation-assembly] US future thoughts
Dear Peter,
I will let Harry, Hugh and others talk about training as they have concrete things in place.
As far as Collaboration is concerned, I think that the key point is to make the 4000 RDA members feel empowered.
This is a big topic that needs wider discussion and probably some “test cases”.
Let’s take the case of data preservation and see how this fits.
Hilary is working on a quote for a new version of the RDA Recommendations & Outcomes booklet. I don’t know when this will appear. There are also some slides I sent to Mark in May 2015 and Fran uses some material from me in one of her courses.
The benefits of adoption are multiple and can be hard to measure precisely. One, of many examples comes from Jamie Shiers, Information Technology Department, CERN & Manager of the Data Preservation for Long-Term Analysis in High Energy Physics (DPHEP) who says that “In DPHEP (Data Preservation in High Energy Physics) we have saved person-years through the knowledge gained through the RDA including the Preservation IG and many others. The most conservative estimate I can think of is that we have saved 5 person years and got something much better and more sustainable than we would have done otherwise. This could well be an under-estimate but 5 person-years at the EU project rate of ~EUR100K/year is quite substantial.”
There is material that I have sent to the Preservation e-Infrastructure IG mailing list, including a status report covering the last 3 years (more or less the time of my involvement with the RDA.
It is referenced from the attachment which will appear in an upcoming version of the CERN Courier. (This is still a draft - some small changes are likely). There is also an iPRES paper currently in review.
What does this say? At a minimum we want to go through certification according to ISO 16363 for the WLCG Tier0. Then we will decide whether we want a formal audit or not. I intend to do the self-audit looking how to extend to all CERN experiments as well as to all CERN activities (the latter including also photos, videos, memos etc - the organisations “digital memory” as the article calls it).
We will certainly learn a lot from this exercise: it should be completed by 2018 to allow it to input to the next round of the European Strategy for Particle Physics update (2018/2019 - the exact dates are not yet fixed).
Coupled to this are other activities: the preparation of Active Data Management plans for WLCG, all CERN experiments (I would like to see this part of the formal approval / review process for experiments), Sharing and Re-use in practice (CMS have already made available 3 releases - some 200TB in total), Open Access Policies, Reproducibility, etc etc
Through projects that we are involved in, we are trying to spread this to other disciplines. Also through the EIROForum IT WG.
The time lines are strict: whilst not technically difficult getting all the necessary formal approval (e.g. the 100+ metrics of ISO 16363) is not going to happen overnight.
How can “the RDA” empower me / us so that the whole community benefits as much as possible?
(Not by telling me that there haven’t been many posts on the PeIG mailing list for example).
If we achieve the above I am immodest enough to think that it will be a pretty major achievement.
I have said many times that we could not have come up with this plan, nor advanced so far in its implementation, without expert input from many people at RDA meetings, including WGs and IGs.
If I had never come to Garching in September 2012, then to Gothenburg in 2013 etc we would be in a very different situation.
Probably unable to see the wood for the trees, so I think my estimate of the effort (= money) that we have saved is almost certainly an underestimate.
To mis-quote George Bernard Shaw, we are waiting to help, we are willing to help, we are wanting to help!
I hope that I haven’t gone too much off track - to come back to concrete steps:
1. We will continue to participate in WGs, IGs, BoFs etc that are relevant to our work and goals;
2. We would like to see how the latter could be leverage to benefit “all of RDA’, e.g. the workshops we are organising, or are likely to organise in the coming 1 / 2 / 3 years (data management, data sharing, reproducibility: “bit preservation” has been a bit overdone in our past DPHEP workshops but we still believe we have some valuable experience to share: at the 100+PB scale today and planning for up to 3 orders of magnitude more, including cost model and business case).
3. We are happy to participate in the production of “success stories” now and in the future.
How to take this further and generalise it?
I would suggest to take 2-3 examples (data preservation, training, and one other) and talk about them, presumably at a plenary, as shining examples of the “power of the RDA”. End by calling for other examples to be submitted for show-casing in the future.
Possibly follow with a 30’ discussion, explicitly trying to get “the silent majority” to speak up.
Not easy, but it could well be a snowball effect, starting slowly and rapidly gaining size and momentum.
Cheers, Jamie
On 01 May 2016, at 11:47, Wittenburg, Peter <***@***.***> wrote:
Thanks Jamie,
interesting point. How do you want to organise things?
I would like to understand how to do things practically. Currently I see that when organising an event based on some request we will find good experts from our RDA members base whom we could ask to run a course etc. They have shown in the groups where they are heading to, they have indicated their use cases or adoption stories, etc. It's all on the wiki.
The 500.000 data scientists you are talking about are a huge mass indeed - some of them we as individuals know per accident and if we are lucky we know in detail what they are doing, where they have deep knowledge, what they could contribute, etc. So when we would organise a meeting on preservation I know that for example you have done a lot, but don't know details. Yet there is no adoption story or WG/IG output about that.
So what is your suggestion?
best
peter
-----Ursprüngliche Nachricht-----
Von: Jamie Shiers [mailto:***@***.***]
Gesendet: Sonntag, 1. Mai 2016 08:56
An: Wittenburg, Peter
Cc: Hugh Shanahan; RDA Europe Synchronisation Assembly (SyA);
***@***.***; Mark Parsons; Berman, Fran; RDA Organisational
Assembly / Organisational Advisory Board (OAB); Simon CODATA; Harrison,
Andrew
Betreff: Re: [synchronisation-assembly] [rda-oab][synchronisation-assembly]
US future thoughts
Dear Peter and all,
It is very good that so many of us see training as a key output / deliverable.
And it is clear that there is only so much that a small team can do.
Hence the proposal to leverage the ~4000 strong “RDA Collaboration”.
But equally important IMHO is to balance “push” with “pull” (from the
organisations / projects that the “RDA Collaboration” represents and bridges
to). (The bi-directional engagement as I call it).
Then we could truly say: “The Worldwide RDA Collaboration, that represents
science at all scales, mobilises to target the “missing 500,000” data scientists.
It offers training not only core data principles and values but also addresses
the specific needs of the various communities and projects concerned. This
allows the RDA to implements its vision of "researchers and innovators
openly sharing data across technologies and disciplines and countries to
address the grand challenges of society”.
Scalable. Sustainable. Implementable. Workable.
Cheers, Jamie
On 30 Apr 2016, at 16:16, Peter Wittenburg <***@***.***>
wrote:
Dear Hugh, all,
did you ever see the training page – please look here:
http://europe.rd-alliance.org/training-programme
Our primary focus for out training efforts as RDA EU is to disseminate the
RDA global results, but we also do slightly more. And also the other regions
will have some activities in this respect. Since all material is open we can all
share our efforts.
So we are already doing training, organise this with a very small team and it
is a huge challenge to organise up to 3 or 4 events per month. In all this we
act as a broker between interested people and some experts which we can
make use of from the 4000 RDA experts. It may be that you find the focus of
the training courses wrong, but then please fill in the request for ideas and
wishes.
Let me add here that we are pleased that EDISON volunteered to create a
framework which allows us to synchronise training efforts across Europe.
And I should also mention (and that is perhaps as important as what is
said above) that various countries are doing training courses
organised by active RDA groups and often these national courses are
supported by RDA Europe. As an example just look
here:https://www.dkrz.de/Nutzerportal/veranstaltungen-1/de-rda-de-trai
ningsworkshop-2016
So Hugh – what is missing and can you do more with a small team?
best
Peter
------------------------------------------------------------------------------------------------
Peter Wittenburg Tel: +49 2821 49180
***@***.*** ; ***@***.***
RDA Europe Director, RDA TAB Member, EUDAT Scientific Advisor
Senior Advisor Data Systems, Max Planck Computing and Data Facility
Gießenbachstraße 2, 85748 Garching, Germany http://www.mpcdf.de,
http://www.mpcdf.de/~pewi
former affiliation: MPI for Psycholinguistics, Nijmegen, The
Netherlands
Von: Hugh Shanahan [mailto:***@***.***]
Gesendet: Samstag, 30. April 2016 11:27
An: Jamie Shiers
Cc: Wittenburg, Peter; ***@***.***; Mark Parsons; Berman,
Fran; RDA Organisational Assembly / Organisational Advisory Board
(OAB); RDA Europe Synchronisation Assembly (SyA); RDA Europe
Synchronisation Assembly (SyA)
(***@***.***-groups-europe.org); Simon CODATA;
Harrison, Andrew
Betreff: Re: [rda-oab][synchronisation-assembly] US future thoughts
Dear all
I wanted to follow up on the email from Jamie, namely his wish for the
RDA to engage in training. I agree with him when refers to it as the under-
utilised “killer app” of the RDA.
I don’t have to state the obvious that there is a huge requirement of Data
Science skills in Research. What is important to note is the depth and
breadth of training that is required. The skills required change subtly from
those working in a domain dominated by Volume and/or Velocity issues such
as High Energy Physics and Bioinformatics to those mostly facing the Variety
issue (in, for example, the Long Tail of Research).
It’s also important to point out that training for specialists in Data Science
is key, but training for the large numbers of researchers who will need to
have a moderate understanding of a variety of different topics within Data
Science is also essential. The recent estimated figure of 500K researchers
who need Data Science skills in Europe alone jumps out here.
There is an analogy here with Engineering – an Engineer will typically
understand some Calculus, Mechanics, Thermodynamics and Linear Algebra
and a variety of other topics. Obviously there are experts in all those fields
but that doesn’t mean an Engineer should simply hand off any matrix
inversion to an Applied Mathematician simply because she’s not done a PhD
in the topic. As it is many researchers are wasting much of their time and
effort re-inventing the wheel and view Open Research with suspicion
because they don’t see the bigger picture.
The number of Masters programmes in Data Science around the world are
growing rapidly but appear to be only addressing parts of the problem. As
noted by one study from the EDISON project, these programmes are often
focussed on particular sub-sets of Data Science rather than giving an
overview. Hence the danger is that without some leadership Data Science
will become a fractured discipline.
There is a need for an organisation to take the lead and propose, not
dictate, best practices in Data Science from introductory to advanced levels;
to point out that it’s necessary to have some understanding of all of it and
the fact that being open with data and its analysis makes for more effective
and efficient research; to set up the mechanisms to accredit individuals,
courses and degree programmes to ensure quality and maximise impact.
The RDA is ideally placed to do this. It has the authority based on its
extensive grassroots community of experts and its array of funders.
I cannot think of a more effective way of achieving the goals of the RDA.
All the best
Hugh
__________________________
Hugh Shanahan
Senior Lecturer in Bioinformatics
***@***.***
http://www.shanahanlab.org
@hughshanahan
Skype hugh_shanahan
Tel +44 (0)1784 443433
orcid.org/0000-0003-1374-6015
On 28 Apr 2016, at 11:00, Jamie Shiers <***@***.***> wrote:
Dear all,
Stimulated by these various discussions I have written a short very informal
note (aka brain dump) that is attached.
This lists 3 main points (as per Leif’s mail) although I confess to running out
of steam, time and page limit (1 double side of A4) on the 3rd (and possibly
most important) point.
However, as the note says, I am sure others will weigh in on this point.
Cheers, Jamie
On 27 Apr 2016, at 18:14, Peter Wittenburg <***@***.***>
wrote:
Dear SyA members,
Fran from RDA US allowed us to distribute this brainstorming note about
the future of RDA. Please take it as what it is: first ideas on how RDA could
move from the viewpoint of our US colleagues.
I think it is a great resource to stimulate our discussions in Europe as well.
best
peter
------------------------------------------------------------------------------------------------
Peter Wittenburg Tel: +49 2821 49180
***@***.*** ; ***@***.***
Attached files:
RDA_US_2.0_Proposal.pdf
--
Full post:
https://rd-alliance.org/group/rda-europe-synchronisation-assembly-sya/
post/us-future-thoughts Manage my subscriptions:
https://rd-alliance.org/mailinglist
Stop emails for this post:
https://rd-alliance.org/mailinglist/unsubscribe/52156
Attached files:
RDA_Sustainability_____Thoughts.docx
--
Full post:
https://rd-alliance.org/group/rda-europe-synchronisation-assembly-sya/
post/us-future-thoughts Manage my subscriptions:
https://rd-alliance.org/mailinglist
Stop emails for this post:
https://rd-alliance.org/mailinglist/unsubscribe/52156
--
Full post:
https://rd-alliance.org/group/rda-europe-synchronisation-assembly-sya/
post/aw-rda-oabsynchronisation-assembly-us-future
Manage my subscriptions: https://rd-alliance.org/mailinglist
Stop emails for this post:
https://rd-alliance.org/mailinglist/unsubscribe/52208
Author: Jamie Shiers
Date: 02 May, 2016
Hi Peter,
On 02 May 2016, at 14:10, Peter Wittenburg <***@***.***> wrote:
Hallo Jamie,
there is so much going on in RDA that I did not have the time to look through the wiki of the preservation group beforehand.
What I have seen and read are for example interesting documents about preservation in the HEP community and there are various pointers to interesting workshops. I also saw that you refer to ISO based certification which obviously is suitable for CERN.
Certification is one of the many issues I learned about through the RDA. We did start looking at DSA and it was useful to get “management approval” to go further. (You can summarise DSA in a few PPT slides - I am not sure that this would be so easy with ISO 16363).
Having studied ISO 16363 a bit further (through an on-site course and also through filling out most of a first pass at self-certification), it matches our existing practices quite well.
As I may have mentioned, we plan to share our (Tier0+Tier1) experiences later (maybe 2017) and would welcome stories also about other types of certification. In fact there is a DPC event in London (at the “worshipful company of information technologies”) in this vein that I will attend. (I want to understand if what I think are suitable justifying documents, policies and procedures would be thought of as such by others before going too far).
Was there any attempt to come to guidelines for preservation that would work across many centres active in various disciplines? It would be great to have such guidelines. For me it would be interesting for example how to formulate certification requirements. Most communities or community centres I interacted with rely on DSA/WDS certification for practical reasons and in the time I was responsible in CLARIN for example, we had long debates as well and finally decided to act according DSA and WDS (at that time separated). This email is not the place to discuss this in detail.
Yes, firstly across all HEP sites and also with other disciplines through multi-disciplinary workshops.
After 3 years, we finally agreed in HEP that we could use a common template to describe our activities. I think that it would work for others too but we did not try this yet.
So what I read is certainly interesting, but I would like to look at
• various case studies from a range of centres and
• generalised guidelines or so?
They may be accessible via the wiki, but I could not find them.
No, you won’t find them there but I agree it would be a nice goal to extract from presentations and workshops.
I don’t think that this would be possible in e.g. a 90’ IG session at a plenary. It would probably need a dedicated workshop and I don’t see this happening in 2016 with what I have already committed to. Probably not even the first half of next year either...
Putting all you did in a convincing case story would certainly be good as well. Of course I would found it great if there were also references to what the other RDA groups are doing. As I indicated for example, the PP group collected a cookbook partly relevant for DM. Linking between the various groups and looking at each other’ stuff seems to be so important.
I would be more than happy to work with someone on such a story, citing the groups that have helped us.
There are also projects too, e.g. 4C.
At our last DPHEP workshop in Lisbon (where the 3 main themes were: 1. presenting the status report reflecting the last 3 years of achievements; 2. DMPs; 3. Certification) it was more or less agreed that the need for a “conventional DPHEP” workshop was over, and we should move to other related things. This is why we foresee a workshop on multi-disciplinary experiences with data sharing. (If you don’t preserve data you can’t share it but there are many other issues involved. For example, in our experience the infrastructure for long-term preservation is not necessarily that you use for sharing the data: you probably need to restrict access to the multi-decade archive and it is probably not configured for “open access”. (Firewall, authentication and other issues).
Should be fun.
Cheers, Jamie
best
peter
------------------------------------------------------------------------------------------------
Peter Wittenburg Tel: +49 2821 49180
***@***.*** ; ***@***.***
RDA Europe Director, RDA TAB Member, EUDAT Scientific Advisor
Senior Advisor Data Systems, Max Planck Computing and Data Facility
Gießenbachstraße 2, 85748 Garching, Germany
http://www.mpcdf.de, http://www.mpcdf.de/~pewi
former affiliation: MPI for Psycholinguistics, Nijmegen, The Netherlands
Von: Jamie.Shiers=***@***.***-groups.org [mailto:***@***.***-groups.org] Im Auftrag von Jamie Shiers
Gesendet: Sonntag, 1. Mai 2016 12:26
An: Wittenburg, Peter; RDA Organisational Assembly / Organisational Advisory Board (OAB); RDA Europe Synchronisation Assembly (SyA)
Cc: Hugh Shanahan; ***@***.***; Mark Parsons; Berman, Fran; Simon CODATA; Harrison, Andrew
Betreff: Re: [rda-oab][synchronisation-assembly] [rda-oab][synchronisation-assembly] US future thoughts
Dear Peter,
I will let Harry, Hugh and others talk about training as they have concrete things in place.
As far as Collaboration is concerned, I think that the key point is to make the 4000 RDA members feel empowered.
This is a big topic that needs wider discussion and probably some “test cases”.
Let’s take the case of data preservation and see how this fits.
Hilary is working on a quote for a new version of the RDA Recommendations & Outcomes booklet. I don’t know when this will appear. There are also some slides I sent to Mark in May 2015 and Fran uses some material from me in one of her courses.
The benefits of adoption are multiple and can be hard to measure precisely. One, of many examples comes from Jamie Shiers, Information Technology Department, CERN & Manager of the Data Preservation for Long-Term Analysis in High Energy Physics (DPHEP) who says that “In DPHEP (Data Preservation in High Energy Physics) we have saved person-years through the knowledge gained through the RDA including the Preservation IG and many others. The most conservative estimate I can think of is that we have saved 5 person years and got something much better and more sustainable than we would have done otherwise. This could well be an under-estimate but 5 person-years at the EU project rate of ~EUR100K/year is quite substantial.”
There is material that I have sent to the Preservation e-Infrastructure IG mailing list, including a status report covering the last 3 years (more or less the time of my involvement with the RDA.
It is referenced from the attachment which will appear in an upcoming version of the CERN Courier. (This is still a draft - some small changes are likely). There is also an iPRES paper currently in review.
What does this say? At a minimum we want to go through certification according to ISO 16363 for the WLCG Tier0. Then we will decide whether we want a formal audit or not. I intend to do the self-audit looking how to extend to all CERN experiments as well as to all CERN activities (the latter including also photos, videos, memos etc - the organisations “digital memory” as the article calls it).
We will certainly learn a lot from this exercise: it should be completed by 2018 to allow it to input to the next round of the European Strategy for Particle Physics update (2018/2019 - the exact dates are not yet fixed).
Coupled to this are other activities: the preparation of Active Data Management plans for WLCG, all CERN experiments (I would like to see this part of the formal approval / review process for experiments), Sharing and Re-use in practice (CMS have already made available 3 releases - some 200TB in total), Open Access Policies, Reproducibility, etc etc
Through projects that we are involved in, we are trying to spread this to other disciplines. Also through the EIROForum IT WG.
The time lines are strict: whilst not technically difficult getting all the necessary formal approval (e.g. the 100+ metrics of ISO 16363) is not going to happen overnight.
How can “the RDA” empower me / us so that the whole community benefits as much as possible?
(Not by telling me that there haven’t been many posts on the PeIG mailing list for example).
If we achieve the above I am immodest enough to think that it will be a pretty major achievement.
I have said many times that we could not have come up with this plan, nor advanced so far in its implementation, without expert input from many people at RDA meetings, including WGs and IGs.
If I had never come to Garching in September 2012, then to Gothenburg in 2013 etc we would be in a very different situation.
Probably unable to see the wood for the trees, so I think my estimate of the effort (= money) that we have saved is almost certainly an underestimate.
To mis-quote George Bernard Shaw, we are waiting to help, we are willing to help, we are wanting to help!
I hope that I haven’t gone too much off track - to come back to concrete steps:
1. We will continue to participate in WGs, IGs, BoFs etc that are relevant to our work and goals;
2. We would like to see how the latter could be leverage to benefit “all of RDA’, e.g. the workshops we are organising, or are likely to organise in the coming 1 / 2 / 3 years (data management, data sharing, reproducibility: “bit preservation” has been a bit overdone in our past DPHEP workshops but we still believe we have some valuable experience to share: at the 100+PB scale today and planning for up to 3 orders of magnitude more, including cost model and business case).
3. We are happy to participate in the production of “success stories” now and in the future.
How to take this further and generalise it?
I would suggest to take 2-3 examples (data preservation, training, and one other) and talk about them, presumably at a plenary, as shining examples of the “power of the RDA”. End by calling for other examples to be submitted for show-casing in the future.
Possibly follow with a 30’ discussion, explicitly trying to get “the silent majority” to speak up.
Not easy, but it could well be a snowball effect, starting slowly and rapidly gaining size and momentum.
Cheers, Jamie
On 01 May 2016, at 11:47, Wittenburg, Peter <***@***.***> wrote:
Thanks Jamie,
interesting point. How do you want to organise things?
I would like to understand how to do things practically. Currently I see that when organising an event based on some request we will find good experts from our RDA members base whom we could ask to run a course etc. They have shown in the groups where they are heading to, they have indicated their use cases or adoption stories, etc. It's all on the wiki.
The 500.000 data scientists you are talking about are a huge mass indeed - some of them we as individuals know per accident and if we are lucky we know in detail what they are doing, where they have deep knowledge, what they could contribute, etc. So when we would organise a meeting on preservation I know that for example you have done a lot, but don't know details. Yet there is no adoption story or WG/IG output about that.
So what is your suggestion?
best
peter
-----Ursprüngliche Nachricht-----
Von: Jamie Shiers [mailto:***@***.***]
Gesendet: Sonntag, 1. Mai 2016 08:56
An: Wittenburg, Peter
Cc: Hugh Shanahan; RDA Europe Synchronisation Assembly (SyA);
***@***.***; Mark Parsons; Berman, Fran; RDA Organisational
Assembly / Organisational Advisory Board (OAB); Simon CODATA; Harrison,
Andrew
Betreff: Re: [synchronisation-assembly] [rda-oab][synchronisation-assembly]
US future thoughts
Dear Peter and all,
It is very good that so many of us see training as a key output / deliverable.
And it is clear that there is only so much that a small team can do.
Hence the proposal to leverage the ~4000 strong “RDA Collaboration”.
But equally important IMHO is to balance “push” with “pull” (from the
organisations / projects that the “RDA Collaboration” represents and bridges
to). (The bi-directional engagement as I call it).
Then we could truly say: “The Worldwide RDA Collaboration, that represents
science at all scales, mobilises to target the “missing 500,000” data scientists.
It offers training not only core data principles and values but also addresses
the specific needs of the various communities and projects concerned. This
allows the RDA to implements its vision of "researchers and innovators
openly sharing data across technologies and disciplines and countries to
address the grand challenges of society”.
Scalable. Sustainable. Implementable. Workable.
Cheers, Jamie
On 30 Apr 2016, at 16:16, Peter Wittenburg <***@***.***>
wrote:
Dear Hugh, all,
did you ever see the training page – please look here:
http://europe.rd-alliance.org/training-programme
Our primary focus for out training efforts as RDA EU is to disseminate the
RDA global results, but we also do slightly more. And also the other regions
will have some activities in this respect. Since all material is open we can all
share our efforts.
So we are already doing training, organise this with a very small team and it
is a huge challenge to organise up to 3 or 4 events per month. In all this we
act as a broker between interested people and some experts which we can
make use of from the 4000 RDA experts. It may be that you find the focus of
the training courses wrong, but then please fill in the request for ideas and
wishes.
Let me add here that we are pleased that EDISON volunteered to create a
framework which allows us to synchronise training efforts across Europe.
And I should also mention (and that is perhaps as important as what is
said above) that various countries are doing training courses
organised by active RDA groups and often these national courses are
supported by RDA Europe. As an example just look
here:https://www.dkrz.de/Nutzerportal/veranstaltungen-1/de-rda-de-trai
ningsworkshop-2016
So Hugh – what is missing and can you do more with a small team?
best
Peter
------------------------------------------------------------------------------------------------
Peter Wittenburg Tel: +49 2821 49180
***@***.*** ; ***@***.***
RDA Europe Director, RDA TAB Member, EUDAT Scientific Advisor
Senior Advisor Data Systems, Max Planck Computing and Data Facility
Gießenbachstraße 2, 85748 Garching, Germany http://www.mpcdf.de,
http://www.mpcdf.de/~pewi
former affiliation: MPI for Psycholinguistics, Nijmegen, The
Netherlands
Von: Hugh Shanahan [mailto:***@***.***]
Gesendet: Samstag, 30. April 2016 11:27
An: Jamie Shiers
Cc: Wittenburg, Peter; ***@***.***; Mark Parsons; Berman,
Fran; RDA Organisational Assembly / Organisational Advisory Board
(OAB); RDA Europe Synchronisation Assembly (SyA); RDA Europe
Synchronisation Assembly (SyA)
(***@***.***-groups-europe.org); Simon CODATA;
Harrison, Andrew
Betreff: Re: [rda-oab][synchronisation-assembly] US future thoughts
Dear all
I wanted to follow up on the email from Jamie, namely his wish for the
RDA to engage in training. I agree with him when refers to it as the under-
utilised “killer app” of the RDA.
I don’t have to state the obvious that there is a huge requirement of Data
Science skills in Research. What is important to note is the depth and
breadth of training that is required. The skills required change subtly from
those working in a domain dominated by Volume and/or Velocity issues such
as High Energy Physics and Bioinformatics to those mostly facing the Variety
issue (in, for example, the Long Tail of Research).
It’s also important to point out that training for specialists in Data Science
is key, but training for the large numbers of researchers who will need to
have a moderate understanding of a variety of different topics within Data
Science is also essential. The recent estimated figure of 500K researchers
who need Data Science skills in Europe alone jumps out here.
There is an analogy here with Engineering – an Engineer will typically
understand some Calculus, Mechanics, Thermodynamics and Linear Algebra
and a variety of other topics. Obviously there are experts in all those fields
but that doesn’t mean an Engineer should simply hand off any matrix
inversion to an Applied Mathematician simply because she’s not done a PhD
in the topic. As it is many researchers are wasting much of their time and
effort re-inventing the wheel and view Open Research with suspicion
because they don’t see the bigger picture.
The number of Masters programmes in Data Science around the world are
growing rapidly but appear to be only addressing parts of the problem. As
noted by one study from the EDISON project, these programmes are often
focussed on particular sub-sets of Data Science rather than giving an
overview. Hence the danger is that without some leadership Data Science
will become a fractured discipline.
There is a need for an organisation to take the lead and propose, not
dictate, best practices in Data Science from introductory to advanced levels;
to point out that it’s necessary to have some understanding of all of it and
the fact that being open with data and its analysis makes for more effective
and efficient research; to set up the mechanisms to accredit individuals,
courses and degree programmes to ensure quality and maximise impact.
The RDA is ideally placed to do this. It has the authority based on its
extensive grassroots community of experts and its array of funders.
I cannot think of a more effective way of achieving the goals of the RDA.
All the best
Hugh
__________________________
Hugh Shanahan
Senior Lecturer in Bioinformatics
***@***.***
http://www.shanahanlab.org
@hughshanahan
Skype hugh_shanahan
Tel +44 (0)1784 443433
orcid.org/0000-0003-1374-6015
On 28 Apr 2016, at 11:00, Jamie Shiers <***@***.***> wrote:
Dear all,
Stimulated by these various discussions I have written a short very informal
note (aka brain dump) that is attached.
This lists 3 main points (as per Leif’s mail) although I confess to running out
of steam, time and page limit (1 double side of A4) on the 3rd (and possibly
most important) point.
However, as the note says, I am sure others will weigh in on this point.
Cheers, Jamie
On 27 Apr 2016, at 18:14, Peter Wittenburg <***@***.***>
wrote:
Dear SyA members,
Fran from RDA US allowed us to distribute this brainstorming note about
the future of RDA. Please take it as what it is: first ideas on how RDA could
move from the viewpoint of our US colleagues.
I think it is a great resource to stimulate our discussions in Europe as well.
best
peter
------------------------------------------------------------------------------------------------
Peter Wittenburg Tel: +49 2821 49180
***@***.*** ; ***@***.***
Attached files:
RDA_US_2.0_Proposal.pdf
--
Full post:
https://rd-alliance.org/group/rda-europe-synchronisation-assembly-sya/
post/us-future-thoughts Manage my subscriptions:
https://rd-alliance.org/mailinglist
Stop emails for this post:
https://rd-alliance.org/mailinglist/unsubscribe/52156
Attached files:
RDA_Sustainability_____Thoughts.docx
--
Full post:
https://rd-alliance.org/group/rda-europe-synchronisation-assembly-sya/
post/us-future-thoughts Manage my subscriptions:
https://rd-alliance.org/mailinglist
Stop emails for this post:
https://rd-alliance.org/mailinglist/unsubscribe/52156
--
Full post:
https://rd-alliance.org/group/rda-europe-synchronisation-assembly-sya/
post/aw-rda-oabsynchronisation-assembly-us-future
Manage my subscriptions: https://rd-alliance.org/mailinglist
Stop emails for this post:
https://rd-alliance.org/mailinglist/unsubscribe/52208
--
Full post: https://rd-alliance.org/group/rda-europe-synchronisation-assembly-sya/po...
Manage my subscriptions: https://rd-alliance.org/mailinglist
Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/52219