Skip to main content

Notice

We are in the process of rolling out a soft launch of the RDA website, which includes a new member platform. Existing RDA members PLEASE REACTIVATE YOUR ACCOUNT using this link: https://rda-login.wicketcloud.com/users/confirmation. Visitors may encounter functionality issues with group pages, navigation, missing content, broken links, etc. As you explore the new site, please provide your feedback using the UserSnap tool on the bottom right corner of each page. Thank you for your understanding and support as we work through all issues as quickly as possible. Stay updated about upcoming features and functionalities: https://www.rd-alliance.org/rda-web-platform-upcoming-features-and-functionalities/

AW: [rda-oab][synchronisation-assembly] [rda-oab][synchronisation-assembly] US future thoughts

  • Creator
    Discussion
  • #83220

    Hallo Jamie,
    there is so much going on in RDA that I did not have the time to look through the wiki of the preservation group beforehand.
    What I have seen and read are for example interesting documents about preservation in the HEP community and there are various pointers to interesting workshops. I also saw that you refer to ISO based certification which obviously is suitable for CERN.
    Was there any attempt to come to guidelines for preservation that would work across many centres active in various disciplines? It would be great to have such guidelines. For me it would be interesting for example how to formulate certification requirements. Most communities or community centres I interacted with rely on DSA/WDS certification for practical reasons and in the time I was responsible in CLARIN for example, we had long debates as well and finally decided to act according DSA and WDS (at that time separated). This email is not the place to discuss this in detail.
    So what I read is certainly interesting, but I would like to look at
    · various case studies from a range of centres and
    · generalised guidelines or so?
    They may be accessible via the wiki, but I could not find them.
    Putting all you did in a convincing case story would certainly be good as well. Of course I would found it great if there were also references to what the other RDA groups are doing. As I indicated for example, the PP group collected a cookbook partly relevant for DM. Linking between the various groups and looking at each other’ stuff seems to be so important.
    best
    peter
    ————————————————————————————————
    Peter Wittenburg Tel: +49 2821 49180
    ***@***.*** ; ***@***.***
    RDA Europe Director, RDA TAB Member, EUDAT Scientific Advisor
    Senior Advisor Data Systems, Max Planck Computing and Data Facility
    Gießenbachstraße 2, 85748 Garching, Germany
    http://www.mpcdf.de, http://www.mpcdf.de/~pewi
    former affiliation: MPI for Psycholinguistics, Nijmegen, The Netherlands
    Von: Jamie.Shiers=***@***.***-groups.org [mailto:***@***.***-groups.org] Im Auftrag von Jamie Shiers
    Gesendet: Sonntag, 1. Mai 2016 12:26
    An: Wittenburg, Peter; RDA Organisational Assembly / Organisational Advisory Board (OAB); RDA Europe Synchronisation Assembly (SyA)
    Cc: Hugh Shanahan; ***@***.***; Mark Parsons; Berman, Fran; Simon CODATA; Harrison, Andrew
    Betreff: Re: [rda-oab][synchronisation-assembly] [rda-oab][synchronisation-assembly] US future thoughts
    Dear Peter,
    I will let Harry, Hugh and others talk about training as they have concrete things in place.
    As far as Collaboration is concerned, I think that the key point is to make the 4000 RDA members feel empowered.
    This is a big topic that needs wider discussion and probably some “test cases”.
    Let’s take the case of data preservation and see how this fits.
    Hilary is working on a quote for a new version of the RDA Recommendations & Outcomes booklet. I don’t know when this will appear. There are also some slides I sent to Mark in May 2015 and Fran uses some material from me in one of her courses.
    The benefits of adoption are multiple and can be hard to measure precisely. One, of many examples comes from Jamie Shiers, Information Technology Department, CERN & Manager of the Data Preservation for Long-Term Analysis in High Energy Physics (DPHEP) who says that “In DPHEP (Data Preservation in High Energy Physics) we have saved person-years through the knowledge gained through the RDA including the Preservation IG and many others. The most conservative estimate I can think of is that we have saved 5 person years and got something much better and more sustainable than we would have done otherwise. This could well be an under-estimate but 5 person-years at the EU project rate of ~EUR100K/year is quite substantial.”
    There is material that I have sent to the Preservation e-Infrastructure IG mailing list, including a status report covering the last 3 years (more or less the time of my involvement with the RDA.
    It is referenced from the attachment which will appear in an upcoming version of the CERN Courier. (This is still a draft – some small changes are likely). There is also an iPRES paper currently in review.
    What does this say? At a minimum we want to go through certification according to ISO 16363 for the WLCG Tier0. Then we will decide whether we want a formal audit or not. I intend to do the self-audit looking how to extend to all CERN experiments as well as to all CERN activities (the latter including also photos, videos, memos etc – the organisations “digital memory” as the article calls it).
    We will certainly learn a lot from this exercise: it should be completed by 2018 to allow it to input to the next round of the European Strategy for Particle Physics update (2018/2019 – the exact dates are not yet fixed).
    Coupled to this are other activities: the preparation of Active Data Management plans for WLCG, all CERN experiments (I would like to see this part of the formal approval / review process for experiments), Sharing and Re-use in practice (CMS have already made available 3 releases – some 200TB in total), Open Access Policies, Reproducibility, etc etc
    Through projects that we are involved in, we are trying to spread this to other disciplines. Also through the EIROForum IT WG.
    The time lines are strict: whilst not technically difficult getting all the necessary formal approval (e.g. the 100+ metrics of ISO 16363) is not going to happen overnight.
    How can “the RDA” empower me / us so that the whole community benefits as much as possible?
    (Not by telling me that there haven’t been many posts on the PeIG mailing list for example).
    If we achieve the above I am immodest enough to think that it will be a pretty major achievement.
    I have said many times that we could not have come up with this plan, nor advanced so far in its implementation, without expert input from many people at RDA meetings, including WGs and IGs.
    If I had never come to Garching in September 2012, then to Gothenburg in 2013 etc we would be in a very different situation.
    Probably unable to see the wood for the trees, so I think my estimate of the effort (= money) that we have saved is almost certainly an underestimate.
    To mis-quote George Bernard Shaw, we are waiting to help, we are willing to help, we are wanting to help!
    I hope that I haven’t gone too much off track – to come back to concrete steps:
    1. We will continue to participate in WGs, IGs, BoFs etc that are relevant to our work and goals;
    2. We would like to see how the latter could be leverage to benefit “all of RDA’, e.g. the workshops we are organising, or are likely to organise in the coming 1 / 2 / 3 years (data management, data sharing, reproducibility: “bit preservation” has been a bit overdone in our past DPHEP workshops but we still believe we have some valuable experience to share: at the 100+PB scale today and planning for up to 3 orders of magnitude more, including cost model and business case).
    3. We are happy to participate in the production of “success stories” now and in the future.
    How to take this further and generalise it?
    I would suggest to take 2-3 examples (data preservation, training, and one other) and talk about them, presumably at a plenary, as shining examples of the “power of the RDA”. End by calling for other examples to be submitted for show-casing in the future.
    Possibly follow with a 30’ discussion, explicitly trying to get “the silent majority” to speak up.
    Not easy, but it could well be a snowball effect, starting slowly and rapidly gaining size and momentum.
    Cheers, Jamie
    On 01 May 2016, at 11:47, Wittenburg, Peter wrote:
    Thanks Jamie,
    interesting point. How do you want to organise things?
    I would like to understand how to do things practically. Currently I see that when organising an event based on some request we will find good experts from our RDA members base whom we could ask to run a course etc. They have shown in the groups where they are heading to, they have indicated their use cases or adoption stories, etc. It’s all on the wiki.
    The 500.000 data scientists you are talking about are a huge mass indeed – some of them we as individuals know per accident and if we are lucky we know in detail what they are doing, where they have deep knowledge, what they could contribute, etc. So when we would organise a meeting on preservation I know that for example you have done a lot, but don’t know details. Yet there is no adoption story or WG/IG output about that.
    So what is your suggestion?
    best
    peter
    —–Ursprüngliche Nachricht—–
    Von: Jamie Shiers [mailto:***@***.***]
    Gesendet: Sonntag, 1. Mai 2016 08:56
    An: Wittenburg, Peter
    Cc: Hugh Shanahan; RDA Europe Synchronisation Assembly (SyA);
    ***@***.***; Mark Parsons; Berman, Fran; RDA Organisational
    Assembly / Organisational Advisory Board (OAB); Simon CODATA; Harrison,
    Andrew
    Betreff: Re: [synchronisation-assembly] [rda-oab][synchronisation-assembly]
    US future thoughts
    Dear Peter and all,
    It is very good that so many of us see training as a key output / deliverable.
    And it is clear that there is only so much that a small team can do.
    Hence the proposal to leverage the ~4000 strong “RDA Collaboration”.
    But equally important IMHO is to balance “push” with “pull” (from the
    organisations / projects that the “RDA Collaboration” represents and bridges
    to). (The bi-directional engagement as I call it).
    Then we could truly say: “The Worldwide RDA Collaboration, that represents
    science at all scales, mobilises to target the “missing 500,000” data scientists.
    It offers training not only core data principles and values but also addresses
    the specific needs of the various communities and projects concerned. This
    allows the RDA to implements its vision of “researchers and innovators
    openly sharing data across technologies and disciplines and countries to
    address the grand challenges of society”.
    Scalable. Sustainable. Implementable. Workable.
    Cheers, Jamie
    On 30 Apr 2016, at 16:16, Peter Wittenburg
    wrote:
    Dear Hugh, all,
    did you ever see the training page – please look here:
    http://europe.rd-alliance.org/training-programme
    Our primary focus for out training efforts as RDA EU is to disseminate the
    RDA global results, but we also do slightly more. And also the other regions
    will have some activities in this respect. Since all material is open we can all
    share our efforts.
    So we are already doing training, organise this with a very small team and it
    is a huge challenge to organise up to 3 or 4 events per month. In all this we
    act as a broker between interested people and some experts which we can
    make use of from the 4000 RDA experts. It may be that you find the focus of
    the training courses wrong, but then please fill in the request for ideas and
    wishes.
    Let me add here that we are pleased that EDISON volunteered to create a
    framework which allows us to synchronise training efforts across Europe.
    And I should also mention (and that is perhaps as important as what is
    said above) that various countries are doing training courses
    organised by active RDA groups and often these national courses are
    supported by RDA Europe. As an example just look
    here:https://www.dkrz.de/Nutzerportal/veranstaltungen-1/de-rda-de-trai
    ningsworkshop-2016
    So Hugh – what is missing and can you do more with a small team?
    best
    Peter
    ————————————————————————————————
    Peter Wittenburg Tel: +49 2821 49180
    ***@***.*** ; ***@***.***
    RDA Europe Director, RDA TAB Member, EUDAT Scientific Advisor
    Senior Advisor Data Systems, Max Planck Computing and Data Facility
    Gießenbachstraße 2, 85748 Garching, Germany http://www.mpcdf.de,
    http://www.mpcdf.de/~pewi
    former affiliation: MPI for Psycholinguistics, Nijmegen, The
    Netherlands
    Von: Hugh Shanahan [mailto:***@***.***]
    Gesendet: Samstag, 30. April 2016 11:27
    An: Jamie Shiers
    Cc: Wittenburg, Peter; ***@***.***; Mark Parsons; Berman,
    Fran; RDA Organisational Assembly / Organisational Advisory Board
    (OAB); RDA Europe Synchronisation Assembly (SyA); RDA Europe
    Synchronisation Assembly (SyA)
    (***@***.***-groups-europe.org); Simon CODATA;
    Harrison, Andrew
    Betreff: Re: [rda-oab][synchronisation-assembly] US future thoughts
    Dear all
    I wanted to follow up on the email from Jamie, namely his wish for the
    RDA to engage in training. I agree with him when refers to it as the under-
    utilised “killer app” of the RDA.
    I don’t have to state the obvious that there is a huge requirement of Data
    Science skills in Research. What is important to note is the depth and
    breadth of training that is required. The skills required change subtly from
    those working in a domain dominated by Volume and/or Velocity issues such
    as High Energy Physics and Bioinformatics to those mostly facing the Variety
    issue (in, for example, the Long Tail of Research).
    It’s also important to point out that training for specialists in Data Science
    is key, but training for the large numbers of researchers who will need to
    have a moderate understanding of a variety of different topics within Data
    Science is also essential. The recent estimated figure of 500K researchers
    who need Data Science skills in Europe alone jumps out here.
    There is an analogy here with Engineering – an Engineer will typically
    understand some Calculus, Mechanics, Thermodynamics and Linear Algebra
    and a variety of other topics. Obviously there are experts in all those fields
    but that doesn’t mean an Engineer should simply hand off any matrix
    inversion to an Applied Mathematician simply because she’s not done a PhD
    in the topic. As it is many researchers are wasting much of their time and
    effort re-inventing the wheel and view Open Research with suspicion
    because they don’t see the bigger picture.
    The number of Masters programmes in Data Science around the world are
    growing rapidly but appear to be only addressing parts of the problem. As
    noted by one study from the EDISON project, these programmes are often
    focussed on particular sub-sets of Data Science rather than giving an
    overview. Hence the danger is that without some leadership Data Science
    will become a fractured discipline.
    There is a need for an organisation to take the lead and propose, not
    dictate, best practices in Data Science from introductory to advanced levels;
    to point out that it’s necessary to have some understanding of all of it and
    the fact that being open with data and its analysis makes for more effective
    and efficient research; to set up the mechanisms to accredit individuals,
    courses and degree programmes to ensure quality and maximise impact.
    The RDA is ideally placed to do this. It has the authority based on its
    extensive grassroots community of experts and its array of funders.
    I cannot think of a more effective way of achieving the goals of the RDA.
    All the best
    Hugh
    __________________________
    Hugh Shanahan
    Senior Lecturer in Bioinformatics
    ***@***.***
    http://www.shanahanlab.org
    @hughshanahan
    Skype hugh_shanahan
    Tel +44 (0)1784 443433
    orcid.org/0000-0003-1374-6015
    On 28 Apr 2016, at 11:00, Jamie Shiers wrote:
    Dear all,
    Stimulated by these various discussions I have written a short very informal
    note (aka brain dump) that is attached.
    This lists 3 main points (as per Leif’s mail) although I confess to running out
    of steam, time and page limit (1 double side of A4) on the 3rd (and possibly
    most important) point.
    However, as the note says, I am sure others will weigh in on this point.
    Cheers, Jamie
    On 27 Apr 2016, at 18:14, Peter Wittenburg
    wrote:
    Dear SyA members,
    Fran from RDA US allowed us to distribute this brainstorming note about
    the future of RDA. Please take it as what it is: first ideas on how RDA could
    move from the viewpoint of our US colleagues.
    I think it is a great resource to stimulate our discussions in Europe as well.
    best
    peter
    ————————————————————————————————
    Peter Wittenburg Tel: +49 2821 49180
    ***@***.*** ; ***@***.***
    Attached files:
    RDA_US_2.0_Proposal.pdf

    Full post:
    https://rd-alliance.org/group/rda-europe-synchronisation-assembly-sya/
    post/us-future-thoughts Manage my subscriptions:
    https://rd-alliance.org/mailinglist
    Stop emails for this post:
    https://rd-alliance.org/mailinglist/unsubscribe/52156
    Attached files:
    RDA_Sustainability_____Thoughts.docx

    Full post:
    https://rd-alliance.org/group/rda-europe-synchronisation-assembly-sya/
    post/us-future-thoughts Manage my subscriptions:
    https://rd-alliance.org/mailinglist
    Stop emails for this post:
    https://rd-alliance.org/mailinglist/unsubscribe/52156

    Full post:
    https://rd-alliance.org/group/rda-europe-synchronisation-assembly-sya/
    post/aw-rda-oabsynchronisation-assembly-us-future
    Manage my subscriptions: https://rd-alliance.org/mailinglist
    Stop emails for this post:
    https://rd-alliance.org/mailinglist/unsubscribe/52208

Log in to reply.