Summer Schools for Data Science in the Low and Medium Income Countries WG - TAB Review

Working Group Title: RDA/CODATA Summer Schools in Data Science and Cloud Computing in the Developing World

Proposers: Hugh Shanahan, Andrew Harrison, Simon Hodson

Date Case Statement Received by TAB (yyyy-mm-dd): 2014-07-22

Date Review Submitted by TAB (yyyy-mm-dd): 2014-07-30

TAB Reviewers: Peter Fox, Rainer Stozka


Rainer Stotzka appears on the list of initial memberships. He is not co-author and not involved in the case statement.


Completeness of Case Statement (Does it include the six requisite components: (1) WG Charter; (2) Value Proposition: (3) Engagement with Existing Work in the Area; (4) Work Plan; (5) Adoption Plan; (6) Initial Membership?): Yes ; No __; Comments:

The Case Statement contains all components.


Please clarify, is the adoption plan the “early on agreements” by partners for content and institutions to host? 


The term “Summer School” suggests education and opening opportunities, over training and “building skills”, especially if cloud-based infrastructure is emphasized - it is changing very rapidly and unless underlying principles and foundations are taught, the training could be out of date very quickly.


Focus and Fit:  (Are the Working Group objectives and deliverables aligned with the RDA mission[1]?  Is the scope too large for effective progress, too small for an RDA effort, or not appropriate for the RDA?  Overall, is this a worthwhile effort for the RDA to take on?  Is this an effort that adds value over and above what is currently being done within the community?)

The WG plans to set up the framework to run a series of Summer Schools in the Developing World focusing on Cloud Computing and Data Science to create a cadre of individuals who can analyze, maintain and curate data sets. 


This would be valuable but it’s not mentioned again how this cadre would be identified and developed (and retained) over time… especially 5 years.


It is not clearly stated how data sharing, exchange, or interoperability will be fostered by the summer schools. These contents should be stronger in an RDA WG, e.g. teaching how to discover openly shared research data, how to use/analyse shared data effectively, how to share own research data. Therein cloud infrastructures play an important role, but the case statement creates the impression that the focus lies in cloud computing.


The objectives are two-fold: 

• Preparation of a proposal for funding including the work packages “Identify set of potential funders for school” and “Funding”.

Well, this is difficult: on one hand this is a goal without long-term impact and funding for a specific group is not well aligned with RDA’s mission. On the other hand it is important to foster data sharing knowledge in the developing world.

• Outline of a typical summer school including the work packages “Agreements from partners on delivery of online resources”, “Agreements from institutions to act as host”, and “Outline for School”.

If the summer school contents are focused on “data sharing” this objective will be well aligned with RDA.


Work Plan, Deliverables, and Outcomes:  (Are there measurable, practical deliverables and outcomes?  Can the proposed work, outcomes/deliverables, and Work Plan described in the Case Statement be accomplished in 12-18 months?)

The outcomes will be a set of documents including a proposal for funding. The work plan is sound and realistic.

It will be important that the identification of funding avenues is done in concert with partners and host institutions to ensure that viable resources are pursued.


Capacity:  (Does the initial membership list include sufficient expertise, and disciplinary and international representation?  Are the right people involved in the Working Group to adopt and implement?  What individuals or organizations are missing?)

The membership seems balanced.

Impact and Engagement:  (Is it likely that the outcome(s) of the Working Group will be taken up by the intended community?  Is there evidence that the research community wants this?  Will the outcome(s) of the Working Group foster data sharing and/or exchange?)

We would recommend concentrating more on the topic “data sharing” instead of “cloud” and to formulate the “funding” as a secondary goal.

There’s also an assumption that existing programs either fall short or have gaps that this effort would fill. We’d also like to get some indication of institutions in the DW that would send participants, and in fact have them involved as stakeholders. 


Recommendation:  Case Statement is Sufficient _; Case Statement Requires Revision _X_; Comments:



1. Switch priorities to align with RDA strengths and mission: concentrating more on the topic “data sharing” instead of “cloud” and to formulate the “funding” as a secondary goal.

2. Strengthen statements of how all stakeholders work together, toward each of the goals.

3. Expand the discussion of the cadre aspect of the statement

4. Consider adding more emphasis on education and clarify how “skills” will be developed

5. Add a statement (perhaps via other RDA IG/WG interactions) how the training may be kept current as technologies change (e.g. curriculum to contain an evolution/ revision component beyond assessments)



[1] RDA Working Groups should tangibly accelerate progress in concrete ways for specific communities with the overarching goal of increasing data-driven innovation.  Outcomes/deliverables may come in a variety of forms including: new data standards or harmonization of existing standards; greater data sharing, exchange, interoperability, usability and re-usability; greater discovery of research data sets; and better management, stewardship, and preservation of research data.


Review of the revised version

Rainer Stotzka, November 6:


The chairs have incorporated most of the recommendations and concentrate their summer schools on data science and data sharing. The outcome will be a curriculum for schools, including learning outcomes and forms of assessment.


From my side it looks OK.


Peter Fox, November 11:

The removal of the cloud computing and adding of data sharing provides a much better focus and alignment with RDA. The funding constraint is also removed. I consider that this case statement is approved and ready to move to the next step.