Computational Notebooks

You are here

27 Nov 2019

Computational Notebooks

Submitted by Hugh Shanahan

Meeting objectives: 

Notebooks, specifically Jupyter notebooks (but also notebooks from platforms such as RStudio) are generating a great deal of excitement and are potentially a significant step forward in terms of reproducibility, education, code documentation and academic publishing. A recent paper found nearly 1.2 Million unique notebooks based on searches through github; another even more recent blog gives evidence of nearly 5 Million notebooks on github.

This is colossal growth amongst use by researchers. The RDA absolutely needs to engage with this as it enables both sharing of data and software.

More recently the deployment of resources such as JupyterHub, myBinder and EGI notebooks mean that notebooks can be deployed on a cloud and hence obviates issues with installed libraries. One potential result of this is that it would represent the easiest pathway for users to use cloud computing resources for research which could transform the use of such resources. It is clear that frameworks such as EOSC and the US National Data Service will make extensive use of notebooks and their offerings affected by them.

An Interest Group will be proposed to match this BoF sssion and builds on the BoF held at RDA Helsinki . A set of topics (discussed in detail in the objectves below) has been arrived at that a) would make a tangible contribution to the notebook community and b) make use of the configuration of expertise within the RDA community that would not be available elsewhere. Briefly these are

  • Publishing notebooks

  • Long term preservation of notebooks

  • Notebooks as FAIR digital objects

  • Notebook for Big data and compute

The RDA community benefits from this as it provides an opportunity to shape how this rapidly expanding technology is deployed and used.

The topics that have been followed up on since the BoF at Helsinki will report back and as an output for the meeting one or more WG's will be proposed on the basis of those topics.

Martin Fenner of the software source code identification WG has agreed to coordinate between the two groups.

Meeting agenda: 
  1. Introduction (5 minutes HS).

  2. Report back on Publishing notebooks. (5 Minutes Martin Fenner - member of software source code identification WG)

  3. Report back on Long term preservation of notebooks. (5 Minutes Patricia Herterich)

  4. Report back on Notebooks as FAIR digital objects. (5 Minutes Rob Quick)

  5. Report back on Notebooks for Big data and compute . (5 Minutes Gergely Sipos)

  6. Break-out groups to discuss proposals for WG's. (45 Minutes)

  7. Report back and recommendations for WG's. (20 Minutes)

Type of Meeting: 
Working meeting
Short introduction describing any previous activities: 

A successful BoF was run on this topic at RDA Helsinki where the above topics were initially discussed.

This session proposal is based on a wide variety of activities - names included in parenthesis indicate individuals who have expressed an interest in this proposal. Software citation (Martin Fenner, Neil Chue Hong) is an active IG and WG in the RDA and providing guidelines for the citation of notebooks would be an excellent use case for these groups. The research library community are extensively exploring the potential of notebooks to facilitate data and software management and reproducibility (Rosie Higman, Jez Cope and Patricia Heterich). Notebooks provide an interesting use case for PID’s in data (Rob Quick) and likewise for computational services (Gergely Sipos, Enol Fernández and Christine Kirkpatrick). This meeting also provides a moment for deployment services such as myBinder (Tim Head) to engage with the RDA. Finally the Neutron and Photon Science community (Brian Matthews and Frank Schluenzen) and the relevant IG make very extensive use of notebooks and hence complements these activities.

Remote participation availability (only for physical Plenaries): 
Avoid conflict with the following group (1): 
Avoid conflict with the following group (2): 
Avoid conflict with the following group (3):