Management of Computational Notebooks
Submitted by Hugh Shanahan
Notebooks, specifically Jupyter notebooks (but also notebooks from platforms such as RStudio) are generating a great deal of excitement and are potentially a significant step forward in terms of reproducibility, education, code documentation and academic publishing. A recent paper found nearly 1.2 Million unique notebooks based on searches through github.
More recently the deployment of resources such as JupyterHub, myBinder and EGI notebooks mean that notebooks can be deployed on a cloud and hence obviates issues with installed libraries. One potential result of this is that it would represent the easiest pathway for users to use cloud computing resources for research which could transform the use of such resources. It is clear that frameworks such as EOSC and the US National Data Service will make extensive use of notebooks and their offerings affected by them.
There is an extensive body of research on Notebooks (for example the JupyterCon conference series) but the expertise of the RDA members can make important contributions in this area. The overall objectives are to initially consider the following :-
-
Citation of notebooks that conform to Software Citation standards,
-
Integration of notebooks with data sources (e.g. EGI DataHub or more abstractly to data sources with a PID),
-
Deploying notebook services on large Scientific computational platforms.
-
Long-term preservation of notebooks without losing functionality.
This will be carried out with some relatively short introductions to the above topics followed by break out groups on the above topics to consider what concrete steps can be made in this within the RDA, e.g. through the formation of an IG and/or WG’s.
Collaborative session notes: https://docs.google.com/document/d/1ore_8jtw1SR-B6N6eBy0GUfkM1LkHIrUYO0m2SM3Svc/edit?usp=sharing
Please use the twitter tag #RDACompNotebooks if you are referencing this session.
-
A brief introduction to notebooks.
-
Publishing notebooks (Martin Fenner, DataCite)
-
Notebooks and long-term preservation (Patricia Herterich, DCC)
-
Notebooks and FAIR digital objects (Christine Kirkpatrick, UCSD/US National Data Service)
-
Deploying notebooks on large Scientific computational platforms (Gergely Sipos, EGI)
-
Break-out groups to discuss topics 2-5.
-
Consideration of next steps.
This session proposal is based on a wide variety of activities - names included in parenthesis indicate individuals who have expressed an interest in this proposal. Software citation (Martin Fenner, Neil Chue Hong) is an active IG and WG in the RDA and providing guidelines for the citation of notebooks would be an excellent use case for these groups. The research library community are extensively exploring the potential of notebooks to facilitate data and software management and reproducibility (Rosie Higman, Jez Cope and Patricia Heterich). Notebooks provide an interesting use case for PID’s in data (Rob Quick) and likewise for computational services (Gergely Sipos, Enol Fernández and Christine Kirkpatrick). This meeting also provides a moment for deployment services such as myBinder (Tim Head) to engage with the RDA. Finally the Neutron and Photon Science community (Brian Matthews and Frank Schluenzen) and the relevant IG make very extensive use of notebooks and hence complements these activities.
A short motivation for this proposal can be found at
https://github.com/rdanotebooksbof/outline/blob/master/motivation.md
The following are a set of relevant links
Introduction of Jupyter notebooks :- https://jupyter.org/about
Introduction to Rstudio notebooks - https://bookdown.org/yihui/rmarkdown/notebook.html
myBinder - https://gke.mybinder.org/
EGI notebooks - https://notebooks.egi.eu/hub/login
REMOTE ACCESS INSTRUCTION
Please join my meeting from your computer, tablet or smartphone.
https://global.gotomeeting.com/join/277853989
You can also dial in using your phone.
United States: +1 (224) 501-3217
Access Code: 277-853-989
More phone numbers
Australia: +61 2 9087 3604
Austria: +43 7 2081 5427
Belgium: +32 28 93 7018
Canada: +1 (647) 497-9350
Denmark: +45 32 72 03 82
Finland: +358 923 17 0568
France: +33 170 950 594
Germany: +49 692 5736 7317
Ireland: +353 15 360 728
Italy: +39 0 230 57 81 42
Netherlands: +31 207 941 377
New Zealand: +64 9 280 6302
Norway: +47 21 93 37 51
Spain: +34 932 75 2004
Sweden: +46 853 527 827
Switzerland: +41 225 4599 78
United Kingdom: +44 330 221 0088
New to GoToMeeting? Get the app now and be ready when your first meeting starts:
https://global.gotomeeting.com/install/277853989
- 1799 reads