High Performance Computing (HPC), High Throughput Computing (HTC) and High-Performance Data Analytics (HPDA) are central to modern research across a range of fields. Thus, more and more of the world’s universities and research institutions provide their research communities with the necessary supercomputing facilities. These are seldom closely coupled with institutional data management services and workflows, however. This session will explore what barriers exist that keep the worlds of data and computing separate (whether technical or social), what can be done to overcome these barriers, and what might be achieved by doing so. We seek to address the following questions:
- What could or should be gained by bringing together supercomputing ecosystems with institutional research data management systems?
- What can be done in practical terms to better integrate supercomputing facilities with data management?
- What metadata could or should be collected related to data outputs from HPC/HTC/HPDA environments?
- Who needs to be involved to improve the situation? What is the role of the CIO, academic researchers, IT Services, the Library?
- What are people or institutions already doing to bring compute and data management together? What can others learn from them?
Collaborative meeting notes: https://docs.google.com/document/d/1HOrHHTF-yQytbKNNCsxT4dj-6qXE9qoIxl-P...
Introduction to RDARI (5 mins)
Introduction to the session - problem statements (5 mins)
Presentation - Institutional case study 1 (10 mins) - Wind Cowles & Kurt Hillegas (Princeton)
Presentation - Institutional case study 3 (10 mins)] - Kenjiro Taura (Tokyo University)
Presentation - Institutional case study 2 (10 mins) - Melissa Craigin (SDSC)
Guided discussion (35/45 mins)
Discussion of RDARI survey - should we repeat the 2019 Survey of Institutional Research Data Services? (10 mins)
Wrap up (5 mins)
This session will be of interest to anyone working for universities or research institutions who are interested in better integrating their compute and data management facilities. This may include research data infrastructure managers, ICT architects, senior managers with responsibility for research IT services, developers, data managers, data engineers, and data scientists. Representatives of service or technology vendors are of course also welcome.
The Research Data Architectures in Research Institutions Interest Group (RDARI) is primarily concerned with technical architectures for managing research data within universities and other multi-disciplinary research institutions. It provides insight into the approaches being taken to the development and operation of such architectures and their success or otherwise in enabling good practice.
Recognized and endorsed. Active since 2018.