The meeting will focus on Data Fabric governance and implementation, but also introduce new innovative areas in terms of automating the Data Fabric with Machine Learning / Artificial Intelligence.
1) International Data Fabric Governance
The Data Fabric is built by infrastructures and applications that rely on a diverse set of components. The interplay between components is described via recommended interfaces and protocols at the technical level. However, to fully realize the Data Fabric vision, governance must also be addressed as part of implementation. Focus topics at the heart of data fabric governance and cross-disciplinary implementation for the session include:
- Governance of different registries and other components and determining the need for overarching governance framework(s)
- What is the win-win governance model for cross-continental data fabric/infrastructure cooperation?
- What is a good mechanism to govern the institutional level, national level and international level data fabric/infrastructure?
- How does data fabric governance relate to practical implementation, e.g. at cloud storage level?
2) Towards the Intelligent Data Fabric
Further automating the data-centric processes in the labs is a key concern in the Data Fabric scope. Machine Learning (ML) and Artificial Intelligence (AI) now present the means to automate the processes on a wide scale, across disciplines and existing services or e-infrastructures. The objective of the meeting is therefore to discuss how to apply AI and ML concepts in practical implementation examples such as:
- automate metadata capture and transformation from structured and unstructured data
- support service chaining for data processing
- automated categorization of data products for metadata management, search and service discovery
- data reduction and summarization
- enhance data discovery, provide intelligent user assistance, recommender systems
Collaborative session notes: https://docs.google.com/document/d/1f1vUe_M76dnhGnRP_IQ9Zye3CLpAAxnM2AHTvOSKQvs/edit?usp=sharing
- Introduction to Data Fabric and governance challenge (10 min)
- International Data Fabric Governance and cross-continental implementation updates (max 30 min)
- DONA, GO-FAIR perspective (tbd)
- CNIC (Liu JIA)
- cs3mesh4eosc (Guido Aben)
- Implementation updates, new projects (tbd)
- Discussion (35 min)
- Towards the Intelligent Data Fabric: Innovation discussion (15 min)
Data infrastructure builders, data repository managers, research software engineers, data and computing centers
Participants should familiarize themselves with the basic ideas and concepts of the Data Fabric; see informative material.
The Data Fabric group is concerned with improving the efficiency and effectiveness of the data generation and consumption cycle as it exists in the labs today. Across disciplines, a lot of manual effort is spent on finding, preparing, managing and curating data, which is inefficient as effort is not spent on research instead. The Data Fabric IG (DFIG) is therefore looking at the data creation and consumption cycle to identify opportunities to optimize the work with data, to place current RDA activities in the overall landscape, to look what other communities are doing in this area and to foster testing and adoption of RDA outputs.
The group has been active since the earliest days of the RDA and holds frequent meetings, in person at plenaries and virtual meetings.