The Data Fabric IG (DFIG) identified that working with data in the many scientific labs and most probably also in other areas such as industry and governance is highly inefficient and too costly. Excellent scientists working on date intensive science tasks are forced to spend about 75% of their time to manage, find, combine and curate data. What a waste of time and capacity. The DFIG is therefore looking at the data creation and consumption cycle to identify opportunities to optimize the work with data, to place current RDA activities in the overall landscape, to look what other rcommunities are doing in this area and to foster testing and adoption of RDA outputs. The goal of DFIG finally is to indentify so-called Common Components and define their characteristics and services that can be used across boundaries in such a way that they can be combined to solve a variety of data scenarios such as replicating data in federations, developing virtual research environments, etc. Much important work is being done on data publishing and citation, but DFIG believes that we need to start at early moments in the "Data Fabrics" in the labs to organize, document and manage data professionally if we want to meet the requirements of the coming decades.
DFIG is focusing on the data creation and consumption circle as it happens daily in the scientific and industrial labs and on the identification of ways to make this work more efficiently and thus more cost-effective.
DFIG's goal is to identify so-called Common Components and define their characteristics and services that can be used across boundaries in such a way that they can be combined to solve a variety of data scenarios.
DFIG has various spin-offs of its discussions such as work on Repository Registry, the acceleration of the testing activities, the huge terminology problems we have, self-registration of CoCos, etc. These will appear partly on this site but will also fork into new RDA groups or be dealt with at other places.
Current Core Group Activities
- Use Cases (description of use cases that describe concrete "data fabrics" in the various labs)
- Composition Building - Finding Minimal Metadata for PIDs
- Composition Building - Towards the Global Digital Object Cloud
- Recommendations for Implementing a Virtual Layer for Management of the Complete Life Cycle of Scientific Data
- Broker-Driven Core Component Workflows