The Data Fabric IG (DFIG) identified that working with data in the many scientific labs and most probably also in other areas such as industry and governance is highly inefficient and too costly. Excellent scientists working on date intensive science tasks are forced to spend about 75% of their time to manage, find, combine and curate data. What a waste of time and capacity. The DFIG is therefore looking at the data creation and consumption cycle to identify opportunities to optimize the work with data, to place current RDA activities in the overall landscape, to look what other rcommunities are doing in this area and to foster testing and adoption of RDA outputs. The goal of DFIG finally is to identify common components and define their characteristics and services that can be used across boundaries in such a way that they can be combined to solve a variety of data scenarios such as replicating data in federations, developing virtual research environments, and automating regular data management tasks. Much important work is being done on data publishing and citation, but DFIG believes that we need to start at early moments in the "Data Fabrics" in the labs to organize, document and manage data professionally if we want to meet the requirements of the coming decades.
DFIG is focusing on the data creation and consumption cycle as it happens daily in the scientific and industrial labs and on the identification of ways to make this work more efficiently and thus more cost-effective.
DFIG's goal is to identify common components and define their characteristics and services that can be used across boundaries in such a way that they can be combined to solve a variety of data scenarios.
Throughout its existence, DFIG has shepherded multiple spin-off groups into existence, dealing with specific aspects of the cycle and components involved, particularly regarding Persistent Idenfiers (PIDs), their relevance and applicability to address data referencing and management issues. These efforts have brought forth a new understanding which is summarized in an overview document here.
The group is currently reassessing the overall landscape in trying to identify the next challenges, components or other work areas of interest. An overview is contained in The Future Trends for the Data Fabric.