status: Recognised & Endorsed

Chair (s): Jianhui Li, Rainer Stotzka, Robert Quick

Group Email: [group_email]

Secretariat Liaison:


The Data Fabric IG (DFIG) identified that working with data in the many scientific labs and most probably also in other areas such as industry and governance is highly inefficient and too costly. Excellent scientists working on date intensive science tasks are forced to spend about 75% of their time to manage, find, combine and curate data. What a waste of time and capacity. The DFIG is therefore looking at the data creation and consumption cycle to identify opportunities to optimize the work with data, to place current RDA activities in the overall landscape, to look what other rcommunities are doing in this area and to foster testing and adoption of RDA outputs. The goal of DFIG finally is to identify common components and define their characteristics and services that can be used across boundaries in such a way that they can be combined to solve a variety of data scenarios such as replicating data in federations, developing virtual research environments, and automating regular data management tasks. Much important work is being done on data publishing and citation, but DFIG believes that we need to start at early moments in the "Data Fabrics" in the labs to organize, document and manage data professionally if we want to meet the requirements of the coming decades.

  

DFIG is focusing on the data creation and consumption cycle as it happens daily in the scientific and industrial labs and on the identification of ways to make this work more efficiently and thus more cost-effective.

DFIG's goal is to identify common components and define their characteristics and services that can be used across boundaries in such a way that they can be combined to solve a variety of data scenarios.

Throughout its existence, DFIG has shepherded multiple spin-off groups into existence, dealing with specific aspects of the cycle and components involved, particularly regarding Persistent Idenfiers (PIDs), their relevance and applicability to address data referencing and management issues. These efforts have brought forth a new understanding which is summarized in an overview document here.

The group is currently reassessing the overall landscape in trying to identify the next challenges, components or other work areas of interest. An overview is contained in The Future Trends for the Data Fabric.

 

An essential topic of the IG are FAIR Digital Objects, see Wiki page including regular meetings and information material here: 
RDA IG Data Fabric: FAIR Digital Objects 

Outputs

20
July
2018

Persistent identifiers: Consolidated assertions

by Tobias Weigel

Experts from 47 European research infrastructure initiatives and ERICs have agreed on a set of assertions about the nature, the creation and the usage of Persistent Identifiers (PIDs).


0 | Add new comment
19
July
2018

Summary of Virtual Layer Recommendations

by Tobias Weigel

This output describes a concept for establishing a comprehensive network of components for the management of digital objects. The concept follows principles of layering and modularity and describes a set of key components required to realize a network for e-infrastructures in practice.


0 | Add new comment