Advancing data-driven research through the Data Commons - P6 BOF session

    You are here

25 September 2015- BREAKOUT 7 - 09:30


Nowadays, research is increasingly and in many cases exclusively data driven. Knowledge of how to use tools to manipulate research data, and the availability of e-infrastructures to support them, are foundational. 

New communities of practice are forming around interests in digital tools, computing facilities and data repositories. By making infrastructure services, community engagement and training inseparable, existing communities can be empowered by new ways of doing research, and can grow around interdisciplinary data that can be easily accessed and manipulated via innovative tools as never done before.

Key enablers of this process are openness, participation, collaboration, sharing and reuse, and the definition of standards and the enforcement of their adoption within the community. In some disciplinary areas communities succeeded in defining and adopting a community governance of the production and sustainable sharing of data and applications. The resource management and governance principles of data and tools are called the Data Commons, and include aspects such as the enforcement of standards, community data release policies, rules for governing data protection, protective patenting strategies, legal frameworks for data reuse etc.

Examples of commons are the Genome Commons, which refers to “the massive accumulation of data from large-scale study of the genetic makeup of humans and other organisms that began with the Human Genome Project in the late 1980s and continues today in many government and privately funded projects. These projects have made a vast quantity of genetic information available in public databases across the globe”. By doing so “today the free availability of genomic data is a fundamental feature of the research landscape”.

In astronomy initiatives like IVOA - the International Virtual Observatory Alliance - helped defining data interoperability standards that allow astronomical datasets and other resources to work as a seamless whole in collaboration with many projects and data centres worldwide. The vast majority of the technology standards being developed in IVOA are not tied at all to a specific discipline, but are fully generic and could serve a wide range of research communities in the quest for multidisciplinary Data Commons.

In the BoF we will review the status of play in the Data Commons, the best practices and success stories available so far and we will discuss how these could be applied to other research domain that are still lagging behind and the opportunities that are arising with many large-scale scientific research infrastructures in their construction stage. The adoption of the Data Commons could avoid high cost attached to the establishment of the needed ICT infrastructures, avoid duplicated efforts and more effectively support multidisciplinary research.

The BoF will address topics such as:

  • The actions needed to foster research data integration through a federated approach.
  • The problems currently faced in providing a scalable access and analysis of research data for an efficient reuse.
  • The role of communities developing around research data, computing and tools in fostering data integration and a culture of open data: how can we foster the flourishing of these communities?
  • The issues in implementing a commons-oriented governance, actions to develop a culture of sharing and participation, breaking today’s barriers between traditional e-Infrastructures and research communities.

The BoF complements existing RDA WGs and IGs by focusing on the governance and policy aspects of community sharing in the Data Commons, rather than the technical enablers.

Remote participation to the BoF is supported on a best effort basis via webex (passwd: RDA-BOF). 

GO TO Slides

 

AGENDA

1. Welcome and introduction to the objectives of the BoF (10')

2. Research data sharing: best practices, needs and challenges. The point of view of research communities (40')

  • Genomics data commons, J. Korbel, European Molecular Biology Laboratory (EMBL)
  • Access to open research data for environmental science: the LifeWatch experience in biodiversity. Jesus Marco de Lucas, Instituto de Fisica de Cantabria, ES
  • Adopting standards to enable sharing of open research data, the experience of the International Virtual Observatory Alliance (IVOA) for Astronomy. David Schade, Canadian Astronomy Data Centre, NRC Herzberg Institute of Astrophysics (NRC-HIA), CA
  • Distributed data repository integration for brain research. Sean Hill, HBP/SP5 Neuroinformatics co-director

Convenors: Matthew Viljoen and Tiziana Ferrari, EGI.eu

 



  • Tiziana Ferrari's picture

    Author: Tiziana Ferrari

    Date: 29 Sep, 2015

    An excellent session, from my point of view, excellent contributions from all disciplines showing how the availability of integrated computing and data infrastructures together with standardization are necessary enablers of open science. Thanks to all speakers!

submit a comment