Data Discovery Paradigms IG


Group details

Secretariat Liaison: 
Kathy Fontaine
TAB Liaison: 
Andrew Treloar
Case Statement: 
IG Established

RDA Interest Group Charter

Name of Interest Group:  Data Discovery Paradigms IG


An emerging statement on research data is that it should be FAIR: “Findable, Accessible, Interpretable and Reusable”. To comply with the first of these criteria, being Findable, we need a data infrastructure that supports users in discovering research data regardless of its location or the manner in which it is stored, described and exposed. This is a significant and growing challenge, as the number of research data repositories, and the need for cross-disciplinary data discovery, increases. This interest group aims to explore common elements and shared issues that those who search for data, and who build systems that enable data search, share.



The objectives for this interest group are to provide a forum where representatives from across the spectrum of stakeholders and roles pertaining to data search can discuss issues related to improving data discovery. The goal is to identify concrete deliverables such as a registry of data search engines, common test datasets, usage metrics, and a collection of data search use cases and competency questions.


Key questions the IG wishes to address:

At RDAP8, we identified a long list of topics pertaining to data discovery, which were then voted by the group members to a shortlist of 10 topics. The top 5 of these have been selected as the key Task Forces which the group is focusing on (linked to the wiki page for each Task Force).

Two of the Task Forces are sceduled to be officially closed at the RDA P11 after producing the corresponding outputs; specifically TFs:

The following two Task Forces are active and continue their work as planned: 

Finally, we have four additional topics for potential Task Forces that will be initiated in the near future, i.e.

  • Cataloging Common API's for Data Query

  • Using to improve dataset description and discoverability by search engines

  • Granularity and domain-specific / cross-domain issues

  • Data Discovery for Institutional Repositories

For more details on the full list of potential task forces and the process followed in selecting them, please see the page on Task Forces.


Related Activities:

  • NASA’s WG on Search Relevancy – focus is on improving search result relevance for EOSDIS data
  • ESIP’s Information Quality Cluster and NASA’s WG on Data Quality are both addressing ways of capturing and conveying quality information
  • W3C’s Best Practices for Spatial Data on the Web aims to improve discoverability and accessibility of geodata

Other RDA IGs whose activities are of interest and who we will interact with:

  • Metadata
  • Registries
  • Brokering
  • PIDs
  • Research data collections



The Data Discovery Paradigms Interest Group is open to all members and encourages active participation through the Task Force mechanism. Task Forces have phone coniferences on a regular basis. To become active in either the Task Forces or propose other activities for the IG, please contact the Chairs. 



Recent Activity

25 Jan 2018

Methods for auto-generating structured metadata

Hello DDPIG Members,

The Metadata Enhancement Task Force, a sub-committee of the
Research Data Alliance Data Discovery Paradigms Interest Group, is
investigating methods for auto-generating structured metadata.
Enhancing the metadata records associated with research data
improves search and supports automation services such as data
processing workflows.

20 Dec 2017

RE: [datadiscovery] Re: [datadiscovery] The RDA Data Discovery interest Group:...

That is something that NOAA has implemented (and found it to be startlingly effective), with their oceanic and climate data.
Jeff DLB could tell you if what they did is actually relevant to this topic and whether there might be someone at NOAA interested in this proposed task force.
- Show quoted text -From: sjskhalsa [mailto:***@***.***]
Sent: Tuesday, December 19, 2017 10:22 AM
To: Dewaard, Anita (ELS-HBE) <***@***.***>; Pete McQuilton

19 Dec 2017

Re: [datadiscovery] The RDA Data Discovery interest Group: Looking back and...

Hi all,
Thanks SiriJodha for the Earthcube link, the documentation looks very
I'd like to add another link if I may, for the bioschemas initiative
(, which aims to improve data interoperability in
the life sciences, through and 'bioschema' extensions, where

12 Dec 2017

RE: [datadiscovery] The RDA Data Discovery interest Group: Looking back and...

Dear Kerstin,
Thank you very much for your interest and enthusiasm; indeed, discussing granularity and domain-specific / cross-domain issues is an important aspect of Data Discovery and a great concept for a Task Force.
Would you be interested in leading such a Task Force? If so, what would be your general plan of action (i.e. the particular objectives and expected outcomes) for the TF?

01 Dec 2017

The RDA Data Discovery interest Group: Looking back and looking ahead

Greetings, members of the Data Discovery Paradigms Interest Group!
We want to make sure we get this message out to you as the end of 2017 comes into view, to let you know of the developments we’ve been working on, as well as new opportunities for 2018.
In particular, we are interested in your thoughts on forming and joining a new set of Task Forces (see point 3), before the end of the year. But first, here’s what we’ve been up to!
1. Meetings.
1. RDA P10:

03 Nov 2017

Re: [datadiscovery] UTC 8pm, Tuesday 2 Nov.: A kickoff discussion with Dr....

Dear Ming,
Thanks so much for recording and posting the presentation.
Helping repositories make their datasets more discoverable and
useful has been a primary goal of the DDPIG since its inception,
and use of markup is mentioned in Recommendation 9 of
our Best Practices for Data Repositories: "Make records easily
indexed and searchable by major web search engines".
I interpreted Natasha's suggestion regarding the focus a new
DDPIG group that it should work with the community on

27 Oct 2017

Re: [datadiscovery] UTC 8pm, Tuesday 2 Nov.: A kickoff discussion with Dr....

Hi Jennie,
Please go ahead to share the meeting ID with your colleagues.
To All: Sorry, I didn't make it clear about the meeting agenda in my last
email. Here it is:
* Natasha: Data search at Google and Google's guideline for data
repositories (~15-20 mins)
* All: Q&A and group discussion (~30 mins)
* All: Discussion of starting a new task force within the RDA Data
Discovery Paradigm IG (~10 mins)

26 Oct 2017

UTC 8pm, Tuesday 2 Nov.: A kickoff discussion with Dr. Natasha Noy from Google on making data discoverable by web search engines

Dear All,
I am glad to inform you that Dr. Natasha Noy from Research at Google has
accepted our invitation to introduce us the effort Google has put into data
discovery and Google's guidelines for data repositories to make data more
easily discoverable by web search engines.
You may recall that our Best Practices Task Force has drafted a white paper

04 Sep 2017

Reminder: Relevancy Ranking TF meeting: 5 Sept. at 11am UTC

Dear All,
The next relevancy ranking TF meeting is on 5th Sept. at 11am UTC.
We sent a survey to collect information about current practices of data
repositories in setting up data search services. As of on 1st Sept., we
got 114 responses, 20 out of 114 are incomplete ones. A preliminary summary
of 94 complete responses is available here
More analyses on possible correlations will be discussed.
Here is the information for joining the meeting.

16 Aug 2017

Re: [datadiscovery] Interesting Article on Data Discovery

This might be an interesting read for this group:
International Journal on Digital Libraries, pp 1–16
Extracting discourse elements and annotating scientific documents using the SciAnnotDoc model: a use case in gender documents
Authors Hélène de Ribaupierre, Gilles Falquet