Generalist Repositories - The challenges of well-documented, reusable data

You are here

04 Aug 2020

Generalist Repositories - The challenges of well-documented, reusable data

Submitted by Shelley Stall

Meeting objectives: 

Generalist repositories offer an important option to preserve data when no domain or specialized repository is available.  This BoF focuses on how researchers working in disciplines with no domain repository can make their data as well-documented as possible to increase research transparency and optimize dataset reuse. 

Research discipline communities with no domain repository for their types of data have an important role to play in guiding their researchers in how best to document their data. Can partnerships with data librarians or organizations like the Data Curation Network be a part of the possible solution? For data to be well-understood by others, what other elements are necessary to include with the dataset? Can links to the paper help support transparency? In this BoF we will explore research using data that supports the biological sciences, geosciences, and social sciences with no domain/specialist repository as use cases to explore these questions and develop better understanding for the steps necessary to make data in a generalist repository better understood and more reusable.


Meeting agenda: 

Collaborative Notes Link:



  1. Welcome and Introductions (10 min) - Set the stage for the session and invite participants to introduce themselves and share their interest in the session.
  2. Keynote #1 (15 min):  Ixchel Faniel, OCLC - What researchers need when deciding whether to reuse data: Experiences from three disciplines
  3. Keynote #2 (15 min): Lisa Johnston, University of Minnesota Libraries - Twin Cities, Data Curation Network - The value of curation and the importance for data reuse.
  4. Q&A (10 min)
  5. Breakout Sessions (20 min): Divide the participants into two or three groups to consider different aspects necessary to optimze the possibility of data reuse when a generalist repository is being used.  Assign a facilitator, timekeeper, and a scribe to each group. Establish three main points to share with the larger group.
  6. Breakout Reports (15 min):  Each team shares their three main points.  This will give 6-9 possible ideas to pursue in some level of combination.  Use Menti (or similar) to have participants rank what is most worth pursuing as pilot with one or more communities. 
  7. Solidify Next Steps (10 min) - Discuss recommended next steps and how they will be accomplished.
Type of Meeting: 
Working meeting
Short introduction describing any previous activities: 

Prior to this BoF, the National Institutes of Health (NIH) hosted a workshop Feb. 11-12, 2020, with more than 750 in-person and videocast attendees to explore the roles of generalist and institutional data repositories in the biomedical data repository landscape. The workshop had five key goals and supported NIH’s ongoing efforts to provide researchers with appropriate solutions to make their data findable, accessible, interoperable, and reusable (FAIR).

  • Learn how generalist repositories see themselves in the larger biomedical data repository landscape.

  • Understand how institutional data repositories are creating suites of solutions for their researchers and how they see generalist repositories fitting into this landscape.

  • Consider desired characteristics of data repositories and how they relate to institutional expectations of data storage and preservation solutions.

  • Explore adoption of common infrastructure, standards, and federated search solutions to enable greater discoverability of NIH research data across federated data repositories.

  • Address the role of data curators in ensuring that data and metadata are sufficiently well curated to enhance discovery and enable reuse.

The meeting was co-chaired by Maryann Martone, professor emerita of neuroscience at the University of California, San Diego and chief scientific officer of SciCrunch, Inc., and Shelley Stall, the senior director for data leadership at the American Geophysical Union working with the Earth, space, and environmental community to improve data management practices, most recently with the Enabling FAIR Data Project.

The full meeting summary is available

The need for this BoF came from the meeting organizers to continue the discussion and effort towards improving data reuse in generalist repositories. 

One important outcome of the workshop is an updated comparison chart of generalist repositories. We are grateful to the repositories that supported this effort to build content that will inform a dynamic version of this comparison chart hosted as a collection in We set up a new RDA coordination group to help coordinate anyone interested in this BoF and other activities -- Generalist Repository Comparison Chart Management Group.

BoF chair serving as contact person: 
Meeting presenters: 
Shelley Stall, Maryann Martone, Lisa Federer, Jennie Larkin, Ixchel Faniel, Lisa Johnston
Avoid conflict with the following group (1): 
Avoid conflict with the following group (2):