BoF - Content Mining with Proprietary and IP-Protected Data: Addressing Scientific, Technical, Legal, and Institutional Barriers to Research - RDA 13th Plenary meeting

You are here

10 Jan 2019
Group(s) submitting the application: 
Meeting agenda: 

Copyright law, contractual agreements, and related statutes complicate access to -- and research with -- various forms of data. The challenges begin with the transactional costs of navigating initial access permissions and continue throughout the research data lifecycle. Significant barriers to access currently dissuade many researchers from pursuing valuable projects; among researchers who do obtain sufficient access to conduct analysis, new questions arise regarding storage, long-term preservation, dissemination, reproducibility, and reuse.These challenges are compounded for scholars engaged in collaborative work across institutional boundaries and legal jurisdictions. An interest group arising from this Birds of a Feather session would focus on multidisciplinary, multi-pronged, and multinational solutions to alleviate the barriers that complicate researchers’ ability to engage in content mining research. Such a group would utilize the diverse perspectives of the RDA community to propose solutions that address the scientific, technical, legal, and institutional ramifications of content mining research. These ramifications include, but are not limited to, the way that proprietary and intellectual property restrictions on digital content complicate trends toward open data, data sharing, and reproducibility; the need for new technical workflows, documentation and provenance strategies, data transfer protocols, access management techniques, and shared standards; guidance on navigating international copyright regimes, efforts toward harmonization, and choice of law considerations; and the lack of sufficient institutional approaches to risk management, managing dispersed responsibility for data access, management, and compliance, and accommodating the costs associated with proprietary data.


Meeting Objective

This birds of a feather session aims to foster a community of international stakeholders committed to pursuing a research and implementation agenda that addresses the scientific, technical, legal, and institutional barriers that stymie content mining initiatives for researchers interested in working with proprietary and IP-protected data. Together we will work to organize an RDA Interest Group and set an agenda for ongoing activities.


Collaborative Notes:  We will also be using Poll Everywhere to gather attendee feedback during the session


Meeting Location: Regency C1



  • Researchers
  • Librarians
  • Data managers
  • Legal experts
  • Content providers


Meeting Agenda

  • Introduction (Eleanor Dickson Koehl, 10 minutes)
    • Outlining current challenges and opportunities
    • Presenting an overview of recommendations and outcomes from related publications and initiatives (e.g., the Hargreaves Report, the FutureTDM project, the Hague Declaration, and the IMLS National Forum on Data Mining with In-Copyright and Limited-access Text Datasets)
  • Presentation of Proposed Interest Group Activities (Megan Senseney, 20 minutes)
  • Discussion (Facilitated by Megan Senseney and Eleanor Dickson Koehl, 60 minutes)
    • Identify potential activities that weren’t included in proposal, based on participants’ own experiences and expertise
    • Establish short, medium, and long-term priorities for top priority actions
    • Close with a commitment to a revised set of interest group activities, a rough outline for an IG charter, and a call for volunteers to join as co-chairs


Short description of previous activities along with additional links

Megan Senseney and Eleanor Dickson Koehl recently completed an IMLS-funded National Forum on Data Mining with In-Copyright and Limited-access Text Datasets, and they are coauthors on a forthcoming ACRL white paper that provides recommendations to academic libraries interested in supporting TDM scholarship. The forum, held 5-6 April 2018, brought together twenty-five leading stakeholders whose initial position papers and SWOT analyses are publicly available at This BoF session is partially inspired by the outcomes of the IMLS National Forum, whose project page is available at The goal of the session is to both extend and sustain the work of the forum by building on initial outcomes and situating new initiatives within the broader mission of the Research Data Alliance.