Basic funding of data infrastructure may not keep pace with increasing costs. Therefore, there is a need to consider alternative cost recovery options and a diversification of revenue streams. In short: who will pay for public access to research data?
This Group proposes to make a contribution to strategic thinking on cost recovery by conducting research to understand current and possible cost recovery strategies for data centres. The Group will pay particular attention to data centres’ involvement in data publishing activities and examine such initiatives as a potential source of alternative revenue.
The Group will produce a report providing conclusions and recommendations about the potential appropriateness of different cost recovery models to different situations and the potential of data publication initiatives fitting into a cost recovery strategy and will also contribute its findings to the combined testing of the various models/scenarios/mechanisms developed in the four Data Publishing Working Groups.
These deliverables will build on five areas of work:
1. A summary of current work on cost models;
2. A survey of funding policies specifically relating to how the costs of data availability/publication may be recovered;
3. A survey, by means of a questionnaire and case studies, of various existing approaches to cost recovery/business models;
4. A survey of other stakeholders (publishers, researchers) to understand their position and policy in relation to charging models and their role in the publishing process;
5. The outcomes of the Working Group on Workflows.
Value proposition: Cost recovery for Data Centres
The challenge: cost recovery and sustainability of public data products.
A number of initiatives examining data citation and data publication aim to integrate data more effectively with the process of scholarly communication and the ‘record of science’. A vision of the future in which ‘data papers’ have scholarly currency, in which data can be accessed and visualized directly from the online literature requires partnerships between publishing platforms and data centres. This vision also demands that data centres providing access to published datasets should have sustainable business models.
A lot of work is going on to understand the costs of maintaining long-term accessibility to digital resources, to identify different cost components and based on this to develop cost models. However, in a broader context—which considers data as part of research communication—the identification of costs and development of cost models addresses only part of the problem. In times of tightening budgets, it is important to address the challenge of ensuring the sustainability of data centres - and considering this in the context of the broader processes for data publication. Many established national and international data centres have reliable sources of income from research funders. However, these sources of income are generally inelastic and may be vulnerable. There is concern that basic funding of data infrastructure may not keep pace with increasing costs. Therefore, there is a need to consider alternative cost recovery options and a diversification of revenue streams.
This Group proposes to make a significant—but achievable—contribution to strategic thinking in this area by conducting research to understand current and possible cost recovery strategies for data centres. We will pay particular—but not exclusive—attention to data centres’ involvement in data publishing activities and examine such initiatives as a potential source of alternative revenue.
By means of a questionnaire and a set of case studies, this working group will shed light on data centres’ current practice of cost recovery and identify possible opportunities for data centres looking to diversify income streams. A number of important questions will be considered, including:
● What cost recovery models are currently being employed by data centres?
● What trends are perceived by data centres with regard to the vulnerability of funding and what are the possible responses to diversify income streams?
● What cost recovery models are available within current, largely grant based funding of research?
● What cost recovery strategies are available while maintaining a commitment to open access to research data?
The principal activity of the WG will be to survey a set of data centres and provide a group of case studies addressing these questions in detail, for use within the test environment developed by other RDA Working Groups. This work will build upon existing work on cost models and funder’s policies, as they relate to cost recovery of data curation costs from research projects. These aspects of the work will help the WG analyses how the costs of particular activities may be covered, and to what extent it is possible to hypothecate charges and encourage clarity around who pays for what. However, the principal focus will be on understanding the alternative options available for cost recovery and diversification of revenue streams for data centres. There are various options available and the involvement of data centres with ‘data publication’ initiatives is a significant new development that will be considered in this study.
Summary of the cost components of a data publication:
● Ensuring a publishable data product—annotation, metadata, codebooks etc.—is argely the responsibility of the researcher.
● Quality assurance and review process is conducted in some instances by the publisher but also, largely, by the data repository.
● Long-term preservation—archive and services for access— is largely the responsibility of the data repository.
Download the full document below