Courageous adherents of the research data cause will have signed up for not one, not two, but three research data conferences held at the Sheraton in Denver this month: SciDataCon, International Data Forum (IDF) and the Research Data Alliance’s (RDA) 8th Plenary meeting. Dom Fripp and I, joined the proceedings at the mid-point for the IDF which provided a full day of inspirational presentations. In a cast of stellar speakers, two particularly stood out for me. The first was Philip Bourne (National Institute of Health) presenting the Big Data 2 Knowledge platform which brings together the services and tools medical researchers need together in one platform underpinned by the FAIR Principles. This links closely to Jisc’s research strategy and what we would like to achieve for the UK research sector. The second was the opportunity to hear from a Member of European Parliament, Edit Herczog, an advocate of the RD cause, presenting some of the issues politicians have when dealing with the data problem which. for them, comes with a ‘toxic’ warning label.
At the opening plenary of the RDA meeting I sat on the opening panel to offer a UK funder’s perspectives on the transition to sustainability. The panel was expertly chaired by Jenny Larkin (NIH) and Josh Greenburg (Sloane Foundation) and comprised Meredith Morovati, Executive Director of Dryad and Dan Lynn and CEO of AgilData. While there may be differences between countries, one thing connected our responses and that is the importance of community-building in the transition to sustainability. This community-building needs to happen early in the process so that service providers co-create their value and define their business models on the basis of their users needs. The importance of community continued as a theme throughout the meeting as every session we attended identified the need to connect to other groups to ensure that RDA fulfills its objective to build bridges between the social and the technical.
Standardisation of Journal Research Data Policies
Continuing on the theme of community building I was delighted to attend a guerilla, off-programme Birds of a Feather (BOF) session, organised by Iain Hrynaszkiewicz, Head of Data Publishing at Springer Nature. Jisc have been working in this area for some time and have been surfacing the complexities involved in standardising RD policies. The BOF was a grouping of RD stars covering funders, data centres, publishers, journal editors, infrastructure service providers from across the globe, disciplinary groupings and institutions. The debate was lively with consensus rapidly reached on the importance and need for the standardisation of journal research data policies. Springer Nature have started the ball rolling by publishing four standard policies and these have already adopted by 350 of their journals. The RDA provides a perfect forum for this work to be extended to the wider community to provide the essential input to standardise policies across the various disciplines and different publisher/journal types. Jisc will work with institutions to provide the essential feedback from researchers and intermediaries on how these policies work in practice.
National Data Services - what’s happening outside the UK?
I finished my RDA plenary at the National Data Services Interest Group where I gave a lightning talk on Jisc’s pilot Research Data Shared Service. This relatively new group has set out to produce an analysis of national data services, propose a model of what constitutes a national data service and collate material which countries have successfully used to present the case for national data services. So far, the group has gathered pen pictures of national services from Australia, Canada, Finland, Korea, United States and UK with others on their way. The RDA provides a unique platform to connect those who are at the start of their RD journey with those who have already grappling with the complexity of implementing services and allows them all to share their experiences, offer support and answer questions that would not be asked in a more formal setting.
Further observations from the RDA Plenary by Dom Fripp
This was my first experience of the RDA environment and it did not disappoint. The structure of the plenary meant there was a lot to see and hear on many different topics. Attending with Linda meant we were able to follow different groups and breakout sessions. There was plenty on offer that was relevant to my work at Jisc.
New discovery paradigms
My work on developing a metadata profile for the UK Research Data Discovery Service has resulted in a considerable amount of thought being given to what metadata fields contribute to discovery and how these can be collected and aggregated between heterogenous sources.
This led me to the New paradigms for data discovery working group. It’s mission statement focuses on the need a data infrastructure that supports users in discovering research data regardless of its location or the manner in which it is stored, described and exposed. Given that this is a central concern for the Jisc Discovery project, it was important to attend and find out what was new in this space.
From a sustainability perspective, suggestions made at the meeting raised the interesting possibility of using the schema.org vocabulary to annotate research data records to faciliate discovery. The appeal of schema.org in this context is to take a seat at the Search Engine table. Schema.org is sponsored by Google, Microsoft, Yahoo and Yandex and many of the applications created by these major players (such as Google’s Knowledge Graph) already use the vocabularies to deliver services.
The vocabularies are developed by an open community process, which opens the door to the creation of a research data vocabulary as built via consensus amongst the international community. It might lead to the award of a full extension which covers this domain. This would take take the use of the vocabulary beyond the Findability element of the community constructed and adopted FAIR principles and aid the development of standards (and potentially applications) around the Accessibility, Interoperability and Reusability components. For now, schema.org represents a chance for the discovery of research data to be made at the search engine level and opens up a prospective audience for consumption.
UK Research Data Discovery Service (alpha): http://ckan.data.alpha.jisc.ac.uk/dataset
FAIR Principles (and more): http://www.datafairport.org/
Community built data models
Community building is a key part of my work on developing a data model for Jisc’s research data shared service. The RDA Plenary offered a fantastic opportunity to see how such community processes around metadata play out, and permitted access to expertise and experience from around the world.
Beyond the involvement of the project pilot institutions in the data model work, it is vital to align with standards, workflows and best practice and the plenary was a platform to discuss plans and receive guidance and feedback from domain experts. Attending joint sessions run by the Elixir Bridging Force Interest Group, Biosharing Registry Working group, Data Type Registries Working Group, Metadata Standards Catalog Working Group allowed an effective horizon scan of work being done and suggested alignments and link ups, not just between these RDA groups but to the community as a whole.
By placing the Research Data Shared Service data model in Github and offering a transparent governance structure, I hope that the model can trap some of this expertise and develop under the watch and through the influence of this extended community.
Refreshing the SWORD protocol
The RDA plenary is the place where the research data community gathers and discusses open problems that are often shared by many of those present. To this end, I was invited give a presentation at the Research Data Repository interoperability working group about Jisc’s planned refresh of the SWORD standard. First developed and implemented in 2006, the data transmission protocol has enjoyed success in many areas, from Open Access repositories to preservation and repository pipelines for research data. The refresh (five years after the last) meant first re-engaging with the community and asking them what was required of SWORD in the modern day repository setting. I also set out various sustainability models for the standard. This gained traction in the room but to reach out the whole community, there is a Google document where SWORD users can leave information about their implementations and requirements for the protocol.
This document is public and open for comment. If you would like to add a use case or implementation, and make a suggestion for how SWORD can be improved for your needs, please add your information. This content will form part of a revamped web presence for the SWORD community.
Assante, M. et al., (2016). Are Scientific Data Repositories Coping with Research Data Publishing?. Data Science Journal. 15, p.6. DOI: http://doi.org/10.5334/dsj-2016-006
Forthcoming RDA events
If you are UK based, interested in the work of the RDA and would like to know more, the inaugural RDA UK workshop takes place in Birmingham on Wednesday 2nd November.
This one-day workshop will provide attendees with information about how the RDA, as a community-driven organisation, is working to achieve its vision. It will include representatives from specific working and interest groups highlighting their work and discussing how this work can be used practically in universities and data centres in the UK.
More details (and registration) can be found at:
The 9th RDA Plenary is in Barcelona on 5-7th April, 2017. More details about this event can be found at: https://rd-alliance.org/plenaries/rda-ninth-plenary-meeting-barcelona
Finally, it has also been announced that in September 2017, the 10th Plenary will be held in Montreal, Canada. Coinciding with the 150th anniversary of the Confederation, this is not to be missed.
See what was said on social media during the week by searching on #IDW2016 and #RDAPlenary hashtags on Twitter
Slides and materials from SciDataCon 2016 http://www.scidatacon.org/2016/programme/
Slides and materials from International Data Forum: http://www.internationaldataweek.org/International-data-forum
RDA 8th Plenary programme: https://rd-alliance.org/plenaries/rda-eighth-plenary-meeting-denver-co/programme