Community Capability Model (CCM) - RDA CWG Case Statement Draft March 2013

CCM Working Group Charter

The purpose of the RDA CCM Working Group (WG) is to collect, validate and publish a range of data-centric “capability profiles” to enhance inter- and intra-domain interoperability and catalyse RDA data-sharing goals. Various barriers (technical, social, legal, ethical), currently make this difficult to achieve beyond established domain community boundaries, limiting the penetration and impact of data-intensive research. The CCM-WG also has potential to adopt a foundational role in synthesising outputs from other RDA WGs.

UKOLN Informatics has worked with Microsoft Research Connections to develop a Community Capability Model Framework (CCMF) for data-intensive research . The aim was to create a common framework for describing how different communities are placed to effectively carry out data-intensive research, and identify gaps in capability. The CCMF White Paper describes eight capability factors which together facilitate an assessment of the data-intensive capability of a particular community, (e.g. discipline, sub-discipline, research unit or research funding agency). The CCMF capability factors are categorised into Human, Environmental and Technical elements. These encompass Openness, Collaboration, Skills and Training, Academic Culture, Legal & Ethical, Economic & Business, Common Practices and Technical Infrastructure.
Essential elements of the Community Capability Model have been adopted within the US NSF-funded EarthCube Cross-Domain Interoperability Roadmap led by SDSC  and the work was presented at a CCMF Workshop at the International Digital Curation Conference in Amsterdam in January 2013 . This exemplar illustrates the potential of the Capability Model to inform data interoperability decision-making and practice.

The CCM WG is proposed to extend and further validate the Model across a wider range of dozens of domains and sub-domains, to collect evidence of its value and impact and to demonstrate subsequent enhancements to data sharing, exchange and analysis. The CCM-WG will review the EarthCube instantiation as a proof-of-concept demonstrator, and build on this foundation to develop a CCM Domain Interoperability Profile Template for community use. The Template will encompass the full list of eight Capability Factors which comprise the Capability Model. Using this Template, we will apply the Capability Model to selected domains in an open, consensus-building process. This will include a consultative Workshop Programme with synthesis of community commentaries, and widespread dissemination of the template to be completed across many disciplines, and sub-disciplines for broad coverage. The development of further Domain Interoperability Profiles will begin to build a referential “registry” to enhance cross-domain data-intensive research capability. The efficacy of the approach will be documented through an RDA-supported CCM Value and Impact Case Study. The results from application of the CCMF will be disseminated on the web, so that it can be used as a reference, and continuously evaluate the progress made in data-intensive research over time.

Value Proposition
Beneficiaries:
• Global research communities in different domains will acquire an authoritative benchmark point of reference through the “registry” of Domain Interoperability Profiles, which inform the adoption of common technical formats, protocols, standards, metadata schema, ontologies, identifiers, workflows, licences, repository infrastructure, preservation actions, collaboration platforms, data sharing behaviours and skills. Once a reference is established, the profiles will enable consistent longitudinal tracking of progress over time.
• Principal Investigators (PIs), new-entrant researchers and postgraduates will be able to check and validate technical specifications, standards and other operational parameters described in grant proposals and Data Management Plans for implementation in funded research projects, in the secure knowledge that particular standards, identifiers, metadata schema or behaviours, in the Domain Profiles are recommended for wider use and embedded within or beyond the domain community.
• Professional Support Services for whom the CCM Profile “registry” will act as a resource to provide authoritative essential information and knowledge for the provision of data-intensive research support e.g. liaison librarians, data librarians, IT storage services.
• Research funding agencies will benefit from the Value and Impact Case Study, which will provide contextual evidence from cross-domain applications of the CCM Capability Model approach. The registry will provide gap analysis, benchmark, and ongoing evaluation of research programmes via CCM Profile “registry”.

Impacts / Outcomes:
• Common framework for describing and evaluating data-intensive research across domains. This will facilitate RDA stakeholders to communicate, plan, execute and evaluate more effectively across its activities.
• Accelerated adoption and embedding of good data practices across research communities, fuelled by raised researcher awareness and active participation in the Workshop Programme.
• Informed advocacy from professional support services to directly assist front-line researchers and to more effectively manage, preserve and sustain data-centric research outputs in the longer term.
• Better quality Data Management Plans providing information and intelligence to funding agencies, facilitating enhanced forward planning for future data-intensive programmes and projects.
• Research productivity and Return-on-Investment (RoI) enhancements enabled by more efficient data exchange. RoI advantage for particular research grants and programmes will be achieved through streamlined computational and analytical processes. These potential economic gains will mean that more data-intensive research can be funded by each funding agency.

Engagement with Existing Work in the Area
The proposed CCM WG includes representatives from UKOLN Informatics, University of Bath, and Microsoft Research Connections, with additional colleagues from relevant initiatives and domains. Bringing expertise from the EarthCube instantiation, Ilya Zaslavsky, SDSC, will be a member of the Working Group. Prior work on capability and maturity models is described in the literature review in the CCMF White Paper. One readiness assessment tool cited is of particular relevance, since it addresses research data management capability from an orthogonal but relevant perspective within institutions or universities: this is the UK Digital Curation Centre CARDIO tool. The DCC has also begun to construct a series of Web pages describing disciplinary metadata ; DCC representatives are members of the CCM WG.
The concept of data profiles has been explored by colleagues at Purdue University Libraries in a project funded by the Institute of Museum and Library Services and aimed at Library support staff. The Data Curation Profiles Toolkit  describes sample data curation requirements in a range of disciplines, recording practice and first-hand researcher experience. This resource offers a useful snapshot which will inform the structured approach offered by the CCM. Scott Brandt, Purdue University Libraries is also a WG member. Dr Andrew Treloar is proposed as the RDA representative, given his prior engagement with CCM work.
Three domains will be selected for CCM “deep-modelling” with an identified domain champion to support community and stakeholder engagement. “Lite-modelling” of further candidate domains will be carried out, with roll-out across the wider community, encompassing dozens of domains and sub-domains.
A provisional start-up list is below:
CCM “deep-modelling”: Astronomy, Prof Alyssa Goodman, Harvard University; Environment, Prof Bill Michener, Univ New Mexico & DataONE; Social Sciences, Prof Dave De Roure, Univ Oxford. “Lite-modelling”: Bioinformatics; Chemistry; High-energy Physics; Health & Medicine; Visual Arts & Humanities.

Workplan    Summary List of Final Deliverables:
• D1 Profile Template: developed as an online and offline tool building on prototype CCM tool development.
• D2 Domain Interoperability Profiles (3): baseline content created and exposed to the community for collective revision, amendment, discussion and agreement, facilitated by a community workshop
• D3 CCM Value and Impact Case Study: synthesised from evidence collected from domain exemplars.
Community Engagement: The consultative Workshop Programme will include three community workshops with each event designed to provide a forum for:
• Presenting the CCM and draft Domain Interoperability Profile based on the Template
• Gathering Profile content and synthesising opinions and views
• Identifying strengths, weaknesses, gaps and inconsistencies in the information
• Seeking research community consensus on the Profile through structured discussion.

Each Workshop will target a domain and will be associated with key international domain meetings or conferences, to maximise community participation. The CCM programme of work will select three domains for “deep-modelling” within the 12 month timeframe January to December 2014. “Lite modelling” of domains and sub-domains will be carried out throughout 2014.
Consensus-building: The WG aim is to achieve community self-validation through the workshop discussions and subsequent open communications. We will explore additional validation of Profiles by relevant professional bodies where appropriate, possibly through joint publication and/or branding e.g. Royal Society of Chemistry or IUCr. Whilst majority views and supporting evidence will be the primary arbiter, conflicting views will be addressed by the WG and ongoing issues will be taken to the RDA Council. To a large degree, the scope of the work has already been determined through the Community Capability Model structure, however standard project management techniques will be applied to WG activities.

Milestones:
M1. Start-up meeting at RDA Conference Summer 2013
M2. Completion of CCM Profile Template December 2013
M3. Completion of 1st CCM Workshop & Synthesis (Q1 2014)
M4. Completion of final CCM Workshop & Synthesis (Q4 2014)
M5. Completion of CCM Value & Impact Case Study December 2014
Intermediate documents: Public drafts of Profile Template (Q3-4 2013), candidate Domain Interoperability Profiles (Q1-4 2014), D3 Value & Impact Case Study (Q3-4 2014).
Mode of Operation: Many elements of the CCM WG participation are currently perceived as voluntary. The WG will operate primarily via regular (2/3-weekly) teleconferences with additional f2f sessions aligned with key RDA events and conferences. These sessions will be supported by an email discussion list and collaborative Web site, where shared documents will be stored.

Adoption Plan

Across the nominated domains, it is envisaged that the “champions” will help to facilitate dissemination and embedding within communities of practice through tweets, blog posts, presentations and articles. Affiliated organisations such as the Digital Curation Centre, will be instrumental in promoting the outputs of the CCM WG, primarily through the Web site, international conference and institutional engagements. We also envisage that the outputs will be promoted by professional services staff e.g. liaison librarians supporting particular departments, and by data centre staff providing disciplinary archival services e.g. British Atmospheric Data Service.

Furthermore, up to three “generic” CCM-RDA Profile workshops will be held at key multi-disciplinary events, e.g. International Digital Curation Conference, International Conference on Research Infrastructures (ICRI), eResearch Australasia, IEEE eScience and the Microsoft Faculty Summit, to promote wider adoption and take-up.

Initial WG Membership (Confirmed)
• Dr Liz Lyon, UKOLN Informatics Director and Associate Director, DCC, (UK)
• Dr Kenji Takeda, Microsoft Research Connections, (Global)
• Dr Manjula Patel, Research Officer, UKOLN Informatics, (UK)
• Dr Ilya Zaslavsky, SDSC, (USA)
• Joy Davidson, DCC, (UK)
• Alex Ball, DCC, (UK)
• Scott Brandt, Purdue University Libraries, (USA)
• Prof Alyssa Goodman, Harvard University (USA)
• Prof Dave De Roure, University of Oxford (UK)
• Prof Bill Michener, University of New Mexico (USA)
• RDA representative (TBC): Dr Andrew Treloar, ANDS (Australia).

References

  1.   Community Capability Model Framework (CCMF) http://communitymodel.sharepoint.com/
  2.   EarthCube Roadmap. Cross-Domain Interoperability Test-Bed Group, 2012. https://www.dropbox.com/s/0oqk5ostahfokbg/interop_roadmap_master8_Aug16.pdf
  3.   http://www.dcc.ac.uk/events/idcc13/workshops
  4.   http://www.dcc.ac.uk/news/new-dcc-resource-disciplinary-metadata
  5.   Data Curation Profiles Toolkit datacurationprofiles.org