Machine-actionable DMPs - a disengaging administrative burden?

24 Sep 2018
Groups audience: 

There are many advantages of machine-actionable DMPs (maDMPs); see e.g. Miksa et al. (2018) [1]. However, there is one aspect of maDMPs that I'd like to hear your opinion on. In their guide on research data management (RDM), Corti et al. (2014:26) [2] say: "[A] data management plan should not be treated as a simple administrative task for which standardized text can be pasted from model templates, with little intention to implement the planned measures early on […]." One of the prerequisites for enabling maDMPs is "avoiding free text and providing structured information whenever possible" (Miksa et al. 2018:10) [1]. Therefore, by implementing maDMPs there might be a danger of making DMPs even more of a disengaging administrative burden for researchers. This was at least the feeling I was left with after selecting lots of predefined values in the easyDMP tool, which already has some of the characteristics of an maDMPs [3].

 

Any thoughts on this? What is the DMP Roadmap approach to this kind of questions?

 

Best,
Philipp

 

References:

[1] Miksa, Tomasz, Simms, Stephanie, Mietchen, Daniel, & Jones, Sarah. (2018). Ten simple rules for machine-actionable data management plans (preprint) (Version preprint). http://doi.org/10.5281/zenodo.1172673

[2] Louise Corti, Veerle Van den Eynden, Libby Bishop and Matthew Woollard (2014): Managing and Sharing Research Data: a Guide to Good Practice. Los Angeles: Sage.

[3] https://easydmp.sigma2.no/
 

  • Rob Hooft's picture

    Author: Rob Hooft

    Date: 24 Sep, 2018

    Philipp,

    I think on of the purposes of the maDMP is exactly the opposite of what you fear: the trivial administrative questions are supposed to be filled in by machine, and only the questions that require thought and where consideration of the options can actually provide value to the research need to be filled in by the researcher. Closed questions rather than open text fields are not conflicting with this: rather than expecting a single piece of text from a researcher who may not know all the issues that could play a role in the decision nor all the options available, a complicated question can be split up into more easily answered closed questions that together provide the information after a proper consideration of all the options and issues.

    To find out how I see that, you could check out the Data Stewardship Wizard tool which I am working on with my ELIXIR colleagues from Prague: https://dsw.fairdata.solutions/

    Regards,

    Rob

    > On 24 Sep 2018, at 09:01, philipp.conzett@uit.no wrote:
    >
    > There are many advantages of machine-actionable DMPs (maDMPs); see e.g. Miksa et al. (2018) [1]. However, there is one aspect of maDMPs that I'd like to hear your opinion on. In their guide on research data management (RDM), Corti et al. (2014:26) [2] say: "[A] data management plan should not be treated as a simple administrative task for which standardized text can be pasted from model templates, with little intention to implement the planned measures early on […]." One of the prerequisites for enabling maDMPs is "avoiding free text and providing structured information whenever possible" (Miksa et al. 2018:10) [1]. Therefore, by implementing maDMPs there might be a danger of making DMPs even more of a disengaging administrative burden for researchers. This was at least the feeling I was left with after selecting lots of predefined values in the easyDMP tool, which already has some of the characteristics of an maDMPs [3].
    >
    > Any thoughts on this? What is the DMP Roadmap approach to this kind of questions?
    >
    > Best,
    > Philipp
    >
    > References:
    >
    > [1] Miksa, Tomasz, Simms, Stephanie, Mietchen, Daniel, & Jones, Sarah. (2018). Ten simple rules for machine-actionable data management plans (preprint) (Version preprint). http://doi.org/10.5281/zenodo.1172673
    > [2] Louise Corti, Veerle Van den Eynden, Libby Bishop and Matthew Woollard (2014): Managing and Sharing Research Data: a Guide to Good Practice. Los Angeles: Sage.
    >
    > [3] https://easydmp.sigma2.no/
    >
    >
    > --
    > Full post: https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    > Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    > Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/60738


    Rob W.W. Hooft || Skype: robhooft || Phone (+WhatsApp/Signal): +31 6 27034319
    Manager ELIXIR-NL at Dutch Techcentre for Life Sciences (DTL) || https://www.dtls.nl

    Join the DTL communities@work day on October 2! http://www.aanmelder.nl/dtl2018

  • Tomasz Miksa's picture

    Author: Tomasz Miksa

    Date: 24 Sep, 2018

    Dear Philipp,

    Thanks for reaching out to us. Less administrative burden and more automation are the main goals for what we do!
    Recently, we had a workshop in which we presented use cases for automating DMP creation and performing actions described in them. Furthermore, we have developed some prototypes to showcase that. We will be showing more of this stuff during the plenary in Gaborone. For now please take a look at the workshop website http://rda-ws-tpdl2018.idsswh.sysresearch.org/#
    You will find there useful information, that should help you better understand our vision. We will also likely have a call in the mid of October in which we can further discuss.

    Tomasz

    P.S. Blog post by Stephanie Simms (DMP Tool – DMP Roadmap) might also be worth looking at https://blog.dmptool.org/tag/machine-actionable-dmps/

    From: rob.hooft=dtls.nl@rda-groups.org On Behalf Of Rob Hooft
    Sent: poniedziałek, 24 września 2018 11:15
    To: philipp.conzett@uit.no; DMP Common Standards WG
    Subject: Re: [dmp-common] Machine-actionable DMPs - a disengaging administrative burden?

    Philipp,

    I think on of the purposes of the maDMP is exactly the opposite of what you fear: the trivial administrative questions are supposed to be filled in by machine, and only the questions that require thought and where consideration of the options can actually provide value to the research need to be filled in by the researcher. Closed questions rather than open text fields are not conflicting with this: rather than expecting a single piece of text from a researcher who may not know all the issues that could play a role in the decision nor all the options available, a complicated question can be split up into more easily answered closed questions that together provide the information after a proper consideration of all the options and issues.

    To find out how I see that, you could check out the Data Stewardship Wizard tool which I am working on with my ELIXIR colleagues from Prague: https://dsw.fairdata.solutions/

    Regards,

    Rob

    On 24 Sep 2018, at 09:01, philipp.conzett@uit.no wrote:

    There are many advantages of machine-actionable DMPs (maDMPs); see e.g. Miksa et al. (2018) [1]. However, there is one aspect of maDMPs that I'd like to hear your opinion on. In their guide on research data management (RDM), Corti et al. (2014:26) [2] say: "[A] data management plan should not be treated as a simple administrative task for which standardized text can be pasted from model templates, with little intention to implement the planned measures early on […]." One of the prerequisites for enabling maDMPs is "avoiding free text and providing structured information whenever possible" (Miksa et al. 2018:10) [1]. Therefore, by implementing maDMPs there might be a danger of making DMPs even more of a disengaging administrative burden for researchers. This was at least the feeling I was left with after selecting lots of predefined values in the easyDMP tool, which already has some of the characteristics of an maDMPs [3].
    Any thoughts on this? What is the DMP Roadmap approach to this kind of questions?
    Best,
    Philipp
    References:
    [1] Miksa, Tomasz, Simms, Stephanie, Mietchen, Daniel, & Jones, Sarah. (2018). Ten simple rules for machine-actionable data management plans (preprint) (Version preprint). http://doi.org/10.5281/zenodo.1172673
    [2] Louise Corti, Veerle Van den Eynden, Libby Bishop and Matthew Woollard (2014): Managing and Sharing Research Data: a Guide to Good Practice. Los Angeles: Sage.
    [3] https://easydmp.sigma2.no/

    --
    Full post: https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/60738


    Rob W.W. Hooft || Skype: robhooft || Phone (+WhatsApp/Signal): +31 6 27034319
    Manager ELIXIR-NL at Dutch Techcentre for Life Sciences (DTL) || https://www.dtls.nl

    Join the DTL communities@work day on October 2! http://www.aanmelder.nl/dtl2018

  • Philipp Conzett's picture

    Author: Philipp Conzett

    Date: 26 Sep, 2018

    Thanks, Rob and Tomasz, for valuable feedback!

    I perfectly agree with the idea and rationale behind making DMPs machine actionable. If this is implemented in a meaningful way, it surely will reduce the administrative burden for researchers. As you say, Rob, this way "a complicated question can be split up into more easily answered closed questions that together provide the information after a proper consideration of all the options and issues". I had a look at the Questionnaire demo of the Data Stewardship Wizard from ELIXIR. Here, researchers are guided through the whole lifecycle of research data. But without such guidance and sensible grouping of questions, filling in a DMP form may turn into a disengaging and frustrating duty. If you read my feedback on the easyDMP tool, you'll probably understand what I mean.

    Best,
    Philipp

     

  • João Rocha da Silva's picture

    Author: João Rocha da Silva

    Date: 27 Sep, 2018

    Hi all,

    I did not have the opportunity to attend the TPDL 2018 workshop, so at the risk of being outdated, here are my two cents on this until we can meet in Gaborone and discuss this in person.

    1. maDMPs need an ecosystem of “automatable” software (Rules 1 and 2 of the 10 Simple Rules for maDMPs )

    One important goal is to have the maDMP automate repetitive work, but people should be able to keep their own repositories and other software platforms in the maDMP workflow, or, at most, install an upgrade to these existing solutions. However, automation across different systems requires interoperability.
    Thus, interoperability with existing data management tools is a must, because a maDMP is basically worthless if no software knows how to execute what it prescribes or enforces. An off-the-shelf workflow engine should be able to fire off the proper events to the data management software, with the necessary payload, which would itself comply with a set interoperability standard (this is where we could come in, I think). In a sense, we could work towards an API specification covering a core set of operations that should be supported by any repository or data staging platform that wants to be “maDMP Ready” in the RDA sense.

    2. maDMPs should be modular

    Every research project has its own way of handling data but some needs are common, as shown by the recent tools that aid in building a DMP. A very strong interest of maDMPs thus lies in possible reuse either as a whole or as building blocks. The maDMP should expose the subprocesses inside it in a modular way, with certain elementary validation and processing workflows being modeled and shareable using existing modeling languages, such as BPMN, as shown at the workshop BPMN , as shown at the workshop. Other modeling languages could be used as long as they have both a standard visual representation for being included in the “printed” DMP, much like UML, and a machine-processable representation in XML. Such maDMP "building blocks” so to say, could then be reused in a project's maDMP, and published in a maDMP “directory ” for others to access.

    People interested in a maDMP for their project could then:

    a) Download and reuse an existing maDMP for a project that they know went well
    b) Reuse only the maDMP "building blocks” for metadata validation, dataset availability, repository compliance, etc. (modeled using BPMN )
    c) Build their own from scratch and share it to this “directory” of maDMPs for others to reuse

    3. The code that actually executes the actions in the maDMP’s processes should travel with them (Rules 6, 7, 9)

    If possible, the small pieces of code behind every step of the modeled workflows should be open-source and retrievable as needed by the workflow engine as it runs the BPMN processes. Like this, vendors could fork and adjust them to the existing APIs of their repository software or other software that the maDMP needs to “remote-control” to execute the automation steps specified in the modeled process. This is somewhat similar to ETL tools, but fetching platform-specific code as needed. A commercial example of a graphical ETL tool is Pentaho Data Integration here — by the way, i am not affiliated with them, only used the tool myself.

    Best,

    João Rocha da Silva
    Invited Assistant Professor — Dendro Lead Developer — Research Data Management
    Faculty of Engineering of the University of Porto, Portugal
    ORCID: https://orcid.org/0000-0001-9659-6256 GitHub: https://github.com/silvae86

    > No dia 26/09/2018, às 12:35, philipp.conzett@uit.no escreveu:
    >
    > Thanks, Rob and Tomasz, for valuable feedback!
    >
    > I perfectly agree with the idea and rationale behind making DMPs machine actionable. If this is implemented in a meaningful way, it surely will reduce the administrative burden for researchers. As you say, Rob, this way "a complicated question can be split up into more easily answered closed questions that together provide the information after a proper consideration of all the options and issues". I had a look at the Questionnaire demo of the Data Stewardship Wizard from ELIXIR. Here, researchers are guided through the whole lifecycle of research data. But without such guidance and sensible grouping of questions, filling in a DMP form may turn into a disengaging and frustrating duty. If you read my feedback on the easyDMP tool, you'll probably understand what I mean.
    >
    > Best,
    > Philipp
    >
    >
    > --
    > Full post: https://rd-alliance.org/group/dmp-common-standards-wg/post/machine-actio...
    > Manage my subscriptions: https://rd-alliance.org/mailinglist
    > Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/60738

  • Paul Walk's picture

    Author: Paul Walk

    Date: 27 Sep, 2018

    I’m not sure that i agree with expanding the scope of this activity to include developing a standard for describing business processes relating to DMPs. We can make a real difference if we focus on our mission to establish a standard data interchange.

    I think you are right to emphasise the importance of understanding the business processes which will use mDMPs, but I think we should stop short of trying to model these formally with something like BPMN. It is too soon to be able to predict with any confidence precisely how mDMPs will be used and we should, in my opinion, avoid over-engineering a solution at this stage. Rather, we should be aiming for improving broad interoperability, and for maintaining flexibility & future extensibility.

    Cheers,

    Paul

    -------------------------------------------
    Paul Walk
    http://www.paulwalk.net

    Founder and Director, Antleaf Ltd
    http://www.antleaf.com

    Antleaf provides Management Services to DCMI
    http://www.dublincore.org
    -------------------------------------------

    > On 27 Sep 2018, at 01:48, jrocha wrote:
    >
    > Hi all,
    >
    > I did not have the opportunity to attend the TPDL 2018 workshop, so at the risk of being outdated, here are my two cents on this until we can meet in Gaborone and discuss this in person.
    >
    > 1. maDMPs need an ecosystem of “automatable” software (Rules 1 and 2 of the 10 Simple Rules for maDMPs)
    >
    > One important goal is to have the maDMP automate repetitive work, but people should be able to keep their own repositories and other software platforms in the maDMP workflow, or, at most, install an upgrade to these existing solutions. However, automation across different systems requires interoperability.
    > Thus, interoperability with existing data management tools is a must, because a maDMP is basically worthless if no software knows how to execute what it prescribes or enforces. An off-the-shelf workflow engine should be able to fire off the proper events to the data management software, with the necessary payload, which would itself comply with a set interoperability standard (this is where we could come in, I think). In a sense, we could work towards an API specification covering a core set of operations that should be supported by any repository or data staging platform that wants to be “maDMP Ready” in the RDA sense.
    >
    > 2. maDMPs should be modular
    >
    > Every research project has its own way of handling data but some needs are common, as shown by the recent tools that aid in building a DMP. A very strong interest of maDMPs thus lies in possible reuse either as a whole or as building blocks. The maDMP should expose the subprocesses inside it in a modular way, with certain elementary validation and processing workflows being modeled and shareable using existing modeling languages, such as BPMN, as shown at the workshop. Other modeling languages could be used as long as they have both a standard visual representation for being included in the “printed” DMP, much like UML, and a machine-processable representation in XML. Such maDMP "building blocks” so to say, could then be reused in a project's maDMP, and published in a maDMP “directory” for others to access.
    >
    > People interested in a maDMP for their project could then:
    >
    > a) Download and reuse an existing maDMP for a project that they know went well
    > b) Reuse only the maDMP "building blocks” for metadata validation, dataset availability, repository compliance, etc. (modeled using BPMN)
    > c) Build their own from scratch and share it to this “directory” of maDMPs for others to reuse
    >
    >
    > 3. The code that actually executes the actions in the maDMP’s processes should travel with them (Rules 6, 7, 9)
    >
    > If possible, the small pieces of code behind every step of the modeled workflows should be open-source and retrievable as needed by the workflow engine as it runs the BPMN processes. Like this, vendors could fork and adjust them to the existing APIs of their repository software or other software that the maDMP needs to “remote-control” to execute the automation steps specified in the modeled process. This is somewhat similar to ETL tools, but fetching platform-specific code as needed. A commercial example of a graphical ETL tool is Pentaho Data Integration here — by the way, i am not affiliated with them, only used the tool myself.
    >
    >
    > Best,
    >
    > João Rocha da Silva
    > Invited Assistant Professor — Dendro Lead Developer — Research Data Management
    > Faculty of Engineering of the University of Porto, Portugal
    > ORCID: https://orcid.org/0000-0001-9659-6256 GitHub: https://github.com/silvae86
    >
    >> No dia 26/09/2018, às 12:35, philipp.conzett@uit.no escreveu:
    >>
    >> Thanks, Rob and Tomasz, for valuable feedback!
    >>
    >> I perfectly agree with the idea and rationale behind making DMPs machine actionable. If this is implemented in a meaningful way, it surely will reduce the administrative burden for researchers. As you say, Rob, this way "a complicated question can be split up into more easily answered closed questions that together provide the information after a proper consideration of all the options and issues". I had a look at the Questionnaire demo of the Data Stewardship Wizard from ELIXIR. Here, researchers are guided through the whole lifecycle of research data. But without such guidance and sensible grouping of questions, filling in a DMP form may turn into a disengaging and frustrating duty. If you read my feedback on the easyDMP tool, you'll probably understand what I mean.
    >>
    >> Best,
    >> Philipp
    >>
    >>
    >> --
    >> Full post: https://rd-alliance.org/group/dmp-common-standards-wg/post/machine-actio...
    >> Manage my subscriptions: https://rd-alliance.org/mailinglist
    >> Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/60738
    >
    > --
    > Full post: https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    > Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    > Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/60738

  • Tomasz Miksa's picture

    Author: Tomasz Miksa

    Date: 04 Oct, 2018

    Hi,

    The idea to model processes using BPMN was to showcase how certain activities that relate to information contained in DMPs can be automated. In other words, it is a way to demonstrate machine-actionability of DMPs.

    When our model is in place, then institutions rolling out maDMPs will have to think about the process in which maDMPs are used – that’s also why we had the first consultation to find out who uses which information and when.

    Depending on the research field and existing infrastructure and practices, the processes will be implemented differently. However, very likely most of the processes will be to some extenet similar, e.g. getting recommendation on repositories, getting information on costs, etc. For this reason, a set of BPMNs is just a useful set of ideas that enable us now to have a more focused discussion on requirements for the model.

    Yes, it is not the ambition of this working group to define processes – we do the model.

    Tomasz

    From: paul=paulwalk.net@rda-groups.org [mailto:paul=paulwalk.net@rda-groups.org] On Behalf Of paulwalk
    Sent: Thursday, September 27, 2018 9:00 AM
    To: jrocha; DMP Common Standards WG
    Cc: philipp.conzett@uit.no
    Subject: Re: [dmp-common] Machine-actionable DMPs - a disengaging administrative burden?

    I’m not sure that i agree with expanding the scope of this activity to include developing a standard for describing business processes relating to DMPs. We can make a real difference if we focus on our mission to establish a standard data interchange.

    I think you are right to emphasise the importance of understanding the business processes which will use mDMPs, but I think we should stop short of trying to model these formally with something like BPMN. It is too soon to be able to predict with any confidence precisely how mDMPs will be used and we should, in my opinion, avoid over-engineering a solution at this stage. Rather, we should be aiming for improving broad interoperability, and for maintaining flexibility & future extensibility.

    Cheers,

    Paul

    -------------------------------------------
    Paul Walk
    http://www.paulwalk.net

    Founder and Director, Antleaf Ltd
    http://www.antleaf.com

    Antleaf provides Management Services to DCMI
    http://www.dublincore.org
    -------------------------------------------

    On 27 Sep 2018, at 01:48, jrocha wrote:

    Hi all,

    I did not have the opportunity to attend the TPDL 2018 workshop, so at the risk of being outdated, here are my two cents on this until we can meet in Gaborone and discuss this in person.

    1. maDMPs need an ecosystem of “automatable” software (Rules 1 and 2 of the 10 Simple Rules for maDMPs )

    One important goal is to have the maDMP automate repetitive work, but people should be able to keep their own repositories and other software platforms in the maDMP workflow, or, at most, install an upgrade to these existing solutions. However, automation across different systems requires interoperability.

    Thus, interoperability with existing data management tools is a must, because a maDMP is basically worthless if no software knows how to execute what it prescribes or enforces. An off-the-shelf workflow engine should be able to fire off the proper events to the data management software, with the necessary payload, which would itself comply with a set interoperability standard (this is where we could come in, I think). In a sense, we could work towards an API specification covering a core set of operations that should be supported by any repository or data staging platform that wants to be “maDMP Ready” in the RDA sense.

    2. maDMPs should be modular

    Every research project has its own way of handling data but some needs are common, as shown by the recent tools that aid in building a DMP. A very strong interest of maDMPs thus lies in possible reuse either as a whole or as building blocks. The maDMP should expose the subprocesses inside it in a modular way, with certain elementary validation and processing workflows being modeled and shareable using existing modeling languages, such as BPMN , as shown at the workshop. Other modeling languages could be used as long as they have both a standard visual representation for being included in the “printed” DMP, much like UML, and a machine-processable representation in XML. Such maDMP "building blocks” so to say, could then be reused in a project's maDMP, and published in a maDMP “directory ” for others to access.

    People interested in a maDMP for their project could then:

    a) Download and reuse an existing maDMP for a project that they know went well

    b) Reuse only the maDMP "building blocks” for metadata validation, dataset availability, repository compliance, etc. (modeled using BPMN )

    c) Build their own from scratch and share it to this “directory” of maDMPs for others to reuse

    3. The code that actually executes the actions in the maDMP’s processes should travel with them (Rules 6, 7, 9)

    If possible, the small pieces of code behind every step of the modeled workflows should be open-source and retrievable as needed by the workflow engine as it runs the BPMN processes. Like this, vendors could fork and adjust them to the existing APIs of their repository software or other software that the maDMP needs to “remote-control” to execute the automation steps specified in the modeled process. This is somewhat similar to ETL tools, but fetching platform-specific code as needed. A commercial example of a graphical ETL tool is Pentaho Data Integration here — by the way, i am not affiliated with them, only used the tool myself.

    Best,

    João Rocha da Silva
    Invited Assistant Professor — Dendro Lead Developer — Research Data Management
    Faculty of Engineering of the University of Porto, Portugal
    ORCID: https://orcid.org/0000-0001-9659-6256 GitHub: https://github.com/silvae86

    No dia 26/09/2018, às 12:35, philipp.conzett@uit.no escreveu:

    Thanks, Rob and Tomasz, for valuable feedback!

    I perfectly agree with the idea and rationale behind making DMPs machine actionable. If this is implemented in a meaningful way, it surely will reduce the administrative burden for researchers. As you say, Rob, this way "a complicated question can be split up into more easily answered closed questions that together provide the information after a proper consideration of all the options and issues". I had a look at the Questionnaire demo of the Data Stewardship Wizard from ELIXIR. Here, researchers are guided through the whole lifecycle of research data. But without such guidance and sensible grouping of questions, filling in a DMP form may turn into a disengaging and frustrating duty. If you read my feedback on the easyDMP tool, you'll probably understand what I mean.

    Best,
    Philipp

    --
    Full post: https://rd-alliance.org/group/dmp-common-standards-wg/post/machine-actio...
    Manage my subscriptions: https://rd-alliance.org/mailinglist
    Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/60738

    --
    Full post: https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/60738

  • Paul Walk's picture

    Author: Paul Walk

    Date: 04 Oct, 2018

    I'm comfortable with "For this reason, a set of BPMNs is just a useful set of ideas that enable us now to have a more focused discussion on requirements for the model."

    We just need to be careful to remember that the processes which will produce and/or consume our mDMPs will be heterogeneous and unpredictable!

    Cheers,

    Paul

    > On 4 Oct 2018, at 11:12, tmiksa wrote:
    >
    > Hi,
    >
    > The idea to model processes using BPMN was to showcase how certain activities that relate to information contained in DMPs can be automated. In other words, it is a way to demonstrate machine-actionability of DMPs.
    > When our model is in place, then institutions rolling out maDMPs will have to think about the process in which maDMPs are used – that’s also why we had the first consultation to find out who uses which information and when.
    > Depending on the research field and existing infrastructure and practices, the processes will be implemented differently. However, very likely most of the processes will be to some extenet similar, e.g. getting recommendation on repositories, getting information on costs, etc. For this reason, a set of BPMNs is just a useful set of ideas that enable us now to have a more focused discussion on requirements for the model.
    >
    > Yes, it is not the ambition of this working group to define processes – we do the model.
    >
    > Tomasz
    >
    > From: paul=paulwalk.net@rda-groups.org [mailto:paul=paulwalk.net@rda-groups.org] On Behalf Of paulwalk
    > Sent: Thursday, September 27, 2018 9:00 AM
    > To: jrocha; DMP Common Standards WG
    > Cc: philipp.conzett@uit.no
    > Subject: Re: [dmp-common] Machine-actionable DMPs - a disengaging administrative burden?
    >
    > I’m not sure that i agree with expanding the scope of this activity to include developing a standard for describing business processes relating to DMPs. We can make a real difference if we focus on our mission to establish a standard data interchange.
    >
    > I think you are right to emphasise the importance of understanding the business processes which will use mDMPs, but I think we should stop short of trying to model these formally with something like BPMN. It is too soon to be able to predict with any confidence precisely how mDMPs will be used and we should, in my opinion, avoid over-engineering a solution at this stage. Rather, we should be aiming for improving broad interoperability, and for maintaining flexibility & future extensibility.
    >
    > Cheers,
    >
    > Paul
    >
    > -------------------------------------------
    > Paul Walk
    > http://www.paulwalk.net
    >
    > Founder and Director, Antleaf Ltd
    > http://www.antleaf.com
    >
    > Antleaf provides Management Services to DCMI
    > http://www.dublincore.org
    > -------------------------------------------
    >
    > On 27 Sep 2018, at 01:48, jrocha wrote:
    >
    > Hi all,
    >
    > I did not have the opportunity to attend the TPDL 2018 workshop, so at the risk of being outdated, here are my two cents on this until we can meet in Gaborone and discuss this in person.
    >
    > 1. maDMPs need an ecosystem of “automatable” software (Rules 1 and 2 of the 10 Simple Rules for maDMPs)
    >
    > One important goal is to have the maDMP automate repetitive work, but people should be able to keep their own repositories and other software platforms in the maDMP workflow, or, at most, install an upgrade to these existing solutions. However, automation across different systems requires interoperability.
    > Thus, interoperability with existing data management tools is a must, because a maDMP is basically worthless if no software knows how to execute what it prescribes or enforces. An off-the-shelf workflow engine should be able to fire off the proper events to the data management software, with the necessary payload, which would itself comply with a set interoperability standard (this is where we could come in, I think). In a sense, we could work towards an API specification covering a core set of operations that should be supported by any repository or data staging platform that wants to be “maDMP Ready” in the RDA sense.
    >
    > 2. maDMPs should be modular
    >
    > Every research project has its own way of handling data but some needs are common, as shown by the recent tools that aid in building a DMP. A very strong interest of maDMPs thus lies in possible reuse either as a whole or as building blocks. The maDMP should expose the subprocesses inside it in a modular way, with certain elementary validation and processing workflows being modeled and shareable using existing modeling languages, such as BPMN, as shown at the workshop. Other modeling languages could be used as long as they have both a standard visual representation for being included in the “printed” DMP, much like UML, and a machine-processable representation in XML. Such maDMP "building blocks” so to say, could then be reused in a project's maDMP, and published in a maDMP “directory” for others to access.
    >
    > People interested in a maDMP for their project could then:
    >
    > a) Download and reuse an existing maDMP for a project that they know went well
    > b) Reuse only the maDMP "building blocks” for metadata validation, dataset availability, repository compliance, etc. (modeled using BPMN)
    > c) Build their own from scratch and share it to this “directory” of maDMPs for others to reuse
    >
    >
    > 3. The code that actually executes the actions in the maDMP’s processes should travel with them (Rules 6, 7, 9)
    >
    >
    > If possible, the small pieces of code behind every step of the modeled workflows should be open-source and retrievable as needed by the workflow engine as it runs the BPMN processes. Like this, vendors could fork and adjust them to the existing APIs of their repository software or other software that the maDMP needs to “remote-control” to execute the automation steps specified in the modeled process. This is somewhat similar to ETL tools, but fetching platform-specific code as needed. A commercial example of a graphical ETL tool is Pentaho Data Integration here — by the way, i am not affiliated with them, only used the tool myself.
    >
    >
    > Best,
    >
    > João Rocha da Silva
    > Invited Assistant Professor — Dendro Lead Developer — Research Data Management
    > Faculty of Engineering of the University of Porto, Portugal
    > ORCID: https://orcid.org/0000-0001-9659-6256 GitHub: https://github.com/silvae86
    >
    >
    > No dia 26/09/2018, às 12:35, philipp.conzett@uit.no escreveu:
    >
    > Thanks, Rob and Tomasz, for valuable feedback!
    > I perfectly agree with the idea and rationale behind making DMPs machine actionable. If this is implemented in a meaningful way, it surely will reduce the administrative burden for researchers. As you say, Rob, this way "a complicated question can be split up into more easily answered closed questions that together provide the information after a proper consideration of all the options and issues". I had a look at the Questionnaire demo of the Data Stewardship Wizard from ELIXIR. Here, researchers are guided through the whole lifecycle of research data. But without such guidance and sensible grouping of questions, filling in a DMP form may turn into a disengaging and frustrating duty. If you read my feedback on the easyDMP tool, you'll probably understand what I mean.
    > Best,
    > Philipp
    >
    > --
    > Full post: https://rd-alliance.org/group/dmp-common-standards-wg/post/machine-actio...
    > Manage my subscriptions: https://rd-alliance.org/mailinglist
    > Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/60738
    >
    > --
    > Full post: https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    > Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    > Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/60738
    > --
    > Full post: https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    > Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    > Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/60738

    -------------------------------------------
    Paul Walk
    http://www.paulwalk.net

    Founder and Director, Antleaf Ltd
    http://www.antleaf.com

    Antleaf provides Management Services to DCMI
    http://www.dublincore.org
    -------------------------------------------

  • Kevin Ashley's picture

    Author: Kevin Ashley

    Date: 04 Oct, 2018

    From a number of perspectives, I am in strong agreement with Paul that the
    group's scope should not try to take this task on. If this is a useful area to
    explore, I would rather see a new working group take it on - I speak now as
    co-chair of the overall interest group looking after the DMP working groups.

    The Common Standards Group has an admirably narrow focus and task to complete
    and I would want to see that achieved first before branching out. Lots else
    depends on getting this standard agreed.

    I'll return to the original criticisms made by Phillip Conzett in a separate
    reply as I think we've ended up with two quite distinct issues in this email thread.

    On 04/10/18 11:12, tmiksa wrote:
    > Hi,
    >
    > The idea to model processes using BPMN was to showcase how certain activities
    > that relate to information contained in DMPs can be automated. In other words,
    > it is a way to demonstrate machine-actionability of DMPs.
    >
    > When our model is in place, then institutions rolling out maDMPs will have to
    > think about the process in which maDMPs are used – that’s also why we had the
    > first consultation to find out who use! s which information and when.
    >
    > Depending on the research field and existing infrastructure and practices, the
    > processes will be implemented differently. However, very likely most of the
    > processes will be to some extenet similar, e.g. getting recommendation on
    > repositories, getting information on costs, etc. For this reason, a set of BPMNs
    > is just a useful set of ideas that enable us now to have a more focused
    > discussion on requirements for the model.
    >
    > Yes, it is not the ambition of this working group to define processes – we do
    > the model.
    >
    > Tomasz
    >
    > *From:*paul=paulwalk.net@rda-groups.org
    > [mailto:paul=paulwalk.net@rda-groups.org] *On Behalf Of *paulwalk
    > *Sent:* Thursday, September 27, 2018 9:00 AM
    > *To:* jrocha; DMP Common Standards WG
    > *Cc:* philipp.conzett@uit.no
    > *Subject:* Re: [dmp-common] Machine-actionable DMPs - a disengaging
    > administrative burden?
    >
    > I’m not sure that i agree with expanding the scope of this activity to include
    > developing a standard for describing business processes relating to DMPs. We can
    > make a real difference if we focus on our mission to establish a standard data
    > interchange.
    >
    > I think you are right to emphasise the importance of understanding the business
    > processes which will use mDMPs, but I think we should stop short of trying to
    > model these formally with something like BPMN. It is too soon to be able to
    > predict with any confidence precisely how mDMPs will be used and we should, in
    > my opinion, avoid over-engineering a solution at this stage. Rather, we should
    > be aiming for improving broad interoperability, and for maintaining flexibility
    > & future extensibility.
    >
    > Cheers,
    >
    > Paul
    >
    > -------------------------------------------
    > Paul Walk
    > http://www.paulwalk.net
    >
    > Founder and Director, Antleaf Ltd
    > http://www.antleaf.com
    >
    > Antleaf provides Management Services to DCMI
    > http://www.dublincore.org
    > -------------------------------------------
    >
    >
    > On 27 Sep 2018, at 01:48, jrocha > wrote:
    >
    > Hi all,
    >
    > I did not have the opportunity to attend the TPDL 2018 workshop, so at the
    > risk of being outdated, here are my two cents on this until we can meet in
    > Gaborone and discuss this in person.
    >
    > *1. maDMPs need an ecosystem of “automatable” software (Rules 1 and 2 of the
    > *10 Simple Rules for maDMPs *)*
    >
    > One important goal is to have the maDMP automate repetitive work, but people
    > should be able to keep their own repositories and other software platforms
    > in the maDMP workflow, or, at most, install an upgrade to these existing
    > solutions. However, automation across different systems requires
    > interoperability.
    >
    > Thus, interoperability with existing data management tools is a must,
    > because a maDMP is basically worthless if no software knows how to execute
    > what it prescribes or enforces. An off-the-shelf workflow engine should be
    > able to fire off the proper events to the data management software, with the
    > necessary payload, which would itself comply with a set interoperability
    > standard (this is where we could come in, I think). In a sense, we could
    > work towards an API specification covering a core set of operations that
    > should be supported by any repository or data staging platform that wants to
    > be “maDMP Ready” in the RDA sense.
    >
    > *2. maDMPs should be modular*
    >
    > Every research project has its own way of handling data but! some needs are
    > common, as shown by the recent tools that aid in building a DMP. A very
    > strong interest of maDMPs thus lies in possible reuse either as a whole or
    > as building blocks. The maDMP should expose the subprocesses inside it in a
    > modular way, with certain elementary validation and processing workflows
    > being modeled and shareable using existing modeling languages, such as BPMN
    > , as
    > shown at the workshop. Other modeling languages could be used as long as
    > they have both a standard visual representation for being included in the
    > “printed” DMP, much like UML, and a machine-processable representation in
    > XML. Such maDMP "building blocks” so to say, could then be reused in a
    > project's maDMP, and published in a maDMP “directory
    > ” for others to access.
    >
    > People interested in a maDMP for their project could then:
    >
    > a) Download and reuse an existing maDMP for a project that they know went well
    >
    > b) Reuse only the maDMP "building blocks” for metadata validation, dataset
    > availability, repository compliance, etc. (modeled using BPMN
    > )
    >
    > c) Build their own from scratch and share it to this “directory” of maDMPs
    > for others to reuse
    >
    > *3. The code that actually executes the actions in the maDMP’s processes
    > should travel wit! h them (Rules 6, 7, 9)*
    >
    > *
    >
    > *
    >
    > If possible, the small pieces of code behind every step of the modeled
    > workflows should be open-source and retrievable as needed by the workflow
    > engine as it runs the BPMN processes. Like this, vendors could fork and
    > adjust them to the existing APIs of their repository software or other
    > software that the maDMP needs to “remote-control” to execute the automation
    > steps specified in the modeled process. This is somewhat similar to ETL
    > tools, but fetching platform-specific code as needed. A commercial example
    > of a graphical ETL tool is Pentaho Data Integration here
    >  — by
    > the way, i am not affiliated with them, only used the tool myself.
    >
    > Best,
    >
    > *João Rocha da Silva *
    > Invited Assistant Professor — Dendro
    >  Lead Developer — Research
    > Data Management
    > Faculty of Engineering of the University of Porto, Portugal
    > *ORCID*: https://orcid.org/0000-0001-9659-6256 *GitHub*:
    > https://github.com/silvae86
    >
    >
    >
    > No dia 26/09/2018, às 12:35, philipp.conzett@uit.no
    > escreveu:
    >
    > Thanks, Rob and Tomasz, for valuable feedback!
    >
    > I perfectly agree with the idea and rationale behind making DMPs machine
    > actionable. If this is implemented in a meaningful way, it surely will
    > reduce the administrative burden for researchers. As you say, Rob, this way
    > "a complicated question can be split up into more easily answered closed
    > questions that together provide the information after a proper consideration
    > of all the options and issues". I had a look at the Questionnaire demo of
    > the Data Stewardship Wizard from ELIXIR. Here, researchers are guided
    > through the whole lifecycle of research data. But ! without such guidance
    > and sensible grouping of questions, filling in a DMP form may turn into a
    > disengaging and frustrating duty. If you read my feedback
    > on the easyDMP tool, you'll
    > probably understand what I mean.
    >
    > Best,
    > Philipp
    >
    > --
    > Full post:
    > https://rd-alliance.org/group/dmp-common-standards-wg/post/machine-actio...
    > Manage my subscriptions: https://rd-alliance.org/mailinglist
    > Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/60738
    >
    > --
    > Full post:
    > https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    > Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    > Stop emails for this post:
    > https://www.rd-alliance.org/mailinglist/unsubscribe/60738
    >
    >
    >
    > --
    > Full post:
    > https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    > Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    > Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/60738

    --
    The University of Edinburgh is a charitable body, registered in
    Scotland, with registration number SC005336.

  • João Rocha da Silva's picture

    Author: João Rocha da Silva

    Date: 04 Oct, 2018

    Dear Paul, Kevin, Tomasz, and all others,

    I see your point when you say that the WG should focus on the deployment of the maDMP overall model. Perhaps it should limit itself to laying the foundation for “plugging in” future BPMN workflows, maybe by including a few related Classes and Properties (in case this model is formalised as an ontology) to reference these models, but without actually embedding them in the model itself?

    A future WG could then be created with the goal of creating some sort of "minimum viable product” for a maDMP-compliant repository or workflow software; afterwards, perhaps we can present a revised model with any lessons learned from the engineering side of things if need be.

    Best,

    João Rocha da Silva
    Invited Assistant Professor — Dendro Lead Developer — Research Data Management
    Faculty of Engineering of the University of Porto, Portugal
    ORCID: https://orcid.org/0000-0001-9659-6256 GitHub: https://github.com/silvae86
    On 4 Oct 2018, 12:39 +0100, kashley , wrote:
    > From a number of perspectives, I am in strong agreement with Paul that the
    > group's scope should not try to take this task on. If this is a useful area to
    > explore, I would rather see a new working group take it on - I speak now as
    > co-chair of the overall interest group looking after the DMP working groups.
    >
    > The Common Standards Group has an admirably narrow focus and task to complete
    > and I would want to see that achieved first before branching out. Lots else
    > depends on getting this standard agreed.
    >
    > I'll return to the original criticisms made by Phillip Conzett in a separate
    > reply as I think we've ended up with two quite distinct issues in this email thread.
    >
    > On 04/10/18 11:12, tmiksa wrote:
    > > Hi,
    > >
    > > The idea to model processes using BPMN was to showcase how certain activities
    > > that relate to information contained in DMPs can be automated. In other words,
    > > it is a way to demonstrate machine-actionability of DMPs.
    > >
    > > When our model is in place, then institutions rolling out maDMPs will have to
    > > think about the process in which maDMPs are used – that’s also why we had the
    > > first consultation to find out who use! s which information and when.
    > >
    > > Depending on the research field and existing infrastructure and practices, the
    > > processes will be implemented differently. However, very likely most of the
    > > processes will be to some extenet similar, e.g. getting recommendation on
    > > repositories, getting information on costs, etc. For this reason, a set of BPMNs
    > > is just a useful set of ideas that enable us now to have a more focused
    > > discussion on requirements for the model.
    > >
    > > Yes, it is not the ambition of this working group to define processes – we do
    > > the model.
    > >
    > > Tomasz
    > >
    > > *From:*paul=paulwalk.net@rda-groups.org
    > > [mailto:paul=paulwalk.net@rda-groups.org] *On Behalf Of *paulwalk
    > > *Sent:* Thursday, September 27, 2018 9:00 AM
    > > *To:* jrocha; DMP Common Standards WG
    > > *Cc:* philipp.conzett@uit.no
    > > *Subject:* Re: [dmp-common] Machine-actionable DMPs - a disengaging
    > > administrative burden?
    > >
    > > I’m not sure that i agree with expanding the scope of this activity to include
    > > developing a standard for describing business processes relating to DMPs. We can
    > > make a real difference if we focus on our mission to establish a standard data
    > > interchange.
    > >
    > > I think you are right to emphasise the importance of understanding the business
    > > processes which will use mDMPs, but I think we should stop short of trying to
    > > model these formally with something like BPMN. It is too soon to be able to
    > > predict with any confidence precisely how mDMPs will be used and we should, in
    > > my opinion, avoid over-engineering a solution at this stage. Rather, we should
    > > be aiming for improving broad interoperability, and for maintaining flexibility
    > > & future extensibility.
    > >
    > > Cheers,
    > >
    > > Paul
    > >
    > > -------------------------------------------
    > > Paul Walk
    > > http://www.paulwalk.net
    > >
    > > Founder and Director, Antleaf Ltd
    > > http://www.antleaf.com
    > >
    > > Antleaf provides Management Services to DCMI
    > > http://www.dublincore.org
    > > -------------------------------------------
    > >
    > >
    > > On 27 Sep 2018, at 01:48, jrocha > > wrote:
    > >
    > > Hi all,
    > >
    > > I did not have the opportunity to attend the TPDL 2018 workshop, so at the
    > > risk of being outdated, here are my two cents on this until we can meet in
    > > Gaborone and discuss this in person.
    > >
    > > *1. maDMPs need an ecosystem of “automatable” software (Rules 1 and 2 of the
    > > *10 Simple Rules for maDMPs *)*
    > >
    > > One important goal is to have the maDMP automate repetitive work, but people
    > > should be able to keep their own repositories and other software platforms
    > > in the maDMP workflow, or, at most, install an upgrade to these existing
    > > solutions. However, automation across different systems requires
    > > interoperability.
    > >
    > > Thus, interoperability with existing data management tools is a must,
    > > because a maDMP is basically worthless if no software knows how to execute
    > > what it prescribes or enforces. An off-the-shelf workflow engine should be
    > > able to fire off the proper events to the data management software, with the
    > > necessary payload, which would itself comply with a set interoperability
    > > standard (this is where we could come in, I think). In a sense, we could
    > > work towards an API specification covering a core set of operations that
    > > should be supported by any repository or data staging platform that wants to
    > > be “maDMP Ready” in the RDA sense.
    > >
    > > *2. maDMPs should be modular*
    > >
    > > Every research project has its own way of handling data but! some needs are
    > > common, as shown by the recent tools that aid in building a DMP. A very
    > > strong interest of maDMPs thus lies in possible reuse either as a whole or
    > > as building blocks. The maDMP should expose the subprocesses inside it in a
    > > modular way, with certain elementary validation and processing workflows
    > > being modeled and shareable using existing modeling languages, such as BPMN
    > > , as
    > > shown at the workshop. Other modeling languages could be used as long as
    > > they have both a standard visual representation for being included in the
    > > “printed” DMP, much like UML, and a machine-processable representation in
    > > XML. Such maDMP "building blocks” so to say, could then be reused in a
    > > project's maDMP, and published in a maDMP “directory
    > > ” for others to access.
    > >
    > > People interested in a maDMP for their project could then:
    > >
    > > a) Download and reuse an existing maDMP for a project that they know went well
    > >
    > > b) Reuse only the maDMP "building blocks” for metadata validation, dataset
    > > availability, repository compliance, etc. (modeled using BPMN
    > > )
    > >
    > > c) Build their own from scratch and share it to this “directory” of maDMPs
    > > for others to reuse
    > >
    > > *3. The code that actually executes the actions in the maDMP’s processes
    > > should travel wit! h them (Rules 6, 7, 9)*
    > >
    > > *
    > >
    > > *
    > >
    > > If possible, the small pieces of code behind every step of the modeled
    > > workflows should be open-source and retrievable as needed by the workflow
    > > engine as it runs the BPMN processes. Like this, vendors could fork and
    > > adjust them to the existing APIs of their repository software or other
    > > software that the maDMP needs to “remote-control” to execute the automation
    > > steps specified in the modeled process. This is somewhat similar to ETL
    > > tools, but fetching platform-specific code as needed. A commercial example
    > > of a graphical ETL tool is Pentaho Data Integration here
    > >  — by
    > > the way, i am not affiliated with them, only used the tool myself.
    > >
    > > Best,
    > >
    > > *João Rocha da Silva *
    > > Invited Assistant Professor — Dendro
    > >  Lead Developer — Research
    > > Data Management
    > > Faculty of Engineering of the University of Porto, Portugal
    > > *ORCID*: https://orcid.org/0000-0001-9659-6256 *GitHub*:
    > > https://github.com/silvae86
    > >
    > >
    > >
    > > No dia 26/09/2018, às 12:35, philipp.conzett@uit.no
    > > escreveu:
    > >
    > > Thanks, Rob and Tomasz, for valuable feedback!
    > >
    > > I perfectly agree with the idea and rationale behind making DMPs machine
    > > actionable. If this is implemented in a meaningful way, it surely will
    > > reduce the administrative burden for researchers. As you say, Rob, this way
    > > "a complicated question can be split up into more easily answered closed
    > > questions that together provide the information after a proper consideration
    > > of all the options and issues". I had a look at the Questionnaire demo of
    > > the Data Stewardship Wizard from ELIXIR. Here, researchers are guided
    > > through the whole lifecycle of research data. But ! without such guidance
    > > and sensible grouping of questions, filling in a DMP form may turn into a
    > > disengaging and frustrating duty. If you read my feedback
    > > on the easyDMP tool, you'll
    > > probably understand what I mean.
    > >
    > > Best,
    > > Philipp
    > >
    > > --
    > > Full post:
    > > https://rd-alliance.org/group/dmp-common-standards-wg/post/machine-actio...
    > > Manage my subscriptions: https://rd-alliance.org/mailinglist
    > > Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/60738
    > >
    > > --
    > > Full post:
    > > https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    > > Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    > > Stop emails for this post:
    > > https://www.rd-alliance.org/mailinglist/unsubscribe/60738
    > >
    > >
    > >
    > > --
    > > Full post:
    > > https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    > > Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    > > Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/60738
    >
    >
    > --
    > The University of Edinburgh is a charitable body, registered in
    > Scotland, with registration number SC005336.
    >
    >
    >
    > --
    > Full post:
    > https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    > Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    > Stop emails for this post:
    > https://www.rd-alliance.org/mailinglist/unsubscribe/60738
    >

  • Tomasz Miksa's picture

    Author: Tomasz Miksa

    Date: 05 Oct, 2018

    Dear Joao,

    “been there, done that” – I was considering that at some point, but we concluded that such an approach limits interoperability of the model. That’s one of the lessons learned in the past months, and that’s why we put some effort into developing use cases – to really narrow the scope of the model.

    Machine-actionability means that using information stored in the model, external tools/services can take action in an automated way. These tools/services can be modelled using BPMN (or in any other way, or not modelled at all…). The maDMP just needs to provide inputs to these models. maDMPs should not prescribe how certain actions are performed.

    Simplest example is information on costs. maDMP should contain a field informing on costs of data management. maDMP should not contain an algorithm (BPMN process) on how to calculate the costs of data management. Calculating the cost is the job of a service that provides this value.

    For this reason in our slides and also in the ten principles for maDMPs we stress that 3 things are needed to put maDMPs in place:

    - model to persist and exchange information

    - infrastructure (services) that actually do something with the information from the maDMP

    - processes that define how stakeholders interact with each other (using services and model) to achieve their objectives, e.g. researcher to deliver a DMP, funder to validate DMP, research support to provide cost estimation, etc.

    In my opinion, we will be able to rip benefits from the model, only when we have the ecosystem of services that actually do something with the model. For this reason, I am really looking forward to our pilot projects – I am sure they will be very diverse, e.g. will come up with different ways of using/providing information from/to maDMPs.

    For now, our discussion should focus on how to convert collected requirements into specific fields of the model – we will provide more details soon on how to contribute. Later we (or a different WG) can focus on how to do magic with this information.

    Cheers

    Tomasz

    From: João Silva [mailto:joaorosilva@gmail.com]
    Sent: Thursday, October 04, 2018 6:35 PM
    To: tmiksa; paulwalk; DMP Common Standards WG; kashley
    Cc: philipp.conzett@uit.no
    Subject: Re: [dmp-common] Machine-actionable DMPs - a disengaging administrative burden?

    Dear Paul, Kevin, Tomasz, and all others,

    I see your point when you say that the WG should focus on the deployment of the maDMP overall model. Perhaps it should limit itself to laying the foundation for “plugging in” future BPMN workflows, maybe by including a few related Classes and Properties (in case this model is formalised as an ontology) to reference these models, but without actually embedding them in the model itself?

    A future WG could then be created with the goal of creating some sort of "minimum viable product” for a maDMP-compliant repository or workflow software; afterwards, perhaps we can present a revised model with any lessons learned from the engineering side of things if need be.

    Best,

    João Rocha da Silva
    Invited Assistant Professor — Dendro Lead Developer — Research Data Management
    Faculty of Engineering of the University of Porto, Portugal
    ORCID: https://orcid.org/0000-0001-9659-6256 GitHub: https://github.com/silvae86

    On 4 Oct 2018, 12:39 +0100, kashley , wrote:

    From a number of perspectives, I am in strong agreement with Paul that the
    group's scope should not try to take this task on. If this is a useful area to
    explore, I would rather see a new working group take it on - I speak now as
    co-chair of the overall interest group looking after the DMP working groups.

    The Common Standards Group has an admirably narrow focus and task to complete
    and I would want to see that achieved first before branching out. Lots else
    depends on getting this standard agreed.

    I'll return to the original criticisms made by Phillip Conzett in a separate
    reply as I think we've ended up with two quite distinct issues in this email thread.

    On 04/10/18 11:12, tmiksa wrote:

    Hi,

    The idea to model processes using BPMN was to showcase how certain activities
    that relate to information contained in DMPs can be automated. In other words,
    it is a way to demonstrate machine-actionability of DMPs.

    When our model is in place, then institutions rolling out maDMPs will have to
    think about the process in which maDMPs are used – that’s also why we had the
    first consultation to find out who use! s which information and when.

    Depending on the research field and existing infrastructure and practices, the
    processes will be implemented differently. However, very likely most of the
    processes will be to some extenet similar, e.g. getting recommendation on
    repositories, getting information on costs, etc. For this reason, a set of BPMNs
    is just a useful set of ideas that enable us now to have a more focused
    discussion on requirements for the model.

    Yes, it is not the ambition of this working group to define processes – we do
    the model.

    Tomasz

    *From:*paul=paulwalk.net@rda-groups.org
    [mailto:paul=paulwalk.net@rda-groups.org] *On Behalf Of *paulwalk
    *Sent:* Thursday, September 27, 2018 9:00 AM
    *To:* jrocha; DMP Common Standards WG
    *Cc:* philipp.conzett@uit.no
    *Subject:* Re: [dmp-common] Machine-actionable DMPs - a disengaging
    administrative burden?

    I’m not sure that i agree with expanding the scope of this activity to include
    developing a standard for describing business processes relating to DMPs. We can
    make a real difference if we focus on our mission to establish a standard data
    interchange.

    I think you are right to emphasise the importance of understanding the business
    processes which will use mDMPs, but I think we should stop short of trying to
    model these formally with something like BPMN. It is too soon to be able to
    predict with any confidence precisely how mDMPs will be used and we should, in
    my opinion, avoid over-engineering a solution at this stage. Rather, we should
    be aiming for improving broad interoperability, and for maintaining flexibility
    & future extensibility.

    Cheers,

    Paul

    -------------------------------------------
    Paul Walk
    http://www.paulwalk.net

    Founder and Director, Antleaf Ltd
    http://www.antleaf.com

    Antleaf provides Management Services to DCMI
    http://www.dublincore.org
    -------------------------------------------

    On 27 Sep 2018, at 01:48, jrocha > wrote:

    Hi all,

    I did not have the opportunity to attend the TPDL 2018 workshop, so at the
    risk of being outdated, here are my two cents on this until we can meet in
    Gaborone and discuss this in person.

    *1. maDMPs need an ecosystem of “automatable” software (Rules 1 and 2 of the
    *10 Simple Rules for maDMPs *)*

    One important goal is to have the maDMP automate repetitive work, but people
    should be able to keep their own repositories and other software platforms
    in the maDMP workflow, or, at most, install an upgrade to these existing
    solutions. However, automation across different systems requires
    interoperability.

    Thus, interoperability with existing data management tools is a must,
    because a maDMP is basically worthless if no software knows how to execute
    what it prescribes or enforces. An off-the-shelf workflow engine should be
    able to fire off the proper events to the data management software, with the
    necessary payload, which would itself comply with a set interoperability
    standard (this is where we could come in, I think). In a sense, we could
    work towards an API specification covering a core set of operations that
    should be supported by any repository or data staging platform that wants to
    be “maDMP Ready” in the RDA sense.

    *2. maDMPs should be modular*

    Every research project has its own way of handling data but! some needs are
    common, as shown by the recent tools that aid in building a DMP. A very
    strong interest of maDMPs thus lies in possible reuse either as a whole or
    as building blocks. The maDMP should expose the subprocesses inside it in a
    modular way, with certain elementary validation and processing workflows
    being modeled and shareable using existing modeling languages, such as BPMN
    , as
    shown at the workshop. Other modeling languages could be used as long as
    they have both a standard visual representation for being included in the
    “printed” DMP, much like UML, and a machine-processable representation in
    XML. Such maDMP "building blocks” so to say, could then be reused in a
    project's maDMP, and published in a maDMP “directory
    ” for others to access.

    People interested in a maDMP for their project could then:

    a) Download and reuse an existing maDMP for a project that they know went well

    b) Reuse only the maDMP "building blocks” for metadata validation, dataset
    availability, repository compliance, etc. (modeled using BPMN
    )

    c) Build their own from scratch and share it to this “directory” of maDMPs
    for others to reuse

    *3. The code that actually executes the actions in the maDMP’s processes
    should travel wit! h them (Rules 6, 7, 9)*

    *

    *

    If possible, the small pieces of code behind every step of the modeled
    workflows should be open-source and retrievable as needed by the workflow
    engine as it runs the BPMN processes. Like this, vendors could fork and
    adjust them to the existing APIs of their repository software or other
    software that the maDMP needs to “remote-control” to execute the automation
    steps specified in the modeled process. This is somewhat similar to ETL
    tools, but fetching platform-specific code as needed. A commercial example
    of a graphical ETL tool is Pentaho Data Integration here
    — by
    the way, i am not affiliated with them, only used the tool myself.

    Best,

    *João Rocha da Silva *
    Invited Assistant Professor — Dendro
    Lead Developer — Research
    Data Management
    Faculty of Engineering of the University of Porto, Portugal
    *ORCID*: https://orcid.org/0000-0001-9659-6256 *GitHub*:
    https://github.com/silvae86

    No dia 26/09/2018, às 12:35, philipp.conzett@uit.no
    escreveu:

    Thanks, Rob and Tomasz, for valuable feedback!

    I perfectly agree with the idea and rationale behind making DMPs machine
    actionable. If this is implemented in a meaningful way, it surely will
    reduce the administrative burden for researchers. As you say, Rob, this way
    "a complicated question can be split up into more easily answered closed
    questions that together provide the information after a proper consideration
    of all the options and issues". I had a look at the Questionnaire demo of
    the Data Stewardship Wizard from ELIXIR. Here, researchers are guided
    through the whole lifecycle of research data. But ! without such guidance
    and sensible grouping of questions, filling in a DMP form may turn into a
    disengaging and frustrating duty. If you read my feedback
    on the easyDMP tool, you'll
    probably understand what I mean.

    Best,
    Philipp

    --
    Full post:
    https://rd-alliance.org/group/dmp-common-standards-wg/post/machine-actio...
    Manage my subscriptions: https://rd-alliance.org/mailinglist
    Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/60738

    --
    Full post:
    https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post:
    https://www.rd-alliance.org/mailinglist/unsubscribe/60738

    --
    Full post:
    https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/60738

    --
    The University of Edinburgh is a charitable body, registered in
    Scotland, with registration number SC005336.

    --
    Full post:
    https://www.rd-alliance.org/group/dmp-common-standards-wg/post/machine-a...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post:
    https://www.rd-alliance.org/mailinglist/unsubscribe/60738

submit a comment