Virtual Research Environment IG (VRE-IG) Activity Overview Virtual Research Environments and Reference Architecture(s) for them

Virtual Research Environments and Reference Architecture(s) for them

Creator

Discussion
March 29, 2017 at 10:48 am #119470

Leonardo Candela
Participant

I just found this paper with a very promising and inspiring title:
K. Jeffery, C. Meghini, C. Concordia, T. Patkos, V. Brasse, J. van Ossenbruck, Y. Marketakis, N. Minadakis & E. Marchetti, “A Reference Architecture for Virtual Research Environments”, 15th International Symposium of Information Science, 13-15 March 2017, Humboldt-Universität zu Berlin
You can read it by the online version of the proceedings http://isi2017.ib.hu-berlin.de/proceedings.html
The paper results from the VRE4EIC project and I think it is of interest for this group.
I’m a member of the BlueBRIDGE project and I’m involved in VRE development since years. I would like to complement a bit the content of the paper by bringing in our experiences and stimulate a fruitful discussion. I focus on two points: (i) what is the approach actually exploited by BlueBRIDGE, and (ii) whether a single Reference Architecture for VREs is suitable or not.
The paper indicates that the BlueBRIDGE approach is the following “BlueBridge produces a VRE that is tightly coupled to the underlying e-RIs;”. I would say that this is not correct for several reasons:

BlueBRIDGE is developing and operating a series of Virtual Research Environment each tailored to serve the needs of dedicated comunities. VREs are dynamically created. The currently deployed ones are available at https://bluebridge.d4science.org/explore (some of them are freely available);
Although BlueBRIDGE is oriented to serve the “marine” domain its underlying “primary” infrastructure (D4Science.org) and enabling technology (gCube) are generic. In fact, they have been and will be exploited to serve the needs of diverse communities and application contexts including social mining scientists, environmental scientists, agriculture scientists, cultural heritage practitioners, geothermal scientists, multidisciplinary community dealing with scholarly communication and open science, data science educators.
D4Science is an infrastructure built with the “system of systems” approach; In the reality it is an ecosystem of ICT infrastructures. Each VRE built by relying on D4Science facilities is actually relying on services coming from many infrastructures and providers;
An overview of the enablign technology is included in the following paper Assante M, Candela L, Castelli D, Coro G, Lelii L, Pagano P. (2016) Virtual research environments as-a-service by gCube. PeerJ Preprints 4:e2511v1 https://doi.org/10.7287/peerj.preprints.2511v1

On Reference Architectures (RA) for VREs. According to my knowledge a RA is a software system architecture expected to be a sort of template for software systems of a particular domain. The paper proposes one consisting of three-tiers (Application, Interoperability, Resource Access) and six conceptual components (System Manager, Workflow Manager, Linked Data Manager, Metadata Manager, Interoperability Manager, AAAI). The lack of a real definition of what a VRE is or is expected to be as well as the level-of-details makes really challenging to comment this proposal, e.g.

What is VRE-specific in this set of components? What is making them a must-have to be named VRE? It seems to me the same set of component can be thought for any system willing to integrate resources from existing systems;
Is a VRE expected to support collaboration and cooperation among the members of its designated community? If yes, what are the envisaged components?
What is the development and deployment model suggested by this reference architecture? Is it expected that each community willing to build its own VRE has to take care of implementing what is needed (by relying on existing services)? Gluing the “components” together might be time consuming. Is this something to be done per VRE?
Is workflow-driven approach the “one-size-fit-all” solution for VRE members willing to define and execute their processing tasks. Do exist WFMS suitable for any need? Is there no need to explicitly develop a code or a script for part of a processing task?

These are some of the questions coming to my mind. There might be a gap to fill between “VRE users’ expectation” and “VRE promises”, a Reference Architecture should carefully describe the promises. Is it actually possible to define “the” reference architecture for VREs? Shall we look at defining a servies of Reference Architectures each taylored to serve a specific domain or to devise a specific class of VRE?

10
Creator

Discussion

Author

Replies
March 29, 2017 at 1:04 pm #131946

Keith Jeffery
Member

Leo –
Thanks for this. I agree there is plenty of room for discussion on VREs and I look forward to a lively discussion at RDA P9 Barcelona.
One important area is definitions: VRE4EIC takes the view that the VRE (e-VRE in VRE4EIC) is the user-facing component of the overall ICT-supported research environment and – in the research domain – other components of the environment (outside of the VRE but connected to it) are one or more RIs (research infrastructures representing their assets or resources digitally as e-RIs) and one or more e-Infrastructures e-Is (usually computing resources, sensors/detectors and related hardware). Of course the e-RIs include also computing capability and possibly also sensors which can accommodate some or all of the required processing (and this may be mandatory for certain kinds of data or certain processes). The e-Is are seen as the additional capacity required for certain workflow executions (e.g. scale-out for performance). In this sense the additional capacity could be provided by public commercial cloud platforms or by any other cloud platform supplier subject to the usual NFRs.
On the other hand, if I understand correctly from your publications, BlueBridge uses VRE to mean ‘the whole package’ including for each instance a RI and appropriate e-Is such as EGI and EUDAT. This led to the remark in the paper characterising BlueBridge.
Concerning reference architecture, VRE4EIC considers it to be a specification of components and interfaces – a kind of template. This means that any implementation (as a whole or of component software) should follow the template and thus be interoperable. VRE4EIC is developing a set of components for the defined reference architecture but – of course – following the philosophy adopted with regard to e-RIs and e-Is each component can be replaced if/when a better implementation of that function (processes) or dataset is produced. A key component is the metadata catalog because this provides to the end-user of the VRE the ‘view’ over the e-RIs and e-Is utilised for any particular research environment.
BTW I hope to get to the BlueBridge event on Monday; it all depends on when my flight arrives (early afternoon)
With best wishes
Keith
——————————————————————————–
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***
T: +44 7768 446088
S: keithgjeffery
Past President ERCIM http://www.ercim.eu (***@***.***)
Past President euroCRIS http://www.eurocris.org
Past Vice President VLDB http://www.vldb.org
Fellow (CITP, CEng) BCS http://www.bcs.org
Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work…
Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
———————————————————————————————————————————-
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
———————————————————————————————————————————-
– Show quoted text -From: leonardo.candela=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of leonardo.candela
Sent: 29 March 2017 11:49
To: Virtual Research Environment IG (VRE-IG)
Subject: [vre_ig] Virtual Research Environments and Reference Architecture(s) for them
I just found this paper with a very promising and inspiring title:
K. Jeffery, C. Meghini, C. Concordia, T. Patkos, V. Brasse, J. van Ossenbruck, Y. Marketakis, N. Minadakis & E. Marchetti, “A Reference Architecture for Virtual Research Environments”, 15th International Symposium of Information Science, 13-15 March 2017, Humboldt-Universität zu Berlin
You can read it by the online version of the proceedings http://isi2017.ib.hu-berlin.de/proceedings.html
The paper results from the VRE4EIC project and I think it is of interest for this group.
I’m a member of the BlueBRIDGE project and I’m involved in VRE development since years. I would like to complement a bit the content of the paper by bringing in our experiences and stimulate a fruitful discussion. I focus on two points: (i) what is the approach actually exploited by BlueBRIDGE, and (ii) whether a single Reference Architecture for VREs is suitable or not.
The paper indicates that the BlueBRIDGE approach is the following “BlueBridge produces a VRE that is tightly coupled to the underlying e-RIs;”. I would say that this is not correct for several reasons:
* BlueBRIDGE is developing and operating a series of Virtual Research Environment each tailored to serve the needs of dedicated comunities. VREs are dynamically created. The currently deployed ones are available at https://bluebridge.d4science.org/explore (some of them are freely available);
* Although BlueBRIDGE is oriented to serve the “marine” domain its underlying “primary” infrastructure (D4Science.org) and enabling technology (gCube) are generic. In fact, they have been and will be exploited to serve the needs of diverse communities and application contexts including social mining scientists, environmental scientists, agriculture scientists, cultural heritage practitioners, geothermal scientists, multidisciplinary community dealing with scholarly communication and open science, data science educators.
* D4Science is an infrastructure built with the “system of systems” approach; In the reality it is an ecosystem of ICT infrastructures. Each VRE built by relying on D4Science facilities is actually relying on services coming from many infrastructures and providers;
* An overview of the enablign technology is included in the following paper Assante M, Candela L, Castelli D, Coro G, Lelii L, Pagano P. (2016) Virtual research environments as-a-service by gCube. PeerJ Preprints 4:e2511v1 https://doi.org/10.7287/peerj.preprints.2511v1
On Reference Architectures (RA) for VREs. According to my knowledge a RA is a software system architecture expected to be a sort of template for software systems of a particular domain. The paper proposes one consisting of three-tiers (Application, Interoperability, Resource Access) and six conceptual components (System Manager, Workflow Manager, Linked Data Manager, Metadata Manager, Interoperability Manager, AAAI). The lack of a real definition of what a VRE is or is expected to be as well as the level-of-details makes really challenging to comment this proposal, e.g.
* What is VRE-specific in this set of components? What is making them a must-have to be named VRE? It seems to me the same set of component can be thought for any system willing to integrate resources from existing systems;
* Is a VRE expected to support collaboration and cooperation among the members of its designated community? If yes, what are the envisaged components?
* What is the development and deployment model suggested by this reference architecture? Is it expected that each community willing to build its own VRE has to take care of implementing what is needed (by relying on existing services)? Gluing the “components” together might be time consuming. Is this something to be done per VRE?
* Is workflow-driven approach the “one-size-fit-all” solution for VRE members willing to define and execute their processing tasks. Do exist WFMS suitable for any need? Is there no need to explicitly develop a code or a script for part of a processing task?
These are some of the questions coming to my mind. There might be a gap to fill between “VRE users’ expectation” and “VRE promises”, a Reference Architecture should carefully describe the promises. Is it actually possible to define “the” reference architecture for VREs? Shall we look at defining a servies of Reference Architectures each taylored to serve a specific domain or to devise a specific class of VRE?
10
—
Full post: https://www.rd-alliance.org/group/virtual-research-environment-ig-vre-ig…
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/55766
April 5, 2017 at 7:40 am #131912

Kheeran Dharmawardena
Member

Hi all,

I am unfortunately not going to be in Barcelona but I hope my contributions here helps in the discussions in Barcelona.

Expanding and responding to Leo’s comments, I too am of the view that in the research space a single reference architecture is unlikely to have much utility. The diversity across research domains are so vast that we either end up having a reference architecture that is too abstract and vague to be of practical use, or we have a reference architecture that ends up very useful for some domains but not others. My suggestion is we should look at a set of reference architectures that are quite diverse yet well suited to non-overlapping research domains. I feel this way we will be able to have a set of reference architectures that collectively have a broad reach and has good utility.

Another thing that I would like to comment on is that I feel we haven’t got a clear understanding of what a VRE is for the purpose of creating Reference Architectures. From my reading, we seem to define VRE as defined by VRE4EIC, and then include Virtual Laboratories and Science Gateways as also being VREs. VRE (as defined by VRE4EIC), VL & Science Gateways, while closely related are also subtly different. I think we would be well served if we develop a clear definition of what we mean by a VRE and then compare and contrast VRE4EIC, VL and Science Gateways against this definition.

I would also like to see this IG not only develop Reference architectures but also develop a set of implementation patterns that are derived from current (successful) implementations of VREs, VLs & SGs that map back to the reference architectures. I believe this would then provide practical guidance to those who are developing VREs for research.

One final comment I’d like to make is that when we consider interoperability we should do so at two levels. Interoperability between e-RIs across domains, and interoperability between e-RIs within a domain. It is important for us to continue to recognise that specilised coupling of e-RI, e-I within a particular research domain is crucial to advancing that field of research and pushing the boundaries of knowledge. I would assert that this should be the primary concern and hence why current investment and effort has been focused around specific domains. However there is also many advantages of having interoperability across domains, and in many respects far more challenging to achieve in practice. So a reference architecture can be of great help.

Kind regards

Kheeran Dharmawardena
April 5, 2017 at 7:55 am #131911

Keith Jeffery
Member

Kheerand –
Many thanks for your comments.
I certainly agree that there is room for specialised research environments. However, for interoperability (recall RDA is about data access and interoperability) life is very complex if there are many architectures (or at least many heterogeneous specifications and implementations). Hence the idea of one reference architecture for this purpose – more specifically one specification of a conceptual (not necessarily realised physically) rich canonical metadata scheme as the conversion target from each local catalog. This has been known and discussed in the literature for >30 years. Implementation is of course difficult, hence currently progress is made commonly in domain-specific implementations with limited interoperability.
I also agree that if we can find agreed definitions of terms that would assist greatly in reducing confusion. I agree that implementations based on the (finally agreed) reference architecture are necessary to prove the architecture.
Thanks again for the comments
Best
Keith
——————————————————————————–
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***
T: +44 7768 446088
S: keithgjeffery
Past President ERCIM http://www.ercim.eu (***@***.***)
Past President euroCRIS http://www.eurocris.org
Past Vice President VLDB http://www.vldb.org
Fellow (CITP, CEng) BCS http://www.bcs.org
Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work…
Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
———————————————————————————————————————————-
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
———————————————————————————————————————————-
– Show quoted text -From: kheerand=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of kheerand
Sent: 05 April 2017 08:40
To: Virtual Research Environment IG (VRE-IG)
Subject: Re: [vre_ig] Virtual Research Environments and Reference Architecture(s) for them
Hi all,
I am unfortunately not going to be in Barcelona but I hope my contributions here helps in the discussions in Barcelona.
Expanding and responding to Leo’s comments, I too am of the view that in the research space a single reference architecture is unlikely to have much utility. The diversity across research domains are so vast that we either end up having a reference architecture that is too abstract and vague to be of practical use, or we have a reference architecture that ends up very useful for some domains but not others. My suggestion is we should look at a set of reference architectures that are quite diverse yet well suited to non-overlapping research domains. I feel this way we will be able to have a set of reference architectures that collectively have a broad reach and has good utility.
Another thing that I would like to comment on is that I feel we haven’t got a clear understanding of what a VRE is for the purpose of creating Reference Architectures. From my reading, we seem to define VRE as defined by VRE4EIC, and then include Virtual Laboratories and Science Gateways as also being VREs. VRE (as defined by VRE4EIC), VL & Science Gateways, while closely related are also subtly different. I think we would be well served if we develop a clear definition of what we mean by a VRE and then compare and contrast VRE4EIC, VL and Science Gateways against this definition.
I would also like to see this IG not only develop Reference architectures but also develop a set of implementation patterns that are derived from current (successful) implementations of VREs, VLs & SGs that map back to the reference architectures. I believe this would then provide practical guidance to those who are developing VREs for research.
One final comment I’d like to make is that when we consider interoperability we should do so at two levels. Interoperability between e-RIs across domains, and interoperability between e-RIs within a domain. It is important for us to continue to recognise that specilised coupling of e-RI, e-I within a particular research domain is crucial to advancing that field of research and pushing the boundaries of knowledge. I would assert that this should be the primary concern and hence why current investment and effort has been focused around specific domains. However there is also many advantages of having interoperability across domains, and in many respects far more challenging to achieve in practice. So a reference architecture can be of great help.
Kind regards
Kheeran Dharmawardena
—
Full post: https://www.rd-alliance.org/group/virtual-research-environment-ig-vre-ig…
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/55766
Author

Replies

Virtual Research Environment IG (VRE-IG)

Group Organizers

Virtual Research Environments and Reference Architecture(s) for them