Meeting 2018-02-28

Meeting on Thursday 2018-02-28

Attendance:

  • Milan Daneček, CESNET
  • Paul Millar, DESY
  • Mikael Borg, NBIS
  • Nicholas Car, CSIRO
  • Bev Jones, University of Lincoln

Apologies:

  • Ville Tenhunen, University of Helsinki

1. Updating our case-statement

Milan reported that David has contacted the National Data Service WG, but hasn't received a response back yet.

Mikael reported that he contacted the Research Data Architectures in Research Institutes (RDARI) IG.  Ville was very interested in the opportunity to present his IG during the P11 meeting.

Nick mentioned Adrian Burton a (ex-)collegue from the Australian National Data Service.  The link would be via National Data Infrastructure (NDI) government research cloud.

ACTION: [NC] contact Adrian Burton to see if he is interested in our work.

Nick also offered some advice, based on his experience with RDA groups: any collaboration between two RDA groups needs to be pretty specific in what we want to achieve through the collaboration, otherwise it is hard to garner engagement.

Our plans for collaborating with the NDI is currently quite nebulous.  Nick suggested we find some specific problem that we could address; for example, could we present a methodology that would allow them (NDI) to compare private section offerings and perhaps in-house ones, too.

Austrailia is currently being strongly encouraged to adopt commercial cloud solutions, which puts the work from our WG of interest.

2. Plans for P11 break-out session

There is a page advertising our break-out session available here: https://rd-alliance.org/wg-storage-service-definitions-rda-11th-plenary-meeting

This includes a draft agenda.  The discussion focused on the items in that agenda

Introduction round: who's who.

This is largely self-contained.

Update on our current status.

Paul asked people to send him any input that should be included here.

ACTION: [ALL] send Paul pertenant status update information.

Links with related activities -- the broader picture.

Nick suggests this section include some description of our expectations from this group; more specifically, describing what is the groups scope.  This could be done quite naturally using specific examples.  We show some particular use-case is "in scope", this other use-case is "out of scope", and for this third use-case, we're not sure.

Ville's talk about the RDARI IG would also naturally fit in this section.  Similarly, Nick could give a talk about the use-cases coming from Australia.

Paul said he imagines this section of the session would likely be a section that will involve the most organic discussion, based on whoever is in the room.

Plan for entering real-world data.

Milan will give a description of their use-cases (a.k.a the "horror stories").

Nick could give a similar talk from the Australian perspective: moving storage to commercial / offsite offerings with no criteria for comparing them or deciding if the offering satisfies minimum requirements.

SKOS vs Ontologies: where we are and where we're going

This is a fairly standard talk for Nick, and one he felt would fit quite naturally for this group.  This could lead to a discussion on what we need to satisfy our use-cases.

We could move the "SKOS vs Ontology" section so it comes before the "Links with related activity" section so any discussion naturally continues into this section.

Service discovery: what's available and what can we use?

Mikael reported he has started investigating Consul and made contact with HashiCorp.  His initial impression is that Consul seems more geared towards comput than storage.

Paul hasn't done much yet on GLUE, but will do so.

ACTION: [PM] investigate and summerise GLUE support for QoS.

Given Mikael unfortunately can't make our session, Paul will give this talk, summarising both.

Plan for adoption as EC ICT specification.

Mikael will contacted Raphael... and just heard back.  Raphael will be able to make our break-out session.

After some discusion, we decided to move the "Links with related activities" section to the last, as the other sections can be time-boxed.  This would allow the discussion to include others in the room without fear of time limits.

ACTION: [PM] to adjust our P11 agenda to move Links-with section to the last point.

Unfortunately Bev can't make it to our P11 session.

ACTION: [PM] to contact the conference organisers to ask about possible video connection.

Paul will work with Bev to get her new laptop working with our video conference system.

ACTION: [PM & BJ] coordinate time to test video conference system.

3. TAB submission

Nick reported that TAB last met on Friday February 16th.  The board meets monthly, so the next meeting will likely be around Friday 16th March.  We would like to submit our updated case-statement in that timescale.

ACTION: [PM] to update the current case-statement by next Wednesday (2018-03-07) and ask people for input, as needed.

ACTION: [PM] to contact our TAB liaison (Rainer) to confirm the deadline for submitting our updated case statement.

4. Microsoft 365

Nick described a specific use-case that might (or might not) be something this WG would be interested in.

Microsft Office 365 provides approx 1 TiB of storage with their Office 365 account.  This is somewhat similar to what Google provide with Google Drive.

Although this isn't "high performance" and isn't intended for storing scientific data, it could be used to store important documents.

Paul suggested that, although nobody would be interested in storing scientific data with this service, it may still be of interest to describe MS Office 365 or Google Drive storage, if only so we could discard it from storing data.

Nick also noted that, combined, the provisioned storage accessible through these services can be non-trivial.  As an example, Nick has some 3 TiB of storage from various cloud offerings.

The Australian Records Keeping office accepts documents being stored in Microsoft 365, but would need an "exit plan" -- what to do if the storage will become unavailable.  This could be simply copying all the data to some other storage provider.

So, perhaps this could be a example of what a storage system means,
practically.

5. Provenance Patterns Database

See http://patterns.promsns.org/

Mikael reported he added a use-case to the Provenance Patterns WG web portal. Was the submission OK?

Nick remembered seeing the content was added, but then couldn't find it.  He will investigate.

Nick also reported that, in addition to "use case" and "pattern" concepts (which support an n-to-n mapping), they are adding "implementer" as a new concept.

The rational is to show which of these items are actually being adopted: either currently or in the future.  This matters from a funding perspective.

6. The AGRIF demo

See http://reference.data.gov.au/agrifdemo

Nick mentioned the AGRIF demo in terms of selecting cloud storage.

This demo includes some concept of the "appropriateness" of a storage system.  A dataset could be stored on a storage-system that is somehow inappropriate.  This is left vague within the demo, but one example could be a requirement for data to reside on storage located in the EU.

One interesting activity would be to cast this example in terms of match-making between what a storage service provides and what is required.  There could be an analysis that flags data stored inappropriately.

Actions summary

Created:

  • 2018-02-28:1 [NC] contact Adrian Burton to see if he is interested in our work.
  • 2018-02-28.2 [ALL] send Paul pertenant status update information.
  • 2018-02-28:3 [PM] investigate and summerise GLUE support for QoS.
  • 2018-02-28:4 [PM] to adjust our P11 agenda to move Links-with section to the last point.
  • 2018-02-28:5 [PM] to contact the conference organisers to ask about possible video connection.
  • 2018-02-28:6 [PM & BJ] coordinate time to test video conference system.
  • 2018-02-28:7 [PM] to update the current case-statement by next Wednesday (2018-03-07) and ask people for input, as needed
  • 2018-02-28:8 [PM] to contact our TAB liaison (Rainer) to confirm the deadline for submitting our updated case statement.

DTNM

The next meeting will be part of our regular meeting series, so in two weeks: 2018-03-14 10:00 CET.