here are my notes from today's call. Please feel free to extend.
Our next call is scheduled for Aug 17, 13:00 UTC.
* We discussed several use cases, particularly from RPID, in terms of
how the strawman fits and what fields might be missing
o Typical properties were related to time, geolocation; there are
also more 'domain properties' like temperature, wind speed - but
are these actually necessary for fundamental decision making?
o Can we form 'packages' out of this? like: KI for trust, KI for
geo/time, KI for environment?
* RPID: 4 exemplary scenarios; often need more information than in the
strawman; can this be broken down into parts, where parts of the
scenarios are enabled purely by the trust KI?
o Example 1: weather scenario: sensor network data, group by
dates, publish as 'daily research objects'; using all the
mandatory fields in the strawman; second part is then analysis
part, but RPID did not proceed yet to the filtering case. but
from what is there, it looks that creation date (already in
strawman) and device ID (can't be included) are important.
o Example 2: rice genomics: phenotypes & genotypes data;
copyright/licensing is a big issue - who created data, who
published data; also uses derivedFrom; future properties may be:
publication date, also geoinformation
* Discussion at IU: pulling info from domains will just make the
profile bigger; this is not what we want, but no clarity on what
else to do, so the discussion stopped. But the problem remains
o This is familiar also from the previous PIT group work. We also
got to that point and did not have any answers.
* One way to approach this: What is the value of the limited profile?
If this stands on its own, what does it enable? Does it enable
enough (cost/benefit ratio)?
o Dublin Core or DataCite must have been there. Can we learn from
them? But: This is not about metadata fields, but at the
conceptual process that leads to including or not including them
(or in what form). We want to clarify that process for our KI
* Currently, we can't see a clear limit to the profile. So we want to
structure the decisions on what field to include or not include.
o Ulrich got to something: took down some first ideas for
structuring: graphs; some sort of ordering (the typical 'date'
use case, but also geolocation; string ordering); patterns in
strings; (there were more - I did not get all of them..)
o it's all geared to give easy 'yes'/'no' answers
* Ulrich got there by thinking about RDA discussions and what currents
run in them. Example: versioning discussions in RDA have always been
based on different understandings what versioning is, e.g. version
numbers of objects vs. graph lineage (git etc.); they are
orthogonal, but both provide some form of ordering - and therefore,
ordering in principle seems to be an interesting/relevant part;
comparability of versions seems to be important;
o Can we find more such examples?
o Guideline is always: What information is at a generic level
required to crawl through DOs? and then it all ends up at these
o TW: another example for a recurring RDA discussion could be
granularity/collection/subsetting - but what is the
generalization of this that leads to yes/no decisions?
o Another example: pattern in strings: is about searching - search
questions are always about string pattern matching; this again
leads to yes/no decisions
+ Can we also look the other way, i.e.: which processes are
ultimately boiled down to a pattern matching question?
Dr. Tobias Weigel
Deutsches Klimarechenzentrum GmbH (DKRZ)
Bundesstraße 45 a • 20146 Hamburg • Germany
Phone: +49 40 460094-104
Geschäftsführer: Prof. Dr. Thomas Ludwig
Sitz der Gesellschaft: Hamburg
Amtsgericht Hamburg HRB 39784
Notes from today's PID KernelInfo call
You are here