As part of the DFT group I'm wondering of the MIG has any definition(s) of metadata to propose that would be useful for out efforts to define things and your interests.
As context we have had some working defintions a bit of which was disucssed at P3.
We start very generally with "Metadata is a type of data object that that contains attributes describing properties of an associated data or digital object. The association between a data object and metadata is that the content of the metadata describes the data object. "
Since metadata plays different roles for different functions we added some additional ideas such as a role in PIDs.
" It may contain as key the persistent identifier of that associated object. "......but things get a little complicated as we note that metadata (MD) may serve different purposes, such as helping people to find data of relevance - discovery (Michener 2006) or to bring data together –a federation role.
And wecan note many other roles:
(aiding in) Discovery, Access, providing context, Selection, Licensing, authorization, Quality, suitability and Provenance (such as describing how data were gathered, reproducibility or summarizing terms for reuse .
Because of the different roles we may distinguish many different types of MD such as:
Administrative Metadata "Provides information about how to manage a resource,such asrights data, intellectualproperty info, date of creation and editing – can be very important for legal reasons." or
Structural Metadata
The structural organization of a data object,such as chaptersin a book,sentencesin achapter, etc,that allows usto figure out how an objectshould be puttogether. Also refers to the underlying structural metadata of digital objects that tells computers how to assemble them.
To this one may add list of MD attributes such as:
Quality & Provenance (such as describing how data were gathered, reproducibility or summarizing terms for reuse.)
When all this is taken into account, and it is just part of the story, we end up with less of a definition and more of an encylopedia entry.. But perhaps this is still useful.
Author: Quentin Reul
Date: 30 Apr, 2014
Hi Gary,
I tend to make a distinction between "*descriptive*" and "*functional*"
metadata. The first type of metadata describes the digital object (e.g.
topic, creation date, etc.), while the second type encodes information that
drives functionality on a platform (e.g. sort date, etc.). I realize that
the distinction can be quite artificial.
With regards to "*structural*" metadata, are you referring to the dynamic
combination of digital objects? Or is it about how digital objects are
managed? Although I don't disagree that we want to represent the relation
between digital object, I'm not sure that relations between them is really
metadata.
Kind regards,
Quentin
Author: Gary Berg-Cross
Date: 30 Apr, 2014
Quentin,
Hello, seems like we were just together at the Ontology Summit.....
>"*descriptive*" and "*functional*" metadata... I realize that the
distinction can be quite artificial. "
To me metadata is a role that some information, represented as data, plays.
Quentin,
Hello, seems like we were just together at the Ontology Summit.....
>"*descriptive*" and "*functional*" metadata... I realize that the
distinction can be quite artificial. "
To me metadata is a role that some information, represented as data, plays.
So we can create as many roles for it as we think useful. That's why we
need to take a community view of what people find useful.
Types like "structural" make a good deal of sense from out experience and
we might be saying that this data object that we are describing is made up
of the following things or that it is part of a data collection.
I can imagine that such structural info might be useful for some functions
too, although I'm not clear if you are just talking about data management
functions or more broadly into analytic things.
I can imagine situations in which the 2 categories may not be orthogonal. (But
we might have some agreement on types of metadata and what attributes, like
a Dublin Core, should be populated.)
As an example of non-orthogonality, if we know certain structural relations
it might help search or query functions.
One other thought. There is RDA WG activity on defining Data Types, so
this would be corresponding work to define MD types.
>Although I don't disagree that we want to represent the relation between
digital object, I'm not sure that relations between them is really metadata.
Yes, but then how should we classify or think about this info that
describes relations between digital objects? If the DOs make up a
collection then it might be thought of as metadata about the collection or
some of its structural parts.
Gary Berg-Cross, Ph.D.
***@***.***
http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
NSF INTEROP Project
http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
SOCoP Executive Secretary
Independent Consultant
Potomac, MD
240-426-0770
On Wed, Apr 30, 2014 at 2:28 PM, qhreul <***@***.***> wrote:
> Hi Gary,
>
> I tend to make a distinction between "*descriptive*" and "*functional*"
> metadata. The first type of metadata describes the digital object (e.g.
> topic, creation date, etc.), while the second type encodes information that
> drives functionality on a platform (e.g. sort date, etc.). I realize that
> the distinction can be quite artificial.
Quentin,
Hello, seems like we were just together at the Ontology Summit.....
>"*descriptive*" and "*functional*" metadata... I realize that the
distinction can be quite artificial. "
To me metadata is a role that some information, represented as data, plays.
So we can create as many roles for it as we think useful. That's why we
need to take a community view of what people find useful.
Types like "structural" make a good deal of sense from out experience and
we might be saying that this data object that we are describing is made up
of the following things or that it is part of a data collection.
I can imagine that such structural info might be useful for some functions
too, although I'm not clear if you are just talking about data management
functions or more broadly into analytic things.
I can imagine situations in which the 2 categories may not be orthogonal. (But
we might have some agreement on types of metadata and what attributes, like
a Dublin Core, should be populated.)
As an example of non-orthogonality, if we know certain structural relations
it might help search or query functions.
One other thought. There is RDA WG activity on defining Data Types, so
this would be corresponding work to define MD types.
>Although I don't disagree that we want to represent the relation between
digital object, I'm not sure that relations between them is really metadata.
Yes, but then how should we classify or think about this info that
describes relations between digital objects? If the DOs make up a
collection then it might be thought of as metadata about the collection or
some of its structural parts.
Gary Berg-Cross, Ph.D.
***@***.***
http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
NSF INTEROP Project
http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
SOCoP Executive Secretary
Independent Consultant
Potomac, MD
240-426-0770
Author: Chris Taylor
Date: 30 Apr, 2014
Hi,
I think a little taxonomy of metadata would be extremely useful. As you
say, the broad categories of administrative and structural (I'd maybe
separate 'descriptive' or something as a third broad category distinct from
structural) can be broken down further, so this is certainly non-trivial,
but we need this I think. Ideally as an XSD somewhere.
Could we hack something together somewhere? Just start with your list...
Chris.
Author: Chris Taylor
Date: 30 Apr, 2014
Hi,
On metadata describing interrelationships: the ISA structure relies on that
sort of information; basically, an 'investigation' describes the
relationship between studies and assays in a body of work (
http://www.isa-tools.org/). #justsaying :)
Chris.
Author: Keith Jeffery
Date: 01 May, 2014
All –
Really good to see this discussion. Can I try to inject some more structuring into it?
1. First can we throw out the commonly used definition that metadata is ‘data about data’ ?
2. Can we agree that there is no difference between data and metadata – except the purpose for which it is used? Example, a library catalog card is metadata for the researcher finding the book on the shelf but data for the librarian counting books on ‘biochemistry’.
3. Can we agree metadata is multidimensional? Most published classifications rely on intrinsic properties or functional usage. Some (DC, DCAT) just relate to the dataset, some (e.g. ISA) provide some context.
4. Just to get the discussion rolling, how about this:
a. Dimension 1: ‘level of detail of metadata’ : descriptive | contextual | detailed/specific. Example: descriptive: DC; contextual (project, person, organisation, funding, facility, equipment, publications….) CERIF, ISA; detailed/specific: schema level connecting dataset to software;
b. Dimension 2: Purpose: re-use | interoperation. Example: re-use: using dataset for a repeat or different purpose; interoperation: using the dataset along with others for some purpose so that the user has a homogeneous view over heterogeneous (distributed) datasets;
c. Dimension 3: Intrinsic Properties: Description | Location | Contextualisation | Preservation | Provenance | Schema; Examples: Description: DC, CKAN; Location: URL/URI; Contextualisation: (used to assess relevance/quality for purpose) CERIF, ISA; Preservation: OAIS architecture but it needs populating; Provenance: versioning of datasets and relationships expressed semantically including relationships to software, persons, organisations etc.
As you can see in the above there is some overlap between the dimensions in terms of what is recorded or used but the different dimensions do have different aspects or modes of usage.
5. Can we agree that metadata needs formal syntax (structure) and declared semantics (terms, meanings, relationships) ?
And just to add to the fun here is a list of characteristics (maybe entities/objects or attributes) that I think need to be available concerning a dataset:
• Unique Identifier (for later use including citation)
• Location (URL)
• Description
• Keywords (terms)
• Temporal coordinates
• Geospatial coordinates
• Originator (organisation(s) / person(s))
• Project
• Facility / equipment
• Quality
• Availability (licence, persistence)
• Provenance
• Citations
• Related publications (white or grey)
• Related software
• Schema
• Medium / format
Some of these may be simple values, others (e.g. Quality) have a whole substructure. If we can agree such a list it forms a basis for characterising / classifying metadata standards and leads towards recommended usage for various purposes.
I hope this stimulates the discussion!
Best
Keith
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***
T: +44 7768 446088
S: keithgjeffery
Past President ERCIM www.ercim.eu (***@***.***)
Past President euroCRIS www.eurocris.org
Past Vice President VLDB www.vldb.org
Fellow (CITP, CEng) BCS www.bcs.org
Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work...
Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
----------------------------------------------------------------------------------------------------------------------------------
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
----------------------------------------------------------------------------------------------------------------------------------
From: chrisftaylor=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of chrisftaylor
Sent: 30 April 2014 22:18
To: ***@***.***-groups.org
Subject: Re: [rda-metadata-ig] Can we move towards a working definition of metadata
Hi,
On metadata describing interrelationships: the ISA structure relies on that sort of information; basically, an 'investigation' describes the relationship between studies and assays in a body of work (http://www.isa-tools.org/). #justsaying :)
Chris.
- Show quoted text -On 30 April 2014 19:28, qhreul <***@***.***> wrote:
Hi Gary,
I tend to make a distinction between "descriptive" and "functional" metadata. The first type of metadata describes the digital object (e.g. topic, creation date, etc.), while the second type encodes information that drives functionality on a platform (e.g. sort date, etc.). I realize that the distinction can be quite artificial.
With regards to "structural" metadata, are you referring to the dynamic combination of digital objects? Or is it about how digital objects are managed? Although I don't disagree that we want to represent the relation between digital object, I'm not sure that relations between them is really metadata.
Kind regards,
Quentin
On 30 April 2014 12:54, Gary <***@***.***> wrote:
As part of the DFT group I'm wondering of the MIG has any definition(s) of metadata to propose that would be useful for out efforts to define things and your interests.
As context we have had some working defintions a bit of which was disucssed at P3.
We start very generally with "Metadata is a type of data object that that contains attributes describing properties of an associated data or digital object. The association between a data object and metadata is that the content of the metadata describes the data object. "
Since metadata plays different roles for different functions we added some additional ideas such as a role in PIDs.
" It may contain as key the persistent identifier of that associated object. "......but things get a little complicated as we note that metadata (MD) may serve different purposes, such as helping people to find data of relevance - discovery (Michener 2006) or to bring data together –a federation role.
And wecan note many other roles:
(aiding in) Discovery, Access, providing context, Selection, Licensing, authorization, Quality, suitability and Provenance (such as describing how data were gathered, reproducibility or summarizing terms for reuse .
Because of the different roles we may distinguish many different types of MD such as:
Administrative Metadata "Provides information about how to manage a resource,such asrights data, intellectualproperty info, date of creation and editing – can be very important for legal reasons." or
Structural Metadata
The structural organization of a data object,such as chaptersin a book,sentencesin achapter, etc,that allows usto figure out how an objectshould be puttogether. Also refers to the underlying structural metadata of digital objects that tells computers how to assemble them.
To this one may add list of MD attributes such as:
Quality & Provenance (such as describing how data were gathered, reproducibility or summarizing terms for reuse.)
When all this is taken into account, and it is just part of the story, we end up with less of a definition and more of an encylopedia entry.. But perhaps this is still useful.
--
Full post: https://rd-alliance.org/can-we-move-towards-working-definition-metadata....
Manage my subscriptions: https://rd-alliance.org/mailinglist
Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/1714
--
Full post: https://www.rd-alliance.org/can-we-move-towards-working-definition-metad...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/1714
Author: Nikos Houssos
Date: 01 May, 2014
Dear all,
Really interesting discussion. My 2 cents:
1. Indeed the distinction between metadata and data (or more accurately,
identifying which data is metadata and which is not) is fuzzy and the
"data about data" definition does not seem satisfactory in general.
Broader definitions might be used, for example "data about data and the
processes and environment involved in the generation of data" but I
doubt whether this is helpful (probably it is even more ambiguous!).
2. Metadata is normally used and designed (upfront) for exchange and
interoperability, i.e. information foreseen to be consumed only by a
particular system/application can hardly be treated as metadata.
3. An important feature of metadata, compared with "plain" data is open
structure, genericity and extensibility. This does not mean that formal
syntax and semantics is not necessary (on the contrary - both are
indispensable and some defined structure is absolutely required for any
practical use and interoperability). However, the metadata
representation method and technology should allow the continuous
definition of new data elements, albeit conforming to the original
structure and with clear semantics. For instance, the metadata structure
used in a system or a community should not require that all data element
types and related semantics (e.g. vocabularies) are rigidly defined a
priori, consequently, that any addition of data elements would require a
modification in a community data model/standard. In other words,
metadata is data that inherently supports evolution of data and related
semantics (albeit in a structured and standard-conformant way) and,
consequently, involves less "hard-coding" at the data model level (which
BTW leads also to less "hard-coding" in technical implementations).
4. Relationships/associations are key in metadata and are certainly
first-class citizens (while this might not hold for all types of data).
In fact, this (representation of most data elements as associations -
with clear semantics - between entity instances) is an important enabler
to achieve data structure evolution (as in point 3 above).
5. Another aspect of metadata is the ability to manage different
versions of the same data element value (e.g. in different languages -
multi-linguality or different encodings) and/or different versions of
the described object/entity.
Looking forward to further discussion on this issue!
Best regards,
Nikos
--------------------------------------------
Nikos Houssos, Ph.D.
Head, Software Development Unit
National Documentation Centre / N.H.R.F.
48, Vas. Constantinou Av.
116 35 Athens, Greece
phone: +30 210 7273949 fax: +30 210 7252223
email: ***@***.***
http://www.ekt.gr
--------------------------------------------
Στις 2014-05-01 08:04, ***@***.***
έγραψε:
Author: Keith Jeffery
Date: 01 May, 2014
Nikos -
Thanks for your comments; unsurprisingly I agree broadly and especially with your points 3,4,5 which cover things I failed to mention.
Unusually, I’d like to disagree with your point 2; even if a dataset never shared or interoperated I think - for example - provenance and preservation metadata for one dataset is valid metadata. I would also argue that a database schema is valid metadata.
I am sure your comments will stimulate further discussion, thanks again.
Best
Keith
-----------------------------------------------------------------------------------------------------------------------------------
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***
T: +44 7768 446088
S: keithgjeffery
Past President ERCIM www.ercim.eu (***@***.***)
Past President euroCRIS www.eurocris.org
Past Vice President VLDB www.vldb.org
Fellow (CITP, CEng) BCS www.bcs.org
Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work...
Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
----------------------------------------------------------------------------------------------------------------------------------
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
----------------------------------------------------------------------------------------------------------------------------------
-----Original Message-----
From: nhoussos=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of nhoussos
Sent: 01 May 2014 10:24
To: ***@***.***-groups.org
Subject: Re: [rda-metadata-ig] Can we move towards a working definition of metadata
Dear all,
Really interesting discussion. My 2 cents:
1. Indeed the distinction between metadata and data (or more accurately, identifying which data is metadata and which is not) is fuzzy and the "data about data" definition does not seem satisfactory in general.
Broader definitions might be used, for example "data about data and the processes and environment involved in the generation of data" but I doubt whether this is helpful (probably it is even more ambiguous!).
2. Metadata is normally used and designed (upfront) for exchange and interoperability, i.e. information foreseen to be consumed only by a particular system/application can hardly be treated as metadata.
3. An important feature of metadata, compared with "plain" data is open structure, genericity and extensibility. This does not mean that formal syntax and semantics is not necessary (on the contrary - both are indispensable and some defined structure is absolutely required for any practical use and interoperability). However, the metadata representation method and technology should allow the continuous definition of new data elements, albeit conforming to the original structure and with clear semantics. For instance, the metadata structure used in a system or a community should not require that all data element types and related semantics (e.g. vocabularies) are rigidly defined a priori, consequently, that any addition of data elements would require a modification in a community data model/standard. In other words, metadata is data that inherently supports evolution of data and related semantics (albeit in a structured and standard-conformant way) and, consequently, involves less "hard-coding" at the data model level (which BTW leads also to less "hard-coding" in technical implementations).
4. Relationships/associations are key in metadata and are certainly first-class citizens (while this might not hold for all types of data).
In fact, this (representation of most data elements as associations - with clear semantics - between entity instances) is an important enabler to achieve data structure evolution (as in point 3 above).
5. Another aspect of metadata is the ability to manage different versions of the same data element value (e.g. in different languages - multi-linguality or different encodings) and/or different versions of the described object/entity.
Looking forward to further discussion on this issue!
Best regards,
Nikos
--------------------------------------------
Nikos Houssos, Ph.D.
Head, Software Development Unit
National Documentation Centre / N.H.R.F.
48, Vas. Constantinou Av.
116 35 Athens, Greece
phone: +30 210 7273949 fax: +30 210 7252223
email: ***@***.***
http://www.ekt.gr
--------------------------------------------
Στις 2014-05-01 08:04, ***@***.***
έγραψε:
Author: Chris Taylor
Date: 01 May, 2014
Hi,
First a serious question: what are instrument settings (or worse,
'set-ups')? They are controlled parameters, not measurements, so are they
the last level of data, or the first level of metadata? An analogy might be
interviewer questions.
Second, less seriously (perhaps); I like 'data about data'. It allows both
that one person's metadata is another person's data and that the whole
thing is recursive (and that essentially everything is, er, are? data).
Simple but hard to sink anyway :)
Chris.
On 1 May 2014 10:36, ***@***.*** <
***@***.***> wrote:
Author: Andrea Perego
Date: 06 May, 2014
My two cents...
1. IMHO, it would be important to clearly state why we need to
classify metadata, and which is the purpose of such classification.
Personally, I find such exercise very difficult, since the same
metadata element can be classified in different groups depending on
the context. Just to give an example, INSPIRE [1] metadata elements
are grouped into three main classes: discovery, evaluation and use.
Whether a metadata element is in one class or another does not depend
on its intrinsic characteristics, but simply because of the role it
plays in the context of INSPIRE.
2. +1 from me to Keith's points (1) (metadata = "data about data") and
(2) ("there is no difference between data and metadata"). It's again
the context determining whether given data can be considered as
metadata. I also think that (1) highlights two important points. The
former is recursion (as noted by Chris) - BTW, in INSPIRE we do have
metadata about metadata (more precisely, who created the metadata,
when and in which language). The other is that, after all, metadata
are all descriptive - the difference is what they are describing, and
for which purpose. On this, it may be worth noting that the de facto
standard way on the Web to link a resource to its metadata is by using
the "describedby" relation - see:
- http://www.iana.org/assignments/link-relations/link-relations.xhtml
- http://www.w3.org/TR/powder-dr/#assoc-linking
(I must also say I have a conflict of interest here, since I
contributed to the definition of such relation)
Cheers,
Andrea
----
[1]http://inspire.ec.europa.eu/
Author: Gary Berg-Cross
Date: 07 May, 2014
>it would be important to clearly state why we need to
Andrea noted:
>it would be important to clearly state why we need to
classify metadata, and which is the purpose of such classification.
The earlier discussion includes descriptive vs. functional categories and
then some additional subclasses within each. The descriptive
classification seems useful and is related to talking about data types.
When we understand what type of data we are describing we are talking
about descriptive MD. Different processes may be applied to data of
different types in part because they have different data structures that
get described for processing purposes etc.
We need some other MD besides this. Data is often held in particular
disciplinary repositories and effective use of these collections requires
disciplinary expertise, including knowledge of assumptions and models used
in creating and interpreting the data, but also the purposes for which the
data was collected. Again it seems useful to try to come up with some way
of capturing these purposes.
On this issue of MD about MD, I agree that there MD about MD. After all MD
is just data that plays a role of describing other data. And MD gets
stored in repositories and we have data/MD about this such as who provided
it, when it was last updated etc. After all it is just data playing a MD
role.
Gary Berg-Cross, Ph.D.
***@***.***
http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
NSF INTEROP Project
http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
SOCoP Executive Secretary
Independent Consultant
Potomac, MD
240-426-0770
On Tue, May 6, 2014 at 6:22 AM, andrea.perego <
***@***.***> wrote:
Author: Gary Berg-Cross
Date: 07 May, 2014
A small addition to my last posting.
When we trying to justify typing MD we might think of the some type like
Provenance which has
some special needs to track data over its lifetime.
At any point data may be attached to a different data set than its original
set. We want this type of information as well as still wanting to know
what particular file the instance data came
from, along with information about the organization responsible for
generating the original data.
Then there is temporal info - the range of time over which the data
applies. When we are talking about dynamic data sets that range can change
over time and this needs to be tracked.
Gary Berg-Cross, Ph.D.
***@***.***
http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
NSF INTEROP Project
http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
SOCoP Executive Secretary
Independent Consultant
Potomac, MD
240-426-0770
Author: Keith Jeffery
Date: 08 May, 2014
Gary –
Perhaps this is stretching the concept of ‘type’ too far; you are describing temporally-bound role-based relationships.
Example: dataset Y was derived by summarisation from dataset X between date-time1 and date-time 2 (i.e. a transformation – provenance). A different example: dataset A was generated by equipment E between date-time1 and date-time2 (some time intervals e.g. in astronomy of neutrino science can be long). A third example is: dataset S was collected by seismic array A between date-time 1 and date-time 2 where typically the interval is some days.
You could also express project P produced dataset D which relates to/covers time interval date-time 1 to date-time 2.
The representation of relationships can be done by extended entity-relationship modelling, by object-relational modelling, by LOD/RDF, by OWL/RDF etc.
Best
Keith
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***
T: +44 7768 446088
S: keithgjeffery
Past President ERCIM www.ercim.eu (***@***.***)
Past President euroCRIS www.eurocris.org
Past Vice President VLDB www.vldb.org
Fellow (CITP, CEng) BCS www.bcs.org
Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work...
Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
----------------------------------------------------------------------------------------------------------------------------------
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
----------------------------------------------------------------------------------------------------------------------------------
From: gbergcross=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of Gary
Sent: 08 May 2014 00:11
To: ***@***.***-groups.org
Subject: Re: [rda-metadata-ig] Can we move towards a working definition of metadata
A small addition to my last posting.
When we trying to justify typing MD we might think of the some type like Provenance which has
some special needs to track data over its lifetime.
At any point data may be attached to a different data set than its original set. We want this type of information as well as still wanting to know what particular file the instance data came
from, along with information about the organization responsible for generating the original data.
Then there is temporal info - the range of time over which the data applies. When we are talking about dynamic data sets that range can change over time and this needs to be tracked.
Gary Berg-Cross, Ph.D.
***@***.***
http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
NSF INTEROP Project
http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
SOCoP Executive Secretary
Independent Consultant
Potomac, MD
240-426-0770
On Wed, May 7, 2014 at 1:45 PM, Gary Berg-Cross <***@***.***> wrote:
>it would be important to clearly state why we need to
classify metadata, and which is the purpose of such classification.
The earlier discussion includes descriptive vs. functional categories and then some additional subclasses within each. The descriptive classification seems useful and is related to talking about data types. When we understand what type of data we are describing we are talking about descriptive MD. Different processes may be applied to data of different types in part because they have different data structures that get described for processing purposes etc.
We need some other MD besides this. Data is often held in particular disciplinary repositories and effective use of these collections requires disciplinary expertise, including knowledge of assumptions and models used in creating and interpreting the data, but also the purposes for which the data was collected. Again it seems useful to try to come up with some way of capturing these purposes.
On this issue of MD about MD, I agree that there MD about MD. After all MD is just data that plays a role of describing other data. And MD gets stored in repositories and we have data/MD about this such as who provided it, when it was last updated etc. After all it is just data playing a MD role.
Gary Berg-Cross, Ph.D.
***@***.***
http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
NSF INTEROP Project
http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
SOCoP Executive Secretary
Independent Consultant
Potomac, MD
240-426-0770
On Tue, May 6, 2014 at 6:22 AM, andrea.perego <***@***.***> wrote:
My two cents...
1. IMHO, it would be important to clearly state why we need to
classify metadata, and which is the purpose of such classification.
Personally, I find such exercise very difficult, since the same
metadata element can be classified in different groups depending on
the context. Just to give an example, INSPIRE [1] metadata elements
are grouped into three main classes: discovery, evaluation and use.
Whether a metadata element is in one class or another does not depend
on its intrinsic characteristics, but simply because of the role it
plays in the context of INSPIRE.
2. +1 from me to Keith's points (1) (metadata "data about data") and
(2) ("there is no difference between data and metadata"). It's again
the context determining whether given data can be considered as
metadata. I also think that (1) highlights two important points. The
former is recursion (as noted by Chris) - BTW, in INSPIRE we do have
metadata about metadata (more precisely, who created the metadata,
when and in which language). The other is that, after all, metadata
are all descriptive - the difference is what they are describing, and
for which purpose. On this, it may be worth noting that the de facto
standard way on the Web to link a resource to its metadata is by using
the "describedby" relation - see:
- http://www.iana.org/assignments/link-relations/link-relations.xhtml
- http://www.w3.org/TR/powder-dr/#assoc-linking
(I must also say I have a conflict of interest here, since I
contributed to the definition of such relation)
Cheers,
Andrea
----
[1]http://inspire.ec.europa.eu/
On Thu, May 1, 2014 at 2:37 PM, chrisftaylor <***@***.***> wrote:
> Hi,
Gary –
Perhaps this is stretching the concept of ‘type’ too far; you are describing temporally-bound role-based relationships.
Example: dataset Y was derived by summarisation from dataset X between date-time1 and date-time 2 (i.e. a transformation – provenance). A different example: dataset A was generated by equipment E between date-time1 and date-time2 (some time intervals e.g. in astronomy of neutrino science can be long). A third example is: dataset S was collected by seismic array A between date-time 1 and date-time 2 where typically the interval is some days.
You could also express project P produced dataset D which relates to/covers time interval date-time 1 to date-time 2.
The representation of relationships can be done by extended entity-relationship modelling, by object-relational modelling, by LOD/RDF, by OWL/RDF etc.
Best
Keith
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: ***@***.***
T: +44 7768 446088
S: keithgjeffery
Past President ERCIM www.ercim.eu (***@***.***)
Past President euroCRIS www.eurocris.org
Past Vice President VLDB www.vldb.org
Fellow (CITP, CEng) BCS www.bcs.org
Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work...
Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
----------------------------------------------------------------------------------------------------------------------------------
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
----------------------------------------------------------------------------------------------------------------------------------
From: gbergcross=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of Gary
Sent: 08 May 2014 00:11
To: ***@***.***-groups.org
Subject: Re: [rda-metadata-ig] Can we move towards a working definition of metadata
A small addition to my last posting.
When we trying to justify typing MD we might think of the some type like Provenance which has
some special needs to track data over its lifetime.
At any point data may be attached to a different data set than its original set. We want this type of information as well as still wanting to know what particular file the instance data came
from, along with information about the organization responsible for generating the original data.
Then there is temporal info - the range of time over which the data applies. When we are talking about dynamic data sets that range can change over time and this needs to be tracked.
Gary Berg-Cross, Ph.D.
***@***.***
http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
NSF INTEROP Project
http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
SOCoP Executive Secretary
Independent Consultant
Potomac, MD
240-426-0770
On Wed, May 7, 2014 at 1:45 PM, Gary Berg-Cross <***@***.***> wrote:
Andrea noted:
>it would be important to clearly state why we need to
classify metadata, and which is the purpose of such classification.
The earlier discussion includes descriptive vs. functional categories and then some additional subclasses within each. The descriptive classification seems useful and is related to talking about data types. When we understand what type of data we are describing we are talking about descriptive MD. Different processes may be applied to data of different types in part because they have different data structures that get described for processing purposes etc.
We need some other MD besides this. Data is often held in particular disciplinary repositories and effective use of these collections requires disciplinary expertise, including knowledge of assumptions and models used in creating and interpreting the data, but also the purposes for which the data was collected. Again it seems useful to try to come up with some way of capturing these purposes.
On this issue of MD about MD, I agree that there MD about MD. After all MD is just data that plays a role of describing other data. And MD gets stored in repositories and we have data/MD about this such as who provided it, when it was last updated etc. After all it is just data playing a MD role.
Gary Berg-Cross, Ph.D.
***@***.***
http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
NSF INTEROP Project
http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
SOCoP Executive Secretary
Independent Consultant
Potomac, MD
240-426-0770
On Tue, May 6, 2014 at 6:22 AM, andrea.perego <***@***.***> wrote:
My two cents...
1. IMHO, it would be important to clearly state why we need to
classify metadata, and which is the purpose of such classification.
Personally, I find such exercise very difficult, since the same
metadata element can be classified in different groups depending on
the context. Just to give an example, INSPIRE [1] metadata elements
are grouped into three main classes: discovery, evaluation and use.
Whether a metadata element is in one class or another does not depend
on its intrinsic characteristics, but simply because of the role it
plays in the context of INSPIRE.
2. +1 from me to Keith's points (1) (metadata "data about data") and
(2) ("there is no difference between data and metadata"). It's again
the context determining whether given data can be considered as
metadata. I also think that (1) highlights two important points. The
former is recursion (as noted by Chris) - BTW, in INSPIRE we do have
metadata about metadata (more precisely, who created the metadata,
when and in which language). The other is that, after all, metadata
are all descriptive - the difference is what they are describing, and
for which purpose. On this, it may be worth noting that the de facto
standard way on the Web to link a resource to its metadata is by using
the "describedby" relation - see:
- http://www.iana.org/assignments/link-relations/link-relations.xhtml
- http://www.w3.org/TR/powder-dr/#assoc-linking
(I must also say I have a conflict of interest here, since I
contributed to the definition of such relation)
Cheers,
Andrea
----
[1]http://inspire.ec.europa.eu/
Author: Andrea Perego
Date: 10 Jun, 2014
Dear colleagues,
Just to mention that a similar issue is under discussion in the W3C
Data on the Web Best Practices WG. See, e.g.:
http://lists.w3.org/Archives/Public/public-dwbp-wg/2014Jun/0068.html
Cheers,
Andrea
On Thu, May 8, 2014 at 9:14 AM,
***@***.***