Re: [rda-datafabric-ig] Architecture or Not?

31 Oct 2014
Groups audience: 

This discussion thread will provide input to our DF White Paper and as
Larry noted some of us will be meeting at NIST (John Henry's turf) in a few
weeks.
I also made some remarks at the tail end of the NDS meeting in response to
the question. One thing I thought important is that we have an IG and thus
provide a venue for discussion of ideas that we hope will clarify what some
here are calling the Landscape, perhaps to avoid the unloaded term
Architecture. To me both landscape, architecture as well as system gesture
to the idea of some type of structure with related elements or components.
So are discussion will include system elements and relations between them.
The other point I may was via a bit of analogy to data modeling. It has
organization and components but there are some useful distinctions of
levels. Some data models are at the Logical level and pick some logical
approach such as a relational one. They take the organizing principle and
relational algebra etc. You can then do a physical model of this, but
going the other way we can be more abstract than the Logical model and use
a conceptual model. You can think of any of these 3 levels are some type
of organizing system or architecture. But if you are at the conceptual
level we are discussing "components" and abstract relations that are far
removed from a particular physical implementation.
So we can talk about services & data or metadata service concepts and
maybe even some level of service oriented architecture but not as much
about specific services implemented by a specific technology unless this is
categorical -such as we need the category of metadata services to find
relevant data via search .
Thus I think that as an IG we will be largely talking conceptually and from
time to time looking a bit less abstractly to see if it makes sense from
that perspective. We may even talk about some things at a logical level
and find ways to talk about them more generally.
Gary Berg-Cross, Ph.D.
***@***.***
http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
SOCoP Executive Secretary
Independent Consultant
Potomac, MD
240-426-0770
On Thu, Oct 30, 2014 at 9:35 AM, Peter Wittenburg
<***@***.***>
wrote:

  • Keith Jeffery's picture

    Author: Keith Jeffery

    Date: 02 Nov, 2014

    Gary –
    The analogy with the ‘data world’ is apposite; there the word ‘model’ is used in the sense that the data model is a representation of the real world appropriate for the purpose (at conceptual level) and appropriate for doing the business of the organisation (at logical level) finally working on a computer system (at physical level).
    The advantage is that there exist decades of experienced and useful tools for managing the 3 levels of models including dealing with referential and functional integrity.
    Best
    Keith
    Keith G Jeffery Consultants
    Prof Keith G Jeffery
    E: ***@***.***
    T: +44 7768 446088
    S: keithgjeffery
    Past President ERCIM www.ercim.eu (***@***.***)
    Past President euroCRIS www.eurocris.org
    Past Vice President VLDB www.vldb.org
    Fellow (CITP, CEng) BCS www.bcs.org
    Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
    Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work...
    Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
    ----------------------------------------------------------------------------------------------------------------------------------
    The contents of this email are sent in confidence for the use of the
    intended recipient only. If you are not one of the intended
    recipients do not take action on it or show it to anyone else, but
    return this email to the sender and delete your copy of it.
    ----------------------------------------------------------------------------------------------------------------------------------
    - Show quoted text -From: gbergcross=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of Gary
    Sent: 31 October 2014 16:18
    To: ***@***.***-groups.org
    Subject: [rda-datafabric-ig] Re: [rda-datafabric-ig] Architecture or Not?
    This discussion thread will provide input to our DF White Paper and as Larry noted some of us will be meeting at NIST (John Henry's turf) in a few weeks.
    I also made some remarks at the tail end of the NDS meeting in response to the question. One thing I thought important is that we have an IG and thus provide a venue for discussion of ideas that we hope will clarify what some here are calling the Landscape, perhaps to avoid the unloaded term Architecture. To me both landscape, architecture as well as system gesture to the idea of some type of structure with related elements or components. So are discussion will include system elements and relations between them.
    The other point I may was via a bit of analogy to data modeling. It has organization and components but there are some useful distinctions of levels. Some data models are at the Logical level and pick some logical approach such as a relational one. They take the organizing principle and relational algebra etc. You can then do a physical model of this, but going the other way we can be more abstract than the Logical model and use a conceptual model. You can think of any of these 3 levels are some type of organizing system or architecture. But if you are at the conceptual level we are discussing "components" and abstract relations that are far removed from a particular physical implementation.
    So we can talk about services & data or metadata service concepts and maybe even some level of service oriented architecture but not as much about specific services implemented by a specific technology unless this is categorical -such as we need the category of metadata services to find relevant data via search .
    Thus I think that as an IG we will be largely talking conceptually and from time to time looking a bit less abstractly to see if it makes sense from that perspective. We may even talk about some things at a logical level and find ways to talk about them more generally.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770
    On Thu, Oct 30, 2014 at 9:35 AM, Peter Wittenburg
    <***@***.***> wrote:
    Hallo John,
    great that you kicked this discussion off, I was waiting on your written input, since for me as non-native E-speaker it is more optimal to read carefully what is being said. Let me respond to your long, but good note (and of course I saw what the other said and also looked at Beth slides). And excellent that you had a discussion about this at the recent meeting. Hope that you will be around when we have the next discussion in Washington as Larry pointed out.
    Let me first clarify in which role I will answer, since I was re-elected for the TAB as well. One part of what you stated addresses RDA etc., the other part addresses DF issues and there are certainly relationships. I will argue here as a bottom-up person (my preferred role) being interested to make data work more efficient in future, so all I am saying is just my personla view :)
    TAB Issue
    This relates mainly to your second question. RDA needs to remain primarily a bottom-up driven organization, so if we two would decide to start a new initiative, then it should be hard for boards etc to stop us if we can present a well-thought through idea and fullfill the formal requirements. The role of TAB is to synch, harmonize etc. where possible and of course check adherence to requirements. Beyond that as you note the role of TAB is to describe and structure the overall "landscape" of issues that RDA should be dealing about. To me it is obvious that at this moment it is too early to describe and structure this landscape in too detailed form, since we still need to learn more. Therefore DF will have an important role in driving this discussion about the "ladnscape" if we do not make big errors such as narrowing down the scope too early etc. For me there is no doubt that the crowd who is organized in the IGs and WGs is the most important driving force, the TAB, if it makes use of its role in a smart way, will help to consolidate, bring people together, etc.
    Important for me is also to note that RDA covers much more than DF people, but this is already part of the answer to your DF points. When you say that system engineering has already though about a system where for example also humans are components, then I would answer that also psychology, law etc. has worked out elaborative models of "systems" that include humans. For issues such as Legal Interoperability I would rather much more look to models about human behavior coming from those disciplines than from system engineering. Since RDA covers IGs/WGs that look from different views to "humans" in the data game, "systems engineering" will have one important view, but it will not be the only one. I am repeating very frequently that RDA needs to define its own culture, if it will not fail, and part of this culture is terminology.
    DF Issue
    Probably I am a bit along Larry's view as so often. It is very good to know that systems engineering defined a variety of terms relevant for us and I am happy to make use of it. But let us take the term "architecture" as an example. In our software design & developing experience the term "architecture" is very much related to design and then implement a system (A house) which then will do some function for us. In EUDAT we had a 2 year long discussion when to start discussing about an overall architecture. We as people who wanted to get specific services did not want to get in endless discussions about an overall architecture too early. I think that this comes down into a never-ending debate between those who like to start with a systematic and structured approach and those who like to start with tests, demos, etc. My attitude is very clear: when you operate in a "landscape" which you need to explore better start with tests, demos etc. But I know for example that my colleagues from ENVRI followed another path, used ODP to specify their infrastructure based on the different views and created an enormous amount of specification notes etc. Probably both ways are important, but in a bottom-up initiative such as RDA where many people address the data issues from a large number of different aspects only the test/demo approach will work. This may change at a certain moment when we understand the landscape. This will probably also be the moment when industry will drop in.
    So I could live with the term "architecture" as defined by systems engineering, but then we need to define for our community how we want to understand it. The way you describe it in your text would not be satisfying for me, since different interpretations are possible. A blueprint of a house already specifies where you have doors, windows, how big they are, etc. I think that in DF we want to remain at the level that we state that we will need doors and windows for certain functional reasons and what their requirements are etc. (some doors must be bigger than others etc.). I think that Keith also argued in this direction.
    So if we do this kind of small exercise and say what we mean with architecture in the realm of RDA I would be ok with using the term as defined by SE. At least at this moment the definition (3.150) seems to be abstract enough. I think we as data practitioners simply do not want to get in these endless architecture discussions some of our IT people like so much. Similar holds with the other terms you are mentioning. So perhaps we should have augmented term definitions by making use of what we can get from SE. More important to me is to define now components/services (there is a dualism between the two) which we need within the DF so that working with data gets much more efficient, effective, self-explanatory, etc.
    AFter having made my points let me also come back on your second questions to Beth. does DFIG have an overall organizing influence on the big picture? Yes if it understands that DF is not the whole RDA and restricts itself. Yes if it helps exploring the data landscape we are operating in, in particular by pointing to components/services that are required to make progress and making terms such as "architecture"much more concrete by doing.
    Again - thanks a lot for kicking this off and sorting out the terms. Perhaps we are not so far away from each other. At least for writing the White Paper this is excellent material and discussion.
    Peter
    --
    Full post: https://rd-alliance.org/group/data-fabric-ig/post/architecture-or-not.html
    Manage my subscriptions: https://rd-alliance.org/mailinglist
    Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/46202

  • Peter Wittenburg's picture

    Author: Peter Wittenburg

    Date: 03 Nov, 2014

    Gary, Keith,
    Yes I guess you are both right in so far as we need to discuss at "conceptual level" although discussing these terms from different perspectives has potential for confusion.
    As someone suggested at our P4 session: we should start arguing from data. But how does "data" appear in our DF discussion? It comes along as collections (as the most useful term and we have defined in DFT what is mentioned by "collection") that will be processed by a "machinery" as roughly indicated already. So I simply hope that we agree on this basic description what a collections is. If so then we can start discussing about the components/services, what kind of data structures they need to support, what kind of functions they need to offer and how they interoperate. That obviously falls under "logical level" to use Keith's terms (which indeed are "old term defs").
    Peter
    - Show quoted text -From: keith.jeffery=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of ***@***.***
    Sent: Sunday, November 02, 2014 12:43 PM
    To: ***@***.***-groups.org
    Subject: Re: [rda-datafabric-ig] Re: [rda-datafabric-ig] Architecture or Not?
    Gary –
    The analogy with the ‘data world’ is apposite; there the word ‘model’ is used in the sense that the data model is a representation of the real world appropriate for the purpose (at conceptual level) and appropriate for doing the business of the organisation (at logical level) finally working on a computer system (at physical level).
    The advantage is that there exist decades of experienced and useful tools for managing the 3 levels of models including dealing with referential and functional integrity.
    Best
    Keith
    Keith G Jeffery Consultants
    Prof Keith G Jeffery
    E: ***@***.***
    T: +44 7768 446088
    S: keithgjeffery
    Past President ERCIM www.ercim.eu (***@***.***)
    Past President euroCRIS www.eurocris.org
    Past Vice President VLDB www.vldb.org
    Fellow (CITP, CEng) BCS www.bcs.org
    Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
    Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work...
    Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
    ----------------------------------------------------------------------------------------------------------------------------------
    The contents of this email are sent in confidence for the use of the
    intended recipient only. If you are not one of the intended
    recipients do not take action on it or show it to anyone else, but
    return this email to the sender and delete your copy of it.
    ----------------------------------------------------------------------------------------------------------------------------------
    From: gbergcross=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of Gary
    Sent: 31 October 2014 16:18
    To: ***@***.***-groups.org
    Subject: [rda-datafabric-ig] Re: [rda-datafabric-ig] Architecture or Not?
    This discussion thread will provide input to our DF White Paper and as Larry noted some of us will be meeting at NIST (John Henry's turf) in a few weeks.
    I also made some remarks at the tail end of the NDS meeting in response to the question. One thing I thought important is that we have an IG and thus provide a venue for discussion of ideas that we hope will clarify what some here are calling the Landscape, perhaps to avoid the unloaded term Architecture. To me both landscape, architecture as well as system gesture to the idea of some type of structure with related elements or components. So are discussion will include system elements and relations between them.
    The other point I may was via a bit of analogy to data modeling. It has organization and components but there are some useful distinctions of levels. Some data models are at the Logical level and pick some logical approach such as a relational one. They take the organizing principle and relational algebra etc. You can then do a physical model of this, but going the other way we can be more abstract than the Logical model and use a conceptual model. You can think of any of these 3 levels are some type of organizing system or architecture. But if you are at the conceptual level we are discussing "components" and abstract relations that are far removed from a particular physical implementation.
    So we can talk about services & data or metadata service concepts and maybe even some level of service oriented architecture but not as much about specific services implemented by a specific technology unless this is categorical -such as we need the category of metadata services to find relevant data via search .
    Thus I think that as an IG we will be largely talking conceptually and from time to time looking a bit less abstractly to see if it makes sense from that perspective. We may even talk about some things at a logical level and find ways to talk about them more generally.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770
    On Thu, Oct 30, 2014 at 9:35 AM, Peter Wittenburg
    <***@***.***> wrote:
    Hallo John,
    great that you kicked this discussion off, I was waiting on your written input, since for me as non-native E-speaker it is more optimal to read carefully what is being said. Let me respond to your long, but good note (and of course I saw what the other said and also looked at Beth slides). And excellent that you had a discussion about this at the recent meeting. Hope that you will be around when we have the next discussion in Washington as Larry pointed out.
    Let me first clarify in which role I will answer, since I was re-elected for the TAB as well. One part of what you stated addresses RDA etc., the other part addresses DF issues and there are certainly relationships. I will argue here as a bottom-up person (my preferred role) being interested to make data work more efficient in future, so all I am saying is just my personla view :)
    TAB Issue
    This relates mainly to your second question. RDA needs to remain primarily a bottom-up driven organization, so if we two would decide to start a new initiative, then it should be hard for boards etc to stop us if we can present a well-thought through idea and fullfill the formal requirements. The role of TAB is to synch, harmonize etc. where possible and of course check adherence to requirements. Beyond that as you note the role of TAB is to describe and structure the overall "landscape" of issues that RDA should be dealing about. To me it is obvious that at this moment it is too early to describe and structure this landscape in too detailed form, since we still need to learn more. Therefore DF will have an important role in driving this discussion about the "ladnscape" if we do not make big errors such as narrowing down the scope too early etc. For me there is no doubt that the crowd who is organized in the IGs and WGs is the most important driving force, the TAB, if it makes use of its role in a smart way, will help to consolidate, bring people together, etc.
    Important for me is also to note that RDA covers much more than DF people, but this is already part of the answer to your DF points. When you say that system engineering has already though about a system where for example also humans are components, then I would answer that also psychology, law etc. has worked out elaborative models of "systems" that include humans. For issues such as Legal Interoperability I would rather much more look to models about human behavior coming from those disciplines than from system engineering. Since RDA covers IGs/WGs that look from different views to "humans" in the data game, "systems engineering" will have one important view, but it will not be the only one. I am repeating very frequently that RDA needs to define its own culture, if it will not fail, and part of this culture is terminology.
    DF Issue
    Probably I am a bit along Larry's view as so often. It is very good to know that systems engineering defined a variety of terms relevant for us and I am happy to make use of it. But let us take the term "architecture" as an example. In our software design & developing experience the term "architecture" is very much related to design and then implement a system (A house) which then will do some function for us. In EUDAT we had a 2 year long discussion when to start discussing about an overall architecture. We as people who wanted to get specific services did not want to get in endless discussions about an overall architecture too early. I think that this comes down into a never-ending debate between those who like to start with a systematic and structured approach and those who like to start with tests, demos, etc. My attitude is very clear: when you operate in a "landscape" which you need to explore better start with tests, demos etc. But I know for example that my colleagues from ENVRI followed another path, used ODP to specify their infrastructure based on the different views and created an enormous amount of specification notes etc. Probably both ways are important, but in a bottom-up initiative such as RDA where many people address the data issues from a large number of different aspects only the test/demo approach will work. This may change at a certain moment when we understand the landscape. This will probably also be the moment when industry will drop in.
    So I could live with the term "architecture" as defined by systems engineering, but then we need to define for our community how we want to understand it. The way you describe it in your text would not be satisfying for me, since different interpretations are possible. A blueprint of a house already specifies where you have doors, windows, how big they are, etc. I think that in DF we want to remain at the level that we state that we will need doors and windows for certain functional reasons and what their requirements are etc. (some doors must be bigger than others etc.). I think that Keith also argued in this direction.
    So if we do this kind of small exercise and say what we mean with architecture in the realm of RDA I would be ok with using the term as defined by SE. At least at this moment the definition (3.150) seems to be abstract enough. I think we as data practitioners simply do not want to get in these endless architecture discussions some of our IT people like so much. Similar holds with the other terms you are mentioning. So perhaps we should have augmented term definitions by making use of what we can get from SE. More important to me is to define now components/services (there is a dualism between the two) which we need within the DF so that working with data gets much more efficient, effective, self-explanatory, etc.
    AFter having made my points let me also come back on your second questions to Beth. does DFIG have an overall organizing influence on the big picture? Yes if it understands that DF is not the whole RDA and restricts itself. Yes if it helps exploring the data landscape we are operating in, in particular by pointing to components/services that are required to make progress and making terms such as "architecture"much more concrete by doing.
    Again - thanks a lot for kicking this off and sorting out the terms. Perhaps we are not so far away from each other. At least for writing the White Paper this is excellent material and discussion.
    Peter
    --
    Full post: https://rd-alliance.org/group/data-fabric-ig/post/architecture-or-not.html
    Manage my subscriptions: https://rd-alliance.org/mailinglist
    Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/46202
    Gary, Keith,
    Yes I guess you are both right in so far as we need to discuss at "conceptual level" although discussing these terms from different perspectives has potential for confusion.
    As someone suggested at our P4 session: we should start arguing from data. But how does "data" appear in our DF discussion? It comes along as collections (as the most useful term and we have defined in DFT what is mentioned by "collection") that will be processed by a "machinery" as roughly indicated already. So I simply hope that we agree on this basic description what a collections is. If so then we can start discussing about the components/services, what kind of data structures they need to support, what kind of functions they need to offer and how they interoperate. That obviously falls under "logical level" to use Keith's terms (which indeed are "old term defs").
    Peter
    From: keith.jeffery=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of ***@***.***
    Sent: Sunday, November 02, 2014 12:43 PM
    To: ***@***.***-groups.org
    Subject: Re: [rda-datafabric-ig] Re: [rda-datafabric-ig] Architecture or Not?
    Gary –
    The analogy with the ‘data world’ is apposite; there the word ‘model’ is used in the sense that the data model is a representation of the real world appropriate for the purpose (at conceptual level) and appropriate for doing the business of the organisation (at logical level) finally working on a computer system (at physical level).
    The advantage is that there exist decades of experienced and useful tools for managing the 3 levels of models including dealing with referential and functional integrity.
    Best
    Keith
    Keith G Jeffery Consultants
    Prof Keith G Jeffery
    E: ***@***.***
    T: +44 7768 446088
    S: keithgjeffery
    Past President ERCIM www.ercim.eu (***@***.***)
    Past President euroCRIS www.eurocris.org
    Past Vice President VLDB www.vldb.org
    Fellow (CITP, CEng) BCS www.bcs.org
    Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
    Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work...
    Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
    ----------------------------------------------------------------------------------------------------------------------------------
    The contents of this email are sent in confidence for the use of the
    intended recipient only. If you are not one of the intended
    recipients do not take action on it or show it to anyone else, but
    return this email to the sender and delete your copy of it.
    ----------------------------------------------------------------------------------------------------------------------------------
    - Show quoted text -From: gbergcross=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of Gary
    Sent: 31 October 2014 16:18
    To: ***@***.***-groups.org
    Subject: [rda-datafabric-ig] Re: [rda-datafabric-ig] Architecture or Not?
    This discussion thread will provide input to our DF White Paper and as Larry noted some of us will be meeting at NIST (John Henry's turf) in a few weeks.
    I also made some remarks at the tail end of the NDS meeting in response to the question. One thing I thought important is that we have an IG and thus provide a venue for discussion of ideas that we hope will clarify what some here are calling the Landscape, perhaps to avoid the unloaded term Architecture. To me both landscape, architecture as well as system gesture to the idea of some type of structure with related elements or components. So are discussion will include system elements and relations between them.
    The other point I may was via a bit of analogy to data modeling. It has organization and components but there are some useful distinctions of levels. Some data models are at the Logical level and pick some logical approach such as a relational one. They take the organizing principle and relational algebra etc. You can then do a physical model of this, but going the other way we can be more abstract than the Logical model and use a conceptual model. You can think of any of these 3 levels are some type of organizing system or architecture. But if you are at the conceptual level we are discussing "components" and abstract relations that are far removed from a particular physical implementation.
    So we can talk about services & data or metadata service concepts and maybe even some level of service oriented architecture but not as much about specific services implemented by a specific technology unless this is categorical -such as we need the category of metadata services to find relevant data via search .
    Thus I think that as an IG we will be largely talking conceptually and from time to time looking a bit less abstractly to see if it makes sense from that perspective. We may even talk about some things at a logical level and find ways to talk about them more generally.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770
    On Thu, Oct 30, 2014 at 9:35 AM, Peter Wittenburg
    <***@***.***> wrote:
    Hallo John,
    great that you kicked this discussion off, I was waiting on your written input, since for me as non-native E-speaker it is more optimal to read carefully what is being said. Let me respond to your long, but good note (and of course I saw what the other said and also looked at Beth slides). And excellent that you had a discussion about this at the recent meeting. Hope that you will be around when we have the next discussion in Washington as Larry pointed out.
    Let me first clarify in which role I will answer, since I was re-elected for the TAB as well. One part of what you stated addresses RDA etc., the other part addresses DF issues and there are certainly relationships. I will argue here as a bottom-up person (my preferred role) being interested to make data work more efficient in future, so all I am saying is just my personla view :)
    TAB Issue
    This relates mainly to your second question. RDA needs to remain primarily a bottom-up driven organization, so if we two would decide to start a new initiative, then it should be hard for boards etc to stop us if we can present a well-thought through idea and fullfill the formal requirements. The role of TAB is to synch, harmonize etc. where possible and of course check adherence to requirements. Beyond that as you note the role of TAB is to describe and structure the overall "landscape" of issues that RDA should be dealing about. To me it is obvious that at this moment it is too early to describe and structure this landscape in too detailed form, since we still need to learn more. Therefore DF will have an important role in driving this discussion about the "ladnscape" if we do not make big errors such as narrowing down the scope too early etc. For me there is no doubt that the crowd who is organized in the IGs and WGs is the most important driving force, the TAB, if it makes use of its role in a smart way, will help to consolidate, bring people together, etc.
    Important for me is also to note that RDA covers much more than DF people, but this is already part of the answer to your DF points. When you say that system engineering has already though about a system where for example also humans are components, then I would answer that also psychology, law etc. has worked out elaborative models of "systems" that include humans. For issues such as Legal Interoperability I would rather much more look to models about human behavior coming from those disciplines than from system engineering. Since RDA covers IGs/WGs that look from different views to "humans" in the data game, "systems engineering" will have one important view, but it will not be the only one. I am repeating very frequently that RDA needs to define its own culture, if it will not fail, and part of this culture is terminology.
    DF Issue
    Probably I am a bit along Larry's view as so often. It is very good to know that systems engineering defined a variety of terms relevant for us and I am happy to make use of it. But let us take the term "architecture" as an example. In our software design & developing experience the term "architecture" is very much related to design and then implement a system (A house) which then will do some function for us. In EUDAT we had a 2 year long discussion when to start discussing about an overall architecture. We as people who wanted to get specific services did not want to get in endless discussions about an overall architecture too early. I think that this comes down into a never-ending debate between those who like to start with a systematic and structured approach and those who like to start with tests, demos, etc. My attitude is very clear: when you operate in a "landscape" which you need to explore better start with tests, demos etc. But I know for example that my colleagues from ENVRI followed another path, used ODP to specify their infrastructure based on the different views and created an enormous amount of specification notes etc. Probably both ways are important, but in a bottom-up initiative such as RDA where many people address the data issues from a large number of different aspects only the test/demo approach will work. This may change at a certain moment when we understand the landscape. This will probably also be the moment when industry will drop in.
    So I could live with the term "architecture" as defined by systems engineering, but then we need to define for our community how we want to understand it. The way you describe it in your text would not be satisfying for me, since different interpretations are possible. A blueprint of a house already specifies where you have doors, windows, how big they are, etc. I think that in DF we want to remain at the level that we state that we will need doors and windows for certain functional reasons and what their requirements are etc. (some doors must be bigger than others etc.). I think that Keith also argued in this direction.
    So if we do this kind of small exercise and say what we mean with architecture in the realm of RDA I would be ok with using the term as defined by SE. At least at this moment the definition (3.150) seems to be abstract enough. I think we as data practitioners simply do not want to get in these endless architecture discussions some of our IT people like so much. Similar holds with the other terms you are mentioning. So perhaps we should have augmented term definitions by making use of what we can get from SE. More important to me is to define now components/services (there is a dualism between the two) which we need within the DF so that working with data gets much more efficient, effective, self-explanatory, etc.
    AFter having made my points let me also come back on your second questions to Beth. does DFIG have an overall organizing influence on the big picture? Yes if it understands that DF is not the whole RDA and restricts itself. Yes if it helps exploring the data landscape we are operating in, in particular by pointing to components/services that are required to make progress and making terms such as "architecture"much more concrete by doing.
    Again - thanks a lot for kicking this off and sorting out the terms. Perhaps we are not so far away from each other. At least for writing the White Paper this is excellent material and discussion.
    Peter
    --
    Full post: https://rd-alliance.org/group/data-fabric-ig/post/architecture-or-not.html
    Manage my subscriptions: https://rd-alliance.org/mailinglist
    Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/46202

  • Gary Berg-Cross's picture

    Author: Gary Berg-Cross

    Date: 03 Nov, 2014

    Peter said (perhaps thinking of the concept of "data collection":
    >we need to discuss at "conceptual level" although discussing these terms
    from different perspectives has potential for confusion.
    Peter said (perhaps thinking of the concept of "data collection":
    >we need to discuss at "conceptual level" although discussing these terms
    from different perspectives has potential for confusion.
    I would just offer the possibility that there may be different perspectives
    to a term because there may be real differences in what a data collection
    means to different groups (such as digital librarians). The usual thing is
    to declare different (logical) data types to make this clear and, to extent
    one can, to declare what the relations between them are.
    They may require different services, for example.
    If this is a reality one can see this as part of a scenario
    which identifies real data so we are doing more than talking past one
    another with the use of one term to mean several different things.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770
    On Mon, Nov 3, 2014 at 6:33 AM, Peter Wittenburg
    <***@***.***>
    wrote:

  • John Henry Scott's picture

    Author: John Henry Scott

    Date: 03 Nov, 2014

    Hi DFIG,
    I hope everyone had a good weekend. Let me briefly address the discussion about the RDA "landscape" and bottom-up vs top-down.
    On Thu Oct 30, Peter posted a long and very helpful message, including a section on the TAB. He wrote:
    "Beyond that as you note the role of TAB is to describe and structure the overall "landscape" of issues that RDA should be dealing about. "
    In the ISO/IEC/IEEE standardized language, I think the term for what Peter calls "landscape" is environment.
    from ISO 42010-2011
    environment
    context determining the setting and circumstances of all influences upon a system
    NOTE The environment of a system includes developmental, technological, business, operational, organizational,
    political, economic, legal, regulatory, ecological and social influences.
    The relationship between the terms I have been using (including their cardinality) might be made more clear with a diagram from the standards documents:
    [cid:28f9d227-7946-4ffd-954e-18c8f203bfc7]
    Another important point I'd like to make is that whether we acknowledge it or not, the research data world is already a system, and it already has an architecture. It might be an organic, evolved, messy, unintended architecture that was not created rationally or by design, but it exists. There are lots of components and they interact with each other.
    I agree with Peter that we don't know enough about the environment to begin prescribing a better architecture from the top down. I think we first need to better understand the existing system and start analyzing what parts are working well and what could be improved. This should be done in parallel with a more deliberate effort to understand and analyze the needs of the community. Both of these steps are a basic part of systems engineering, which by the way is equally comfortable with bottom-up approaches as top-down. In ISO jargon this would be called identifying your stakeholders, eliciting their needs, and articulating the purpose of the system and the various system concerns.
    As Peter mentioned earlier, the good news is that I think our opinions are all closer to each other than it might appear, but we are struggling a bit with communication problems, something standardized terminology is designed to address.

    --John Henry
    ________________________________
    From: gbergcross=***@***.***-groups.org <***@***.***-groups.org> on behalf of Gary <***@***.***>
    Sent: Monday, November 3, 2014 8:48 AM
    To: ***@***.***-groups.org
    Subject: Re: [rda-datafabric-ig] Re: [rda-datafabric-ig] Architecture or Not?
    Peter said (perhaps thinking of the concept of "data collection":
    I would just offer the possibility that there may be different perspectives to a term because there may be real differences in what a data collection means to different groups (such as digital librarians). The usual thing is to declare different (logical) data types to make this clear and, to extent one can, to declare what the relations between them are.
    They may require different services, for example.
    If this is a reality one can see this as part of a scenario which identifies real data so we are doing more than talking past one another with the use of one term to mean several different things.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770
    On Mon, Nov 3, 2014 at 6:33 AM, Peter Wittenburg
    <***@***.***> wrote:
    Gary, Keith,
    Yes I guess you are both right in so far as we need to discuss at "conceptual level" although discussing these terms from different perspectives has potential for confusion.
    As someone suggested at our P4 session: we should start arguing from data. But how does "data" appear in our DF discussion? It comes along as collections (as the most useful term and we have defined in DFT what is mentioned by "collection") that will be processed by a "machinery" as roughly indicated already. So I simply hope that we agree on this basic description what a collections is. If so then we can start discussing about the components/services, what kind of data structures they need to support, what kind of functions they need to offer and how they interoperate. That obviously falls under "logical level" to use Keith's terms (which indeed are "old term defs").
    Peter
    From: keith.jeffery=***@***.***-groups.org [mailto:keith.jeffery=***@***.***-groups.org] On Behalf Of ***@***.***
    Sent: Sunday, November 02, 2014 12:43 PM
    To: ***@***.***-groups.org
    Subject: Re: [rda-datafabric-ig] Re: [rda-datafabric-ig] Architecture or Not?
    Gary –
    The analogy with the ‘data world’ is apposite; there the word ‘model’ is used in the sense that the data model is a representation of the real world appropriate for the purpose (at conceptual level) and appropriate for doing the business of the organisation (at logical level) finally working on a computer system (at physical level).
    The advantage is that there exist decades of experienced and useful tools for managing the 3 levels of models including dealing with referential and functional integrity.
    Best
    Keith
    Keith G Jeffery Consultants
    Prof Keith G Jeffery
    E: ***@***.***
    T: +44 7768 446088
    S: keithgjeffery
    Past President ERCIM www.ercim.eu (***@***.***)
    Past President euroCRIS www.eurocris.org
    Past Vice President VLDB www.vldb.org
    Fellow (CITP, CEng) BCS www.bcs.org
    Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
    Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work...
    Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
    ----------------------------------------------------------------------------------------------------------------------------------
    The contents of this email are sent in confidence for the use of the
    intended recipient only. If you are not one of the intended
    recipients do not take action on it or show it to anyone else, but
    return this email to the sender and delete your copy of it.
    ----------------------------------------------------------------------------------------------------------------------------------
    From: gbergcross=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of Gary
    Sent: 31 October 2014 16:18
    To: ***@***.***-groups.org
    Subject: [rda-datafabric-ig] Re: [rda-datafabric-ig] Architecture or Not?
    This discussion thread will provide input to our DF White Paper and as Larry noted some of us will be meeting at NIST (John Henry's turf) in a few weeks.
    I also made some remarks at the tail end of the NDS meeting in response to the question. One thing I thought important is that we have an IG and thus provide a venue for discussion of ideas that we hope will clarify what some here are calling the Landscape, perhaps to avoid the unloaded term Architecture. To me both landscape, architecture as well as system gesture to the idea of some type of structure with related elements or components. So are discussion will include system elements and relations between them.
    The other point I may was via a bit of analogy to data modeling. It has organization and components but there are some useful distinctions of levels. Some data models are at the Logical level and pick some logical approach such as a relational one. They take the organizing principle and relational algebra etc. You can then do a physical model of this, but going the other way we can be more abstract than the Logical model and use a conceptual model. You can think of any of these 3 levels are some type of organizing system or architecture. But if you are at the conceptual level we are discussing "components" and abstract relations that are far removed from a particular physical implementation.
    So we can talk about services & data or metadata service concepts and maybe even some level of service oriented architecture but not as much about specific services implemented by a specific technology unless this is categorical -such as we need the category of metadata services to find relevant data via search .
    Thus I think that as an IG we will be largely talking conceptually and from time to time looking a bit less abstractly to see if it makes sense from that perspective. We may even talk about some things at a logical level and find ways to talk about them more generally.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770
    On Thu, Oct 30, 2014 at 9:35 AM, Peter Wittenburg
    <***@***.***> wrote:
    Hallo John,
    great that you kicked this discussion off, I was waiting on your written input, since for me as non-native E-speaker it is more optimal to read carefully what is being said. Let me respond to your long, but good note (and of course I saw what the other said and also looked at Beth slides). And excellent that you had a discussion about this at the recent meeting. Hope that you will be around when we have the next discussion in Washington as Larry pointed out.
    Let me first clarify in which role I will answer, since I was re-elected for the TAB as well. One part of what you stated addresses RDA etc., the other part addresses DF issues and there are certainly relationships. I will argue here as a bottom-up person (my preferred role) being interested to make data work more efficient in future, so all I am saying is just my personla view :)
    TAB Issue
    This relates mainly to your second question. RDA needs to remain primarily a bottom-up driven organization, so if we two would decide to start a new initiative, then it should be hard for boards etc to stop us if we can present a well-thought through idea and fullfill the formal requirements. The role of TAB is to synch, harmonize etc. where possible and of course check adherence to requirements. Beyond that as you note the role of TAB is to describe and structure the overall "landscape" of issues that RDA should be dealing about. To me it is obvious that at this moment it is too early to describe and structure this landscape in too detailed form, since we still need to learn more. Therefore DF will have an important role in driving this discussion about the "ladnscape" if we do not make big errors such as narrowing down the scope too early etc. For me there is no doubt that the crowd who is organized in the IGs and WGs is the most important driving force, the TAB, if it makes use of its role in a smart way, will help to consolidate, bring people together, etc.
    Important for me is also to note that RDA covers much more than DF people, but this is already part of the answer to your DF points. When you say that system engineering has already though about a system where for example also humans are components, then I would answer that also psychology, law etc. has worked out elaborative models of "systems" that include humans. For issues such as Legal Interoperability I would rather much more look to models about human behavior coming from those disciplines than from system engineering. Since RDA covers IGs/WGs that look from different views to "humans" in the data game, "systems engineering" will have one important view, but it will not be the only one. I am repeating very frequently that RDA needs to define its own culture, if it will not fail, and part of this culture is terminology.
    DF Issue
    Probably I am a bit along Larry's view as so often. It is very good to know that systems engineering defined a variety of terms relevant for us and I am happy to make use of it. But let us take the term "architecture" as an example. In our software design & developing experience the term "architecture" is very much related to design and then implement a system (A house) which then will do some function for us. In EUDAT we had a 2 year long discussion when to start discussing about an overall architecture. We as people who wanted to get specific services did not want to get in endless discussions about an overall architecture too early. I think that this comes down into a never-ending debate between those who like to start with a systematic and structured approach and those who like to start with tests, demos, etc. My attitude is very clear: when you operate in a "landscape" which you need to explore better start with tests, demos etc. But I know for example that my colleagues from ENVRI followed another path, used ODP to specify their infrastructure based on the different views and created an enormous amount of specification notes etc. Probably both ways are important, but in a bottom-up initiative such as RDA where many people address the data issues from a large number of different aspects only the test/demo approach will work. This may change at a certain moment when we understand the landscape. This will probably also be the moment when industry will drop in.
    So I could live with the term "architecture" as defined by systems engineering, but then we need to define for our community how we want to understand it. The way you describe it in your text would not be satisfying for me, since different interpretations are possible. A blueprint of a house already specifies where you have doors, windows, how big they are, etc. I think that in DF we want to remain at the level that we state that we will need doors and windows for certain functional reasons and what their requirements are etc. (some doors must be bigger than others etc.). I think that Keith also argued in this direction.
    So if we do this kind of small exercise and say what we mean with architecture in the realm of RDA I would be ok with using the term as defined by SE. At least at this moment the definition (3.150) seems to be abstract enough. I think we as data practitioners simply do not want to get in these endless architecture discussions some of our IT people like so much. Similar holds with the other terms you are mentioning. So perhaps we should have augmented term definitions by making use of what we can get from SE. More important to me is to define now components/services (there is a dualism between the two) which we need within the DF so that working with data gets much more efficient, effective, self-explanatory, etc.
    AFter having made my points let me also come back on your second questions to Beth. does DFIG have an overall organizing influence on the big picture? Yes if it understands that DF is not the whole RDA and restricts itself. Yes if it helps exploring the data landscape we are operating in, in particular by pointing to components/services that are required to make progress and making terms such as "architecture"much more concrete by doing.
    Again - thanks a lot for kicking this off and sorting out the terms. Perhaps we are not so far away from each other. At least for writing the White Paper this is excellent material and discussion.
    Peter
    --
    Full post: https://rd-alliance.org/group/data-fabric-ig/post/architecture-or-not.html
    Manage my subscriptions: https://rd-alliance.org/mailinglist
    Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/46202
    --
    Full post: https://www.rd-alliance.org/group/data-fabric-ig/post/re-rda-datafabric-...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/46278

submit a comment