Building a common ground for a FAIR microstructure repository
Submitted by Eva Campo
Human history keeps close parallelism with metallurgy development. Through the Stone to Copper, Bronze, and Iron Ages increasingly sophisticated tools and constructs were manufactured that ultimately accomplished civilization as we know it today. Indeed, technologies developed in late XIX and XX centuries enabled the production of steel which would ultimately dictate the fortunes of whole countries such as the United States for years to come. In the XXI century, demands on metals require fine tuning of their properties. To date, fine tuning imposes considerable delays and costs to adjacent industries, such as semiconductors, photonics, construction, and fuselage amongst others, creating global unrest.
At a scientific level, fine tuning is ultimately depending on how metals, ceramics, and semiconductors behave during fabrication and subsequent treatments. This is what experts in the field call the nucleation, growth, and processing trinity or microstructure.
Today, the availability of data harvesting from a multiplicity of sources is inspiring the possibility of manufacturing specific microstructures by design. To this end, data aggregation is needed along with ontologies.
However, barriers exist on both fronts. Artisans of Samurai swords in Seki or Spanish swords in Toledo were not in the habit of technology sharing and this tradition has been propagated to this very day. At best, surviving attempts to repository building worldwide are intranational, raising concerns on future interoperability. Progress on these fronts will likely promote a whole new era in future civilization.
Meeting objectives *
1. Introduce microstructure repository existing efforts worldwide
2. Identify commonalities/gaps
3. Identify data expertise needed
4. Gauge interest in forming a working group at RDA
Collaborative notes:
https://docs.google.com/document/d/1It7usZbbyoZJ0q4F1-OXtvqQl8GleGf7UnQL...
- Welcome and meeting objectives (Campo) 5 mins
- Round Table I: Description of existing efforts worldwide (Eberl, Trautt, and Tanifuji). Panelists will report on efforts that have been identified and discussed prior to the meeting. 15 mins
- Identification of gaps (Eberl, Arróyave, and Rickman). Panelists will report on gaps that have been identified prior to the meeting. 15 mins
- Resource assessment (Arróyave, Greenbert, and Hanisch). Panelists will report on current resources that have been identified prior to the meeting. 15 mins
- International engagement (Campos). Speaker will describe the likelihood of "dark data" discoverability. 10 mins
- Open mic to audience ( Campo moderates)15 mins
- Closing: provide a channel to ratify engagement and future communications towards a working group. (Campo) 5 mins
Insert Break: 10 mins
The group is just being started. A number of repositories are being developed worldwide without communication between groups. This group aims at solving this situation while identifying resources, minimizing duplicity, and agreeing in standards towards interoperability.
Many existing efforts in this realm remain unpublished. It is worth emphasizing that this field of knowledge is particularly late in adopting data science and speakers in the present group are pioneers. Some of the published efforts in this realm are authored by speakers and co-chairs in this proposal.
In Germany, many scientists from different universities and research institutionshave been working on different parts in parallel which we try to fuse together for materials science & engineering:
The main goal in our efforts is to represent process – microstructure – property relations in a FAIR manner and then use it to create new knowledge and novel mterials.
Projects so far include:
- A pending proposal www.NFDI-MatWerk.de) for infrastructure,
- A one project (applied materials science) already in action since 2019 (www.materialdigital.de), a platform project plus 13 materials projects starting in March 2021, and another 10 industrial projects coming in 2022,
- Roughly 12 EU projects connected also to EOSC
Infrastructure-wise groups in Germany work on different aspects in parallel:
- Decentralized concepts of a Digital Materials Environment – raw data is distributed, metadata is centralized within a common knowledge graph,
- Heterogeneous storage concepts (graph, SQL, noSQL,.., data bases)
- Data formats based in Industry 4.0 standards,
- Data workflows: mostly PyIron from MPIE (pyiron · GitHub), with common meta data schemas across all involved tools
- Knowledge representation (EMMO, BFO, …) of microstructure, properties and processes,
- AI applications, from analytics to representation to knowledge generation
- Incentives and business models (integrate fundamental research and commercial interests)
- Other: AAI (OAuth,…), containerization (Docker), …
Similar efforts exist in Japan and the US that should ultimately conform to FAIR standards. This is good timing since there is great expectation of additional resources being awarded. We are particularly enthusiastic to enlist Professor Jane Greenbelt given her expertise in metadata.
- 945 reads