Project Number: | RE 4004 (RE) |
Project Title: | DESIRE II - Development of a European Service for Information on Research and Education II |
Deliverable Number: | D3.5 |
Deliverable Title: | DESIRE metadata registry framework |
Deliverable Type: | PU |
Deliverable Kind: | RE |
Principal Reviewer: | Name | Tom Baker |
| Address | GMD Library Schloss Birlinghoven 53754 Sankt Augustin, Germany |
| E-Mail | Thomas.Baker@gmd.de |
| Telephone | +49-2241-14-2352 |
| Fax | fax +49-2241-14-2619 |
| Credentials | Member of Dublin Core Executive Committee Member of Dublin Core Working Groups (Multiple Languages Interest Group (co-chair), Data Model, etc.) Partner in SCHEMAS project |
Summary: | Relevant | 5 (1 = poor, 5 = excellent) |
| State-of-Art | 4 |
| Meets Objectives | 5 |
| Clarity | 4 |
| Value to Users | 3 |
Specific Criticisms | 1 | Suitability of ISO11179 as a universal conversion target could have been discussed more fully. |
| 2 | Unclear how DESIRE's subset of BSR would be maintained as the BSR standard evolves as a whole. |
| 3 | Suitability of BSR categories for fine- versus coarse-grained mapping is not evaluated; role of Super Elements is unclear. |
| 4 | Glossary lacks some fundamental terms, such as Element. |
Developer Response: | 1 | Agreed. In D3.5, ISO/IEC 11179 is only really discussed in relation to the prototype DESIRE registry |
| 2 | Agreed. This is something that would need to be addressed in a non-prototype registry-type service. |
| 3 | Agreed. The section on metadata mappings and cross-walks (p. 31) has been amended slightly to reflect the granularity issue. |
| 4 | Definitions for the terms, 'Element', 'Vocabulary', 'Qualifier' and 'Schema' have been added to the glossary. |
Review of ``DESIRE II: Project Deliverable''
Thomas Baker <thomas.baker@gmd.de>, GMD, Sankt Augustin, Germany
9 March 2000
The DESIRE registry is a nicely scoped ``proof of concept'' for a general registry of metadata. It shows an innovative design (eg, using BSR as an interlingua between schemas) implemented with a clear and simple interface. The practical implementation decisions are very sensible -- it uses a relational database instead of the more experimental RDF schema technology, it acknowledges but side-steps advanced questions such as the versioning of individual elements, and it "registers" the elements of seven schemas internally while pointing out that such a registry should eventually be implemented using machine-readable schemas maintained in a distributed manner by a variety of maintenance agencies and implementation-level registries over the Web.
The great value of this demonstrator lies in its presentation of related metadata entities in a form that can be easily browsed. Using such a tool to browse schemas is in itself a learning experience and promises to be one of the main benefits of registries generally, as the authors of this report point out.
Some of the interesting questions that this prototype suggests have to do with how this model might be enriched semantically and distributed over the Web. These include:
- The problem of mapping various schemas to the DESIRE data model.
By its design, the DESIRE registry looks at various schemas, each of
which has its own data model, through the lens of ISO/IEC 11179.
This raises in part a mapping question -- how well do various
schemas fit into the ISO/IEC 11179 model (in its DESIRE
adaptation)? The DESIRE model is "intended to be generic enough to
support the registration of elements from multiple namespaces and
detailed enough to support advanced functionality such as the
automatic generation of cross-walks" [p. 14]. Evaluating the
suitability of this model as further schemas are added seems like a
promising line for further investigation. What heuristics are
necessary for such mappings? In a distributed registry, can the
mapping to a uniform DESIRE data model be automated? (And more
specifically, what is the relationship of BSR representation classes
such as Name, Text, and Code to the datatypes of ISO/IEC 11179?).
- The problem of mapping the semantics of schemas to the Interlingua.
Mapping elements of various schemas to concepts of the ISO Basic
Semantic Registry standard is an interesting idea. It is worth
noting (and perhaps citing) that this is analogous to the approach
taken by the EuroWordNet, which maps entire vocabularies of natural
languages such as Italian and Dutch to a superset of semantic
concepts (the "n-to-1" model) instead of trying to maintain an
exponentially growing number of 1-to-1 mappings (the "n-to-n"
model) as new languages are added. The EuroWordNet project calls
this an Interlingual Index, or Interlingua, and Interlingua concepts
can be related to language terms not just as pure equivalents, but
by a number of other types of fuzzy equivalency. The DESIRE
registry currently supports just simple mapping, so experimenting
with such a richer semantic framework such as this seems like a
promising direction for further research.
- The problem of mapping the DESIRE interlingua to the BSR itself.
The DESIRE registry has imported a subset of BSR terms for use as an
interlingua, but BSR is itself undergoing continual development.
Maintaining a DESIRE subset as the BSR itself evolves through
successive versions raises longer-term issues of design. The DESIRE
II glossary defines Semantic Unit as "an informational element
described independently of a specific namespace" [p. 10], but it
would seem these do in effect get a namespace (ie, the DESIRE
Registry namespace) when they are incorporated into the
interlingua.
- The problem of granularity, mapping, and navigation among BSR
terms within the DESIRE interlingua. Ideally, one would expect a
Semantic Unit such as bsr/1.0/2045 (Contributor.Name) to be related
to a unit such as bsr/1.0/2044 (Originator.Name) via a Super Element
(see p. 20). Any loss of semantic precision involved in using BSR
as opposed to 1-to-1 crosswalks could in principle be minimized by
using hierarchical relationships within the interlingua together
with richer semantic relationships between the interlingua and
schema elements. As the DESIRE II report points out, "if the
semantic layer is not detailed enough, the translation will begin to
suffer from information loss" [p. 28], but the problem cuts both
ways -- coarse-grained navigation and discovery could suffer if the
semantic layer were not general enough.
- The relationship of maintenance agencies, "implementation-level"
registries, and application profiles. (The glossary definition of
application profile seems a bit vague -- they "may register
elements... that are valid for use with particular elements.") By
definition, application profiles do not introduce new data elements
all of its data elements must derive from other namespaces. This
suggests a simple and workable way to approach the problem of
creating "hybrid" schemas for specific purposes and suggests
a good topic for further investigation.
The Glossary in the DESIRE II report makes a good start at creating a
resource that is sorely needed in this area, where terminology is used
loosely and inconsistently even within specific project and
initiatives. It is interesting to note that a few concepts that are
key to the DESIRE registry are not listed at all, such as Element,
Vocabulary, Qualifier, and Schema (in contrast to Scheme, which is
listed). This is by no means a shortcoming of the DESIRE glossary in
particular but reflects the lack of a common terminology in the
metadata field as a whole.
As a hands-on demonstrator, the DESIRE registry makes broader questions
of design and theory visible and tangible in a way that scholarly
papers alone cannot. It is a refreshingly sensible and
well-implemented first step.
Title:
Issue:
Date: