Research: Demonstrators: URI Resolution

Metadata, Web caching and URI Resolution

Document Author: Dan Brickley, ILRT, UK
Javascript code by: Jan Grant, ILRT, UK
Last updated: 2000-02-17

Abstract

This DESIRE Demonstrator shows one possible application of W3C's RDF Metadata technology to the problem of URI/URN resolution, and relates this to Web-cache based deployment scenarios. This document itself constitutes an interactive demonstration of RDF / URI resolution capabilities, using a prototype RDF query engine written in Javascript.

[Note: A separate interactive demo is also available on the ILRT site which reflects the Handle resolution system into RDF. Unlike the current discussion document, this data does not include 'primary manifestation' or resource discovery metadata.]

Background

The original intention behind this demonstrator was to show the use of Web cache technology (such as Squid) for URI resolution. Since the work was originally proposed, things have moved on, making such a demonstrator redundant. The Squid cache package now ships with a built in feature that can be used for URI resolution, and the Squid web site now points to their own demonstrator of this technology. Rather than duplicate this work, this document provides a demonstrator that explores a different angle to the same set of problems.

Web Caches and URI resolution

A Web cache or proxy serves as a bottleneck for end-user access to the network. As such, the cache system is well positioned to provide a number of value-adding features. This document presents an interactive demonstration of URI resolution and RDF metadata services that might be layered on top of a distribute Web caching mesh.

Existing work has shown that a Web cache can store 'URI resolution' information, and provide views of this data to end-user clients who have configured their browser application to use a resolution-aware cache service. The Squid cache, for example, has prototype URN support available. An overview article by Andy Powell in D-Lib Magazine provides an accessible overview that applies this Squid facility to the resolution of DOI identifiers (represented as URNs).

This demonstrator provides an interactive, clientside, testbed for exploring the intersection of URI resolution and metadata modeling issues. The working assumption motivating this approach is that the ability to repurpose Web cache services for URI resolution has already been sufficiently demonstrated, and that other facets of the problem are worth exploring.

Terminology

Existing work on using Web cache infrastructure has typically taken as a model some database of 'URN to URL' mappings. There are (at least) two concepts of URN in circulation within the URI community. The Internet Engineering Task Force has a URN working group, and has produced a series of documents that propose a specialised type of URI called a URN, ie. identifiers that begin with the string 'urn:'. The second broader sense of URI encompasses identifiers within all URI schemes that are used as 'names' rather than as 'addresses', or that are managed with particular concern for persistence and longevity. The PURL initiative creates 'http://' addresses that can be considered URNs in this sense. Identifiers within alternative URI schemes such as the Handle System or the Digital Object Identifier (DOI) system can be considered URNs in this broader sense at least.

Working Scenario

This demonstration explores a simple example scenario in which an article from D-Lib Magazine has a URN-style identifier ("hdl:cnri.dlib/january98-kirriemuir"), as well as a primary http: identifier ("http://www.dlib.org/dlib/january98/01kirriemuir.html") corresponding to a page on the D-Lib web server. In addition, the D-Lib Web site is mirrored internationally, so we have two more identifiers for mirrored copies of the D-Lib page, ("http://sunsite.anu.edu.au/mirrors/dlib/dlib/january98/01kirriemuir.html" and "http://mirrored.ukoln.ac.uk/lis-journals/dlib/dlib/dlib/january98/01kirriemuir.html").

There is, as yet, no consensus within the URI/URN and metadata community concerning the 'best' model for representing common scenarios such as that above. While this document does not propose a specific model for scenarios such as that sketched above, it does show how W3C's RDF metadata model can be used to represent and process such data.

Our example scenario can be modeled as a directed, labelled graph for graphical representation (pictured here) as well as for automated processing.

A graphical view of an RDF model of the example text

Note: the

diagram source was rendered by Graphviz from AT&T research (for future work -- automatic generation of the .dot source files from RDF should be possible, as should SVG output to avoid image scaling problems)


Interactive Demo

The remainder of this document consists of an interactive RDF query / URI resolution demonstrator. The target audience is technical developers familiar with the RDF data model and URI resolution issues. Future refinements to this document may make it appropriate for a wider audience.

RDF query syntax: the queries shown below are expressed in terms of a representation of the RDF data model. Terms beginning with an initial capital letter are considered variable names by the query engine. The {curly braces} are used to enclose URIs, and "double quotes" to indicate literal text. Queries are matched as many times as possible against portions of the RDF graph, with tabular results displayed in a textarea in this page, and a separate query results HTML window.

What does this show?

The interactive demonstrator component of this document consists of a Javascript demo embedded below, which reads in a simple RDF metadata database (expressed in a simple custom syntax) and answers URI-related queries.

This demonstrator is a simple prototype intended to outline future research and development work. It shows:

The RDF vocabulary sketched here is not a concrete proposal, and exists only for demonstration purposes. Some knowledge of the RDF data model (directed labelled graphs) is currently assumed.

Deployment context: one possible application of an RDF system similar to that prototyped here would be to generate 'URI2URI' or 'URN2URL' configuration files for use with Web cache URI resolution systems. In addition, the integration of general Web metadata shown here would have use for Cache resolvers that generate HTML menu pages describing the different Web addresses that correspond to some abstract resource.

An RDF Vocabulary for URI Resolution Description

This demo uses a small RDF vocabulary for describing URI resolution scenarios. In RDF contexts, this can be combined with other vocabularies such as the Dublin Core Element Set, to allow richer description of the resources being identified.

We define only two vocabulary elements: an RDF property which we call 'webLoc', and an RDF class (resource type) which we call 'PrimaryManifestation'. These are given Web identifiers as follows:

The example also uses the rdf:type property, which relates a resource to a class (category) such as 'PrimaryManifestation', as well as Dublin Core properties for 'title' and 'creator' which we apply (against DC best practice) to the abstract resource rather than its particular online manifestation(s).

The following text diagram represents the RDF graph used by the Javascript query engine:


[hdl:cnri.dlib/january98-kirriemuir] --demo:webLoc--> [http://www.dlib.org/dlib/january98/01kirriemuir.html] --rdf:type--> [demo:PrimaryManifestation] 
                              \--demo:webLoc--> [http://sunsite.anu.edu.au/mirrors/dlib/dlib/january98/01kirriemuir.html] 
                              \--demo:webLoc--> [http://mirrored.ukoln.ac.uk/lis-journals/dlib/dlib/dlib/january98/01kirriemuir.html] 
                              \--dc:title--> "Cross-Searching SubjectGateways"
                              \--dc:creator--> (etc...)

The example uses the 'webLoc' property to relate the abstract identifier to the D-Lib website URIs and mirrors. The 'primary' web manifestation of the resource we're describing is indicated using an rdf:type property.

Demonstrator Requirements

The demonstrator consists of RDF data and query structures that run in Javascript-enabled Web browsers. It has been tested with versions 4 upwards of Netscape Navigator and MS Internet Explorer. The demo is based on an extended version of the Javascript TinyProlog engine by Jan Grant.

If the demonstrator fails to operate, try ensuring that Javascript is enabled. The demo is known to work in Netscape 4.5+ and IE5.

Usage: simply select the 'go!' buttons below to submit RDF queries to the Javascript system.

Known Issues

(1) When an RDF query produces 0 results, we do not handle the resultant Javascript error very well.

Examples

This query lists all triples in the RDF graph

This asks for the address of all mirrors of the abstract resource whose URI is hdl:cnri.dlib/january98-kirriemuir

This asks for the URI of all mirrors of the resource available at the URI http://sunsite.anu.edu.au/mirrors/dlib/dlib/january98/01kirriemuir.html

This asks for the primary online URI for the resource available at the URI http://sunsite.anu.edu.au/mirrors/dlib/dlib/january98/01kirriemuir.html. We also ask for the title and creator of the abstract resource that these documents represent.

RDF Database

Output

Output from the query engine is displayed in the text area below, and in a separate browser window.