The DESIRE Project – promoting and facilitating Web usage among Europe’s research community

(pre-print of an article to be published in the journal New Review of Information Networking)

Emma Worsfold and Debra Hiom, Institute for Learning and Research Technology, University of Bristol (emma.worsfold@bristol.ac.uk, d.hiom@bristol.ac.uk)

 

This paper reports on the outputs of the DESIRE project[1] - a large, international project aiming to promote and facilitate use of the World Wide Web (WWW) in the European research community. DESIRE is funded by the European Union, under the Telematics Application Programme[2]. The project originally ran from January 1996 to March 1998 but has since received further funding until the year 2000. This paper will focus on the outputs from the first phase of DESIRE. An overview of all the DESIRE tools and reports will be given, followed by a more in-depth look at the outputs from the work on Resource Discovery strand of the project[3], which focused on the development of Web catalogues (subject gateways) and Web indexes. The DESIRE outputs are freely available and this report will describe and reference them all. Given the size and scope of the project, different outputs are likely to be of interest to different audiences, including network managers, librarians, Web researchers and researchers in general. This paper aims to act as a launch pad via which readers can begin their travels to the most relevant DESIRE outputs for their interests.

 

European Researchers and the WWW

Many researchers recognise that the Internet has great potential to support their work. In terms of information, they can use the Web both to publish and disseminate their own research data and to retrieve the data published by others. In terms of communication, they can use the Internet to discuss work in progress both on a one-to-one and one-to-many basis.

The Web can leave researchers disappointed and frustrated however, as the high expectations people develop when they see the potential are often not met. It can be difficult to locate high quality data on the Web efficiently and effectively. Access to Web resources can be slow and costly. Security issues can make people dubious of using it as much as they might.

DESIRE aimed to address some of the "Web needs" of European researchers and to enhance their experience of using it. The following needs of researchers have been identified:

The first phase of the DESIRE project aimed to address some of these needs. At the time there were no "off the shelf" solutions existed which could address these issues and so there was considerable scope for research and development in these areas.

 

History of DESIRE

The first phase of the DESIRE project has its roots in TERENA (The Trans-European Research and Education Network Association)[4]. TERENA is an international organisation which has members and representatives from nearly all European countries.

In 1994, members of the TERENA Working Group on Information Services and User Support came together to discuss the project proposal. The result was the DESIRE project, with 23 partner organisations based in eight European countries and with research and development plans in a number of areas.

FIG 1: The DESIRE home page

 

DESIRE 1996-1998: Work Packages and their Outputs

The first DESIRE project had eight main work packages. A brief overview will now be given, outlining the aims, results and outputs of each. All the DESIRE outputs are freely available for use. For more detailed information readers are referred to the URLs that are referenced for each output.

 

1. Resource Discovery – Web Indexing and Cataloguing

This work package[5] aimed to develop tools and methods that would enable European researchers to locate relevant high quality information resources on the Web more efficiently and effectively. The following three approaches have been investigated:

  1. Web cataloguing (subject gateways)
  2. automated indexes
  3. harvesting of high quality resources

Results

Methods and tools have been developed for:

The following services, based in the UK, The Netherlands and Sweden demonstrate the DESIRE work on subject gateways:

DESIRE has identified the potential for a European network of subject gateways. The tools, methods and guidelines for doing this are already available, but effort needs to be focused on encouraging other European countries to join the network and set up their own subject gateways. The next phase of DESIRE (1998-2000) aims to promote the development the gateway model to libraries and research institutions across Europe, enabling inter-operability between gateways from many countries.

The potential for building Web indexes aimed at specific user-groups has also been identified. A software package called Combine Harvester[9] has been developed to gather Web documents, parse them and collect them in a database. The Combine Harvester is intended for anyone setting up some kind of Web-index. Combine has been used to build the Nordic Web Index[10], which has nodes in Sweden, Denmark and Iceland.

Work has also been done on combining the Web cataloguing and indexing approaches – offering researchers the choice of searching quality-controlled catalogues and automated indexes simultaneously.

Deliverables

The deliverables from this strand of the DESIRE project are referenced and described later in this paper.

2. Caching

The Caching work package[11] aimed to address the need for improved access and speed of retrieval of information on the Web. It developed tools and techniques for information caching – to enable more efficient use of available bandwidth.

Results

National caches have been developed in Norway and the Netherlands, which have demonstrated how Web traffic can be significantly reduced and money saved by installing caches. Guidelines for good practice in setting up caches have been made freely available, with a view to encouraging the development of a European caching mesh. This could significantly reduce the load on the network infrastructure by keeping copies of frequently requested information close to the point of access.

Deliverables

  1. Survey of caching requirements and specifications for prototype[12]
  2. Report on the costs and benefits of operating caching services[13]
  3. Practical Experiences of establishing caching meshes[14]

3. Security

The Security work package[15] aimed to explore the issue of security on the Web, with a view to protecting confidentiality and providing pricing mechanisms where necessary.

Results

A Smartcard system was built to give researchers access via the public network to confidential or licensed information sources with security similar to that of dedicated private networks. The protocol developed is extendable to a variety of Smartcards and third party authentication systems.

Deliverables

  1. Requirements and Recommendations for Firewalls[16]
  2. Security Demonstrator Project[17]

 

4. Information Tools

The Information Tools work package[18] aimed to develop tools for researchers interested in publishing information on the WWW.

Results

When DESIRE started in 1996 there were few high quality commercial software tools for managing and authoring the publishing process for Web sites. WebManager was developed as a Web-site management tool, specifically aimed at supporting collaboratively authored Web sites. The situation changed however, in the lifetime of the project and the need for research in this area declined.

Deliverables

  1. Overview of HTML Authoring Tools (including WebManager)[19]
  2. Specification for Information-Provider Tools[20]
  3. Verified Information-Provider Toolset[21]

5. Quality of Service

The Quality of Service work package[22] aimed to assist network managers and information providers by developing methods for the remote management of Web servers.

Results

A Java based system was produced that diagnosed and anticipated faults across distributed servers. The system enables the performance of the server to be charted. This work improved the quality of service a network manager could provide.

Deliverables

  1. Requirements Survey for Quality Metrics[23]
  2. Functional Specification for QoS Tools[24]
  3. Validated Toolset[25]
  4. The Tools:

6. Training

It was acknowledged that the success of some of the DESIRE work would depend on the provision of training and awareness to ensure the results actually reached the people it was designed to help. The training work package[26] aimed to work with the other work packages to disseminate information to relevant audiences.

Results

Given the wide variety of training and dissemination that would be needed under DESIRE, the training team developed a set of generic tools for the production of training materials that could be adopted by all the DESIRE partners. These included templates and guidelines for face-to-face training and also a software package called TONIC-NG[27] for the delivery of online training via an interactive tutorial. During the course of DESIRE 377 users and subject specialists around Europe received training via face-to-face workshops. Many more have benefited from remote training via the Web, using tutorials based on TONIC-NG ("TONIC, The Online Netskills Interactive Course[28]" and "Internet Detective[29]").

Deliverables

  1. Generic Training Materials for Desire[30]
  2. Subject-Based Training Materials[31]
  3. Verified Network Training Materials[32]

 

7. Home Access

The Home Access work package[33] aimed to address the needs of researchers who wish to use the Web from their homes

Results

A prototype system was developed to improve information access from people’s own homes.

Deliverables

  1. Functional Specification[34]
  2. Specification of Charging/Identification[35]
  3. Prototype of Home Access System[36]
  4. Validated tools to enable the construction of similar systems[37]
  5. Exploitation Plan[38]

8. Evaluation

The Evaluation work package[39] aimed to provide internal evaluation of the other DESIRE work packages, to measure the success of the deliverables to in light of their objectives.

Results

A number of user groups have been involved in the evaluation process, for example the Cataloguing and Indexing strand had a user group of end-users of subject gateways and another for the information professionals involved in setting up the subject gateways. An evaluation report was written at the end of the project describing the impact of the DESIRE project on end-users.

Deliverables

  1. Evaluation of Desire Impact on Users[40]

 

A Focus on the DESIRE Resource Discovery Work Package

 

The rest of this paper will focus on the outputs of the Resource Discovery work package of DESIRE. As mentioned earlier, this work package aimed to develop tools and methods that would enable European researchers to locate relevant high quality information resources on the Web more efficiently and effectively.

The partners involved in this work package were:

FIG 2: The DESIRE Cataloguing and Indexing Page (1996-1998)

Web Cataloguing and Web Indexing

Internet users will be familiar with the public search services such as Yahoo![41] and AltaVista[42]. These services demonstrate two very different approaches to locating information on the Web; Web cataloguing and Web indexing.

AltaVista is a search engine that aims to exhaustively index the Web, by trawling every accessible online document. It relies on the work of robots, which automatically locate Web resources and automatically generate descriptions of them (using the first x number of words in the document). The database of these descriptions can then be searched by the end user.

Yahoo! is a Web directory (or catalogue) that offers access to a selective collection of Web resources, chosen for their high quality. It relies on the work of humans, who hand-pick Web resources, according to a set of selection criteria, and then hand-write descriptions for them. All the resources are classified under subject headings and all the descriptions are entered into a database, so that users can chose either to search the database or browse the resources under subject headings.

Without the Web indexes and catalogues it would be virtually impossible to find information on the Web. However, Yahoo and AltaVista are very much directed at serving the information needs of the general public. DESIRE has been developing services which are comparable, but which are aimed specifically at meeting the information needs of researchers.

The aim of this work package was to create tools and methods that could enable:

  1. The development of subject gateways/Web catalogues across Europe
  2. The development of Web indexes across Europe

A collection of software tools, reports and guidelines have been developed and are described below. They are all freely available and those interested in creating a strategy for Internet cataloguing and indexing are encouraged to use them.

 

DESIRE Work on Web Cataloguing /Subject Gateways

Many academic libraries and institutions are currently looking for ways to help their users discover high quality information on the Internet in a quick and effective way. DESIRE suggests that the development of subject based information gateways can provide a solution.

In the traditional information environment human intermediaries, such as publishers and librarians, filter and process information so that users can search catalogues and indexes of organised knowledge as opposed to raw data and disparate information. Subject gateways work on the same principle -- they employ subject experts and information professionals to select, classify and catalogue Internet resources to aid search and retrieval for the users. Users are offered access to a database of Internet resource descriptions, which they can search by keyword or browse by subject area. They can do this in the knowledge that they are looking at a quality-controlled collection of resources. A description of each resource is provided to help users assess very quickly its origin, content and nature, enabling them to decide if it is worth investigating further.

DESIRE has created methods and tools for libraries and institutions interested in setting up subject based information gateways. With the software and procedures recommended by DESIRE, there is the potential to create an international network of subject gateways which are all compatible and interoperable, providing a wider range of resources for users and avoiding the duplication of cataloguing effort which is the consequence of keeping smaller independent collections. Users would benefit from the expertise of librarians and subject specialists across the continent and be directed to high quality Internet resources rather than have to locate, evaluate, filter and organise the resources themselves.

Libraries and institutions interested in setting up subject gateways are invited to make free use of the following software, tools and methods.

Resource Description

Human-generated descriptions of Web resources are an integral part of a subject gateway. There is a recognised need for a standard format for resource descriptions to facilitate information search and retrieval on the Web. DESIRE staff have played an important role in the development of international metadata standards such as Dublin Core[43], ROADS templates[44] and RDF[45]. The project outputs in this area are listed below:

1) ROADS-based gateways[46]

DESIRE is an integrative project, building where possible on emerging and existing standards and technologies rather than developing new ones. ROADS is one of these and offers a resource discovery solution for the information gateways which form part of the DESIRE vision. There are now many ROADS-based information gateways, either full services or demonstrators, which might serve as a useful resource for anyone interested in developing or exploiting the models which DESIRE offers.

ROADS offers a selection of software tools and standards to support the infrastructural requirements of subject gateways. With its emphasis on configurability and interoperability it allows each to develop it’s own unique identity and content.

ROADS uses the WHOIS++ search and retrieve protocol which enables ROADS users to make complex transparent searches across multiple gateways as if they had a common index. ROADS based subject gateways can work together to form a comprehensive distributed resource discovery system.

A list of services which are currently running either a partial or full ROADS based subject information gateway is available from the ROADS Web site and those interested in finding out more about the software are invited to email roads-liaison@bristol.ac.uk

2) ROADS cross-searching demonstrator[47]

One of the objectives of the DESIRE project was to investigate the potential for cross-searching national gateways. As part of this work a cross-searching demonstrator was set up where a ROADS database in the Netherlands could be cross-searched with ROADS gateways in the UK (SOSIG and Biz/ed[48]).

Traditionally cross searching is carried out by querying each database in turn and combining the results for the user. The DESIRE demonstrator index server consults a locally held centroid (or index) of each of the three databases which provides some "forward knowledge" about the information contained in the databases. The index server decides which of the databases contain information that matches the query, passes on the query to each database that has matching entries and collates the results for the end user.

(Please note: The demonstrator is intended as a "proof of concept" and is not an operational service).

3) A Review of Metadata: a Survey of Current Resource Description Formats[49]

This is a peer-reviewed report of metadata formats. Metadata is data which describes attributes of a resource. A more formal definition is: metadata is data associated with objects which relieves their potential users of having to have full advance knowledge of their existence or characteristics.

The review will be of use to anyone with an interest in metadata issues and electronic cataloguing. The development of metadata standards are still in a state of flux and this report reviews the metadata formats that were current during the period of May 1996 to March 1997 (readers should follow references to the individual formats to determine the most recent developments).

The review provides background information to the DESIRE project enabling the implications of using particular metadata formats to be assessed. Part I is a brief introductory review of issues including consideration of the environment of use and the characteristics of metadata formats. A broad typology of metadata is introduced to provide a framework for analysis. Part II consists of an outline of resource description formats in directory style. This includes generic formats, but also, to give an indication of the range of development, domain-specific formats. The intention is not to be comprehensive, but to give sufficient examples to support understanding of a rapidly developing environment. The focus is on metadata for 'information resources' broadly understood rather than on the variety of other approaches which exist within particular scientific, engineering and other areas.

Stu Weibel of OCLC[50] has called this report "the single most comprehensive survey of metadata standards and issues that I am aware of."

4) Metadata Software Tools[51]

A suite of software tools for handling metadata in various formats was developed. They will be of interest to people involved in the creation of electronic resources, those involved with the creation of third party metadata (information gateways and catalogues) and software developers. They are freely available on the site for downloading.

The software tools have been developed by the UKOLN metadata group[52]. They provide a variety of functions, including the creation, maintenance and conversion of metadata formats. The tools include:

 

Quality

Subject gateways are characterised by their quality control. They only point to Internet resources that have been selected for their relevance to the target user group, in this case, researchers. DESIRE has developed tools and guidelines for those interested in creating selection criteria for subject gateways. These outputs are listed below:

1. Internet Detective[53]

Internet Detective is an interactive, Web-based tutorial which provides an introduction to the issues of information quality on the Internet and teaches the skills required to evaluate critically the quality of an Internet resource. It provides a step by step introduction to the main issues relating to information quality on the Internet but assumes an informal approach incorporating quizzes, graphics and a "Try it yourself" section. Internet Detective can be freely used by anyone, but is targeted at the research and higher education communities and in particular librarians and information professionals who work to raise awareness of the problem of poor quality information on the Internet.

Feedback on the Internet Detective suggests that lecturers and librarians are not only working through the tutorial themselves, but are incorporating it into their teaching and training programmes.

Internet Detective allows for self-paced working. It is estimated that the entire tutorial constitutes up to 3 or 4 hours work, however users can choose to work over more than one session - the tutorial will remember where individuals left off and will keep a running total of their scores on the quizzes.

Internet Detective is based on tutorial software called TONIC-NG, which was developed under the DESIRE Training work package (referenced earlier).

FIG 3: A page from the Internet Detective tutorial

2. Selection Criteria for Quality Controlled Information Gateways[54]

This report will be of interest to people wishing to see the detailed research and methodology that lay behind the development of the other DESIRE quality tools. It is a lengthy, peer-reviewed report that includes a "state of the art" review of existing quality selection criteria, including:

The review informed the development of two tools, described in the report, which can be used by gateways to improve their quality systems:

  1. A diagrammatic model of an information gateway that highlights the areas that require quality control mechanisms
  2. A list of selection criteria that can be drawn upon in creating an explicit set of criteria for a specific gateway

DESIRE suggests that "a high quality Internet resource is one that satisfies the user". Librarians, information professionals and others building collections of Internet resources will need to develop their own set of selection criteria based on the specific needs of their target users and the aims of the service.

The guidelines are not prescriptive and are designed to help an institution or service develop their own tailor-made Scope Policy, Collection Management Policy and Selection Criteria. An example of a service that has used the list to develop criteria is given in the report.

3. Selection Criteria: Examples[55]

A number of gateways and services have used DESIRE research to inform the development of their selection criteria. These examples may be used as a model by librarians, academics or information professionals interested in developing their own selection criteria, whether for a gateway or library OPAC.

Classification

Subject gateways enable users to browse Internet resources under subject headings. This offers an important compliment to the search option, allowing an alternative method of accessing the data. The browsing structure depends on the use of a classification scheme and DESIRE investigated the issues surrounding the choice of a classification scheme for a subject gateway. The two research outputs in this area are described below:

1) The Role of Classification Schemes in Internet Resource Description and Discovery[56]

The report provides a summary of the major classification schemes as well as a review of their use in Internet services. It will be of interest to people wishing to use classification schemes in their services or for those interested in seeing the research and methodology that lay behind the development of the DESIRE tools.

This report includes a "state of the art" review of the use of classification schemes in Internet services. Classification schemes have a role in aiding information retrieval in a network environment, especially for providing browsing structures for subject-based information gateways on the Internet. Advantages of using classification schemes include improved subject browsing facilities, potential multi-lingual access and improved interoperability with other services. Classification schemes vary in scope and methodology, but can be divided into universal, national, general, subject specific and home-grown schemes. Which type of scheme is used, however, will depend upon the size and scope of the service being designed.

In this report a study is made of classification schemes currently used in Internet search and discovery services, particular reference being given to the following schemes: Dewey Decimal Classification (DDC); Universal Decimal Classification (UDC); Library of Congress Classification (LCC); Nederlandse Basisclassificatie (BC); Sveriges Allmäma Biblioteksförening (SAB); Iconclass; National Library of Medicine (NLM); Engineering Information (Ei); Mathematics Subject Classification (MSC) and the ACM Computing Classification System (CCS). Projects which attempt to apply classification in automated services are also described including the Nordic WAIS/WWW Project, Project GERHARD and Project Scorpion.

2) Mapping Classification Schemes[57]

This is a short case study report describing the use of classification schemes within two ROADS information gateway services; SOSIG and Biz/ed. Classification schemes are used by ROADS information gateways to provide a subject hierarchy for the browsable sections of the service. This report looks at the decision to use different classification schemes within the two services and the implications for cross searching and browsing.

The report will be of interest to users and potential users of the ROADS system who are interested in the possibilities of cross browsing with other gateways.

 

Multilingual Issues for Subject Gateways

European researchers work in many different European languages and their language needs need to be addressed by subject gateways, which aim to serve this audience. DESIRE produced two reports that described the main issue in this area:

1) Internationalization in the DESIRE Project[58]

This is a working paper that describes some of the technical issues involved in creating international subject gateways on the World Wide Web (WWW). It is likely to be of interest to any gateway or Internet service thinking of developing a multilingual interface or thinking of cataloguing or indexing multilingual resources.

An international gateway is one that can handle resources written in different languages and that offers a multilingual interface to those resources. A number of internationalization issues faced by cataloguing and indexing software are discussed, including:

The paper focuses on the internationalization issues for ROADS gateways and has sections on:

2) Developing Multilingual Subject Gateways[59]

This is a working paper that describes the potential for developing multilingual subject gateways. It describes a review of the current practices of subject gateways, search engines and libraries in multi-lingual service provision. Three main issues for multilingual developments are identified:

This paper focuses on the strategic issues only (a sister paper, cited above, covers the technical and interface issues). DESIRE has set up demonstrations of both the centralised and distributed gateway models so that others can see how they work, and can consider their potential for developing a multilingual gateway.

The centralised model has been demonstrated by The Social Science Information Gateway (SOSIG) where staff have set up a system whereby librarians or academics across Europe can contribute resources to the central SOSIG service. Selection criteria and cataloguing rules for multilingual resources have been developed and these are described in the report.

The distributed model has been demonstrated by Koninklijke Bibliotheek (Netherlands) and SOSIG (UK) who have both developed their own ROADS gateways, based in their own countries and have then enabled the two distributed databases to be simultaneously cross-searched.

Distributed and Part-Automated Cataloguing[60]

Subject gateways are labour intensive to develop and maintain. They require the constant input of staff to hand-pick, classify and catalogue each Internet resource. This is both the strength and the weakness of gateways. The human input allows for semantic judgements and decisions that are the key ingredients for creating a quality controlled gateway. This ingredient is lacking in automated indexes or search engines, which can not filter information in such a meaningful way. However, considerable time and effort is needed to make these judgements and decisions and this means that the collection of resources is often small and slow to grow.

As the number of resources available over the Internet increases, gateways need to develop ways of increasing the number of resources they can catalogue. DESIRE has identified two ways in which this might be done and these are described in the following report:

1) Distributed and Part-Automated Cataloguing

This is a working paper that describes two ways in which a subject gateway might speed up the cataloguing process:

1. By setting up distributed cataloguing - where librarians can add resources from afar
2. By part automating the creation of cataloguing records by automated metadata entry

This paper is likely to interest both gateway staff and library staff interested in creating strategies for Internet resource discovery.

There is considerable potential for distributed collaborative cataloguing of networked resources. Information gateways can be built by teams of staff who are geographically dispersed but who can add resources to a database from their desktops via the WWW. DESIRE describes a number of different strategies for distributed cataloguing, many of which involve the library profession, and looks at the issues involved.

The potential for part-automated creation of cataloguing records is growing as more Internet information providers begin to attach metadata to their resources. DESIRE has experimented with automatic metadata entry to create part of a catalogue record. This work is described in the report. DESIRE concludes that the distributed database model holds the most promise for developing multilingual gateways. The potential is there, for Europe to create an international network of subject gateways that could all be simultaneously cross-searched. The benefits would be great, as the subject and language skills of librarians and academics across Europe could be used to select and catalogue resources, creating a virtual collection of high quality Internet resources written in all European languages.

 

Users of the DESIRE Tools and Methods

A number of European countries are already creating subject gateways that use the DESIRE model and the ROADS software to enable cross-searching. These countries include the UK, Finland, Sweden and Denmark. Countries who have expressed an interest include Norway and Iceland. Libraries and academic institutions from all European countries are invited to consider adopting the DESIRE strategy and to make free use of the software tools, guidelines and materials produced within the project.

 

DESIRE Work on Web Indexing

A European Web Index (EWI) is being built that will provide a search interface to all Internet documents published in Europe which have research relevance, harvesting not only online documents from the Web, but also those available via other Internet protocols. Significant information from each document will be digested into a database that can then be searched using the standard Z39.50 protocol, supporting a variety of user interfaces and library systems.

The EWI will take advantage of metadata which may be embedded in current documents or which in the future may be available by other means significantly enhancing the standard of the bibliographic descriptions available in the index. Close collaboration with the subject gateways work in DESIRE will ensure a shared approach to the implementation of new technologies and in the longer term closer integration of the EWI and the subject gateways.

The Combine Harvester[61]

Combine Harvester is a software package for gathering Web documents, parsing them and collecting them in a database. It is intended for anyone setting up a Web-index, which, for example, might cover Web resources from a particular country, university or subject area. Users simply need to specify which servers and which URLs should be harvested. There is also potential for using Combine as an automatic classification tool.

Combine has the following features

Combine consists of a number of relatively small parts communicating with specified protocols, allowing the user to combine these parts in a way that makes the system perform the required tasks. Combine has also been designed to make it easy to extend with, for example, a parser for a new kind of file format, or a new database format.

The different parts of Combine can be run in multiple instances on one or more computers, taking full advantage of multiple processors or networks of computers that may be available for the harvesting process.

Combine can handle metadata embedded in the Web documents, as well as fetching metadata from special metadata registries or databases.

Combine Harvester : Examples of Use

The Combine Harvester is currently being used in Sweden, Denmark and Iceland which all have nodes in the Nordic Web Index[62], a distributed regional Web index containing all Web pages in the Nordic countries. The Swedish database for example, currently holds some 2.5 million records.

Combine is also being used by subject gateways such as EELS[63] in Sweden and SOSIG[64] in the UK to create companion databases for the quality controlled catalogues. These databases are generated by feeding Combine a small set of starting points of quality controlled collections and then letting it follow all the links to a specified depth, thus generating a database with several thousand pages, most of which will be related to the subject covered by the gateway.

 

Integration of Web Indexes and Web Catalogues

DESIRE is interested in combining the two approaches to resource discovery – Web indexing and Web cataloguing. A report was written towards the end of the first phase of DESIRE that described the issues involved in doing this. This work will be carried forward in the next phase of the project.

Integration of Indexing and Subject Gateways[65]

This DESIRE report which addresses the problem of providing integrated access to robot-gathered and human-catalogued searchable resources on the Internet. For example, a subject-based Internet catalogue that describes Business and Economics resources might be usefully cross-searched against much larger (but lower quality) resources containing several thousand pages from Web sites in that subject area. There are several ways in which these complementary services might be presented in an integrated fashion to users. The report presents an overview of the major issues, describes a variety of practical approaches suitable for immediate deployment and highlights areas for future software-development, standards and user-interface work. The report focuses on the development of the European Web Index in this context.

Several scenarios are explored, with particular focus on cross-searching infrastructure, user interface issues and varying approaches to multi-protocol distributed searching. Examples are presented which draw upon existing robot and subject gateway resources, searchable using the Z39.50 protocol.

Future of DESIRE

The first phase of DESIRE has successfully developed tools and methods for building large-scale information systems to support the European research community in their use of the World Wide Web.

Many countries and services have already benefited from using these, for example subject gateways, Web indexes and caches have been built in many European countries using the DESIRE tools. Notable successes include:

There is great potential for these tools to be adopted by a wider community. Many of the technological solutions are already there - it is now time now to develop the human networks to ensure that the DESIRE tools are widely adopted so they can benefit the end users they were designed to support.

The second phase of DESIRE runs from 1998 to 2000. It is a smaller and more focused project and which will concentrate on the following areas of Web research and development:

  1. Distributed Web Indexing
  2. Subject-based Web cataloguing
  3. Caching
  4. Directory Services

As well as taking these areas forward in terms of technology, this phase of DESIRE has a programme of dissemination and demonstration to encourage the use, development and extension of these services by the user community. There will be "how to" guides and training workshops for those wishing to build their own services exploiting the DESIRE tools.

A list of the deliverables for this second phase can be found on the DESIRE Web site where full versions will be placed, as they become available.

All the DESIRE outputs are made freely available and we encourage people to view them, use them and to get in touch if they have any ideas for collaborative work.

 

 

 

References



[1] DESIRE home page. Available from:
URL: http://www.desire.org/.

[2] The Telematics Applications Programme. Available from:
URL: http://www2.echo.lu/telematics/home.html.

[3] DESIRE Cataloguing and Indexing (1996-1998). Available from:
URL: http://www.desire.org/results/discovery/.

[4] TERENA home page. Available from:
URL: http://www.terena.nl/.

[5] DESIRE Indexing and Cataloguing work package (1996-1998). Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/indexing.html.

[6] SOSIG (The Social Science Information Gateway). Available from:
URL: http://www.sosig.ac.uk/.

[7] DutchESS (The Dutch Electronic Subject Service). Available from:
URL: http://www.konbib.nl/dutchess/.

[8] EELS (Engineering Electronic Library, Sweden. Available from:
URL: http://www.ub2.lu.se/eel/.

[9] Combine Harvester. Available from:
URL: http://www.lub.lu.se/combine/.

[10] Ardö, Anders and Lundberg, Sigfrid. A regional distributed WWW search and indexing service - the DESIRE way. Available from:
URL: http://www.elsevier.nl:80/cas/tree/store/comnet/free/www7/1900/com1900.htm.

[11] DESIRE Caching work package (1996-1998). Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/caching.html.

[12] Survey of caching requirements and specifications for prototype. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP4/D4-1.html.

[13] Report on the costs and benefits of operating caching services. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP4/D4-2.html.

[14] Practical Experiences of establishing caching meshes. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP4/D4-3.html.

[15] DESIRE work package (1996-1998). Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/security.html.

[16] Requirements and Recommendations for Firewalls. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP5/D5-1.html.

[17] Security Demonstrator Project. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP5/D5-2.html.

[18] DESIRE Information Tools work package (1996-1998). Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/infotools.html.

[19] Overview of HTML Authoring Tools (including WebManager). Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP6/D6-1.html.

[20] Specification for Information-Provider Tools. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP6/D6-2.html.

[21] Verified Information-Provider Toolset. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP6/D6.3.html.

[22] DESIRE Quality of Service work package (1996-1998). Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/quality.html.

[23] Requirements Survey for Quality Metrics. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP7/D7-1.html.

[24] Functional Specification for QoS Tools. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP7/D7-2.html.

[25] Validated Toolset. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP7/D7-3.html.

[26] DESIRE Training work package (1996-1998). Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/training.html.

[27] Netskills (home of TONIC-NG). Available from:
URL: http://www.netskills.ac.uk/.

[28] TONIC (The Online Netskills Interactive Course). Available from:
URL: http://www.netskills.ac.uk/TonicNG/cgi/sesame?tng.

[29] Internet Detective. Available from:
URL:http://www.sosig/desire/internet-detective.html.

[30] Generic Training Materials for Desire. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP8/D8-1.html.

[31] Subject-Based Training Materials. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP8/D8-2.html.

[32] Verified Network Training Materials. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP8/D8-3/D8-3.html.

[33] DESIRE Home Access work package (1996-1998). Available from:
URL:http://www.surfnet.nl/surfnet/projects/desire/homeoffice.html.

[34] Functional Specification. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP10/D10-1.html.

[35] Specification of Charging/Identification. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP10/D10-2.html.

[36] Prototype of Home Access System. Available from:
URL:http://www.surfnet.nl/surfnet/projects/desire/deliver/WP10/D10-3.html.

[37] Validated tools to enable the construction of similar systems. Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/deliver/WP10/D10-4.html.

[38] Exploitation Plan. Available from:
URL:http://www.surfnet.nl/surfnet/projects/desire/deliver/WP10/D10-5.html.

[39] DESIRE Evaluation work package (1996-1998). Available from:
URL: http://www.surfnet.nl/surfnet/projects/desire/evaluate.html.

[40] Evaluation of Desire impact on users. Available from:
URL:http://www.surfnet.nl/surfnet/projects/desire/deliver/WP9/D9-3.html.

[41] Yahoo!. Available from:
URL: http://www.yahoo.com/.

[42] AltaVista. Available from:
URL: http://www.altavista.com/.

[43] Dublin Core Metadata. Available from:
URL: http://purl.org/dc/.

[44] The ROADS Template Registry. Available from:
URL: http://www.ukoln.ac.uk/metadata/roads/templates/.

[45] Resource Description Framework (RDF) Model and Syntax Specification. W3C Working Draft 08 October 1998. Available from:
URL: http://www.w3.org/TR/WD-rdf-syntax/.

[46] ROADS. Available from:
URL: http://www.ilrt.bris.ac.uk/roads/.

[47] ROADS cross-searching demonstrator. Available from:
URL: http://www.desire.org/html/research/demonstrations/.

[48] Biz/ed. Available from:
URL: http://www.bized.ac.uk/.

[49] A Review of Metadata: a Survey of Current Resource Description Formats. Available from:
URL: http://www.ukoln.ac.uk/metadata/desire/overview.

[50] OCLC home page. Available from:
URL: http://www.oclc.org/.

[51] Metadata Software Tools. Available from:
URL:http://www.ukoln.ac.uk/metadata/software-tools/.

[52] UKOLN Metadata Group home page. Available from:
URL: http://www.ukoln.ac.uk/metadata/.

[53] Internet Detective. Available from:
URL: http://www.sosig.ac.uk/desire/internet-detective.html.

[54] Selection Criteria for Quality Controlled Information Gateways. Available from:
URL: http://www.ukoln.ac.uk/metadata/desire/quality/.

[55] Examples of Selection Criteria. Available from:
URL: http://www.desire.org/results/discovery/cat/selectex_des.htm.

[56] The Role of Classification Schemes in Internet Resource Description and Discovery. Available from:
URL: http://www.ukoln.ac.uk/metadata/desire/classification/.

[57] Mapping Classification Schemes. Available from:
URL: http://www.sosig.ac.uk/desire/class/mapping.html.

[58] Internationalization in the DESIRE Project. Available from:
URL: http://www.roads.lut.ac.uk/DESIRE/DesireI18N.html.

[59] Developing Multilingual Subject Gateways. Available from:
URL: http://www.sosig.ac.uk/desire/lang/language.html.

[60] Distributed and Part-Automated Cataloguing. Available from:
URL: http://www.sosig.ac.uk/desire/cat/cataloguing.html.

[61] The Combine Harvester. Available from:
URL: http://www.lub.lu.se/combine/.

[62] Nordic Web Index. Available from:
URL: http://nwi.lub.lu.se/.

[63] "All" Engineering resources on the Internet (EELS). Available from:
URL: http://www.ub2.lu.se/eel/ae/>.

[64] The SOSIG Link harvester. Available from:
URL:http://www.sosig.ac.uk/roads/cgi/search.pl?form=harvester.

[65] Integration of Indexing and Subject Gateways. Available from:
URL: http://www.sosig.ac.uk/desire/index/integration.html.