DESIRE Information Gateways Handbook
HomeTable of contentsAuthors-
Search | Help   
Information Gateways Handbook (Print Version)

Strategy issuesSection 1 : Strategic Issues (Print Version)

Target audience
 

Section 1 of this handbook is aimed at the people responsible for strategic management - funders and project managers who will initiate the set up of a gateway and who will steer its direction over time.

It aims to give an overview of the key issues involved in gateway projects, giving a rationale for these projects. It covers the important decisions that need to be made when setting up a new gateway (for example, staff effort, skills and costs) but also deals with logistics for managing an existing gateway.

Each section offers some background, practical tips and hints, key references, a glossary, case studies and examples. Watch out for the Cross Reference that will take you to related sections elsewhere in the handbook

Contents
  Section 1 : Strategic Issues
  1. Information Gateways overview
  2. Preliminary planning
  3. Staff and skills required overview
  4. System requirements overview
  5. Maintenance requirements:cost implicaitons
Section 2 : Information Issues

Section 3 : Technical Issues


-1.1. Information gateways overview

In this chapter...
 
  • what is an information gateway
  • the rationale for developing information gateways
  • examples of leading information gateways

Introduction
 

Information gateways are now a well established feature on the Internet. There are a number of different models for setting up and running gateways. The technology behind gateways can also vary considerable. But quality information gateways all have key similarities that make them invaluable resources to their respective user communities.


What is an information gateway?
 

Information gateways are quality controlled information services that have the following characteristics:

  1. an online service that provides links to numerous other sites or documents on the Internet
  2. selection of resources in an intellectual process according to published quality and scope criteria (this excludes e.g. selection according to automatically measured popularity)
  3. intellectually produced content descriptions, in the spectrum between short annotation and review (this excludes automatically extracted so-called summaries). A good but not necessary criterion is the existence of intellectually assigned keywords or controlled terms.
  4. intellectually constructed browsing structure/classification (this excludes completely unstructured lists of links)
  5. at least partly, manually generated (bibliographic) metadata for the individual resources

After T. Koch: http://www.ub2.lu.se/tk/SBIG-definition.txt


The rationale behind information gateways
 

Many academic libraries and institutions are currently looking for ways to help their users discover high quality information on the Internet in a quick and effective way. The DESIRE project and others (e.g. IMesh) suggest that the development of information gateways can provide a solution.

Researchers and academics do not always have the time, inclination or skills to surf the Internet for resources that could support their work. As Internet publishing and communication become more commonplace this could disadvantage some researchers as they will miss valuable information and communication resources.

In the traditional information environment human intermediaries, such as publishers and librarians, filter and process information so that users can search catalogues and indexes of organised knowledge as opposed to raw data and disparate information. Subject gateways work on the same principle - they employ subject experts and information professionals to select, classify and catalogue Internet resources to aid search and retrieval for their users. Users are offered access to a database of Internet resource descriptions which they can search by keyword or browse by subject area. They can do this in the knowledge that they are looking at a quality controlled collection of resources. A description of each resource is provided to help users assess its origin, content and nature, enabling them to decide if it is worth investigating further.


Examples of leading information gateways
 

The following information gateways are used elsewhere in the handbook as examples of good practise and/or having interesting development information to contribute to the wider gateway's community. A full listing of information gateways can be obtained from:

E X A M P L E

Leading information gateways

Biz/ed - Business and Economics Education on the Internet

Biz/ed is a unique business and economics service for students, teachers and lecturers. The gateway contains a ROADS based Internet catalogue with over 1400 Internet resources selected and described by subject experts.

DutchESS - Dutch Electronic Subject Service

Is an Internet Subject Service which indexes Internet resources, selected on quality and relevance for the academic community: students and academic researchers. The resources are classified according to the Nederlandse Basisclassificatie (Dutch Basic Classification).

EEVL - The Edinburgh Engineering Virtual Library

The EEVL Service a gateway for the higher education and research community to access high quality information resources in Engineering. The EEVL gateway offers broad or focused searching capabilities, and search results provide the choice of linking to full descriptive resource records or to the resources themselves. The catalogue has descriptions and links to thousands of quality Internet resources.

The Finnish Virtual Library Project

The Finnish Virtual Library project, launched in 1995 and funded directly by the Finnish Ministry of Education, aims to form a foundation for a Finnish field-specific subject index of subject gateways. A collection of libraries have produced individual virtual libraries in 40 subject areas; these are now being converted into a gateway format, and offered as bilingual services in Finnish and English. The Kuopio University Virtual Library has mounted its Virtual Library as a ROADS-based gataway, covering the subject areas of Clinical Nutrition, Neurosciences and Pharmacy.

NMM Port

Port is the UK National Maritime Museum's online catalogue of high quality maritime related Internet resources. Every resource has been selected and described by a librarian or subject specialist. Services and materials developed by the Museum's Centre for Maritime Research are also available on the site.

OMNI - Organising Medical Networked Information

OMNI, Organising Medical Networked Information, covers the areas of medicine, biomedicine, allied health, health management and related topics. The service also provides training materials and workshops. Browsing can be done via either alphabetical topics, classified topics, or via MeSH headings. In addition, OMNI provides a range of biomedical value-added services, including a MEDLINE review section, mirrors of key NHS IT strategy documents, and the UK CME database.

SOSIG - The Social Science Information Gateway

SOSIG can help you locate high quality sites on the Internet, which are relevant to social science education and research. The Internet Catalogue offers access to thousands of high quality Internet resources, each selected and described by academic librarians and subject specialists. The SOSIG service receives funding from the ESRC, JISC and the European Union.


Glossary
 

Desire - Development of a European Service for Information on Research and Education, EU funded research project
ESRC - Economic and Social Research Council. The ESRC is the UK's largest independent funding agency for research and postgraduate training into social and economic issues.
IMesh - International Collaboration on Internet Subject Gateways
JISC - Joint Information Systems Committee. UK Higher Education organisation, with the aim to stimulate and enable the cost effective exploitation of information systems and to provide a high quality national network infrastructure for the UK higher education and research councils communities
ROADS - Resource Organisation And Discovery in Subject-based Services


References
 

Biz/ed - Business and Economics Education on the Internet, http://www.bized.ac.uk/

Desire - Development of a European Service for Information on Research and Education, http://www.desire.org/

DutchESS - Dutch Electronic Subject Service, http://www.konbib.nl/dutchess/

EEVL - The Edinburgh Engineering Virtual Library, http://www.eevl.ac.uk/

The Finnish Virtual Library Project, http://www.uku.fi/kirjasto/virtuaalikirjasto/

IMesh, http://www.desire.org/html/subjectgateways/community/imesh/

NMM Port, http://www.port.nmm.ac.uk/

OMNI - Organising Medical Networked Information, http://www.omni.ac.uk/

PINAKES - A Subject Launchpad, http://www.hw.ac.uk/libWWW/irn/pinakes/pinakes.html

SOSIG - The Social Science Information Gateway, http://www.sosig.ac.uk/


Credits
 

Chapter author: Martin Belcher

Contributors: Phil Cross


-1.2. Preliminary planning

In this chapter...
 
  • setting a gateway's objectives
  • examples gateway objectives
  • scheduling achievable timescales
  • phasing of the project
Introduction
 

Information gateway projects range in size and complexity from small scale projects, that an enthusiast embarks upon in their own time, to the development of full blown services at a national level, that a team of many specialists works on full time. This handbook is primarily concerned with the development of larger scale gateways. This chapter deals with the planning of a medium to large scale gateway and not a "one-man" band approach. Saying that, many of the issues that are applicable to a large scale gateway are equally applicable to a gateway set up by a single person. However, the system of a well defined plan, aims and objectives, and a carefully thought out timetable should help contribute to any gateway project, regardless of its size.


Background
 

As with any serious project, a well thought out plan is essential for long term success of an information gateway project. The best way to plan projects efficiently is with the aid of a formal project plan document. An important section of the project plan is a clearly defined set of aims and objectives. Simply stating what a project's aims and objectives are is not enough. The objectives must be accompanied by a clear set of deliverables, against which the overall success of meeting the aims and objectives can be measured. The deliverables need to be contextualised with a clear and simple timetable to help deliver the project within a sensible time frame.


Setting a gateway's objectives
 

The fact that you are seriously considering setting up a gateway must mean that you have some aims and objectives. This might be to establish a service for a specific national user community, or perhaps it is to set up a gateway for your University Library? Each different gateway will have a different set of aims and objectives. If you are receiving funding from a third party then it is highly likely that there are some contractual aims and objectives that have to be met.

In general aims and objectives are wide ranging and rather broad statements that require further clarification. A measurable set of scheduled deliverables can help focus the general aims and objectives. Deliverables are an important part of a project plan and are often required as a condition of funding (it allows the funding and supporting organisations to check and evaluate that their funding is being used to achieve the project's set aims and objectives.)


E X A M P L E

Early SOSIG project aims and objectives

An early SOSIG project plan (published February 1996) contained the following text:

SOSIG's overall aims fall into three broad categories:

  • To improve delivery of information and quality of service by working with and helping to pilot the latest developments in networked resource tools technology
  • To improve accessibility and usability or resources via a programme of training and awareness
  • To encourage availability of new, quality networked resources of relevance to social scientists

Social Science Information Gateway - Project Plan
(Lesly Huxley and Nicky Ferguson: 1996)

Early SOSIG deliverables

Also contained in the same document, were a set of key deliverables that helped to put the broad aims and objectives into easily measurable deliverables. A sample of early SOSIG deliverables include:

  • A demonstrator service providing a testbed for the latest developments in networked information retrieval technology in collaboration with other services
  • Subject-specific training documentation (in paper and online form)
  • Subject-specific training workshops
  • Subject-based user guides to selected quality networked resources
  • Promotional materials to raise awareness of the service

Social Science Information Gateway - Project Plan
(Lesly Huxley and Nicky Ferguson: 1996)


  . .   R E M E M B E R

Deliverables should be SMART:

  • Specific
  • Measurable
  • Achievable
  • Relevant
  • Time-based

Making your deliverables SMART can help everyone involved in the project, both those involved in the implementation and those involved in the funding of the project.


Scheduling achievable timescales
 

Once a detailed set of deliverables has been drawn up, the next stage is to develop a timetable for their delivery. There are a few issues to consider when committing to a timetable, the most important issue being that once you have an agreed timetable then you are bound by it. There may be some flexibility in the schedule, but generally deadlines should be kept to, in order to avoid projects running into timetabling difficulties. Therefore developing a realistic and achievable timetable is important.

There is little point in having lots of important sounding deliverables and a very detailed timetable if the schedule is impossible to meet. It is a guaranteed way to increase the chances of the project and hence the gateway, failing. Set realistic and achievable deliverables and deadlines. Do not agree to do something unless there is sufficient time and resources available to deliver.


Phasing of the project
 

Many of the tasks associated with setting up an information gateway are closely related to each other. There is an overlap with some tasks whilst some can only be started once others have been completed. The key tasks and phases of an information gateway project might include:

Phase 1: Pre-project

  • Outline planning of project
  • Securing funding for project
  • Producing outline project timetable and plan

Phase 2: Project planning and set-up

  • Drawing up detailed timetable and plan
  • Hiring staff and developing skills
  • Developing policy documents (scope and selection criteria)
  • Technical planning

Phase 3: Technical implementation

  • Technical set up and system testing
  • Training of non-technical staff in system usage

Phase 4: Catalogue development

  • Cataloguing of resources and catalogue development
  • Service launch

Phase 5: Day to day running

  • Ongoing catalogue development
  • Collection management

Generally the phases above are all sequential and related i.e. phase 3 can't really be started until phase 2 has been completed, etc. The actual launch date of the gateway should often be delayed until there are a certain number of resources in the catalogue. Many gateways have waited until 100-200 resources are available before launching. Although the exact number will be largely dependent on the staff effort available to develop the catalogue and the overall objectives of the gateway.


References
 

SOSIG, http://www.sosig.ac.uk/


Credits
 

Chapter author: Martin Belcher


-1.3. Staff and skills required overview

In this chapter...
 
  • setting up a gateway
  • running a gateway
  • skills and people checklist
Introduction
 

Information gateway projects have several distinct phases; planning and scoping, technical and information setup, administration and maintenance. Each phase requires different skills and perhaps different staff. In the ideal world a gateway project would be able to call on a large pool of staff, this may be the case in some instances, more often a few key staff will perform the majority of the tasks, with external people being brought in from time to time.


Setting up a gateway
 

Depending on the exact technology used, there is going to be relatively large up front cost in terms of time and unique skills, in the setting up of a gateway. The information management issues will require research and documentation. It is likely that the people involved with this side of the setting up, will continue to play a part in the project, most usually in the building of the resources database and the day to day running of the project. There will also be a large up front cost in terms of the technical implementation of the infrastructure software that the gateway will operate on. How large this cost will be depends on whether or not an existing set of gateway technology is being used (e.g. ROADS) or a new system is being developed. Either option will require people with the appropriate technical skills.

If the gateway technology is being developed from scratch or using an existing system with significant modification, then significant amounts of technical research and development will be required. Staff with the appropriate technical skills will be essential. Additional there may be a need for an interface designer, to develop the user front end to the system. These skills will only really be required for a set period and set of tasks. As such they are the ideal skills to bring in from external sources.

A project manager or supervisor will also be invaluable, to help in the development of the project to time, budget and its original aims and objectives. The project manager should be able to operate on both the subject specialist level and technical level. This doesn't mean that you need a programming librarian, but someone who can understand both areas and manage their different strengths and weaknesses.


Running a gateway
 

The key staff needed for the running of a gateway are subject specialists who will be involved in the expansion and development of the resources catalogue. The exact number of these will depend on the scope of the gateway. If the gateway aims to catalogue all resources in a given field within a short period, then a larger number of cataloguers will be required. The more subject specialist and resource cataloguers there are, then the faster the number of resources in the gateway can grow.

Various models of developing the catalogue of resources and distributed staffing are discussed elsewhere (resource discovery strategies, working with information providers and distributed cataloguing and collaborative working), each model can have a significant effect on the number and type of core staff that a gateway requires for expanding the catalogue of resources.

Cross reference
Resource discovery, Working with information providers, Distributed cataloguing, Co-operation between gateways

Depending on the technology used to set up and run a gateway, the need for continued technical support and development can vary considerably. Under some circumstances the need for technical support staff effort can be kept very low. However, it is essential for the long term survival of the gateway that a reasonable amount of staff effort is kept aside for technical support and development. Even the most robust technologies can run into problems. Simple problems can cripple a gateway if the technical staff are not there to fix them.


Skills and people checklist
 

Under ideal circumstances an information gateway will be able to draw on the skills of staff with the following roles and/or job titles. Reality may mean that a few staff cover all these roles:

Title

Description

Skill Set

Project manager

someone to over see the whole project and ensure the smooth day to day running

organisational skills, good written and oral communication, person management, subject and technical knowledge and understanding, excellent information management skills

Subject specialist

person or persons to develop the intellectual scope of the gateway and the expansion of the gateway catalogue or resources

excellent subject knowledge, understanding of information management issues, ideally extensive Web experience and some understanding of technological principles behind gateway

Information cataloguers

person or persons directly involved in the entry of resources into the catalogue (often the same as the subject specialist)

subject knowledge, confident Web user, some understanding of technological principles behind gateway

Technical implementation officers

person or persons involved in the development and implementation of the technical side of the gateway

excellent technical understanding of the networked environment, good programming and scripting skills and good working knowledge of proposed gateway technology. If developing new gateway technologies then very high network related technical skills are essential. Ideally have some appreciation of information management issues

Technical support officers

person responsible for the day to day technical integrity of the gateway system

as technical implementation officers but can be slightly less experienced if correct tools are put in place in the system development

Web server administrator

person responsible for the running and administration of the gateway web server

as above plus excellent Web server administration skills

User interface designer

person or persons responsible for the design and implementation of the gateway user interface

good understanding of Web site design and well versed in usability and accessibility issues

Finances officer

person responsible for the financial side of the project

good understanding and experience of potentially large scale project financial management, may or may not be project manager

Publicity and promotions officer

person or persons responsible for the development and deployment of publicity and promotional materials/activities

experience in publicity and promotions, good subject knowledge and user community understanding

The ideal versus the real world

Ideally we would all like to be able to draw on the specialist skills of all those people outlined above. The real world dictates that more often than not, we will be required to draw the skills from a smaller group of multi-skilled people. This means a very broad skill set is required from a small number of staff. It can also mean the development of an excellent, tight-nit, well focused team.

When skills are lacking within the core team, it can often be very effective to bring in experts from outside. These experts could be drawn from within the same organisation (e.g. other sections of the same university) or they could be commercial consultants. People involved in the technical implementation, user interface design and publicity and promotion are often brought in under such circumstances.


Glossary
 

ROADS - Resource Organisation And Discovery in Subject-based Services

Credits
 

Chapter author: Martin Belcher

-1.4. System requirements overview

In this chapter...
 
  • reliability - making sure your gateway is always available
  • responsiveness - how will your gateway perform?
  • efficiency - making the best of available resources
  • scalability - coping with more users, more data and more services

Introduction
 

Subject gateway services need to be provided in such a way that they are:

  • reliable
  • responsive
  • efficient
  • scalable

A reliable service is one that is available all (well, almost all) of the time, is secure and does not lose all your data in the event of disk failure or security breaches. A responsive service is one that can be browsed, searched and maintained in a way that does not subject the end-user and cataloguer to undue delays. An efficient service makes the best use of the available hardware and network resources. A scalable service is one that can cope with demands placed on it by growing numbers of end-users, increasing database size and new service scenarios.


Background
 

Subject gateways operate in a Web environment. This means that they must be available all the time. End-users expect reasonable response times while they browse the gateway and fast and predictable performance when they search the database. Subject gateway cataloguers expect reasonable response times as they add resource descriptions to the database. Subject gateway managers want to be able to deliver all this at a reasonable cost - both in terms of the initial cost of establishing the gateway and in terms of ongoing hardware and software support costs.

You can achieve this through the use of appropriate:

  • network connectivity
  • hardware configuration (memory, CPU speed, disk space)
  • operating system software
  • subject gateway database and associated software
  • Web server software

Hardware and software requirements; issues for managers
 

Reliability

You want your subject gateway to be reliable. You want it to be available for use for as much of the time as possible - preferably 24 hours a day, 365 days a year. In order to achieve this, there are several issues you will need to think about when you are setting up and running the gateway.

Use reliable hardware

Use reliable hardware to run your subject gateway. This probably means using hardware with which you are familiar. Get a hardware support contract for your machine with an appropriate call-out time. If you are nervous, make sure that you can offer your service from some other hardware if your main kit is seriously broken. If you are really nervous, set aside a machine specifically for this purpose. As regards cost, you are likely to get a much better price/performance ratio by choosing Intel (PC) hardware. However, remember that you are likely to be accessing your disks heavily during subject gateway operation so choose an appropriate disk configuration and connection method.

Use reliable software

Remember that a subject gateway operates in a hostile networked environment and needs to support multiple users. Choose an operating system that can reliably handle this. Again, it may be sensible to choose an operating system with which you are familiar. However, it is worth noting that UNIX-based operating systems have a much longer track record of providing Internet-based services. Think carefully before choosing anything else! Much of the software developed by the DESIRE project is aimed at (or will only run under) UNIX-based operating systems. If you've chosen Intel-based hardware, using Linux as the operating system is an obvious choice. Remember that you may need software support both for your operating system and for the subject gateway software that you are running. If you prefer to pay for such support, fine; but remember that the freely available and fairly informal support which is usually available for Open Source software through mailing lists and Web sites can often be extremely good. Remember also that your subject gateway software is likely to rely on a separate Web server; the widely deployed, well maintained and supported and freely available Apache Web server is a sensible choice.

Make sure your data is regularly backed up

What happens when something goes seriously wrong with your machine: a disk crashes or you are hacked and your data is deleted? Make sure that all your software and data is backed up in such a way that you can quickly and easily recover your service. You may choose some sort of RAID architecture for your disks. You may choose to copy your data automatically to a second disk partition. In any case, you are advised to archive your data to tape regularly. You may even do all three of these things ... but do something! And don't forget your software and configuration files; in the event of a serious problem you may need to re-install absolutely everything!

Make sure your server is secure

An insecure server is a disaster waiting to happen. Follow the advice in your operating system manuals concerning security. Apply all known security patches and get someone in your team on to the right mailing lists so that you find out about potential problems early. Only run the minimum number of network services that you have to. Position your machine behind a firewall if you can, with access to the Internet only on those ports that you actually need.

Coping with external problems

Your subject gateway will rely on various external services. If your network connection goes down, you can't offer a service. If your DNS entry isn't available for some time, people may be unable to access you. An off-site secondary for your DNS entries is a good idea; an off-continent secondary is even better! As your subject gateway grows, you might think about mirroring your service at another location. One way of achieving this is to have a reciprocal mirroring arrangement with another subject gateway.

Staffing issues

Unless you hand over completely the running and administration of your subject gateway server to a third party, you are highly likely to need one technically competent member of staff to run a subject gateway. For DESIRE developed software solutions, this will mean someone familiar with administering UNIX machines. Familiarity with the Perl programming language would be a distinct advantage as well. Other software solutions may not require UNIX or Perl experience; however, a technical understanding of the issues related to the operation of a networked service will be very helpful.

Responsiveness and efficiency

Hardware and software issues

More details concerning hardware and software issues are given in the Systems Requirements Specifics section. The main rules of thumb are:

  • hardware requirements will be software-specific - in particular, database-specific. Check your software manual!
  • more memory is likely to mean better performance
  • faster CPU speed is likely to mean better performance
  • Linux will give better performance than NT given the same hardware
  • NT and Perl may not mix well
  • more network bandwidth means better performance
  • multiple DNS secondaries will give better performance

Cross reference
System requirements specifics, hardware and software

Network and design issues

The design of the Web interface to your subject gateway will have an effect on the efficiency with which you use the available network bandwidth. Make as many of your pages as possible suitable for caching. For example, most of your browsable interface (assuming that you have one) can probably be designed so that it can be cached by remote Web caches and at the Web browser. Your user interface will be much more responsive because of this.

Cross reference
User interface implementation

Scalability

Scalability is discussed in more detail in the Scalability section. As a general point it is worth noting that:

  • supporting more users may require more memory and more network bandwidth
  • having more records in the database may require more memory and more disk space
  • introducing new service scenarios may require more memory and more disk space

Cross reference
Scalability

Costs

Unless you are very lucky, the hardware on which you run your subject gateway is going to cost money. As mentioned above, Intel-based hardware is likely to give a much better price/performance ratio than other hardware. Software may well be free - all the software developed by the DESIRE project will be made available on an Open Source basis. Hardware and software support is likely to cost money; though again it is worth noting that the support you can get for free from the Internet community may well be good enough for your needs (and may even be better than that provided commercially). Technical staff will cost money.


Future proofing
 

Software and hardware systems need to be regularly reviewed to measure how far they are meeting business requirements. The gateway will want to choose software and hardware solutions which provide sufficient flexibility to accommodate change. Such products will probably:

  • offer regular upgrades
  • comply with open standards
  • respond to customer requests
  • impose no restrictions which tie you to that product, for example by ensuring that you have access to proprietary specifications of data structures which may be needed to convert to a new supplier's format The gateway will want to ensure that decisions regarding the choice of products are informed by strategic objectives, for example:
  • use products that have a good reputation in areas which are important for the gateway (by being innovative, reliable, flexible, customisable . . . )
  • use products that support inter-working with key collaborators
  • implement systems with potential audiences in mind (the technologies they use, the features they value)
E X A M P L E

Scout/SOSIG mirroring

SOSIG, the Social Science Information Gateway, is a ROADS database of over 5500 Internet resource descriptions operated by ILRT at the University of Bristol in the UK. In order to make the database more accessible to end-users in North America, SOSIG has been working closely with staff from the Internet Scout Project, located at the University of Wisconsin-Madison (USA) and funded by the National Science Foundation. A mutual mirroring service has been set up so that users from North America can access a mirror of SOSIG, based on the Scout server, and European users can access a mirror of Scout from the SOSIG server. The SOSIG ROADS database is mirrored weekly using some locally developed scripts that make a 'tar' copy of the complete SOSIG ROADS installation (after making some site-specific changes).

Cross reference
Co-operation between gateways


Glossary
 

DNS - Domain Name Server. A general-purpose distributed, replicated, data query service chiefly used on Internet for translating hostnames into Internet addresses.
Linux - Linux is a free Unix-type operating system originally created by Linus Torvalds with the assistance of developers around the world.
RAID - Redundant Arrays of Independent Disks
ROADS - Resource Organisation And Discovery in Subject-based Services

References
 

Apache, http://www.apache.org/

Internet Scout Project - SOSIG mirror, http://scout18.cs.wisc.edu/sosig_mirror/

Linux, http://www.linux.org/

SOSIG, http://www.sosig.ac.uk/

AE. Frisch, Essential System Administration, 2nd ed. (ISBN: 1-56592-127-5). http://www.oreilly.com/catalog/esa2/

B. Laurie & P. Laurie, Apache: The Definitive Guide, 2nd ed. (ISBN: 1-56592-528-9). http://www.oreilly.com/catalog/apache2/

M. Loukides, System Performance Tuning (ISBN: 0-937175-60-9). http://www.oreilly.com/catalog/spt/

E. Siever, et al., Linux in a Nutshell: A Desktop Quick Reference (ISBN: 1-56592-585-8). http://www.oreilly.com/catalog/linuxnut2/


Credits
 

Chapter author: Andy Powell

-1.5. Maintenance requirements

In this chapter...
 
  • the importance of maintenance
  • estimating maintenance requirements
Introduction
 

Information gateways need to be maintained in two key areas:

  • collection management
  • server integrity and functionality

Without adequate maintenance in these two areas a gateway is vulnerable to undermining its core aims and objectives; being a quality-controlled portal to online information resources. The key strength of an information gateway is in the quality of its data and the reliability of its service. Without adequate maintenance both of these areas are susceptible to developing weaknesses and problems.


The importance of maintenance
 

Server integrity and functionality

All Web sites and services need some degree of Web server maintenance. A competent system administrator and Webmaster can easily carry out much of this technical maintenance. Additionally many maintenance tasks can be readily automated, thereby reducing the requirements for direct human intervention. However there is still a need for someone to keep an eye on things, such as monitor system performance and deal with any day-to-day maintenance issues that may arise. Without this maintenance there is a real risk that any problems with the Web server will not be picked up until users find them. If users experience regular problems with Web sites they are likely to loose trust in the sites in question. Loss of trust often results in lost users.

Information gateways have the additional requirement that they need regular and sometimes extensive maintenance of the resource catalogue. Because the resource catalogue is at the heart of the gateway (it is the very reason why people use the gateway), then failure to maintain this aspect of the gateway can lead to serious problems in quality of service and content. Problems in this area directly effect user confidence in the gateway. Without user confidence and quality assurance gateways can rapidly loose users and fail to attract new ones.

Collection management

Because of the dynamic nature of the Internet, a catalogue of Internet resources is going to require a certain degree of maintenance in order to keep the catalogue up to date. Online resources come and go, are available one day and not the next (the fluidity of many online documents is detailed elsewhere - Collection management). This makes collection management an important part of any gateway's maintenance requirements.

Cross reference
Collection management


Estimating maintenance requirements
 

Estimating maintenance requirements for an information gateway can be a difficult task. Key factors that should be considered are:

  • what is the scope of the gateway?
  • how quickly is the gateway resource catalogue scheduled to grow?
  • what is the perceived lifetime of the gateway?
  • how heavily will the gateway be used?

Generally the larger the scope, the quicker the scheduled growth, the longer the lifetime and the more heavily used the gateway is the more maintenance will be required.

Server integrity and functionality

Server maintenance will be largely constant regardless of the size of the gateway. If the gateway has its own dedicated server then there will be basic machine level administration tasks. If the gateway is hosted virtually (i.e. multiple Web sites on the same machine), then a large proportion of the maintenance will be shared with other sites on that machine.

For more details on hardware and software maintenance see the System requirements specifics, hardware and software chapter.

Cross reference
System requirements specifics, hardware and software

Virtual hosting maintenance can be as little as a few hours a week of staff effort, sometimes even less. Dedicated servers are going to require more maintenance but with the right planning and set-up the maintenance requirements can be kept below one day per week in staff effort.

These low levels of maintenance can be achieved only with careful planning and setting up of the gateway from the start. Obviously when problems arise (they do even for the best-planned gateway) maintenance requirements can be considerably more time consuming.

Collection management

Collection management and associated maintenance requirements are closely linked to the size of the catalogue and resources database. Validating records, link checking and updating resource descriptions will be related to the number of records that are being dealt with. As the catalogue grows expect to spend 10-15% of the overall cataloguing time on collection management maintenance and related tasks.

  . .   R E M E M B E R

General Web sites often require an unexpectedly high level of maintenance. It has been estimated that "as a rule of thumb, the annual maintenance budget for a website should be about the same as the initial cost of building the site, with 50 percent as an absolute minimum."

Jakob Nielsen: 1997
http://www.useit.com/alertbox/9706b.html


References
 

Jakob Nielsen Top Ten Mistakes of Web Management. Alertbox, June 15 1997. http://www.useit.com/alertbox/9706b.html


Credits
 

Chapter author: Martin Belcher


Subject specialists and information managersSection 2 : Information Issues (Print Version)

Target audience
 

Section 2 of this handbook is aimed at gateway staff responsible for information management - the subject specialists and information professionals who will consider the content and organisation of the information within the gateway.

It aims to cover the important decisions that need to be made when setting up a new gateway (such as choosing a metadata format, designing a use interface, writing a selection policy) but also covers issues that arise in the day-to-day running of an existing gateway (such as cataloguing, resource discovery and publicity and promotion).

Each chapter offers some background, practical tips and hints, key references, a glossary, case studies and examples. Watch out for the Cross Reference that will take you to related sections elsewhere in the handbook.

Contents
  Section 1 : Strategic Issues

Section 2 : Information Issues
  1. Quality selection
  2. Resource discovery
  3. Metadata formats
  4. Cataloguing
  5. Subject classification, browsing and searching
  6. Collection management
  7. Working with information providers
  8. Publicity and promotion
  9. User interface design
  10. Integration of robot and manual indexes
  11. Distributed cataloguing
  12. Multi-lingual issues
  13. Co-operation between gateways
Section 3 : Technical Issues

-2.1. Quality selection: ensuring the quality of your collection

In this chapter...
 
  • why develop and publish a selection policy for your gateway?
  • creating a scope policy and selection criteria for your gateway
  • guidelines for selecting and evaluating Internet resources
  • skills and training required by gateway staff in selection and evaluation
  • changing your selection criteria over time
  • quality ratings/labelling/PICS and other Internet initiatives in this area

Introduction
 

Subject gateways are sometimes called the Internet equivalent of a library, and in terms of the selection process this is certainly true.

Gateways are characterised by the focus and quality of their collections. They aim to provide their users with a quality controlled environment in which to search for information on the Internet and they do this by building selective collections where every resource that the gateway points to has been carefully selected for its quality.

The selection process involves people making value judgements about Internet resources and selecting only those resources that satisfy certain quality criteria.

But what constitutes a 'high quality' Internet resource? Information gateways need to use a service-driven definition of quality, where resources are selected for their relevance to the user group as well as their inherent features.

Selecting resources for a gateway therefore requires a clear understanding of the information needs of the end-users, as well of as the pros and cons of the design features of Internet sites.

Information gateways consciously emphasise the importance of skilled human involvement in the assessment and 'quality control' of their selected Internet resources. Selection and evaluation of resources for a gateway is typically done by a librarian or subject specialist, reflecting the fact that selection is based on an evaluation of the semantic content of the resources.

A formal selection policy can support the development of a consistent and coherent collection of high quality Internet resources.


Why develop and publish a selection policy for your gateway?
 

Many subject guides on the Internet do not explicitly state their selection policies, but there are a number of advantages in developing a formal selection policy for a gateway and publishing it on your site:

  • it helps users to appreciate that the service is selective and quality controlled
  • it helps users to understand the level of quality of information they will find when using the service
  • it helps gateway staff to be consistent in their selection and to maintain the quality of the collection
  • it can be used to train new staff
  • it ensures consistency in collections that are developed by a distributed team

By publishing your selection policy on the gateway you can help your users to conceptualise the nature of the collection they are using. On the Web, users are very often faced with a search box or an index, and it is not always easy for them to understand exactly what they are searching. An explicit selection policy can help them to understand the nature of your gateway service. The Centre for Information Quality Management (CIQM) recommends that database providers offer a 'published specification' or 'user-level agreement' to 'lessen the gap between user expectations and the reality of searching' (Armstrong, 1997). A formal selection policy can help to meet with this recommendation.

The integrity of a collection will depend on there being some consistency in the type and quality of resources that your staff decide to include in the collection. A formal selection policy can help to ensure that the selection is consistent and that the quality of the collection remains high.

A selection policy can ensure that the same member of staff makes consistent judgements about what they include in the collection. It can also ensure that different members of the staff team make consistent judgements and that they are all using the same selection criteria.

The selection policy can help new staff to understand quickly both the nature of the collection and the criteria they should use when selecting new resources to add to the gateway.

A formal policy can also help to ensure consistency of selection within a distributed team. For example, if a number of gateways are working collaboratively, an agreed selection policy can help to ensure that the combined collection has a consistent level of quality.


What is a selection policy?
 

In an information environment, a selection policy defines the criteria used for selecting resources to add to a collection. It will typically outline the scope of the collection and the criteria used when new resources are selected for the collection. The scope policy relates to the needs of the target user group, while the selection criteria relate to the inherent features of the Internet resources.

Defining the scope of the collection

Subject gateways do not aim to include every resource available on the Internet. The scope of a gateway defines the boundaries of the collection. The scope policy is therefore a broad statement of the parameters of the collection.

The scope policy of a service states what is and is not to be included in the catalogue. In the selection process, the scope of the service will affect the first decisions made about the quality of the resources. Those falling outside the scope will be rejected and the rest will have the quality criteria applied to them.

The scope criteria are the first filter through which the resources pass. They will tend to involve clear decisions; either a resource falls within the scope or it does not.

A scope statement will typically outline:

  • the subject areas covered by the gateway
  • the types of resources covered by the gateway

It may also outline:

  • language parameters (e.g. whether the gateway only includes resources in a certain language)
  • geographical parameters (e.g. whether the gateway only includes resources from a particular country)
  • other parameters of relevance to the user group served
E X A M P L E

Examples of scope policies


Defining the quality selection criteria

Subject gateways do not generally aim to point to every Internet resource that falls within their subject area and scope. They are characterised by their quality control, aiming to point only to the best resources available for their subject area and audience.

The selection criteria outline the qualities that a resource must have to be included in the collection.

E X A M P L E

Examples of quality selection criteria



Developing a selection policy for your gateway
 

How should a gateway develop its selection policy? Each gateway needs to develop its own unique set of selection criteria to take the information needs of the user group and the aims of the service into account.

The first steps are to define:

  1. your target user group
  2. the information needs of the user group
  3. the aims and objectives of the gateway (balancing what you'd like to cover with what you have the resources to cover)

Once these steps have been taken, it is a matter of defining a formal scope policy and a set of selection criteria.

The DESIRE project has created some tools for creating a scope and selection policy. The guidelines are not prescriptive and are designed to help an institution or service develop its own tailor-made policies in the light of its aims and audience. A comprehensive list of criteria is given, from which criteria relevant to the individual service can be chosen. The list has been drawn from a 'state of the art review' of current practice, library and Web literature.

Creating a scope policy

Some possible criteria for creating your scope policy are given below. For each heading you will need to outline the parameters to be used in your gateway. Not all of these will be appropriate for your audience and you may need to add additional criteria.

INFORMATION COVERAGE

Subject Matter

  • what subject matter is appropriate for the target audience?
  • are there any subjects which will be censored (e.g. for ethical reasons, such as resources produced by hate groups or resources about bomb-making/paedophilia etc.)
  • how important is the subject matter of linked sites?

Acceptable Types of Resource

  • what types of resource are appropriate for the target audience?
  • is the information scholarly rather than popular?
  • does the resource contain more than just a list of links?
  • is the site either proven to be or expected to be durable?
  • would a resource intended for use by an individual or local group be acceptable?
  • is it innovative - does it contain breakthrough design elements?

Acceptable Sources

  • which sources of information are acceptable/appropriate for the target audience?
  • are academic, government, commercial, trade/industry, non-profit private sources all acceptable?
  • are pages maintained by individual enthusiasts (e.g. students) acceptable?
  • is biased information acceptable, and are opinions and ideologies acceptable?

Acceptable Levels of Difficulty

  • which sources of information are acceptable/appropriate for the target audience?
  • are pages maintained by individual enthusiasts (e.g. students) acceptable?
  • is biased information acceptable, and are opinions and ideologies acceptable?

Acceptable Levels of Difficulty

  • what level of resource is appropriate for the target audience? (e.g. users may be school children or may be academics)

Advertising

  • are resources that contain advertising acceptable?
  • is there a limit to the amount of advertising that is acceptable?
  • are there any forms of advertising that will be censored?

ACCESS

Cost

  • how is charging going to affect selection - is the service only going to point to resources that are free to access?
  • are there any price limits in terms of the access charge?
  • what if resources are under copyright?

Technology

  • what technologies are appropriate for the target audience? (forms, ismaps, databases, CGI scripts, Java applications, frames, etc.)
  • what connectivity does your audience have and how will this affect selection?
  • what software do your users have and how will this affect selection? (e.g. will resources that work well in graphical browsers but not in line browsers be accepted?)
  • what hardware do your users have and how will this affect selection?

Registration

  • will the service accept resources where user-registration is necessary before the resource can be accessed?
  • is online registration acceptable?
  • if users must negotiate written contracts before access is possible, is this acceptable?

Special Needs

  • do your users have any special needs that will affect the resources selected? (e.g. large print or audio options for disabled users)

METADATA AND CATALOGUING ISSUES

Granularity

  • at what level will resources be selected/catalogued?
  • will resources be considered at the Web site/Usenet group level or the Web page/Usenet article level?

Resource description

  • what is the minimum amount of information needed to create a resource description in your catalogue, i.e. what basic information MUST a resource contain to be selected? (e.g. in a WWW document, contact details, last update details, etc.)
  • is there sufficient information to create a descriptive record?
  • will the service accept resources with/without specific metadata?

GEOGRAPHICAL ISSUES

Geographical Restraints

  • are any geographical restraints appropriate for your audience?
  • will the service cover information produced locally, from particular countries, particular continents or worldwide?

Language

  • in which languages are resources acceptable/appropriate to your target audience?

Creating quality selection criteria

Once you have defined the scope of your gateway, you will need to outline the level of quality that is acceptable within each individual resource.

A list of possible quality selection criteria is given below, from which criteria relevant to the individual service can be picked.

Content criteria: evaluating the information

  • validity
  • authority and reputation of source
  • accuracy
  • comprehensiveness
  • uniqueness
  • composition and organisation
  • currency, adequacy of maintenance

Form criteria: evaluating the medium

  • ease of navigation
  • provision of user support
  • use of recognised standards
  • appropriate use of technology
  • aesthetics

Process criteria: evaluating the system

  • information integrity (work of the information provider)
  • site integrity (work of the Webmaster/site manager)
  • system integrity (work of the systems administrator)

Fuller description of each of these criteria and examples can be found in an online tutorial called 'Internet Detective':

  . Tips

Internet Detective

Internet Detective is an interactive, online tutorial which provides an introduction to the issues of information quality on the Internet and teaches the skills required to evaluate critically the quality of an Internet resource. There is no charge, it takes around two hours to complete and it has interactive quizzes and exercises to lighten the learning process.

Selection criteria for quality controlled information gateways

This is a lengthy, peer-reviewed report which describes the DESIRE research into the development of quality systems and selection criteria for subject gateways. This report will be of interest to people wishing to see the research and methodology that lay behind the development of the lists of criteria given above. The lists resulted from a 'state of the art' review of quality issues, both within subject gateways and in other sectors, notably the private sector and industry.


Guidelines for selecting and evaluating Internet resources
 

The staff responsible for selecting new resources to add to the gateway will need to be able to select resources that together create a consistent and coherent collection of high quality Internet resources.

What constitutes a 'high quality' Internet resource? The definition of quality used here has been drawn from the commercial sector, where quality is seen to be closely related to customer satisfaction and to developing systems of continuous improvement. In the context of a subject gateway, the quality of a resource will depend on the users of the service, and the nature of the service, as well as the internal features of the resource itself. We suggest that for information gateways 'a high quality Internet resource is one that meets the information needs of the user'.

This is a service-oriented definition, and so, when evaluating the quality of Internet resources, gateway staff must consider the user group that they are serving as much as the Internet resources they are evaluating.

SOSIG (The Social Science Information Gateway) has come up with five steps that describe the selection process for gateway staff:

E X A M P L E

SOSIG selection procedure: Five steps to quality control

Before you start - get to know the quality of SOSIG

  • read the SOSIG scope policy, which outlines the subjects and types of resources that are acceptable
  • become familiar with the SOSIG service, especially the coverage of the collection; browse the database to see the kinds of resources that are acceptable
  • become familiar with the SOSIG quality selection criteria outlined in these Web pages

Finding resources

You may find it easier to divide the selection process into two stages:

  1. Spend time finding resources on the Internet and bookmarking those with potential.
  2. Go back to the bookmark list later to spend time evaluating each resource in some detail.

Once you have found a resource to evaluate, there are five steps to quality control, which are summarised below.

1. Ensure that the resource falls within the scope of SOSIG

This is the most important filter through which all resources should pass - if it isn't relevant then reject it! You can use the scope policy for guidance. Most important of all is to ensure that the resource is social science related! You can look at the browsing pages to see which subject areas the service covers.

2. Search the SOSIG collection

To avoid duplication within the SOSIG collection, it is essential that you go to 'Search SOSIG' and check that the resource is not already in the database. Consider how the resource will add to the SOSIG collection (this will get easier the more you get to know SOSIG). The coverage and balance of the collection is important. Try to find resources for subject areas that are not well covered.

3. Evaluate the content of the information

Content criteria are based on the information the resources actually contain. Of the criteria relating to the resources themselves, the content criteria are the most important. Content criteria should take precedence over form criteria - SOSIG users are likely to care more about getting the information that they need than about the form it takes.

4. Evaluate the form of the information

Form criteria relate to the medium, design and presentation of the resource. Some evaluation of the form can be made by considering the ease of navigation, provision of user support, and design. Resources should rarely be rejected on design points alone, but there may be factors which should be mentioned in your description of the resource (e.g. if a resource comes in a form that some users will not be able to access).

5. Evaluate the processes set up to support the resource

Process criteria relate to the fact that Internet resources can be volatile and can lack integrity. Some evaluation of the processes set up to support a resource is necessary. These may involve personnel as well as computer systems. You need to evaluate the likelihood that a resource will be adequately maintained over time and that it will remain current and stable.

Quality resources can now be added to SOSIG via the WWW catalogue form


Skills and training required by gateway staff in selection and evaluation
 

The choices made by the staff who select resources for a gateway will determine the nature of the collection. Recruitment and training of staff will therefore be a critical choice for your gateway.

Recruiting staff

Subject gateways typically employ librarians or subject specialists to select Internet resources to add to the gateways. This reflects an acceptance that to build a high quality collection you need:

  • a good understanding of the information needs of your target user group
  • to base selection on semantic judgements about the relevance and value of resources to your users
  • to have knowledge and expertise in the subject
  • to have knowledge and experience of information resources
  • skills in critical evaluation of information resources

Recruiting skilled and knowledgeable staff will help ensure the integrity of the gateway collection.

Training staff

Staff will need to be consistent in their selection criteria if the collection is to develop consistently. They will need to be familiar with the scope and selection criteria of your gateway, but will also need to develop skills for evaluating Internet resources. Training staff may involve:

  • 'editorial meetings'- where all the selection staff discuss the criteria to be used
  • creating a staff manual - giving staff paper or online copies of the selection policy
  • developing exercises and examples based on Web sites to evaluate
  • asking staff to complete the 'Internet Detective' online tutorial
  • monitoring the sites selected by new staff to check they comply with the selection policy
  • setting up an email list for all staff to discuss and debate any quality issues that arise

Changing your selection criteria over time
 

It may be necessary to update a selection policy, as the priorities for selection may change over time as a gateway collection matures.

Adapting scope policies

A new gateway may wish to focus on developing a core collection very quickly before broadening the parameters. The scope may be much narrower in the early stages of collection development. For example, a new gateway may set narrow parameters for things such as:

  • granularity (e.g. focus on Web sites as opposed to Web pages)
  • subjects covered (e.g. prioritise generic resources over resources for very rarely researched subjects)
  • geographic boundaries (e.g. focus on UK resources before adding those from elsewhere)
  • types of resource (e.g. focus on Web sites as opposed to mailing lists or newsgroups)

A more mature gateway on the other hand may broaden its scope once a core collection has been developed to include resources beyond the very narrow scope initially used. It may choose to extend its subject coverage, work at a finer level of granularity or include resources from different countries and of different types. These decisions should be reflected in the scope policy of the service.

Adapting selection criteria

The Internet offers uneven coverage of subjects, and this may affect the quality selection criteria used within different parts of a gateway collection.

For example, if a subject comes within the scope of the gateway but very few resources can be found about that subject, it may be that less stringent quality criteria should be used, to ensure that there is at least some subject coverage.

Conversely, if there are many resources available for a subject, then very stringent quality criteria may be used to ensure that the highest quality resources are selected in preference to others with the same subject coverage.

These issues relate to collection management, which is discussed in the Collection Management chapter of this handbook.


Quality ratings/labelling/PICS and other initiatives in this area
 

The Web and metadata communities have been exploring the potential for automated approaches to quality-related aspects of information management on the Internet. The main aim has been to create a system where the quality of an Internet resource can be described in a machine-readable form. If this were to be achieved a number of scenarios would become possible. For example:

  • search engines could retrieve or rank resources according to aspects of their quality
  • users could search for resources using particular quality requirements (e.g. only peer reviewed journals, or resources that work with version 3.1 of Netscape, or resources that have been approved by a librarian)
  • users could recommend and rate Internet resources in a standard format and share these ratings

There have been two main challenges:

  1. Creating the technological infrastructure to support machine-readable quality ratings.
  2. Creating metadata vocabularies to describe various quality attributes of Internet resources.

PICS and RDF

PICS and RDF both aim to provide a technological infrastructure to support machine-readable quality ratings.

PICS stands for Platform for Internet Content Selection. It has been approved by the W3C (World Wide Web Consortium) as an agreed standard for associating labels (metadata) with Web sites or Web pages. Essentially, these labels refer to the information content of the sites, and therefore provide a means of recording information about aspects of their quality. PICS has most famously been used to support the development of services that aim to protect children from X-rated sites on the Internet.

RDF stands for Resource Description Framework and is a standard approved by the W3C. It has emerged as a successor to PICS, offering a broader infrastructure for assigning metadata labels to Internet sites and pages. RDF can be used with many different metadata vocabularies, and certainly there is potential for it to be used with a vocabulary that describes the quality of an Internet resource.

Metadata vocabularies for quality

The second challenge has been to create metadata vocabularies to describe various quality attributes of Internet resources. At the time of writing no vocabulary has emerged but work is under way, particularly within the medical community, to create metadata labels for quality that can be incorporated into Internet resource discovery services.

With the basic RDF framework in place, it is now possible for different communities to create their own quality vocabularies and apply them to their own services.

How does this work relate to Information gateways?

This work has the potential to offer gateways a number of interesting possibilities, for example:

  • Internet cataloguers may use quality ratings to help them find high quality resources to add to their gateway
  • gateways may create machine-readable quality labels
  • they may incorporate user ratings into their services

The missing link, as things stand, is the development of quality vocabularies. Gateways may see it as their role to create such vocabularies and to use RDF to create machine-readable metadata about the quality of Internet resources. At present we cannot offer an example of a gateway doing this, but some key sites where new developments will appear are listed below.

E X A M P L E

Examples of recent work with PICS and quality ratings


Glossary
 

DutchESS Dutch Electronic Subject Service
EELS Engineering Electronic Library Sweden
PICS Platform for Internet Content Selection
RDF Resource Description Framework
SOSIG Social Science Information Gateway


References
 

DutchESS, http://www.konbib.nl/dutchess/

EELS, http://www.ub.lu.se/eel/

European Link Treasury, http://www.en.eun.org/news/european-link-treasury.html

Information Quality WWW Virtual Library, http://www.ciolek.com/WWWVL-InfoQuality.html

Internet Detective, http://www.sosig.ac.uk/desire/internet-detective.html

Länkskafferiet (Link Larder), http://lankskafferiet.skolverket.se/information/kvalitetskriterier.html

PICS Home Page, http://www.w3.org/PICS/

RDF Home Page, http://www.w3.org/RDF/

Scout Report, http://scout.cs.wisc.edu/index.html

SOSIG, http://www.sosig.ac.uk/

J. Alexander & M. A. Tate, Evaluating Web Resources,
http://www2.widener.edu/Wolfgram-Memorial-Library/webeval.htm

C. Armstrong, 'Metadata, PICS and Quality', Ariadne Issue 9. 1997
http://www.ariadne.ac.uk/issue9/pics/

N. Auer, Bibliography on Evaluating Internet Resources
http://www.lib.vt.edu/research/libinst/evalbiblio.html

D. Brickley, T. Gardner, R. Heery & D. Hiom, Recommendations on Implementation of Quality Ratings in an RDF Environment.
http://www.desire.org/html/research/deliverables/D3.2/

A. Cooke, Finding Quality on the Internet: a guide for librarians and information professionals,
(London: Library Association Publishing, 1999. ISBN: 1-85604-267-7).


Credits
 

Chapter author: Emma Place

With contributions from: Michael Day, Debra Hiom, Ann-Sofie Zettergren


-2.2. Resource discovery

In this chapter...
 
  • the resource discovery process - ensuring new Internet resources are found to add to your gateway
  • systems for gateway managers - to support efficient resource discovery within your team
  • strategies for gateway staff - to continuously locate high quality resources on the Internet
  • case studies - resource discovery tips and hints from existing gateways
  • new and mature gateways - different resource discovery issues for different gateways
Introduction
 

Subject gateways should aim to describe the best resources that the Internet has to offer in their field and for their target audience. They need to:

  • point to the highest quality networked resources currently available
  • point to new networked resources as they appear

Finding high quality resources on the Internet can be a time-consuming job - which of course, is exactly why gateways exist - to save the end-user some of the time and commitment required to discover and retrieve high quality information on the Internet.

Locating resources to add to your gateway will require one of the biggest investments of staff time and effort, and so it is important to find efficient and effective methods of working at this task:

  • gateway managers need to ensure that systems to support resource discovery are in place
  • individual gateway staff need to develop their own strategies for locating as many high quality resources as efficiently as possible

Resource discovery issues for gateway managers
 

Gateway managers will need to provide the systems and strategies to support efficient resource discovery within their team.

Resource discovery is labour-intensive and efficient strategies can help to maximise the number of resources added to the gateway. This section suggests some of the systems that managers can put in place to support efficient resource discovery within the team:

  1. Avoid duplicated effort.
  2. Find the right people for the job.
  3. Provide training in resource discovery.
  4. Set up support systems for resource discovery staff.
  5. Set up systems to encourage your user community to suggest resources.

1. Avoiding duplicated effort

Duplicated effort can be wasted effort. There are issues of duplication:

  • between gateways
  • within the team

Avoid duplication with other gateways

It is worth finding out whether other gateways already describe Internet resources in your field. If there are other gateways you have to ask yourself whether it really makes sense to spend time and effort cataloguing the same resources twice. If existing gateways are already describing resources relevant to your users you should consider:

  • collaboration with other gateways (to avoid cataloguing the same resources twice)
  • cross-searching your gateway with other gateways so that your users can search more than one simultaneously
  • sharing metadata records

Cross reference
Co-operation between gateways

Avoid duplication within your team

Time can be wasted if members of your team are all trawling the same sources. Consider developing a team strategy for resource discovery. For example by:

  • giving people different subject responsibilities - so they are each hunting for resources in a different discipline
  • giving people different monitoring responsibilities - so they are each monitoring different sources (email lists/URLs/current awareness services etc.)
E X A M P L E

Example of a team dividing resource discovery responsibilities

SOSIG has divided responsibilities among the team of core staff and section editors as follows:

Section Editors: each have responsibility for a particular SUBJECT area
Central staff: have responsibility for trawling generic sources and for monitoring suggestions of sites sent in by users

See: http://www.sosig.ac.uk/contact.html


2. Find the right people for the job

It will be financial and political considerations which determine whom you can take on to do the job of resource discovery, as with recruiting staff for cataloguing.

Cross reference
Subject indexing and classification, Distributed cataloguing

Volunteers?

Pros: may be cheap and plentiful

Cons: may be inconsistent and unreliable in their contribution and it may be difficult to find volunteers with the subject expertise to select the high quality resources you want

Subject specialists?

Pros: may know of the best sources to use to discover relevant resources for your gateway and should be able to assess resources effectively, given their subject knowledge.

Cons: may be expensive, short of time, difficult to recruit and unable or unwilling to spend time cataloguing

Librarians/information professionals?

Pros: have training in selecting resources to meet the information needs of users and also may be able to catalogue resources in addition to selecting them, since they may have training in cataloguing/information retrieval issues.

Cons: may be expensive/difficult to recruit

  . .   R E M E M B E R
  • Internet skills can be taught more easily than subject expertise!
  • Librarians may be more willing and able to catalogue resources than to discover them

3. Provide training in resource discovery

The Internet is always growing and changing, so there are always new tips and hints to be learned in Internet resource discovery - training staff can improve skills and effectiveness. Training may include:

  • offering lists of sources for staff to use
  • offering demonstrations and hands-on work with different resource discovery tools
  • brainstorming ideas within the team to share resource discovery strategies

4. Set up support systems for resource discovery staff

The following are ideas for support systems for resource discovery staff:

  • create Web documents that list resource discovery strategies appropriate to your gateway
  • set up a mailing list for resource discovery staff so that the team can share knowledge of any useful new sources or techniques they find - and so they can talk about issues that arise
  • set up meetings for resource discovery staff to share stories of successful and unsuccessful strategies which they have found.
E X A M P L E

Example of a support system for gateway staff

  1. SOSIG has created a Web page for section editors, which lists possible resource strategies: 'Finding Internet resources for SOSIG: strategies and sources'
  2. A mailing list has been set up for section editors to share news of any new, effective strategies they discover.
  3. Twice a year the section editors come together and compare experiences of the most effective and the most ineffective (!) resource discovery strategies.

5. Set up systems to encourage your user community to suggest resources

Why not let the resources come to you! Encourage your users to send you details of any sites which they think should be added to the gateway. You will need:

  1. to publicise an email address or Web form for submissions
  2. to publicise your scope and selection criteria

Cross reference
Quality selection

  . Tips
  • Web forms are great because they encourage users to generate the appropriate metadata - and they may have good ideas about keywords and descriptions
  • make sure your selection criteria are freely available, to try to discourage inappropriate resources from being submitted and to make it clear that not all submissions will be accepted
  • a quick thank-you message to users is good PR and can encourage them to submit again. If you are getting a lot of submissions - create a standard courtesy reply
  • publicise the fact that you welcome submissions from your user community. If you run an email list associated with your gateway, (***CROSS REFERENCE publicity and promotion) you can send out occasional reminders to subscribers

E X A M P L E

Examples of Web forms for users to submit resources


Resource Discovery Strategies for Staff
 

Gateway staff do the 'leg work' for SOSIG users - joining the lists, monitoring the sites and doing the searches that many users do not have the time to do, filtering out items that are of poor quality or irrelevant to the users.

It's easy to waste time when surfing the Internet - gateway staff need to develop efficient and effective strategies for locating high quality Internet resources. Some strategies are suggested below.

Resource discovery tools and methods

  1. Browsing strategies
  2. Mailing lists and their archives
  3. Distribution lists and current awareness services
  4. Search tools
  5. Newsgroups and discussion forums
  6. URL-minders and Web agents
  7. Non-Internet sources

1. Browsing strategies

One of the richest sources of resources will be existing Web pages - especially authoritative ones in your field which list related or recommended resources. Trawling these sites is the equivalent of citation pearl-growing or snowballing, traditionally done by researchers looking for references - if they find one useful resource, they will follow the references from that resource to find others.

Trawling home pages of known experts

If you know of experts in your field, do a search to see if they have their own Web page. You may find that:

  1. They have published their work on the Web.
  2. They have collected a list of links (and, given their knowledge and expertise, they will be worth checking out!)

Bookmark any that look as if they may be developed over time, so that you can check them again in the future.

Trawling organisational home pages

Many organisations now have their own Web sites. These can be useful in two ways:

  1. They may include primary resources for you to catalogue.
  2. They may have lists of links selected by people with subject knowledge which you could trawl.

Consider which organisations are relevant to your audience and try to keep in touch with developments concerning them.

  . Tips

Take time to do a search for the most relevant organisational sites for you and organise them in a bookmark folder, so you can take a look at them periodically. Only bookmark the best - you won't have time to trawl too many.


If you are creating a gateway for an academic audience then it can pay to monitor university Web pages. Look for:

  • library Web sites - as many librarians are now building collections of Internet links
  • academic departments' Web sites - where lecturers and researchers may publish their work or may create lists of links
E X A M P L E

Examples of some starting points useful for academic gateways:


Trawling subject-based sites

Many sites have a section of 'links' which can be mined for new resources. The better quality the original site, the better the related links are likely to be:

  • find the most important sites in your field and look at all the links they recommend
  • look for 'What's New' or 'Latest News' features on trusted sites
  • bookmark these link pages or 'What's New' pages to check regularly, or consider putting the URLs into a Web Agent or URL-minder (see below) so that they can let you know when anything new is added
E X A M