Viser: A virtual service provider for displaying selected OLAC metadata

Date issued:2003-07-29
Status of document:Draft Implementation Note. This is only a preliminary draft that is still under development; it has not yet been presented to the whole community for review.
This version:http://www.language-archives.org/NOTE/viser-20030729.html
Latest version:http://www.language-archives.org/NOTE/viser.html
Previous version:http://www.language-archives.org/NOTE/viser-20021102.html
Abstract:

Documents a service named Viser, hosted on the OLAC web site, that allows language resource sites to post services based on OLAC metadata without having to implement a conventional service provider. Viser works in conjunction with the query facility of the OLAC Aggregator to selectively harvest OLAC metadata and then provide HTML displays of records that match a query. The service returns an XML document with processing instructions to invoke an XSL stylesheet and set certain display parameters; the actual rendering to HTML happens in the client's browser.

Editors: Gary Simons, SIL International (mailto:gary_simons@sil.org)
Changes since previous version:

Updated to reflect changes from version 0.4 of the OLAC metadata standard to version 1.0.

Copyright © 2003 Gary Simons (SIL International). This material may be distributed and repurposed subject to the terms and conditions set forth in the Creative Commons Attribution-ShareAlike 2.5 License.

Table of contents

  1. Introduction
  2. The CGI interface
  3. The result format
  4. Using Viser to provide a service
  5. Implementation
References

1. Introduction

A key feature of the openness of the Open Archives Initiative protocol for metadata harvesting [OAI-PMH] on which OLAC is based is that any site on the web is free to become a service provider. That is, it may harvest metadata from the participating data providers and offer a service based on the harvested metadata. In general, it is complicated to implement and operate a complete harvester with the result that few sites rise to the challenge of becoming a service provider.

The Open Language Archives Community is seeking to change this by offering services that make it easy for would-be service providers to selectively harvest and present the metadata records that are relevant for their area of interest. The central service in this respect is the query facility of the OLAC Aggregator [OLACA-Query]. It provides a CGI interface through which a would-be service provider may query the complete database of harvested OLAC metadata records. The result is an XML document containing just the metadata records that match the query.

Viser, the virtual service provider, takes this one step further. It offers a CGI interface that not only processes a query, but also returns the results in such a way that they can be rendered in HTML on the end user's browser. Viser was developed as the counterpart to Vida, OLAC's virtual data provider [Vida], which made it possible for a language resource provider to become an OLAC (and OAI) data provider without implementing the OAI protocol. In a similar way, Viser makes it possible for a language resource site to become an OLAC service provider without implementing the OAI protocol.

The purpose of this document is to document Viser and to illustrate how it can be used.

2. The CGI interface

Viser is a process with a CGI interface. It is located at the following URL. If Viser is invoked without any arguments, it simply returns the page of documentation you are currently reading.

http://www.language-archives.org/viser

Viser, like [OLACA-Query] on which it is based, uses the OAI flow control mechanism [OAI-FC] to deal with queries that generate multiple pages of results. This means the arguments that are valid for the initial request to Viser are different from the arguments on follow-up requests to get the second and following pages of a multi-page result.

The interface for an initial query to Viser supports the following five arguments:

elements

A required argument that specifies the number of metadata elements that are referred to in the selection criterion.

sql

A required argument that specifies the selection criterion expressed as the content of a WHERE clause in MySQL syntax.

count

An optional argument that specifies the number of metadata records to return in a single response. If this argument is not specified, a default value of 20 is assumed. Viser enforces a limit of 500,000 bytes for the length of a response. If this limit is exceeded, one must specify a lower value for this argument.

title

An optional argument that specifies the title for the HTML page of results. If this argument is not specified, no title value will be given to the stylesheet. In the default stylesheet, this generates a title of Untitled Query Results.

xsl

An optional argument that specifies the URL of the XSL stylesheet to use for formatting the results on the end user's browser. If this argument is not specified, the following default stylesheet is used:

http://www.language-archives.org/tools/viser/basic_service.xsl

The first three arguments above are passed directly to the OLACA query facility. See [OLACA-Query] for documentation on how to use these arguments.

The arguments for the follow-up requests to get the second and following pages of a multi-page result are as follows:

resumptionToken

A required argument that specifies the flow control token [OAI-FC] returned in the <resumptionToken> element of the previous Viser request. It instructs OLACA to pick up where it left off in returning the results of the original query.

start

A required argument that specifies the sequence number of the first record to be returned on the resulting page. For instance, if the initial request returned 20 records, then the request to retrieve the second page of results should specify a start value of 21.

title

An optional argument that specifies the title for the HTML page of results. If this argument is not specified, no title value will be given to the stylesheet. In the default stylesheet, this generates a title of Untitled Query Results.

xsl

An optional argument that specifies the URL of the XSL stylesheet to use for formatting the results on the end user's browser. If this argument is not specified, the following default stylesheet is used:

http://www.language-archives.org/tools/viser/basic_service.xsl

3. The result format

The result returned to the end user's browser by a Viser request is an XML document. It consists of the ListRecords response [OAI-LR] returned by the corresponding OLACA request [OLACA-Query] with XML processing instructions added at the beginning to pass needed parameters to the end user's browser. An XSL stylesheet reads the value of title, for instance, by executing <xsl:value-of select="/processing-instruction('title')"/>. The result format is thus as follows:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="value-of-xsl-or-default"?>
<?title value-of-title?>
<?start value-of-Start?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/">
   <responseDate> ... </responseDate>
   <request verb="Query" ...> ... </request>
   <ListRecords>
      <!-- A <record> element for each returned record -->
      <!-- A <resumptionToken> if there are more results -->
   </ListRecords>
</OAI-PMH>

The individual metadata records are returned in OLAC format [OLAC-MS]. They are in order of their OAI identifiers. If more records match the selection criterion than the number indicated by the count parameter, a resumption token is returned at the end of the response as described in [OAI-FC].

The <?xml-stylesheet?> processing instruction invokes client-side XSL processing on the user's browser. Thus, Viser works only with browsers that implement the [XSLT] standard. It has been verified to work with current versions of Internet Explorer and Opera.

4. Using Viser to provide a service

To create a service, a language resource site needs only to create a link that accesses Viser on their own web page. This involves following the specification in [OLACA-Query] to formulate the query and supplying a title for the page of language resources that is returned. For instance, here are examples of URLs that create two simple services:

These URLs exemplify the kind of service that can be created without any knowledge of XSL. The default stylesheet displays the title, creators, and publication date for all records that match the selection criterion. In addition, the OAI identifier for the record is formatted as a link to the lookup service at:

http://www.language-archives.org/tools/lookup.php4?identifier=

This results in a page that gives an HTML representation of all the information in the OLAC metadata record. If there are additional records that match the selection criterion, a link at the bottom of the page labeled "More resources ..." makes another call to Viser with the resumption token that will retrieve the next batch of records.

When a site has XSL expertise, it may develop its own stylesheet and use the xsl argument to pass it to Viser. A customized stylesheet could be used to give the service a look and feel that is consistent with the rest of the site. Or a customized stylesheet could take advantage of more of the information in the metadata records and provide greater functionality with respect to the special interest of the site. One may begin developing a customized stylesheet by downloading the default stylesheet from:

http://www.language-archives.org/tools/viser/basic_service.xsl

5. Implementation

Viser is implemented as a PHP4 script [PHP]. The script may be downloaded at the following URL:

http://www.language-archives.org/tools/viser/viser.php4.txt

This version of the script returns an XML document as described in The result format so that the rendering to HTML can be done by the end user's browser. This strategy of client-side XSL processing was required in order to prevent overloading the server that is hosting the OLAC site. There is another version of the script at the following URL that includes code to perform the XSL transformation on the server. The advantage of server-side transformation is, of course, that any browser will be able to render the result; it does not depend on the end user having a browser that can perform the XSL transformation:

http://www.language-archives.org/tools/viser/viser_transform.php4.txt

This version is not enabled for execution on the OLAC site, but could be downloaded and configured to run on another site. It may also serve as a source of ideas for sites that want to implement a customized service provider that is based on selective harvesting through the OLAC Aggregator.


References

[OAI-FC]"Flow Control," section 3.5 of The Open Archives Initiative Protocol for Metadata Harvesting, Version 2.0 (2002-06-14).
<http://www.openarchives.org/OAI/2.0/openarchivesprotocol.htm#FlowControl>
[OAI-LR]"ListRecords," section 4.5 of The Open Archives Initiative Protocol for Metadata Harvesting, Version 2.0 (2002-06-14).
<http://www.openarchives.org/OAI/2.0/openarchivesprotocol.htm#ListRecords>
[OAI-PMH]The Open Archives Initiative Protocol for Metadata Harvesting, Version 2.0 (2002-06-14). .
<http://www.openarchives.org/OAI/2.0/openarchivesprotocol.htm>
[OLACA-Query]A query facility for selective harvesting of OLAC metadata.
<http://www.language-archives.org/NOTE/query.html>
[PHP]PHP Manual.
<http://www.php.net/manual/en/>
[Vida]OLAC Virtual Data Provider.
<http://www.language-archives.org/vida>
[XSLT]XSL Transformations (XSLT) Version 1.0. W3C Recommendation 16 November 1999.
<http://www.w3.org/TR/xslt>