The Open Language Archives Community

A Symposium at the 76th Annual Meeting of the Linguistic Society of America, San Francisco, 3-6 January 2002

Convened by Steven Bird (University of Pennsylvania) and Gary Simons (SIL International)

Materials from the Launch

  1. [ pdf | ppt ] Gary Simons: The Seven Pillars of Open Language Archiving
  2. [ pdf | ppt ] Helen Dry and Anthony Aristar: OLAC and Linguist List
  3. [ pdf | ppt ] Megan Crowhurst: Web Based Archiving as a Tool for Language Preservation and Maintenance
  4. [ pdf | ppt ] Chu-Ren Huang: Language Archives and Linguistic Anchoring of Digital Archives
  5. [ pdf | ppt ] Mark Liberman: Legal, Ethical and Policy Issues Concerning the Recording and Publication of Primary Language Materials
  6. [ pdf | ppt ] Gary Holton: Creating an OLAC Data Provider at ANLC
  7. [ pdf | ppt ] Steven Bird: Getting Involved in OLAC

2-page handout


The goals of the symposium were to disseminate the OLAC vision to the linguistics community, and to encourage linguists to archive and publish primary language documentation using archival formats.

This meeting marked the public release of the OLAC specifications. Upon release these specifications were fixed for a one-year pilot period during which members of the wider community have been encouraged to participate by implementing the specifications for their own collections of material.

Presentations addressed the following questions:

  1. What is the Open Language Archives Community?
  2. Why is language archiving important?
  3. What does it take to participate in OLAC?

Discussion time was used to clarify the OLAC model and to identify and address any concerns raised by the audience. Substantive feedback will help to guide the future evolution of OLAC.


1. What is the Open Language Archives Community?

The Seven Pillars of Open Language Archiving (Gary Simons, SIL International). Digital archiving of language documentation and description on the World-Wide Web holds the promise of unparalleled access to language information. But if it is not done well, it also offers the specter of frustration and chaos on an unparalleled scale. This talk presents an executive summary of our vision for the kind of infrastructure that would unlock the promise. Special focus is given to the seven pillars on which such an infrastructure would be erected: DATA, TOOLS, ADVICE, GATEWAY, METADATA, REVIEW, and STANDARDS.

OLAC and Linguist List (Helen Aristar-Dry, Linguist List and Eastern Michigan University). Over the past 11 years, the Linguist List has become the primary source of information for the linguistics community, reaching out to over 15,000 subscribers worldwide, and having four complete mirror sites. The Linguist List will be augmenting its service by providing the primary entry point for OLAC, and permitting linguists to browse distributed language resources at a single place. This talk will include a demonstration of a new Linguist List ``service provider'', and also report progress on a new NSF-sponsored project to create a ``Showroom of Best Practice'' for language documentation.

2. Why is language archiving important?

Web based archiving as a tool for language preservation and maintenance (Megan Crowhurst, UT Austin and LSA Committee on Endangered Languages and their Preservation). Problems facing linguists and others working on endangered language varieties are the challenges of first documenting them, and second, working to preserve, maintain, or revitalise them, where desired. Even when adequate documentary materials exist, limitations on the distribution of traditionally published print media restrict the availability of these materials to those who need them for preparing the materials appropriate for language maintenance projects. The talk discusses some of the ways in which web based archives for language documentation materials can be used as a tool for language maintenance efforts.

Language Archives and Linguistic Anchoring of Digital Archives. (Chu-Ren Huang, Academia Sinica, Taiwan). Language archiving is not purely for linguistic purposes. We believe that language archives are also crucial in the context of a broader digital archives project, since linguistic description and interpretation, like temporal and geographical description and interpretation, underlies all digital archive items. Thus, ideally, when archiving digital knowledge, each knowlege item should be temporally, geographically, and linguistically anchored. And any linguistic anchoring is impossible without comprehensive language archives. A second topic to be covered by this talk is a regional perspective on Chinese and Austronesian language archives.

Legal, Ethical, and Policy Issues Concerning the Recording and Publication of Primary Language Materials (Mark Liberman, University of Pennsylvania). Language documentation involves recording, analyzing, archiving and publishing a wide variety of materials, including audio and video recordings, transcripts, linguistic and cultural annotations and commentaries, dictionaries, grammars, and instructional works. Because of advances in digital media, in networking, and in computing technology, language documentation is becoming both easier and more useful. In addition, there has been much recent interest in documentation of endangered languages. This talk will review the legal, ethnical and regulatory context of language documentation projects in the United States. \end{description}

3. What does it take to participate in OLAC?

Creating an OLAC Data Provider at ANLC. (Gary Holton, Alaska Native Language Center, UA Fairbanks). The Alaska Native Language Center is a major center for the study of Eskimo and Northern Athabaskan languages. The ANLC maintains an archival collection of more than 10,000 items documenting the twenty Alaska native languages, including virtually everything written in or about these languages. ANLC would like to improve the accessibility of its collection, and is making catalog data available in the OLAC metadata format. This talk will report experience with OLAC from the perspective of the ANLC archive. It will also describe the issues that a linguist/archivist faces in participating in OLAC, concerning computational issues and archive metadata issues.

Concrete Steps for Linguists, Archivists and Funding Agencies. (Steven Bird, University of Pennsylvania). This talk will describe a variety of ways for the wider community to participate in OLAC. For individuals and institutions wanting to disseminate their resources, three methods for contributing metadata will be presented. One of these, the OLAC Repository Editor (ORE), will be demonstrated live, to show how easy it is for linguists to document the resources they create. The talk will also describe the OLAC Process, a method for identifying best practices for the digital storage of linguistic documentation.

Steven Bird and Gary Simons