OLAC Type Vocabulary

Date issued:2001-03-20
Status of document:WithdrawnRecommendation.
This version:http://www.language-archives.org/REC/type-20010320.html
Latest version:http://www.language-archives.org/REC/type.html
Previous version:None.
Abstract:

This document specifies the controlled vocabulary of language resource types used by OLAC. The type vocabulary describes the nature of the content of a resource.

Editors: Steven Bird, University of Pennsylvania (mailto:sb@ldc.upenn.edu) Gary Simons, SIL International (mailto:gary_simons@sil.org)
Copyright © Steven Bird, University of Pennsylvania (mailto:sb@ldc.upenn.edu) Gary Simons, SIL International (mailto:gary_simons@sil.org) . This material may be distributed and repurposed subject to the terms and conditions set forth in the Creative Commons Attribution-ShareAlike 2.5 License.

Table of contents

  1. Introduction
  2. Data Type
  3. Genre Type
References

1. Introduction

Key points: two-level systems, multiple categories for a single resource, parallelism of the transcription and annotation subcategories.

2. Data Type

Each term of the controlled vocabulary is described in one of the following subsections. The heading gives the encoded value for the term that is to be used as the value of the code attribute of the fooBar metadata element [OLAC-MS]. Under the heading, the term is described in four ways. Name gives a descriptive label for the term. Definition is a one-line summary of what the term means. Comments offers more details on what the term represents. Examples may also be given to illustrate how the term is meant to be applied.

A further label, Subterms, appears when the term permits more specific refinements. In such cases, the generic (top-level) terms may be chosen, or one of its more specific refinements.

transcription

NameTranscription
DefinitionThe resource includes a transcription of a linguistic event.
Subterms

transcription/orthographic

NameOrthographic transcription
DefinitionThe resource includes orthographic transcription.

transcription/phonetic

NamePhonetic transcription
DefinitionThe resource includes phonetic transcription.
Comments

Phonetic transcription may be narrow or broad, and will typically use the International Phonetic Alphabet [IPA] in a standard encoding (e.g. [Unicode-IPA], [SAMPA]). Phonological transcriptions are also classified here.

transcription/prosodic

NameProsodic transcription
DefinitionThe resource includes prosodic transcription.
Comments

A prosodic transcription is a symbolic record of intonation, stress, tone or other suprasegmental features that is expressed independently of regular phonetic transcription.

transcription/morphological

NameMorphological transcription
DefinitionThe resource includes morphological transcription.

transcription/gestural

NameGestural transcription
DefinitionThe resource includes gestural transcription.

transcription/part-of-speech

NamePart-of-speech transcription
DefinitionThe resource includes part-of-speech tags.

transcription/syntactic

NameSyntactic transcription
DefinitionThe resource includes syntactic transcription.

transcription/discourse

NameDiscourse transcription
DefinitionThe resource includes discourse transcription.

transcription/musical

NameMusical transcription
DefinitionThe resource includes musical transcription.

annotation

NameAnnotation
DefinitionThe resource includes information which annotates some other linguistic record.
Comments

A linguistic annotation is defined as structured linguistic information that is explicitly aligned to some spatial and/or temporal extent of some other linguistic record.

Subterms

annotation/orthographic

NameOrthographic annotation
DefinitionThe resource includes orthographic annotation.

annotation/phonetic

NamePhonetic annotation
DefinitionThe resource includes phonetic annotation.
Comments

An example of a phonetic annotation is the TIMIT database, in which each element of phonetic transcription is associated with a range of samples in a digital audio file [TIMIT]. Phonological annotations are also classified here.

annotation/prosodic

NameProsodic annotation
DefinitionThe resource includes prosodic annotation.

annotation/morphological

NameMorphological annotation
DefinitionThe resource includes morphological annotation.
Comments

A morphological annotation is a morphological transcription where the component morphemes are aligned with some other linguistic record, such as an orthographic transcription or a speech signal. An example of morphological annotation is interlinear text with aligned morphemic glosses.

annotation/gestural

NameGestural annotation
DefinitionThe resource includes gestural annotation.

annotation/part-of-speech

NamePart-of-speech annotation
DefinitionThe resource includes aligned part-of-speech tags.

annotation/syntactic

NameSyntactic annotation
DefinitionThe resource includes aligned syntactic transcription.

annotation/discourse

NameDiscourse annotation
DefinitionThe resource includes aligned discourse transcription.

annotation/musical

NameMusical annotation
DefinitionThe resource includes musical annotation.

description

NameDescription
DefinitionThe resource includes linguistic description.
Comments

A description is any description or analysis of a language. Unlike a transcription or an annotation, the structure of a description is independent of the structure of the linguistic events that it describes.

Subterms

description/grammatical

NameGrammatical description
DefinitionThe resource includes grammatical description.

description/phonological

NamePhonological description
DefinitionThe resource includes phonological description.

description/orthographic

NameOrthographic description
DefinitionThe resource includes documentation of a writing system.

description/paradigms

NameLinguistic paradigms
DefinitionThe resource includes linguistic paradigms.
Comments

A paradigm is a tabulation of linguistic forms designed to illustrate one or more systematic contrasts.

description/pedagogical

NamePedagogical description
DefinitionThe resource includes pedagogical description.
Comments

A pedagogical description is a style of presentation intended for use in teaching people to use the language.

description/dialectal

NameDialectal description
DefinitionThe resource includes dialectal description.

description/comparative

NameComparative description
DefinitionThe resource includes comparative or typological description.

lexicon

NameLexicon
DefinitionThe resource includes a systematic listing of lexical items.
Subterms

lexicon/dictionary

NameDictionary
DefinitionThe resource includes a dictionary.
Comments

This includes any resource that lists words or morphemes and defines them. It contrasts with a word list in that the definitions are complex (rather than being one-word equivalents) and the entries may include other information like part of speech, related words, and illustrative sentences.

lexicon/wordlist

NameWord list
DefinitionThe resource includes a word list.
Comments

A word list is a list of reference words in a major language for which the nearest equivalent word in a target language has been elicited (for instance, the Swadesh 100-word list).

lexicon/wordnet

NameWordNet
DefinitionThe resource includes a semantic wordnet.
Comments

Whereas a dictionary documents the meanings of words by means of definitions, a word net documents meanings by building a web of semantic relationships [WordNet].

lexicon/thesaurus

NameThesaurus
DefinitionThe resource includes a thesaurus.
Comments

A thesaurus is a list of words or concepts arranged according to sense.

lexicon/terminology

NameTerminology
DefinitionThe resource includes a terminological lexicon.
Comments

A terminological lexicon is a glossary of domain-specific terms. Examples are technical terminology, kinship terms, color terms, acronyms, ...

lexicon/proper-names

NameName Dictionary
DefinitionThe resource includes proper names.

lexicon/bilingual

NameBilingual Lexicon
DefinitionThe resource includes definitions in another language.

lexicon/etymological

NameEtymological Lexicon
DefinitionThe lexicon contains etymological information.

lexicon/phonetic

NamePhonetic Lexicon
DefinitionThe lexicon contains phonetic information, including pronunciation, phonology, stress, rhymes.

lexicon/frequency

NameFrequency Lexicon
DefinitionThe lexicon contains frequency information.

lexicon/analytical

NameAnalytical Lexicon
DefinitionThe lexicon contains analytical information.
Comments

Analytical information includes such things as morphological derivation, grammatically related forms, argument structure, ...

3. Genre Type

narrative

NameNarrative
Definition
Subterms

narrative/traditional

NameTraditional Narrative
Definition

narrative/personal

NamePersonal Narrative
Definition

oratory/political

NamePolitical Oratory
Definition

oratory/religious

NameReligious Oratory
Definition

dialogue/conversation

NameConversation
Definition

dialogue/counsel

NameCounsel
Definition

dialogue/interview

NameInterview
Definition

dialogue/elicitation

NameElicitation
Definition

singing/individual

NameIndividual Singing
Definition

singing/group

NameGroup Singing
Definition

To do

Explain that typical resources will contain multiple types. Do we want to include genre?


References

[OLAC-MS]OLAC Metadata Set.
<http://www.language-archives.org/OLAC/olacms.html>
[SAMPA]Speech Assessment Methods Phonetic Alphabet
<http://www.phon.ucl.ac.uk/home/sampa/home.htm>
[TIMIT]TIMIT Acoustic-Phonetic Continuous Speech Corpus
<http://www.ldc.upenn.edu/Catalog/LDC93S1.html>
[Unicode-IPA]Unicode IPA Extensions
<http://www.unicode.org/unicode/uni2book/ch07.pdf>
[WordNet]WordNet - a Lexical Database for English
<http://www.cogsci.princeton.edu/~wn/>