|Status of document:||WithdrawnRecommendation.|
This document specifies the controlled vocabulary of language resource types used by OLAC. The linguistic data type vocabulary describes the nature of the content of a resource from a linguistic standpoint.
Helen Aristar Dry (mailto:email@example.com)
|Changes since previous version:||
Adds: transcription/phonemic, transcription/kinesic, annotation/translation, annotation/phonological, annotation/semantic, annotation/eye-gaze, annotation/facial-expression, description/phonological, description/kinesic, description/pedagogical, description/comparative, dataset/kinesic.
Deletes: transcription/eye-gaze, transcription/facial-expression, transcription/translation, transcription/phonological, transcription/semantic, description/eye-gaze, description/facial-expression, dataset/eye-gaze, dataset/facial-expression. Genre Type section.
Copyright © Heidi Johnson (University of Texas at Austin), Helen Aristar Dry (Eastern Michigan University) . This material may be distributed and repurposed subject to the terms and conditions set forth in the Creative Commons Attribution-ShareAlike 2.5 License.
Key points: two-level systems, multiple categories for a single resource, parallelism of the transcription and annotation subcategories.
Each term of the controlled vocabulary is described in one of the following subsections. The heading gives the encoded value for the term that is to be used as the value of the code attribute of the Type.linguistic metadata element [OLAC-MS]. Under the heading, the term is described in four ways. Name gives a descriptive label for the term. Definition is a one-line summary of what the term means. Comments offers more details on what the term represents. Examples may also be given to illustrate how the term is meant to be applied.
A further label, Subterms, appears when the term permits more specific refinements. In such cases, the generic (top-level) terms may be chosen, or one of its more specific refinements.
|Definition||A transcription is a written representation of an audio or visual signal.|
A resource can be identified as a transcription if it includes a type of transcription as part of the content; for example, the first line of an interlinear analysis is often some type of transcription.
|Definition||The resource includes information which annotates some other linguistic record.|
A linguistic annotation is defined as structured linguistic information that is explicitly aligned to some spatial and/or temporal extent of some other linguistic record.
|Definition||The resource is a structured set of data items.|
A dataset is a collection of items organized in a structured format for some specific research purpose. Examples of datasets are: a database of sentences illustrating deictic terms; an inflectional affix paradigm; a list of utterance tokens in a uniform context (e.g. "Say [pat] now.").
|Definition||The resource is a linguistic description.|
A description is any description or analysis of a language. Unlike a transcription or an annotation, the structure of a description is independent of the structure of the linguistic events that it describes.
|Definition||The resource includes a systematic listing of lexical items.|
|Definition||This is a primary resource: the object of study.|
A text is defined as any primary resource or research material, such as a literary work, film, or recording of natural discourse.
Write the introduction. Explain that typical resources will contain multiple types.
|[OED]||Oxford English Dictionary
|[OLAC-MS]||OLAC Metadata Set.
|[SAMPA]||Speech Assessment Methods Phonetic Alphabet
|[TIMIT]||TIMIT Acoustic-Phonetic Continuous Speech Corpus
|[Unicode-IPA]||Unicode IPA Extensions
|[WordNet]||WordNet - a Lexical Database for English