OLAC Linguistic Data Type Vocabulary

Date issued:	2002-06-28
Status of document:	WithdrawnRecommendation.
This version:	http://www.language-archives.org/REC/type-20020628.html
Latest version:	http://www.language-archives.org/REC/type.html
Previous version:	http://www.language-archives.org/REC/type-20020612.html
Abstract:	This document specifies the controlled vocabulary of language resource types used by OLAC. The linguistic data type vocabulary describes the nature of the content of a resource from a linguistic standpoint.
Editors:	Heidi Johnson (mailto:ailla@ailla.org) Helen Aristar Dry (mailto:hdry@linguistlist.org)
Changes since previous version:	Adds: transcription/phonemic, transcription/kinesic, annotation/translation, annotation/phonological, annotation/semantic, annotation/eye-gaze, annotation/facial-expression, description/phonological, description/kinesic, description/pedagogical, description/comparative, dataset/kinesic. Deletes: transcription/eye-gaze, transcription/facial-expression, transcription/translation, transcription/phonological, transcription/semantic, description/eye-gaze, description/facial-expression, dataset/eye-gaze, dataset/facial-expression. Genre Type section.

Copyright © Heidi Johnson (University of Texas at Austin), Helen Aristar Dry (Eastern Michigan University) . This material may be distributed and repurposed subject to the terms and conditions set forth in the Creative Commons Attribution-ShareAlike 2.5 License.

Introduction
Linguistic Data Type

References

1. Introduction

Key points: two-level systems, multiple categories for a single resource, parallelism of the transcription and annotation subcategories.

2. Linguistic Data Type

Each term of the controlled vocabulary is described in one of the following subsections. The heading gives the encoded value for the term that is to be used as the value of the code attribute of the Type.linguistic metadata element [OLAC-MS]. Under the heading, the term is described in four ways. Name gives a descriptive label for the term. Definition is a one-line summary of what the term means. Comments offers more details on what the term represents. Examples may also be given to illustrate how the term is meant to be applied.

A further label, Subterms, appears when the term permits more specific refinements. In such cases, the generic (top-level) terms may be chosen, or one of its more specific refinements.

transcription

Name

Transcription

Definition

A transcription is a written representation of an audio or visual signal.

Comments

A resource can be identified as a transcription if it includes a type of transcription as part of the content; for example, the first line of an interlinear analysis is often some type of transcription.

Subterms

transcription/phonetic

Name	Phonetic transcription
Definition	A phonetic transcription represents the signal at the phonetic level.
Comments	Phonetic transcription may be narrow or broad, and will typically use the International Phonetic Alphabet [IPA] in a standard encoding (e.g. [Unicode-IPA], [SAMPA]). Phonological transcriptions are also classified here.

transcription/phonemic

Name	Phonemic transcription
Definition	A phonemic transcription represents the signal at the level of the phoneme.
Comments	Phonemic transcription may use the International Phonetic Alphabet [IPA] or a practical orthography.

transcription/prosodic

Name	Prosodic transcription
Definition	The resource includes prosodic transcription.
Comments	A prosodic transcription is a symbolic record of intonation, stress, tone or other suprasegmental features that is expressed independently of regular phonetic transcription.

transcription/orthographic

Name	Orthographic transcription
Definition	An orthographic transcription uses a standard or conventional orthography.
Comments	Orthographic transcriptions differ from phonemic transcriptions that use a practical orthography in that they include orthographic conventions for punctuation, capitalization, etc.

transcription/gestural

Name	Gestural transcription
Definition	The resource includes gestural transcription.

transcription/kinesic

Name	Kinesic transcription
Definition	A kinesic transcription represents eye, face, and body movements.
Comments	Kinesic transcriptions represent systematic uses of facial expressions, body language, and eye gaze that are used to communicate meaning.

transcription/musical

Name	Musical transcription
Definition	A musical transcription represents music.

annotation

Name

Annotation

Definition

The resource includes information which annotates some other linguistic record.

Comments

A linguistic annotation is defined as structured linguistic information that is explicitly aligned to some spatial and/or temporal extent of some other linguistic record.

Subterms

annotation/phonetic

Name	Phonetic annotation
Definition	The resource includes phonetic annotation.
Comments	An example of a phonetic annotation is the TIMIT database, in which each element of phonetic transcription is associated with a range of samples in a digital audio file [TIMIT].

annotation/phonological

Name
Definition

annotation/prosodic

Name	Prosodic annotation
Definition	The resource includes prosodic annotation.

annotation/gestural

Name	Gestural annotation
Definition	The resource includes gestural annotation.

annotation/kinesic

Name
Definition	The resource includes aligned representations of eye, face, and body movements.

annotation/morphological

Name	Morphological annotation
Definition	The resource includes morphological annotation.
Comments	A morphological annotation is a morphological transcription where the component morphemes are aligned with some other linguistic record, such as an orthographic transcription or a speech signal. An example of morphological annotation is interlinear text with aligned morphemic glosses.

annotation/part-of-speech

Name	Part-of-speech annotation
Definition	The resource includes aligned part-of-speech tags.

annotation/syntactic

Name	Syntactic annotation
Definition	The resource includes aligned syntactic transcription.
Comments	A syntactic annotation might include supra-lexical features such as word order or auxiliary phrase constructions. They may thus be aligned with phrases or clauses rather than smaller segments of the annotated record.

annotation/semantic

Name
Definition

annotation/discourse

Name	Discourse annotation
Definition	The resource includes aligned discourse transcription.

annotation/translation

Name	Translation
Definition	A translation is a version of the resource in another language.
Comments	Translations may align with different linguistic levels of the resource: morpheme-by-morpheme translation, word-by-word translation, sentence-level free translation, or discourse-level free translation.

annotation/musical

Name	Musical annotation
Definition	The resource includes musical annotation.

dataset

Name

Dataset

Definition

The resource is a structured set of data items.

Comments

A dataset is a collection of items organized in a structured format for some specific research purpose. Examples of datasets are: a database of sentences illustrating deictic terms; an inflectional affix paradigm; a list of utterance tokens in a uniform context (e.g. "Say [pat] now.").

Subterms

dataset/phonetic

Name	Phonetic dataset
Definition	The dataset is comprised of phonetic data.

dataset/phonological

Name	Phonological dataset
Definition

dataset/prosodic

Name	Prosodic dataset
Definition	The dataset is comprised of prosodic data.

dataset/orthographic

Name	Orthographic dataset
Definition	The dataset is comprised of orthographic data.

dataset/gestural

Name	Gestural dataset
Definition	The dataset is comprised of gestural data.

dataset/kinesic

Name	Kinesic dataset
Definition	The dataset is comprised of kinesic data.

dataset/morphological

Name	Morphological dataset
Definition	The dataset is comprised of morphological data.

dataset/part-of-speech

Name	Part-of-speech dataset
Definition	The dataset is comprised of part-of-speech data.

dataset/syntactic

Name	Syntactic dataset
Definition	The dataset is comprised of syntactic data.

dataset/semantic

Name	Semantic dataset
Definition	The dataset is comprised of semantic data.

dataset/discourse

Name	Discourse dataset
Definition	The dataset is comprised of discourse data.

dataset/musical

Name	Musical dataset
Definition	The dataset is comprised of musical data.

description

Name

Description

Definition

The resource is a linguistic description.

Comments

A description is any description or analysis of a language. Unlike a transcription or an annotation, the structure of a description is independent of the structure of the linguistic events that it describes.

Subterms

description/phonetic

Name	Phonetic description
Definition	The resource includes description of phonetic characteristics.

description/phonological

Name	Phonological description
Definition	The resource includes descriptionof phonological characteristics.

description/prosodic

Name	Prosodic description
Definition	The resource includes description of prosodic characteristics.

description/orthographic

Name	Orthographic description
Definition	The resource includes documentation of a writing system.

description/gestural

Name	Gestural description
Definition	The resource includes description of gestural characteristics.

description/kinesic

Name	Kinesic description
Definition	The resource includes description of kinesic characteristics.

description/morphological

Name	Morphological description
Definition	The resource includes description of morphological characteristics.

description/part-of-speech

Name	Part-of-speech description
Definition	The resource includes description of part-of-speech characteristics.

description/syntactic

Name	Syntactica description
Definition	The resource includes description of syntactic characteristics.

description/semantic

Name	Semantic description
Definition	The resource includes description of semantic characteristics.

description/discourse

Name	Discourse description
Definition	The resource includes description of discourse characteristics.

description/pedagogical

Name	Pedagogical description
Definition	The resource includes pedagogical description.
Comments	A pedagogical description is a style of presentation intended for use in teaching people to use the language.

description/comparative

Name	Comparative description
Definition	The resource includes comparative or typological description.

lexicon

Name

Lexicon

Definition

The resource includes a systematic listing of lexical items.

Subterms

lexicon/dictionary

Name	Dictionary
Definition	The resource includes a dictionary.
Comments	This includes any resource that lists words or morphemes and defines them. It contrasts with a word list in that the definitions are complex (rather than being one-word equivalents) and the entries may include other information like part of speech, related words, and illustrative sentences.

lexicon/wordlist

Name	Word list
Definition	The resource includes a word list.
Comments	A word list is a list of reference words in a major language for which the nearest equivalent word in a target language has been elicited (for instance, the Swadesh 100-word list).

lexicon/wordnet

Name	WordNet
Definition	The resource includes a semantic wordnet.
Comments	Whereas a dictionary documents the meanings of words by means of definitions, a word net documents meanings by building a web of semantic relationships [WordNet].

lexicon/thesaurus

Name	Thesaurus
Definition	The resource includes a thesaurus.
Comments	A thesaurus is a list of words or concepts arranged according to sense.

lexicon/terminology

Name	Terminology
Definition	The resource includes a terminological lexicon.
Comments	A terminological lexicon is a glossary of domain-specific terms. Examples are technical terminology, kinship terms, color terms, acronyms, ...

lexicon/proper-names

Name	Name Dictionary
Definition	The resource includes only proper names sa dictionary headwords.

lexicon/bilingual

Name	Bilingual Lexicon
Definition	The resource includes definitions in another language.

lexicon/etymological

Name	Etymological Lexicon
Definition	The lexicon contains etymological information.

lexicon/phonetic

Name	Phonetic Lexicon
Definition	The lexicon contains phonetic information, including pronunciation, phonology, stress, rhymes.

lexicon/frequency

Name	Frequency Lexicon
Definition	The lexicon contains frequency information.

lexicon/analytical

Name	Analytical Lexicon
Definition	The lexicon contains analytical information.
Comments	Analytical information includes such things as morphological derivation, grammatically related forms, argument structure, ...

text

Name

Text

Definition

This is a primary resource: the object of study.

Comments

A text is defined as any primary resource or research material, such as a literary work, film, or recording of natural discourse.

Subterms

text/narrative

Name	Narrative
Definition	A monologic discourse which represents temporally organized events.
Comments	Types of narratives include historical, traditional, and personal narratives, myths, folktales, fables, and humorous stories.

text/oratory

Name	Oratory
Definition	"The art of public speaking, or of speaking eloquently according to rules or conventions.
Comments	Examples of oratory include sermons, lectures, political speeches, and invocations.

text/dialogue

Name	Dialogue
Definition	An interactive discourse with two or more participants.
Comments	Examples of dialogues include conversations, interviews, correspondence, consultations, greetings and leave-takings.

text/singing

Name	Singing
Definition	"Words or sounds [articulated] in succession with musical inflections or modulations of the voice" OED.
Comments	Examples of singing include chants, songs, and choruses.

text/drama

Name	Drama
Definition	A planned, creative, rendition of discourse with two or more participants.

text/formulaic

Name	Formulaic
Definition	The resource is a ritually or conventionally structured discourse.
Comments	Examples of formulaic discourse are prayers, curses, blessings, charms, curing rituals, marriage vows, and oaths.

text/procedural

Name	Procedural
Definition	An explanation or description of a method, process, or situation having ordered steps.
Comments	Examples of procedural discourses include recipes, instructions, and plans.

text/report

Name	Report
Definition	A factual account of some event or circumstance.
Comments	Examples of reports include news reports, essays, and commentaries.

text/ludic

Name	Ludic
Definition	Ludic discourse is language whose primary function is to be part of play, or a style of speech that involves a creative manipulation of the structures of the language.
Comments	Examples of ludic discourse are play languages, jokes, secret languages, and speech disguises.

text/unintelligible speech

Name	Unintelligible speech
Definition	The resource consists of utterances that are not intended to be interpretable as ordinary language.
Comments	Examples of unintelligible speech include sacred languages, speaking in tongues, and singing syllables (fa-la-la).

To do

Write the introduction. Explain that typical resources will contain multiple types.

References

[OED]	Oxford English Dictionary <http://dictionary.oed.com/entrance.dtl>
[OLAC-MS]	OLAC Metadata Set. <http://www.language-archives.org/OLAC/olacms.html>
[SAMPA]	Speech Assessment Methods Phonetic Alphabet <http://www.phon.ucl.ac.uk/home/sampa/home.htm>
[TIMIT]	TIMIT Acoustic-Phonetic Continuous Speech Corpus <http://www.ldc.upenn.edu/Catalog/LDC93S1.html>
[Unicode-IPA]	Unicode IPA Extensions <http://www.unicode.org/unicode/uni2book/ch07.pdf>
[WordNet]	WordNet - a Lexical Database for English <http://www.cogsci.princeton.edu/~wn/>

OLAC Linguistic Data Type Vocabulary

Table of contents

1. Introduction

2. Linguistic Data Type

transcription

transcription/phonetic

transcription/phonemic

transcription/prosodic

transcription/orthographic

transcription/gestural

transcription/kinesic

transcription/musical

annotation

annotation/phonetic

annotation/phonological

annotation/prosodic

annotation/gestural

annotation/kinesic

annotation/morphological

annotation/part-of-speech

annotation/syntactic

annotation/semantic

annotation/discourse

annotation/translation

annotation/musical

dataset

dataset/phonetic

dataset/phonological

dataset/prosodic

dataset/orthographic

dataset/gestural

dataset/kinesic

dataset/morphological

dataset/part-of-speech

dataset/syntactic

dataset/semantic

dataset/discourse

dataset/musical

description

description/phonetic

description/phonological

description/prosodic

description/orthographic

description/gestural

description/kinesic

description/morphological

description/part-of-speech

description/syntactic

description/semantic

description/discourse

description/pedagogical

description/comparative

lexicon

lexicon/dictionary

lexicon/wordlist

lexicon/wordnet

lexicon/thesaurus

lexicon/terminology

lexicon/proper-names

lexicon/bilingual

lexicon/etymological

lexicon/phonetic

lexicon/frequency

lexicon/analytical

text

text/narrative

text/oratory

text/dialogue

text/singing

text/drama

text/formulaic

text/procedural

text/report

text/ludic

text/unintelligible speech

To do

References