OLAC Record: Czech Audio-Visual Speech Corpus for Recognition with Impaired Conditions

OLAC Record
oai:catalogue.elra.info:ELRA-S0284

Metadata

Title: Czech Audio-Visual Speech Corpus for Recognition with Impaired Conditions

Access Rights: Rights available for: nonCommercialUse, commercialUse

Date Available (W3CDTF): 2008-11-05

Date Issued (W3CDTF): 2008-11-05

Date Modified (W3CDTF): 2008-11-05

Description: This is an audio-visual speech database for training and testing of Czech audio-visual continuous speech recognition systems collected with impaired illumination conditions. The corpus consists of about 20 hours of audio-visual records of 50 speakers in laboratory conditions. Recorded subjects were instructed to remain static. The illumination varied and chunks of each speaker were recorded with several different conditions, such as full illumination, or illumination from one side (left or right) only. These conditions make the database usable for training lip-/head-tracking systems under various illumination conditions independently of the language. Speakers were asked to read 200 sentences each (50 common for all speakers and 150 specific to each speaker). The average total length of recording per speaker was 23 minutes.Acoustic data are stored in wave files using PCM format, sampling frequency 44kHz, resolution 16 bits. Each speaker’s acoustic data set represents about 180 MB of disk space (about 8.8 GB).Visual data are stored in video files (.avi format) using the digital video (DV) codec. Visual data per speaker take about 3.7 GB of disk (about 185 GB as a whole) and are stored on an IDE hard disk (NTFS format).

Identifier: ELRA-S0284

ISLRN: 747-828-662-077-7

Identifier (URI): https://catalog.elra.info/en-us/repository/browse/ELRA-S0284/

Language: Czech Sign Language

Language (ISO639): cse

Medium: Not specified

Publisher: ELRA (European Language Resources Association)

Type (DCMI): Sound

MovingImage

Type (OLAC): primary_text

OLAC Info

Archive: ELRA Catalogue of Language Resources

Description: http://www.language-archives.org/archive/catalogue.elra.info

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:catalogue.elra.info:ELRA-S0284

DateStamp: 2008-11-05

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: n.a. 2008. ELRA (European Language Resources Association).
Terms: area_Europe country_CZ dcmi_MovingImage dcmi_Sound iso639_cse olac_primary_text

http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-S0284
Up-to-date as of: Wed Jul 15 7:05:23 EDT 2026

Metadata
Title:		Czech Audio-Visual Speech Corpus for Recognition with Impaired Conditions
Access Rights:		Rights available for: nonCommercialUse, commercialUse
Date Available (W3CDTF):		2008-11-05
Date Issued (W3CDTF):		2008-11-05
Date Modified (W3CDTF):		2008-11-05
Description:		This is an audio-visual speech database for training and testing of Czech audio-visual continuous speech recognition systems collected with impaired illumination conditions. The corpus consists of about 20 hours of audio-visual records of 50 speakers in laboratory conditions. Recorded subjects were instructed to remain static. The illumination varied and chunks of each speaker were recorded with several different conditions, such as full illumination, or illumination from one side (left or right) only. These conditions make the database usable for training lip-/head-tracking systems under various illumination conditions independently of the language. Speakers were asked to read 200 sentences each (50 common for all speakers and 150 specific to each speaker). The average total length of recording per speaker was 23 minutes.Acoustic data are stored in wave files using PCM format, sampling frequency 44kHz, resolution 16 bits. Each speaker’s acoustic data set represents about 180 MB of disk space (about 8.8 GB).Visual data are stored in video files (.avi format) using the digital video (DV) codec. Visual data per speaker take about 3.7 GB of disk (about 185 GB as a whole) and are stored on an IDE hard disk (NTFS format).
Identifier:		ELRA-S0284
Identifier:		ISLRN: 747-828-662-077-7
Identifier (URI):		https://catalog.elra.info/en-us/repository/browse/ELRA-S0284/
Language:		Czech Sign Language
Language (ISO639):		cse
Medium:		Not specified
Publisher:		ELRA (European Language Resources Association)
Type (DCMI):		Sound
Type (DCMI):		MovingImage
Type (OLAC):		primary_text
OLAC Info
Archive:		ELRA Catalogue of Language Resources
Description:		http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:catalogue.elra.info:ELRA-S0284
DateStamp:		2008-11-05
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		n.a. 2008. ELRA (European Language Resources Association).
Terms:		area_Europe country_CZ dcmi_MovingImage dcmi_Sound iso639_cse olac_primary_text