OLAC Record

Title:Danish SpeechDat(M) database - DB2
Abstract:Phonetically rich sentences sub-set. See ELRA-S0040
Access Rights:Rights available for: Research Use, Commercial Use
Date Available (W3CDTF):1997-06-02
Date Issued (W3CDTF):2004-09-14
Date Modified (W3CDTF):2007-08-28
The (polyphone-like) Danish SpeechDat(M) database contains the recordings of 1,523 Danish speakers from 11 regions. Speech samples are stored as sequences of 8 bit 8 kHz A-law. Each prompted utterance is stored in a separate file, and the associated label files are stored in SAM file format. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information. It was validated by SPEX (the Netherlands) to assess its compliance with the SpeechDat format and content specifications. The lexicon is presented in a TAB delimited ASCII file containing an alphabetically ordered list of distinct lexical items occurring in the database. Each entry contains a frequency count and corresponding pronunciation information. Example: WORD FREQUENCY PHONEMIC TRANSCRIPTIONS ?bnede 104 O b n @ D | O b n @ D @ adresseangivelse 97 a d R a s @ a n g i: u l s @ The complete Danish SpeechDat database is partitioned into 5 CD-ROMs. The first three CD-ROMs contain the application oriented sub-set. The last two CD-ROMs contain the phonetically rich sentences. Each speaker uttered the following items: * 5 semi-spontaneous application word phrases * 12 connected digit strings with 8 digits * 24 natural numbers (3-4 digits) * 27 application words * 3 dates, including a spontaneous one e.g. birthday * 3 spelled words * 2 money amounts, including a small one, and a large one * 1 spontaneous city name * 3 spontaneous yes/no questions * 22-25 sentences * 2 time phrases, including a time phrase and a spontaneous time of day The 5 age groups are the following: under 16, 16-30, 31-45, 46-60, over 60. 78% of the speakers are between 16 and 60 years old. A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
Language (ISO639):dan
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Sound
Type (OLAC):primary_text


ELRA (European Language Resources Association).
