OLAC Record
oai:catalogue.elra.info:ELRA-S0497

Metadata
Title:Chinese Kids Speech database (Upper Grade)
Access Rights: Rights available for: nonCommercialUse, commercialUse
Date Available (W3CDTF):2025-07-18
Date Issued (W3CDTF):2025-07-18
Description:The Chinese Kids Speech database (Upper Grade) contains the total recordings of 161 Chinese Kids speakers (71 males and 90 females), from 10 to 12 years’ old recorded in quiet rooms using smartphone. This database may be combined with the Chinese Kids Speech database (Lower Grade) also available in the ELRA Catalogue under reference ELRA-S0495.Number of speakers, utterances and duration, age are as follows :Number of speakers (Male/Female): 161 (71/90)Number of utterances (average): 234 utt/spkrTotal number of utterances: 37,806Age: from 10 to 12Total number of hours: 721,859 sentences were used. Recordings were made through smartphones and audio data stored in .wav files as sequences of 16KHz Mono, 16 bits, Linear PCM.Database・Audio data: WAV format, 16KHz, 16bit, mono (recorded with smartphone)・Transcription data: TSV format(tab-delimited), UTF-8 (without BOM), Line ending: LF・Size: 7.8GBAgeMaleFemaleTotal101423371122335512353469Structure of database :├─ readme.txt├─ Chinese Kids Speech Database (Upper grade).pdfDescription document of the database├─ transcription(Upper).tsvTranscription└─ High/directory of audio data └─ (1st/2nd/3rd)directory of version ID└─(0/1)directory of gender (0: male, 1: female) └─(audio_file)audio file (WAV format, 16KHz, 16bit, mono)Field information of “transcription(Upper).tsv” are as follows:Field numberDescription0Script ID1Speaker ID2Audio file name3Transcription (in Chinese)File naming conventions of audio files are as follows:Field numberContentsDescriptionRemarks0Script IDFour digitsXXXX: four digits1Speaker IDThree digitsXXX: three digits2AgeTwo digitsFrom 10 to 123Gender0: male, 1: female4Utterance No.Three digitsSequential numbering starting from 001 within each speaker5Recording dateYYYYMMDDHHMM6Recording device nameRecording device nameEx. NTH-AN007OSOperating System info of recording deviceEx. android-118Durationduration in msecDuration of the actual spoken utteranceFiled separation character is “_”.For example, if the audio file name is “1190_190_11_0_001_202204291812_V2162A_android-11_3290.wav“, this file has the following meaning:1190: script ID190: speaker ID11: age (eleven years old)0: gender (male)001: utterance number202204291812: recording date (April 29, 2022, at 6:12 PM)V2162A: recording device nameandroid-11: operating system info of recording device3290: duration of the actual spoken utterance (3,290 msec)
Identifier:ELRA-S0497
ISLRN: 993-024-988-227-0
Identifier (URI):https://catalog.elra.info/en-us/repository/browse/ELRA-S0497/
Language:Chinese
Language (ISO639):zho
Medium:Not specified
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-S0497
DateStamp:  2025-07-18
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2025. ELRA (European Language Resources Association).
Terms: dcmi_Sound iso639_zho olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-S0497
Up-to-date as of: Thu Aug 21 1:01:03 EDT 2025