OLAC Record
oai:catalogue.elra.info:ELRA-S0345

Metadata
Title:Spoken Portuguese Corpus
Abstract:The Spoken Portuguese corpus consists of a total of 86 recordings (8h44m), collected among sociolinguistically diverse speakers having Portuguese as mother tongue or as second language. The corpus was recorded in a situation of spontaneous oral communication, on different themes of everyday life, with speakers of different ages and social and professional backgrounds. The corpus consists of audio files in .wav format, aligned transcriptions in XML Exmaralda format and transcriptions in plain text.
Access Rights:Rights available for: Research Use, Commercial Use
Coverage:1970 to 2001
Date Available (W3CDTF):2012-09-12
Date Issued (W3CDTF):2012-09-12
Date Modified (W3CDTF):2012-09-12
Description:Desktop/Microphone
The Spoken Portuguese corpus was collected among sociolinguistically diverse speakers having Portuguese as mother tongue or as second language. In a total of 86 recordings, the texts exemplify the Portuguese spoken in Portugal (30), in Brazil (20), in the African countries with Portuguese as its official language: Angola, Cape Verde, Guinea-Bissau, Mozambique and Sao Tome and Principe (5 each), in Macao (5), in Goa (3) and in East-Timor (3), corresponding to a total of 8h44m of recording. The corpus was recorded in a situation of spontaneous oral communication, on different themes of everyday life, with speakers of different ages and social and professional backgrounds. The recordings cover a period that goes from 1970 to 2001, and approximately 70% of them fall within the nineties. The corpus contains 153,588 tokens. The corpus consists of audio files in .wav format, aligned transcriptions in XML Exmaralda format and transcriptions in plain text. The plain text files also have automatically assigned POS-tag information. The transcriptions of the corpus are also available in html format. The characters have been encoded in UTF-8.
Identifier:ELRA-S0345
http://catalog.elra.info/product_info.php?products_id=1172
Language:Portuguese
Language (ISO639):por
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-S0345
DateStamp:  2012-09-12
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2012. ELRA (European Language Resources Association).
Terms: area_Europe country_PT dcmi_Sound iso639_por olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-S0345
Up-to-date as of: Mon Feb 27 0:31:45 EST 2017