OLAC Record: SmartWeb Video Corpus (SVC)

OLAC Record
oai:catalogue.elra.info:ELRA-S0280

Metadata

Title: SmartWeb Video Corpus (SVC)

Access Rights: Rights available for: nonCommercialUse, commercialUse

Date Available (W3CDTF): 2008-07-11

Date Issued (W3CDTF): 2008-07-11

Date Modified (W3CDTF): 2008-07-11

Description: The SMARTWEB UMTS data collection was created within the publicly funded German SmartWeb project in the years 2004-2006. It comprises a collection of user queries to a naturally spoken Web interface with the main focus on the soccer world series in 2006. The recordings include field recordings using a hand-held UMTS device (one person, SmartWeb Handheld Corpus SHC, ref. ELRA-S0278), field recordings with video capture of the primary speaker and a secondary speaker (SmartWeb Video Corpus SVC, ref. ELRA-S0279), as well as mobile recordings performed on a BMW motorbike (one speaker, SmartWeb Motorbike Corpus SMC, ref. ELRA-S0280). This multimodal corpus corresponds to the video capture of the primary speaker and a secondary speaker (SmartWeb Video Corpus) and contains 99 recordings each containing a human-human-machine dialogue: one speaker (which is being recorded) interacts with a human partner as well with a dialogue system via a smart phone (SmartWeb system).The speaker uses a client-server based dialogue system (SmartWeb) for spoken access to Internet contents in a natural environment (office, hallway, street, park, cafe, etc.). Speech was captured over a Bluetooth headset and transferred via an UMTS cellular line to the server; a second collar attached microphone was captured on a portable iRiver recorder to yield an undisturbed, high quality reference signal. The face of the speaker was captured by the build-in face camera of the smart phone. The speech signal was segmented into queries (automatically by the prompting system) and a second time manually into turns and transcribed according to Verbmobil transliteration standard. The video signal was labelled manually into OnView / OffView and - partly - spatially segmented for face detection.The motivation for this corpus was to capture realistic multimodal (speech + face) data in a realistic human machine interaction as well as to capture as many OffTalk situations as possible (OffTalk being all speech uttered by the speaker that is not intended as input to the system).The corpus contains:-number of dialogues / recorded speakers: 99-number of segmented turns: 2,218-total duration: 971 minutes-formats: o collar mic: WAV 44,1kHz, 16 bit o Bluetooth/UMTS-channel: ALAW 8kHz 8bit o video: 176x144, 24bpp, 15fps, 3GPP + MPEG1 o Verbmobil Transliteration (TRS), BAS Partitur Format (BPF), ATLAS Annotation Graph (XML) o meta data: speaker and recording protocol (XML) -segmentation: automatic segmentation into input queries by the prompting system; manual segmentation into turns; OffTalk labelling; OffView labelling, spatially segmentation of face (partly manually)-distribution: 5 DVD-R See also ELRA-S0278 and ELRA-S0279.

Identifier: ELRA-S0280

ISLRN: 874-872-676-146-6

Identifier (URI): https://catalog.elra.info/en-us/repository/browse/ELRA-S0280/

Language: German

Language (ISO639): deu

Medium: Not specified

Publisher: ELRA (European Language Resources Association)

Type (DCMI): Sound

MovingImage

Type (OLAC): primary_text

OLAC Info

Archive: ELRA Catalogue of Language Resources

Description: http://www.language-archives.org/archive/catalogue.elra.info

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:catalogue.elra.info:ELRA-S0280

DateStamp: 2008-07-11

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: n.a. 2008. ELRA (European Language Resources Association).
Terms: area_Europe country_DE dcmi_MovingImage dcmi_Sound iso639_deu olac_primary_text

http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-S0280
Up-to-date as of: Wed Oct 1 0:55:56 EDT 2025

Metadata
Title:		SmartWeb Video Corpus (SVC)
Access Rights:		Rights available for: nonCommercialUse, commercialUse
Date Available (W3CDTF):		2008-07-11
Date Issued (W3CDTF):		2008-07-11
Date Modified (W3CDTF):		2008-07-11
Description:		The SMARTWEB UMTS data collection was created within the publicly funded German SmartWeb project in the years 2004-2006. It comprises a collection of user queries to a naturally spoken Web interface with the main focus on the soccer world series in 2006. The recordings include field recordings using a hand-held UMTS device (one person, SmartWeb Handheld Corpus SHC, ref. ELRA-S0278), field recordings with video capture of the primary speaker and a secondary speaker (SmartWeb Video Corpus SVC, ref. ELRA-S0279), as well as mobile recordings performed on a BMW motorbike (one speaker, SmartWeb Motorbike Corpus SMC, ref. ELRA-S0280). This multimodal corpus corresponds to the video capture of the primary speaker and a secondary speaker (SmartWeb Video Corpus) and contains 99 recordings each containing a human-human-machine dialogue: one speaker (which is being recorded) interacts with a human partner as well with a dialogue system via a smart phone (SmartWeb system).The speaker uses a client-server based dialogue system (SmartWeb) for spoken access to Internet contents in a natural environment (office, hallway, street, park, cafe, etc.). Speech was captured over a Bluetooth headset and transferred via an UMTS cellular line to the server; a second collar attached microphone was captured on a portable iRiver recorder to yield an undisturbed, high quality reference signal. The face of the speaker was captured by the build-in face camera of the smart phone. The speech signal was segmented into queries (automatically by the prompting system) and a second time manually into turns and transcribed according to Verbmobil transliteration standard. The video signal was labelled manually into OnView / OffView and - partly - spatially segmented for face detection.The motivation for this corpus was to capture realistic multimodal (speech + face) data in a realistic human machine interaction as well as to capture as many OffTalk situations as possible (OffTalk being all speech uttered by the speaker that is not intended as input to the system).The corpus contains:-number of dialogues / recorded speakers: 99-number of segmented turns: 2,218-total duration: 971 minutes-formats: o collar mic: WAV 44,1kHz, 16 bit o Bluetooth/UMTS-channel: ALAW 8kHz 8bit o video: 176x144, 24bpp, 15fps, 3GPP + MPEG1 o Verbmobil Transliteration (TRS), BAS Partitur Format (BPF), ATLAS Annotation Graph (XML) o meta data: speaker and recording protocol (XML) -segmentation: automatic segmentation into input queries by the prompting system; manual segmentation into turns; OffTalk labelling; OffView labelling, spatially segmentation of face (partly manually)-distribution: 5 DVD-R See also ELRA-S0278 and ELRA-S0279.
Identifier:		ELRA-S0280
Identifier:		ISLRN: 874-872-676-146-6
Identifier (URI):		https://catalog.elra.info/en-us/repository/browse/ELRA-S0280/
Language:		German
Language (ISO639):		deu
Medium:		Not specified
Publisher:		ELRA (European Language Resources Association)
Type (DCMI):		Sound
Type (DCMI):		MovingImage
Type (OLAC):		primary_text
OLAC Info
Archive:		ELRA Catalogue of Language Resources
Description:		http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:catalogue.elra.info:ELRA-S0280
DateStamp:		2008-07-11
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		n.a. 2008. ELRA (European Language Resources Association).
Terms:		area_Europe country_DE dcmi_MovingImage dcmi_Sound iso639_deu olac_primary_text