Sample Metadata Record

oai:mmm.idiap.ch:1


XML format

<olac:olac>
<dc:title>The AMI Meeting Corpus</dc:title>
<dc:contributor xsi:type="olac:role" olac:code="author">AMI Consortium</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="editor">Jean Carletta</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="recorder">Idiap Research Institute</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="recorder">University of Edinburgh</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="recorder">TNO (Netherlands Organization for Applied Scientific Research)</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="annotator">University of Edinburgh</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="sponsor">European Union - Framework Programmes 6 and 7</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="sponsor">Swiss National Science Foundation - IM2 National Center of Competence in Research</dc:contributor>
<dc:coverage>Martigny, Switzerland</dc:coverage>
<dc:coverage>Edinburgh, Scotland</dc:coverage>
<dc:coverage>Delft, The Netherlands</dc:coverage>
<dc:date>2005-2007</dc:date>
<dc:description>The AMI Meeting Corpus is a multi-modal data set consisting of 100 hours of meeting recordings. Around two-thirds of the data has been elicited using a scenario in which the participants play different roles in a design team, taking a design project from kick-off to completion over the course of a day. The rest consists of naturally occurring meetings in a range of domains, and other scenarised meetings. The corpus has been recorded using synchronised recording devices (close-talking and far-field microphones, individual and room-view video cameras, projection, a whiteboard, individual pens), has been fully transcribed, and annotations for many different phenomena (such as dialogue acts or head movement) have been made. Although the AMI Meeting Corpus was created for the uses of a consortium that is developing meeting browsing technology, it is designed to be useful for a wide range of research areas.</dc:description>
<dc:description xsi:type="dcterms:URI">http://corpus.amiproject.org</dc:description>
<dc:description xsi:type="dcterms:URI">http://corpus.amiproject.org/documentations/</dc:description>
<dc:format>WAV</dc:format>
<dc:format xsi:type="dcterms:IMT">audio/x-wav</dc:format>
<dc:format>RealAudio</dc:format>
<dc:format xsi:type="dcterms:IMT">audio/vnd.rn-realaudio</dc:format>
<dc:format>AVI</dc:format>
<dc:format xsi:type="dcterms:IMT">video/x-msvideo</dc:format>
<dc:format>RealVideo</dc:format>
<dc:format xsi:type="dcterms:IMT">video/vnd.rn-realvideo</dc:format>
<dc:format>XML</dc:format>
<dc:format xsi:type="dcterms:IMT">text/xml</dc:format>
<dcterms:extent>100 hours</dcterms:extent>
<dcterms:extent>171 meetings</dcterms:extent>
<dc:identifier>AMI_Corpus</dc:identifier>
<dcterms:bibliographicCitation>Jean Carletta, Simone Ashby, Sebastien Bourban, Mike Flynn, Mael Guillemot, Thomas Hain, Jaroslav Kadlec, Vasilis Karaiskos, Wessel Kraaij, Melissa Kronenthal, Guillaume Lathoud, Mike Lincoln, Agnes Lisowska, Iain McCowan, Wilfried Post, Dennis Reidsma, and Pierre Wellner, "The AMI Meeting Corpus: A Pre-announcement", in Machine Learning for Multimodal Interaction II, edited by S. Renals and S. Bengio, LNCS 3869, Springer-Verlag, Berlin/Heidelberg, 2006, pages 28-39.</dcterms:bibliographicCitation>
<dcterms:bibliographicCitation>Jean Carletta, "Unleashing the killer corpus: experiences in creating the multieverything AMI Meeting Corpus", Language Resources and Evaluation, vol. 41, n. 2, 2007, pages 181-190.</dcterms:bibliographicCitation>
<dc:language xsi:type="olac:language" olac:code="eng">English</dc:language>
<dc:publisher>AMI Consortium</dc:publisher>
<dcterms:hasPart>AMI Corpus - Scenario-based Meeting Media</dcterms:hasPart>
<dcterms:hasPart>AMI Corpus - Non Scenario-based Meeting Media</dcterms:hasPart>
<dcterms:hasPart>AMI Corpus - Annotations and Metadata</dcterms:hasPart>
<dc:rights>All of the signals and transcription, and some of the annotations, have been released publicly under the AMI Meeting Corpus license, very similar to the Creative Commons Attribution NonCommercial ShareAlike 2.5 License (http://creativecommons.org/licenses/by-nc-sa/2.5).</dc:rights>
<dc:rights xsi:type="dcterms:URI">http://corpus.amiproject.org/documentations/license</dc:rights>
<dc:subject>Multi-party meetings</dc:subject>
<dc:type xsi:type="dcterms:DCMIType">Sound</dc:type>
<dc:type xsi:type="dcterms:DCMIType">Text</dc:type>
<dc:type xsi:type="dcterms:DCMIType">MovingImage</dc:type>
<dc:type xsi:type="dcterms:DCMIType">StillImage</dc:type>
<dc:type xsi:type="dcterms:DCMIType">Collection</dc:type>
<dc:type xsi:type="olac:discourse-type" olac:code="dialogue">dialogue</dc:type>
<dc:type xsi:type="olac:linguistic-type" olac:code="primary_text">primary text</dc:type>
</olac:olac>

Display format

 Title  The AMI Meeting Corpus
 Contributor (author)  AMI Consortium
 Contributor (editor)  Jean Carletta
 Contributor (recorder)  Idiap Research Institute
 Contributor (recorder)  University of Edinburgh
 Contributor (recorder)  TNO (Netherlands Organization for Applied Scientific Research)
 Contributor (annotator)  University of Edinburgh
 Contributor (sponsor)  European Union - Framework Programmes 6 and 7
 Contributor (sponsor)  Swiss National Science Foundation - IM2 National Center of Competence in Research
 Coverage  Martigny, Switzerland
 Coverage  Edinburgh, Scotland
 Coverage  Delft, The Netherlands
 Date   2005-2007
 Description  The AMI Meeting Corpus is a multi-modal data set consisting of 100 hours of meeting recordings. Around two-thirds of the data has been elicited using a scenario in which the participants play different roles in a design team, taking a design project from kick-off to completion over the course of a day. The rest consists of naturally occurring meetings in a range of domains, and other scenarised meetings. The corpus has been recorded using synchronised recording devices (close-talking and far-field microphones, individual and room-view video cameras, projection, a whiteboard, individual pens), has been fully transcribed, and annotations for many different phenomena (such as dialogue acts or head movement) have been made. Although the AMI Meeting Corpus was created for the uses of a consortium that is developing meeting browsing technology, it is designed to be useful for a wide range of research areas.
 Description (URI)  http://corpus.amiproject.org
 Description (URI)  http://corpus.amiproject.org/documentations/
 Format  WAV
 Format (IMT)  audio/x-wav
 Format  RealAudio
 Format (IMT)  audio/vnd.rn-realaudio
 Format  AVI
 Format (IMT)  video/x-msvideo
 Format  RealVideo
 Format (IMT)  video/vnd.rn-realvideo
 Format  XML
 Format (IMT)  text/xml
 Extent  100 hours
 Extent  171 meetings
 Identifier  AMI_Corpus
 Bibliographic Citation  Jean Carletta, Simone Ashby, Sebastien Bourban, Mike Flynn, Mael Guillemot, Thomas Hain, Jaroslav Kadlec, Vasilis Karaiskos, Wessel Kraaij, Melissa Kronenthal, Guillaume Lathoud, Mike Lincoln, Agnes Lisowska, Iain McCowan, Wilfried Post, Dennis Reidsma, and Pierre Wellner, "The AMI Meeting Corpus: A Pre-announcement", in Machine Learning for Multimodal Interaction II, edited by S. Renals and S. Bengio, LNCS 3869, Springer-Verlag, Berlin/Heidelberg, 2006, pages 28-39.
 Bibliographic Citation  Jean Carletta, "Unleashing the killer corpus: experiences in creating the multieverything AMI Meeting Corpus", Language Resources and Evaluation, vol. 41, n. 2, 2007, pages 181-190.
 Language (ISO639-3)  English [eng], English
 Publisher  AMI Consortium
 Has Part  AMI Corpus - Scenario-based Meeting Media
 Has Part  AMI Corpus - Non Scenario-based Meeting Media
 Has Part  AMI Corpus - Annotations and Metadata
 Rights  All of the signals and transcription, and some of the annotations, have been released publicly under the AMI Meeting Corpus license, very similar to the Creative Commons Attribution NonCommercial ShareAlike 2.5 License (http://creativecommons.org/licenses/by-nc-sa/2.5).
 Rights (URI)  http://corpus.amiproject.org/documentations/license
 Subject  Multi-party meetings
 Type (DCMI)  Sound
 Type (DCMI)  Text
 Type (DCMI)  MovingImage
 Type (DCMI)  StillImage
 Type (DCMI)  Collection
 Type (OLAC)  Discourse type: Dialogue, dialogue
 Type (OLAC)  Linguistic type: Primary text, primary text

Metadata quality analysis

OLAC metadata records are scored for metadata quality on a 10-point scale explained in OLAC Metadata Metrics. The score for the above record (along with comments on changes that could improve the score) is as follows:

Component + - Comments
Title   1   0 
Date   1   0 
Agent   1   0 
About   1   0 
Depth   1   0 
Content Language   1   0 
Subject Language   1   0 
OLAC Type   1   0 
DCMI Type   1   0 
Precision   1   0 
Quality score  10