Sample Metadata Record
<olac:olac>
<dc:title>The AMI Meeting Corpus</dc:title>
<dc:contributor xsi:type="olac:role" olac:code="author">AMI Consortium</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="editor">Jean Carletta</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="recorder">Idiap Research Institute</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="recorder">University of Edinburgh</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="recorder">TNO (Netherlands Organization for Applied Scientific Research)</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="annotator">University of Edinburgh</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="sponsor">European Union - Framework Programmes 6 and 7</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="sponsor">Swiss National Science Foundation - IM2 National Center of Competence in Research</dc:contributor>
<dc:coverage>Martigny, Switzerland</dc:coverage>
<dc:coverage>Edinburgh, Scotland</dc:coverage>
<dc:coverage>Delft, The Netherlands</dc:coverage>
<dc:date>2005-2007</dc:date>
<dc:description>The AMI Meeting Corpus is a multi-modal data set consisting of 100 hours of meeting recordings. Around two-thirds of the data has been elicited using a scenario in which the participants play different roles in a design team, taking a design project from kick-off to completion over the course of a day. The rest consists of naturally occurring meetings in a range of domains, and other scenarised meetings. The corpus has been recorded using synchronised recording devices (close-talking and far-field microphones, individual and room-view video cameras, projection, a whiteboard, individual pens), has been fully transcribed, and annotations for many different phenomena (such as dialogue acts or head movement) have been made. Although the AMI Meeting Corpus was created for the uses of a consortium that is developing meeting browsing technology, it is designed to be useful for a wide range of research areas.</dc:description>
<dc:description xsi:type="dcterms:URI">http://corpus.amiproject.org</dc:description>
<dc:description xsi:type="dcterms:URI">http://corpus.amiproject.org/documentations/</dc:description>
<dc:format>WAV</dc:format>
<dc:format xsi:type="dcterms:IMT">audio/x-wav</dc:format>
<dc:format>RealAudio</dc:format>
<dc:format xsi:type="dcterms:IMT">audio/vnd.rn-realaudio</dc:format>
<dc:format>AVI</dc:format>
<dc:format xsi:type="dcterms:IMT">video/x-msvideo</dc:format>
<dc:format>RealVideo</dc:format>
<dc:format xsi:type="dcterms:IMT">video/vnd.rn-realvideo</dc:format>
<dc:format>XML</dc:format>
<dc:format xsi:type="dcterms:IMT">text/xml</dc:format>
<dcterms:extent>100 hours</dcterms:extent>
<dcterms:extent>171 meetings</dcterms:extent>
<dc:identifier>AMI_Corpus</dc:identifier>
<dcterms:bibliographicCitation>Jean Carletta, Simone Ashby, Sebastien Bourban, Mike Flynn, Mael Guillemot, Thomas Hain, Jaroslav Kadlec, Vasilis Karaiskos, Wessel Kraaij, Melissa Kronenthal, Guillaume Lathoud, Mike Lincoln, Agnes Lisowska, Iain McCowan, Wilfried Post, Dennis Reidsma, and Pierre Wellner, "The AMI Meeting Corpus: A Pre-announcement", in Machine Learning for Multimodal Interaction II, edited by S. Renals and S. Bengio, LNCS 3869, Springer-Verlag, Berlin/Heidelberg, 2006, pages 28-39.</dcterms:bibliographicCitation>
<dcterms:bibliographicCitation>Jean Carletta, "Unleashing the killer corpus: experiences in creating the multieverything AMI Meeting Corpus", Language Resources and Evaluation, vol. 41, n. 2, 2007, pages 181-190.</dcterms:bibliographicCitation>
<dc:language xsi:type="olac:language" olac:code="eng">English</dc:language>
<dc:publisher>AMI Consortium</dc:publisher>
<dcterms:hasPart>AMI Corpus - Scenario-based Meeting Media</dcterms:hasPart>
<dcterms:hasPart>AMI Corpus - Non Scenario-based Meeting Media</dcterms:hasPart>
<dcterms:hasPart>AMI Corpus - Annotations and Metadata</dcterms:hasPart>
<dc:rights>All of the signals and transcription, and some of the annotations, have been released publicly under the AMI Meeting Corpus license, very similar to the Creative Commons Attribution NonCommercial ShareAlike 2.5 License (http://creativecommons.org/licenses/by-nc-sa/2.5).</dc:rights>
<dc:rights xsi:type="dcterms:URI">http://corpus.amiproject.org/documentations/license</dc:rights>
<dc:subject>Multi-party meetings</dc:subject>
<dc:type xsi:type="dcterms:DCMIType">Sound</dc:type>
<dc:type xsi:type="dcterms:DCMIType">Text</dc:type>
<dc:type xsi:type="dcterms:DCMIType">MovingImage</dc:type>
<dc:type xsi:type="dcterms:DCMIType">StillImage</dc:type>
<dc:type xsi:type="dcterms:DCMIType">Collection</dc:type>
<dc:type xsi:type="olac:discourse-type" olac:code="dialogue">dialogue</dc:type>
<dc:type xsi:type="olac:linguistic-type" olac:code="primary_text">primary text</dc:type>
</olac:olac>
| Title | The AMI Meeting Corpus |
| Contributor (author) | AMI Consortium |
| Contributor (editor) | Jean Carletta |
| Contributor (recorder) | Idiap Research Institute |
| Contributor (recorder) | University of Edinburgh |
| Contributor (recorder) | TNO (Netherlands Organization for Applied Scientific Research) |
| Contributor (annotator) | University of Edinburgh |
| Contributor (sponsor) | European Union - Framework Programmes 6 and 7 |
| Contributor (sponsor) | Swiss National Science Foundation - IM2 National Center of Competence in Research |
| Coverage | Martigny, Switzerland |
| Coverage | Edinburgh, Scotland |
| Coverage | Delft, The Netherlands |
| Date | 2005-2007 |
| Description | The AMI Meeting Corpus is a multi-modal data set consisting of 100 hours of meeting recordings. Around two-thirds of the data has been elicited using a scenario in which the participants play different roles in a design team, taking a design project from kick-off to completion over the course of a day. The rest consists of naturally occurring meetings in a range of domains, and other scenarised meetings. The corpus has been recorded using synchronised recording devices (close-talking and far-field microphones, individual and room-view video cameras, projection, a whiteboard, individual pens), has been fully transcribed, and annotations for many different phenomena (such as dialogue acts or head movement) have been made. Although the AMI Meeting Corpus was created for the uses of a consortium that is developing meeting browsing technology, it is designed to be useful for a wide range of research areas. |
| Description (URI) | http://corpus.amiproject.org |
| Description (URI) | http://corpus.amiproject.org/documentations/ |
| Format | WAV |
| Format (IMT) | audio/x-wav |
| Format | RealAudio |
| Format (IMT) | audio/vnd.rn-realaudio |
| Format | AVI |
| Format (IMT) | video/x-msvideo |
| Format | RealVideo |
| Format (IMT) | video/vnd.rn-realvideo |
| Format | XML |
| Format (IMT) | text/xml |
| Extent | 100 hours |
| Extent | 171 meetings |
| Identifier | AMI_Corpus |
| Bibliographic Citation | Jean Carletta, Simone Ashby, Sebastien Bourban, Mike Flynn, Mael Guillemot, Thomas Hain, Jaroslav Kadlec, Vasilis Karaiskos, Wessel Kraaij, Melissa Kronenthal, Guillaume Lathoud, Mike Lincoln, Agnes Lisowska, Iain McCowan, Wilfried Post, Dennis Reidsma, and Pierre Wellner, "The AMI Meeting Corpus: A Pre-announcement", in Machine Learning for Multimodal Interaction II, edited by S. Renals and S. Bengio, LNCS 3869, Springer-Verlag, Berlin/Heidelberg, 2006, pages 28-39. |
| Bibliographic Citation | Jean Carletta, "Unleashing the killer corpus: experiences in creating the multieverything AMI Meeting Corpus", Language Resources and Evaluation, vol. 41, n. 2, 2007, pages 181-190. |
| Language (ISO639-3) | English [eng], English |
| Publisher | AMI Consortium |
| Has Part | AMI Corpus - Scenario-based Meeting Media |
| Has Part | AMI Corpus - Non Scenario-based Meeting Media |
| Has Part | AMI Corpus - Annotations and Metadata |
| Rights | All of the signals and transcription, and some of the annotations, have been released publicly under the AMI Meeting Corpus license, very similar to the Creative Commons Attribution NonCommercial ShareAlike 2.5 License (http://creativecommons.org/licenses/by-nc-sa/2.5). |
| Rights (URI) | http://corpus.amiproject.org/documentations/license |
| Subject | Multi-party meetings |
| Type (DCMI) | Sound |
| Type (DCMI) | Text |
| Type (DCMI) | MovingImage |
| Type (DCMI) | StillImage |
| Type (DCMI) | Collection |
| Type (OLAC) | Discourse type: Dialogue, dialogue |
| Type (OLAC) | Linguistic type: Primary text, primary text |
OLAC metadata records are scored for metadata quality on a 10-point scale explained in OLAC Metadata Metrics. The score for the above record (along with comments on changes that could improve the score) is as follows:
| Component | + | - | Comments |
|---|---|---|---|
| Title | 1 | 0 | |
| Date | 1 | 0 | |
| Agent | 1 | 0 | |
| About | 1 | 0 | |
| Depth | 1 | 0 | |
| Content Language | 1 | 0 | |
| Subject Language | 1 | 0 | |
| OLAC Type | 1 | 0 | |
| DCMI Type | 1 | 0 | |
| Precision | 1 | 0 | |
| Quality score | 10 | ||