Sample Metadata Record

oai:compendium.lr.sign-lang.uni-hamburg.de:dgscorpus


XML format

<olac:olac>
<dc:title>DGS Corpus</dc:title>
<dc:contributor xsi:type="olac:role" olac:code="sponsor">Universität Hamburg</dc:contributor>
<dc:description>The DGS Corpus is a collection of German Sign Language data from 330 signers from Germany. The 15-year long-term project is based at the Institute of German Sign Language and Communication of the Deaf at the Universität Hamburg and started in 2009. It is led by Thomas Hanke and Annika Herrmann. The DGS Corpus is used to build the DGS-German dictionary DW-DGS. The signers were recorded in pairs in a mobile studio travelling to 13 spots in Germany. The signers were sitting opposite each other in front of a blue background. In total seven cameras were used for the recordings, five HD cameras and two Bumblebees. The Bumblebees were later replaced by three HD stereo cameras. The cameras were set up in three different angles: one recording a total view including the moderator, one filming the signers from the front and one from above. The original resolution is 1080i50 for the videos of 2010 and 720p50 for the videos from 2011 onwards. Public data is provided in 360p50. A Deaf moderator was leading through the tasks. The DGS Corpus is available in different formats: MY DGS is a community portal which offers an easy access to the data tailored for users interested in the content of the conversations. Videos can be watched in an online viewer with subtitles. MY DGS – annotated is a research portal which offers the annotated corpus data for linguistic research. MY DGS – ANNIS is another research portal making the DGS Corpus available via the corpus tool ANNIS, a web browser-based search and visualization architecture for complex multilayer linguistic corpora.</dc:description>
<dc:description>https://www.sign-lang.uni-hamburg.de/lr/compendium/corpus/dgscorpus.html</dc:description>
<dc:format>560 hours recorded, 657000 tokens annotated</dc:format>
<dc:identifier xsi:type="dcterms:URI">https://ling.meine-dgs.de</dc:identifier>
<dcterms:bibliographicCitation>Konrad, R., Hanke, T., Langer, G., Blanck, D., Bleicken, J., Hofmann, I., Jeziorski, O., König, L., König, S., Nishio, R., Regen, A., Salden, U., Wagner, S., Worseck, S., Böse, O., Jahn, E., Schulder, M. 2020. MEINE DGS – annotiert. Öffentliches Korpus der Deutschen Gebärdensprache, 3. Release / MY DGS – annotated. Public Corpus of German Sign Language, 3rd release [Dataset]. Universität Hamburg. https://doi.org/10.25592/dgs.corpus-3.0</dcterms:bibliographicCitation>
<dc:subject xsi:type="olac:linguistic-field" olac:code="text_and_corpus_linguistics"/>
<dc:subject xsi:type="olac:discourse-type" olac:code="dialogue"/>
<dc:subject>Corpus of German Sign Language</dc:subject>
<dc:subject xsi:type="olac:language" olac:code="gsg"/>
<dc:type xsi:type="dcterms:DCMIType">Collection</dc:type>
<dc:type xsi:type="dcterms:DCMIType">MovingImage</dc:type>
<dc:type xsi:type="olac:linguistic-type" olac:code="primary_text"/>
</olac:olac>

Display format

 Title  DGS Corpus
 Contributor (sponsor)  Universität Hamburg
 Description  The DGS Corpus is a collection of German Sign Language data from 330 signers from Germany. The 15-year long-term project is based at the Institute of German Sign Language and Communication of the Deaf at the Universität Hamburg and started in 2009. It is led by Thomas Hanke and Annika Herrmann. The DGS Corpus is used to build the DGS-German dictionary DW-DGS. The signers were recorded in pairs in a mobile studio travelling to 13 spots in Germany. The signers were sitting opposite each other in front of a blue background. In total seven cameras were used for the recordings, five HD cameras and two Bumblebees. The Bumblebees were later replaced by three HD stereo cameras. The cameras were set up in three different angles: one recording a total view including the moderator, one filming the signers from the front and one from above. The original resolution is 1080i50 for the videos of 2010 and 720p50 for the videos from 2011 onwards. Public data is provided in 360p50. A Deaf moderator was leading through the tasks. The DGS Corpus is available in different formats: MY DGS is a community portal which offers an easy access to the data tailored for users interested in the content of the conversations. Videos can be watched in an online viewer with subtitles. MY DGS – annotated is a research portal which offers the annotated corpus data for linguistic research. MY DGS – ANNIS is another research portal making the DGS Corpus available via the corpus tool ANNIS, a web browser-based search and visualization architecture for complex multilayer linguistic corpora.
 Description  https://www.sign-lang.uni-hamburg.de/lr/compendium/corpus/dgscorpus.html
 Format  560 hours recorded, 657000 tokens annotated
 Identifier (URI)  https://ling.meine-dgs.de
 Bibliographic Citation  Konrad, R., Hanke, T., Langer, G., Blanck, D., Bleicken, J., Hofmann, I., Jeziorski, O., König, L., König, S., Nishio, R., Regen, A., Salden, U., Wagner, S., Worseck, S., Böse, O., Jahn, E., Schulder, M. 2020. MEINE DGS – annotiert. Öffentliches Korpus der Deutschen Gebärdensprache, 3. Release / MY DGS – annotated. Public Corpus of German Sign Language, 3rd release [Dataset]. Universität Hamburg. https://doi.org/10.25592/dgs.corpus-3.0
 Subject (OLAC)  Text and corpus linguistics
 Subject (OLAC)  Discourse type: Dialogue
 Subject  Corpus of German Sign Language
 Subject (ISO639-3)  German Sign Language [gsg]
 Type (DCMI)  Collection
 Type (DCMI)  MovingImage
 Type (OLAC)  Linguistic type: Primary text

Metadata quality analysis

OLAC metadata records are scored for metadata quality on a 10-point scale explained in OLAC Metadata Metrics. The score for the above record (along with comments on changes that could improve the score) is as follows:

Component + - Comments
Title   1   0 
Date   0   1  Add a dc:date element (or one of its refinements, like dcterms:created or dcterms:issued).
Agent   1   0 
About   1   0 
Depth   1   0 
Content Language   0   1  Add a dc:language element with an ISO 639-3 code to identify the language in which the resource is written or spoken.
Subject Language   1   0 
OLAC Type   1   0 
DCMI Type   1   0 
Precision   0.67   0.33  For the full score, make use of at least one more encoding scheme in addition to the ones counted explicitly in other components of the score. For instance,
  • use dcterms:W3CDTF on dc:date (or its refinements)
  • use dcterms:IMT on dc:format
Quality score  7.67