Sample Metadata Record

oai:EarlyMandarin.sinica.edu.tw:EarlyMandarin


XML format

<olac:olac>
<dc:contributor xsi:type="olac:role" olac:code="sponsor">Academia Sinica Computing Centre</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="editor">Chinese Knowledge Information Processing Group of Institute of Information Science of Academia Sinica</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="editor">Corpus Research Group in Institute of Linguistics, Academia Sinica</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="sponsor">Chu-Ren Huang</dc:contributor>
<dc:contributor>Academia Sinica</dc:contributor>
<dc:contributor>Institute of History and Philology,Academia Sinica</dc:contributor>
<dc:contributor>Chiang Ching-Kuo Foundation for International Scholarly Exchange</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="sponsor">Cheng-hui Liu</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="sponsor">Keh-jiann Chen</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="sponsor">Pei-chuan Wei</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="sponsor">P.M. Thompson</dc:contributor>
<dc:coverage>Early(After Tang and Five Dynasties)</dc:coverage>
<dc:coverage>Primitive(Early Qin to Western Han)</dc:coverage>
<dc:coverage>Medieval(Eastern Han to Wei-Jin-Southern and Northern Dynasties)</dc:coverage>
<dc:coverage>Zhonghua (nation)</dc:coverage>
<dc:coverage>Chung-hua Min-kuo (nation)</dc:coverage>
<dc:coverage>China</dc:coverage>
<dc:coverage>Taiwan</dc:coverage>
<dc:creator>Academia Sinica Computing Centre</dc:creator>
<dc:creator>Chinese Knowledge Information Processing Group of Institute of Information Science of Academia Sinica</dc:creator>
<dc:creator>Corpus Research Group in Institute of Linguistics, Academia Sinica</dc:creator>
<dcterms:created>1990</dcterms:created>
<dcterms:available>2001/12, December, 2001</dcterms:available>
<dcterms:modified>1997</dcterms:modified>
<dcterms:modified>1995</dcterms:modified>
<dc:description>Institute of Information Science,Institute fo Linguistics,ASCC. All Rights Reserved.</dc:description>
<dc:description>http://www.sinica.edu.tw/ftms-bin/kiwi1/mkiwi.sh</dc:description>
<dc:description>http://www.sinica.edu.tw/Early_Mandarin/early_mandarin_chinese_c_help.html</dc:description>
<dc:description>The www version of Sinica Early Chinese Corpus was open to research community in November, 2001. Literature available for searching is "Hong-lou-mong" and "San-sui-ping-yao-zhuan". Although the searching function and criteria for segmentation and tagging are mostly the same with Sinica Corpus for modern Chinese, it has its own features as well. For example, while the searching result provides the part-of-speech with the lemma, the source of the cited sentence is also given. This feature helps researchers on the reference. Besides, for the segmentation and tagging standard, some changes are also made because of the different focus from the standard for analyzing modern language. For instance, the verb-complement structure is labelled more detailed in our Sinica Early Chinese Corpus.</dc:description>
<dc:description>Scholarly Exchange and Institute of History and Philology in Academia Sinica. The objective was only to collect the raw text at that time. Ever since, the colletion for raw corpus has never stopped. The collected data has entended from Primitive Chinese to Medieval Chinese and Early Chinese. The work of collection is mainly managed by Prof. Pei-Chuang Wei and is founded by Academia Sinica. The tagging of Primitive Chinese data started in 1995. For Early Chinese, the tagging system was designed in 1997 and applied immediately. The project of Early Chinese was led by Prof. Pei-Chuang Wei and Prof. Cheng-hui Liu (Hsing-Hua University) The financial support came from Academia Sinica and National Science Council; technical support on tagging system and computer science were provided by Prof. Chu-Ren Huang, Prof. Ker-Jiann Chen, and ASCC.</dc:description>
<dc:format>4.5 million words, 19044 KB, saved in text file and presented in HTML format</dc:format>
<dc:format>big5</dc:format>
<dc:format>Criteria for POS and Feature tagging in Academia Sinica Balanced Corpus of Early Chinese</dc:format>
<dc:identifier>http://www.sinica.edu.tw/Early_Mandarin/</dc:identifier>
<dc:language>C++</dc:language>
<dc:language>JAVASCRIPT</dc:language>
<dc:language xsi:type="olac:language" olac:code="cmn">x-sil-CHN</dc:language>
<dc:language xsi:type="olac:language" olac:code="cmn">x-sil-CHN</dc:language>
<dc:publisher>Academia Sinica</dc:publisher>
<dc:publisher>http://www.sinica.edu.tw/</dc:publisher>
<dc:publisher>http://www.iis.sinica.edu.tw/</dc:publisher>
<dc:publisher>Institute of Linguistics, Preparatory Office Academia Sinica</dc:publisher>
<dc:publisher>http://www.ling.sinica.edu.tw/</dc:publisher>
<dc:publisher>Institute of Information Science Academia Sinica</dc:publisher>
<dc:relation>http://www.ling.sinica.edu.tw/formosan/</dc:relation>
<dcterms:requires>At least 32M memory</dcterms:requires>
<dcterms:requires>At least 32M memory</dcterms:requires>
<dcterms:requires>At least 32M memory</dcterms:requires>
<dcterms:requires>At least 32M memory</dcterms:requires>
<dcterms:requires>At least 32M memory</dcterms:requires>
<dcterms:requires>At least 32M memory</dcterms:requires>
<dcterms:requires>Unix/Linux</dcterms:requires>
<dcterms:isPartOf>Academia Sinica Ancient Chinese Corpus</dcterms:isPartOf>
<dcterms:references>Academia Sinica Formosan Language Archive</dcterms:references>
<dcterms:references>http://www.sinica.edu.tw/SinicaCorpus/</dcterms:references>
<dcterms:references>Academia Sinica Balanced Corpus of Modern Chinese</dcterms:references>
<dc:rights>This notice regulates your usage of this web site and its associated services including interface, corpus data, segmenting and tagging standard, etc. All rights are reserved by Academia Sinica. In your research you may apply the data resulting from the searching processes of our interface systems. However, you are prohibited to abstract, alter or publish any searching results voluntarily. The copyright of corpus data is still reserved by original author or source and cannot be reproduced, copied or violate anything involving intellectual property.</dc:rights>
<dc:source>San-sui-ping-yao-zhuan</dc:source>
<dc:source>Hong-lou-mong</dc:source>
<dc:subject xsi:type="olac:language" olac:code="eng">en-us</dc:subject>
<dc:subject xsi:type="olac:language" olac:code="cmn">x-sil-CHN</dc:subject>
<dc:title>Academia Sinica Tagged Corpus of Early Mandarin Chinese</dc:title>
<dc:type xsi:type="dcterms:DCMIType">Text</dc:type>
</olac:olac>

Display format

 Contributor (sponsor)  Academia Sinica Computing Centre
 Contributor (editor)  Chinese Knowledge Information Processing Group of Institute of Information Science of Academia Sinica
 Contributor (editor)  Corpus Research Group in Institute of Linguistics, Academia Sinica
 Contributor (sponsor)  Chu-Ren Huang
 Contributor  Academia Sinica
 Contributor  Institute of History and Philology,Academia Sinica
 Contributor  Chiang Ching-Kuo Foundation for International Scholarly Exchange
 Contributor (sponsor)  Cheng-hui Liu
 Contributor (sponsor)  Keh-jiann Chen
 Contributor (sponsor)  Pei-chuan Wei
 Contributor (sponsor)  P.M. Thompson
 Coverage  Early(After Tang and Five Dynasties)
 Coverage  Primitive(Early Qin to Western Han)
 Coverage  Medieval(Eastern Han to Wei-Jin-Southern and Northern Dynasties)
 Coverage  Zhonghua (nation)
 Coverage  Chung-hua Min-kuo (nation)
 Coverage  China
 Coverage  Taiwan
 Creator  Academia Sinica Computing Centre
 Creator  Chinese Knowledge Information Processing Group of Institute of Information Science of Academia Sinica
 Creator  Corpus Research Group in Institute of Linguistics, Academia Sinica
 Created  1990
 Available  2001/12, December, 2001
 Modified  1997
 Modified  1995
 Description  Institute of Information Science,Institute fo Linguistics,ASCC. All Rights Reserved.
 Description  http://www.sinica.edu.tw/ftms-bin/kiwi1/mkiwi.sh
 Description  http://www.sinica.edu.tw/Early_Mandarin/early_mandarin_chinese_c_help.html
 Description  The www version of Sinica Early Chinese Corpus was open to research community in November, 2001. Literature available for searching is "Hong-lou-mong" and "San-sui-ping-yao-zhuan". Although the searching function and criteria for segmentation and tagging are mostly the same with Sinica Corpus for modern Chinese, it has its own features as well. For example, while the searching result provides the part-of-speech with the lemma, the source of the cited sentence is also given. This feature helps researchers on the reference. Besides, for the segmentation and tagging standard, some changes are also made because of the different focus from the standard for analyzing modern language. For instance, the verb-complement structure is labelled more detailed in our Sinica Early Chinese Corpus.
 Description  Scholarly Exchange and Institute of History and Philology in Academia Sinica. The objective was only to collect the raw text at that time. Ever since, the colletion for raw corpus has never stopped. The collected data has entended from Primitive Chinese to Medieval Chinese and Early Chinese. The work of collection is mainly managed by Prof. Pei-Chuang Wei and is founded by Academia Sinica. The tagging of Primitive Chinese data started in 1995. For Early Chinese, the tagging system was designed in 1997 and applied immediately. The project of Early Chinese was led by Prof. Pei-Chuang Wei and Prof. Cheng-hui Liu (Hsing-Hua University) The financial support came from Academia Sinica and National Science Council; technical support on tagging system and computer science were provided by Prof. Chu-Ren Huang, Prof. Ker-Jiann Chen, and ASCC.
 Format  4.5 million words, 19044 KB, saved in text file and presented in HTML format
 Format  big5
 Format  Criteria for POS and Feature tagging in Academia Sinica Balanced Corpus of Early Chinese
 Identifier  http://www.sinica.edu.tw/Early_Mandarin/
 Language  C++
 Language  JAVASCRIPT
 Language (ISO639-3)  [cmn], x-sil-CHN
 Language (ISO639-3)  [cmn], x-sil-CHN
 Publisher  Academia Sinica
 Publisher  http://www.sinica.edu.tw/
 Publisher  http://www.iis.sinica.edu.tw/
 Publisher  Institute of Linguistics, Preparatory Office Academia Sinica
 Publisher  http://www.ling.sinica.edu.tw/
 Publisher  Institute of Information Science Academia Sinica
 Relation  http://www.ling.sinica.edu.tw/formosan/
 Requires  At least 32M memory
 Requires  At least 32M memory
 Requires  At least 32M memory
 Requires  At least 32M memory
 Requires  At least 32M memory
 Requires  At least 32M memory
 Requires  Unix/Linux
 Is Part Of  Academia Sinica Ancient Chinese Corpus
 References  Academia Sinica Formosan Language Archive
 References  http://www.sinica.edu.tw/SinicaCorpus/
 References  Academia Sinica Balanced Corpus of Modern Chinese
 Rights  This notice regulates your usage of this web site and its associated services including interface, corpus data, segmenting and tagging standard, etc. All rights are reserved by Academia Sinica. In your research you may apply the data resulting from the searching processes of our interface systems. However, you are prohibited to abstract, alter or publish any searching results voluntarily. The copyright of corpus data is still reserved by original author or source and cannot be reproduced, copied or violate anything involving intellectual property.
 Source  San-sui-ping-yao-zhuan
 Source  Hong-lou-mong
 Subject (ISO639-3)  [eng], en-us
 Subject (ISO639-3)  [cmn], x-sil-CHN
 Title  Academia Sinica Tagged Corpus of Early Mandarin Chinese
 Type (DCMI)  Text

Metadata quality analysis

OLAC metadata records are scored for metadata quality on a 10-point scale explained in OLAC Metadata Metrics. The score for the above record (along with comments on changes that could improve the score) is as follows:

Component + - Comments
Title   1   0 
Date   1   0 
Agent   1   0 
About   1   0 
Depth   1   0 
Content Language   1   0 
Subject Language   1   0 
OLAC Type   0   1  Add a dc:type element that uses the OLAC linguistic-type encoding scheme to identify the type of the resource from a linguistic point of view.
DCMI Type   1   0 
Precision   0.33   0.67  For the full score, make use of at least 2 more encoding schemes in addition to the ones counted explicitly in other components of the score. For instance,
  • use dcterms:W3CDTF on dc:date (or its refinements)
  • use dcterms:URI when the value of an element is a URL
  • use dcterms:IMT on dc:format
  • use dcterms:Box or dcterms:Point or dcterms:TGN on dcterms:spatial
Quality score  8.33