Sample Metadata Record

oai:SinicaCorpus.sinica.edu.tw:SinicaCorpus


XML format

<olac:olac>
<dc:contributor xsi:type="olac:role" olac:code="editor">Chinese Knowledge Information Processing Group of Institute of Information Science of Academia Sinica</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="editor">Corpus Research Group in Institute of Linguistics, Academia Sinica</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="sponsor">Academia Sinica Computing Centre</dc:contributor>
<dc:contributor>Chiang Ching-Kuo Foundation for International Scholarly Exchange</dc:contributor>
<dc:contributor>National Science Council</dc:contributor>
<dc:contributor xsi:type="olac:role" olac:code="sponsor">Chu-Ren Huang</dc:contributor>
<dc:contributor>Keh-jiann Chen</dc:contributor>
<dc:contributor>Knowledge Representation and Language Engineering for Mandarin Chinese-Core Technology and Tool Libraries for Language Processing</dc:contributor>
<dc:coverage>Modern</dc:coverage>
<dc:coverage>Chung-hua Min-kuo (nation)</dc:coverage>
<dc:coverage>Taiwan</dc:coverage>
<dc:creator>Chinese Knowledge Information Processing Group of Institute of Information Science of Academia Sinica</dc:creator>
<dc:creator>Corpus Research Group in Institute of Linguistics, Academia Sinica</dc:creator>
<dc:creator>Academia Sinica Computing Centre</dc:creator>
<dcterms:created>1990</dcterms:created>
<dcterms:available>1995, July, 1995</dcterms:available>
<dcterms:available>1996, November, 1996</dcterms:available>
<dcterms:modified>1991</dcterms:modified>
<dcterms:modified>1994</dcterms:modified>
<dcterms:modified>1997</dcterms:modified>
<dcterms:modified>2001, December, 2001</dcterms:modified>
<dc:description>tute of Information Science, Institute of Linguistics, ASCC. All Rights Reserved.</dc:description>
<dc:description>http://www.sinica.edu.tw/ftms-bin/kiwi1/mkiwi.sh</dc:description>
<dc:description>http://www.sinica.edu.tw/SinicaCorpus/modern_c_help.html</dc:description>
<dc:format>5 million words, 120868 KB, saved in text file and presented in HTML format</dc:format>
<dc:format>big5</dc:format>
<dc:format>Criteria for POS and Feature tagging in Academia Sinica Balanced Corpus of Modern Chinese</dc:format>
<dc:identifier>http://www.sinica.edu.tw/SinicaCorpus/</dc:identifier>
<dc:language/>
<dc:language/>
<dc:language xsi:type="olac:language" olac:code="cmn"/>
<dc:publisher>Academia Sinica</dc:publisher>
<dc:publisher>http://www.sinica.edu.tw/</dc:publisher>
<dc:publisher>Institute of Linguistics, Preparatory Office Academia Sinica</dc:publisher>
<dc:publisher>http://www.ling.sinica.edu.tw/</dc:publisher>
<dc:publisher>Institute of Information Science Academia Sinica</dc:publisher>
<dc:publisher>http://www.iis.sinica.edu.tw/</dc:publisher>
<dcterms:requires>At least 32M memory</dcterms:requires>
<dcterms:requires>At least 32M memory</dcterms:requires>
<dcterms:requires>At least 32M memory</dcterms:requires>
<dcterms:requires>At least 32M memory</dcterms:requires>
<dcterms:requires>At least 32M memory</dcterms:requires>
<dcterms:requires>At least 32M memory</dcterms:requires>
<dcterms:requires/>
<dcterms:references>Academia Sinica Tagged Corpus of Early Mandarin Chinese</dcterms:references>
<dcterms:references>Academia Sinica Formosan Language Archive</dcterms:references>
<dcterms:references>http://www.sinica.edu.tw/Early_Mandarin/</dcterms:references>
<dcterms:references>http://www.ling.sinica.edu.tw/formosan/</dcterms:references>
<dc:rights>This notice regulates your usage of this web site and its associated services including interface, corpus data, segmenting and tagging standard, etc. All rights are reserved by Academia Sinica. In your research you may apply the data resulting from the searching processes of our interface systems. However, you are prohibited to abstract, alter or publish any searching results voluntarily. The copyright of corpus data is still reserved by original author or source and cannot be reproduced, copied or violate anything involving intellectual property.</dc:rights>
<dc:source>General Magazine: Common Wealth-Taiwan leading's magazine, Sinorama Magazine, Travel, WorldScreen.</dc:source>
<dc:source>Newspaper: China Times, Liberty Times, Mandarin Daily News,Newsletter of Computing Centre.</dc:source>
<dc:source>Academic Journal: Institute of Ethnology Publications, Institute of Biomedical Sciences (IBMS) Newsletter.</dc:source>
<dc:source>Textbook: Elementary School Mandarin Textbook 12 volumes.</dc:source>
<dc:source>Reference: CKIP Technical Reports.</dc:source>
<dc:source>Thesis: Paper.</dc:source>
<dc:source>General Book:8 volumes of the Common Psychology Series published by Hong's Foundation for Education and Culture, Carnival in Brazil published by China Times Publishing Co.</dc:source>
<dc:source>Audio/Visual Medium: paragraphs in Bulletin Board System in Taiwan.</dc:source>
<dc:source>Conversation/Interview: Interview Record of participants in Democracy Movement, Daily conversation of mainland Chinese students in America.</dc:source>
<dc:source>Elsewhere: Texts that cannot be categorized in above media.</dc:source>
<dc:subject xsi:type="olac:language" olac:code="eng"/>
<dc:subject xsi:type="olac:language" olac:code="cmn"/>
<dc:title>Sinica Corpus</dc:title>
<dc:title>Academia Sinica Balanced Corpus of Modern Chinese</dc:title>
<dc:type xsi:type="dcterms:DCMIType">Text</dc:type>
</olac:olac>

Display format

 Contributor (editor)  Chinese Knowledge Information Processing Group of Institute of Information Science of Academia Sinica
 Contributor (editor)  Corpus Research Group in Institute of Linguistics, Academia Sinica
 Contributor (sponsor)  Academia Sinica Computing Centre
 Contributor  Chiang Ching-Kuo Foundation for International Scholarly Exchange
 Contributor  National Science Council
 Contributor (sponsor)  Chu-Ren Huang
 Contributor  Keh-jiann Chen
 Contributor  Knowledge Representation and Language Engineering for Mandarin Chinese-Core Technology and Tool Libraries for Language Processing
 Coverage  Modern
 Coverage  Chung-hua Min-kuo (nation)
 Coverage  Taiwan
 Creator  Chinese Knowledge Information Processing Group of Institute of Information Science of Academia Sinica
 Creator  Corpus Research Group in Institute of Linguistics, Academia Sinica
 Creator  Academia Sinica Computing Centre
 Created  1990
 Available  1995, July, 1995
 Available  1996, November, 1996
 Modified  1991
 Modified  1994
 Modified  1997
 Modified  2001, December, 2001
 Description  tute of Information Science, Institute of Linguistics, ASCC. All Rights Reserved.
 Description  http://www.sinica.edu.tw/ftms-bin/kiwi1/mkiwi.sh
 Description  http://www.sinica.edu.tw/SinicaCorpus/modern_c_help.html
 Format  5 million words, 120868 KB, saved in text file and presented in HTML format
 Format  big5
 Format  Criteria for POS and Feature tagging in Academia Sinica Balanced Corpus of Modern Chinese
 Identifier  http://www.sinica.edu.tw/SinicaCorpus/
 Language 
 Language 
 Language (ISO639-3)  [cmn]
 Publisher  Academia Sinica
 Publisher  http://www.sinica.edu.tw/
 Publisher  Institute of Linguistics, Preparatory Office Academia Sinica
 Publisher  http://www.ling.sinica.edu.tw/
 Publisher  Institute of Information Science Academia Sinica
 Publisher  http://www.iis.sinica.edu.tw/
 Requires  At least 32M memory
 Requires  At least 32M memory
 Requires  At least 32M memory
 Requires  At least 32M memory
 Requires  At least 32M memory
 Requires  At least 32M memory
 Requires 
 References  Academia Sinica Tagged Corpus of Early Mandarin Chinese
 References  Academia Sinica Formosan Language Archive
 References  http://www.sinica.edu.tw/Early_Mandarin/
 References  http://www.ling.sinica.edu.tw/formosan/
 Rights  This notice regulates your usage of this web site and its associated services including interface, corpus data, segmenting and tagging standard, etc. All rights are reserved by Academia Sinica. In your research you may apply the data resulting from the searching processes of our interface systems. However, you are prohibited to abstract, alter or publish any searching results voluntarily. The copyright of corpus data is still reserved by original author or source and cannot be reproduced, copied or violate anything involving intellectual property.
 Source  General Magazine: Common Wealth-Taiwan leading's magazine, Sinorama Magazine, Travel, WorldScreen.
 Source  Newspaper: China Times, Liberty Times, Mandarin Daily News,Newsletter of Computing Centre.
 Source  Academic Journal: Institute of Ethnology Publications, Institute of Biomedical Sciences (IBMS) Newsletter.
 Source  Textbook: Elementary School Mandarin Textbook 12 volumes.
 Source  Reference: CKIP Technical Reports.
 Source  Thesis: Paper.
 Source  General Book:8 volumes of the Common Psychology Series published by Hong's Foundation for Education and Culture, Carnival in Brazil published by China Times Publishing Co.
 Source  Audio/Visual Medium: paragraphs in Bulletin Board System in Taiwan.
 Source  Conversation/Interview: Interview Record of participants in Democracy Movement, Daily conversation of mainland Chinese students in America.
 Source  Elsewhere: Texts that cannot be categorized in above media.
 Subject (ISO639-3)  [eng]
 Subject (ISO639-3)  [cmn]
 Title  Sinica Corpus
 Title  Academia Sinica Balanced Corpus of Modern Chinese
 Type (DCMI)  Text

Metadata quality analysis

OLAC metadata records are scored for metadata quality on a 10-point scale explained in OLAC Metadata Metrics. The score for the above record (along with comments on changes that could improve the score) is as follows:

Component + - Comments
Title   1   0 
Date   1   0 
Agent   1   0 
About   1   0 
Depth   1   0 
Content Language   1   0 
Subject Language   1   0 
OLAC Type   0   1  Add a dc:type element that uses the OLAC linguistic-type encoding scheme to identify the type of the resource from a linguistic point of view.
DCMI Type   1   0 
Precision   0.33   0.67  For the full score, make use of at least 2 more encoding schemes in addition to the ones counted explicitly in other components of the score. For instance,
  • use dcterms:W3CDTF on dc:date (or its refinements)
  • use dcterms:URI when the value of an element is a URL
  • use dcterms:IMT on dc:format
  • use dcterms:Box or dcterms:Point or dcterms:TGN on dcterms:spatial
Quality score  8.33