OLAC Record: oai:EarlyMandarin.sinica.edu.tw:EarlyMandarin

Metadata
Title: Academia Sinica Tagged Corpus of Early Mandarin Chinese
Contributor: [role = sponsor] Academia Sinica Computing Centre
 [role = editor] Chinese Knowledge Information Processing Group of Institute of Information Science of Academia Sinica
 [role = editor] Corpus Research Group in Institute of Linguistics, Academia Sinica
 [role = sponsor] Chu-Ren Huang
 Academia Sinica
 Institute of History and Philology,Academia Sinica
 Chiang Ching-Kuo Foundation for International Scholarly Exchange
 [role = sponsor] Cheng-hui Liu
 [role = sponsor] Keh-jiann Chen
 [role = sponsor] Pei-chuan Wei
 [role = sponsor] P.M. Thompson
Coverage: Early(After Tang and Five Dynasties)
 Primitive(Early Qin to Western Han)
 Medieval(Eastern Han to Wei-Jin-Southern and Northern Dynasties)
 Zhonghua (nation)
 Chung-hua Min-kuo (nation)
 China
 Taiwan
Creator: Academia Sinica Computing Centre
 Chinese Knowledge Information Processing Group of Institute of Information Science of Academia Sinica
 Corpus Research Group in Institute of Linguistics, Academia Sinica
Date: 1990 [Created]
 2001/12, December, 2001 [Available]
 1997 [Modified]
 1995 [Modified]
Description: Institute of Information Science,Institute fo Linguistics,ASCC. All Rights Reserved.
 http://www.sinica.edu.tw/ftms-bin/kiwi1/mkiwi.sh
 http://www.sinica.edu.tw/Early_Mandarin/early_mandarin_chinese_c_help.html
 The www version of Sinica Early Chinese Corpus was open to research community in November, 2001. Literature available for searching is "Hong-lou-mong" and "San-sui-ping-yao-zhuan". Although the searching function and criteria for segmentation and tagging are mostly the same with Sinica Corpus for modern Chinese, it has its own features as well. For example, while the searching result provides the part-of-speech with the lemma, the source of the cited sentence is also given. This feature helps researchers on the reference. Besides, for the segmentation and tagging standard, some changes are also made because of the different focus from the standard for analyzing modern language. For instance, the verb-complement structure is labelled more detailed in our Sinica Early Chinese Corpus.
 Scholarly Exchange and Institute of History and Philology in Academia Sinica. The objective was only to collect the raw text at that time. Ever since, the colletion for raw corpus has never stopped. The collected data has entended from Primitive Chinese to Medieval Chinese and Early Chinese. The work of collection is mainly managed by Prof. Pei-Chuang Wei and is founded by Academia Sinica. The tagging of Primitive Chinese data started in 1995. For Early Chinese, the tagging system was designed in 1997 and applied immediately. The project of Early Chinese was led by Prof. Pei-Chuang Wei and Prof. Cheng-hui Liu (Hsing-Hua University) The financial support came from Academia Sinica and National Science Council; technical support on tagging system and computer science were provided by Prof. Chu-Ren Huang, Prof. Ker-Jiann Chen, and ASCC.
Format: 4.5 million words, 19044 KB, saved in text file and presented in HTML format
 big5
 Criteria for POS and Feature tagging in Academia Sinica Balanced Corpus of Early Chinese
Identifier: http://www.sinica.edu.tw/Early_Mandarin/
Language: [sourcecode = C] C++
 [sourcecode = JAVASCRIPT] JAVASCRIPT
 [language = cmn] x-sil-CHN
 [language = cmn] x-sil-CHN
Publisher: Academia Sinica
 http://www.sinica.edu.tw/
 http://www.iis.sinica.edu.tw/
 Institute of Linguistics, Preparatory Office Academia Sinica
 http://www.ling.sinica.edu.tw/
 Institute of Information Science Academia Sinica
Relation: http://www.ling.sinica.edu.tw/formosan/
 [cpu = Alpha] At least 32M memory [Requires]
 [cpu = x86] At least 32M memory [Requires]
 [cpu = 680x0] At least 32M memory [Requires]
 [cpu = PowerPC] At least 32M memory [Requires]
 [cpu = Sparc] At least 32M memory [Requires]
 [cpu = MIPS] At least 32M memory [Requires]
 [os = Unix/Linux] Unix/Linux [Requires]
 Academia Sinica Ancient Chinese Corpus [Is Part Of]
 Academia Sinica Formosan Language Archive [References]
 http://www.sinica.edu.tw/SinicaCorpus/ [References]
 Academia Sinica Balanced Corpus of Modern Chinese [References]
Rights: This notice regulates your usage of this web site and its associated services including interface, corpus data, segmenting and tagging standard, etc. All rights are reserved by Academia Sinica. In your research you may apply the data resulting from the searching processes of our interface systems. However, you are prohibited to abstract, alter or publish any searching results voluntarily. The copyright of corpus data is still reserved by original author or source and cannot be reproduced, copied or violate anything involving intellectual property.
Source: San-sui-ping-yao-zhuan
 Hong-lou-mong
Subject: [language = eng] en-us
 [language = cmn] x-sil-CHN
Type: [DCMIType] Text

OLAC Info

Archive:  Academia Sinica Tagged Corpus of Early Mandarin Chinese
Description:  http://www.language-archives.org/archive/EarlyMandarin.sinica.edu.tw
GetRecord:  OAI-PMH request for OLAC format

OAI Info

OaiIdentifier:  oai:EarlyMandarin.sinica.edu.tw:EarlyMandarin
DateStamp:  2002-12-14
GetRecord:  OAI-PMH request for simple DC format

Search Info

Terms  dcmi_Text iso639_cmn iso639_eng