OLAC Record: Kacenka : parallel corpus of English and Czech texts

OLAC Record
oai:lindat.mff.cuni.cz:11372/LRT-891

Metadata

Title: Kacenka : parallel corpus of English and Czech texts

Bibliographic Citation: http://hdl.handle.net/11372/LRT-891

Contributor: Rambousek, Jiri

Date (W3CDTF): 2014-07-30T21:24:44Z

Date Available: 2014-07-30T21:24:44Z

Description: Parallel corpus, 3,297,283 words. The idea was to create a small parallel corpus which would enable to work with entire texts in translation analysis rather then short extracts. At the same time it aimed at acquiring experience that could be used in creating a larger parallel corpus of English and Czech in the future. Although the main part of work has been completed -- and the aims of the KACENKA grant met -- we keep improving and enlarging KACENKA gradually. Currently, it has the size of 3,297,283 words (out of which, 1,689,513 have been acquired by means of scanning). Most of the English texts for KACENKA have been retrieved from the Internet resources. The rest -- and nearly all the Czech texts -- had to be scanned with the use of an OCR programme. KACENKA is stored on a single CD-ROM; its use is limited by copyright restrictions.

Identifier (URI): http://hdl.handle.net/11372/LRT-891

Language: Czech

English

Language (ISO639): ces

eng

Publisher: Masaryk University, Brno

Type: corpus

Type (DCMI): Text

Type (OLAC): primary_text

OLAC Info

Archive: LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University

Description: http://www.language-archives.org/archive/lindat.mff.cuni.cz

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:lindat.mff.cuni.cz:11372/LRT-891

DateStamp: 2021-06-29

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Rambousek, Jiri. 2014. Masaryk University, Brno.
Terms: area_Europe country_CZ country_GB dcmi_Text iso639_ces iso639_eng olac_primary_text

http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11372/LRT-891
Up-to-date as of: Mon Jun 16 1:04:07 EDT 2025

Metadata
Title:		Kacenka : parallel corpus of English and Czech texts
Bibliographic Citation:		http://hdl.handle.net/11372/LRT-891
Contributor:		Rambousek, Jiri
Date (W3CDTF):		2014-07-30T21:24:44Z
Date Available:		2014-07-30T21:24:44Z
Description:		Parallel corpus, 3,297,283 words. The idea was to create a small parallel corpus which would enable to work with entire texts in translation analysis rather then short extracts. At the same time it aimed at acquiring experience that could be used in creating a larger parallel corpus of English and Czech in the future. Although the main part of work has been completed -- and the aims of the KACENKA grant met -- we keep improving and enlarging KACENKA gradually. Currently, it has the size of 3,297,283 words (out of which, 1,689,513 have been acquired by means of scanning). Most of the English texts for KACENKA have been retrieved from the Internet resources. The rest -- and nearly all the Czech texts -- had to be scanned with the use of an OCR programme. KACENKA is stored on a single CD-ROM; its use is limited by copyright restrictions.
Identifier (URI):		http://hdl.handle.net/11372/LRT-891
Language:		Czech
Language:		English
Language (ISO639):		ces
Language (ISO639):		eng
Publisher:		Masaryk University, Brno
Type:		corpus
Type (DCMI):		Text
Type (OLAC):		primary_text
OLAC Info
Archive:		LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:		http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:lindat.mff.cuni.cz:11372/LRT-891
DateStamp:		2021-06-29
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Rambousek, Jiri. 2014. Masaryk University, Brno.
Terms:		area_Europe country_CZ country_GB dcmi_Text iso639_ces iso639_eng olac_primary_text