OLAC Record: Czech image captioning, machine translation, and sentiment analysis (Neural Monkey models)

OLAC Record
oai:lindat.mff.cuni.cz:11234/1-2839

Metadata

Title: Czech image captioning, machine translation, and sentiment analysis (Neural Monkey models)

Bibliographic Citation: http://hdl.handle.net/11234/1-2839

Creator: Libovický, Jindřich

Rosa, Rudolf

Helcl, Jindřich

Popel, Martin

Date (W3CDTF): 2018-09-21T08:38:37Z

Date Available: 2018-09-21T08:38:37Z

Description: This submission contains trained end-to-end models for the Neural Monkey toolkit for Czech and English, solving three NLP tasks: machine translation, image captioning, and sentiment analysis. The models are trained on standard datasets and achieve state-of-the-art or near state-of-the-art performance in the tasks. The models are described in the accompanying paper. The same models can also be invoked via the online demo: https://ufal.mff.cuni.cz/grants/lsd There are several separate ZIP archives here, each containing one model solving one of the tasks for one language. To use a model, you first need to install Neural Monkey: https://github.com/ufal/neuralmonkey To ensure correct functioning of the model, please use the exact version of Neural Monkey specified by the commit hash stored in the 'git_commit' file in the model directory. Each model directory contains a 'run.ini' Neural Monkey configuration file, to be used to run the model. See the Neural Monkey documentation to learn how to do that (you may need to update some paths to correspond to your filesystem organization). The 'experiment.ini' file, which was used to train the model, is also included. Then there are files containing the model itself, files containing the input and output vocabularies, etc. For the sentiment analyzers, you should tokenize your input data using the Moses tokenizer: https://pypi.org/project/mosestokenizer/ For the machine translation, you do not need to tokenize the data, as this is done by the model. For image captioning, you need to: - download a trained ResNet: http://download.tensorflow.org/models/resnet_v2_50_2017_04_14.tar.gz - clone the git repository with TensorFlow models: https://github.com/tensorflow/models - preprocess the input images with the Neural Monkey 'scripts/imagenet_features.py' script (https://github.com/ufal/neuralmonkey/blob/master/scripts/imagenet_features.py) -- you need to specify the path to ResNet and to the TensorFlow models to this script Feel free to contact the authors of this submission in case you run into problems!

Identifier (URI): http://hdl.handle.net/11234/1-2839

Is Replaced By (URI): http://hdl.handle.net/11234/1-3145

Language: Czech

English

Language (ISO639): ces

eng

Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)

Rights: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

http://creativecommons.org/licenses/by-nc-sa/4.0/

Subject: sentiment analysis

machine translation

image captioning

neural networks

transformer

Neural Monkey

Type: toolService

Type (DCMI): Software

OLAC Info

Archive: LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University

Description: http://www.language-archives.org/archive/lindat.mff.cuni.cz

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:lindat.mff.cuni.cz:11234/1-2839

DateStamp: 2021-06-29

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Libovický, Jindřich; Rosa, Rudolf; Helcl, Jindřich; Popel, Martin. 2018. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL).
Terms: area_Europe country_CZ country_GB dcmi_Software iso639_ces iso639_eng

http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-2839
Up-to-date as of: Mon Jun 16 1:05:20 EDT 2025

Metadata
Title:		Czech image captioning, machine translation, and sentiment analysis (Neural Monkey models)
Bibliographic Citation:		http://hdl.handle.net/11234/1-2839
Creator:		Libovický, Jindřich
		Rosa, Rudolf
		Helcl, Jindřich
		Popel, Martin
Date (W3CDTF):		2018-09-21T08:38:37Z
Date Available:		2018-09-21T08:38:37Z
Description:		This submission contains trained end-to-end models for the Neural Monkey toolkit for Czech and English, solving three NLP tasks: machine translation, image captioning, and sentiment analysis. The models are trained on standard datasets and achieve state-of-the-art or near state-of-the-art performance in the tasks. The models are described in the accompanying paper. The same models can also be invoked via the online demo: https://ufal.mff.cuni.cz/grants/lsd There are several separate ZIP archives here, each containing one model solving one of the tasks for one language. To use a model, you first need to install Neural Monkey: https://github.com/ufal/neuralmonkey To ensure correct functioning of the model, please use the exact version of Neural Monkey specified by the commit hash stored in the 'git_commit' file in the model directory. Each model directory contains a 'run.ini' Neural Monkey configuration file, to be used to run the model. See the Neural Monkey documentation to learn how to do that (you may need to update some paths to correspond to your filesystem organization). The 'experiment.ini' file, which was used to train the model, is also included. Then there are files containing the model itself, files containing the input and output vocabularies, etc. For the sentiment analyzers, you should tokenize your input data using the Moses tokenizer: https://pypi.org/project/mosestokenizer/ For the machine translation, you do not need to tokenize the data, as this is done by the model. For image captioning, you need to: - download a trained ResNet: http://download.tensorflow.org/models/resnet_v2_50_2017_04_14.tar.gz - clone the git repository with TensorFlow models: https://github.com/tensorflow/models - preprocess the input images with the Neural Monkey 'scripts/imagenet_features.py' script (https://github.com/ufal/neuralmonkey/blob/master/scripts/imagenet_features.py) -- you need to specify the path to ResNet and to the TensorFlow models to this script Feel free to contact the authors of this submission in case you run into problems!
Identifier (URI):		http://hdl.handle.net/11234/1-2839
Is Replaced By (URI):		http://hdl.handle.net/11234/1-3145
Language:		Czech
Language:		English
Language (ISO639):		ces
Language (ISO639):		eng
Publisher:		Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Rights:		Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Rights:		http://creativecommons.org/licenses/by-nc-sa/4.0/
Subject:		sentiment analysis
		machine translation
		image captioning
		neural networks
		transformer
		Neural Monkey
Type:		toolService
Type (DCMI):		Software
OLAC Info
Archive:		LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:		http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:lindat.mff.cuni.cz:11234/1-2839
DateStamp:		2021-06-29
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Libovický, Jindřich; Rosa, Rudolf; Helcl, Jindřich; Popel, Martin. 2018. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL).
Terms:		area_Europe country_CZ country_GB dcmi_Software iso639_ces iso639_eng