OLAC Record
oai:clarin.eurac.edu:20.500.12124/104

Metadata
Title:LegISTyr test set
Bibliographic Citation:http://hdl.handle.net/20.500.12124/104
Creator:Alber, Marlies
Chiocchetti, Elena
Ralli, Natascia
Stanizzi, Isabella
Date (W3CDTF):2025-07-07T07:54:49Z
Date Available:2025-07-07T07:54:49Z
Description:LegISTyr is a machine translation test set for evaluating the quality of legal terminology translation from Italian to South Tyrolean German, a minor standard variety of German. It covers specific legal subdomains or legal translation issues: 1) standardised terminology, 2) occupational health and safety, 3) subsidised housing, 4) family law, 5) criminal and criminal procedure law, 6) homonyms, 7) abbreviated forms, 8) gender-inclusive writing strategies. Each subset contains at least 250 examples, i.e. five examples for each term or twenty examples for each inclusive writing strategy. The total number of examples is 2067. The example sentences in the test set showcase single-word and multi-word terms from the Italian legal system, together with their correct, standardised or non-standardised South Tyrolean German target hypothesis. It also lists other (less) acceptable variants used in South Tyrol and, where available, equivalent terms from other German-speaking legal systems (mainly Austria, Germany, Switzerland). The legal subdomain is specified for each example in every subset, except for the last subset on gender-inclusive writing. This subset contains examples for different strategies used in Italian but no target hypotheses, as there may be several acceptable ones. LegISTyr can be used, for example, to assess the success of terminology enforcement strategies when machine translating legal and administrative texts from Italian into German as well as the influence of major varieties of legal German on translations into a minor standard variety.
Identifier (URI):http://hdl.handle.net/20.500.12124/104
Language:Italian
German
Language (ISO639):ita
deu
Publisher:Institute for Applied Linguistics, Eurac Research
Rights:Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
https://creativecommons.org/licenses/by-nc/4.0/
Subject:legal terminology
South Tyrol
German language variety
legal translation
language varieties
terminological variation
MT test set
machine translation
legal terminology variation
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  Eurac Research CLARIN Centre
Description:  http://www.language-archives.org/archive/clarin.eurac.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:clarin.eurac.edu:20.500.12124/104
DateStamp:  2025-07-07
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Alber, Marlies; Chiocchetti, Elena; Ralli, Natascia; Stanizzi, Isabella. 2025. Institute for Applied Linguistics, Eurac Research.
Terms: area_Europe country_DE country_IT dcmi_Text iso639_deu iso639_ita olac_primary_text


http://www.language-archives.org/item.php/oai:clarin.eurac.edu:20.500.12124/104
Up-to-date as of: Tue Jul 8 1:30:21 EDT 2025