DGT-Translation Memory

DGT-Translation Memory is a database of aligned sentences from the European Union’s legislative documents (Acquis Communautaire) in 24 EU languages. Sketch Engine offers this database as parallel corpora which can be searched. Detailed information and how to cite the corpora can be found in the bibliography.

The DGT-Translation Memory consists of 24 European languages:

Bulgarian German Polish
Czech Greek Portuguese
Danish Hungarina Romanian
Dutch Irish Serbo-Croatian
English Italian Slovak
Estonian Latvian Slovenian
Finnish Lithuanian Spanish
French Maltese Swedish

The aligned texts come from a large translation memory DGT published by The European Commission.

The individual corpora have been processed by the latest processing tools available in Sketch Engine.

Tools to work with the DGT-Translation Memory

A complete set of Sketch Engine tools is available to work with DGT-Translation Memory as parallel corpora to generate:

  • word sketch – collocations categorized by grammatical relations (this function requires part-of-speech tagging)
  • thesaurus – synonyms and similar words for every word (this function requires part-of-speech tagging)
  • word lists – lists of nouns, verbs, adjectives etc. organized by frequency
  • n-grams – frequency list of multi-word units
  • concordance – examples in context

Bibliographic references

For a more detailed description of the DGT-TM, including more statistics on the resource, see the following publication. When making reference to DGT-TM in scientific publications, please refer to:

Steinberger, R., Eisele, A., Klocek, S., Pilos, S., & Schlüter, P. (2013). DGT-TM: A freely available translation memory in 22 languagesarXiv preprint arXiv:1309.5226.

For a contrastive overview of DGT-TM and the other multilingual text resources offered for download on this site, you can read the following journal article:

Steinberger, R., Ebrahim, M., Poulis, A., Carrasco-Benitez, M., Schlüter, P., Przybyszewski, M., & Gilbro, S. (2014). An overview of the European Union’s highly multilingual parallel corporaLanguage resources and evaluation48(4), 679-707.

Search the DGT-Translation Memory

Sketch Engine offers a range of tools to work with the DGT-Translation Memory parallel corpus.

or

Use Sketch Engine in minutes

Generating collocations, frequency lists, examples in contexts, n-grams or extracting terms is easy with Sketch Engine. Use our Quick Start Guide to learn it in minutes.