Corpus of judgments of the European Parliament

The Eur-Lex judgments parallel corpus is a multilingual corpus in all the official languages of the European Union focused only on judgments of the Court of Justice and thus a subset of the whole EUR-Lex corpus. The corpus specification enables to look for translation examples from the field of justice for 23 languages (except for Irish).

Corpus metadata

The corpus contains the same information as the EUR-Lex corpus (CELEX number, document title, year of document, etc.), moreover, there is information relating to the parts of judgements:

<heading> – title of the judgment
<summary>  – summary of the judgment
<parties> – involved parties
<subject> – subject of the case
<judgment> – judgment of the court
<costs> – decision on costs
<operative_part> – operative part

Not all these parts are present in each judgment:

  • Sections in pre-2000 files (before year 2000):
    – summary, parties, subject or grounds, costs, operative_part
  • Sections in post-2000 files (after year 2000):
    – summary, judgment, costs

How to get the data

Academic institutions

The EUR-Lex corpus is released under CC-BY-NC-SA licence. Because of the file size, please email us at support@sketchengine.eu first and we will set up a temporary download link for you. Data are supplied as vertical text with an alignment file. The total size is ca 1.5 GB. For the original documents, see the official EUR-Lex website.

For commercial use

Please contact us for a quote.

Search judgment parts

Search texts within particular parts of judgments is available via the CQL form of the Concordance search in Sketch Engine.

[lempos="border-n"] within <summary></summary>

This CQL example shows how to find English judgments with summaries where lemma “border” occurs as noun. See the search result (login required).

A complete set of Sketch Engine tools is available to work with this EUR-Lex judgments corpus to generate:

  • word sketch – collocations categorized by grammatical relations
  • thesaurus – synonyms and similar words for every word
  • keywords – terminology extraction of one-word and multi-word units
  • word lists – lists of nouns, verbs, adjectives etc. organized by frequency
  • n-grams – frequency list of multi-word units
  • concordance – examples in context
  • trends – diachronic analysis automatically identifies neologisms and changes in use
  • text type analysis – statistics of metadata in the corpus

Search the Eur-Lex judgments parallel corpus

Sketch Engine offers a range of tools to work with this set of parallel corpora.

or

Tip

Learn to work with multilingual and parallel corpora in Sketch Engine. Refer to the user guide.

More parallel corpora

DGT Translation Memory parallel corpora – European Union’s legislative documents

EUR-Lex 2/2016 parallel corpora – texts from the EUR-Lex database containing public EU documents

Europarl spoken parallel corpora – transcriptions of the European Parliament Proceedings

Open Parallel Corpus (OPUS) – translated texts from various sources, e.g. medical documents, subtitles, technical documentation, etc.

OpenSubtitles 2018 parallel corpora – movie subtitles from the OpenSubtitles database

United Nations Parallel Corpus (UNPC) – official records and other parliamentary documents of the United Nation

Use Sketch Engine in minutes

Generating collocations, frequency lists, examples in contexts, n-grams or extracting terms is easy with Sketch Engine. Use our Quick Start Guide to learn it in minutes.