The corpus was prepared by Philipp Koehn. The process is described in Europarl: A Parallel Corpus for Statistical Machine Translation, Philipp Koehn, (MT Summit 2005).

All material is taken from http://www.statmt.org/europarl/.

Changelog

(spring 2015)

  • Tagged by TreeTagger.

v 7.0 (May 2012)

  • A further expanded and improved version of the corpus was released on 15th May 2012.

v 5.0 (May 2010)

  • A corpus further expanded and improved version of the earlier version was released on 20th January 2010.

Reference

Philipp Koehn. Europarl: A Parallel Corpus for Statistical Machine Translation, MT Summit 2005.