The Europarl parallel corpus
The Europarl corpus is a parallel corpus created from the European Parliament Proceedings in the official languages of the EU.
The corpus was prepared by Philipp Koehn. The process is described in Europarl: A Parallel Corpus for Statistical Machine Translation, Philipp Koehn, (MT Summit 2005).
All material is taken from http://www.statmt.org/europarl/.
v 7.0 (May 2012)
- A further expanded and improved version of the corpus was released on 15th May 2012.
v 5.0 (May 2010)
- A corpus further expanded and improved version of the earlier version was released on 20th January 2010.
Philipp Koehn. Europarl: A Parallel Corpus for Statistical Machine Translation, MT Summit 2005.