The Oxford English Corpus

Supplied by Oxford University Press. Access Policy Access restricted…

New Model Corpus (SuperSenseTagger)

The New Model Corpus tagged with SuperSenseTagger (​sst-light)…

Islam – UK

A special English newspaper corpus by Costas Gabrielatos at…

Internet-ZH corpus

Internet-ZH is a Chinese web corpus collected by Serge Sharoff.…

Project Gutenberg Corpus

downloaded with wget: ​getting Gutenberg cleaned with…

Fryske Akademy Parallel Corpus

Frisian and Dutch not POS tagged aligned sentences Dutch…

French Web Corpus (WaC)

This corpus (web as corpus) was gathered using a list of URLs…

Fida PLUS corpus

The corpus is a reference corpus for Slovene, as described ​here.…

Feed Corpus

The FeedCorpus is a corpus with about 300 million words, which…

Europarl: European Parliament Proceedings Parallel Corpus

The corpus was prepared by Philipp Koehn. The process is described…

Estonian Reference Corpus

Morphologically annotated corpus by Filosoft. The character…

English Wikipedia corpus

This corpus has been built using English Wikipedia dump (from…

Domain Web Corpus

The corpora available here have been collected using the WebBootCat…

DGT-Translation Memory

This translation memory consists of 24 collections of texts in…

DCEP: Digital Corpus of the European Parliament

The Digital Corpus of the European Parliament (DCEP) is a collection…