Fryske Akademy Parallel Corpus

Frisian and Dutch not POS tagged aligned sentences Dutch…

French Web Corpus (WaC)

This corpus (web as corpus) was gathered using a list of URLs…

Fida PLUS corpus

The corpus is a reference corpus for Slovene, as described ​here.…

Estonian Reference Corpus

Estonian Reference Corpus is a morphologically annotated corpus…

English Wikipedia corpus

This corpus has been built using English Wikipedia dump (from…

A medical web corpus

A web medical corpus has been collected using the WebBootCat with…

DCEP: Digital Corpus of the European Parliament

The Digital Corpus of the European Parliament (DCEP) is a collection…

ChineseTaiwanWaC corpus

Chinese Taiwan web as corpus has almost 260 million words encoded…

Chinese Gigaword corpus

The Chinese Gigaword corpus from the Linguistic Data Consortium…

CHILDES English corpus

Childes-En is a subcorpus of the full CHILDES corpus which has…

CAJA corpus

Caja corpus is a corpus of Academic Journal Aricles. created…

HindiWaC corpus

This corpus contains almost 60 million words crawled from the…