Fryske Akademy Parallel Corpus

Frisian and Dutch not POS tagged aligned sentences Dutch…

French Web Corpus (WaC)

This corpus (web as corpus) was gathered using a list of URLs…

Fida PLUS corpus

The corpus is a reference corpus for Slovene, as described ​here.…

Estonian Reference Corpus

Estonian Reference Corpus is a morphologically annotated corpus…

A medical web corpus

A web medical corpus has been collected using the WebBootCat with…

DCEP: Digital Corpus of the European Parliament

The Digital Corpus of the European Parliament (DCEP) is a collection…

ChineseTaiwanWaC corpus

Chinese Taiwan web as corpus has almost 260 million words encoded…

Chinese Gigaword corpus

The Chinese Gigaword corpus from the Linguistic Data Consortium…

CHILDES English corpus

Childes-En is a subcorpus of the full CHILDES corpus which has…

HindiWaC corpus

This corpus contains almost 60 million words crawled from the…

IgboWaC corpus

The corpus is prepared by Corpus factory method and was crawled…