TurkishWaC corpus

The TurkishWaC corpus is a 32 million word collection of samples…

UKWaCsst corpus

UKWaC tagged with SuperSenseTagger (​sst-light) described in…

GujarathiWaC corpus

FrWac web as corpus is a corpus of Gujarati language (Indo-Aryan…

GeorgianWaC corpus

Original file owner: bharat.

FinnishWaC corpus

Finnish web as corpus.

FrisianWaC corpus

Frisian web as corpus was crawled in August 2013. It is a corpus…

danishWaC corpus

The corpus prepared by Corpus factory method. It has 288 million…

Filipino web corpus (FilipinoWaC)

The corpus was created by Anil in October 2013. It has almost…

Arabic web corpus (WaC)

Arabic web corpus was created by Serge Sharoff and was tagged…