Corpora prepared from specific domains, e.g. science, art etc.

SiBol/Port corpus

The SiBol/Port (Siena-Bologna, Portsmouth) corpus is a corpus…

Scottish Gaelic Wiki corpus

Scottish Gaelic Wikipedia corpus. Downloaded in February 2015.…

CAJA corpus

Caja corpus is a corpus of Academic Journal Aricles. created…

Domain Specific Corpora

These corpora are prepared from specific domains, e.g. science,…

ScienceBlog corpus

The ScienceBlogs corpus is a selection of posts and comments…

e-flux corpus

The e-flux corpus is a web corpus of English art news digests.…

Environment corpus

English environment related web corpus. Crawled by SpiderLing…

COMPAS corpus

The COMPAS is a corpus with about 100 million words, which was…