A page relevant to corpora.

Pages

Nineteenthcentury corpus

Actually, the 19th century corpus is only available to Osnabrück…

Bosnian/Croatian/Serbian WaC

Bosnian, Croatian, Serbian corpora obtained from the web by Nikola…

BengaliWaC corpus

Bengali web corpus was created with Corpus Factory method. The…

Cantonese web corpus (WaC)

This corpus is collected using Cantonese only seed words and…

Penn Historical Corpora

Penn Historical Corpora is a collection of historical English…

GerManC. A Historical Corpus of German Newspapers 1650–1800

GerManC is a historical corpus of written German texts. (This…

A Corpus of English Dialogues 1560–1760

‘Released in Spring 2006, A Corpus of English Dialogues 1560–1760…

COMPAS corpus

The COMPAS is a corpus with about 100 million words, which was…

Corpus of Academic Journal Articles (CAJA)

This balanced corpus (in abbreviation CAJA) of academic language…

BulgarianNC corpus

Bulgarian National Corpus (see the website of Institute for Bulgarian…

BROWN Corpus

A Standard Corpus of Present-Day Edited American English, for…

Basque Web Corpus (WaC)

The Basque "Web as Corpus" corpus was created by Mr. Igor Leturia…

Persian Web Corpus (WaC)

Persian (also known as Farsi) is the main language of Iran. This…

Argamon corpus

The current Argamon corpus contains blog posts to various Farsi…