Hebrew web corpora

Hebrew General corpus This corpus was crawled from the Internet…

Gaeilge tagset

Parole Common Morphosyntactical Tagset The tables below…

CLAWS tagset

C8 to C7 mapping file. NS 2011-5-14. APPGE -> APPGE: possessive…

Feed Corpus Project

FCP corpus aims to be a million word per day collection of POS-tagged…

OPUS parallel corpora

The parallel corpora available here have been collected, prepared…

The New Corpus for Ireland | Nua-Chorpas na hÉireann

[ezcol_1half] The New Corpus for Ireland – user’s guide Welcome…

Oxford Children's Corpus

Journal article Kate Wild, Adam Kilgarriff, and David Tugwell.…

TatarWaC corpus

Tatar sample corpus is ca 200 thousand words crawled from the…

Icelandic sample corpus

This is a small corpus of Icelandic texts prepared for the Sketch…

General instructions on corpus data directory structure

The aims of these instructions is to ensure that for every corpus,…

Sketch Grammar development corpora

This page describes how to use a sketch grammar in your corpus. In…

Preloaded Configuration Templates

When you create a corpus from the Sketch Engine interface (see…