Languages in Sketch Engine

The features available for each corpus differ because corpora in Sketch Engine come from different sources. Even corpora in the same language may not have the same set of features available.

Preloaded corpora

Refer to the corpus details page for information about available features.

User corpora

Refer to the table below to check whether the following features will be available:

POS – Yes – user corpora will be tagged for parts of speech

WS – Yes – Word Sketches Grammar available for the language

POS – No, WS – Yes – user corpora cannot be tagged for parts of speech but Word Sketch Grammar is available for the language, the user corpus can make use of Word Sketches if it is tagged externally with the tagset used by the Word Sketch Grammar

click a language to list the corpora

Language POS WS
Afrikaans no yes
Albanian no no
Arabic yes yes
Armenian no yes
Azerbaijani no no
Basque no yes
Bengali no no
Bosnian no no
Bulgarian yes yes
Burmese no no
Catalan no yes
Chinese Simplified yes yes
Chinese Traditional yes yes
Croatian yes yes
Czech yes yes
Danish no yes
Dutch yes yes
English yes yes
Esperanto no no
Estonian yes yes
Filipino no no
Finnish yes no
French yes yes
Frisian no no
Galician no no
Georgian no no
German yes yes
Greek no yes
Gujarati no no
Language POS WS
Hebrew yes no
Hindi no no
Hungarian yes no
Icelandic no no
Igbo no no
Indonesian no no
Irish no yes
Italian yes yes
Japanese yes yes
Kazakh no no
Korean yes yes
Kyrgyz no no
Latin no yes
Latvian yes yes
Limburgish no no
Lithuanian no yes
Macedonian no no
Malayalam no no
Malay no no
Maldivian no no
Maltese no no
Maori no no
Mongolian no no
Nepali no no
Norwegian no yes
Persian no yes
Polish yes yes
Portuguese yes yes
Romanian yes yes
Language POS WS
Russian yes yes
Samoan no no
Sanskrit (romanised) no no
Scottish Gaelic no no
Serbian yes yes
Setswana no no
Slovak yes yes
Slovenian yes yes
Spanish yes yes
Swahili yes no
Swedish yes yes
Tajik no no
Talysh no no
Tamil no no
Tatar no no
Telugu no no
Thai no no
Tibetan no yes
Turkish no no
Turkmen no no
Ukrainian no no
Urdu no no
Uzbek no no
Vietnamese no yes
Welsh no no
Yiddish no no
Yoruba no no