A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Tatar part-of-speech tagset is available in the Tatar News corpus.

An Example of a tag in the CQL concordance search box: [pos="<np?>"] finds all nouns, e.g. Татарстан, кеше (note: please make sure that you use straight double quotation marks)


Part of speech categories = First-level tags

Tag Part of speech Example
<abbr> Abbreviation Аббревиатура
<adj> Adjective Прилагательное
<adv> Adverb Наречие
<cm> comma ,
<cnjadv> Adverbial conjunction Наречие-союз
<cnjcoo> Coordinating conjunction Сочинительный союз
<cnjsub> Subordinating conjunction Подчинительный союз
<cop> Copula Копула
<det> Determiner Детермирнатив
<ideo> Ideophone Звукоподражательное слово
<ij> Interjection Междометие
<n> Noun Существительное
<np> Proper noun Имя собственное
<num> Numeral Числительное
<post> Postposition Послелог
<postadv> Postadverb Посленаречие
<prn> Pronoun Местоимение
<sent> sentence marker . ? !
<v> Verb Глагол
<vaux> Auxiliary verb Вспомогательный глагол
<apos> apostrophe
<guio> hyphen
<lpar> left Parenthetical marker (
<lquot> left Quote marker “, «
<mod_ass> Assertive modal particle бит
<mod_ind> Indefinite modal particle (expresses doubt) дыр
<qst> Modal question particle микән
<rpar> right Parenthetical marker )
<rquot> right Quote marker ”, »

Proper noun types

Tag Description
<top> Toponym
<ant> Anthroponym
<cog> Cognomen
<pat> Patronym
<org> Organization
<al> Other

Gender of

<m> Masculine
<f> Feminine
<mf> Masculine/feminine; basically cognoms without -ов/-ова,
-ин/-ина endings

“Syntactic” tags. Attributive use of non-adjectives etc.

<attr> Attributive
<subst> Substantive
<advl> Adverbial


<sg> Singular
<pl> Plural
<sp> Singular/Plural


<px1sg> First person singular
<px2sg> Second person singular
<px3sp> Third person singular/plural
<px1pl> First person plural
<px2pl> Second person plural
<px3pl> Third person plural (for
<px> General possessive


<nom> Nominative
<gen> Genitive
<dat> Dative
<acc> Accusative
<abl> Ablative
<loc> Locative
<ref> Reflexive

some additional ~cases

<sim> Similative
# DAй (-дай/-дәй, -тай/-тәй)
<abe> Abessive=Privative
# SIZ (-сыз/-сез) (not used after posessives and cases)
<reas> not used rigth now, just in case

Levels of comparison of adj.

<comp> Comparative

Pronoun types

<pers> Personal
<recip> Reciprocal

Pronoun&Determiner types

<dem> Demonstrative
<ind> Indefinite
<itg> Interrogative
<qnt> Quantifier
<neg> Negative
(NOTE: also used to denote
negation in verbs, i.e for м{A})
<ref> Reflexive

Numeral types

<ord> Ordinal
<coll> Collective
<dist> Distibutive

Verbal features


<imp> Imperative
<opt> Optative/jussive
<evid> Evidential, a.k.a.
“indirect” / non-eyewitness / hearsay


<caus> Causative
<pass> Passive
<coop> Cooperative

Tenses / finite forms

<pres> -{E}
<past> -{G}{A}н
<ifi> -{D}{I}
<fut> -{I}р
<fut2> -{A}ч{A}к
<fut_plan> -м{A}кч{I}

Non-Finite verb forms


<prc_perf> Perfect participle
# “_Йоклап_ яткан мәче авызына тычкан үзе _килеп_ керми.”;
<prc_impf> Imperfect participle
# ул _уйный_ алмады; мин _яза_ башладым;
<prc_vol> Volition participle
# _эчәсем_ килә;
<prc_cond> Conditional participle
# “…моны _алсаң_ була…” (“ала аласың”
<prc_fplan> Future plan participle
# “Бакчага бармакчы идем.”;

Verbal adverbs

<gna_perf> -{I}п
# “…ул вакытта инде кояш _баеп_, йолдызлар күренә башлаган
# иде…” (Ф.Хөсни);
<gna_cond> -с{A}
# “…кайда икәнен _белсә_, миңа моның турында сөйләр иде…”;
<gna_until> -{G}{A}нч{I}
(name covers only the temporal
meaning of it, form has more)
# “Авылның басу капкасына _җиткәнче_ эңгер-меңгердә карлы юлдан
# озак кайта ул.”;
<gna_after> -{G}{A}ч
(name covers only the temporal
meaning of it, form has more)
# “Берәү, патша йортын күреп _кайткач_, үз өенә ут төрткән,

Verbal adjectives

<gpr_past> -{G}{A}н
# килгән кеше;укылмаган китап;
<gpr_impf> -{A} торган
TODO: this is equivalent of Kazakh
<gpr_impf>; compound forms
should be handled in transfer, so
check once more, whether
there is a real reason not to
handle it there (it seemed so)
<gpr_pot> {U}ч{I}
# сөйләүче кеше;үз урынын белмәүче;
<gpr_ppot> -{I}рл{I}к/-{A}рл{I}к
<gpr_fut> -{I}р/-{A}р
# барыр җир; сөйләр сүз;
<gpr_fut2> -{A}ч{A}к
# әйтеләчәк фикер; эшләнәчәк эш;
<gpr_fut3> {E}с{I}
(NOTE: ambigious with the volition
participle (see above))
# “_Үләсе_ күбәләк ут күзенә керер”;

Gerunds (verbal nouns)

<ger> -{U}
<ger_past> -{G}{A}н
<ger_perf> -{G}{A}нл{I}к
(stresses the fact that something
# “Барысының да мәсьәләне үз башында _йөрткәнлеге_, теге яктан
# да, бу яктан да үлчәп _караганлыгы_ сизелеп тора.” (Ф. Хөсни);
<ger_ppot> -{I}рл{I}к/-{A}рл{I}к
(~the ability to do the denoted
<ger_abs> -{U}ч{I}л{I}к FIXME CHECK
This form shouldn’t be that
productive in Tatar, consider adding
them as nouns if they appear in
corpus. Kazakh <ger_abs> is
translated with <ger>
<ger_fut> -{I}р/-{A}р
# “Ярым кем _булырын_ белмим, мин әле ялгыз йөрим.” (Ш.Галиев);
<ger_fut2> -{A}ч{A}к
<ger_fut3> {E}с{I}
(NOTE: ambigious with the volition
participle (see above))
# “_Күрәселәре_ алда әле”;
<ger1> -м{A}к
<inf> -{A/I}рг{A}


<tv> Transitive
 <iv>  intransitive


<p1> First person
<p2> Second person
<p3> Third person
<frm> Formality

Modal particles

<qst> Modal
question particle
# м{I}
<emph> Emphasizing modal particle
# -ч{I}, -с{A}н{A}
<mod_ass> Assertive modal particle
<mod_ind> Indefinite modal particle
(expresses doubt)

Punctuation mark

<sent>  Sentence marker
<guio>  Hyphen
<cm>  Comma
<apos>  Apostrophe
<rquot>  Quote marker (right hand side)
<lquot>  Quote marker (left hand side)
<rpar>  Parenthetical marker (right hand side)
<lpar>  Parenthetical marker (left hand side)

Source: http://corpus.tatar/index_en.php?openinframe=manual/tags_uniq.pdf

Download all Tatar POS tagset in the excel format or in the txt format