A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Tatar part-of-speech tagset is available in the Tatar Mixed corpus. The tags in the previous version of this tagset are closed to angle quotes and are used in Tatar News corpus.

An Example of a tag in the CQL concordance search box: [tag2="n:sg:nom"] finds all nouns (‘n’) in the singular form (‘sg’) and nominative case (‘nom’), e.g. республика, кеше (note: please make sure that you use straight double quotation marks). The old notation of Tatar part-of-speech tags used the French angle quotes “<” and “>”. These characters were removed. The tags consist of more tags are connected with colon “:”. (The former notation of the tag above is [tag2="<sg><nom>"]).

Part of speech categories = First-level tags

Tag Part of speech Example
abbr Abbreviation Аббревиатура
adj Adjective Прилагательное
adv Adverb Наречие
cm comma ,
cnjadv Adverbial conjunction Наречие-союз
cnjcoo Coordinating conjunction Сочинительный союз
cnjsub Subordinating conjunction Подчинительный союз
cop Copula Копула
det Determiner Детермирнатив
ideo Ideophone Звукоподражательное слово
ij Interjection Междометие
n Noun Существительное
np Proper noun Имя собственное
num Numeral Числительное
post Postposition Послелог
postadv Postadverb Посленаречие
prn Pronoun Местоимение
sent sentence marker . ? !
v Verb Глагол
vaux Auxiliary verb Вспомогательный глагол
apos apostrophe
guio hyphen
lpar left Parenthetical marker (
lquot left Quote marker “, «
mod_ass Assertive modal particle бит
mod_ind Indefinite modal particle (expresses doubt) дыр
qst Modal question particle микән
rpar right Parenthetical marker )
rquot right Quote marker ”, »

Proper noun types

Tag Description
top Toponym
ant Anthroponym
cog Cognomen
pat Patronym
org Organization
al Other

Gender of

m Masculine
f Feminine
mf Masculine/feminine; basically cognoms without -ов/-ова,
-ин/-ина endings

“Syntactic” tags. Attributive use of non-adjectives etc.

attr Attributive
subst Substantive
advl Adverbial


sg Singular
pl Plural
sp Singular/Plural


px1sg First person singular
px2sg Second person singular
px3sp Third person singular/plural
px1pl First person plural
px2pl Second person plural
px3pl Third person plural (for
px General possessive


nom Nominative
gen Genitive
dat Dative
acc Accusative
abl Ablative
loc Locative
ref Reflexive

some additional ~cases

sim Similative
# DAй (-дай/-дәй, -тай/-тәй)
abe Abessive=Privative
# SIZ (-сыз/-сез) (not used after posessives and cases)
reas not used rigth now, just in case

Levels of comparison of adj.

comp Comparative

Pronoun types

pers Personal
recip Reciprocal

Pronoun&Determiner types

dem Demonstrative
ind Indefinite
itg Interrogative
qnt Quantifier
neg Negative
(NOTE: also used to denote
negation in verbs, i.e for м{A})
ref Reflexive

Numeral types

ord Ordinal
coll Collective
dist Distibutive

Verbal features


imp Imperative
opt Optative/jussive
evid Evidential, a.k.a.
“indirect” / non-eyewitness / hearsay


caus Causative
pass Passive
coop Cooperative

Tenses / finite forms

pres -{E}
past -{G}{A}н
ifi -{D}{I}
fut -{I}р
fut2 -{A}ч{A}к
fut_plan -м{A}кч{I}

Non-Finite verb forms


prc_perf Perfect participle
# “_Йоклап_ яткан мәче авызына тычкан үзе _килеп_ керми.”;
prc_impf Imperfect participle
# ул _уйный_ алмады; мин _яза_ башладым;
prc_vol Volition participle
# _эчәсем_ килә;
prc_cond Conditional participle
# “…моны _алсаң_ була…” (“ала аласың”
prc_fplan Future plan participle
# “Бакчага бармакчы идем.”;

Verbal adverbs

gna_perf -{I}п
# “…ул вакытта инде кояш _баеп_, йолдызлар күренә башлаган
# иде…” (Ф.Хөсни);
gna_cond -с{A}
# “…кайда икәнен _белсә_, миңа моның турында сөйләр иде…”;
gna_until -{G}{A}нч{I}
(name covers only the temporal
meaning of it, form has more)
# “Авылның басу капкасына _җиткәнче_ эңгер-меңгердә карлы юлдан
# озак кайта ул.”;
gna_after -{G}{A}ч
(name covers only the temporal
meaning of it, form has more)
# “Берәү, патша йортын күреп _кайткач_, үз өенә ут төрткән,

Verbal adjectives

gpr_past -{G}{A}н
# килгән кеше;укылмаган китап;
gpr_impf -{A} торган
TODO: this is equivalent of Kazakh
; compound forms
should be handled in transfer, so
check once more, whether
there is a real reason not to
handle it there (it seemed so)
gpr_pot {U}ч{I}
# сөйләүче кеше;үз урынын белмәүче;
gpr_ppot -{I}рл{I}к/-{A}рл{I}к
gpr_fut -{I}р/-{A}р
# барыр җир; сөйләр сүз;
gpr_fut2 -{A}ч{A}к
# әйтеләчәк фикер; эшләнәчәк эш;
gpr_fut3 {E}с{I}
(NOTE: ambigious with the volition
participle (see above))
# “_Үләсе_ күбәләк ут күзенә керер”;

Gerunds (verbal nouns)

ger -{U}
ger_past -{G}{A}н
ger_perf -{G}{A}нл{I}к
(stresses the fact that something
# “Барысының да мәсьәләне үз башында _йөрткәнлеге_, теге яктан
# да, бу яктан да үлчәп _караганлыгы_ сизелеп тора.” (Ф. Хөсни);
ger_ppot -{I}рл{I}к/-{A}рл{I}к
(~the ability to do the denoted
ger_abs -{U}ч{I}л{I}к FIXME CHECK
This form shouldn’t be that
productive in Tatar, consider adding
them as nouns if they appear in
corpus. Kazakh is
translated with <ger< td=””> </ger<>
ger_fut -{I}р/-{A}р
# “Ярым кем _булырын_ белмим, мин әле ялгыз йөрим.” (Ш.Галиев);
ger_fut2 -{A}ч{A}к
ger_fut3 {E}с{I}
(NOTE: ambigious with the volition
participle (see above))
# “_Күрәселәре_ алда әле”;
ger1 -м{A}к
inf -{A/I}рг{A}


tv Transitive
iv  intransitive


p1 First person
p2 Second person
p3 Third person
frm Formality

Modal particles

qst Modal
question particle
# м{I}
emph Emphasizing modal particle
# -ч{I}, -с{A}н{A}
mod_ass Assertive modal particle
mod_ind Indefinite modal particle
(expresses doubt)

Punctuation mark

sent  Sentence marker
guio  Hyphen
cm  Comma
apos  Apostrophe
rquot  Quote marker (right hand side)
lquot  Quote marker (left hand side)
rpar  Parenthetical marker (right hand side)
lpar  Parenthetical marker (left hand side)

Source: http://corpus.tatar/index_en.php?openinframe=manuals/tags_uniq.pdf
Download all Tatar POS tagset in the old format with “<” and “>” as XLS (excel format) or TXT (text file format).