A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

HunMorph part-of-speech tagset is available in Hungarian corpora (e.g. Hungarian DGT-Translation Memory) annotated by the Hunmorph morphological analyzer.

An Example of a tag in the CQL concordance search box[tag="NOUN"] finds all nouns, e.g. és, rendelet (note: please make sure that you use straight double quotation marks)


Tag POS category
ADJ adjective
ADV adverb
ART article
CONJ conjunction
DET determiner
NOUN noun
NUM numeral
ONO onomatopoeic
POSTP postposition
PREP preposition
PREV preverb
PUNCT punctuation
UTT-INT utterance/interjection
VERB verb

See detailed information about the tagset in The annotation system of HunMorph by Alexandr Rosen (2006).

Source: http://utkl.ff.cuni.cz/~rosen/public/kr_for_ldc.pdf