A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Maltese part-of-speech tagset is available in Maltese corpora annotated by the MLSS (Maltese Language Software Services) part-of-speech tagger based on TnT which is an implementation of a statistical part of speech tagger created by Thorsten Brants

This is modified tagset of default TreeTagger tagset.

An Example of a tag in the CQL concordance search box[tag="NN"] finds all common nouns, e.g.  gvern, sena (note: please make sure that you use straight double quotation marks)


POS tag Description
CC coordinating conjunction
CMP complementiser
CS subordinating conjunction
DD determiner
DDC definite determiner, clitic
DP determiner, plural
DQ determiner quantifier
DS specifier, singular
EX existential marker
II interjection
MJ modifier, adjective
MV modifier, adverb
NCI numeral, cardinal, intransitive
NCT numeral, cardinal, transitive
NN common noun
NO numeral, ordinal
NP proper name
NPI initial in proper name
NV verbal negator
PAC particle, aspect marker, continuous aspect
PAF particle, aspect marker, prospective aspect
PD pronoun, demonstrative
PI pronoun, indefinite
PMP preposition ma’ with bound pronoun
PP pronoun, personal
PR pronoun, reflexive
PRP preposition
PRPC fused preposition-article
PT pronoun, possessive
PUN punctuation
RA residual, acronym
RB residual, abbreviation
RD residual, date
RFR residual, formula, mathematical symbol
RFW residual, foreign word
RH residual, honorific
RO residual, other
RS residual, other symbol
UAM (unique,unassigned) multiword utterance
VA verb, auxiliary
VG pseudo verb
VP participle, active, or passive
VV main verb

