A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

SWATWOL part-of-speech tagset is available in Swahili (also known as Kiswahili) corpora.

An Example of a tag in the CQL concordance search box[tag="N"] finds all nouns, e.g. watu, mwaka (note: please make sure that you use straight double quotation marks)


Tag Description Example
A-INFL inflecting adjective viti vizuri
A-UINFL uninflecting adjective viti haba
ABBR abbreviation n.k.
AD-ADJ qualifier of an adjective kabisa, sana
ADJ adjective zuri, haba
ADJ-INFL inflecting adjective zingine
ADV adverb tena
AG-PART agentive particle ililiwa na panya
AG-PART_PRON agentive pronoun nayo
AR Arabic origin Kitaalam
CC coordinating conjunction baba na mama
CLB clause boundary . ; ? ! ili, kwamba
COLON colon :
COMMA comma ,
CONCORD1 grammatical concord kupi
CONJ conjunction ili
DEF-V:li verb with no inflection based on the root “li” walio, alye
DEF-V:na verb with no inflection based on the root “na” kuna
DEF-V:ni verb with no inflection based on the root “ni” ni
DEM:hV demonstrative pronoun of the type hV(C)V hiki, haya, hii
DOLLAR-SIGN dollar sign $
DOUBLE-QUOTE double quote
EMPH emphatic form mimi ndimi
ENG English origin Apartheid
EQUAL-MARK equal mark =
EXCLAM number 2000
EXCLAMATION exclamation mark !
GEN-CON genitive connector kitabu cha mtoto
HYPHEN hyphen
IDIOM idiom punde si punde
IMP imperative twende
INTERROG interrogative lini? mbona?
LEFT-PARENTHESIS left parenthesis (
LEFT-SQUAREBRACKET left square bracket [
N noun kitu
NA-POSS possessive particle alikuwa na mali
NEG negative sisomi, hasomi, hatasoma
NUM numeral ishirini
PL1-SP noun class 1/2 1p plural, subject prefix mnasoma
PL2-SP noun class 1/2 2p plural, subject prefix mnasoma
PREP preposition katika
PREP_PRON preposition+pronoun naye
PROCENT-MARK procent mark %
PRON pronoun mimi, hiki
PROPNAME proper name  Ali, Mombasa
REL relative marker in verb ninayesoma
RHET rhetorical Unakwenda, sio?
RIGHT-PARENTHESIS right parenthesis )
RIGHT-SQUAREBRACKET right square bracket ]
SELFSTANDING selfstanding subject particle yu, zi
SG1-SP noun class 1/2 1p singular, subject prefix ninafikiri
SG2-SP noun class 1/2 2p person singular, subject prefix unasoma
SG3-SP noun class 1/2 3p person singular, subject prefix anasoma
SINGLE-QUOTE single quote  ‘
V verb kusoma
VCAP_ word starting with capital Nawe
VFIN finite verb anasoma
VIMP imperative verb rudi, angalia
VINF infinite verb kuwa, kufanya
Vkwisha verb with the marker “kwisha” kwisha

Source: http://www.aakkl.helsinki.fi/cameel/corpus/swatags.pdf