HanNanum – a Korean Morphological Analyzer and POS Tagger – is used for tagging Korean texts. The 22 tags preset configuration of HanNanum demo java class is used to assign tags to words. Some own pre/post processing scripts were used to correct/improve lemmatization, add sentence markup, add English tag names.

The Original HanNanum manual (downloaded from webarchive; the tagset is described in chapter 3.4 – thanks to Dr. Juyeon Kang for very much help with translation and understanding).

Part of speech

Original Simplified Description
nominals
NC N noun
NQ NP proper noun
NB N noun
NP Pron pronoun
NN Num numeral
predicates
PV V verb
PA Adj adjective
PX VAux auxiliary verb
modifiers
MM Det determiner
MA Adv adverb
interjection
II Interj interjection
relational suffix
JC Suff
JX Suff
JP Suff
ending
EP Suff
EC Suff
ET Suff
EF Suff
affixes
XP Pref
XS Suff
symbols
SP Sym
SL Sym
SD Sym
SU Sym
SF Sym
SR Sym
SE Sym
SY Sym
foreign words
F X