.A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Danish ePOS part-of-speech tagset

Danish  ePos part-of-speech tagset is available in Danish corpora annotated by ePos tagger trained on ePAROLE corpus. The PAROLE tag set is positional which means that subclassifications of particular PoS has a fixed position within a tag. For example, in the case of nouns, the gender marker is always found at position 3, number at position 4, case at position 5, and definiteness at position 8.

The basic structure of an ePOS tag is:


An Example of a tag in the CQL concordance search box: [tag="NC:sigc:.*"] finds all common nouns which meet conditions: singular number, indefiniteness, genitive case and common gender.  e.g. verdens, finnes (note: please make sure that you use straight double quotation marks)

Class and subclass

% can be replaced with a mark of inflectional part of the tag

means not defined

POS Subcategory POS tag example
V Verb I infinitive VI:—-:-%:—-
F finite VF:—-:%%:—-
M imperative VM:—-:–:—-
G gerund VG:%%%%:–:—
P participle VP:%%%%:%-:—-
T past part. VT:siu#:%-:—-
D adv. part. VD:—-:%-:—-
A Adjective C common AC:%%%%:–:%—
D adverbial AD:—-:–:%—
L Numeral C cardinal LC:–%-:–:—-
O ordinal LO:–%%:–:—-
N Noun C common NC:%%%%:–:—-
P proper NP:%%%%:–:—-
P Pronoun C reciprocal PC:%-%-:–:—-
M demonstrative PM:%-%%:–:—-
I indefinite PI:%-%%:–:—-
O possessive PO:%–%:–:-%%%
P personal PP:%-%%:–:-%%-
R relative PR:%-%%:–:—-
D Adverb D-:—-:–:%—
I Interjection I-:—-:–:—-
T Preposition T-:—-:–:—-
C Conjunction C coordinating CC:—-:–:—-
S subordinating CS:—-:–:—-
U Unique I inf.marker UI:—-:–:—-
S som/der US:—-:–:—-
E Lexical element W word formation EW:—-:–:—-
M Inflectional ending N attached to a noun MN:%%%%:–:—-
V attached to a verb MV:—-:%%:—-
A attached to an adj. MA:%%%%:–:%—
X Residual S symbol XS:—-:–:—-
F foreign XF:—-:–:—-
Y tagging error XY:—-:–:—-

Inflectional part of the tag

Nominal markers

Position Marker Category Tag
1. Number (NUM) singular s
plural p
2. Definiteness (DEF) indefinite i
definite d
3. Case (CAS) unmarked u
genitive g
fossilized f
personal pronouns only nominative n
(accusative is identical with unmarked) u
4. Gender (GEN) common c
neuter n

Verbal markers

Position Marker Category Tag
1. Tense (TMP) present s
past t
2. Voice (VOC) active a
passive p

Additional markers

Position Marker Category Tag
1. Degree (DEG, adjectives and some adverbs) positive p
comparative c
superlative s
absolute superlative a
2. Person (PER, personal and possessive pronouns) first 1
second 2
third 3
3. Reflexiveness (RFL, personal and possessive pronouns) yes y
no n
4. Possessor (POS, possessive pronouns) singular s
plural p

Source: http://korpus.dsl.dk/clarin/corpus-doc/pos-design.pdf

Danish text corpora in Sketch Engine

Sketch Engine offers dozens Danish language corpora.