A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Greek INTERA tagset

Greek INTERA part-of-speech tagset is available in Greek corpora annotated by the tool TreeTagger trained on the INTERA corpus. TreeTagger was developed by Helmut Schmid in the TC project at the Institute for Computational Linguistics of the University of Stuttgart.

An Example of a tag in the CQL concordance search box: [tag="NoCmNeSgNm"] finds all common nouns of neutral gender in singular number and nominative case, e.g. βιβλίο (note: please make sure that you use straight double quotation marks)

Basic notation

PoS tag Description Example
No noun βιβλίο
Aj adjective ψηλός
Nm numeral ένας
At article το
Vb verb γράφω
Pn pronoun εγώ
Ad adverb ψηλά
As adposition (= preposition) σε
Cj conjunction και
Ij interjection Αλί
Pt particles μην
Rg residual Τσόμσκυ
tokenizer tags

Detailed notation

Noun

Position Category Value PoS tag
2 Type Common Cm
Proper Pr
3 Gender Masculine Ma
Feminine Fe
Neuter Ne
3 Number Singular Sg
Plural Pl
4 Case Nominative Nm
Genitive Ge
Accusative Ac
Dative Da
Vocative Vo

Adjective

Position Category Value PoS tag
2 Degree Basic Ba
Comparative Cp
Superlative Su
3 Gender as with nouns
4 Number
5 Case

Numeral

Position Category Value PoS tag
2 Type Cardinal Cd
Ordinal Od
Multiplicative Ml
Analog An
Collective Ct
3 Gender as with nouns
4 Number
5 Case
6 Function Adjectival Aj
Nominal No
Adverbial Ad

Article

Position Category Value PoS tag
2 Type Definite Df
Indefinite Id
3 Gender as with nouns
4 Number
5 Case

Verb

Position Category Value PoS tag
2 Type Main Mn
Impersonal Is
3 Finiteness/Mood Indicative Id
Imperative Mp
Infinitive Nf
Participle Pp
4 Tense Present Pr
Past Pa
No value Xx
5 Person 1st 01
2nd 02
3rd 03
6 Number Singular Sg
Plural Pl
No value Xx
7 Gender Masculine Ma
Feminine Fe
Neuter Ne
No value Xx
8 Aspect Imperfective Ip
Perfective Pe
9 Voice Active Av
Passive Pv
10 Case Nominative Nm
Genitive Ge
Accusative Ac
Dative Da
Vocative Vo
No value Xx

Pronoun

Position Category Value PoS tag
2 Type Personal Pe
Demonstrative Dm
Possessive Po
Indefinite Id
Interrogative Ir
Relative Re
Relative Indefinite Ri
3 Person 1st 01
2nd 02
3rd 03
4 Gender Masculine Ma
Feminine Fe
Neuter Ne
4 Number Singular Sg
Plural Pl
5 Case Nominative Nm
Genitive Ge
Accusative Ac
Dative Da
Vocative Vo
6 Inflection Strong St
Weak We
No value Xx

Adverb

Position Category Value PoS tag
2 Type Xx
3 Degree Basic Ba
Comparative Cp
Superlative Su

Adposition (= preposition)

Position Category Value PoS tag
2 Type Pp
3 Form Simple Sp
Prepart Pa
3 Gender as with nouns
4 Number
5 Case

Conjunction

Position Category Value PoS tag
2 Type Coordinative Co
Subordinative Sb

Interjection

PoS tag Description Example
Ij interjection Αλί

Particles

Position Category Value PoS tag
2 Type Future Fu
Negative Ng
Subjunctive Sj
Other Ot

Residual

Position Category Value PoS tag
2 Type Foreign Fw
Abbreviation Ab
Acronym An
Symbol Sy
3 Transliteration Transliterated Tr
Original Or
No value Xx

Tokenizer tag

Type Category PoS tag Example
Punctuation PUNCT , : ‘
Terminal punctuation P_TERM_P ? ; !
Open punctuation O_PUNCT ( [ «
Close punctuation C_PUNCT ) ] »
Abbreviations ABBR κλπ, κοκ, OHE
Not Breaking Abbreviations (Abbreviations never occurring at the end of a sentence) ΝΒΑΒΒR κ.
Initials ΙΝΙΤ Γερ.
Digits DIG 1, 10, 86%
Enumerations ENUM 1. 1]
Dates DATE DATE 21.10.1998, DATE Δεκ_1996

Source: http://nlp.ilsp.gr/nlp/tagset_examples/tagset_en

Greek text corpora in Sketch Engine

Sketch Engine offers dozens Greek language corpora.

or