A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Historical English Penn Treebank part-of-speech tagset is available in corpora of Historical English. It is a special POS tagset aimed to describe grammatical categories of historical language. See more at http://www.ling.upenn.edu/histcorpora/

An Example of a tag in the CQL concordance search box[tag="ADJ.*"] finds all nouns in plural, e.g. good, slow (note: please make sure that you use straight double quotation marks)


. sentence-final punctuation
, sentence-internal punctuation
single quote
double quote
$ possessive marker
+ joins constituent morphemes in compounds Example: (N+N mankind)
ADJ adjective
ADJR adjective, comparative
ADJS adjective, superlative
ADV adverb
ADVR adverb, comparative
ADVS adverb, superlative
ALSO the words ALSO (except when = AS) and EKE
(the latter only in Middle English)
BAG BE, present participle
BE BE, infinitive
BED BE, past (including past subjunctive)
BEI BE, imperative
BEN BE, perfect participle
BEP BE, present (including present subjunctive)
C complementizer
CODE non-text material (e.g., page numbers)
CONJ coordinating conjunction
D determiner
DAG DO, present participle
DAN DO, passive participle (verbal or adjectival)
DO DO, infinitive
DOD DO, past (including past subjunctive)
DOI DO, imperative
DON DO, perfect participle
DOP DO, present (including present subjunctive)
ELSE the word ELSE in the collocation OR ELSE
EX existential THERE
FOR infinitival FOR
FOR+TO cliticized FOR+TO
FP focus particle
FW foreign word
HAG HAVE, present participle
HAN HAVE, passive participle (verbal or adjectival)
HV HAVE, infinitive
HVD HAVE, past (including past subjunctive)
HVI HAVE, imperative
HVN HAVE, perfect participle
HVP HAVE, present (including present subjunctive)
ID token identification number
INTJ interjection
LB line break
LS list marker
MAN indefinite subject pronoun (ME, MAN)
(only in Middle English)
MD modal verb
MD0 modal verb, untensed
META special text material (e.g., stage directions); generally found only in drama
N common noun, singular
N$ common noun, singular, possessive
NEG negation
NPR proper noun, singular
NPR$ proper noun, singular, possessive
NPRS proper noun, plural
NPRS$ proper noun, plural, possessive
NS common noun, plural
NS$ common noun, plural, possessive
NUM cardinal number
NUM$ cardinal number, possessive
ONE the word ONE (except as focus particle)
ONE$ ONE, possessive
OTHER the word OTHER (except as conjunction)
OTHER$ OTHER, nominal use, possessive
OTHERS OTHER, nominal use, plural
OTHERS$ OTHER, nominal use, plural possessive
P preposition or subordinating conjunction
PRO personal pronoun
PRO$ possessive pronoun
Q quantifier
Q$ quantifier, possessive
QR quantifier, comparative (MORE, LESS)
QS quantifier, superlative (MOST, LEAST)
RP adverbial particle
SUCH the word SUCH
TO infinitival TO, TIL, and AT
VAG present participle
VAN passive participle (verbal or adjectival)
VB infinitive, verbs other than BE, DO, HV
VBD past (including past subjunctive)
VBI imperative
VBN perfect participle
VBP present (including present subjunctive)
WADV wh-adverb
WARD the morpheme WARD
WD wh-determiner
WPRO wh-pronoun
WPRO$ possessive wh-pronoun
WQ WHETHER introducing indirect questions
X tag for unknown part of speech

Extended syntactic tags

The basic syntactic tags listed above can be modified by the following extended tags (also referred to as dash tags).

Suffix tag Definition Example
-LFD left-dislocated constituent ADVP-LOC-LFD left-dislocated locative adverb phrase
-PRN parenthetical or appositive NP-PRN parenthetical or appositive noun phrase
-RSP resumptive constituent NP-SBJ-RSP resumptive subject
-SPE direct speech (only on CP, IP) CP-REL-SPE relative clause, direct speech
IP-MAT-SPE matrix clause, direct speech

“-#” (a hyphen followed by a numeric index) is used to coindex antecedents and their traces, as well as expletives (overt or empty) that are associated with a clause or noun phrase.

“=#” (an equals sign followed by a numeric index) is used to coindex gapped clauses with full clauses. See Gapping, Right-node raising.

Empty categories

0 empty operator
*arb* arbitrary subject in ECM infinitives (as in I have heard *arb* tell)
*con* subject elided under conjunction
*exp* empty expletive subject
*pro* “small pro” subject
*ICH* abbreviation mnemonic for “insert constituent here”; trace of extraposition, scrambling, or other movement that does not fit neatly into the A/A’ dichotomy
*T* trace of A’-movement
* trace of A-movement; also default empty category

