Attributes include the Penn TreeBank tags and SuperSenseTagger (WordNet labels /list of super senses/) and Named Entity Labels. The corpus was presented at Skew-2 see the presentation in pdf (along with details of the Dante Disambiguation Project
This Corpus is finished but the Sketch Grammar is undergoing research and development.
v1.0 (8th March 2010)
- 115 million tokens