It was collected in collaboration with David Levine and Alexander Provan, as the basis of their research article,‘International Art English’ presented in the online journal Triple Canopy and discussed in the Guardian newspaper here. A follow-up debate can be viewed here.
- raw HTML data of each article parsed into metadata – year, month, author (institution), title – and textual content of the announcement.
- tokenized using unitok with English model
- tagged by TreeTagger using Penn Treebank tagset
- compiled in the Sketch Engine using English sketch grammar for word sketches
v. 1 (24 May 2012)
- created, 6.2M tokens