The corpus was prepared by Adriano Ferraresi. The process is described in Ferraresi et al (LREC 2008) .
All material is taken from the .uk domain. It was part-of-speech tagged and lemmatised using TreeTagger, a leading part-of-speech tagger which has been trained for a number of languages. It uses Penn Treebank Tagset.
Grammatical relation definitions, as prepared by David Tugwell for other English corpora, were used.