The corpus is prepared by Steven Bird. The process is described in the bibliography (below).
All material is taken from here. It was part-of-speech tagged and lemmatised using TreeTagger, a leading part-of-speech tagger which has been trained for a number of languages.
Grammatical relation definitions as prepared by David Tugwell for other English corpora were used.
Word sketches are of the first version.