pukWaC: ukWaC English corpus parsed with MaltParser
The pukWaC is a subset of the British English corpus ukWaC collected from the .uk domain with using medium-frequency words from the British National Corpus as seed words. In addition to the ukWaC corpus, the pukWaC corpus contains the syntax dependency annotation which shows the dependency between units in one sentence, i.e. which word depends which. This type of parsing was performed with the MaltParser.
The pukWaC corpus was tagged by TreeTagger using Penn TreeBank tagset.