This corpus was gathered by Serge Sharoff at the University of Leeds using the method described here, designed to produce a general language resource. There has been little checking of the content.

It was part-of-speech tagged and lemmatised using TreeTagger, a leading part-of-speech tagger which has been trained for Russian also by Sharoff, as described here.

Word sketches by Maria Khokhlova.