See the Russian tagset.
v. 1.0 (April 2012)
- initial version – 15.8 billion words
v. 1.1 (May 9 2014)
- removed documents containing Ukrainian characters [ІіЇїЄє] or Belarusian characters [Ўў],
- removed documents from sites yielding high relative frequency of word порно (porn).
- currently re-processing version – 14.5 billion words
- dynamic case, number, gender
- gender lemma attribute