Tatar sample corpus is ca 200 thousand words crawled from the web in the year 2015. The text in the corpus is tokenised.