Turkish TenTen corpus. Crawled by SpiderLing in December 2011 and January 2012. The corpus was deduplicated by Onion, tokenized using unitok and encoded in UTF-8.
Current version of the corpus has more than 3.3 million words.
initial version, obtained from the web in 2012
no tagging, no sketches
Adam Kilgarriff Prize
Adam Kilgariff (1960-2015) was a British corpus linguist and founder of Lexical Computing, the company behind Sketch Engine. Adam devoted his whole life to research at the intersection of corpus linguistic, computational linguistics and lexicography.
To honour our brilliant and much-loved colleague, we established the Adam Kilgarriff Prize for outstanding work in the fields to which Adam contributed so much: corpus linguistics, computational linguistics, and lexicography.