Pages

TenTen corpora

TenTen is a new generation of Web corpora. These corpora are…

zhTenTen corpus

Simplified Chinese TenTen corpus was created from the Internet…

yoTenTen corpus

Yoruba TenTen web corpus. The corpus is cleaned by ​jusText,…

uaTenTen corpus

Ukrainian TenTen corpus was crawled by ​SpiderLing in 2014.…

trTenTen corpus

Turkish TenTen corpus. Crawled by ​SpiderLing in December 2011…

svTenTen corpus

Swedish TenTen web corpus. The corpus is cleaned by ​jusText,…

skTenTen corpus

Slovak TenTen corpus. The corpus has been tagged by the ​Ľ.…

ruTenTen corpus

Russian TenTen corpus. Russian web corpus crawled by ​SpiderLing…

ptTenTen corpus

Portuguese TenTen corpus. The corpus is processed with Eckhard…

plTenTen corpus

Polish TenTen web corpus was crawled by a web spider SpiderLing…

noTenTen corpus

Norwegian TenTen corpus. The corpus is tagged with ​Oslo-Bergen…

nlTenTen corpus

Dutch TenTen web corpus. The corpus is cleaned by ​jusText,…

lvTenTen corpus

Latvian TenTen corpus was crawled by ​SpiderLing in April 2014.…

ltTenTen corpus

Lithuanian TenTen corpus. The corpus has not been tagged yet. Structural…

koTenTen corpus

Korean TenTen corpus crawled by SpiderLing in August & September…