A page relevant to corpora.

Pages

ptTenTen corpus

Portuguese TenTen corpus. The corpus is processed with Eckhard…

plTenTen corpus

Polish TenTen web corpus was crawled by a web spider SpiderLing…

noTenTen corpus

Norwegian TenTen corpus. The corpus is tagged with ​Oslo-Bergen…

nlTenTen corpus

Dutch TenTen web corpus. The corpus is cleaned by ​jusText,…

lvTenTen corpus

Latvian TenTen corpus was crawled by ​SpiderLing in April 2014.…

ltTenTen corpus

Lithuanian TenTen corpus. The corpus has not been tagged yet. Structural…

koTenTen corpus

Korean TenTen corpus crawled by SpiderLing in August & September…

itTenTen corpus

Italian web corpus is a corpus from TenTen generation of corpora.…

jpTenTen11 corpus

Japanese TenTen corpus gathered from the web in December 2011.…

huTenTen corpus

Hungarian web corpus was crawled by ​SpiderLing in June 2012.…

heTenTen corpus

Hebrew TenTen web corpus. The corpus is cleaned by ​jusText,…

frTenTen corpus

French TenTen corpus. French web corpus crawled by SpiderLing…

fiTenTen corpus

Finnish TenTen web corpus crawled by ​SpiderLing in February…

etTenTen corpus

Estonian web corpus was crawled by ​SpiderLing in 2013. It…

esTenTen corpus

esTenTen is a Spanish TenTen corpus. The source data was crawled…