Also referred to as “Hebrew Comparable Corpus”, uploaded in 2010.

The corpus comprises of two components: translated and non-translated texts in Hebrew. There are about fifteen books (fiction and non-fiction) in each component. The two components are matched for topic and genre: for example, there is one biography in each. It is best suited for people who want to study differences between translated and non-translated language, but of course it can be used in order to study language use more generally.

The corpus was compiled as part of a project funded by the Israel Science Foundation and carried out in the Department of Translation and Interpreting Studies at Bar Ilan University.

See the tagset.


Related paper

Guidelines: About the Hebrew corpora imbedded in Sketch Engine