Also referred to as “Hebrew Comparable Corpus”, uploaded in 2010.

The corpus comprises of two components: translated and non-translated texts in Hebrew. There are about fifteen books (fiction and non-fiction) in each component. The two components are matched for topic and genre: for example, there is one biography in each. It is best suited for people who want to study differences between translated and non-translated language. It can also be used in order to study language use more generally.

The corpus was compiled as part of a project funded by the Israel Science Foundation and carried out in the Department of Translation and Interpreting Studies at Bar Ilan University.

See the part-of-speech tagset summary.

A corpus attribute overview

A list of positional attributes used in the corpus

 TAGGER OUTPUT  ABRV  VALUES PER TAG
token token
transliteration (token) trans
lemma lemma
transliteration (lemma) transl
pos tag adjective adverb conjunction copula existential foreign interjection interrogative modal negation noun numberExpression numeral participle preposition pronoun properName punctuation quantifier title url verb wPrefix
pos-type postype amount and arithmetic-operation bracket-end bracket-start colon comma coordinating demonstrative determiner dot exclamation-mark gematria hyphen impersonal literal-number numeral-cardinal numeral-fractional numeral-ordinal or other partitive personal proadverb prodet pronoun question-mark quote reflexive relativizing semicolon slash subordinating yesno
prefix string prestring ב בכ ו וב ובכ וכ וכש וכשל ול ומ ומכ ומש וש ושב ושל ושמ כ ככ כש כשב כשל כשמ ל לכ לכש מ מכ מש משב משכ משל משמ ש שב שכ שכש שכשמ של שמ שמש
base string basestring
 suffix string sufstring גם ה הם הן ו י ך כם כן ם ן נו
gender gender feminine masculine masculine-and-feminine
number number dual dual-and-plural plural singular singular-and-plural
status status absolute construct
polarity polarity negative positive
person person 1 2 3 any
tense tense beinoni future imperative infinitive past
binyan binyan Hifil Hitpael Hufal Nifal Paal Piel Pual
prefix conjunction prefconj conjunction
prefix definite article prefdefinite definiteArticle
prefix interrogative prefinterrog
prefix preposition prefprep preposition
prefix subordination conjunction / relativizer relativizer  relativizer/subordinatingConjunction
prefix temporal subordinating conjunction preftemp temporalSubConj
 prefix adverb prefadv adverb
suffix function suffunction accusative-or-nominative possessive pronomial
suffix number sufnum feminine masculine masculine-and-feminine
suffix gender sufgender plural singular
suffix person sufper 1 2 3