relative text type frequency

(also called Relative density in the interface) Relative text type frequency compares the frequency in a specific text type to the frequency in the whole corpus. It shows how typical the word(s) is of a specific text type, e.g. of the spoken part of the corpus or of a particular website which the texts were downloaded from.

The number is the relative frequency of the query result divided by the relative size of the particular text type. It can be interpreted as how much more/less frequent is the result of the query in this text type compared to the whole corpus.

  • less than 100 % – it is less frequent in this text type than in the whole corpus, it is not typical or specific of this text type
  • 100 % – it as frequent in this text type as it is in the whole corpus
  • more than 100 % – it is more frequent in this text type than in the whole corpus, it is typical or specific of this text type

Example

The word ‘test‘ has 2000 hits in the corpus. 400 of them are in the text type “Spoken” and “Spoken” represents 10 % of the corpus. Then the Relative Text Type frequency will be (400 / 2000) / 0.1 = 2, which means that ‘test’ is twice as common in “Spoken” than in the whole corpus. For the result in percentage (as it is in the interface), this number needs to be multiplied by 100.

See also Statistics in Sketch Engine