The Oxford English Corpus (OEC) consisted mainly of websites chosen in the way of presenting all types of English, from literary novels to everyday newspapers and the language of blogs, and even social media. Besides UK and US English there are Englishes from Ireland, Australia, New Zealand, the Caribbean, Canada, India, Singapore, and South Africa. The last version of this corpus contains nearly 2.1 billion words (almost 2.5 billion tokens).

For information visit Oxford Dictionaries’s website.

The corpus is supplied by Oxford University Press.

Access Policy

Access restricted unless special permission granted.

Permission from Oxford University Press is required to get access to the corpus. Researchers may contact oec@oup.com. Include information about you and your research project. Please add a note you would like to access the corpus in Sketch Engine, including your user name in Sketch Engine. (This is a manual process that may take several days.)

Changelog

v3 (Feb 2012)

“OEC + Biwec build v2”
2.073 billion words

Updates:

  • 2012-03-08 encoded, word sketches
  • 2011-04-05 doc.wordcount

v2

2.008 billion words

Updates:

  • 2010-11-02 encoded, word sketches
  • 2011-03-05 doc.wordcount

v1

1.736 billion words

Updates:

  • 2010-03-15 encoded
  • 2010-04-01 word sketches
  • 2011-03-05 doc.wordcount