Historical collection of the Text Creation Partnership’s (TCP)

Early English Books Online (EEBO) – Phase I; Eighteenth Century Collections Online (ECCO); Early American Imprints, Series I: Evans

This Sketch Engine corpus collection consists of three publicly available parts of greater projects:

  • ProQuest’s EEBO-TCP Phase I – 25 364 books from the EEBO collection from the years 1473–c.1700
  • Gale Cengage’s ECCO-TCP – 2473 titles printed in the United Kingdom between the years 1701 and 1800
  • Readex’s Evans-TCP – 5007 books published in America during the years 1639–1800

The previous parts also represent subcorpora in the corpus. The corpus collection size is more than 826 million words. This corpus compilation was annotated by TreeTagger with Penn TreeBank tagset.

The corpus searches can use criteria such as the year or century published, key terms in books, etc.

Diachronic analysis

The corpus contains time metadata which enables to build the trends feature in Sketch Engine. The trends feature analyses the frequency of the use of a word in time by comparing the frequency of use across a series of comparable time periods.

Availability

The corpus is accessible to all users with a subscription plan and site licence members (not to trial users).

Corpus compilation of EEBO, ECCO, EVANS

826 million words

38.8 million sentences

32,844 documents in total

  • 25 364 books from the EEBO collection
  • 2473 titles from the ECCO collection
  • 5007 books from the Readex’s collection

books published between the years 1473 and 1820

Available diachronic analysis of word usage

A concordance example from the corpus

concordance generated from a corpus using CQL