Brown corpus: Corpus of American English

The Brown corpus (full name  Brown University Standard Corpus of Present-Day American English) was the first text corpus of American English. The original corpus was published in 1963-1964 by W. Nelson Francis and Henry Kučera at Department of Linguistics, Brown University Providence, Rhode Island, USA.

The corpus consists of 1 million words (500 samples of 2000+ words each) of running text of edited English prose printed in the United States during the year 1961 and it was revised and amplified in 1979 (see more at http://www.hit.uib.no/icame/brown/bcm.html)

Part-of-speech tagset

Sketch Engine contains two POS tagged versions of the Brown corpus:

Tools to work with the Brown corpus

A complete set of Sketch Engine tools is available to work with the Brown corpus to generate:

  • word sketch – English collocations categorized by grammatical relations
  • thesaurus – synonyms and similar words for every word
  • word lists – lists of English nouns, verbs, adjectives etc. organized by frequency
  • n-grams – frequency list of multi-word units
  • concordance – examples in context
  • trends – diachronic analysis automatically identifies neologisms and changes in use

Availability

The access to the corpus is freely available for research.

Bibliographic references

FRANCIS, W. Nelson; KUCERA, Henry. Brown corpus. Brown University, 1964.

FRANCIS, W. Nelson; KUCERA, Henry. Brown corpus manual. Brown University, 1979, 15.

Search the British National Corpus

Sketch Engine offers a range of tools to work with the British National Corpus.

or

Other text corpora in Sketch Engine

Sketch Engine provides access to 350+ language corpora.

Use Sketch Engine in minutes

Generating collocations, frequency lists, examples in contexts, n-grams or extracting terms is easy with Sketch Engine. Use our Quick Start Guide to learn it in minutes.