CHILDES corpora are databases include a rich variety of computerized transcripts from language learners. These databases belong to the child language components of the TalkBank system.

Most of these transcripts record spontaneous conversational interactions. Often the speakers involved are young monolingual children conversing with their parents or siblings. There are also transcripts from bilingual children, older school-aged children, adult second-language learners, children with various types of language disabilities and aphasics who are trying to recover from language loss. Current number of transcripts for different languages is 26.

The project homepage:


On the homepage in section Child Language Bibliographies (

Search the CHILDES corpora in Sketch Engine

Sketch Engine offers a range of tools to work with the CHILDES corpora.


Use Sketch Engine in minutes

Generating collocations, frequency lists, examples in contexts, n-grams or extracting terms is easy with Sketch Engine. Use our Quick Start Guide to learn it in minutes.