CHILDES corpora are databases include a rich variety of computerized transcripts from language learners. These databases belong to the child language components of the TalkBank system.
Most of these transcripts record spontaneous conversational interactions. Often the speakers involved are young monolingual children conversing with their parents or siblings. There are also transcripts from bilingual children, older school-aged children, adult second-language learners, children with various types of language disabilities and aphasics who are trying to recover from language loss. Current number of transcripts for different languages is 26.
The project homepage: http://childes.psy.cmu.edu/