Open Cambridge Learner Corpus Uncoded

The Open Cambridge Learner Corpus Uncoded is a small part of the Cambridge Learner Corpus containing exam scripts written by students learning English, supplied the Cambridge University Press.

This corpus consists of scripts from over 10,000 students, from more than 60 countries, speaking 7 different first languages.

Corpus text types store detailed information about examined student. These enable to search through a specific part of the corpus, e.g.  the first language of students, their nationality or age. It means that the corpus can be used to find out how a specific group of students express and create answers at different levels in English exams.


The corpus is accessible to all users with a subscription plan and site licence members (not to trial users).


[1] NICHOLLS, Diane. The Cambridge Learner Corpus: Error coding and analysis for lexicography and ELT. In: Proceedings of the Corpus Linguistics 2003 conference. 2003. p. 572-581.

Text types in the corpus

ALTE level


CEFR level exam

CEFR level student performance

Education level

Exam description

Exam score

Exam year

First language

Full time



Other exams taken


Preparation course

Previous attempts

Question ID

Reason for EFL



Years studying English