The Corpus Query Language is a special code or query language used in Sketch Engine to search for complex grammatical or lexical patterns or to use search criteria which cannot be set using the standard user interface.
CQL is used to set criteria for positions or tokens, i.e. words, lemmas, tags, lempos, lc etc.
REGEX set criteria for strings of characters and can be used inside the CQL code or for filtering word lists.
A simple convention to use the question mark (?) for any unspecified character and the asterisk (*) for any number of unspecified characters. Wild cards only work in the Simple concordance search.
to set conditions for tokens (words), e.g. find words which are nouns followed by a preposition
to set conditions for character strings such as words or tags, e.g. find all words starting with letters br- or find all words whose tag starts with letter N
simple system with limited options to search for text
where to use it
in concordance search with the CQL option
(for advanced users)
in Word Sketch Grammar
in Term Grammar
in concordance search with these options: lemma, word, phrase, character (not in simple query!)
inside the CQL code
in word Lists and n-grams
to only find required patterns
only in the SIMPLE concordance search
The language was developed at the Corpora and Lexicons group, IMS, University of Stuttgart in the early 1990s, see IMS Corpus Workbench. The CQL as used in Sketch Engine is an extension to the original language and varies in several ways. This documentation describes the CQL as implemented in manatee 2.122 (released April 2015).