What is CQL?

The Corpus Query Language is a special code or query language used in Sketch Engine to search for complex grammatical or lexical patterns or to use search criteria which cannot be set using the standard user interface.

Where is CQL used?

CQL is only used in the concordance search with the CQL options selected.

CQL? Regular expressions? Wild cards?

All three can be used in Sketch Engine.

  • CQL
    Corpus Query Language
  • CQL is used to set criteria for positions or  tokens, i.e. words, lemmas, tags, lempos, lc etc.
  • REGular EXpressions
  • REGEX set criteria for strings of characters and can be used inside the CQL code or for filtering word lists.
  • wild cards
  • A simple convention to use the question mark (?) for any unspecified character and the asterisk (*) for any number of unspecified characters. Wild cards only work in the Simple concordance search.
CQLregular expressionswild cards
purposeto set conditions for tokens (words), e.g. find words which are nouns followed by a prepositionto set conditions for character strings such as words or tags, e.g. find all words starting with letters br- or find all words whose tag starts with letter Nsimple system with limited options to search for text
where to use itin concordance search with the CQL option

(for advanced users)
in Word Sketch Grammar
in Term Grammar
in concordance search with these options: lemma, word, phrase, character (not in simple query!)

inside the CQL code

in word Lists and n-grams to only find required patterns
only in the simple concordance search

The language was developed at the Corpora and Lexicons group, IMS, University of Stuttgart in the early 1990s, see IMS Corpus Workbench. The CQL as used in Sketch Engine is an extension to the original language and varies in several ways. This documentation describes the CQL as implemented in manatee 2.122 (released April 2015).