A summary of Sketch Engine features aimed at terminologists such as term extraction and related functionality.

Features for terminology and terminography

  • Term extraction  finds candidates for terms in your documents or in subject-specific texts which the user uploads or Sketch Engine can find such texts on the web.
  • Bilingual terminology extraction can be performed on the translation memory (TM) the user uploads. The result is a bilingual list of terms and their translations.

Terminology tasks can be aided with

  • usage checking with the help of concordance searches which find examples of a phrase or word in context sourced from domain-specific copora which Sketch Engine can automatically create for you.
  • word sketch will highlight the typical collocations and word combinations. Use general corpora for information about non-specialized language or subject-specific corpora for professional language.

Automated building of a subject-specific database of texts

Sketch Engine has a built-in tool which allows the user to create a database of subject-specific texts which can bue used for term extraction or for checking how the specialized language is used by real speakers of the language. There are three ways to create such a database (corpus):

  • upload any material the user has access to
  • have Sketch Engine look up and download relevant texts on the web
  • combination of the above options

It is advisable to work with small corpora (e.g. about 100,000 words) made up of terminology-rich texts because it may give more precise results for domain-specific work. Sketch Engine will automatically find and download relevant texts on the internet for you and your specialized corpus can be ready within minutes. Typically, it will take about 10 mins to create a 1,000,000 word corpus. All additional functionality will be available automatically with your corpus: Word Sketch, concordance, term extraction, n-grams, word lists etc. (feature availability is dependent on the language).

List of domain corpora available in Sketch Engine

Here are listed a few examples of text domain corpora that can be found in Sketch Engine.



For inspiration

In the paper below, Adam Kilgarriff offers an interesting, unusual and well-founded view of terminology.

Adam Kilgarriff (2007). I don’t believe in word sense. In Computers and the Humanities, 31(2), pp. 91–113.

