Working with concordance results

Please familiarise yourself with the  concordance result screen first.

Concordance result screen – left menu


The concordance lines generated from a corpus can be saved into a text or XML file. The data will be saves as they are displayed on the screen. Use View options to change this before saving. Saving large concordances can take some time, it is recommended that the user fist saves only the first page to check the output format before saving the final concordance.

Make subcorpus

Creates a subcorpus by including the structures, e.g. documents, paragraphs, sentences, from which the concordance hits originate, into a subcorpus.

View options

View options –  access to the detailed settings, see below.

KWIC – (default setting) displays a KWIC concordance

Sentence – displays a concordance of complete sentences containing the search word.

Sketch Engine remembers the last used option and will use it for the next concordance search with this corpus.

Detailed settings

This dialogue opens after clicking View options in the left menu. Sketch Engine remembers the settings and will use them for the next search with any corpus.

concordance view options

Example from the British National Corpus. Different corpora use different Structures and References.

(1) attributes are additional pieces of information related to each token in a corpus. Normally, they are hidden. Tick the ones you want to display in the concordance. Refer to glossary for details. At least one attribute has to be selected.

(2) select the structures you want to see in the concordance, hold down Ctrl to select multiple structures. Structures refer to segments into which a corpus can be divided, some examples might be s for sentence, p for paragraph. The list of structures can be different for each corpus and can be found on the corpus info page.

(3) References – line source information displayed to the left of each concordance line, hold down Ctrl to select multiple references. The list of available references can be different for each corpus and can be found on the corpus info page.

(4) Attributes (1) can be displayed:
for the search word or phrase only
for all words in each concordance line

In addition, they can be displayed permanently or as tooltips (=when the user hovers the mouse over the word).

(5) number of concordance lines displayed on one page

(6) number of characters displayed to the right and left of the search word; when attributes (1) are selected, they are included in the number so selecting many attributes can mean that very little words will be included in the line because the attributes will use most of the characters for themselves

(7) will sort the concordance lines so that GDEX are shown first

(8) for development purposes only

(9) sorting by GDEX is a time-consuming operation, therefore only a limited number GDEX can be displayed

(10) when ticked, each concordance line can be selected by clicking the icon, the sentence will be copied to clipboard ready for pasting

(11) when ticked, the clipboard will hold multiple sentences, not only one

(12) when the copy icon does not work, try turning flash off by placing a tick here, copied sentences will appear in a pop-up window ready to be pasted into another software

(13) when ticked, lines will have a check box for copying rather than an icon

(14) when ticked, lines will be numbered

(15) when ticked, the references (3) displayed to the left of each line will be shortened, hover the mouse over them to display the complete text

(16) sets the format for the copied sentences if the corpus supports this feature

One-click copying

Sketch Engine offers a simple way of copying concordance lines to be inserted into a different application.

one-click copying details

  • Click on View Options in the left-hand side panel.
  • Enable these settings:


  • Click on the icons to the right of concordance lines to copy them to your clipboard.
  • In the end simply use Ctrl+C/Ctrl+V or Ctrl+Insert/Shift+Insert (in Windows) or Cmd+C /Cmd+V (in Mac)


one-click copying troubleshooting

The problem can come from the Adobe Flash player used for one-click copying to access your computer’s clipboard. Some versions of the player restrict access to the clipboard, though.

The functionality was tested with Flash player 9.

For IE and Firefox, there is another option to access the clipboard without Flash (however, not so user-friendly ). Sketch Engine uses jQuery zclip for one-click copying.

To enable the non-flash copying in Fire fox:

  1. write “about:config” in the address bar
  2. click “I will be careful, I promise”
  3. put “signed.applets.codebase_principal_support” into the ‘filter’ box (we want to change the browser property with this name)
  4. double-click on the value “false” should change it to “true”
  5. when clicking a copy-icon in SkE concordance, a dialog window appears telling that a script is trying to get some abilities; tick “Remember this decision” and “Allow”
  6. restart Firefox

One-click copying should work now in Firefox in the same way as with the Flash player 9.

The other solutions are:

  1. downgrade/upgrade the Flash player to the version which is not restricted
  2. use IE or Firefox with the setting described above
  3. use one-click copying without Flash option in View options


Concordance lines can be sorted in a simple or complex way and also shuffeled. When the concordance is sorted, a new control will appear at the bottom to quickly jump to a certain value.

sorted concordance jump


Click Sort in the left menu to access detailed sort settings.

Simple sort

simple sort

select whether you want to sort by word form, lemma, tag etc.

Sort key
select whether you want to search the tokens to the left of the search word or to the right or whether you want to search the search word (node) itself.

Number of tokens to sort
will sort the given number of tokens, the setting in the screenshot will first sort the lines by the first token after the node. The lines with the same first word will then be sorted by the second token and then by the third token. To only sort by the third token, use Multilevel sort.

Ignore case
the sort will not be case sensitive

the sort will be in the reverse order

Multilevel sort

multilevel concordance sorting

Here you can specify which exact token should be used for searching and in which order.

2L = second token to the left, 1R = first token to the right etc.

Lines will first be sorted by the first criterion. The lines with the same attribute in that position will then be sorted further by the second criterion etc.

Left / Right

Left / Right– click to sort the concordance lines by the first word to the left / right of the search word or phrase, the lines will be grouped making them easier to observe lexical or grammatical patterns


Click Node to sort the lines by the search word or search phrase. If the node in all lines is identical, the sort has no effect.


Sort the concordance lines by the References displayed to the left of each concordance line. The sort depends on which References are selected in View options . See also Concordance result screen.


The concorance lines will be randomly reshuffled. Useful if the user wants to see different concordance lines without going to the next screen.


Click Sample to have Sketch Engine randomly select a certain number of lines. Useful when the concordance produces too many results. Random selection guarantees a representative sample of the whole concordance. Sketch Engine remembers the last sample settings and provides a shortcut to them. (Look for Last in the left menu.)

details about sampling

The random sample is generated using a random number generator which always starts from the same point. This means the random sample remains the same every time it is created for the same number of lines. This can be useful when several users want to work on the same sample or a user needs to recreate the same sample as used previously.


Watch this short You Tube video to learn how you can create a random sample of concordance lines from your sets of results.


Concordance lines can be filtered. The filter will exclude lines or only include lines with match the filter criteria.

Filter settings

Click Filter to access these settings:

filtering the concordance

positive – only matching lines will be kept in the concordance
negative – matching lines will be excluded

selected token
select which token should be highlighted in a different colour if the token is found more than once within the search span

search span
determines how far from the node the filter should look for matching tokens

include KWIC
when ticked, the node (search word or phrase) itself will also be included in the filter

The rest of the settings is identical to the concordance query form.

Different filters can be used on top of each other until the desired outcome is achieved.


Frequencies of word forms, lemmas, tags and other attributes can be calculated from the concordance lines. You can combine up to 4 attributes for which frequencies should be calculated.


Click Frequency in the left menu to access these settings:


Multilevel Frequency Distribution

Use frequency limit to exclude low frequency items from the list.

Frequency can be calculated for the node but also for any other token up to 6 positions to the right or left of the node.


Frequencies can also be calculated for phrases or groups of words, tokens, tags or a mixture of attributes. For example, these settings will tell us the most frequent combinations of a word followed by work such as at workto work, the work etc. :

frequency example

Text Type Frequency Distribution

The text type frequency distribution will calculate frequency of the node in different texts. You can select text type, author, publication date and other attributes. Hold down CTRL and selecting more than one attribute will produce a statistic for each attribute all on one page.

Specify a frequency limit to exclude low frequency items.

What does Relative Text Type frequency on the text type frequency page mean?

The number is relative frequency of the query result divided by relative size of the particular text type. The number grows with higher frequency and lowers with bigger size of the text type. It can be interpreted as “how much more/less often the result of the query appears in this text type in comparison to the whole corpus”.

E.g. “test” has 2000 hits in the corpus. 400 of them are in the text type “Spoken”. Text type “Spoken” represents 10 % of the corpus. Then the Relative Text Type frequency will be (400 / 2000) / 0.1 = 200 % and it means “test” is twice as common in “Spoken” than in the whole corpus.

These frequency presets are directly available from the left menu:

Node tags

Click to calculate the frequency of the node tags, i.e. tags related to the highlighted word(s) in the concordance. It is only useful if the concordance includes node words with various different tags, such as the word work sometimes used as a noun and sometimes as a verb. Then this option will show how often it is used as each part of speech.

Node forms

Click to calculate the frequency of the node forms, i.e. the different word forms of the highlighted text in the concordance. This is only useful if various forms of the highlighted word are found. If the node is a word with no inflections, such as because, using this option will not produce any sensible result.

Doc IDs

Click to calculate the frequency of Doc IDs.

Text types

Click to calculate how the node is distributed between the different text types.


The collocation tool will search the context around the node and will display the most frequent words which can be regarded as collocation candidates. This process can be slow for large concordances, it is recommended to use the Word Sketch circumstances permitting.

Collocations settings

collocations from concordanceattribute
chose word form, tag, lempos, lemma, lc or lemma_lc


defines how many tokens to the right and to the left of the node will be included

minimum frequency in corpus
excludes words which appear in the corpus less frequently than the given value

minimum frequency in given range

excludes words which appear in the above defined range less frequently than the given value

show functions
defines which values should be shown on the result screen

Sort by
defines which value will be used for sorting

See Statistics used in Sketch Engine for an explanation of the values.


This tool will show a graph showing how the concordance lines are distributed across the concordance. This is useful to check, whether the search word is distributed evenly across the whole corpus or whether there are some places (documents) in the corpus where the word is concentrated which would suggest that the word is subject specific or the topics in the corpus are not balanced.

By default, the corpus is divided into 100 equal parts (slices), use the slider to achieve a finer division.

The column height represents a relative frequency of the search word within a concordance part (=column).

The columns are clickable and will display a concordance from the slice.

concordance distribution graph