General instructions on corpus data directory structure

The aims of these instructions is to ensure that for every corpus,…

Sketch Grammar development corpora

This page describes how to use a sketch grammar in your corpus. In…

Preloaded Configuration Templates

When you create a corpus from the Sketch Engine interface (see…

Word Sketches definition files

The following files can be used for building word sketches in…

Corpcheck: Verifying corpus consistency, integrity and complenetess

The corpcheck program should be run on each corpus, it checks…

Command line tools generating and viewing n-grams

There is a number of utilities available in Finlib/Manatee that…

Text Types, Headers and Subcorpora

Overview For many kinds of language study, text type is important.…

Preparing Corpus Text

The input format is "vertical" or "word-per-line (WPL)" text,…

Dynamic Attributes

To make use of dynamic attributes they have to be set up in …

Creating Subcorpora for Sharing with All Users

In Sketch Engine, subcorpora can be created by users in their…

Allowed language names in corpus configuration

A Afar, Abkhazian, Adyghe, Afrikaans, Aghem, Akan, Amharic,…

Vertical file example

If your vertical text contains only words and no annotation,…

The Corpus Configuration File: Overview

For the software to be able to use a corpus, there are a number…

Preparing a Text Corpus for Sketch Engine: Overview

This page describes how to prepare a text corpus for indexation…

Corpus Configuration File: All Features

Corpus configuration options NAME name of the corpus; defaults…