Create a new corpus
To create a new corpus from uploaded files, log in to Sketch Engine (or click Home) and click click Create corpus from the left menu. Complete the following details:
- Corpus name: this will be displayed in the interface. A simple, unique, alphanumeric corpus ID that can be used in CQL queries will be created automatically.
- Language: Sketch Engine will process the uploaded data using settings for the selected language. Select Other UTF-8 for a language not found in the list.
For advanced users
Click Create to create an empty corpus and to go to the next step.
Add a File
If you wish to add data using WebBootCat instead, click Cancel and then Add data from the web using WebBootCaT in the corpus screen.
Clicking add a new file will give you these options
- upload a file from your computer
- download from a URL
- upload from the Sketch Engine server
FTP to the.sketchengine.co.uk at port 10021 to upload files, use the same username and password as for logging into the web interface – FTP tutorial
Uploading multiple files at once
You can also add multiple files in an archive using formats: .zip, .tar, .tar.gz, and .tar. bz2. Optionally, if the file names should be preserved, click 'Expand this archive instead of converting it to a single plaintext'. It is recommended, however, to put all metadata (including the file name) in XML structures inside the file and not use the 'expand' option.
You need to (re-)compile the corpus after adding one or many files.
You will find your corpus by clicking Home and going to My corpora.