6) Annotating your corpus
It is possible to add your own annotations to the texts in your corpora. You can use an ordinary text editor or a third party annotation software (e.g. Brat).
- Click ‘Download corpus’ in the left side menu on the corpus building page and check Format ‘vertical’ to save the corpus in a one token per line format.
- Open the saved file in the annotation editor, make annotations, save.
- Add the annotated file to the corpus, remove the old file from the corpus.
Please note the Sketch Engine accepts annotations in two forms:
- XML structures with attributes (useful for annotating phrases),
- positional attributes (useful for annotating words).
It might be required to transform the annotation tool output to a format supported by Sketch Engine to successfully compile the corpus.
Example annotation – phrase level
Useful for attributes common to the whole phrase:
Example annotation – token level
Used to represent attributes of separate words (e.g. token ID, word, lowercase lemma, part of speech, dependency ID):
0 Golden golden adjective 1
1 gate gate noun 2
2 bridge bridge noun -
Both examples combined
0 Golden adjective 1
1 gate noun 2
2 bridge noun -