Our new Spanish Word Skteches give a much better coverage of Spanish-specific phenomena such as compound verb tenses, verb constructions, ser/estar or esubjuntivo. Spanish collocation information has never been so rich.

decirnos, descargárselo, comerselo are examples of verbs with clitics which pose a problem when searching. Sketch Engine can now handle these much better, searching for decir will find instances with and without the pronouns, i.e. decir and decirle, diciéndo and diciéndole

This is available in the European Spanish Web 2011 (eseuTenTen11) or any newly created user corpora.

New Spanish Word Sketches

The new word sketch grammar now analyses Spanish specific phenomena for example

  • adjectives preceded by ser or estar
  • statistics of verb constructions (perífrasis verbales) in which a verb tends to appear
  • noun phrases using de
  • statistics of the subjunctive compared to the indicative

and many others.

See an example of the word sketch for apoyar (v) and claro (adj).

Availability

Available in the European Spanish Web 2011 (eseuTenTen11) or any newly created user corpora.

Upgrading your corpus

Previously created user corpora need to be upgraded and  re-compiled to bring in the new functionality. Start the re-compilation and you will be invited to upgrade the corpus during the process – watch out for a yellow message.

New clitics handling

Using the simple search option and typing a verb (dar, poner etc.) or a pronoun in its object form (me, le, nos, se…) will find instances of the verb with and without the pronouns (dar and also darse, dárselo…) or pronouns on their own as well as attached to a verb (se and also ponerse, ponerselo…). This is default behaviour in simple search.

In the CQL search, use:

[lemma="dar"]
[morphemes="se"]

to replicate the former and the latter example respectively.

To find verbs with attached pronouns se and lo, use:

[morphemes="se" & morphemes="lo"]

To find verbs with any attached pronouns, use:

[tags="V.*" & tags="PP.*"]

Note the use of tags, not tag.

New attributes for clitics handling

To enable this functionality, Sketch Engine uses two new multi-value attributes for Spanish:

morphemes – lists the morphemes which make up the token

tags – lists the tags related to the morphemes within the token

word formlemmatagmorphemestagsnotes
digodecirVMIP1S0decirVMIP1S01 token, 1 morpheme, 1 tag
decirledecirMN0000decir
le
MN0000
PP3CSD0
1 token, 2 morphemes, 2 tags
diselodecirVMM02S0decir
se
lo
VMM02S0
PP3CN00
PP3MSA0
1 token, 3 morphemes, 3 tags

Availability

Available in the European Spanish Web 2011 (eseuTenTen11) or any newly created user corpora.

Upgrading your corpus

Previously created user corpora need to be upgraded and  re-compiled to bring in the new functionality. Start the re-compilation and you will be invited to upgrade the corpus during the process – watch out for a yellow message.

N'ko corpus

XLIFF support in Sketch Engine
Sketch Engine CQL calendar

Calendar 2017

Sketch Engine and Colibri
Audio recordings for the British National Corpus (BNC)

BNC audio

improved functionality for Bulgarian text
improved Thai support

Adam's blog

Happy New Year!