2013
- 26th March 2013
- The Sketch Engine was presented to staff who run the Pre-Sessional English courses at the University of Strathclyde.
- 21 March 2013
- We have installed a New version of the English Word Sketch Grammar compatible with the Penn Treebank tagset. The changes are:
- TRINARY relations dualized using DUAL (requires Manatee 2.74)
- Proper nouns in sketches allowed ("NN.?.?" -> "N.*")
- The grammar has been applied to compatible preloaded corpora (English, Penn TT tagset). This has also been made available for use with User Corpora in Corpus Architect. To all users who are currently using your own corpora (in English with Penn TT tagset) for terms extraction or key collocations, you are requested to recompile the Word Sketches of your relevant corpora. You can contact Support in case you need help.
- 20 March 2013
- The Sketch Engine was presented at Durham University. Staff from the Foundation Centre, Computer Science, the English Language Centre and Modern Languages all turned up to a computer workshop to see how the Sketch Engine can be used in their respective Departments.
- 14-16 March 2013
- Adam Kilgarriff presented the Sketch Engine at the V Congreso Internacional de Lingüística de Corpus en la Universidad de Alicante
- 8 March 2013
- The Sketch Engine was presented at Aston University. Corpus Linguistics Undergraduates and Staff and Students in Modern Foreign Languages saw the Sketch Engine in action.
- 28 February 2013
- Irena Srdanovic presented the Sketch Engine in her talk entitled 'Profiling of Japanese vocabulary and grammatical information using a corpus of ten billion words', at 3rd Workshop on Science: Japanese corpus. You can find her presentation on our 'Sketch Engine Documentation for Languages Other than English' page.
- 22 February 2013
- The Sketch Engine was exhibited at the BAAL Corpus Linguistics SIG "Building and Mining Small, Specialised Corpora" at the University of Edinburgh.
- 21 February 2013
- Today the Sketch Engine was presented at the University of Bath English Language Centre in a talk about how to use corpora in the classroom.
- This morning Vineeta Gupta, Head of Children’s Dictionaries, was interviewed by Chris Evans on his BBC Radio 2 Breakfast Show. As many of you know, OUP analysed the over 74,000 entries to the BBC Radio 2 500 Words short story competition last year, which resulted in some fantastic language research and national publicity coverage. We will also be supporting the competition with children’s language research in 2013. For information about the project, or last year’s research, please contact Vineeta or I (harriet.bayly@oup.com or vineeta.gupta@oup.com). Link to listen to the interview: http://www.bbc.co.uk/programmes/b01qqblq
- 13 February 2013
- Today from 1pm-2pm, the Sketch Engine will be presented at Swansea University Language Research Centre. Corpora for all!
- 8 February 2013
- There have been many new updates to the Sketch Engine:
- Asynchronous query processing - If your query retrieves a lot of results, you won't have to wait for all of your results to be processed before they appear on screen. Now, the Sketch Engine will retrieve the first page of results immediately, whilst subsequent pages of results will be 'Counted' in the background, saving you time... You can also see them being counted at the top of the screen.
- Parallel queries - Now, you can generate and save concordances and Word Sketches from parallel corpora.
- Multiword sketches - In an instant you can create Word Sketches for multiword expressions.
- In Corpus Architect, you can now access parallel corpora on a separate page.
- We have a New first form - The 'Query types', 'Text types' and 'Context' sections are now shown as Tabs, underneath the 'Simple query' box.
- 28 January 2013
- The Sketch Engine in the Guardian A very entertaining read about a language project using the Sketch Engine to look at 'art-speak' or the "pompous" language used in Galleries to describe art work.
- 25 January 2013
- The Sketch Engine was presented at the LLAS e-Learning Symposium 2013
- 22 January 2013
- The Sketch Engine now has it's very own YouTube page. On the Sketch Engine's channel you can: watch, comment, rate and subscribe to Sketch Engine video tutorials. We hope that watching the videos will help you learn more about the Sketch Engine and how easy it is to use.
- 21 January 2013
- What do you do when the dictionary doesn't give you enough information? Martin Svěrák has created this website that gives information about how ESL teachers can use the Sketch Engine in their classroom.
- 17-18 January 2013
- The Sketch Engine is being presented at the CIUTI Forum 2013. CIUTI is an 'association of University institutes with translation and interpretation programmes'. The Sketch Engine's 'Bilingual word sketch' is being launched at this forum.
- 10-11 January 2013
- Nicole Keng and Robyn Woodrow attended the AULC Conference 2013, Durham University. They spoke to University Language Centre Managers about how useful the SketchEngine? is to learners of foreign languages, and in ESL.
- 7 January 2013
- Luisa Bozzo from the University of Torino, Italy, has written a report on the 2012 Lexicom conference and workshop. You can read the report here.
2012
- 29-30 November 2012
- Today we attended the Aslib Translating and the Computer Conference in London. The Sketch Engine team are currently working on Bilingual Word Sketches which will be invaluable to translators and to those teaching foreign languages.
- 28 November 2012
- Robyn Woodrow delivered a Sketch Engine presentation at Nottingham College International to an audience of English Language Teachers interested in how corpora can aid their teaching.
- 20 November 2012
- Today we attended the JISC Collections Conference and AGM on Brunel's ss Great Britain in Bristol. We spoke to librarians about the upcoming changes in JISC banding and about how members of an educational institution can access resources online through the UK Access Management Federation.
- 16 November 2012
- Macmillan dictionaries who have worked intensely with the Sketch Engine, have recently announced that they will soon cease publishing their dictionaries on paper. The Sketch Engine is discussed in a related article in Atlantic Wire
- 10 November 2012
- Today we are giving an advanced Sketch Engine workshop at the Corpus Linguistics in the South conference at the University of Portsmouth. The event is being hosted by Charlotte Taylor of the University of Portsmouth. Corpus Linguistics in the South
- 22 October 2012
- Participants wanted for a German-English Machine Translation research project! Please get in touch for details: [robyn.woodrow@sketchengine.co.uk]
- 21 October 2012
- We are happy to announce new design improvements in the Sketch Engine interface.
- 20 October 2012
- First version of bilingual word sketches enabled for EUROPARL en-fr and en-de pairs of corpora.
- 19 October 2012
- Asynchronous query processing
- With our corpora getting bigger and bigger, the pressure to provide reasonable response times to user queries grows as well. Therefore we are happy to announce a new feature that we call "asynchronous query processing" that makes it possible to show partial search results (e.g. first 20 lines of the concordance) to the user as soon as they are ready without the need to wait for computing all the hits. Those are computed in the background while the user can continue working. We invite you to try this and let us know what you think.
- Asynchronous query processing
- 15 October 2012
- Today we attended Lancaster University's UCREL Corpus Research Seminar http://ucrel.lancs.ac.uk/crs/
- 24-28 September 2012
- The Sketch Engine team attended Lexicom 2012 Europe Workshop in Lexicography and Lexical Computing in Galtür, Austria. http://nlp.fi.muni.cz/lexicom2012eu/ This year it was led by Adam Kilgarriff and Michael Rundell with local co-ordination by Eveline Wandl-Vogt of the Institut für Österreichische Dialekt-und Namenlexika. Please feel free to contact us if you attended Lexicom 2012 and have any comments or feedback.
- 23 September 2012
- You can now follow us on Twitter @SketchEngine
- 21 September 2012
- Dr Adam Kilgarriff attended LThist2012 First International Workshop on Language Technology for Historical Text(s) in Vienna. This workshop was part of KONVENS 2012 The 11th Conference on Natural Language Processing. The Sketch Engine poster session, entitled 'The Sketch Engine as Infrastructure for Historical Corpora' explained how useful the Sketch Engine is when examining key word lists between different historical corpora. More information about the workshop can be found here: http://corpus3.aac.ac.at/LThist2012/
- 3-7 September 2012
- The Sketch Engine team attended the 15th International Conference on Text, Speech and Dialogue in Brno, Czech Republic. Dr Adam Kilgarriff's talk entitled 'Getting to Know Your Corpus' drew in a large audience and the Sketch Engine demonstration sessions received great feedback. You can view the TSD conference information, photos and videos here:http://www.tsdconference.org/tsd2012/
- 1 Aug 2012
- The article ' International Art English' uses the Sketch Engine to explore the language of the Art World
- We are sorry to say that WebBootCaT is out of action for a few days while we rewrite procedures following changes in Yahoo's and Bing's authentication procedures. We shall have it working again by 6 August.
- 17 July 2012
- Anthony Shore, chief of brand naming company Operative Words, described his experience with the Sketch Engine on his blog
- 9 July 2012
- We are happy to release the new stable version (2.59-2.91.9). The new features include:
- multiword sketches
- searching in parallel corpora
- interface language switch (Chinese, Czech, English, Irish, Slovenian)
- We are happy to release the new stable version (2.59-2.91.9). The new features include:
- 14 May 2012
- We are looking for a salesperson. Full advertisement here
- 7 May 2012
- We are finalizing new features in beta. You are welcome to test the new features at https://beta.sketchengine.co.uk. The changes you will see include:
- multiword sketches
- searching in parallel corpora
- shortcut for switching sort order (salience/frequencies) in Word Sketches
- interface language switch (Chinese, Czech, English, Irish)
- We have also build the following new TenTen scaled web corpora:
- arTenTen (Arabic, 5.8 G words)
- esAmTenTen (American Spanish, 7.5 G words, word sketches available, going to be merged with the European Spanish TenTen)
- czTenTen2 (Czech, 4.8 G words)
- frTenTen (French, 10.7 G words, word sketches available)
- jpTenTen (Japanese, 9.1 G words
- ruTenTen (Russian, 15.8 G words, word sketches available)
- We are finalizing new features in beta. You are welcome to test the new features at https://beta.sketchengine.co.uk. The changes you will see include:
- 29 February 2012
- Last day with the company for both Jan Pomikalek and Diana McCarthy. After all the wonderful work they have done and the pleasure it has been to work with them, we are sad to bid them farewell. We wish them both the very best for the future
- 20 February
- Conference and workshop papers accepted:
- WAC-7 Web as Corpus Workshop, Lyon, April 2012
- Vit Suchomel and Jan Pomikalek: Efficient Web Crawling for Large Text Corpora
- LREC Language Resources and Evaluation Conference, Istanbul, May 2012
- Bharat Ram Ambati, Siva Reddy, Adam Kilgarriff: Word Sketches for Turkish
- EURALEX, European Lexicography Conference, Oslo, August 2012
- Milos Jakubicek, Adam Kilgarriff, Pavel Rychly, Vojtech Kovar: Finding Multiwords of More Than Two Words
- Adam Kilgarriff, Jan Pomikalek, Pete Whitelock: Setting up for Corpus Lexicography
- Diana McCarthy, Avinesh PVS, Dominic Glennon: Domain Specific Corpora from the Web
- WAC-7 Web as Corpus Workshop, Lyon, April 2012
- Conference and workshop papers accepted:
- 31 January 2012
- BBC Radio 2's Chris Evans and Oxford University Press join forces to explore children's writing - with Sketch Engine as the back-end technology: radio interview here, 19.10 minutes in
- 30 January 2012
- Digital Languages: Using corpora for your research questions, workshop led by Adam Kilgarriff at the University of Sussex
- Hindi word sketches now available
- 16-17 January 2012
- Adam Kilgarriff presented the Sketch Engine at University of Heidelberg, Departments of English and of Translation
- Featured in Juliette Scott's blog, here
- 11 January 2012
- You can now upload multiple files in an archive when uploading your own corpus to Sketch Engine. See the relevant help page
- 6 January 2012
- An enthusiastic blogger we have come across is Anth of Operative Words. Thank you Silvia Bernardini and Juliette Scott, for both (independently) spotting it
- A brand new, web-crawled, 2-billion word Chinese corpus, zhTenTen, is now available
2011
- 25 November 2011
- We fixed a bug in WebBootCaT which caused problems with using the service in Internet Explorer.
- 11 November 2011
- Siva Reddy and Diana McCarthy, both from Lexical Computing Ltd., are co-authors with Ioannis Klapaftis and Suresh Manandhar in a paper that won best paper award at the 5th International Joint Conference on Natural Language Processing. The paper is listed in the Sketch Engine Bibliography and uses data from Sketch Engine for modelling the semantics of compound nouns
- 27 October 2011
- An RSS feed for Sketch Engine news is now available at http://www.sketchengine.co.uk/rss.cgi
- 7 July 2011
- Opening of the 2011 LSA Linguistics Institute in Boulder, sponsored by LCL, in Boulder, Colorado. All participants receive one year's Sketch Engine account.
- Six 'Brown Family' corpora available in Sketch Engine, supporting comparisons across genre, time and dialect
- original Brown (US 1961), LOB (UK 1961), BLOB (UK, 1931), FLOB (UK, 1991), FROWN (US, 1991), BrE06 (UK, 2006)
- 1 July
- Diana McCarthy, LCL Director and Erasmus Mundi Fellow, visiting Melbourne University for a month.
- 29 June 2011
- Sketch Engine workshop in Taipei, Taiwan, hosted by Bookman Books with speakers Wallace Chen, Jerome Su, Howard Chang
- 15 June
- CHILDES/TalkBank data (parent-child dialogs) available in the Sketch Engine, for multiple languages. 23m words for English.
- 2 June 2011
- The UK National Ecosystem (http://uknea.unep-wcmc.org/) launches its Synthesis Report with its key findings. Sketch Engine was used for the corpus linguistics analysis. See page 41 of the report which can be obtained here
- 7 May 2011
- GDEX (Good Dictionary EXamples): infrastructure now set up so customers can develop their own GDEX (eg for a different language/publisher/dictionary). Documentation at https://trac.sketchengine.co.uk/wiki/GDEX
- 30 April 2011
- First version of CCBC (Comparable Corpora BootCaT) and bilingual word sketches presented at 'Research Models in Translations Theory' conference, Manchester. Powerpoint here
- 27 April 2011
- Web corpus tools developed by Jan Pomikalek, for his PhD and within PRESEMT, made available:
- jusText, for web page cleaning including removing boilerplate, http://code.google.com/p/justext/
- Onion, for deduplication, http://code.google.com/p/onion/
- Web corpus tools developed by Jan Pomikalek, for his PhD and within PRESEMT, made available:
- 25 April 2011
- First version of bilingual word sketches prepared
- 15 April 2011
- Polish word sketches available
- 30 March 2011
- First version of Bulgarian word sketches available; SkE going into use at the Institute for the Bulgarian Language, Sofia
- 22 March 2011
- New front pages (including this news page) go live
- 17 March 2011
- New improved wordlist functionality and sketch diffs by subcorpus available on beta
- 16-17 March 2011
- SKEW-2; 2nd International Sketch Engine Workshop has been held in Brighton
- 11 March 2011
- we now own our own servers (as well as renting some) and shall be shifting services to the owned servers
- Users can now upload their own aligned, parallel corpora within Corpus Architect
- 11 March 2011
- paper to appear in Corpus Linguistics 2011 on our work on CLAEVIPS: A Corpus Linguistic Analysis of Ecosystems Vocabulary in the Public Sphere. Commissioned by the UK National Ecosystem Assessment





