Simon Krek


Simon Krek was the editor-in-chief of a new Oxford comprehensive English-Slovenian dictionary published in two volumes between 2005 and 2006, a project that introduced modern corpus-based lexicography in Slovenia. Currently his affiliation is with the Jozef Stefan Institute (Artificial Intelligence Laboratory) and the University of Ljubljana (Centre for Language Resources and Technologies). Between 2008 and 2013 he was coordinating a five-year project whose results include a billion-word corpus of Slovene with a new tagger, parser and pedagogically-oriented web concordancer, a lexicon and lexical database, serving as a basis for a web-based pedagogical dictionary, grammar and manual of style. In 2013, he published a proposal for the compilation of a new dictionary of modern Slovene, together with two co-authors. The dictionary would be exclusively designed for the digital medium, freely available both online and as a data set, crowdsourced and constantly updated.

Lexicography in the Digital Age

Advances in information extraction techniques permit a thorough rethink of traditional methods of language description tasks, and extend to the automatic extraction of example sentences, multi-word expressions, neologisms, definitions, knowledge rich contexts, lexical-semantic relations, word senses, grammatical patterns, and other types of data. At the end of this process one could conceive a fully Automated Construction of Dictionary Content (ACDC) providing language data for general (human) users interested in lexical behaviour of words, for dictionary makers as a starting point for further processing of the collected material either by traditional lexicographic process or through crowdsourcing platforms, and for Natural Language Processing communities, enhancing semantic lexical databases and similar resources designed for NLP tasks. In the talk, the current state of e-lexicography in Europe from the perspective of the European Network of e-Lexicography project ( will be explored.