«TAMAN TODAY»: CORPUS RESEARCH OF XIXth CENTURY’S RUSSIAN LANGUAGE
This article presents a project of the School of Linguistics at the National Research University Higher School of Economics, «Taman today». The aim of this project is to identify obsolete linguistic constructions in the Russian texts of the XIX century, compiling a database and creating an annotated corpus. In our long-term outlook, we have a more applied goal — to help the modern reader understand texts written two centuries ago. We need to estimate the readability of classical Russian literature, thus, the project implies a series of experimental studies. When we collect enough experimental data, we will be able to release a manual for teachers and students, and for anyone who has difficulty with the perception of the classical Russian literature language.
We created the XIXth Century Corpus on web-corpora.net and proceeded to annotate one of the key texts of Russian literature, the novel «A Hero of Our Time» (1840) by Mikhail Lermontov. Although our main problem is obsolete constructions, in the Corpus we have marked not only constructions but all language features of vocabulary, grammar and morphology, because studying constructions requires attention to all morphological and lexical divergence from the current standards.
Our Corpus is designed for a wide range of professionals — researchers of the language history, teachers, philology and linguistics students, as well as for anyone interested in Russian literature of this period.