NAÏVE POETRY IN ACCENTOLOGIC CORPUS


2015. № 3 (6), 257-271

Vinogradov Russian Language Institute of the Russian Academy of Sciences, Yandex, National Research University "Higher School of Economics"

Abstract:

The article deals with specific material that is able to enlarge the Accentologic Corpus, the subcorpus in the RNC reflecting accent patterns in Russian words. Naïve poetry is the term for unprofessional poems written by amateur poets. Their textual products have not passed any editorial filters and have not been published in reputable periodicals and publishing houses. Since majority of these texts are written in the correct syllabic-tonic, it is possible to predict stress automatically and make the markup for the Corpus. Examples of naive poetry have been downloaded from the site stihi.ru, the oldest one in Russia that publishes such works by amateur poets. Despite the existence of alternative platforms for publication, the site is still popular and the number of publications is on the rise. A special program for the markup of texts was used. This program based on machine learning predicts the place of the accents. There is a table in the article that shows how the enlargement has increased the number of occurrences of some competing forms in Corpus.