A PANCHRONIC CORPUS: INTEGRATION OF HISTORICAL AND CONTEMPORARY CORPUS RESOURCES
Abstract:
The paper describes a panchronic corpus within the Russian National Corpus, which integrates several pre-existing corpora — Old East Slavic, Middle Russian, Birchbark letters corpus and the Main corpus, as well as the recently added Epigraphy corpus. Thus, we have formed a single search with a single query covering the history of the Old East Slavic / Russian language throughout a millennium. The main obstacles to the creation of such a corpus are the discrepancies between the orthography, the phonetic composition and the morphological principle of allocating lemmas in diff erent corpora, as wellas the incompatible markup of grammatical phenomena. The paper describes how these formats were partially unifi ed, without losing the functionality of separate corpora at the same time. The paper also provides illustrations of the panchronic corpus search, relevant not only to the study of grammatical processes in synchrony and diachrony, but also to literary / textual and historical problems.