TAGGING OF COLOUR WORDS IN THE RUSSIAN NATIONAL CORPUS


2024. № 4 (42), 15-40

St. Petersburg Institute of Culture, 
St. Petersburg State University

Abstract:

The paper describes errors in tagging of color terms in the Main and Poetry corpora of the Russian National Corpus. We focus our attention on the semantic annotation. The errors could be divided in two basic groups that are the absence of the corresponding semantic tag in the descriptions of color terms or irrelevant ascribing of the tagto words which do not have the meaning of color. In addition, a mistake in grammatical tagging of adverbs and adjectives in short form which have the tag of color was found. We discovered a few adjectives of high frequency to which the tag was ascribed irrelevantly, which causes considerable informational noise. We represent the analysis of verbs and nouns used as color terms. It was established that frequency of tagged color terms is higher than those that are not tagged. Therefore, despite the founded tagging errors the Russian National Corpus could be used as an effi cient instrument of scientifi c researches. However, these errors can noticeably infl uence the results and interpretations of researches. That is why the correction of the tagging errors is advisable.