LINGUIST List 13.1207

Wed May 1 2002

Books: Corpus Ling: Frequency list for Russian

Editor for this issue: Dina Kapetangianni <>

Links to the websites of all LINGUIST's supporting publishers are available at the end of this issue.


  1. Serge Sharoff, Frequency list for Russian

Message 1: Frequency list for Russian

Date: Thu, 18 Apr 2002 10:13:46 +0200
From: Serge Sharoff <>
Subject: Frequency list for Russian

The list of most frequent Russian words is available at:

Currently Chastotnyj slovarj russkogo jazyka (Zasorina, 1977) provides
the most widely used frequency list for Russian. However, the corpus
used in Zasorina is relatively small according to modern standards
(about 1 million words). It is outdated: mostly it covers uses from
1920s to 1960s and includes a high proportion of ideological sources,
like texts by Lenin and Khrushchev and Soviet newspapers, thus, word
frequencies in it are severely biased. Finally, the list of
(Zasorina, 1977) is not available electronically.

The announced list is compiled on the basis of a corpus of modern
Russian fiction and political texts (more than 35 million words). The
list includes about 33000 words which frequency is greater than 1 ipm
(instances per million words). A shorter selection of 5000 most
frequent words is also available.

The structure of the lists follows the template of the lemmatised BNC
lists produced by Adam Kilgariff
(, namely:
word rank, frequency (in ipm), word, part of speech.

In addition, some analytical information about the lexical stock is
provided, such as coverage of the total language use by word bands,
e.g. first 3000 lemmas cover 76.6824% of the total number of word

The corpus, tools for working with it, as well as an aligned parallel
English-Russian corpus are discussed in the forthcoming publication:
Sharoff, Serge, (2002). Meaning as use: exploitation of aligned
corpora for the contrastive study of lexical semantics. Proc. of
Language Resources and Evaluation Conference (LREC02). May, 2002, Las
Palmas, Spain.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue


----------------- Major Supporters ----------------



Academic Press



Arnold Publishers



Athelstan Publications



Blackwell Publishers




Cambridge University Press



Cascadilla Press




Continuum International Publishing Group Ltd




CSLI Publications




Distribution Fides



Elsevier Science Ltd.



John Benjamins����




Kluwer Academic Publishers




Lernout & Hauspie



Lincom Europa



MIT Press




Mouton de Gruyter




Multilingual Matters





Oxford UP



Pearson Education











Summer Institute of Linguistics





---------Other Supporting Publishers-------------


Anthropological Linguistics


Bedford/St. Martin's


Finno-Ugrian Society


Graduate Linguistic Students' Assoc., Umass


International Pragmatics Assoc.


Kingston Press Ltd.


Linguistic Assoc. of Finland


Linguistic Society of Southern Africa (LSSA)


MIT Working Publishers in Linguistics


Pacific Linguistics


Pacini Editore Spa


Utrecht Institute of Linguistics


Virittaja Aikakauslehti


Thursday, January 17, 2002