LINGUIST List 9.258

Sun Feb 22 1998

Sum: English Word Frequency

Editor for this issue: Julie Wilson <>


  1. Alex Zheltuhin, Summary: English word frequency

Message 1: Summary: English word frequency

Date: Thu, 19 Feb 1998 14:29:39 -0500 (EST)
From: Alex Zheltuhin <>
Subject: Summary: English word frequency

10 days ago I posted a query about recent English word frequency lists.
Counter my expectations, I received very few references to the relevant
on-line resources.
I would like to thank the following Linguist subscribers for their 
kind responses:
Julie Vonwiller
Lynn Santelmann
Timothy Jay
Barbara Pearson
Marie C. Egan

Suggestions that I received are given below in no particular order.

Julie Vonwiller:

Most of the major newspapers have their papers on line. Word
frequencies would be available that way. Otherwise check the
comp.speech site for references. I think they list that kind of
thing. Also most dictionary publishing forms have web sites now.

Lynn Santelmann:

If you go to the Website for the linguist list, they have links
to several on-line sources for word frequency. The LDC at UPenn
is the first that comes to mind, but there are others too.

Timothy Jay:

I have a chapter on word frequency in CURSING IN AMERICA (1992, John
Benjamins Pub Co - 1-800-562-5666). My research indicates how
frequency estimates exclude the usage of offensive words, along with
general problems of estimating word usage.

Barbara Pearson and Marie C. Egan referred to 

Francis, W. N. & Kucera, H. (1982). Frequency analysis of English
usage: Lexicon and grammar. Boston, MA: Houghton Mifflin.

I would like to extend this list of suggestions with additional 
references of interest:

Kucera, H. & Francis, W. N. (1967). Computational analysis of
present-day American English. Providence, RI: Brown University Press.

Bloom, P.A., & Fishler, I. (1980). Completion norms for 
329 sentence contexts. Memory and Cognition, 8, 631-642.

On letter/bigram/trigram frequency see: Solso, R. L., & King,
J. F. (1976). Frequency and versatility of letters in the Endlish
language. Behavior Research Methods and Instrumentation, 8, 283-286.

Solso, R. L., Barbuto, P. F. & Juel, C. L. (1979). Bigram and trigram
frequency and versatility in the English language. Behavior Research
Methods and Instrumentation, 11, 475-484.

Web sites:
There is a link to this site on the Linguist's web page.

HCRC Map Task Corpus (150,000 tokens)

ARTFL Project Word Frequency Search Form at

Statistics gathered for the most frequent words found on Usenet in

A follow-up on Kucera & Francis study:

I am very much looking forward to further references and suggestions
on the subject. If I receive additional information, I will certainly
update the summary. And once again, many thanks to those who


Alexander Zheltukhin, Ph.D.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue