Review of  The Significance of Word Lists

Reviewer: John M Clifton
Book Title: The Significance of Word Lists
Book Author: Brett Kessler
Publisher: CSLI Publications
Linguistic Field(s): Text/Corpus Linguistics
Issue Number: 13.491

Kessler, Brett. 2001. The Significance of Word Lists. CSLI Publications,
x+277pp, hardback ISBN 1-57586-299-9, paperback ISBN 1-57586-300-6,
Dissertations in Linguistics.
Announced at

John M Clifton, Summer Institute of Linguistics and University of North

The two major issues addressed in this book can be characterized in terms of
two senses of the word 'significance' as used in the title of the book. The
first issue is how significant word lists are to determining language
relatedness. The second issue is what is involved in showing that hypotheses
made on the basis word lists are statistically significant.

In chapter 1, 'Introduction', Kessler (K) addresses the two major positions
on the first issue. On the one side are those like Greenberg and Ruhlen
(1992) who feel that the analysis of word lists can be used to demonstrate
the links between remotely related languages. On the other side are scores
of more traditional historical linguists who claim that the similarities
used to establish these putative links are due to chance. K proposes a third
option: word lists can be used to establish linguistic relationships, but
only when following a rigid methodology designed to ensure the results will
be statistically significant.

Chapters 2, 'Statistical Methodology', and 3, 'Significance Testing', are
the heart of the book. In these chapters K discusses statistical methodology
in general, and then details the specific methodology proposed for the
analysis of word lists. K then applies this test to Swadesh 100 word lists
from eight languages: Latin, French, English, German, Albanian, Hawaiian,
Navajo, and Turkish. With a few exceptions, the results of the procedure
indicate that the first five are related, and the others are not. At the
risk of over-simplifying a complex procedure, I will attempt to summarize
the contents of the methodology. Feel free to skip the next paragraph if it
is too obtuse.

The methodology involves constructing a table of correspondences of
word-initial segments in semantically related words in two languages. This
table can then be analyzed using the chi-square test for significance. From
a statistical point of view, the problem is that the number of occurrences
of specific correspondences is too low for the chi-square test to be
meaningful. To remedy this, K proposes the use of a Monte Carlo technique.
Applying this technique, one of the word lists is randomized, a new table is
constructed, and the chi-square test is applied to the new table. This
procedure is repeated 10,000 times. Now the value of the original table is
compared with the values of these 10,000 tables generated by the Monte Carlo
technique, and a valid level of significance can be attached to the original

As indicated above, the methodology as proposed does not always correctly
identify which languages are related. There are both false positives in
which a relationship is posited between apparently unrelated languages like
Latin and Navajo, and false negatives in which no relationship is posited
between related languages like Albanian and German. K points out that false
positives are unavoidable in statistics; the goal is to minimize them. False
negatives, on the other hand, should be eliminated. In addition, it would be
nice if the methodology could distinguish between closer relationships like
those between English and German, and more distant relationships like those
between English and Albanian. In chapters 4-10, K discusses various ways in
which the methodology might be improved.

In chapter 4, 'Tests in Different Environments', K concludes that
predictions are not improved by comparing features other than the
word-initial consonant, for example, the first consonant of the second
syllable, or the first vowel, or some combination of the above. Then in
chapter 5, 'Size of the Word Lists', K shows that increasing the size of the
word lists by using the Swadesh 200 word list instead of the Swadesh 100,
does not improve the predictions.

Chapter 6, 'Precision and Lumping', deals with the implications of two types
of historical changes. First, phonemes can split or merge so that, for
example, /t/ in language A may correspond to /t/, /tj/, and /tw/ in language
B. Second, semantic shifts occur which result in, for example, the lexical
item for 'skin' in language A being related to the lexical item for 'bark'
in language B. K rejects attempts to incorporate such factors into the
procedures on the basis of practical considerations related to the
methodological requirement that lexical items be chosen without reference to
their similarity to forms in other languages.

Chapters 7-9 deal with what lexical items may need to be eliminated from the
analysis. In chapter 7, 'Nonarbitrary Vocabulary', K discusses forms in
which the phonetic form may be at least partially determined by sound
symbolism including, but not limited to, onomatopoeia and nursery words.
Then K discusses loan words in chapter 8, 'Historical Connection vs.
Relatedness', and language-internally related forms in chapter 9,
'Language-Internal Cognates'. Language-internally related forms include such
phenomena one phonetic form for related meanings (for example, 'skin' and
'bark' or 'egg' and 'seed') and derivationally related forms. K argues that
if the goal of the analysis is determining whether two languages are
genetically related, the nonarbitrary aspects of such forms needs to be

Then, in chapter 10, 'Recurrence Metrics', K introduces some statistical
methods that might be used in place of the chi-square test.

In the final chapter, 'Conclusions', K summarizes the actual procedures
proposed in the book, and then offers observations on what such procedures
have to offer the practice of historical linguistics.

The book concludes with an appendix that includes all eight word lists that
are used to test the methodology presented in the book, references, and an

It should be obvious by now that this book may be hard going for readers who
have an aversion to mathematics in general or statistics in particular. At
the same time, I feel K does a good job of presenting the material in a form
that should be accessible to readers who do not have a strong background in
statistics. The book is full of examples illustrating the various points.
And the fact that the same eight word lists are used throughout the book
makes it easier to follow the arguments related to variations in the

I feel K has demonstrated that it is possible to develop procedures that
yield statistically significant results (that is, issue two from above). At
the same time, I do not feel K demonstrates how the procedures will bring
together the two sides regarding the issue of how significant a role word
lists should play in determining language relatedness. The problem is that
most of the discussion regarding this issue deals with languages whose
relationship is very remote, while the methodology presented here only seems
to be applicable to languages related at the level of Indo-European. K never
shows how the methodology could be adapted to test more remote

In addition, I am not sure that K's requirement that the analysis must be
based on a pre-determined procedure, on word lists that are chosen without
reference to any of the other languages to be analyzed, will be acceptable
to those interested in determining remote relationships.

This is not so say, however, that the methodology is without merit. In some
areas like Papua New Guinea and Africa, relationships have not been firmly
established even at the level of Indo-European. In addition, the chapters on
lexical items that should be eliminated from the analysis (7-9) discuss
issues that are important for anyone involved in the analysis of word lists.
I have seen many analyses (my own included) that fail to take into
consideration internal cognates.

A major thrust of the book is that 'more is not necessarily better'. K
demonstrates the importance of choosing carefully the words to be analyzed.
It is better to analyze a smaller set of words that have been screened in
terms of origin than to analyze a large number of words that are of
questionable status. In other words, K argues that attempts to bolster an
analysis based on word lists of questionable status by simply adding more
words actually works against the trustworthiness of the analysis. At the
same time, this will make the procedure more difficult to apply in
situations as in Papua New Guinea where it is difficult to gather the
information necessary to compile trustworthy word lists. Technical
dictionaries of the caliber used by K simply do not exist in many of the
languages there.

K also makes it clear that the procedures proposed in this book are not a
replacement for the more traditional tasks of establishing cognates.
Instead, the procedures are meant to show which languages are good
candidates for such a task.

In conclusion, while I am not sure how influential the book will be in the
debate over the use of word lists for determining remote relationships, I
feel the book has a lot to offer to those involved in more mundane analysis
of word lists.

Greenberg, Joseph H. and Merritt Ruhlen. 1992. Linguistic origins of
Native Americans. Scientific American 267:94-99.

John M Clifton has been involved in sociolinguistic research involving,
among other aspects, language relationships, in Papua New Guinea from 1982
to 1994. More recently, he has just finished coordinating the work of a team
of researchers working in language use and attitudes among speakers of
less-commonly-spoken languages in Azerbaijan.


