Date: Tue, 1 Apr 2003 10:48:28 -0500
From: John Stevens
Subject: Analyzing Linguistic Variation: Statistical Models and Methods
Paolillo, John C. (2002) Analyzing Linguistic Variation: Statistical
Models and Methods, CSLI Publications.
John J. Stevens, University of North Carolina at Wilmington
"Analyzing Linguistic Variation: Statistical Models and Methods" by
John C. Paolillo explains the statistics of logistic regression within
the context of variationist linguistics. The book's main purpose is to
go beyond the simple instructional manuals of the popular statistical
software packages by providing a convenient resource to researchers who
seek answers to common questions as well as a more comprehensive
explanation of the principles underlying statistical models of
variation. The book examines all aspects of the most commonly used
analytical tool in sociolinguistic variationist studies, VARBRUL, a
multiple regression computer program developed by David Sankoff based
on William Labov's (1969) notion of the variable rule. The author
evaluates VARBUL in the light of other statistical techniques used in
the social sciences and attempts to relate variationist methods to more
formal models of linguistics.
This book assumes no previous familiarity with statistics. It is
written with three different audiences in mind: graduate students and
researchers who are looking for a guide that explains how to conduct
variationist linguistic analyses; more experienced researchers seeking
answers to recurring problems not adequately addressed in the currently
available literature; and researchers from other areas of linguistics
who need to relate variationist analyses to the theoretical models
current in their own sub-fields.
The volume is organized into ten chapters and includes appendices,
references, and an index. The first chapter serves as a general
introduction and provides information about the fundamental concepts
that are key to understanding variationist linguistic analysis and its
goals such as variable rules, chance occurrence, probability,
hypothesis testing, modeling, and the nature of different types of
data. Chapter 1 also gives a partial taxonomy of statistical analysis
procedures and explains the reasons why VARBRUL is the most widely used
program for variationist analysis.
Chapters 2, 3, and 4 address most directly the first audience mentioned
above by focusing on the practical aspects of performing variationist
linguistic analysis. Chapter 2 outlines the goals of variationist
research and considers issues that may affect the interpretation of the
results of statstical analysis. Chapter 3 discusses the types of data
required for analysis by VARBRUL and other software programs (i.e., raw
data, tokens, and contingency tables) and describes procedures for
recoding data in order to achieve different research goals such as
testing for significant factor groups and identifying interaction among
factor groups. Chapter 4 explains the steps involved in running an
actual VARBRUL analysis and includes information on how to read the
results and assess the model of variation in terms of "goodness-of-
fit," or how well the statistical model fits the variation observed in
the data (see Young and Bayley 1996).
Chapters 5, 6, 7, and 8 present the underlying statistical principles
involved in variationist linguistic methodology. Chapter 5 describes
the fundamentals of contingency tables analysis, introduces the chi-
square test of independence, and lays the foundation for logistic
regression modeling, which is further developed in subsequent chapters.
Chapter 6 examines the analytical structure of regression models and
explains how their components are used to assess competing models of a
set of data. Chapter 7 describes the measure of variance known as log-
likelihood and explains how the likelihood ratio test can be used to
achieve an optimal model of variation. Chapter 8 presents the logistic
regression model itself and discusses the procedures and software used
to assess model
The final two chapters consider variationist linguistic methods within
the larger theoretical context of statistics and linguistics. Chapter
9 relates the logistic regression model to log-linear models and
general linear models and compares and contrasts the assumptions, uses,
and limitations of each. Chapter 10 examines the notion of the
variable rule within the larger context of formal models of language
and extends the Variable Rule model to other models of language such as
Optimality Theory and Default Inheritance.
This book is written in an engaging, straightforward style appropriate
for graduate students and professionals in the field who are interested
in conducting variationist linguistic research. Although the text
contains numerous typographical errors, context generally allows the
reader to identify them as such without any loss in meaning.
Paolillo uses many tables and figures to illustrate his explanations
and main points. Particularly useful are the figures and tables used
in Chapter 4, "Conducting Variationist Analyses," which exemplify the
actual print-outs given in a multivariate analysis as performed by the
VARBRUL software program. In addition to leading the reader step by
step through the analysis, the author provides information on how to
read and interpret the results files of the print-outs.
Unlike other books on statistical methods, this manual provides
linguistic examples from actual variationist studies, many of which may
already be familiar to the reader (e.g, Labov's  study of (r)
deletion in New York City department stores). These examples may serve
as potential models for researchers and assist them in the conception
of their own investigations of linguistic variables. Each chapter ends
with a "Further Reading" section, which provides references to works
that may provide answers to questions that go beyond the scope of this
In Chapter 3, "Variable Linguistic Data," Paolillo discusses the
management of different forms of data and explains procedures for
coding and recoding data. However, for most of the technical details
of preparing data for statistical analysis, the reader is referred to
the specific instructions for whatever software package is used. The
author does provide invaluable information on where to obtain
statistical analysis software in Appendix 1, which gives the addresses
(URLs) where several versions of VARBRUL may be downloaded free of
charge from the Internet. These VARBRUL software programs currently
include versions for the Macintosh and PC platforms as well as a newly-
available version re-written for Microsoft Windows called GoldVarb
The book provides complete references for all works cited. Although
the index is comprehensive, the volume would have benefited form the
inclusion of a glossary of terms. Such a glossary would be
particularly useful to those unfamiliar with the specialized vocabulary
of statistical analysis. In addition, instructions on how to write up
the results of a variationist linguistic analysis would have been
helpful, especially for novice researchers not sure about what
information, including tables, figures, and measures of fit, should be
reported in a written results section.
"Analyzing Linguistic Variation" brings together a wealth of
information from several fields to explain the principles of logistic
regression as it is applied in the analysis of linguistic variation.
It will be especially suitable as a textbook for graduate students
learning how to perform variationist linguistic analyses for the first
time. It will also prove to be an indispensable resource for more
experienced researchers who seek a deeper understanding of the
statistical bases of VARBRUL.
Labov, William. (1969) "Contraction, deletion, and inherent variability
of the English copula." Language, 45:715-762.
Labov, William. (1972) "The social stratification of (r) in New York
City department stores." In Sociolinguistic Patterns, pp. 43-69.
Philadelphia: University of Pennsylvania Press.
Young, Richard, and Robert Bayley. (1996) "VARBRUL analysis for second
language acquisition research." Second Language Acquisition and
Linguistic Variation, ed. by R. Bayley and D. Preston, 253-306.
Philadelphia: John Benjamins.