Epstein, Samuel David, and Norbert Hornstein, ed. (1999) Working
Minimalism. MIT Press.
Reviewed by: Robin Schafer, University of Canterbury
Syntacticians are created through a series of small epiphanies: the
discovery of principled patterns in a randomly selected grammar; the
identification of a seemingly idiosyncratic construction with a
cross-language phenomenon; the realization of a prediction. The beauty of
a theory like Government and Binding (GB) or Principles and Parameters
(P&P) is that it is possible to lead students to such points. Most
importantly, it is possible to ask them to apply the theory to data they
have not seen before. One of the frustrations of living in the age of
minimalism is that these experiences are unavailable to the uninitiated:
students cannot be told to apply the Minimalist Program (MP) to data of
their own choosing. It is simply not clear how to start.
The solution to this problem is not found in the book Working Minimalism,
despite its title. The work leans more to philosophy of linguistic
analysis than to linguistics itself. It is a compendium of the obsessions
of current theory, primarily answering the question, 'How does the MP
address X treated previously as Y'? As such, it is highly readable and
informative, but imparts the attitudes more than the procedures employed
in working with the MP. The papers involve a rethinking of the operations,
relations, and constraints of GB/P&P. Thus, the work suggests that the
starting point of MP analyses are all and only the data questions raised
by GB. A reworking of what has been done before is welcomed if the result
is a neater or more discerning account. But as a reader looking for such
results, I was disappointed.
Working Minimalism consists of an introduction and 12 papers. The intro by
Epstein and Hornstein presents, in about five pages, a clear and useful
synopsis of the MP. They provide readers with an annotated list of 10 key
ingredients to the MP, which I repeat here in two sets: 5 bits that come
in pairs and 5 conditions & postulates:
Two levels: LF & PF
Recursion through generalized transformations (plus Strict Cycle Condition)
Two basic options: Merge and (Copy) Move
Two distinctions among features: Interpretability and Strength
Two checking configurations: Spec-head and Head-head:
NOTE: locality is defined on these configurations: there is no government
Full Interpretation: Interpret fully at interfaces
Shortest Move Condition
Last Resort Condition (motivated by Greed)
Inclusiveness Condition: operations don't add features
Theta role postulate:
Roles are special: they are not features, they are assigned lexically.
The approach is derivational, but it is not clear whether derivations
proceeds serially or in parallel, and the different authors make different
assumptions in their individual works.
In addition, Epstein and Hornstein set out their view of the relation
between GB/P&P and the MP. They explain that the successes of GB/P&P
brought to the fore issues of theory evaluation like parsimony and
simplicity. These economy conditions constitute the MP principles. It is
the change in the role that economy plays -- from an evaluative metric in
late GB to the basis of the principles in the MP -- that accounts for the
level of abstraction that is consequently introduced into syntax. In
reading this book I was hoping to find justification for this change to a
less accessible method of linguistic analysis.
The twelve chapters that comprise the book include one by Hornstein and
one by Epstein as well as the work of 10 other authors. The chapters are
not organized into sections, but can be divided into three types: half the
chapters argue how to get rid of a GB/P&P device; four offer new
explanations for handling already defined phenomenon; and two update the
reader on the operations of an MP grammatical device. I will not go
through each chapter in detail, but I include mention of all the authors
and topics below as I describe these three groups of papers.
The two papers that discuss MP particular devices are Jairo Nunes's paper
on Linearization and Juan Uriagereka's on Multiple Spell-Out. Both these
papers are elaborations of operations familiar from earlier versions of
MP. The Nunes paper is a straight-forward, addressing the question why,
if we adopt the copy theory of movement, traces are silent. Nunes starts
by noting that traces cannot be primitives, since they are not in the
numeration. He sets out to account for the facts that all links of all
non-trivial chains are not phonetically realized, certain links often are,
sometimes more than one is. His approach is to let linearization be a PF
convergence requirement (following Kayne 1994). Copies are subject to the
Linear Correspondence Axiom (LCA) and deletion is triggered by
linearization considerations. Basically copies must delete because copies
themselves are non-distinct from each other. So when the phonological
component attempts to process COPY A - X - COPY B, it is forced to order
the copied element both before and after X. Since this is impossible (i.e.
since there is no output of linearization), a derivation in which deletion
occurs converges as a minimal derivation. Which copy is deleted is
determined by economy of formal feature elimination.
This account strikes me as technical: it's clever and innocuous, but it
doesn't capture intuitions or offer insight into why there are traces. It
doesn't appear to tie into the interactions between (or the convergence
of) precedence and information structure, gapping and relevance, locality
and the number of roles an expression is assigned in an event, or any
other larger view of what characterize traces.
The Uriagereka paper has a more far-reaching consequences. It is often
observed that if Spell Out applies only once, then the MP admits an
s-structure level, despite its commitment to only two levels. This is a
matter of mis-conceptualization according to Uriagereka. Spell-Out is like
any other rule; it applies as much as it can subject to economy
Uriagereka, like Nunes, starts with Kayne's LCA, since linearization
operates at Spell-Out. The LCA has two parts: the first states a
relationship between command and precedence (command is sufficient for
precedence), and the second cleans things up when a command relation
doesn't directly hold between the two chunks to be linearized. If we have
multiple Spell Out, the second step follows. That is, linearization
applies only to chunks in which a command relation holds, so
sub-sentential pieces are fed into the interfaces independently.
Superiority and CED effects are discussed using this machinery. The output
of linearization re-creates domains of government without reference to
what minimalism views as a redundant relation.
All sorts of questions are raised here, (many of which must have been
discussed since this article was first distributed). I mention the obvious
one here: how do the various chunks spewed into PF get into the 'right'
order? Some discussion of this with respect to antecedence and the role
agreement plays is supplied. I am most disconcerted by the result that the
syntax doesn't actually derive anything that corresponds to the actual
stuff that is said --- nothing syntactic corresponds to the sentence. It's
not that the actual empirical data is lost: it corresponds to the
numeration and the PF representations. This is analogous to the change in
status of constructions from primitive operations to a collection of
independently required feature specifications, conditions and operations.
What concerns me is that a lot of language structure is actually encoded
in the string itself. We spend more time thinking about the non-evident
structural relations, but it seems wrong to loose the connection between
syntactic structure and what is said. This paper definitely leaves the
reader with a good deal to think about.
The papers that attempt to get rid of some GB/P&P device are crucial to
the claim that previously employed grammatical constructs are too rich.
Roger Martin targets uninterpretable features for elimination from the
theory, Norbert Hornstein targets Quantifier Raising, Hisatsugu Kitahara,
the 'or' feature, Robert Freidin argues cyclicity is unnecessary, Howard
Lasnik argues against reconstruction with A chains, and Sam Epstein
proposes that c-command is reducible to Merge.
Several papers stand out in this group. Martin's paper is an excellent
example of the Get Rid Of It group which doesn't succeed in eliminating
the device investigated. Martin points out that the only two levels in an
optimal language faculty are the two interfaces, LF and PF, so there
should be only features interpretable at one of these two interfaces. No
uninterpretable features should exist. Martin shows that some
uninterpretable features can be eliminated, but Case cannot be. He
concludes that Case must be conceptually necessary. The discussion is
clear and well presented, but what is also fascinating is the nature of
the argument, particularly this conclusion.
One might have thought that the conclusion should be that the language
component is not optimal. Given the relevant conception of 'optimal,' few
linguists would be surprised at such a conclusion. Formulated in the MP,
what Martin's investigation reveals is what is minimally necessary. The
question is: assume syntax is minimal, what must be in it? Martin provides
a partial answer: Case. The minimal syntax required is viewed as
conceptually necessary, hence as part of an optimal system.
The Hornstein paper on Quantifier Raising is a very fine example of the
Get Rid Of It group of papers which arguably succeeds in eliminating a GB
device. QR is a syntactic operation with a semantic motivation. In an MP
world where all movement serves to check morphological features, this is
unexpected. It is also unexpected under any theory that embraces the idea
that semantics is essentially read off a structure determined by syntax.
In either case, rules that fix quantifier scope are an anathema.
Hornstein argues that certain basic aspects of the MP allows us to get rid
of QR. Crucial is the fact that the Case checking (in fact all feature
checking) occurs outside of VP, and the view that movement involves copy
and deletion. The proposal is remarkably simple and well supported: the
scope an expression has depends on which member of an A-Chain survives
deletion --- the copy internal to VP or the copy in an Agr specifier.
Scope, in this conceptualization, is a property of a member of a chain,
not of a syntactic category.
The beauty of this paper lies in the discussion of the separation of Case
checking from Theta properties, a move which began in late GB (Principles
and Parameters) work. Hornstein's discussion of the theoretical
ramifications of QR is not only clear with respect to the desiderata of
MP, but contrasts this with the position adopted in GB and so actually
unveils the development in thinking on QR from GB to MP.
Hornstein ends the paper with a defense of lowering which might seem odd
from the commonly held perspective that optimal grammars eschew lowering.
However, when deletion targets the higher copy or chain member, the result
is reminiscent of lowering. So Hornstein argues there is reconstruction in
A-chains (that is, he argues against arguments against reconstruction in
A-chains). He's looking for other instances where we might find deletion
of the highest member of a chain. I point this out because it is an aspect
of this group of papers that there is a trade off or cost associated with
the minimizing therein.
Papers like Hornstein's that argue for the elimination of some device seem
to be the product of a program embracing economy, but they do not seem to
require that economy constitute the principles of the theory (rather than
serve as an evaluative metric). Papers like Martin's, on the other hand,
really do seem to be the product of a program distinct from GB/P&P. If
economy conditions were an evaluative metric, than an analysis involving
Case features would be evaluated less optimally than one without them, all
other things being equal. By making the economy principles the theory
itself, the non-optimal bits that are necessary are rendered critical, or
conceptually necessary. It is here that we observe a real difference in
operation of the MP.
The third group of papers, those concerned with how to handle specific
phenomena within the MP, includes Erich Groat on expletive there, Norvin
Richards on multiple specifiers in WH-Questions and other constructions,
Zeljko Boskovic on multiple WH-fronting and multiple head movement and Amy
Weinberg on sentence processing. Groat's paper is truly enjoyable, if for
no other reason than that it provides a score card of the various
incarnations of the expletive there analysis since 1991. But the question
raised is also one that reveals a lot about MP conceptualizations. Full
Interpretation is a cornerstone to the Minimalist Program. Expletives are
simply not compatible with Full Interpretation. So, how do we analyse
expletives so that they can be conceived of as compatible with the theory
(in fact, conceptually necessary)?
Groat argues, contra the analysis in Chomsky 1995, that expletives must
bear Case. Under his analysis, D-features assigned to there are
unnecessary. The expletive raises from a small clause in which it is
generated together with its associate to check nominative Case. Through
this analysis, two complications of assuming expletives do not bear Case
are eliminated. First, if the expletive just checks an EPP feature
(Chomsky 1995), feature checking occurs when the expletive is Merged into
the representation. Thus this account requires that features can be
checked either through Merge or Move. Groat points out that all other
features are checked via Move. If the expletive raises to check Case, as
Groat proposes, all checking is uniform and takes place only through Move.
This seems to me to be conceptually critical prohibition: if Merge is
cheaper than Move, and both operations were available for feature
checking, but most features were checked via Move, the result would
certainly raise eyebrows. Chomsky's 1995 analysis also requires a
constraint that Move is possible only if it is the cheapest among
competing, potentially convergent, derivational "steps." Since steps that
lead to non-convergent derivations have to be excluded, this constraint
introduces a global look-ahead. Groat's analysis doesn't require this
constraint. Admittedly, however, it introduces its own non-minimal bits:
the raising of the expletive is less optimal than simply Merging in the
Just as in the Martin paper, the point of the Groat paper is to look at an
identifiable non-optimal part of the grammar. I wonder whether the
conceptually necessity of Case a la Martin isn't tied to the conceptual
necessity, or at least self-evident existence of expletive there.
Interestingly Groat's discussion doesn't touch on matters of conceptual
necessity. In fact this paper seems much more concerned with the
phenomenon of expletives; here the theoretical concerns are secondary.
This is a general property of the How to Handle It group of papers. In
this same way, Norvin Richards paper is a very neat piece of exposition
concerned with the nested dependencies evident in movement to multiple
The theoretical matter that Richards couches his work in is cyclicity. In
Chomsky 1993 cyclicity was captured in the condition that all operations
must expand a tree. But head movement had to be treated as a widespread
exception to this condition. Chomsky's 1995 proposal that strong features
must be checked ASAP avoids this problem, plus Richards argues it predicts
paths to multiple specifiers ought to cross, rather than nest. Richards
shows this prediction is borne out in a discussion of Multiple WH-movement
in scrambling and question formation. He then considers other instances of
crossed paths (Object Shift, Negative Fronting, etc.) and argues that
these too involve "multiple attraction by a single attractor" or head.
These How to Handle It papers are instructive instances of applied (or
working) minimalism, but do not seem to be particularly indebted to the
MP. Their focus is phenomenological; the theoretical application seems to
be secondary, in some cases almost an afterthought The facts and
generalizations persist independent of the theory within which the data
are currently discussed, and so support it only indirectly.
In this context it is important to mention Amy Weinberg's contribution, A
Minimalist Theory of Human Sentence Processing. This paper stands out from
the others in the volume in subject matter; it is also an instance of the
How to Handle It groups of papers that is intimately connected with the
MP, particularly with Uriagereka's revision of the Linear Correspondence
Axiom and his notion of multiple Spell Out. Weinberg uses the MP to
explain initial analyses of ambiguous structures and to provide a theory
of the revision that occurs when the processor encounters disconfirming
data. It is an interesting re-working of the data.
For example, it is well established that a noun phrase following a
potentially transitive verb is preferentially interpreted as a direct
object (see Frazier and Rayner 1982. Making and correcting errors during
sentence comprehension: Eye movements in the analysis of structurally
ambiguous sentences. Cognitive Psychology 14, 178-210.). A sentence like
(a) contains the ambiguous string the girl knows the answer.
a) The girl knows the answer to the problem was correct
The noun phrase the answer is preferentially interpreted as the object to
knows, and Frazier and Rayner showed higher reading time per character in
sentences of this sort, as well as a higher probability of regressive eye
movements, then in corresponding cases lacking the ambiguity.
This preference for an argument analysis Weinberg captures via feature
checking. Weinberg states that under a direct object attachment, the noun
phrase will be assigned more features at the point of attachment than
under the subject of complement clause analysis. Thus the object
attachment is preferred. In her discussion of these sorts of cases,
Weinberg must assume that theta-features are checked like other features.
She also makes extensive use of Larsonian shells (including for adverbial
The idea that the processor is (economically) driven to check as many
features as possible in as few operations as possible contrasts with a
Garden Path/Construal view that deterministic, structural parsing
principles like Minimal Attachment account for this preference. How does
the system know how many features will be checked prior to making the
attachment? It is very disappointing that there is no comparison between
what Weinberg proposes and this more widely known Garden Path alternative,
in which questions of this sort could have been directly answered.
Weinberg does provide a comparison with Constraint Satisfaction models. It
is intriguing in that she concedes that a verb's frequency of occurrence
as a main verb vs. as a head of a reduced relative may well be the
determining factor underlying processing behaviors observed with classic
garden path sentences like the horse raced past the barn fell. The role of
grammatical constraints, she asserts, is to determine whether reanalysis
Weinberg's proposal concerning reanalysis is built on the notion of
multiple Spell-Out: reanalysis is available where it is triggered before
Spell-Out has eliminated access to structure. So in the example in (a)
above, there is reanalysis. Where there is no reanalysis we find a garden
path effect, in traditional Late Closure example like (b).
b) Since Jay always jogs a mile seems like a short distance to him.
Here in the absence of punctuation the string since Jay always jogs a mile
is ambiguous, and people show a preference for attaching a mile as the
object of jog. Again for Weinberg, this preference is captured by feature
checking as in example (a), but she further argues that the attachment of
the disambiguating element seems triggers Spell-Out of the initial portion
of the input, so eliminating the structure, preventing re-analysis. It was
not clear to me what she predicts to be the difference in behavioral
indicators (reading times, eye movements) of reanalysis (as in (a)) vs.
garden path (b).
I hope this discussion has indicated that there are a number of worthwhile
papers in the collection. Overall, my disappointment with this book might
be attributed to an expectation for something more instructive than this
collection of essays or a failure to appreciate the distinction between a
theory and a program. But I don't think that my disappointment is
ultimately due to my ignorance, inasmuch as I think my expectations for
the book are fair.
The work should come together better as a collection. Most of the essays
are nice independent pieces of research, but more needs to be done to fit
them together. A discussion, for instance, of the inconsistencies in the
papers is required: Lasnik argues against reconstruction in A chains,
Hornstein argues against this position. Hornstein deletes movement copies
freely at LF; Nunes deletes them at PF according to formal feature
content. Epstein and Hornstein state that theta roles do not check
features, Weinberg claims they do. Is everyone consistent with Epstein's
reformulation c-command? Is (the often employed) deletion an operation in
MP? Does the system compare multiple derivations from a numeration, or (as
Weinberg asserts) work serially? And finally, does the MP require only
principles operating at the interfaces and reducible to economy conditions
as Uriagereka indicates, or is it a means for critiquing the devices of
actual theories as Freidin suggests? It isn't incumbent upon the editors
to resolve all issues, but it is important to present the points of debate
and do so coherently. In addition, a synopsis of where we are at the end
of this work is necessary: what do the writers agree is conceptually
necessary? What elements have been introduced in eliminating the old,
unnecessary devices? I think deliberation of this sort would give readers
of Working Minimalism a sense that the MP is moving forward rather than
Robin Schafer is an assistant professor in Linguistics at the University
of Canterbury teaching syntax, morphology and psycholinguistics.