Hisatsugu Kitahara, (1997) Elementary Operations and Optimal Derivations,
MIT Press, Cambridge, Mass. 140 pages, $15.00.
Reviewed by Julie Legate, <firstname.lastname@example.org>
This book is very firmly situated within the Minimalist approach to syntactic
theory that was begun by Chomsky (1991) and perhaps most fully articulated in
Chomsky (1995). It adopts much of the basic architecture of the 1995 version
of Minimalism (henceforth MP), while deriving several of its principles and
assumptions. The first three chapters of the book propose some comparatively
minor alterations to the MP system, and demonstrate that these alterations
allow several stipulations of the MP to be dropped, while retaining or
improving the framework's empirical coverage. The final chapter retains the
proposals of the previous chapters while putting forth a new condition that is
a more clear departure from the MP approach. With this condition, Kitahara is
able to account for several notoriously problematic wh-constructions.
Kitahara's first chapter consists of a review of the Minimalist syntactic
framework. He discusses the conceptual foundation of the approach (the
unflinching application of Occam's Razor to every aspect of the
computational system), the guiding principles of global economy, as well as
the internal mechanisms of the computation (including the creation of
syntactic trees through successive operations of Merge and of morphological
His second chapter contains the core proposals of the book. Kitahara
replaces the operations of Merge, Move, and Erase by two "Elementary
Operations": "Concatenation" and "Replacement". Concatenation is the
procedure which joins two objects alpha and beta to form a new object K.
Replacement, on the other hand, substitutes an object alpha for an object
beta, where beta is contained within a larger object sigma. The MP operations
are redefined using these Elementary Operations as follows (p35):
(i) Cyclic Merge = Concatenation
Cyclic Move = Concatenation
Noncyclic Merge = Concatenation + Replacement
Noncyclic Move = Concatenation + Replacement
Erase = Replacement
He further redefines the Shortest Derivation Condition (Chomsky 1991, 1993;
Epstein 1992) in terms of these operations, as in (ii) (p26):
(ii) Shortest Derivation Condition (SDC)
Minimize the number of elementary operations necessary for
Given this technology, he proceeds to derive several of the principles and
assumptions of MP.
First, he considers cyclicity, providing a detailed summary and comparison
of the approaches taken in Chomsky 1993, 1994, and 1995. He derives
that, in simple cases, cyclic convergent derivations should be preferred
over noncyclic convergent ones, since cyclic operations yield a shorter
derivation. As shown in (i) above, cyclic operations involve only one
Elementary Operation: Concatenation, whereas noncyclic ones require two:
Concatenation and Replacement.
Next, he turns to the MP principle of Procrastinate: covert movement
(i.e. movement which occurs in the computation after the derivation is sent to
PF for pronunciation) is preferable to overt movement. He separates the
discussion into head movement, object shift, and expletive insertion. He
adopts the MP assumptions that (a) movement results in two instances of
identical elements, one in the merged position and the other in the moved
position (i.e. the copy theory of movement); (b) if an element is overtly
attracted, the entire category moves, whereas if an element is covertly
attracted, only the formal features move; and (c) only one of the identical
elements created by movement is interpreted at LF. Finally, he proposes a
novel interpretation of effect of strong features in the grammar, as
in (iii) (p37):
(iii) Strong Feature Condition
Spell-Out applies to sigma only if sigma contains no category with
a strong feature.
Regarding head movement, in languages with overt verb raising, T has a strong
V feature and thus overt raising is necessary for convergence. In languages
without overt verb raising, the SDC selects derivations with covert, rather
than overt, verb raising. Although both covert and overt head movement
involve one operation of Replacement (since head movement is necessarily
non-cyclic), Kitahara claims that overt head movement, being category movement
requires an additional instance of Replacement, thus resulting in a longer
derivation. His reasoning is as follows. If a verb moves overtly, its
semantic features are carried along. Therefore, at LF it will be necessary to
delete one of the instances of the semantic features, since elements are only
interpreted once. This requires an application of Replacement that is not
needed for covert movement, since covert movement only affects the formal
features, not the semantic features as well.
Notice that the same reasoning will not extend to phrasal movement. Overt
movement will require one instance of Replacement to delete the semantic
features of one member of the chain (just considering a simple two-membered
chain), however covert movement will also require one instance of Replacement,
since covert movement is necessarily non-cyclic. Thus, the SDC cannot choose
between derivations with overt object shift and those without. This predicts
the optionality of object shift in languages like Icelandic, without resorting
to the MP 'optional strong feature' analysis, which is a simple restatement of
In languages without overt verb movement, object shift is predicted to be
impossible, assuming that the object shifts to the outside specifier of vP,
the inside specifier being the merged position of the subject, and that
multiple specifiers are not equidistant from a higher head, unless head
movement renders them equidistant. Thus, the shifted object would block
movement of the subject to TP, unless the verb has raised to T. Kitahara
acknowledges (p144, fn26), however, that he cannot explain languages like
French, that display overt verb raising but prohibit object shift.
Finally, Kitahara considers the timing of expletive insertion. Notice that
although MP assumes that Merge is cheaper than Move, Kitahara's reanalysis
predicts that Merge be equally economical to cyclic Move, since both consist
of one application of Concatenation. Since the timing of expletive insertion
is the primary empirical motivation for the MP assumption, Kitahara
demonstrates that this timing is equally captured within his system.
Consider the familiar sentences (iv) and (v).
(iv) There seems to be a man in the room.
(v) *There seems a man to be in the room.
In (iv), the expletive was inserted in the embedded [spec, T] and raised to
the matrix TP. In (v), on the other hand, "a man" was raised to the embedded
TP and the expletive was inserted directly into the specifier of the matrix
TP. MP claimed that (iv) is preferred over (v) because it is cheaper to merge
the expletive into the embedded TP than to move the associate. Kitahara
claims that these facts follow from his SDC. Since expletives are assumed to
have no semantic features, overt raising of "there" will not require an
application of Replacement to delete an instance of semantic features in (iv).
In (v), on the other hand, overt raising of "a man" will require Replacement
to delete one of the resulting two instances of the semantic features of "a
man". Therefore, the derivation in (iv) is shorter than that in (v) and thus
As a side point, notice that this analysis requires that the formal features
of "a man" raise covertly directly to the matrix T. If these features raised
to the embedded TP covertly, the non-cyclic movement would result in an
additional application of Replacement, and (iv) and (v) should be equally
economical (an equivalent situation to object shift).
Kitahara concludes Chapter Two with a note about the timing of expletive
insertion in Icelandic Transitive Expletive Constructions. He assumes the MP
analysis that the associate (i.e. the subject of the transitive) moves into
the inner specifier of TP and the expletive merges into the outer specifier of
TP, the verb appearing between the two as a verb-second phenomenon. He notes
that although this situation is the opposite of the English situation
discussed above, i.e. here category movement precedes expletive insertion,
this is predicted by the restriction against downwards movement. Assuming the
associate must move to adjoin to the expletive at LF, the associate must
appear lower than the expletive, in order for this movement to be raising
rather than lowering.
In Chapter Three, Kitahara demonstrates that Chomsky's (1995) Minimal Link
Condition can explain phenomena which had previously received disparate
analyses in the literature.
(vi) Minimal Link Condition (MLC)
H(K) attracts alpha only if there is no beta, beta closer to H(K) than
alpha, such that H(K) attracts beta.
Kitahara begins with Relativized Minimality (Chomsky 1993) violations, as in
(vii), and Superiority Condition violations, as in (viii).
(vii) *John seems it is t(John) certain to be here.
(viii) *What did you persuade whom to buy t(what)?
The MLC accounts naturally for these facts: in (vii), "it" is closer
than "John" to the matrix T, and thus blocks attraction of "John"; in (viii),
"whom" is closer to the matrix CP than "what" and thus blocks attraction of
Next, Kitahara considers Proper Binding Condition violations, like that shown
(ix) Proper Binding Condition
Traces must be bound.
(x) *Which picture of t(who) do you wonder who John likes t(which picture
He argues that a Proper Binding Condition analysis of (x) is no longer
available in Minimalist approaches. This condition can no longer apply at
S-structure, since S-structure has been eliminated from the model, and LF
reconstruction of "picture of t(who)" could create a configuration in which
the trace of "who" is bound.
Instead, Kitahara offers an MLC solution. He observes that (x) involves two
violations of the Minimal Link Condition. First, "which" is closer to the
embedded CP than "who" and thus blocks the attraction of "who" (note that
"picture of who" would be necessarily carried along with "which" to the
embedded CP by an independent convergence condition). Second, given the
illegitimate attraction of "who" to the embedded CP, "who" becomes closer to
the matrix CP than "which", and thus blocks the attraction of "which".
(xi) A derivation employing a greater number of illegitimate steps
induces a greater degree of deviance (p72)
the derivation in (xii) below is preferred over (x) because (x) involves two
violations of the MLC whereas (xii) involves only one.
(xii) ??Who do you wonder which picture of t(who) John likes t(which
picture of who)?
Kitahara extends this analysis to crossing versus nesting dependency data.
(xiii) Nested Dependency Condition (Pesetsky 1987)
If two "wh"-trace dependencies overlap, one must contain the other.
The paradigm cases are those in (xiv) and (xv):
(xiv) ??What did you wonder whom John persuaded t(whom) to buy t(what)?
(xv) ?*Whom did you wonder what John persuaded t(whom) to buy t(what)?
In (xiv), Kitahara observes, the MLC is disobeyed once, in the raising of
"what" over "whom" to the matrix CP. In (xv), on the other hand, the MLC is
disobeyed twice, once in the raising "what" over "whom" to the embedded CP,
and a second time in the raising of "whom" over "what" to the matrix CP. Thus,
following (ix), the grammar prefers (xiv) over (xv).
Finally, Kitahara considers scrambling and topicalization in German and
Japanese, demonstrating that certain restrictions on these phenomena can also
receive an MLC treatment. He assumes that both phenomena are feature driven,
and that the scrambling/topicalization feature of the attracted element is
interpretable, and thus remains accessible to the computation after checking.
The basic pattern considered is that it is not possible to scramble an element
from a constituent and then scramble the remnant, however it is possible to
then topicalize the remnant. German examples are provided in (xvi) and (xvii):
(xvi) scrambling + scrambling of remnant
*dass [t(das Buch) zu lesen] keiner [das Buch] t(t(das Buch) zu lesen)
that (the book) to read no one the book (the book to read)
"that no one has tried to read the book"
(xvii) scrambling + topicalization of remnant
[t(das Buch) zu lesen] hat keiner [das Buch] t(t(das Buch) zu lesen)
(the book) to read has no one the book (the book to read)
"No one has tried to read the book"
Under these assumptions, (xvi) violates the MLC twice, first by scrambling the
DP "that book" over the closer VP "that book to read", and second by
scrambling the VP "t(that book) to read" over the now-closer DP "that book".
(xvii), on the other hand, does not violate the MLC at all. Assuming that the
features that drive topicalization and scrambling are distinct, "that book"
would be the closest available element with the scrambling feature to the
attracting head, since "that book to read" would have a topicalization feature
rather than a scrambling feature. Similarly, the VP "t(that book) to read" is
the closest available element to be attracted for topicalization, since "that
book" has a scrambling feature not a topicalization feature.
In Chapter Four, Kitahara discusses the differences in deviance between
derivations which involve one violation of the MLC by a wh-element. He
provides the generalization in (xviii), and examples in (xix)-(xxii) (p83-85):
(xviii) An MLC violation involving adjuncts, subjects, or quasi objects [i.e.
"how many" phrases] is far more severe than an MLC violation involving
*How do you wonder [whether John fixed the car t(how)?
*What do you wonder [whether t(what) was fixed t(what)]?
*How many pounds do you wonder [whether John weighed t(how many)]?
??What do you wonder [whether John fixed t(what)]?
In order to explain this phenomenon, Kitahara proposes the following condition
(xxiii) Chain Formation Condition
An application of Move forms 1 or >1 chain(s) only if it is legitimate
and assumes that traces may be attracted (at least covertly). He claims that
the illegitimate wh-movements in (xix)-(xxii) do not form a chain.
Therefore, the wh-elements will not be able to be interpreted at LF,
causing the derivation to crash. This accounts for the ungrammaticality of
(xix)-(xxii). In (xxii), however, Kitahara claims that the formal features of
the trace of "what" raise covertly to check accusative case, and that it is
this movement that saves the derivation. (Notice that covert movement of the
traces of the wh-elements in (xxiv)-(xxvi) will not occur. Adjuncts and
quasi objects do not check case, and subjects move overtly to check case.)
According to the Chain Formation Condition, the movement of the formal
features of the object, being legitimate, may form one or more chains; in
particular, it forms a chain between the raised position of "what" in the
matrix CP and the merged position of "what". Thus, the derivation can be
interpreted at LF, and has only the status of a MLC violation.
Kitahara extends the analysis to (xxiv).
(xxiv) "where"/"when" adjuncts
?? Where/when do you wonder [whether John fixed the car t(where/when)?
He assumes that these adjuncts are the complement of a null preposition.
Therefore, the formal features of the trace of "where"/"when" must raise
covertly to check case with the null preposition, again creating the necessary
chain between the moved position of "where"/"when" in the matrix CP and its
This chapter concludes the book.
Although this book stands solidly on the foundations of previous Minimalist
syntactic research, it remains accessible to those who are not well-versed in
Minimalist theory. It provides very clear explanations of the details of
previous Minimalist approaches, as well as Kitahara's own proposals.
Furthermore, all relevant derivations are presented step-by-step, at a pace
designed to accommodate the non-specialist. Thus, it presents a good
opportunity for those interested to learn about research and issues in
Those who are familiar with Minimalist research should find this to be an
interesting reworking and application of 1995-style Minimalism. Anyone
convinced by recent discussions of computational complexity and local economy
(see Collins 1997, Johnson & Lappin 1997, Yang 1997, among others), however,
will be dissatisfied with the approach, as it continues to rely on global
economy conditions. Since, of course, not everyone has been convinced by the
discussion, this is more a note to prospective readers than a criticism. On
a similar note, a crucial assumption for the analyses is that the grammar can
count, which is controversial, but not obviously false.
Note that, regarding cyclicity, the notoriously problematic case of head-
movement, which Chomsky (1995) managed to incorporate into "cyclicity"
requirements (forcing it to apply before introduction of another head into the
derivation), again falls outside the analysis of "cyclicity". Since all head-
movement will require an operation of Replacement, there is no longer any
clear way to force it to apply before another head is introduced.
Perhaps more serious is the reformulation of the Strong Feature Condition.
The various Strong Feature Conditions of previous Minimalist approaches are
simplified in the first chapter to (ii) above, repeated in (xxv) below.
(xxv) Strong Feature Condition
Spell-Out applies to sigma only if sigma contains no category with a
The difficulty with this formulation is that it renders the Strong Feature
Condition an S-structure condition and thus anti-Minimalist, since Minimalism
took great pains to eliminate all S-structure conditions. Notice that it
would be trivial to reformulate all previous S-structure conditions in a
manner parallel to (xxv)--"Spell-Out applies to sigma only if sigma contains
no traces which are not bound"--thus reducing the Minimalist claim that
S-structure is redundant to a matter of terminology only.
Furthermore, in the last chapter, a further condition involving strong
features had to be introduced in order to rule out certain noncyclic
derivations. This additional condition, given in (xxvi), is essentially a
weakened version of Chomsky's (1995) formulation of the Strong Feature
(xxvi) alpha and beta cannot be concatenated if some sublabel of alpha and
some sublabel of beta are both strong (p95)
Thus, the proposed simplification of the Strong Feature Condition actually
results in positing two conditions, one of which is an S-structure condition.
Another seemingly anti-Minimalist proposal is the Chain Formation Condition,
given in (xxiii) above and repeated in (xxvii) below.
(xxvii) Chain Formation Condition
An application of Move forms 1 or >1 chain(s) only if it is legitimate
Minimalist theory claims that the computation of human language meets the
Inclusiveness Condition, i.e. no new elements are added during the course of
the computation. Instead, the computation arranges and rearranges items
selected from the lexicon. Therefore, under Minimalist theory, the notion of a
"chain" as an independent entity does not exist, as it would have to be added
during the course of the derivation, violating Inclusiveness. Instead,
"chain" is simply a convenient term used to refer to the identical elements in
a derivation. The Chain Formation Condition, however, crucially requires
chains to have an independent existence in the computation.
These comments aside, this book does represent a step forward in the
Minimalist research program. Kitahara is able to derive several assumptions/
principles which previously could only be stipulated. The account of
optionality in Icelandic object shift is more satisfying than the "optional
strong feature" approach, although, as was noted, it does raise some cross-
linguistic considerations (e.g. French). The systematic application of the
Minimal Link Condition to data captured by various other conditions was sorely
needed, if only to confirm the intuitions that an MLC would have equal, or
superior, empirical coverage. Finally, the analysis of "wh" extraction
asymmetries presented in the final chapter, is one of the few Minimalist
treatments of this phenomenon. All in all, the reader will find this book
to be very well considered, clearly explained, and thought-provoking.
Julie Anne Legate is a PhD student in the Department of Linguistics and
Philosophy at MIT. Her research interests include syntactic theory and Irish
Chomsky, Noam. (1991) Some notes on economy of derivation and
representation. In Principles and parameters in comparative grammar, ed.
Robert Freidin, 417-454. MIT Press, Cambridge, Mass.
Chomsky, Noam. (1993) A minimalist program for linguistic theory. In The
view from building 20, eds Kenneth Hale & Samuel Jay Keyser, 1-52. MIT
Press, Cambridge, Mass.
Chomsky, Noam. (1994) Bare phrase structure. MIT Occasional Papers in
Linguistics 5. MITWPL, Cambridge, Mass.
Chomsky, Noam. (1995) The Minimalist Program. MIT Press, Cambridge, Mass.
Collins, Chris. (1997) Local Economy. MIT Press, Cambridge, Mass.
Epstein, Samuel D. (1992) Derivational constraints on A'-chain formation.
Linguistic Inquiry 23, 135-159.
Johnson, David & Shalom Lappin. (1997) A Critique of the Minimalist Program.
Linguistics and Philosophy.
Yang, Charles D. (1997) Minimal Computation. Master's Thesis, Department
of Electrical Engineering and Computer Science, MIT.
- ----- End of Forwarded Message