Review of  Bidirectional Optimality Theory

Reviewer: Diane Frances Lesley-Neuman
Book Title: Bidirectional Optimality Theory
Book Author: Anton Benz Jason Mattausch
Publisher: John Benjamins
Linguistic Field(s): Linguistic Theories
Issue Number: 24.2938

EDITORS: Benz, Anton and Jason Mattausch
TITLE: Bidirectional Optimality Theory
PUBLISHER: John Benjamins Publishing Company
SERIES TITLE: Linguistik Aktuell/ Linguistics Today 180
YEAR: 2011

Diane Lesley-Neuman, University of the Gambia


This volume represents the state of the art in Bidirectional Optimality Theory (BiOT), a field that grew out of Optimality Theory (OT), a theory of human linguistic competence in phonology and syntax, and from the concept of bidirectionality in the perception and production of speech. It was later extended to the analysis of production and interpretation in the field of semantics, fusing with the subfield of Radical Pragmatics. The book consists of ten chapters covering various sub-disciplines in linguistics.

The introductory chapter by editors Benz and Mattausch, “Bidirectional Optimality Theory: An Introduction,” presents theoretical background and an overview of the origins and evolution of BiOT. They introduce OT through examples from Archangeli (1997), and follow with the notion of bidirectional optimization, a combination of generative and interpretive optimization that widens the applicability of the theory to semantics and pragmatics. Necessarily, the topic turns to stochastic OT and Boersma’s (1998) Gradual Learning Algorithm (GLA) to explain Jäger’s (2004) bidirectional variant, known as BiGLA. BiGLA differs from GLA in that the learning algorithm has recoverability factored in. This is made possible through the definition of asymmetric bidirectional optimization. Candidate forms can be disqualified when they are not optimally recoverable as the intended meaning and at least one other form is; the learner evaluates the candidate forms with respect to a hypothetical grammar and the meanings of the other candidates. Evaluation takes place as the observed form and meaning, f and m, are compared with those generated by the hypothetical grammar f’ and m’. When there is a mismatch between these pairs, learning takes place as the constraints of the learner’s grammar are adjusted. They next present Jäger’s proposal to combine the Iterated Learning Model (ILM) (Kirby & Hurford, 1997) of language evolution to create a modeling system called Evolutionary OT. The model takes each generation of learners to be one cycle of language evolution, and by applying a learning algorithm to the output of one cycle, produce subsequent cycles. They then explain Jäger’s subsequent work with iconicity constraints, which are derived from harmony scales, and Mattausch’s own Evolutionary OT proposal using bias and markedness constraints, which show improved results. They then describe how game theory was incorporated into the field. OT is a theory about how grammars are cognitively represented, and was not originally designed to be a theory of strategic interaction. But as communicative success is a criterion for selecting optimal form-meaning pairs, strategic interaction, and therefore game-theoretic modeling, must be incorporated into it. This has led to proposals that BiOT should be regarded as the study of emerging conventions in language (Van Rooij, 2004, 2009).

The second chapter, by Paul Boersma, “A programme for bidirectional phonology and phonetics and their acquisition and evolution”, outlines a research program with the goal of achieving explanatory adequacy through whole language simulations. He presents a model of grammar and processing at three levels of representation. The level of phonetic representations consists of both the auditory and articulatory forms, under which sensorimotor and articulatory constraints are hierarchically organized. Cue constraints connect to the auditory form but are found under the second level of the hierarchy: that of the phonological representations. This level consists of the underlying and surface forms. Cue, structural and faithfulness constraints are organized under the surface form, while the faithfulness constraints also connect to the underlying form, which in turn is connected to the next higher level, that of semantic representations, through the lexical constraints. The level of the semantic representations consists of morphemes and context. Morphemes hierarchically organize the lexical and semantic constraints, the latter of which are linked to the context. Bidirectional processing takes place as the speaker changes the context through the creation of meaningful speech. As production occurs, the process travels down the hierarchy back to the lowest level: that of the articulatory form of phonetic representations. The listener initiates with the auditory form, traveling up through the hierarchy to arrive at the act of comprehension, in which the change in context occurs in the moment of perception. Constraints evaluate candidates at either a single level of representation or at the interfaces between two levels. The author discusses sample constraints on bidirectional processing and learning in triplets and quadruplets of form along the hierarchy of the different levels of representation.

The third chapter, by Jason Mattausch, “A note on the emergence of subject salience,” deals with the prevalence of discourse anaphor resolution from subject antecedents. The explanation he provides takes an evolutionary perspective and is spelled out and implemented in the stochastic version of BiOT. He first defines subject salience in anaphor resolution using Rule 1 of Centering Theory (CT) (Grosz et al., 1995) and the algorithm of Walker et al. (1998), and proceeds to discuss Beaver’s (2004) BiOT version. In his opinion, the deficiency of Beaver’s account and its subsequent revisions lies in their lack of explanatory adequacy, in that innate constraints mandating pronominalization or defining topicality are not warranted or plausible. The results of CT should be the result of constraint interactions over generations of learners, due to constraints on faithfulness and economy factored in with those derived from CT. Mattausch employs as the starting probability of anaphoric reference that which was derived from a corpus of sentences from a popular fairy tale. With the BiGLA and ILM, he simulates the exposure of 100 generations of learners to data that are modified with each subsequent generation. The grammar stabilizes according to Rule 1 of CT: the subject of the previous sentence is the most likely antecedent of the discourse anaphor in the present sentence.

The fourth chapter, by Petra Hendriks and Jacolien van Rij, “Language acquisition and language change in Bidirectional Optimality Theory”, compares Mattausch’s (2004) diachronic study of the development of pronominal binding from Old to Modern English with Hendriks’ and Spenader’s synchronic account of its development in English-speaking children. The study assumes that generational transmission is a key factor in language change. Mattausch employed a computational model assuming that statistically more frequent forms were favored over time, and successfully simulated in twenty generations the Binding Principles A and B of ME through the three diachronic stages under which reflexives are presumed to develop (Levinson, 2000). Hendriks and Spenader created a BiOT constraint system to address the Delay of Principle B Effect: while children produce nouns, pronouns and reflexives in an adult-like fashion, they confuse pronoun and reflexive objects until at least age 6:6. Mattausch’s model applied to child language produces a non-existent Delay of Principle A Effect while producing no such effect for Principle B. Likewise, Hendriks’ and Spenader’s constraint system does not model the changes from OE to ME, as it effects this change in only one generation. A revised model of Mattausch and Gülsow (2007) predicts stable states in the grammar for Principles A and B, with Principle B the stronger constraint. This contrasts with Hendriks’ and Spenader’s account that characterizes Principle A as a grammatical constraint while Principle B is a derived effect. The latter is supported by research with aphasics indicating that Principle B is most vulnerable to breakdown. The authors conclude that the two models under consideration could not be combined into a single model of grammar, suggesting that child language acquisition may not be reliant solely upon the statistical language patterns, but reflects internal factors of human cognition.

Peter de Swart addresses differential case-marking in the fifth chapter, “Sense and simplicity: Bidirectionality in differential casemarking,” using Papuan and Tibeto-Burman languages. The author makes the distinction between languages that mark direct objects due to the presence of certain semantic figures, which he refers to as local distinguishability, and languages in which they are marked in cases of ambiguity or comparison between subject and object, which he calls ‘global distinguishability’. The bidirectional model he proposes has the speaker monitoring his production to ensure that his message is recoverable, making a form bidirectionally optimal if it is the least marked form from which the hearer can recover the intended interpretation. The author makes clear that for bidirectional models to account for his data, interpretive optimization must constrain productive optimization. This chapter underscores the importance to linguistic theory of working with marginalized, minority and endangered languages.

“On the interaction of tense, aspect and modality in Dutch” is the sixth chapter, by Richard van Gerrevink and Helen de Hoop. It deals with the fact that that the imperfective past form ‘moest betaald worden’ ‘had to be paid’, present perfect form ‘heft moeten betalen’ ‘had to pay’ and the past perfect ‘had moeten betalen’ ‘should have paid’ have differences in actuality entailment, in which only the perfective form implies that someone in reality made the payment, and the past perfect implies that the payment had in fact not been made. They propose three constraints representing three factors of interpretation relevant to the forms. The first is FAITHMODAL: A modal verb leads to undetermined factuality status. The second is FAITHPERFECT: Perfective aspect means the eventuality described is completed and thus a fact. The third is FAITHPTI (Faith Past Tense Implicature): The eventuality described is not true at the moment the utterance holds.
They produce a ranking FAITHMODAL >> FAITHPERFECT >> FAITHPTI.

The seventh chapter, by Gerlof Bouma, “Production and comprehension in context: The case of word order freezing” deals with exceptions to word order variation induced by information structure, “word order freezing”. In free word order languages, instances of syncretism of case present possibilities of multiple interpretations of sentences, which in reality do not occur, as in this Russian sentence:

Mat’ ljubit doč’
mother-NOM/ACC love-3s daughter-NOM/ACC
‘Mother loves her daughter.’

This sentence is only given the SVO interpretation of the mother loving her daughter in spite of the fact that the case-marking information in this free word order language makes the OVS interpretation, that of the daughter loving her mother, a possibility. This contrasts with the ambiguity in the following Dutch example, in which there is no preferred option between the SVO and OVS interpretations:

Welk meisje zoent Peter?
which girl kisses Peter
‘Which girl is kissing Peter?’ (SVO)
‘Which girl is Peter kissing?’ (OVS)

To account for ambiguity, or the lack of it, the author utilizes a notion of grammaticality termed stratified strong bidirectionality. It is based on Antilla’s (1997) OT model of variation within languages, in which a language-specific grammar has partial rather than full rankings of constraints. The constraints are placed in strata: between strata, the order of the constraints is fixed, but within strata they are not. A language described by a partial ranking consists of the union of all of the full rankings that correspond to it. Nevertheless, word order is a complex phenomenon dependent upon a number of factors for which there should be separate hierarchies of constraints: topicality, focus, animacy, definiteness of the NP, WH-movement. The author concludes that adequately addressing the topic of word order freezing within BiOT requires more data, a more comprehensive constraint set, and investigation into the factorial typologies of existing constraint sets and those yet to be proposed.

The eighth chapter, by Reinhard Blutner and Anatoli Strigin, “Bidirectional grammar and bidirectional optimization,” presents a general architecture of the human language faculty with three subsystems: the grammar, the conceptual system and the sensorimotor system, while discussing two views of bidirectional optimization: the online processing view in which the conflict between production economy and comprehension is resolved at the moment of the utterance, and the fossilization view in which resolution takes place during language acquisition. They argue that both types of processes occur, but that online bidirectionality is asymmetric: speakers optimize bidirectionally and take the hearer into account, but hearers do not normally take the speaker into account when computing the optimal interpretation. They posit that future research into the interplay between asymmetric online processing and fossilization should be carried out in terms of cognitive economy and cognitive resources: in some cases it is more economical to store information in the long-term memory and retrieve it when required as opposed to computing it online, while in other cases the opposite is true.

The ninth chapter, by Henk Zeevat, “Bayesian interpretation and Optimality Theory” defends the version of OT in which optimization takes place only in production. It is an asymmetric model in which interpretive optimization is constrained by productive optimization -- in other words, the hearer needs to simulate the speaker’s perspective to interpret the utterance. The advent of the mirror neuron research program provides support for this view as mirror neurons fire during both production and understanding, particularly in imitation of and reaction to the production of others. Contrary to Blutner’s symmetric OT, this model allows for ambiguity, and poses the question as to how simulating the production process can resolve it. The author argues that given an utterance with form F, the hearer tries to find a meaning M for which the conditional probability of M is maximal. By Bayes’s theorem, that is the equivalent of maximizing p(M) p(F|M). To calculate p(F|M), the hearer uses his own production grammar. This idea is then applied to phonology, syntax, semantics and pragmatics.

In the final chapter, “On bidirectional Optimality Theory for dynamic contexts,” Anton Benz develops a context-sensitive BiOT model that accounts for the asymmetry in knowledge between the speaker and hearer during online communication. It addresses problems created by the fact that the information states of interlocutors are not represented in OT models. Benz does so by proposing two OT systems: one that produces a ranked group of constraints providing for speaker preferences on forms, and another that produces a second ranked group providing for hearer preferences for meanings. These constraint groups are called Blutner structures, and they constitute the combination of BiOT and Dynamic Semantics. Because of the epistemic asymmetry, it is necessary to remove the misleading form-meaning pairs that can lead to the hearer making an ungrammatical choice, a so-called ‘dead end.’


The introductory chapter effectively lays a foundation for understanding the subsequent contributions, since many readers have exposure only to certain variants of OT applied to their own sub-specialties, but lack the knowledge of the whole theory, especially of the particulars of bidirectional optimization and how game theory plays a part.

The second chapter, by proposing a model of bidirectional phonology and phonetics, successfully addresses two problems crucial to the field of phonology and the optimality-theoretic enterprise. The first is the need to organize the proliferation of constraints of different types that have emerged since the advent of Optimality Theory in 1993 into a coherent system more firmly aligned with that of the human language faculty. The second is to theoretically account for phonological phenomena discovered through instrumental measurement and experimental design.

Two questions arise from this model’s presentation. The first is whether there are expedited pathways for the production and comprehension of the exceptional structures found in sound symbolism, such as ideophones, “marked words that depict sensory imagery” (Dingemanse, 2011:3). As noted by Blench (2011), some ideophones, such as reduplicated forms, possess canonical phonological form and morphemic shape, but others do not. A description of how the latter are produced and perceived may require bypassing some of the steps proposed in the BiOT hierarchy.

The second is how this system relates to the results of the last research program that, like this one, endeavored to describe, as the author states “‘all’ of the phonology” (p.33) -- lexical phonology and morphology (LPM), or its counterpart in Optimality Theory -- LPM-OT (Kiparsky, 2000), and earlier rule-based analyses. Boersma’s description of phonological-phonetic production makes a contribution to LPM-OT by modeling a parallel process explaining the incorporation of phonetic effects into the grammar. His assessment of the capacity of his constraints to entirely replace other systems may be too cavalier and warrants more careful attention. Despite the author’s assertion of a minimal but comprehensive model, his explanation falls slightly short of its stated ambition, because of insufficient coverage of the morphology-phonology interface, and for failing to recognize or address the rich literature covering a variety of languages from this hybrid theoretical tradition. It is, nonetheless, a foundation for further elaboration and a product of careful research.

The successful simulation of historical data in the third chapter does not explain any preferences for pronominalization that may occur, nor does it completely explain the distribution of discourse anaphora in frequentist/functionalist terms, but it does show how effects such as minimal obliqueness come to be associated with salience. It also shows how fidelity to the basic principles of OT yields greater explanatory power than language-specific constraints or those of brute force that are sometimes marshaled to account for linguistic phenomena by researchers working within an OT framework.

In chapter four, the conclusion by the authors that internal factors of human cognition play a role in child language acquisition rather than it being a matter of statistical language patterns should have led the authors to contemplate their problem within known phenomena of child development. Since the Delay of Principle B Effect is the linguistic counterpart to children’s gradual development in tasks of conservation of volume, number and spatial area (Piaget, 1954), future attempts to model language acquisition should take a combined Vygotskian-Piagetian view: recognizing the Zone of Proximal Development of the adult-influenced linguistic environment (Vygotsky, 1962), and the constraints governing concept internalization within the individual described by Piaget.

As parent-to-child transmission may not play a major role in causing language change, the authors’ attempt to link the models appears to have been based on a faulty assumption. As seen in the work of Labov (1966, 1972) and Eckert (1989), factors of the extra-familial, adolescent and adult world can exert the greatest pressures on the evolution of a speech community. An attempt to merge two models addressing phenomena with different causes and ontologies would logically not be successful, as it was not in this study.

The significance of the findings of the fifth chapter is that it underscores the importance to linguistic theory of working with marginalized, minority and endangered languages.

The constraint ranking produced by the authors in the sixth chapter has no motivation shown for it, leaving it unclear as to why FAITHMODAL >> FAITHPERFECT >> FAITHPTI. A more robust explanation of their factorial typology is in order, as it is unclear why “once the optimal ± fact reading has already been paired up with the imperfective modal form, it is no longer available anymore for the present perfect form” (p. 165). There is no explanation as to why the imperfective has precedence in the analysis, and the option of syncretism in meaning is not considered.

In the seventh chapter, the author concludes correctly that adequately addressing the topic of word order freezing within BiOT requires more data, a more comprehensive constraint set, and investigation into the factorial typologies of existing constraint sets and those yet to be proposed.

In the eighth chapter, the positing of asymmetric online bidirectionality by the authors ignores the extent to which the hearer takes the speaker into account. A hearer processes frequency, loudness, intonation, accent, stress and word choice to make decisions about the speaker’s sex, social origins, intentions, meaning, and point of view. Their assertion that evidence is lacking for strong bidirectionality needs to be re-examined by considering relevant sociolinguistic literature and by designing and implementing online comprehension studies that manipulate prosody and sociolinguistic variables, which can play just as much a part of the linguistic communication process as other variables.

In the ninth chapter, the advent of the mirror neuron research program provides support for the views adopted by the authors in that mirror neurons fire during both production and understanding, particularly in the imitation of and reaction to the production of others.

For the final chapter, it is unclear how the proposal presented would model genuine cases of misapprehension, or online negotiations of meaning. Models must be able to describe both successful and unsuccessful negotiations of meaning given constraints on production and interpretation.

One deficiency of the book is the need for more careful editing. References given in the articles are sometimes not listed in the reference section, and publication dates for the same references differ among contributors. There are also a significant number of errors in spelling, punctuation, sentence structure and usage. Some even change the factual content of what is being explained. Among them:

p. 21 “m2” should substitute “f2” to correctly read “…the unmarked form f1 when in the state m1, and f2 when in the state m2,”
p. 22 “game models also provides us” should be changed to “game models also provide us”
p. 23 “constraint” should be replaced by “constraints”
p. 24 “instable pooling equilibrium” should be replaced by “unstable pooling equilibrium”
p. 25 “...and how they can be learned. Something game theory has nothing to say about.” Should be changed to: “and how they can be learned, something that game theory has nothing to say about.”
p. 77 “topichood” by “topicality”
p. 91 “due to Kirby and Hurford” should be changed to “of Kirby and Hurford”.
p. 96 “Pittsburg, PA” should be changed to “Pittsburgh, PA”
p. 105 “computational models can help investigating the causes of” should be changed to “computational models can help in the investigation of the causes of”
p.152 the word “for” should be added: “the broken window had to be paid for”
p. 178, example 13, “SOV” should be changed to “SVO”
p. 187 “ambiguity avoiding strategy” should be replaced with “ambiguity avoidance strategy”
p. 224 “utterance planer” should be replaced with “utterance planner”
p. 237 “makes it is not easy” should be “makes it not easy”
p. 243 “the proper way of explaining” should be changed to “the proper mode of explanation”
p. 244 “The idea to this article” should be changed to “The idea for this article”.


Diane Lesley-Neuman is a humanities lecturer at the University of the Gambia. Her interests lie in semantic change, the evolution of person-marking and the phonetics and phonology of grammaticalization processes and their theoretical expression. Her most recent work, “Morpho-phonological Levels and Grammaticalization in Karimojong: A Review of the Evidence” was recently published in Studies in African Linguistics. Her 2007 Master’s thesis posited a stratal OT model for the Karimojong language, and, until recently, has focused on studying [ATR] harmony as a tool for historical reconstruction in Nilotic. She is currently conducting her Ph.D. dissertation fieldwork on grammatical features and dialectal variation in West African languages.
