Date: Sun, 22 Aug 2004 20:41:13 -0400 (EDT) From: Smaranda Muresan Subject: Dynamical Grammar: Minimalism, acquisition, and change
AUTHOR: Culicover, Peter W.; Nowak, Andrzej TITLE: Dynamical Grammar SUBTITLE: Minimalism, acquisition, and change PUBLISHER: Oxford University Press YEAR: 2003
Smaranda Muresan, Natural Language Processing Group, Department of Computer Science, Columbia University
OVERVIEW ''Dynamical Grammar'' is the second volume of a two-volume work on ''Foundation of Syntax'', offering a new perspective, minimalist and dynamical, on language acquisition and language change. While the main goal of the first volume (Culicover, 1999) was to investigate the properties of language as bounding conditions on the learning mechanism, the current book explores the actual architecture of the learner and the linguistic theory most compatible with the facts of language.
The book is addressed mainly to linguists, psycholinguists and cognitive scientists. However, computational linguists interested in language learning can greatly benefit from the material presented in this book.
SYNOPSIS The book comprises three parts: Foundations (Chapter 1-2), Simulations (Chapter 3-6) and Grammar (Chapter 7).
Part I presents a minimalist, dynamical approach to language acquisition (Chapter 1) and discusses how the link between linguistic theory and language acquisition can be rethought in this dynamical view (Chapter 2).
In Chapter 1, ''The Dynamical Perspective'', the authors argue in favor of a minimalist, dynamical approach to language acquisition, language evolution and language processing. It is minimalist since it looks for the minimal formal machinery and prior knowledge that the learner has access to in order to acquire language. It is dynamical because it considers the architecture of language faculty to be a dynamical system. The authors discuss Concrete Minimalism (Culicover 1999) as a linguistic theory tightly linked to the dynamical approach. They present in general terms how language acquisition can be modeled by an adaptive, dynamical system: forms and meanings are represented as trajectories in the linguistic space; trajectories are acquired individually; similar trajectories are grouped into flows, and generalizations emerge from these groupings.
Chapter 2 challenges the traditional view on the link between language acquisition and linguistic theory in both ''how'' language is acquired and ''what'' is actually acquired. ''The central thesis of this chapter is that current formal grammatical descriptions of adult language do not offer the proper vocabulary for describing the course of language acquisition'' (p. 23). The authors presents a critical analysis of Principles and Parameters Theory (PPT), considering several issues regarding parameters, parsing and triggers, mistakes, idioms and irregularities. They argue against an innate skeletal grammar with default values of all the parameters, as PPT assumes, and propose that ''the learner does have something structured that it draws upon in establishing correspondences between sounds and meaning, namely, the infinite inventory of meanings, the capacity to generalize and knowledge of what constitutes warranted generalization, and the capacity to extract formal regularities from the language that it is exposed to'' (p. 42) .
Part II presents several computer simulations for testing various hypotheses about language acquisition (Chapter 3, 4 and 5) and language change (Chapter 6).
The simulations in Chapter 3, 4 and 5 search for the minimal mechanisms the learner needs in order to acquire language. The conclusion is that neither distributional, nor grammatical approaches can alone justify language acquisition, that each of them plays a role. In all simulations, artificial data sets, as well as transcripts from the CHILDES database (MacWhinney 1995) were used. The mechanisms contained in the system were systematically varied in order to explore what features of the grammar the learner acquires given different assumptions.
Chapter 3, which focuses on the distributional approach to language learning, concludes that the ability to statistically acquire regularities in language is not sufficient for grammar learning. The authors present their system, Aqui, which is a dynamical system in which there is no meaning associated with words or sentences. The authors also analyze the amount and type of knowledge in the input data itself. They present Clagen, a genetic algorithm system that identifies optimal clustering in a corpus, based on distributional information. Quantitative and qualitative analyses show that the clustering method discovers mainly semantic co-occurrence restrictions in the data, but no syntactic structure. The results of Aqui and Clagen suggest that in order to learn grammar, there is a need for additional information, which the authors argue to be the meaning.
Chapter 4 and 5 explore whether access to meaning is not only necessary, but also sufficient for grammar learning, given that in the system there exists mechanisms for finding associations between meaning and form.
Chapter 4 presents CAMiLLe (Conservative Attentive Minimalist Language Learner), a system which implements the theory of Concrete Minimalism in the form of a dynamical system. Chapter 5 presents experiments done with this system. As nicely stated in Part III, the key properties of CAMiLLe are: ''i) the learner is presented minimally with strings and their associated conceptual structures, ii) the learner is Conservative in formulating hypotheses and in generalization, iii) the learner is Attentive to detail, iv) the learner formulates a set correlations that expresses the pairings of strings and their conceptual structures'' (p. 241).
Chapter 4 starts with a presentation of CAMiLLe's properties derived both from the theory of Concrete Minimalism and from computational considerations (p. 103): e.g., access to meaning in the form of Jackendoff's Conceptual Structures; capacity to form categories, while not having the information of what these categories are; access to the notions of ''phrase'' and ''head of a phrase'', a.s.o.
CAMiLLe has two representation systems: one for meaning and one for sentences, corresponding to classical semantics and syntax. The authors present the dynamical view of these systems, and explain what language learning means in this approach: finding couplings between trajectories of the syntactic and the semantic system. This is equivalent to finding correspondence rules between form and meaning. Generalization is seen as a self-organization of the language space based on repeated experience.
Parsing is what permits CAMiLLe to identify syntactic structure. The sentence is not just a string of words, but it has a structure associated with it, which emerges from contracting the strings of words to their heads. The authors present the characteristics of parsing, and briefly discuss the treatment of embedded clauses, representation of phrase structure, transformations, emphasizing that CAMiLLe is attentive to linear order. How the system will deal with null arguments, transformations, word-ordering will be presented in Chapter 5.
Chapter 5 discusses some experiments with CAMiLLe, showing initial results and pointing to future research. The authors organize the chapter into three main parts corresponding to experiments regarding the lexicon, the phrase structure, and the word order.
For the lexicon, experiments are done for nouns, compound nouns, verbs and verbal inflection and synonymy/ambiguity. Authors point out in the Section ''Dummy Semantics'' that a realistic simulation would need larger data sets, which are hard to build. They present as an alternative the use of dummy semantics. For lexicon, they present experiments with flat dummy semantics, but they avow that this is not sufficient to learn structure and hint in a footnote that a different kind of dummy semantics will be needed: bracketed strings.
The experiments regarding structure look at the discovery of DP structure (determiners, modifiers, argument structure), showing several positive results, as well as problematic phenomena where further experimentation is needed. The authors discuss in these cases the possible reasons of failure: not enough information, the task is difficult, the learner needs to be enhanced.
Regarding word order, they explore word order correlates of argument structure, scrambling, inversion, wh-movement and null arguments. An interesting point is the reinforcement of Elman's discovery (Elman 1993) that too much and complex information at the outset may confuse the learner, leading him to spurious generalizations.
The last two sections of this chapter,''Extending CAMiLLe'' and ''Preconditions for language acquisition by CAMiLLe'', are a remarkable discussion of the overall results of the simulations, showing the characteristics of CAMiLLe, its performances and future directions.
Chapter 6 reports on simulations of language change and language evolution from the dynamical perspective. This chapter elaborates on the authors' previous papers, (Culicover et al. 2003), (Culicover and Nowak 2003), and present them within a unified framework. The authors argue that the same dynamical system architecture and self-organization mechanism are suitable for both language acquisition and language change. Section 6.2 presents a computer simulation of language change (extending the material from (Culicover et al. 2003)), while Section 3 presents several extensions to the model (including bias, and variations across lexical populations) and discusses the computational complexity issue (a summary of the discussion in (Culicover and Nowak, 2003)). They argue that ''the major grammatical constraint that guides the direction of change is the computational complexity of the sound-meaning correspondence'' (p. 22).
Part III (Chapter 7) presents ''Concrete Minimalism'' as a link between syntactic theory and the dynamical perspective on language acquisition and language change. This concluding chapter crowns the monumental work on two volumes about Foundations of Syntax. The chapter has two parts.
Part one (Section 7.1) presents the major formal design features of language and how they can be represented in a dynamical system that conforms with Concrete Minimalism. This part is a very detailed presentation of the dynamical system, giving a more formal definition (p. 246) and showing in greater detail how language, and in particular syntax can be represented as a dynamical system (lexical categories, phrasal categories (endocentric, exocentric, movement and recursion).
Part two (Section 7.2-7.8) shows how Concrete Minimalism allows for ''descriptively adequate syntactic descriptions, and following Occam's Razor, is to be preferred to other syntactic theories that invoke more abstract structure'' (p. 22). In this approach, linear order is considered a primitive in the grammar. The phenomena covered are some of the most prominent ones studied in the context of PPT: head-complement order, V raising to I, V2 and Inversion, Null arguments, Wh-movement and Scrambling. The authors present an analytic discussion of both traditional approaches and the Concrete Minimalism account of these phenomena. The main conclusion is that Concrete Minimalism coupled with a theory of markedness can explain these phenomena and it should be preferred, as a simpler approach.
CRITICAL EVALUATION ''Dynamical Grammar'' can be recommended on several levels, being in my opinion one of those revolutionary and fundamental books that have an interdisciplinary impact, moving the research on language acquisition and language change to a next level. The book provides an excellent discussion of related work in linguistics, cognitive science and computational linguistics.
The authors evaluate this work to be ''a model of qualitatively understanding of how language acquisition may be represented as a self-organization of a dynamical system''(p.189). However, I see two dimensions, besides the unified framework, that the book can be appreciated and evaluated upon: the linguistic theory (Concrete Minimalism) and the learning theory (Dynamical Systems). I found Chapter 7 to be a very powerful chapter that ties them together.
An aspect of Concrete Minimalism that might be found appealing by both linguists and computational linguists is the tendency towards a simpler syntax. In this view, the grammar explicitly states correspondences between form and meaning. The simulations with CAMiLLe, and the detailed presentation of its properties (p. 103 and p. 190) give valuable insight on what might be the minimal information a learner needs in order to acquire language. This book presents consistent positive results that support the authors' general approach. What I found also very useful, is the discussion of the limitations of the current implementation and how future research can address them: e.g., CAMiLLe does not currently generalize to basic categories. The questions that arise are: would additional mechanisms will be needed and would they be sufficient, or the learner would need this information to start with. The authors argue for the first solution and present a brief discussion. Another zone of further research, as the authors themselves point to, resides in the generalization mechanism (i.e., should the learner be conservative, or it should allow overgeneralization, but then supply an error correction mechanism).
Overall I found this book to be a thought provoking reading, rich in both theoretical arguments and experimental validations. I foresee this work to open several doors not only to work in linguistic theory and cognitive science, but also to computational linguistics focusing on grammar induction and language understanding. Moreover, this book is a clear example of how valuable the interdisciplinary work is for studying language acquisition and language change. In general, as computer simulations become a bigger part of the approaches taken for the study/experimentation of language acquisition and language evolution theories (see also Briscoe (2002)), I think closer collaborations between linguists, cognitive scientists on one hand, and computational linguists on the other hand, can lead to further developments.
REFERENCES Briscoe, Ted (editor) (2002). ''Language Evolution through Language Acquisition: Formal and Computational Models''. Cambridge University Press, Cambridge, UK, 2002.
Culicover, Peter W. (1999). ''Syntactic Nuts: Hard Cases, Syntactic Theory, and Language Acquisition''. Oxford University Press. 1999. Volume One of Foundations of Syntax.
Culicover, Peter W. and Andrzej Nowak (2003). ''Markedness, Antisymmetry and Complexity of Constructions''. In Pierre Pica and Johann Rooryk, eds. Variation Yearbook. John Benjamins, Amsterdam. 2003.
Culicover, Peter W., Andrzej Nowak, and Wojciech Borkowski (2003). ''Linguistic Theory, Explanation and the Dynamics of Grammar''. In John Moore and Maria Kolinsky, eds. Explanation in Linguistic Theory, CSLI Press, Stanford, CA. 2003.
Elman J. L. (1993). ''Learning and Development in Neural Networks: The Importance of Starting Small''. Cognition, 48:71-99.
MacWhinney B. (1995). ''The CHILDES Project: Tools for Analyzing Talk''. Hillsdale, NJ: Lawrence Erlbaum Associates, 1995.
ABOUT THE REVIEWER:
ABOUT THE REVIEWER Smaranda Muresan is a PhD Candidate in the Natural Language Processing Group, Department of Computer Science, Columbia University. Her research interests are in the field of computational linguistics, including grammar learning and natural language understanding. In her PhD thesis, she proposes a relational learning framework for the induction of grammars able to capture both aspects of syntax and semantics. She uses a domain ontology during the learning process as a grammar semantic constraint.