Date: Mon, 09 May 2005 11:06:18 +0200 From: Michael Zock Subject: Multidisciplinary Approaches to Language Production
EDITORS: Pechmann, Thomas; Habel, Christopher TITLE: Multidisciplinary Approaches to Language Production SERIES: Trends in Linguistics: Studies and Monographs 157 PUBLISHER: Mouton de Gruyter YEAR: 2004
Michael Zock, Director of Research, LIMSI-CNRS (Orsay, France)
This book deals with natural language production, that is, the translation of a communicative intention (goal) into language. While language production may cover a wide range of phenomena (words, sentences, text), involve different tasks (sentence or paraphrase generation, translation) and take place in different modes, nearly all of the the work presented in this volume is confined to the production of spoken forms (words or sentences).
Speaking is a complex process. It involves many tasks (choice of a topic, sentence frame and words, morphological and acoustical, operations), various knowledge sources (encyclopedia, dictionary, grammar), and it is carried out under severe space (memory) and time constraints (speed). Yet people succeed amazingly well. How is this possible? Since answering such a complex question is a gigantic enterprise, the editors of this book have decided to convince a funding agency (DFG) to support their quest, have nearly two dozen teams spread all over Germany (17 universities) work on the same topic.
The book here presented summarizes the outcome of this effort, a six- year long priority program funded by the German Research Foundation, aiming to bring together researchers with different backgrounds (linguistics, psychology, computer science, neuro- science), but with a common goal: understand the cognitive processes underlying language production. The focus of their work is on empirical explanation (evidence) rather than on formal representations. The unifying framework, or, the big picture within which this work was carried out, is W. Levelt's book Speaking (Levelt, 1989), a true masterpiece.
To account for the information processing, Levelt conceives an architecture composed of three serially ordered, relatively autonomous main components (conceptualizer, formulator, articulator). These modules are in charge of message generation, grammatical encoding, phonological encoding, and articulation. In addition there is a feedback loop: the speaker is listening to himself.
The papers of the book (17, plus a preface by the editors and an introduction by M. Garrett, a pioneer in language generation research) can be placed within the chart (or architecture) proposed by Levelt.
The first two papers fall into the message planning component. The paper by Guhe et al. (Incremental generation of interconnected preverbal messages) deals with the conceptual preparation for describing a scene. Since the scene is composed of various objects (airplanes), and since the scene changes all the time, the subjects have to decide which elements to include in their message, how to combine the to-expressed events and how to express the message. Hence, issues of coherence and coreference (pronouns) have to be addressed. In addition, in order to be able to address the dynamic aspect of the situation, planning is done incrementally.
The next paper by Gardent et al. (Generating definite descriptions, non-incrementality, inference, and data) also deals with incrementality, or rather, its opposite. The authors show quite convincingly some of the shortcomings of too strict incremental processing, that is, too early verbalization of some planned content, (here, the definite descriptions) may yield hilariously complex sentences, whereas delayed verbalization would have resulted in quite natural output. One of the interesting features of this work is that it takes corpus data into account.
The paper by Harbusch and Woch (Integrated natural language generation with schema-tree adjoining grammars) deals with the problem of easing integration of basically very different kind of information, conceptual and linguistic. In order to do so, they resort to a unification mechanism, tree adjoining grammars.
Klabunde and Glatz (On the production of focus) address the hairy issue of focus. While intuitively clear and without any doubt important (noun vs pronoun, active vs. passive voice), there is a lot of disagreement when it comes to this notion. To be understandable, discourse is embedded into a situation, hence part of the message is old, while the other is new, or in focus. Unfortunately focussed and new information are not always identical.
Tappe et al. (Thematic information, argument structure, and discourse adaptation in language production) address the issue of thematic role assignment. According to them a thematic processor is needed to interface the conceptual and linguistic component. In other words, they suggest adding a thematic processor to the existing architectures.
Kempen and Harbusch (A corpus study into word order variation in German subordinate clauses: Animacy affects linearization independently of grammatical function assignment) resort to a corpus analysis in order to account for word order preferences in case languages like German. Their findings are likely to have consequences on the generation architectures or processing strategies. The authors suggest computing simultaneously grammatical functions and linear order, which, in the Garrett and Levelt model was always done serially, the former preceding the latter.
The next paper by Carroll, et al. (The language and thought debate: A psycholinguistic approach) addresses an age old problem, which, surprisingly has hardly ever been addressed within the framework of language production, where it is highly relevant. Nearly all systems are based on the assumption that the message on which the formulator works is specific enough in order to do his job, yet, language may have to say its word. Indeed, cross-linguistic comparisons allow the authors to show the kind of influence a given language may have on the preverbal message.
Leuninger, et al. (The impact of modality on language production: Evidence from slips of the tongue and hand) study whether sign languages obey similar rules as spoken languages. Their results reveal amazing similarities in terms of error typology.
The next few papers clearly fall into Levelt's second component, the formulator. Pechmann and Zerbst (Syntactic constraints on lexical access in language production) summarize their work on lexical access, using a well known technique the picture-word paradigm. Their results suggest changes in the generation architecture, allowing for cascaded rather than strict serial processing.
Blanken, et al. address similar problems (The dissolution of word production in aphasia: Implications for normal functions). They reach quite similar conclusions, even though their data are based on a very different population, brain-damage people.
Schade (The benefits of local-connectionist production) also reaches a similar conclusion, though his conclusions are based on a computer simulation. He shows how an interactive model can handle several problems that the Levelt model does not address at all. Two interesting features of this approach lie in the fact that the system can be tuned to accommodate, little by little with the empirical data. In addition, the different models compete. Hence, instead of preferring a model on a priori grounds, choice can be based on the result of the competition.
The next two papers report work based on brain activity measures. Jansma, et al. (Electrophysiological studies of speech production) use electrophysiobiological methods to study language production, while Dogil, et al.(Brain dynamics induced by language production) use brain imagining tools. They provide evidence for localization effects concerning syntax and semantics.
The next two papers deal with morphology. Boelte, et al. (Morphology in experimental speech production research) ask whether word forms are computed or accessed, since readily stored.
Unlike the authors of the preceding paper, who addressed three kinds of morphological problems (derivation, inflection and compounding) Janssen, et al. (Morphological encoding and morphological structures in German) focus only on the processing of inflections. Noticing important differences in terms of processing German and Dutch, they try to find explanations in the structure of the two languages to account for these facts.
Weingarten, et al. (Morphemes, syllables, and graphemes in written word production) address a problem, a bit outside of the Levelt paradigm, as it deals with written word production. Yet, it seems, that there are some similarities between the processing of the spoken and written word.
Finally, Hamm and Bredenkamp (Working memory and slips of the tongue) show some of the effects that working memory has on the (mis)functioning of the production system. Memory constraints can be said to be responsible for certain kinds of breakdowns, or speech impairments, like sound exchange errors.
To launch and keep alive a project like this (raise the funding for 20 teams for 6 years) deserves respect. Also, a book on language production from scholars with such a diversity of backgrounds is highly appreciated, since, there are not many books of this kind. Even though I'm not a newcomer to the field, there were many things I've learned by reading the book, and I've found nearly all the papers of excellent quality.
This being said, the book also has a few shortcomings. While I think that the editors have done a fantastic job in bringing this project to life, and while they've also done a great job as authors, I don't think that they've succeeded equally well as editors. Here are some of the reasons why I think so.
Lack of guidance for the reader: (a) the table of contents is not structured, yet this would have been quite easy to do. In the absence of such section titles, grouping the papers of the same kind under the same heading, a newcomer will perceive the papers just as an unordered collection of papers. The field is so complex and wide, that without guidance the reader is likely to get lost, or not to get the big picture.
(b) The index is done by hand. I've checked quite a few entries, each of which was mentioned only once or twice, yet the terms occurred more than half a dozen of times. Since papers were submitted electronically, building an index automatically would have been quite easy.
(c) Given the fact that all authors cast their work in Levelt's paradigm, and since not every reader can be expected to know his book, it would have been useful to resort to one of the following solutions: use one of Levelt's papers describing his framework, have Levelt contribute such a chapter, or have had it written by the editors. This lost space could at least partially be recovered by the fact that it would allow authors to refer to the introduction rather than having, one after another, describe basically the same aspects of Levelt's work. This would serve as advanced organizer, guiding the reader. Another way of gaining back some space would be to reduce the length of some of papers that are really very long (50-60 pages).
(d) The editors mention several workshops that have taken place during the 6 years of the project, yet none of these discussions (what were the problems, achievements) become in any way visible in this volume.
References: To ease the access of references, it would have been better to group them all at the end of the book. This would have saved space, which could have been used profitably for a glossary. While the diversity of approaches is certainly stimulating, it can also be overwhelming. Not everyone has the background to understand all the work, techniques or terminology.
Integration of the work with other work in psychology and computational linguistics: While no other book matches Levelt's landmark work, there are a lot of books than contain useful, complementary information, some in psychology and a lot in computational linguistics (at least a dozen). For some pointers see Bateman & Zock (2003: 301) and Zock & Adorni (1996).
It is really surprising that the Reiter and Dale book, which in the ''natural language generation community'' plays a similar role as Levelt's book does in psychology is only mentioned once in this book. Also, mentioning T. Dijkstra & K. de Smedt's (1996) book (with a foreword by P. Levelt), would have been in point, as it tries to accommodate work from different backgrounds in a common framework.
Even more surprising is the fact that nothing is said about the other community working on language production (they call their field ''text generation''). This community, which is very dynamic, productive (it has produced at least a dozen books over the last ten years), has a website (http://www.siggen.org/index.html), an international conference every year, and integrates people from many horizons (as a matter of fact, psychologists like Kempen, Harley, Pechman and Roelofs have presented their work in this framework).
One last point. For the newcomer it may be startling that ''natural language production'', is confined only to sentence production. Yet, again, there is a whole literature on this subject, both from a computational and psycholinguistic perspective (see Andriessen et al., 1996, de Beaugrande, 1984; Fayol, 1997; Flower & Hayes 1980).
Despite all these criticism, I maintain the respect that the editors of this volume deserve, for the quality of the final product and the enormous work put into the project to make it work.
Andriessen J., deSmedt K. & Zock, M. (1996) Discourse Planning: Empirical Research and Computer Models. In T. Dijkstra & K. de Smedt (Eds). Computational Psycholinguistics: AI and Connectionist Models of Human Language processing, London: Taylor & Francis, pp. 247-278
Bateman, J. & Zock, M. (2003). Natural language generation. In R. Mitkov (Ed.), The Oxford Handbook of computational linguistics. London: Oxford University Press, pp. 284-304
de Beaugrande, R. (1984). Text production: toward a science of composition, Norwood, New Jersey: Ablex
de Smedt, K., Horacek, H., & Zock, M. (1996). Architectures for natural language generation: problems and perspectives. In G. Adorni & M. Zock (Eds.), Trends in natural language generation: an artificial intelligence perspective (pp. 17-46). New York: Springer Verlag, Lecture Notes in Artificial Intelligence 1036
Fayol, M. (1997) Des idées au texte: psychologie cognitive de la production verbale, orale et écrite. Paris: Presses Universitaires de France
Flower, L. & Hayes, J. (1980). The dynamics of composing: making plans and juggling constraints, dans: Gregg & Steinberg (1980). Cognitive processes in writing, Hillsdale, New Jersey: Erlbaum,
Reiter, E., & Dale, R. (2000). Building natural language generation systems. London: Cambridge University Press.
Zock, M., & Adorni, G. (1996). Introduction. In G. Adorni & M. Zock (Eds.), Trends in natural language generation: an artificial intelligence perspective (pp. 1-16). Heidelberg: Springer Verlag, Lecture Notes in Artificial Intelligence 1036
ABOUT THE REVIEWER:
ABOUT THE REVIEWER
Michael Zock holds a Ph.D. degree in experimental psychology. He is currently research director at LIMSI-CNRS (Orsay, France). Having launched the European Workshop on Natural Language Generation (1987, Royaumont) he has edited several books on generation. His major research interests lie in the building of tools to support people, producing, or learning to produce language. His recent work is devoted to the building of extensions to electronic dictionaries aiming to facilitate the access, memorization and automation of words and syntactic structures, and to overcome the tip-of-the-tongue-problem.