AUTHORS: Mani, Inderjeet; Pustejovsky, James TITLE: Interpreting Motion SUBTITLE: Grounded Representations for Spatial Language PUBLISHER: Oxford University Press YEAR: 2012
Dorothea Hoffmann, The University Chicago, USA
The monograph ‘Interpreting Motion’ sets out to describe a new approach in computational linguistics to understand and map natural language descriptions of motion. The authors aim to “offer an integrated perspective on how language structures concepts of motion, and how the world shapes the way in which motion is linguistically expressed” (5). Therefore, the novelty of the approach lies in the attempt to describe an interdisciplinary method of combining research on the semantics of motion verbs and locative constructions from a cross-linguistic perspective with qualitative spatial reasoning. The aim is to develop a model for computational linguistics based on mapping spatial and temporal relations to represent motion in natural language. Within this approach, topological, orientation and distance relations are analyzed as being expressed in verb, adverbial and prepositional phrases. Ultimately, the approach is meant to lead to mapping text to data representations with practical consequences.
The first chapter introduces the topic, outlines the technical approach and situates the publication within other works on spatial prepositions and motion verbs. In particular, some challenges of a computational approach to natural language analysis are introduced, such as considering typological characteristics of languages, making use of diverse corpora, and some issues of human and machine annotation. Additionally, consequences of combining linguistic description and theory with computational specifications are addressed. For example, Talmy’s (1985, 2000a, 2000b, 2007, 2009) typological distinction of verb- and satellite-framed languages with the addition of Slobin’s (1996, 2004, 2006) equipollently-framed ones results in a formal distinction of location- (path) and action-based (manner) predicates, as well as a combination of the two, for the latter type in qualitative reasoning.
Specifically, the authors proclaim essential desiderata for their approach in this chapter. These include that semantic representations need to be expressive enough for natural language. The aim is to develop a denotational semantic theory, and a compositional analysis. Additionally, all representations need to support qualitative reasoning and all systems need to be accurate and efficient enough to support practical applications.
Concerning the theoretical background of linguistic descriptions of motion, the authors provide a comprehensive summary of previous studies on spatial prepositions and motion verbs. These, for example, classify spatial prepositions as locative or directional (Miller and Johnson-Laird, 1976). Within cognitive linguistics, they are termed by such concepts as ‘contact’ and ‘inclusion’ (Evans et al., 2007), while Jackendoff (1983) defines them within his theory of Lexical Conceptual Structure (LCS). Finally, spatial prepositions have been analyzed as vector-representations mapping sets of points in accordance with semantic content (Zwarts and Winter, 2000, Zwarts, 2003). Mani and Pustejovsky come to the conclusion that this research is of no use for their computational approach since it lacks corpus-based evidence. Regarding motion verbs, various approaches are discussed, including Langacker’s (1987) topological view of motion verbs, Jackendoff’s LCS, Word Net's (Fellbaum, 1998) ranking senses of words based on corpora, Verb Net's (Kipper et al., 2006) ability to provide syntactic and semantic information about verbs, Frame Net's (Bake et al., 2003) reliance on the theory of Frame Semantics (Fillmore, 1976), various approaches to verb classification based on qualitative reasoning, and compositional semantics. Overall, the authors conclude that none of these approaches is consistent with the desiderata set out to govern their methodology.
Finally, some limitations of the study are briefly addressed. Pragmatic and psycholinguistic considerations are not part of the book’s content since it is primarily concerned with semantic theory. Finally, the authors do not claim to provide a thorough survey of the field, rather choosing to only introduce select relevant literature.
The second chapter discusses how motion is expressed in natural language by developing a framework for analyzing different parameters of spatial meaning. First, static spatial relations are briefly discussed, including topological (e.g. “on”, “in”), orientational (e.g. “over”, “under”), topometric (e.g. “near”, “far”), and topo-orientational relations (e.g. “on the wall”, “hang over the desk”). Particular emphasis is laid on “the domain of points, lines, regions, and the relations between them” (31). Following this, a discussion of motion includes argument structure and role selection, event structure, path and manner of motion verbs, paths and orientation, and measuring distance. Based on the ‘Region Connection Calculus 8’ (RCC 8) (Randell et al., 1982) the model is enriched to deal with orientation and distance as well as motion. The RCC 8 is a calculus of relations of eight jointly exhaustive and pairwise disjoint relations used to analyze static spatial descriptions involving prepositions in natural language.
Path verbs are discussed in some detail and an additional path element is assumed in lexical argument structure. Motion is then represented in terms of transitions in spatial configurations, along with particular temporal constraints. In contrast, manner verbs are analyzed as indicating motion alone, without a path element within the verb. In line with Talmy’s typology, manner verbs require a path adjunct and path verbs may add a manner adjunct. The foundations for a semantics of motion laid out in the chapter then form the basis for a logic of motion, ‘Dynamic Interval Temporal Logic’ (DITL), allowing the authors to model events and states as programs.
In Chapter 3, spatial and temporal representations and inference methods are examined with regards to qualitative reasoning and are applied to spatial phenomena in languages involving topological and orientation relations. Particular emphasis is laid on qualitative representations of ‘Topology’ and ‘Frames of Reference’ (FoR) (Levinson, 2003, Levinson and Wilkins, 2006a, Pederson et al., 1998). Topological relations are analyzed with the already mentioned RCC-8 relations mapped onto interval calculus (Allan, 1984) relations to combine spatial and temporal relations with one another. Furthermore, for intrinsic FoR, the ‘Oriented Point Relational Algebra’ (OPRA) (Moratz et al., 2005) is used to describe the relation between oriented points when size and shape of the ground is of no importance. Absolute FoR is analyzed using the ‘Cardinal Direction Calculus’ (CDC) (Goyal and Egenhofer, 2000, Skiadopoulous and Koubarakis, 2005) for relations between objects when each is positioned in terms of a coordinate system. Finally, relative FoR makes use of the ‘Double Cross Calculus’ (DCC) (Freksa, 1992), describing the position of the Figure relative to the Ground as seen by an observer. In conclusion, the authors state that for a sufficient qualitative analysis of topological and FoR relations, the individual calculi mentioned are not enough. A combination of these remains a challenge for future work.
Chapter 4 applies the methods discussed in Chapters 2 and 3 to the concept of motion. Generally, motion expressions are analyzed within a cognitively inspired spatiotemporal model of change. In this approach, Talmy’s distinction can be modeled based on ‘Dynamic Interval Temporal Logic’ (DITL) (Pustejovsky and Moszkowicz, 2011), in which prepositional, verb and noun meanings are integrated together compositionally. This “combines mechanisms from temporal logic with the ability to update state information from dynamic logic” (90). Generally, the authors aim to fulfill two criteria, namely, discussing how each component of a basic motion frame is semantically grounded, as well as how these representations map to a compositional interpretation of the motion expression in a language. Some path and manner of motion verbs are discussed in detail and provided with a DITL definition.
Chapter 5 introduces a practical methodology for humans in annotating linguistic text corpora with information on motion to allow for automatic text to sketch mapping by a computer. To achieve reliable human annotation, best practices for spatial annotation (including toponyms) are examined alongside motion annotation. While some of these issues are well-discussed problems, the extraction for topological and orientation relations has not been as developed. Following ‘ISO Space’ (Pustejovsky et al., 2010) which has rich representations of paths and distinguishes between manner and path verbs, in addition to subclasses of motion events, the authors conclude that in general, capabilities of assembling automatic motion tracking from natural language narratives are well underway.
Finally, Chapter 6 summarizes the authors’ approach of using representations based on qualitative reasoning to describe the meaning of motion verbs and spatial expressions. Additionally, the chapter discusses potential advantages as well as practical application options. These can be found in route navigation, mapping travel narratives and multimedia tagging of static images, audio, video, question-answering, communicating with artificial agents, and rendering scenes from text. Additionally, some open issues are mentioned. These include, for example, the problem of ‘Fictive Motion’, adequately capturing cross-linguistic variation, integrating 3D representations of spatial entities, the challenges of developing methods to quickly generate training data, and data preparation methods.
The main aim of the authors is to provide an interdisciplinary approach to combining linguistic analysis of motion event descriptions with qualitative spatial reasoning to develop a computational model for the mapping of motion events in natural language. The monograph succeeds in carefully laying out essential issues in both fields of interest and in providing an excellent overview of current developments and trends. In an easy to follow, step-by-step discussion, the authors sensibly add components to examine this highly complex issue. While open questions and problems remain at the end of the book, ‘Interpreting Motion’ provides an excellent discussion of the problems of mapping motion from natural language aimed primarily at a computational linguist audience. However, because of the well-structured summaries and explanations of the many calculi and intervals used in the model, the book might also be useful to a non-expert audience of linguists interested in text-to-data mapping of static spatial and dynamic language.
Chapter 2 is particularly compelling in presenting an excellent overview of issues in spatial and motion linguistics and pointing out direct links to existing computational models of relevance to the aims of the monograph. This kind of discussion gives the reader the opportunity to directly reflect on and evaluate the current state of the art with regards to both disciplines – the semantics of spatial language, and computational models for their description. Furthermore, the authors succeed in guiding the reader through the discussion without overly simplifying the complex features of either field. Especially useful in this chapter are numerous exemplary analyses of path as well as manner of motion verbs, which provide the theoretical underpinnings with helpful illustrations.
Additionally, Chapter 5 is also particularly convincing in postulating highly valuable practical guidelines for developing and expanding procedures in semantic annotation. This discussion is able to provide useful parameters for human as well as automated linguistic annotation that go beyond the topic of the monograph. As a result, the authors accomplish improving best practices of annotation as well as expanding the outreach of their study.
The main criticism of the monograph comes from a certain inconsistency in including issues of cross-linguistic variation in the semantics of spatial and motion event descriptions without adequately discussing these in relation to a computational approach. While the authors show some detailed knowledge of the typological literature on the topic, this is not further exploited in the later stages of the book. Additionally, some terminology, such as ‘orientation’ for ‘FoR’ is not chosen well, especially when considering that a number of recent typological publications on languages distinguish between ‘orientation’ (e.g. “he is facing the tree”) and ‘FoR’ (e.g. “he is in front of the tree”) (Bohnemeyer and O'Meara, 2012, Terrill and Burenhult, 2008).
Additionally, Mani and Pustejovsky appear to leave out some important distinctions of motion event literature. For example, ‘change of location’ is equalized with ‘motion’, however, Levinson and Wilkins (2006b) describe clear distinctions in the verbal semantics between translocational movement (e.g. “he went from the garden to the house”), change of location (e.g. “he left the garden and arrived at the house”) and change of locative relation (e.g. “he ended up in the house”). Furthermore, while issues concerning the lexicalization of ‘source’ and ‘goal’ are discussed, ‘passed grounds’ (i.e. ‘via’) are left out of the examination.
All in all, the monograph is a good source for computational linguists and others interested in text-to-sketch mappings of spatial and motion events and provides some well-grounded discussions of relevant and detailed problems and solutions.
Allan, J. 1984. Towards a general theory of action and time. Artificial Intelligence 23:123-154.
Bake, C.F., Fillmore, Charles, and Cronin, B. 2003. The structure of the FrameNet database. International Journal of Lexicography 16:281-296.
Bohnemeyer, Juergen, and O'Meara, Carolyn. in press. Vectors and frames of reference: Evidence from Seri and Yucatec. In Space and Time across Languages and Cultures, eds. Luna Filipovic and Kasia M. Jaszczolt: John Benjamins Ltd. .
Evans, V., Bergen, B.K., and Zinken, J. 2007. The cognitive linguistics enterprise: An overview. In The Cognitive Linguistics Reader, eds. V. Evans, B.K. Bergen and J. Zinken, 3-26. London: Equinox Publishers.
Fellbaum, C. ed. 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.
Fillmore, Charles. 1976. Frame Semantics and the nature of language. Paper presented at Conference on the Origin and Development of Language and Speech, New York
Freksa, C. 1992. Using orientation information for qualitative spatial reasoning. In Theories and methods of spatiotemporal reasoning in geographic space, eds. A.U. Frank, I. Campari and U. Formentini, 162-178. Berlin: Springer.
Goyal, R., and Egenhofer, M.J. 2000. Consistent queries over cardinal directions across different levels of detail. Paper presented at 11th International Workshop on Database and Expert Systems Applications, Greenwich, UK.
Jackendoff, R. 1983. Semantics and Cognition: Current studies in Linguistics. Cambridge, MA: MIT Press.
Kipper, K., A., Korhonen, Ryant, N., and Palmer, M. 2006. Extending VerbNet with Novel Verb Classes. Paper presented at Fifth International Conference on Language Resources and Evaluation, Genoa.
Langacker, Ronald W. 1987. Foundations of Cognitive Grammar. Stanford: Stanford University Press. Levinson, Stephen C. 2003. Space in Language and Cognition. Explorations in Cognitive Diversity. Cambridge: Cambridge University Press.
Levinson, Stephen C., and Wilkins, David. 2006a. Grammars of space: explorations in cognitive diversity: Language, culture, and cognition ; 6. Cambridge, UK ; New York: Cambridge University Press.
Levinson, Stephen C., and Wilkins, David. 2006b. Patterns in the Data. Towards a semantic typology of spatial description. In Grammars of Space. Explorations in cognitive diversity, eds. Stephen C. Levinson and David Wilkins, 512-552. Cambridge: Cambridge University Press.
Miller, G.A., and Johnson-Laird, M. 1976. Language and Perception. Cambridge, MA: Belknap Press of Harvard University Press.
Moratz, R., Dylla, F., and Frommberger, L. 2005. A relative orientation algebra with adjustable granularity. Paper presented at Workshop on Agenst in Real-Time and Dynamic Environments (IJCAI), Edinburgh, Scotland.
Pederson, Eric, Danziger, Eve, Wilkins, David, Levinson, Stephen C., Sotaro, Kita, and Senft, Gunter. 1998. Semantic Typology and Spatial Conceptualisation. Language 74:557-589.
Pustejovsky, J., Moszkowicz, J.L., and Verhagen, M. 2010. ISO-Space Specification: Version 1.3.
Pustejovsky, J., and Moszkowicz, J.L. 2011. The qualitative spatial dynamics of motion in language. Journal of Spatial Cognition and Computation 11:15-44.
Randell, D.A., Cui, Z., and Cohn, A.G. 1982. A spatial logic based on regions and connection. Paper presented at 3rd International Conference on Knowledge Representation and Reasoning, San Mateo, CA.
Skiadopoulous, S., and Koubarakis, M. 2005. On the consistency of cardinal direction constraints. Artificial Intelligence 163:91-135.
Slobin, Dan I. 1996. Two ways to travel: Verbs of motion in English and Spanish. . In Grammatical Constructions. Their Form and Meaning eds. M Shibatani and S.A. Thompson, 195-219 Oxford: Clarendon Press.
Slobin, Dan I. 2004. The many ways to search for a frog: Linguistic typology and the expression of motion events. In Relating Events in Narrative: Vol. 2 Typological and contextual perspectives, eds. S. Stroemqvist and L. Verhoeven, 219-257. Mahwah, NJ: Lawrence Erlbaum Associates.
Slobin, Dan I. 2006. What makes manner of motion salient? Explorations in linguistic typology, discourse and cognition. In Space in Languages. Linguistic Systems and Cognitive Categories, eds. Maya Hickmann and Stephane Robert, 59-81. Amsterdam, Philadelphia: John Benjamins.
Talmy, Leonard. 1985. Lexicalization patterns: semantic structure in lexical forms. In Language Typology and Syntactic Description: Grammatical Categories and the Lexicon, ed. Shopen, 57-149. Cambridge: Cambridge University Press.
Talmy, Leonard. 2000a. Toward a Cognitive Semantics: Concept Structuring Systems vol. 1. Cambridge, MA: MIT Press.
Talmy, Leonard. 2000b. Toward a cognitive semantics. Typology and Process in Concept Structuring vol. 2. Cambridge, MA: MIT Press.
Talmy, Leonard. 2007. Lexical Typologies In Language Typology and Syntactic Description, ed. Timothy Shopen, 66-168. New York: Cambridge University Press.
Talmy, Leonard. 2009. Main Verb Properties and Equipollent Framing. In Crosslinguistic Approaches to the Psychology of Language. Research in the Tradition of Dan Isaac Slobin, eds. Jiansheng Guo et al., 389-402. New York: Psychology Press.
Terrill, Angela, and Burenhult, Niclas. 2008. Orientation as a strategy of spatial reference. Studies in Language 32:93-136.
Zwarts, J., and Winter, Y. 2000. Vector Space Semantics: A model-theoretic analysis of locative prepositions. Journal of Logic, Language and Information 9:171-213.
Zwarts, J. 2003. Vectors across spatial domains: from place to size, orientation, shape and parts. In Representing Direction in Language and Space, eds. Emile Van der Zee and Jon Slack, 39-68. Oxford: Oxford University Press.
ABOUT THE REVIEWER
ABOUT THE REVIEWER:
Dorothea Hoffmann received her PhD from the University of Manchester, Great
Britain in 2011. She is now a postdoctoral fellow at the University of
Chicago on a language documentation project of MalakMalak, an endangered
language of the Daly River Area in Australia, funded by the Endangered
Language Documentation Programme. Her research interests include typology,
lexical semantics, language contact, narrative structure, cognitive
linguistics, Australian indigenous languages and culture, as well as
discourse-based studies of space and motion.