EDITORS: Gries, Stefan Th.; Stefanowitsch, Anatol TITLE: Corpora in Cognitive Linguistics SUBTITLE: Corpus-based Approaches to Syntax and Lexis PUBLISHER: Mouton de Gruyter YEAR: 2006
Rolf Kreyer, Department of English, American and Celtic Studies, University of Bonn
The volume under review is a collection of nine papers on 344 pages, all of which aim to show how issues of cognitive linguistics can benefit from the extensive use of corpus data and from the application of objective statistical methods. Although the volume is not divided into separate 'parts', the papers can broadly be subsumed under three major groups, namely 1) four papers on the study of semantic similarity, 2) three papers on the linguistic manifestation of causation and transitivity, and 3) two papers on the role of image-schemas in cognitive linguistics and their analysis with corpus-based methods.
The following synopsis will give a summary of the key points of each of the articles. The review will conclude with a critical evaluation.
In her paper ''Ways of intending: Delineating and structuring near-synonyms'', Dagmar Divjak analyses meaning differences in ''five Russian near-synonymous verbs that, in combination with an infinitive, express the concept INTEND TO CARRY OUT AN ACTION'' (19). Her study falls into two parts, the first being based on elicitation, the second on corpus data. The aim of the former is to show that the degree of similarity between lexemes can be measured reliably with the help of ''precise syntactic and semantic data on the distribution of the potentially near-synonymous lexemes over constructions and of their collocates over the slots of those constructions'' (22). On the basis of an elicitation experiment that taps knowledge of realizations of the underlying pattern [Vfin Vinf] the author identifies five verbs that show constructional similarities, namely 'namervat'sja', 'sobirat'sja', 'predpolagat'', 'dumat'' and 'xotet''. One verb which, on a purely semantic basis, usually is considered to fall into the same meaning group of intention, namely 'planirovat', is excluded from this list, since it does not show the same constructional properties. Interestingly, this difference in constructional realization reflects a semantic distinction between 'planirovat' and the other verbs, which apparently is difficult to recognize through purely meaning-based methodologies. They ''seem to lack a precise enough measure to determine the degree of similarity; the proposed solutions may thus be influenced by the authors' opinion on what an intention should be as well as by prototype effects typical of human categorization'' (32). Accordingly, ''[a]n approach that builds on distribution parameters from argument- and event-structure offers a viable alternative'' (32).
In the second part of her study, Divjak sets out to explore the near-synonyms in more detail. More specifically, she analyses four of the above verbs, 'namervat'sja', 'sobirat'sja', 'dumat'' and 'xotet'' , with regard to 47 parameters which fall into two broad groups, the first being concerned with the formal instantiations of the slots provided by the pattern [Vfin Vinf], the second focussing on semantic paraphrases for the subject and the infinitive in the construction. An Hierarchical-Agglomerative-Clustering analysis (HAC) shows that with regard to the formal realizations 'dumat'' and 'xotet'' are most similar, while a focus on the semantic paraphrases reveals strong similarities between 'dumat'' and 'namerevat'sja'. Surprisingly, an HAC over all 47 parameters shows a pattern that is similar to the one that is revealed when merely taking the formal aspects into account. It follows that the overall impact of semantic variables seems to be rather unimportant. In contrast, ''the way constructional slots are formed can be decisive in determining the degree of closeness between near synonyms'' (46). Accordingly, the constructional approach to near-synonyms applied in this paper is advocated as ''a valid [sic!] verifiable and repeatable alternative to meaning-based, introspective methods'' (46).
Stefan Th. Gries In his article ''Corpus-based methods and cognitive semantics: The many senses of 'to run''' Stefan Th. Gries tries to bridge the gap between corpus linguistics and cognitive linguistics by ''demonstrating how cognitive linguistics can benefit from methodologies from corpus linguistics and computational linguistics'' (57-8). To this end, the author, by conducting a number of case studies, illustrates how a 'traditional' cognitive analysis of the individual meanings of 'to run' can be supplemented by corpus-based statistical methods. Underlying all analyses are the 815 instances of the verb in the British component of the International Corpus of English (ICE-GB) and the Brown corpus of American English. The cognitive investigations result in a radial network of a total of 56 senses of 'to run', with five central senses around which the others cluster, namely 'fast pedestrian motion', 'fast motion', 'motion', 'abstract motion', and 'to cause motion'. The author points out that traditional cognitive analysis would usually treat the sense 'motion' as prototypical since it ''is the sense from which most others can be (most economically) derived'' (75). However, general and also more sophisticated corpus linguistic evidence indicate that 'fast pedestrian motion' is more central. As for the first kind of evidence, the CHILDES corpus shows that this sense is the most frequent in early stages of acquisition. Also, ICE-GB yields 60 instances of the noun 'run', about three quarters of which refer to the sense 'fast pedestrian motion' or closely related senses. More sophisticated corpus-based methods like analyses of the behavioural profile of the verb's occurrences (based, among others, on morphological, syntactic and semantic properties) provide further evidence for the centrality of the sense 'fast pedestrian motion'. For instance, this sense seems to be ''the formally least constrained sense and can, thus, be considered unmarked and prototypical'' (76). Similarly, the fact that ''it exhibits most variation across all formal and semantic characteristics which were coded'' also points in the same direction. This case study on the prototypical sense of 'to run', as well as other case studies (for instance, on the distinctiveness of senses and on the question of how and where senses in a network should be connected), thus show how cognitive approaches can benefit from corpus-based methods, in particular approaches that are based on behavioural profiles: ''[A] behavioural profile is [...] the most rewarding starting point that will hopefully be utilized more fully in future work'' (90).
Stefanie Wulff, in her paper ''Go'-V vs. 'go-and'-V in English: A case of constructional synonymy?'', studies instances of the two superficially similar double verb patterns that are exemplified below.
(1) Go find the books and show me. (2) Now, just keep polishing those glasses while I go and check the drinks. (101)
In contrast to generative-transformational approaches that treat the first pattern as merely being a truncated surface form of the second one, the author shows that both patterns should be regarded as constructions (in a construction-grammar sense) in their own right. This claim is substantiated by a number of smaller studies that analyse the behaviour of the constructions at issue with the help of ''statistical methods such as collocational overlap estimation, collostructional analysis, and distinctive collexeme analysis'' (102). For instance, the latter technique, which ''measures the dissimilarity of semantically similar constructions on the basis of their significant collexemes'' (119) shows that the 'go-and-V' pattern mostly occurs with stative verbs (although these sometimes may get a dynamic reading if they occur in this pattern, e.g. 'I might go and see Aunt Violet' (114)). The 'go-V' pattern, in contrast, usually chooses motion verbs or verbs which imply activity. This and similar results lead Wulff to conclude that ''while whatever action is denoted by the 'go-and-V' gains an event-like interpretation and is meant to embrace the whole sequence cascade of a typical event with a beginning and an end, the meaning of 'go-V' only denotes the initiation of an action and is inherently atelic, which invites process verbs to occupy the V2 slot''. (121)
In the their article ''Syntactic leaps or lexical variation? - More on 'Creative Syntax''', the authors Beate Hamp and Doris Schönefeld analyse the creative use of verbs in untypical complementation patterns, as exemplified in clauses like 'He supported them through the entrance door' or 'She bore them stupid'. Such cases of 'caused-motion' or 'resultative' constructions have been studied at length in construction-grammar approaches such as, for instance, Goldberg (1995) where a fusion model is advocated to account for the apparent change in the complementational behaviour of the verb: ''the verb 'inherits' a syntactic slot from an argument-structure construction (ASC) it is usually not associated with. [...T]he ASC provides both a very generic meaning and a syntactic template [...] which gets fused with the semantic and syntactic frame of the verb at issue [... and thus] licenses both the semantic change incurred and the appearance of additional syntactic slots'' (129-30). The authors agree with Goldberg in attributing a central role in creative verb use to the ASC but they suggest that the ASC plays a different role, namely acting ''as a trigger to the activation of another verb [...] as input to a blending process'' (130). The creative use of 'fear' in resultative constructions (e.g. 'Hundreds of people are feared dead after a mining disaster'), for instance, could thus be explained by reference to lexical influence. 'Fear' in its creative use shows a very high collocational strength with 'dead' and only occurs in passive constructions, a situation that is similar to the use of main verb 'find', as in 'the bookseller was found dead'. In the case of 'fear', then, the creative use might have originated in ''lexically filled model collocations, such as 'X (be) found dead', from which a specific ''creative'' collocation like 'X (be) feared dead' may be formed by lexically manipulating the model pattern in only one slot'' (148). Similarly, all other creatively used verbs studied by Hampe and Schˆnefeld also show strong collocational restrictions with regard to the newly acquired argument slot, a finding that is not predicted or explained by Goldberg's account. Accordingly, the authors conclude ''what is treated as a merely syntactic [...] type of creativity in Goldberg's 'fusion model' may be governed to variable extents, by lexical processes'' (150).
Gaëtanelle Gilquin's paper ''The place of prototypicality in corpus linguistics: Causation in the hot seat'' investigates the relation of cognitive prototypes and corpus-linguistic frequencies. More specifically, she explores to what extent authentic periphrastic causative constructions (for instance, 'get your father to run us out') can be interpreted as realizations of one of three cognitive models of prototypical causation (namely the notion of iconic sequencing with the order 'causer - causee - patient', the billiard-ball model (both Langacker 1991), and Lakoff's (1987) direct manipulation model). Surprisingly, ''the models of prototypical causation described in the cognitive literature account for an astonishingly small proportion of the data'' (175), i.e. only 45%. Although the author adduces some qualifications that may (at least to some extent) ''reduce the distance between cognitive salience and frequency [...] this lack of overlap nonetheless questions our deepest intuitions and calls for explanation'' (181). One such explanation may lie in the fact that the cognitive ''models proposed in the literature are not valid descriptions of prototypical causation'' (178). In particular, all of the three models discussed seem to be merely based on the intuition of the originators, which, as corpus-linguistic studies have shown with regard to many other contexts may be rather unreliable. Furthermore, Gilquin, following Geeraerts (1989), claims that the concept of prototypicality itself is prototypical and, hence, may be too fuzzy to be applied satisfactorily. What is needed is ''a refined and more detailed description of this concept, which might involve multi-faceted characterisation and/or additional adjustments, such as assigning particular weight to each parameter defining the prototype'' (181). In this respect, for instance, corpus linguistic data on frequently occurring patterns of use may be helpful.
The paper 'Passivisability of English periphrastic causatives'' by Willem Hollmann is an attempt ''to account for the differences in passivisability of English periphrastic causatives'' (193), as exemplified in sentences like 'Recruits were made to hop on the spot' or 'People in their work roles are caused to respond from their unconscious world of internal objects.' Hollmann restricts himself to an empirical analysis of instances of 'to make', since semantically it is the most general causative, and, accordingly, ''results may be extended to other causatives'' (193). On the basis of Hopper and Thompson's (1980) work on transitivity the author suggests several scales that are supposed to capture the transitivity of the constructions under scrutiny. For instance scales like 'full affectedness < partial affectedness (of the object)' or 'inducive < volitional < affective < physical (causation)' capture the causality aspect of transitivity (with scales showing a decrease of transitivity from left to right); the scale 'unity of space and time < absence of unity of space/time < absence of unity of space and time' is supposed to capture the aspect of directness of the cause, and so on. The application of the descriptive framework to 400 instances of active/passive causative uses of make in the past/present (i.e. 100 tokens for each configuration) shows that causation type exhibits a strong influence on the passivisation of 'make' causatives, while the affectedness of the object does not yield significant results. Similar, the results for directness of the cause only seem to have a marginal influence on passivisability. The results obtained from the empirical study leads Hollmann to posit a number of ''implicational universals that may be proposed to capture the relation between the semantics of causatives and their degree of passivisability'' (213). The influence of causation type, for instance, is described as follows: ''If a language allows passivisation of causative constructions towards the lower, less transitive end of the causation type scale then the constructions toward the higher, more transitive end of the scale will also be passivisable (all other things being equal).'' (213) These implicational universals are then tested against a small set of other causative verbs like 'get', 'force', or 'persuade' (reported more fully in Hollmann 2003). This comparison shows that the individual factors or scales applied in this study do not reliably predict the frequency of passivisation, if they are considered equally influential. Rather the data point to the fact that the different factors need to be weighted. Here, again, corpus-based methods that ''assess the relevance of the factors in question'' might prove useful.
John Newman and Sally Rice explore the ''Transitivity schemas of English EAT and DRINK in the BNC''. In particular, the authors analyse how the two verbs are used transitively and intransitively within spoken and written English and what kinds of nouns occur as subjects and objects. Also, they aim to show how usage patterns of the two verbs depend on the form of the verb that is actually used. For instance, the lexeme EAT occurs more frequently in both the written and spoken material than DRINK and usually is the first in combinations ('ate and drank' instead of 'drank and ate'). In the view of the authors this might indicate ''experiential salience: when we eat and drink, the drinking is an accompaniment to the eating, rather than the other way round'' (236). Other findings include the nature of objects that usually occur in transitive uses of the two verbs: the most frequent object with EAT, for instance, is 'food'. In addition, among the 20 most frequent words many occurrences denote particular kinds of meals, such as 'breakfast', 'lunch', or 'dinner'. In this context the authors note that while it is good practice in dictionaries ''to recognize a 'food' and 'meal' kind of understood object on intransitive EAT [... their] results show that these two categories are a feature of the 'transitive' use of EAT as well'' (246-7). The authors report similar findings with regard to DRINK: as in the case of EAT, intransitive uses are usually described as with reference to an understood object denoting some kind of alcohol. Again, their corpus study on the transitive use of the verb DRINK shows that ''[t]he occurrence of names for alcoholic beverages is striking'' (248). If lexicographers leave this fact unmentioned in their description of the uses of the verbs this might be interpreted as mirroring a difference between transitive and intransitive uses, that is not actually given in authentic usage data. A fuller integration of corpus-linguistic findings could thus help to make apparent ''the full extent of inferences and collocational properties associated with a verb [...] and the ensuing description becomes more observationally adequate'' (248). Finally, the authors stress the importance of the word forms in studies of the kind they conduct, since syntactic and/or semantic properties of the usage of a word are usually tied to particular word forms and do not necessarily hold true for the complete lemma. Accordingly the authors claim that ''the notion of a dictionary entry based on a lemma is still inadequate'' (255).
Maarten Lemmens paper on ''Caused posture: Experiential patterns emerging from corpus research'' investigates the relation of the three Dutch cardinal posture verbs 'zitten' 'sit', 'liggen' 'lie', and 'staan' 'stand' and their causative counterparts 'zetten' 'set', 'leggen' 'lay' 'steken/stoppen' 'stick (into)' and 'doen' 'do'. On the basis of an analysis of 7550 tokens, the author finds that usually there is no ''direct link between the causatives and the non-causatives, in the sense that one can always recast one in terms of the other'' (279). While 'liggen' and its causative counterpart 'leggen' show clear correspondences, the situation is different for 'staan', which only in a few metaphorical uses is related to its apparent counterpart 'stellen' - more frequent and regular is 'zetten', the causative that corresponds to the posture verb 'zitten'. Causatives related to 'zitten', in addition to 'zetten', include 'steken', 'stoppen' and 'doen'. Lemmens further analyses the distribution of postural and locational uses of the causative verbs in those cases where the 'causee', i.e. the 'entity' that is put somewhere, is human. Surprisingly, postural readings of the causative verbs with human causee, i.e. 'bring a person in a standing/sitting/lying position', are only rarely attested in the corpus data. For example, less than 1% of all occurrence of 'leggen' and 'zetten' involve postural usage, and seem to be restricted to two cases: 1) ''situations where people no longer control their own posture'' (283), as is the case with babies or ill people, or 2) contexts where people are manipulated or put somewhere, e.g. being expelled from a country or from a house. In addition, 'zetten' seems to have become highly productive. This, in the view of the author, is due to the fact the '''zetten' has generalized to the meaning 'put an entity in its canonical position''' (285), which naturally makes it applicable to a large number of situations. 'Zetten' thus seems to have become the default causative verb.
The final paper of this volume ''From conceptualization to linguistic expression: Where languages diversify'' by Doris Schönefeld analyses differences in conceptualizations of similar scenes in English, German and Russian. The paper is informed by the idea that speakers usually have choices in the way they conceptualize a particular scene and that these conceptualizations leave traces in their verbalization. It follows that ''from habitual, i.e. typical and frequent, expressions of a language we can infer a speech community's habitual ways of conceptualization'' (298). The author tries to identify such 'patterns' of conceptualization through a corpus-analysis of collocations found with the posture verbs 'sit', 'stand' and 'lie'. The analysis, for instance, shows that the three languages may use different prepositions with identical verbs to describe similar situations: While English and German students 'sit over' books (Ger. über den Büchern sitzen'), Russian students rather 'sit behind' books (Rus. 'Sidet' za knigami'). Similarly, in England and Russia books stand on the shelf while in Germany they stand 'in' the shelf (Ger. 'das Buch steht im Regal'). These and similar examples show that in their construal of the situation different languages activate different image-schemas. With regard to the book example above, for instance, English and German construe the relative position of landmark (book) and trajector (student) on the basis of the UP-DOWN schema while Russian employs the FRONT-BACK and NEAR-FAR schema. Further differences show when in the description of similar scenarios different posture verbs or even non-posture verbs are used in one or two languages. On the whole, the author finds that ''diversifications between languages [...] may be the result of diverging construals by drawing on different image-schema combinations in the conceptualizations of the phenomena to be expressed. [...] image-schemas are centrally employed [...] in the conceptualization and verbalization of identical/comparable (posture) scene, and [...] different speech communities can construe these scenes differently by highlighting particular image schemas at the expense of others'' (330). Again, corpus-based observations may yield interesting insights into areas of cognitive linguistic research.
Stefan Th. Gries and Anatol Stefanowitsch, in my view, have edited an excellent selection of papers. The articles are generally of a very high quality and highly stimulating and show impressively how cognitive linguistics may benefit from corpus linguistic research and (advanced) statistical methods. As the title already makes clear, the volume first of all is aimed at researchers from a cognitive- and corpus-linguistic background.
The former will find articles that represent three traditional areas of cognitive linguistics, namely similarity and dissimilarity of senses and ways of describing their organisation, cognitive approaches to grammar with a special focus on aspects of transitivity, and, finally, studies on the relevance of image schemas for human conceptualization and how this is mirrored in language use. In addition to the insights presented in the individual articles, the cognitive linguist is likely to benefit enormously from seeing a vast range of corpus-linguistic and statistical methods at work. The studies presented, thus, no doubt open up new methodological perspectives for the field of cognitive linguistics.
The volume will also prove valuable for the corpus linguists, as it shows a number of 'new' ways to exploit authentic data. While notions like 'mutual information', 'z-score', or 'chi-square' by now are part of received corpus-linguistic wisdom, this volume confronts the corpus linguist driven by the urge for objectivity with a large number of more advanced statistical methods, like collocational overlap estimation, collostructional analysis, or hierarchical cluster analysis, to name but a few. These new ways of analysing large amounts of authentic data should be welcome to any linguistic working with corpora. To quote Jan Aarts (although on another topic): ''If you want a challenge, there it is''.
Still, while clearly advocating the use of corpus-linguistic and advanced statistical methods, the reader never gets the feeling that these are regarded as ends in themselves but merely serve ancillary purposes. In this respect, the following quote by Gries, in my view, can be seen as representative of the attitude common to all of the papers: ''I have tried to emphasize the benefits of additional corpus-based evidence, but I should like to point out, however, that I do not advocate using corpus evidence alone. Corpus evidence can complement different research methodologies such as (psycho-)linguistic experiments, but it should not replace them'' (87).
Another group of researchers that will certainly benefit from this volume are lexicographers. The volume provides a number of case studies on identifying meaning and, most importantly, show how meaning is tied to semantic and syntactic context. This book, like many others before, thus provides further evidence for the lack of strict boundaries between lexis and grammar, and may contribute to more accurate descriptions of meanings in dictionaries.
Finally, the proof-reading turns out to have been almost perfect. Only a very few errata remain, which is within more than reasonable limits for a book of roughly 350 pages.
On the whole, the volume makes for a highly stimulating and interesting read and shows numerous ways in which corpus-linguistic methods may help to complement cognitive approaches to linguistics. In my view, the illustration of a vast range of statistical methods is particularly appealing, and shows to what extent 'traditional' ways of analysis might benefit from the objective exploitation of usage-based data. If (cognitive) linguistics will really experience ''a major methodological paradigm shift in the direction of corpus work'' (14), as is the hope expressed in the introduction by Stefan Th. Gries, can of course not be answered now - but this volume no doubt makes such a shift appear very attractive.
Geeraerts, Dirk (1989): ''Introduction: Prospects and problems of prototype theory'', Linguistics 27: 587-612.
Goldberg, Adele (1995): Constructions. A Construction-Grammar Approach to Argument Structure. Chicago: The University of Chicago Press.
Hopper, Paul and Sandra A. Thompson (1980): ''Transitivity in grammar and discourse'', Language 56: 251-299.
Lakoff, George (1987): Women, Fire, and Dangerous Things. What Categories Reveal about the Mind. Chicago: The University of Chicago Press.
Langacker, Ronald W. (1991): Foundations of Cognitive Grammar. Vol. II. Descriptive Applications. Stanford: Stanford University Press.
ABOUT THE REVIEWER:
ABOUT THE REVIEWER
Rolf Kreyer is an Assistant Professor of Modern English Linguistics in the department of English, American and Celtic Studies of the University of Bonn, Germany. His research interests include corpus linguistics, syntax, and text linguistics. He is the author of "Inversion in Modern Written English. Syntactic Complexity, Information Status and the Creative Writer", which was published in 2006 by Gunter Narr. At present he is working on a corpus-linguistic study that aims to analyse the interaction of language use and grammar.