Language Evolution: The Windows Approach addresses the question: "How can we unravel the evolution of language, given that there is no direct evidence about it?"
The LINGUIST List is dedicated to providing information on language and language analysis, and to providing the discipline of linguistics with the infrastructure necessary to function in the digital world. LINGUIST is a free resource, run by linguistics students and faculty, and supported primarily by your donations. Please support LINGUIST List during the 2016 Fund Drive.
AUTHOR: Wulff, Stefanie TITLE: Rethinking Idiomaticity SUBTITLE: A Usage-based Approach SERIES: Corpus and Discourse PUBLISHER: Continuum International Publishing Group Ltd YEAR: 2008
Phoebe Ming Sum Lin, School of English Studies, University of Nottingham, United Kingdom
Wulff's 'Rethinking Idiomaticity: A Usage-based approach' presents a study which aims to isolate factors underlying native speakers' intuitive idiomaticity judgement. Adopting a corpus linguistic methodology, the study successfully develops a model which identifies not only the factors underlying intuitive idiomaticity judgement, but also the weightings of their contribution to the judgement. The study is based on 39 V NP-constructions chosen from an idiom dictionary and the British National Corpus (BNC). The constructions were first presented to native speakers of British English to obtain their idiomaticity ratings of these constructions. Then, a series of formulas were applied to calculate compositionality and flexibility of V NP-constructions based on frequency information from corpus data. In the study, flexibility is a meta-concept comprising three specific sub-parameters, namely tree-syntactic, lexico-syntactic and morphological flexibility. Therefore, the flexibility formulas were applied once for each of the sub-parameters. Finally, multiple regression analysis was carried out to relate various measures of compositionality and flexibility to the intuitive idiomaticity judgement of native speakers. The result is a statistical model that reflects the unconscious factors native speakers assess when judging the idiomaticity of the 39 V NP-constructions.
The book is divided into an introduction and 7 chapters. In the Introduction, the author provides the background of the study. She observes that when presented with idiomatic expressions, such as 'take the plunge', 'see a point' and 'write a letter', native speakers can readily rank their idiomaticity. There are a number of suggestions about the basis of this intuitive idiomaticity judgement. As Wulff points out, early studies tend to equate idiomaticity with semantic non-compositionality. The idea is that native speakers rely on the criterion of whether the meaning of an idiomatic expression is the sum of the meaning of its constituting words. Later studies, however, suggest that non-compositionality is indeed scalar and it may not be the sole factor in idiomaticity judgements. The extent to which an expression accepts adverbial or adjectival modifications and passivization may also be part of native speakers' unconscious assessment of its idiomaticity. However, there is a gap for empirical research to investigate the interplay between these factors and the extent to which each factor contributes to the intuitive idiomaticity judgements of native speakers. The author argues that the quantitative corpus linguistic approach, which is particularly suited to deal with multifactorial phenomena, will shed new light on our understanding of idiomaticity.
Chapter 1, 'Theoretical issues', reviews different approaches to idioms and idiomaticity, with the emphasis on the Constructionist approach which informs the study in the book. After an overview of the approaches of discourse analysis, phraseology and psycholinguistics, Wulff elaborates on how the concepts of idioms and idiomaticity can be integrated into the approach of Construction Grammar. She argues that idiomaticity is a property inherent in all linguistic items regardless of their size and degree of schematization. This argument is in line with Croft and Cruse's (2004) concept of 'schematic idioms' which states that any construction, be it a core idiom like 'kick the bucket' or a regular syntactic expression, can be conceived of as being idiomatic.
Chapter 2, 'Methodological issues', expounds the approach of quantitative corpus linguistics (Stefanowitsch and Gries 2003, 2005; Gries and Stefanowitsch 2004) and describes the processes involved in collecting intuitive idiomaticity judgement data from native speakers of British English. 39 idiomatic V NP-constructions were chosen based on their frequency in the BNC, i.e. 90 times or above. These idiomatic V NP-constructions were embedded in carefully designed sentences and presented to 39 native speaker first year undergraduate students in English linguistics in a British university. Without being provided with an explicit definition of idiomaticity in the instructions, the participants were asked to give relative idiomaticity ratings to the 39 target constructions and indicate how reasonable it was for each to be included in dictionaries or phrase books. The results were presented in a graph in which the idioms were arranged in order of their idiomaticity ratings. The author attempts to explain the ranking of some of the idioms in the graph. She suggests that the fact that the grouping of the idioms on the graph matches established idiom pattern typologies is an indication that the participants were, according to plan, judging idiomaticity. The reliability analysis shows a high correlation between the idiomaticity judgements of the native speakers. Finally, the author explains the decision to use non-linguist native speakers as judges in the study and not to provide an explicit definition of idiomaticity in the instructions.
Chapter 3, 'Compositionality', begins by summarizing previous approaches to idiom compositionality. While the earlier studies tend to associate idioms with non-compositionality, the recent view, which is supported by much empirical psycholinguistic evidence, suggests that even semantically opaque idioms like 'spill the beans' can be regarded as relatively compositional. Compositionality of idioms, as Wulff argues along with other linguists, is a continuum and each idiom can be placed along a scale of compositionality. Moving to the calculation of compositionality, Wulff turns to the works on verb-particle constructions (VPCs) for practical ideas. She reviews 5 approaches used in empirical literature to calculate the compositionality of verb-particle constructions (VPCs). Among them, she chose to adapt Berry-Rogghe's (1974) method. Formerly applied to the study of VPCs, Berry-Rogghe's compositionality formula works by dividing the number of collocates that the VPC and the particle share by the total number of the VPC's collocates. The idea is that the more the particle contributes to the semantics of the VPC, the more of its collocates will be among the set of the VPC's collocates. A number of adaptations were made to Berry-Rogghe's formula to suit the case of V NP-constructions. One of the necessary adaptations is the need to generate an overall compositionality measure which is a weighted average of the compositionality of the verb and the noun. The precise weighting is determined 'exploratively' according to the data. In other words, different combinations of weightings are calculated and the settings that produce the best results are then chosen. The resulting new compositionality formula was applied on the 39 V NP-constructions, and the numerical results are presented graphically.
Chapter 4, 'Flexibility measures', has the same structure as the previous chapter. It begins with a review of theories followed by the introduction of formulas to calculate flexibility. From the review, the author identifies general consensus in the literature on three aspects: 1) most idioms are flexible at least to some extent; 2) flexibility tends to correlate with token frequency; and 3) compositionality does not correlate with any kind of flexibility to an extent that licenses the assumption of a causal relationship between the two. After discussing the different kinds of flexibility (i.e. tree-syntactic, lexico-syntactic, morphological and phonetic flexibilities), the author moves to the ways of calculating idiom flexibility indicated in previous empirical work. She borrowed Barkema's (1994) flexibility formula for noun phrases and the method of calculating Entropy from physics and applied them on the target V NP-constructions. Here, she makes a further distinction of variables under the three kinds of flexibilities (phonetic flexibility is excluded from the study). With reference to the 39 V NP-constructions, these variables concern the various aspects in which each kind of flexibility varies. For example, morphological flexibility (MF) can vary in person, tense, voice and so on. Therefore, we have Person, Tense and Voice as variables under MF. For tree-syntactic, lexico-syntactic and morphological flexibilities, a total of 27 variables are identified. The next step is a long report comparing the behaviour of the 39 V NP-constructions across each of the variables based on the aforementioned formulas.
Chapter 5, 'The idiomatic variation continuum', uses the Principle Component Analysis (PCA) to explore the structure of the combined results of the compositionality and flexibility measures from the two previous chapters. The compositionality and flexibility measures and corpus frequency altogether make up 20 parameters. The way PCA functions in this study is to explore if these 20 parameters can be put into groups of over-arching factors (or 'principle components') so as to reduce the complexity of the data and reveal which parameters are closely correlated. From the statistical perspective, the PCA also captures the maximum variance within the compositionality and flexibility data, and this is a window of opportunity to explore the idiomatic variation displayed in the 39 constructions. After the PCA, the 20 original parameters are compressed into 8 components, which account for 74 percent of the total variance in the target constructions. The component which is mainly made up of tree-syntactic flexibility and a morphological flexibility (i.e. Voice) combined is found to explain the most variance and is therefore the most important. The second most important component comprises two morphological flexibility parameters (i.e. NumV and Mood). Contrary to the common belief in the literature, the compositionality parameter has only limited power to explain idiomatic variation. Combined with another parameter (i.e. corpus frequency), it forms only the fourth most important component in the PCA.
Chapter 6, 'The idiomaticity continuum', is the centre of the book as all the work in the previous chapters is brought together to address the key research question: 'which factors do speakers rely upon when assessing the idiomaticity of a construction?' Multiple regression analysis (MRA) was used to construct a model of native speakers' intuitive idiomaticity judgement. In the MRA, the aforementioned 20 parameters from chapters 3 and 4 were the independent variables, and the idiomaticity judgement ratings from chapter 2 were the dependent variable. Amongst other findings, the MRA indicates that the 20 parameters altogether account for nearly 80 percent of the variance in the dependent variable (i.e. the idiomaticity judgement). The parameters that contribute most to the variance are again the two morphological flexibility parameters, NumV and Mood. This finding suggests that when native speakers judge the idiomaticity of the 39 V NP-constructions, NumV and Mood are probably the factors that are considered unconsciously. Echoing another finding of the PCA in the previous chapter is the limited importance of the compositionality parameter (i.e. the fifth in terms of ranking). As the author points out, the agreement in the results in Chapters 5 and 6 strengthens the argument for this statistical model of native speakers' intuitive idiomaticity judgements.
Chapter 7, 'Towards a new model of idiomaticity', concludes the book by highlighting the significance of this study to research in the areas of idioms, corpus linguistics and Construction Grammar. In the light of the findings of this study, Wulff proposes an extended model of constructions to which an idiomaticity dimension is added. In the model, constructions at or above complex constructions in terms of structural complexity have to be considered also for their position along the idiom-collocation continuum. This new idiomaticity dimension is not a single-layered representation, but a multi-layered one. Therefore, the profile of a complex construction can have as many as 20 layers along the idiom-collocation continuum, with each layer representing one of the aforementioned 20 parameters. This extended model of constructions can integrate the findings about compositionality and flexibility in the study to the framework of Construction Grammar.
EVALUATION Among the many recent publications in the area of idioms, phraseology and formulaic language, this book impresses with its innovative approach to idiomaticity. Despite the fact that introspection informed many of the earliest investigations in the area, Wulff's study is one of the very few that empirically examine the nature of native speakers' intuitive idiomaticity judgements. The author also makes an original point and contribution by using non-linguists as native speaker judges. She reasons 'it is plausible to assume that language experts will have had considerable exposure to theoretical approaches to idiomaticity, so their [idiomaticity] judgements will hardly be unfiltered' (p. 32). As she says, if intuitive idiomaticity judgement is real, it should not only exist in the linguists' heads. Non-linguist native speakers should also be able to discern idiomaticity.
The aim of the book is to develop a statistical model of native speakers' intuitive idiomaticity judgements using quantitative corpus linguistic methodology. This aim is achieved as the study successfully isolates the factors that native speakers may be assessing unconsciously as they judge the idiomaticity of expressions, and points out the weightings of these factors. These positive results encourage the use of quantitative corpus linguistic methodology to address other linguistic problems. Furthermore, this study provides important counter-evidence against the suggestion that intuitive idiomaticity judgements are merely random. As the results of the MRA show, native speakers unconsciously draw on their implicit knowledge of the distributional characteristics of the idiomatic expressions when making intuitive judgements. This suggestion seems to challenge previous views within corpus linguistics that human intuitions are not good at recording facts about frequency in language use (see Sinclair 1991). If Wulff's suggestion is correct, as is evidenced by the results of her study, the nature, validity and reliability of introspection warrant further investigation.
A debatable issue in the book is the 'explorative' approach to a few important decisions in the process of developing the intuitive idiomaticity judgements model. When a choice has to be made between a few viable options, the study's approach is to try all the options and then select the one which produces results closest to the author's expectation. This liberal and practical approach to decision-making has its merits for being completely data-driven, but it may also be challenged for its ambivalence. This approach may have elevated the success rate of the outcome idiomaticity judgement model, but it may also have compromised the generalizability and the applicability of the model developed in this study to other datasets and studies.
To conclude, this book is a valuable addition to the field, for it offers many innovative perspectives on issues in idiomaticity research. All in all, it is an interesting and useful read and one that is highly recommended to researchers of idiomaticity and formulaicity.
Barkema, H. (1994). Determining the syntactic flexibility of idioms. In U. Fries, G. Tottie & P. Schneider (eds.), Creating and using English language corpora (pp. 39-52). Amsterdam: Rodopi. Berry-Rogghe, G. L. M. (1974). Automatic identification of phrasal verbs. In J. L. Mitchell (Ed.), Computers in the humanities (pp. 16-26). Edinburgh: Edinburgh University Press. Croft, W., & Cruse, D. A. (2004). Cognitive Linguistics. Cambridge: Cambridge University Press. Gries, S. Th. &. Stefanowitsch, A. (2004). Extending collostructional analysis: A corpus-based perspective on 'alternations'. International Journal of Corpus Linguistics, 9 (1), 97-129. Sinclair, J. M. (1991). Corpus, Concordance and Collocation. Oxford: Oxford University Press. Stefanowitsch, A. & Gries, S. Th. (2003). Collostructions: Investigating the interaction of words and constructions. International Journal of Corpus Linguistics, 8(2), 209-243. Stefanowitsch, A., & Gries, S. Th. (2005). Covarying collexemes. Corpus Linguistics and Linguistic Theory, 1(1), 1-43.
ABOUT THE REVIEWER
ABOUT THE REVIEWER:
Phoebe M. S. Lin is a PhD student at the School of English Studies,
University of Nottingham. She is currently working on her thesis, which
explores the use of native speaker intuition as a method to identify
formulaic sequences and the prosodic features of formulaic sequences. Her
research interests include formulaic language, intonation, corpus
linguistics and psycholinguistics.