Language Evolution: The Windows Approach addresses the question: "How can we unravel the evolution of language, given that there is no direct evidence about it?"
The LINGUIST List is dedicated to providing information on language and language analysis, and to providing the discipline of linguistics with the infrastructure necessary to function in the digital world. LINGUIST is a free resource, run by linguistics students and faculty, and supported primarily by your donations. Please support LINGUIST List during the 2016 Fund Drive.
Review of Corpus Applications in Applied Linguistics
SUMMARY This book covers the traditional areas of applied linguistics, such as Second Language Acquisition (SLA), professional discourse, and the development of language teaching materials, but also includes new domains, such as English as a Lingua Franca (ELF). After a brief discussion of the impact of corpora on the field of applied linguistics, the introduction by the three editors, “Corpora in Applied Linguistics”, provides an overview of the four remaining sections. The book contains thirteen chapters, all by different authors (in one case by two co-authors), and focuses mainly on English, with one chapter providing data from other European languages and another describing a documentary corpus of Chinese photographs. There is also an Afterword “The problems of Applied Linguistics” by Susan Hunston.
The first main section is composed of three chapters focusing on professional, or institutional, discourse. Michael Handford provides a summary of research over the past twenty years, and presents a series of specialised corpora, including those produced by the Hong Kong Polytechnic University, which can be searched online. He also describes his current project, compiling a corpus of professional international English speech data, focusing on construction industry discourse. Finally, he shows how corpus data can be used to inform pedagogical materials, with examples taken from the Cambridge Business Advantage series.
Ken Hyland discusses studies of academic discourse, which also provide information useful for teaching this type of language to students and researchers. Academic discourse is not a single unit, but a composite of many different genres, and students need to learn how to understand different types of academic discourse, just as researchers need to learn how to produce context-appropriate forms. The example presented by Hyland is a case-study of gender differences in book reviews, in the contrasting fields of philosophy and biology. In the final chapter in this section, Almut Koester reviews corpora of workplace discourse, and discusses several studies based on CANBEC, CANCODE, HKCSE and other smaller corpora, using both qualitative and quantitative analyses. She shows that lexical features, keywords, phraseology and pragmatics can all be studied by means of corpora, and that thus the distinctive characteristics of workplace discourse can be established, confirming the institutional aspects of this type of language.
The second section, “Corpora in Applied Linguistics Domains”, contains four chapters, each focusing on a different aspect of this multidisciplinary field: translation studies, forensic linguistics, gender studies and media studies.
Sara Laviosa illustrates the uses of corpora in translation studies, presenting the different types of corpora that can be used, whether multilingual, bilingual or monolingual, parallel or comparable, unidirectional or bidirectional. Translation universals such as simplification, explicitation and normalisation are also explored. Translation examples are drawn from several languages, almost always between English and another European language (Norwegian, German, and Spanish), with a more detailed presentation of a project using corpora to train students in specialised translation from English to Italian.
The field of forensic linguistics, according to John Olsson, owes a great deal to corpus linguistics and quantitative analysis. He provides a comprehensive overview of the ways in which corpus investigation can assist in identifying authorship, by describing in some detail a specific case involving an unnamed actress. He also discusses the limits of corpus analysis regarding the evolution of legal terminology and the need to build corpora of specialist text types.
Like Hyland, Paul Baker investigates gender, but in a broader discussion of representations in general corpora, and a specific study of the term “metrosexual” in British newspapers. He points out certain problems with the interpretation of corpus data and pleads for an approach combining concordance analysis with the more detailed study of longer passages of text. In the final chapter in this section, Anne O’Keeffe presents the core applications of corpus linguistics to the study of media discourse (keywords, frequency lists and concordances), providing many detailed examples of the types of analysis that can be fruitfully undertaken in this field, drawing upon a corpus of media interviews, and several reference corpora.
The third section, “Corpora in New Spheres of Study”, contains three chapters, on ELF (English as a Lingua Franca), texting, and a photograph corpus.
Barbara Seidlhofer discusses two specific ELF corpora, VOICE and ELFA, their similarities and their differences, and the implications of this type of research for Applied Linguistics, while underlining the validity of observation rather than introspection and elicitation to provide information about language in use, particularly for ELF, which by definition has no native speakers.
Texting is a relatively new form of expression, and Caroline Tagg presents many of the problems that its study involves, first in collecting the data, and then in its standardisation, with respelling and code-switching high on the list of the challenges encountered. Perhaps because such data are difficult to obtain, Tagg provides a list of freely available text corpora, and one commercially available corpus.
Gu Yuego presents “A Conceptual Model for Segmenting and Annotating a Documentary Photograph Corpus”. The corpus contains several million photographs stored in digital form, and the author discusses in great detail how to annotate such a corpus, so that a computer can analyse it. Surprisingly, segmentation and annotation are performed manually, based on the MPEG-7 standard, and Protégé is used to construct a skeleton ontology for knowledge representation.
The final section, “Corpora, Language Learning and Pedagogy”, contains three chapters, on learner corpora and SLA, using corpora in the classroom, and for materials design.
Chau Meng Huat provides an overview of learner corpora research, illustrated by a case study of the development of L2 phraseological competence among Malaysian 13-year-olds. He also outlines ongoing learner corpus initiatives, and discusses the terminology used and its impact on research (learners vs. users).
Lynne Flowerdew discusses a range of approaches to the use of corpora in the classroom: lexical, functional, genre-based, ending with a short case-study of a corpus-informed course in report-writing, which also systematically integrated training in corpus consultation strategies at each stage of the writing process.
The chapter by Michael McCarthy and Jeanne McCarten gives a brief history of corpus-informed language teaching, before describing the production of spoken corpus-based materials that are both user- and teacher-friendly, illustrating the discussion with excerpts from the Touchstone series of textbooks, which they co-authored with Helen Sandiford. Elements such as word frequency lists, keyword lists, collocation statistics and chunk lists were used to analyse the data in order to select the most pertinent items to be integrated into the course. The focus is on the development of conversational strategies, such as organising, topic management and listenership to encourage learner autonomy.
Finally, Susan Hunston’s Afterword, “The Problems of Applied Linguistics”, discusses the book from the viewpoint of both Corpus and Applied Linguistics. The main difference seems to be one of perspective: in Applied Linguistics, the focus is on the individual text and corpora are used to validate the analysis, whereas in Corpus Linguistics texts are selected to form a corpus which, taken as a whole, will provide information about language. The patterns observed can then be used to interpret a specific text.
EVALUATION As Hunston points out, applied linguists undertake research into language with relevance to real-world problems, and each chapter shows how using corpora can assist in reaching that goal. Readers will find many interesting suggestions and ideas for future study and research, even though the specific focus of some chapters may be outside their usual interests.
English for international communication is frequently the focus, and most chapters have some pedagogical aspects to interest both researchers and language teachers. For students, it would have been helpful to include a section presenting the dos and don’ts of using corpora in applied linguistics; a reference list of corpora or corpus tools and software would also be useful, as only Tagg and Seidlhofer provide such details. Some information about corpora can be gleaned from the subject index, but no software is listed there. A longer, more extensive introductory chapter could have provided a better overview of what corpora can bring to applied linguistics. The corpus linguist will notice that Sinclair appears twice in the list of authors (under both J. and J. McH.), but that Firth for Applied Linguistics means Alan, not John Rupert, despite the emphasis on the importance of social context.
One of the drawbacks of the book is that the chapters are divided into artificial groups rather than introduced individually, which would help to underline their many common features. Although each separate chapter makes a valid contribution to the general theme of using corpora in applied linguistics, the book reads more like a collection of conference papers than a set of chapters specially commissioned to form a unified whole. The Afterword by Susan Hunston would make a better starting point than the opening chapter by the editors, if the reader wishes to envisage the book as a coherent whole.
Several articles stand out. Sara Laviosa’s presentation of the importance of corpora in translation studies is thorough, detailed and inspiring. It seems almost impossible to imagine the field without corpus input and she predicts that corpus linguistics, translation studies and computer science will become even closer in the future.
Similarly, the impassioned defence of ELF as a valid form of communication raises many questions for the language teacher, and it would have been interesting for there to have been more intertextual links on such points. Seidlhofer does refer to the chapter on SLA by Chau, but neither Handford nor McCarthy and McCarten mention ELF explicitly, although their chapters discuss the role of English in international communication.
The most unusual chapter is about the annotation a photographic corpus, which is not readily classified under Applied Linguistics or Corpus Applications, yet still makes for fascinating reading (and the author promises to compensate any reader who undertakes the task of creating such a corpus and regrets it). Although Gu makes a good case for his model, it is debatable whether the language teacher really needs such a labour-intensive image database, when Wikimedia Commons and Google Images allow almost anyone with basic computer skills to access copyright-free pictures.
This book provides a useful update on what has happened in the field since “Corpora in Applied Linguistics” (Hunston, 2002), which focused more on what corpus linguistics could bring to topics such as language teaching and lexicography. The two books complement each other, with the earlier one providing much of the information students will need in order to undertake the types of cutting-edge research described in “Corpus Applications in Applied Linguistics”.
REFERENCES Hunston, S. 2002. Corpora in Applied Linguistics. Cambridge University Press.
ABOUT THE REVIEWER:
Carmela Chateau-Smith is a lecturer in English for Specific Purposes at the University of Burgundy, Dijon, France. She works in the Earth and Environmental Sciences Department and at the University Language Centre. She has recently completed a PhD in corpus linguistics, investigating language change at a moment of paradigm shift in the domain of Earth Sciences, with a diachronic corpus of geological English, WebsTerre. She is also interested in learner corpora and CEF levels, the language of wine, and the use of English as an international language for scientific communication.