LINGUIST List 19.670|
Thu Feb 28 2008
Review: Corpus Linguistics: Fellbaum (2007)
Editor for this issue: Randall Eggert
This LINGUIST List issue is a review of a book published by one of our
supporting publishers, commissioned by our book review editorial staff. We
welcome discussion of this book review on the list, and particularly invite
the author(s) or editor(s) of this book to join in. To start a discussion of
this book, you can use the
Discussion form on the LINGUIST List website. For
the subject of the discussion, specify "Book Review" and the issue number of
this review. If you are interested in reviewing a book for LINGUIST, look for
the most recent posting with the subject "Reviews: AVAILABLE FOR REVIEW", and
follow the instructions at the top of the message. You can also contact the
book review staff directly.
Review: Corpus Linguistics: Fellbaum (2007)
Message 1: Review: Corpus Linguistics: Fellbaum (2007)
From: Randall Eggert <randylinguistlist.org>
Subject: Review: Corpus Linguistics: Fellbaum (2007)
E-mail this message to a friend
Announced at http://linguistlist.org/issues/18/18-738.html
EDITOR: Fellbaum, Christiane
TITLE: Idioms and Collocations
SUBTITLE: Corpus-based linguistic and lexicographic studies
SERIES: Corpus and Discourse
PUBLISHER: Continuum International Publishing Group Ltd
Esa Penttilä, Department of English, University of Joensuu, Finland
This book consists of a collection of essays reporting the results of a
wide-ranging corpus linguistic project at the Berlin Brandenburg Academy of
Sciences, Germany, which concentrates on various aspects of German multi-word
units, primarily verb phrase idioms and support verb constructions. The project
is adjacent to another project, which deals with compiling a large corpus of
20th century German language that has provided the primary basis for the
empirical studies in the volume.
The volume begins with Christiane Fellbaum's introduction, in which she explains
the main aims of the project determining its position in the research tradition.
The merits of large-scale corpora in studies on idiomatic language are
well-acknowledged (see e.g. Moon 1998, Stubbs 2002), and this book continues the
trend by showing how multi-word units that have traditionally been regarded as
fixed, idiosyncratic expressions do not in the end dramatically differ from
non-idiomatic, syntactically free expressions. At the same time as Fellbaum's
text outlines the background for the book, it introduces the reader to the field
of corpus-linguistic idiom studies in general explaining the basic framework and
terminology and reciting some of the seminal studies in the field.
The actual chapters in the book are divided into two parts. The first four
chapters are called ''Corpus, extraction and workbench'' and discuss the
technical, theoretical, and, to some extent, practical background of the
project, while the final six chapters form the ''Linguistic analysis'' and present
individual empirical studies conducted during the project.
The technical part of the book begins with Alexander Geyken's description of the
corpus compilation project (''The DWDS corpus: a reference corpus for the German
language of the twentieth century'', 23-40). It provides an account of the
various questions related to creating a balanced reference corpus and thus
offers a valuable lesson for anyone who has any ideas of collecting a corpus of
one's own. The emphasis is, naturally, on the 100-million-word core corpus
called the DWDS corpus, which contains German texts from each decade of the 20th
century balanced chronologically and by text genre, but the project also
involved a collection of a 900-million-word supplementary corpus of newspaper
texts from the 1990s. Together these two corpora are large enough to be reliably
used for studies on idioms, but the development work still continues.
The following three chapters describe some of the basic aspects related to the
tools and methods used for searching and analyzing the corpora. First, Alexander
Geyken and Alexey Sokirko (''Classifying NVGs/FVGs in an interactive parsing
process'', 41-53) give a brief account of the interactive parser they have been
developing to help lexicographers find suitable data from the corpus; the
corpus, after all, is far too large for anyone to inspect it manually. Their
work is based on the idea of semi-automatic linguistic analysis, in which a
shallow parser first on syntactic basis extracts a suitably-sized set of
relevant examples which can then be inspected manually. The parser development
continues, but the results of the experiments that have so far been conducted on
verb-nominalization constructions and function verb constructions look promising.
In his essay, Axel Herold (''Corpus queries'', 54-63) concentrates on the problems
related to extracting enough relevant data from the corpus, since it is
important that when examples are extracted from the corpus the queries allow us
to detect not just those variations that we could think of but also those
variations that we would not expect to turn up. After all, intuition cannot
often account for everything that actually happens in the real world, and this
is especially true of idiomatic expressions.
Gerald Neumann, Fabian Körner and Christiane Fellbaum (''A lexicographic
workbench for German collocations'', 64-77) describe the lexicographic workbench
used for analyzing and representing the data. The idiom examples extracted from
a corpus form example corpora, and the idioms in these corpora are analyzed
manually and represented in annotated templates, which contain a lot of
information about the syntactic and semantic nature and even history of each
example idiom linking it to other relevant expressions. These templates
constitute the core of the workbench and will be freely available to the
research community, which makes the whole project a very valuable contribution
to the field.
Katerina Stathi's chapter ''A corpus-based analysis of adjectival modification in
German idioms'' (81-108) begins the empirical part of the volume and provides a
comprehensive and careful investigation into the ways in which German idioms
allow adjectival modification. Stathi takes a critical view towards earlier
suggestions and develops a fine-grained classification of adjectival
modification, which contains five different levels and functions hierarchically
in a somewhat similar fashion with Fraser's (1970) classic hierarchy of idiom
transformations, i.e. each modification that is permitted at a certain higher
level of the hierarchy automatically permits modifications at lower levels.
Moreover, Stathi also ponders the consequences her analysis has for idiom theory
In their article ''Types of changes in idioms – some surprising results of corpus
research'' (109-137), Elke Gehweiler, Iris Höser and Unidine Kramer take a
diachronic view on idioms and discuss the semantic and structural changes that
have occurred in idiomatic expressions in German during the 20th century. The
development of idiomatic expressions follows the same routes that have been
acknowledged with single lexemes. This diachronic development is something that
cannot be adequately presented in conventional dictionaries, but the authors
suggest that their idiom database could function as a suitable means to this end.
Christiane Hümmer discusses the possibly motivated nature of the contextual
behavior of idiomatic expressions (''Meaning and use: a corpus-based case study
of idiomatic MWUs'', 138-151). She comes to the conclusion that idiom behavior is
at the same time both motivated and arbitrary. Motivation is important for
explaining the links between the different semantic levels of idiomatic
expressions, but arbitrariness can be seen, for example, in the way that only
one of the possible motivated links between the lexicon and language use is
In the chapter called '''You fool her' doesn't meant (that) 'you conduct her
behind the light': (Dis)agglutination of the determiner in German idioms''
(152-163), Anna Firenze concentrates on the determiner variation that can be
found in German idioms. Although various grammar books and earlier studies have
categorically claimed that determiner changes in certain idioms automatically
turn them into non-idiomatic expressions, Firenze shows that this is not true.
On the contrary, the various determiner changes that are possible for
non-idiomatic language are also possible for idioms without loss of idiomatic
meaning. For some reason, the examples in this chapter are not glossed, which
makes the chapter slightly different from the other chapters of the book.
Angelika Storrer's chapter ''Corpus-based investigations on German support verb
constructions'' (164-187) analyzes German support verb constructions, also known
as light verb or nominalization verb constructions. Storrer divides the
constructions into two types, those in which the predicative noun following the
verb forms part of a prepositional phrase (the construction type is abbreviated
as PP-SVC) and those in which the noun is the head of a direct object (DO-SVC),
and shows how these types behave slightly differently in terms of
morphosyntactic variation. She also points out how the previous assumption,
according to which support verb constructions can in most cases be freely
substituted by the corresponding base verb constructions is unjustified; in many
cases, contextual or semantic restrictions prohibit such substitutions. The
chapter contains a lot of interesting information. However, a few of the
quantitative claims made in it would have benefited if they had been tested with
statistical methods, although most of the points made in the text do not require
particular statistical verification.
The volume ends with Christiane Fellbaum's discussion of the roles of
constructional meaning and lexical meaning in the semantics of idiomatic
expressions (''Argument selection and alternations in VP idioms'', 188-202).
Following the ideas of Goldberg (1995), she argues for the importance of
lexeme-independent constructional meaning in explaining the semantics of idioms.
As a consequence, the semantic analysis of idioms requires that idiom-specific
syntactic frames that essentially contribute to the meaning be recognized.
This book is a neat and compact package of studies illuminating the phenomenon
of German multi-word units from various angles. Its object is interesting and
current in linguistics, and the fact that the studies are corpus-based makes it
even more topical. Although the articles approach the phenomenon from various
perspectives and posit fairly different research questions, they closely relate
to one another and support the claims made in the whole book.
An additional merit of the book is that it bridges the gap between the German
and Anglo-American research traditions of idiomatic language. After all,
phraseology plays an important role in German linguistic tradition, part of
which is unfortunately little known by researchers who are not literate in
German. In addition to containing original studies, which discuss idiomatic
expressions in German and thus offer information that could be compared, for
example, with the corresponding phenomena in English, the book also brings
attention to a prominent body of idiom literature that has been published either
in German or in French and therefore has so far remained mainly unrecognized in
the English-speaking world.
Since this is a work in progress, one could always question whether it would
have been a good idea to delay the publication, for example, by a year, because
this would have allowed time for some of the work to be developed a bit further.
I, however, prefer the publication at this point. The stage at which the work is
at the moment (or was at the moment when the articles were written) is now
reported in the articles and offers valuable information for researchers who are
planning or have already began to work on similar projects; had the articles
been written later, some of the questions that are included in them and can be
of help for future projects might have been left out, since they would have been
solved already. Moreover, since the authors at various points emphasize that
this is a work in progress and will be developed continuously, there would have
been no guarantee that a slight delay would have found the project at a stage
where it is essentially different from its present condition.
Unfortunately, the print quality of some of some of the figures in chapters 2
and 4 is fairly poor. And it escaped the eye of the editor that the text on a
few occasions refers to color codes that are used in the computer programs while
the figures in the book are black and white. Nevertheless, the book reads well
and the type editing is almost faultless. All in all, Fellbaum's _Idioms and
Collocations_ is a very welcome contribution to the field of idiom research and
offers valuable information about corpus-based study of multi-word units.
Fraser, Bruce. (1970) Idioms within a transformational grammar. _Foundations of
Language_, 6, 22-42.
Goldberg, Adele. (1995) _Constructions: A Construction Grammar Approach to
Argument Structure_. Chicago: University of Chicago Press.
Moon, Rosamund. (1998) _Fixed Expressions and Idioms in English: A Corpus-Based
Approach_. Oxford: Oxford University Press.
Stubbs, Michael. (2002) _Words and Phrases: Corpus Studies of Lexical
Semantics_. Oxford: Blackwell.
ABOUT THE REVIEWER
Esa Penttilä, PhD, is currently working as senior assistant at the University of
Joensuu, Finland. His main research interests are in cognitive linguistics, in
particular idiomatic language and idiomatic constructions. He is also interested
in the philosophy of language.
Read more issues|LINGUIST home page|Top of issue
Please report any bad links or misclassified data
LINGUIST Homepage | Read
LINGUIST | Contact us
While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.