This volume provides an up-to-date survey of the field of corpus
linguistics, a field whose methodology has revolutionized much of the
empirical work done in most fields of linguistic study over the past decade.
Corpus linguistics investigates human language by starting out from large
collections of texts - spoken, written, or recorded. These language
corpora, which are now regularly available in electronic form, are the
basis for quantitative and qualitative research on almost any question of
linguistic interest. Many techniques that are in use in corpus linguistics
today are rooted in the tradition of the late 18th and 19th century, when
linguistics began to make use of mathematical and empirical methods. Modern
corpus linguistics has used and developed these methods in close connection
with computer science and computational linguistics.
The handbook sketches the history of corpus linguistics, shows its
potential, discusses its problems, and describes various methods of
collecting, annotating, and searching corpora as well as processing corpus
data. It also reports case studies that illustrate the wide range of
linguistic research questions addressed in corpus linguistics. The over 60
articles included in the handbook are divided into five sections:
(1) the origins and history of corpus linguistics and surveys of its
relationship to central fields of linguistics
(2) corpus compilation
(3) corpus types
(4) preprocessing of corpora
(5) the use and exploitation of corpora.
The final section gives an overview of the results of corpus studies
obtained in phonetics, phonology, morphology, syntax, semantics,
sociolinguistics, historical linguistics, stylometry, dialectology, and
discourse analysis. It also reports on recent advances made in human and
machine translation, contrastive studies, computer-assisted language
learning, and automatic summarization.
The contributors to the volume are internationally known experts in their
respective fields. The handbook is intended for a wide audience ranging
from teachers, university students, and scholars to anyone interested in
the use of computers in linguistic analyses and applications.