Review of Phonetic Interpretation

Reviewer: David Deterding
Book Title: Phonetic Interpretation
Book Author: John Local Richard Ogden Rosalind Temple
Publisher: Cambridge University Press
Linguistic Field(s): Linguistic Theories
Book Announcement: 15.3336

Date: Fri, 26 Nov 2004 13:03:28 +0800
From: David Deterding <>
Subject: Phonetic Interpretation: Papers in Laboratory Phonology VI

EDITORS: Local, John; Ogden, Richard; Temple, Rosalind
TITLE: Phonetic Interpretation
SUBTITLE: Papers in Laboratory Phonology VI
PUBLISHER: Cambridge University Press
YEAR: 2003

David Deterding, NIE/NTU, Singapore


This book is a selection of papers from the Laboratory Phonology VI
conference held in York in 1998. Though it has taken over five years to
come out (and so it has in fact emerged later than the comparable volume
for the 2000 Laboratory Phonology conference, Gussenhoven & Warner 2002),
it still represents a valuable and compact overview of recent work by a
number of prominent scholars from around the world into the nature of
phonological representation and its connection with phonetic realisation.

As is the norm for research under the rubric of Laboratory Phonology (e.g.
Connell & Arvaniti 1995, Broe & Pierrehumbert 2000), the focus is on the
fine details of articulation and perception and how research into these
can provide evidence about the nature of phonological representation, so
most of the papers report on meticulous measurements of data from a small
number of speakers producing carefully prepared material under laboratory
conditions, and there is almost no discussion of naturally occurring

After an introduction by the editors, the book is divided into four Parts,
each with four Chapters followed by a critical commentary on the papers in
that Part, though the first paper in each of Parts I, II and III does not
undergo critical evaluation in this way. (These three papers are the
longest, with an average of 25.7 pages compared with 16.2 pages for the
others, so one assumes they represent the keynote presentations from the
conference.) Only in Part IV are all four preceding papers discussed in
the critical commentary.


Part I is on phonological representation in the lexicon.

In Chapter 1, Mary Beckman and Janet Pierrehumbert report on a priming
experiment in which subjects gave the first word that came to mind in
response to pairs of words either with similar meaning or shared phonemes,
and they conclude that the results support a model in which the semantic
and phonological entries for a word are stored in separate but connected
compartments of the lexicon. Furthermore, they argue that an abstract
phonemic representation for lexical items assists infants in negotiating
the bottlenecks in language acquisition.

In Chapter 2, Sarah Hawkins and Noel Nguyen build on previous work showing
that the quality of an initial /l/ is affected by the voicing of the coda,
and they predict a longer reaction time (RT) for cross-spliced tokens.
Although the expected effect was not found, for real words they did find a
correlation between RT and the size of the acoustic mismatch after cross-
splicing. This was especially true for a voiceless coda, probably because
the longer vowel for a voiced coda allows the relatively weak perceptual
cues in the initial /l/ to be overridden. Finally, they argue that the
fact that cues to coda voicing extend throughout the syllable lends
support to a holistic rather than phoneme-based model of word recognition.

In Chapter 3, Jennifer Hay, Janet Pierrehumbert and Mary Beckman describe
experiments in which subjects listened to imaginary words created by cross-
splicing, with items containing common medial nasal-obstruent clusters,
e.g. /slentu/, contrasted with less likely or impossible ones,
e.g. /slemku/. The listeners reported how well-formed they judged the
words and also transcribed what they heard. It was found that the
perceived well-formedness of a word is related in a gradient fashion to
the likelihood of its occurrence, and this can broadly be predicted from
counts of attested phoneme combinations in a database such as CELEX.

In Chapter 4, Richard Wright investigates the acoustic quality of vowels
in "easy" words (like 'gave') and "hard" words (like 'mace'), where
easy/hard is determined not just by frequency of occurrence but also by
the number of words that sound similar, and he finds that there is greater
dispersion in the F1/F2 space for the vowels of hard words, especially
those with a point vowel such as /i, a, u/. Furthermore, he reports that
his speakers differed substantially, with one male and one female speaker
having greater reduction in dispersion for the easy vowels than the other
eight speakers.

In Chapter 5, John Coleman comments on Chapters 2, 3 and 4 and observes
that although Laboratory Phonology is more of a way of doing phonology
than a theory in its own right, it does present a welcome antidote to the
shortcomings of generative phonology. With regard to Chapter 2, he
suggests that if voicing is treated as a property of the rime rather than
the coda, then influences of the voicing of a final plosive on an
initial /l/ are no longer particularly surprising, and this lends support
to a syllable-based representation. And he argues that the findings of
Chapters 3 and 4 provide strong evidence for the need to include
probabilistic data in phonological representations, as deterministic rule-
based systems cannot predict patterns of well-formedness or the fine
details of articulation.

The papers in Part II investigate the influence of phrase structure on the
articulation of sounds, particularly consonants.

In Chapter 6, John Harris argues against rule-based, derivational accounts
of ambisyllabicity, and with a representation using element theory, where
each basic element such as (H) 'high source' (resulting in aspiration) and
(U) 'labial' (characterised by lowered F2) has a direct and independent
acoustic interpretation, he shows that the foot is the appropriate domain
for representing phonetic phenomena such as the lenition of consonants in
Danish and Ibibio.

In Chapter 7, Mariapaola D'Imperio and Barbara Gili Fivela investigate
Florentine Italian for the effects of a clause or phrase boundary and also
of narrow, contrastive focus on the occurrence of Raddoppiamento (Fono-)
Sintattico (RF), the lengthening of an initial consonant triggered by a
stressed final vowel in the preceding word, and they find that a clause
boundary does block RF as expected, but that a phrase boundary and narrow
focus do not always have the predicted effect.

In Chapter 8, Patricia Keating, Taehong Cho, Cecile Fougeron and Chai-
Shune Hsu use electro-palatography (EPG) to compare the effects of various
kinds of phrasal boundary on the duration and degree of contact between
the tongue and the roof of the mouth for syllable-initial /n/ and /t/ in
French, Korean and Taiwanese, and they show that although the levels of
phrasing vary for the different languages, all speakers make at least one
distinction which has a substantial effect on the articulation of initial
alveolar consonants.

In Chapter 9, Robert Ladd and James Scobbie investigate the duration of
various consonants in Sardinian to see whether postlexical geminates
(PLGs), the long consonants that occur as a result of assimilation, are
identical to geminates that originate in the lexicon. They report that,
unlike the situation for many types of assimilation in English for which
there are residual effects from the underlying sounds, Sardinian PLGs are
the same as lexical geminates with no residual effects, and so they
conclude that gestural overlap does not provide a suitable model for the
assimilatory patterns of Sardinian. However, they do concede that gestural
overlap may explain some cases of residual nasalisation in Sardinian.

In Chapter 10, Jonathan Harrington comments on Chapters 7, 8 and 9. First,
for Chapter 9, he observes that all final consonants in Sardinian are
alveolar, so place of articulation can be left unspecified, and this means
that the PLG data have no bearing on the kind of assimilation in English
that originally gave rise to accounts of gestural overlap. Next, he
reports that although a mora-based model may be suitable for describing
the Sardinian data, such a model does not work for the RF data from
Chapter 7, as there is no evidence that RF in Florentine Italian shows the
categorical shift that mora relinking would entail, and he further
suggests that, rather than a purely syntactic analysis, the RF data might
be investigated using various intonational break indices. Finally, for
Chapter 8, he proposes that the differences in articulation found for
initial consonants in various languages are perceptually based (something
the authors themselves are equivocal about), and he further notes that it
would be valuable to extend the work by investigating indigenous
Australian languages which favour VC syllables.

Part III is concerned with syllable structure, particularly the timing and
quality of initial and final consonantal gestures.

In Chapter 11, Terrance Nearey reports first on simulations to investigate
the factors that work best in speech perception of syllables in noisy
conditions, and he concludes that segment-sized units work best. Then he
investigates the combinations of acoustic factors that can best model the
previously-reported perceptual responses of listeners to stimuli spanning
the /bla, dla, bra, dra/ continuum, and he finds that a segmental model
which includes quadratic effects for F2 and F3 works best, and there is no
clear need to include any terms for diphones.

In Chapter 12, Bryan Gick measures lip aperture and the position of
various parts of the tongue for initial, ambiguous and final /w,j,l/ in
American English, for example in 'ha wadder' (initial /w/), 'how otter'
(ambiguous /w/) and 'how hotter' (final /w/), and he finds that lip
aperture for ambiguous /w/ behaves like the tongue tip for /l/ in showing
evidence of resyllabification, but none of the measurements for
ambiguous /j/ undergo such a shift. He concludes that, if lip aperture
for /w/ is treated as a consonantal gesture, his data support a model
based on gestural overlap, with resyllabification involving retiming of
the consonantal and vocalic gestures, and he finally suggests that /l/
and /w/ (and also /r/) behave like consonants while /j/ is purely a vowel.

In Chapter 13, Paul Carter measures F2 as an indication of the darkness of
[l] and [r] for one speaker of each of four dialects of British English,
representing the four combinations of rhotic/non-rhotic and clear/dark
initial [l]. For the non-rhotic varieties, he confirms earlier reports
that dark initial [l] is found with clear [r] while clear initial [l] is
paired with dark [r], but for the rhotic varieties, this pattern is not
found, as both initial [l] and initial [r] are dark for the speaker from
Fife, though this speaker does have a clear final [r]. Carter also
measures formant transitions as an indication of the timing of apical and
dorsal gestures, and he finds that, for a dark initial [l], the dorsal
gesture is timed at or before the apical gesture, which indicates that
vocalic gestures do not necessarily occur closer to the syllable peak than
consonantal gestures.

In Chapter 14, Kenneth de Jong investigates the timing of /p/ and /b/ in
onset and coda positions, with 'pea', 'eep', 'bee' and 'eeb' repeated at
varying speech rates dictated by a metronome, and he finds that, although
at fast speech rates coda /p/ tends to become perceptually similar to
onset /b/, some aspects of the coda enunciation remain, so there is not a
complete switch from coda consonant to onset consonant as previously
claimed. Furthermore, he reports that, with changing speaking rates, there
is a delay in the shifting of patterns, so that once speakers start to
produce a pattern, they tend to continue with it.

In Chapter 15 Peter Ladefoged raises a number of questions about Chapters
12, 13 and 14. For Chapter 12, he notes that treating [l] as a combination
of vocalic and consonantal gestures only works for American English, as in
his own pronunciation of British English there is no raising of the back
of the tongue for initial [l] nor any contact between the tip of the
tongue and the alveolar ridge for final [l], so for him, initial and final
[l] must be treated as separate gestures, or, in traditional terms, as
extrinsic allophones of /l/. For Chapter 13, although he praises the
elegant overall treatment, he questions if it is adequate to have one
speaker for each dialect, as a single informant may have idiosyncratic
speech patterns. And for Chapter 14, he notes that only stressed syllables
were studied, and consonants can behave differently in unstressed
syllables. He finally suggests that we might even consider
treating 'happy' and 'supper' as single syllables, and, referring to data
from Scottish Gaelic and Montana Salish, he proposes that syllables may in
fact be totally irrelevant constructs in some circumstances.

Part IV covers miscellaneous topics in speech production.

In Chapter 16, Bushra Adnan Zawaydah uses an endoscope to investigate the
articulation of oral, pharyngeal, and guttural consonants in Jordanian
Arabic, and she reports that the gutturals are characterised by narrower
pharyngeal diameter than non-gutturals. In contrast, for Interior Salish,
she claims that lowered first formant is needed to describe the grouping
of consonants. She thus concludes that articulatory features are necessary
for classification of the guttural consonants in Arabic while acoustic
features are more appropriate for Salish languages.

In Chapter 17, Daniel Silverman manipulates recordings of words in Jalapa
Mazatec, a Mexican language which is characterised both by a range of
tones and also by breathy and modal phonation. He reports that listeners
are able to perceive pitch contrasts more clearly on modal vowels than
breathy vowels, and he suggests that this explains why, for vowels in
Mazatec which consist of a breathy portion followed by a modal portion,
the tonal contrasts only occur during the modal portion.

In Chapter 18, Katrina Haywood, Justin Watkins and Akin Oyetade
investigate the H, M and L tones of Yoruba, to see whether two of them can
be grouped together in a separate register. On the basis of various
acoustic measurements and also the closed quotient from a laryngograph
waveform, they conclude that the L tone is indeed characterised by a
distinctive voice quality, so it might be analysed as belonging to a
different register than the other two, though unexpectedly they find that
the L tone has greater spectral tilt than the M or H tones.

In Chapter 19, Keiichi Tajima and Robert Port adopt "speech-cycling"
methodology, using a metronome to guide speakers of English and Japanese
in producing utterances with a fixed number of syllables in a waltz-timed
beat, and then they see what happens to the timing when the middle
syllables are manipulated, either by switching them around or by
introducing an extra syllable. They report that the English speakers tend
to maintain a stress-timed rhythm, but the rhythmic basis of the Japanese
utterances varies when an extra syllable is introduced, with subjects
aligning their speech with the rhythmic beat in different ways, and it is
not clear if mora-timing is the best way to categorise the timing of

Finally, in Chapter 20, Gerard Docherty reviews the papers in Part IV and
asks three important questions. Firstly, to what extent do the results of
laboratory work describe the characteristics of natural speech? He
believes that natural data are important, and he particularly raises
concerns about the artificiality of the speech-cycling data from Chapter
19. Secondly, is it true that speakers always strive to maintain maximal
contrasts in their speech? For example, do the data from Chapters 17 and
18 on tonal realisations indicate that speakers actually make use of the
enhanced auditory options that are available? He cites data for the NURSE
vowels and final plosives in natural speech from Tyneside to show that
speakers often do not maintain contrasts, as social factors may outweigh
the desire to achieve maximal clarity. Thirdly, if a phonetic attribute is
found to co-occur with a phonological feature, can we be sure that the two
are linked? Particularly in Chapter 16, do the co-occurrence of an
articulatory or acoustic feature with a set of consonants for Arabic and
Salish really indicate that the consonants are grouped using those


This volume presents a succinct and impressive overview of recent research
into phonology under laboratory conditions. While the wealth of material
that is packed into a single volume is admirable and will be highly valued
by many, others may find the brevity of some of the papers a little
frustrating. There are regular comments from the authors that "due to
space limitations" (p. 258) "we do not have the space to report further"
(p. 172), and for example we are informed that "vowel-duration
measurements followed standard procedures" (p. 134) without being told
what those procedures were. Often there is discussion of results that
are "not shown in the figure" (p. 154, p. 156), reference is made to plots
that "we do not show" (p. 66), and many of the authors acknowledge that
their chapter is a summary of some other more comprehensive account "which
may be consulted for more details" (p. 41). In many cases, one feels that
it is necessary to get hold of the full report published elsewhere to
understand the research fully, though that is not helped when we are
advised to "see Gick, forthcoming, for detailed discussion of this matter"
(p. 226) but 'Gick forthcoming' is not actually included in the References
at the back of the book despite at least five citations in the text (p.
222 twice, p. 226 twice, p. 233).

Unfortunately, there are also quite a few errors which exacerbate the
difficulties in interpreting some of these papers. Many are merely
irritating, with misspelled words in labelling the figures ('onomorphemic'
p. 69; 'vvoicing' p. 261) and erroneous cross-references (Section 2.2.1
instead of 12.2.1, p. 227; Experiment 3 rather than 2, p. 63). Sometimes,
these errors affect the detailed description of data, with 'crank'
transcribed with an initial /c/ and 'sermon' and 'syrup' listed as sharing
two segments while the transcription actually indicates three shared
segments (p. 19), and all items beginning with /str/ and /gr/ are
transcribed with a turned-r (p. 61), while those beginning with /kr/ have
a lower-case-r, with this distinction retained later in the text (p. 65)
even though there seems to be no logic behind it. Finally, spelling
of 'histeresis' with the more usual 'y' instead of the first 'i' (p. 262
ff) would be helpful to those of us who need to look the word up in a

There are a few problems with the data in Tables. Often this just involves
misaligned text (Table 7.1, p. 135; Table 18.1, p. 313; '' in item 5,
p. 167; an extra word 'atom' in item 9, p. 113), but occasionally
something is wrong with the numbers, so in Table 18.1 (p. 313), 0.6 is
given as the mean of 5.4, 0.1 and 3.9, and even more bizarrely the whole
of the second last line is wrong, with for example 170 given as the mean
of 24, 23 and 28.

Some of the errors are not just irritating but seriously disrupt
interpretation of the material. In Nearey's paper, Table 11.2 lists Model
II as an enhancement of Model II (is it really a recursive model?) with
the addition of G x F3 (one assumes it might really be an enhancement of
Model I with the addition of G x F2). Moreover, on p. 216, three
references are made to Row 7 of Table 11.3, the first two suggesting it
compares Models III and V, and the third discussing its comparison of
Models V and VI, while the table itself shows Row 7 as comparing Models IV
and VI, so it is hard to determine which is correct: the text or the
table. The compactness of the presentation in this chapter, the admission
that the full "simulations are described in detail" in another paper (p.
200) and frequent references to "further simulations sketched" elsewhere
(p. 204), and the existence of so many errors makes this paper rather
difficult to understand.

In contrast, many of the papers are very well presented, with a
comprehensive description of all the data. A model of clarity is Wright's
chapter, where even the detailed methodology of the formant measurements
is reported in full (something that is unfortunately rarely done in
research papers of this nature). One might question a couple of things,
for example if 12th order linear prediction is sufficient for formant
measurements when the sampling rate is 22,050 Hz (p. 80), as Ladefoged
(2003:125) suggests an order of between 20 and 24 would be more
appropriate; and in Figure 4.2 (p. 82) the "hard" /E/ vowel
(in 'den', 'wed' and 'pet') seems to be a bit further from the centre of
the vowel space than its "easy" counterpart, while the bar chart in Figure
4.3 on the same page shows the easy version of /E/ as more peripheral. But
these are minor quibbles in an otherwise excellent paper.

One might question the interpretation of the data in one or two other
places. For example, although Carter's paper is mostly carefully presented
and well argued, his conclusion that "[i]nitial laterals are clearer than
final laterals" (p. 245) is not supported by the plot for the speaker from
Fife, whose initial [l] appears to have a very slightly lower F2 than his
final [l] (p. 244). And the claim (p. 245) that initial [r] is darker than
initial [l] for this speaker is open to doubt, as the two values in Figure
13.3 are rather close and the error bars overlap, so one might assume that
there is no significant difference.

Even though some of the chapters are rather compact, this book does
represent an exceptionally valuable compilation of recent laboratory work
on various aspects of phonological representation. Furthermore, the four
commentaries by Coleman, Harrington, Ladefoged and Docherty offer
insightful discussion of the issues and provide critical but thoroughly
constructive evaluations of the research. In particular, the discussions
by Ladefoged and Docherty represent a real breath of fresh air, raising
some important questions about the research, particularly with regard to
the number of speakers involved in the data and also the applicability of
results obtained under laboratory conditions to the interpretation of real
speech. While it is undoubtedly true (as acknowledged by Docherty, p. 343)
that the invasive nature of Zawaydeh's work with an endoscope means that
she could only realistically study her own articulation, it is still a
genuine concern that much of the research depends on so few speakers
producing somewhat contrived utterances in artificial conditions. Not
only, as observed by Docherty, do the requirements for Tajima and Port's
speakers to rehearse the data beforehand lead to doubts about naturalness,
but one might also note that Ladd and Scobbie obtained data for Sardinian
using prompts in English for one speaker and, for the other two, an
invented phonetic script that one of them found hard to use (p. 171), and
Keating et al recorded data for Taiwanese using prompts in Mandarin, and
furthermore for /n/ this involved repetition of the syllable /na/ (p.
158). Does this really result in genuine speech data, or do we have some
kind of artificial laboratory construct?

However, small-scale meticulous investigations using carefully designed,
innovative data are the at the core of most work in laboratory phonology,
and furthermore the focus of this kind of research is generally to devise
ingenious fresh ways of investigating speech in order to tease out details
of the nature of phonological representation, so it is perhaps not
surprising if the data at times are somewhat artificial. Furthermore if
prepared data are to be recorded for languages that are rarely written,
such as Sardinian and Taiwanese, then we have to accept that non-ideal
prompts must be used.

It is certainly true that studies such as those reported in this book
provide fascinating and invaluable evidence about the nature of speech,
and in conclusion, the collection of papers in this volume, particularly
when accompanied by the four insightful commentaries, represents a very
useful overview of some of the laboratory investigations into speech being
undertaken around the world.


Broe, Michael B & Pierrehumbert, Janet B (2000) Papers in Laboratory
Phonology V: Acquisition and the Lexicon, Cambridge: Cambridge University

Connell, Bruce & Arvaniti, Amalia (1995) Phonology and Phonological
Evidence: Papers in Laboratory Phonology IV, Cambrdige: Cambridge
University Press.

Gussenhoven, Carlos & Warner, Natasha (2002) Laboratory Phonology 7,
Berlin: Mouton de Gruyter.

Ladefoged, Peter (2003) Phonetic Data Analysis: An Introduction to
Fieldwork and Instrumental Techniques, Malden MA, Blackwell.


David Deterding is an Associate Professor at NIE/NTU, Singapore, where he
teaches phonetics, phonology, syntax, and Chinese-English translation.

Format: Hardback
ISBN: 0521824028
ISBN-13: N/A
Pages: 416
Prices: U.K. £ 45.00
U.S. $ 70.00