Measuring Vowel Duration from Spectrogrammes
|Submitter Email:||click here to access email|
Regarding query: http://linguistlist.org/issues/17/17-218.html#2
A couple of weeks ago I posted a query concerning standards for measuring
vowel duration using spectrogrammes, in phonetic and phonological studies,
highlighting some of the problems this task poses.
I received 4 answers, which I try to summarise in what follows:
1. Kimary Shahin (Effat College) and Zinny Bond (Ohio University), both
referred me to the classical study of English vowel duration by Peterson &
Lehiste from 1960: ''Duration of syllable nuclei in English'' JASA 32:693-703.
My own feeling about this important publication is that, given the time at
which it was written and the accumulated knowledge and experience in
acoustic-phonetic experimentation available, it more highlights problems
rather than solving them. In addition, most of present-day's advanced
spectrogramme-manipulation technology was unavailable, and many possible
refinements of decision making procedures regarding temporal segmentation
could not have been foreseen at the time of writing. Furthermore, while this
study is undoubtedly a landmark in the development of acoustic phonetic
research, its applicability and relevance to more abstract aspects of
linguistic (phonological) theory is somewhat limited, since many of these,
like syllabic weight and non-linear organisation of the utterance were not
in existence. Such aspects, rather than being conditioned and controlled,
were simply neutralized in that study, so it is hard to infer standards
that would suit measurements relevant for less obvious phonological
conditions (and purposes). Finally, the publication itself aims at
reporting of a particular, albeit paradigmatically-comprehensive, research,
rather than an attempt to create a standard. Therefore, by using it as a
standard, a researcher (in particular a less experienced one) is likely to
encounter many pitfalls neither covered nor forewarned in this publication,
and may deduce inadequate solutions, for example by inaccurate analogy to
the authors' general procedures.
I commented in length about this reference because, due to its rightful
inclusion in the canonical reader ''Reading in Acoustic Phonetics'' (MIT
Press, 1967), it is probably the most accessible paper related to measuring
vowel durations, for those not involved with acoustic phonetics as primary
research domain. While this paper should be considered a MUST reading for
similar or related experimental research, it should be taken as some older
reference point to a more recent standard, enabling the reader to
'interpolate' the gradual development of methodological standards, and to
better evaluate the methodological adequacy of earlier studies at their
time of writing and at present.
2. Correspondence initiated by Nora Wiedenmann ended up with some detailed
answers and a text of segmentation standards from Florian Schiel (both are
from the Institute of Phonetics and Speech Communication at the
Ludwig-Maximilians-University, Munich). The standard, which was designed
for and implemented in the annotation of the PhonDat database of spoken
German, includes various brief instructions regarding problematic segment
sequences. Unfortunately, due to my very limited knowledge of German, it is
impossible for me to evaluate the description of this standard.
Nevertheless, it seems that, as it was developed for the strict purpose of
annotation of connected speech corpus, it indeed addresses many
segmentation adversities, but at the same time it cannot cover specific
contextual variables, which have to be controlled in elicitation-based
experimental studies with theoretical implications in phonetics and
This is probably true for most other connected-speech-corpus-oriented
annotation and segmentation standards, which, like the previous reference,
might be easily accessible. This is because, unlike particular,
theory-oriented experimental studies, the amount of annotation work in
corpora requires employing a large team of annotators, for whom a written
standard must be provided for team-consistency purpose. Public availability
of such corpora and/or their commercial use in the speech technology
industry contributes to their attractiveness and 'fame' (e.g. the TIMIT
corpus of American English), but
the applicability of their annotation standards for other research cases
depends on and varies with the task and the purpose in each case.
3. Alice Turk (University of Edinburgh) sent to me a copy of a paper
authored by her, Satsuki Nakai, and Mariko Sugahara, titled ''Acoustic
Segment Durations in Prosodic Research: A Practical Guide'', to appear
Sudhoff, Stefan, Denisa Lenertová, Roland Meyer, Sandra Pappert, Petra
Augurzky,Ina Mleinek, Nicole Richter & Johannes Schließer (eds): Methods in
Empirical Prosody Research. Berlin, New York: De Gruyter.
This paper happens to aim precisely at the topic of my query.
It both provides principles for acoustic segmentation and distinguishes
between various segmental and prosodic contexts and highlighting potential
pitfalls in detail, but also critically evaluates the reliability and
'relative segmentability' of these contexts. It thus provides criteria for
context design, warning against mutual evaluation of incompatible contexts,
or at least requires explicit justification of theoretical and
methodological concepts when such evaluations are carried out.
In addition, it provides a concise description of key aspects of the
experimental and methodological setting, such as control for speech rate,
syntactic structure, orthographic bias, and direct influence of results by
explicit instruction of the participant, among others. Many of these
recommendations seem to result from the authors' experience both in active
carrying of such experiments and in reviewing many other experiment reports
in the history of acoustic phonetic literature, which is partially
demonstrated by the list of references. Not surprisingly, the earliest
referred study is Peterson and Lehiste's paper mentioned above. Reference
to and evaluation of work on annotated speech corpora is also made.
While for the most experienced and knowledgeable acoustic phonetician most
of the descriptions, recommendations and warnings in this paper may seem
natural or obvious, they are invaluably helpful for the less experienced
researchers, both those of younger age and of different disciplinary
background. One will not be able to find solutions for all methodological
questions in this 20-pages-long paper, including some of those mentioned in
my query, but one is definitely compelled to take them into consideration
in experimental design, and to explicitly defend one's own methodological
treatment of adversities and incompatibilities forseen by this paper.
Implementing the lessons learned from this paper can make the difference,
when reporting a seemingly successful experiment, between ''preliminary
encouraging tentative results'' and ''substantial reinforcement of the
I would like to conclude by thanking all the contributors for their answers
to this query. All mistakes are mine.
Graduate Student, UCLA.
Sums main page