Review of  Input-based Phonological Acquisition

Reviewer: Madalena Cruz-Ferreira
Book Title: Input-based Phonological Acquisition
Book Author: Tania S. Zamuner
Publisher: Routledge (Taylor and Francis)
Linguistic Field(s): Phonology
Text/Corpus Linguistics
Language Acquisition
Subject Language(s): English
Language Family(ies): New English
Issue Number: 16.873

Date: Wed, 23 Mar 2005 08:46:22 +0800
From: Madalena Cruz-Ferreira
Subject: Input-based Phonological Acquisition

AUTHOR: Zamuner, Tania S.
TITLE: Input-based Phonological Acquisition
SERIES: Outstanding Dissertations in Linguistics
PUBLISHER: Routledge (Taylor and Francis)
YEAR: 2003

Madalena Cruz-Ferreira, Department of English Language and Literature,
National University of Singapore


This book presents research designed to assess two alternative hypotheses
about language acquisition, one holding that child productions reflect
universal properties of language, the other predicting child productions
according to properties of the particular language to which the child is

The book contains seven chapters, a 15-page Appendix with tabulation of
data, and three indexes, by language, author and subject. Being a graduate
research piece, the book will interest scholars in the field of child
language acquisition, particularly phonological acquisition, and those
concerned with evaluation of linguistic theories.

Chapter 1, "Accounts of acquisition", starts by discussing the claim "that
children's productions mirror cross-linguistic markedness" (p.4), unmarked
properties of language being those that are cross-linguistically frequent,
whereas marked features occur less frequently.

Nature vs. nurture accounts of language acquisition are formulated in two
contrasting hypotheses, the Universal Grammar Hypothesis (UGH), whereby
acquisition is mediated by genetically-encoded properties of language that
are universally unmarked, regardless of linguistic input, and the Specific
Language Grammar Hypothesis (SLGH), whereby the unmarked patterns of the
input language are the driving factor in acquisition, with no requirement
for "innate _linguistic_ knowledge" (p.12). The empirical verification of
the two hypotheses constitutes the focus of the book.

Zamuner highlights several of the theoretical and practical implementation
problems that plague Universal Grammar (UG) accounts of language
acquisition, such as the characteristic woolliness of definitions of the
term "Universal Grammar" itself, or the circularity of much UG
argumentation where, for example, "markedness is seen as both as [sic]
evidence for UG and as the product of UG" (p.10). Most relevant for
Zamuner's research is the systematic confounding, in UG claims about
markedness, of cross-linguistic and language-specific features of
language: if cross-linguistic markedness is derived from, and reflected
in, properties of particular languages, then the assumedly innate,
unmarked properties of language are also the unmarked properties of any
particular language. It is this observation that prompts Zamuner's
proposal of the SLGH.

The chapter goes on to review studies showing children's progressive
sensitivity to input, including gradual disregard of non-phonemic
contrasts in the surrounding language, or the earlier child production of
consonantal codas in English than in languages with lower coda
frequencies. The latter findings constitute the background to the present
study, which investigates "the place [of articulation] and sonority of
word-final codas in the domains of cross-linguistic markedness, the
distribution of codas in English, and in coda acquisition in English. The
aim is to determine whether children's productions reflect UG or the
ambient language" (p.19).

Chapter 2, "Cross-linguistic codas", gathers together data from corpus-
based cross-linguistic research on codas, in order to enable a
characterisation of preferred coda consonants and the related formulation
of the UGH. The corpora consist of published research on codas across
languages, and of Zamuner's own collection of CVC (consonant-vowel-
consonant) words across 35 languages, each from one of a variety of
language sub-families.

The set of possible codas and the number of word-final codas are tabulated
for place of articulation and sonority, on the assumption that statistical
analysis of word-final codas enables clarification of marked vs. unmarked
features in this distributional position. Since cross-linguistic
markedness is established on the strength of patterns observed from
language processes, language change, child language, aphasia and phonemic
frequency and is "interpreted here as evidence for UG" (p.22), markedness
patterns will allow predictions about patterns in child language.

Cross-linguistic word counts are then performed, using two different
frequency analyses. An Expected Frequency Analysis (EFA) establishes
whether "the words of a language contain a specific phonological element
more than expected by chance" (p.24), and an Actual Frequency Analysis
(AFA) establishes whether "the number of words containing codas with a
specific phonological feature is greater than the words containing
different features" (p.25). Overall findings are that coronal (vs. labial
and dorsal) and sonorant (vs. obstruent) are the preferred place of
articulation and sonority feature in codas, respectively.

The chapter ends with remarks on the impasse faced by UG formulations of
predictions about the markedness of particular phonemic segments, in view
of these results. Namely, that some of the properties of a segment may be
marked (e.g. obstruent) and some may be unmarked (e.g. coronal). In other
words, "the unmarked features in coda position are not compatible" (p.34),
leaving open the markedness status of, say, a coronal obstruent like /t/.
A reasonable prediction for a UGH can nevertheless be formulated, that
children's first codas are preferably coronal and preferably sonorant.

Chapter 3, "English codas", characterises the distribution of codas in
English, in order to establish the features of the input to which children
acquiring English are exposed, and thus enable a formulation of the SLGH.
Data are gleaned from two online dictionaries, and from two databases
containing words familiar to children through exposure and/or children's
own use, namely, Fenson et al.'s (1993) MacArthur Communicative
Development Inventories and a corpus of child-directed speech from CHILDES
(MacWhinney 2000). Naturalistic interactions with children aged 1;7 to 2;4
were sampled from the latter.

EFA and AFA counts find consistency in the distribution of coda features
in CVC words across all four databases. Zamuner then opts to base SLGH
predictions on token CVC words from the CHILDES child-directed samples of
speech, because token frequency has been shown to enable more accurate
predictions about child productions than type frequency, and because child-
directed speech can arguably constitute the most appropriate data for
forming predictions on acquisition, in that it closely reflects the
phonological input available to English-learning children. The SLGH
predicts that children's first codas will be those that are most common in
the input, regardless of features like coronal or sonorant.

The chapter then compares the predictions of both hypotheses. Given that
both EFA and AFA across the four databases revealed a preference for
coronal codas in English, children's early production of coronal codas
could reflect either universal place of articulation preferences or the
distribution of place of articulation in the ambient language.
Specifically, /t, n, r, d/ codas are predicted by both hypotheses.
However, EFA showed a larger proportion of sonorant codas than expected in
English, whereas AFA showed a significantly greater number of obstruent
than sonorant codas in the language. The latter finding is unsurprising,
given the asymmetry in the inventories of English codas that favours
obstruents in this position. Children's productions of sonorant and
obstruent codas will then provide the relevant data upon which to decide
between the two hypotheses.

The next three chapters provide data from (monolingual) English-speaking

Chapter 4, "Child Language Codas", scans literature on phonological
acquisition between the ages of 0;11 and 2;11 for data on child codas, due
to unavailability of specific studies on the acquisition of codas in
English. Child productions of codas in CVC words are then analysed in such
a way as to maximise information pertinent to acquisition, using Stoel-
Gammon's (1985) Independent Analysis, which measures child productions
with no reference to target forms, and Relational Analysis, which compares
child productions to adult targets. Findings from both analyses are mixed:
although both show child preference for /t, n, k, d/ codas, Independent
Analysis has /m/ and Relational Analysis has /s/.

Zamuner then discusses the overall analytical difficulties raised by the
disparate data concerning child codas, that generally conflate results for
spontaneous productions and immediate imitations, word-medial and word-
final codas, or productions of content and function words. In addition,
Zamuner invokes research of her own showing the need to control for the
effects of prosodic position (stressed vs. unstressed syllable) on child
coda productions, a further variable that is consistently disregarded in
previous literature. The blurred nature of both data and findings patent
from the review in this chapter justifies the set up of the experimental
layout described in the two following chapters.

Chapter 5, "Experiment 1" describes the first of Zamuner's own two
experiments. A total of 17 children aged between 1;8 and 2;2 were tested
for productions of codas in 70 monomorphemic CVC content words of English,
of which 12 words were usable (due to the well-known vagaries of child
behaviour in experimental settings). The age range was deemed
representative of early coda emergence in child speech because "this is
when children are both producing and deleting codas" (p.62). The vowels in
different word sets were controlled for quality (lax vs. tense), and the
words exemplify the set of possible English codas.

The children were first assessed about their knowledge of the words, and
then prompted to produce them by naming pictures displayed on a computer
screen. Only target-like coda productions in spontaneous words were
tabulated, by means of a weighted statistic correcting for inequality in
the children's productions. On the basis of the results from this
experiment, Zamuner gives a first evaluation of the UGH and the SLGH.
There was no evidence of preference for coronal or sonorant codas that
might confirm the UGH, whereas the positive correlation between children's
codas and the frequency of these codas in English confirms the SLGH

Chapter 6, "Experiment 2", reports a second experiment, designed to test
phonotactic probability, or "the likelihood of sounds' occurrences"
(p.81). The goal is to investigate child coda productions in different
probabilistic contexts. Given previous research showing that infants,
children and adults alike are sensitive to phonotactic probability, it is
likely that child productions of the same coda will depend on its context.
Zamuner accordingly devised a set of CVC non-word and near-non-word
stimuli (the latter being actual words of English assumed to be of such
low frequency that they could safely be taken as non-words for the
children), with pairwise-matched codas. The phonotactic probability of
each word was ascertained through the large body of research addressing
this issue for the English language, and the experimental words were
accordingly divided into two sets of 11 words each, one high-probability
and one low-probability.

A group of 29 children aged 1;8 to 2;4 (of which a subset also took part
in Experiment 1) heard the pre-recorded words as names of pictures on a
screen, and their task was to repeat them -- with experimental non-words,
there can obviously be no question of spontaneous productions. Results
show that children are significantly more likely to render the same coda
accurately in those non-words that match high phonotactic probabilities of
English than in those with a low phonotactic probability in the language.

Combined findings from the experiments in chapters 5 and 6 are that the
distributional properties of the ambient language account best for the
children's coda productions. Children will prefer to produce a coda not
only because that coda is frequent in the input, but because its
phonotactics are frequent too. The conclusion must then be that adequate
predictions about phonological acquisition are best sought in an input-
based model.

Chapter 7, "Coda acquisition", reviews methodology, findings and
conclusions in the book. It also describes a further experiment that
replicates Jusczyk et al.'s (1994) findings about 9-month-old infants'
preference for CVC non-words with a high phonotactic probability, but for
the younger age of 7 months. These results bring additional insight into
the central role played by features of the input language in early
acquisition, reflected in children's first productions.

Given that the present study found no correlation between predicted and
attested patterns according to the UGH, whereas the correlation holds
firmly for the SLGH, Zamuner concludes that "children do not necessarily
come to the acquisition task with prespecified knowledge" about features
of language, but instead organise and build that knowledge "upon
frequently occurring patterns in the ambient language" (p.99).


First, the bad news. The very bad news concerns the deplorable
proofreading, if any at all, that the book underwent before publication.
Typos, (near-)verbatim repetitiveness, awkward turns of phrase (often
draft-like, some of which can be seen in material quoted in this review),
non-sequiturs and/or nonsensical punctuation are a feature of virtually
every page in the book, at times several times over. A benevolent reader
is forced to re-read paragraphs or entire sections, in as many attempts to
locate and hopefully resolve sources of garden-paths or misdirected
reasoning. One major typo concerns a page duplicated in full, complete
with footnote (pp.65-66). A sample of other examples is (page numbers on
the left, emphasis added):

6. "voiced _stops_ are marked", for "obstruents"

14. "this is because [...], and _as_ children's initial productions"

16. "the analyses are restricted codas to final position"

22. "languages with codas in CVC words"

24. "[...] by considering expected frequencies, this controls for the fact
that [...]"

26. "Arapaho (8)", where the presumed cross-reference (8) on p.33 gives a
hypothesis, not a language sample

27. "a languages' unmarked place feature" and "Further evidence [...] are

38. "[proper names were excluded] due to children having different names
from popular culture"

61. "The _goal_ was to present children with real words containing a
variety of codas", which is clearly a "method" instead

68. "further experiments would benefit from having and need to have a
range of words types"

89. two blank spaces where a phonetic transcription should be

101. [children are better at] "producing sounds contained within words
than are unanalysed forms, such as nursery rhymes [...]"

104. "most of the these approaches"


Clarification about conventions or data is given several pages after their
introduction. For example, the use of the symbol "F", introduced on p. 53
with what looks like a superscript cross-reference to a (non-existing)
footnote, is first explained in another footnote on p.72 as
an "unspecified fricative"; and we understand that the data given in Table
3 (p.66) concern adult-like (vs. children's own) renditions of the tested
codas only a few pages later, a distinction whose statistical relevance
for discussion of child productions the previous chapter makes clear.

Repetitiveness, including summaries of summaries (e.g. pp.69-75 and the
summarised overview in chapter 7), adds to the burden of reading. The
reference to missing studies on coda acquisition in English is repeated
nearly verbatim on pp. 49 and 61, as is the sentence beginning "An
attempt..." (pp.62 and 84), and the verbatim formulation of the UGH (pp.
34, 69, 96) and the SLGH (pp.42, 70, 96).

Recurring repetition of assumptions and research goals, and/or recurring
references to these (e.g. pp.70, 71, 73 about preferences for coronal and
sonorant codas) further compound an impression that each chapter is meant
for independent reading, and that the book was put together not as a
single, cohesive piece of research, but rather as a collection of
autonomous research papers. (Chapter 6 of the book is the basis of Zamuner
et al. (2004), and chapters 2, 3 and 4 of Zamuner et al. (2005).)

Editorial sloppiness of this kind produces a cumulative effect of
exasperation that risks detracting from content issues, to which I now

The good news are many and wide-ranging. The first thing that stands out
is the fine scholarship that pervades the rigorous treatment of the data,
whether concerning collection modes, statistical analysis or the
interpretation of findings. Well aware that different researchers will
favour different analytical choices for different kinds of data that serve
different purposes, Zamuner takes the hard way of dissecting away from
published research on child phonology and available databases the
scattered information about syllable codas that can legitimately ground
her own study. Her task was not made easier by the fact that most
available data on early child productions fail to provide phonetic
transcriptions (p.50). I, too, could not agree more that any understanding
of language acquisition must rely on precise information about how young
children sound. Incidentally, one very welcome feature of the book is that
transcriptions use the standard IPA (International Phonetic Alphabet)

This book joins the growing body of literature on child language (Bybee,
1998; Barlow & Kemmer, 2000; Leather & van Dam, 2002; Tomasello, 2003)
that returns to the role that Saussure (1915/1969:37) ascribed
to "parole", aptly translatable as 'language usage', in language
acquisition: "c'est en entendant les autres que nous apprenons notre
langue maternelle; elle n'arrive à se déposer dans notre cerveau qu'à la
suite d'innombrables expériences" [is it through hearing others that we
learn our mother tongue; it imprints itself on our brain only as outcome
of countless experiences, MCF's translation]. The focus is on usage, and
on the related insight that attempting to characterise properties of
language without an understanding of the socialisation factors that shape
those properties does not make much sense. On the evidence from syllable
patterns, the "pressures to conform" (p.9) are found not in innate
constraints dictated by universal properties of language(s), but in
qualitative and quantitative properties of the specific input surrounding
the child, for which the claim of a 'degenerate' status finds a very
flimsy foothold indeed.

In this sense, Zamuner's findings are commonsensical, almost trivially so
when put into words: children will speak as they hear spoken. However, the
apparently triviality of this claim dissolves against the persistently
blurred nature of universalist claims about language acquisition, where no
necessary causality is found between cross-linguistic recurrence,
markedness and universalism, or between the latter and nativism, except
axiomatically. Even for English, the single most widely analysed
language, "it is not clear which information is relevant [...] for
determining markedness or for determining the representation of final
consonants" (p.18), which leaves undecided what the 'U' in 'UG' is meant
to represent. Whether "sensitivity to UG" (passim, pass the paradox of
sensitivity to what is assumed as an ineluctable biological development)
might be triggered by some input from the environment appears to make
little sense too. If children must generalise from the input, minimal
though its contribution is traditionally claimed to be in UG
argumentation, before some acquisitional parameter can be set, then the
relevant pattern has de facto been learned through the input alone. This,
in Tomasello's (2003:187) words, "basically leaves universal grammar with
nothing to do", which in turn leaves undecided what the 'G' in 'UG' is
meant to represent. In other words, universal grammar, or black holes, may
do useful work in the minds of particular analysts, but this does not
entail that they must inhabit the minds of human beings across the board.

One very pleasing feature of Zamuner's account is that she avoids sweeping
generalisations, insisting that her findings apply to English, and to
monolingual children acquiring it. In this connection, I wondered why the
input-based hypothesis was not labelled, straightforwardly, "Input-Based
Hypothesis". I believe that at least two reasons speak for this label.
First, it is self-explanatory in a way that "Specific Language Grammar
Hypothesis" (itself recycled from Zamuner's original "General Pattern
Learning Hypothesis" in her dissertation) never became to me along the
book. Parsing [Specific] [Language Grammar] makes obviously no sense, but
the presumably intended parallel between [Universal] [Grammar] and
[Specific Language] [Grammar] makes either the word "language" or the
word "grammar" redundant. Second, and more importantly, "input-based"
allows generalisation of the hypothesis to studies in child
multilingualism, where features of "specific languages" may be of little
help in accounting for patterns in multilingual child productions.

Another pleasing feature of the book is that Zamuner's style is never
polemic. She is more interested in understanding what may explain what
children do than in fuelling research paradigms with obedient data, or
analytical controversies with rhetorical arguments. She provides robust
empirical proof of the importance of the input in acquisition, points out
matter-of-factly that UG makes wrong predictions, but then notes that
neither the UGH nor the SLGH can claim to explain how children are
sensitive to what their productions show them to be sensitive to: in this
respect, both accounts remain equally in the dark. She also leaves open
the issue that different interpretations of UG from the one that she
adopts may give it the predictive strength that is found lacking in the
book. This is a refreshing departure from the righteous stubbornness that
often entrenches child language analysts in their own research paradigms,
by choosing to remain deaf to alternative claims and argumentation. After
all, Zamuner's findings are also that we learn to produce intelligible
things because we listen to what is going on around us.


Barlow, M. and S. Kemmer, Eds. (2000). Usage-based Models of Language.
Stanford, CA, CSLI Publications.

Bybee, J. L. (1998). Usage-based phonology. In Darnell, M., E. Moravcsik,
F. Newmeyer, M. Noonan and K. Wheatley, Eds., Functionalism and Formalism
in Linguistics, vol. 1. Amsterdam, John Benjamins, 211-242.

Fenson, L., P. S. Dale, J. S. Reznick, D. Thal, E. Bates, J. P. Hartung,
S. Pethick and J.S. Reilly (1993). The MacArthur Communicative Development
Inventories: User's Guide and Technical Manual. San Diego, Singular
Publishing Group.

Jusczyk, P. W., P. A. Luce and J. Charles-Luce (1994). Infants'
sensitivity to phonotactic patterns in the native language. Journal of
Memory and Language 33, 630-645.

Leather, J. and J. van Dam, Eds. (2002). Ecology of Language Acquisition.
Amsterdam, Kluwer.

MacWhinney, B. (2000). The CHILDES Project (2 vols.). Mahwah, NJ, Lawrence
Erlbaum Associates.

Saussure, F. de (1915/1969). Cours de Linguistique Générale. 3rd edition,
Paris, Payot.

Stoel-Gammon, C. (1985). Phonetic inventories, 15-24 months: A
longitudinal study. Journal of Speech and Hearing Research 28, 505-512.

Tomasello, M. (2003). Constructing a Language: A Usage-Based Theory of
Language Acquisition. Cambridge, MA/London, Harvard University Press.

Zamuner, T. S., L. Gerken and M. Hammond (2004). Phonotactic probabilities
in young children's speech production. Journal of Child Language 31(3),

Zamuner, T. S., L. Gerken and M. Hammond (2005). The acquisition of
phonology based on input: A closer look at the relation of cross-
linguistic and child language data. Lingua 115(10), 1403-1426.


Madalena Cruz-Ferreira teaches linguistics at the National University of
Singapore. Her research interests include prosody and child

