Documente Academic
Documente Profesional
Documente Cultură
tongue placement during speech. It is still used by descriptive phoneticians and speech scientists to record
sreas of linguadental and linguapalatal contact during
rhe production of various sounds (Hardcastle, 1974). In
direct palatography,
the hard palate, lingual surfaces of
the teeth, and the soft palate are all dusted, by means of
an anatomizer, with a dark powder prior to the produczion of the sound in questiono A mixture of charcoal and
powdered sweetened chocolate is very satisfactory. It
dheres to the palate very well, tastes good, and is eas:\ rinsed away when the experiment has been completed. Once the sound has been produced, a small oval
mirror is inserted into the oral cavity, and the entire roof
f the mouth can be either examined direct1y or phorographed as in Figure 4-112. The technique is limited
A
Undusted
palate
B
Dusted
palate
c
Palatogram
illustrating
linguapalatal
contact (d)
293
OF THE ARTICULATORS
Articulation
Tracking Oevices
4-112
294
CHAPTER
4 ARTICULATION
Resonance
Natural
Frequency
Forced Vibration
The swing has a "natural period or frequency," and it
takes an unreasonable amount of effort to cause it to
travei at an "unnatural period"; that is, we would have
to force it into vibration. The term for such vibration is
forced vibration. If the outside force is removed from a
system vibrating at its natural frequency, it will continue
to vibrate for some considerable length of time. The
damping forces are slight. The vibrations of something
vibrating at an unnatural frequency, or executing forced
vibration, will, when the outside driving force is removed, cease quite abruptly. Such a system is said to be
highly damped.
Radiation
of Energy
Resonant Frequencies
of Vibrating Air Columns
FIGURE
A series of damped vibrations.
4-113
CONTRIBUTIONS
speech mechanism. A simple experiment will demonstrate how an air column may be set into vibration.
Almost everyone has blown across the top of a
narrow-necked bottle to produce a deep, mellow tone,
called an edge tone. No matter how in tens e the air
strearn (within certain limits), the bottle resonates at just
one frequency. The air particles in the botde may vibra te
with greater excursions due to increased breath force,
but they vibrate no faster. In other words, the sound may
become louder, but never higher in pitch. The vibrating
air column has a natural frequency, or to put it another
way, the botde will resonate at a specific frequency. If
ater is added, the air column is shortened and the resnant frequency increases. Thus, the resonant frequencies of vibrating air columns may be manipulated by
:nodifying the size and configuration of the cavities.
An edge tone is one way to set an air column in to vi. ration, but there are other ways. If the botde is held an
inch or so from the lips and a puff of air is released into
rt (call them bilabial puffs, for want of a better term) , a
short-duration note is emitted from the mouth of the
bottle. The pitch of the note, although it is of short duration, is the same as when the air column is set into viration by means of an edge tone. Adding water to the
botde raises the pitch, just as in the previous experimento
li we could now place our botde over the isolated vi. rating vocal folds mentioned earlier, we should not be
surprised to find that the air column in the botde is set
into vibration at the same rate as before, and not at the
-ibratory rate of the vocal folds. The implication, of
course, is that although the vocal folds may vibra te and
release puffs of air at some particular frequency, the rate
f vibration of the air column in the bottle is determined
solely by its length and configuration. The resonating
cavity in the bottle absorbs energy, contained in the puffs
of air, only at the natural frequency of the botde.
The air column is driven into vibration for a short
duration with each discrete puff of air that is emitted by
me vocal folds. The rate at which the air column is driuen
rito uibration determines the pitch, while the frequency 01'
-requencies at which the air column resonates determines the
.uality of the tone. This is the reason, for example, that
rhe speech mechanism is capable of producing a certa in
vowel sound over a large part of the pitch range while a
static vocal tract configuration is maintained.
295
OF THE ARTICULATORS
of the Source
Characteristics
E
E
.s
ro
<Il
ro
(ij
s
(9
",
,,
1\
I
\
I
I
I
I
I
I
I
I
::f
20~
:...
Area
Q;
CL
;:;o
>
<Il
JI 100 gE
::::l
O~
O
, 958.)
volume
jo
,~
4
5
6
3
Time in milliseeonds
FIGURE
Glottal
Cf)
'0
j200
<Il
~300
\ Velocity
I
I
o
.~
c
o
~400
\
""O
C')
I
I
12L
1,
600
J
\
VOO
1
~500
\
Subject: AII
Fo = 125 cps
Ps = 8 em H20
I
I
\
\
I
I
16~
4-114
velocity.
(From
Flanagan,
296
CHAPTER
\
\
-10
'u
Ql
""O
-20
.~
Ql
""O
0=
P
Fs
-30
125 eps
em H20
A-li
..:
'\Vi
C.
E -40
Output
sound -----
v>
Radiated spectrum
Area
wave
E
E 16
=8
ARTICULATION
"rn
Subject:
Cf)
1/
'tJ
"I
Ql
i; -50
ro
a: -60
,I
-70
O
500
,I
1000
\/'\
1,
2 4 6
8 10
Time in msec
Frequency
FIGURE
Vocal tract
(resonator)
\Vr->
V \ f\ -c
I ,I
1500
a!
E
-c
~\A
ro
~
. ""
2000
in eyeles
L
2500
V IV
I
3000
3500
per second
4-115
1958.)
Airstream
l
Lungs
(power
supply)
FIGURE
4-116
10
y
y
O~~~~~~~~~~
5
O
10
x
FIGURE
4-117
CONTRIBUTIONS
17.5
Amax
297
OF THE ARTICULATORS
em
L __-_-
~
500 Hz
lo
FIGURE
4-118
mass-spring vibrator that vibrates with maximum amplitude at fo. When f = fo, maximum energy transfer occurs.
The resonant frequency of the mass-spring vibrator is fo,
and the graph on the right represents the transfer function
of the mass-spring vibrator.
a Uniform Tube
f = -:;:
340 meters/second
70 centimeters
485.7 Hz
FIGURE
4-119
The vocal tract represented as a tube of uniform crosssectional area, 17.5 em in length, and closed at one end.
Its first resonant frequency has a wavelength four times the
length of the tube, and successive resonant frequencies are
odd-numbered multi pies of the first.
Frequencies
(Resonances)
CHAPTER
298
4 ARTICULATION
E(fects of Configurations
of the Vocal Tract
tipo
FIGURE
The modifications of the vocal tract that are necessary to produce the speech sounds in our repertory are
reasonably well documented. For example, phoneticians
learned long ago that rather specific tongue positions are
associated with production of certain vowel sounds. Because the tongue is so highly variable and makes contact
with 50 many structures in the mouth, adequate descriptions of tongue positions are very difficult. In practice.
the configuration of the tongue is described by specifying its gross position during the production of voweL.
together with the degree of lip rounding.
Radiation
Resistance
To complete our equation for the source-filter theory 0speech production, the radiation characteristics at the
lips IR(f)l, where volume velocity through the lips is
converted to a sound pressure pattern (speech), rnust be
considered. Air molecule displacement is greater for
high intensity than it is for low intensity sounds, which
means that air molecule displacement is greater for the
4-120
Schematic tracing of an x-ray of a person producing a neutral vowel; spectrum of glottal sound source and of the
vocal tract acoustical response characteristics (transfer function). The radiated
vowel spectrum is shown at the top of
the figure.
Vocal traet
resonance
~
'""/
Glottal ---\{..
tone
Subglottal
arr under
pressure
_;J
Q)
CONTRIBUTIONS
~
500
1500
2500
3500
1000
2000
3000
500
1500
2500
3500
1000
2000
3000
299
OF THE ARTICULATORS
500
1500
2500
3500
1000
2000
3000
b
500
1500
2500
3500
1000
2000
3000
Frequeney Hz
Frequeney Hz
~
500
1500
2500
3500
1000
2000
3000
500
1500
2500
3500
1000
2000
3000
Radiated spectrum
Radiated spectrum
500
500
1500
2500
3500
1000
2000
3000
FIGURE
1500
2500
3500
1000
2000
3000
Frequeney Hz
Frequeney Hz
4-121
Partial tracing of x-rays of a subject producing the vowels in the words heed, hid, head, hod,
hod, hawed, hood, and who'd. The radiated vowel spectrum is also shown schematically.
300
CHAPTER
4 ARTlCULATION
Vowels
Classification
Four aspects of an articulatory gesture shape the vocal
tract for vowel production. They are the point of major
constriction, degree of constriaion, degree of lip rounding,
and degree of muscle tension.
lhe Cardinal Vowels The position of the tongue is
defined as the highest point of the body of the tongue.
It is difficult to describe tongue positions as being high,
low, front, back, and so forth, without some sort of reference. Denes and Pinson (1963) state that tongue positions are often described by comparing them with
positions used for making the cardinal vowels, which are
a set of vowels whose perceptual quality is substantially
the same regardless of the language used. They constitute a set ofstandard rejerence sounds whose quality is defined
independently of any specific language. X-ray studies of
speakers have shown that rather predictable tongue positions can be associated with the qualities of the cardinal vowels, and so it has become common practice to
compare tongue positions of all vowels with those of the
cardinal vowels.
Within reasonable limits a vowel produced with the
tongue high up and in front, as in Figure 4-122 (without the tip touching the palate), will be recognized as an
[i]. On the other hand, if the tongue is moved to the opposite extreme of the oral cavity, that is, low and back,
as in Figure 4-123, the vowel will probably be recog-
FIGURE
4-123
nized as an [a]. In all there are eight such cardinal voweis, and their relative physiologic positions are often
shown in the form of a cardinal vowel diagram, as in
Figure 4-124.
The cardinal vowels are useful because they describe the physiologic limits of tongue position for the
production of vowel sounds; all the vowels we produce
fall within the boundaries described by the cardinal vowel
diagramo
\
\
\
\
FIGURE
_______
F_I_G_U_R_E 4- 122
of the
of the
i[Jn
4-124
Relative physiological positions for articulation of the cardinal vowels. Range of vowel articulation is shown in solid
line. Close, back, and front tongue shapes are shown in
dashed lines.
CONTRIBUTIONS
OF THE ARTICULATORS
FIGURE
4-125
301
Vowel Articulation
In Figure 4-120, an outline of the configuration of the
vocal tract during production of a neutra Ivowel is shown,
and as shown earlier, it can be represented by an equivalent simple resonator model. A graphic representation of
the amplitude of the harmonics in the glottal source, as a
function of frequency (glottal spectrum),
is shown to
the right of the vocal tract. An acoustic response curve
illustrating the transfer function of the vocal tract is also
shown, and finally, at the top of the illustration is a diagrammatic representation of the sound spectrum of the
radiated neutral vowe1. The harmonics
in the glottal
tone are shown every 125 Hz (which implies a vibratory
rate of the vocal folds of 125 Hz). The radiated vowel spectrunt in general has the same shape as the source spectrum,
ioitb five notable exceptions: the spectral peaks at 500,
1500,2500,3500,
and 4500 Hz. They represent the formants of the vocal tract, but in talking about the spectral
peaks, we have a tendency to identify them as "formants,"
which is not entirely correct. Formants are the property of
the vocal tract. The first formant for any vowel is identified as FI> the second formant F2, the third formant F3,
and so on.
The vocal tract does not affect the frequency of the
harmonics in the glottal source, but rather it reinforces
the amplitudes of those harmonics that coincide or nearly
coincide with the natural frequencies of the vocal tract.
As a person phonates at different fundamental frequencies while maintaining a constant vocal tract configuration, the distribution of the harmonics in the glottal tone
will be altered, but the frequencies of the spectral peaks
in the vowel being produced remain the same. Changes in
tbe SOUTcecharacteristics do not cause changes in the transjer
[unction of tbe vocal tract.
Each vowel in our language system is characterized
by its own uni que energy distribution or spectrum,
which is the consequence of the cross-sectional area
properties and length of the vocal tract. Changes in the
302
CHAPTER
4 ARTICULATION
Lips
17 cm
Glottis
Predicted
resonance
frequency
partem
~(A)
::1
constriction
~(B)
oral
velar
constriction
constriction
pharyngeal
constriction
~10cm----..
~
~
Shorter vocal
tract
(O) ~
Unconstricted
vocal tract
..
(E)
(G)
500
FIGURE
1500
2500
Frequency (Hz)
3500
4-126
Increasing
Length of Vocal Tract The same can be
said for the consequences of lip rounding, or depressio
of the larynx, either of which increases the effective
length of the vocal tract, and 50 all formants are lowere
(Lindblom and Sundberg, 1971). Lip protrusion CaI:
increase the effective length of the vocal tract by abou;
1 cm (Fant, 1970; PerkeU, 1969), which will cause a decrease in the frequency of FI of about 26 Hz. This smal,
shift in frequency can be perceptually significant (Flanagan, 1955).
In addition, the larynx may be raised or lowered b.
as much as 2 em during the production of contextua,
speech (Perkell, 1969), to increase or decrease the eifective length of the vocal tract. This results in a concomitant shift in FI by as much as 50 Hz.
These motor gestures (lip protrusion, changes ir.
levei of the larynx) may accompany "traditional" articulatory gestures of the tongue to modify the acoustica.
properties of the vocal tract in a way that is seemingly
contradictory, or at least unpredictable. In other words.
speech production is a highly personalized sequence
events, and to some extent the process is unique for each
of uso We should avoid the concept that speech production is a series of invariant motor gestures (Ladefoged.
et al., 1972).
o:
Spectrographic
Analyses Figure 4-121 shows partia:
tracings of x-rays of a subject producing the vowels in
CONTRIBUTIONS
HEED
HID
HEAD
303
OF THE ARTICULATORS
HAD
FIGURE
HOD
HAWED
WHO'D
HOOD
4-127
Excerpts of spectrographic analyses of the vowels in the same word series as in Figure
4- 121. The centers of each gray bar on the right are separated by 500 Hz.
rhe words heed, hid, head, had, hod, hawed, hood, and who'd,
in addition to the spectrum for each of the vowels. Figure 4-127 contains excerpts of spectrographic analyses of
the vowels in the same word series. Notice that for the
'ords heed, hid, head, and had, the frequency of FI is risrng, while F2 is lowering. Inspections of the tracings of
'I:-rays in Figure 4-121 reveal the changes in cross-secrional area in the region of the tongue constriction that
account for these shifts in formant distribution.
Graphic representations
of the relationships berween the frequency of the first formant and that of the
-econd formant have been employed to represent certain physiological dimensions in vowel production. In
~948, J oos, as well as Potter and Peterson, demonstrated that when the frequency of the first formant is
plotted against the frequency of the second formant, the
;raph assumes the shape of the conventional vowel diagram but rotated to the right by 45, as shown in Figare 4-128. Note that the frequency scale is linear below
000 Hz and logarithmic above 1000 Hz. It approxi:nates the relationship between the frequency of a sound
and judgments of pitch (Koenig, 1949). The frequency
f the formants is higher for the female than for the
male, while the formant frequencies for the child are
substantially higher than those of either of the adults.
The differences in frequencies do not follow a simple
:,roportionality in overall size of the vocal tract, however, Fant (1973) attributes the disparity to the ratio of
.-baryngeal cavity length to oral cavity length, which tends
zo be greater in males than in females.
owels in General American English Before leaving
the topic of vowel production, we should add that the
owels in general American English are normally pro.iuced exclusively by vocal fold excitation of the vocal
rract. During normal speech the vocal tract is held in a
relatively constant configuration while a vowel is being
oroduced. During contextual speech the vowels may lead
to consonants or to other vowels, as in diphthongs, so
.t is not surprising
to see short duration rransitions or
3600
_MAN
3200
2800
li
Ifl
Q.
~
c
'"E
.2
>o
c
Q)
2400
2000
1600
1400
r-,
Q)
u::
LI
800
700
b._
\
U\'
o
200
\
\
I
I
I
I
I
900
t.
1200
1000
'{
~.~
,.._,_
::>
o-
"
CHILD
1'-.
I~
H-
I'
...............
1800
----
'o "
.S
o
--WOMAN
1--,
,\
...:
Ji{u
7
fl:,
,"
~..... Pu'
L....
u
/
~,
400
600
800
1000
1300
FIGURE
4-128
Consonants
Comparison
304
CHAPTER
4 ARTICULATlON
Classification of Consonants
As shown in Figure 4-129, and as can be seen in the consonant classification chart (Table 4-6), place of articulation includes use of the lips (labial or bilabial), the
gums (alveolar), hard pala te (palatal), the soft pala te
(velar), or the glottis (glottal). Manner of articulation
describes the degree of constriction as the consonants
initiate or termina te a syllable. For example, if closure is
complete, the consonant is called a stop; if incomplete,
TABLE
--
4-5
Vowels
Tongue Hump Position
Degree of
C onstriction
High
Front
Central
8ack
[i] eve
[u] boot
[e] hate
[3'"] bird
[a-] over
[A] up
[E] met
[a] alarm
[J] raw*
[I] it
Medium
Low
[a:] at
[u] foot
[o] obey
[a] father
1.
2.
3.
4.
5.
6.
7.
8.
Lips (labial)
Teeth (dental)
Alveolar rdge (alveolar)
Hard palate (pre-palatal)
Hard palate (palatal)
Soft palate (velar)
Uvula (uvular)
Pharynx (pharyngeal)
FIGURE
4-129
A schematic sagittal section of the head showing articulators and places of articulation.
CONTRIBUTIONS
305
OF THE ARTICULATORS
TABLE
4-6
fricatives
5tops
Place of
Articulation
.ablal
Voiceless
[p]
Voiced
Voiceless
abiodental
Voiceless
Voiced
[m]
[f]
[v]
[8]
[5]
[z]
[3]
Alveolar
[t]
[d]
[s]
Palatal
un
[d3]
lf]
elar
[k]
Glottal
Voiced
[b]
Dental
Glides
and Uquids
NasaIs
[n]
Voiceless
[hw]
Voiced
[w]
[I]
[j][r]
[IJ]
[9]
[h]
-:lleir primary excitation source is the larynx, with a secdary constriction somewhere along the vocal tract reting in noise being generated. Radiation of the sound
frorn the mouth. If sufficient intraoral pressure is gen_:ared so as to result in turbulent air flow, the source is
d to be a noise source, and the consonant is unvoiced
r voiceless. Often a given articulatory gesture is asso_ ated with a pair of consonants that differ only in the
iced-unvoiced feature. Pairs of "related" consonants
re called cognates. The voiced [b] and unvoiced [p]
. nstitute a cognate pair and the [s] and [z], [f] and [v]
re other examples.
ops Stop consonants are dependent upon complete
ure at some point along the vocal tract. With the ree of the forces of exhalation, pressure is built up bed the occlusion until the pressure is released very
ddenly byan impulsive sort of movement of the arculators. As shown in Table 4-6, articulation for stops
rmally occurs at the lips in the production of [b] and
. voiceless cognate [p], with the tongue against the
veolar ridge for the [d] and [t] pair, and with the
- ngue against the pala te for the cognates [9] and [k].
drop across the vocal folds, or in other words, the transglottal pressure differential.
Voice-Onset- Time (VOT). Contrasting stop consonants as voiced or voiceless is not without its difficulties.
Both voiced and voiceless stops are produced with a
short interval of complete silence. When stop consonants occur in the middle of a vowel-consonant-vowel
(VCV) sequence, a true distinction between the voiced
and voiceless categories may be difficult to perceive.
CHAPTER
306
4 ARTICULATION
Point of
articulatory
Vocal tract closure
\I
\l
release
~
\I
.
Vocal tract opemng
\I
\I
\I
\J
\I
Voicing
before
release
Voicing
at
release
VOT = O
Voicing
after
VOT
20 msec
release
FIGURE 4-130
-----------------Schematic illustration of voice-onset-time. At the top, volcing begins 25 msec before burst release of the consonant
and so it has a nega tive VOT of 25 msec. In the middle,
voicing begins at the moment of consonant release, and it
has a VOT of O. At the bottom, voicing begins 20 msec
after the consonant release, and it has a VOT of +20 msec.
Glides and Liquids Glides and liquids are characterized by voicing, radiation from the mouth, and a lack 0':nasal coupling. These sounds almost always precede
vowels, and they are very vowel-like, except that they
are generated with more vocal tract constriction than
are the vowels. Place of articulation for glides and liquids is shown in Table 4-8.
Lisker and Abramson (1964) found that voice-onsettime was an adequate cue for a voiced-voiceless distinction in eleven different languages. The authors ais o
found that voice-onset-tirne was sensitive to place of articulation. Velars, for example, had consistently longer
VOT values that did labiais and apicals.
TABlE
4-7
Fricative consonants
Place of
Articulation
Voiced
Labiodental
[v] vote
[f] far
Dental
[] then
[8] thin
Alveolar
[z] zoo
[s] see
Palatal
[3] beige
[f] she
Unvoiced
[h] how
Glottal
TABlE
4-8
Place of Articulation
Voiced
Palatal
[j] you
Labial
[wJwe
Palatal
[r] red
Alveolar
[l]let
CONTRIBUTIONS
OF THE ARTICULATORS
307
Oral cavity
Tongue
hump
Pressurized air
in the lungs
Expiratory forces
FIGURE
4-131
CHAPTER
308
4 ARTICULATION
4-132
Targets
The purpose of speaking is to generate a stream of sp
sounds that produce purposeful consequences. The tzrget is the production of the correct sounds. Achie=ement of this target requires that the respiratory targeadequate for the laryngeal and articulatory reqw:~ments, that the laryngeal target is adequate for the .,.-ticulatory target, and that the articulatory target m
the cri teria for a correct sound. Traditionally, we ha
regarded the articulatory gestures that produce spesounds in isolation as the gestures that set the standar
for articulation during contextual speech. It would
difficult to generate a substantial argument in defense
these articulatory targets. What we hear as prope:produced sounds, either in isolation or contextual spee
is really the criterion. It is possible for more than ~combination of articulatory gestures to produce ,-,
tract configurations that have the same auditory effe ~
As Lindau et al. (1972) state,
What a speaker aims at in vowe1 production, hi.- get, is a particular configuration in an acoustic
where the relations between formants play a cru
role. The nature of some vowel targets is much m
likely to be auditory than articulatory. The parti
articulatory mechanism a speaker makes use of to
tain a vowel target is of secondary importance onlAnd we might add, the same argument holds for cons nant articulation as well.
At times the same auditory effect can be produc
by articulatory compensation or be due simply to in_vi dual articulatory behavior. Singers can be very expc-:at compensation. The open mouth position singers ofu
use places constraints on "traditional" articulatory P .tures. The larynx can be lowered to decrease forme;
frequencies, the lips can be pursed to accomplish -same effect, or a little of each may be effective.
During contextual speech, somewhere between
and 15 sounds per second are articulated. The articuletory gesture may approach the target, but time o
straints do not allow the ideal target (the same sour
produced in isolation) to be attained. The articulatcr
may undershoot or overshoot the ideal target. If the ., ditory target is reached, however, the criteria have b
309
Segmental
Feature
Source Features
voice
noise
transient
2
3
Resonator Features
occlusive
fricative
lateral
nasal
vowellike
transitional
4
5
6
7
8
9
Features
'-e have explored the articulation of vowels and consoaants, and in grade school we learned that a syllable is one
r more speech sounds constituting an uninterrupted
anit of utterance. A syllable can form a whole word (boy)
r part of a word (A-mer-i-ca). Speech sounds are also
.alled segments. Thus, vowels, consonants, and syllables
re composed of the following segmental features:
B) A sequence of mini mal sound segments, the boundaries of which are defined by relatively distinct
changes in the speech wave structure.
------
FIGURE
4-133
RJ
[1
r-x~--~ ,.,,
."'-c-'X~-..;
.
'.
-,
. :-...
"",
phonemes
.
""
CHAPTER
310
4 ARTICULATION
Suprasegmental
Elements
Extending across speech segments are the suprasegmental elements which consist of the prosodic features
of pitch, loudness, and duration. They impart stress, intonation, and inflection to speech. Prosodic features are
important in conveying emotion, and even meaning to
speech. For example, you can change the emotional
content and the meaning of the sentence, "I don't want
it," by stressing different words and varying inflectional
patterns. These features are called suprasegmental because they often extend past segmental boundaries.
to interrupt the vowels in an utterance. That is, the consonants seem to permit vowels to be "turned on and off.and the very nature of consonant articulation will influence the vowel-shaping gestures that immediately precede and follow consonants. One of the consequences
this consonant articulation is that what we tend to thin..
of as relatively steady-state vowel articulation is in realir
characterized by formant transitions, which reflect articulation into and out of consonants. Formant tra
tions are also characteristic of diphthongs, as can be se
in Figure 4-134. The first and second formants, espccially, reflect the movement of the articulators in the p
duction of "Roy was a riot in leotards." The shifts of[irst formam reflea the manner of articulation (where ~tongue produces the vocal tract constriction) and the sbr:
of the secondformam reflect the place of articulation, which
important in recognition of plosive consonants.
The spectrograms in Figure 4-135 illustrate
latter point. Here, a vowel-consonant
(VC) is sho
As the vowel approaches the plosive consonant the
ond formant "bends" toward the burst frequency rh
is characteristic of the consonant. For the producti of [b] or [p], the second formant of the vowel [Q] ben
toward the burst frequency of those consonants, at
proximately 1000 Hz. Whereas for [t] or [d], the
ond formant bends toward a burst frequency of abo.,
2000 Hz.
Formant transitions of the vowel provide acue .:
the perception of the consonant. The significance
these transitions has been recognized by Fant (19-?
and others. Fant states,
The time-variation of the F -pattern across one or
eral adjacent sounds, which may be referred to as rr
F -formant transirions, are often important auditor
cues for the idenrification of a consonant supplemei -
Transitions
When we examine sequences of sounds as they occur in
contextual speech, the role of the consonants seems to be
wa
Z !
I i ! t
ard
______________________
Spectrogram
~F~IG~U~RE
4-134
the diphthongs
that occur.
CONTRIBUTIONS
c:'
o'
.iij:
"5'
u'
u'
o:
Formants
g-I
F2
lliJ:
!g
oQl
.tI
F1
1000 Hz
--Time
Formants
fI:::
ri]
: : 2000 Hz
F1
Time
FIGURE
311
OF THE ARTICULATORS
speech production model is so unsatisfactory. Our idealized articulation and their targets are corrupted by the
production of the preceding and successive sounds. This
means articulatory overlap can be anticipatory (right to
left, RL) or carryover (left to right, LR), as shown in
Figure 4-136. In either instance, RL or LR, the articulatory targets must he comprotnised in order to facilitate
smooth transitions from one sound to the next, and this
is the nature of human speech.
Coarticulation is, by the very nature of the rapidity
of speech sound production, a necessary component
of speech physiology and is one reason that humanmachine communication systems have been so difficult
to develop.
4-135
- hematic spectrograms of a VC in which a vowel is folwed by the [b] or [p], where the second formant "bends"
zoward the burst frequency of the consonant that is located
-- about 1000 Hz (top) and in which the vowel is followed
..JJ
the consonants [t] or [d]. Here, the second formant
cends toward the burst frequency of the consonant that is
ocated at about 2000 Hz (bottom).
NOTE:
The complexity of coarticulation also explains why integration (or carryover) of
newly acquired sounds into conversational speech outside of the therapy session is often such a stumbling
block in articulation therapy. It may be that we expect
toa much toa soon. Unless these sounds can be produced rapidly, with absolutely smooth RL and LR
transitions in all phonetic environments, attempts to
use them wiU interrupt the natural flow of contextual
speech.
C LI N ICAL
Coarticulation
:oarticulation or assimilation occurs when two or more
ech sounds overlap in such a way that their articula::>ry gestures occur simultaneously. In the word class, the
.: of the cluster [kl] is usually completely articulated bere the release of the plosive. We overlap our articula:.ary gestures, and while one sound is being produced,
zae articulators are getting "set" to produce the next
und. This, of course, results in a large number of allohonic variations that listeners may not even perceive.
Coarticulation is sometimes described as the spreading of features. This means that features such as voicing,
Lelt
Right
-c_
past__j
CLlNICAL
NOTE:
When producing consonant
clusters, particularly those beginning with stops, very
:;oung children may articulate both consonants correct!y but the consonants may not be fully coarticulated, for example, the word blue may resemble the
word balloon minus the [n). Such a variance, though
not unusual for a young child, should be noted because future evidence of improved coarticulation may
indicate that speech is still maturing. Also, an articulation test or phonological exam should provide an exact
record of what the examiner heard, whether or not it
was considered significant at the time of testing.
Present
(being
articulated)
,.---------
FIGURE
Future
Left-to-right (LR)
or earryover coarticulation, eflect 01 A
on B.
4-136
312
CHAPTER
ARTICULATION
Feedback
It is very difficult to say something in the way it is intended to be said, without hearing what is being said,
while it is being said. As was shown in Figure 1-34, auditory feedback is a principal avenue by which we monitor our speech production. Control of speech is often
likened to a servo-system,
in which sensors sample the
output of a system, and compare it with the input. The
difference (error signal) is used to correct the input so
that the output is what it is supposed to be. This is shown
as mutual influence and feedback in Figure 1-34. AImost any interruption of auditory feedback will result in
degradation of speech production. This is especially
evident in the speech of children who have lost their
hearing very early in life. Once speech has been well established, the role of auditory feedback may be diminished, as demonstrated by individuais who have suffered
severe hearing losses later in life, but who manage to
maintain adequate articulation, primarily through the
use of kinesthetic feedback.
Delayed
Feedback
Motor
Feedback
There is ais o interaction between the motor and oth~sensory modalities, which although rnostly unconscio
Unvoiced
Voiced
Auditory
b
FIGURE
4-137
An example of coarticulation of voicing during the production of the word Baja [baha],
which is almost completely voiced. When said slowly, the [h] in Baja is unvoiced as shown
in the right spectrogram.
313
ontrol our entire speech production mechanism. Mus:les, tendons, and mucous membrane have elaborate
and sensitive stretch, pressure, tactile, and other receprors that deliver information about the extent of movements, degree of muscle tension, speed of movement,
.md much more. This information is returned to the
brain and spinal cord where it is integrated into serially
ordered neural commands for the muscles of speech (and
locomotor) mechanisms. These receptors are for the
most part very quick to adapto That is, they send information only while movement is taking place. Once a
structure has gotten to where it is supposed to go, we
needri't be reminded where it is. In Figure 4-138, a lower
motor (efferent) neuron transmits an impulse (Nl) to a
muscle, which then contracts. This muscle movement
srimulates a receptor (R), and it transmits information to
me comparator by way of an afferent (sensory) neuron.
Ar the same time, information about the initial neural
unpulse has also been transmitted
to the compararor, which weighs the difference between the afferent
and efferent neural impulses. Comparator output then
transmits "compensatory information" back to the lower
motor neuron.
BIBlIOGRAPHY
AND
READING
Abbs, J., and B. Gilbert, "A Strain Gauge Transduction System for
Lip and Jaw Motion in Two Dimensions: Design Criteria and
Calibration Data," J. Sp. Hrng. Res., 16, 1973,248-256.
---,
J. Folkins, and M. Sivarjan, "Motor Impairment Following Blockade of the Infraorbital Nerve: Implieations for the Use
of Anesthetization Techniques in Speeeh Research," J. Sp. Hrng.
Res., 19, 1976, 19-35.
Abramson, A. S., and L. Lisker, "Voice Onset Time in Stop Consonants," in Haskins Laboratories, Status Report on Speecb Researcb,
SR-3. New York: Haskins Laboratories, 3, 1965, 1-17.
Facilitation of Compensatory
,1ovement
ne important role of the feedback mechanism is to facilitate compensation in the event of disease or disorder.
=: an anesthetic is applied to the oral cavity (in the case of
bilateral mandibular block in the dentist's office, for example), there is a loss of tactile and stretch receptor feed-
Amerman, J., "A Maximum- Force- Dependent Protocol for Assessing Labial Force Control," J. Sp. Hrng. Res., 36, 1993,460-465.
Arnerman, J., R. Daniloff, and K. MoJI, "Lip and Jaw Coarticulation for the Phonerne h/," J. Sp. Hrng. Res., 13, 1970, 147-161.
Angle, E. H., "Classification of Malocclusion," Dental Cosmos, 41,
1899, 248-264, 350-357.
FIGURE
:::Omparator weighs
e differenee
:::etween what is
"1appening and whal
s supposed to be
~appening
4-138
Compensatory
stimulus
nerve
impulse
LlST
Receptor in
muscle tells
comparator
what is
happening