Documente Academic
Documente Profesional
Documente Cultură
Speech Science
Acronyms
VT = Vocal Tract
VF = Vocal Fold
IPA = International Phonetic Alphabet (1 symbol = 1 sound)
FF = Formant Frequency·ies
VC = Vowel-Consonant
CV = Consonant-Vowel
VOT = Voice Onset Time
Specific Vocabulary
English French
Airstream Courant d’air
Bounce Rebondir
Burst Éclater / éclatement
Glides Semi-voyelles
Horseshoe Fer à cheval
Impede Entraver, gêner, empêcher
Jaws Mâchoires
Outward airflow Flux d’air sortant
Pitch height Hauteur tonale (hauteur ou gravité)
Plosive Occlusive
Pressure drop Chute de pression
Schwa ‘eu’ (menu) [ə] voyelle neutre / centrale
Slack Mou, lâche
Swallowing Déglutition
Thick Épais
Tip Bout / pointe
To scold Réprimander, gronder
Utterance Énoncé / prononciation
Velum Palais mou
Vocal folds/cords Cordes vocales
Welsh Gallois
Yield Rendement / produire
Broad Large / vaste
Sloppy Négligent / bâclé
Onset Début / survenue
Vowel Diagram Triangle vocalique
Slope Pente
Tilting Basculement / inclinaison
Solely Uniquement
Cutback Diminution / réduction
1
Cours 1
Damped Amorti·e
Definitions
Phonetics: the science of speech sounds (language independent)
- Articulatory phonetics: how speech sounds are produced/articulated
- Acoustics phonetics: physical properties like voicing, aspiration, frication
- Auditory phonetics: perception
Phonology: the principles and patterns by which sounds are used in a language (language
dependent)
Turbulent
Transient
Fundamental Frequency
Vocal Tract
Phoneme
Minimal pair
Phones
Allophones
Syllables
Morphemes
Articulatory phonetics
Place of articulation
Phonetic transcription
Broad transcription
Narrow transcription
Source filter theory
Formant frequency rate
Voice Onset Time
2
Cours 1
ACRONYMS 1
SPECIFIC VOCABULARY 1
DEFINITIONS 2
SPEECH PRODUCTION 5
The Larynx 7
Airstream mechanisms 10
ARTICULATORY PHONETICS 12
Manner of articulation 13
Place of articulation 14
Consonants 14
Vowels 15
Overview 19
Consonants 19
Parametric analysis 19
The voicebox 21
SEGMENTAL INFORMATION 26
Vowels 26
Vowel duration 26
3
Cours 1
Consonants 29
Voice Onset Time (VOT) 29
Formant transitions 30
Release bursts 31
Summary of acoustic features for consonants 31
Cues to manner and place of articulation 31
Voicing cues 33
Plosives 33
Fricatives 36
Affricates 40
Nasal consonants 40
Nasal vowels 41
Glides 42
Liquids 44
EXERCISES 47
4
Cours 1
Language model
Prosodic description
Word sequence
Phonetic processing
Concatenation
Sound generation
Speech production
Air is moved from the lungs, through the trachea via the pharynx to oral and nasal cavities.
During speech, some parts of the vocal tract are constricted. Outward airflow impeded à
pressure rises. Variations in tracheal air pressure provide basis of speech sounds.
Speaking is an alternation of time between a constricted articulation and an open
articulation: no gaps!
Average syllable rate: 2-5 syllables per second!
Vocal Tract = pharynx + oral + nasal cavities. The length, from glottis (VF) to lips, varies:
- Adult male à 17cm
- Adult female à 14-15cm
- Child à 8-9cm
5
Cours 1
6
Cours 1
Stretching the VF is brought about enlarging the distance between the thyroid cartilage and
the two arytenoid cartilages (done by the cricothyroid muscles).
The Larynx
The larynx sits on top of trachea and controls the flow of air in and out of the lungs. Vocal
folds are inside the larynx. The glottis is the opening between the vocal folds. The latter are
elastic.
Thyroid, Cricoid and Arytenoid cartilages support muscles which bring about changes in
voicing.
During swallowing, the epiglottis covers the entrance to the larynx (food and liquids must
pass over the entrance to the lungs).
Tension and elasticity of VF can be varied, made thicker or thinner, shorter or longer;
- Open for voiceless consonants
- Closed for voiced ones
Myoelastic aerodynamic theory of phonation : the VF are activated by the airstream from
the lungs, rather than nerve impulses.
Aerodynamic theory : vocal folds are parted by subglottal air pressure pushed up from the
lungs. Hundred of « pops » of air per second !
For voicing, the air pressure below the vocal folds must exceed the pressure above the folds.
7
Cours 1
The VF close due to a pressure drop. Bernoulli principle : an increase in velocity results in a
drop in the pressure, the pressure drop being perpendicular to the direction of the flow. So,
increase in velocity within a narrow passage à decrease in pressure against the lateral wall
(cf VF : a sudden drop in pressure against the inner sides of each fold à the VF are sucked
together again).
Turbulent: breath stream passes through narrow constriction (in VT or pharynx, glottis is
open). Produces an aperiodic turbulent stream (/s, f/).
Transient: sudden increase of air flow due to release of air pressure built-up behind a
constriction (/p, t, k/)
8
Cours 1
a: wide appart ;
b: narrow closure;
c: opening and closing;
d: tightly closed.
9
Cours 1
4: Breathy voice
Airstream mechanisms
Production of any speech sound involves the movement of an airstream.
- Most sounds pulmonic egressive: by pushing air through the lungs, through the
mouth and sometimes also through the nose.
- Glottalic airstream mechanism: implosives and ejectives. However, with implosives,
the air is sucked in, while it is pushed out for ejectives. Instead of lung air, the air in
the mouth is moved.
- Velaric (lingual) airstream mechanism: clicks. Clicks are also ingressive.
6 possible airstream mechanisms:
- Pulmonic egressive (used in all languages)
- Pulmonic ingressive (NF)
- Velaric ergessive (NF)
- Velaric ingressive (Zulu : https://www.youtube.com/watch?v=CcE-BdgCW2A )
(produces click consonants)
- Glottalic egressive (Navajo : https://www.youtube.com/watch?v=XFayFUiyv20 )
(produces ejective consonants) (p’,t’,q’…)
- Glottalic ingressive (Sindhi) (produces implosive consonants) (b,d,g…)
10
Cours 1
Phonemes are individual sounds, the smallest meaningful contrastive unit in the phonology
language. They can be combined to produce distinct word forms. They are written between
slashes / /. They are not defined acoustically by their sound properties, but by their function
in a language system.
Pot-Tot are phonemes but differences like Pie-P’hie are not.
Each language has 20-40 phonemes (/heed/, /hid/, /ahead/, /hayed/, /had/, /hod/, /hawed/,
/hoed/, /hood/, /who’d/, /hide/, /hud/, /howed/, /heard/, /hoyed/).
If you have a phonological knowledge of a language, you can:
- Produce sounds which form meaningful utterances
- Recognize foreign words, foreign accents
- Make up new words
- Add the appropriate segments to form plurals and past tenses
- Know when to aspirate plosives and when not
- Know whether a sound belongs to your language or not
- Know that different phonetic utterances represent the same ‘meaningful unit’.
Minimal pair: when 2 different forms are identical in every way except for one sound
segment (phoneme) that occurs in the same place in the word. Both phonetic form and
meaning are changed (eg : Sink-Zink, Junk-Chunk, Boy-Buy, Teeth-Teethe…).
Counterexamples : Butter and Buʡer have the same meaning in English, Seed and Soup...
When the phonetics is different but the phonemes are the same, we call these sounds “free
variations”. In English, the glottal stop is not a phoneme. Same for unreleased or released
plosives at end of word. It can be transcribed phonetically (using diacritics), but it is not
distinctive phonemically in English.
11
Cours 1
Phones are the physical sounds that are produced when a phoneme is articulated. As the
vocal tract doesn’t work discretely, each new production by the same speaker of the same
phoneme sounds differently. They are written between brackets [ ].
Allophones describe a class of phones of one phoneme. /k/ and /kh/ (aspirated) are
significantly different. The variation must be systematic (a predictable phonetic variant; it is
rule-governed). Stop consonants tend to have more allophones than other phonemes,
depending on context: aspirated, unaspirated, voiced, unvoiced, short, long… written
between slashes too / /.
Lack of naturalness results from too few allophones like:
- kh (aspirated): initial in stressed syllables before non-front vowels (e.g. could,
because)
- k (unaspirated): after /s/ before non-front vowels; syllable initial in unstressed
syllables before non-front vowels; syllable final position (sometimes). E.g. skull,
scoot, teacup, peak
- k= (unreleased): syllable final position (sometimes). E.g. attic
- kyh (palatalized, unaspirated): initial in stressed syllables before front vowels (e.g.
keep)
- g : syllable initial before non-front vowels; sometimes syllable final position (e.g.
sheepdop, ago)
- g= : syllable final position, sometimes (e.g. rig)
- gy (palatalized): before front vowels (e.g. geese, regain)….
Auditory system is very sensitive to unnatural prosody.
Syllables are the next larger unit of speech after the phoneme. In English a syllable may
consist of a vowel alone, a vowel preceded by one, two, or three consonants, a vowel
followed by one, two, three or four consonants, or a combination of these. The following
words contain 1 syllable:
- Owe: a vowel alone
- Me: a vowel preceded by a single consonant
- Am: a vowel followed by a single consonant
- Strew: a vowel preceded by three consonants
- Inks: a vowel followed by three consonants
- Strengths: a vowel preceded by three and followed by four consants
Morphemes are smallest unit of linguistic meaning (eg : Baseball = base + ball).
Articulatory phonetics
à Relates linguistic features of sounds to positions and movements of the vocal tract
(articulators).
Vowel and consonant phonemes are classified in terms of:
- Manner of articulation (concerns how the vocal tract restricts airflow):
12
Cours 1
Manner of articulation
To split phonemes into the broad categories used by most languages.
- Vowels: air flows without constriction from lungs through pharyngeal and oral
cavities to the outside world.
- Glides are like vowels, but with narrow constrictions in the vocal tract.
- Stop consonants (plosives) involve the complete closure and subsequent release of a
vocal tract obstruction. Pressure build-up followed by burst. The closure in the oral
tract and the velum must be raised to prevent nasal airflow, except for glottal stop
(/heh/).
- Liquids are also like vowels, but tongue is used for some degree of obstruction. For
/l/, air escapes around the tip of tongue or dorsum. The /r/ has more variable
articulation. Generally voiced, but can be ‘devoiced’ in ‘please’ or ‘price’. Some
languages have voiceless ‘L’.
- Nasals involve a lowering of the velum. Air flows out of the nostrils. In English, only
nasalized consonants (oral tract completely closed). In French, also nasalized vowels
(air escapes through oral tract and nasal cavities). Vowels may be nasalized in
English, but the distinction is not phonemic (= vowel identity doesn’t change). In
French, there are pairs of vowels that differ only in the presence or absence of vowel
nasalization.
- Fricatives: narrow constriction in the oral tract (for some languages in the pharynx
and in the glottis). If the pressure behind the constriction is high enough and the
passage sufficiently narrow, airflow becomes fast enough to generate turbulence at
the end of the constriction.
o Labiodental fricatives: /f,v/ à friction created at the lips
o Alveolar fricatives: /s,z/ à friction created at alveolar ridge
o Palatal / alveopalatal fricatives: measure à friction created at alveolar ridge
o Dental fricatives: /ð/ (this), /θ/ (thin) à friction occurs between tongue and
teeth
o Velar fricatives: right, knight, enough, through, Bach
o Uvular fricatives: /r/ as in rose
o Voiceless glottal fricative: /h/
o Pharyngeal fricatives: tongue root is pulled towards pharynx (Arabic)
- Affricate (stop + fricative): gin, church
13
Cours 1
Place of articulation
This classification enables finer discrimination of phonemes. Languages differ considerably
with regard to place of articulation (within the various manner classes).
Place of articulation: point of narrowest vocal tract constriction.
Consonants :
- Labials:
o Bilabial: both lips constrict à /p/, /b/, /m/
o Labiodental: the lower lip contacts the upper teeth à /f/, /v/
- Dental, articulated with the tongue against the upper teeth à (/l/):
o Interdental à /the/
- Alveolar, tongue tip or blade against alveolar ridge à /t/, /d/, /n/, /s/, /z/
- Palatals, front part of the tongue is raised to hard palate à measure
- Velar, tongue is raised to soft palate or velum
- Uvular, the dorsum approaches the uvula
- Pharyngeal, constriction in the pharynx
- Glottis, vocal folds close or constrict
14
Cours 1
Vowels
https://en.wikipedia.org/wiki/IPA_vowel_chart_with_audio
Tongue position Part of the tongue Description
High Front Tongue constriction at hard palate
High Back Tongue constriction at soft palate
Low Back Constriction in the upper part of the pharynx
Low Front Constriction in the lower part of pharynx
The third parameter is the position of the lips (rounded or unrounded).
15
Cours 1
16
Cours 1
The IPA designates a single symbol for the same sound, regardless of the spelling (quelqu’en
soit l’orthographe). At times the IPA symbol and the spelled or orthographic symbol
coincide, sometimes they do not. The IPA is largely based on articulatory properties.
Broad transcription is a phonemic transcription that involves representing speech using just
a unique symbol for each phoneme of the language: abstract mental constructs.
Narrow transcription captures more phonetic details of the speech sounds.
17
Cours 1
Bilabial Labiodental Dental Alveolar Postalveolar Retroflex Palatal Velar Uvular Pharyngeal Glottal
Plosive
Nasal
Trill
Tap or Flap
Fricative
Lateral
fricative
Approximant
Lateral
approximant
Symbols to the right in a cell are voiced, to the left are voiceless. Shaded areas denote articulations judged impossible.
OTHER SYMBOLS
Open
Voiceless labial-velar fricative Alveolo-palatal fricatives Where symbols appear in pairs, the one
to the right represents a rounded vowel.
Voiced labial-velar approximant Voiced alveolar lateral flap
Typefaces: Doulos SIL (metatext); Doulos SIL, IPA Kiel, IPA LS Uni (symbols)
18
Cours 1
Overview
Manner of articulation
Stop
Consonants Glides
consonants Liquids Nasals Fricatives Affricates
(semivowels)
(plosives)
/m/,
/l/, /v/, /ð/,
Voiced /w/, /j/ /b/, /d/, /g/ /n/, /dʒ/
(/r/) /z/, /ʒ/
Voicing
/ŋ/
Unvoiced /f/, /q/,
/p/, /t/, /k/ /s/, /ʃ/, /tʃ/
/c/, /h/
/j/ à lie ; /ŋ/ à ring ; /ð/ à this ; /q/ à thin ; /c/ à ha râclé ; /dʒ/ à judge ; /tʃ/ à chair
Parametric analysis
19
Cours 2
-0.6658
0 1.292
Time (s)
Periodicity
Periodic: nasals, vowels and approximants
Aperiodic: fricatives
Quiescent: silence preceding the plosive, voice onset time
Transient: release
Duration
-0.6467
0 0.03027
Time (s)
20
Cours 2
A
0.8741
-0.8545
0 /s/ /o/ /l/ /d/ /ier/0.383628
Time (s)
B
0.8741
(b?)
(a)
-0.8545
0 0.140272
Time (s)
The voicebox
The vocal tract (VT) can be thought of as a tube that is closed at one end (glottis) and open
at the other end (lips). This type of tube is known as a quarter-wave resonator. The lowest
frequency (formant, F1) at which a quarter-wave system resonates has a wavelength that is
4 times the length of the tube.
For an adult male: 4 ∗ 17 &' = 68 &'
,
+ = - where & = 340 '. 1 23 [5 ] = ' [+] = 78 = 1 23
This type of tube will also resonate naturally at odd multiples of lowest frequency (500Hz,
1500Hz, 2500Hz…), odd because of the closure at one end.
Usually, only the first three formants of the VT are considered. Exact values depend on
length and shape of VT (place of constriction and degree of narrowness of constriction).
F1 = lowest resonant frequency, F2 = second formant, F3 = third formant etc.
F4, F5 and higher are relatively constant regardless of changes of the VT.
- F1 is related to volume of the pharyngeal cavity as well as how tightly the VT is
constricted
- F2 is related to the length of the oral cavity
21
Cours 2
Volumes are changed by the position of the tongue (in general: larger volumes will resonate
at lower frequencies, smaller volumes at higher frequencies).
- Raise tongue to palate for high front vowel /i/ à enlarge the pharyngeal cavity
behind the tongue constriction and decrease the volume of the oral cavity in front of
the tongue constriction. As a result, F1 will be lower (volume of the pharyngeal cavity
is large, will resonate more strongly to lower harmonics); F2 will be higher, due to the
shorter length in the oral cavity (amplification of higher harmonics).
- Retract and lower tongue for low back vowel /a/ à enlarge the oral cavity and
decrease the volume of the pharyngeal cavity. Oral cavity is even further lengthened
in case of lip rounding. As a result, F1 will be higher (volume of the pharyngeal cavity
is smaller, resonation of higher harmonics); F2 will be lower because oral cavity is
larger.
NB: formant frequencies change depending on length of vocal tract.
22
Cours 2
23
Cours 2
Source spectrum is the same for the different vowels, it is changed by filtering of the vocal
tract à Overall, independence of source and filter.
F0 varies according to the gender or age (spacing between harmonics) but the shape of
(vowel) spectrum is not affected by changes in F0.
NB: formant frequency estimation is more difficult at higher pitches (the attenuation is less
important).
24
Cours 2
Speech sounds are complex, for information about the Fourier transform click here.
25
Cours 2
Front tongue constriction: the frequency of F2 is raised by a front tongue constriction and
the greater the constriction, the more F2 is raised.
Lip-rounding: the frequencies of all formants are lowered by lip rounding (length of VT is
increased). The more the rounding, the more the constriction, the more they are lowered.
Segmental information
Vowels
3000
i
Second formant frequency (Hz)
2500 i e
e
y i
I
2000 i I
e
I
y y
1500 a
a
a
a
1000 u o
WD- female
u
u o MD - male
u o
500
JW - male
0 AG- female
200 400 600 800 1000 1200
Vowel duration
No fixed value for vowel duration (inherent duration). Duration is influenced by:
- syllable stress
- speaking rate
- voicing of preceding or following consonant
- place of articulation of preceding and following cons.
- utterance position (syntactic feature)
- word familiarity
26
Cours 2
27
Cours 2
Vowels
tongue position the lower the F1 the tongue position, the higher the
Central tongue position.
value. Some high vowels have higher F1 frequency. These sounds vary
intrinsic F0 than others (inherent F0 primarily in the frequency of F1
is relative F0). which varies with vowel height.
Large separation between F1 and F2, Uniform formant pattern (formant
Acoustic
28
Cours 2
Consonants
Voice Onset Time (VOT)
è refers to the time (in ms) between the release of the burst to the beginning of the vocal
fold vibration for the following vibration. It is an indication of the coordination between the
laryngeal and articulatory systems:
- VOT < 0 à vocal folds are vibrating before release. Also called pre-voicing lead (voice
bar). Usually 10-20ms, depending on the accent and on the language.
- VOT = 0 à when voice onset and release occur at the same time
- VOT > 0 à short lag, aspirated: onset of VF vibration follows shortly after release
burst.
https://www.youtube.com/watch?v=KkiuV8GGKUw
VOT varies with place of articulation. In general it increases as place of articulation moves
backward in the oral cavity:
- Bilabials à shortest VOT (often prevoicing)
- Alveolars: intermediate VOTs
- Velar: longest VOT
VOT is not as important in signalling the voicing distinction in final position as it is in initial
position. In final position (of a word) the duration of the vowel preceding the stop is more
important for the voiced-voiceless contrast. Vowels are longer before voiced stops and
shorter before voiceless ones (attendez VS attente).
Languages differ in VOT. VOT values for /p, t, k/ in Spanish, Italian, Dutch and French are
similar to those of /b, d, g/ in English. For English-speaking listeners it seems as though
speakers of Italian, Spanish and French produce only voiced stops (none of them are
aspirated – VOT > 0).
Also, VOT has been used as a measure to document developmental changes. Young children
do not produce voiced and voiceless stops with clearly separated VOT values.
29
Cours 3
Formant transitions
20 - 50 ms from stop to vowel, or vice versa (very rapid articulations)
Very important cues for stop identification and place of identification: release burst is often
not produced in daily speech, transitions are always produced but very fast and therefore
very difficult to measure.
30
Cours 3
Release bursts
Several studies investigated whether a spectral template can be associated with each place
of articulation (Blumstein & Stevens, 1979 à 85% accuracy with 1800 tokens).
Important features of the release burst are:
- The spectrum at burst onset (falling, rising, steady)
- The spectrum at voice onset
- VOT
- Burst amplitude (in some languages)
- Temporal changes in spectral shape appear to be very important for classification:
dynamic cue
- Presence/Absence of mid-frequency peak
Aspiration (only
Silence Burst Transition
voiceless sounds)
Aspiration (only
Transition Silence Burst Transition
voiceless sounds)
31
Cours 3
- Burst: most of the energy in the burst spectrum is low in frequency, between 500 Hz
and 1500 Hz. Also, the acoustic energy may be spread out over a wide range of
frequencies à flat or falling spectrum;
- F1 increases from nearly zero to the frequency of the vowel (the F1 for vowels is
always higher than the F1 of a plosive);
- F2 increases from a low frequency (800 Hz) to the F2 of the vowel;
- F3 increases from a relative low-frequency (2200 Hz) to the F3 of the vowel.
onset of /aba/ onset of /ada/
4000 4000
0 0
0.33 0.55 0.33 0.55
Time (s) Time (s)
Alveolar:
- Burst: the small area in front of the constriction acts as a high-pass filter
(emphasizing the higher-frequency components in the noise source). High-intensity
and high-frequency energy between 2500 Hz and 4000 Hz à a rising spectrum;
32
Cours 3
following vowel: when followed by a back vowel, a larger cavity is created (resonates
lower frequencies). When /k/ is followed by a front vowel, the smaller cavity
resonates the higher frequencies.
Voicing cues
- F0 tends to be higher for vowels following voiceless stops than those following
voiced ones.
- VOT
- F1 cutback = a delay in F1 relative to the higher formants.
Plosives
Phonemes: /p, t, k, b, d, g/
Phonetic features :
- Manner: stop (plosive)
- Place (bilabial, alveolar, velar)
Acoustic cues
- Silence (corresponds to the period of oral constriction = stop gap)
o Voiced stops: low energy, also called voice bar
- Burst: corresponds to the articulatory release of the oral constriciton and to
aerodynamic release (due to build-up of pressure). Bursts occur in initial and medial
position, rarely found in final position. Place of articulation may be signaled by
spectrum of burst, but
- Transition is also very important. Transition corresponds to the articulatory
movement from oral constriction for the stop to the more open tract for a following
sound (usually vowel). Easy to identify for voiced than for voiceless sounds.
33
Cours 3
34
Cours 3
35
Cours 3
Fricatives
Phonemes:
- voiced /v, ᶞ, z, ʒ/
- voiceless: /f, ɵ , s, sh, h, X/
Phonetic features:
- Manner: frication (turbulence)
- Place: labiodental, linguadental, alveolar, palatal, glottal
Acoustic cues:
- Voicing
- Frication noise: noise generated as air is forced through a narrow constriction. Then
filtered by the vocal tract:
o /f, v, ɵ, ᶞ / are produced most anteriorly. Not much of a front resonating
cavity. Therefore, very low intensity spectrum spread over a broad range of
frequencies. 4500 Hz - 7000 Hz
o /s,z, s,sh/: front cavity is like an open tube open at one end (lips) and closed
at the other (constriction behind front cavity is very small). Therefore, a
quarter-wave resonator. Front cavity for /s/ is approx. 2.5 cm. Lowest
resonance frequency is 3400 Hz (2.5 cm x 4= 10 cm); 34.000cm/s/10=3400 Hz.
Also higher formants that are odd multiples of the lowest. Lowest is most
important for perception of fricatives.
36
Cours 3
o /s,sh/: longer resonating front cavity than alveolars. Major resonant region
therefore somewhat lower. Also often produced with lip rounding (lengthens
the VT). Most energy: 2000 Hz
- Transitions to and from the vowels due to changes in the vocal tract
- Sibilants/ stridents (/s,z,s,sh/) have intense noise energy
- Non sibilants (/f, v, ɵ, ᶞ /) weak noise energy
20
-20
0 10000
Frequency (Hz)
20
-20
0 10000
Frequency (Hz)
-20
0 10000
Frequency (Hz)
37
Cours 3
/asa/
0.282
-0.2207
0 0.738005
Time (s)
/aza/
0.2784
-0.1956
0 0.728027
Time (s)
/aha/
0.2877
-0.2131
0 0.664014
Time (s)
38
Cours 3
/acha/ /aga/
104 104
0 0
0 0.684014 0 0.759025
Time (s) Time (s)
/aha/
104
0
0 0.664014
Time (s)
39
Cours 3
Affricates
Phonemes /ʧ ,ʤ /
Phonetic features
- manner: fricative and stop (two phase pattern of production)
- place: linguapalatal
Acoustic cues
- Stop gap (silence or low energy interval) due to articulatory closure or some voice
bar. Stop gap not always well defined if it is preceded by silence for pause.
- Frication is similar as for other fricatives. However, affricates show a relatively rapid
increase in noise energy (short rise time). Duration of noise interval is relatively
shorter than with fricatives. Some studies have shown that difference between
fricative and affricate can be cued on the basis of duration of frication alone.
- Transitions to and from the preceding and following vowel
/adza/ /atza/
104 104
0 0
0 0.740023 0 0.775011
Time (s) Time (s)
Nasal consonants
Phonemes: /m, n, ŋ /
Phonetic features:
- manner: nasal
- place: bilabial, alveolar, velar
- Acoustic features:
Due to lowering of the velum the nasal cavities are coupled to the rest of the vocal tract.
This introduces anti-resonances (anti-formants, zeroes) and an extra formant (nasal murmur,
nasal formant)
- Antiformants are bands of frequencies in which the acoustic energy has been
damped. On spectrograms they look like extremely weak intensity formants (position
depends on how open the velo-pharyngeal passage is). They occur mainly because
the nasal cavities are very sound absorbent (due to soft moist lining, cilia, and
internal structure). Sound waves traveling through the nose are damped at
frequencies of the antiformants.
- Note: a VT formant amplifies frequencies in its bandwidth and attenuates those
outside the cutoff frequencies. An antiformant attenuates frequencies within its
bandwidth and amplifies those outside its bandwidth.
- Nasals have both formants (intense) and antiformants (weak)!
40
Cours 3
- Nasal murmur: caused by oral blockage and lowered velum (extra resonances
because of 2 branches leading to pharynx). Results in nasal radiation of acoustical
energy. The spectrum is dominated by high intensity, low-freq. energy (< 500 Hz).
Murmur cues of three different nasals are not exactly alike, but difficult as a
distinctive cue
- Transitions: preceding and following vowels will be nasalized. Cues to place of
articulation
- Voicing is always present (except during whispering)
Nasal vowels
Nasalize vowels by allowing the velum to remain slightly open.
Nasal-non nasal phonemic distinctions occur in many languages (not in English).
Open velum affects filter curve of VT: F1 becomes broader, less peaked (lower amplitude)
because of damping of F1 by the loss of energy through nose. Nasalization effects on a vowel
depends on the non-nasal formant frequencies determined by the oral tract shape and on
the frequencies of the zeroes and poles introduced by nasal tract coupling
Example of French:
- Both vowels have well-defined peaks in amplitude
- F1 of nasalized vowel is broadened
- F2 appears as a hump from 1100-1750 Hz, while it is a relatively narrow peak for the
non-nasalized vowel
F1 & F2 of nasalized vowel are 5-8 dB weaker than of non-nasalized vowel
F3 weakened by about 11 dB
The F4-F5 region is lowered by about 20 dB, possibly due to a zero near 3125 Hz
41
Cours 3
/ama/ /ana/
6000 6000
0 0
0 0.762018 0 0.764014
Time (s) Time (s)
/anga/
6000
0
0 0.749025
Time (s)
Glides
Also called ‘approximants’ and semivowels:
- Gradual articulatory movement
- VT narrowed, not closed
Phonemes: /j/ & /w/
Phonetic features
- Manner: glide or semivowel
- Place: palatal or labiovelar
Acoustic cues :
- A relatively slow transition (75-150 ms)
- No steady-state portion as with diphthongs
- F1 of both sounds starts at very low value (a little higher than for stops)
- F2 of /w/: 800 Hz Compare with /b/!!, F3 of /w/: 2200 Hz. Lower frequencies for all
formants due to lengthening of the vocal tract.
- F2 of /j/: 2200 Hz (compare wih /d/!!), F3 is 3000 Hz
42
Cours 3
0 0
0 0.754014 0 0.748027
Time (s) Time (s)
0 0
0.33 0.55 0.33 0.55
Time (s) Time (s)
‘awa’ looks like ‘aba’ and ‘aja’ looks like ‘ada’ : it’s the same place of articulation, not the
same manner.
Differences are:
- Voice bar and release burst for stop
- For [w] strong low-frequency info. Throughout, no burst
43
Cours 3
- Duration of stop transition = 40 ms, of glide = 75 ms (note: speaking rate can play an
important role: a transition that is heard at as a stop at a slow rate can be heard as a
glide at a fast rate, Miller & Baer, 1983)
- Steeper onset and offset in intensity changes for stop than for glide
Liquids
Phonemes: /l/ & /r/
Phonetic features:
- Manner: lateral or rhotic
- Place: alveolar for /l/, palatal for /r/
Acoustic cues: rather complex:
- both relatively fast formant transitions
- similarity with glides: well-defined formant structure (less constriction than stops,
fricatives, and affricates)
- /l/: energy mainly in the low frequencies. Resonances and anti-resonances due to
divided vocal tract. Resembles /n/. F1: 360 Hz, F2: 1300 Hz, F3: 2700 Hz
- /r/: similar for F1
o F2 somewhat lower than for /l/
o F3 especially lower (1650 Hz). Durations of formant transitions somewhat
longer for /r/ than for /l/
- temporal cues:
o /r/: F1 has a short steady-state + relatively long transition
o /l/: F1 has a long steady-state + relatively short transition
44
Cours 3
some back vowels some back vowels vowels Noise segment has intense, Burst has weak and
Burst has weak and flat Burst has weak and flat Noise segment has intense, high high frequency spectrum flat spectrum
spectrum spectrum frequency spectrum (> 4k Hz) (> 3 kHz)
[w] [j]
F1 increases F1 increases
F2 increases F2 increases
Glide
45
Cours 3
46
Cours 3
Exercises
Write the symbols for the vowels in the following words: bread, rough, foot, hymn, pull,
cough, mat, friend.
Transcribe the following words: bake, bought, bored, goat, tick, guard, doubt, bough, peak,
football, bank, threat(e)ning, gently, usual.
Transcribe the sentences: “Opening the bottle presented no difficulty.”, “All the time.”
47