Documente Academic
Documente Profesional
Documente Cultură
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
24.910
Laboratory Phonology
Basic Audition
No class next week (Tuesday is a Monday)
Readings for 2/27: Johnson chs 5 & 6
Assignments (due 2/27):
Basic acoustics.
VOT and laryngeal contrasts in Mandarin and
English.
Audition
Middle
Outer Ear Inner Ear
Ear
Anvil
Ear Flap
Auditory
Nerve
Hammer
Ear Canal
Eardrum Cochlea
Stirrup
Eustachian Tube
Anatomy
Audition
Loudness
Pitch
Auditory spectrograms
Loudness
The perceived loudness of a sound depends on the
amplitude of the pressure fluctuations in the sound
wave.
Amplitude is usually measured in terms of root-
mean-square (rms amplitude):
The square root of the mean of the squared amplitude
over some time window.
rms amplitude
Square each sample in the analysis window.
Calculate the mean value of the squared waveform:
Sum the values of the samples and divide by the number of
samples.
Take the square root of the mean.
1.5
0.5
pressure
0 pressure^2
0 0.05 0.1 0.15 0.2 rms amplitude
-0.5
-1
-1.5
time
rms amplitude
0 .01 .02
Time in seconds
Figure by MIT OpenCourseWare. Adapted from Johnson, Keith. Acoustic and Auditory Phonetics. Malden, MA: Blackwell Publishers, 1997. ISBN: 9780631188483.
Intensity
log xn = n log x
1.8
1.6
1.4
1.2
0.8
0.6
0.4
0.2
0
0 10 20 30 40 50
x
Loudness
The relationship between intensity and perceived loudness
is not exactly logarithmic.
20 100
18 90
16 80
dB SPL
14 70
12 60
dB SPL
Sones
10 Sones 50
8 40
6 30
4 20
2 10
0 0
0 500,000 1,000,000 1,500,000 2,000,000
Pressure (Pa)
Figure by MIT OpenCourseWare. Adapted from Johnson, Keith. Acoustic and Auditory Phonetics. Malden, MA: Blackwell Publishers, 1997. ISBN: 9780631188483.
Loudness
Loudness also depends on frequency.
equal loudness contours for pure tones:
24
20
Frequency (Bark)
16
12
0
0 1 2 3 4 5 6 7 8 9 10
Frequency (kHz)
Figure by MIT OpenCourseWare. Adapted from Johnson, Keith. Acoustic and Auditory Phonetics. Malden, MA: Blackwell Publishers, 1997. ISBN: 9780631188483.
Pitch
The non-linear frequency response of the auditory system is related to the
physical structure of the basilar membrane.
basilar membrane uncoiled:
Figure by MIT OpenCourseWare. Adapted from Johnson, Keith. Acoustic and Auditory Phonetics. Malden, MA: Blackwell Publishers, 1997. ISBN: 9780631188483.
Masking - simultaneous
Energy at one frequency can reduce audibility of
simultaneous energy at another frequency (masking).
One sound can also mask a preceding or following sound.
104
103
Masking
102
10
1
400 600 800 1000 1200 1600 2000 2400 2600 3200 3600 4000
Frequency of masked tone
Example of masking of a tone by a tone. The frequency of the masking tone is 1200 Hz. Each curve corresponds to a
different masker level, and gives the amount by which the threshold intensity of the masked tone is multiplied in the
presence of the masker, relative to its threshold in quiet. The dashed lines near 1200 Hz and its harmonics are estimates
of the masking functions in the absence of the effect of beats.
Figure by MIT OpenCourseWare. Adapted from Stevens, Kenneth N. Acoustic Phonetics. Cambridge, MA: MIT Press, 1999. ISBN; 9780262194044.
Time course of auditory nerve response
Response to a noise burst: 256
0
0 64 128
msec
600 NO AT
TT = 27 dB SPL AT = 13 dB SPL AT = 37 dB SPL
Discharge rate (SP/S)
400
200
0
0 200 400 0 200 400 0 200 400
M M M
AT TT
Figure by MIT OpenCourseWare. Adapted from Stevens, Kenneth N. Acoustic Phonetics. Cambridge, MA: MIT Press, 1999. ISBN: 9780262194044,
after Delgutte, B. "Representation of Speech-like Sounds in the Discharge Patterns of Auditory-nerve Fibers."
Journal of the Acoustical Society of America 68, no. 3 (1980): 843-857.
Auditory spectrograms
The auditory system performs a running frequency analysis of
acoustic signals - cf. spectrogram.
A regular spectrogram analyzes frequency of equal widths,
but the peripheral auditory system analyzes frequency bands
that are wider at higher frequencies.
Further disparities are introduced by the non-linearities of the
peripheral auditory system, e.g.
loudness is non-linearly related to intensity
masking(simultaneous and nonsimultaneous)
85
80
Vowel /I/
75
70
Level, dB
Auditory Frequency (Bark) 65
0 2 4 6 8 10 12 14 16 18 20 22 24
240 60
55
220
50
200
45
Amplitude (dB)
180
40
0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000
160
Frequency, Hz
140
A comparison of acoustic (light line) and auditory (heavy line) spectra of a complex wave
80
composed of sine waves at 500 at 1,500 Hz. Both spectra extend from 0 to 10 kHz, although
on different frequency scales. The auditory spectrum was calculated from the acoustic spectrum Vowel /I/
70
using the model described in Johnson (1989). 80
60
Excitation Level, dB
50
Image by MIT OpenCourseWare. Adapted from Johnson, Keith. Acoustic and Auditory
Phonetics. Malden, MA: Blackwell Publishers, 1997. ISBN: 9780631188483. 40
50
30
20
10
2 4 6 8 10 12 14 16 18 20 22 24 26
Number of ERBs, E
The spectrum of a synthetic vowel /I/ (top) plotted on a linear frequency scale,
and the excitation patterns for that vowel (bottom) for two overall levels, 50 and
80 dB. The excitation patterns are plotted on an ERB scale.
F2 (Hz)
2500 2300 2100 1900 1700 1500 1300 1100 900 700 500
200
i
u
e o 400
600
a
800
ERB scales
F2 (E)
25 23 21 19 17 15
6
10
E(F1)
12
14
16
24.910
Linguistic Phonetics
Analog-to-digital conversion of
speech signals
2.0
1.6
1.2
0.8
0.4
0.0
-0.4
-0.8
-1.2
0.00 0.01 0.02 0.03 0.04 0.05
The Results Of Sampling
2.0
1.6
1.2
0.8
0.4
0.0
-0.4
-0.8
-1.2
0.00 0.01 0.02 0.03 0.04 0.05
intervals.
The sampling rate is measured 0 Time .01 .02 .03 Sec
A wave with a fundamental frequency of 100 Hz and a major
(c)
0 2 4 6 8 10
Time (ms)
20 steps
Amplitude
200 steps
0 5 10 15 20
Time (ms)
0 5 10 15 20 25
Time (ms)
0
0
0.471 0.3718
1.1154 1.27558 4.26614 4.42706
5000 Time (s) 5000 Time (s)
0 0
1.1154 1.27558 4.26614 4.42706
Time (s) Time (s)
die tie
VOT, closure voicing
English intervocalic stops can be fully voiced
VOT is 0 ms in 2nd and 3rd stops
0.2644
0.1468
547.195 547.485
5000 Time (s)
0
547.195 547.485
Time (s)
brigadoo(n)
VOT, closure voicing
Hindi - three-way contrast
recordings from Ladefoged
http://www.phonetics.ucla.edu/vowels/chapter12/hindi.html
0 0
0
0 0 0
0 0.231127 0 0.113588 0 0.13828
Time (s) Time (s) Time (s)