Documente Academic
Documente Profesional
Documente Cultură
as they vary with time or some other variable. The spectrogram is a basic tool in audio spectral
analysis and other fields. It has been applied extensively in speech analysis. The instrument that
generates a spectrogram is called a spectrograph/ spectrometer. A common format is a graph with
two geometric dimensions: the horizontal axis represents time, the vertical axis is frequency; and
the third dimension indicates the amplitude of a particular frequency at a particular time, which is
represented by the intensity or colour of each point in the image.
Vowels
Vowels usually have very clearly defined formant bars, as in the following:
In dipthongs, you can see the formants change frequency as the tongue body moves
through the mouth:
You can't always tell reliably which formant you're looking at -- F1, F2, F3, etc. -- unless
you already have a good idea of where to expect them. But the existence of formants is
usually obvious enough that you can at least be sure you're looking at a vowel.
(There are some especially common difficulties in identifying formants. In [], and
sometimes other back vowels, F1 and F2 are often so close together that they appear as a
single wide formant band. In [i], F2 and F3 also often appear merged together in a single
wide band.)
Fricatives
Fricatives are easy. The turbulent airstream of fricatives creates a chaotic mix of random
frequencies, each lasting for a very brief time. The result sounds much like static noise, and
on a spectrogram it looks like the kind of static noise you might see on a TV screen.
While each momentary burst of energy occurs at a random frequency, there are tendencies
in which frequencies the random bursts cluster around. [s] has a higher average frequency
than [] does; and both are higher than [f] or [].
Voiced fricatives show aspects of both regular vocal fold vibrations and a randomly
turbulent airstream.
[h]
[h] is really a voiceless version of the preceding or following vowel. On a spectrogram, it
looks a little like a cross between a fricative and a vowel. It will have a lot of random noise
that looks like static, but through the static you can usually see the faint bands of the
voiceless vowel's formants.
Plosives
The medial phase of a voiceless plosive is complete silence. On a spectrogram, this will
appear as a white blank.
The quiet vocal fold vibrations in a voiced plosive will sometimes appear as a faint band
along the bottom of the spectrogram at the frequency of f0. (But very often you won't see
anything there, either because the voicing got lost in the background noise or because the
recording or computer equipment cut off frequencies that low.)
To tell the difference between plosives, listeners rely on the release burst and on formant
transitions. On a spectrogram, the release burst looks like a very, very thin fricative. The
formant transitions (if you can see them) look like the formants have been distorted away
from the frequencies they have during most of the vowel.
Aspiration will look like a period of [h] between the blank gap and the vowel -- specifically,
a voiceless version of the following vowel. (Recall that the tongue body is in position for the
following vowel and that aspiration is just a delay in the onset of voicing.)
NB: Aspiration is not the same as the release burst. The period of aspiration (which only
some voiceless plosives have) is much longer than the very short release burst (which all
released plosives have).
The above spectrogram is of the English word attack [tk].