Sunteți pe pagina 1din 24

Pronunciations in connected speech:

A survey of weak forms in a spoken


corpus of American English

Takehiko Makino
(Chuo University, Japan)

1
Background
• I have been feeling that standard English phonetic
descriptions are more or less focused on relatively formal
style and do not tell us what is really going on in actual
connected speech. Especially I found descriptions of weak
forms unsatisfactory.
• In the previous study (Makino 2009), I surveyed actual
pronunciations of some of the words listed as “weakeners” in
Obendorfer (1998) in Buckeye Corpus of Conversational
Speech (Pitt et al. 2007), a phonetically transcribed spoken
corpus of American English.
– Makino (2009) is written in Japanese and found at
http://www.scribd.com/full/25041468?access_key=key-
14qctczl5mqnjkbiksuw
• I identified a lot of forms not listed by Obendorfer and in
reference works such as LPD3 and EPD17.

2
Current list of weak forms
• Obendorfer (1998) lists some 100 words which are
weakened in proper (phonetic) environments (called
“weakeners”) and their weak form pronunciations. To
my knowledge, it is the most comprehensive of its kind
and the starting point of this series of my study.

• I added what I find in LPD3 (2008) and EPD17 (2006)


to what is written in Obendorfer’s list. LPD3 added 2
words not listed by Obendorfer.

• I have left out from Obendorfer’s list the archaic forms


of verbs in the second person singular conjugation,
which are unlikely to be used in spontaneous speech.

3
articles: a, an, the
auxiliary verbs: be, been, am, are, is, was, were, had,
has, have, did, do, does, can, could, may, must, shall,
should, will, would
prepositions: at, by, for, from, in, of, on, per, to, up, with
conjunctions: and, but, nor, or, so, as, if, than, till
relatives: what, when, who, whom, whose, that
personal pronouns: he, her, him, his, it, me, my, she,
their, them, they, us, we, you, your, one
determiners: any, no, some, such, this
adjectives/adverbs: sure, just, not, then, there
verbs: come, get, go, said, says, sit, thank
nouns: ma'am, Saint, sir, time, times
interjection: well

Only in LPD3: I, its


4
Possible additions considered in
the previous study
• In the list, we find possible gaps below:
– Of the relatives, “how”, “where”, “which” and
“why” are not listed.
– Only “our” is not listed from personal pronoun
category.
– “might” is missing in auxiliaries.
– Conjugated forms of main verbs are unlisted.
“said”, “says” are listed buy not “say”, “come” is
listed but not “came”, “go” is listed but “went” and
“gone” are not, “get” is listed but “gets”, “got” and
“gotten” are not.

5
Findings in the previous study:
Words which should be added to the list
where wɛr 186 32.8 42.5
n=567 wɛ 16 2.8
weɪ 12 2.1
weɪr 9 1.6
wɪ 9 1.6
hwɛr 3 0.5
hwɛ 2 0.4
wɪr 4 0.7
wɚ 204 36.0 52.9
wər 61 10.8
wə 17 3.0
hwɚ 13 2.3
ɚ 3 0.5
wɚr 2 0.4
others 26 4.6 4.6
6
gets gɛts 33 42.3 53.8
n=78 gɛs 7 9.0
gɛds 1 1.3
gɛz 1 1.3
gɪts 29 37.2 46.2
gɪs 2 2.6
gɪtʃ 1 1.3
gɪds 1 1.3
gɪdz 1 1.3
gɪz 1 1.3
kɪts 1 1.3

7
our ɑːr 257 73.6 79.7
n=349 aʊr 16 4.6
ɑː 3 0.9
ɔːr 2 0.6
ər 30 8.6 18.3
ɚ 29 8.3
ə 5 1.4
others 7 2.0 2.0

most moʊst 111 47.0 85.2


n=236 moʊs 87 36.9
moʊʃ 3 1.3
məst 14 5.9 11.0
məs 12 5.1
others 9 3.8 3.8

8
which wɪtʃ 257 80.6 84.6
n=319 hwɪtʃ 7 2.2
wɪʃ 6 1.9
wʊtʃ 14 4.4 9.4
wətʃ 13 4.1
tʃ 3 0.9
others 19 6.0 6.0

9
went wɛn 123 28.8 87.6
n=427 wɛnt 109 25.5
wɛɾ̃ 55 12.9
wɛnʔ 30 7.0
wɛ̃ʔ 17 4.0
wɛʔ 10 2.3
wɛ̃t 8 1.9
wɛnd 6 1.4
wɛ̃ 5 1.2
wɛ 5 1.2
wɛt 4 0.9
wɛm 2 0.5
wən 17 4.0 7.5
wəɾ̃ 5 1.2
wə̃ʔ 5 1.2
wənt 3 0.7
wn̩t 2 0.5
others 21 4.9 4.9 10
Other candidates to the list
• here = 3.2% weak
• why = 2.6% weak
• though = 2.6% weak
• how = 2.5% weak

• got, gotten, gone, came, might = No


tokens of “weak” forms

11
Other weak forms for the selected
words in the list
• a: ə, ɪ, l̩, ʊ, ɚ, i, u
• an: ən, ɪn, n̩, əɾ̃, ɪɾ̃
• been: bɪn, bɪɾ̃, bən, bɪ̃, bn̩, bɪ, bɪm
• was: wəz, əz, wəs, wɪz, wʊz, ɪz, ʊz, wɪs,
əs, wʊs, z, wuz, uz, wɪʒ, s, ʊs, wəʒ
• have: əv, ə, ɪv, ɪ, v, həv
• do: du, dɪ, də, ɾu, tu, d, dʊ, di, ɾɪ, ɾi
• from: frəm, fɚm, frm̩, fəm, fm̩, fɛm

12
In the present study
• The continuation of the previous study
where I seek to make a more complete list
of pronunciations of weakeners and other
words not listed in current descriptions.
• I also attempt to account for the processes
which produce the various forms and tidy
up the lists.

13
Data source
• Buckeye Corpus of Conversational
Speech (Pitt et al. 2007), compiled in the
Department of Psychology at the Ohio
State University.
• A phonetically transcribed corpus of
informal interviews (total 300,000 words)
by 40 speakers of English from Columbus,
Ohio area.

14
• According to the manual of the corpus, the
transcription procedure was:
1. Writing down orthographic words by using
Soundscriber software.
2. Automatically generating phonetic
transcriptions and aligning word transcription,
the phonetic transcription to the media file,
by using ESPS Aligner software .
3. Manually correcting the automatic
transcriptions.

15
Limitation of the corpus
• The corpus do not transcribe word stress,
sentence accent or intonation. Thus, it is
impossible to decide whether the
segments iy, ih, uw, uh, ah (ə/ʌ) and er
(ɚ/ɝ) correspond to strong or weak vowels
just by looking at the transcription. (It will
be possible if you listen to the media, but
that has been impractical, given the large
number of data.)
16
Procedure
• Searching words
and extracting
their phonetic
forms by using
SpeechSearcher
sofware which
accompanies the
corpus.
• Comparing the
results with the
current list.

17
Other words in the list
• Since it is impossible to discuss all the
other words in the list here, I will give a
couple of weakeners which have been
found to occur in a very large number of
variants:
– “that”
– “and”

18
“that”
• 351 different forms in total of 5,871 tokens (excluding
three possible mistranscriptions, namely /ennæt/,
/wɪθðæʔ/ and /UNKNOWN ɛʔ/).
• Forms occuring more than 100 times (possible weak
forms in red): ðæʔ (n=1,086), ðæt (n=727), ðɛt, ðɛʔ,
ðæɾ, ðɪt, ðæ, ðɛɾ, ðɪʔ, næʔ, næt
• Forms occuring more than 10 times: ðɛ, ðəʔ, ðət, æʔ,
ðɪɾ, ðɪ, θæʔ, ɛʔ, næɾ, ðæd, næ, æt, θæt, ðəɾ, nɛt, ðɛd,
æ, dæʔ, ðə, ɛt, ɛ, dæt, nɛʔ, ðɪd, æɾ, θɪt, ɪt, zæt, læʔ, ə,
θɛt, ɪ, ɪʔ, nɪt, əʔ, θɛʔ, ðæk, zɛt, dɛt, ɪɾ, nɛ, zæʔ, zɛɾ, ət,
ðæp, læt, θɛɾ, zɪt, əɾ, dɛʔ, ðæb, ɛɾ, næd, nət, nɛɾ, zət
• There are still 284 other forms left!
cf. ðət (Orbendorfer); ðət (LPD3); ðət, ðt (EPD17)

19
“and”
• 189 different forms in total of 10,998 tokens.
• Forms occurring more than 1,000 times: ɛn, n̩, ɪn
• Forms occurring more than 100 times: æn, ən, ɛɾ̃,
ænd, ɛnd, ɛ, æɾ̃, n, əɾ̃, ɛn
• Forms occurring more than 10 times: ɪɾ̃, æn, ə, m̩,
ɛm, ænt, ɪ, ən, ɛnt, æ, n̩d, ɪnd, ʊn, əm, ɪn, ɛŋ, ənd,
ɪm, ɛn̩, ɾ̃, ən̩, ŋ̩, æm, eɪn, ɪŋ
• Still 151 forms left!
cf. ənd, n (Orbendorfer); ənd, ən, nd, n, m, n, ŋ, əm,
əŋ (LPD3); ənd, ən, nd, n, m, ŋ (EPD17)

20
The need to tidy the variants
• Obviously, many of the forms found here are the result of the
influence of the flanking words to which the word in question
is, say, assimilated. Basically, such forms should not be listed
as the variants of that word but are to be accounted for in
more general phonological processes which have been
identified. This is not limited to weak forms.

• However, the wide variety of different forms may reveal


processes that have not been identified and incorporated into
phonological and/or phonetic theories.

• Part of the purpose of this series of studies is to “discover”


such processes.

21
Conclusions
• I am not sure how far the findings from
connected speech corpus should be
incorporated into standard descriptions.
But it will be useful to look into this more
fully.
• It has been found that not only weak forms
but strong forms have large variations,
which is interesting in its own right both
descriptively and theoretically.

22
References
• Makino, Takehiko (2009) “Revising the list of weak
forms in English using a spoken corpus.” (In
Japanese) Paper presented at the 320th Regular
Meeting of the Phonetic Society of Japan, Tokyo.
• Obendorfer, Rudolf (1998) Weak Forms in
Present-Day English. Oslo: Novus Press.
• Pitt, M.A., Dilley, L., Johnson, K., Kiesling, S.,
Raymond, W., Hume, E. and Fosler-Lussier, E.
(2007) Buckeye Corpus of Conversational Speech.
(2nd release) [www.buckeyecorpus.osu.edu]
Columbus, OH: Department of Psychology, Ohio
State University (Distributor).

23
Хвала!

24

S-ar putea să vă placă și