Documente Academic
Documente Profesional
Documente Cultură
Editors
Mouton de Gruyter
Berlin New York
Voicing in Japanese
Edited by
Mouton de Gruyter
Berlin New York
ISBN-13: 978-3-11-018600-0
ISBN-10: 3-11-018600-4
ISSN 0167-4331
Copyright 2005 by Walter de Gruyter GmbH & Co. KG, D-10785 Berlin.
All rights reserved, including those of translation into foreign languages. No part of this
book may be reproduced in any form or by any means, electronic or mechanical, including
photocopy, recording, or any information storage and retrieval system, without permission
in writing from the publisher.
Cover design: Christopher Schneider, Berlin.
Printed in Germany.
Preface
Most of the work on this book was done while the first editor was a Research
Fellow at the Netherlands Institute for Advanced Study in the Humanities
and Social Sciences (NIAS) in Wassenaar in the period 20022003. We are
extremely grateful to NIAS for the tranquil yet productive environment in
which the ideas expressed in this book could be conceived and reflected
upon.
A first version of some of the papers in this volume here were presented at
a workshop in the Linguistics and Phonetics 2002 (LP2002) conference,
held from September 26, 2002 at Meikai University in Urayasu, Japan.
We are grateful to the organisers for giving us the opportunity to have this
workshop, and to the audience for helpful discussion and suggestions.
Jeroen van de Weijer
Kensuke Nanjo
and Tetsuo Nishihara
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Voicing in Japanese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Jeroen van de Weijer, Kensuke Nanjo and Tetsuo Nishihara
25
47
71
89
viii Contents
Recognizing Japanese numeral-classifier combinations . . . . . . . . . . . . . . 191
Keiichiro Suzuki
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Index of authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Index of languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Index of subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
279
307
312
313
Voicing in Japanese
Jeroen van de Weijer, Kensuke Nanjo
and Tetsuo Nishihara
This book presents a number of studies which focus on the [voice] grammar
of Japanese, paying particular attention to historical background, dialectal
diversity, phonetic experiment, and phonological analysis. Both voicing processes in consonants (such as Sequential Voicing, henceforth Rendaku) and
vowels (such as vowel devoicing) are examined. A number of new analyses
are presented, focusing on well-known data that have been controversial in
phonological debate in the past, but it also presents new (or rediscovered)
data, partly through the work of Japanese scholars that hitherto went mostly
unnoticed, partly through new database research, and partly through phonetic
experiment. In this introduction, we will briefly introduce the different contributions and point out their respective interests.
There are two parts to the book: (1) consonant voice, (2) vowel voice. In
the consonant part, the contribution by Kubozono presents a point of departure by introducing many of the voicing phenomena in Japanese, and also
pointing out some of the relevant dialectal differences. Let us briefly review
the most important of these in very general terms. For details and refinements, we refer to the contributions that follow. Rendaku is a rule of
Japanese which voices the initial consonant of the second member of a
compound, if certain phonological and syntactic conditions are satisfied.
Consider the following examples (taken from various standard sources):
(1)
shima
island
maki
roll
oo
large
- kuni
shima-guni
country
- sushi
maki-zushi
sushi
- tanuki oo-danuki
badger
island country
rolled sushi
large badger
doku
poison
oo
large
- tokage
lizard
- kaze
wind
doku-tokage
poisonous lizard
oo-kaze
big wind
(Vance 1987)
The first non-Japanese researcher who wrote about this blocking effect was
Lyman (1894), which is why the condition is commonly referred to as
Lymans Law. In other literature, the condition is referred to as Motoori
Norinagas Law. A point of controversy in the literature is whether this
Law has exceptions or not. In recent work, Haraguchi (2003) points out
that the exceptions can all be analysed by making reference to independently
motivated principles of grammar, such as morphological constituency.
A number of issues are distinguished with respect to Rendaku: is it an exceptionless rule? Tamamura (1989) points out that only in 60% of the nounnoun compounds in which Rendaku could occur, it actually does occur. If
Rendaku is not a regular rule, how should the exceptions be accounted for?
How should the phonological and syntactic conditions be formalized? To
what part(s) of the lexicon does the rule apply? How long has it been part
of the grammar of Japanese? Do loanwords undergo it? What does the rule
tell us about the specification of voicing on obstruents and on sonorants in
the phonology of Japanese? All these issues are dealt with at length in the
contributions that follow, but let us pick out three major issues here. First, it
appears to be the case that some lexical items that would at first glance be
expected to undergo Rendaku do not undergo it, or undergo the rule in
some compounds but not in others. This variable or unpredictable behaviour can be approached from a number of viewpoints. Has the rule been
completely lexicalized? Are there still subregularities? Does analogy play a
role? How are new formations treated? These questions play a major role in
the contributions by Kubozono and Ohno, while certain syntactic conditions that were assumed up to now are scrutinized and dismissed in the contribution by Vance. The fact that a rule is variable presents obvious problems for speech recognition software. This problem is dealt with in the
paper by Suzuki.
A second issue concerns the stratification of the Japanese lexicon. As is
well known, the Japanese language incorporates a number of layers, or
Voicing in Japanese
6 Haruo Kubozono
looked at children with a language disorder called specific language impairment (henceforth SLI for short). People with SLI are linguistically normal
in every respect except that they cannot apply productive grammatical rules
to morphological/syntactic strings. For example, native English speakers
with SLI are unable to produce plural forms for countable nouns (2a) and to
put an ending /s/ to a verb to mark a third person, singular form (2b).
(2)
Fukuda and Fukuda (1999) examined how native Japanese speakers with the
same language impairment produce Japanese utterances. Specifically, they
looked at the way their eight- to twelve-year-old subjects produced voicing
in compound nouns. If the subjects should fail to produce voicing in words
like /hiragana/, then it would mean that voicing is a productive rule in modern Japanese, hence supporting the first hypothesis mentioned above. If, on
the other hand, the subjects should produce voicing in words like /hiragana/
just as normal native speakers do, then it would suggest that the phonological form with voicing is a lexical form of the word, namely, that voicing has
been lexicalized and is not produced by rule in the synchronic grammar.
What Fukuda and Fukuda (1999) found out is something that compromises
the two predictions. On the one hand, their subjects showed voicing in some
basic compound nouns like nagagutu long + shoes; boots, suggesting that
voicing in these words is part of their underlying representation. On the
other hand, they also showed lack of voicing in non-frequent and novel
compounds such as those in (3a), which were pronounced with voicing by
normal native speakers of the same age group as shown in (3b).
(3)
This latter result reveals a contrast between normal speakers and speakers
with SLI, with the first but not the latter group of speakers being able to
produce voicing in non-frequent and novel compounds. This suggests that
voicing in non-frequent and novel compounds should be attributed to a
productive rule and, hence, that there exists a productive process of voicing
in normal speakers grammars.
Fukuda and Fukudas experimental data are interesting in that they reveal
that some instances of rendaku voicing are lexicalized while others are due
to a productive rule. Native speakers of Japanese deal with the first type of
voicing by memorizing the form with voicing as a lexical entry. In contrast,
they deal with the second type of voicing by acquiring a voicing rule, or
rendaku rule, and applying it to unfamiliar and novel compounds. What
remains unclear is the boundary between the two kinds of voicing, more
specifically, between frequent and non-frequent compounds. This will
be an intriguing empirical question for future research.
2.
(5)
a. aka + huda
aka-huda, *aka-buda red tag
cf. uwa + huta uwa-buta
top lid
roten + huro roten-buro
outdoor, bath;
outdoor bath
b. ai + kagi
ai-kagi, *ai-gagi
cf. ama + kaki ama-gaki
umi + kame
umi-game
duplicate key
sweet, persimmon;
sweet persimmon
sea, turtle; sea turtle
8 Haruo Kubozono
rendaku, the feature in question is [+voice, +obstruent], with the relevant
domain of OCP being the morpheme or the second member of a compound.
While this is a well-known fact in Japanese phonology, there are certain
cases where Lymans Law requires a larger domain. This can be seen rather
clearly in the data provided by Sugito (1965), which we will consider in
detail in the next section.
2.2. Sugitos data and Lymans Law
Sugito (1965) looked at the alternation between /ta/ and /da/ shown by the
morpheme ta rice field as it is combined with a bimoraic morpheme to
form a personal name: e.g. /siba-ta/ vs. /ima-da/. This particular morpheme
exhibits a rather clear pattern of alternation, which is more or less predictable from the consonant in the immediately preceding mora.1 The results of
Sugitos analysis can be summed up as follows.
(6)
We can develop Sugitos analysis one step further and reinterpret the data
in terms of natural classes.2 This reanalysis leads to the generalization in
(8). The contrast between (8a) and (8b) is illustrated in (9).
(8)
(9)
10 Haruo Kubozono
(10)
/rV//wV//yV/-
-/ta/ -/da/
31
3
7
2
8
0
-/ta/ -/da/
0
35
0
26
13
31
11
2.3. Summary
In the preceding section we have seen Sugitos data concerning the /ta/-/da/
alternation in personal names consisting of three moras. It should be clear
now that in these particular type of compound nouns, Lymans Law exerts
its effect in a wider domain than is usually assumed. The same effect is
found in many pairs of personal names including /naga-sima//nakazima/, /naga-sawa//naka-zawa/ and /naga-saki//naka-zaki/, which fluctuate between these two forms. This being said, it is also important to point
out that not all morphemes or personal names exhibit the same extended
effect of Lymans Law. Restricting ourselves to personal names, we find
that some morphemes invariably undergo rendaku even when they are preceded by a voiced obstruent. sono garden and huti the depth, an abyss,
for example, get voiced regardless of what morpheme they are combined
with. Indeed, these morphemes invariably undergo rendaku as long as they
are in a non-initial position of a compound.
(13) a. hoka-zono, mae-zono, kubo-zono, azi-zono, naka-zono, nagazono, sugi-zono, eno-ki-zono
b. naka-buti, naga-buti, sugi-buti
On the other hand, some morphemes tend to resist rendaku voicing in any
context. hara field and saka slope may be such morphemes, which are
realized as /hara/ and /saka/, respectively, in most cases:
(14) a. oo-hara, o-hara, naga-hara, naka-hara, saka-hara; cf. kanbara
b. e-saka, oo-saka, naka-saka, no-saka, ta-saka
Most morphemes including ta discussed in the preceding subsection fall
between these two extremes. A closer examination of compound nouns
may reveal a more general nature of the extended effect of Lymans Law
sketched in (8)(9) as well as the degree to which rendaku voicing is morpheme-dependent.
3. Branching constraint
A second major condition on rendaku voicing in Japanese is the so-called
branching constraint (Otsu 1980; Kubozono 1988). This constraint can be
defined as follows.
12 Haruo Kubozono
(15) Rendaku is blocked in the second member of a right-branching compound.
Otsu (1980) gives the following pairs to illustrate the effect of this constraint.
(16) a. Right-branching compounds
nuri + [hasi + ire] nuri-hasi-ire
lacquered, chopstick, case; chopstick case which is lacquered
nise + [tanuki + siru] nise-tanuki-ziru
pseudo, raccoon dog, soup; raccoon dog soup that is not authentic
b. Left-branching compounds
[nuri + hasi] + ire nuri-basi-ire
lacquered, chopstick, case; case for lacquered chopsticks
[nise + tanuki] + siru nise-danuki-ziru
pseudo, raccoon dog, soup; soup made from a pseudo raccoon dog
In (16a), the second member forms a constituent with the third rather than
the first member. Corresponding to this morphosyntactic structure, rendaku
voicing is blocked between the first and second members although it is not
blocked between the second and third. In contrast, rendaku is not blocked
in (16b), where the second as well as the third member can undergo the
process. Sato (1989) adds the following pair to illustrate the same effect:
(17) a. mon + [siro + tyoo] mon-siro-tyoo, *mon-ziro-tyoo
white, armorial bearing, butterfly; white cabbage butterfly
b. [o + siro] + wasi o-ziro-wasi tail, white, eagle: white-tailed eagle
The status of the branching constraint may be questioned, however, despite
the examples in (16) and (17). For one, it is difficult to find clear cases
showing its effect. The compound nouns in (16) are novel compounds for
many native speakers of Japanese, who do not necessarily have clear-cut
intuitions about the presence or absence of voicing in the pairs of expressions. The compounds in (17) are existing expressions, but it is difficult to
find more examples showing a similar effect. Moreover, the branching constraint may be questioned by the existence of expressions that apparently
defy its effect. Some of these counterexamples are given in (18).
13
14 Haruo Kubozono
(20) Branching constraint
a. Phonological unification is blocked in the right-branching structure.
b. Phonological unification is blocked between two constituents, A
and B, if B does not c-command A.
An equally interesting fact about the branching constraint thus redefined is
that it also applies to phonological processes in languages other than Japanese. In English, for example, compound nouns exhibit an asymmetry between left-branching and right-branching constructions, with the latter but
not the former failing to conform to the general strong-weak pattern of
compound stress of the language (Chomsky and Halle 1968; Liberman and
Prince 1977). Essentially the same asymmetry is observed in Chinese. In
this tone language, right-branching phrases fail to undergo the well-known
tone sandhi rule whereby a sequence of two tones 3 (falling-rising tone) is
converted into a sequence of tone 2 (rising tone) and tone 3 (falling-rising
tone) (Hirose et al. 1994). Thus, a string of 3-3-3 tones turns into 2-2-3 via
2-3-3 if it forms a left-branching structure, but the same string tends to
yield 3-2-3 in a right-branching structure. A similar effect is observed in the
tone sandhi rule in Ewe, a tone language in Africa (Clements 1978). Moreover, it is also reported that consonant lengthening in Italian is blocked in
right-branching constructions (Napoli and Nespor 1976). It is an open empirical question if this structural constraint is observed in a wider range of
languages, but it obviously represents a rather general constraint on
phonological processes that has a cross-linguistic significance.
One last question that remains unanswered is why phonological processes
in Japanese and other languages are subject to the structural constraint formulated in (20) or, equivalently, why right-branching structure exhibits
such a marked phonological pattern. Kubozono (1995) proposed two hypotheses. One is that the right-branching structure displays irregular phonological behavior in languages where the right-branching structure is syntactically/morphologically marked. This interpretation is consistent with the
fact that the left-branching structure is unmarked, at least statistically, in
Japanese compounds and phrases as well as in English compounds. If this
interpretation is correct, it is expected that left-branching rather than rightbranching structures will show marked, exceptional behavior in rightbranching languages. A second hypothesis put forward by Kubozono
(1995) is that the right-branching constraint in (20) is universal and applies
to compound nouns irrespective of whether the left-branching or rightbranching structure is syntactically/morphologically unmarked in a particular
15
language. These two hypotheses must be compared and evaluated by examining phonological markedness in a wider range of languages. This is certainly another interesting topic for future work that will require detailed
cross-linguistic comparisons.
4.
Mora constraint
16 Haruo Kubozono
do not pattern with bimoraic and bisyllabic morphemes like /aka/ red and
/ero/ erotic. Another noteworthy point is that the morphological complexity
of N1 does not matter. The monomoraic and bimoraic N1s in (22a) are all
monomorphemic while N1 in (22b) consists of more than one morpheme in
most cases. This reflects the fact that hon, a Sino-Japanese (SJ) morpheme,
tends to be combined with another SJ morpheme (or morphemes) and that
each SJ morpheme is up to two moras long. However, the morphological
structure of the N1 does not directly concern the boundary between /hon/
and /bon/. This is shown by monomorphemic N1s such as /pinku/ pink
and /karaa/ color, which clearly pattern with bimorphemic words like
/bunko/ bibliotheca, papeterie and /manga/ cartoon and not with monomorphemic words like /aka/ red and /ero/ erotic.
Having justified the generalization in (21), it is necessary to point out
that this rule applies specifically to compound nouns with hon, and not to
other compounds. Indeed, many morphemes other than hon do not conform
to the pattern in (21). We saw in section 2 above that the morpheme ta rice
field can undergo voicing even when it is combined with a bimoraic noun.
Moreover, some morphemes like ha tooth and kame turtle undergo
rendaku even when they are combined with bimoraic nouns as in (23a),
while others are not subject to voicing whether they are combined with
bimoraic or trimoric nouns, as shown in (23b).
(23) a. musi + ha musi-ba
a decayed tooth
umi + kame umi-game sea, turtle; turtle
mayu + ke mayu-ge eyebrow, hair; eyebrow
b. migi + te
migi-te
hidari + te hidari-te
kasegi + te kasegi-te
17
18 Haruo Kubozono
SJ compounds in (27b), in contrast, readily undergo the contraction since
they do not involve a word boundary between the second and third elements.
(27) a. [dai + but] + si dai.bu.t<u>.si, *dai.bus.si
great, Buddha, teacher; a sculptor of big Buddhist images
[sin + gak] + ka sin.ga.k<u>.ka, *sin.gak.ka
god, learning, department: department of religion
b. dai + [but + si] dai.bus.si, *dai.bu.t<u>.si
great, Buddha, teacher; a great sculptor of Buddhist images
sin + [gak + ka] sin.gak.ka, *sin.ga.k<u>.ka
new, learning, department; a new department
It and Mester (1996) interpret the constituency effect illustrated in (27) as
a constraint on the domain of the contraction process: Contraction occurs
within a PrWd, which consists of one or two morphemes. Since every SJ
morpheme is at most two moras long, this domain constraint can be reinterpreted as in (28).
(28) Contraction occurs in the domain of up to four moras.
Contraction has taken place in (26) and (27b) since the (C)VC morphemes
in question are embedded in a word of up to four moras. In (27a), by contrast,
(C)VC morphemes are combined with the following CV(C) morphemes in
a larger word. In terms of phonological length, this fact can be reduced to a
constraint requiring that the maximal domain of contraction be a constituent
consisting of four moras. In other words, two morphemes can be combined
without undergoing vowel epenthesis if they form a four-mora or shorter
word. This is precisely the same domain constraint that we saw for hon
above, which does not undergo the compound rule of rendaku voicing if it
is embedded in a four-mora or shorter word.
19
20 Haruo Kubozono
(32) yura + yura yura-yura (to sway) gently
suru + suru suru-suru (to climb) smoothly
bata + bata bata-bata (to fall) noisily, one after another
(33) mura + mura mura-mura village, village; villages
kazu + kazu kazu-kazu number, number; in a great number,
numerous
(34) yoru + hiru yoru-hiru
hiru + yoru hiru-yoru
asa + ban
asa-ban
Accent deletion of the second member does not seem to occur, however, if
the bimoraic base is reduplicated after being combined with a mimetic ending such as /ri/ and the moraic nasal /N/.9
(35) yurari + yurari yurari yurari (to sway) in slow motion
sururi + sururi sururi sururi (to dodge) swiftly
bataN + bataN bataN bataN thumpety thump
The contrast between (32) and (35) indicates that four-mora mimetics constitute a prosodic word (PrWd), or one accentual unit, whereas six-mora
mimetics form two PrWds. This provides further support to the claim that
the maximal length of a PrWd is four moras.
21
In (36b), the string of three numbers, 721, is realized in two PrWds, with the
first two numbers forming a four-mora unit, and the last number constituting
a separate PrWd. This clearly demonstrates that the optimal length of PrWds
is maximally four moras.
Interestingly, the same maximality constraint operates in other dialects,
too. (37) shows how the string in (36b) is pronounced in Kinki (Kyoto/
Osaka) dialects (Fukui 1990). In fact, Tokyo and Kinki dialects differ only
in the tonal pattern of four-mora PrWds: four-mora strings are pronounced
with the tonal pattern of LHHL in Tokyo, and with the pattern of LLHL in
Kinki.10 Two-mora PrWds are pronounced with the original (or lexical)
accentual pattern of the relevant morpheme in both dialects.
(37) 721-2875 {nananii}{iti} {niihati} {nanagoo}
LLHL HL LLHL
LLHL
The facts in (36) and (37) clearly show that the maximal size of a PrWd is
four moras in number enumerations. Compound nouns can form a longer
PrWd, as exemplified in (19), but this is due to a morphological requirement demanding correspondence between morphological and prosodic
words (or edges). The facts discussed in this and the preceding sections
reveal an emergence of the unmarked, or an optimal phonological shape of
PrWds in Japanese.
22 Haruo Kubozono
monsutaa/ /pokemon/ pocket monster, or Pokmon. This process admits three-mora outputs in some contexts, but never permits five-mora or
longer outputs.
The same maximality constraint applies to other morphological processes
such as the formation of zuzya-go (a jazz musicians secret language).
Zuzya-go formation involves metathesis by which the final two moras in
the input are combined with the initial two moras to yield four-mora outputs. Here again, three-mora outputs are allowed in some contexts, but fivemora or longer outputs are absolutely illicit (It et al. 1996; Kubozono
2002b). This input-output correspondence, too, reveals a tremendous difference between structures with four and structures with five moras. All in
all, the fact that five-mora or longer outputs are never tolerated in these
morphological processes supports the idea that the optimal word form in
Japanese is up to four moras long.
There are, of course, quite a few words, mostly loanwords, that are morphologically simplex but phonologically longer than four moras: e.g.
/irasutoreesyon/ illustration, /animeesyon/ animation. But there is
some accentual evidence suggesting that five-mora or longer loanwords are
processed as phonological compounds, i.e. that five-mora or longer words
are split into two four-mora or shorter substrings to which accent is assigned by the compound accent rule (Sato 2002; Kubozono 2002a). This,
too, lends support to the idea that PrWds in Japanese are optimally up to
four moras long.
5. Concluding remarks
In this paper I have first considered Fukuda and Fukudas (1999) neurolinguistic data suggesting that rendaku voicing falls into two kinds: voicing in
some words is lexicalized, while voicing in other words is due to a productive synchronic process of voicing. In the rest of the paper, I have discussed
three constraints on rendaku voicing: an extended version of Lymans Law,
branching constraint and mora constraint. These three constraints define the
domain in which the productive process of rendaku voicing occurs in contemporary Japanese. The extended version of Lymans Law and the mora
constraint apply only to a specific type of compound nouns, while the
branching constraint applies in a wider context. Despite this difference, all
these constraints represent quite general conditions on phonological and morphological processes in Japanese. In this sense, the constraints on rendaku
23
voicing should be interpreted in a wider context. These constraints, if examined in more detail, might uncover more interesting aspects and principles of (Japanese) phonology.
Notes
1. What counts here is the consonant in the preceding mora, not in the preceding
syllable. This is clearly shown by den and goo, which yield /den-da/ and /gooda/, not /den-ta/ and /goo-ta/, respectively, even though they contain a voiced
obstruent (/d/ or /g/).
2. Sugito was mainly concerned with the relationship between the /ta/-/da/ distribution and the accentual pattern of the whole personal name. She found out
that three-mora names ending in /ta/ are usually accented on their initial mora
in Tokyo Japanese (e.g. siba-ta, kubo-ta), while those containing /da/, e.g.
/ima-da/ and /sima-da/, tend to be unaccented. This is an interesting fact that
needs to be explained.
3. We can add the word /kuro-da/ to Sugitos list of exceptions.
4. Node A c(onstituent)-commands node B if neither A nor B dominates the other
and the first branching node which dominates A dominates B (Reinhart 1976:
32). In the right-branching structure [[A][[B][C]]], [A] c-commands [B], but
[B] does not c-command [A] because [B] forms a constituency with [C] rather
than with [A]. In the left-branching structure [[[A][B]][C]], on the other hand,
both [B] and [C] c-command [A].
5. The morpheme hon book should be clearly distinguished from the numeral
classifier hon which is used to count the number of objects such as fingers and
pencils (e.g. /go-hon no yubi/ five-hon Gen finger=five fingers). This numeral morpheme alternates between three phomemic forms, /hon/, /bon/ and
/pon/, depending on the phonetic property of the immediately preceding sound
(Tanaka and Kubozono 1999).
6. An apparent exception to the generalization illustrated in (21) is the word /binibon/ vinyl book, a book enclosed in vinyl. This particular instance will not
count as an exception since /bini-bon/ does not come directly from /bini-hon/,
but from /buniiru-bon/ via shortening: namely, /biniiru + hon/ /biniiru-bon/
/bini-bon/.
7. There are some compounds which contain the morpheme hon but have lost its
original meaning book e.g. /mi-hon/ a sample for sale, /syoo-hon/ an extract, hyoo-hon a sample. Interestingly, these lexicalized compounds conform to the generalization in (21).
24 Haruo Kubozono
8. Contraction is generally blocked if the first morpheme ends in /k/. In this case,
vowel epenthesis instead of contraction occurs except when the second morpheme also begins with /k/. Thus, /hak+ti/ and /hak+sai/ undergo vowel epenthesis and turn into /hakuti/ imbecility and /hakusai/ Chinese cabbage, respectively, while /hak+kyuu/ turns into /hakkyuu/ white ball. The fact that
the morpheme-final /k/ blocks contraction reveals an interesting asymmetry
between /t+k/ and /k+t/, which is called coronal asymmetry by It and
Mester (1996: 30). Thus, the former but not the latter triggers contraction: e.g.
/bet+kak/ /bekkaku/ different style vs. /hak+ti/ /hakuti/ imbecility. A
similar asymmetry is observed in the morphophonology of native verbs, where
a stem-final /k/ triggers vowel epenthesis rather than contraction when it is followed by a /t/-initial ending like the past marker /ta/. Thus, /kak + ta/ to write
(past) turns into /kakita/ (and subsequently /kaita/), whereas /yor + ta/ to approach (past) and /hasir + ta/ to run (past) turn into /yotta/ and /hasitta/, respectively.
9. We occasionally observe reduplicative mimetic forms that are five-mora long.
These five-mora forms seem to be split into two PrWds: e.g. {yura} {yurari}
(to sway) gently.
10. This tonal pattern is different from the typical pattern of nouns. In Tokyo
Japanese, nouns are usually accented, if accented at all, on the third mora from
the end of the word: the word Nagasaki, for example, is accented on /ga/ as
in /nagasaki/. However, the tonal pattern characteristic of numeral sequences
is also found in four-mora acronyms consisting of two alphabets. Thus, the
words for PC (personal computer) and OL (office lady) are pronounced
with an accent on the penultimate mora: /piisii/, /ooeru/. Alphabetic acronyms are different from numeral sequences, though, in that three-letter and
longer acronyms form one unified PrWd that is longer than four moras: e.g.
PTA /piitiiee/, IBM /aibiiemu/, YMCA /waiemusiiee/.
26
Keren Rice
solution that I offer runs counter to the claim that the lexicon of Japanese is
synchronically stratified, with a constraint against post-nasal voiceless obstruents holding of the Yamato, or native, stratum of Japanese and not in
the Sino-Japanese part of the lexicon.1 In the second part of the article, I
review the problems with stratification between the Yamato and SinoJapanese vocabularies based on the proposed constraint that the Yamato
stratum of the lexicon requires that post-nasal obstruents be voiced while
the Sino-Japanese stratum does not have this restriction, extending the discussion in Rice (1997).
1. Some background
It and Mester (1986) pose a conundrum in Japanese. First consider the
much-studied process of rendaku, or sequential voicing. Several examples
are given in (1).2 I use the Romanization used in the source.
(1)
Rendaku
a. voicing of initial consonant when no consonants follow
take + sao take-zao
bamboo pole
bamboo pole
(Vance 1987: 133)
kan + sya kan-zya
patient
illness person
(Labrune 1999: 123)
b. voicing of initial consonant when voiceless obstruent follows
de + kuc&i de-guc&i
exit
leave mouth
(It and Mester 1986: 52)
c. voicing of initial consonant when sonorant follows
is&i + tooroo is&i-dooroo stone lantern
stone lantern
(Vance 1987: 133)
ike + hana ike-bana
ikebana
arrange flower
(It and Mester 1986: 53)
hon + tana hon-dana book shelf
book shelf
(Labrune 1999: 123)
d. no voicing of initial consonant when voiced obstruent follows
doku + tokage
poisonous lizard, Gila monster
poison lizard
(Vance 1987: 137)
hyootan + kago
gourd basket
gourd
basket
(It and Mester 1986: 69)
27
As the examples in (1) illustrate, rendaku fails to apply just in the case that
a voiced obstruent follows within the morpheme that is the target of voicing.
This blocking of rendaku by a voiced obstruent is conditioned by a process
known as Lymans Law; see, for instance, Martin (1952), McCawley (1966),
Vance (1987), It and Mester (1986, 1995a, 2003), and Fukazawa and
Kitahara (2001) for discussion. Lymans Law disallows more than one
voiced obstruent within a morpheme. This constraint holds whether both
obstruents are lexically contained within a single morpheme (see (2)) or
one would be derived through rendaku (as in (1d)).
(2)
buta pig
*buda
(It and Mester 1995a: 819)
It and Mester (1986) treat Lymans Law as an OCP effect against two
voiced segments within a particular domain. If the feature [voice] is contrastive for obstruents and redundant for sonorants, the exclusion of sonorants as blockers is easily explained.
The term sequential voicing is also used to describe voicing that is
found in a post-nasal environment; see, for instance, Martin (1952), It and
Mester (1986), and Vance (1987, 1996). I refer to this as post-nasal voicing. An example is given in (3), showing alternations in form of the past
tense morpheme.
(3)
It and Mester (1995a, 2003) and It, Mester and Padgett (1995) also propose that post-nasal voicing is at play within morphemes. They argue that
within the Yamato, or native, vocabulary of Japanese, voicing is redundant
on post-nasal obstruents, as in (4).
(4)
The facts presented in this section leave us with the conundrum mentioned
at the start the voicing illustrated in (1) is blocked by Lymans Law and
requires that voicing be absent from nasals (1c), but the process shown in
28
Keren Rice
2. Rendaku
It will be useful to begin by defining some terms. I use the term sequential
voicing to mean the overall effects discussed in section 1, namely the substitution of a voiced obstruent for a voiceless one without regard for details.
I use the term rendaku or rendaku element in a very narrow sense, to
refer to a voicing feature that functions as a compound formative (see It
and Mester 1986, for instance); it is this feature that causes voicing in compounds. Finally, I use the term post-nasal voicing to refer to the voicing
caused by a nasal in the absence of the rendaku element, as in (4).
With respect to rendaku, then, I assume an analysis like that proposed
by It and Mester (1986): the voicing feature, i.e., the rendaku element, is
part of a segment that occurs in some compounds (to be elaborated below)
and provides voicing to the initial obstruent of the second member of the
compound. Following the standard analysis, the association of the rendaku
element is blocked by the occurrence of a voiced obstruent later in the morpheme. I refer to the feature involved as LV; representationally, I assume
that it is dominated by a root node; see also Labrune (1999). Other than
calling the feature LV rather than [voice], this analysis follows It and
Mester (1986); see Labrune (1999) and Clements (2001) for other alternative analyses.
3.
29
Post-nasal voicing
30
Keren Rice
does not usually apply in compounds where both members are verbs, and
semantic and phonological conditions hold such that it sometimes applies
in direct object-verb compounds and sometimes not; see also Labrune
(1999), among others. Vance further recognizes a source for rendaku voicing, proposing that it comes from a reduced form of the genitive particle no
(Vance 1982, 1987: 136; also Labrune 1999) that occurred between nouns;
thus verb compounds would be excluded from being affected by sequential
voicing due to the absence of this particle. Based on Vances discussion,
and on the examples used to illustrate rendaku, it appears that the rendaku
element is not found in all compound types. Without going into detail, the
rendaku element is not found in most compounds involving verbs as a second member. Assuming then that the rendaku morpheme is present in some
noun compounds but that compounds with a verb as the second member (I
will call these verb compounds) do not contain this element, it might be
possible to sort out whether the rendaku element and post-nasal voicing
have the same feature through an examination of verb compounds.
First consider cases where the rendaku element is present and the second element of the compound contains a voiced obstruent. In the rendaku
environment, rendaku LV is not licensed as it would lead to a violation of
Lymans Law, as discussed above. The example in (5), repeated from (1d),
illustrates this.
(5)
hyootan + kago
gourd basket
In this case, the initial obstruent of the second member of the noun compound (k) fails to voice due to Lymans Law; the final nasal of the first
member of the compound cannot trigger voicing of the /k/ because it is not
adjacent, as illustrated in (6).
(6)
n
Root
|
SV
rendaku
Root
|
LV
k
Root
g
Root
|
LV
When the rendaku element is present, then, it takes precedence over a nasal
in terms of being a trigger for voicing on the initial consonant of the second
element of the compound. In (6), Lymans Law blocks the association of
the rendaku LV to /k/, producing the unvoiced form. Nasal-final and vowelfinal first elements of a compound pattern together, as it is the rendaku LV
and not the nasal SV that has the opportunity to be implemented here.
31
Consider now forms with verbs as the second member. The prefixed forms
in (7) are verbs which contain an initial element with a final nasal and a
second element with an initial obstruent. The first morpheme is identified
by It and Mester (1999a: 68) and Vance (this volume) as fum- to step on
(and hence these forms are treated as compounds) and by Vance (1987:
137) as fuN-, an unproductive emphatic prefix.
(7)
In these forms, the post-nasal obstruent voices. This can be accounted for
by either the single voicing mechanism or the dual voicing mechanism hypothesis; under the single mechanism hypothesis, voicing on the initial
consonant of the verb would be triggered not only by the rendaku element
but also by a nasal (the same is true of forms with the verb suffix in (3));
under the dual mechanism hypothesis, rendaku in (1) is triggered by the LV
of the rendaku element and post-nasal voicing in (3) and (7) by the SV of
the final nasal of the first morpheme. In the environment where the rendaku
element is missing, nasal-final and vowel-final first elements do not pattern
together: the final nasal of the first element causes post-nasal voicing, but if
these same verb stems follow a vowel-final morpheme, no voicing occurs;
see (11) below.
Some additional examples of verb compounds are given in (8) and (9).
These items have in common that their second element is suru, a verb
meaning do (Martin 1952: 49). When suru follows an element ending in a
nasal, its initial is generally voiced (8); when it follows an element ending
in a vowel, its initial is generally voiceless (9); there are lexical exceptions
to both statements.4
(8)
32
Keren Rice
c. kin-zuru
kin
d. uton-zuru
utome. sakin-zuru
saki
(9)
forbid
(Vance 1986: 139, formal register)
prohibition
[kin-jiru (normal register)]
is indifferent, neglects
(Martin 1952: 51)
distant, estranged
advances
(Martin 1952: 51)
ahead + n- intensive
Martin (1952: 49) comments on compounds with suru, remarking that the
form usually begins with the voiceless obstruent; it is voiced following some
long vowels. Vance (1987: 140) identifies these long vowels as coming from
vowel-nasal sequences. Martin (1952: 50) further notes that the majority of
S[ino] morphs ending in n which occur in this sort of compound are attached
to the alternant -zu.ru rather than -su.ru. Vance (1987: 140) echoes this
observation, pointing out that there are counterexamples to the tendency to
voice following a nasal, but it seems to reflect a very old pattern. The overall tendency with suru, then, is that its initial consonant voices after a nasal,
but not after a vowel. This can be accounted for if these compounds do not
involve the rendaku element, but simply show post-nasal voicing. Either
hypothesis could account for these forms.
Other compounds show similar patterning. The examples in (10) should
be compared with those in (7). (10a, b) illustrate initial voicing of the verb
kiru cut as the second element of a compound after a nasal (10a), but not
after a vowel (10b); (10c) shows the verb tsukeru attach, add after a
vowel. Some of these forms contain the morpheme fum shown in (7), here
in its continuative form fum-i (Vance, this volume). In (10a), in the second
example the second part of the compound is a deverbal noun (Vance 1987:
145); this is also the case in the third example in (10c).
33
cut
34
Keren Rice
In this case, post-nasal voicing occurs despite the presence of the voiced
obstruent later in the word, unlike in (1), where rendaku is blocked in similar
circumstances. (Note that (11) is reported to be the only example of its type
found in Japanese.)
At this point, I have argued the following. First, the rendaku morpheme,
a compound formative, occurs in compounds such as those in (1), but is not
found in the verb compounds in examples like (7) through (11), nor in affixation environments as in (3). When the rendaku element appears, it surfaces so long as its realization does not lead to a violation of Lymans Law
(1d). Because this morpheme does not occur in all morphological concatenations, we can look to verb compounds and affixation structures to see
what the effect of the preceding segment is on the initial consonant of the
second morpheme in the absence of the rendaku element. In this environment (7, 8, 10), post-nasal voicing is found, and this voicing is not blocked
by Lymans Law (11a). These facts are difficult to account for under the
single voicing mechanism hypothesis: why would realization of the voicing
from the rendaku element be blocked by the presence of a voiced obstruent
in the second morpheme (1d), but voicing from a nasal be allowed (11 a)?
The dual mechanism hypothesis renders such forms explicable: the voicing
triggered by the nasal and the voicing of the morpheme-internal obstruent
have different sources, and there is no violation of Lymans Law since
there is but a single laryngeally voiced obstruent present.
To summarize, I have proposed that in studying sequential voicing in
Japanese, one must sort out two things that are often conflated. First, what I
call rendaku is restricted to the voicing triggered by a compound formative.
This compound formative has the feature LV, and the surface implementation of LV is blocked by the presence of a voiced (LV) obstruent later in
the target morpheme. This compound formative is found in noun-noun
compounds (with exceptions, both lexical and principled), but does not
generally occur in compounds headed by a verb, nor in affixation structures. In the non-rendaku environment, nasals generally trigger voicing of
the initial consonant of the second morpheme. Voicing triggered by the
rendaku element interacts with Lymans Law, but post-nasal voicing does
not, creating apparent violations of Lymans Law. My conclusion is that
post-nasal voicing is triggered by SV, while rendaku voicing is LV. See
also Pater (1999: 332334), Rice (1993), Steriade (1995: 185), and Ohno
this volume, among others, for discussion.
35
Such words can be found in the rendaku environment, where rendaku occurs.
(13) fuufu + keNka fuufu-geNka
husband & wife quarrel
onna + teNka onna-deNka
woman empire
domestic quarrel
(Vance 1987: 114)
petticoat government
(It and Mester 1999a: 70)
It, Mester, and Padgett (1995) set aside this class of words containing NT
sequences, and argue that post-nasal voicing, in the form of a constraint *NT,
is a constraint only within the Yamato vocabulary; with Sino-Japanese (12)
and other vocabulary, NT sequences are allowed. Morphemes such as those
in (12) are Sino-Japanese, so the constraint *NT does not hold.
36
Keren Rice
voicing, as under the single mechanism hypothesis, that voicing should not
be present at the time that rendaku voicing takes place. See It and Mester
(1986) and It, Mester, and Padgett (1995) for extensive discussion. Both
hypotheses thus predict that post-nasal voicing should not block voicing
triggered by the rendaku element. However, post-nasal tautomorphemic
voiced obstruents are not transparent with respect to Lymans Law, as one
might expect, but instead are blockers of rendaku, as in (14).
(14) s&irooto + kaNgae s&irooto-kaNgae
layman
idea
*s&irooto-gaNgae
aka + tombo
aka-tombo
red dragonfly
*aka-dombo
laymans idea
(It and Mester 1995a: 576)
red dragonfly
(Kawasaki 1996: 4)
37
Mesters *NT constraint. Instead, I propose that Japanese does not provide
appropriate cues to stratify the lexicon into Yamato and Sino-Japanese
vocabulary based on this constraint, and that voicing, namely LV, is contrastive in post-nasal position. If LV is distinctive in this position, then the
surface facts are as expected: LV blocks Lymans Law and SV is transparent with respect to Lymans Law.7
Avery and Idsardi (2002) pursue the line of thinking that voicing is
contrastive after a nasal. They further examine another constraint proposed
by It and Mester. It and Mester (1995a: 821822) identify two
constraints that are relevant to the representation of nasal-obstruent
clusters. First is the now familiar *NT, and second is a constraint against
voiced geminates, *DD. They link these constraints with stratification as in
(16).
(16) Yamato
*NT *DD
Sino-Japanese
*DD
Avery and Idsardi (2002) pick up on the constraint *DD and propose that
the underlying clusters allowed in the Yamato vocabulary are the following:
(17) TT
DD
TT
TT
DD
DD
NT
As Avery and Idsardi point out, the difference between the lexical strata
concerns the distribution of N before a consonant.
38
Keren Rice
I do not try to decide between the two accounts here; critically in both cases
post-nasal obstruent voicing is non-contrastive between morphemes but
contrastive morpheme internally. Instead I turn next to the issue of stratification between the Yamato and the Sino-Japanese vocabulary with respect
to the constraint *NT. These two accounts converge on the point that voicing is distinctive in tautomorphemic clusters. Either treatment accounts for
the range of patterns found in the language, as summarized in (19).
(19) observation: Lymans Law blocks rendaku when a singleton voiced
obstruent follows (1d)
account:
Lymans Law [LV from the rendaku morpheme cannot be realized because of following LV]
observation: Lymans Law does not block rendaku when a nasal
follows (1c)
account:
nasals have SV [LV from the rendaku morpheme can
be realized because no LV segment follows]
observation: Lymans Law does not block post-nasal voicing between morphemes [derived environment post-nasal
voicing in a non-rendaku environment] (11)
account:
Derived environment post-nasal voicing is marked
by SV, and thus Lymans Law is not violated [realization of SV from nasal is not blocked by later LV]
observation: Lymans Law blocks rendaku when tautomorphemic
ND follows (14)
account:
Voicing is distinctive in tautomorphemic post-nasal
obstruents [LV from the rendaku morpheme cannot
be realized because of following LV]
6. On stratification in the Japanese lexicon
While stratification and post-nasal voicing are logically independent of one
another, it is nevertheless worth pursuing whether the stratification analysis,
that NT holds of Yamato but not of Sino-Japanese vocabulary, is reasonable to maintain. In this section I examine why one might choose to abandon
stratification with respect to the properties discussed here.
That the Japanese lexicon is stratified is a general assumption in the
literature. Martin (1952) divides the lexicon into three groups, Native,
Sino-Japanese and Onomatopoeia, and Foreign, as does McCawley (1968).
39
Martin (1952: 9), in an interesting discussion about his purpose, states that
this is the first attempt to make a systematic study of Japanese morphophonemics on a synchronic level. He argues that a study of compounds
shows a definite cleavage of morphs into two classes, here called class S
(for Sino-Japanese, the historical original of the class) and class Y (for
Yamato, or native Japanese, the presumed original of most members of the
class). There are numerous hybrid compounds, to be sure; but on the basis
of selectivity within immediate constituents which contain only two morphs,
for the overwhelming majority of cases, each morph and morph group may
be placed clearly in one of the two classes (24). In a comparison between
Native and Sino-Japanese morphemes, McCawley (1968: 64) states about
the Sino-Japanese morphemes that they are borrowed from Chinese in
medieval times and which function in Japanese chiefly as elements of compounds which usually have a somewhat learned flavor; their role in Japanese
is much like that of the Latin and Greek morphemes found in the learned
vocabulary of English. Since Sino-Japanese morphemes are syntactically
distinct from the other morphemes of Japanese in that they and only they
are the bound morphemes from which two-element compounds such as
they above are formed, the syntactic information in the dictionary entry of a
Japanese morpheme must indicate (directly or indirectly) whether the morpheme is Sino-Japanese or not. McCawley shows that Sino-Japanese and
native morphemes have a slightly different vowel inventory (Cyu and Cyo
are excluded in the native vocabulary but not in the Sino-Japanese vocabulary); Sino-Japanese items obligatorily have no fewer than two nor more
than four mora, as also discussed by It and Mester (1995a, 2003). While
McCawley differentiates Sino-Japanese and Native vocabularies, he has a
number of rules marked [-foreign], including Native and Sino-Japanese
together, but only one marked [+native] (restrictions on diphthongs) and
one marked [+Sino] (a deletion/epenthesis rule).
Looking at the distribution of obstruent voicing following a nasal, the
following divisions into strata then are proposed by It and Mester (1995a,
1999a).
post-nasal voicing between morphemes
post-nasal voicing within morphemes
Yamato
yes
ND
Sino-Japanese
yes
NT/ND
40
Keren Rice
41
42
Keren Rice
43
katakana. Thus a typical text contains both kanji and kana; see Kess and
Miyamoto (1999) for detailed discussion.
Many words of Sino-Japanese origin are written with two kanji. This
might suggest that two morphemes are actually involved, and that the use
of two kanji provides orthographic evidence that *NT holds of the Yamato
stratum but not of the Sino-Japanese stratum. Martin (1952) can be viewed
as providing evidence for the position that the Sino-Japanese forms are
morphologically complex: he points out that morphs in Japanese are limited
to certain shapes, and that n.C sequences (where the dot represents a morpheme boundary) are nearly always indicative of morph boundaries (17).
Vance (1996: 23), on the other hand, suggests that many Sino-Japanese
words written with two kanji probably should not be analyzed as consisting
of two morphemes. He further notes that kana spelling provides a clear
indication in some cases that an etymological compound is no longer recognized as a compound (27). In discussion of kanji, Kess and Miyamoto
(1999: 68) point out that compound kanji are often used for common vocabulary items in literary Japanese, and that many two-kanji compound
words are stored and accessed as whole word units (6869).
Based on the studies of the writing system cited above, it appears that
orthography is not necessarily a useful tool to demarcate the Yamato and
Sino-Japanese vocabularies. First, both strata employ kanji and hiragana.
Second, discussion in Kess and Miyamoto (1999) and Vance (1996) suggests that etymological compounds are not necessarily analyzable as compounds synchronically. While the writing system allows recent borrowings
to be identified through the use of katakana, it is not necessarily helpful in
sorting out the Yamato and Sino-Japanese strata.
7. Conclusion
In this article I have made three points. First, I have argued that not all
compounds take the rendaku element, and that post-nasal voicing can be
studied best outside of the rendaku environment. Second, I have added
support to the position that two voicing mechanisms are found in Japanese,
LV and SV. Post-nasal voicing is contrastive within a morpheme, marked
by LV, and it is predictable between morphemes, marked by SV, in the nonrendaku environment. The contemporaneous inclusion of tautomorphemic
post-nasal voiced stops in Lymans Law in the rendaku context and the
failure of derived environment post-nasal voicing to be blocked by Lymans
44
Keren Rice
Law in the non-rendaku environment provides evidence for this claim. Third,
I have argued that NC sequences provide little evidence for stratifying the
Japanese lexicon into Yamato and Sino-Japanese vocabulary. The kinds of
alternations that one would hope to find to distinguish the two strata with
respect to this criterion do not appear to be available. Stratification with
respect to post-nasal voicing seems to be tangential as no approprite alternations exist to trigger the placement of words in different strata. One certainly does not want to deny the possibility of stratification in grammar.
Within-morpheme requirements, without the benefit of alternations, are not
clear evidence for stratification, however, as there is no evidence available
to the learner for making the morpheme other than what it appears to be.
Acknowledgements
Thank you to Bill Poser for helpful discussion of the Japanese facts discussed in this article. I could not have completed this work without his assistance. Thank you also to Kazutoshi Ohno for detailed comments and
discussion on an earlier draft, to an anonymous reviewer, and to Manami
Hirayama for help with the data. Misunderstandings are my own.
Notes
1. Additional strata are argued for, mimetic and foreign. See It and Mester
1995a and It and Mester 1999a for recent work that deals explicitly with this
classification, and Martin 1952 and McCawley 1968 for foundational work in
English. See Vance 1987 for a discussion of some of the older literature.
2. There are many lexical exceptions to the processes discussed in this article;
see, for instance, Martin 1952, Vance 1987, Labrune 1999, Ohno 2002, and
Kubozono this volume for discussion. For instance, some words always undergo rendaku as the second element of a compound, some never undergo rendaku (e.g., tuti soil, ground, himo string), and some are variable, undergoing rendaku in some but not all compounds (e.g., hune boat); see Ohno 2002
and others for discussion. Some exceptions can be accounted for by
phonological, syntactic, and semantic factors; see Lyman 1894 and Ogura
1910 (cited in Martin 1952: 49) as well as Otsu 1980, It and Mester 1986,
Vance 1987, Labrune 1999 and Kubozono this volume, among others, for dif-
3.
4.
5.
6.
7.
8.
9.
10.
45
Sei-daku:
diachronic developments in the writing system
Kazutoshi Ohno
1.
Introduction
One issue regarding voicing in Japanese concerns the sei-daku (lit. clearmuddy1) distinction, which correlates with the voicing opposition in contemporary Japanese. Various aspects of the sei-daku distinction are represented
in the history of the writing system. This article presents the historical development of this distinction within Japanese orthography and comments
on the nature of such distinctions. This article, therefore, chiefly presents
the facts of sei-daku as represented in the writing system, and introduces
prior proposals that attempt to account for the inconsistent representation of
this phenomenon in the orthographic history. It provides neither new data
nor new findings.
The contents are as follows: Section 2 illustrates the three diachronic
stages of the sei-daku distinction in writing (distinguished, not distinguished, and distinguished again); Section 3 introduces hypotheses that
explain the transitions of the three stages; Section 4 displays one of the
important remaining issues: sei-daku and nasality; finally, Section 5 concludes with some discussion points.
2.
2.1. Issue 2
In the current usage of the kana syllabary (hiragana or katakana), sei-daku
is distinguished by the addition of two dots just to the top right corner of a
given kana (see Appendix). These dots are called daku-ten (ten dot),
which change a sei-on (on sound) character to a daku-on character. That
is, daku-on are not presented by independent kana characters, but rather are
created by adding a diacritic to a sei-on character. This convention is due to
48 Kazutoshi Ohno
the fact that hiragana and katakana materialized as systems without a seidaku distinction. One kana could represent either sei-on or daku-on. The
source of hiragana or katakana, manyougana (see section 2.2 below for
details), actually had daku-on characters, and the sei-daku distinction was
quite well distinguished by different characters at some point in the past.
In terms of the writing conventions, then, the diachronic transition of the
sei-daku distinction can be roughly divided into three stages as given in (1)
below (see sections 2.2 through 2.4 below for further details and clarification).
(1)
kana
manyougana
hiragana, katakana
hiragana, katakana
The two periods in which sei-daku was distinguished within the kana systems
are interrupted by a period in which sei-daku was not distinguished in the
writing system. This is a fact of the history of kana usage.
Explanations for this fact will be largely different depending on whether
we regard it as a reflection of actual spoken language or not. We will address
this issue in section 3. In the remaining part of section 2, we will discuss
the diachronic development of the kana systems in Japanese in more detail.
2.2. Earliest Stage: manyougana
Chinese characters in Japan, or kanji, were (and still are) read in two ways:
the Chinese way (on reading), and the Japanese way (kun reading).3 For
example, the Chinese character for four could be read either as si (on
reading) or as yo (kun reading).4 These readings were utilized to dictate
Japanese pronunciation. Such Chinese characters, which present sound information rather than logographic information, are called manyougana
(lit.) kana used in Manyoushuu because their use is most diversified in
Manyoushuu (see below).5
The earliest written works in Japanese can be traced back to the eighth
century, or the Nara Era [710784], represented by writings such as Kojiki
(712), Nihonshoki (720), and Manyoushuu (759?).6 Some parts of these
official documents or collections are written in manyougana.7 For example,
waka (Japanese traditional songs) were usually written in manyougana.
49
2.3.
50 Kazutoshi Ohno
this occurred below, but before moving on, two things must be kept in
mind with regard to the development of hiragana and katakana. One is that
the development was gradual. The other is that the sei-daku distinction was
not associated with the writing systems, i.e. it is not accurate to say
manyougana had the sei-daku distinction, while hiragana and katakana
didnt.
51
52 Kazutoshi Ohno
To summarize, it is clear that the systems of manyougana and hiragana/
katakana represent a continuum, illustrating that the manyougana system
gradually shifted to hiragana/katakana. The loss of the sei-daku distinction
in writing can already be seen in the manyougana system (see 2.3.3) and
yet preservation of some daku-on characters in simplified forms were established, if only temporarily (see 2.3.2). It is more natural to assume that
the tendency not to keep the orthographic sei-daku distinction became
stronger and stronger regardless of the kana system between the Earliest
Stage and the Middle Stage. The transition from manyougana to hiragana/
katakana happened to overlap with this tendency. It is thus perhaps no surprise that hiragana/katakana developed without the sei-daku distinction.
53
Some were also used to represent nasal sound. Placed to the (top) right of a
kana, they could be distinguished from tone marks, i.e. function as daku-on
markers (Komatsu 1981: 6371).
The use of daku-on markers was fairly established and gradually spread
to other fields than those directly related to Chinese in the Muromachi Era
[13381573], but they were not popularized yet. In the first half of the Edo
Era [16001867], the diacritic was unified to the two dot form on the top
right corner (i.e. same as daku-ten currently used) and the appearance of the
diacritic increased greatly (Ono 1995: 80). The early Edo Era, therefore, is
often considered to be the era in which stabilization of the sei-daku distinction in writing (by daku-ten) occurred. However, even in the Edo Era,
daku-on were not consistently marked by daku-ten.25 Generally speaking,
daku-ten was added only when the author thought it necessary. Hence, kana
with daku-ten would be daku-on, but kana without daku-ten could be sei-on
or daku-on. Not as common as daku-ten, fudaku-ten (fu not) was sometimes added to represent sei-on in the Edo Era (Komatsu 1981: 71).26 It will
safely be said that kana was still common to both sei-on and daku-on.
The use of daku-ten was finally incorporated into the modern education
system in the Meiji Era [18681912]. Even so, some documents were still
written without daku-ten (Maruyama 1967: 1122).27 Official (Governmental)
documents, such as in law or in regards to the constitution, are written
without using daku-ten, and the style continued until the end of the World
War II (Kamei 1970: 4445). The rigid sei-daku distinction by daku-ten,
then, is much more recent than people normally think.
3.
54 Kazutoshi Ohno
seemingly reappeared? Under this assumption, therefore, it must be explained how and why such changes occurred, including changes of the
sound values of daku-on and/or sei-on.
The second approach hypothesizes that the stage transitions merely reflect the facts of writing. Under this hypothesis, it becomes reasonable to
claim that the recognition of the sound values of sei-daku remained the
same throughout the history of Japanese. The question here is: why such
different writing conventions, in terms of sei-daku, were adopted?
In the remainder of section 3, we will discuss two proposals along the
lines of the second approach (sections 3.2 and 3.3), and one along the lines
of the first approach (section 3.4). In these subsections, we focus on the
transition from the Earliest Stage to the Middle Stage, since the explanation
of this transition is the key for each proposal. Finally, we will discuss the
transition from the Middle Stage to the Current Stage (section 3.5).
3.2. Second approach 1: sei-daku has been distinctive
The second approach introduced above hypothesizes that the sei-daku distinction, or the lack of this distinction, is merely a matter of writing practice. This approach can be further divided into two positions, depending on
whether or not we assume that sei-daku has remained distinctive throughout
the history of Japanese. In 3.2, we will discuss the first position that was
adopted within this second approach, i.e. sei-daku has been phonologically
distinctive but the distinction was not reflected in writing (in the Middle
Stage).
This assumption must be accompanied by a satisfactory account of the
question mentioned in 3.1 above why different writing conventions were
adopted? More precisely, the following question must be answered: Why is
there a stage in which sei-daku was not distinguished in writing if it was
distinctive in the [spoken] language? A possible answer to the question is:
Because sei-daku was rarely contrasted for the purpose of interpretation
(even if contrasted in pronunciation), the distinction was simply ignored in
writing conventions, as seen in Takagi et al. (annotated) (1960: 4246),
Anonymous (1963: 375388), etc. As long as the sei-daku distinction seldom triggered semantic confusion, it did not have to be reflected in the
writing system (e.g. hasituma and hasiduma would be the same word
meaning loving wife; naturally context played a role as well).
The additional explanation above would explain the Current Stage as
well because sei-daku began to trigger semantic confusion, the distinction
55
was employed within the writing convention. However, it does not explain
why sei-daku was relatively well-distinguished in the earliest literature.
Hence, further explanation is added as follows (cf. Takagi et al. (annotated)
1960: 4344): Manyougana were used to represent precise pronunciation,
while hiragana and katakana were created to represent sound conveniently
(simply and quickly). That is, the appearance and the loss of the sei-daku
distinction in writing resulted from the different functions of manyougana
and hiragana/katakana. Manyougana would distinguish hasituma from
hasiduma because they are pronounced differently, while hiragana and katakana would not because they are the same word for loving wife.29 For
convenience, it would have been better to represent hasituma and hasiduma
together, i.e. with a character that represents both sei-on and daku-on (tu/du).
After all, the main claim of this position is that hiragana and katakana
were established without the sei-daku distinction because the users felt it
most convenient for their writing systems. This claim itself is quite reasonable, if we do not assume an association between the kana system and the
sei-daku distinction.
There remains an important question, which is not detrimental to the
claim made above. Would people really give up the sei-daku distinction
just for convenience in writing despite the fact that they were well aware of
the distinction? If people hear and pronounce two sounds distinctly, will it
seem natural to describe them in different ways?30 In order to answer this
question, or avoid answering it, some have sought to explain the three stage
transitions without assuming that sei-daku was distinctive.
3.3. Second approach 2: sei-daku was indistinctive
The second approach does not necessarily assume that sei-daku was distinctive throughout the history of Japanese. Some researchers take a radical
position by assuming that there was no sei-daku distinction in the past at
all, but many simply assume that generally the auditory distinction was
extremely hard and thus confused quite often in the writing system, even
though the distinction existed in speech. They assume that sei-daku was
actually indistinctive in the past anyway, and gradually became distinctive
later (in the Current Stage). In order to justify this assumption, it must be
explained why sei-daku was relatively well-distinguished in the earliest
literature.
Hamada (1960, 1971) proposes that the knowledge or skills of the
authors or editors made the distinction possible. That is, exceptional writers
56 Kazutoshi Ohno
could distinguish sei-daku both by hearing and in writing. One of the reasons
for this is that they are assumed to have been familiar with Chinese (characters, language, and literature)31, so that they could utilize manyougana for
distinct phonetic phenomena. As discussed in section 2.4 above, the rigid
sei-daku distinction was generally required in reading Chinese. Hence, being
familiar with Chinese would have resulted in being able to distinguish seidaku.
There are several pieces of evidence to support Hamadas hypothesis.
Let us discuss two of them. First, even in the eighth century, in which seidaku was well-distinguished in official documents, sei-daku was not distinguished in private documents and/or by lower-class people (see also Kamei
1970, 1985: 228). For example, the two personal letters of Shousouin kana
monjo (762?)32 do not show sei-daku distinctions at all in their manyougana use. Second, Shinsen jikyou, a set of the oldest existing ChineseJapanese dictionaries, is relatively rigid in the sei-daku distinction, though
it was written in 892.33 This is easily understood if we assume that the distinction was the result of an educated editor writing for educational purposes.34
Because there were people who could distinguish sei-daku, it is natural
to assume that sei-on and daku-on would have been pronounced differently.
However, it is possible that common people did not pay attention to the
sei-daku distinction. There appears room to suspect that perhaps they would
have been even unaware of such distinction; similar to the nasal alternations in contemporary Japanese.35 Assuming so does not require any correction of the argument presented above; rather, it accounts for things more
precisely.
Let us note two ways in which this is so. First, it explains why hiragana
and katakana were established without the sei-daku distinction within their
systems. This would have been because people generally had trouble in
distinguishing sei-daku. Originally, manyougana was used by the elite, who
could distinguish sei-daku. As manyougana were popularized, those who
were not educated enough to distinguish sei-daku started using manyougana
or their simplified forms, confusing sei-daku. Hiragana and katakana were
not established or issued by one particular person, institute, or authority, at
some particular moment but by various people, over time. Second, even
the writers of the earliest literature, who could distinguish sei-daku, sporadically confuse sei-daku in writing. This might have been due to the fact
that the pronunciation of sei-daku in that period was rather more indistinctive than is currently believed.
57
58 Kazutoshi Ohno
dence, out of several he provides, is the appearance of nasality in the inflections of the b-final verbs in this period; e.g. tob+ta fly+PAST > tonda
flew, which is actually similar to the sound alternation of yom+ta read+
PAST > yonda read (past tense). These will be naturally understood if
the sound quality of [ b ] was closer to [ m ] such as [ mb ].48 He further
assumes a similar change for other daku-on as well. See End
(1973) for
other evidence and discussion.
To summarize, End
(1989) hypothesizes that nasalization of daku-on
blurred the sei-daku distinction, which is reflected in the writing of the
Middle Stage, while nasality appeared around 800. Before that, daku-on
were not accompanied with nasality and distinguished from sei-on, which
is reflected in the writing of the Earliest Stage. Thus, according to End
,
revival of the sei-daku distinction must be closely related to the disappearance of nasality (which will be discussed in 3.5 below).
The hardest part in supporting this hypothesis is the unsatisfactory explanation for why the nasalized daku-on and sei-on are indistinctive, while
daku-on without nasality are distinctive from sei-on. Another difficulty is
generalizing the change seen [ b ] to other daku-on. End
(1989) discusses
the change of [ b ] to [ mb ] (i.e. ba-column daku-on) extensively, but other
daku-on only briefly. Yet another remaining issue is the time when nasality
appeared with daku-on. The change of [ b ] to [ mb ] may have been around
800, the transitional span from the Earliest Stage to Middle Stage, but it
may not have been indicative of the change in daku-on in general. It is important to remember that the nasalization of daku-on is one of the hardest
issues to deal with in the study of the history of Japanese (see also fn. 43
and section 4 below).
59
60 Kazutoshi Ohno
4. Remaining issues
We have illustrated the sei-daku distinctions in the history of Japanese orthography, and seen possible explanations for differences in the various stages.
While we investigate the data found in literature, we must try to reconstruct
the sound values of sei-daku or assume the consciousness of sei-daku by
Japanese speakers of the past. There remain many unresolved issues regarding this exploration. In this section, however, only the most important remaining issue is addressed: the sound values of sei-daku in relation with
nasality.
In the major dialects of contemporary Japanese, sei-daku is opposed in
voicing. Most sei-daku pairs differ not only in voicing but also in places
and/or manners of articulation (see appendix), but the statement that sei-on
are all voiceless, while daku-on are all voiced holds. The sei-daku opposition, therefore, is usually assumed to be the voicing opposition, implicitly
or explicitly. Basically this article has taken this position as well. However,
when the sound values of sei-daku are concerned diachronically, nasality
must also be taken into account. That is, at least three different sounds in
manner must be considered, e.g. [ t ], [ d ], and [ nd ], for sei-daku. Since
sei-daku is a binary distinction, how to group the three into two is an important issue.
Considering that [ N ] is regarded as a variant of [ g ] in contemporary
Japanese54 and that it is noted that [ b ] is occasionally accompanied with
nasality by Rodriguez in the beginning of the 17th century (see fn. 43),
and so forth, voiced obstruents55 (e.g. [ d ]) are perceptually grouped together with (pre)nasalized obstruents (e.g. [ nd ]) and distinguished from
voiceless obstruent (e.g. [ t ]). This distinction is the sei-daku distinction.
This will be the most popular view.
End
(1989) hypothesizes that voiceless obstruents (e.g. [ t ]) can be
grouped together with nasalized obstruents (e.g. [ nd ]) due to their perceptual closeness, but distinguished from voiced obstruents (e.g. [ d ]), though
motivation is not convincingly given. The sei-daku distinction assumed by
him, however, is parallel to the view just given above, i.e. in terms of voicing. This is obvious from his statement the sound values of daku-on
changed (from [ d ] to [ nd ]), etc.
There is another view that we have not discussed yet. Some support the
idea of grouping oral obstruents (e.g. [ t ] and [ d ]) together and distinguishing them from nasalized obstruents (e.g. [ nd ]) (M. Takayama 1992a,b,
among others56). This division is based on nasality (oral vs. nasal), rather
61
than voicing (voiceless vs. voiced). They propose that the sei-daku distinction had been based on this distinction, i.e. non-nasal obstruents are sei-on
and [partially] nasal obstruents are daku-on. Voicing of the non-nasal obstruents (i.e. sei-on) had been allophonic, e.g. voiceless word initially and
voiced word internally. That is, the sei-daku distinction in the past is similar to the dialects currently spoken in Tohoku (north-eastern Japan) or part
of southern Kyushu.57
We did not, and will not, explore this view mainly because this is not a
hypothesis to explain the diachronic transitions of the sei-daku distinction
in writing. It is unclear how this view accounts for the transitions in writing
representation convincingly. Also, this view, so far, does not explain how,
why, and when the sei-daku distinction in the past (by nasality) changed to
the distinction now (by voicing). According to their assumption, voiced
obstruents (e.g. [ d ]) in the past were sei-on; but now they are categorized
daku-on. We would like to have a persuasive explanation of this fact.
This, of course, does not mean that this view is not worth exploring. As
noted in section 3.4, such as in fn. 44, more and more consideration is required to conclude something about the diachronic background of the nasality appearance with obstruents. Until we reach a solid consensus, it is
best to keep our eyes open to various possibilities.58 It is actually worth
investigating various proposals, such as Takayama (1992b) who extensively discusses sei-daku in relation to nasality, to consider the diachronic
development or change of the sound values of sei-daku. It must also be
noted that the real value of End
s proposals (section 3.4) becomes clearer
as the study of the nasalized obstruents are considered in greater detail.
5. Summary
In the first half of this article (section 2), it was illustrated that in writing seidaku was distinguished at first (Earliest Stage), then was not distinguished
(Middle Stage), and is now distinguished again (Current Stage). The transitions of the kana systems, including daku-ten usage, were also illustrated.
In the second half (section 3), three possible explanations for the sei-daku
representations were introduced. One approach assumed that sei-daku in
literature was a reflection of the actual [spoken] language, and the other
approach assumed that sei-daku in literature was merely a fact within the
writing system. The latter assumed that the sei-daku distinction existed
phonologically in a different manner from that in writing, and allowed for a
62 Kazutoshi Ohno
position either that sei-daku was actually distinctive or that sei-daku was
indistinctive. Finally (section 4), it was briefly noted that the correlation
between sei-daku and nasality could be a key for the further development
of sei-daku study in terms of the history of Japanese.
Appendix
Sei-daku (lit.) clear-muddy
sei
daku
ka
ki
ku
ke
ko
(ka)
(ki)
(ku)
(ke)
(ko)
[ka]
[kji]
[k]
[ke]
[ko]
ga
gi
gu
ge
go
(ga)
(gi)
(gu)
(ge)
(go)
[ga]~[Na]
[gji]~[Ni]
[g]~[N]
[ge]~[Ne]
[go]~[No]
sa
si
su
se
so
(sa)
(shi)
(su)
(se)
(so)
[sa]
[i]
[s]
[se]
[so]
za
zi
zu
ze
zo
(za)
(ji)
(zu)
(ze)
(zo)
[dza]~[za]
[di]~[i]
[dz]~[z]
[dze]~[ze]
[dzo]~[zo]
ta
ti
tu
te
to
(ta)
(chi)
(tsu)
(te)
(to)
[ta]
[ti]
[ts]
[te]
[to]
da
di
du
de
do
(da)
(ji)
(zu)
(de)
(do)
[da]
[di]~[i]
[dz]~[z]
[de]
[do]
ha
hi
hu
he
ho
(ha)
(hi)
(fu)
(he)
(ho)
[ha]
[i]
[F]
[he]
[ho]
ba
bi
bu
be
bo
(ba)
(bi)
(bu)
(be)
(bo)
[ba]
[bi]
[b]
[be]
[bo]
63
the data), romanization faithful to pronunciation (i.e. phonetic romanization, typically seen in the spelling of Japanese proper names), and [broad]
transcription of actual pronunciation. [N] is observed only word internally,
if it appears (dialectal). See section 2 for discussion on manyougana.
Acknowledgements
The completion of this paper was supported by the MOE (Ministry of Education) project of the Center for Linguistics and Applied Linguistics of
Guangdong University of Foreign Studies, Guangzhou, China. The revised
version of this article was written while I was working at the Institute of
Cognitive Science, Hunan University, Changsha, China; after considerable
restructuring from its original version, entitled Sei-daku: More than a
voicing difference toward a better understanding of the rendaku phenomenon , written in June 2002, while I was studying at the University
of Arizona, USA. Reviews from two anonymous scholars were very helpful, especially, detailed comments and suggestions from the second reviewer. Takayama Tomoaki provided me not only with helpful suggestions
but also materials that I could not obtain. This article could not have been
completed without continuous help from the editors of this volume. Many
thanks go to those who have commented on various drafts of this article.
All remaining errors are my own.
Notes (Japanese, Chinese and Korean names are given in the order last-first)
1. (lit.) = (literal translation)
2. Discussion in Section 2.1 is largely dependent on Hamada (1971: 4445).
3. Adding a little more explanation, on reading is based on Chinese pronunciation,
while the kun reading is actually native vocabulary (e.g. word) assigned to
kanji. In contemporary Japanese, not all, but most of the frequently used kanji
have the two readings. Moreover, there may be multiple on readings and/or
multiple kun readings for one kanji. It is not surprising that one kanji has several
readings.
4. cf. Four in contemporary Mandarin Chinese is s in Pinyin representation.
5. There was another way of reading manyougana called gisho fun reading,
which relies on association or imagination by the reader. For instance, two
kanji [bee]-[sound] represented the sound bu (onomatopoetic), two kanji
[ten]-[six] (i.e. sixteen) represented the sound sequence of sisi because
44 (called si-si four-four in Japanese) makes 16, and so forth.
64 Kazutoshi Ohno
6. Kojiki (Record of Ancient Matters) is a history book written by no Yasumaro (who recorded what Hieda no Are said) by Imperial request. The preface
to this work is written in Chinese (seikaku kanbun regular Chinese), while
the text is written in highly Japanized Chinese (hentai kanbun irregular Chinese). Nihonshoki (Chronicle of Japan) is another history book written by
Toneri Shinn (Shinn Imperial Prince), etc. and is the first official document
compiled by Imperial command. The text of the book is basically written in
Chinese. Manyoushuu (Collection of Myriad Leaves) is an anthology of
Japanese traditional songs (waka) written or edited by various anonymous
authors in 759, or perhaps a little later than that (around 770; the complete editorial work may have been done even later than this).
7. Using Chinese characters to show Japanese pronunciation is in fact already
seen in inscriptions (kinseki-bun) a few centuries earlier. However, those lexical items are limited to proper nouns, which it is often hard to reconstruct
original pronunciations for.
8. At least 35 different manyougana are used for the native sound of si in Nihonshoki (Tsuru 1977: 242).
9. Thus, reading manyougana was already difficult in the next Era (Heian Era
[794 1192]).
10. According to Tsukishima (1972: 384), this had already been recognized by
Keich (Waji shouin, 1691). Motoori Norinaga (Kojiki-den 1: Karina no koto,
1767) studied this issue, and his work was well expanded by his student Ishizuka Tatsumaro (Kogen seidaku kou, published in 1801).
11. Although data are limited, it is generally agreed that manyougana in inscriptions (see fn. 7 above) in or around the Suiko Era [592628] also had the seidaku distinction by using different characters (cf. Tsuru 1977).
12. Kasuga (1941) reexamined manyougana in Kojiki which had been considered
dubious regarding the sei-daku distinction at that time, and concluded that they
were distinguished well by different characters except for a few exceptions.
no (194748, 1953), argued that manyougana in Nihonshoki, which made
Motoori Norinaga wonder why there were many exceptions, also generally had
the sei-daku distinction by characters as well. Nishimiya (1960) and Tsuru
(1960) argued that the sei-daku distinction was well represented by not only
on-gana (manyougana read in on reading) but also kun-gana (manyougana
read in kun reading).
13. An example of multiple ways of representing a sound is provided in section
2.2 above (kamo (admiration marker) could be represented by one kanji or
two ka-mo). Another way is gisho use of manyougana (see fn. 5 above).
14. This difficulty was remarkable especially when taking supplemental notes on
the Chinese literature. For reading help, difficult or special pronunciations, morphemes that Chinese lacked (e.g. particles such as case markers, inflectional
endings, etc.), annotations, and so forth, were added in the margin.
65
15. In fact, some simplified kana had already been sporadically used earlier. For
example, a few of them are seen in Shousouin komonjo: Minokoku [Mino no
kuni] Kamogun Hanifuri (Hanyuuri) koseki-chou Register book of Hanyuuri,
Kamo County, Mino State (currently part of Gifu Prefecture), which is the
existing earliest family register book written in 702. A claim in the text is that
kana simplification was positively processed in this period.
16. Those various variants of simplified characters were sometimes mixed with
manyougana (i.e. full kanji) in the literature.
17. Any simplified form of manyougana is, thus, often grouped together and
called ryakutaigana simplified kana in contrast to magana real kana which
refers to non-simplified manyougana (i.e. full kanji).
18. The umlaut shows that the sound belongs to the otsu type, not the actual sound
quality such as lip rounding.
19. Tsuru (1977) chose this literature for discussion because it represents official
documents (history books compiled by Imperial command), which are descended from Nihonshoki. They are written in the Chinese style, and
manyougana are included.
20. They were separately developed because of their preference in the field. Cursive
characters were preferred in writing the native literature, while partial characters
were preferred in reading (annotating, commenting, etc.) Chinese literature.
21. For example, Japanese pronunciation of Chinese characters are written in
manyougana in Konkoumyou saishououkyou ongi, a commentary on Buddhist
scriptures, copied in 1079.
22. This was actually simplified (partially represented) kanji for daku noted in
red ink (seen in Kongouchou yugarengebu shinnenju giki, 889).
23. Komatsu (1981: 70) notes that the primitive use of daku-ten on the top right
corner is seen in the literature in the mid-thirteenth century ([Kanchiin] Ruijuu
myougishou, written [copied] in 1241).
24. Originally they were tone marks, which later started to represent sei-daku.
25. Maruyama (1967: 1122) notes that in the Edo Era [16031867] people tended
to think that sei-on were sophisticated and daku-on vulgar. For example, Japanese-studying scholars often wrote sei-on but no daku-on. This might be one of
the reasons for the inconsistency.
26. This is a single small circle on the top right corner of kana, i.e. the same diacritic currently used for handaku-on ([ p ]-initial syllables).
27. e.g. Nihonshoki Tsuushaku (Commentary on Nihonshoki) by Iida Takesato
published in 1902.
28. Discussions in sections 3.1 and 3.2 are largely dependent on Hamada (1971).
29. Of course, this is not identical to the statement manyougana is phonetic (seidaku is distinct) and hiragana and katakana are phonemic (sei-daku is indistinct), which contradicts the basic assumption of this position sei-daku has
been distinctive.
66 Kazutoshi Ohno
30. But see the last paragraph in section 3.5 below for a possible answer to this
question.
31. Mori (1991) points out that some volumes of Nihonshoki were actually written
by Chinese scholars (at least two different Chinese scholars). Ide (1989: 241)
says, citing M. Inoue (1932: 223225), that some manyou-gana usage in
Manyoushuu requires knowledge of the Chinese literature.
32. There are two of them: kou-monjo and otsu-monjo. They are (independently)
written at least before 762, according to Komatsu (1981: 57). They are generally regarded as written by a person from the low class (among intellectuals). M. Tanaka (1995: 196) says that they are written in rough style with
sloppy characters, so the author is supposedly not highly educated.
33. See also the second paragraph of section 2.4.
34. Wamyou ruiju-shou, which was edited a little later in 934 (by Minamoto no
Shitag), has no sei-daku distinction, though it is also a set of ChineseJapanese dictionaries. Hamada (1971: 44) notes that this is also due to the differences between the attitudes of the authors/editors against the sei-daku distinction in writing, rather than temporal factors.
35. Syllabic/Moraic nasal in Japanese undergoes place assimilation. It becomes
[ m ] before a bilabial sound, [ n ] before an alveolar sound, [ N ] before a velar
sound, and [ ] elsewhere (word-final, etc.). However, they are all recognized
as the same sound and written in the same hiragana or katakana by native
speakers of Japanese (Komatsu 1981: 5859). Sometimes [ ] (i.e. unassimilated) is observed in the environments of assimilation (e.g. before a bilabial
sound). Native Japanese speakers will not recognize it as special or different.
36. End
(1989) is a collection of his papers published in 19711988.
37. Kakekotoba is one rhetorical devise mainly used in verse, by which two (or
more) readings are available from one expression. That is, homonymic expressions are exploited there. e.g. matu > wait(ing), pine (tree)
38. For example, in the traditional song (waka) in Kokinwakashuu (edited in around
913): wasurenanto omohukokorono tukukarani arisiyorigeni maduzokohisiki
(14: 718), two readings are available from madu madu first (of all) and
matu wait(ing).
39. Such nasality is not indicated in Japanese literature at all.
40. Around 1600, several books and dictionaries related to Japan or Japanese were
written by missionaries of the Society of Jesus. They are called Kirishitan
shiryou Christian literature, represented by Arte da Lingoa de Iapam [Nihon
daibunten] (16041610) by Joo Rodriguez. These are most reliable for discussion of sound values of sei-daku because they are relatively new and most
of them are written in alphabets which distinguish voicing by using different
letters.
What Rodriguez actually says is that the preceding vowel is nasalized, but
Hamada (1952a: 21, note 9 on p. 31) says that the nasality must be accompanied
41.
42.
43.
44.
45.
67
with daku-on because: (i) the nasality is also expected word-initially; (ii) the pronunciation of daku-on shares the property of nasals in youkyoku (singing Noh),
heikyoku (singing Heike monogatari Tale of Heike), citing Iwabuchi (1934).
Helinyuli [Kakuringyokuro] edited by Luo Dajing [Ra Taikei] (13c, 1252?),
Ribenjiyu [Nihonkigo] edited by Xue Jun [Setsu Shun] (1523), etc. In these
works, daku-on are usually preceded by a coda nasal. e.g. f
n-zh for hude
(writing) brush. Pronunciations are given in contemporary Mandarin Chinese
here for convenience.
Iropa [Iroha] (author unknown) (1492), Cheophaesineo [Shoukaishingo] by
Gang Useong [K
Gsei] (1676), etc. They spelled in similar fashion as in the
Chinese literature (see fn. 41 above), i.e. put a nasal coda before daku-on. One
might think that it is showing the voicing of the obstruent rather than the nasalization of daku-on, mentioning that Hangul (Korean syllabary) itself has no
way to show voicing. In fact, Hangul had a letter for / Z / and it was used for
the Japanese / z / sound in the works cited above; nonetheless, it is preceded by
a nasal coda (see also Hamada 1952b).
In Nosondang Ilbonhaengnok [Roushoudou Nihonkouroku] by Song
Huigyeong [Kikei S
] (1420?), Haedongjegukgi [Kaitoushokokuki] edited by
Sin Sukju [Shin Shukush] (1471), etc., Japanese place names are written with
Chinese characters, using the same representation for daku-on as seen in the
Chinese literature mentioned in fn. 41 above.
Special thanks go to Choi Kyung-Ae who helped me to transliterate the
authors and titles using the new Korean romanization system.
Precisely speaking, Rodriguez writes in Arte da Lingoa de Iapam [Nihon
daibunten] (see fn. 40 above) that nasality is always observed before D, DZ, G
(/ d, g /, where / d / includes [ dz ] before / u /) and occasionally before B (/ b /).
However, considering examples in Korean literature (see fn. 42 above) and
k- n-zh for kaze wind (Helinyuli), hung-bng for obou monk (Helinyuli),
yn-b-j for ibiki snore (Ribenjiyu), etc. in Chinese literature, it will be reasonable to assume that daku-on were generally accompanied by nasality
around the 14th century.
It will be fair to note that it was kindly pointed out to me by an anonymous
reviewer that this is not well-established or an uncontroversial view. Rodriguez
does not say daku-on are consistently accompanied with nasality (see fn. 43
above). Some argue that the coda nasal before daku-on may not represent nasality but emphasize voicing of the following obstruent (e.g. Fukushima 1959).
However, it is also a fact that there is no strong evidence to refute the assumption (daku-on were generally accompanied with nasality); and therefore, not so
many scholars seem to reject it. This issue, whether daku-on were consistently
nasalized or not, is still debated.
End
(1989) does not provide any convincing explanation or evidence for what
he calls neutralized. However, his hypothesis itself is actually very suggestive, especially when we compare the onomatopoeias that contrast in sei-on,
68 Kazutoshi Ohno
46.
47.
48.
49.
50.
51.
daku-on, and nasals. It is well known that the sei-daku contrast in Japanese
onomatopoeia results in contrastive meanings, such as positive and negative,
clear and dirty, light and heavy, etc., respectively. For example, ones eyes are
kirakira when enjoying something, while they are giragira when in hunger.
Let us consider the following examples: surusuru/zuruzuru vs. nurunuru
and torotoro/dorodoro vs. noronoro. (All examples here are mine.) Surusuru
refers to smooth action such as sliding down a rope, while zuruzuru refers to
the friction caused by moving a heavy object; and nurunuru refers to the oily
and/or slippery surface, which is much closer to surusuru (lubricious) and opposite of zuruzuru (frictional). Moreover, turuturu also refers to a slippery surface.
Moving on to the other set, torotoro refers to slow action or transition, while
dorodoro refers to the state or movement which is jelly-like or pulpy; and
noronoro is to describe that some movement is slow, which is closer to the
meaning of torotoro.
These are interesting, but it is unknown how these impressions/feelings of
the modern people can be evidential for determining the recognition of the intuitions of people hundreds years ago. See Komatsu (1981: 107110) for an interesting discussion of the [nasal][daku-on] ([ n ][ d ]) contrast that functions
similarly to the sei-daku contrast in meaning as described above. (His example
is nora vs. dora, both of which have a meaning of stray.)
Hamada (1952a: 27f) also assumes the change (nasalization over daku-on) in
the same period, but for different reasons. He assumes that the change was triggered by the influence from Chinese that had pre-nasalized obstruents around
this period. However, many others take a prudent attitude to this assumption.
Considering that education, or knowledge of Chinese, in that period was
strictly limited to a small set of people, it is difficult to accept an assumption
that the knowledge (and preference) of nasalized obstruents by such a limited
group changed the sound quality of the entire Japanese sound system.
The existence of it is confirmed by foreign literatures (see fn. 4043). However, not much is actually known about the emergence of the nasalization of
daku-on (see also section 4 below).
Then the nasality weakened to something like [ mb ] a few centuries later to be
realized as described by Rodriguez (see fn. 40 above).
Mabuchi (1971) also indicates the sound change of [ b ] in this period.
In the previous paragraph, it is expressed that the sei-daku distinction was
rather or tentatively stabilized. This is because daku-on had been inconsistently specified in the Japanese literature throughout the Edo Era, as mentioned
in section 2.4 above.
Many do not deny this view. Hamada (1960: 78) notes this, and so does
M. Takayama (1992b: 45).
Many fix their pronunciations after they study the spellings of those words.
That is, for Japanese speakers, the voicing distinction is still hard just through
listening.
69
52. In other words, it is social rather than linguistic (Vance 1987: 107).
53. It is also suggestive in that most Japanese speakers cannot easily name the
accent pattern, e.g. HLLL, LHL, etc. (L = Low and H = High), if they can
distinguish them. The sei-daku distinction in the past may be similar to this:
even if they could distinguish them in speech, they could not write them down
correctly.
54. [ g ] is supposed to be [ Ng ] in the past, and [ N ] would have been developed
from [ Ng ] because [ N ] was not a phoneme. It is sometimes taught that / g / is
prescriptively [ N ] word internally, but this is actually dialectal. The velar nasal
is not clearly observed in the dialects of the western half of Japan. Even in some
dialects of the eastern half of Japan, the younger generations do not produce
the nasal consistently (e.g. Tokyo dialect).
55. Whenever nasal is unspecified, e.g. voiced obstruents, they are oral or
non-nasal sounds.
56. Takayama (1992a, b) says that he followed Hayata (1977a,b), although Hayata,
in fact, just mentioned it without providing any evidence in his articles.
57. Yamane Tanaka (this volume) discusses voicing and nasality of obstruents in
Tohoku dialects.
58. This is one of the reasons this article focuses on introducing different views,
rather than proposing or concluding something new.
1. Introduction
With a focus on generative restrictiveness and the significance of crosslanguage variation in source contrasts, this paper identifies those phonological primes responsible for creating laryngeal-source contrasts in Japanese.
The arguments will be based on Element Theory (Kaye, Lowenstamm and
Vergnaud 1990; Harris 1994; Harris and Lindsey 1995, 2000). Unlike theories based on SPE-type distinctive features, this theory of melodic representation recognizes primes which are monovalent and which can therefore be
interpreted separately without needing to be combined with other primes.
The theory admits only two autonomous melodic categories for crosslinguistic source contrasts: one contributes aspiration and the other prevoicing.
This paper claims that Japanese exploits only the prevoicing element in the
representation of phonation-type contrasts, this position being supported by
evidence from assimilatory processes, early language acquisition and aphasia.
The argument leads to the further claim that vowel devoicing is not a process
triggered by the laryngeal element for aspiration, but by a manner element
called noise. Furthermore, in accordance with the general trend towards
reducing the size of the element inventory, the paper will discuss the validity
of a recent proposal to merge the prevoicing and nasal elements.
72 Kuniya Nasukawa
However, it has been acknowledged that the use of these terms is insufficient for describing the varied phonetic manifestations of the source contrasts across different languages. In articulatory terms, for example, the socalled voiced obstruent plosives b, d, g in most Slavic (such as Polish and
Russian) and Romance languages (such as Spanish and French) are typically produced in word-initial position with glottal pulsing during articulatory closure: in precise phonetic terms, they are described as voiced unaspirated. The voiceless plosives p, t, k in those languages are articulated
without glottal pulsing and are described as voiceless unaspirated. On the
other hand, the voiced plosive series of some Germanic languages such as
English and Swedish, for example, is typically produced in word-initial
contexts without glottal pulsing, and is identified as voiceless unaspirated.
The voiceless series in those languages is also articulated without glottal
pulsing, but with aspiration: it is described as voiceless aspirated, which is
thus identical to the voiced series of most Slavic and Romance languages.
The cross-linguistic differences in the phonetic realization of the two-way
contrast is often captured by voice onset time (VOT), which is the wordinitial interval between the release of the stop closure and the onset of vocalfold vibration (Lisker and Abramson 1964; Abramson and Lisker 1970).
Members of the voiced unaspirated (truly voiced) series, as found in Spanish
and French, are characterized by a relatively long lead time between the
onset of glottal pulsing and stop release. On the other hand, in the voiceless
aspirated (fortis) series found in languages such as English and Swedish,
there is a relatively long time lag between closure release and the onset of
glottal pulsing. In the voiceless unaspirated (lenis or neutral) series, which is
common to all languages of the world, there exists either a relatively short
or zero time lag between closure release and the onset of glottal pulsing.
It is possible to identify the perceptual source distinction using experimental phonetic methods. Spectrographic analysis reveals how the VOT
value in an initial CV context is reflected in the cutback of the onset of the
first formant (F1) relative to the higher formants. According to experimental
tests involving the discrimination of contrasting initial CV contexts (Lisker
and Abramson 1970), for English speakers the perceptual boundary of the
b-p distinction is observed in the VOT value from +20 to +30 msec,
whereas the b-p boundary for Spanish speakers lies in the VOT value +10
to +20 msec.
To avoid confusion arising from the use of the cover terms voiced and
voiceless, the experimental phonetics literature often employs the terms
voiced unaspirated, voiceless aspirated and voiceless unaspirated to refer to
73
long voicing lead (negative VOT), long voicing lag (positive VOT) and
short or zero voicing lag (zero VOT) in the description of the interval between the stop release and the onset of voicing. With respect to these three
VOT categories, languages are classified into at least four groups (there
exists a fifth type which will be discussed in 5) as follows:
(1)
Contrast
One-way
Two-way
Two-way
Three-way
Short lag
Long lead
Long lag
Types II and III are systems exhibiting a two-way source contrast: Type II
to which Spanish and French belong displays short-lag and long-lead
plosives; Type III to which English and Swedish belong shows shortlag and long-lag plosives. There are two other types of source contrasts:
Type I which is found in Finnish exhibits only the short-lag series; on
the other hand, Type IV which is the system observed in Thai and Burmese employs all three source contrasts. It should be noted that the shortlag series is always present in every language system.
Japanese, like Spanish and French, is considered as belonging to Type
II, since it exhibits a contrast between short voicing lag and long voicing
lead in word-initial plosives: in initial consonants, the language does not
show the aspiration which is a property of Type III languages. Also, experimental tests for the discrimination of source contrasts in initial CV contexts
in Japanese (Shimizu 1977) reveal that the perceptual boundary of the b-p
distinction lies in the VOT value from +15 to +20 msec, which is almost
identical to the result for Spanish.
74 Kuniya Nasukawa
frameworks, and have been variously labeled as distinctive features (SPE,
et passim), components and gestures (in the framework of Dependency
Phonology: Anderson & Ewen 1987, van der Hulst 1989), particles
(Schane 1984, 1995) and elements (Kaye, Lowenstamm and Vergnaud 1985;
Harris 1990, 1994; Harris and Lindsey 1995, 2000).
In order to represent cross-linguistic source distinctions, various types of
phonological primes have been proposed within this range of theories. Distinctive feature theories, for example, use the bivalent features [voice] and
[tense] (see Jakobson, Fant & Halle 1952, Chomsky and Halle 1968, and
others). In addition to these, we find references to [heightened subglottal
pressure] and [glottal constriction] in Chomsky and Halle (1968), [spread
glottis], [constricted glottis], [stiff vocal cords] and [slack vocal cords]
in Halle and Stevens (1971), and [fortis] in Kohler (1984).
In frameworks employing monovalent distinctive features, the singlevalued prime [voice] can be found (see It, Mester & Padgett (1995),
Lombardi (1995) and others). Also, [stiff vocal cords] is employed in Halle
& Stevens (1991) while [spread glottis] is used in Jessen & Ringen (2001).
In Dependency Phonology (Anderson & Ewen 1987), which employs
only monovalent primes, the addition of the |V| component in the phonatory
sub-gesture represents voicing while its absence indicates voicelessness. In
the variant of Dependency Phonology known as Radical CV phonology
(van der Hulst 1995), C under the phonatory sub-gesture is for constricted
glottis, Cv for spread glottis, V for obstruent voicing and the absence of
these components for voicelessness in obstruents.
Particle Phonology investigates mainly vocalic systems and does not
provide any significant details about the melodic representation of source
contrasts in consonants.
Like Dependency Phonology, Element Theory can be traced back to
Anderson & Jones (1974). However, more recent developments in Element
Theory have necessitated a change of name, and the label Government/
Licensing-based Phonology (Kaye, Lowenstamm & Vergnaud 1985, 1990)
has become the accepted term. In this framework the two elements [L] and
[H] are employed to express laryngeal-source contrasts: [L] for long voicing
lead (truely voiced); [H] for long voicing lag (voiceless aspirated); the absence of these elements stands for short or zero voicing lag (neutral).
Most studies investigating the phonological phenomena of Japanese have
traditionally employed the bivalent feature [voice] to represent the twoway source contrast (It & Mester 1986, et passim): [+voice] and [voice]
for long voicing lead and short voicing lag respectively. In recent theories
75
which claim that source contrasts involve a privative prime (Rice 1992, It
& Mester 1993; It, Mester & Padget 1995), the monovalent feature [voice]
is typically employed: the existence of [voice] refers to long voicing lead
and its absence to short voicing lag.
The first Element Theory analysis of source contrasts in Japanese was
given in Shohei Yoshida (1990, 1996), where both elements [L] and [H] are
employed. Without taking into consideration any of the cross-linguistic
differences regarding source contrasts, he utilises [L] for voiced obstruents and [H] for voiceless cognates. An alternative approach such as
Nasukawa (1995), however, succeeds in incorporating the cross-linguistic
facts concerning source distinctions into the melodic analysis of Japanese.
Specifically, it is claimed that [L] is the only element required for the
source contrast in Japanese: its existence is interpreted as long voicing lead
(voiced series of obstruents) and its absence as short voicing lag (voiceless series of obstruents). Furthermore, by merging [L] and the murmur/nasal element [N], Nasukawa (1998) proposes that long voicing lead
phonetically manifests itself when [N] is the head of a given melodic
expression.
In the context of Element Theory, this paper will consider (i) how the
phonological primes involved in laryngeal-source contrasts contribute to
the internal organisation of individual sounds, and (ii) how the representation of those primes succeeds in incorporating implicational universals as
well as the phonological properties associated with source contrasts.
76 Kuniya Nasukawa
monovalent prime. For instance, like some melodic theories employing
gestures or particles, voice contrasts are encoded by the presence versus
the absence of the voice prime in a given expression. Some feature-based
theories such as It, Mester & Padgett (1995) and Lombardi (1995) take the
same theoretical stance. In contrast, the competing notion of bivalent oppositions is exploited by orthodox distinctive-feature theories (SPE, et passim),
where an opposition is derived by specifying plus and minus values to a
given prime. For instance, the voice contrast is captured by the voice prime,
to which a plus or minus value is specified. Under this view, bivalency
produces at least three possibilities: [+voice] is active in processes; [voice]
is active; and both [+voice] and [voice] are simultaneously active.
As the literature indicates, however, the processes involving laryngealsource contrasts in Japanese do not employ these three options in equal
measure: [+voice] is traditionally said to trigger most dynamic processes
such as postnasal voicing and compounding; [voice] rarely triggers processes except for high vowel devoicing between voiceless consonants; and
no simultaneous participation of both features is attested. Furthermore, in
the context of a rule-based multi-stratal model, the bivalent format substantially over-generates the number of unattested processes. In a model exploiting the notion of monovalency, on the other hand, the voice contrast
is captured by the presence or absence of the voice prime. In this case,
only the prime which is present in a given context can be active for processes such as voicing assimilation. Its absence means a failure to participate in any processes. This adequately describes the asymmetric processes
involving the voice prime, yet it does not generate processes which are
unattested.
Secondly, (2b) states that an element can be interpreted separately without needing to be combined with other elements. Indeed, this property is
common to all theories which adopt the notion of monovalency. This suggests that information from phonological representations is accessible by
the sensorimotor systems. As Harris & Lindsey (1995) discuss, this approach succeeds in eliminating redundancy rules of the kind which fill in
predictable feature values, and instead pursues a mono-stratal approach to
phonology. In orthodox feature theories, on the other hand, a single prime
cannot be interpreted without being harnessed to the signatures of other
primes. This implies that the minimal units of phonetic interpretation are
segments, not features.
The element-based approach (Harris & Lindsey 2000) is also characterised by (2c), which states that elements are not defined, by properties such
77
as tongue height or formant height: rather, they are sound images which
comprise information-bearing patterns that humans perceive in speech
sounds. They should be detectable through the traditional method of determining the manner in which sounds are organized into systems and natural
classes. This view is based upon the assumption that speech sounds are
represented cognitively as auditory images primary media which are neutral between speaker and hearer: speakers transmit and monitor such information and listeners receive it. This position is rarely touched upon in the
literature of other representational approaches such as orthodox feature
theory, where features are chiefly defined in terms of articulation or raw
acoustics (Chomsky & Halle 1968; Clements & Hertz 1991) or otherwise
in terms of coexisting articulatory and acoustic specifications (Flemming
1995). As a challenge to non-element-based theories, Harris & Lindsey
claim that articulation and raw acoustics are not information-bearing categories: articulation is a delivery system for linguistic information and raw
acoustics is a mere outcome delivered by articulation.
5. Laryngeal-source elements
Element Theory employs two autonomous melodic primes: the low source
element labelled [L] and the high source element labelled [H], which are
phonetically interpreted in obstruents as long voicing lead (true voicing,
prevoicing) and long voicing lag (aspiration) respectively. The auditory
images of these elements are, as the names imply, low source and high
source, the acoustic patterns of which appear in spectrograms as lowered
fundamental frequency (F0 down) and raised fundamental frequency (F0 up)
respectively. The nearest corresponding features might be taken to be [slack
vocal cords] and [stiff vocal cords] respectively (Halle & Stevens 1971,
1991), although the equivalence is found only in terms of glottal execution.
Whereas [slack vocal cords] and [stiff vocal cords] are defined in terms of
articulation, [L] and [H] may be treated as phonologically defined auditory
images. In addition, [slack vocal cords] and [stiff vocal cords] are intended
only for representing laryngeal activity (that is, phonation-type distinctions
in consonants and also tonal distinctions in syllable nuclei) (Bao 1990,
Halle & Stevens 1991, cf. Yip 1980), whereas [L], as we will see in 7, is
also relevant to the representation of nasality (Nasukawa 1995, 1998,
2005a; Ploch 1999). Harris (1998) claims that the availability of the two
monovalent elements [L] and [H] implies the possibility of the following
four combinations:
78 Kuniya Nasukawa
(3)
Phonetic manifestation
Short or zero voicing lag (neutral)
Long voicing lead (truly voiced)
Long voicing lag (voiceless aspirated)
Long voicing lead and lag, murmur (breathy)
Besides the specification of both [L] and [H] alone, there are two further
possibilities: both elements are unspecified in obstruents, and both elements
are simultaneusly specified in a given melodic expression. The former option, where no source category is specified, manifests itself phonetically as
short or zero voicing lag in obstruents. The latter option, where both source
categories are specified together, is interpreted as long voicing lead and lag
(breathy voice), which is associated with the voiced aspirated plosives
found in, for instance, Hindi and Gujarati.
These combinatorial specifications are selected by parameter on a language by language basis. The typology of source-element specifications is
illustrated below:
(4)
E.g.
Finnish
Spanish
English
Thai
Hindi
Non-specified
[L]
[H]
[L, H]
In presenting the VOT typology in (4), Harris (1994, 1998) claims that its
arrangement straightforwardly captures implicational universals and allows
us to identify those segmental classes that are active in processes involving
laryngeal source.
With respect to implicational universals, the relative markedness of
source contrasts is explained in terms of compositional complexity. The
unmarked setting is represented by the absence of source elements and corresponds to the short or zero voicing lag found in all languages. This nonspecification of source elements is regarded as the baseline on to which
source elements are superimposed: so the existence of any source elements
79
80 Kuniya Nasukawa
6. Japanese as a Type II language
6.1. Phonetic evidence
There are several pieces of evidence to support the claim that Japanese belongs to the group of Type II languages in (4).1 One piece of phonetic evidence, as 2 has already mentioned, comes from the observation that, like
Spanish and other languages belonging to the Type II category, aspiration
(which is a characteristic of two-way source contrasts in Type III languages) is cannot be detected in typical contexts for VOT measurement in
Japanese. The voiceless series of two-way source contrasts in Japanese
comprises voiceless unaspirated consonants. This is supported by the results of tests for source discrimination in initial CV contexts. Shimizu (1977)
claims that the perceptual boundaries of the b-p, d-t and g-k distinctions lie
in the VOT value from +15 to +20 msec, from +20 to +30 msec, and from
+20 to +30 msec, respectively. These characteristics of VOT perception are
almost identical to those found in a Type II language like Spanish. In contrast, as Shimizu argues, Type-III languages (showing another two-way
source contrast) such as English exhibit rather different values for the perceptual boundaries in the same b-p, d-t and g-k distinctions: from +20 to
+30 msec, from +30 to +40 msec, and from +30 to +40 msec, respectively.
81
6.3. Devoicing
Turning to the so-called voiceless members the two-way laryngeal contrasts in Japanese, these involve no source specification and are therefore
predicted to be phonologically inert. That is, the members of this neutral
series of obstruents can undergo laryngeal-source assimilation as in postnasal voicing but cannot trigger the process.
At first sight, however, phenomena such as devoicing do seem to contradict this prediction. Both this volume and also the wider literature report
that Japanese exhibits the notable exception of vowel devoicing, which most
frequently occurs when a high vowel is flanked by voiceless obstruents: e.g.
aki8ta Akita (place name). According to one view, as explained by Tsujimura
(1196: 2728), the voiceless properties of both flanking consonants affect
the intervening vowel: the value of [voice] in the vowel changes from plus
to minus. For this reason alone, it is often assumed that the voiceless
property in obstruents can be phonologically active in Japanese.
Within the framework of Element Theory, however, unlike distinctive
feature theories, the voiceless (neutral) obstruents in Japanese, as well as
all vowels and sonorant consonants, have no source specification which can
act as a trigger for laryngeal assimilation. Instead, I assume that vowel de-
82 Kuniya Nasukawa
voicing in Type II languages is related to the active status of the noise element [h], which manifests itself acoustically as aperiodic energy. The closest corresponding feature might be [+continuant], although unlike
[+continuant], [h] is present not only in fricatives and affricates but also in
plosives (Harris and Lindsey 1995: 734). It is always specified in the internal organisation of obstruents (except the glottal stop /). It will be recalled
that the context which triggers high vowel devoicing makes crucial reference to obstruents: segments flanking high vowels must be not only voiceless but also obstruents. In element terms, this condition is described by
the statement that [h] (which is required in obstruents) in non-nuclear positions flanking a high vowel nucleus affects the nuclear position only if [h]
is not combined with any laryngeal-source specification (that is, no [L] in
Japanese which indicates a neutral obstruent).
Following the analyses of the interpolation of nasality in Cohn (1993)
and Nasukawa (1995, 2005a), I assume that the extension of [h]s phonetic
signature over a flanked nucleus results from the phonetic interpolation of
the two [h]s in the flanking obstruents. This can be basically attributed to
the quality of Japanese high vowels, which are relatively centralised and
are frequently involved in various processes. For example, they are often
used as an epenthetic vowel because of their less salient profile (e.g. tas to
add + ta past tense suffix tasita in Yamato Japanese, and ski ski
s kii in lexical borrowing), and often undergo assimilatory processes (e.g.
eiga film eega and ma horse M ma). Spontaneous voicing of
such phonetically weak vowels tends to be overshadowed by its neighbouring [h]s, and the result is generally perceived as devoicing.
This kind of interpolation is not achieved when high vowels are flanked
by obstruents with long-lead voicing that is, when [h] co-exists with [L]
in a single expression. This is due to an acoustic effect whereby the characteristics of aperiodic energy are partially suppressed by the co-existence of
long-lead glottal pulsing.
83
lead and short-lag contrasts emerge. Assuming that unmarked melodic representations are acquired before marked ones, then we predict that acquisition
is characterized by two separate stages: first, the non-specification of [L]
(neutral laryngeal state), then later the specification versus non-specification of [L]. That is to say, the neutral state (baseline) of laryngeal-source
specification is preferred in the earlier stages of plosive production. This unmarked status of short-lag plosives in the speech production of very young
children is also backed up by acquisition studies involving other languages
(see Harris 1998: 179, for a detailed discussion and references therein).
Reports of aphasic deficit in Japanese can also be accounted for in similar terms. For instance, in Brocas and global deficit the laryngeal-source
contrasts are collapsed, and the production of stops tends to converge on
the short-lag region (Itoh, Tatsumi and Sasanuma 1986). This convergence
can be regarded as a loss of the categorical representation [L]. Similar patterns are attested across different languages: e.g. the loss of [H] in English
(cf. Blumstein, Cooper, Statlender, Goodglass and Gottlieb 1980) and the
loss of [L] and [H] in Thai (cf. Gandour and Dardarananda 1982, 1984).
Element
Phonetic manifestation
Nasality
(long-lead) voicing
84 Kuniya Nasukawa
The first formal evidence that voicing and nasality are two instantiations of
the same category is provided in Nasukawa (1995, 1998), which present an
integrated approach to the paradixical behaviour of voice and nasality:
nasals appear to be specified for voice in postnasal voicing assimilation
(e.g. in + ta inda died), while they behave as if they have no voice in
Lymans Law, which allows only a single voiced obstruent in a particular
domain (e.g. kanade a play, a dance, *ganade). By adopting the representations in (5), postnasal voicing is treated as the extension of the [N] across
both positions of an NC cluster, where only the element in the second position is promoted to a headed status.7 On the other hand, the transparency of
nasal obstruents to Lymans Law follows from [N] failing to be headed.
Furthermore, the dual interpretation of [N] is supported by some robust
correlations between voicing and nasality. A typical instance of such a relation is postnasal voicing assimilation, found not only in Yamato Japanese,
but also in many languages such as Quichua and Zoque, where an obstruent
preceded by a nasal is obligatorily voiced. Another example of the relation
between voice and nasal is found in processes involving alternations between voiced obstruents and their nasal reflexes such as fully-nasalised
and prenasalised voiced cognates. This kind of process is often observed in
intervocalic contexts. For example, voiced obstruent prenasalisation is witnessed in Northern Tohoku Japanese, languages of the Reef Island-Santa
Cruz family, those in the Pacific area and several Bantu languages; in the
intervocalic context, conservative Tokyo Japanese exhibits voiced-velarobstruent nasalisation. Furthermore, in the verbal inflexion of Yamato
Japanese, the stem-final b in a verbal stem such as tob to fly is realised as
a nasal that is homorganic with the initial obstruent of a suffix such as -te
(gerundive).8
According to Nasukawa (2005a), the assignment of head status to voicing
rather than to nasality can be justified as follows. First, the representations
in (5) can encode an implicational universal between voicing and nasality.
(6)
Spanish, Thai
Voicing
Nasal
85
As illustrated in (6), we never encounter a system which displays trulyvoiced plosives without also having nasals. This observation allows us to
express the implication that the existence of voicing implies the existence
of nasal. This implicational universal is straightforwardly captured by the
representations in (5).
Second, the representations in (5) encode the optional status of voicing.
Almost all languages exploit contrastive nasality, whereas voicing is parametrically controlled. The optional status of voicing is reflected in the nonintegral nature of headedness: some systems permit [N] to be headed, while
others disallow this as a structural possibility.
Furthermore, the representations in (5) reflect differences in complexity
between voicing and nasality. In the analysis of prenasalisation and velar
nasalisation by Nasukawa (1999), nasality must be less complex structurally
than voicing, since the latter property is often suppressed in intervocalic
contexts and instead nasality is interpreted (e.g. in some dialects of Japanese,
some Western Indonesian languages and several Bantu languages). According to Harris (1994, 1997), segmental structure is less complex in weak
positions than in strong positions a state of affairs predicted by the proposed representations in (5).
Finally, the elimination of [L] from the melodic inventory for phonationtype contrasts complements a recent analysis of tone and intonation. Instead
of [L] representing low pitch in nuclear positions, for example, the nonspecification of [H] (which is interpreted as high tone) in nuclear positions
phonetically manifests itself as low pitch in the analysis of Japanese pitch
accentuation (Yuko Yoshida 1995). Also, in the analysis of intonation, as
an alternative to the specification of [L], a prosodic boundary unassociated
to [H] is interpreted as low pitch (Cabrera-Abreu 2000).
According to this revised approach, the specification of source contrasts
can be summarized as follows:
(7)
Source elements
Non-specified
[N] (headed [N])
[H]
[N, H]
Phonetic manifestation
Short or zero voicing lag (neutral)
Long voicing lead (truly voiced)
Long voicing lag (voiceless aspirated)
Long voicing lead and lag, murmur (breathy)
The two-way source contrast of Japanese (a type II language) is then represented by the contrast between a non-specified neutral baseline and headed
[N].
86 Kuniya Nasukawa
8. Summary
The main purpose of this paper has been to present an analysis of laryngealsource contrasts in Japanese within the scope of Element Theory. Exhibiting
cross-linguistic variation in VOT values, the contrasts are derived by the
non-specification of any source elements versus the specification of the low
source element, which correspond phonetically to the laryngeal properties
of neutral and long voicing lead, respectively. This is supported by a number
of phonologically-active phenomena involving true voicing, the preference
for the neutral laryngeal state in early language acquisition and the convergence of VOT values on the neutral region in cases of aphasia in Japanese.
Finally, following a recent proposal to merge the low source and nasal
elements, this paper has shown how long voicing lead is represented by a
headed nasal element ([N]) in a given expression. To support this position,
evidence has come from the correlations observed between voice and nasal,
as well as from implicational universals. To conclude, the Japanese laryngealsource contrast is represented by the specification of two laryngeal states:
the bare source baseline versus the same baseline with a headed nasal
element superimposed on to it.
Notes
1. The first Element Theory analysis of source contrasts in Japanese was given in
Shohei Yoshida (1991, 1996), where both elements [L] and [H] are employed.
Without taking into consideration any of the cross-linguistic differences regarding source contrasts, he utilises [L] for long voiced obstruents and [H]
for voiceless cognates.
2. An exception arises when a voiced obstruent is already specified in a given
lexical form. In such cases Lymans Law requires the original voiceless
consonant to remain unchanged.
3. There have been several proposals to extend the element-reducing programme
in various ways, for instance by merging aspiration with noise and coronality
with openness (van der Hulst 1995; Marten 1996; Charette & Gksel 1998;
Kula & Marten 1998). The conceptual advantages of this approach are clear.
However, the empirical consequences have yet to be fully worked out.
4. Instead of [N], Ploch (1999) and others use [L] and eliminate [N] from the
element inventory. However, I have opted for [N] rather than [L] to represent
5.
6.
7.
8.
87
the correlation between voicing and nasality, since the bare element without
headship status contributes nasality.
In this treatment, the headedness of a given element is regarded as an intrinsic
property which enhances the acoustic image of the element (Harris 1994; Harris & Lindsey 1995; Backley 1998; Nasukawa 1998, 1999).
Within a geometry-based version of Element Theory (Backley 1998; Backley
& Takahashi 1998), Nasukawa (2005a) proposes that the contrast between
voicing and nasality is represented using the same idea: if the element [N] licenses its [comp], then it is interpreted as (long-lead) voicing, while the same
element without a licensed [comp] is interpreted as nasality.
See Nasukawa (2005a: 4.5) for a detailed discussion.
See also Ploch (1999) for further arguments to support the merger of long-lead
voicing and nasal elements.
1. Rendaku
The Japanese term rendaku, which Martin (1952: 48) translates as sequential voicing, refers to a morphophonemic phenomenon found in compounds
and in prefix+base combinations. A morpheme that shows rendaku has one
allomorph beginning with a voiceless obstruent and another allomorph beginning with a voiced obstruent. The rendaku allomorph (i.e., the allomorph beginning with a voiced obstruent) of such a morpheme appears
only when it is a non-initial morph in a word. The examples in (1) illustrate
the pairs of phonemes that can alternate.
(1)
ALTERNATING
PHONEMES
/f/~/b/
/h/~/b/
/t/~/d/
/k/~/g/
/c/~/z/
/s/~/z/
//~/j/
//~/j/
VOICELESS
ALTERNANT
/fune/
/hako/
/tama/
/kami/
/cuka/
/sora/
/i/
/irui/
boat
case
ball
paper
mound
sky
blood
symbol
VOICED
ALTERNANT
/kawa+bune/
/hai+bako/
/me+dama/
/kabe+gami/
/ari+zuka/
/hoi+zora/
/hana+ji/
/ya+jirui/
river boat
chopstick case
eyeball
wallpaper
anthill
starry sky
nosebleed
arrow symbol
90
Timothy J. Vance
2. Historical development
The oldest substantial texts in Japanese date from the 8th century, and the
language they represent presumably reflects a variety spoken by the aristocracy in the contemporary capital of Nara. There is general agreement that
word-medial voiced obstruents were prenasalized in Old Japanese: [Ng ndz
n m 3
d b] (Vance 1983: 335337). As Unger (1977: 89) first pointed out, if
we make the plausible assumption that such prenasalization was present in
prehistoric Japanese as well, a satisfying explanation for the origin of sequential voicing is available. Hamada (1952: 23) cites the examples in (2)
to illustrate the historical process of interest in some items that developed
after the 8th century.4
(2)
/sumi+sur-i/ /suzuri/
/fumi+te/
/fude/
/ika ni ka/ /ikaga/
ink+scraper
inkstone5
letter+hand
writing brush
INTERROG+ADV+? how
POJ
The obvious candidate for the mystery syllable is the genitive particle
POJ
/n/, the ancestor of OJ/n/ and modern /no/. Attested Old Japanese vocabulary items like those in (4) suggest why rendaku was irregular (as it
continues to be in modern Japanese).8
(4)
a.
b.
c.
OJ
/ak+n+pa/
/taka+pa/
OJ
/sasa+ba/
OJ
autumn leaf
bamboo leaf
bamboo-grass leaf
POJ
/sasa n pa/
91
As expected, lexicalized phrases that retained genitive OJ/n/ (as in 4a), did
not show rendaku.9 Noun+noun compounds could have originated either by
simple juxtaposition, in which case rendaku did not occur (as in 4b), or by
contraction of a phrase, in which case rendaku did occur (as in 4c).10
3. Inflected words
Verb+verb compound verbs are abundant in Japanese, but they rarely show
rendaku. An example is /kak-i+tor-u/ write down, which contains the roots
of /kak-u/ write and /tor-u/ take. The first component verb in such a
compound is invariable; it must appear in its continuative form.11 The
second component verb bears whatever inflectional ending is required for
the compound as a whole; the citation form of a verb is the nonpast indicative. The account in 2 of the origin of rendaku provides a natural explanation for the rarity of rendaku in compounds of this type (Vance 1983).
There is no reason to suppose that the components of a verb+verb compound verb were ever connected by a genitive particle or any other NV
syllable in earlier stages of Japanese.
As noted in the previous paragraph, the first element of a verb+verb
compound verb appears in its continuative form. The continuative of any
verb is an inflectional form, and as a word on its own it functions to connect its clause to a following clause. The example in (5) illustrates with
/hana-i/ (romanized hanashi), the continuative of /hanas-u/ speak.
(5)
Tomodachi to hanashi,
sore kara nemashita.
friend
with speak-CONT that from sleep-POLITE-PAST
(I) spoke with (my) friend, and after that (I) went to bed.
92
Timothy J. Vance
/e/). An example of such a vowel-stem verb is /tabe-ru/ eat, with the nonpast indicative marked by /ru/ rather than by the /u/ of consonant-stem
verbs. The continuative of this verb is /tabe/, with no inflectional ending,
since the continuative of every vowel-stem verb is identical to its stem.15
Many verbs have a corresponding deverbal noun that is segmentally identical to the continuative, although it may be accented on a different syllable
(Martin 1952: 34). The examples in (6) illustrate, pressing English gerunds
into service as translations of the continuative forms.
(6)
/yasum-u/ rest16
/yasum-i/ resting
/yasum-i/ vacation, break
/kikoe-ru/ be audible
/kikoe/
being audible
/kikoe/
sound
Okumura (1955) claims that rendaku does not occur in compounds of inflected word plus inflected word.17 In fact, we do find examples of rendaku
in such compounds, but as mentioned above, rendaku is rare in verb+verb
compound verbs. Okumuras illustrative examples actually suggest a more
interesting generalization. Two of those examples appear in (7).
(7)
Both examples in (7) derive from the verbs /wakac-u/ divide and /kak-u/
write, and the former, like all non-final verbal elements in compounds,
appears in its continuative form /waka-i/. The verb /waka-i+kak-u/ (7a:
V1+V2=V) is given in its citation form, with the second element bearing the
nonpast affirmative ending /u/. The noun /waka-i+gak-i/ (7b: V1+V2=N),
on the other hand, does not inflect; the second element is fixed in form.
Okumuras precise claim thus appears to be that rendaku will not occur in a
compound which consists of two inflected words and is itself an inflected
word. At the same time, the second example suggests that we should expect
rendaku in a compound that consists of two verb stems but is itself a noun.
The examples just considered involve verbs. The other major class of
inflected words in Japanese is adjectives.18 Just as in the case of a verb, the
citation form of an adjective is the nonpast indicative. The adjectival nonpast indicative suffix has the invariant form /i/. The continuative form of an
adjective is always marked by the suffix /ku/ and is never identical to the
stem. When a compound contains an adjective as its initial element, the
adjective always appears as a bare stem. The examples in (8) illustrate.
(8)
/omo-i/
/omo-ku/
/omo+kurui-i/
heavy
being heavy
oppressive (cf. /kurui-i/ strained)
/haya-i/
/haya-ku/
/haya+oki/
early
being early
early rising (cf. /oki-ru/ get up)
93
Some adjective stems can be used as nouns (Martin 1975: 399), as the examples in (9) show.
(9)
/maru-i/ round
/maru/
circle
4. Verb+verb compounds
A set of verb+verb compounds was collected to assess the notion that rendaku does not occur in a compound that consists of two inflected words and
is itself an inflected word. The first step in the collection procedure was to
make a list of all the non-compound verbs beginning with a voiceless obstruent that appear in Kazama (1979), a reverse dictionary that has a separate section for each part of speech. There is no point in considering verbs
that do not begin with a voiceless obstruent, since rendaku cannot affect a
vowel (as in /oboe-ru/ remember), a sonorant (as in /nom-u/ drink), or an
obstruent that is already voiced (as in /de-ru/ leave). In order to limit the
investigation to words in common use in modern Japanese, each verb on the
list was checked in a medium-size Japanese-English dictionary (Hasegawa
et al. 1986). Every verb on the list that does not appear as a headword in
this dictionary was eliminated from further consideration. Also eliminated
was every verb that contains a medial voiced obstruent (e.g., /sage-ru/
lower). A compound containing such a second element is subject to a
well-known constraint on rendaku called Lymans Law (Vance 1987: 136
139): rendaku almost never affects an initial obstruent of an element that
already contains a voiced obstruent. Consequently, it would not be appropriate to cite a verb+verb compound verb such as /hik-i+sage-ru/ pull
down as support for the claim that compounds of this form resist rendaku.
The next step in the data collection process was to find compounds of
the form V1+V2=V or V1+V2=N in which V2 is one of the verbs remaining
on the list described in the preceding paragraph. The original intent was to
94
Timothy J. Vance
:
:
95
/toor-i+kakar-u/ pass by
/toor-i+gakar-i/ passing by
But there are also pairs in which both the verb and the noun show rendaku
and other pairs in which neither shows rendaku. The examples in (11) illustrate.
(11) a. V1+V2=V [+rendaku] : /kaer-i+zak-u/ bloom again
V1+V2=N [+rendaku] : /kaer-i+zak-i/ second blooming
b. V1+V2=V [rendaku] : /mi+toos-u/
V1+V2=N [rendaku] : /mi+too-i/
foresee
prospect
In some pairs, the pronunciation of one or both members has the mora obstruent /Q/ following the continuative form of V1, as in (12a), or in place of
the last syllable of the continuative form of V1, as in (12b).22
(12) a. V1+V2=V [rendaku] : /hane+kaer-u/ rebound
V1+V2=N [rendaku]~[mora obstruent] :
/hane+kaer-i/~/hane-Q+kaer-i/ rebound
b. V1+V2=V [mora obstruent]: /yoQ+para-u/ get drunk
V1+V2=N [mora obstruent]: /yoQ+para-i/ drunken person
The continuative form of /yo-u/ get drunk (V1 in 12b) is /yo-i/. Since the
mora obstruent pre-empts rendaku (Vance 1987: 148), if the unabridged
dictionary gives a pronunciation with /Q/ as the only pronunciation of either
member of a pair, that pair was excluded from the statistics reported below.
In other pairs, the pronunciation of one or both members has the mora nasal
/N/ in place of the last syllable of the continuative form of V1, as in (13).
(13) V1+V2=V [mora nasal]
V1+V2=N [mora nasal]
96
Timothy J. Vance
comprehend by looking at
comprehending by looking at
looking over and selecting
97
each for the verb and noun in (15b). Maintaining the bias again, (15b) was
counted as a verb not showing rendaku and a noun showing rendaku.
The data set contains a total of 234 verb/noun pairs, and the table in (16)
shows how these pairs pattern in terms of rendaku.
V+V=V
+rendaku
rendaku
V+V=N
(16)
+rendaku
10
22
rendaku
202
As the lower right cell in (16) shows, in the great majority of the pairs
(202/234 =86%), neither the verb nor the noun shows rendaku. In other
words, pairs like /mi+toos-u/ foresee and /mi+too-i/ prospect (11b) are
the norm. By comparison, despite the deliberately biased counting described above, only a small fraction of the pairs in the data set (22 /234 =
9%) exhibit the behavior that Okumura (1955) suggests is typical. This
means that pairs like /toor-i+kakar-u/ pass by and /toor-i+gakar-i/ passing by (10) are actually quite unusual.
Needless to say, a data set consisting of entries in an unabridged dictionary certainly will not match the relevant portion of a representative native speakers actual vocabulary. To get some idea of how serious this
shortcoming might be, a well-educated native speaker went through the 234
verb/noun pairs in the data set, discarded those that were unfamiliar to her,
and noted pronunciations (with or without rendaku) that differed from her
own.23 Applying the same counting bias as above to this revised data set,
the pairs in this speakers vocabulary pattern as in (17).
V+V=V
+rendaku
rendaku
V+V=N
(17)
+rendaku
13
rendaku
188
The total number of pairs in this revised data set is 208, and their distribution
in the four cells of the table in (17) differs very little from the distribution of
98
Timothy J. Vance
the pairs in (16). Here again, in most of the pairs (188/208 =90%), neither
the verb nor the noun shows rendaku, and only a small fraction of the pairs
(13/208 =6 %) show rendaku in the noun but not in the verb. In short, the
revised data set suggests that simply relying on the dictionary entries does
not lead us astray. Consequently, no attempt was made to go beyond dictionary entries for the counts reported below in 5.
dim
dark
99
rejuvenation
return
The table in (21) summarizes the data collected for the six categories of
two-element compounds in which both elements are verbal or adjectival.
(21)
+rendaku
rendaku
rendaku %
V+V=V
V+V=N
A+V=V
A+V=N
A+A=A
V+A=A
16
716
2%
211
258
45%
7
0
100%
43
4
91%
8
10
44%
17
3
85%
Unlike the table in (16), the table in (21) includes unpaired items in the two
V+V categories. Of the 732 (16+716) V+V=V items tabulated in (21), only
234 are those tabulated in (16). The remaining 498 are V+V=V compounds
for which no corresponding V+V=N compound is listed as a headword in
the unabridged dictionary. For example, the verb /okur-i+kaes-u/ send back
(cf. /okur-u/ send and /kaes-u/ return) is listed, but there is no entry for a
corresponding noun (which would be either /okur-i+kae-i/ or /okur-i+gae-i/).
Similarly, of the 469 (211+258) V+V=N items tabulated in (21), 235 are
V+V=N compounds for which no corresponding V+V=V compound is
listed as a headword in the unabridged dictionary. For example, the noun
/oboe+gak-i/ memo (cf. /oboe-ru/ remember and /kak-u/ write) is listed,
but there is no entry for a corresponding verb (which would be either
/oboe+kak-u/ or /oboe+gak-u/).
Including such unpaired items makes it clear that V+V=N compounds are
much more likely to show rendaku than V+V=V compounds. Nonetheless,
this difference is just a strong statistical tendency, not an inviolable principle.
Furthermore, the fact that compounds containing adjectival elements are so
100
Timothy J. Vance
likely to show rendaku means that only verbal elements exhibit this tendency. It is not a generalization that applies to all inflected-word elements.
6. Conclusion
As shown in 4, only a small minority of paired V+V=N and V+V=V items
show rendaku in the noun but not in the verb (as in /toor-i+gakar-i/ passing
by and /toor-i+kakar-u/ pass by). On the other hand, when unpaired items
are taken into consideration (as in 5), it is clear that V+V=N compounds
are much more likely to show rendaku than V+V=V compounds. Nonetheless, this difference is just a strong statistical tendency, not an inviolable
principle. Furthermore, the fact that compounds containing adjectival elements are so likely to show rendaku (as demonstrated in 5) means that
only verbal elements exhibit this tendency. It is not a generalization that
applies to all inflected-word elements. Incidentally, if rendaku originated as
described in 2 above, the behavior of adjectival elements is a mystery,
since there is no reason to suppose that the two elements in a compound
containing an adjectival element were linked by a syllable of the form NV
at some time in the past.
Notes
1. Many linguists prefer to analyze [h], [], and [F] as allophones of a single phoneme except in borrowings. I am assuming a uniform phonemic inventory for
all vocabulary strata and a split that has resulted in a contrast between [F] and
[h]/[] (with [h] appearing before /e/, /a/, or /o/ and [] appearing before /i/ or
/y/). Either way, the rendaku alternation is not simply a matter of voicing. The
ancestor of modern /h/ and /f/ was pronounced [p], although there is some controversy about how long the [p] pronunciation persisted in the central dialects
(Kiyose 1985). See also Ohno (this volume, 4.2).
2. In Vance (1987: 24), I said that the two allophones of /z/, [dz], and [z], are
distributed as follows: [dz] word-initially or immediately following the mora
nasal /N/ and [z] elsewhere. The actual distribution is certainly not this clean,
but there is no contrast, and the two are unquestionably allophones of a single
phoneme. The modern rendaku pairing of /z/ with /c/ and /s/ reflects the historical merger of a voiced affricate and a voiced fricative, and so does the pairing of /j/ with // and // .
101
3. Old Japanese also had a corresponding series of phonemes realized as voiceless obstruents, two phonemes realized as nasals, and two phonemes realized
as semivowels. The entire Old Japanese consonant inventory is typically transcribed phonemically as /p t s k b d z g m n y w/. For a recent attempt at phonetic
reconstruction of the entire Old Japanese phonological system, see Miyake
(2003).
4. The etymologies in (2) are reasonably secure and are given in Nihon Daijiten
Kankkai 197276, although Miller (1967: 21314) is dubious about this etymology for /fude/. The earliest attestations for the shortened forms range from
ca. 900 for /ikaga/ to ca. 1000 for /fude/. Although the earliest attestation for
/sumi+suri/ is from the tenth century, the other two long forms are attested
from the eighth century.
5. The hyphen /sur-i/ separates what are commonly analyzed as a verb stem and
an inflectional ending. See the discussion below in 3 for details.
6. The mora nasal /N/, which occurs syllable-finally in modern Japanese, is a
later development (Hamada 1955; Vance 1987: 5657).
7. All Old Japanese examples are marked with a superscript OJ. The transcription
conventions follow Millers (1986: 198) slightly modified version of the system
first adopted by Mathias (1973) and endorsed by Martin (1987: 50). The transcription reflects the fact that many modern standard syllables with one of the
vowels /i e o/ correspond to two distinct eighth-century syllables. (For details,
see Lange 1973 and Shibatani 1990: 125139). For each such eighth-century
pair, it is standard practice to label one syllable type A (k-rui) and the other
type B (otsu-rui), following Hashimoto (1917: 173186). Some researchers construe the phonological differences between the type-A and type-B syllables as
vowel-quality distinctions; others construe them as distinctions between syllables
with and without a glide: CV vs. CGV. In any case, the transcription adopted
here represents type-A syllables with a circumflex over the vowel / /, typeB syllables with a diaresis over the vowel / /, and syllables for which there
was no A/B distinction with no diacritic /i e o/. A capitalized vowel /I E O/ indicates a syllable for which there was an A/B distinction but for which the
category is unknown. The source for all Old Japanese forms is the Jdaigo Jiten Hensh Iinkai 1967, the definitive dictionary of Old Japanese. Hypothetical pre-Old Japanese forms are marked with a superscript POJ.
8. On the fundamental irregularity of rendaku, see Vance 1987: 146148, Ohno
2000, and several of the papers in this volume. It is curious, to say the least,
that these irregularities have not been leveled out over the course of the last
millennium, but they have not. This is not to say that the situation has been
static. Many individual vocabulary items that used to have rendaku no longer
do and vice versa. But these changes do not seem to have any discernible direction. To give just one set of examples, Hepburns 1867 dictionary lists the
verb /ki+kae-ru/ change clothes and the corresponding noun /ki+gae/ changing clothes, and it also lists the verb /nor-i+kae-ru/ change horses and the
102
9.
10.
11.
12.
13.
14.
15.
16.
Timothy J. Vance
corresponding noun /nor-i+gae/ changing horses. The descendants of these
items for most modern Tokyo speakers are /ki+gae-ru/ (a gain for rendaku),
/ki+gae/ (no change), /nor-i+kae-ru/ (no change), and /nor-i+kae/ (a loss for
rendaku).
On the other hand, there are puzzling examples of rendaku in phrasal items of
this form, including OJ/ama+n+gapa/ Milky Way (cf. OJ/kapa/ river), with
genitive OJ/n/, and OJ/ma+tu+g/ eyelash (cf. OJ/k/ hair), with genitive OJ/tu/
(which has not survived into modern Japanese).
As explained in note 7, forms marked with a superscript POJ are hypothetical
pre-Old Japanese. The genitive POJ/n/ in POJ/sasa n pa/ in (4c) is not attested;
it is merely an inference. Interestingly, the form now in use in modern Japanese is /sasa no ha/, not /sasaba/. The other two items in (4) are also obsolete. I
have nothing illuminating to say about why certain composite items in the preOld Japanese vocabulary contained a genitive particle while others did not. I
assume the situation was much the same as it is in modern Japanese. For example, I have no explanation to offer for why the notion toe is expressed by
the phrase /ai no yubi/ foots digit whereas the notion ankle is expressed
by the compound /ai+kubi/ foot-neck.
The term continuative is Kunos (1973: 195). The traditional term in Japanese
grammar renykei adverbial form, and Bloch (1946: 6) calls it the infinitive
form.
The two classes are called godan-kastuy-dshi five-row inflection verbs and
ichidan-kastuy-dshi one-row inflection verbs. For details, see Vance (1987:
178 184).
This rather clumsy characterization is necessary because of verbs such as /ka-u/
buy, which has a consonant-final allomorph only before /a/, as in the negative
/kaw-ana-i/. The Old Japanese citation form of this verb was OJ/kap-u/, and the
modern forms reflect a well-known sequence of historical changes. The standard
account is that, in word-medial position, [p] >[w], and then [w] > except before /a/. I use Blochs (1946) morphological segmentations as a convenience,
not as an endorsement of the analysis behind them.
Many linguists prefer not to analyze [s] and [] as contrastive, treating [] before /i/ as a realization of /s/ and [] before by any other vowel as a realization
of /sy/ .
Parallelism with consonant-stem verbs would dictate a zero morph marking
the continuative of a vowel-stem verb (as in /tabe+), but I will not clutter the
transcriptions in this paper with zero morphs. For an argument that the very
notion of a zero morph is incoherent, see Matthews (1974: 117).
The distinctive part of the pitch-accent pattern on a Japanese word is a fall
from high pitch to low pitch. I mark the location of a fall with a downwardpointing arrow (). Some words are unaccented, i.e., contain no pitch fall, and
no arrow appears in the transcription of an unaccented word. Standard references on Japanese accent include McCawley (1977), Haraguchi (1977), and
17.
18.
19.
20.
21.
22.
23.
103
Pierrehumbert and Beckman (1988). Aside from these examples in (5), I have
not bothered to mark accent in this paper, since accent does not figure in the
discussion.
Sakurai (1966: 41) makes a similar claim about compounds of inflected word
plus inflected word, but he qualifies it by saying that if the first element is used
as a noun, sequential voicing can occur. However, since the first element must
appear in its stem form, it is not clear how to determine whether it is being
used as a noun (Vance 1987: 143).
The other class of inflected words in Japanese contains only a single member:
the copula (Bloch 1946: 2124). We will not consider it here, since even if it
occurred in forms that could be construed as compounds, its citation form /da/
and most of its other forms begin with the voiced obstruent /d/, making rendaku
inapplicable (or vacuous).
I have no reason to think that examples containing vowel-stem verbal elements
would significantly change the overall picture that emerges. I could be wrong,
of course.
A compound could appear either as a headword itself or as a subentry under its
first element.
To be more precise, the continuative of a verb followed by /gai/ is either an
adjectival noun (keiydshi) or what Martin (1975: 179) calls a precopular
noun. See Martin (1975: 418419) for discussion and examples.
For details on the mora obstruent in compounds like (12b), see Vance (2002).
I am grateful to my research assistant, Mieko Kawai, for her painstaking work.
1. Introduction
Quite often phonological phenomena are found only in a certain vocabulary
class but not in others within a single language. However, the basic tenet in
OT, a single invariant ranking, seems incompatible to those multiple vocabulary classes with inconsistent phonological phenomena. Recent OT
analyses have developed useful notions to approach this problem.
First, multiple sub-lexica are defined when their phonological properties
are distinct enough. For instance, Japanese has at least four phonological
sub-lexica, such as Yamato, Sino-Japanese, Mimetics, and Foreign (It and
Mester 1995, 1999; Fukazawa et al. 1998).
Second, those sub-lexica are organized in a core-periphery structure (It
and Mester 1995). Generally (and historically), the native vocabulary tends
to form the core part while non-native vocabularies tend to form the periphery. A constraint-based implementation of the core-periphery structure is
to assume that the more native the sub-lexicon is, the more markedness
constraints it may obey. So, for example, in the most native sub-lexicon Z,
constraints *F, *G, and *H are all respected. In the least native sub-lexicon
X, however, only the constraint *F is satisfied. Figure 1 shows these relations
in a set of concentric ellipses.
X
Y
Z
*H respected
*G respected
*F respected
2.
In this ranking, *VoiObs2stem, Express Affix, *NT, and *VoiObs are markedness constraints whose definitions are given in (3).
a. oyako-geNka
b. oyako-keNka
*!
c. oyako-keNga
*!
d. oyako-geNga
*!
*
*
**
**
2.2.
Ranking Paradoxes
(6)
[howaito sokkusu]
[iNdiaNzu]
[saNzu]
[kabusu]
[reddo uingusu]
White Socks
Indians
Suns
Cubs
Red Wings
Additional Data
a.
b.
c.
d.
[shuuzu]
[jiiNzu]
[bibusu]
[daburusu]
shoes
jeans
bibs
doubles
Words in (5ac) show that loanwords are copying the voicing value of the
plural morpheme in the original. However, those in (5de) and (6cd) suggest that the situation is not that simple. The pronunciation of the plural
morpheme of those words in English is always [z] since the stem ends in a
voiced segment. However, corresponding Japanese loanwords have [-su]. It
In contrast to Yamato, voicing contrast after a nasal is observed and a morpheme can contain more than one voiced obstruent in the Foreign sublexicon, which leads to the following constraint ranking.
(8)
It is evident that Tateishis data in (5) contradict to the ranking for Foreign
words in (8) although those words are undoubtedly Foreign. It is true that
there are some assimilated-foreign words5 in Japanese lexicon, such as
[karuta] card (borrowed from Portuguese carta in 16th century). Assimilated-foreign words are phonologically quite close to Yamato words. For
example, [karuta] shows Rendaku in a compound [iroha garuta] cards of
the Japanese syllabary (see Takayama, this volume, for similar examples).
However, we cannot say the words in (5) are well-assimilated to Japanese because they are not widely used and are not popular to people other
than sports fans.
Looking closely, due to the fact that -s must be voiced after a nasal as
in (5bc), *NT needs to be ranked higher than the faithfulness constraint
for the Foreign sub-lexicon. Thus, the data suggest that either we abandon
the membership of the words in (5) to Foreign sub-lexicon, or admit a
paradoxical ranking (9).
(9)
/kabu-zu/
*VoiObs2stem
a. kabu-zu
*!
IDENT[voice]-Foreign
b. kabu-su
These are the first two problematic cases for the current OT analyses of
multiple sub-lexica in Japanese. The ranking paradoxes here occur between
a markedness constraint and a faithfulness constraint. The next subsection
introduces a more serious case where the ranking paradox arises between
two markedness constraints.
*NT
a. iNdiaN-zu
b. iNdiaN-su
*VoiObs2stem
*
*!
IDENT[voice]-Foreign
On the other hand, It and Mester (2001) proposed the ranking in (2) to
account for consonant voicing in Japanese. The relevant partial rankings
are summarized in (13).
(13) Partial rankings proposed by It and Mester (2001)
for e.g. [furaNsu]-France
a. IDENT[voice]-Foreign >> *NT
2
b. IDENT[voice]-Foreign >> *VoiObs stem for e.g. [gyagu]-gag
c. *VoiObs2stem >> *NT
for (4) [oyako-geNka]
It and Mesters rankings are thus in contradiction to the ranking for the
new data introduced by Tateishi (2001). In the following section, we will
give a solution to these ranking paradoxes.
3. Solution
Two of the three paradoxes in the previous section are essentially coming
from mixing up etymological knowledge with phonological knowledge.
We want to put IDENT[voice]-Foreign higher than *NT because we etymologically know that Indians is a foreign word in Japanese. Meanwhile, we
cannot phonologically know Indians as a foreign word because the obstruent after the nasal is voiced.
Fukazawa, Kitahara and Ota (2002) show the necessity of reconsidering
etymology-based Japanese sub-lexica. The previous literature has claimed
that sub-lexica are phonologically motivated and etymology-oriented labelling of sub-lexica is just a convention (It and Mester 1995; Fukazawa
et al. 1998). Subscript numbers or letters are often used instead. However,
just substituting labels to anonymous numbers or letters does not guarantee
the independence of phonology from etymology. Fukazawa et al. (2002)
*VoiObs2
a. kabu-zu
*!
IDENT[voice]stem
b. kabu-su
IDENT[voice]affix
*
c. kapu-zu
*!
IDENT[voice]stem
a. furaNzu
*!
b. furaNsu
*NT
*
a. iNdiaN-zu
b. iNdiaN-su
c. iNtiaN-zu
d. iNtiaN-su
IDENT[voice]affix
*!
*!
*!
The problem here is that the highest ranked *VoiObs2 constraint immediately kills the desired output because there are apparently two [voice] features
for obstruents in /iNdiaN-zu/. However, is this really a problem? The answer
is No since a single [voice] feature can have two segments to be voiced. In
other words, a fusion of two [voice] features is a viable representation for
/iNdiaN-zu/. In Fukazawa and Kitahara (2001), we proposed that UNIFORMITY[F] can be relativized to a morpheme to regulate the fusion of features as
a repair strategy for the OCP violation. In the present paper, we will explore
more candidates with featural fusion for /iNdiaN-zu/ and other examples.
In addition to IDENT[F] and UNIFORMITY[F], relativized MAX[F] will be
necessary in the present analysis. For the sake of brevity, the definitions are
all given in (18) and the proposed overall ranking of relevant constraints is
shown in (19).
(18) Definition of relativized faithfulness constraints relevant for the present
analysis (Abbreviations in parentheses)
a. IDENT[voice]stem (IDst): the correspondent segments in a stem in the
input and the output have identical values for the feature [voice].
b. IDENT[voice]affix (IDaff): the correspondent segments in an affix in
the input and the output have identical values for the feature [voice].
c. UNIFORMITY[voice]stem (UNIst): no feature [voice] in a stem in the
output has multiple correspondents in the input (i.e., no coalescence
regarding the feature [voice] in a stem).
d. UNIFORMITY[voice]word (UNIwd): no feature [voice] in a word in
the output has multiple correspondents in the input (i.e., no coalescence regarding the feature [voice] in a word).
e. MAX[voice]stem (MAXst): every feature [voice] linked to a segment
in a stem in the input has a correspondent in the output.
f. MAX[voice]affix (MAXaff): every feature [voice] linked to a segment
in an affix in the input has a correspondent in the output.
*Voi Obs2
MAXst
UNIst
EXPAFF
iNdiaN-zu
a.
hf
[voi]
iNdiaN-zu
b.
*!
[voi] [voi]
iNdiaN-zu
c.
*!
[voi]
iNdiaN-zu
d.
*!
*!
**
[voi]
e.
iNdiaN-zu
In (20), candidate (b) has two [voice] features resulting in a violation of the
highest ranked constraint *VoiObs2. The violation of MAX[voice]stem penalizes candidates (d) and (e). In those candidates, the feature [voice] attached
to the segment [d] in the input is lost in the output: [iNtiaN- zu/su]. Devoicing in the stem is worse than that in the affix because MAX[voice]stem is
ranked far higher than MAX[voice]affix. In the optimal candidate (a), two
[voice] features are fused into one. Therefore, it does not violate *VoiObs2.
Coalescence of the features in the word violates the faithfulness constraint,
the low-ranked UNIFORMITY[voice]word, but does not violate
UNIFORMITY[voice]stem since one of the [voice] features belongs to the stem
but the other belongs to the affix. That is, the coalescence takes place not
within a stem but within a word. Candidate (a) wins over candidate (c)
since *NT outranks UNIFORMITY[voice]word.
(21) Tableau for oyako-genka
/oyako-keNka/
EXPAFF
oyako-geNga
a.
hf
*!
**
[voi]
oyako-geNga
b.
# #
*!
**
[voi] [voi]
oyako-geNka
c.
[voi]
oyako-keNga
d.
*!
[voi]
e.
oyako-keNka
*!
Now, let us reanalyze the example (4) in from It and Mester (2001). In (21),
candidate (b) loses due to the violation of *VoiObs2. Candidate (a) in which
two [voice] features are fused within a stem violates UNIFORMITY[voice]stem .
Candidates (d) and (e) lose since Rendaku does not take place, resulting in
the violation of EXPRESSAFFIX. Consequently, candidate (c) in which Rendaku
takes place becomes optimal. It violates both IDENT[voice]stem and *NT, but
neither violation is more serious than those in other candidates.
/bibu-zu/
EXPAFF
b i b u-su
a.
hf
[voi]
b i b u-zu
b.
# #
*!**
# #
*!
[voi] [voi]
b i b u-zu
d.
[voi]
[voi]
*!
*!
b i b u-su
e.
[voi]
b i b u-zu
f.
hf
*!
[voi] [voi]
b i b u-zu
g.
hf
[voi]
[voi]
*!
In (22), both candidates (a) and (c) have the same segmental structure
[bibu-su], but the featural structures are different. Candidate (c) has two
independent [voice] features violating the highest ranked constraint
*VoiObs2. On the contrary, candidate (a) violates UNIFORMITY[voice]stem,
since two features are fused within a stem. Similarly, we can consider three
different featural structures for [bibu-zu] as shown in candidates (b), (f),
and (g). All of them lose because they result in violating the highest ranked
constraint *VoiObs2 regardless of their featural structures. Candidate (e) violates MAX[voice]stem because the [voice] feature in the stem in the input loses
the correspondent in the output, resulting in a violation of MAX[voice]stem.
On the contrary, the loss of [voice] feature in candidate (a) occurs in the
affix, resulting in a violation of low-ranked IDENT[voice]stem. Consequently,
candidate (a) becomes optimal.
*Voi Obs2
MAXst
UNIst
EXPAFF
kabu-zu
a.
hf
!*
[voi]
kabu-su
b.
[voi]
kabu-zu
c.
##
*!
[voi] [voi]
kabu-zu
d.
*!
[voi]
In (23), candidate (c) loses due to its violation of *VoiObs2. Candidate (d)
loses since devoicing takes place in the stem, resulting in the violation of
high-ranked MAX[voice]stem. The violation of UNIFORMITY[voice]word in
candidate (a) is more serious than that of IDENT[voice]affix in candidate (b)
although both of them are relatively low-ranked. Two [voice] features are
fused not in the stem but in the word in candidate (a). Devoicing in the affix makes candidate (b) violate IDENT[voice]affix. However, (b) becomes
optimal since other candidates commit more serious violations. 11
4. Conclusion
We have seen paradoxical cases for the previously proposed system of multiple phonological sub-lexica in Japanese. Our proposal to resolve the paradoxes is simple: relativize faithfulness constraints with standard morphophonological categories. All the patterns brought up in Tateishi (2001) in (5)
and additional data of our own in (6) are all accounted for in our analysis.
The analysis so far brings up some theoretical implications. First, as we
have claimed earlier in introduction, any etymological information should
not be mixed up with phonological information for setting up sub-lexica.
This position is enforced by a simple consideration about language acquisition. There is no a priori knowledge for children that a certain item belongs
to a particular sub-lexicon. At the early stage of acquisition, the grammar,
Acknowledgements
We thank the editors of this volume, Jeroen van de Weijer, Kensuke Nanjo
and Tetsuo Nishihara for giving us an opportunity to contribute our paper.
We also thank Shigeto Kawahara, Linda Lombardi, Mits Ota, and Koichi
Tateishi for helpful inputs. Of course, all errors are our own.
Notes
1. Sub-lexicon specific faithfulness constraints are derived from a general faithfulness constraint, which is called relativization, or split of faithfulness (Fukazawa 1999).
2. Of course, etymology was not part of the phonological grammar in the previous analyses either. It has been repeatedly pointed out that historical origins of
morphemes do not necessarily coincide with the identification of lexical subclasses (It and Mester 1999; Tateishi 2003). However, what blurs our eyes is
native speakers intuition about lexical classification. As Takayama (this volume) argues, we do have an intuition about lexical classes and it may interact
with the phonological grammar. Our point is that the former is theoretically
distinct from the latter.
Introduction
This paper gives an account of the regular affinity among voice, nasal and
place of articulation, focusing on intervocalic stop consonants in Japanese.
There is agreement that voiced obstruents appeared only intervocalically and
were prenasalized [Ng, ndz, nd, mb] in prehistoric and Old Japanese (Unger
1977; Vance 1983), and that these prenasalized stops (henceforth, PNS) are
still retained only in some of the Tohoku dialects in northern Japan.1 It has
been suggested that the phonemic inventory of the Tohoku system is similar
to that of Old Japanese (F. Inoue 2000), when it was only prenasal which
used to play a distinctive role for the intervocalic obstruents (Hamano 2000;
M. Takayama 2002; among others).
The loss of PNS spans several centuries and is still in progress crossdialectally, but the variations are not random. It is well-known, in the National Language Study in Japan, that the existence of [mb] implies that of
[nd] but not vice versa, and [nd] implies [N] but not vice versa (Hashimoto
1932; Hirayama et al. 1992; Kamei et al. 1997; Kindaichi 1941; Oohashi
2002; M. Takayama 2002; T. Takayama 1993; Uwano 1989; Yanagita 1930;
among others). In the framework of Optimality Theory (henceforth OT;
Prince & Smolensky 1993; McCarthy & Prince 1995), [N][g] alternation
in current Tokyo Japanese has been taken up (McCarthy & Prince 1995; It
& Mester 1997; Hibiya 1999; among others), however, little discussion has
been made as to the related alternation in other places of articulation, its
dialectal variance, and the relation to the historical shift of voice contrast.
This paper will try to shed light on these issues, adopting FAITH reranking model developed by It & Mester (1995ab, 1997, 1998, 1999ab,
2000, 2003). Focusing on the interface between synchronic variation and
diachronic change emphasized by Anttila & Cho 1998, Cho 1998, and others (references cited in Yamane & Tanaka 2002), we will show that the
minimal demotion of FAITH leads to the gradual loss of PNS. It will also be
Intervocalic (Pre-)Nasalization
(i) g N
[N] / V _ V
/g/
[g] / elsewhere
(ii) d nd 6
[nd] / V _ V
/d/
[d] / elsewhere
(iii) b mb
[mb] / V _ V
/b/
[b] / elsewhere
a. kagami
b. kagi
c. uguisu
d. toge
e. kago
kaN
Nami6
kaN
Ni
uN
Nuisu
toN
Ne
No
kaN
mirror
key
bush warbler
thorn
basket
a. hada
b. ude
c. mado
handa
unde
mando
skin
arm
window
samba
embi
ambura
kambe
mackerel
shrimp
oil
wall
a. saba
b. ebi
c. abura
d. kabe
125
a. Plain nasal
velic aperture
|-------------|
oral constriction |-------------|
b. Prenasalized stop: shortening of velar lowering gesture
velic aperture
|------|
oral constriction |-------------|
c. Prenasalized stop: lengthening of oral closing gesture
velic aperture
|-------------|
oral constriction |----------------------|
In a simple nasal, the relative timing of oral and velic gestures are closely
coordinated as in (2a), but in a prenasalized stop, the nasal passage has to
be closed before the oral articulation is released. According to this view, a
prenasalized stop seems articulatorily unstable. There are several different
ways of producing PNS: One way is to shorten the duration of the velic
lowering gesture as illustrated in (2b), and the other is to extend the duration of the oral closure as in (2c). Furthermore, the total duration of the
prenasalized stop is longer in case (2c), which would be consistent with
the idea that the durations of complex segments are greater than the duration of simpler ones. (Maddieson & Ladefoged 1993: 255). However, it
seems that the Japanese system would choose option (2b) rather than (2c),
otherwise the contrast between moraic nasals (e.g., kaN
N go nursing) and
PNS never appear word-initially, which might erroneously lead one to assume that PNS are mere positional variants. However, voiced stops appear
also intervocalically, and in this sense, they are not in complementary distribution. Furthermore, the intervocalic voiced stops are synchronically
derived from voiceless stops, for instance saka sag a slope, kaki
kagi persimmon, doko dogo place, geta geda cloggs, mato
mado window, and kata kada shoulder. As a result, minimal pairs are
easily found as follows.
(5)
127
Minimal pairs
(i) /g/ vs. /N/ (k g, g N):
ageru open vs. aNeru raise, kagi oyster vs. kaNi key
(ii) /d/ vs. /nd/ (t d, d nd):
mado target vs. mando window, hada flag vs. handa skin
This fact suggests that the contrast between voiced stops and PNS is considered to be phonemic rather than allophonic.
System
A1
A2
B
C
D
System
A
B
C
Regional information
Aomori, parts of Iwate, Miyagi, Akita, parts of Yamagata, parts of Fukushima, parts of Niigata, parts of Mie,
parts of Ehime, parts of Nagasaki, parts of Kagoshima
Parts of Nara, parts of Wakayama, Kochi
Parts of Ibaragi, parts of Tochigi, parts of Chiba, Tokyo,
Kanagawa, Toyama, Ishikawa, Fukui, Yamanashi, Nagano, Gifu, Shizuoka, parts of Shiga, parts of Kyoto,
Osaka, parts of Hyogo, parts of Tottori, parts of Okayama, parts of Okinawa
The rest of the regions above
129
are diffused in a gradual succession, just like a ripple, which would be made
by a stone thrown into a pond.
This assumption will be supported if the forms in peripheral areas existed in the past. As far as PNS are concerned, there is agreement that [Ng,
n
dz, nd, mb] existed intervocalically in prehistoric and Old Japanese (Unger
1977; Vance 1983).13 More importantly, PNS are gradually lost, which has
been attested by historical materials in the central and other dialects. According to Hashimoto (1932), the loss of [mb] took place in the late Muromachi
era, and the loss of [nd] took place in Modern Japanese. 14
It is not known when [Ng] was lost, and there is no clear phonetic evidence
to prove its existence in OJ and even in current Tohoku dialects. Although it
would be reasonable to assume that the loss of [Ng] occurred prior to the loss
of [mb], but as for the period of the loss, I could only speculate it is in OJ.
The scenario would be that [Ng] was replaced by [N] in OJ, then [N] is also
going to be replaced by [g]. In fact, the loss of [N] is a well-known ongoing
change in present-day Japanese. Kindaichi (1941) reports that [N] started to
be lost and replaced by [g] among the younger generation in Tokyo.
Summarizing, the loss of PNS affected velar, labial, and coronal in this
order, and further affected [N]. This is shown in the rightmost row of the
table below.
(8)
Loss of PNS
System Intervocalic
PNS and Nasal
A
{mb, nd; N, m, n}
B
{nd; N, m, n}
C
{N, m, n}
D
{m, n}
Period
Major Change
Loss of Ng
Loss of mb
Loss of nd
Loss of N
2.
OT analysis
MAX(NAS) requires that every PNS of the input should be realized in the
output, so the change of prenasalized stop to simple voiced stop in the output would incur the violation of this constraint (This will be explained in
(19ii) and (20b)). The random permutation of these constraints could generate 120 possible dominance hierarchies. However, as far as PNS of OJ
through present-day Japanese is concerned, the hierarchies should be limited to only 4 ways. This can be achieved by adopting the following two
assumptions.
131
*PNS
* Ng
*mb
* nd
*Nasal
>>
*N
*m
*n
Once M hierarchy is fixed in this way, the rerankable constraint would necessarily be only F. This assumption allows us to capture the so-called harmonic completeness.
The definition of harmonic completeness is given below.
(12) Harmonic completeness (Prince & Smolensky 1993; Prince 1998)
Let S be a system and , elements that are markedness-wise comparable, with . Then, if S contains , it must also contain : ( 6&
) 6
* Ng
A
m
* b
B
n
*d C
*N D
{ mb, nd, N}
{ nd, N}
{N}
{ }
b.
more marked
implicans
implicatum
less marked
As we go toward the periphery, more structures are allowed, while as we go
inward, the allowable structures are minimized due to the obligation to the
observance to more constraints. Based on the observation in the preceding
sections, we could state that system A falls outside of the circle of any domain; system B is inside the domain of *mb; system C is inside the domain
of *nd, and system D is in the innermost circle of *N.
From a historical perspective, Japanese starts from the outer circle allowing the full range of instantiations of PNS, and as time goes by, the system goes inward eliminating marked segments gradually. This seems to be
133
intuitively right, in the sense that the historical shift proceeds toward the
less marked structure. Such a direction could be expressed as an inward
shift or a shift from implicans to implicatum as shown in (14b).17
Before representing how variation between grammars is accounted for,
let us clarify some difference between the constraint domain map developed by It and Mester and the one in here. It and Mesters map is about
lexicon-internal variation within a grammar of one speaker of current Tokyo
Japanese. What I am addressing is variation between grammars in the PNS
inventory of speakers both across time and space. But in both models, the
emphasis would be placed on the observance of the hypotheses in (10ab).
*N
PJ (=D)
C+D
*N
ModJ (= C)
B+C
*N
MJ (=B)
A+B
*N
OJ (= A)
This image shows that every system is harmonically complete, and the
sound change proceeds toward a specific direction step by step. Again, the
direction of the diachronic change here can be characterised from implicans
to implicatum.
(16) a. Implicational relation: DCBA
b. Unmarked direction of diachronic change: ABCD
In terms of OT, thanks to the fixed ranking of M, harmonically incomplete
systems such as those in (13b) would never be generated. Furthermore, a
diachronic prediction can be made. That is, [mb] may be lost before [nd], but
[nd] will never be lost before [mb]. Likewise, [nd] may be lost before [N],
but [N] will never be lost before [nd]. This prediction holds true for both the
synchronic and the diachronic continuum, as seen in the previous sections.
135
the input as diachronically old forms (Yamane-Tanaka 2003), then candidates which lost PNS would incur violations of MAX(NAS).
In this section, I will insist that the rerankability of MAX(NAS) is also
not random. If FAITH(NAS) can be reranked in any free way, even if allowable systems would be only 4, the ways of order that each system will
emerge should add up to 24 ways (e.g., DCBA, ACBD,
CABD), but it is not the case. In order to derive the correct diachronic change (i.e., ABCD), F has to be demoted minimally among
the relevant constraint system. Thus the minimal F demotion limits the possible order of permutations to only one way, as shown below. (The constraint at the top is most highly ranked.)
(17) Possible permutations and direction
System
Ranking
Output
A
* Ng
MAX(NAS)
*mb
* nd
*N
B
* Ng
*mb
MAX(NAS)
*nd
*N
C
* Ng
*mb
*nd
MAX(NAS)
*N
D
* Ng
*mb
*nd
*N
MAX(NAS)
{ mb, nd, N}
{Nd, N}
{N}
{}
As F is ranked higher, more M are invalidated (i.e., free from force), so that
more PNS are allowed in the system. As F goes downward, more M become active, so that more PNS are removed from the system. F minimal
demotion hypothesis correctly captures the way of progression that the less
marked structures never have been eliminated until the relatively marked
ones were eliminated.
Let me discuss some lexical items, to see how PNS would be realized in
each system A-D. Items /kabe/ wall, /hada/ skin and /toge/ thorn are
synchronically attested as below.
(18) PNS in intervocalic context
m
System A
System B
System C
System D
b
kambe
kabe
kabe
kabe
d
handa
handa
hada
hada
N
toN
Ne
toN
Ne
toN
Ne
toge
|------|
Oral constriction
|-------------|
|-------------|
|-------------|
|-------------|
The first strategy is to extend the velic aperture of PNS to create a simple
nasal, as in (19i), and the other is to delete the entire velic aperture to create
a simple stop, as in (19ii).
Both strategies would satisfy *PNS, but each of them would violate one
of the following constraints.
(20) a. DEP(NAS): Velic aperture of the output has an identical correspondent in the input.
(No extension of velic aperture)
b. MAX(NAS): Velic aperture of the input has an identical correspondent in the output.
(No deletion of velic aperture)
The strategy in (19 i) would satisfy MAX(NAS), but violate DEP(NAS) since
the velic aperture is extended in the output (e.g., mbm). On the other hand,
the strategy in (19ii) would satisfy DEP(NAS), but violate MAX(NAS) since
the velic aperture is deleted in the output (e.g., mbb).
As it stands, there is no predictive force to capture the idiosyncrasy of
the velar. Then I will propose the following constraint.
(21) MAX(NAS)/VEL: Velic aperture with the oral constriction at velar of
the input has an identical correspondent in the output.
(No deletion of N)
MAX(NAS)/VEL demands that the velum remains to be lowered when the
oral constriction is formed at the same area (i.e., velar). Thus it is violated
if the velum is raised while the constriction is formed at the velar (e.g., Ng
g). This kind of constraint may be phonetically-grounded (see note 9),
137
to ge
toge
toNe
* Ng
MAX(NAS)/VEL
DEP(NAS)
MAX(NAS)
*!
*!
*
*
*Ng DEP(NAS)
D
N
to ge
toge
toNe
toNge,
toNke
MAX(NAS)/VEL
MAX(NAS)
*!
*!
System
A
kambe,
kampe
kambe
kabe
kame
ha da,
hanta
toNge,
toNke
DEP(NAS) MAX(NAS)
hana
toNge
toge
toNe
*nd
*
*!
*!
handa
hada
* mb
*
*!
*!
*!
*!
*
*
139
kabe
*
kampe
kame
*!
handa
*
handa,
hada
*!
n
ha ta
hana
*!
toNge *!
N
to ge,
toge
*!
*
toNke
toNe
*
System B
hada
*
hanta
hana
*!
toNge *!
toNge,
toge
*!
*
toNke
toNe
*
System C
3.
141
Voiceless stops
p, t, k
p, t, k
p, t, k
p, t, k
PNS
b, nd, N
n
d, N
N
Voiced stops
b
b, d
b, d, g
The only thing that is certain here is that PJ has the voice contrast without
prenasal, and the segmental distribution in the other eras is hypothetical.
The question is how PJ has attained such a system. Imagine when one prenasalized stop is lost, it is replaced by the plain voiced stop; say when nd is
lost, nd is replaced by d. Then, the original d may become t in order to
avoid merging with the original d. Given that such a diachronic chain shift
proceeds from labial, coronal and velar in this order, we could further assume that the voice contrast didnt emerge at all PoA simultaneously. In
other words, in early OJ the contrastive feature pertaining to stops was prenasal, but the contrastive feature may have shifted from prenasal to voice
gradually from MJ through PJ.
143
145
*!
*!
*
*
*!
*
*
*!
(T = p, t, k; D = b, d, g; A = vowel)
Recall that as far as word-initial position is concerned, OJ does not allow
voiced obstruents, while PJ does. If OJ can be categorized into type (30a)
and PJ as type (30b), then the asymmetry of such a voice contrast between
OJ and PJ should be already captured in (31).
The hierarchies in (30) should be compared to those given below, which
could be developed from the hierarchy in (17).
(32) a. Early J: MAX(NAS) >> *PNS
(i.e., Prenasal obstruents are attested in the inventory)
b. ModJ: *PNS >> MAX(NAS)
(i.e., Prenasal obstruents are excluded from the inventory)
Since in early J, it is more important to retain nasal than to avoid PNS, PNS
can be surfaced in the output. But in PJ, it is more important to avoid PNS
than to retain nasality, the underlying PNS cannot be surfaced in the output.
It would easily be found that OJ (or early J) and PJ (or Mod J) show
different markedness. The former is marked in segmental complexity, but
unmarked in inventory. On the contrary, the latter is marked in inventory,
but unmarked in segmental complexity. As is clear from the trade-off relation of markedness shown in (30) and (32), Mod J seems to have reduced
the segmental complexity, sacrificing the unmarked status of obstruents.
([Ft = Feature])
147
4. Conclusion
We have discussed the connection between the loss of PNS and the history
of voice of obstruents. Based on the observation that the loss of PNS proceeded along the harmonic scale of PoA, I gave it a principled account with
the minimal demotion of MAX(NAS). It was suggested that the demotion
may indicate the historical emergence of the voice contrast.
The parallelism between synchronic variation and diachronic change is
attested in the form of an implicational relation. Among 47 regional dialects surveyed here, 46 systems involving PNS fit into the factorial typology that emerged from the fixed markedness hierarchy with rerankable
MAX(NAS); the only exception is the Tokushima dialect (cf. Appendix 3;
No. 37). Historical surveys demonstrate that the possible grammatical
change proceeds from implicans to implicatum, which is mirrored in the
demotion of MAX(NAS).
Acknowledgements
Part of this article was first read at the workshop Voicing in Japanese on
Linguistics and Phonetics (LP) held at Meikai University on September 3,
2002. I am deeply indebted to Kensuke Nanjo, and Tetsuo Nishihara, and
Jeroen van de Weijer, who organized this project and edited this book.
Other presentations of this research include the ones at the meeting of
the Tokyo Circle of Phonologists (TCP) at Seikei University on May 25,
2003 and the informal research meetings as well as a presentation session
in LING 507 at the University of British Columbia from September to December, 2003.
I express my profound gratitude to Sonya Bird, Atsushi Fujimori,
Shosuke Haraguchi, Ayako Hashimoto, Takeru Honma, Junko It, Itsue
Kawagoe, Masahiko Komatsu, Haruo Kubozono, Masao Okazaki, Ruangjaroon, Keiichiro Suzuki, Timothy Vance, Ian Wilson and my classmates,
who provided me with insightful comments and discussions.
Special thanks go to Linda Lombardi, Kan Sasaki, Michiaki Takayama,
and Tomoaki Takayama, who kindly sent me various important materials
related to the issues Im interested in. I am also grateful to Hiroyuki Maeda,
who wrote a critique of an earlier version of this paper (Maeda 2004).
Last but not least, I would express my deepest thanks to Gunnar lafur
Hansson, Douglas Pulleyblank, Shin-ichi Tanaka, and Jeroen van de Weijer,
and anonymous reviewers, who read earlier versions of this paper and made
valuable comments and suggestions.
The research I have conducted since September 2003 is supported by
SSHRC Standard Research Grant [#410-2002-0041], which was awarded
to Douglas Pulleyblank.
Id like to dedicate this paper to the memory of my father and my fatherin-law.
149
Notes
1. According to the dialectal division of Tj (1954), Tohoku dialects consist of
Aomori, Iwate, Akita, Miyagi, Yamagata, Fukushima and North of Niigata.
2. For an OT analysis of the chain shift, see Yamane-Tanaka (2003).
3. /p/ didnt participate in voicing. /p/ turned into [] or [h] word-initially, or
turned into [w] or was deleted word-medially (F. Inoue 2000: 421).
4. As an anonymous reviewer pointed out, some readers may wonder if PNS in
this dialect contrasts with NC clusters intervocalically. It seems cross-linguistically rare for NC clusters and PNS to contrast (Maddieson & Ladefoged 1993).
Also, from the point of view that Tohoku dialects is a syllabeme dialect
(Shibata 1962: 140 141), where light and heavy syllables do not show weight
contrast, it may be hard to believe there is a distinction. However, there are
several reasons to believe so. First, this dialect has minimal pairs to show this
contrast. For example, /samba/ mackerel vs. /samba/ midwife, /handa/
skin vs. /handa/ solder, and /kaN
N o/ basket vs. /kaN
N No/ nursing. Second,
the moraic nasal has longer duration than prenasals. So far I have not attained
any phonetic measurement data to show the durational contrast between the
prenasal (e.g., /m/ in /..mb../) and the moraic nasal on the same place of articulation (e.g., /m/ in /..mb../), but according to Oohashi (2002: 210215), the duration of the nasal murmur showed 130 ms. for /N/ (in /teNki/ weather), which
contrasted with 36 ms. for /m/ (in /ombi/ kimono belt) and 100 ms. for /n/ (in
/handa/ skin). Third, the duration of the moraic nasal is longer compared to
the second element of long vowels or geminates (Oohashi 2002: 317349).
5. Generally, PNS only appear in Yamato items, not in Sino-Japanese (SJ), Mimetics and Foreign items. There are also phonetic environments which prohibit
PNS: (a) V1. _ V2 : where V1 = long, (b) C1V1. _ V2 : where C1 = [-voice], V1 =
[+high], V2 = [-high], (c) V1. _ V2. C3V3 : where V2 = [+high], C3 = [-voice], V3
= [-high], (d) after nasal. Nonetheless, some prenasalized items in SJ as well
as Yamato are found in these environments.
d
[ko bodaisi]
[kondo:]
[doN
N u]
tool (SJ)
Environments
VV _
[u de]
[iN
N aN]
none
none
none
/N/ _
none
none
[miN
N ite]
right hand (Yamato)
[riNN
N o]
apple (SJ)
6.
7.
8.
9.
10.
Interestingly, [N] can appear in all environments (a-d), [nd] in limited environments (a, b), and [mb] in only one environment (a). This observation also
seems to match the assumption that [mb] is least likely to survive.
In this paper the vowel phonetic symbols are represented by simplified 5 vowels [a, i, u, e, o] rather than the narrow transcription. In Aomori dialect, /u/ is
[_] (unrounded and centralized), which is common to the most dialects in
Eastern Japan. Tohoku dialects in general have no contrast between /i/ and /e/,
merged as [e] (raised).
/..di/ and /..du/ are left out from the column of the examples, because they are
exclusively observed in Foreign items (e.g., [torendi] trendy, [andutowa]
un, deux, trois (Fr)) at present. Relevant to /di/ or /du/ in Yamato items, there
is something worth noting. It is assumed that historically /di/ vs. /zi/ were pronounced respectively as [di] vs. [Zi], and /du/ vs. /zu/ were as [du] vs. [zu].
Such a distinction is still kept in only some areas in Kochi, well known as
yotsugana dialects [dialects with four different ways of pronounciation for
four kana letters] (e.g., [uZi] for /fuzi/ name of area or person, [undi] for
/fudi/ wisteria, [kuzu] for /kuzu/ arrowroot vs. [kundu] for /kudu/ trash).
In contrast, the North of Tohoku areas neutralize such four forms into [dz] ([]
= centralized [i]), which is characterized as hitotsugana dialects [dialects
with only one way of pronunciation for four kana letters] or dzii dzii dialects. This dialect also neutralizes /ti/ and /tu/ into [ts], which turns up as
[dz] intervocalically. Thus, /tizi/ governor and /tizu/ map are both realized
as [tSindz], and /titi/ milk and /tuti/ soil are both [tsdz] (see R. Sato 2002).
As for the structurally complex segments, see van de Weijer (1996). As for the
markedness ranking of features, see sec. 2.1.
The other way of adjustment is turning it into simple voiced stops (i.e., [g]), as
it has happened to [mb] and [nd] in younger generation. But only [Ng] didnt
show this option. It might be ascribed to Boyles Law: voicing is difficult to
maintain when the supraglottal cavity is small (Ohala 1983; Vance 1987).
McCarthy & Prince (1995: 353) express this effect in terms of a constraint
POSTVCLS posterior stops (i.e., velars) be voiceless. In their system treating
[g] ~ [N] alternation in Tokyo Japanese, *[N] >> POSTVCLS >> IDENT-IO(NAS)
turns [k] to [g] only where [N] cant. It & Mester (1997) postulates *g, which
can be interpreted as trigger-constraint, producing the impetus for nasalization of [g] into [N] to occur (Kager 1999: 241). Such an avoidance of [g] may
be aerodynamically or physiologically grounded, as the oral constriction at the
velar could lower the velum so easily that the airflow may be let out of the nasal cavity. It may be worth exploring in terms of Grounding Theory (Archangeli & Pulleyblank 1994).
From the viewpoint of the long-term sound changes, this change can be taken
as the first stage of the consonant shift; [Ng] > [N] > [g] > [F]. As for the labial
series, [mb] > [b] > [B] is assumed. The change in the coronal series is divided
151
into two: [nd] > [d] before nonhigh vowels, and [nd] > [ndz] > [dz] > [z] before
high vowels (See T. Takayama 1993). It should be noted that velars had [N]
before reaching [g], while labial and coronal did not have [m] or [n] respectively in any stage of the consonant shift.
11. [Ng] and [N] may not stand in phonemic distinction, so the latter is parenthesized. However, positing the two systems of A1 and A2 seems to cover the observation so far. Thanks for Gunnar lafur Hansson for this suggestion.
12. This may be similar to the wave theory in the European tradition of dialectology. However, Yanagitas theory lays more emphasis on the aspect that the
spread of newer forms takes place in a circular pattern just like ripples with its
center located in the cultural center (Shibatani 1990: 201).
13. This discussion is developed based on the following chronological division (cf.
Miller 1967: ch. 1).
Division
Old J
Late Old J
Centuries
8c.
9c. 12c.
Middle J
13c. 16c.
Early Modern J
Modern J
17c. 1868
1868
14. We focus on the prenasalization of voiced stops, and therefore do not treat
[ndz] here. However, [ndz] likewise underwent loss of PNS, and subsequently
merged with [z] around the 16th through 17th century. For details, see T. Takayama (1993).
15. For another OT analysis treating dialectal differences of consonant voicing, see
Nishihara (2002).
16. For references, see de Lacy (2002: 193194). In fact, not only the ranking between *Lab and *Dor, but also the whole ranking has been a matter of debate.
See Hume & Tserdanelis (2002). For a restriction on M, see de Lacy (2004).
17. The relation between implicans and implicata seems to show the basic pattern
of phonological asymmetries. In the pattern of consonant harmony in acquisition (Pater & Werle 2001), implicans and implicatum are both instantiated in
the early stage, but only implicata are observed in the later stage (e.g. coronals
are targets of harmony as frequently or more frequently than non-coronals.).
18. Anttila and Cho (1998) divide the systems into invariant and variable systems.
Variable systems are expressed as combinations of two invariant systems, such
as A+B, B+C, and C+D. Thus, 7 systems will be logically possible. Since we
assume that diachronic change originates in synchronic variation, we must allow for those variable systems.
19. Thanks for Douglas Pulleyblank for raising this issue.
20. See Nasu (1999) for more data with an emphasis on the markedness of [p].
153
[1] [N, g
,Ng]
[3] [mb]
[2] [Nd]
155
1.
2.
3.
Prefectures
Hokkaido
Aomori
Iwate
4.
Akita
5.
Miyagi
6.
Yamagata
7.
Fukushima
8.
9.
10.
11.
12.
13.
14.
Ibaraki
Tochigi
Gunma
Chiba
Saitama
Tokyo
Kanagawa
15.
Niigata
16. Nagano
17. Toyama
18. Ishikawa
19. Fukui
20. Gifu
21. Yamanashi
22. Shizuoka
23. Aichi
24. Shiga
25. Mie
26. Kyoto
27. Nara
28. Osaka
29. Wakayama
30. Hyogo
31. Tottori
32. Shimane
33. Okayama
34. Hiroshima
35. Yamaguchi
36. Kagawa
+
N
+
+
+
+
+
+
+
+
+
+
+
+
+
System
C
A2
A1
A2
A1
A2
A1
A2
C
D
C
D
C
A1
A2
D
C
A2
C
B
C
B
C
C
D
C
D
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
Prefectures
Tokushima
Ehime
Kochi
Fukuoka
Oita
Saga
Nagasaki
Kumamoto
Miyazaki
Kagoshima
Okinawa
System
?
A2
B
D
A2
D
A2
C
1. Introduction
Although Rendaku (or Sequential Voicing) has been extensively studied in
the literature, its relation to accentuation has not. Sugito (1965) is a unique
study, which, after conducting a limited research on person names which
end with the morpheme (rice field), claims that words which undergo
Rendaku tend to be accentless. This correlation is intuitively supported, as
some researchers follow her on various occasions (cf. H. Sato (1989),
Kubozono (1998), Kubozono (this volume), Tanaka (this volume), etc.).
This paper examines the extent to which Sugitos generalization applies
in Japanese, investigating more thoroughly other person names. It will become clear that the correlation in question is observed to some extent, but is
not overwhelming. Moreover, it will be shown that obedience or nonobedience to it is lexically determined by the rightmost head morpheme of
the name (e.g. ta in Yoko-ta), and further, that each morpheme shows quite
a diverse behavior in accentuation and Rendaku.
The paper is organized as follows: in the next section we first review the
generally-held view of the relationship in question, giving a summary of
Sugitos (1965) investigation on names with ta. In Section 3, we investigate
other Japanese names to see to what extent the relevant observation applies
in general. It soon becomes evident that the pattern differs significantly
depending on the last head morpheme in the name; in Section 4 such various
morpheme-specific patterns will then be illustrated. Section 5 concludes the
paper.
Sugito investigated names which end with the morpheme ta rice field one
of the most productive morphemes for Japanese surnames and examined
if ta is subject to this generalization, referred to as Sugitos Law in this
paper. As shown in the examples below, names which conform to this generalization are abundant.
(2)
The names in (3a) do not undergo Rendaku although they are accentless.
On the other hand, those in (3b) get accented even though they are subject
to Rendaku.
After investigating 362 names with this morpheme, Sugito found that
the generalization applies to more than half of those ending with ta. Below
are the results in which Sugito counted the number of names with respect to
accentedness and Rendaku sensitivity [slight modifications are mine].1
(4)
non-Rendaku
Rendaku
both
total
accented
94
64
8
166
accentless
13
95
0
108
both
10
56
22
88
159
total
117
215
30
362
According to (4), 189 names that is, 52.2% of all the ta-names and 71.0%
of names which do not have alternating patterns (i.e., all the names excluding
those listed in both cells) conform to the generalization in (1), as highlighted above. It is possible to conclude from this table that Sugitos Law
applies moderately, though not strictly, to names ending in ta.
Sugito also pointed out that both voicing and accentuation are influenced
by the onset segment of the last mora of the preceding morpheme, which I
will call the base. If the segment is voiced including sonorants but not
nasals the name tends to be accented and exempt from Rendaku (as in
Fuji-ta). If the segment is voiceless including nasals or the mora has no
onset, names tend to be accentless and undergo Rendaku (as in Yoshi-da).2
The table below illustrates this point, where [v] stands for voiceless and
[+v] for voiced in the above-mentioned classifications regarding sonorants
and nasals. Note also that the counting of names is slightly different from
(4), mainly because Sugito excluded names with alternative voicing
pronunciations.
(5)
non-Rendaku
Rendaku
total
accented
v
+v
a
23
55
58c
2
81
57
accentless
v
+v
3
3
87
3
90
6
both
v
12b
56
68
total
+v
14
0
14
110
206
316
161
sensitivity. Below are examples of names whose base consists of one mora,
and they are all accentless:
(6)
I-da, U-da, E-da, O-da, Ki-da, Su-da, Ta-da, Tsu-da, To-da, No-da,
Hi-da, Ya-da, Yu-da, Yo-da, Wa-da; Se-ta, Ha-ta, Mi-ta
Moreover, except for the last three, names in (6) almost all undergo Rendaku.
This might result from the fact that voiced obstruents are scarce in the last
onset of the base in this type.
Before moving on to the next section, let us summarize the characteristics of ta in the following table:
(7)
characteristics of ta:
specification
ta
Rendaku/Accent
correlation
Yes
OCP
Yes
Peculiarity of
monomoraic base
Yes: A, +R
accented
accentless
both
total
105 (30.3 %)
59 (17.0 %)
5 (1.4 %)
169 (48.7 %)
89 (24.5 %)
63 (18.2 %)
1 (0.3 %)
153 (44.1 %)
both
15 (4.3 %)
6 (1.7 %)
4 (1.2 %)
total
209 (60.2 %)
128 (36.9 %)
10 (2.9 %)
25 (7.2 %)
347
The areas predicted by the generalization in (1) are shaded. The percentage
shows the rate of appearance among all the names in question (i.e. 347
names). The high percentage of accented non-Rendaku names is in accordance with Sugitos Law, in that it is higher than those of both accented
Rendaku names and accentless non-Rendaku names. That is, if a name is
accented, it is most likely to be exempt from Rendaku (i.e. 30.3% to
24.5%). If it does not undergo Rendaku, it is most likely to be accented (i.e.
30.3% to 17.0%). The percentage of accentless Rendaku names, on the
other hand, does not seem to be consistent with Sugitos Law. The percentage is only slightly higher in the Rendaku group (i.e. 18.2% to 17.0%), and
even lower in the accentless group (i.e. 18.2% to 24.5 %).
This tendency described above is even clearer in another calculation of
the appearance rate. The percentages in (9a, b) are calculated by the number
among one particular group, not among the entire set of names. For example,
the 105 accented non-Rendaku names in (9a) comprise 62.1% of all the nonRendaku names (i.e. 105 to 169). On the other hand, in (9b) they comprise
50.2% of all the accented names (i.e. 105 to 209). In both (9a) and (9b),
accented non-Rendaku names comprise more than 50%, which suggests that
to some extent they conform to Sugitos Law.
(9)
a.
non-Rendaku
Rendaku
b.
non-Rendaku
accented
accentless
total
105 (62.1 %)
59 (34.9 %)
5 (3.0 %)
169
89 (58.2 %)
63 (41.2 %)
1 (0.7 %)
153
accented
accentless
105 (50.2 %)
59 (46.1 %)
Rendaku
89 (42.6 %)
63 (49.2 %)
both
15 (7.2 %)
6 (4.7 %)
total
both
163
209
128
Accentless Rendaku names, on the other hand, do not exhibit such prevalence in any group. Among Rendaku names (9a), they comprise 41.2%,
which is less than accented Rendaku names (58.2%). They are prevalent
among the accentless names, but with only 4 names exceeding nonRendaku accentless names (9b).
Though this is a very simple comparison, from these data we can conclude that certain characteristics found in ta in particular, the generalization in (1) do not apply to Japanese person names in general. As will become obvious in the following sections, accentedness and Rendaku
sensitivity are actually morpheme-dependent properties. Moreover, the
influence of the voicing of the base on Rendaku, which is operative in ta,
also seems to be determined by each morpheme, as we will shortly see. In
the next section, therefore, we discuss how such properties are specified for
each major morpheme which appears in Japanese surnames.
4. Morpheme-specific tendencies
As previously mentioned, accentedness and Rendaku sensitivity are properties specified for each morpheme independently. Morphemes can be categorized into four types as to the degree in which the properties are specified: that is, (i) those in which both accentedness and Rendaku sensitivity
are specified; (ii) those in which one of these properties is specified; (iii)
those in which neither is specified; and (iv) those which have peculiar pattern. In what follows, we will consider various morphemes according to
this categorization, so that we can examine to what extent Sugitos Law is
upheld in them and how different they are in accentuation and Rendaku.
4.1. Names in which both accentedness and Rendaku sensitivity are specified
Examples that most clearly exemplify this category are names with hara
field. As shown in (10), these are most likely to be accented without Rendaku, and should be specified as such.
(10) hara: accented, non-Rendaku
Shino-hara, Kuri-hara, Taka-hara, Ue-hara, Ta-hara, Naka-hara,
Kawa-hara, Yoshi-hara, Tsuka-hara, Take-hara, Kasa-hara,
Oo-hara, Kita-hara, Nishi-hara
The table below shows more precisely the specific behavior of this morpheme. The areas predicted by Sugitos Law are shaded in gray.
(11)
non-Rendaku
accented
accentless
both
23 (63.9 %)
3 (8.3 %)
3 (8.3 %)
total
29 (80.6 %)
Rendaku
4 (11.1 %)
(0 %)
(0 %)
4 (11.1 %)
both
2 (5.6 %)
(0 %)
1 (2.8 %)
3 (8.3 %)
total
29 (80.6 %)
3 (8.3 %)
4 (11.1%)
36
The percentage of words which obey the lexical specification (i.e. 63.9%)
might seem rather small, but it is clear that the other major pattern predicted by Sugitos Law (i.e. accentless with Rendaku) is never produced,
not even as an exception. If it is the case that this morpheme just follows
the generalization i.e. not that it is specified as accented with Rendaku
the alternative pattern should contain fairly large number of examples. Note
also that the total percentages of accented and non-Rendaku groups are
both 80.6%. From these observations, we can conclude that this morpheme
is most typically accented without Rendaku, because it is specified as such.
Moreover, the monomoraicity of the preceding morpheme seems to play
a role in accentuation, as in the case of ta. In (11), exceptions with a monomoraic base can be found in the following four categories: (i) two in accentless non-Rendaku names (Mihara and Ihara); (ii) two in non-Rendaku
names which have both accented and accentless patterns (Ki()hara and
165
No()hara); (iii) one in an accented Rendaku name (Ebara); and (iv) one
in an accented name which may or may not undergo Rendaku
(Ko[h/b]ara).4 Thus, it is possible to conclude that hara is specified not
only as accented without Rendaku, but also as accentless or undergoing
Rendaku for names with a monomoraic base.5
On the other hand, names with hara are not influenced by the voicing of
the preceding morpheme in terms of Rendaku. Names with a voiceless
segment in the last onset of the base are equally exempt from Rendaku, as
is obvious in (10): compare, for example, Takahara and Yoshihara with
Kurihara and Kawahara. This, however, is natural because the OCP occurs
only when Rendaku might occur, which does not normally happen in hara
names due to their specification.
Other examples which show similar behavior to hara are:
(12) a. shita under (8/8):
b. tani valley (7/8):
c. se shallows (7/7):
d. saka slope (6/7):
As the rate in parentheses shows, almost all the names with these morphemes are accented and do not undergo Rendaku, which conforms to the
pattern predicted by Sugitos Law. Monomoraicity does not seem to play a
role in tani and saka, since names with monomoraic base behave in the
same way as others.
On the other hand, there are many names which are specified as having
a pattern that do not obey Sugitos Law; that is, those which are specified
either as accented and undergoing Rendaku, or as accentless and not undergoing Rendaku. Names with kuchi mouth as head are the clearest examples which show the former pattern.
(13) kuchi: accented, Rendaku
Yama-guchi, Tani-guchi, No-guchi, Kawa-guchi, Hi-guchi,
Ta-guchi, Seki-guchi, E-guchi, I-guchi, De-guchi, Mizo-guchi,
Hama-guchi, Hori-guchi, Hara-guchi
(16)
specification
hara
shita
tani
se
saka
kuchi
hayashi
kura
+A, R
+A, R
+A, R
+A, R
+A, R
+A, +R
+A, +R
A, R
Rendaku/Accent
correlation
Yes
Yes
Yes
Yes
Yes
No
No
No
OCP
167
Peculiarity of
monomoraic base
Yes: A or +R
No
No
No
No
No
(Yes: A)
(Yes: +R)
accented
v
+v
0
0
1
0
0
0
1
0
accentless
v
+v
2
6
12
1
3a
0
17
7
total
both
v
0
0
0
0
+v
1
0
0
1
9
14
3
26
accented
v
+v
3
0
0
0
0
0
3
0
accentless
v
+v
3
6
7
2
1
0
11
9
both
v
+v
0
0
0
0
0
0
0
0
total
12
9
1
22
As in the case of sawa, /k/ is counted as voiceless in (20). The Sugito Law
pattern is again only observed in the accentless Rendaku cell.
Names with tsuka mound also show a similar pattern to sawa and
shima in that the name with this morpheme becomes accentless.9
(21) Oo-tuska, Hira-tsuka, To-tsuka; Ii-zuka, Ishi-zuka, Te-zuka
The difference between this and the previous two is that voicing is not determined by the last onset of the preceding morpheme; cf. Totsuka vs. Tezuka. As long as there are no Rendaku names with voiced base-final onsets,
it can be said that this morpheme also respects the OCP. Only seven examples with this morpheme were found, however, and thus this tendency remains to be examined more thoroughly.
169
Saki cape is also similar to sawa, shima and tsuka in the sense that accentedness is fixed lexically; still, the value of the specification is opposite
that is, specified as accented. Rendaku sensitivity is again determined by
the last onset of the base:
(22) a. Miya-zaki, Oka-zaki, Matsu-zaki, No-zaki, Shino-zaki,
Ishi-zaki, Shima-zaki
b. Iwa-saki, Fuji-saki, Naga-saki, (Kawa-saki)
As shown in (22b), a name does not undergo Rendaku when the last onset
of the base is voiced, including exceptionally accentless Kawasaki. This
fact is observable from the following table:
(23)
non-Rendaku
Rendaku
both
total
accented
v
+v
1
3
8
1
4a
1
13
5
accentless
v
+v
0
1
0
0
0
0
0
1
both
v
1
0
0
1
total
+v
0
0
0
0
6
9
5
20
a: All contain a base either (i) with [m] (e.g. Yama-[s/z]aki and Hama-[s/z]aki)
or (ii) of monomoraic shape (e.g. Ta-[s/z]aki and E-[s/z]aki)
As in the case of sawa, /m/ behaves differently from other voiceless segments, as well as monomoraic bases.
Another morpheme that is specified as accented is hata farm:
(24) Oo-hata, Taka-hata; Ta-bata, O-bata; (Kawa-bata)
The OCP effect on Rendaku seems to be absent here note that Takahata
does not undergo Rendaku even though the last onset is voiceless [k]. It may
be that names with a monomoraic base undergo Rendaku, as in Tabata and
Obata, although the data is limited to only these two. Kawabata is exceptional not only in that it is accentless but also in that it undergoes Rendaku
even though the base-final onset is voiced, violating the OCP.
The morpheme ki tree shows a unique pattern. A name with this morpheme shows one of the following three patterns: (i) accentless without
Rendaku as in (25 a); (ii) accentless with Rendaku as in (25b); and (iii) accented without Rendaku as in (25c).
accented
v
+v
3
4
0
0
0
0
3
4
accentless
v
+v
5
2
3
2
0
0
8
4
both
v
0
0
0
0
total
+v
0
0
0
0
14
5
0
19
Both accented and accentless names are found in non-Rendaku and Rendaku
cells. Possible explanations would be either: (i) accentedness is fixed as
accentless and the names in (25c) are exceptions; or (ii) Rendaku sensitivity
is fixed as negative and the names in (25b) are exceptions. If we look at the
data more closely, it becomes clear that the former is a better analysis. Note
that the names which belong to (25c) have a base with [r] or [m] as the last
onset. Only when the base has one of these particular segments, does the
name gets accented instead of its [A] specification.10
It is worthwhile to note that the exceptional names in (25c) obey Sugitos
Law: when the name is exceptionally accented, it is always exempt from
Rendaku. Thus, we can conclude that the generalization is partly preserved
in names with this morpheme.
We summarize this section with the list in (27). As in the previous sections, each morpheme is supplied with information concerning its lexical
specification, Rendaku/accent correlation, the OCP, and peculiarities regarding monomoraic bases. In addition, the idiosyncratic behavior of some
morphemes is given in the rightmost column. Some morphemes either undergo or fail to undergo Rendaku when the base has [m] as the last onset,
which means that [m] must be regarded as both voiced and voiceless.
(27)
specifiRendaku/
cation Accent correlation
OCP
Peculiarity of
monomoraic base
special
treatment
[m]: [v]
sawa
No
Yes
(No)
shima
No
Yes
(No)
tsuka
No
(Yes)
(No)
saki
+A
No
Yes
(Yes: R)
hata
+A
No
(No)
(Yes: +R)
ki
Yes
(in subgroups)
No
(No)
171
[m]: [v]
[m] and [r]: +A
accented
v
+v
11
8
1
0
0
0
12
8
accentless
v
+v
1
1
9
1
1
1
11
3
both
v
0
0
3
3
total
+v
0
0
0
0
21
11
5
37
Examples with a monomoraic base are excluded in (29) and (30). It is obvious from (29) and (30) that this morpheme follows the generalization (1)
when the base is bimoraic: that is, the name is exempt from Rendaku when
accented and subject to it when accentless. This pattern is also observed in
three names categorized as having both kinds of accentuation with/without
Rendaku, as they are either accented without Rendaku or accentless with
Rendaku (e.g. Shimo-kawa/Shimo-gawa).
This distribution can be accounted for if we assume that kawa is not
assigned any specific value for accentedness or Rendaku sensitivity. In this
case, a name randomly takes a value for accentedness but not for Redanku,
for reasons we will see shortly and if it is accented, obedience to Sugitos
Law prohibits it from undergoing Rendaku. Conversely, if it is accentless,
the name undergoes Rendaku for the same reason.
Also interesting in (30) is the fact that accentless Rendaku names are
restricted to names with bases having final voiceless onsets, but accented
non-Rendaku ones do not have such a restriction. This is because the OCP
only prohibits two consecutive voiced consonants, but not a voicelessvoiceless sequence. Suppose a name with a final voiced onset (say, paba)
has a [A] value. Sugitos Law would predict an illegal voiced-voiced sequence *Pabagawa. On the other hand, obedience to the OCP produces a
pattern which violates Sugitos Law when it preserves the [A] value:
*Pabakawa. Satisfying both the [A] value and the OCP is thus impossible
for a name with a final voiced onset, which leads to the near nonexistence
of [+v] names in the accentless Rendaku cell of (30).
On the contrary, a base with a final voiceless onset does not violate the
OCP if it does not undergo Rendaku: a sequence of two voiceless segments
is not ill-formed in itself with respect to the OCP. Thus, if a name with a
base-final voiceless onset is given a [+A] value, Sugitos Law prohibits it
from undergoing Rendaku, resulting in many [v] as well as [+v] names
in the accented non-Rendaku cell of (30).
173
When the base is monomoraic, the morpheme shows a distinct pattern: the
name is accented and subject to Rendaku:
(31) Se-gawa, Ta-gawa, E-gawa, Ka-gawa, I-gawa, Sa-gawa
It is clear from these examples that kawa receives special treatment in
names with this kind of base, so that they must be assigned with [+A, +R]
values.
Interestingly, the pattern varies slightly when the name literally refers to
a river not a person even though the same morpheme is used as a head.
River names are always subject to Rendaku, and moreover, they are typically accented as shown in examples of longer shape (32b). Four-mora
names are always accentless as in (32a), due to a restriction against accented
four-mora nouns (cf. Kubozono 1996; Zamma 2001, 2003).
(32) kawa in river names:
a. Kamo-gawa, Yodo-gawa, Shuku-gawa, Kako-gawa, Ibi-gawa
b. Katsura-gawa, Takase-gawa, Nagara-gawa, Temuzu-gawa
This difference suggests that the specification is determined not only by the
morpheme itself, but also by the type of noun it produces.
We summarize this section with the table in (33).
(33)
hashi
kawa
specification
Rendaku/Accent
correlation
No
Yes
OCP
Peculiarity of
mono-moraic base
Yes
Yes: +A, +R
after nasal:
after [u]:
after [(a)i]:
others:
+A, +R
+A, +R
A, R
+A, R
175
Acknowledgements
A very preliminary version of this paper was presented at the annual meeting
of PAIK (Phonological Association in Kansai) held at Kobe College on
July 13, 2002. I would like to thank PAIK participants (especially Shigeto
Kawahara, Haruo Kubozono, Kazutaka Kurisu, Michinao Matsui and Akio
Nasu) and Jeroen van de Weijer for their valuable comments and discussion. I am also grateful to Mark Campana, who suggested stylistic improvements of this paper.
Notes
1. The original list is more complicated because Sugito also made a comparison
between the dialects spoken in Tokyo and Osaka.
2. Kubozono (this volume) also discusses this issue.
3. These four names in fact follow Sugito's Law, as they undergo Rendaku when
they are accentless, and do not when accented. As none of the factors accentedness and Rendaku sensitivity are fixed for them, they are included in this
category.
4. Tahara is the only name with a monomoraic base which satisfies the general
specification of hara: i.e. accented without Rendaku.
5. In addition, one of the accented Rendaku names has a base which ends with a
moraic nasal (Kambara). It might be possible to attribute this Rendaku to the
nasal segment that is, to Post Nasal Voicing, in which a moraic nasal acts as
the trigger.
6. Only Mizu[g/k]uchi has an alternative pronunciation in which Rendaku does
not apply.
7. The only name with /m/ which does not belong to this category is Misawa,
which has a monomoraic base.
8. The pattern with shima is quite different from that found in river names, where
non-Rendaku names get accented (cf. Tanaka, this volume). In this case, morever, the voicing of the last onset of the base is not the trigger of Rendaku: e.g.
Sakura-jima and Itsuku-shima.
Introduction
The aim of this article is to present a survey of rendaku (sequential voicing)
in loanwords. There are some differences in phonological behavior between
native words and loanwords. It is often said that rendaku is one of those
differences, since rendaku hardly occurs in loanwords. However, we find a
number of exceptional occurrences. In this article, we take up some problems concerning such examples. Investigation into those exceptions sheds
light on some aspects of lexical stratification in Japanese. The question of
lexical stratification is one of the central issues in recent research in generative phonology, and some of the principal studies (It and Mester (1999a,
2003); Fukazawa, Kitahara, and Ota (2002); among others) have paid attention to phenomena such as rendaku in the Japanese lexicon. This article,
however, intends to survey the relationship between lexical stratification
and rendaku from a different viewpoint. If we try to answer the question of
what is the loanword stratum, or what is the relationship between lexical
stratification and phonological phenomena, we need to look further into the
background behind the lexical stratification. Especially we have to recognize the significance of stylistic and sociolinguistic aspects. Paying serious
attention to these aspects helps us to understand what the occurrence of a
phonological phenomenon depends on. In our opinion, such a consideration
is useful even for theoretical research.
The borrowed vocabulary of the Japanese language consists of two main
groups: the group of Sino-Japanese (SJ) words,1 and the group of foreign
words that are largely borrowed from European languages. These two
groups differ from each other with respect to rendaku occurrences. In the
following sections, we look at this difference by examining loanword rendaku examples, and discuss some issues in loanwords in Japanese. Without
intending to reach a conclusive argument, this article emphasizes the importance of stylistic or sociolinguistic aspects when dealing with phonological
phenomena.
(2)
179
Let us return to rendaku in words like karuta, kappa. What was said about
ikura equally applies to karuta and kappa. Although both words were originally borrowed from Portuguese in the 16 th century, they are not foreign in
terms of a native speakers intuition. First, they look like non-foreign words
in terms of their forms.5 Second, neither of them has any connotation with
something foreign. As for karuta, it refers to a Japanese style card game,
and Japanese people believe that playing karuta belongs to their own tradition. We can also regard kappa as a non-foreign word. In fact, it has a foreign
counterpart rein k
to, which comes from the English word raincoat. Although both kappa and rein k
to are daily expressions in present Japanese,
they are subtly different from each other in connotation. People prefer rein
k
to to kappa in some contexts, because the latter suggests cheaper or less
fashionable quality.
Another example, illustrating the history of rendaku in loanwords, is (4)
karuka, which refers to the stick for loading a bullet into the barrel of a
matchlock gun from the muzzle.
(4)
181
processes. This problem is worthy of further investigation in order to clarify the relationship between phonological phenomena and semantic aspects
in loanwords.
To sum up this section, rendaku in foreign loanwords takes place only in
the words that merged into the non-foreign word group (including both
native words and SJ words). Therefore, we conclude that the occurrence of
rendaku essentially depends on the difference between the foreign word
group and the non-foreign word group. In addition, there are still other
problematic rendaku cases, of which some examples will be mentioned in
(19) of section 4.
kiku chrysanthemum
sira giku white kiku
no giku wild kiku
Although kiku originates from the Classical Chinese kuk, this word had already joined the native word group in the Heian period (ca. 9th12th century).
One striking evidence is that in those days kiku was commonly and quite
often used in Japanese poetry Yamato uta or waka where the usable expressions were confined to the native word group except for extraordinarily
licensed cases. This indicates that the rendaku in compounds with kiku in (6)
reflects its membership in the native word group. We therefore can explain
rendaku in kiku in the same manner as in section 1, where it was concluded
that the foreign loanwords that undergo rendaku are limited to words which
have merged into the non-foreign word group. In fact, the form of kiku
looks like a native form in terms of phonotactics. This property must have
been one of factors that led kiku to join the native word group.
However, not all cases of rendaku in SJ can be treated in the same manner
as kiku. We also find a special type of rendaku words that does not follow the
pattern we looked at in section 1. Some examples are given in (7) below.
183
(i)
However, it is important to also pay attention to the fact that the great majority of SJ words are not uniquely used in formal contexts. Even among SJ
words, there are differences in formality or style and we find a great number
of SJ words that are more compatible with informal contexts. The examples
(9), (10), and (11) illustrate this point.
(9)
(10) a. tepp
(gun)
b. zy (gun)
(11) a. hyakusy
(farmer, peasant)11
b. n
min (farmer, peasant)
Although both isya and isi in (9) are SJ, isya is a more informal expression
preferred in daily colloquial contexts; isi is a stiff expression mostly used in
Sino-Japanese
Vulgarized SJ
Formal SJ
Possible targets of rendaku
185
187
(18) Variations of stsyon station (For details, see Sanada 1981, 1991):
a. stensyo (syo is a SJ morpheme, meaning site or place),
b. tensyoba (ba is a native word, meaning site or place. Cf. the
non-foreign word tsyaba station, composed of tsya a stop
and ba site or place),
A reanalysis inspired by folk-etymology or a blending with native or SJ
elements played an important role in the formation of these variants. This
displays a tendency for merging newcomers into the familiar existing vocabulary rather than separately constructing a new word group. The word
ketto in (5) can be added as another exemplification of this trend, since native
speakers associated ke- with the native word ke, as mentioned in section 1.
This nativization trend was stronger in the past, though it has not gained the
mainstream status. Furthermore, we cannot overlook Nakagawa (1966)s
suggestion that some compounds with foreign words occasionally undergo
rendaku, as shown in (19), even though these forms are unstable in comparison to non-rendaku forms.15
(19) a. indo gar Indo- kar <curry (Indian curry)
b. yama gyanpu mountain kyanpu <camp (camping in mountains)
Although it is necessary to further investigate into the foreign words that
older generations were using in colloquial or dialectal contexts, the examples
in (19), at least, demonstrate that even foreign words are subject to rendaku
in the same way as vulgarized SJ words. Of course, this trend has not been
observed. But it is noticeable that, at least in the past, some foreign words
were apt to undergo rendaku when they got used in colloquial contexts.
However, the leading class of native speakers, especially educated people,
respected the forms that are seemingly faithful to their foreign origin, and
they probably thought that these faithful forms should occupy standard
positions. It is possible that this inclination became predominant over the
tendency toward further nativization, such as rendaku and transformation
by means of reanalysis or blending. This trend has accelerated especially
after World War II and more speakers have become sensitive to the existence of foreign languages behind foreign words. At the same time, people
have become sensitive to the difference as to whether an initial consonant
of a foreign word is a voiced or voiceless obstruent (daku-on or sei-on).
Finally, a few words about SJ words need to be added. A main point of
concern is the relationship between rendaku and the complexity in the SJ
Acknowledgements
This article is a revised version of Takayama (1999). I am grateful to
Jeroen van de Weijer, Kensuke Nanjo, and Tetsuo Nishihara for providing
an opportunity to contribute to this volume. I am indebted to Jeroen van de
Weijer and anonymous reviewers for many helpful comments. I would like
to thank Paul Hoornaert for useful suggestions for improving the English
expressions. Special thanks go to Yoiko Aoyama for valuable information
on rendaku words. Of course, I take ultimate responsibility for all the inadequacies and errors that remain. This work was partially supported by the
Grants-in-Aid for Scientific Research from the Japan Society for the Promotion of Science, No.15520285, 20032004.
189
Notes
1. The Sino-Japanese vocabulary, which occupies an important part in the Japanese lexicon, originally derives from the Chinese language mainly before the
10th century through the learning of the Chinese logographic system (so-called
Chinese characters).
2. There are various kinds of karuta and each kind is named using a compound
with karuta such as iroha garuta. However, in the written forms of these compounds, we encounter non-rendaku forms such as {iroha karuta}, {haikai karuta}
({} indicates the transliteration of kana phonograms). I think that it owes to the
fact that those written forms do not always directly reflect their sound forms,
and/or to the fact that they are often pronounced without rendaku voicing.
3. Iroha is a series of kana syllabaries like an alphabet, which is ordered by i, ro,
ha, ni, etc.
4. There is another sonata in the native vocabulary, which is an antiquated expression referring to the second (singular) person. Nowadays, this word is used
only in historical plays or dramas.
5. Vance (1987: 141) argues that kappa is not a virtual native word but is rather
virtual Sino-Japanese. However, in some cases, it is hard for native speakers to
determine by intuition whether some word is native or SJ. Kappa is one of
those cases. The boundary between these two groups is not always clear inside
the non-foreign vocabulary. The boundary between foreign and non-foreign is
clearer, although there are also vague cases.
6. In Zhy Monogatari (see note 7), we find a compound like futo+karuka
(thick stick). But we cannot determine whether its form was karuka or garuka.
Although the kana phonogram system has a diacritic mark dakuten to indicate
voiced obstruents, it was not a compulsory element in the Edo period. Without
this mark, we have no decisive clue to the rendaku form.
7. Zhy Monogatari (Tales of common soldiers) is a collection of stories told
by experienced common soldiers. Probably its first manuscript appeared in the
17 th century.
8. Some speakers use a form without rendaku, naga try.
9. When saying that rendaku can apply to a word group, this does not mean that
rendaku occurs to all targets belonging to that group.
10. There is a phonotactic constraint in the native vocabulary: more than one voiced
obstruent per morpheme is prohibited. This constraint explains why rendaku
is blocked in words that already have more than one voiced obstruent (see
Kubozono, this volume, It & Mester 1986, Yamaguchi 1988, and Haraguchi
2002). As illustrated in (7), every SJ word that undergoes rendaku has no
voiced obstruent by itself. Although a SJ word generally comprises two morphemes, it behaves like one simplex word with regard to this point.
11. Since hyakusy often has a pejorative connotation, a form with honorific elements o-hyakusy-san is preferred.
1. Introduction
The main goal of this paper is to present the results of a study conducted to
improve the performance of Large-Vocabulary Continuous Speech Recognition (LVCSR) by modeling context-dependent pronunciation variation
(i.e. morphophonemic alternation) and context-independent pronunciation
variation (i.e. free variation). In particular, I report the results of performance tests run on numeral-classifier combinations in Japanese (e.g. ni-hon
two stick-type objects, san-bon three stick-type objects), showing how
the accuracy of our Japanese LVCSR engine was improved through modeling the context-dependent pronunciation variation and context-independent
pronunciation variation. On the one hand, these numeral-classifier combinations are a typical subject of phonological/morphological study, displaying
linguistically significant, regular morphophonemic voicing alternation
patterns. On the other hand, the same set of data shows linguistically insignificant free variation involving voicing. I demonstrate that these two types
of pronunciation variation are indeed captured by the same process of statistical adjustment in our LVCSR engine.
The secondary goal of this paper is to introduce a glimpse of research in
the area of Automatic Speech Recognition (ASR) to a phonological audience
and to contribute to the knowledge transfer between the two disciplines.
While much attention has been paid to the inter-disciplinary study between
phonology and cognitive science, not much discussion has been generated
between phonology and speech engineering. The practice of computational
phonology (Bird 1995) does exist; however, it studies implementations of
theoretical phonology, which is not the same as the research aimed at improving ASR systems.
Note that it is not the goal of this paper to offer some particular linguistic
insight. Rather, this paper presents an alternative look at the Japanese numeral-classifier combinations, a typical subject for a phonological analysis,
from the viewpoint of a commercial LVCSR.
The term, argmax, in (1c) means that the formula following it (the probability
of the output word sequence given the sequence of acoustic input) is maximized. The formula in (1c) straightforwardly expresses the goal of ASR: to
find the string of words W that maximizes P(W | X). Rather than trying to
solve the formula as is, we use Bayes rule to (1c) to get the mathematically
equivalent formula (2):
(2)
193
The first part P(X | W) can be calculated by what we call the Acoustic Model
(AM) and the second part P(W) can be calculated by what we call the Language Model (LM).
AM is a collection of probabilistic sound sequences for a given word. In
our LVCSR, AM is based on the Hidden Markov Model (HMM), which
gives transitions of observation sequences when no deterministic information about observed input is given (thus the name is Hidden). Many stateof-the-art ASR systems employ some form of HMM. Our AM treats each
state in the HMM as the basic subphonetic unit called a senone (Hwang and
Huang 1993). Senones are the units composing a triphone (context dependent phones), consisting of the left context, the phone, and the right context
(e.g. /to/ consists of two triphones, <sil>-t+o and t-o+<sil>, where <sil> is
silence). The parameters (probability values) for our AM can be automatically estimated by going through hundreds of hours of acoustic data. Once
the AM is trained, the spectral features extracted from the acoustic input get
computed and matched against the probable phone sequence for the candidate word.
LM is the model for determining the probability of a word sequence w1,
w2, wn, namely P(w1, w2, wn). This probability gets broken down into
its component probabilities by the Chain Rule:
(4)
P(w1, w2, wi) = P(w1)*P (w2 | w1)**P (wi | w1, w2, wi-1)
n
P (w | w
i
i1
1 )
i=1
(5)
P (w1n )
P (w | w
i
i=1
If the probability of the word depends on the previous two words (N=3), we
have a trigram (6c). Similarly, it is called a unigram when N=1 (6a), a
bigram when N=2 (6b). The trigram language model is widely used in most
commercial LVCSR systems today.
n
(6)
a. Unigram:
P (w1n )
P (w )
i
i=1
b. Bigram:
P (w1n )
P (w | w
i
i1 )
i=1
n
c. Trigram:
P (w1n )
P (w | w
i
i2 wi1 )
i=1
The current practice is that we use large text corpora to calculate the ngram probabilities. It follows that the larger the size of the text corpora, the
better the n-gram coverage. Even with large corpora, there will always be
many word sequences, especially for trigrams, that get zero probability.
There are various discounting and smoothing techniques to circumvent this
data sparseness problem, but I will not discuss them here (See Huang, Acero, and Hon 2001 for more details).
In order for the recognizer to know the phonetic content of the word
sequences stored in LM, the module called lexicon acts as the database
storing the pronunciation(s) of each word counted in LM. Many LVCSR
systems are equipped with the lexicon containing over 100,000 words for a
given language. The recognizer is only capable of recognizing words that
are listed in the lexicon. Thus, it is safer to have a large lexicon to avoid
out-of-vocabulary errors.
The above is a quick introduction to ASR and specifically to LVCSR.
Besides AM, LM, and the lexicon, there are two more important pieces to
the system: Front End and Decoder. I will not cover these topics, since these
are not relevant to the main discussion of this paper (for more in-depth introductions to ASR, see Jurafsky and Martin 2000, Huang, Acero, and Hon
2001, Shikano, et al. 2001).
2.
195
The Problem
197
b. N-gram count file creation stage: no attention was paid to the pronunciation variation for numeral-classifier combinations.
c. N-gram count file creation stage: no mechanism existed for assigning appropriate probabilities to free variants.
In the next section, I discuss how we dealt with each of the above problem
areas.
3.
The Fix
199
would have a probability that is too low to factor in LMs. Thus, we first divided the 65 classifiers into three tiers based on their frequency of occurrence
in our corpora: Tier 1 classifiers included /hon/ stick-shape object, /en/
yen, etc., Tier 2 classifiers included /hyoo/ number of votes,
/shoo/ number of wins, etc., and Tier 3 classifiers included /seki/
number of ships, /kumi/ group of etc. Then, during the training, we
explicitly increased the count for the numeral-classifier sequences that had
a relatively lower frequency count within the tier. This resulted in equal
distribution for the numeral-classifier combinations within the same tier. For
example, if there are 1,000,000 occurrences of the numeral-yen sequence
(i.e. (n) n yen where n is 010), all the tier 1 classifiers will be counted
as occurring 1,000,000 times.
In addition to the probability adjustment for the numeral-classifier combinations within a tier, we also smoothed the probability distribution for the
different numerals (010) for a given classifier. Most, if not all, classifiers
have significantly larger counts for their combination with /ichi/ () one
compared to the other numerals. This would cause the pronunciation variant
for the one-classifier (e.g. pai in ip-pai () one cup of) to be so strong
that it would inappropriately win out for the other numeral-classifier combinations (e.g. *ni-pai () two cups of instead of the correct ni-hai).
Thus, we took the count for the one-classifier combination to be the base
count for the rest of the numbers (0-10 except 1). For example, we explicitly
added ni-hai () two cups of, san-bai () three cups of, ... jyup-pai
() ten cups of for each occurrence of ip-pai () one cup of in the
data.
Some classifiers do not take zero as a numeral (*zero-choome (0)
zero street address (?)) or take it with extremely low probability (?zerohai (0) zero cups (?)), so zero was discounted from the count of numeral-classifier combinations for these particular classifiers.
Incorporating the explicit n-gram extrapolation, it was possible to directly
model the pronunciation variants for both numerals and classifiers in the ngram, thereby resolving the problem in (9b).
4.
The Test
201
4.2. Results
We recorded the results of the accuracy tests in versions 1 through 4 where
Version 4 was the newest incarnation of our speech recognition engine. As
the version progressed, we have incrementally added fixes to improve the
accuracy of numeral-classifier combinations. The test result consists of the
following two numbers per system: Word Accuracy Rate (WAR) and Numeral-Classifier combination Accuracy Rate (NCAR).
(11) Accuracy Results Numbers
a. WAR (Word Accuracy Rate)
100 WER
b. WER (Word Error Rate)
100*(#Insertion Errors+#Deletion Errors+#Substitution Errors)/
(#Words)
c. NCAR (Numeral-Classifier combination Accuracy Rate)
100*(#correct numeral-classifier combinations)/
(#numeral-classifier combinations)
WAR is the rate obtained by subtracting Word Error Rate (WER) from 100
(%). WER is based on how much the output string returned by the recognizer differs from the correct string for a given test set. WER is calculated
as 100*(#Insertion Errors+#Deletion Errors+#Substitution Errors)/(#Words).
For a given test set, WAR is the indicator of how likely the recognizer gets
the correct recognition results. For example, for the correct string I had
three cups of coffee this morning consisting of 8 words, if the hypothetical
output of the recognizer was I hid three cups of cold feet morning (hid
is substituted for had, cold is inserted, feet is substituted for coffee,
this is deleted) then the WER is 100*(1+1+2)/8 = 50%, and the WAR is
50% (10050).
NCAR represents the accuracy specific to the numeral-classifier combinations in a given test set. For each numeral-classifier combination, I gave
the value 1 if the output contained the correct numeral-classifier combination; otherwise, I gave 0. Note that the insertion errors were not
counted in the calculation of NCAR. As long as the output contained the
correct string, it was counted as correct. Thus, the formula for NCAR is
100*(#correct numeral-classifier combinations)/(#numeral-classifier combinations). For example, taking the previous hypothetical output that contains
WAR
NCAR
WAR
NCAR
Version 1
h-ini non
77
77
23
24
Avr.
77.0
23.3
Version 2
h-ini non
80
81
59
38
Avr.
80.4
48.7
Version 3
h-ini non
82
90
67
76
Avr.
85.8
71.3
Version 4
h-ini non
88
92
76
82
Avr.
90.2
78.9
Baseline
h-ini rest
80
84
47
64
Avr.
81.8
55.2
The graph in (13) below shows the improvement more clearly. As is obvious
from the graph, our Version 1 engine was performing very poorly, getting
lower accuracy rates for both WAR and NCAR than the baseline. As we
incorporated the fix progressively version by version, gradually completing
the adjustment of the frequency counts of numeral-classifier combinations,
it is evident that the performance of our engine for both the WAR and the
NCAR improved dramatically. By Version 3, our engine outperformed the
baseline engine for both WAR and NCAR. At Version 4, as our implementation of the probability adjustment for the numeral-classifier combinations
was completed, we obtained the best results. Note that the improvement of
NCAR did not hinder the WAR but helped the WAR improvements.
203
4.3. Summary
The test results reveal the successful improvement in the performance of the
Japanese LVCSR engine regarding the pronunciation variability of numeralclassifier combinations by making probability adjustments using the explicit
n-gram extrapolation. Three problem areas identified earlier in (9) were resolved by 1) exhaustive listing of pronunciation variants for numerals as well
as for classifiers in the lexicon, and by 2) manually adjusting the counts of
numeral-classifier combinations to model in the n-grams. Not only did the
explicit n-gram extrapolation resolve the context-dependent pronunciation
variation, but it resolved the issue with free variation as well.
One of the things we did not cover with this study was the testing of zeroclassifier instances. As I mentioned earlier, not all classifiers may be used
with the numeral zero. Future research will need to test whether these exceptional cases are handled appropriately. Another remaining issue is that of
numerals that are larger than 10. Increasing the frequency counts of numeralclassifier combinations for the numerals 0-10 may have adverse effects on
instances where the numeral is larger than 10. If so, we will need to make
the appropriate modifications to handle larger numerals. Further testing is
required before the engine will end up in commercial products. The goals
of my future research are to expand the test cases as well as to seek other
ways of improving the performance of our Japanese LVCSR engine.
1. Introduction
Introductory textbooks of phonetics or pronunciation dictionaries of Japanese
often state that close vowels (/i/ and /u/) are devoiced when they are both
preceded and followed by voiceless consonants. This description turns out
quickly to be incorrect when we look at real data. For one thing, close vowels
are not always devoiced, even in the above-mentioned environment, and in
addition, close vowels followed by voiced consonants can be devoiced to
some extent when they are preceded by voiceless consonants. Moreover,
non-close vowels like /a/ are also devoiced occasionally.
These facts, which we will examine more closely in this paper, indicate
that vowel devoicing is a probabilistic event: an event whose occurrence cannot be predicted with 100% accuracy. Vowel devoicing, accordingly, should
be analyzed from a statistical perspective. In this perspective, phoneticians,
including the first author of this paper, have in the past conducted statistical
analyses of vowel devoicing in order to find out which factors determine
the probability of vowel devoicing in a given phonological context.
The reported results, however, have not always coincided. For example,
there is disagreement regarding the influence of the manner of articulation
of the following consonant. Han (1962) claimed that close vowels followed
by an affricate or fricative were more likely to be devoiced than those followed by a plosive, but Takeda and Kuwabara (1987) obtained exactly the
opposite result. The latter study also reported that one of the devoicing
rules proposed in NHK (1985), namely a low-pitched mora in pre-pause
position is likely to be devoiced, was almost useless in interpreting the
devoicing patterns observed in a read-speech corpus.
There may be several possible reasons for such disagreements. First,
some descriptions of devoicing were based upon introspection. Generally
speaking, introspection alone is not an appropriate analysis method for a
probabilistic event like devoicing.
2.
The data
designed mainly for the study of speech recognition and phoneticslinguistics (See Maekawa, Koiso, Furui and Isahara 2000 for the blueprint
of the CSJ).
The whole body of the CSJ contains about 7.5 million words spoken by
native speakers of so-called Standard, or Common, Japanese. This corresponds roughly to about 660 hours of speech. The main body of the corpus
is monologue taken from two sources: academic presentation speech (APS)
and simulated public speaking (SPS).
The APS is the live recording of academic presentations done in meetings
of nine different academic societies covering both humanities, natural science, and engineering fields. The SPS, on the other hand, is the public
speech on every-day topics, performed by recruited lay subjects in front of
small audiences. The sex and age of the SPS speakers are roughly balanced.
The speech data was recorded using a head-worn directional microphone
and a DAT with the sampling frequency of 48 kHz and 16-bit precision.
The speech data was then down-sampled to 16 kHz and stored in computer.
All recorded speech was transcribed and morphologically analyzed in
terms of word boundary and part-of-speech information. In addition to this
tagging of the entire corpus, we have done extensive annotation of a number
of linguistic features to a subset of the corpus; we call this subset the Core.
The Core contains about 500,000 words or about 45 hours of speech, all
of which have been (sub-)phonemically segmented and labeled for intonation.1 The tag set used in the segmental labeling of the Core is shown in
Table 1. The tag set is a mixture of phonemic and sub-phonemic labels.
This inconsistency was a deliberate choice of ours to enrich the value of the
Core as resource for the study of phonetic variation. When this segment
label information is coupled with the X-JToBI intonation labels that we
developed for the CSJ (Maekawa, Kikuchi, Igarashi and Venditti 2002), the
Core can be an excellent resource for the phonetic study of spontaneous
speech.
The segment labeling of the Core was preformed in three steps. First,
the initial labels were generated from the transcription text and aligned
automatically to the speech signal using a Hidden Markov Model based
speech recognition toolkit (Young et al., 1999). The accuracy of automatic
alignment in terms of phoneme boundary location, averaged over all phonemes, is currently 3.84 ms average and 21 ms standard deviation (Kikuchi and Maekawa 2002).
Vowels:
a, i, u, e, o (voiced)
A, I, U, E, O (devoiced)
Plain Consonants:
k, g, G[F], @[], s, z, t, c[ts], d, n, h, F, b, p, m, r [R], w, y
Phonetically palatalized consonants:
kj, gj, Gj, @j, sj[S], zj[Z], cj[tS], nj[], hj[]
Phonologically palatalized consonants (youon):
ky, gy, Gy, @y, sy, zy, cy, ny, hy, by, py, my, ry
Moraic phonemes:
Long vowel:
H
Geminate (sokuon):
Q
Moraic nasal (hatsuon): N
VOICED
DEVOICED
109,624
3,956
58,154
12,363
75,581
2,650
88,412
19,445
49,448
8,340
108,432
3,954
57,401
12,361
60,675
2,646
87,282
19,437
33,917
8,307
1,192
2
753
2
14,906
4
1,130
8
15,531
33
% DEVOICED
1.09
0.05
1.29
0.02
19.72
0.15
1.28
0.04
31.41
0.40
a
e
i
o
u
4.
C1
Co
Co
Cv
Cv
Co
Co
Cv
Cv
Co
Co
Cv
Cv
Co
Co
Cv
Cv
Co
Co
Cv
Cv
C2
Co
Cv
Co
Cv
Co
Cv
Co
Cv
Co
Cv
Co
Cv
Co
Cv
Co
Cv
Co
Cv
Co
Cv
C1
c
h
i
k
p
s
t
C2
c
h
k
p
Q
s
t
c
h
k
Q
s
t
c
h
k
Q
s
t
Q
c
h
k
Q
s
t
k
Q
VOICED
16
35
31
7
16
64
32
5
22
15
21
11
21
19
167
73
32
144
53
118
7
47
50
25
259
49
11
13
DEVOICED % DEVOICED
82.02
73
16.67
7
92.03
358
86.27
44
50.00
16
39.05
41
84.98
181
94.12
80
29.03
9
95.80
342
65.00
39
21.43
3
97.68
883
76.54
62
28.02
65
86.70
476
61.45
51
64.53
262
93.72
791
7.09
9
97.37
259
22.95
14
95.66
1,102
78.63
92
40.73
178
6,507
99.25
0.00
0
0.00
0
C1
c
h
u
k
p
s
DEVOICED % DEVOICED
C2
VOICED
c
h
k
Q
s
t
c
h
k
Q
s
t
c
h
k
p
Q
s
t
k
s
t
c
h
k
p
Q
s
t
16
24
44
13
137
19
57
10
872
32
140
207
78.08
29.41
95.20
71.11
50.54
91.59
4
17
15
25
6
10
86
16
227
7
46
248
95.56
48.48
93.80
21.88
88.46
96.12
48
132
151
3
114
380
148
123
56
246
21
26
1,202
1,021
71.93
29.79
61.96
87.50
18.57
75.98
87.34
8
12
6
7
18
12
46.67
60.00
66.67
3
4
31
2
23
60
37
8
8
2,207
154
31
195
1,210
72.73
66.67
98.61
98.72
57.41
76.47
97.03
Affricate
C2
Fricative
Stop
Affricate
81.1
33.3
89.4
78.3
Fricative
96.3
38.1
98.4
94.6
Stop
80.2
51.5
89.3
77.3
91.0
47.7
43.8
Table 7. Devoicing rate [%] of /u/ in the /CoVcCo/ environment classified by the
manner of C1 and C2
Affricate
C1
C2
Fricative
Stop
Affricate
77.2
48.1
94.5
83.6
Fricative
95.1
61.2
97.5
93.5
Stop
80.8
74.0
80.1
77.1
84.4
68.8
35.9
These tables show several interesting tendencies. First, the devoicing rate
was the highest when fricative C1 was followed by stop C2 in both tables,
and the second highest devoicing rate was observed when fricative C1 was
followed by affricate C2 in both tables. In contrast, the devoicing rate was
the lowest when affricate C1 is followed by fricative C2, and the second
lowest rate was observed when fricative C1 is followed by fricative C2 in
both tables. Also, it is worth noting that, in terms of the peripheral distribution, the highest devoicing rate was observed when C2 was stop, and the
lowest devoicing rate was observed when C2 was fricative.
These facts show clearly that there is an interaction between the manners
of articulation of C1 and C2. A two-way ANOVA between the manners of
C1 and C2 applied to data pooled over /i/ and /u/ showed that main effects
of C1 and C2 and their interaction were all significant (For C1, DF =2,
F=44.38, P <0.0001; For C2 DF =2, F =1959.43, P <0.0001; For C1*C2,
DF =4, F =263.24, P <0.0001). Phonetic interpretation of the manner
interaction will be discussed in Section 5.1 below.
In the calculation of Tables 6 and 7, samples in which C2 was a geminate
/Q/ were omitted, because the manner of /Q/ per se is not specified from a
phonological point of view, and, it seemed that a following geminate constituted a special environment of devoicing, as shown below.
Table 8 compares devoicing rates of close vowels (pooled over /i/ and
/u/) in cases where C2 was and was not a geminate. This table shows that
the devoicing rate was lower when C2 was a geminate, regardless of the
manner of C1 (DF =758, t=24.84, P<0.0001, unequal variance). Further
analysis revealed that the devoicing rate was the highest for the combination
of fricative C1 and a stop geminate (namely a geminate followed by a
stop), and was the lowest for the combination of fricative C1 and a fricative
geminate (namely a geminate followed by a fricative). These show the same
tendency as observed in Tables 6 and 7.
Table 8. Effect of the following geminate on devoicing rate:
Pooled data of /i/ and /u/
C1
C2 non /Q/
C2 /Q/
Affricate
454
2,021
81.7
29
49
62.8
Fricative
860
14,099
94.3
112
181
61.8
1,464
4,954
77.2
282
87
23.6
Stop
VOICED
FIRST VOWEL
DEVOICED
SECOND VOWEL
VOICED
DEVOICED
17
44
171
84
Figure 1 compares devoicing rates of the first and second vowels in the
consecutive devoicing environment. Its abscissa represents the combination
of the manner of C1 and C2, and is sorted in the descending order of the
observed devoicing rate of the first vowel. Letters, A, F, and S stand
respectively for affricate, fricative, and stop; and are combined in the order
of C1/C2. This figure shows that the two devoicing rates were, by and
large, inversely proportional, reflecting a one or the other relationship
between the two vowels.2 The graph also shows that when a fricative was
combined with an affricate or stop, it was always the vowel associated with
(i.e., in the same mora as) the fricative that showed the higher devoicing
rate, and, when both consonants were fricatives, it was the second vowel
that showed a high devoicing rate.
217
Nasal
Stop
Affricate
13.8
20.6
6.9
28.3
12.9 19.8
C1
Fricative
46.5
16.7
5.5
65.6
22.1 36.8
4.7
2.6
3.1
3.8
5.0
18.4
12.6
4.6
35.8
15.9
Stop
3.9
219
Affricate
Affricate
Fricative
(1)
Stop
(7) 0.0
Geminate
(47) 1.3
(79) 0.7
C1 Fricative
Stop
1.5
1.9
2.5
1.1
Table 13. Devoicing rate [%] of /e/ in the /CoVncCo/ environment by the manner
of C1 and C2
C2
C1
Affricate
Fricative
Stop
Geminate
Affricate
(1)
(0)
(0)
(7)
0.0
Fricative
0.9
(333)
2.2
(45)
7.5
(388)
0.0
(132)
3.7
Stop
0.6
(176)
4.4 (1,083)
3.4 (2,925)
1.2
(650)
3.2
4.3
3.9
1.0
0.8
Table 14. Devoicing rate [%] of /o/ in the /CoVncCo/ environment by the manner
of C1 and C2
C2
Affricate
C1
Fricative
Affricate
(3)
(3)
Fricative
2.6
(76)
4.1
Stop
2.6
(721)
2.8
Stop
1.7
Geminate
(59)
4.0
(455)
3.8
(410)
4.3 (1,205)
1.7
(119)
4.0
2.1 (2,070)
3.8 (7,208)
1.1
(355)
3.3
2.5
3.9
2.6
In Tables 1214, the devoicing rate stayed nearly the same regardless of the
combination of consonant manners, and it is this very fact that characterizes
the devoicing of non-close vowels. Devoicing in the /CoVncCo/ environment is special in that the manners of adjacent consonants do not play a
crucial role in the prediction of devoicing rates. But this does not mean that
devoicing of non-close vowels was completely free from phonological con-
ditioning. There is at least one phonological factor that influences the devoicing rate of /CoVncCo/ vowels: consecutive identical morae, or, the
repetition of the same mora.
Sakuma (1929) noted that in words like /kokoro/ (mind) and /haha/
(mother), the vowel in the first mora could be devoiced. Table 15 summarizes the devoicing rate of the first vowels of 1260 samples that contain
consecutive identical morae in the /CoVncCo/ environment. Devoicing
rates of /a/ and /o/ shown in the table were higher than the overall devoicing rate shown in Tables 12 and 14.
Table 15. Devoicing of the first vowel of two identical morae in the /CoVncCo/
environment
VOWEL
VOICED
DEVOICED
a
e
o
458
112
490
54
5
141
% DEVOICED
10.5
4.3
22.3
speaking rate
Figure 3. Effect of speaking rate on devoicing rate in the /CoVncCo/ environment
N
a
0
1
12,184
292
11,940
274
244
18
2.00
6.16
e
0
1
5,616
124
5,434
116
182
8
3.24
6.45
o
0
1
12,396
288
11,977
270
419
18
3.38
6.25
5.
Discussion
V1
/CoVcCo/
F/A
F/S
A/S
S/A
A/A
S/F
F/F
S/S
A/F
98.3
95.3
94.4
91.6
75.0
62.5
42.3
39.2
22.2
95.9
98.1
92.6
80.7
79.3
68.6
48.9
84.4
43.1
APS
SPS
N
% DEV
N
% DEV
11,028
10,943
12,215
85.9
18.4
2.5
13,570
16,816
18,685
87.8
19.9
3.1
6. Concluding remarks
The use of a spontaneous speech corpus has revealed its effectiveness in the
analysis of vowel devoicing. The data presented here is one of the most
reliable resources for the study of vowel voicing, both in its quality and in
its quantity. Full coverage of the many C1C2 manner combinations would
have been impossible if the amount of data was substantially smaller than
the current data set. Needless to say, however, the current data set is still
not large enough for a complete analysis of the statistically complex phenomena like consecutive devoicing discussed in Section 4.1.2. More reliable
conclusions will be achieved once we have access to the entire CSJ-Core
whose data size is more than twice the current data.
Most of the analyses done in this paper are linguistic analyses in the
sense that phonological environments were used as the factors conditioning
vowel devoicing. Yet, as suggested in the analysis of non-close vowel devoicing, it is obvious that extra-linguistic factors also played a certain role.
Extensive analyses of extra-linguistic factors and the integration of linguistic
and extra-linguistic factors is an important step towards a full understanding
of vowel devoicing phenomenon. Lastly, intonation labeling of the CSJCore will make it possible to examine the effect of prosodic conditionings
such as pitch accent. All of these analyses should be the focus of future
study.
Acknowledgments
The authors are grateful to all speakers in the Corpus Spoken Japanese. Our
gratitude also goes to Professor Hisao Kuwabara of Teikyo Science University who sent us his paper upon our request, and Dr. Jennifer Venditti
whose comments on an earlier version of this paper helped us greatly.
Notes
1. The Core is also labeled for other research information such as clause boundary, discourse segmentation and dependency structure, but this information is
not relevant to the current paper. Visit the following URL for more information about CSJ; http://www2.kokken.go.jp/~csj/public/index.html
1. Introduction
Vowel devoicing is a common phonological process in many languages and
typically involves high vowels and schwa. High vowels and schwa are inherently short (Bell 1978; Dauer 1980) and the process usually occurs when
the vowels are either adjacent to, or surrounded by, voiceless consonants,
during which the glottis is fully open. It is thought that vowel devoicing is a
consequence of articulatory undershoot of glottal movements. It also suggests
that vowel devoicing processes are the results of glottal gestural overlap between voiceless consonants and short vowels. The movements of glottal
muscles for the short high vowels /i/ and /u/ blend with those of the adjacent voiceless sounds or a pause (Jun 1993; Jun and Beckman 1994). In
many languages, the process is also considered to be part of the vowel neutralization and reduction processes in which vowels are first reduced in
duration and centralized in quality, typically in the unaccented position, and
then eventually devoiced and/or deleted in fast or casual speech (Hyman
1975; Wheeler 1979; Dauer 1980; Kohler 1990).
The Japanese high vowels /i/ and /u/ also become voiceless when surrounded by voiceless consonants, or when preceded by a voiceless consonant and followed by a pause: i.e. /C8VC8/ or /C8V#/ (where the Vs are
[+high]). However, in Japanese the vowel devoicing processes do not
involve apparent centralization of vowels. There is no obvious durational
reduction of vowels in the unaccented positions in Japanese, nor does
vowel quality depend on accentuation. However, the vowel devoicing process is very common in many Japanese dialects, especially in eastern dialects
including Standard Japanese. The process occurs even in slow or formal
speech (Kondo 1997). This suggests that Japanese high vowel devoicing is
not merely an optional process in fast or casual speech, but is also a phonologically controlled process.
Syllable structure and its acoustic effects on vowels in devoicing environments 231
Syllable structure and its acoustic effects on vowels in devoicing environments 233
was 83.93% (SD 12.97). However, the durations of devoiced morae were
significantly longer than their corresponding consonants portion of /CV/
morae [t(44)=13.62, p<.001].
Figure 2. Average closure duration and the duration after release of stops and stop
part of affricates in CV morae and devoiced morae
The closure durations of stops and the stop part of affricates in devoiced
morae were compared with the closure durations of prevocalic stops and
the stop part of prevocalic affricates in /CV/ morae. The average closure
duration of stops and affricates in devoiced morae was not significantly
different from that in /CV/ morae as shown in Figure 2. However, the average duration of stops and affricates in devoiced morae excluding closure
duration (i.e. after release of stop closure) was significantly shorter than
that of /CV/ morae [t(31)=7.12, p<.005]. This means that vowel devoicing
reduces the duration of devoiced vowel but does not affect the duration of
a. /ta,i.sjo.ku. te.a.te/
[taiokteate]
retirement allowance
/ka.mo.tu. se,N.pa.ku/ [kamotssempak] cargo boats
/ta.ka.sa.ki. si.mi,N/ [takasakiimii)]
the Takasaki citizens
b. /hu.ku.sjo.ku. ke,N.sa/ [kokkensa] dress inspection
/sjo.ku.hi.se.tu.ja.ku/ [okCisetsjak] a cut in food expenses
/ha,i.sju.tu.ki.dju,N/
[haitskid)] exhaust limit
(Here dots /./ denote syllable boundaries, and commas /,/ denote
mora boundaries.)
Syllable structure and its acoustic effects on vowels in devoicing environments 235
The total number of devoiceable vowels in single devoicing sites in the test
words was 6 (as some of the test words contain more than 1 single site),
and the number in consecutive devoicing sites was also 6, yielding 324
devoiceable vowels ([6 vowels + 6 vowels] x 3 rates x 3 repetitions x 3
speakers = 324 devoiceable vowels).
The average intensities of voiced vowels in devoicing and nondevoicing environments, excluding word-initial and word-final morae, were
calculated for the three tempi and individual subjects, and were compared
using a T-test. Since speaking tempo was effective only in the single devoicing sites and not in the consecutive sites, the vowel intensities were
compared by their devoicing environments using a T-test. When the devoiceable vowels remained voiced in the single devoicing condition, their
intensities were significantly lower than those of non-devoiceable vowels at
all speaking tempi for all subjects (Table 1). One of the subjects (A) devoiced all devoiceable vowels at the normal tempo and only once voiced
the underlined devoiceable vowel /u/ in /hukusjokukeNsa/ at the slow rate.
Therefore, no comparison was made of subject As data for the two tempi.
This result was expected because when a vowel was voiced in a single devoicing environment, it was often partially voiced, i.e. the duration of the
vowel tended to be shorter and its intensity was lower, which sometimes
made it difficult to judge whether a vowel was actually voiced or voiceless.
Table 1. T-test results of intensity differences between voiced vowels in single
devoicing and non-devoicing environments (one-tailed)
Subject
Tempo
fast
comfortable
Average intensity
of devoiceable
vowels (dB)
Average intensity
of non-devoiceable
vowels (dB)
67.07
77.34
df
p-value
p < 0.005
N/A
N/A
N/A
N/A
slow
(59.57)
(76.56)
N/A
N/A
fast
68.69
75.14
p < 0.025
comfortable
72.63
77.64
p < 0.05
slow
69.10
72.45
11
p < 0.001
fast
69.69
76.27
p < 0.025
comfortable
71.10
74.85
p < 0.05
slow
70.32
72.93
p < 0.001
Subject
Average intensity
of devoiceable
vowels (dB)
Average intensity
of non-devoiceable
vowels (dB)
fast
70.20
comfortable
df
p-value
78.25
p < 0.025
72.37
75.60
p < 0.005
slow
70.15
76.12
p < 0.005
fast
71.57
73.83
n.s.
comfortable
76.12
78.47
13
p < 0.001
slow
72.03
72.63
11
n.s.
fast
74.71
74.63
10
n.s.
comfortable
74.46
74.79
11
n.s.
slow
71.76
72.26
12
n.s.
Tempo
Syllable structure and its acoustic effects on vowels in devoicing environments 237
Figure 3. The average intensity ratios of three speakers between voiced devoiceable vowels and non-devoiceable vowels in single and consecutive devoicing environments at three tempi
front vowels. In Japanese, the F1 and F2 of the vowels [i], [e] and [] are
relatively far apart while [a] and [o] have relatively close F1 and F2. In
other words, the intensities of [i], [e] and [] are generally less than those
of [a] and [o]. In this experiment, all devoiceable vowels were either [i] or
[] with an inherently weak intensity. This may have lowered the average
intensity ratios of devoiceable vowels against non-devoiceable vowels that
are inherently greater in intensity.
It was extremely difficult to find ideal test words for comparing intensities in both devoicing and non-devoicing environments, and therefore the
type of vowel tested was not always identical. Although there were differences between the intensities of voiced vowels in the devoicing and nondevoicing environments, this might simply have been due to the different
types of vowels in the two environments. Under equal conditions, high
vowels have intrinsically lower intensities than non-high vowels, and all
vowels in the devoicing environment are high vowels. However, the following patterns were noted: (a) more intensity weakening at all tempi in the
single devoicing environment than in the consecutive environment, (b)
greatest intensity weakening at the fast tempo and least intensity weakening
at the slow tempo in the single devoicing environment, and (c) there was no
tempo effect on intensity in the consecutive devoicing environment.
Syllable structure and its acoustic effects on vowels in devoicing environments 239
(2b)
V C
V C V
C V
kC
(2c)
V C C V C
a N
kC k
a N
(3c) [kCta]
C V
C V
C C V
C C
kC t
kC t
Syllable structure and its acoustic effects on vowels in devoicing environments 241
(4b) Demoraification
of / kC/
(4c) Resyllabification
C C V C C V C
C C V C C V C
C C V C C VC
s j o kC k a N
s j o kC k a N
s j o kC k a N
C V V C V C V
C V V C C V
d o
d o
k u t
o kx t
C V V C C V
o kx t u
C V V C V C V
C V V C C V
o k
o kx t
C V V C C V
o kx t u
Syllable structure and its acoustic effects on vowels in devoicing environments 243
C V V C V C V
C V V C C
u t
(6a) [k8ts8ita]
C V V C C
o kx ts
(6b) *[k8ts8ita]
C C V C C V
C C C C V
kx t
o o kx ts
kx ts
In the word kutsushita /kutusita/ sock(s), where all the underlined vowels
are devoiceable, consonants preceding devoiceable vowels are the stop [k],
the affricate [ts] and the fricative []. The pronunciation [k8tsi ta] with
the first and third vowels devoiced and the second vowel voiced is most
common. This process can also be explained in relation to the syllable structure. As shown in (6a) and (6b), when the first vowel /u/ in /ku/ becomes
voiceless, the preceding /kx/ is desyllabified and becomes non-moraic, and
then is syllabified to the onset of the following syllable. The third vowel /i/
also becomes voiceless, and the preceding /s/ [] is also syllabified to the
onset of the following syllable. The process creates sequences of less common syllables /CCV/+/CCV/, but it is still better than devoicing in three
consecutive morae. Triple devoicing is not acceptable as shown in (6b). The
4. Conclusions
Vowel devoicing is fundamentally a phonetic process that economizes glottal
movements of a short high vowel and its surrounding voiceless consonants.
However, Japanese vowel devoicing processes are also affected by various
phonological factors, especially the syllable structure. The experimental
results showed that high vowels in the single devoicing sites were almost
always devoiced but not all devoiceable vowels became voiceless in the
consecutive devoicing sites. Vowels in typical devoicing sites became voiceless only when the consonants preceding devoiced vowels are possible to
be syllabified to their adjacent syllables. Moreover, voiced high vowels in
typical devoicing environments were often not fully voiced and were reduced in duration. These voiced devoiceable vowels were not only shorter
but also had less intensity. This means that it is more natural for high vowels
to undergo the devoicing process between voiceless sounds. Therefore when
they did remain voiced they were acoustically shorter and weaker than when
they occurred in the non-devoicing environment.
The results also suggest that vowel devoicing is part of a vowel weakening process and the final state of the process is completely voiceless or in
an extreme case vowels are deleted. Vowel weakening in Japanese affects
vowel intensity and duration, but the quality of the vowels remain relatively
unchanged regardless of the intensity level of the vowel. Two different
mechanisms, namely phonetic and phonological processes, appear to control Japanese vowel devoicing. The vowel devoicing environment must
qualify certain phonetic conditions: namely short vowels surrounded by
Syllable structure and its acoustic effects on vowels in devoicing environments 245
Acknowledgement
The author would like to thank an anonymous reviewer for useful comments
and suggestions.
Notes
1. The presence of one devoiced vowel in a word did not affect the duration of a
whole word. Despite shorter duration of devoiced morae, the whole durations
of words did not show a significant difference from words of the same number
of morae without devoiced vowels. However, when there are more than one
devoiced vowel in a word, the whole word duration becomes significantly
shorter. See Kondo (2003) for details.
2. The mora tier usually represents an alternative rather than an addition to the
CV tier, and onset consonants are attatched directly to the syllable node as they
are nonmoraic (Hayes 1989; Kenstowicz 1994). However, I use a separate CV
tier in order to present clearly the formation of moraic consonant and resulting
syllable structures.
3. Pseudo-phonemic transcriptions are used to describe devoiced vowels and
resulting moraic consonants for convenience. The examples are the devoiced
vowels /i/ and /u /, the allophones of consonants // instead of /s/ and /sju/ (in
/si/ and /sju/), /C/ and // instead of /h/ (in /hi/ and /hu/), /t/ instead of /t/ (in
/ti/) and /ts/ instead of /t/ (in /tu/). /kC/ was also used to indicate palatalization
of /k/ and its release into a palatal fricative [C] in /ki/, and [kx] for /k/ in /ku/ to
indicate backness of the /k/ and its release into a velar fricative [x].
4. For the arguments concerning vowel devoicing and the loss of syllabicity, see
Kondo (2001).
1.
Introduction
2.
248
Miyoko Sugito
3.
249
Experimental procedures
250
4.
Miyoko Sugito
rate
natural
(1) MN
(1930)
fast
very fast
(2) KK
(1963)
type of
accent
+V
+V
HL
HL
V
+V
Durations
times
/kusa/
/a/
mean (SD)
mean (SD)
7
5
305.0 (34.3)
275.0 (24.2)
65.9 (11.1)
79.0 (6.6)
HL
HH*
2
1
266.0 (36.8)
202.0 ()
78.0
71.0
(2.8)
()
HH*
165.5 (19.7)
65.0 (10.7)
natural
HL
405.6 (42.2)
118.7 (22.0)
fast
HL
334.7 (15.3)
104.0 (14.2)
very fast
HL
HH*
6
1
275.5 (36.8)
224.0 ()
98.0 (18.2)
47.0 ()
+V
HL
338.0 (40.1)
142.3 (35.8)
fast
V
V
very fast
HL
HL
HL
4
7
5
334.0 (44.2)
280.9 (32.7)
187.0 (14.6)
160.8 (35.4)
128.4 (29.2)
65.0 (12.6)
HH*
175.0 (7.1)
natural
(3) TA
(1979)
1st vowel
devoiced
or not
54.5
(2.1)
251
rate
1st vowel
devoiced
or not
type of
accent
+V
+V
V
HL
HL
HL
very fast
V
V
natural
/kusa/
/a/
mean (SD)
mean (SD)
7
3
4
287.1 (23.3)
277.7 (13.9)
272.8 (22.5)
64.7 (10.3)
75.3 (16.2)
74.0 (4.6)
HH*
HL
7
7
163.9 (19.9)
388.1 (21.1)
60.3 (10.0)
131.9 (14.9)
HL
HH*
331.2 (11.7)
102.2 (12.0)
311.0
77.0
()
HL
HH*
5
2
282.6 (15.2)
261.0 (15.6)
103.8
90.5
(5.0)
(2.1)
+V
HL
393.0
176.0
()
fast
V
V
HL
HL
6
7
341.8 (23.0)
270.0 (33.7)
170.0 (24.9)
122.4 (22.8)
very fast
HH*
169.6 (38.1)
60.0 (21.8)
natural
(1) MN
(1930)
(2) KK
(1963)
fast
fast
very fast
(3) TA
(1979)
Durations
times
natural
()
()
252
Miyoko Sugito
Table 2 shows the data for sentence frame (B) Tsugi-wa (HLL) kusa
(HL). Looking at the table, we see that MN and TA devoiced and accented
/u/ in /kusa/ more often in sentence frame (B) than in sentence frame (A).
KK devoiced and accented all words. For TA, both voiced and devoiced
vowels were found in both sentence frames; however, accent changes occurred more often in (B) than in (A).
The results of the acoustic analysis may be summarized as follows: (1)
Individual differences were observed. Speaker MN tended to produce the
first mora voiced and accented. However, for the younger speakers, KK
devoiced all the first mora vowels, while TA usually, but not always, devoiced and accented them. (2) Speech rate affected vowel voicing and accentedness. In fast speech, devoiced, accented vowels were observed more
often than in natural speech. At a very fast speech rate, not only was the
first mora vowel devoiced, but also the word accent tended to change to
HH. (3) The sentence frame affected the accent patterns. Accent change
was more often observed in frame (B) (accent HLL) than in (A) (accent
HHH). A reason may be that when they spoke at a fast rate, it was more
difficult for speakers to make the necessary laryngeal adjustments to raise
the pitch for the accent HL immediately after the falling tones of Tsugi-wa
(HLL).
253
vowels of /kusa/ in (5) and (6) are very short and their F0 contours have
nearly level tones.
Figure 1. Speech waves and F0 contours of /kusa/ (HL) grass in sentence frames,
(A) Kore-wa kusa(HHH HL) This is grass (1)(3)(5) and (B) Tsugiwa kusa.(HLL HL) The next is grass (2)(4)(6).
(1)(2): the first vowels voiced, accented. (3)(4): with devoiced accented
vowels (natural speech). (5)(6): with accent perceived as HH (very fast
speech). Broken lines: the beginning time points of the words kusa
(speaker: TA).
An acoustic comparison of /kusa/ (HL) and /huta/ with accent HH also provides evidence to support the shift of /kusa/ from HL to HH in very fast
speech.
254
Miyoko Sugito
Figure 2. Speech waves and F0 contours of /huta/ (HH) lid in sentence frames,
(A) Kore-wa huta(HHH HH) This is a lid (1)(3)(5) and (B) Tsugiwa huta.(HLL HH) The next is a lid (2)(4)(6).
(1)(2): the first vowels voiced. (3)(4): the first vowels devoiced (natural
speech). (5)(6): the first vowels devoiced (very fast speech) (speaker:
TA). Broken lines: the beginning time points of the words huta
(speaker: TA).
Figure 2 shows the F0 contours of the words /huta/ with HH accent in the
sentence frames (A) and (B). The tokens in (1)(4) were spoken at a natural
rate, and those in (5)(6) at a very fast rate by speaker TA. The F0 contours
of (1) and (2) show that the first mora vowels are voiced, as indicated by
white arrows. The first mora vowels of (3) and (4) are devoiced. The F0
contours of the second mora vowels are almost level. The second vowels of
255
/huta/ in (5) (6) of Figure 2 are similar to those of /kusa/ in (5) and (6) in
Figure 1. All of them had level F0 contours, short durations, and were perceived as having HH accent.
5.
256
Miyoko Sugito
257
vowel sequence) and /kusa/ (with a close-open vowel sequence) have different F0 contours. Figure 4 shows twelve superimposed F0 contours of
/kusi/ and /kusa/, respectively. Dotted lines were interpolated through the
period of /s/ from the end of V1 to the start of V2. Speaker YI spoke at a
natural speech rate during the physiological experiment. Here, the F0 contours of (1) /kusi/ and (2) /kusa/ are quite different from each other. In
/kusi/ (1), F0 contours begin to fall in the vicinity of the end of the first
vowel, while in /kusa/ (2) it begins to fall at the beginning of the second
vowels, as indicated by the black arrows. The initial vowel in /kusi/ is
voiced, while that in /kusa/ is devoiced (except in one token whose second
vowel starts a little lower compared with the other falling contours).
Figure 4. Superimposed F0 contours, (1) /kusi/ (HL) and (2) kusa/ (HL), twelve
utterances each. Dotted lines: interpolated through the period of /s/ from
the end of V1 to the start of V2. Arrows: the starting time points of falling F0 contours (speaker: YI).
258
Miyoko Sugito
(1) /kusi/ with accent HL: The F0 contour of the first vowel of /kusi/ is high
while the second vowel starts with a relatively low frequency. CT activity
begins prior to the onset of the first vowel, which presumably accounts for
the high F0 of the first vowel. During the first vowel, only the CT is active
and SH activity is almost absent. SH activity begins prior to the onset of the
second vowel. The end of CT activity and the beginning of SH activity
occur at the same time at the end of the first vowel, as indicated by the broken vertical line. Activity of SH is associated with the low F0 of the second
vowel.
(2) /kusa/ with accent HL: This figure shows the F0 contour, the CT, and
SH pattern of the word with devoiced accented vowel. The F0 contour of
the vowel following the devoiced mora starts high and then drops sharply.
It is notable that the CT peak (as indicated by the white arrow) is observed
at the time it would occur if the first vowel were voiced; the same time
point as observed in /kusi/. This suggests that the command for raising F0
259
was input for the first vowel, even though it was devoiced. An additional
peak of CT activity (where the small black arrow points) is also observed in
/kusa/. Notice that the second CT peak begins preceding the initial high
starting F0 of the second vowel /a/. In /kusi/, the onset of the second vowel
/i/ has a low F0, and correspondingly, there is no second CT activity associated with the second vowel /i/. As for /kusa/, co-occurring with the second
CT activity, there is also onset of SH activity. The SH activity is associated
with the F0 fall on the second vowel. All eleven utterances of words /kusa/
showed a similar pattern.
260
Miyoko Sugito
no SH activity associated with these second vowels. The vowels that follow
devoiced accented vowels need to have adequate length in order to allow
for a falling F0 contour to occur.
6. Summary
This paper examined the acoustic and physiological characteristics of voicing
and accent changes in Osaka dialect words at different speech rates. Individual differences were observed. When speakers spoke at a relatively fast
rate, devoiced, accented vowels were produced more frequently. Moreover,
at a very fast rate, the HL accent was often changed to a HH accent. Laryngeal activities for the devoiced, accented vowels in /kusa/ were compared
with those for the voiced accented vowels in /kusi/. CT activity in devoiced
accented /kusa/ was found to occur at the time it would have occurred if the
vowels were voiced. This observation strongly suggests that the devoiced
vowels were not only perceived as accented, but were also produced as
accented.
With regard to the laryngeal activity for vowels in the second mora of the
words, the second peak of CT activity was associated with a high starting
F0, and the following SH activity with a fall in F0. These joint activities
may be involved in the resulting steep falling F0 contour following the devoiced accented vowels. The accent change found in very fast speech might
be due to the short duration of the second vowels. That is, we might conjecture that since the vowels were short, there was no time for the SH to
become active, and consequently, no F0 fall occurred on these short vowels
spoken in very fast speech. We hope that additional physiological experiments with natural, fast, and very fast speech, using MRI, will provide
further insight into how this accent change occurs.
Acknowledgements
The author would like to express her gratitude to Professors Donna Erickson,
Raymond Weitzman, and Jeroen van de Weijer who kindly provided comments on this paper.
1. Introduction
This study is devoted to rethinking the function and interaction of voicing
and accent from a perspective of prominence and tackling phonologicallysignificant issues on their interaction. Our ultimate goal is to shed new light
on their interaction in a general theory of prominence that involves the
harmonic scale of accent, tone, sonority, and voicing and to solve certain
problems observed in the accentual phenomena on devoiced vowels of
Japanese. Specifically, we are concerned with the issues of what happens
when a vowel that should bear accent is exactly in the position that should
be devoiced. This situation causes various problems because accent and
devoicing are incompatible in principle but turn out to be sometimes compatible in the phonological grammar of Japanese.
Let us review the historical background and motivation of our study.
There have been many phonetic studies on vowel devoicing and its relation
to accent in Japanese, and some researchers in this field are contributing
their recent findings to the vowel voice part of the present book (Sugito,
this volume). Compared to the abundance of phonetic literature on this topic,
little attention has been paid to a phonological account of what happens
when an accented vowel is devoiced. Yet we can find some theoretical
work in the metrical framework, such as Yamada (1990), Haraguchi
(1991), Tanaka (1992), and Yokotani (1997), which agree, on the basis of
the descriptive literature (NHK, ed. 1998; Akinaga, ed. 2001), that accent
can either remain on a devoiced vowel or shift to an adjacent vowel. However, the optionality and directionality of accent shift are so complicated
that derivational analyses such as those above are problematic in their empirical coverage and do not explain the cases of accent shift beyond metrical
constituents, as Yokotani (1997) and Tanaka (2002a) point out. Derivational accounts also pose the fundamental question as to how accent shift
263
2.
voicing
sonority
tone, length
accent
vibration
vibration
aperture
vibration
aperture
pitch, duration
vibration
aperture
pitch, duration
intensity
least prominent
loudness
most prominent
265
guages lies in whether pitch gets phonologized as tone or remains somewhat phonetic as tune in the realization of pitch contour). Third, the chart in
(1) accounts for what Hayes (1995: 7) calls the parasitic nature of stress,
which refers to the fact that stress parasitically invokes phonetic resources
that serve other phonological ends. This point is clear from (1), because in
accent or stress, all the articulatory resources are put together to realize
loudness in speech production.
The phonetic characterization of prominence elements in (1) can phonologically be represented as the Harmonic Scale of Prominence in (2),
where A B indicates that B is a proper subset of A, and A > B means A is
more prominent than B:
(2)
(2a) shows a harmonically-complete system of prominence where an element always implies the existence of any element(s) to the left. Especially,
accent presupposes the existence of all the elements in the scale, so accent
often interacts with tone, sonority, and voicing in realizing prominence, as
will be discussed below. Note here that length is not incorporated into this
phonological system. This is because an accented vowel is indeed phonetically longer than an unaccented one but is not necessarily a long vowel in
phonology, although a long vowel tends to attract accent in quantity-sensitive languages. The aspect of quantity sensitivity is captured by the constraint of WEIGHT-TO-STRESS or more strictly speaking, PEAK-PROMINENCE
(Prince & Smolensky 1993) and syllable quantity is a different concept
from syllable prominence (Hayes 1995: 270273). So, in what follows, we
will just consider syllable prominence in line with the scale in (2) and exclude syllable quantity (i.e., length) from our discussion.1
Now let us look at the interplay of accent with the other prominence
elements. As shown in (2), accent implies tone, sonority, and voicing, and
there is a good possibility that accent has an effect on, or is influenced by,
these elements in pitch-accent and stress-accent languages, where accent
works together with the other prominence elements in order to enhance syllable prominence or the culminativity of a word. It is a kind of conspiracy
effect of prominence elements. Such cases are classified into accent-conditioned prominence and prominence-conditioned accent (Tanaka 2005): in
the former case, the behavior of tone, sonority, and voicing is sensitive to
kmakiri mantis
| | | |
L H L
habrasi toothbrush
| | | |
H L LL
L HL L
b. Tone-conditioned accent
(Lithuanian, from Halle & Vergnaud 1987)
viras man Vislas Vistula
| |
| |
vinas wine
| |
HL L
LH L
HL L
viksmas course
|
|
LH
The tone patterns in Japanese are derived by linking accent to H, and then
the preceding and following moras are assigned H and L, respectively, with
the proviso that the unaccented initial mora is always L (Haraguchi 1991).
On the other hand, a long vowel in Lithuanian may either have acute (HL)
or circumflex (LH) tone, and accent falls on the first mora linked to H
(Hayes 1995). In both languages, accent and tone cooperate to highlight
prominence in a word. This conspiracy effect is also seen in (4), where accent and sonority agree in prominent position:
(4)
267
(4a) shows that accent loss causes vowel reduction and, conversely, accent
acquirement turns schwas to full vowels; that is, sonority is based on accent
placement.3 (4b) is the opposite case, where accent placement is based on
vowel sonority. Winnebago, a Siouan language, has the sonority hierarchy
of a > o > u > e > i; when accent is assigned on a diphthong, it falls on the
more sonorous vowel (Susman 1943). Although Hayes (1995: 15) reports a
few other cases of sonority-conditioned accent, they are relatively restricted
in number.
The fundamental characteristics of such interactions as in (3) and (4) are
also seen in the case of accent and voicing. In the next section, we will present our main concern, accent and voicing, on the basis of the harmonic
scale in (2).
Accent-conditioned voicing
a. /ks/-Voicing (English)
xecute [ks] / excutive [gz] exhbit [gz] / exhibtion [ks]
b. /s/-Voicing (English)
trnsit [s] / transtion [z]
vs.
vs.
vs.
vs.
*h-ka / hu -k failure
*s-ken / si-kn exam
*k-soku / ki-sku rule
*s-ki / sik four seasons
269
column, devoicing must apply and accent automatically shifts to the adjacent syllable. Note that the blocking effect of devoicing is also seen on the
landing site of the final example sik four seasons. 6 This is because adjacent syllables are usually not devoiceable, as stated above.
What (5) and (6) have in common is that accent and voicing conspire to
maximize syllable prominence. Especially, the presence/absence of accent
and voicing must target the same vowel, because accent implies voicing in
the harmonically-complete system of prominence.
Finally, the following are interesting cases with the interaction between
accent and voicing, where compound accent and Rendaku voicing are in
complementary distribution and the presence of one of them is necessary
and sufficient for word prominence (Zamma, this volume, also discusses
this point):
(7)
(8)
Personal names with saburou the third son from Haraguchi (2002)8
a. Rendaku without accent
nin-zaburou ken-zaburou dai-zaburou
b. Accent without Rendaku
yo-sburou ki-sburou
tyou-zaburou
tama-sburou tomi-sburou
This complementary distribution of accent and voicing can be given a plausible account in our prominence theory. The domain of prominence in these
cases is the whole word, not the syllable as in the previous cases, and one
word prominence is necessary and sufficient as the basic nature of culminativity of a word. It may be the case that either of them functions as the
prominence that marks the boundary of a compound.
Here, feet are already assigned to each word for expository purposes, following the analysis in Tanaka (2001, 2002b): accent is placed by constructing
bimoraic feet from right to left without crossing morpheme boundaries, and
it basically falls on the penultimate foot of each word.
271
What is crucial is the very fact that the non-shifted variants allow accent to
fall on devoiced vowels, a harmonically-incomplete behavior of accent and
voicing. For that matter, vowel devoicing itself is a very strange phenomenon in the first place, since sonorants, including vowels, are preferably
voiced in phonology, which is another aspect of harmonic completeness in
prominence (see also note 1).
There might be a possibility that devoicing is a phonetic rule outside the
grammar, but we do not adopt this idea because, as we will see below, devoicing and its correlation to accent can be phonologized in a constraintbased grammar. Phonetically, pitch cannot be implemented without the
vibration of the vocal cords, and yet the fact that accent stands in the devoiced environment clearly shows that phonetics and phonology are different. In fact, accent should be an abstract entity. This incompleteness suggests
that the Harmonic Scale of Prominence in (2) and its related constraints
may be outranked by the system of compound accent. In other words, compound accent may be respected at the cost of harmonic completeness, or
otherwise accent shift applies by giving priority to harmonic prominence
over compound accent. We will put forward such an account in the next
section.
In addition to harmonic incompleteness, the devoiced accent also poses
another problem with phonological analysis, viz., opacity. More exactly,
harmonic incompleteness may stem from the opacity concerned. As illustrated in (10 a), in derivational terms, accent-shifted forms are obtained in
the feeding order of devoicing and accent shift, with compound accent preceding them. The resulting forms are transparent:
(10) Non-surface-true opacity
a.
Feeding Order
/k-sku/
(k)(soku)
(k)(soku )
transparent (ki)(sku )
b.
opaque
Rules
/bzyutu-kn/
(bi)(zyut)(kan)
(bi)(zyut)(kan)
(bi)(zytu )(kan)
Counter-Feeding Order
/k-sku/
/bzyutu-kn/
(k)(soku)
(k)(soku )
(bi)(zyut)(kan)
(bi)(zyut )(kan)
Compound Accent
Devoicing
Accent Shift
Rules
Compound Accent
Accent Shift
Devoicing
( *)
(*) (*) <(*)>
bi zyutu kan
b. Rightward shift
(*)
(*)
(*) <(. *)>
(*) <(. *)>
?
ki soku ki soku
c. Cancellation of extrametricality
(*)
( *)
(*)(. *)
(. *)
ki soku *ki soku
d. Leftward shift
(.
*)
(.
*)
(. *) (*) <(. *)>
(. *) (*)<(. *)>
?
nana hi kari nana hi kari
273
(11a) shows how leftward shift occurs after devoicing. We can obtain the
correct form by deleting the syllable on the devoiced vowel. But a problem
occurs with the rightward shift in (11b): the head of the foot cannot go out of
the domain after syllable deletion, and even worse, the final foot is invisible,
so accent shift is not predicted. Even if extrametricality were canceled before the application of devoicing and accent shift, as in (11c), the landing
site of accent shift would be wrong and an ungrammatical form would be
derived. In the same way, leftward shift across constituent boundaries is not
accounted for, as (11d) shows.
Another fundamental question arises when we take into account the fact
that the accent-preserving vowel -t - in the left column of (11a) is still
dominated by a syllable node but its accent-losing counterpart in the right
column of (11a) is not, even though they are equally devoiced. This situation is very puzzling. Moreover, it is unclear what happens to the floating
devoiced vowel that has lost its syllable node: floating segments are erased
by convention, but the devoiced vowel may not be deleted, because vowel
devoicing is distinct from vowel deletion as in sentakki sentku ki /
sentkki washing machine and suizokkan suizku kan / suizkkan
aquarium (cf. Kondo, this volume). In short, the syllable node should be
deleted to cause accent shift but it should be preserved to make the distinction between vowel devoicing and vowel deletion, which leads to a paradoxical situation.
275
/k+sku/
(k)+(soku)
(k)+(soku8)
(k)+(soku8)
() (ki)+(sku8)
(ki)+(soku8)
b.
/nna+husigi/
(nana)+(h)(sigi)
*!
*!
*!
IDENT
(voice)
*
**
**
**
IDENT
(voice)
*
*
*
*
(nana)+(hu8)(sgi)
(nana)+(hu8)(sigi)
/bzyutu+kn/
(bi)(zyut)+(kan)
(bi)(zyutu)+(kn)
(bi)(zyut8)+(kan)
() (bi)(zytu8)+(kan)
(bi)(zyutu8)+(kan)
d.
/sit+kutibiru/
(sita)+(kti)(biru)
*!
IDENT
(voice)
IDENT
(voice)
**
**
**
**
(s)+(ki)
(s)+(ki)
(s)+(ki)
() (si)+(k)
(si)+(ki)
(s)+(ki)
(si)+(k)
*!
*!
NONFIN ALIGN-R
*
*
*
**!
*!
MAX-O *V!8
(accent)
**
**
***
*
*****
NONFIN ALIGN-R
*
*
*!*
*
**!
MAX-O *V!8
(accent)
IDENT
(voice)
***
***
**
***!*
******
*
*
*
**!
MAX-O *V!8
(accent)
*
*
*
*
**
**
NONFIN ALIGN-R
*
*
*
*
*
**!
*
*
**
****
NONFIN ALIGN-R
*
*
*
*
*!
*!
MAX-O *V!8
(accent)
**!
(sit)+(ku8ti)(biru)
(sita)+(ku8ti)(biru)
/s+k/
*
*!
**
**
**
*
***
(sita)+(k8ti)(biru)
() (sita)+(ku8t)(biru)
e.
*
*
*
*
*
*
*
NONFIN ALIGN-R
(nana)+(h8)(sigi)
() (nan)+(hu8)(sigi)
c.
MAX-O *V!8
(accent)
*
***
*
*
**
*
***
Here, all the non-shifted forms with devoicing are evaluated as optimal
correctly. More crucially, note here that if we rerank *V!8 over MAX-O
(accent) in (14), as shown by the dotted lines, then the accent-shifted (i.e.
transparent) form of each example is uniformly selected as optimal. So the
two constraints must have free ranking (Anttila 1997, 2002; Anttila and
4. Conclusion
In this article, we have developed a theory of prominence by showing evidence for the Harmonic Scale of Prominence from phonetics, phonology,
typology, and prosody. Specifically, we have argued for an implicational
relation among voicing, sonority, tone, and accent by considering their
various phonological interactions for enhancing syllable prominence or
culminativity. Then, we have seen that the well-known phenomenon of
devoiced accent poses such problems as i) incompleteness for prominence
theory, ii) optionality and directionality of accent shift for derivational theory, and iii) non-surface-true opacity for general phonological theory. The
apparent harmonic incompleteness in prominence has much to do with the
opacity of accent and devoicing. Thus, we have proposed a solution in OT
and have demonstrated that the notions of sympathy and reranking serve to
solve all of these problems.
Acknowledgements
This paper is part of my talk delivered at the Phonology Forum 2001 of the
Phonological Society of Japan, which was held on August 28 at Chiba University. I would like to thank the audience for their helpful comments. I am
also very grateful to Stuart Davis, Jeroen van de Weijer, and an anonymous
reviewer for helping me improve the content and style of this paper. Special
thanks go to Jeroen van de Weijer, Tetsuo Nishihara, and Kensuke Nanjo,
who gave me the opportunity to elaborate my idea in this project. Any remaining inadequacies or misconceptions are my responsibility alone, of
course. This study is supported by the Grant-in-Aid for Scientific Research
(Basic Sciences (C)(2), grant number 15520306) of the Japan Society for
the Promotion of Science.
277
Notes
1. We assume here with Hayes (1995) that syllable weight consists of syllable
quantity and syllable prominence. But his theory differs from ours in that
length can be reflected on both moraic structure (i.e., syllable quantity) and
prominence structure. Instead, we assume that length is only a matter of syllable quantity. Incidentally, another interesting approach to prominence is seen
in Anttila (1997), who proposes a constraint-hierarchy system that incorporates
accent, length, and sonority by using Prince & Smolenskys (1993) Harmonic
Alignment, although it does not incorporate voicing or tone and does not capture the implicational relations among prominence elements. If we incorporate
voicing and tone as well in this approach and assume such binary prominence
scales as V! > V (accent), V > V (tone), VV > V (length), V > C (sonority),
|
|
H L
and V > V (voicing), Harmonic Alignment can derive constraint hierarchies
like *{V, V!} >> *{V!, V} (accent & tone), *{VV, V!} >> *{V!V, V} (accent &
| |
| |
H L
H L
length; WEIGHT-TO-STRESS or more strictly, PEAK-PROMINENCE), *{V, C!} >>
*{V!, C} (accent & sonority), *{V, V!8} >> *{V!, V8} (accent & voicing), *{C,
V8} >> *{C8, V} (sonority & voicing), etc. See Tanaka (2003b) for the details of
such a theory, where the overall scale in (2) and these specific binary scales are
developed in a uniform fashion.
2. Conversely, length-conditioned accent (i.e., quantity-sensitivity) seems to be
more dominant cross-linguistically than accent-conditioned length (e.g. vowel
lengthening on an accented vowel, vowel shortening with accent loss, etc.).
This is another reason we argue for the division of labor between syllable
quantity and syllable prominence and exclude length from our discussion of
syllable prominence. For the interaction of accent and syllable length, see
McGarrity (2003).
3. We assume here that schwa is less sonorous than non-high vowels.
4. In what follows, morpheme boundaries are represented with .
5. Of course, it is not true that the consonants that precede accent should always
be voiced in such a way as in (5a, b). As is well-known, stops in English may
be voiceless and aspirated when they precede accent. Generally, less sonorous
consonants are preferred to their voiced counterparts in onset position, because
of the Dispersion Principle (Clements 1990, 1992). In Pirah, CV and CVV are
more prominent and attract accent more strongly than GV and GVV, respectively, which is a very rare case (C is a voiceless consonant and G is a voiced
consonant). This is also related to the principle and a steep rise from onset to
nucleus may sound more prominent than a gentle rise.
6. hsu in (5d) and sik in (6) are fairly contrastive in that although they are originally accented on the first syllable, only the latter undergoes devoicing and
7.
8.
9.
10.
11.
12.
accent movement to the right. The former example may pose a problem with
the analysis that we will present in section 3.2, but it is true that his may optionally be acceptable as well. We will leave this question for further study,
since such examples as in (5d) are limited in occurrence and the patterns in (6)
are productive. The difference between them is whether or not they have a
word boundary in their domain.
Examples we are concerned with here are ones in which the modifier of sima
is more than two moras. Exceptions such as nakano-sima and nakadoori-sima
without any accent and Rendaku and hatizyu-zima and kakar-zima with both
are quite rare. As for (7b), syoud-sima, awazsima, and taneg-sima may lead
us to believe that the voiced obstruents immediately before the boundary cause
the blocking of Rendaku (due to the Lymans Law in a wider domain); however, there are island names like isigaki-zima, ogi-zima, and megi-zima, which
contradict such a hypothesis.
The distinction in (8) can be phonologized in such a way that the specifier in
(8a) is CVC or CVV while the one in (8b) is CV or CVCV. It is surprising that
the violation of Lymans Law is acceptable in the head of the names -zabu in
(8a).
We assume here, following Poser (1990) and Tanaka (1992), that compound
accent is derived by final-foot extrametricality.
(12a) may be a locally self-conjoined constraint banning a devoiced vowel, but
we adopt an OCP-based version for expository purposes.
This constraint is virtually equivalent to Prince & Smolenskys EDGEMOST
(pk; R; Word), which states that a peak of prominence (i.e. accent) lies at the
right edge of PrWd. In addition to (12e, f), there are other constraints for compound accentuation; see Tanaka (2001, 2002b) for the exact hierarchy and the
ranking relation between (12e) and (12f).
McCarthy (2003b) compares various approaches to opacity using comparative
markedness, local conjunction, stratal OT, sympathy, and targeted constraints
only to conclude that each of them has its own advantages and disadvantages.
Empirically, sympathy seems to be the best candidate to account for the data
concerned, so we adopt it here.
Bibliography
280 Bibliography
Backley, Phillip
1998
Tier geometry: An explanatory model of vowel structure. Doctoral
dissertation, University College London.
Backley, Phillip and Toyomi Takahashi
1998
Element activation. In Structure and Interpretation: Studies in
Phonology (PASE Studies & Monographs 4), Eugeniusz Cyran (ed.),
1340. Lublin: Wydawnictwo Folium.
Bao, Zhiming
1990
On the nature of tone. Doctoral dissertation, MIT.
Beckman, Jill N.
1997
Positional faithfulness, positional neutralization, and Shona vowel
harmony. Phonology 14 (1): 146.
1998
Positional faithfulness. Doctoral Dissertation, University of Massachusetts, Amherst.
Beckman, Mary E.
1982
Segmental duration and the Mora in Japanese. Phonetica 39: 113
135.
Bell, Alan E.
1978
Syllabic consonants. In Universals of Human Language, Volume 2:
Phonology, Joseph H. Greenberg (ed.), 153201. Stanford, California:
Stanford University Press.
Benua, Laura H.
1998
Transderivational identity: Phonological relations between words.
Doctoral dissertation, University of Massachusetts, Amherst.
Bird, Steven G.
1995
Computational phonology: A constraint-based approach. Cambridge:
Cambridge University Press.
Bloch, Bernard
1946
Studies in colloquial Japanese I: Inflection. Journal of the American
Oriental Society 66: 97109 (References are to the version in Miller
1970: 124).
Blumstein, Sheila E., William E. Cooper, Harold Goodglass, Sheila Statlender and
Jonathan Gottlieb
1980
Production deficit in aphasia: a voice onset time analysis. Brain and
Language 9: 153170.
Boersma, Paul P. G.
1997
How we learn variation, optionality, and probability. Proceedings of
the Institute of Phonetic Sciences, Amsterdam 21: 4358. (Available
on the Rutgers Optimality Archive, ROA-221.)
Cabrera-Abreu, Mercedes
2000
A phonological model for intonation without low tone. Bloomington,
Indiana: Indiana University Linguistics Club Publication.
Bibliography
281
Calabrese, Andrea
1995
A constraint-based theory of phonological markedness and simplification procedures. Linguistic Inquiry 26: 373463.
Campbell, Nick and Yoshinori Sagisaka
1991
Moraic and syllable-level effects on speech timing. Journal of Electronic Information Communication Engineering SP 90107: 3540.
Charette, Monik and Asli Gksel
1998
Licensing constraints and vowel harmony in Turkic languages. In
Structure and Interpretation: Studies in Phonology (PASE Studies &
Monographs 4), Eugeniusz Cyran (ed.), 6588. Lublin: Wydawnictwo
Folium.
Cho, Young-mee Yu
1998
Language change as constraint reranking. Historical Linguistics 1995.
Amsterdam: John Benjamins.
Chomsky, A. Noam and Morris Halle
1968
The Sound Pattern of English. New York: Harper and Row
Clements, George N.
1978
Tone and syntax in Ewe. In Elements of Tone, Stress and Intonation
Donna J. Napoli (ed.), 2199. Washington, D.C.: Georgetown University Press.
1990
The role of the sonority cycle in core syllabification. In Papers in
Laboratory Phonology I: Between the Grammar and Physics of
Speech, John C. Kingston and Mary E. Beckman (eds.), 283333.
Cambridge: Cambridge University Press.
1992
The sonority cycle and syllable organization. In Phonologica 1988,
Wolfgang U. Dressler, Hans C. Luschtzky, Oscar E. Pfeiffer and
John R. Rennison (eds.), 6376. Cambridge: Cambridge University
Press.
2001
Representational economy in constraint-based phonology. In Distinctive Feature Theory, T. Alan Hall (ed.), 71146. Berlin /New
York: Mouton de Gruyter.
Clements, George N. and Susan R. Hertz
1991
Nonlinear phonology and acoustic interpretation. In Actes du XIIme
Congrs International des Sciences Phontiques, Aix-en-Provence,
1924 aot 1991 [Proceedings of the XIIth International Congress of
Phonetic Sciences, Aix-en-Provence, August 1924, 1991], Vol. 1,
364373. Aix-en-Provence: Universit de Provence, Service des
Publications.
Cohn, Abigail C.
1993
Nasalisation in English: Phonology or phonetics. Phonology 10: 43
81.
282 Bibliography
Coleman, John S. and Janet B. Pierrehumbert
1997
Stochastic phonological grammars and acceptability. In Computational Phonology: Third Meeting of the ACL Special Interest Group
in Computational Phonology, 4956. Association for Computational
Linguistics, Somerset.
Cremelie, Nick and Jean-Pierre Martens
1995
On the use of pronunciation rules for improved word recognition. In
Proceedings Eurospeech 95: 17471750.
1997
Automatic rule-based generation of word pronunciation networks. In
Proceedings Eurospeech 97: 24592462.
1999
In search of better pronunciation models for speech recognition.
Speech Communication 29: 225246.
Dauer, Rebecca M.
1980
The reduction of unstressed high vowels in modern Greek. Journal
of the International Phonetic Association 10: 1727.
de Lacy, Paul V.
1999
Tone and prominence. Ms., University of Massachusetts, Amherst
(Available on the Rutgers Optimality Archive, ROA-333).
2001
Prosodic markedness in prominent positions. Ms., University of
Massachusetts, Amherst (Available on the Rutgers Optimality Archive, ROA-432).
2002
The Formal Expression of Markedness. Ph.D. dissertation, University of Massachusetts, Amherst.
2004
Markedness conflation in Optimality Theory. Phonology 21: 154
200.
End, Kunimoto
1973
Kaiki to ruisui: Ma-gyou no dakuon-gana to sono haikei [Back formation and analogy: The kana of the [b]-column used for the [m]column, and its background]. Gifudaigaku Kyiku Gakubu Kynky
Hkoku: Jinbun 21: 103112.
1989
Kokugo Hyoogen to Onin Genshoo [Expression in Japanese and
Phonological Phenomena]. Tokyo: Shinten-sha.
Flemming, Edward S.
1995
Auditory representations in phonology. Doctoral dissertation, University of California, Los Angeles.
Frisch, Stefan A.
1996
Similarity and Frequency in Phonology. Doctoral dissertation, Northwestern University, Evanston, Illinois.
Fujimoto, Masako and Shigeru Kiritani
2003
Comparison of vowel devoicing for speakers of Tokyo and Kinki
dialects. Journal of the Phonetic Society of Japan 7: 5869.
Bibliography
283
284 Bibliography
Gandour, Jackson T. and Rochana Dardarananda
1984
Voice onset time in aphasia: Thai II. production. Brain and Language 23: 177205.
Grootaers,Willem A.
1976
Nihon no Gengo Chiri Gaku no Tameni [For the Sake of Dialect
Geography in Japan]. 4877. Tokyo: Heibonsha.
Halle, Morris
2003
Verners Law. In A New Century of Phonology and Phonological
Theory: A Festschrift for Professor Shosuke Haraguchi on the
Occasion of His Sixtieth Birthday, Takeru Honma, Masao Okazaki,
Toshiyuki Tabata and Shin-ichi Tanaka (eds.). Tokyo: Kaitakusha.
Halle, Morris and Kenneth N. Stevens
1971
A note on laryngeal features. MIT Quarterly Progress Report of the
Research Laboratory of Electronics 101: 198213.
1991
Knowledge of language and the sounds of speech. In Music, Language, Speech and Brain: Proceedings of an International Symposium at the Wenner-Gren Center, Stockholm, 58 September 1990,
Johan Sundberg, Lennart Nord and Rolf Carlson (eds.), 119.
Houndmills: MacMillan Press.
Halle, Morris and Jean-Roger Vergnaud
1987
An Essay on Stress. Cambridge, Massachusetts: MIT Press.
Hamada, Atsushi
1952a
Hatsuon to dakuon to no sookansei no mondai [Issues in relativity
between moraic nasals and voiced obstruents]. Kokugo-Kokubun
21(3), 1832.
1952b
Kouji gonen Chousen-ban Iroha Onmon taion kou [Thoughts on
correspondence of Hangul [to Japanese kana] seen in Iroha (Iropa)
of the fifth year of Kouji [1492] printed in Korea]. Kokugo-Kokubun
21 (10): 2232.
1955
Haneru-on [Moraic nasals]. In Kokugogaku Jiten, Kokugo Gakkai
(ed.), 750751. Tokyo: Tokyo-do.
1960
Rendaku to renjou [Rendaku and sandhi]. Kokugo-Kokubun 29 (10):
116.
1971
Sei daku [Sei-daku: Clear-muddy]. Kokugo-Kokubun 40 (11): 4051.
Hamano, Shoko
2000
Voicing of obstruents in Old Japanese: Evidence from the soundsymbolic stratum. Journal of East Asian Linguistics 9: 20725.
Han, Mieko, S.
1962a
Japanese phonology. Tokyo: Kenkyusha.
1962b
Unvoicing of vowels in Japanese. Study of Sounds 10: 81100.
1994
Acoustic manifestations of mora timing in Japanese. Journal of the
Acoustical Society of America 96: 7382.
Bibliography
285
Haraguchi, Shosuke
1977
The Tone Pattern of Japanese: An Autosegmental Theory of Tonology.
Tokyo: Kaitakusha.
1991
A Theory of Stress and Accent. Dordrecht: Foris.
2002
A theory of voicing. In A Comprehensive Study on the Phonological
Structure of Languages and Phonological Theory, Shosuke Haraguchi (ed.), 122. Technical Report of Basic Sciences (A)(1), Grant-inAid for Scientific Research by the Japan Society for the Promotion
of Science.
Harris, John K. M.
1990
Segmental complexity and phonological government. Phonology 7:
255301.
1994
English Sound Structure. Oxford: Blackwell.
1997
Licensing Inheritance: An integrated theory of neutralisation. Phonology 14: 315370.
1998
Phonological universals and phonological disorder. In Linguistic
Levels in Aphasia: Proceedings of the RuG-SAN-VKL Conference on
Aphasiology, Evy Visch-Brink and Roelien Bastiaanse (eds.), 91117.
San Diego, CA: Singular Publishing Group.
Harris, John K. M. and Geoffrey A. Lindsey
1995
The elements of phonological representation. In Frontiers of Phonology: Atoms, Structures, Derivations, Jacques Durand and Francis
X. Katamba (eds.), 3479. Harlow, Essex: Longman.
2000
Vowel patterns in mind and sound. In Phonological knowledge:
Conceptual and empirical issues, Noel Burton-Roberts, Philip Carr
and Gerry J. Docherty (eds.), 185205. Oxford: Oxford University
Press.
Hasegawa, Kiyoshi, Katsuaki Horiuchi, Tsutomu Momozawa and Saburo Yamamura
(eds.)
1986
Obunshas Comprehensive Japanese-English Dictionary. Tokyo:
Obunsha.
Hashimoto, Shinkichi
1917
Kokugo kanazukai kenkyshij no ichihakken. Teikoku Bungaku 23
[References are to the version in Hashimoto (1949: 123163).].
1932
Kokugo ni okeru biboin [Nasalized vowels in Japanese]. Kokugo
Onin no Kenkyuu. Tokyo: Iwanami.
1949
Moji Oyobi Kanadzukai no Kenkyuu. Tokyo: Iwanami.
Hattori, Noriko
1989
Mechanisms of word accent change: Innovations in Standard Japanese. Doctoral dissertation, University College, London.
Hattori, Shiro
1928
On two-syllable words uttered in Kameyama-cho area in Mie Prefecture. Bulletin of The Phonetic Society of Japan 11.
286 Bibliography
Hattori, Shiro
1950
Phoneme, phone, and compound phone. Gengo Kenkyu 16: 92109
(Revised version appeared in Gengogaku no Hoohoo [Methods in
Linguistics]. Tokyo; Iwanami, 1960).
1960
Gengogaku no Hoohoo [Methods in Linguistics]. Tokyo: Iwanami.
Hayata, Teruhiro
1977a
Nihongo no onin to rizumu [Sounds and rhythm of Japanese]. Dentou to Gendai 45: 4149.
1977b
Seisei akusento ron [Generative accentuation] In no, Susumu and
Takeshi Shibata (eds.), Nihongo 5: Onin [Japanese 5: Sounds], 323
360.
Hayes, Bruce P.
1989
Compensatory lengthening in moraic phonology. Linguistic Inquiry
20: 253306.
1995
Metrical Stress Theory: Principles and Case Studies. Chicago: The
University of Chicago Press.
Hayes, Bruce P. and Donca Steriade
2004
Introduction: the phonetic basis of phonological markedness. In
Phonetically-Based Phonology, Bruce Hayes, Robert Kirchner and
Donca Steriade (eds.), 133. Cambridge: Cambridge University Press.
Hepburn, James Curtis
1867
A Japanese and English Dictionary with an English and Japanese
Index. Shanghai: American Presbyterian Mission Press [Reprinted in
1983. Tokyo: Charles E. Tuttle].
Hibiya, Junko
1999
Variationist Sociolinguistics. The Handbook of Japanese Linguistics,
Natsuko Tsujimura (ed.), 101120. Cambridge, Mass.: Blackwell.
Hinskens, Frans and Jeroen M. van de Weijer
2003
Patterns of segmental modification in consonant inventories: A crosslinguistic study. Linguistics 41 (6): 10411084.
Hirayama, Teruo, Ichiro Oshima, Makio Ono, Makoto Kuno, Mariko Kuno and
Takao Sugimura
1992
Gendai Nihongo Hgen Daijiten [Dictionary of Japanese Dialects].
Tokyo: Meiji Shoin.
Hirose, H. et al.
1994
Analysis and formulation of the prosodic features of Standard Mandarin Chinese. The Journal of the Acoustical Society of Japan 50 (3):
177187.
Honda, Kiyoshi, Hiroyuki Hirai, Shinobu Masaki and Yasuhiro Shimada
1999
Role of Vertical Larynx Movement and Cervical Lordosis in F0
Control. Language and Speech 42: 401411.
Bibliography
287
288 Bibliography
It, Junko and Ralf-Armin Mester
1986
The phonology of voicing in Japanese. Linguistic Inquiry 17: 4973.
1993
Licensed segments and safe paths. Canadian Journal of Linguistics
38: 197213
1995a
Japanese phonology. In Handbook of Phonological Theory, John A.
Goldsmith (ed.), 817838. Cambridge: Blackwell.
1995b
The core-periphery structure of the lexicon and constraints on
reranking. In University of Massachusetts Occasional Papers in Linguistics 18: Papers in Optimality Theory, Jill N. Beckman, Suzanne
C. Urbanczyk and Laura Walsh Dickey (eds.), 181210. Amherst:
GLSA.
1996
Stem and word in Sino-Japanese. In Phonological Structure and
Language Processing: Cross-linguistic Studies, Takeshi Otake and
Anne Cutler (eds.), 1344. Berlin /New York: Mouton de Gruyter.
1997
Correspondence and compositionality: The ga-gy variation in Japanese phonology. In Derivations and Constraints in Phonology, I. M.
Roca (ed.), 419462. New York: Oxford University Press.
1998
Markedness and word structure: OCP effects in Japanese. Ms. University of California, Santa Cruz (Available on the Rutgers Optimality Archive, ROA-255).
1999a
The phonological lexicon. In The Handbook of Japanese Linguistics,
N. Tsujimura (ed.), 62100. Malden, Mass. and Oxford, U.K: Blackwell Publishers.
1999b
The lexicon in Optimality Theory. Handout presented at University
of Tsukuba, Special Research Project for the Typological Investigation of Languages and Cultures of the East and West.
2000
Weak parallelism and modularity: Evidence from Japanese. In Report
of the Special Research Project for the Typological Investigation of
Languages and Cultures of the East and West III, Part I, Shosuke
Haraguchi (ed.), 89 105. Ibaraki: University of Tsukuba.
2001
Covert generalizations in Optimality Theory: the role of stratal faithfulness constraints. In Proceedings of 2001 International Conference
on Phonology and Morphology, 333, Yongin, Korea.
2003
Japanese Morphophonemics: Markedness and Word Structure.
Cambridge, Mass.: MIT Press.
It, Junko, Ralf-Armin Mester and Jaye E. Padgett
1995
Licensing and underspecification in Optimality Theory. Linguistic
Inquiry 26: 571614.
1999
Lexical classes in Japanese: A reply to Rice. Phonology at Santa
Cruz 6: 3946.
Itoh, Motonobu, Itaru F. Tatsumi and Sumiko Sasanuma
1986
Voice onset time perception in Japanese aphasic patients. Brain and
Language 28: 7185.
Bibliography
289
Iwabuchi, Etsutar
1934
Youkyoku no utai-kata ni okeru nisshou tsu ni tusite [On the entering tone [=coda] -t in the singing of ykyoku]. Kokugo to Kokubungaku 11: 5, 7 and 9. (98117, 91101 and 8595, respectively)
Iwanami Shoten Henshbu (ed.)
1992
Gyakubiki Kjien. Tokyo: Iwanami.
Jakobson, Roman
1968
Child language, aphasia and phonological universals. The Hague:
Mouton.
Jakobson, Roman, C. Gunnar M. Fant and Morris Halle
1952
Preliminaries to speech analysis. Cambridge, Mass.: MIT Press.
Jessen, Michael and Catherine O. Ringen
2001
On the status of [voice] in German. In WCCFL 20: 304317.
Jdaigo Jiten Hensh Iinkai (ed.)
1967
Jidaibetsu kokugo daijiten: jdaihen. Tokyo: Sanseid.
Jun, Sun-Ah
1993
The phonetics and phonology of Korean prosody. Unpublished Ph.D.
dissertation. The Ohio State University, Columbus, Ohio.
Jun, Sun-Ah and Mary E. Beckman
1993
A gestural-overlap analysis of vowel devoicing in Japanese and
Korean. Paper presented at the 1993 Annual Meeting of the LSA,
Los Angeles, 710 January, 1993.
Jurafsky, Daniel and James H. Martin
2000
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition.
Upper Saddle River, NJ: Prentice Hall.
Kager, Ren W. J.
1999
Optimality Theory. Cambridge: Cambridge University Press.
Kamei, Takashi
1970
Kana wa naze dakuon senyou no jitai o motanakatta ka-o megutte
kataru [Discussing why kana did not have letters only for daku-on].
Hitotsubashi Daigaku Kenkyuu Nenpou: Jinbun Kagaku Kenkyuu
12: 192.
1985
Dakuon [Voiced sounds]. Heibonsha Hyakka Jiten 9: 227228.
Tokyo: Heibonsha.
Kamei, Takashi, Rokuro Kono and Eiichi Chino
1997
Gengogaku daijiten selection: Nihonrett no gengo [Linguistics encyclopedia selection: Language in the Japanese archipelago]. Tokyo:
Sanseido.
Kasuga, Kazuo
1941
Kojiki ni okeru seidaku kakiwake ni tsuite [On the sei-daku distinctions in Kojiki]. Kokugo-Kokubun 11(4): 3978.
290 Bibliography
Kawakami, Shin
1969
Musee haku no tsuyosa to akusento kaku [The intensity of devoiced
moras and the accent nucleus]. Kokugakuindai Kokugo Kenkyu 27.
Kawasaki, Takako
1996
Sonority and voicing: a structural analysis. Ms., McGill University,
Montreal, Quebec.
Kaye, Jonathan D., Jean Lowenstamm and Jean-Roger Vergnaud
1985
The internal structure of phonological representations: A theory of
charm and government. Phonology Yearbook 2: 305328.
1990
Constituent structure and government in phonology. Phonology 7:
193231.
Kazama, Rikiz (ed.)
1979
Tsuzuriji gyakujun hairetsu goksei ni yoru Daigenkai bunrui goi.
Tokyo: Fuzanb.
Kenstowicz, Michael J.
1994a
Phonology in Generative Grammar. Oxford: Blackwell.
1994b
Sonority-driven stress. Ms., Massachusetts Institute of Technology,
Cambridge (Available on the Rutgers Optimality Archive, ROA-33).
Kess, Joseph E. and Tadao Miyamoto
1999
The Japanese Mental Lexicon. Psycholinguistic Studies of Kana and
Kanji Processing. Amsterdam and Philadelphia: John Benjamins.
Kikuchi, Hideaki and Kikuo Maekawa
2002
Accuracy of automatic phoneme labeling on spontaneous speech.
Proceedings of the 2002 Spring meeting of the Acoustical Society of
Japan: 9798.
Kikuda, Norio
1971
Ygen no rendaku no ichiyin [xxx]. Kaishaku 17 (5): 2429.
Kindaichi, Haruhiko and Kazue Akinaga
1997
Shinmeikai Accent Dictionary of the Japanese Language. Tokyo:
Sanseidoo.
Kindaichi, Haruhiko, Ooki Hayashi and Takesi Sibata (eds.)
1988
An Encyclopaedia of the Japanese Language. Tokyo: Taishukan.
Kindaichi, Kyosuke
1941
Kokugo no hensen [Transition of Japanese Language]. Tokyo: NHK.
Kiparsky, R. Paul V.
1982
Lexical Phonology and Morphology. In Linguistics in the Morning
Calm, the Linguistic Society of Korea (ed.), 391. Seoul: Hanshin.
1985
Some consequences of Lexical Phonology. Phonology Yearbook 2:
85138.
Kitahara, Mafuyu
1998
The interaction of pitch accent and vowel devoicing in Tokyo Japanese. In Japanese-Korean Linguistics 8, D. Silvia (ed.), 30315.
Stanford, CA: CSLI.
Bibliography
291
292 Bibliography
Kula, Nancy Chongo and Lutz Marten
1998
Aspects of nasality in Bemba. SOAS Working Papers in Linguistics
and Phonetics 8: 191208.
Kuno, Susumu
1973
The Structure of the Japanese Language. Cambridge, Mass: MIT
Press.
Kuroda, Shige-Yuki
2002
Contrast in Japanese. A contribution to feature geometry. Paper presented at the Second International Conference on Contrast in Phonology. University of Toronto, Toronto, Ontario, Canada. May 3,
2002.
Kuwabara, Hisao and Kazuya Takeda
1988
Analysis and prediction of vowel-devocalization in isolated Japanese
words. ATR Technical Report TR-I-0033. Kyoto: ATR Interpreting
Telephony Research Laboratories.
Labrune, Laurence
1999
Variation intra et inter-langue: Morpho-phonologie du rendaku en
japonais et due sai-sios en coren. In Phonologie: thorie et variation, Cahiers de grammaire 24: 117152.
Lange, Roland A.
1973
The Phonology of Eighth-Century Japanese. Tokyo: Sophia University.
Liberman, Mark Y. and Alan S. Prince
1977
On stress and linguistic rhythm. Linguistic Inquiry 8: 249336.
Lisker, Leigh and Arthur S. Abramson
1964
A cross-language study of voicing in initial stops: acoustical measurements. Word 20: 384422.
1970
The voicing dimension: some experiments in comparative phonetics.
Proceedings of the Sixth International Congress of Phonetic Sciences,
Prague 1967, 563567. Prague: Academia, Czechoslovak Academy
of Sciences.
Lombardi, Linda
1995
Laryngeal features and privativity. The Linguistic Review 12: 3559.
2002
Why place and voice are different: Constraint-specific alternations in
Optimality Theory. In Segmental Phonology in Optimality Theory, L.
Lombardi (ed.), 1345. Cambridge: Cambridge University Press.
Lyman, Benjamin S.
1894
Change from surd to sonant in Japanese compounds. Oriental Club
of Philadelphia.
Mabuchi, Kazuo
1971
Kokugo on-in ron [Japanese Phonology]. Tokyo: Kasama Shoin.
Maddieson, Ian
1984
Patterns of Sounds. Cambridge: Cambridge University Press.
Bibliography
293
294 Bibliography
Maruyama, Rinpei
1967
Joudaigo Jiten. (Dictionary of Jdai [710794] vocabulary). Tokyo:
Meiji Shoin.
Mathias, Gerald B.
1973
On the modification of certain Proto-Korean-Japanese reconstructions. Papers in Japanese Linguistics 2: 3147.
Matsui, F. Michinao
1993
Museihaku joo no akusento kaku no chikaku ni tsuite [Perceptual
study of the accent on devoiced accented mora]. Paper presented at
the 28th Kinki Onsei Gengo Kenkyuukai, Osaka, Japan.
Matsumoto, Takashi
1965
Ma-gyou on ba-gyou on koutai genshou no keikou [Tendency of
alternations between the [b]-column sounds and the [m]-column
sounds]. Kokugogaku Kenkyuu 5: 5265.
Matsumura, Akira (ed.)
1988
Daijirin [Daijirin Japanese Dictionary]. Tokyo: Sanseid.
Matthews, Peter H.
1974
Morphology: An Introduction to the Theory of Word Structure.
Cambridge: Cambridge University Press.
McCarthy, John J.
1999
Sympathy and phonological opacity. Phonology 16: 331399.
2003a
Sympathy, cumulativity, and the Duke-of-York Gambit. In The
Optional Syllable, Caroline Fry and Ruben van de Vijver (eds.).
Cambridge: Cambridge University Press.
2003b
Comparative Markedness. Ms., University of Massachusetts, Amherst.
Available on Rutgers Optimality Archive, ROA- 489].
McCarthy, John J. and Alan S. Prince
1995
Faithfulness and Reduplicative Identity. University of Massachusetts
Occasional Papers in Linguistics 18: Papers in Optimality Theory,
Jill N. Beckman, Suzanne C. Urbanczyk and Laura Walsh Dickey
(eds.), 249384. Amherst: GLSA.
McCawley, James D.
1968
The Phonological Component of a Grammar of Japanese. The
Hague: Mouton.
1977
Accent in Japanese. In Studies in Stress and Accent, Southern California Occasional Papers in Linguistics 4, Larry M. Hyman (ed.),
261302. Los Angeles: University of Southern California Department of Linguistics.
McGarrity, Laura W.
2003
Constraints on patterns of primary and secondary stress. Doctoral
Dissertation, Indiana University.
Mielke, Jeffrey
2004
The emergence of distinctive features. Ph.D. dissertation, Ohio State
University.
Bibliography
295
Miller, Roy A.
1967
The Japanese Language. Chicago: University of Chicago Press.
1986
Nihongo: In Defence of Japanese. London: Athlone.
Miller, Roy A. (ed.)
1970
Bernard Bloch on Japanese. New Haven: Yale University Press.
Miyake, Marc H.
2003
Old Japanese: A Phonetic Reconstruction. London: Routledge Curzon.
Mori, Hiromichi
1991
Kodai no onin to Nihonshoki no seiritsu [Sounds of Old Japanese
and completion of Nihonshoki]. Tokyo: Taishukan.
Murayama, Tadashige
2001
Nihon-no Myooji Besuto 10,000 [Top 10,000 Surnames in Japanese].
Tokyo: Shin-Jinbutsu-Ooraisha.
Nakagawa, Yoshio
1966
Rendaku, Rensei (Kashou) no Keifu [Compounds with Rendaku and
Compounds without Rendaku]. Kokugo Kokubun 35 (6): 302314.
Kyoto: Kyoto University.
Nakata, Norio (ed.)
1972
Kooza kokugo-shi 2: Onin-shi Moji-shi [History of sounds and
characters]. Tokyo: Taishkan.
Nakata, Norio and Hiroshi Tsukishima
1980
Dakuten [Daku-ten]. In Kokugogaku daijiten [Dictionary of National
Language Study], Kokugo Gakkai (ed.), 586587. Tokyo: Tokyodo.
Napoli, Donna J. and Marina A. Nespor
1976
The syntax of raddoppiamento sintattico. Unpublished Ms.
Nasu, Akio
1999
Onomatope-ni okeru yuuseika-to [p]-no yuuhyoosei [Voicing in
onomatopoeia and the markedness of [p]]. Journal of the Phonetic
Society of Japan 3: 5266.
2001
Heiretugo akusento no yure to keisan [Accent of Japanese dvandva
and its variations]. Paper presented at the 26th Annual Meeting of
Kansai Linguistic Society.
Nasukawa, Kuniya
1995
Melodic structure and no constraint-ranking in Japanese verbal inflexion. Paper presented at the Autumn Meeting of the Linguistic
Association of Great Britain. University of Essex.
1998
An integrated approach to nasality and voicing. In Structure and
Interpretation: Studies in Phonology (PASE Studies & Monographs
4), Eugeniusz Cyran (ed.), 205225. Lublin: Wydawnictwo Folium.
1999
Prenasalisation and melodic complexity. UCL Working Papers in
Linguistics 11: 207224.
296 Bibliography
Nasukawa, Kuniya
2005a
A Unified Approach to Nasality and Voicing. Berlin /New York:
Mouton de Gruyter.
2005b
Melodic complexity in infant language development. In Developmental Paths in Phonological Acquisition, Marina Tzakosta, Claartje
Levelt and Jeroen van de Weijer (eds.). Leiden Papers in Linguistics
2 (1): 5370.
Nihon Daijiten Kankkai (ed.)
197276 Nihon kokugo daijiten [Grand Japanese Dictionary]. Tokyo: Shgakukan.
Nihon Hoso Kyokai [NHK]
1985
NHK Nihongo Hatsuon Akusento Jiten [NHK Pronunciation and
Accent Dictionary of Japanese]. 1st ed., Nihon Hoso Shuppan
Kyokai, Tokyo.
1998
NHK Nihongo Hatsuon Akusento Jiten [NHK Pronunciation and
Accent Dictionary of Japanese]. 2nd ed., Nihon Hoso Shuppan
Kyokai, Tokyo.
Nishihara, Tetsuo
2002
Tohoku hgen ni okeru Shiin no Yseika [On consonant voicing in
the Tohoku dialect]. Miyagi Kyiku Daigaku Gaikokugo Kenky
Ronsh 2: 1924. Miyagi University of Education, Sendai.
Nishimiya, Kazutami
1960
Joudai-go no seidaku: shakkun moji o chshin to shite [Sei-daku in
Jdai Japanese [710794]: focusing on the characters for kun readings]. Many 36: 119.
Ogura, Sinpei
1910
Lyman-si no rendaku-ron [Lymans theory of sequential voicing].
Kokugakuin zassi [Journal of the National Research Institute] 16 (7):
923.
Ohala, John J.
1983
The origin of sound patterns in vocal tract constraints. The Production
of Speech, P. F. MacNeilage (ed.), 189216. New York: Springer.
Ohno, Kazutoshi
2000
The lexical nature of Rendaku in Japanese. In Japanese/Korean
Linguistics, Vol. 9, Mineharu Nakayama and Charles J. Quinn, Jr.
(eds.), 151164. Stanford: CSLI publications and Stanford
Linguistics Association.
2002
Rules or lexicon: sticking to rules or giving them up. Presentation at
the Second Conference on Formal Linguistics. June 2223. Hunan
University, Changsha, Hunan, China.
forthc.
Analogy: guessable rules Towards a better understanding of the
rendaku phenomenon. In Proceedings of LP2002, Shosuke Haraguchi, Bohumial Palek and Osamu Fujimura (eds.).
Bibliography
297
Okumura, Mitsuo
1955
Rendaku. In Kokugogaku jiten, Kokugo Gakkai (ed.), 916961.
Tokyo: Tkyd.
Ono, Masahiro
1995
Kindai no moji [Characters in Kindai (in and after 1338)]. In Gaisetu
Nihongo no rekishi [Survey of the history of Japanese], Takeyoshi
Sat (ed.), 4283. Tokyo: Asakura.
no, Susumu
194748 Nihonshoki no jion-gana ni okeru seidaku hyouki ni tuite. [On the
sei-daku representations by kana of the on reading in Nihonshoki].
Kokugo to Kokubungaku [Japanese Language and Literature]
24 (11): 4959 and 25(1): 4350.
1953
Joudai Kana-dzukai no Kenkyuu. [Study of Kana Usage in Joodai
(710794)]. Tokyo: Iwanami.
1980
Nihongo no sekai 1: Nihongo no seiritsu [World of Japanese 1: Formation of Japanese]. Tokyo: Chkronsha.
Oohashi, Junichi
2002
Tohoku hogen onsei no kenkyu [Study of the Sounds of Tohoku dialects]. Tokyo: Oufuu.
Orgun, Cemil Orhan
1998
Cyclic and noncyclic phonological effects in a declarative grammar.
In Yearbook of Morphology 1997, Geert E. Booij and Jaap van Marle
(eds.), 179218. Dordrecht: Kluwer.
Otsu, Yukio
1980
Some aspects of rendaku in Japanese and related problems. In Theoretical issues in Japanese linguistics: MIT Working Papers in Linguistics 2, Yukio Otsu and Ann Farmer (eds.), 207227.
tsubo, Heiji
1977
Katakana, hiragana [Katakana and hiragana]. In Nihongo 8: Moji
[Japanese 8: Characters], Susumu no and Takeshi Shibata (eds.),
249299. Tokyo: Iwanami.
Parker, Charles K.
1939
A dictionary of Japanese compound verbs. Tokyo: Maruzen.
Pater, Joseph V.
1999
Austronesian nasal substitution and other NC effects. In The prosody-morphology interface, Ren W. J. Kager, Harry G. van der Hulst
and Wim Zonneveld (eds.), 310 343. Cambridge: Cambridge University Press.
Pater, Joseph V. and Adam Werle
2001
Typology and variation in child consonant harmony. Proceedings of
HILP 5: 119 139.
Pierrehumbert, Janet B. and Mary E. Beckman
1988
Japanese Tone Structure. Cambridge, Mass: MIT Press.
298 Bibliography
Ploch, Stefan
1999
Nasals on my mind: the phonetic and the cognitive approach to the
phonology of nasality. Doctoral dissertation, School of Oriental and
African Studies, University of London.
Polivanov, Yevgeny D.
1928
Two kinds of musical accent of the Mie dialect in Nagasaki Prefecture. Studies on the Japanese Language (translated by S. Murayama
1976): 61.
Port, Robert F., Jonathan M. Dalby and Michael L. ODell
1987
Evidence for mora timing in Japanese. Journal of the Acoustic Society
of America 81 (5): 15741585.
Poser, William J.
1990
Evidence for foot structure in Japanese. Language 66: 78105.
2002
Japanese periphrastic verbs and noun incorporation. Ms., University
of Pennsylvania.
Prince, Alan S.
1998
Foundations of Optimality Theory; Current directions in Optimality
Theory. In Handouts of lecture at the Phonology Forum 1998, Kobe
University, September 1998. Phonological Studies 2, the Phonological Society of Japan (ed.). Tokyo: Kaitakusha.
Prince, Alan S. and Paul Smolensky
1993
Optimality Theory: Constraint interaction in generative grammar.
Ms., Rutgers University and University of Colorado. [Blackwell, Oxford, 2004.]
Pulleyblank, Douglas G.
1997
Optimality Theory and features. Optimality Theory: An Overview,
D. Archangeli and T. Langendoen (eds.), 59101. Massachusetts,
USA and Oxford, UK: Blackwell.
2003
Covert feature effects. WCCFL 22 Proceedings, Gina Garding and
Mimu Tsujimura (eds.), 398422. Somerville, Mass.: Cascadilla Press.
Reinhart, Tanya M.
1976
The syntactic domain of anaphora. Doctoral dissertation, MIT.
Rice, Keren D.
1993
A reexamination of the feature [sonorant]: The status of sonorant
obstruents. Language 69: 308344.
1997
Japanese NC clusters and the redundancy of postnasal voicing. Linguistic Inquiry 28: 541551.
2003
Featural markedness in phonology: Variation. The Second Glot International State-of-the-Article Book. Lisa Cheng and Rint Sybesma
(eds.), 389429. Berlin /New York: Mouton de Gruyter.
Rice, Keren D. and J. Peter Avery
1991
On the relationship between laterality and coronality. In Phonetics and
Phonology 2. The Special Status of Coronals: Internal and External
Bibliography
299
300 Bibliography
Shibata, Takeshi
1962
Onin [Phonology]. Hogengaku gaisetsu [Phonology, General survey
of dialectology]. Tokyo: Musashino Shoin.
Shibatani, Masayoshi
1990
The Languages of Japan. Cambridge: Cambridge University Press.
Shikano, Kiyohiro, Katsuteru Ito, Tatsuya Kawahara, Kazuya Takeda and Mikio
Yamamoto (eds.)
2001
Speech Recognition Systems. Tokyo: Ohmsha.
Shimizu, Katsumasa
1977
Voicing features in the perception and production of stop consonants
by Japanese speakers. Studio Phonologica 11: 2534.
Shinohara, Shigeko
1997
The roles of the syllable and the mora in Japanese adaptations of
French words. Cahiers de Linguistique Asie Orientale 25 (1): 87112.
Paris: CRLAO EHESS.
Siegel, Dorothy C.
1974
Topics in English phonology. Doctoral dissertation, MIT.
Smolensky, Paul
1994
Harmony, markedness, and phonological activity. Ms., Johns Hopkins
University (Available on the Rutgers Optimality Archive, ROA-37).
1995
On the structure of the constraint component Con of UG. handout for
talk at UCLA.
1997
Constraint interaction in generative grammar II: Local conjunction.
Paper presented at the Hopkins Optimality Theory Workshop /University of Maryland Mayfest, May 812, 1997.
Steriade, Donca
1995
Underspecification and markedness. In The Handbook of Phonological Theory, John A. Goldsmith (ed.), 114174. Oxford: Blackwell.
Strik, Helmer and Catia Cucchiarini
1999
Modeling pronunciation variation for ASR: A survey of the literature.
Speech Communication 29: 225246.
Sugito, Miyoko
1965
Shibata-san to Imada-san: Tango-no chookakuteki benbetsu ni tsuiteno ichi koosatsu [Mr. Shiba-ta and Mr. Ima-da: A study in the auditory differentiation of words], Gengo Seikatsu 165 [S40-6], 64 72
(Reproduced in Miyoko Sugito 1998, Nihongo Onsei no Kenkyu
[Studies on Japanese Sounds]. Izumi-Shoin, Vol. 6: 315.)
1969
Akusento no aru museika boin [A study on accented voiceless vowels].
The Bulletin of the Phonetic Society of Japan 132: 13.
1969/70 Measurements of tone movement of vowels and hearing validity in
relation to accent in Japanese. Studia Phonologica 5: 119. University of Kyoto.
Bibliography
1982
301
302 Bibliography
Takeda, Kazuya and Hisao Kuwabara
1987
Boin museika no youin bunseki to yosoku syuhou no kentou [Analysis and prediction of devocalizing phenomena]. Proceedings of the
1987 Autumn Meeting of the Acoustical Society of Japan 1: 105106.
Tamamura, Fumio
1989
Gokei [Word form] In Nihongo no goi imi [Words and meaning in
Japanese], Fumio Tamamura (ed.), 2351. Tokyo: Meiji shoin.
Tanaka, Makir
1995
Kodai no buntai, bunshou [Style and writing in Kodai (before 1338)].
In Gaisetu Nihongo no Rekishi [Survey of the History of Japanese],
Takeyoshi Sat (ed.), 190206. Tokyo: Asakura.
Tanaka, Shin-ichi
1992
Accentuation and prosodic constituenthood in Japanese. Tokyo Linguistic Forum 5: 195216.
2001
The emergence of the unaccented: Possible patterns and variations
in Japanese compound accentuation. In Issues in Japanese Phonology and Morphology, Jeroen M. van de Weijer and Tetsuo Nishihara
(eds.), 159192. Berlin /New York: Mouton de Gruyter.
2002a
An OT-based integrated model of accent and accent shift phenomena
in Japanese. Phonological Studies 5, the Phonological Society of
Japan (ed.), 99104. Tokyo: Kaitakusha.
2002b
Three reasons for favoring constraint reranking over multiple faithfulness. In A Comprehensive Study on the Phonological Structure of
Languages and Phonological Theory Shosuke Haraguchi (ed.), 121
130. Technical Report of Basic Sciences (A)(1), Grant-in-Aid for
Scientific Research by the Japan Society for the Promotion of Science.
2003a
Review of Eric Robert Rosen, 2001, Phonological Processes Interacting with the Lexicon: Variable and Non-Regular Effects in Japanese Phonology. GLOT International.
2003b
Japanese grammar in the general theory of prominence: Its conceptual basis, diachronic change, and acquisition. In A New Century of
Phonology and Phonological Theory: A Festschrift for Professor
Shosuke Haraguchi on the Occasion of His Sixtieth Birthday, Takeru
Honma, Masao Okazaki, Toshiyuki Tabata and Shin-ichi Tanaka
(eds.). Tokyo: Kaitakusha.
2005
Accent and Rhythm: From the Basics of Phonology to Optimality
Theory. Tokyo: Kenkyusha.
Tanaka, Shinichi and Haruo Kubozono
1999
Nihongo no Hatsuon Kyooshitsu [Introduction to Japanese Pronunciation]. Tokyo: Kurosio Publishers.
Bibliography
303
Tateishi, Koichi
2001
Onin jisho kurasu seeyaku no bunpu ni tsuite [On the distribution of
constraints for phonological sub-lexica]. Paper presented at the 26th
Annual Meeting of the Kansai Linguistic Society, Ryukoku University, Kyoto.
2002
Lexical stratification theories and (un)markedness. paper presented
at LP 2002, Meikai University, September 3, 2002.
2003
Phonological patterns and lexical strata. In Proceedings of CIL 17,
E. Hajicova, A. Kotesovcova and J. Mirovsky (eds.). Prague: Matfyzpress, MFF UK.
Tj, Misao (ed.)
1954
Nihon Hoogengaku [Japanese Dialectology]. Tokyo: Yoshikawa
Koubunkan.
Tsujimura, Natsuko
1996
An Introduction to Japanese Linguistics. Oxford: Blackwell.
Tsukishima, Hiroshi
1972
Kodai no moji [Characters in Kodai] (approximately 8c11c, in this
book). In Kooza kokugo-shi 2: Onin-shi Moji-shi [History of sounds
and characters], Norio Nakata (ed.), 311444. Tokyo: Taishkan.
Tsuru, Hisashi
1960
Manyoushuu ni okeru shakkun-gana no seidaku hyouki [Sei-daku
notations of kun reading kana in Manysh]. Many 36: 2032.
1977
Manyougana [Many-gana]. In Nihongo 8: Moji [Characters], Susumu no and Takeshi Shibata (eds.). Tokyo: Iwanami.
Unger, J. Marshall
1977
Studies in Early Japanese Morphophonemics. Bloomington: Indiana
University Linguistics Club [Doctoral dissertation, Yale University,
1975].
Uwano, Zendo, Aizawa Masao, Kato Kazuo and Sawaki Motoei
1989
Onin sran [Survey of Phonology]. Nihon hgen dai jiten [Encyclopedia of Japanese dialects], Munakata Tokugawa (ed.), 177. Tokyo:
Shogakukan.
Vance, Timothy J.
1980
The psychological status of a constraint on Japanese consonant alternation. Linguistics 18: 245267.
1983
On the origin of voicing alternation in Japanese consonants. Journal
of the American Oriental Society 102: 333341.
1987
An Introduction to Japanese Phonology. Albany: State University of
New York Press.
1992
Lexical phonology and Japanese vowel devoicing. In The Joy of
Grammar, Brentari et al. (eds.). Amsterdam: John Benjamins.
304 Bibliography
1996
Bibliography
305
Yip, Moira J. W.
1980
The tonal phonology of Chinese. Doctoral dissertation, MIT.
Yokotani, Teruo
1997
Accent shift beyond the foot boundary: Evidence from Tokyo Japanese compound nouns. Journal of the Phonetic Society of Japan 1(1):
5462.
Yoshida, Natsuya
2002
The effect of phonetic environment on vowel devoicing in Japanese.
Kokugogaku [Japanese Linguistics] 53 (3): 3447.
Yoshida, Natsuya and Yoshinori Sagisaka
1990
Boin museika no youin bunseki [Factor analysis of vowel devoicing].
Technical Report of ATR Interpreting Telephony Research Laboratories (TR-I-0159).
Yoshida, Shohei
1991
Some aspects of governing relations in Japanese phonology. Doctoral dissertation, School of Oriental and African Studies, University
of London.
1996
Phonological Government in Japanese. Canberra: The Australian
National University.
Yoshida, Yuko Z.
1995
On pitch accent phenomena in Standard Japanese. Doctoral dissertation, School of Oriental and African Studies, University of London.
[Published in 1999 by Holland Academic Graphics. The Hague.]
Yoshioka, Hirohide
1981
Laryngeal adjustments in the production of the fricative consonants
and devoiced vowels in Japanese. Phonetica 38: 236251.
Young, Steve J., Joop Jansen, Julian J. Odell, Dave Ollason and Phil C. Woodland
1999
The HTK Handbook. Entropic Research Laboratories.
Zamma, Hideki
1999
Affixation and phonological phenomena: From Lexical Phonology to
Lexical Specification Theory. Onin Kenkyuu [Phonological Studies]
2, the Phonological Society of Japan (ed.), 6976. Tokyo: Kaitakusha.
2001
Accentuation of person names in Japanese and its theoretical implications. Tsukuba English Studies 20: 118.
2003
Suffixes and Stress/Accent Assignment in English and Japanese:
More Than a Simple Dichotomy. On-line proceedings of Linguistics
and Phonetics 2002 (LP2002), Meikai University, Tokyo. [http://
www.adn.nu/~ad31175/lp2002/lp2002main.htm].
Index of authors
Cabrera-Abreu, Mercedes, 85
Calabrese, Andrea, 25,28
Campbell, Nick and Sagisaka
Yoshinori, 232
Index of authors
Jurafsky, Daniel and James H. Martin,
194
Kager, Ren W. J., 137, 150n9
Kamei, Takashi, 53, 56
Kamei, Takashi, Rokuro Kono and
Eiichi Chino, 123, 152n22, 152n24
Kasuga, Kazuo, 49, 64n12
Kawai, Mieko, 103n23
Kawakami, Shin, 248
Kawasaki, Takako, 36
Kaye, Jonathan D., Jean Lowenstamm
and Jean-Roger Vergnaud, 71, 74
Kazama, Rikiz, 93
Kenstowicz, Michael J., 245n2, 262
Kess, Joseph E. and Tadao Miyamoto,
42f
Kikuchi, Hideaki and Kikuo Maekawa,
207
Kikuda, Norio, 98
Kindaichi, Haruhiko, Ooki Hayashi and
Takesi Sibata, 9
Kindaichi, Kyosuke, 123, 129
Kiparsky, R. Paul V., 41, 174
Kitahara, Mafuyu, 248
Kitahara, Yasuo, 94
Kiyose, Gisabur N., 100n1
Kohler, Klaus J., 74, 229
Komatsu, Hideo, 53, 59, 65n23, 66n32,
66n35, 68n45
Kondo, Mariko, 4, 216, 226, 229ff, 230,
241, 244, 245n1, 245n2, 273
Kubozono, Haruo, 1, 2, 5ff, 11, 13, 14,
22, 44n2, 121n3, 157, 160, 173,
175n2, 189
Kula, Nancy Chongo and Lutz Marten,
86n3
Kuno, Susumu, 102n11
Kuroda, Shige-Yuki, 5, 25, 36, 45n8
Kuwabara, Hisao and Kazuya Takeda,
230
Labrune, Laurence, 25, 26, 28, 30,
44n2
309
Index of authors
Takeda, Kazuya and Hisao Kuwabara,
205, 206, 223, 230
Tamamura, Fumio, 2
Tanaka, Makir, 66n32
Tanaka, Shin-ichi, 4, 157, 175n8, 261ff,
268, 269, 270, 273, 274, 277n1,
278n9, 278n11
Tanaka, Shinichi and Haruo
Kubozono, 23n5
Tateishi, Koichi, 106f, 109ff, 119,
120n2, 121n4, 121n7, 121n8
Tj, Misao, 149n1
Tsujimura, Natsuko, 81
Tsukishima, Hiroshi, 64n10
Tsuru, Hisashi, 49, 51, 64n8, 64n11,
64n12, 65n19
Unger, J. Marshall, 90, 123, 129
Uwano, Zendo, Masao Aizawa, Kazuo
Kato, and Motoei Sawaki, 123
Vance, Timothy J., 2, 25, 26, 27, 29ff,
35, 40, 42f, 44n1, 44n2, 45n5, 45n9,
45n10, 59, 69n52, 81, 89ff, 90, 91, 93,
94f, 100n2, 101n6, 101n8, 102n12,
103n17, 103n22, 123, 129, 150n9,
189n5, 240
Varden, J. Kevin, 45n10
311
Index of languages
Ewe, 14
Latin, 39
Lithuanian, 266
German, northern, 79
Germanic languages, 72
Greek, 39
Gujarati, 78
Pirah, 277n5
Polish, 72, 79
Portuguese, 57, 110, 178, 180
Hindi, 78
Indonesian languages, 85
Italian, 14
Japanese,
Aomori, 124
common, 207
eastern 69n54, 229
Kansai, 247f
Kanto, 128
Kinki, 21, 128
Kochi, 139
Kyoto, 21, 128, 247
Kyushu, 61
literary, 42
Quichea, 80, 84
Quileute, 84
Reef Island-Santa Cruz languages, 84
Romance languages, 72
Russian, 72, 178f
Serbo-Croatian, 79
Siouan languages, 267
Slavic languages, 72
Spanish, 72f, 78, 80, 84
Swedish, 72f
Thai, 73, 78, 80, 83, 84
Winnebago, 267
Zoque, 80, 84
Zuya-go, 22
Index of subjects