Documente Academic
Documente Profesional
Documente Cultură
' Rectilinear scales of pitch can account for the similarity of tones close together
in frequency but not for the heightened relations at special intervals, such as the
octave or perfect fifth, that arise when the tones are interpreted musically. In-
creasingly adequate accounts of musical pitch are provided by increasingly gen-
eralized, geometrically regular helical structures: a simple helix, a double helix,
and a double helix wound around a torus in four dimensions or around a higher
order helical cylinder in five dimensions. A two-dimensional "melodic map" of
these double-helical structures provides for optimally compact representations
of musical scales and melodies. A two-dimensional "harmonic map," obtained
by an affine transformation of the melodic map, provides for optimally compact
representations of chords and harmonic relations; moreover, it is isomorphic to
the toroidal structure that Krumhansl and Kessler (1982) show to represent the •
psychological relations among musical keys.
A piece of music, just as any other acous- the musical experience. Because the ear is
tic stimulus, can be physically described in responsive to frequencies up to 20 kHz or
terms of two time-varying pressure waves, more, at a sampling rate of two pressure
one incident at each ear. This level of anal- values per cycle per ear, the physical spec-
ysis has, however, little correspondence to ification of a half-hour symphony requires
well in excess of a hundred million numbers.
I first described the double helical representation of Clearly, our response to the music is based
pitch and its toroidal extensions in 1978 (Shepard, Note on a much smaller set of psychological at-
1; also see Shepard, 1978b, p. 183, 1981a, 1981c, p. tributes abstracted from this physical stim-
320). The present, more complete report, originally
drafted before I went on sabbatical leave in 1979, has
ulus. In this respect the perception of music
been slightly revised to take account of a related, elegant is like the perception of other stimuli such
development by my colleagues Krumhansl and Kessler as colors or speech sounds, where the vast
(1982). number of physical values needed to specify
The work reported here was supported by National the complete power spectrum of a stimulus
Science Foundation Grant BNS-75-02806. It owes
much to the innovative researches of Gerald Balzano is reduced to a small number of psycholog-
and Carol Krumhansl, with whom I have been fortunate ical values, corresponding, say, to locations
enough to share the excitement of various attempts to on red-green, blue-yellow, and black-white
bridge the gap between cognitive psychology and the dimensions for homogeneous colors (Hurv-
perception of music. Less directly, the work has been ich & Jameson, 1957) or on high-low and
influenced by the writings of Fred Attneave and Jay
'Dowling. Finally, I am indebted to Shelley Hurwitz for front-back dimensions for steady-state vow-
assistance in the collection and analysis of the data and els (Peterson & Barney, 1952; Shepard,
to Michael Kubovy for his many helpful suggestions on 1972). But what, exactly, are the basic per-
the manuscript. ceptual attributes of music?
Requests for reprints should be sent to Roger N.
Shepard, Department of Psychology, Jordan Hall, Just as continuous signals of speech are
Building 420, Stanford University, Stanford, California perceptually mapped into discrete internal
94305. representations of phonemes or syllables,
Copyright 1982 by the American Psychological Association, Inc. 0033-295X/82/8904-0305J00.75
305
306 ROGER N. SHEPARD
continuous signals of music are perceptually pitch (Balzano, 1980; Krumhansl & Shep-
mapped into discrete internal representa- ard, 1979; Shepard, 1982) and time (Jones,
tions of tones and chords; just as each speech 1976; Pressing, in press), the dimensions
sound can be characterized by a small num- within which higher order musical units such
ber of distinctive features, each musical tone as melodies and chords are capable of struc-
can be characterized by a small number of ture-preserving transformations.
perceptual dimensions of pitch, loudness, In this paper I confine myself to the case
timbre, vibrato, tremolo, attack, decay, du- of pitch and to the question of how a single
ration, and spatial location. In addition, psychological attribute corresponding, in the
much as the internal representations of case of pure sinusoidal tones, to a simple one-
speech sounds are organized into higher level dimensional physical continuum of fre-
internal representations of meaningful words, quency affords the structural richness req-
phrases, and sentences, the internal repre- uisite for tonal music. One can perhaps
sentations of musical tones and chords are readily conceive how structural complexity
organized into higher level internal repre- is achievable in the dimension of time,
sentations of meaningful melodies, progres- through overlapping patterns of rhythm and
sions, and cadences. stress, but the structural properties of pitch
seem to be manifested even in purely melodic
The Fundamental Roles of Pitch and sequences of purely sinusoidal tones (Krum-
Time in Music hansl & Shepard, 1979). In the absence of
physical overlap of upper harmonics of the
The perceptually salient attributes of the sort considered by Helmholtz (1862/1954)
tones making up the musically significant and by Plomp & Levelt (1965), wherein does
chords and melodies are not equally impor- this structure reside?
tant for the determination of those higher
order units. In fact, for the music of all hu- Cognitive Versus Psychoacoustic
man cultures, it is the relations specifically Approaches to Pitch
of pitch and time that appear to be crucial Until recently, attempts to bring scientific
for the recognition of a familiar piece of methods to bear on the perception of musical
music. Other attributes, for example, loud- stimuli have mostly adopted a psychoacous-
ness, timbre, vibrato, attack, decay, and ap- tic approach. The goal has been to determine
parent spatial location, although contribut- the dependence of psychological attributes,
ing to audibility, comfort, and aesthetic such as pitch, loudness, and perceived du-
quality, can be varied widely without dis- ration, on physical variables of frequency,
rupting recognition or even musical appre- amplitude, and physical duration (Stevens,
ciation. 1955; Stevens & Volkmann, 1940) or on
The reasons for the primacy of pitch and more complex combinations of physical vari-
time in music are both musical and extra- ables (de Boer, 1976; J. Goldstein, 1973;
musical. From the extramusical standpoint Plomp, 1976; Terhardt, 1974; Wightman,
there are compelling arguments, recently 1973).
advanced by Kubovy (1981), that pitch and By contrast, the cognitive psychological
time alone are the attributes that are "in- approach looks for structural relations within
dispensable" for the perceptual segregation a set of perceived pitches independently of
of the auditory ensemble into discrete tones. the correspondence that these structural re-
From the musical standpoint a case can be lations may bear to physical variables. This
made that the richness and power of music approach is particularly appropriate when
depends on the listener's interpretation of such structural relations reside not in the
the tones in terms of a discrete structure that stimulus but in the perceiver—a circum-
is endowed with particular group-theoretic stance that is well known to students of mu-
properties (Balzano, 1980, in press, Note 2). sic theory, who recognize that an interval
In the case of the human auditory system, defined by a given physical difference in log
moreover, the requisite properties appear to frequency may be heard very differently in
be fully available only in the dimensions of different musical contexts. To elaborate on
STRUCTURE OF PITCH 307
can be thought of as projecting all tones with plitudes of the component frequencies de-
the same musical name but differing by oc- termined by a fixed bell-shaped spectral en-
taves (e.g., the tones C, C, C", etc.) down velope that was at its maximum near the
into a single corresponding point in a middle of the standard musical range and
"chroma circle" on a plane orthogonal to the that gradually fell away in both directions
axis of the helix (see Figure 1). Moreover, to below-threshold levels for very low and
this projective property, unlike the property very high frequencies. The different sounds
of augmented proximity, holds regardless of generated in this way remain fully distinct
the slope of the helix. in chroma but are all equivalent in height.
Thus, in shifting through chroma, from C
Physical Realization of the Chroma Circle to C* to D to D# and so on to B, the next
step (though still heard as a step up in pitch),
It is, in fact, the projective property of the instead of leaving one an octave higher at
regular helix that subsequently led me to a C, leaves one back at the original starting
method of physically separating the two un- tone C (Shepard, 1964b).2 Indeed, applica-
derlying components of pitch implicit in the tion of multidimensional scaling to measures
helical representation, namely, the rectilin- of similarity derived from judgments of rel-
ear component called pitch height, corre- ative pitch between tones varied in this man-
sponding to the axis of the helix (or of the ner (Shepard, 1964b) yielded the almost
cylinder in which it lies), and the circular perfectly circular solution displayed in Fig-
component variously called tone quality ure 2 (a) (see Shepard, 1978a). (A similarly
(Revesz, 1954) or tone chroma (Bachem, circular representation for tones generated
1950, 1954), corresponding to the circum- in this way has also been reported by Char-
ference of that cylinder (see Shepard, 1964b). bonneau and Risset, 1973.)
What was required was the specification of Although this circular component, chroma,
two physical operations corresponding to the emerges most compellingly when the recti-
geometrical projection of the entire helix linear component, height, is artificially sup-
onto the central axis, in the one case, and pressed, as with these special, computer-gen-
onto an orthogonal plane, in the other. The erated tones, the claim is that this circular
auditory realization of the required opera- component is necessarily present in all mu-
tions was greatly facilitated by the devel- sical tones for which tones separated by an
opment, at just this time, of computer tech- octave are perceived as more closely related
niques for the additive synthesis of arbitrarily than tones separated by a somewhat smaller
specified tones (Mathews, 1963). interval. Circular multidimensional scaling
For the first operation I proposed a broad- solutions have in fact been obtained for or-
ening of the band of energy around the cen- dinary musical tones. Figure 2 (b), for ex-
ter frequency of each tone until the resulting ample, reproduces a similarly circular pat-
narrow-band noise encompassed about an tern subsequently obtained by Balzano (1977,
octave. Because the different sounds gener- 1982) from a multidimensional scaling anal-
ated in this way have different center fre- ysis of his own discriminative reaction time
quencies, they still differ over the whole data for melodic intervals.3
range of pitch height. But, because they have
all been spread alike around the chroma cir- 2
The illusion of circular or "endlessly ascending"
cle, they can no longer be discriminated with tones can be beard on a commercial record ("Shepard's
respect to chroma. This operation thus cor- Tones," 1970) or on a short 16-mm film (Shepard &
responds to collapsing the helix onto its cen- Zajac, 1965), which we believe to be the first film in
which both the sound and the animation were generated
tral axis. by computer. I demonstrated the independent variation
For the second operation I proposed, in- of linear pitch height and circular tone chroma at the
stead, a harmonic elaboration of each tone 1978 meeting of the Western Psychological Association
until it included, alike, all multiples and sub- (Shepard, Note 1).
• 3 One should, however, exercise caution in basing the
multiples of the original frequency (i.e., all inference of circularity solely on a multidimensional
tones standing in octave and multiple-octave scaling solution. The curvature evident in some obtained
relations to that original tone), with the am- solutions (e.g., the one reported by Levelt, Van de Geer,
310 ROGER N. SHEPARD
Mai 7th
M»j 6th (8
Figure 2. Chroma circles recovered by multidimensional scaling (a) for 10 computer-generated tones
especially designed to eliminate differences in pitch height and (b) for twelve ordinary musical tones.
(Panel a is from "The Circumplex and Related Topological Manifolds in the Study of Perception" by
R. N. Shepard. In S. Shye (Ed.), Theory Construction and Data Analysis in the Behavioral Sciences.
San Francisco, Calif.: Jossey-Bass, 1978. Copyright 1978 by Jossey-Bass, Inc. Reprinted by permission.
Panel b is from Chronometric Studies of the Musical Interval Sense by G. J. Balzano. Unpublished
doctoral dissertation, Stanford University, 1977. Reprinted by permission.)
By now the possibility of analyzing per- between points in the representational struc-
ceived pitch into the rectilinear and circular ture, quite apart from whether that structure
components of height and chroma has been is basically rectilinear or helical in overall
accepted by a number of researchers (e.g., shape, should be adjusted to reflect how the
Bachem, 1950,1954; Balzano, 1977; Deutsch, operating characteristics of the sensory
1972, 1973; Jones, 1976; Kallman & Mas- transducers shift as we move from low to
saro, 1979; Pikler, 1955; Revesz, 1954; Ris- high input frequencies. Someone preoccu-
set, 1978). Even among advocates of a helical pied with such sensory considerations might
representation, though, opinions may still dif- even see some significance in the resem-
fer concerning the relative merits of a geo- blance of a distorted spiral or helix to the
metrically regular structure such as I pro- anatomical conformation of the cochlea.
posed versus a more or less distorted variant By contrast, a more cognitively and mu-
such as Ruckmick (1929) advocated. Here, sically oriented approach to pitch is likely
again, one's predilection may depend on to regard such considerations of automa-
whether one takes a more psychoacoustic or tic peripheral transduction (and anatomy)
a more cognitive and musical point of view. as largely irrelevant. Adopting something
like Chomsky's (1965) competence-perfor-
Argument for a Geometrically Regular mance distinction, I suggest that if it is
Structure musical pitch that interests us, the repre-
From the psychoacoustic standpoint it sentation should reflect the deeper structure
seems natural to suggest that the spacing that underlies a listener's competence to im-
pose a musical interpretation on a stream of
acoustic inputs under favorable conditions.
& Plomp, 1966) may simply reflect an artifact that al-
most always arises when basically one-dimensional data Such an interpretive structure continues to
are fit in a higher dimensional space (Shepard, 1974, exist whether or not the acoustic stimuli in
pp. 386-388). a particular stream fall within the range of
STRUCTURE OF PITCH 311
Limitations of the Simple Helix modulations of key most often occur (Forte,
1979; Helmholtz, 1862/1954; Schenker,
The structure of the simple regular helix 1906/1954). (f) Finally, according to Bal-
pictured in Figure 1 was dictated by two zano's (1980, Note 2) group-theoretic anal-
considerations: invariance under transposi- ysis, the preeminence of the fifth in tonal
tion and increased similarity at the octave. music has a basis in abstract structural con-
In such a regular helix, moreover, the special straints independent of the psychoacoustic
octave relation is represented by unique col- facts noted under (a) and (b).
linearity or projectability as well as by aug- Despite these diverse indications of the
mented spatial proximity. As it stands, how- importance of the perfect fifth, the fifth has
ever, the helical structure does not provide largely failed to reveal its unique status in
either augmented proximity or collinearity psychoacoustic investigations for the same
for tones separated by any other special reason, I believe, that the octave often re-
musical interval. Yet, beginning with Eb- vealed its unique status only weakly, if at
binghaus, Drobisch's proposal of a helical all. In the absence of a muscial context,
representation has been criticized for its fail- tones—particularly the pure sinusoidal tones
ure to account for the special status of the favored by psychoacousticians—tend to be
interval of a perfect fifth (see Ruckmick, interpreted primarily with respect to the sin-
1929). gle, rectilinear dimension of pitch height.
There are, indeed, a number of converging Without a musical context there is insuffi-
reasons for supposing that the fifth should, cient support for the internal representation
like the octave, have a unique status, (a) As of more complex components of pitch—com-
has been known at least since Pythagoras, ponents that may underlie the recognition
after the 2-to-l ratio in the lengths of a vi- of special musical intervals and that (like the
brating string that corresponds to the octave chroma circle) are necessarily circular be-
(which as we now know determines a 1-to- cause, again, the musical requirement of in-
2 ratio in the resulting frequency of vibra- variance under transposition entails that
tion), the fifth corresponds to the next sim- each such component repeat cyclically
plest, 3-to-2, ratio (Helmholtz, 1862/1954). through successive octaves.
(b) In the case of musical and, therefore,
harmonically rich tones, those separated by Recent Evidence for a Hierarchy of Tonal
a fifth also have, within the octave, the few- Relations
est upper harmonics that deviate from co-
incidence by an amount expected to produce Motivated by the considerations just given,
noticeable beats (Helmholtz, 1862/1954; Carol Krumhansl and I initiated a new series
Plomp & Levelt, 1965). (c) Moreover, such of experiments on the perception of musical
beats can be subjectively experienced even intervals within an explicitly presented mu-
when the harmonics that most contribute to sical context. In this way we were in fact
them are not physically present (Mathews, able to obtain clear and consistent evidence
Note 4; see also Mathews & Sims, 1981). that the perfect fifth is at the top of a whole
(d) Correspondingly, simultaneously sounded hierarchy of special relations within each
tones differing by a fifth—even pure, sinu- octave. In these experiments we established
soidal tones—tend to be heard as particu- the necessary musical context, just prior to
larly smooth, harmonious, or consonant, and presenting the to-be-judged test tone or
the fifth, together with the similarly har- tones, simply by playing, for example, the
monious major third, completes the uniquely sequence of tones of a major diatonic scale
stable and tonally centered chord, the major (the tones named do, re, mi, fa, sol, la, and
triad (Meyer, 1956; Piston, 1941; Ratner, ti and corresponding, in the key of C major,
1962; Schenker, 1906/1954, p. 252). (e) In to the white keys of the piano keyboard).
addition, the interval of the fifth plays a piv- Ratings of the ensuing test tones yielded
otal role in tonal music, being .the interval highly consistent orderings of the musical
that separates musical keys that share the intervals across listeners having equivalent
greatest number of tones and between which musical backgrounds. This was true both in
STRUCTURE OF PITCH 313
the initial experiments in which listeners New Representations for Musical Pitch
rated, in effect, the extent to which each in-
dividual test tone out of the 13 within one The Diatonic Scale as an Interpretive
complete octave was substitutable for the Schema
tonic tone (do) that would normally have In their characteristic eschewal of musical
completed the major scale presented as con- context, psychoacoustically oriented inves-
text (Krumhansl & Shepard, 1979; also see tigators missed the essential musical aspect
Krumhansl & Kessler, 1982) and in further of pitch. By failing to elicit, within the lis-
experiments in which listeners rated the sim- teners, the discrete tonal schema or "hier-
ilarities between the two test tones in all archy of tonal functions" (Meyer, 1956, pp.
possible pairs selected from one complete 214-215; Piston, 1941; Ratner, 1962) as-
octave (Krumhansl, 1979). sociated with a particular musical key, these
Only for listeners with little musical back- investigators left the listeners with no unique
ground did the obtained orderings of the cognitive framework within which to inter-
musical intervals agree with previous psy- pret the test tones. Even musically sophis-
choacoustic results in which similarity was ticated listeners therefore had little choice
determined primarily by proximity in pitch but to make their judgments on the basis of
height between the two tones making up the the simplest attribute of tones differing in
interval, that is, in which the ranking of the frequency—pitch height.
intervals with respect to similarity or mutual Cognitively oriented researchers are now
substitutability of their two component tones recognizing that interpretive schemata play
was, from greatest to least, unison, minor an essential role in the perception of musical
second, major second, minor third, major pitch, just as they do in perception generally.
third, and so on. For the more musically In the case of pitch, the primary schema
oriented listeners, the results tended, instead, seems to be the musical scale—usually, in
toward the entirely different ranking: unison the case of Western listeners, the familiar
and octave (nearly equivalent to each other), major diatonic scale (do, re, mi, etc.). As
followed by the fifth and sometimes the ma- noted by Dowling (1978b, in press), even
jor third, followed by the other tones of the though the most commonly used musical
diatonic scale, followed by the remaining, scales differ somewhat from culture to cul-
nondiatonic tones (those corresponding in ture, they all share certain basic properties.
the key of C major to the sharps and flats Regardless of the total number of tones per-
or black keys of the piano). mitted by each scale, most are organized
In short, data collected from listeners who around five to seven "focal pitches" per oc-
invest the test tones with a musical inter- tave. Moreover, the steps between such
pretation consistently reveal a whole hier- pitches rather than being constant in log fre-
archy of tonal relations that cannot be ac- quency are almost always arranged accord-
commodated within previously proposed ing to a particular asymmetric pattern that
geometrical representations of pitch whether repeats exactly within every octave.
rectilinear, helical, or spiral. Accordingly, it The cyclic repetition of the pattern from
now appears justified to present some alter- octave to octave can be explained in terms
native, generalized helical structures to- of the perceived equivalence of tones differ-
gether with the steps that led to their con- ing by an approximately 2-to-l ratio of fre-
struction and some evidence that such quencies, which led to the proposed simple
generalized structures are indeed capable of helix for pitch. The other structural univer-
accommodating the musically primary tonal sals of musical scales have been attributed
relations. The following is intended, there- to pervasive cognitive constraints on the
fore, as the first full account of these new number of absolutely identifiable categories
representations of pitch—first briefly de- per perceptual dimension (7 ± 2, as enun-
scribed in 1978 (Shepard, Note 1; also see ciated by Miller, 1956; cf. Dowling, 1978b)
Shepard, 198la, or, for a description con- and to the requirement that the scale have
current with that presented here but follow- a structure that affords reference points or
ing a different derivation, Shepard, 1982). tonal centers to which a melody can move
314 ROGER N. SHEPARD
tonality of the diatonic scale with respect to physically equal, the steps that should have
which the listeners interpret the test tones. been half as large if the scale had been dia-
Finally, in the informal experiment men- tonic (viz., the steps between Tones 3 and
tioned earlier in which a major triad was 4 and between Tones 7 and 8) sounded too
alternated with the same major triad dis- large. Moreover, in just completed, more for-
placed in pitch height, I noticed that between mal experiments, a student and I have now
the unison and the octave displacement, the obtained strong quantitative confirmation
greatest perceived relation was at the dis- of this phenomenon (Jordan & Shepard,
placement of a perfect fourth and a perfect Note 7).
fifth. But these intervals, which are adjacent Our tendency to hear the successive in-
to the unison around the circle of fifths, do tervals of the diatonic scale as uniform,
not have the greatest number of component which I take to underlie this auditory illu-
tones in the octave or unison relation; rather, sion, probably depends on perceptual set.
for these displacements alone, all tones in That tendency may be weakened in musi-
either triad are in the diatonic scale deter- cians such as singers, trombonists, and string
mined by the alternately presented triad. players who, unlike passive listeners or those
The rectilinear and the simple helical and who primarily play keyboard or other wind
spiral representations of pitch bear little re- instruments, must learn to make vocal or
lationship to the diatonic or related musical motor adjustments essentially proportional
scales found in human societies. It is not sur- to actual differences in log frequency. (Such
prising, therefore, that these previously pro- differences in set may in part account for the
posed geometrical representations fail to departure from equality of successive scale
provide an account of the various phenom- steps implied by the results of Frances, 1958,
ena of culture-specific, context-dependent, Experiment 2, or Krumhansl, 1979). Nev-
and apparently categorical perception. ertheless, our own results (Jordan & Shep-
My own approach to the representation ard, Note 7) indicate that there is a tendency
of pitch grew out of an informal observation to hear scale steps as more nearly equivalent
concerning the diatonic scale: In listening to than they physically are. The work of Bal-
the eight successive tones of the major scale zano (1980, 1982; Balzano & Liesch, in
(do, re, mi, . . . , do), I tended to hear the press; Balzano, Note 2) has also provided
successive steps as equivalent, even though support for the notion that in addition to
I knew that with respect to log frequency, pitch height and tone chroma, the discrete
some of the intervals (viz., the interval mi steps or "degrees" of the musical scale are
to fa and the interval ti back to do an octave psychologically real.
higher) are only half as large as the others. Suppose, then, that listeners interpret suc-
This apparent equivalence of successive steps cessive tones of the major scale (e.g., C, D,
of the diatonic scale could not be dismissed E, F, G, A, B, C, in the key of C major) by
as an inability to discriminate between major assimilating each tone to a node in an in-
and minor seconds. ternalized representation of the diatonic
When I then used a computer to generate scale. If the steps in this internal represen-
a series of eight tones that divided the octave tation are functionally equivalent (as steps),
into seven equal steps in log frequency (a then the perception of uniformity would fol-
series that does not correspond to any stan- low, despite the fact that the physical dif-
dard musical scale), the successive steps ferences are only half as great for the steps
sounded oddly nonequivalent. Apparently, E to F and B to C' as for the other steps.
just as we cannot voluntarily override the
visual system's tendency to interpret paral- Derivation of a Double Helix
lelograms projected on the two-dimensional
retina as rectangles in three-dimensional I propose to represent musical tones by
space (Shepard, 1981c, p. 298), I could not points in space and, as in the case of the
wholly override my auditory system's ten- simple regular helix, to represent the under-
dency to interpret tones in terms of the dia- lying relations between their pitches by geo-
tonic scale. Thus, after they had been made metrical relations of distance and collinear-
316 ROGER N. SHEPARD
most stable tone, the dominant, a fifth above, cause the resulting structure then curves
is always the next-to-lowest tone in the set back into itself, merging points correspond-
of four adjacent scale tones along the other ing to distinct tones and eliminating the rec-
edge of the strip. tilinear component of pitch height. The case
Thus, by going to a more complex rep- in which alternate folds are made in opposite
resentation than a simple unidimensional directions does not lead to these undesirable
scale of pitch height, we avoid an objection consequences, however. As might be ex-
to the subjective equality of the intervals of pected from the fact that the most general
the diatonic scale, namely, that such unifor- rigid motion of three-dimensional space into
mity would "deny a major source of melodic itself is a combination of a rotation and a
variety" (Dowling, 1978b, p. 350). The translation along the axis of rotation, folding
structural uniqueness of the diatonic scale in alternating directions produces an endless
that underlies the desired variation of mel- helical structure with the amount of twist
ody and modulation of key can be embodied per octave determined by the uniform angle
in a qualitative asymmetry rather than in a of folding. Because there are two edges to
purely quantitative one. No such structural the originally flat strip, each corresponding
uniqueness is possible within scales that are to one of the two distinct whole-tone scales,
both quantitatively and qualitatively sym- such folding leads to a double helix.
metric, such as the whole-tone scale, rep- In order to achieve the collinearity of tones
resented by just one edge of the strip of tri- that are equivalent except for height, in ac-
angles, or the chromatic scale, represented cordance with Requirement 2, there must be
by the symmetrical zigzag path that alter- an integer number of full twists of the struc-
nates between the two edges of the strip. ture per octave. The flat version displayed
Still, although the strip of triangles shown in Figure 3 (a) corresponds to the trivial 0°
in Figure 3 (a) is consistent with Require- twist and, as I noted, must be excluded be-
ments 1 and 3, it is not consistent with Re- cause it does not segregate the chroma lines:
quirement 2, according to which any two They collapse into the two whole-tone scales.
tones standing in an octave relation (such At the other extreme, two full twists per
as C-C, C'-C", etc.) must fall on their own octave lead to a different kind of degeneracy
unique chroma line. For, in this flattened in which all the triangles collapse into a sin-
form, the line passing through C and C also gle triangle. This different kind of flat con-
passes through D, E, F1, G*, and A*, which figuration must also be excluded because in
are not octave or chroma equivalents to C it not only octaves but also minor thirds map
and C. Likewise, the line passing through onto each other and, again, we lose the com-
C* and C* also passes through D#, F, G, A, ponent of pitch height. The single remaining
and B. However, there is no requirement that case of just one full twist per octave is the
this structure remain flat. In fact, any fold- unique solution we seek. It is the nondegen-
ing of the strip of triangles along the sides . erate double-helical structure of which one
of the triangles will preserve the equilater- octave is portrayed in Figure 3 (b). In three-
ality of the triangles imposed by Require- dimensional space it alone satisfies Require-
ments 1 and 3. But in order to ensure the ments 1-3.
full satisfaction of Requirement 1, the fold-
ing must be done in a uniform manner Emergent Properties of the Double Helix
throughout the strip. Only then will the
transformation of the strip into itself induced Musically significant properties emerge
by transposition into a different key consist from the double-helical representation that
only of rigid translations, rotations, and re- were not explicitly used in its derivation.
flections of the structure as a whole and, First, successive tones of the chromatic scale
thus, preserve all distances within it. Spe- project onto the axis of the helix in equal
cifically, all folds must be made at the same steps of pitch height. More remarkably, as
angle. is illustrated in Figure 3 (c), the same tones
The case in which all folds are also made project down onto the plane orthogonal to
in the same direction can be ruled out be- that axis to form a circle, the "cycle of
318 ROGER N. SHEPARD
because the unwrapped surface of the cyl- to the diatonic scale, differing only in which
inder is viewed in Figure 4 as if from the tone is taken as the principal (beginning,
inside of the cylinder.) In order to represent final, or tonic) tone in the scale. Moreover,
the unbounded character of the cylindrical the most common pentatonic scale is given
surface, the rectangle can be endlessly re- by the complement of the diatonic scale
peated in the plane as indicated in the figure. (e.g., by C#, D#, F#, G#, and A* in the figure
The chromatic scale is represented in this or by the black keys on the piano). Notice
two-dimensional map by the sequence of that apart from the always permissible key-
notes on any of the straight lines directed changing translation, such a pentatonic scale
upward and to the right. The two whole-tone is equivalent to a diatonic scale in which the
scales are represented by the two distinct two most outlying tones within the vertical
sequences of notes on the somewhat steeper band have simply been deleted (e.g., B and
straight lines directed upward and to the left. F in the diatonic key of C). The resulting
The diatonic scale, designated by the stip- 2-3-2-3-. . . pattern preserves many of the
pled band, exhibits a 3-4-3-4- . . . zigzag desired structural properties of the 3-4-3-4-
pattern in which strings of whole-tone steps . . . diatonic scale and, again, changes of
are asymmetrically broken by single half- key correspond to horizontal shifts of the
tone steps. (now narrower) vertical band.
Corresponding to the division of the tones The adjacency of tones differing by half-
into those that are and those that are not in and whole-tone steps in the embedded two-
a particular key by a plane pivoted about the dimensional lattice preserves proximity in
axis of the helix, the tones belonging to a pitch height. Thus, this two-dimensional
particular key fall, in the flattened represen- structure is particularly suited for the rep-
tation, within a particular vertical band de- resentation of melodies as well as scales. For,
marcated in the figure for the key of C by as might be expected on the basis of Gestalt
the lighter dashed lines. Modulation to an- principles of good continuation and grouping
other key corresponds, here, to a horizontal by proximity (e.g., see Deutsch & Feroe,
shift of the vertical band with, again, more 1981), transitions in pitch between succes-
closely related keys obtainable by smaller sive tones of a melody are most commonly
shifts. Alternatively, transpositions from any a single step in the diatonic scale (Dowling,
major key to any other can be thought of as 1978b; Fucks, 1962; Merriam, 1964; Philip-
translations of the stippled zigzag pattern of pot, 1970, p. 86; Piston, 1941, p. 23). For
the diatonic scale from one location to an- example, in an analysis of nearly 3,000 me-
other within this two-dimensional plane. lodic intervals in 80 English folk songs,
That the two keys most closely related to a Dowling (1978b, p. 352) found, even after
given key (e.g., C) are the two obtained by omitting the unison, that 68% of the tran-
a shift of a fifth up (to G) or down (to F) sitions were no larger than one step on the
is reflected in the geometrical fact that this diatonic scale, and 91% no larger than two
zigzag pattern overlaps most with itself when steps.6 On the basis of these considerations,
the straight group of three adjacent points
is superimposed on the straight group of 6
There may be more than one reason for this striking
four, slipped into either of the two alterna- predilection for small melodic steps. As Dowling (1978b)
tive positions within that group. notes, it probably stems, in part, from basic limitations
Other commonly used scales take similar of human memory capacity. It seems to be related to
zigzag patterns within this space. In one of Deutsch's (1978) finding that accuracy of recognition
its versions, the relative minor scale is iden- of a repeated tone falls off inversely with the average
size of the intervals in an interpolated sequence of tones.
tical to its associated major scale except that As I have suggested, however, its close connection to
the sequence is started and ended on a dif- Gestalt principles of visual perception (Koffka, 193S;
ferent tone in the sequence (e.g., on A rather KOhler, 1947), to phenomena of "melodic fission" or
than on C in the example illustrated). In- "auditory stream segregation" (Bregman, 1978; Breg-
man & Campbell, 1971; Dowling, 1973b; McAdams
deed, the seven so-called authentic church & Bregman, 1979; van Noorden, 1975), and to the
modes (which derive rather directly from the closely allied phenomena of the "trill threshold" (Miller
earlier Greek modes) all correspond exactly & Heise, 1950) and apparent movement in pitch (Shep-
320 ROGER N. SHEPARD
justable through variation of three parame- yield robust, orderly, and informative data
ters: a weight for the circular component for concerning the effects of musical contexts on
fifths, a weight for the circular component the interpretive schema induced within the
for chroma, and a weight for the rectilinear listener (Krumhansl & Kessler, 1982; Jor-
component for height. Thus, we can accom- dan & Shepard, Note 7). In this technique
modate the relations between musical pitches the context (e.g., a musical scale, a melody,
as they are perceived by different listeners a chord, a sequence of chords, or some richer
who may vary, for example, in the extent to musical passage) is immediately followed by
which they represent the cognitive structural a probe tone (selected, for example, from the
component of the circle of fifths versus the 13 chromatic tones inclusively spanning a
purely psychoacoustic component of pitch one-octave range), and the listener is asked
height. to rate (on a 7-point scale) how well the
From this standpoint the original double probe tone "fits in" with the preceding con-
helix (Figure 3) was perhaps too rigid. In text.
present terms it can be seen to be the Eu- For any context, the average ratings from
clidean sum of just the circle of fifths and trials using different probe tones form a pro-
the dimension of pitch height. But we now file over the octave that in the case of musical
know that listeners vary widely in their re- listeners, reveals the hierarchy of tonal func-
sponsiveness to these two attributes (Krum- tions induced by that context—with the rat-
hansl & Shepard, 1979; Shepard, 198la). ings highest for a tone interpreted as the
So, in fitting the helix to data, we should tonic, next highest for a tone interpreted as
perhaps make what might be regarded as a the dominant, and so on (see Krumhansl
concession to performance, as opposed to & Shepard, 1979; and, especially, Krum-
competence, and allow a differential stretch- hansl & Kessler, 1982). Indeed, Krumhansl
ing or shrinking of the vertical extent of an and Kessler demonstrate that the circularly
octave of the helix relative to its diameter. shifted position (or phase) of the profile per-
This implies, of course, a departure from the mits one to infer which of the 24 possible
constraint imposed by my third requirement major or minor diatonic keys is instantiated
(uniformity of scale steps), but I noted at as the listener's momentary interpretive
the time that such a requirement may be framework.
appropriate only under a certain "perceptual The rating of how well a given probe tone
set." If listeners differ in their judgments of fits in with the preceding context can be in-
the relations between tones, they are not all terpreted as a measure of the spatial prox-
operating under identical perceptual sets. imity of that probe to the ideal tonal center
In terms of the two-dimensional map of or tonality implied by that context. When
the double helix, differences in relative sa- applied to a suitably complete set of such
lience of pitch height and the circle of fifths proximity measures, techniques of multidi-
would be accommodated by a certain class mensional scaling (see Shepard, 1980) should
of linear transformations of the rectangle, therefore enable one to reconstruct the un-
namely, those restricted to relative stretch- derlying spatial structure.
ing or shrinking of the rectangle along its In the original experiment by Krumhansl
vertical and horizontal axes only. In the lim- and Shepard (1979), however, only the dia-
iting cases in which there is a degenerate tonic scale of a single key (C major) was
collapse of the rectangle in the horizontal or presented as context. As a result, the ob-
vertical direction, we obtain a rectilinear tained rating profiles directly provided in-
(and logarithmic) dimension of pitch height formation about the spatial proximities of
or a simple circle of fifths, respectively. the 13 probe tones to a single tonal center,
C. However, there is every reason to believe
Recovery of the Geometrical that under transposition into any other key,
Representation From Empirical Data the profile would be essentially invariant ex-
cept for random fluctuations in the data—
The probe technique introduced by Krum- and this assumption already has some em-
hansl and Shepard (1979) is continuing to pirical support (Krumhansl & Kessler, 1982;
STRUCTURE OF PITCH 323
Jordan & Shepard, Note 7; also see Krum- quired sort was obtained separately from the
hansl, Bharucha, & Kessler, 1982). Accord- average rating profile obtained from each of
ingly, it seemed reasonable to approximate the 23 subjects in the experiment by Krum-
the complete matrix of proximity measures hansl and Shepard (1979). Application of
needed for the application of multidimen- individual-difference multidimensional scal-
sional scaling by simply duplicating the pro- ing (INDSCAL; Carroll & Chang, 1970) to
file of ratings in each row of a square matrix, the entire resulting set of 23 individual ma-
after circularly shifting each succeeding row trices then yielded the four-dimensional so-
by one cell so that the highest number (cor- lution presented in Figure 7.
responding to the functional identity of the Panel a shows the projection of the solu-
tonic tone and the ideal tonal center) fell on tion onto the plane of Dimensions 1 and 2,
the principal diagonal of the matrix. Then, whereas Panel b shows its projection onto
as is customary in multidimensional scaling, the plane of Dimensions 3 and 4. As sug-
each entry in the matrix was averaged with gested by the circular dashed line, the first
its diagonally opposite counterpart to yield projection (a) is essentially the chroma cir-
a symmetric matrix of proximity measures. cle, going clockwise from C (through C#, D,
Because large individual differences, which D*, etc.) around to C' an octave above. The
are related to extents of musical background, configuration departs from the chroma cir-
characteristically emerge in these experi- cle, however, in that the spacing is wider
ments (Krumhansl & Shepard, 1979; Shep- near C and C' and, particularly, in that C
ard, 198la), a symmetric matrix of the re- and C' do not coincide as they should if oc-
(£) --'-
<Swxsi./3v.rF>-
_ ^(y
*-®+
DIMENSION 1 DIMENSION 3
V
1,4 /i« • >r3
V Var-J1 "
/
/"«•?;•"'
/
-* HEIGHT
/' A /'
/' k. '
tave equivalence had been complete for all Figure 5 at C and springing it slightly apart
listeners. Dimension 1 thus seems to combine (with respect to chroma) in that same four-
one dimension of the chroma circle with the dimensional space rather than, as in Figure
dimension of pitch height. The second pro- 6, in an orthogonal fifth dimension. Presum-
jection (b), however, is an almost perfect ably, if similar data were collected and an-
circle of fifths, with the points representing alyzed for tones spanning two or three oc-
C and C nearly superimposed, indicating taves, the data could no longer be fit by a
complete octave equivalence. Apart from the small distortion of this sort in the four-di-
stretching and separation between the points mensional space and, thus, the truer, five-
representing C and C on Dimension 1, the dimensional structure would emerge.
four-dimensional configuration is, in fact, Further support for these conclusions
the double helix on the torus depicted in comes from a linear regression used to assess
Figure 5, which corresponds to the Euclid- the importance of the various proposed geo-
ean product of the chroma circle and the metrical components of pitch, including—in
circle of fifths. addition to the one-dimensional component
Panels c and d display the INDSCAL for height and the two circular components
weights for each of the listeners on each of for chroma and for perfect fifths already
the four dimensions. Panel c shows that the discussed here—a two-dimensional compo-
listeners with the least extensive musical nent for major thirds. The use of linear
backgrounds (represented by the triangles) regression for this purpose is made possible
had the heaviest weights on Dimension 1, by the principle of "Euclidean composition,"
which separated the tones with respect to according to which the squared distance be-
pitch height, whereas Panel d shows that the tween any two points in the final, higher
listeners with the most extensive musical dimensional configuration is a weighted sum
backgrounds (represented by the circles) had of the squared distances between the cor-
the heaviest weights on Dimensions 3 and responding points in each of the component
4, which contained the circle of fifths and configurations. What the regression analysis
implied complete equivalence between oc- thus yields is the set of weights, one for each
taves. Moreover, the fact that the points for component configuration, that provides the
all listeners fall on a 45° line in Panel d best fit to the data. (See Shepard, 1982, for
means that the circle of fifths emerges, to a fuller explanation of the analysis and the
whatever extent that it does for any one lis- results.) The results indicated that the circles
tener, as an integrated whole and never one of chroma and of fifths did indeed account
dimension at a time. That the points for for a significant portion of the variance but
Group 1 listeners also fall close to the (bro- that the factors of height and of major thirds
ken) 45° line in Panel c indicates that under were also significant for the least and the
the complete octave-equivalence character- most musical listeners, respectively. More
istic of the most musical listeners, the chroma specifically, the principal factors (with frac-
circle, too, comes and goes as an integrated tions of variance accounted for in the sym-
unit. metrized data) were for the most musical
I interpret the obtained four-dimensional listeners, fifths (.43), chroma (.21), and
structure in Figure 7 as a one-octave piece thirds (.19); for the intermediate listeners,
of the endless five-dimensional theoretical chroma (.36), fifths (.21), and thirds (.16);
structure portrayed in Figure 6. But because and for the least musical listeners, chroma
it includes only one octave, the gap between (.39) and height (.31) only (Shepard, 1982,
the two ends, which should have been rep- Table 1).
resented by a displacement in a separate Clearly, then, pitch is multidimensional,
fifth dimension, has (with a small distortion) and the different dimensions differ in sa-
been accommodated in the four dimensions lience for different listeners (as well as in
of the embedding space of the torus. In other different musical contexts). Moreover, be-
words, the separation in pitch height be- cause the final configuration (whether fitted
tween C and C an octave above has been by multidimensional scaling or Euclidean
achieved by cutting through the torus in composition) contained circular compo-
STRUCTURE OF PITCH 325
nents, the obtained structures provide fur- multaneously (that is, as harmonic inter-
ther support for the claim that some of the vals). Harmony, which governs the selection
dimensions of pitch are circular. The struc- of tones to be sounded simultaneously in
ture as a whole thus appears to be consistent chords, is therefore based on the consonant
with the theoretical expectation of a helical intervals of the major and minor third, along
or, under complete octave equivalence, to- with the consonant perfect fifth, which to-
roidal character. gether make up the particularly harmonious,
stable, and tonality-defining major triad
The Problem of Harmonic Relations (e.g., C-E-G, in the key of C).
There are two directions in which the geo-
The generalizations of the double helix for metrical models proposed here might be gen-
pitch presented in the two preceding sections eralized in order to accommodate harmonic
depended on an implicit weakening of Re- relations. One possibility, already suggested
quirement 3 underlying the original deriva- at the end of the preceding section, is to add
tion of that double helix, namely, the re- further components to the structural repre-
quirement that steps within a diatonic scale sentation. In addition to the one-dimensional
correspond to equal distances within the component of pitch height, the two-dimen-
model. Only by accepting a weakening of sional component of chroma, and the two-
this strong constraint can we provide for the dimensional component of the circle of fifths,
quantitative variations in the relative weights we could include another two-dimensional
of underlying components (height, chroma, component of major thirds. As I noted, such
fifths, etc.) necessary to fit the data of dif- an extended model will indeed permit a
ferent listeners. In terms of the two-dimen- somewhat better fit to the data (for the most
sional melodic map of the manifold in which musical listeners, an increase from 63% to
the double helix is wound, changes in the 83% of the variance explained; see Shepard,
relative weights correspond to linear stretch- 1982, Table 1). However, the prospect of
ings or shrinkings of the rectangular melodic increasing the number of dimensions of the
map along either or both of its two principal embedding space from five to seven is not
axes. One of these two axes corresponds to terribly attractive.
the circle of fifths, whereas the other cor-, A second possibility is to remove the re-
responds to pitch height, chroma, or a mix- striction that the linear expansions or
ture of these in the case of the straight cy- compressions of the melodic map are to be
lindrical model (Figure 3 [c]), the toroidal permitted only along the two orthogonal
model (Figure 5), or the cylindrical helical axes of that map. If we allow elongations
model (Figure 6), respectively. and contractions along arbitrarily oriented
The coefficients of this linear transfor- directions in the plane of that map, we can
mation determine the relative importance of bring tones separated by other musical in-
certain musical intervals, namely, the per- tervals into closer proximity without increas-
fect fifth, the octave, and (through emphasis ing dimensionality. Linear transformations
of either pitch height or chroma) the minor of this more general type are called affine
second or chromatic step. Within this re- transformations (Coxeter, 1961); they pre-
stricted class of linear transformations of the serve straight lines and parallelism, which
melodic map there is, however, no way to is desirable if we are to continue to use col-
emphasize the intervals of the major or mi- linearity and parallel projection to represent
nor third. Yet these two intervals, though of musical relations. Even so, we are not free
limited importance fpr melodic structure, to choose any affine transformation of the
are fundamental to harmonic structure plane but must confine our choice to one that
(Piston, 1941, p. 10). Thus, whereas the is compatible with the toroidal interpretation
most common intervals between successively of the plane. Expansions or contractions
sounded tones (melodic intervals) are, as we along the orthogonal axes of the rectangular
noted, the minor or major seconds, which are map can be of any magnitudes, correspond-
adjacent in the melodic map (Figure 4), such ing to changes in the relative sizes (or
intervals are dissonant when sounded si- weights) of the two circular components
326 ROGER N. SHEPARD
(chroma and fifths) that generate the torus. (b). However, because the transformation
But expansions or contractions along other is on the torus, the lower triangular half of
directions can take on only certain discrete the resulting parallelogram wraps around to
values, corresponding to operations of cut- fill in the vacated upper triangular half of
ting through the torus in a plane parallel to the rectangular map (as shown in b). The
one of the two generating circles, giving one second 360° twist operates in the same way
free end of the resulting cylindrical tube an (to take b into c). Finally, a relative expan-
integer number of 360° twists relative to the sion on the horizontal axis and a circular
other end, and then reattaching the two ends. shift of the entire pattern around the torus
Twists that are not multiples of 360° would to bring the tone C into the center of the
fail to rejoin the two ends of each line on the map yields the final, transformed map (d)
plane (or helical path on the torus) and, shown at the bottom of Figure 8.
hence, would disrupt continuity required for Because the shearing transformation in-
an affine transformation and collinearity de- duced by the double twist was confined to
sired for our musical interpretation. the vertical axis corresponding to the chroma
circle, the horizontal axis was unaffected and
The Harmonic Map and Its Affinity to the still corresponds to the circle of fifths. As a
Melodic Map consequence the tones falling within any par-
ticular key continue to form a vertical band
In the present case we seek an affine trans- (illustrated for the key of C major in the
formation (of the admissible sort) that will figure), and modulations between keys still
bring tones separated by the harmonically correspond to horizontal shifts of this band
important major and minor thirds and per- and, hence, to rotations around the torus.
fect fifth into close mutual proximity. In However, owing to the twist in the torus
Figure 4 we see that for any given tone, say entailed by the affine transformation, the
C at the lower left corner of the rectangle, harmonic map is not best described as a dou-
the major third (E), the minor third (D*), ble helix wrapped around a torus. In it the
and the perfect fifth (G), upward and to the series of perfect fifths now forms a single
right, form the vertexes of an elongated par- helix wrapped three times around the torus;
allelogram in which the lower triangular half the three series of major thirds form a triple
corresponds to the major triad built on C helix wound once around the torus crosswise
(viz., C-E-G) and the upper, complemen- to the series of fifths; and the four series of
tary triangular half corresponds to the minor minor thirds form four separate circular
triad built on C (viz., C-D*-G). Because no rings around the torus crosswise to both of
other tones fall within such parallelograms, the first two types of series. (See the hori-
an appropriate affine transformation should zontal and diagonal rows of letter names for
bring all such sets of four tones into the de- the tones in Figure 8 [d].)
sired, more compact form. Fortunately, this The important result is that tones related
can be achieved in a way that is compatible to any tone by major and minor thirds, as
with the toroidal interpretation. well as by perfect fifths, have now been
First, we cut through the original torus brought into spatial proximity to that tone,
(Figure 5) at some point on the circle of whereas the formerly proximal tones related
fifths. Then we give the chroma circle at one by major and minor seconds have been dis-
end of the resulting cylindrical tube two full placed to greater distances. This is illus-
360° twists relative to the chroma circle at trated for the tone C in Figure 8 (d), where,
the other end. Finally, we reattach the two as can be seen, all the tones forming major
ends, forming a new torus. These operations or minor triads with C constitute a compact
are most adequately visualized in terms of hexagonal cluster around C. Moreover, the
the two-dimensional map of the torus, as tones traditionally considered to be conso-
shown in Figure 8. The first twist takes the nant when sounded simultaneously with C
rectangle of the original melodic map (a, now fall in a compact circular region around
which is a simplified version of the earlier C. These same relations also hold for any
Figure 4) into the sheared parallelogram other tone chosen as a reference point. Ac-
STRUCTURE OF PITCH 327
-c C-D*-F*-A-C fc—0—E
\ D& B
—C
Q
A* NO" E G
\
^^_ A* F D
\
F C* A
X
— E -— (
Melodic
D* B Q
Map
\
F ff D
1st 2nd
360° \ 360°
Twist Twist
>r \ . >
<*
d
Harmonic
Map
(circularly shifted
around torus
to bring C into
center of map)
«_ Fifths
Figure S. The harmonic map (d) as obtained from the melodic map (a) by an affine transformation
consisting of two 360° twists of the torus (b and c).
cordingly, I propose that this transformed tones, including augmented and diminished
map be called the harmonic map. Whereas chords, though more complex, also have
the melodic map provided for compact rep- compact representations in this map.7
resentations of musical scales and, hence, the
7
most common melodies, the harmonic map As an amusing instance of "converging evidence,"
provides for compact representation of con- in the version of this space that Attneave presented at
the symposium (Note 8), which was reflected with re-
sonant intervals and, hence, the most com- spect to the version shown here, the pattern for the sev-
mon chords. Although not explicitly illus- enth chord took, as he observed, the unmistakable shape
trated, chords consisting of more than three of the numeral 7.
328 ROGER N. SHEPARD
such a way that musical transposition cor- their case because pitch height is not rele-
responds to rigid motions of the structure vant to tonality. To say that a piece is in F
into itself. However, if we cut off the apex major is not to say whether it is to be played
of the cone and then cut through the re- by a piccolo or a tuba. Krumhansl and Kes-
sulting closed band, we obtain a strip that sler's work elegantly uses the tone-probe
with a slight twist, can be continued upward technique to show how a listener's internally
and downward as a helical structure (much represented tonal center shifts in this closed
as the torus of Figure 5 was cut through, toroidal surface under the influence of an
twisted, and continued to form the higher evolving musical context.
order helix of Figure 6). A further consid- As I mentioned earlier, there may be fun-
eration is raised by Tversky's (1977) demon- damental parallels between the cognitive
stration that perceptually more salient stim- structural constraints governing visual-spa-
uli are judged to be both more similar to tial and auditory-musical representation.
each other when the judgments are of sim- Pitch is the "morphophoric" medium that
ilarity and more different from each other is most analogous to physical space in the
when the judgments are of dissimilarity. It case of vision (Attneave, 1972; Attneave
is possible that the tones that are perceived & Olson, 1971; Kubovy, 1981; Shepard,
as closely related to the tonal center are 1981c, 1982). Musically meaningful ob-
rated as more similar when similarity is jects, such as melodies and chords, are the
judged because they are more salient and closest auditory analogs of visual shapes be-
not because they are closer in the underlying cause such objects preserve their structure
representational space. If so, the underlying under rigid transformations within their re-
structure might be more similar, though ev- spective morphophoric media. Moreover,
idently not isomorphic, to one of the gen- because the medium for the rigid transfor-
eralized helical structures proposed here. mation of melodies and chords has circular
Alternatively, it may be that the departures components of chroma and fifths, these rigid
from helical regularity in Krumhansl's con- transformations include rotations as well as
ical solution reflect real differences in the translations. Just as in the case of visual cog-
relative stabilities of the tones within the nition, the time needed to carry out such
diatonic context. (These possibilities are transformations mentally is expected to de-
further considered in Shepard, 1982, pp. pend on the angular extent of these spatial
383-384.) transformations (Shepard, 1978a, 1981c;
The toroidal manifold of possible tonali- Shepard & Cooper, 1982). Likewise, mod-
ties obtained more recently by Krumhansl ulations from one key to another correspond
and Kessler (1982) is, apart from its inclu- to rotations (Balzano, 1980; Shepard, 1982;
sion of the minor tonalities, isomorphic to Balzano, Note 2; Shepard, Note 1). The to-
the toroidal manifold whose two-dimen- roidal space in which these latter rotations
sional harmonic map is shown in Figure 8. occur has now been fully elaborated by
However, whereas the map in Figure 8 al- Krumhansl and Kessler (1982), who have
lows the inclusion of pitch height and, hence, given all major and minor keys angular co-
the extension of the structure into higher and ordinates within this same two-dimensional
lower octaves in the manner illustrated in manifold.
Figure 6, Krumhansl and Kessler's solution Finally, although I have in this paper fo-
is a closed torus with complete octave equiv- cused exclusively on just one of the two in-
alence and, hence, no component of pitch dispensable musical attributes, namely, pitch,
height. there are reasons to believe that the structure
This closure of the torus was ensured in described here may carry over to the other
Krumhansl and Kessler's experiment by gen- indispensable musical attribute, namely,
erating the tones presented to the listeners time. Relations of beat and rhythm have the
according to a scheme that achieves com- same structural properties of augmented
plete equivalence of octaves and suppression correspondence at the simplest ratios of
of pitch height (after Shepard, 1964b). tempo, such as the 2-to-l or 3-to-2 ratios,
Moreover, such a closure is appropriate in which in the case of frequency of tones cor-
330 ROGER N. SHEPARD
differences in multidimensional scaling via an N-way ed.). New York: Holt, Rinehart & Winston, 1979.
generalization of Eckart-Young decomposition. Psy- Frances, R. La perception de la musique. Paris: Vrin,
chometrika, 1970, 35, 283-319. 1958.
Charbonneau, G., & Risset, J.-C. Circularite de juge- Fucks, W. Mathematical analysis of formal structure
ments de hauteur sonore. Comptes Rendus Hebdo- of music. IRE Transactions on Information Theory,
madaires des Seances de L'Academie des Sciences 1962, 8, 225-228.
Paris, 1973, 277B, 623-626. Goldstein, H. Classical mechanics. Reading, Mass.:
Chomsky, N. Aspects of the theory of syntax. Cam- Addison-Wesley, 1950.
bridge, Mass.: M.I.T. Press, 1965. Goldstein, J. L. An optimum processor theory for the
Chomsky, N. Language and mind. New York: Har- central formation of the pitch of complex tones. Jour-
court, Brace & World, 1968. nal of the Acoustical Society of America, 1973, 54,
Cohen, A. Perception of tone sequences from the West- 1496-1516.
ern-European chromatic scale: Tonality, transposi- Greenwood, G. D. Principles of dynamics. Englewood
tion and the pitch set. Unpublished doctoral disser- Cliffs, N.J.: Prentice-Hall, 1965.
tation, Queen's University at Kingston, Ontario, Hahn, J., & Jones, M. R. Invariants in auditory fre-
Canada, 1975. quency relations. Scandinavian Journal of Psychol-
Coxeter, H. S. M. Introduction to geometry. New York: ogy, 1981, 22, 129-144.
Wiley, 1961. Hall, D. E. Quantitative evaluation of musical scale tun-
de Boer, E. On the "residue" and auditory pitch per- ings. American Journal of Psychics, 1974, 42, 543-
ception. In W. D. Keidel & W. D. Neff (Eds.), Hand- 552.
book of sensory physiology (Vol. 5, Pt. 3: Clinical Helmholtz, H. von. On the sensations of tone as a phys-
and special topics). New York: Springer-Verlag, iological basis for the theory of music. New York:
1976, pp. 479-583. Dover, 1954. (Originally published, 1862.)
Deutsch, D. Octave generalization and tune recognition. Hilbert, D., & Cohn-Vossen, S. Geometry and the
Perception & Psychophysics, 1972, //, 411-412. imagination. New York: Chelsea, 1952. (Originally
Deutsch, D. Octave generalization of specific interfer- published, 1932.)
ence effects in memory for tonal pitch. Perception Humphreys, L. F. Generalization as a function of
& Psychophysics, 1973, 13, 271-275. method of reinforcement. Journal of Experimental
Deutsch, D. Delayed pitch comparison and the principle Psychology, 1939, 25, 361-372.
of proximity. Perception & Psychophysics, 1978, 23, Hurvich, L. M., & Jameson, D. An opponent-process
227-230. theory of color vision. Psychological Review, 1957,
Deutsch, D., & Feroe, J. The internal representation of 64, 384-404.
pitch sequences in tonal music. Psychological Review, Idson, W. L., & Massaro, D. W. A bidimensional model
1981,55, 503-522. of pitch in the recognition of melodies. Perception
Dewar, K. M. Context effects in recognition memory & Psychophysics, 1978, 24, 551-565.
for tones. Unpublished doctoral dissertation, Queen's Imberty, M. L acquisition des structures tonales chez
University at Kingston, Ontario, Canada, 1974. I'enfant. Paris: Klincksieck, 1969.
Dewar, K. M., Cuddy, L. L., & Mewhort, D. J. K. Jones, M. R. Time, our lost dimension: Toward a new
Recognition memory for single tones with and without theory of perception, attention, and memory. Psy-
context. Journal of Experimental Psychology: Hu- chological Review, 1976, 83, 323-355.
man Learning and Memory, 1977, 3, 60-67. Kallman, H. J., & Massaro, D. W. Tone chroma is
Dowling, W. J. The 1215-cent octave: Convergence of functional in melody recognition. Perception & Psy-
Western and Nonwestern data on pitch-scaling. Jour- chophysics, 1979, 26, 32-36.
nal of the Acoustical Society of America, 1973, 53, Kilmer, A. D., Crocker, R. L., & Brown, R. R. Sounds
373A. (Abstract) (a) from silence: Recent discoveries in ancient Near
Dowling, W. J. The perception of interleaved melodies. Eastern music. Berkeley, Calif.: Bit Enki Publica-
Cognitive Psychology, 1973, 5, 322-337. (b) tions, 1976.
Dowling, W. J. Listeners' successful search for melodies Koffka, K. Principles of Gestalt psychology. New York:
scrambled into several octaves. Journal of the Acous- Harcourt Brace, 1935.
tical Society of America, 1978, 64, S146. (Abstract) Kdhler, W. Gestalt psychology. New York: Liveright,
(a) 1947.
Dowling, W. J. Scale and contour: Two components of Krumhansl, C. L. The psychological representation of
a theory of memory for melodies. Psychological Re- musical pitch in a tonal context. Cognitive Psychol-
view, 1978,55, 341-354. (b) ogy, 1979, //, 346-374.
Dowling, W. J. Musical scales and psychophysical Krumhansl, C. L., Bharucha, J. J., & Kessler, E. J.
scales: Their psychological reality. In T. Rice & R. Perceived harmonic structure of chords in three re-
Falck (Eds.), Cross-cultural approaches to music. lated musical keys. Journal of Experimental Psy-
Toronto, Ontario, Canada: University of Toronto chology: Human Perception and Performance, 1982,
Press, in press. 8, 24-36.
Dowling, W. J., & Hollombe, A. W. The perception of Krumhansl, C. L., & Kessler, F. J. Tracing the dynamic
melodies distorted by splitting into several octaves: changes in perceived tonal organization in a spatial
Effects of increasing proximity and melodic contour. representation of musical keys. Psychological Re-
Percepton & Psychophysics, 1977, 21, 60-64. view, 1982, 89, 334-368.
Forte, A. Tonal harmony in concept and practice (3rd Krumhansl, C. L., & Shepard, R. N. Quantification of
332 ROGER N. SHEPARD
the hierarchy of tonal functions within a diatonic con- Ratner, L. G. Harmony: Structure and style. New
text. Journal of Experimental Psychology: Human York: McGraw-Hill, 1962.
Perception and Performance, 1979, 5, 579-594. Revesz, G. Introduction to the psychology of music.
Kubovy, M. Concurrent pitch-segregation and the the- Norman: University of Oklahoma Press, 1954.
ory of indispensable attributes. In M. Kubovy & Risset, J.-C. Musical acoustics. In E. C. Carterette &
J. R. Pomerantz (Eds.), Perceptual organization. M. P. Friedman (Eds.), Handbook of perception (Vol.
Hillsdale, N.J.: Erlbaum, 1981. 4). New York: Academic Press, 1978, pp. 521-564.
Lakner, Y. A new method of representing tonal rela- Ruckmick, C. A. A new classification of tonal qualities.
tions. Journal of Music Theory, 1960, 4, 194-209. Psychological Review, 1929, 36, 172-180.
Levelt, W. J. M., Van de Geer, J. P., & Plomp, R. Schenker, H. Harmony (O. Jones, Ed. and E. M.
Triadic comparisons of musical intervals. British Borgese, trans.). Cambridge, Mass.: M.I.T. Press,
Journal of Mathematical and Statistical Psychology, 1954. (Originally published, 1906).
1966, 19, 163-179. Shepard, R. N. Attention and the metric structure of
Liberman, A. M., Cooper, F. S., Shankweiler, D., & the stimulus space. Journal of Mathematical Psy-
Studdert-Kennedy, M. Perception of the speech code. chology, 1964, /, 54-87. (a)
Psychological Review, 1967, 74, 431-461. Shepard, R. N. Circularity in judgments of relative
Licklider, J. C. R. Basic correlates of the auditory stim- pitch. Journal of the Acoustical Society of America,
ulus. In S. S. Stevens (Ed.), Handbook of experi- 1964, 36, 2346-2353. (b)
mental psychology. New York: Wiley, 1951. Shepard, R. N. Approximation to uniform gradients of
Locke, S., & Kellar, L. Categorical perception in a non- generalization by monotone transformations of scale.
linguistic mode. Cortex, 1973, 9, 355-369. In D. I. Mostofsky (Ed.), Stimulus generalization.
Longuet-Higgins, H. C. Letter to a musical friend. Stanford, Calif.: Stanford University Press, 1965, pp.
Music Review, 1962, 23, 244-248. 94-110.
Longuet-Higgins, H. C. The perception of music (Re- Shepard, R. N. Psychological representation of speech
view Lecture). Proceedings of the Royal Society, sounds. In E. E. David & P. B. Denes (Eds.), Human
London, 1979, 205B, 307-332. communication: A unified view. New York: McGraw-
Mathews, M. V. The digital computer as a musical in- Hill, 1972, pp. 67-113.
strument. Science, 1963, 142, 553-557. Shepard, R. N. Representation of structure in similarity
Mathews, M. V., & Sims, G. Perceptual discrimination data: Problems and prospects. Psychometrika, 1974,
of just and equal tempered tunings. Journal of the 39, 373-421.
Acoustical Society of America, Suppl. 1, 1981, 69, Shepard, R. N. The circumplex and related topological
538 (Abstract) manifolds in the study of perception. In S. Shye (Ed.),
McAdams, S., & Bregman, A. Hearing musical streams. Theory construction and data analysis in the behav-
Computer Music Journal, 1979, 3, 26-43. ioral sciences. San Francisco: Jossey-Bass, 1978, 29-
Merriam, A. P. The anthropology of music. Evanston, 80. (a)
111.: Northwestern University Press, 1964. Shepard, R. N. Externalization of mental images and
Meyer, L. B. Emotion and meaning in music. Chicago: the act of creation. In B. S. Randhawa & W. E.
University of Chicago Press, 1956. Coffman (Eds.), Visual learning, thinking, and com-
Miller, G. A. The magic number seven, plus or minus munication. New York: Academic Press, 1978. (b)
two. Psychological Review, 1956, 63, 81-97. Shepard, R. N. On the status of "direct" psychophysical
Miller, G. A., & Heise, G. A. The trill threshold. Jour- measurement. In C. W. Savage (Ed.), Minnesota
nal of the Acoustical Society of America, 1950, 64, studies in the philosophy of science (Vol. 9). Min-
637-638. neapolis: University of Minnesota Press, 1978. (c)
Null, C. Symmetry in judgments of musical pitch. Shepard, R. N. Multidimensional scaling, tree-fitting,
Unpublished doctoral dissertation, Michigan State and clustering. Science, 1980, 210, 390-398.
University, 1974. Shepard, R. N. Individual differences in the perception
O'Connell, W. Tone spaces. Die Reihe, 1962, 8, 34-67. of musical pitch. In Documentary report of the Ann
Peterson, G. E., & Barney, H. L. Control methods used Arbor symposium: Applications of psychology to the
in a study of the vowels. Journal of the Acoustical teaching and learning of music. Reston, Va.: Music
Society of America, 1952, &, 175-184. Educators National Conference, 1981. (a)
Philippot, M. L'Arc, Beethoven, No. 40, 1970. Shepard, R. N. Psychological relations and psycho-
Pikler, A. G. The diatonic foundations of hearing. Acta physical scales: On the status of "direct" psycho-
Psychologica, 1955, ;/, 432-445. physical measurement. Journal of Mathematical
Pikler, A. G. Logarithmic frequency systems. Journal Psychology, 1981, 24, 21-57. (b)
of the Acoustical Society of America, 1966,39,1102- Shepard, R. N. Psychophysical complementarity. In M.
1110. Kubovy & J. R. Pomerantz (Eds.), Perceptual or-
Piston, W. Harmony. New York: Norton, 1941. ganization. Hillsdale, N.J.: Erlbaum, 1981. (c)
Plomp, R. Aspects of tone sensation: A psychophysical Shepard, R. N. Structural representations of musical
study. New York: Academic Press, 1976. pitch. In D. Deutsch (Ed.), Psychology of music. New
Plomp, R., & Levelt, W. J. M. Tonal consonance and York: Academic Press, 1982.
critical band width. Journal of the Acoustical Society Shepard, R. N., & Cooper, L. A. Mental images and
of America, 1965, 38, 548-560. their transformations. Cambridge, Mass.: MIT Press/
Pressing, J. Cognitive isomorphisms in pitch and rhythm Bradford Books, 1982.
in world musics: West African, the Balkans, Thailand Shepard, R. N., & Zajac, E. (Producers). A pair of
and Western tonality. Ethnomusicology, in press. paradoxes. Murray Hill, N.J.: Bell Telephone Lab-
STRUCTURE OF PITCH 333
oratories, Technical Information Library, 1965. (Film) Tversky, A. Features of similarity. Psychological Re-
"Shepard's Tones." On M. V. Mathews, J.-C. Risset, view, 1977, 84, 327-352.
et al., The voice of the computer (Decca Record DL van Noorden, L. P. A. S. Temporal coherence in the
710180). Universal City, Calif.: MCA Records, Inc., perception of tone sequences. Unpublished doctoral
1970. dissertation, Technishe Hogeschool, Eindhoven, The
Siegel, J. A. The nature of absolute pitch. In E. Gordon Netherlands, 1975.
(Ed.), Research in the psychology of music (Vol. 8). Ward, W. D. Subjective musical pitch. Journal of the
Iowa City: University of Iowa Press, 1972. Acoustical Society of America, 1954, 26, 369-380.
Siegel, J. A., & Siegel, W. Absolute identification of Ward, W. D. Absolute pitch. Pt. I. Sound, 1963, 2, 14-
notes and intervals by musicians. Perception & Psy- 21. (a)
chophysics, 1977, 21, 143-152. (a) Ward; W. D. Absolute pitch. Pt. II. Sound, 1963, 2,
Siegel, J. A., & Siegel, W. Categorical perception of 33-41. (b)
tonal intervals: Musicians can't tell sharp from flat. Ward, W. D. Musical perception. In J. V. Tobias &
Perception & Psychophysics, 1977, 21, 399-407. (b) E. D. Hubert (Eds.), Foundations of modern auditory
Stevens, S. S. The measurement of loudness. Journal theory (Vol. 1). New York: Academic Press, 1970,
of the Acoustical Society of America, 1955, 27, 815- pp. 407-447.
829. Wightman, F. L. The pattern-transformation model of
Stevens, S. S., & Volkmann, J. The relation of pitch to pitch. Journal of the Acoustical Society of America,
frequency: A revised scale. American Journal of Psy- 1973,54, 407-416.
chology, 1940, S3, 329-353. Zatorre, R. S., & Halpern, A. R. Identification, dis-
Stevens, S. S., Volkmann, J., & Newman, E. B. A scale crimination, and selective adaptation of simultaneous
for the measurement of the psychological magnitude musical intervals. Perception & Psychophysics, 1979,
of pitch. Journal of the Acoustical Society of Amer- 26, 384-395.
ica, 1937, S, 185-190. Zenatti, A. Le developpement genetique de la perception
Sundberg, J. E. F., & Lindqvist, J. Musical octaves and musicale (Monographies Francaises de Psychologic,
pitch. Journal of the Acoustical Society of America, No. 17). Paris: Centre National de la Recherche
1973, 54, 922-929. Scientifique, 1969.
Terhardt, E. Pitch, consonance, and harmony. Journal Zuckerkandl, V. Sound and symbol. Princeton, N.J.:
of the Acoustical Society of America, 1974,55,1061- Princeton University Press, 1956.
1069. Zuckerkandl, V. Man the musician. Princeton, N.J.:
Thurlow, W. R., & Erchul, W. P. Judged similarity in Princeton University Press, 1972.
pitch of octave multiples. Perception & Psychophys-
ics, 1977, 22, 177-182. Received October 21, 1981 •