Sunteți pe pagina 1din 24

Epilepsy & Behavior 7 (2005) 578–601

www.elsevier.com/locate/yebeh

Review

Absolute pitch: Music and beyond


David A. Ross a,*, John C. Gore b, Lawrence E. Marks c
a
Department of Diagnostic Radiology, Yale School of Medicine, Box 208043, New Haven, CT 06520, USA
b
Department of Radiology and Radiological Sciences, Vanderbilt University Medical Center, 1161 21st Avenue South, R-1032 MCN,
Nashville, TN 37232, USA
c
John B. Pierce Laboratory and Departments of Epidemiology/Public Health and Psychology, Yale School of Medicine,
290 Congress Avenue, New Haven, CT 06519, USA

Received 26 May 2005; accepted 27 May 2005


Available online 15 August 2005

Abstract

‘‘Perfect pitch,’’ known in the scientific literature as ‘‘absolute pitch’’ (AP), is a rare phenomenon that has fascinated musicians
and scientists alike for over a century. There has been a great deal of conflict in the literature between advocates of the two main
theories on the etiology of AP: some believe that AP is learned early in life through intensive musical training, whereas others believe
AP to be largely innate. Both theories are alike, however, in considering AP to be exclusively a musical phenomenon. We propose a
paradigm shift by presenting here a new model of AP, one that is predicated on two principles: (1) that AP may be relatively inde-
pendent of musical experience; and (2) that there are different types of AP, each of which can be ascribed to discrete neurobiological
mechanisms. We also review data from a diverse series of experiments that were designed to test explicitly both the predictions of our
model and a series of historical myths about AP. In each case, the data strongly support our model. We conclude with a general
discussion on the nature of AP, the relevance of these findings for other areas of research, and future directions of study.
Ó 2005 Elsevier Inc. All rights reserved.

Keywords: Absolute pitch; Perfect pitch; Music; Preattentive processing

1. Introduction We propose a paradigm shift: rather than assuming


that AP is strictly a musical phenomenon we explore a
The ability to identify the musical pitch of auditory broader view of pitch processing, which sees AP as
stimuli without the use of a reference pitch is a rare skill, reflecting capacities that can far exceed the perception
commonly known as ‘‘perfect pitch’’ or, in the scientific of music per se. To see AP in this way requires abandon-
literature, as ‘‘absolute pitch’’ (AP). Because of the ing the traditional definition of AP (i.e., as the ability to
potential musical advantages it may endow, this skill name notes) in favor of a more inclusive criterion. Our
has enormous sociopolitical significance. Within the sci- rationale for doing this was originally based on observa-
entific community the etiology of this trait has been tions of heterogeneity among subjects who claimed to
vociferously debated for over a century: on one side of have ‘‘perfect pitch’’: we observed some individuals
the argument are those who believe that it is learned ear- whose performance in naming notes appears to be spe-
ly in life, and on the other, those who believe it is innate. cific to music, and who apparently developed this skill
However, advocates of both these theories share the through particular kinds of early experience; other indi-
underlying assumption that AP is strictly a musical viduals with AP, however, evince a capacity to encode
phenomenon. frequency information automatically, not only for music
but for a variety of other acoustical signals. These latter
*
Corresponding author. individuals may name notes with great facility if or when
E-mail address: david.a.ross@yale.edu (D.A. Ross). they are exposed to musical training, but their ability to

1525-5050/$ - see front matter Ó 2005 Elsevier Inc. All rights reserved.
doi:10.1016/j.yebeh.2005.05.019
D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601 579

encode the frequency of auditory stimuli transcends mu- rather than trying to tune challenging intervals. More
sic per se. Because we believe these groups differ mecha- speculative is the claim that AP may be a prerequisite
nistically, we eschew the ambiguous label of ‘‘AP’’ in of musical genius and that possessors have an increased
favor of names that directly reflect the relevant neuro- ability to know whether a particular note is properly
biological processes: we refer to the former group as tuned.
possessing heightened tonal memory (HTM) and the lat- Conversely, AP may also be limiting in several impor-
ter as possessing the ability to perceptually encode tant ways. For one, strict adherence to the absolute fre-
(APE) the frequency of auditory stimuli. quencies of tones as found in the equal tempered scale
In this article we review historical perspectives on AP, could hinder a possessorÕs ability to remain in tune with
discuss their relative strengths and weaknesses, and then a choir that drifts flat or to maintain proper intonation
present a new theory that may reconcile previous points on intervals that differ from equal temperament. More
of conflict in the literature. In doing this, we take a phe- dramatically, there are reports that a significant percent-
nomenon that has been considered predominantly from age of AP musicians find that their sense of pitch shifts
the cognitive and behavioral domains and connect it to as they grow older [1,5], a change that can be completely
the burgeoning neuroscientific literature on low-level debilitating for a professional musician.
mechanisms of auditory perception. We then review
data from a series of experiments that were designed 2.1. The etiology of absolute pitch
to test both the explicit predictions of our model and a
series of historical myths and dogmas in the field. Fore- Although numerous studies have explored potential
most among these is the assumption that AP is funda- anatomical and physiological bases that could account
mentally a musical phenomenon. for differences between AP possessors and nonposses-
sors (e.g., possessors tend to have increased asymmetry
between right and left planum temporales [6–8] and
2. Cultural and historical perspectives on absolute pitch may have decreased P300 responses to tones [9–12]),
the etiology of the trait remains unclear. Early models
Most individuals perceive melodic sequences without of AP suggested that ‘‘true’’ AP is a genetically deter-
being able to identify absolutely the notes involved. For mined trait [13,14], and several recent articles have pro-
example, most people recognize the opening bar of Bee- vided convincing evidence that AP has a genetic
thovenÕs Fifth Symphony but are unaware that the first component [1,15–17]. Other models of absolute pitch
note is a ÔG.Õ Thus, in most people the dominant mode advocate the position that AP is the result of early learn-
of pitch perception entails recognizing the relations ing experiences [18–21]. While this debate may seem
among notes, rather than the exact frequency of each largely academic, it is also of great social significance;
individual note. When refined further, this skill—the a multimillion dollar industry exists solely to train musi-
ability to identify or produce musical intervals accurate- cians to develop AP.
ly—is referred to as relative pitch (RP).
Nevertheless, a small subset of the population is also 2.1.1. The early learning theory
capable of quickly and accurately labeling the absolute According to the early learning theory (ELT), AP
frequency of tonal stimuli (without the use of a reference possessors were exposed to a form of training that
pitch). In musical settings, this skill is generally referred caused them, through repeated presentation, to form
to as perfect pitch, while in the scientific community it is strong associative links between specific pitches and
more frequently referred to as absolute pitch (AP). Given their appropriate musical labels. Thus, an individualÕs
how rare and how distinct this phenotype is relative to AP should depend on the nature and extent of his or
the ‘‘normal’’ mode of auditory perception (the preva- her musical training (i.e., the age at which the person
lence of AP is generally estimated to be between 1/ began training and the extent to which training empha-
5000 and 1/10,000 [1–3]), it is not surprising that AP sized the association of pitch names with absolute pitch-
has fascinated musicians (and audiences) for years and es), and individuals with AP should be more facile at
has been explored in the scientific literature for well over naming more familiar stimuli [e.g., notes played on their
a century (cf. [4]). This interest has been further piqued primary instrument, more commonly encountered notes
by frequent suggestions (of varying degrees of legitima- (such as white vs black), or within the range of their
cy) that possessing AP may confer superior ability in instrument]. Critically, according to this theory, the abil-
some musical domains. One obvious advantage is that ity to recognize tones is indistinguishable from the abil-
it may be relatively easy for AP possessors to transcribe ity to label them. As one advocate wrote: ‘‘An important
or remember a piece of music because they can identify and related point is that AP is not an unusual ability in
directly each of the notes that are played. Another is the domain of Ôpitch perception,’’ despite the fact that
that it may be easier for AP possessors to sight-sing dif- Baharloo et al. repeatedly refer to the ability this way.
ficult passages by producing each note individually AP is a skill in labeling (a form of long-term memory
580 D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601

and categorization/classification behavior, involving a complex behavioral task that may be performed using
self-referencing) and has nothing to do with pitch per- different strategies, strategies that in turn may reflect dif-
ception per se’’ [22, p. 257]. ferent underlying neurobiological mechanisms. Consis-
tent with this possibility are several reports of
2.1.2. The innate model heterogeneity within cohorts of individuals who name
In contrast, advocates of the innate model have argued musical notes with facility [1,3,4,19,20,27,28]. Few inves-
that AP may be better defined as an ability to transfer the tigations, however, have attempted to explicitly address
initial sensory trace of a stimulus into a more stable, long- this heterogeneity, and none has attempted to distin-
term representation. This representation may, but need guish subtypes of AP mechanistically.
not, involve a musical label [23–25]. Thus, the ability to
recognize and to encode a long-term representation of a 2.3. Reconciling different models of AP
tone may be independent of a subjectÕs ability to label it.
Importantly, this skill may not depend on subjectsÕ musi- In our own laboratory, we have observed different
cal experience, though their ability to apply musical labels subtypes of AP. There are some individuals who name
to tones would obviously require knowledge of musical notes effortlessly and automatically. In our mind, these
nomenclature. This subtle point is elegantly captured by are the individuals who possess ‘‘true’’ or ‘‘genuine’’
Corliss [26, p. 1738], herself an AP possessor: ‘‘There is AP as originally described by Bachem [27]. Elsewhere,
certainly some element of memory in absolute pitch, even we have argued that this group may be defined by the
if it is just the association of the gamut with certain arbi- unique ability to encode meaningful representations of
trary pitch standards. In that sense, it cannot be said to be stimulus frequency, and we developed a paradigm inde-
entirely innate. It seems to me that some of the perception pendent of music to test this hypothesis. With this par-
processes that allow the gamut to be recognized in detail adigm, we showed not only that AP possessors differ
may be innate abilities.’’ Given this nuance, one might from a group of nonpossessors with comparable musical
reasonably argue that a more perspicuous approach experience, but that ‘‘true AP’’ may exist in: (a) musi-
would be to identify AP possessors based on the underly- cians without early musical training [29], (b) adult non-
ing capacity to recognize pitch chroma1 rather than on the musicians [30], and (c) children who have not yet had
culturally acquired, if not arbitrary, ability to name notes. sufficient musical experience to learn the names of all
the musical notes [31].
2.2. Weaknesses of historical models In contrast, for other AP musicians the ability to name
notes appears to involve active calculation based on com-
Historically, the ability to name musical notes has al- parison of target tones to a memorized template. As such,
ways been regarded as the sine qua non of AP, and we performance tends to improve the more closely the target
have followed the tradition in so describing it above stimuli match items in a subjectÕs memorized template
(see the beginning of Section 2). Unfortunately, several (where key parameters typically include timbre, height,
major methodological flaws arise from this definition and chroma). For reasons that become evident below,
that may inhibit researchersÕ ability to meaningfully we believe this qualitative difference between the two clas-
study the phenomenon. First, the use of a note naming ses of AP possessors is one of type rather than degree.
paradigm creates a screening bias such that only musi- Accordingly, we consider the latter type of subjects as pos-
cians of Western training can take the test; thus, neither sessors of ersatz AP, distinct from the group described
nonmusicians nor musicians of non-Western training above. Rather than being specific to note naming tasks,
would know the names of the notes and could not be the difference between true and ersatz AP possessors
included in a standard protocol. More significantly, was apparent when subjects were tested using a range of
the exclusive use of these tests has entrenched in the musically independent paradigms [29].
community the assumption that AP is fundamentally a To some extent, then, the argument over which model
musical phenomenon. Accordingly, the possibility that (early learning vs innate) is ‘‘correct’’ may be moot: each
AP may be better explained by basic mechanistic differ- model may accurately describe some possessors of
ences in auditory perception has largely been ignored (if ‘‘AP.’’ It is plausible that the ELT accurately describes
not outright rejected). Finally, by considering note nam- possessors of ersatz AP, whereas the innate model
ing as an end in itself, one may be led to believe that AP describes the more limited group of true AP possessors.
is a simple, binary, trait. To the contrary, note naming is
2.3.1. A useful analogy
1
The difference between these subtypes of AP resem-
Chroma is typically defined as the frequency of a stimulus bles the distinction that has been made between ‘‘preat-
independent of its octave, an attribute particularly important for
music. For example, despite the fact that ‘‘C5’’ has an overall
tentive’’ and ‘‘automatic’’ processing of stimuli in the
frequency twice that of ‘‘C4,’’ they both have the same chroma and visual field [32]. TreismanÕs description of preattentive
perform identical functions in most musical contexts. processing closely matches the precategorical definition
D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601 581

of AP: ‘‘Preattentive processing does not require extend- gradient along the organ, the place of maximum ampli-
ed practice. It depends on mechanisms that are either in- tude corresponds to the fundamental frequency. This spa-
nate or acquired early and outside the laboratory. Its tial analysis, in turn, leads to increased firing within a
functions are also different—not to make skilled perfor- particular characteristic frequency (CF) band in the audi-
mance autonomous, but rather to meet the needs of the tory nerve, such that different pitches may be encoded via
preliminary stage of visual coding.’’ different CF bands. These tonotopic representations are
In contrast, Treisman describes ‘‘automatic’’ process- carried up to the cortex, though they do not convey pre-
ing [33] in the following manner: ‘‘automatization results cise absolute pitch information in most individuals. To
from the accumulation across repeated trials of specific a certain extent, this pathway undoubtedly affects pitch
memory traces, increasing the probability of direct perception. However, there are a number of limitations
retrieval of the correct response to the evoking stimulus in the descriptive capacity of this model [35].
and bypassing any rules or decision processes.’’ She later More recently, it has been suggested that musical
summarizes: ‘‘We tentatively conclude that automatiza- pitch may be encoded not by CF bands per se, but by
tion . . . speeds up processing at a later stage, after feature the temporal firing pattern of the inner hair cells. Across
integration has been achieved, rather than creating new, the entire length of the basilar membrane, all of these
preattentively detectable features. The effects seem to cells fire in a pattern that is phase locked to the incoming
depend on the formation of new and very specific associ- stimulus. For example, for a stimulus of 100 Hz, while a
ations between features, their locations, and the required place code may exist leading to increased firing within a
responses.’’ These are almost verbatim descriptions of the particular CF band, the entire basilar membrane
early learning theory for absolute pitch recognition. vibrates at 100 Hz and all hair cells will fire in phase
Qualitatively, we have observed that pitch/chroma is with this signal at exactly 100 Hz. Thus, the same timing
a salient aspect of auditory stimuli only in true AP pos- information exists in all CF bands. The pitch of a stim-
sessors. For everyone else the primary mechanism of ulus is then extracted via an autocorrelative process that
pitch perception is interpreting the intervalic relation measures the time intervals between successive spikes.
between pitches. By analogy, we propose that pitch chro- This process may be conceptualized by looking at a his-
ma may be a feature of auditory perception only in true togram of all-order intervals between spikes in the audi-
AP possessors. In the visual domain ‘‘feature’’ process- tory nerve. The highest peak on the plot corresponds to
ing is thought to reflect low-level mechanisms used to the inverse of the stimulus pitch and the relative height
encode stimuli. Similarly, we suggest that true AP may of this peak (compared to the rest of the distribution)
result from a fundamental property of the way the audi- may indicate pitch salience [35].3
tory system encodes stimulus frequency. Strong evidence has accrued in support of this model,
In contrast, possessors of ersatz AP may have an which has excellent predictive power for a wide range of
increased ability to memorize and retrieve specific con- perceptual phenomena [35]. However, there is also a
junctions of basic auditory features (e.g., timbre, height, central paradox: if the model proposes that there exists
and intensity). However, as in non-AP-possessors (NAP an exact low-level representation of the pitch of all fre-
individuals), chroma is not a basic feature of perception quency stimuli (as in the autocorrelogram), then why
and the initial mechanisms used to encode and translate are most people unable to access this representation?
auditory stimuli are similar (though it is plausible that Unfortunately, few researchers have addressed this
subjects have differences at higher points in the auditory dilemma. One notable exception is a theory proposed
pathway). Importantly, the ability to recognize auditory by Cariani [36] in which a series of operations are per-
‘‘objects’’ will depend strongly on previous exposure to formed on the output of the autocorrelogram, including:
similar objects. echoic memory (via recurrent timing nets with delay
While this model represents a reasonable qualitative loops of various lengths); relative pitch processing of
description of AP, we now turn to the critical question vertical intervals4 (via feedforward timing nets that use
of whether it is consistent with known mechanisms of
auditory perception.
3
To be clear, it is unlikely that either model is exclusively correct:
2.3.2. Models attributing pitch perception to low-level current debate is focused on establishing the relative roles of tonotopic
and periodotopic signals across a wide range of perceptual processes
processes and stimulus types. It should also be noted that most ‘‘temporal’’
It was originally thought that musical pitch is encoded models incorporate spectral information by including CF bands as an
primarily as a spectral (or tonotopic) signal.2 By this mod- additional input factor. The details of this debate are tangential to our
el, an incoming frequency stimulus creates a traveling purposes here. In the present discussion, we are interested simply in the
wave on the basilar membrane and, because of the tension extent to which musical pitch processing makes use of temporal codes
(with or without spectral information).
4
A vertical interval is the difference in pitch between two simulta-
neously presented tones, as opposed to a horizontal interval, when the
2
For detailed descriptions of spectral and temporal models see [34]. tones are presented sequentially.
582 D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601

coincidence detectors to compare interval distributions use of the secondary processing systems that are present
of two inputs); and relative pitch processing of horizon- in normal individuals. One system likely to be particu-
tal intervals (via a combination of the two previous larly important is that used for interpreting timbre. As
mechanisms). Of note, however, all of these operations illustrated in Fig. 1 this second groupÕs ability to process
take place exclusively in the temporal domain. Thus, stimulus frequency would depend largely on periodotop-
the representation of absolute pitch would exist solely ic mechanisms. Other researchers, however [38], have
as an intermediate point in the overall pathway. argued that timbre may be processed primarily as a spec-
tral representation. If this were true, then pitch percep-
2.4. A new theory of absolute pitch perception tion by this subset of AP possessors would be based
primarily on tonotopic information and relatively inde-
The idea of a central subsystem with an exact repre- pendent of periodotopic coding. A third possibility is
sentation of the absolute pitch of stimuli is extremely that the extraction of pitch information takes place at
compelling. Indeed, it could provide a parsimonious a higher level in the pathway: given that stimulus recog-
explanation of the differences between AP and NAP nition derives from the processing of conjunctions of
individuals: What if true AP possessors, possessors of features, it is entirely possible that the critical mecha-
APE, are defined by the unique ability to access this nism occurs at a point where temporal and spectral
low-level representation? In a simple version of such a streams have already been reintegrated. Regardless of
model, some individuals would have an additional syn- the precise location in the pathway, though, all of these
apse that translates the output of the autocorrelo- possibilities hold in common the essential aspect of our
gram—a temporal signal—into a place code for pitch theory: that ersatz AP possessors have an increased abil-
(e.g., if a peak at 10 ms synapses onto a cell correspond- ity to extract AP information from systems that are nor-
ing to 100 Hz) that is then carried up to the cortex as an mally used to process other basic auditory features.
additional stream of auditory information.5 Furthermore, because repeated exposure to specific con-
Consequently, we propose the following theory to ac- junctions of features will increase the probability of rec-
count for differences between true absolute pitch and ognizing the same conjunction on future presentation,
NAP individuals (as schematically depicted in Fig. 1). performance by this group should be closely tied to
In normal individuals the basic features of auditory per- the intensity and nature of training.
ception include: loudness, height (via the tonotopic path- Given this scheme, it is reasonable to believe that
way), relative pitch, and timbre processing6 (the last two such a learned skill may be present to a varying degree
via the periodotopic pathway). In AP possessors, the throughout the population. At its simplest, this type of
auditory system also contains a mechanism that trans- process could explain why normal individuals tend to re-
lates the most frequently occurring spike interval into a call with surprising accuracy the frequency of the first
specific place code for pitch. Thus, in this group, the note of popular songs [39–41]. Similarly, it is well known
absolute pitch of stimuli is coded as an additional basic that instrumentalists may form a strong tonal memory
feature of auditory perception. Performance by subjects of the frequency to which they tune their instrument
in this group should match the predictions of the innate (e.g., a violinist may recall the frequency of A440). In a
theory: recognition of stimulus frequency would be more extreme case, this could also explain why a highly
immediate and independent of prior experience or any trained pianist may learn to recognize any note played
other stimulus attributes. It is interesting to note that this on a piano but have greater difficulty recognizing notes
simple difference nicely accounts for BachemÕs suggestion with unfamiliar timbre. Much of the literature in sup-
that NAP individuals perform pitch memory tasks using port of the early learning model of AP could be inter-
height only (thus suggesting that the maximum resolu- preted easily within this conceptual framework. It
tion of the tonotopic pathway is five to nine half-steps), remains to be seen whether some individuals are predis-
while AP possessors encode both height and chroma. posed, by virtue of innate neurophysiological mecha-
We have described another subset of AP possessors nisms, to being better at such feature integration or
as having an increased ability to memorize and recall whether it is purely the result of practice.
conjoined auditory ‘‘features.’’ These individuals would
recognize a target stimulus by comparing the incoming 2.4.1. New terminology
signal with the evoked memory of previous, similar sig- For lay people, the term perfect pitch has come to
nals. This comparison would depend on the differential have different meanings for virtually everyone who uses
it. Similarly, in the scientific literature, many researchers
5
have acknowledged that ‘‘AP’’ possessors do not form a
N.B. This is not the first model to suggest that AP possessors may single homogeneous group. This ambiguity poses a
differ from nonpossessors in the character or function of cells in a
brainstem nucleus [37].
major obstacle to research in the field: it may be virtual-
6
Which may be viewed in large part as a derivative of vertical RP ly impossible to know what types of subjects were in-
processing. volved in any particular study of ‘‘AP.’’ Ultimately,
D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601 583

Fig. 1. Schematic representation illustrating proposed AP pathway (dotted lines).

the cause of this ambiguity has been the use of a diag- closely to other groupsÕ definition of ‘‘genuine’’ AP
nostic criterion that has both poor specificity and poor [27]. In contrast, we define individuals who recognize
sensitivity. The model that we present suggests the exis- target stimuli by comparing them with their memory
tence of two discrete mechanisms, either of which could of specific auditory events as possessors of heightened
endow an individual with the ability to name notes and, tonal memory (HTM). This definition corresponds (at
thereby, to pass a ‘‘classic’’ AP test. least broadly) to that used by advocates of the early
For the sake of clarity, we prefer to identify groups learning theory (e.g., Miyazaki, Levitin, and Takeuchi
on the basis of the proposed underlying mechanism. and Hulse). By these definitions, APE possessors are
Thus, we define APE as the unique ability to perceptual- characterized by the way they encode stimulus frequen-
ly encode durable representations of stimulus frequency cy, whereas HTM possessors are characterized primarily
at a precategorical level, independent of a targetÕs tim- by an increased ability to retrieve specific complex audi-
bre, chroma, or height. This definition corresponds tory memories. While the ability to name musical tones
584 D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601

may be an easy index for identifying ‘‘genuine’’ AP pos- clarify the relative importance of tonotopic and periodo-
sessors, we do not believe that this skill is fundamental topic mechanisms in normal and absolute pitch
to the condition. Rather, we argue that APE is a basic, processing.
psychoacoustic capacity that, in conjunction with musi- The second major goal was to address directly some
cal training, would make it relatively easy to associate of the vast folklore about AP that has accumulated over
labels (either from the Western musical scale or other- time. Not all myths are mythical, but all are untested.
wise) with frequency stimuli. Thus, a double dissociation The present experiments were designed to determine
may be possible between the ability to name notes and which of the common claims about AP are fictional
‘‘genuine’’ AP: as illustrated by subject R.M. [28], and which have a legitimate scientific basis.
APE possessors without musical training may be inca-
pable of naming notes; conversely, expert musicians
may develop a refined sense of HTM that makes it easy 3. Encoding and reproducing HugginsÕ pitch stimuli
for them to recognize notes in a conventional paradigm.
Importantly, the distinction that we propose between As described above, early models of pitch perception
APE and HTM is easily testable. Foremost, in groups emphasize a place code on the basilar membrane,
matched for musical experience, our theory predicts that whereas more recent models emphasize temporal codes.
performance by HTM musicians should be affected by Oddly, this shift in thinking has not been incorporated
the many parameters that define the target tone. Specif- into models of AP. With one notable exception [28], vir-
ically, they should be more accurate at recognizing or tually all theorists on the topic maintain that AP pos-
identifying tones with more familiar timbre (e.g., piano sessors encode the pitch of target stimuli via a simple
vs sine), chroma (e.g., white keys vs black keys), height place code (generally in the context of discussing why
(e.g., within the range of their primary instrument), or AP possessors may experience shifts in pitch perception
intonation. All of these effects have been reported in with increasing age). Ironically, though an advocate of a
support of the early learning theory [20,42–45]; by our place code model [27], Bachem himself originally noted
model, they reflect the limitations of HTM. We predict the correspondence between the ability of hair cells to
that a small subset of the conventionally defined popu- phase lock to stimuli and the ability of AP possessors
lation of ‘‘perfect pitch’’ possessors should be minimally to identify musical pitch consistently [46]; however, to
affected by these factors. We regard this subset as the the best of our knowledge, this observation has largely
possessors of APE. been ignored.
Critically, we may also formulate an explicit null The following study [47] was designed to test the
hypothesis for our model: namely, that the distinction extent to which AP depends on place versus temporal
we have drawn between APE and HTM is artificial. codes. Data from an earlier study [31] suggest that spec-
According to this alternate hypothesis, APE and HTM tral cues may be important in HTM perception, as evi-
possessors are drawn from opposite ends of the same denced by a poorer ability to encode and reproduce
continuous spectrum, and both result from the same iterated rippled noises (IRNs) and missing fundamental
underlying mechanism (which could be consistent with stimuli relative to spectral stimuli, such as pure tones. In
either the early learning or innate model). The most contrast, APE subjects performed comparably with all
obvious feature that one might use to define such a spec- three kinds of stimuli. While performance on IRN stim-
trum would be the ability to name notes. uli poses a reasonable first test of the relative importance
The following experiments were designed to test our of spectral versus periodotopic encoding, it is not ideal:
new theory using a series of paradigms that would be though the stimuli are designed to minimize the amount
relatively independent of subjectsÕ musical experience of spectral information present, nevertheless, with an
and note naming ability.7 In doing so, we hoped: (1) increased number of iterations, a mild place code and
to clarify the extent to which AP may be considered a overtone series do exist.
musical phenomenon versus a generalized perceptual For this reason, we sought a paradigm that could test
phenomenon; (2) to identify the preattentive ‘‘features’’ more rigorously the importance of tonotopic representa-
that define ‘‘normal’’ auditory perception; and (3) to tions in AP perception. To do so, we adapted our origi-
nal paradigm [29] to use target stimuli that would
preclude absolutely the possibility of subjects using a
7
Subjects were classified as APE, HTM, or NAP possessors on the place code—HugginsÕ pitches (HPs) [48]. HP is generat-
basis of their performance on a series of separate paradigms, with the ed by an interaural phase transition at a particular fre-
most important skill being the ability to encode and reproduce quency within a broadband noise. At this frequency,
stimulus frequency accurately after a prohibitive amount of tonal the interaural phase relationship changes sharply with
interference across a diverse set of timbres (including MF, iterated
ripped noises, sine, and piano tones). Full details of this method may
increasing frequency, shifting through 360° over a nar-
be found in [31]. In total, 17 subjects qualified as APE possessors and row bandwidth (typically 3–6%). Above and below the
30 qualified as HTM possessors. transition frequency, the noise is identical in each ear.
D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601 585

Because each ear is presented with a white noise, the ference. With interference, the APE group performed
pitch is dichotic (i.e., it cannot be heard in either ear more accurately than the HTM group. There was also a
alone), so no place code can exist. Listeners generally significant interaction between group and trial type (i.e.,
hear ‘‘a warbling tone standing out from the noise, with and without interference). These data indicate that
whose pitch can be matched to the center frequency of APE subjects were less affected than HTM subjects by
the phase transition’’ [49]. Previous data have shown the presence of interfering tones. Interestingly, when com-
that HP may be processed as musical pitch [50]. Howev- pared with performance on identically structured para-
er, to the best of our knowledge, neither HP stimuli nor digms with different types of target tones (i.e., sine,
any other dichotic pitch stimulus has ever been used to piano, missing fundamental, and IRN targets), APE per-
test AP. formance was identical, whereas HTM subjects per-
To summarize, traditional theories assume that AP formed significantly better with HP stimuli than they
depends on place (spectral) rather than temporal infor- had with the other types of targets.
mation; consequently, these theories would predict that These data show that APE and HTM subjects can en-
AP possessors would have difficulty processing HP stim- code durable representations of HP stimuli, at least as
uli. Based on our theory, we predict that individuals readily as they do for more conventional types of target
with APE encode pitch through temporal processes stimuli.
(periodicity) and thus will readily process the pitch of
HP stimuli, whereas individuals with HTM, to the 3.2. Discussion
extent that their ability to encode stimulus frequency
depends on spectral processing, may not. Despite recent research emphasizing the importance of
periodotopic information in low-level frequency process-
3.1. Summary of findings ing, virtually every model of perfect pitch has maintained
the dogma that AP processing depends directly on the
Subjects were presented with a target stimulus fol- interpretation of a place code on the basilar membrane.
lowed by either a silent interval of varying length (2, 8, Were this the case, one would expect AP possessors to
or 16 seconds) or an interval of equal duration filled be significantly hindered in their ability to encode pitch
with a series of interfering tones (1, 31, or 71 interfering stimuli that cannot be defined by a spectral code. The
tones). Previously, we showed this to be a powerful present data refute this hypothesis: both APE and HTM
means of distinguishing among APE, HTM, and NAP subjects performed at least as well with HP stimuli as with
possessors [29,30]. The target tones were HP stimuli conventional spectral stimuli. Previous data have shown
from within the range of C5 (523.3 Hz) to B5 that HP stimuli may be used for relative pitch processing
(973.2 Hz); to maximize interference, sine tones were [48]. The present data corroborate the overall importance
used from across a wider range (from C2, 65.4 Hz, to of periodotopic signals in musical processing.
B7, 1975.5 Hz). Following this interval, subjects were These data are consistent with the mechanism we
asked to reproduce the original target stimulus by have proposed for APE subjects, namely, that subjects
adjusting the knob of a sine function generator. The may have an increased ability to extract AP information
generator was always turned to a frequency of from the periodotopic pathway. Of greater interest, they
1000 Hz, and subjects were blindfolded to prevent visual help refine our perspective on pitch processing in the
monitoring of their performance. HTM group. Previously, we suggested that the ability
Data were analyzed by calculating the distance away to extract AP information from other stimulus features
in half-steps (hs, semitones) of each response from the (e.g., timbre) might take place in the tonotopic pathway,
target. Because we were interested only in subjectsÕ abil- the periodotopic pathway, or at a relatively high stage,
ity to encode chroma, responses were corrected to the after integration of these streams. The present data
nearest octave. Thus, the possible range of responses appear to contradict the first of these possibilities. Fur-
was from 6 to +6 hs. Data were quantified using stan- ther investigation is required to determine the relative
dard deviation (r) as a measure of variability and con- importance of low-level periodotopic processing and
stant error (CE) as a measure of bias. Perfect higher-order integration in HTM pitch recognition.
performance would be expected to yield a normal distri-
bution at zero (low CE, low r), whereas chance perfor-
mance would yield a uniform distribution from 6 to 4. Distinguishing between properly and improperly tuned
+6 hs (low CE, high r). stimuli
Data from representative APE and HTM subjects are
shown in Fig. 2 (without interfering tones) and Fig. 3 Part of the mythology of ‘‘perfect pitch’’ is that musi-
(with interfering tones). Data from a cohort of 6 APE cians with this trait may possess a superior ability to
and 13 HTM subjects are presented in [47]. There were reproduce specific frequencies accurately (e.g., to sing
no major effects by group or interval length without inter- A4 = 440 Hz) or to recognize tones that deviate from
586 D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601

Fig. 2. Data from representative APE and HTM subjects for HP target stimuli with a silent ISI of 2, 8, or 16 seconds.

Fig. 3. Data from representative APE and HTM subjects for HP target stimuli for ISIs with 1, 31, and 71 interfering tones.
D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601 587

proper tuning (e.g., to recognize if an instrument is tively, we asked: Can musicians with ‘‘perfect pitch’’
slightly out of tune). In a sense, these purported skills distinguish accurately between in-tune and out-of-tune
are mirror images of each other in that both effectively notes?
ask how finely AP possessors internally represent pitch, The experiment was conducted using a confidence
the former in the context of reproduction, the latter in rating paradigm from signal detection theory. This
the context of encoding and recognition. paradigm offers an important benefit: it ameliorates at
Different models of AP yield different predictions as least some of the arbitrary nature of the task (i.e.,
to how well possessors should be able to perform on knowledge of A4 = 440 Hz). Because of the way it is
these tasks. According to the early learning theory (or structured, the paradigm allows the possibility of doing
our own definition of HTM), certain individuals, either significantly better or significantly worse than
through repeated exposure to musical stimuli, form a chance. For example, if a given AP musician usually
template of specific tones that they consider to be ‘‘in played an instrument that was chronically flat, she or
tune.’’ In this case, performance on the task should cor- he might perceive the stimuli that were 50 cents off as
relate with both the extent and the specific nature of mu- being correct and the others as mistuned. Rather than
sical experience; i.e., the more experience one has with performing at chance, data from such an individual
an equal tempered instrument the better that individual would yield a receiver operating characteristic (ROC,
should be. Thus, one would expect to see a spectrum in in signal detection theory) that fell below chance perfor-
performance, with highly trained keyboardists doing mance (thus measures of sensitivity, such as d 0 , would be
best and less trained vocalists doing relatively worse. negative, and the area under the ROC, p(A), <0.5).
Alternatively (and consistent with our definition of Thus, rather than asking how well subjects recognize
APE), different pitches may possess an inherent salience, the precise frequency values of the ET scale, the para-
independent of a personÕs musical experience. By this digm tests how well subjects can distinguish among three
rationale, a minimum amount of training should confer different sets of stimuli, each of which is internally con-
to these individuals a sense of which specific frequencies sistent and slightly different from the others.
are ‘‘correct,’’ and the ability to distinguish in-tune and
out-of-tune stimuli should be relatively independent of 4.1. Summary of findings
training (at least beyond a certain threshold).
Although these issues have been studied indirectly in Subjects were presented with 100 pure sine tones rang-
several ways, e.g., how small a difference can AP pos- ing from C3 (130.8 Hz) to B4 (493.9 Hz). Thirty-four of
sessors remember after a delay with interference [23] the tones were tuned to the ET scale with A4 = 440 Hz,
and how do AP possessors categorize notes that are mis- 33 differed by 25 cents, and 33 differed by 50 cents. For
tuned across a spectrum [20,22,51], to the best of our each tone, subjects were required to rate their confidence
knowledge, no study has explicitly tested the ability to that the tone came from the ET on a scale ranging from
distinguish between properly and improperly tuned 1 = completely confident that it did not come from the
stimulus frequencies. In fact, the only explicit reference ET scale to 6 = completely certain that it did.
of which we are aware is by Levitin, who wrote: Mistuned (MT) stimuli were defined as the targets, and
separate ROC curves were generated for the 25- and 50-
Recent articles . . . have demonstrated confusion about
cent stimuli. Data were quantified by calculating p(A),
the nature of AP by using the term ‘‘Perfect Pitch’’ inter-
the area under the curve, for each subject, and, to account
changeably with the term ‘‘Absolute Pitch.’’ The unfor-
for the possibility of ‘‘mistuned’’ subjects (who would have
tunate implication of the term ‘‘Perfect Pitch’’ is that
negative ROC curves), the average of the positive distribu-
possessors of the ability have some sort of super resolu-
tion (APD = the absolute difference between p(A) and 0.5
tion in their pitch perception, and that they can tell
for each subject) was also calculated for each group.
whether a tone is perfectly in tune or not. In fact, there
Fig. 4 shows p(A) values for representative APE,
is nothing ‘‘perfect’’ about absolute pitch. APers are no
HTM, and NAP subjects. Data from 17 APE, 29
better at tone discrimination than other individuals, and
HTM, and 8 NAP control musicians are presented in de-
they are no more accurate at noticing deviations from
tail in [52]. APE performance was significantly better
perfect intonation. What they are better at is labeling
than chance for both conditions. HTM performance
tones along the unidimensional continuum of frequency,
exceeded chance for the 25-cent stimuli and there was
but there is some ‘‘slop’’ or ‘‘hysteresis’’ in their category
a trend toward above-chance performance for the 50-
boundaries. [22, pp. 256–257]
cent stimuli. The NAP group did not differ from chance
The goal of the following study [52] was to com- for either set of stimuli. Statistical comparison showed
pare the ability of APE, HTM, and NAP musicians that APE subjects were better able than HTM subjects
to distinguish between notes tuned to the equal tem- to distinguish between ET and MT tones and that
perament (ET) scale with A4 = 440 Hz and notes that they were more sensitive to the difference between
deviated either 25 or 50 cents from this scale. Effec- 25- and 50-cent stimuli.
588 D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601

Fig. 4. p(A) data for representative APE, HTM, and NAP subjects for target stimuli mistuned by 25 or 50 cents.

Separate analyses were conducted to test a number of lation with any other musical parameters—note naming
alternate explanations of our data. These analyses ability, memory for A4 = 440 Hz, or overall experience —
showed that performance on this task did not correlate similarly confirms the nonmusical nature of the task.
with: type of instrument, years experience, or the age Instead, these data support the fundamental assertion
of onset of training. Further, neither APE nor HTM of our model: Stimulus chroma is a salient feature of
subjects were significantly affected by target chroma perception for APE subjects only. In this group, differ-
(i.e., they did not perform more accurately for common ences in pitch of as little as 25 cents are meaningfully
notes—C, E, G—than uncommon notes—D#, F#, G#), encoded. For these individuals our task entailed identi-
and neither groupÕs performance correlated with either fying the pitch of each stimulus within a continuous
the ability to name notes or the ability to reproduce spectrum and judging the distance to the nearest known
A4 = 440 Hz accurately.8 reference point. Because the 50-cent stimuli lie further
from such references, they were recognized with greater
4.2. Discussion confidence and accuracy.
While the HTM subjects performed mildly better than
According to popular mythology, musicians with chance, they were relatively insensitive to subtle differenc-
‘‘perfect pitch’’ are highly adept at distinguishing es in pitch. These data suggest that HTM subjects per-
between properly tuned and mistuned stimuli. This formed the task using a more basic strategy: each target
belief, however, has been largely anecdotal. The present was compared directly to the memory of a ‘‘correct’’ tone
study is the first to assess this skill explicitly and the data and the judgment was made as a simple same/difference
show a clear empirical basis for the myth. response. Thus, while these individuals have been able
While at face value these results appear to reflect dif- to establish in their minds several fixed points of reference,
ferences in musical skill, this is not the case. All of the these references may be relatively broad and their utility is
subjects who participated had sufficient musical experi- limited. While they may enable subjects to assign labels to
ence to familiarize themselves with the ‘‘correct’’ value musical tones, at a perceptual level, fine-grained differenc-
of each note. Yet despite their extensive training, the es in pitch are not meaningfully encoded. These data sup-
NAP controls were incapable of distinguishing between port the assertion that HTM is fundamentally a skill of
properly and improperly tuned stimuli and the HTM retrieval and comparison rather than a basic property of
subjects were only marginally better. The lack of corre- perceptual encoding.

8
These data were collected after completion of one of the previous 5. Does AP confer superior auditory memory?
reproduction tasks. After the last response, the sine function generator
was reset to 1000 Hz and subjects were asked to tune it to match A440.
They were not given any feedback on the task and there was no Another facet of the mythology of ‘‘perfect pitch’’ is
immediate musical context to assist them. that possessors of this trait may have an increased mem-
D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601 589

ory for musical stimuli. This belief may be fueled, in had been part of the preceding sequence. Approxi-
large part, by the performance of certain savant musi- mately half of the time, the probe tone matched a tone
cians [53–57]. One famous example appears in the story from the sequence; when it did, the target was always
of a trip by Mozart to the Vatican during which he presented in either the second or third position of the
reputedly heard a single performance of AllegriÕs Mise- sequence. When the probe did not match a tone from
rere and transcribed the entire piece from memory. the sequence, it always differed by a half-step from a
While the historical accuracy of this anecdote may be tone in one of the target positions. For each subject
difficult to verify, the myth remains popular and with and each condition, a ROC was again generated from
it the question: Are AP possessors predisposed to having the confidence ratings, area under each ROC, p(A),
better musical memories? was calculated, and each groupÕs performance was
As discussed above, there is an appreciable body of then compared with chance (p(A) = 0.5) using means
literature demonstrating that AP possessors have and SE.
decreased decay rates for individual tones. In a situation Data from representative APE, HTM, and NAP sub-
where one group has the ability to name the target item jects are shown in Fig. 5. Group data for 12 APE, 17
and the other does not, such an effect is not particularly HTM, and 9 NAP subjects are presented in [58]. The
surprising. Perhaps for this reason, few investigators data show a clear ordering: APE subjects performed
have explored differences between AP and NAP individ- best, followed by HTM and then NAP subjects.
uals in remembering sequences of tones or other extend- One might speculate that performance on the memo-
ed musical stimuli. The myth remains untested. ry task depends directly either on subjectsÕ musical expe-
We have presented a model of AP in which APE pos- rience or on their ability to name notes. However,
sessors differ from ‘‘normal’’ individuals, and from pos- extended analyses showed no correlation between per-
sessors of HTM, by the presence of an additional stream formance on this task and either of these parameters,
of auditory information that may parallel the ordinary neither for within-group analysis nor for the combined
RP processing pathway. In this scenario, APE individu- cohort. Thus, the ability to perform this task accurately
als possess an extra channel carrying information about appears to reflect inherent group differences in the per-
stimulus frequency to the cortex. As such, one might ceived salience of pitch stimuli.
reasonably expect that they should have an increased
capacity to remember musical stimuli; a greater overall 5.2. Memory for sequences of complex piano chords
amount of information about auditory stimuli is trans-
mitted to the cortex. This might also account for subjec- In the first part of the experiment, stimuli were indi-
tive perceptual differences described by AP and NAP vidual piano notes. Consequently, APE and HTM sub-
musicians. In our own informal surveys, the vast major- jects might have had an advantage in that they could
ity of NAP musicians reported that when listening to a label each tone and remember the appropriate label
complex piece of music, they can attend to only a single (rather than having to rely on difficult relative pitch cal-
aspect of the piece (e.g., in a symphony they could focus culations). The goal of the present experiment was to
on the flute, the violins, the chord progression, or some present subjects with stimuli that would elude conven-
other harmonic element, but not more than one of these tional notation and thus eliminate any bias acting in
items at a time). In contrast, most of our APE musicians favor of individuals who can name notes. To this end,
report that it is relatively easy for them to keep track of piano chords were carefully designed so that they could
several streams at once. This ability is closely related to not be named according to any standard musical
the concept of preattentive processing (see below). conventions. Each chord consisted of six notes: the root,
Unfortunately, we are unaware of any systematic efforts the major third and fifth one octave above, and a
to study differences in musical memory between AP and doubling of the root three octaves above the bass. The
NAP musicians. The goal of the present study was to other two notes were ‘‘blue’’ notes and consisted of
compare APE, HTM, and NAP musicians on a series either the sixth or seventh below and either the ninth
of memory tasks and to determine the extent to which or eleventh above the third/fifth cluster. Chords were
a subjectÕs memory depends on his or her ability to name then constructed using every permutation of flatting
the stimulus in a conventional manner. and sharping the accidentals (i.e., flat or natural sixth;
flat or natural seventh; flat, natural, or sharp ninth; nat-
5.1. Memory for sequences of individual piano tones ural or sharp eleventh). Thus, there were 20 possible
permutations for each chord root, giving a total of
Subjects were presented with sequences of one, 240 possible chords. A sample chord would be: G2 –E3 –
four, or seven individual piano tones followed by a B3 –D4 –A#4 –G5 (root–sixth–third–fifth–sharp ninth–root).
probe tone. Tones were taken from the range of G3 Approximately half of the time, the probe chord matched
(196 Hz) to F#4 (370 Hz). Subjects rated their confi- a chord from the sequence. When the probe differed, it
dence on a scale from 1 to 6 as to whether the probe differed only by a half-step in one of the two ‘‘blue’’ notes
590 D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601

Fig. 5. p(A) data for representative APE, HTM, and NAP subjects for sequences of 1, 4, and 7 stimuli.

of the chord; the root and the harmonically important figuration of ‘‘blue’’ notes was randomly determined. All
notes were always identical. Thus, an incorrect probe other parameters were the same as described in the previ-
for the sample above might be G2 –Eb3 –B3 –D4 –A# 4 –G5 ous section.
(root–flat sixth–third–fifth–sharp ninth–root). Sequences Data for representative APE, NAP, and HTM sub-
were randomly generated such that no chord root repeat- jects are shown in Fig. 5. Data from the full cohort of
ed within a sequence, and for each chord the specific con- subjects are presented in [58]. Performance for all three
D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601 591

groups differed from chance for the one chord condition; disposed to having an increased ability to remember
for the four-chord condition, only the APE and HTM musical stimuli. For the APE group, this effect closely
performances were significantly better than chance; for matches our initial prediction. For the HTM group,
the seven chord condition, none of the groups differed we were surprised to find that they performed almost
significantly from chance. as well as the APE musicians. Perhaps subjects with
As above, there were no significant correlations HTM performed well because the stimuli comprised
between musical experience or note naming ability and chords with many notes presented simultaneously in
p(A) for any group at any sequence length. what might be called a ‘‘timbral mass.’’ If HTM pos-
sessors are defined by their ability to extract pitch
5.3. Discussion from complex timbral cues, particularly given that
the stimuli were all played on a piano, it is plausible
In the first experiment there were clear differences that they were able to transfer some degree of abso-
among APE, HTM, and NAP groups. One might lute identification skills to the identification of the
speculate that APE and HTM subjects performed bet- pitch complexes.
ter because they could label the component tones of In conclusion, these data show clear differences in
each sequence. This undoubtedly played some role in the ability of APE, HTM, and NAP individuals to
their improved performance compared with the NAP remember short musical sequences. Performance on
controls. However, data from the regression analysis these tasks did not correlate with subjectsÕ ability to
of p(A) and note naming ability should temper this name notes. Impressively, both APE and HTM sub-
simple conclusion. Within the group of individuals jects performed significantly better than chance with
with ‘‘perfect pitch,’’ there was no correlation between sequences of four chords that were specifically
these two variables. This is somewhat less surprising designed to be unnamable. These data are the first test
when one considers the speed of the sequences: the of overall musical memory skills in AP possessors and
interstimulus interval of 800 ms is approximately as give credence to the popular belief that people with
fast as our best note namers (cf. [31]) and consider- ‘‘perfect pitch’’ have an inherently greater ability to
ably faster than the majority of HTM subjects remember musical stimuli. These findings are consis-
[20,44]. Instead, these data suggest that performance tent with the assertion of our model that stimulus fre-
on the paradigm may reflect fundamental differences quency may be a more salient feature of perception
in the perceived salience of stimulus frequency. They for APE and, to a lesser extent, HTM possessors.
also support the widespread belief that individuals
with ‘‘perfect pitch’’ have better memories for simple
musical stimuli. 6. Preattentive processing of pitch and timbre
The second part of the experiment was designed to
test whether APE and HTM possessors have a superi- As discussed above (Section 2.3.1), a significant
or memory for musical stimuli even when it is not body of research has explored the concept of preatten-
possible to label them using any conventional nomen- tive processing in the visual domain. As initially
clature. To meet this end, the chords were designed to described, ‘‘features’’ of visual perception are thought
be foreign: it is highly unlikely that a subject had ever to reflect the basic stimulus attributes encoded by the
heard any chord like them and conventional names visual system [32]. Despite major progress in this field,
for them do not exist. In an isolated context perhaps surprisingly little research has been dedicated to
it would have been possible for highly trained musi- exploring this phenomenon in other domains, such
cians to determine the ‘‘quality’’ of the chord. Howev- as the auditory. From this perspective, the question
er, in a random context and with a total of only remains open: What are the basic features of audition?
800 ms per chord, this is implausible. As a qualitative Obvious candidates include: loudness, height, timbre,
check, subjects were asked, after the completion of the and, of greatest interest to the present research, pitch.
last task, if they had ‘‘figured out’’ how we had For pitch, several studies have shown preattentive pro-
designed the stimuli; while many subjects appreciated cessing in tasks requiring response to ‘‘oddball’’ stimuli,
that the changes always took place in the inside voic- which differ saliently from others in a sequence [59–61].
es, none was able to describe accurately the formula While these data suggest that pitch differences may be pro-
we had used. cessed preattentively, they do not address the question of
Despite the imposing difficulty of the task, both the whether pitch, per se, may be considered a basic feature of
APE and HTM groups performed significantly better perception. Given that the absolute frequency of a stimu-
than chance in the four-chord condition. As before, lus has little (if any) salience to most individuals, it would
performance on this task did not correlate with either be counterintuitive to qualify it as such. Perhaps, as dis-
years of experience or the ability to name notes. These cussed above (and originally hinted at by Bachem), height
data support the hypothesis that AP possessors are may be a feature of perception in normal individuals,
592 D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601

whereas some basic difference in signal processing makes 6.1. Preattentive processing of pitch or timbre sequences
chroma an additional feature of perception in AP
possessors. Eleven APE, 21 HTM, and 13 NAP controls partici-
One reason that research into preattentive process- pated in the following experiment.9 On each trial, sub-
ing in audition has proceeded so slowly may be the jects were instructed to remember a series of six
difficulty of adapting classic paradigms to the auditory images.10 Coincident with each image a tone was pre-
domain. For example, one standard test of preatten- sented. The tonal sequences varied in either pitch or tim-
tive processing is to measure search time for a target bre (e.g., six different pitches played on a piano or six
in a visual field: if the target is preattentively pro- different instruments playing middle ÔCÕ). Following
cessed, RT should be largely independent of the num- the sequence two separate probe images were presented.
ber of distractors. The obvious auditory analog of this For each one, subjects were required to respond yes/no
task would be to translate visual space into time: thus, as to whether that image had been part of the sequence.
one might present a target and then ask subjects to Following the probe images a probe tone was presented
signal when that target was presented during a and subjects were required to rate their confidence (1–6
sequence of interfering tones. However, several prob- scale) as to whether the tone had been part of the pre-
lems arise. Foremost, a memory component is neces- ceding sequence. Probe tones were part of the sequence
sarily introduced into any task spread out over time. approximately half of the time; when they were, the tar-
Second, a paradox emerges from the way the target get was always presented in either the second or third
is initially defined: if subjects are asked to search for position of the sequence. Subjects were explicitly
an item by name (e.g., respond to each ÔAÕ), then instructed that their first priority was to remember the
the task could be performed only by experienced visual stimuli accurately and that their data would be
musicians with AP; if, however, the desired target is thrown out if they failed to do so.
presented at the start of each sequence, then it be- Data were first analyzed for visual accuracy and any
comes impossible to prevent the task from being per- trial in which a subject responded incorrectly to a probe
formed on the basis of relative pitch. Thus, with this image was excluded.11 Visual accuracy (VA) was defined
type of paradigm, there is no obvious way of testing as the proportion of trials in which a subject responded
the extent to which normal individuals preattentively correctly to both probe images (i.e., chance VA = 0.25).
process the absolute pitch of stimuli. Data were then separately analyzed for timbre and pitch
An alternative paradigm of preattentive visual pro- sequences, and ROC curves were generated from the
cessing requires subjects to focus on a central image confidence rating data. Data were quantified by calcu-
and then evaluates their ability to process peripheral lating p(A)pitch and p(A)timbre for each subject. ANOVAs
stimuli. According to the theory, only preattentive fea- were conducted with these data as the dependent
tures (e.g., color, orientation) can be perceived in this variable.
manner. For example, a peripherally presented red ver- APE, HTM, and NAP subjects had VAs
tical line could be perceived either as red or as vertically (mean ± SE) of 0.64 ± 0.03, 0.65 ± 0.02, and
oriented, but not both. To do so would require the 0.67 ± 0.03. ANOVA with VA as the dependent vari-
assembly of features into a complex object, which can able showed no effect of group [F (2, 42) = 0.58; ns].
only happen with overt attention. This type of paradigm
is more easily adaptable to the auditory domain and 9
APE: 6 male; average age of 20.6 ± 0.7 years; average of 14.5 ± 0.7
allows us to ask the more explicit question: To what ex- years of experience playing a musical instrument, HTM: 5 male;
tent do individuals encode representations of stimulus average age of 20.6 ± 0.6 years; average of 14.3 ± 1.0 years of musical
frequency (as opposed to differences in pitch) when they experience, NAP: 7 male; average age of 19.4 ± 0.3; average of
are not attending to the target? 10.1 ± 0.7 years of experience on their primary instrument.
10
Images were presented for a duration of 1500 ms with an
The following experiments were designed to test the interstimulus interval (ISI) of 250 ms. Stimulus tones were presented
extent to which APE, HTM, and NAP individuals pre- for 600 ms and were generated using EasyBeat MIDI software. Tones
attentively process the pitch and timbre of auditory were selected from a set that consisted of each note of the ET scale
stimuli. Our model predicts that APE possessors should from C4 (261.6 Hz) to B5 (987.8 Hz) played on each of 20 diverse
be very good at this task, whereas NAP individuals timbres. Six blocks consisted of 20 trials each and for each block a
different category of images was used (including beetles, green
should be relatively poor. To the extent that HTM sub- minerals, trees, geometric shapes/colors, warblers, and stamps).
jects ‘‘automatically’’ process the targets (i.e., recognize Subjects were encouraged to rest as necessary between blocks.
them because of their similarity to items in their memo- Completion of all six blocks typically required between 45 minutes
rized templates) they could outperform the NAP con- and an hour and subjects performed two complete cycles. Identical
trols; however, this form of processing should be both Psyscope scripts were used for all subjects.
11
One AP subject was colorblind and had a particularly difficult time
inefficient and inaccurate as compared with true preat- performing the visual tasks. For this subject, data were included for all
tentive processing. For timbre, we did not expect to find trials, regardless of his visual accuracy (though he was given the same
any differences between groups. instructions as the other subjects).
D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601 593

Fig. 6. Average ROC curves for APE, HTM, and NAP subjects for preattentive processing of pitch.

For the pitch sequences, APE, HTM, and NAP How might we explain the results? The most obvious
groups performed at (p(A) ± SE) 0.85 ± 0.02, explanation is that the APE subjects have an increased
0.81 ± 0.01, and 0.75 ± 0.02, respectively (see Fig. 6). ability to preattentively process pitch. In fact, the data
All of these values were significantly better than chance closely correspond to the two major predictions of our
(P < 0.01). For the timbre sequences,12 APE, HTM, and theory: increased performance on the pitch task for
NAP groups performed at: 0.74 ± 0.03, 0.74 ± 0.02, and the APE group and equal performance across groups
0.76 ± 0.02, respectively (see Fig. 7). for the timbre task. However, several alternate explana-
ANOVA showed a significant effect of task tions exist.
[F (1, 39) = 18.14, P < 0.0001] but not group First, it is possible that the visual task did not fully
[F (2, 39) = 1.30, P > 0.20]. The interaction between task maintain subjectsÕ attention; in this case the paradigm
and group was highly significant (F (2, 39) = 5.17, would become a divided attention task rather than one
P = 0.01). These data show that APE, HTM, and of preattentive processing, per se, and a trade-off would
NAP subjects performed differently on the pitch and be expected between visual and auditory performance.
timbre tasks: as predicted, there was a clear order effect Regression analysis of p(A) as a function of visual accu-
(APE > HTM > NAP) for the pitch task but no group racy showed this not to be the case for any group at
differences for timbre.13 either task. This explanation also seems less likely given
the nearly identical performance of all three groups on
the visual task.
12
Three subjects (two AP, one HTM) misunderstood the task and A second possibility is that performance on the audi-
thought that targets need only correctly match the pitch of an item- as tory tasks depends directly on subjectsÕ ability to label
such, they responded ‘‘6’’ to all timbre sequences. Accordingly, data
from these subjects have been excluded from the timbre analysis.
the auditory stimuli. Regression analyses of APE and
13
Despite the magnitude of this effect, we believe these data may HTM data showed no correlation, however, between
actually give a falsely low impression of APE performance. In point of note naming ability and either p(A)pitch or p(A)timbre.
fact, there were two outlier individuals who performed significantly A third explanation is that APE possessors are some-
worse than the rest of the group. One was an individual who how endowed with an enhanced ability to attend to mul-
performed poorly in the seven note condition of the previous
experiment and clearly had lower overall memory abilities. At a later
tiple tasks: thus, rather than reflecting a difference in
date, we questioned the other outlier about his subpar performance preattentive processing, the data could reflect a more
(relative to his performance on the rest of our tasks) and he reported generalized increase in attentional resources for auditory
that he had not felt well the morning of the experiment. He agreed to stimuli. To some extent, this hypothesis is refuted by the
retake the test and was rerun using a new script. On the retest he had a failure to find group differences in the timbre task. Nev-
p(A) of 0.97 for pitch, which was the highest of any subject (compared
with 0.74 during his first session). Another interesting overall indica-
ertheless, a more rigorous test would be to show that
tion of APE performance is that eight of the top nine scores were by pitch and timbre perception in the present tasks truly
APE possessors. qualify as preattentive. As described above, a hallmark
594 D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601

Fig. 7. Average ROC curves for APE, HTM, and NAP subjects for preattentive processing of timbre.

of preattentive processing is that subjects may perceive on a different instrument. For trials where the probe was
features individually but are incapable of fusing multiple present, the target was either in the second or third posi-
features into an object without overt attention. Thus, a tion. When it was not, the second and third tones each
control for the above experiment is to test how well sub- matched one of the parameters. Thus, in all sequences
jects can preattentively process stimuli that vary based subjects heard both the pitch and timbre of the probe
on both pitch and timbre. If the APE group maintained during the main sequence; the question they were being
elevated performance for such an ‘‘object’’ task it would asked was whether they heard the two as a conjoined
suggest either: (1) that the paradigm does not test preat- object.
tentive processing (i.e., that accuracy is proportional to In this experiment, subjects performed six blocks of
subjectsÕ ability to divide their attention between visual 15 trials each. All other parameters were the same as de-
and auditory tasks) or (2) that pitch and timbre may scribed above.
not rightfully be considered as discrete preattentive fea- APE, HTM, and NAP subjects had average VAs of
tures of auditory perception. 0.71 ± 0.05, 0.58 ± 0.04, and 0.74 ± 0.07. These values
were similar to their performance in the previous task.
6.2. Preattentive processing of combined pitch/timbre Average ROC curves for the object task are shown
sequences for each group in Fig. 8. As predicted, subjects per-
formed significantly worse in the present condition: the
Five APE, seven HTM, and five NAP musicians were APE, HTM, and NAP subjects performed at average
recruited from the previous study to take part in the fol- p(A) of 0.62 ± 0.03, 0.67 ± 0.04, and 0.66 ± 0.04. An
lowing experiment. One additional NAP subject was ANOVA with task (pitch vs object) as a within group
recruited.14 factor showed a significant effect of task
In the previous experiment tonal sequences were [F (1, 15) = 100.35, P < 0.0001] but not of group
designed to vary in either pitch or timbre. To test how [F (2, 15) = 1.88, P > 0.15]. The interaction between
well subjects can perceive conjoined features, sequences group and task was significant [F (2, 15) = 6.53,
in the present experiment were constructed so as to vary P < 0.01], reflecting a greater effect of task in the APE
in both pitch and timbre simultaneously. Thus, six dif- group.
ferent pitches were presented, each of which was played
6.3. Discussion
14
Performance of these subjects was generally comparable to that of Data from these experiments show that APE subjects
the larger groups described in Section 2. APE, HTM, and NAP
subjects had average VAs of 0.69 ± 0.05, 0.58 ± 0.05, and 0.66 ± 0.04;
are better than either HTM or NAP subjects at preatten-
average p(A)pitch of: 0.89 ± 0.02, 0.85 ± 0.03, and 0.76 ± 0.03; and tively processing pitch. This finding cannot be accounted
average p(A)timbre of 0.81 ± 0.02, 0.78 ± 0.02, and 0.80 ± 0.02. for by differences in note naming ability or musical
D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601 595

Fig. 8. Average ROC curves for APE, HTM, and NAP subjects for combined pitch/timbre processing.

experience. Further, poorer performance for the object able amount of information about each note: subjects
task rules out the possibility of baseline differences in may have been able to identify many of the tones with
either memory or attention. It remains unclear whether sufficient precision as to be able to perform the task with-
the APE groupÕs poor performance in the object task out needing to recognize the pitch of the target, per se.
results from a difference in strategy (i.e., they tried to From this perspective, the equivalence of NAP perfor-
use pitch and timbre information) or a limited ability mance for ‘‘pitch’’ and ‘‘timbre’’ tasks is not as surpris-
to make use of height cues (perhaps secondary to lack ing; in point of fact, performance for both tasks may
of use in ordinary life). Nevertheless, taken as a whole, depend strongly on timbral cues. Similarly, performance
these data strongly support the hypothesis that the dom- on the object task suggests that NAP subjects may have
inant mode of perception in APE subjects is to process used some sort of fused height/timbre representation.
pitch and timbre as separate features that cannot be Suffice it to say, future research elaborating the basic fea-
fused without overt attention. tures of auditory perception in ‘‘normal’’ individuals
Data from the NAP group provide a fascinating would be extremely useful.
counterpoint. One surprising aspect of their baseline Of note, an important historical difficulty with these
data was that NAP subjects performed comparably for types of experiments (including our own) is that many
pitch and timbre tasks and that performance was signif- ‘‘pitch’’ tasks are susceptible to subjectsÕ exploitation
icantly better than chance for both. If only APE possess- of height cues. One possible way to circumvent this con-
ors are able to preattentively process absolute pitch, why found would be to present sequences across a wider fre-
did a group of control musicians perform so well on this quency range and define ‘‘correct trials’’ as those in
task? which the chroma of a target was presented in a different
The NAP groupÕs performance on the pitch task likely octave (e.g., for a sequence that included C3, a probe of
reflects limitations of the paradigm itself. To a certain ex- C4 would be considered a ‘‘hit’’). Such a manipulation
tent, it was possible to perform the task on the basis of would eliminate height cues and allow one to test more
height only. For example, if a sequence fell mostly within precisely the extent to which stimulus chroma is pro-
the fourth octave and the probe was in the fifth octave, cessed as a feature. Based on our theory, we would pre-
no knowledge of absolute frequency would be required dict even larger effects than seen in the present article:
to respond correctly. A second factor to be considered NAP performance should fall to chance whereas APE
is that the timbre of an instrument is known to vary con- performance should be unaffected.
siderably across its natural range, even within intervals Performance by the HTM subjects is more difficult to
as small as a major third [45]. This effect is likely to be interpret. Data from the first part of the experiment con-
particularly prominent given the irregular nature of the firm that HTM subjects performed better on the pitch
timbres we used. Thus, the combination of cues from task than normal controls, though not as well as the
the height and timbre of the stimuli provides a consider- APE subjects. As in the APE group, performance was
596 D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601

markedly diminished in the object task. These data sug- template of memorized items. The more familiar the tar-
gest that the difference in HTM processing may be con- get stimulus, the easier it should be for an HTM posses-
nected to the ability to exploit height/timbre cues. sor to recognize (and encode a long-term memory of) it.
Unfortunately, the present experiment was not designed Broadly speaking, our definition of APE corresponds to
to test the nuance of our modelÕs distinction between the definition of AP described by the innate model and
APE and HTM. It would be interesting to explore this our definition of HTM corresponds to that of the
question at greater length in future experiments. ELT. The present experiments were designed to test
One test would be to compare performance of APE both the explicit predictions of our new model, and a
and HTM subjects for stimuli with common or rare tim- series of historical myths and dogmas in the field. Fore-
bres. To the extent that the recognition of pitches by most among these is the assumption, held by advocates
HTM possessors depends on a targetÕs similarity to a of both the early learning and innate models, that AP is
memorized template, we would expect the group to per- fundamentally a musical phenomenon.
form better for familiar timbres. Another follow-up Data from these experiments showed that: (1) APE
would be to replicate the object experiment with similar and HTM subjects were equally adept at encoding
manipulations in timbre. Again, to the extent that HTM long-term representations of HP stimuli (relative to their
subjects have memorized specific tones (e.g., an A4 on a own baseline), (2) APE subjects were not only more
violin) they should be better able to recognize these accurate than HTM subjects at distinguishing between
fused objects. In the interim, however, these data contin- properly and improperly tuned stimuli, but they were
ue to support the distinction we have drawn between more sensitive to fine-grained differences in frequency;
APE and HTM individuals. (3) There were clear group differences in subjectsÕ ability
to remember simple musical sequences; more impres-
sively, both APE and HTM subjects performed signifi-
7. Discussion cantly better than chance at remembering sequences of
chords that could not be identified using any conven-
For more than 100 years, musicians and scientists tional nomenclature, (4) APE and, to a lesser extent,
alike have been fascinated by the enigmatic phenome- HTM possessors have an increased ability to preatten-
non of ‘‘perfect pitch.’’ Yet for all of the attention it tively process pitch but do not differ from NAP controls
has received, many aspects of this condition have re- in preattentive processing of timbre; follow-up data
mained largely unresolved; conflict persists regarding from an ‘‘object’’ task support the idea that APE sub-
its etiology and its musicosocial significance, and, at jects perceive pitch and timbre as distinct features of per-
the most fundamental level, there is no consensus on ception, NAP subjects perceive height and timbre (but
how the condition should be defined, let alone how best not pitch, per se) as basic auditory features, and HTM
to account for heterogeneity among its possessors. subjects may extract information about AP from height
In this article, we propose a new model of AP that and/or timbral cues.
may help reconcile conflicting historical perspectives.
In doing so, we attempt to reframe a phenomenon that 7.1. Reevaluating our model: Are there two different types
has been explored almost exclusively from a behavioral of ‘‘perfect pitch’’?
perspective within the context of modern research on
low-level mechanisms of pitch perception. Our new One might object that our distinction between APE
model arose from the qualitative observation that indi- and HTM is arbitrary, i.e., that the groups we defined
viduals who can name notes with great facility constitute are simply part of the same continuous spectrum and
a heterogeneous population. As such, the model subdi- result from the same underlying mechanism. In contrast
vides AP possessors into two groups based on proposed to this hypothesis, not only did we observe major differ-
differences in the mechanisms used to encode pitch. We ences between the groups, but for each experiment there
define APE as the ability to perceptually encode stimu- was a significant interaction between the main experi-
lus frequency. We argue that this skill may result from mental manipulation and AP status (i.e., APE vs
fundamental differences in the processing of periodotop- HTM). Additionally, while the ability to name notes
ic pitch. This skill is automatic and independent of any would be an obvious way to define a continuous spec-
musical experience. Further, subjectsÕ ability to encode trum of AP ability (as has been used by virtually every
stimulus frequency is independent of the specific nature other research group), there was no correlation between
of the target. performance on our gold standard note naming task and
In contrast, we define HTM (heightened tonal mem- any of the other output variables measured, either with-
ory) possessors as those individuals with an increased in or across groups. Instead, the only factor that accu-
ability to form and evoke memories of specific complex rately predicted performance was the one that we used
frequency stimuli. These individuals may be able to to define the groups in the first place, namely, the ability
identify the AP of targets by comparing them to a to encode accurate long-term representations of
D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601 597

stimulus frequency. These data provide overwhelming be relatively crude and idiosyncratic across subjects.
support to the legitimacy of our distinction between dif- Second, as discussed with the APE subjects, accurate
ferent types of ‘‘perfect pitch’’ with different underlying performance with the HP stimuli contradicts the pri-
mechanisms. mary importance of any spectral processing mecha-
nism. Other data (including those obtained in what
7.1.1. (Re)defining APE is dubbed a ‘‘Garner interference paradigm’’ [31]) lend
According to our theory, APE subjects are distin- further support to a mechanistic distinction between
guished by a low-level difference in the pathway used HTM and NAP individuals. Finally, the most reveal-
to process periodotopic pitch. Specifically, we argue that ing finding for this group may be subjectsÕ perfor-
they may be uniquely able to access the low-level repre- mance on the preattentive pitch, timbre, and object
sentation of pitch that is thought to emerge via an auto- tasks: while subjects demonstrated a moderately
correlative process but is inaccessible in ‘‘normal’’ increased ability to process pitch preattentively, they
individuals. This hypothesis may be difficult to prove were not as inhibited as the APE subjects in their abil-
in any conventional, noninvasive manner; nevertheless, ity to process conjoined pitch/timbre objects. This
the present data support it in several important ways. finding supports the idea that HTM subjects do not
First, data from the HP experiment show that APE process pitch as an independent feature of audition;
subjects are not limited in their ability to encode repre- rather, as argued above, their ability to process pitch
sentations of purely periodotopic pitches; these data appears closely tied to the height and/or timbre of
confirm that APE subjects are, at the very least, able the stimulus.
to encode pitch via some mechanism in the temporal As a whole, these data support the idea that HTM
pathway. More significantly, data from the other exper- subjects differ from APE and NAP possessors at a rela-
iments strongly corroborate our qualitative description tively high level in the auditory pathway: specifically,
of APE. Possessors of APE appear to encode the pitch they seem better able to memorize and retrieve complex
of stimuli immediately and absolutely. This process is stimuli that are integrated across multiple auditory
independent of either attention (see Section 6) or the streams. The greater the number of familiar features
ability to name notes (as evidenced by lack of correla- that define a stimulus, the easier it seems to be for them
tion between performance on the gold standard note to recognize it. An interesting quirk of the behavior of
naming test and any other tasks). Further, the represen- many of our HTM subjects was the importance of
tations of pitch formed by these subjects fall along a motor memory in trying to recall a specific pitch. For
continuous spectrum within which subjects can resolve example, many of the string players would ‘‘finger’’ a
differences a fraction of the size of the smallest meaning- note that they were trying to either name or remember.
ful unit in Western music (see Section 4). Other subjects reported being able to name notes only
Further experiments should be aimed at measuring while they were holding their instrument in their hands.
neurophysiological signals so as to test the proposed These types of motor cues seemed to facilitate subjectsÕ
mechanism more directly. Useful methods may include auditory imagery for tones. These observations not only
MEG, EEG, or brainstem imaging with functional support the qualitative description of HTM as depen-
MRI. If our model is correct, one might expect to see dent primarily on retrieval mechanisms, but they also
differences between APE and NAP individuals as low suggest that the critical pathways involved may be quite
in the auditory pathway as the cochlear nucleus or supe- high in the auditory pathway, perhaps involving cortical
rior olive. memory systems that can allow for integration both
within and across modalities.
7.1.2. (Re)defining HTM
Our model defined HTM as an enhanced ability to 7.1.3. Two groups or three?
memorize and retrieve specific configurations of audito- It should be emphasized that these data point to a
ry features from which subjects could infer AP informa- fundamentally different process used by APE and
tion. The system used to process timbre was HTM possessors to encode pitch. While the former
hypothesized to play a prominent role. By this defini- group is distinguished by a difference in the ability to
tion, an underlying difference distinguishing HTM pos- encode periodicity pitch, the HTM group is distin-
sessors from NAP controls, if it exists, could lie in guished by an increased ability to form and retrieve
either the spectral or temporal pathway or at a higher memories of specific complex stimuli. As such, a major
level, after the two streams have been reintegrated. unanswered question is whether, in the spirit of John
The present data help refine this original definition. Watson, we can take any child and provide sufficient
At a qualitative level, performance by the HTM group training to induce HTM, or whether some kind of innate
in the intonation task supports the idea that subjects predisposition is necessary. This question is equivalent
may encode stimulus frequency by comparing targets to asking whether the difference between HTM and
to a memorized template and that this template may NAP individuals is one of type or of degree.
598 D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601

It seems likely that the appropriate type of training, ultimately, our model was formulated based on qualita-
particularly at a young age, may lead to the formation tive observations and discussion with subjects. During
of a template in virtually any individual. However, some the course of testing, all subjects were surveyed with a
individuals may be predisposed to either: (1) being more standardized set of questions including: When did you
susceptible to forming templates from specific auditory first realize you had ‘‘perfect pitch’’? and Are there
events; or (2) being better able to retrieve memories from any features which make it easier or harder for you to
their templates. Either of these possibilities would lead recognize tones? For virtually all of our subjects,
to greater fluidity in naming notes: the former by creat- responses to these questions gave a clear indication of
ing a more thorough template and the latter by allowing whether they possessed APE or HTM.
easier recognition of tones that are similar to, but do not APE subjects typically could not recall a time when
exactly match, items in a subjectÕs template. At this they did not have AP: frequently, they would respond
point, while it is clear that differences exist between to this question by recounting the epiphany in which
NAP and HTM groups, the relative roles of innate they realized not that they had AP, but that everyone
and epigenetic factors remain ambiguous. else did not. Familiarity effects in this group were gener-
While our definition of HTM resembles the ELT in ally minor; subjects reported being equally able to name
many ways, we must emphasize that it differs in the most musical or nonmusical stimuli, and few reported any
critical one: though the ability of this group to recognize changes in pitch perception over time. One of the most
pitches may be based on a memorized musical template, remarkable features of these subjects was their qualita-
we believe that the ability to name notes is a symptom tive assessment of our paradigm: the majority of them
rather than a cause of the condition. A salient feature reported that there was no difference in difficulty
of HTM, considered broadly, is an increased ability to between reproducing a tone following interstimulus
memorize and retrieve specific complex frequency stim- intervals with or without interference.
uli—a skill that may facilitate note naming if items in In contrast, many of the HTM subjects recounted
the template are paired with musical labels, but may also stories of having learned or taught themselves ‘‘perfect
be relatively independent of labeling. Two major factors pitch’’ or how it had slowly evolved over time. Virtually
have led researchers to overemphasize the importance of all of them described a set of parameters that facilitated
attaching labels to items in a subjectÕs template. First, note identification and many claimed to have ‘‘perfect
classic AP tests screen musicians only, a group in which pitch’’ only for their primary instrument or only for
musical labels are likely to be more important. More sig- some notes. A typical story would be that after having
nificantly, these types of paradigms explicitly ask sub- played piano for 10 years, the subject spontaneously
jects to apply musical labels to targets; thus, the extent realized that he or she recognized any note played on
to which unlabeled representations exist in subjectsÕ tem- a piano. Then, gradually, he or she gained the ability
plates has rarely been examined. Given this context, to name notes on other familiar instruments. These sub-
however, the recent data showing that ‘‘normal’’ indi- jects typically found it quite difficult to remember a tone
viduals remember the starting pitches of their favorite after extended interference.
songs with disproportionately great accuracy are of par-
ticular interest in that they provide clear evidence for a 7.2. On the mythology of ‘‘perfect pitch’’
dissociation between the ability to memorize/retrieve
specific complex auditory stimuli and the need to overtly 7.2.1. Myth 1: AP is a musical phenomenon
label these representations. Perhaps because it is such a striking and valuable skill
Because most of the present experiments sought to in a musical setting, AP has historically been defined
test differences between APE and NAP subjects, further strictly as the ability to name notes. As such, advocates
experiments that are specifically aimed at characterizing of both early learning and innate models have taken for
the HTM group could be illuminating. As discussed granted that the skill is limited to the musical domain. In
above, MEG, EEG, and fMRI paradigms would be fact, to the best of our knowledge, no one has previously
helpful in clarifying the relative importance of different attempted to define AP within the broader context of
neural systems, which could occur anywhere between auditory perception, nor has anyone devised a paradigm
brainstem nuclei and parietal association cortex. At a that could test for AP independent of a subjectÕs musical
behavioral level, expanded testing could elucidate the experience. Recently, we showed not only that the latter
extent to which note naming ability in HTM subjects of these goals was feasible, but that there exist individu-
is a direct outgrowth of the tonal memory that has been als who cannot name notes accurately but who neverthe-
observed in NAP controls. less possess AP. These data show that AP may be
independent of a subjectsÕ musical experience. The pres-
7.1.4. Qualitative observations on APE and HTM ent data extend these findings.
Although subjects were classified as APE or HTM We show that APE and HTM possessors differ from
based on their performance on our reproduction tasks, NAP musicians across a wide range of auditory tasks,
D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601 599

some of which may be considered ‘‘musical,’’ others of experience. Rather, they appear to reflect a difference in
which cannot. None, however, relates directly to the the perceptual salience of stimulus frequency (as similar-
classic AP skill of note naming. The data show that ly shown in the preattentive processing experiment).
APE subjects have an increased sensitivity to fine- These findings call for an expanded inquiry into other
grained pitch differences, an increased memory for tonal potential differences between APE and NAP individuals.
stimuli, and an increased ability to preattentively pro- Are there other group differences within the auditory
cess pitch. These differences cannot be explained by note domain? Perhaps of greater interest, do differences
naming ability. They also do not correlate with musical between APE and NAP individuals extend to other cog-
experience. Rather, they demonstrate that APE possess- nitive realms (e.g., is the increased memory generaliz-
ors perceive stimulus frequency in a fundamentally dif- able?)? A more provocative topic to explore would be
ferent manner from ‘‘normal’’ individuals. the potential connection between AP and the nebulous
A paradigm shift is in order: it is time to stop defining social construct of ‘‘genius.’’ At least among musical
AP based on a single, arbitrary, skill and to start explor- ‘‘geniuses’’ there is clearly a disproportionate incidence
ing the full range of perceptual attributes encompassed of AP—from classical composers (e.g., Mozart and Bee-
by this phenomenon. thoven), to modern musicians (e.g., Yo Yo Ma, Wynton
Marsalis, and Bobby McFerrin), to special populations
7.2.2. Myth 2: AP possessors identify stimulus frequency such as autistic musical savants.
by using a place code on the basilar membrane A noteworthy aspect of the present studies is that the
Despite a significant body of recent research demon- research was conducted at Yale University and the vast
strating the relative importance of periodotopic signals majority of subjects were undergraduate students of
in pitch perception, researchers have persisted in assert- Yale College. An interesting aspect of this population
ing that a simple place code on the basilar membrane is was that while almost all of them were excellent musi-
the critical mechanism used to encode absolute pitch. cians, few intended to pursue music professionally. In
This discussion has been carried out largely within the fact, the subjects were quite diverse in their primary
context of age-related changes in pitch perception in areas of study, including journalism, psychology, and
AP possessors. biomedical engineering. Suffice it to say, while musically
The present data show conclusively that AP percep- rich, the environment was far from the typical conserva-
tion can be independent of a cochlear place code. While tory atmosphere from which most researchers recruit
it is plausible that redundant mechanisms exist (one their subjects.
spectral and one temporal) it is more parsimonious to Nevertheless, within this population the prevalence of
conclude that AP recognition derives from a periodo- APE was extraordinarily elevated compared with any
topic mechanism. A worthwhile line of inquiry might conventional estimates (which generally range from
therefore be to inquire how age-related changes might 1/5000 to 1/10,000). Currently at Yale College, there
affect the rate of firing in the cochlear hair cells. are no fewer than seven individuals with this trait (out
of a student body of approximately 4500), a prevalence
7.2.3. Myth 3: AP may predispose individuals to special five to ten times greater than the most generous esti-
musical skills mates of AP. If we include HTM possessors in this cal-
Within musical communities, it may often be accept- culation (as published reports almost certainly do), there
ed, either implicitly or explicitly, that having AP may are another 16 current undergraduates who were includ-
help make someone a better musician. Obviously, musi- ed in this study plus at least half a dozen more ‘‘perfect
cianship is multifaceted, and the dominant component pitch’’ possessors of whom we are aware but have not
(at least in tonal music) is developing a strong sense of yet tested. Thus, in total, no less than 0.5%—and possi-
relative pitch. Nevertheless, AP possessors are frequent- bly up to 1%—of Yale undergraduates possess some
ly rumored to have increased musical memory or an in- form of ‘‘perfect pitch.’’ While the admissions commit-
creased ability to discern in-tune from out-of-tune tee may look favorably on applicants with musical expe-
stimuli. While commonplace, these suggestions have rience, this proportion is still surprising and is worthy of
been deemed politically incorrect and vehemently reject- further consideration in its own right.
ed by some members of the scientific community (gener-
ally, the same individuals who reject the similarly 7.3. General implications
politically incorrect idea that AP is innate).
The present data show an empirical basis for both of 7.3.1. Models of auditory processing
these myths. APE possessors were not only able to accu- The cochlea encodes stimulus frequency along two
rately discern in-tune from out-of-tune stimuli; they orthogonal domains: spectrally, based on the location
demonstrated a vastly increased memory for simple of the point of maximum amplitude along the basilar
musical sequences. Neither of these skills correlates with membrane, and periodotopically, based on the spike
either the ability to name notes or with subjectsÕ musical intervals from outer hair cells. Researchers studying
600 D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601

low-level auditory processing tend to emphasize the type of framework may be ideally suited to audition,
importance of periodotopic pathways in pitch percep- where the dominant features of perception are ill-defined.
tion. In contrast, researchers studying pitch processing Intuitively, one might speculate that auditory features
at the cortical level remain relatively fixated on the include height, timbre, and relative pitch intervals. The
importance of tonotopic representations. At first present data confirm the importance of height and timbre
glance, it is puzzling why researchers from these two in ‘‘normal’’ auditory perception. More significantly, they
fields have been unable to reconcile their differing per- confirm the key tenet of our model, namely, that one sub-
spectives. On closer examination, though, the conflict set of the population may be distinguished by the percep-
may be seen to result from the major irony of ‘‘pitch tion of a distinct preattentive feature. These data point to
processing’’ research: for the vast majority of individ- an innovative application of TreismanÕs methods. While
uals, ‘‘pitch’’ is a meaningless concept. While precise originally intended to elaborate mechanisms of ‘‘normal’’
low-level representations of pitch may exist, at a cog- perception, they may be equally valuable for identifying
nitive level they are inaccessible: a ÔBbÕ does not sound or better characterizing other special populations.
any different than a ÔCÕ.
From this perspective, debate about the relative
importance of tonotopic and periodotopic processing Acknowledgments
streams, like the debate between the early learning
and innate models of AP, may be moot: each model This article reports results from a dissertation by
appropriately describes a subset of the whole. Simple David Ross presented in partial fulfillment for the
‘‘pitch processing’’ tasks (e.g., high/low tasks for two Ph.D. at Yale University. The authors thank John
fixed tones) are likely performed on the basis of height Culling, Ingrid Olson, and Bob Schultz for their collab-
and processed via tonotopic pathways. More complex oration on these experiments, and Thomas C. Duffy and
‘‘pitch processing’’ tasks (e.g., roving high/low para- Tom Cantey for important musical insights.
digms or interval recognition) must be performed on
the basis of relative pitch information, which is likely
processed via periodotopic streams. On one level, this
distinction is supported by the respective methods References
used by previous researchers to map tonotopic versus
[1] Baharloo S, Johnston PA, Service SK, Gitschier J, Freimer NB.
periodotopic representations in the cortex. From a Absolute pitch: an approach for identification of genetic and
theoretical perspective, the functional significance of nongenetic components. Am J Hum Genet 1998;62:224–31.
this distinction is supported by the fact that tonotopic [2] Gregersen PK, Kowalsky E, Kohn N, Marvin EW. Absolute
representations of pitch follow the Mel Scale (which is pitch: prevalence, ethnic variation, and estimation of the genetic
component. Am J Hum Genet 1999;65:911–3.
consistent with height but not musical pitch), whereas
[3] Profita J, Bidder TG. Perfect pitch. Am J Med Genet
periodotopic information is naturally organized in 1988;29:763–71.
musically meaningful ways (where the degree of over- [4] Ward WD. In: Deutsch D, editor. The psychology of music. San
lap between autocorrelograms correlates with the Diego: Academic Press; 1999. p. 265–98.
musical overtone series). These distinctions should [5] Vernon PE. Absolute pitch: a case study. Br J Psychol
1977;68:485–9.
reinforce the importance of eschewing generic descrip-
[6] Schlaug G, Jancke L, Huang Y, Steinmetz H. In vivo evidence of
tors (such as ‘‘pitch processing’’) in favor of more pre- structural brain asymmetry in musicians. Science
cise terminology and in carefully specifying the exact 1995;267:699–701.
cognitive demands of each behavioral task. [7] Zatorre RJ. How do our brains analyze temporal structure in
sound? Nat Neurosci 1998;1:343–5.
[8] Keenan JP, Thangaraj V, Halpern AR, Schlaug G. Absolute pitch
7.3.2. Importance of preattentive processing
and planum temporale. NeuroImage 2001;14:1402–8.
The present research was inspired by the observa- [9] Klein M, Coles M, Donchin E. People with absolute pitch process
tion that pitch was a salient feature of perception tones without producing a P300. Science 1984;223:1306–9.
for only a small subset of the population. The preat- [10] Hantz E, Crummer GC, Wayman JW, Walton P, Frisina RD.
tentive processing model provides a valuable frame- Effects of musical training and absolute pitch on the neural
work for interpreting this type of observation. The processing of melodic intervals: a P3 event-related potentials
study. Music Percept 1992;10:25–42.
crux of the model is that behaviorally identifiable [11] Wayman JW, Frisina RD, Walton JP, Hantz EC, Crummer GC.
‘‘features’’ of perception may reflect the mechanisms Effects of musical training and absolute pitch ability on event-
of low-level stimulus processing. Thus, the presence related activity in response to sine tones. J Acoust Soc Am
of a unique behavioral ‘‘feature’’ in one subset of 1992;91:3527–31.
[12] Crummer GC, Walton JP, Wayman JW, Hantz EC, Frisina RD.
the population suggests that a basic neurobiological
Neural processing of musical timbre by musicians, nonmusicians,
difference may define this group. and musicians possessing absolute pitch. J Acoust Soc Am
Unfortunately, preattentive paradigms have been 1994;95:2720–7.
largely unexplored outside the visual domain. Yet this [13] Bachem A. Absolute pitch. J Acoust Soc Am 1955;27:1180–5.
D.A. Ross et al. / Epilepsy & Behavior 7 (2005) 578–601 601

[14] Seashore CE. Psychology of music. New York: McGraw–Hill; [39] Levitin DJ. Absolute memory for musical pitch: evidence from the
1938. production of learned melodies. Percept Psychophys
[15] Baharloo S, Service SK, Risch N, Gitschier J, Freimer NB. 1994;56:414–23.
Familial aggregation of absolute pitch. Am J Hum Genet [40] Halpern AR. Memory for the absolute pitch of familiar songs.
2000;67:755–8. Mem Cognit 1989;17:572–81.
[16] Gregersen PK. Instant recognition: the genetics of pitch percep- [41] Schellenberg EG, Trehub SE. Good pitch memory is widespread.
tion. Am J Hum Genet 1998;62:221–3. Psychol Sci 2003;14:262–6.
[17] Gregersen PK, Kowalsky E, Kohn N, Marvin EW. Early [42] Lockhead GR, Byrd R. Practically perfect pitch. J Acoust Soc
childhood music education and predisposition to absolute pitch: Am 1981;70:387–9.
teasing apart genes and environment. Am J Med Genet [43] Miyazaki K. Absolute pitch identification: effects of timbre and
2001;98:280–2. pitch region. Music Percept 1989;7:1–14.
[18] Watt HJ. The psychology of sound. Cambridge: Univ. Press; [44] Takeuchi AH, Hulse SH. Absolute-pitch judgments of black- and
1917. white-key pitches. Music Percept 1991;9:27–46.
[19] Takeuchi AH, Hulse SH. Absolute pitch. Psychol Bull [45] Sergeant D. Experimental investigation of absolute pitch. J Res
1993;113:345–61. Music Educ 1969;17:135–43.
[20] Miyazaki K. Musical pitch identification by absolute pitch [46] Bachem A. Chroma fixation at the ends of the musical frequency
possessors. Percept Psychophys 1988;44:501–12. scale. J Acoust Soc Am 1948;20:704–5.
[21] Cuddy LL. Training the absolute identification of pitch. Percept [47] Ross DA, Marks LE. Subjects with AP accurately encode
Psychophys 1970;8:265–9. HugginÕs pitch stimuli. submitted for publication.
[22] Levitin DJ. Absolute pitch: self-reference and human memory. Int [48] Cramer EM, Huggins WH. Creation of pitch through binaural
J Computing Anticipatory Syst 1999;4:255–66. interaction. J Acoust Soc Am 1958;30:413–7.
[23] Siegel JA. Sensory and verbal coding strategies in subjects with [49] Culling JF. The existence region of HugginsÕ pitch. Hearing Res
absolute pitch. J Exp Psychol 1974;103:37–44. 1999;127:143–8.
[24] Zatorre RJ, Beckett C. Multiple coding strategies in the retention [50] Akeroyd MA, Moore BCJ, Moore GA. Melody recognition using
of musical tones by possessors of absolute pitch. Mem Cognit three types of dichotic-pitch stimulus. J Acoust Soc Am
1989;17:582–9. 2001;110:1498–504.
[25] Zakay D, Roziner I, Ben-Arzi S. On the nature of absolute pitch. [51] Siegel JA, Seigel W. Absolute identification of notes and intervals
Arch Psychol 1984;136:163–6. by musicians. Percept Psychophys 1977;21:143–52.
[26] Corliss ELR. Fixed-scale mechanism of absolute pitch. J Acoust [52] Ross DA, Marks LE, Absolute pitch confers an enhanced ability
soc Am 1973;59:1737–9. to distinguish betweem properly and improperly tuned stimuli, In:
[27] Bachem A. Various types of absolute pitch. J Acoust Soc Am The neurosciences and music II: From perception to performance,
1937;9:146–57. May 5–8, 2005, Leipzig.
[28] Wynn VT. Absolute pitch in humans, its variations and possible [53] Tredgold AF. A text-book of mental deficiency. Baltimore: Wil-
connections with other known rhythmic phenomena. Prog Neu- liam Wood; 1937.
robiol 1973;1:111–49. [54] Treffert DA. The idiot savant: a review of the syndrome. Am J
[29] Ross DA, Olson IR, Marks LE, Gore JC. A non-musical Psychiatry 1988;145:563–72.
paradigm for identifying absolute pitch possessors. J Acoust Soc [55] Young RL, Nettelbeck T. The abilities of a musical savant and his
Am 2004;116:1793–9. family. J Autism Dev Disord 1995;25:231–47.
[30] Ross DA, Olson IR, Gore JC. Absolute pitch does not depend on [56] Viscott DS. A musical idiot savant: a psychodynamic study, and
early musical training. Ann NY Acad Sci 2003;999:522–6. some speculations on the creative process. Psychiatry
[31] Ross DA. Doctoral dissertation. Yale University; 2004. 1970;33:494–515.
[32] Treisman A, Vieira A, Hayes A. Automaticity and preattentive [57] Sloboda J, Hermelin B, OÕConnor N. An exceptional musical
processing. Am J Psychol 1992;105:341–62. memory. Music Percept 1985;3:155–70.
[33] Logan GD. Toward an instance theory of automatization. [58] Ross DA, Marks LE, Absolute pitch confers enhanced memory
Psychol Rev 1988;95:492–527. for musical sequences, In: The neurosciences and music II: From
[34] Moore BCJ. An introduction to the psychology of hearing. Lon- perception to performance, May 5–8, 2005, Leipzig.
don: Academic Press; 1989. [59] Paavilainen P, Jaramillo M, Naatanen R, Winkler I. Neuronal
[35] Cariani PA, Delgutte B. Neural correlates of the pitch of complex populations in the human brain extracting invariant relationships
tones: I. Pitch and pitch salience. J Neurophys 1996;76:1698–716. from acoustic variance. Neurosci Lett 1999;265:179–82.
[36] Cariani PA. Temporal codes, timing nets, and music perception. [60] Paavilainen P, Simola J, Jaramillo M, Naatanen R, Winkler I.
New Music Percept 2001;30. Preattentive extraction of abstract feature conjunctions from
[37] Langner G. Neural processing and representation of periodicity auditory stimulation as reflected by the mismatch negativity
pitch. Acta Otolaryngol Suppl 1997;532:68–76. (MMN). Psychophysics 2001;38:359–65.
[38] Langner G, Sams M, Heil P, Schulze H. Frequency and [61] Takegata R, Paavilainen P, Naatanen R, Winkler I. Preattentive
periodicity are represented in orthogonal maps in the human processing of spectral, temporal, and structural characteristics of
auditory cortex: evidence from magnetoencephalography. J Comp acoustic regularities: a mismatch negativity study. Psychophysics
Physiol [A] 1997;181:665–76. 2001;38:92–8.

S-ar putea să vă placă și