Zbikowski - Conceptualizing Music, Chapter Two

Lawrence M.
Zbikowski: Conceptualizing Music: Cognitive Structure, Theory, and Analysis

CHAPTER TWO
CROSS‐DOMAIN MAPPING
About a quarter of the way through the Credo of his Pope Marcellus Mass (printed 1567),
Giovanni Pierluigi da Palestrina indulges in a marvelous bit of text painting with telling
effect. The text Palestrina sets here is “Qui propter nos homines et propter nostram
salutem descendit de cælis” (“Who for us men, and for our salvation, came down from
heaven”). As shown in example 2.1, with the first statement of the word “descendit,” each
voice begins a scalar descent. Christ's descent from heaven is thus represented with a
cascading fall through musical space, a series of overlapping movements “down” the
musical scale.
Palestrina's text painting is a striking embodiment of the conventional construal of
pitch as “high” and “low.” This way of thinking about pitch relations has a venerable
tradition. Its basic elements can be seen in Aristoxenus's description of pitch relations in
terms of two‐dimensional space, which was discussed in the introduction to this volume.
Pitches are understood as points in space, and musical intervals are reckoned in terms of
distances between these points. However, the spatial orientation crucial to Palestrina's text
painting—the correlation of the “up” and “down” of physical space with specific pitch
relations, such that a musical scale can “descend”— was seldom used in Greek music
theory. As Andrew Barker has noted, the standard Greek for what would now be called
“high‐pitched” is oxys, which meant “sharp,” “pointed,” or “keen‐edged”; its musical
opposite was barys, which meant “heavy” (but not, in opposition to oxys, “blunt”).[ 1 ] Just
when “up” and “down” consistently came to be correlated with musical pitch is unclear, but
the linkage was in place at least by the beginning of the tenth century, a good six hundred
years before Palestrina wrote his Mass. From around the tenth century, thus, musicians in
the West began writing about and depicting pitch in terms of “high” and “low,” mapping
structural relations from the domain of vertically oriented, two‐dimensional space onto the
domain of music.
Perhaps more remarkable than the long tradition of construing pitch relations in
terms of “up” and “down” are the ready reminders of how arbitrary a construal it is
EXAMPLE 2.1 Giovanni Pierluigi da Palestrina, Credo of the Pope Marcellus Mass, mm. 53–
58
[Full Size]
Consider, for instance, “up” and “down” on the piano: how can D{in4}4 be “above” C4 on the
piano when they are both on the same horizontal plane? Think of playing the two notes on
the cello—to play the “higher” D4, we have to move our left hand “down,” so that it is closer
to the ground. In fact, there are countless reminders of the artificiality of describing pitch in
terms of “high” and “low.” And yet Palestrina's text painting seems anything but arbitrary—
there seems to be an aptness to his portrayal of the descent from heaven that goes beyond
mere traditions of depiction.
In this chapter, I explore the function of cross‐domain mappings of the sort that
underlie this small bit of text painting in the Credo of the Pope Marcellus Mass. Cross‐
domain mapping plays two important roles in musical understanding: first, it provides a
way to connect musical concepts with concepts from other domains, including those
associated with language; second, it provides a way to ground our descriptions of elusive
musical phenomena in concepts derived from everyday experience. Both of these
contribute to the establishment of relationships between concepts, relationships that are
fundamental to the prospect of theorizing about music. In the first section that follows, I
provide an introduction to the theory of cross‐domain mapping as it has been developed in
recent work by cognitive linguists and others. This introduction provides a framework for
explaining what makes mappings possible and why some mappings are more effective than
others.
Cross‐domain mapping also makes it possible to correlate the musical domain with
others, such as the domain of physical space or of gesture. Under certain circumstances,
such correlations provide the basis for rich worlds of the imagination.
― 65 ―
Correlating musical pitches with vertically oriented, two‐dimensional space, for instance,
leads quite naturally to an imaginary world in which pitches become things that move
through space: the successive notes of a scale gradually descend and ascend; in other
passages, some notes leap, while still others fall. Within this imaginary world, each
traversal of space has a specific and unmistakable sound— that is, descent sounds one way,
ascent another. And this is not something limited to text painting of the sort demonstrated
by Palestrina, as any number of cartoon soundtracks confirm. Never mind that actual
traversais of space sound nothing like those of the hyperkinetic Roadrunner or the hapless
Wile E. Coyote; if the correlation between the domains is properly established, elements
from each will blend together to create novel relationships and elements. In the second
section that follows, I describe the process that leads to this sort of blending and show the
role conceptual blending plays in text painting and program music.
AN INTRODUCTION TO CROSS‐DOMAIN MAPPING
Cross‐Domain Mapping and Metaphor
The theory of cross‐domain mapping is a product of a generalized approach to linguistic
metaphor first taken by George Lakoff and Mark Johnson in 1980. Perhaps the most
common conception of metaphor is of a literary device, a manifestation of the figurai use of
language to create colorful if imprecise images. Lakoff and Johnson accumulated a
substantial body of evidence demonstrating that metaphor was not simply a manifestation
of literary creativity but was, in fact, pervasive in everyday discourse.[ 2 ] As an example,
consider the way up and down are used to characterize emotions, consciousness, and
health:
EMOTIONS
I'm feeling up. My spirits rose. I'm feeling down. I fell into a depression. My spirits sank.
CONSCIOUSNESS
Get up. I'm up already. He rises early in the morning. He fell asleep.
HEALTH
He's at the peak of health. She's in top shape. He came down with the flu.
Each characterization suggests not a literal representation of the spatial domain

implied by the orientation up‐down but, instead, uses our knowledge of physical space to
structure our understanding of emotions, consciousness, or health.
Based on evidence provided by a large number of similar examples of the
― 66 ―
appearance of metaphorical constructions in everyday discourse, Lakoff and Johnson

proposed that metaphor was a basic structure of understanding through which we
conceptualize one domain (the target domain, which is typically unfamiliar or abstract) in
terms of another (the source domain, which is most often familiar and concrete). Further
study has provided a wealth of empirical evidence for this proposal and contributed to the
development of the field of cognitive linguistics.[ 3 ]
Fundamental to the theory of metaphor that sprang from Lakoff and Johnson's work is
a distinction between conceptual metaphors and linguistic metaphors. A conceptual
metaphor is a cognitive mapping between two different domains; a linguistic metaphor is
an expression of such a mapping through language. For instance, the conceptual
metaphor state of being is orientation in vertical space maps relationships in physical space
onto mental and physical states.[ 4 ] The cross‐domain mapping wrought by this conceptual
metaphor then gives rise to innumerable linguistic expressions. Some of these expressions
are commonplace, such as “Maxwell seems a bit down today.” Others summon a rich
imagistic world, such as that of John Keats's “Ode to a Nightingale”:
My heart aches, and a drowsy numbness pains
My sense, as though of hemlock I had drunk,
Or emptied some dull opiate to the drains
One minute past, and Lethe‐wards had sunk.
Here the descent to the mythical river gives a physical correlate to the narcotic state of
the narrator: the act of sinking is mapped onto a melancholy emotional state. Thus the
same conceptual metaphor (state of being is orientation in vertical space) is behind both
linguistic metaphors, one commonplace (“Maxwell seems a bit down today”), the other
poetic.
With respect to music, the “high” and “low” used to describe pitches reflect the
conceptual metaphor pitch relationships are relationships in vertical space. This metaphor
maps spatial orientations such as up‐down onto the pitch continuum. The mapping yields a
system of metaphors replete with possibilities for describing musical pitch. We can speak
in terms of pitch contour (meaning successions of pitches, which are located at different
places in pitch‐space), gesture (meaning
― 67 ―
successions of pitches with a specific directionality and contour), and musical space
(meaning a three‐ or four‐dimensional extension of the basic two‐dimensional
mapping).[ 5 ] This system is given graphic representation in traditional musical notation:
notes that are the result of more rapid vibrations of the sounding medium are placed
higher on the page than notes that result from less rapid vibrations (with the exception of
sharps and flats). The two‐dimensional space of the musical page thus correlates with the
spatial orientation ascribed to pitch.[ 6 ] The systematic quality that results from mapping
spatial orientations onto the pitch continuum thus leads to an entire vocabulary for
describing relationships among pitches that provides a rich set of possibilities for
furthering our conceptualization of music.
As common as conceiving of pitches as “high” or “low” seems, not all cultures describe
pitch relationships in purely spatial terms. As I noted here, Greek theorists of antiquity
used oxys(“sharp” or “pointed”) and barys (“heavy”) to characterize pitches. And traversing
historical distance is not the only way to discover alternative conceptualizations of pitch
relations. Consider three examples in which it is culture, rather than time, that creates
distance:
1. Steven Feld's research has shown that the Kaluli of Papua New Guinea
describe melodic intervals with the same terms they use to characterize
features of waterfalls. For instance, in Kaluli sa means “waterfall,” and
a mogan is a still or lightly swirling waterpool; sa‐mogan is the flow of a
waterfall into a level waterpool beneath it. Sa‐mogan is also used to describe a
melodic line that descends to a repeated note, the contour of which replicates
that of a waterfall flowing into a pool. In contrast, there are no specific names
for ascending intervals, which nonetheless do occur in Kaluli song.[ 7 ] Behind this
account of musical intervals is the conceptual metaphor PITCH RELATIONSHIPS
ARE WATERFALL CHARACTERISTICS, which provides the basis for a rich set of
descriptive terms that capture some aspects of melody but not others.
2. In Bali and Java pitches are conceived not as “high” and “low” but as “small”
and “large.”[8 ] Here the conceptual metaphor is pitch relationships are
relationships of physical size, a mapping that accurately
― 68 ―
reflects the norms of acoustic production: small things typically vibrate more
rapidly than large things. This acoustic fact is represented throughout the
numerous parts of the gamelan, the collection of instruments central to the
musical practice of Bali and Java.
3. The Suyá of the Amazon basin do not have an extensive vocabulary for
describing pitch relationships. When they are described, however, it is in terms
of age: pitches are conceived not as “high” and “low” but as “young” and “old.”
The conceptual metaphor that guides this mapping is pitch relationships are age
relationships, which accurately reflects the Suyá's beliefs that the pitch of the
voice becomes deeper with age.[ 9 ]
With each of these conceptions of pitch relationships it is easy to focus on what they
lack rather than what they offer. That is, it is natural (if not quite excusable) to reckon
difference in somewhat chauvinistic terms: that the Kaluli do not have terms to describe
ascending intervals, where we do; that the way pitch relationships are characterized in Bali
and Java does not transfer into graphic representations with the same ease as do “high” and
“low.” It has to be remembered, however, that mapping “high” and “low” onto music has its
own limitations: “high” and “low” cannot reflect the subtle play of flowing water, nor do
they provide much of an explanation for how acoustic features correlate with pitch
relationships. Such differences show that each mapping between domains makes some
conceptualizations possible, while it disables others.
Image Schema Theory

The variety of conceptual metaphors used to characterize pitch relations leads to the
question of the ultimate grounding of the process of cross‐domain mapping. Even if we
grant that we understand a target domain (such as pitch relationships) in terms of a source
domain (such as orientation in vertical space), how is it that we understand the source
domain in the first place? Mark Johnson answered this question by proposing that meaning
was grounded in repeated patterns of bodily experience.[ 10 ] These patterns give rise to
what Johnson called image schemata, which provide the basis for the concepts and
relationships essential to metaphor. An image schema is a dynamic cognitive construct that
functions somewhat like the abstract structure of an image and thereby connects together a
vast range of different experiences that manifest this same recurring structure.[ 11 ] Image
schemata are by no means exclusively visual—the idea of an image is simply a way of
capturing the organization inferred from patterns in behavior and concept formation.
As one example of an image schema, consider the verticality schema, which might be
summarized by a diagram of the sort given in figure 2.1. We grasp this structure repeatedly
in thousands of perceptions and activities that we experience every day. Typical of these
are the experiences of perceiving a tree, our felt sense of standing upright, the activity of
climbing stairs, forming a mental image of a flagpole, and
― 69 ―
FIGURE 2.1 Diagram of the verticality schema
[Full Size]
watching the level of water rise in the bathtub. The verticality schema is the abstract
structure of the verticality experiences, images, and perceptions. Our concept of verticality
is based on this schema, and this concept is in turn invoked by the various conceptual
metaphors that use vertical space as a source domain through which to structure such
target domains as emotions, consciousness, health, and musical pitch.
By definition, image schemata are preconceptual: they are not concepts, but they
provide the fundamental structure upon which concepts are based. In consequence, it is
important to emphasize that any diagram used to illustrate an image schema is intended to
represent the key structural features and internal relations of the schema; it is not meant to
summon a rich image or mental picture that we some‐how have “in mind” and use actively
to structure our thought. More directly, whatever actually occupies our thoughts is not, by
definition, an image schema. We can conceive of image schemata, just as we can conceive of
any of a number of nonconceptual or preconceptual cognitive processes. We can also note
general patterns in the way concepts are structured, which can be attributed to image
schemata. However, there are, by definition, no “image‐schema concepts.”
The relationship between the verticality schema and our characterization of musical
pitch with reference to the spatial orientation up‐down is fairly immediate: when we
make low sounds, our chest resonates; when we make high sounds, our chest no longer
resonates in the same way, and the source of the sound seems located nearer our head. The
“up” and “down” of musical pitch thus correlate with the spatial “up” and “down”—the
vertical orientation—of our bodies. The verticality schema offers a straightforward way to
explain why we characterize musical pitch in terms of high and low even when the actual
spatial orientation of the means through which we produce pitches either does not
reinforce the characterization or runs directly counter to it.
At present, the image schema remains largely a theoretical construct.Work across a
variety of fields, however, has made strong arguments for the importance of such a
construct, including that by Leonard Talmy in linguistics, Gerald Edelman in neuroscience,
David McNeill in psychology, and Raymond W Gibbs Jr. and Herbert L. Colston in
psycholinguistics.[ 12 ] Recently, Lawrence Barsalou and his associates have
― 70 ―
developed a comprehensive theory of mental representation based on perceptual symbols

(which are analogical structures similar to image schemata) and have begun to provide
empirical work to support the theory.[ 13 ] Antonio Damasio, working from a
neurophysiological perspective, has made compelling arguments for the importance of the
body to consciousness and thought.[ 14 ]Although research on image schemata and similar
structures is still preliminary, it is also highly promising and offers some of the best
prospects for solving some of the problems of the relationship between mind and body that
have dogged cognitive research throughout this century.
The Invariance Principle
The theory of image schemata provides a way to explain how conceptual metaphors are
grounded. It does not, by itself, explain why some conceptual metaphors seem intuitively
better than others. For instance, the conceptual metaphor pitches are fruit could provide
the grounding for such expressions as “You must play the first note more like an apple, the
second more like a banana.” Although such mappings are possible, they are certainly not
common. Pitches and fruits just do not seem to be a good match.
To account for why some metaphorical mappings are more effective than others,
George Lakoff and Mark Turner proposed that such mappings are not about
the imposition of the structure of the source domain on the target domain, but are instead
about the establishment of correspondences between the two domains. These
correspondences are not haphazard, but instead preserve the image‐schematic structure
latent in each domain. Lakoff and Turner formalized this perspective with the Invariance
Principle, which Turner states as follows: “In metaphoric mapping, for those components of
the source and target domains determined to be involved in the mapping, preserve the
image‐schematic structure of the target, and import as much image‐schematic structure
from the source as is consistent with that preservation.”[ 15 ] According to the Invariance
Principle, mapping the spatial orientation
― 71 ―
up‐down onto pitch works because of correspondences between the image‐schematic
structure of components of the spatial and acoustical domains. Both space and the
frequency spectrum are continua that can be divided into discontinuous elements. In the
spatial domain, division of the continuum results in points; in the acoustic domain, it
results in pitches. Mapping up‐down onto pitch allows us to import the concrete
relationships through which we understand physical space into the domain of music and
thereby provide a coherent account of relationships between musical pitches. Mapping
various fruits onto musical pitches works less well because fruit does not (in any ordinary
way) constitute a continuum. To employ this mapping is to highlight instead both the
discontinuity among musical pitches and how they are unlike one another (an emphasis on
difference suggested by the idiomatic phrase “like apples and oranges”).
Cross‐Domain Mapping and Conceptual Models
According to current theory, cross‐domain mappings are grounded in repeated patterns of
embodied experience called image schemata. These schemata provide the basic structure
employed in the mappings: the verticality schema is thus fundamental to our
understanding of two‐ or three‐dimensional spaces as having directionality and of musical
pitch as “high” and “low.” Image schemata also constrain the possibilities for mapping
between two domains, a constraint reflected in the Invariance Principle. Because
the verticality schema can be applied to both the spatial and the musical domains, we can
use our understanding of the former to structure our understanding of the latter. It
remains to be explained why one mapping would be preferred over another—why, for
instance, we tend to describe pitch relations in terms of “high” and “low” rather than
“small” and “large.”
On the face of it, both of these mappings are equally viable. Both draw on aspects of
our embodied experience: on the one hand, countless experiences with the seeming origin
of our own voices; on the other hand, countless experiences with the sounds given out by
physical objects in the world around us. Both mappings allow us to describe musical
pitches as elements within a continuum. And each mapping can be easily understood from
the perspective provided by the other. For instance, in Camille Saint‐Saëns's Le Carnaval
des animaux, the music for the elephant is played by the contrabass (large is low), that for
the birds by the flute (small is high).[ 16 ]Although musicians educated in the West sense the
novelty of the mapping, they can nonetheless understand it perfectly well. In a similar
fashion, musicians from Bali and Java, when confronted with Western conventions for
notation, have few if any problems translating “small” and “large” into “high” and “low”[ 17 ]
― 72 ―
The reason we prefer one mapping over another has to do with the global conceptual
models we absorb from culture and that supply crucial support for the preferred
mapping.[ 18 ] In the West, the description of pitch relations in terms of “up” and “down”
arose around the same time musicians began to develop ways of notating polyphonic
compositions. These notational systems often relied, either directly or indirectly, on the
physical placement of symbols on the page.[ 19 ]The attribution of “high” and “low” to musical
pitches is thus correlated with a system of notation that permitted both the visualization
and the preservation of musical works. In turn, this notation relied on a global model that
made three important assumptions: first, pitches could be regarded as objects that were
independent of the sound source that produced them; second, graphical symbols could be
used to represent these pitchobjects; and third, the surface on which these symbols
appeared was analogous to physical space. In Bali and Java, the performance of music was
associated with the rich palette of instruments through which the music was effected. Thus
the attribution of “small” and “large” to pitches correlated with characteristics of the
musical instruments intrinsic to musical performance. The conception of musical pitches as
physical objects relies on a global model that does not, at some fundamental level,
disassociate a pitch from the sound source that produces it.[ 20 ]
As noted earlier, the characterization of pitch relations can be informed by mappings
other than “high” and “low” and “small” and “large.” In each case, the basic mapping relies
on embodied knowledge and on the correlation of the musical domain with a more
concrete domain. The specific mapping chosen within a tradition of discourse about music
reflects not so much absolute musical structure as it does the broader cultural practice
within which music and its understanding are embedded: mappings reflect the conceptual
models that are important to culture. The cross‐domain mappings employed by any theory
of music are thus more than simple curiosities—they are actually key to understanding
music as a rich cultural product that both constructs and is constructed by cultural
experience.
― 73 ―
Cross‐Domain Mapping, Music, and Music Theory
palestrina's text painting The theory of cross‐domain mapping outlined in the preceding
discussion provides a relatively simple way to account for the effectiveness of Palestrina's
text painting. According to this theory, we use the verticality schema, the product of
countless bodily experiences, to give physical space an “up” and a “down” analogous to the
“up” and “down” of our bodies. This orientation is then mapped on to a metaphorical
musical space made up of pitches and the relations (or distances) between them. Pitches
that result from more rapid vibrations of the sounding medium are regarded as “higher”
than pitches that result from less rapid vibrations of the sounding medium. As each of the
voices in Palestrina's six‐part texture takes up the word “descendit,” it begins to “descend”
through musical space. The notion of descent summoned by the text is thus given sounding
image by a specific series of musical events.
Of course, we could also hear (through the ears of a Javanese musician) the pitches
growing larger, or hear (through the ears of the Suyá) the pitches growing older, or hear
(through the ears of a Greek theorist) the pitches growing heavier. Distinct from these
ways of hearing, Palestrina's text painting relies on a conceptual model that characterizes
pitches as objects in two‐dimensional, vertically oriented space, in which “up” and “down”
describe specific pitch relations. This model is strongly associated with the conventions of
musical notation as developed in the West: as pitches get “lower” in sound, they are also
written “lower” on the musical staff.
There is an additional reason why Palestrina's text painting is as convincing as it is,
however. The image of descent created by Palestrina relies on more than just the general
notion of descent, which could have been evoked in a variety of ways: by a downward leap
of a single interval; by a short sequence that alternated descending thirds with ascending
seconds; or through having each voice enter in turn, beginning with the highest voice and
ending with the lowest. The scalar descent chosen by Palestrina, however, provides a
striking analogue for the descent of our bodies through physical space (when this descent
is unaided by artificial means). Such a descent involves a lessening of potential energy and
a continuous action in one direction, articulated by the regular transfer of weight from one
leg to another.[ 21 ]
― 74 ―
The act of singing a descending scale correlates well with the basic structure of this event:
the relaxation many singers feel as they sing a descending scale matches the lessening of
potential energy; the temporary pauses on each note of the scale match the regular transfer
of weight, which articulates a physical descent. The text painting of “descendit” is thus
supported by our embodied knowledge of descent, as well as by the conventions of
ascribing “high” and “low” to musical pitches.
In its exploitation and enrichment of a mapping between the domains of physical
space and music, Palestrina's text painting also gives us a glimpse into a process of meaning
construction. Conventionally, text painting is understood to operate through a crude sort of
mimesis: physical descents are represented by musical descents. By contrast, I have argued
that hearing a succession of musical events as a descent is an act that is thoroughly
mediated.[ 22 ] What mimesis there is is highly conditioned by the choice of cross‐domain
mappings through which discourse about music is structured; in turn, these mappings
reflect the global models of a given cultural perspective and historical moment. Palestrina's
text painting is not just woven into this web of meaning construction; it also spins its own
threads. Some result from specific features of physical descents summoned by the passage,
some from the point within the larger musical and dramatic discourse at which this striking
moment occurs, and some from the sonic attributes that get mapped back onto the notion
of physical descent. The meaning constructed is not, in the final analysis, simple or direct
but multivalent and contingent, and it reflects the rich set of correspondences activated by
mapping between the two domains.
cross‐domain mapping and music theory Because it provides a way to bring an
integrated system of terms and structural relations to bear on problems of musical
understanding, cross‐domain mapping plays an important part in theories of music. Indeed,
every theory of musical organization employs crossdomain mappings of one sort or
another. Often, the appeal of such mappings is strong, and the mappings seem intuitively
correct (much as “high” and “low” seem intuitively correct for the characterization of pitch
relations). Further investigation, however, reveals that these mappings are every bit as
mediated as those that are less systematically developed.
Rudolf Louis and Ludwig Thuille's characterization of tonality in
their Harmonielehreprovides a case in point. The relevant passage, which appears at the
beginning of the first chapter, runs as follows:
The unity within the diversity of all harmony is ensured through the law of tonality. This
asserts that any succession of harmonies can become musically understandable only when
each (independently appearing) chord is perceived to be in a specific relationship of
dependence with respect to the principal chord that underlies the entire harmonic context.
Both melodic and harmonic succession require a center, a point of repose, around which
everything twists and turns. For melodic relationships, this stationary middle point is
the tonic note, the Tone, whose central place in the scheme of the scale is made manifest by
the melodic motions that we direct away from tonic and
― 75 ―
then back toward it again. The same role of a fixed middle point and resting place that,
within melodic unity, falls to the tonic note, is played, with respect to harmony, by the tonic
chord: the consonant triad that has the tonic note as its root.[ 23 ]
The view set forth by this characterization is that musical organization is analogous to
that of the physical world. Music consequently obeys laws that are independent of human
beings—“tonality” is, as it were, a natural force.[ 24 ] Musical understanding relies on
apprehending the harmonic dependencies that reflect these immutable laws. The focal
point of these dependencies is the center of the system—either the tonic note or the tonic
chord—which acts as both a center of gravity and an axis of symmetry.[ 25 ] To support this
perspective, Louis and Thuille appeal to the empirical evidence provided by actual melodic
and harmonic progressions, which they believe show that every properly constructed
melody turns around a single tonic note, and that every properly constructed harmonic
progression turns around a single tonic chord.
Louis and Thuille's characterization of tonality does a marvelous job of capturing the
compelling coherence wrought by well‐composed tonal music. Given the evidence of
ethnographic and historical studies of musical cultures, however, the idea that Western
European tonality is a naturally occurring force seems doubtful, as would be the notion of a
science aimed at discovering the laws behind this force. Also implicit in Louis and Thuille's
theory is the concept that tonality exists apart from musical syntax: successions of musical
events do not give rise to the law of tonality but serve only to provide evidence of its
existence. According to this perspective, successions of musical events that do not provide
evidence of tonal relations cannot be understood as music. Of course, such a view places
rather profound restrictions on what counts as music.
Theorizing about music requires that we bring order, even if of a tenuous sort, to an
ephemeral and often intangible domain. Cross‐domain mapping aids this process by
bringing systems of relationships to bear on the musical domain and by
― 76 ―
correlating rich, if at times complicated, images with essential musical concepts. Such
mappings are a way to both structure our understanding and extend it— indeed, much
work in music theory is devoted to exploring the ramifications of these mappings.
Nonetheless, each mapping reflects the cultural values and imperatives relative to which it
is framed, not the verities of absolute musical structure. Tonality is not, in a simple way, a
centric and symmetrical system, any more than it is a reflection of logic (as Hugo Riemann
would have it), psychological energetics (as Ernst Kurth would have it), or gravity (as
Arnold Schoenberg would have it).[ 26 ] Instead, tonality is a way of understanding musical
organization. The cross‐domain mappings through which notions such as tonality are
articulated are not simply essential to theories of music; in fact, they constitute what counts
as music in the first place.
Summary
Cross‐domain mapping is a general cognitive process through which we structure an
unfamiliar or abstract domain in terms of one more familiar or concrete. Crossdomain
mapping plays two important roles in musical understanding. First, it provides a way to
connect musical concepts with concepts from other domains. As we saw here, pitch
relations within the domain of music have been connected with concepts associated with
vertical space, waterfalls, physical size, and human aging. Each such mapping made
possible systematic accounts of the ways pitches related to one another. Second, cross‐
domain mapping allows us to ground our descriptions of elusive musical phenomena in
concepts derived from everyday experience, since the structural relations basic to cross‐
domain mapping have their source in repeated patterns of bodily experience—that is, in
image schemata.
As we have seen, the mappings we use to structure our discourse about music are not
accidental but reflect two constraints. One constraint is the Invariance Principle, which
proposes that the best cross‐domain mappings are those that preserve as much of the
image‐schematic structure of both target and source domains as possible. The other
constraint is provided by the global conceptual models relative to which cross‐domain
mappings are framed. Taken together, these constraints suggest that cross‐domain
mappings not only provide a way to structure our understanding of music but also shape
our ideas about what we include under the rubric “music.” For an element to count as
musical, it must be able to serve as a target for the cross‐domain mappings that guide our
discourse about music.
Because cross‐domain mapping offers a way to connect what are often elusive musical
concepts with concepts from more concrete domains, and because these connections give
rise to integrated systems of terms and relations, cross‐domain mapping is essential to our
theorizing about music. Indeed, as we shall see in the
― 77 ―
next chapter, the concepts derived from processes of categorization and the relations
created by cross‐domain mapping provide the basic materials for music theory.
CONCEPTUAL BLENDING
Palestrina's text painting relies on our understanding of pitches as “high” and “low” and on
our experiences with physical descents down staircases, slopes, and hillsides to create a
vivid aural representation of the text. However, the imaginary world summoned by
Palestrina also extends beyond the immediate bounds of text and music. Using the basic
correlation between text and music as a point of departure, we can enter an imaginary
domain in which the pitches to which “descendit” is sung become objects descending
through musical space. Within this domain, every physical descent is accompanied by
sounds that result from a smooth transition from very rapid to less rapid vibrations of the
sounding medium. This extension of Palestrina's imaginarium cannot be predicted simply
from linking the domain of text with the domain of music. It results instead from blending
elements and events from these two domains to create a new one with its own structures
and relations, a domain populated by such things as pitch‐objects and the sound of descent.
It must be admitted that an imaginary domain populated with pitch‐objects moving
through musical space is a somewhat rarefied one. Even so, the process of conceptual
blending through which it comes about is itself exceedingly common. For instance,
concepts about humans and concepts about animals are often brought together in
children's stories to produce talking animals. Such creatures are powerful devices in
storytelling, for they offer narrative possibilities beyond those offered by characters with
only human or animal attributes. In a like fashion, the combination of musical concepts
with those from other domains creates possibilities for meaning construction that reach far
beyond those of music alone. As an introduction to a methodology for exploring such
conceptual blends in greater detail, let us turn to one of the talking animals from recent
literature and discover what he can tell us about the process behind the imaginary world
summoned by Palestrina's text painting.
Talking Animals and Conceptual Integration Networks

The Old Grey Donkey, Eeyore, stood by himself in a thistly corner of the forest, his front feet
well apart, his head on one side, and thought about things. Sometimes he thought sadly to
himself, “Why?” and sometimes he thought, “Wherefore?” and sometimes he thought,
“Inasmuch as which?”—and sometimes he didn't quite know what he was thinking about.
So when Winnie‐the‐Pooh came stumping along, Eeyore was very glad to be able to stop
thinking for a while, in order to say “How do you do?” in a gloomy manner to him.[ 27 ]
Thus A. A. Milne introduces a new character into the stories he crafted for his son, in
this case a character built around a stuffed toy donkey Christopher Robin had
― 78 ―
received as a Christinas present. Eeyore is, unquestionably, a donkey, fond of thistles and
content to stay outside in all sorts of weather. But Eeyore is also endowed with human
characteristics: he is able to talk, to be perversely gloomy, to work his own convoluted
chains of logic, and, on occasion, to be capable of exquisite irony. In fact, Eeyore is a blend
of some of the concepts associated with donkeys and with humans.
In order to study conceptual blends such as that represented by Eeyore, the
rhetorician Mark Turner and the linguist Gilles Fauconnier developed the notion
of conceptual integration networks (CINs).[ 28 ] Each CIN consists of at least four
circumscribed and transitory domains called mental spaces. Mental spaces temporarily
recruit structure from more‐generic conceptual domains in response to immediate
circumstances and are constantly modified as our thought unfolds.[ 29 ] For instance, Milne's
sketch of Eeyore sets up two closely related mental spaces. The first is that of the Old Grey
Donkey—solitary, graceless, and phlegmatic. The second is that of a somewhat morose and
plodding intellect, tangled in its own thoughts and happy enough to leave them behind at
the first opportunity for social interaction. Aspects of these two spaces are combined in a
third space, producing the character of Eeyore. Turner and Fauconnier use CINs to
formalize the relationships between the mental spaces involved in a conceptual blend, to
specify what aspects of the input spaces are imported into the blend, and to describe the
emergent structure that results from the process of conceptual blending.
The CIN for the conceptual blend used by Milne is diagrammed in figure 2.2. Its
network involves four interconnected mental spaces, which are shown as circles. Central to
the network are two correlated input spaces, the “donkey” space and the “human” space.
The solid double‐headed arrow linking these two spaces indicates
― 79 ―
FIGURE 2.2 Conceptual integration network (CIN) for the anthropomorphic blend used for
A. A. Milne's Eeyore
[Full Size]
that elements within them serve as structural correlates: person is correlated with donkey,
and gloomy disposition is correlated with slow‐moving. Guiding the process of mapping
between these domains is the generic space, which defines the core crossspace mapping
and basic topography for the CIN.[ 30 ] Throughout this network, beings are mapped onto
beings, and character traits onto character traits. Guided by the conceptual framework
provided by the generic space, structure from each of the input spaces is projected into the
fourth space, called the blend, which results in the anthropomorphic character of Eeyore.
The mapping is only partial, however, reflecting the limitations imposed by the generic
space. Since the generic space does not map between physical characteristics, we do not
expect Eeyore to be of human appearance or to be human‐sized.[ 31 ]
The dashed arrows linking the generic space to the input spaces, and the input spaces
to the blended space, indicate the directions in which structure is projected:
― 80 ―
from the generic space to the input spaces, and from the input spaces to the blended space.
The arrows are double‐headed because, under certain circumstances, structure may also be
projected from the blended space back into the input spaces, and from the input spaces
back into the generic space. The idea of anthropomorphic animals that emerges in the
blended space may thus influence the way we think about “regular” animals, leading us to
talk to them and act toward them as though they had human characteristics. Similarly, our
experiences with actual animals and people give life to the abstractions of the generic
space. The double‐headed arrows also serve as a reminder of the limitation of all of the
diagrams of CINs I shall use: mental spaces are dynamic structures, as are the CINs that are
built from them. Thus figure 2.2 represents a sort of analytical snapshot of this particular
network, framed with the intent of capturing its essential features, but making no claim to
exhausting the possibilities for description. Hints about how the CIN and its spaces may
develop can be gleaned from the diagram, but a full account would require a series of such
snapshots.
Although the diagram given in figure 2.2 is standard in the literature on mental spaces,
it can lead to two misunderstandings. First, since the blend is at the bottom of the diagram,
it gives the impression that concepts precipitate down into the blend. Second, the function
of the generic space can be a bit confusing, since it does not seem to be directly involved in
the blend. In the interests of clarifying these points I offer figure 2.3, which represents the
essential components of a four‐space CIN in a slightly different format. Here the generic
space is properly represented as both the background and the foundation for the entire
network. The two input spaces are concrete representations of the abstract structure
represented by the generic space, and the conceptual blend is a further projection from
these.
An important aspect of the topography of CINs is the basic logic established by the
generic space. For the CIN of figure 2.2, the assumption is that beings visibly manifest their
character traits. Old donkeys are slow‐moving and balky; morose humans are given to
gloomy pronouncements on the state of the world; and Eeyore talks and acts like the
clumsy, gloomy character he is. The topography of the net‐ work also guides three
operations—composition, completion, and elaboration— that produce emergent structure
unique to the blend.
Composition puts together elements from the input spaces to create new entities in the
blended space, yielding the character of Eeyore: although donkeys cannot actually think
and talk, and humans do not have four legs or hooves, Eeyore has all these
traits. Completion extends the image suggested by the initial mapping from the input
spaces, drawing on our background knowledge of the circumstances summoned by the CIN.
For instance, we know that gloomy characters expect the worst of situations. Eeyore can
thus be relied on for solemn pronouncements of impending disaster on even the sunniest
of days, to greet each calamity as confirmation of his estimation of the world, and to eat his
thistles with little sign of relish. Elaboration is a more extensive operation than completion
is; it develops the structure of the blended space by building on the principles and logic
evinced by the blend. Ineffect, the input spaces decrease in importance and the focus is
directed toward the rich imaginary possibilities of the blended space. At this point, we can
start to write
― 81 ―
FIGURE 2.3 Alternative representation of a four‐space CIN
[Full Size]
our own stories about the Old Grey Donkey. Were we to place Eeyore aloft in an airplane,
we could be sure he would profess no enjoyment of the view but would instead focus on
the various aeronautical catastrophes that, from his perspective, were almost certainly
imminent. Given Eeyore's clumsiness and the awkwardness of his hooves, we would be
much less likely to imagine him flying the plane, leaving this to the more dexterous animals
or to Christopher Robin.
Although the conceptual blend that yields anthropomorphic animals is important to
Milne's story, it is not the only blend in evidence. Because Eeyore originated as a stuffed
animal, he retains some of the characteristics of the species. Accordingly, in the story in
which he is introduced, he loses his tail only to have it recovered by Winnie‐the‐Pooh and
nailed back on by Christopher Robin, a sequence of events unlikely were he based solely on
real donkeys. This points to yet another conceptual blend, in which one of the input spaces
is structured around stuffed, rather than real, animals (and one that I shall not pursue in
the present discussion).
A. A. Milne's Eeyore and most of the other characters from the Winnie‐the‐Pooh
stories provide clear examples of conceptual blending. As indicated, however, the process
of blending is not restricted to children's stories. On the one hand, conceptual blending is
also common in everyday discourse. Witness statements such as “The car is being stubborn
today—I just can't get her to start,” which blends concepts related to the physical
properties of inanimate objects and those related to the behavior of humans. On the other
hand, conceptual blending also has its place in literature. Consider Proust's description of
one feature of the springtime walks along the “Méséglise way” in Combray:
We would leave town by the road which ran along the white fence of M. Swann's park.
Before reaching it we would be met on our way by the scent of his lilac‐trees, come out to
welcome strangers. From amid the fresh little green hearts of their foliage they raised
inquisitively over the fence of the park their plumes of white or mauve
― 82 ―
blossoms, which glowed, even in the shade, with the sunlight in which they had bathed.[ 32 ]
Here Proust blends features of the lilac trees with actions of which only animals or
humans are capable, thus creating a memorable image that perfectly captures the intense
sensual engagement prompted by spring.
Conceptual blending is a pervasive and often transparent cognitive process. A given
situation or story may involve any number of conceptual blends, as the character of
Eeyore—constructed from the attributes of both stuffed and real animals— shows. Each
such blend can be described through a conceptual integration net‐ work, which permits a
systematic description of specific features of the blend. Finally, the CIN associated with the
character of Eeyore can be used to explain why the Old Grey Donkey has some, but not all,
of the features of humans, and why we can easily imagine him taking part in some activities
but not in others.
Conceptual Blends and Text Painting

In the conceptual blends just discussed, the input spaces for each CIN were summoned by
language. As Fauconnier has noted, however, mental spaces are very general and are
constructed for many cognitive purposes; language is but one way to prompt the
construction of a mental space.[ 33 ] Under certain circumstances, music can also prompt
space construction. In the case of Palestrina's text painting, the mental space is relatively
circumscribed and focuses on an orderly progression of pitches that lead resolutely toward
a cadence. This space is in correspondence with that set up by the text, which focuses on
Christ's movement from the heavenly to the mundane and the lessening of potential energy
associated with physical descents.
Correlations between the musical and textual spaces involved in Palestrina's text
painting set up the CIN shown in figure 2.4. The generic space for the CIN is structured
around the notion of elements that are in directed relations conceived with respect to a
teleological framework. Within the network, motions through physical and musical space
are directed motions. As with the conceptual blend associated with the character of Eeyore,
the emergent structure of the blended space can be described in terms of the operations of
composition, completion, and elaboration.
Composition puts musical pitches together with the act of descent to yield pitch‐
objects that descend. Composition also associates descent with a specific sound, which
actual descents may or may not produce. Completing this image, we might infer that the
lower the pitch, the closer it is to the ground and to the mundane. We could also infer that,
since descent can be given a sound, ascent can also be given a sound. Palestrina in fact
confirms the latter inference in mm. 92–93 of the
― 83 ―
FIGURE 2.4 CIN for Palestrina's text painting
[Full Size]
Credo, when, as shown in example 2.2, he sets “ascendit” with steadily rising pitches.
Elaborating the blend, we might imagine pitch‐objects doing all sorts of things, not just
ascending and descending. Something like this is behind Louis and Thuille's
characterization of tonality as a system of forces operating on pitches distributed
throughout the field of musical space.
The extension of the conceptual blend suggested by mm. 92–93 of the Credo
EXAMPLE 2.2 Giovanni Pierluigi da Palestrina, Credo of the Pope Marcellus Mass,mm. 92–94
[Full Size]
― 84 ―
notwithstanding, Palestrina's text painting is a relatively isolated instance of meaning

construction involving music and language. Indeed, text painting is usually regarded as a
clearly circumscribed compositional technique, most often restricted to a single word or
image. Under certain circumstances, however, music and language can combine to create
rather more extended possibilities for meaning construction, as is shown by an instance of
text painting in Giaches de Wert's seven‐part madrigal “Tirsi morir volea,” first printed in
1581.
The text for Wert's madrigal is from a poem by Giovanni Battista Guarini that is
probably the most popular madrigal text of the late sixteenth century. It recounts a sexual
encounter between a shepherd and a nymph.[ 34 ] The passage relevant to discussion here
appears in example 2.3; the text for the entire poem, with the verses that appear
in example 2.3 underlined, is as follows:
TIRSI MORIR VOLEA THYRSIS WISHED TO DIE
Tirsi morir volea, Thyrsis wished to die,
Gli occhi mirando di colei ch' adora, gazing at the eyes of his beloved,
Quand'ella, che di lui non men ardea, when she, whose ardor equaled his,
Li disse: “Ahimè, ben mio, said: “Alas, my love,
Deh non morir ancora do not die yet,
Che teco bramo di morir anch' io.” since I long to die with thee.”
Frenò Tirsi il desio, Thyrsis curbed the desire,
Ch'ebbe di pur sua vita allor finire, which by then had almost ended his life:
E sentea morte, e non potea morire; he felt death near, yet could not die;
E mentre il guardo suo fisso tenea and while he kept his gaze
Ne' begli occhi divini, fixed upon those eyes divine,
E' 1 nettar amoroso indi bevea, and drank from thence the nectar of love,
La bella Ninfa sua, che già vicini his pretty Nymph, who felt
Sentea i messi d'Amore, Love's heralds near,
Disse con occhi languid' e tremanti: said with languishing and trembling looks:
“Mori, cor mio. ch'io moro.” “Die my heart, for I die.”
Cui rispose il Pastore: At which the Shepherd replied:
“Ed io. mia vita, moro.” “And I. my life, die.”
Così moriro i fortunati amanti, Thus the happy lovers died,
Di morte sì soave e sì gradita, a death so sweet and pleasant,
Che per anco morir tornaro in vita. that in order to die again, they returned to life.
The text painting begins in mm. 31–32 when the word “tremanti” is set to a written‐
out ornament (a trill or gruppo) in all parts. “Tremanti” (“trembling”) sets up a mental
space focused on an intense and partially involuntary physical reaction to stress that
produces repeated oscillating motions. The written‐out ornament sets up a mental space
focused on the rapid alternation of a “minor chord” with E♭3 in the lowest voice, and a
“major chord” with F#4 in the highest (which creates an
― 85 ―
EXAMPLE 2.3 Giaches de Wert, “Tirsi morir volea,” mm. 28–39
[Full Size]
― 86 ―
EXAMPLE 2.3 (continued)
[Full Size]
― 87 ―
FIGURE 2.5 CIN for mm. 31–32 of Wert's “Tirsi morir volea”
[Full Size]
augmented‐second cross‐relation).[ 35 ] Similarities between these spaces suggest
correlations between their elements: each chord in the musical space correlates with an
end‐point of the physical motion summoned by the textual space, and the intensity of the
physical reaction correlates with the rapid alternation of major and minor chords and the
cross‐relation between the outer voices.
It is the correlations between these spaces that set up the CIN shown in figure 2.5. The
generic space for the CIN is structured around the idea that changes of disposition
represent motion.[ 36 ]Physical trembling of the sort summoned by “tremanti” is one
instantiation of this notion, and the rapid alternation of musical materials is another. Once
again, the emergent structure of the blend can be described in terms of the operations of
composition, completion, and elaboration. Composition combines musical pitches with
specific features of trembling, so that the pitches summon a sense of physical motion even
though the sound‐source producing them remains relatively fixed: the singers of Wert's
madrigal need not actually tremble when they sing “tremanti.” Composition also suggests
that trembling has a sound,
― 88 ―
even though most actual trembling is done in relative silence. If we complete the pattern
suggested by the basic mapping, we can infer that once the physical action stops the sound
will stop. Finally, through elaboration we can imagine that successions of restricted groups
of pitch materials could depict intense physical actions more complex than trembling,
perhaps even extending to the interaction of physical bodies.
Although the elaboration of a blended space is often left to our imagination, at times it
may be explicitly developed. This is indeed what Wert does in the measures immediately
following the setting of “tremanti.” Here are some of the salient features of the music of
mm.33–38:
• Starting with the pickup to m.33, the women's and men's voices engage in a
rapid alternation of entrances.
• The beginnings and endings of the entrances are elided, so that the singing is
seamless.
• The spacing between entrances is gradually compressed, culminating in the
joining of men's and women's voices in mm.37–38; in fact, this is the first time
in the madrigal that the two vocal groups sing together.
• The setting is almost entirely homophonic within each group of voices.
• The harmonic material is highly restricted, consisting almost entirely of triadic
sonorities built on G, D, and F.
• In m. 38, the music breaks off, and there is a half measure of silence.
The alternation of restricted groups of pitch material, the charged interaction of the
vocal groups, the sudden breaking off of the voices in m.38, and the context provided by the
blend produced by the text painting of “tremanti”—all summon an image of intense
physical activity followed by a sudden suspension ofthat activity. It is an image that gives
sounding representation to the sort of death—sexual orgasm —about which the poem
obsesses. This image also offers a way to ground the double entendre exploited by Guarini,
built on the paradoxical connection between life at its most intense and death. Both sexual
climax and death are marked by sharp, nearly immediate contrasts: orgasm followed by
quiescence; the clamor of life followed by the silence of death. Wert's music, building from
the common ground shared by sexual climax and death, breathes life into a play of words
that, by this point in the sixteenth century, had become a rather common commonplace.[ 37 ]
It is important to emphasize that not all combinations of text and music will yield
rewarding conceptual blends. For a blend to occur, there not only needs to be some
correlation between two domains, but the domains must also have a uniform topography.
In some cases—certain strophic songs, for instance—the correlation between text and
music is simply too general to generate a compelling blend. In
― 89 ―
other cases—formulaic songs composed for opera or reflecting the more commercial side
of popular music—the correlation between text and music may be so tenuous as to be
virtually nonexistent. Also important are the conceptual models that inform the
interpretation of the correlation between the input spaces for a blend. A careful listener
who is nonetheless unaware of the context of Palestrina's music, for example, will almost
certainly realize that something is going on in the passage cited in example 2.1 but may
interpret the music as “sad” or “losing energy.” If conceptual models associated with the act
of descent, or with this particular portion of the Mass, are not activated, the listener may
get no farther than this. Similarly, Wert's climactic moment, for a listener who does not
understand the text (either for lack of knowledge of Italian or out of innocence of the
double entendre), may simply be heard as “exciting.” In both cases, the conceptual blends
will remain latent: potential but not actualized opportunities for the construction of
meaning.
As Wert's madrigal suggests, conceptual blends involving text and music may be
relatively extended. Chapter 6 will consider such blends in greater depth, showing how
they develop over the course of entire songs, how the same text set to new music can give
rise to two very different songs, and how music alone can elaborate a blend first set up by
text and music.
Conceptual Blending and Program Music

In theory, conceptual blends may involve mental spaces associated with any domain of
thought. Musical domains could thus be correlated not only with domains associated with
language but also with physical gesture or color.[ 38 ] In the following, I would like to
consider the conceptual blends that occur when instrumental music is associated with an
extra‐musical program. Although such programs are almost always invoked through
language, the connection between the musical and linguistic domains is much looser, and
the concepts that emerge from the blend are some‐ what more variable. The generality of
such blends also allows them to cover greater expanses of music—in the example with
which I am concerned here, an entire symphony.
Ludwig van Beethoven's Symphony No. 6 in F major, Op. 68 (“Pastorale”) is among the
most well‐known of instrumental works with which extra‐musical programs have been
associated. In the case of the Sixth Symphony, this association was aided by the descriptive
titles Beethoven gave individual movements (such as
― 90 ―
“Scene by the Brook”). As Richard Will has pointed out, the programmatic aspects of the
Sixth Symphony created difficulties for those who sought to give an account of the work, for
the simplistic representation of natural sounds seemed in conflict with the received
wisdom that Beethoven's music referred to nothing outside itself.[ 39 ]
Symptomatic of this uneasiness with Beethoven's program was Donald Francis
Tovey's outright rejection of any link between such natural sounds and the music of the
Sixth Symphony. Tovey writes, “In the whole symphony there is not a note of which the
musical value would be altered if cuckoos and nightingales, and country folk, and thunder
and lightning, and the howling and whistling of the wind, were things that had never been
named by any man, either in connection with music or with anything else.”[ 40 ] With the
correlation between the domain of nature and the domain of music sundered, Tovey paves
the way to approach the work as “a perfect classical symphony” and to claim for it a place
in the pantheon of Beethoven's works.
Nevertheless, Tovey's analysis of the symphony seems to speak of something other
than absolute music. Describing an important moment in the slow movement, he writes,
The deep shadow of this remote key of G flat becomes still deeper as C flat, which, changing
enharmonically to B natural, swoops round to our original key B flat. At the outset of this
wonderful passage the theme was that of the first subject with the murmur of the brook
becoming articulately melodious in the clarinet and the bassoon. At the moment when the
melody gathers itself up into a sustained phrase and makes its enharmonic modulation,
there comes a phenomenon full of deep meaning. From this point nothing is left of the
melody but sustained notes and bird‐song trills; the whole of the rest of the return to the
main key is harmonic and rhythmic. In this as everywhere else the movement remains true
to type, a perfect expression of the happiness in relaxation.[ 41 ]
Here we have key areas that swoop, clarinets and bassoons that are articulate, and
melodies capable of independent motion, all within a movement that expresses the
happiness there is to be found in relaxation.
At first glance, Tovey's prose seems to retain the very correlation between the
domains of music and nature that he had earlier rejected, and to develop a conceptual
blend based on this correlation. Still present are the murmur of the brook, the bird‐song
trills of melody, and the general atmosphere of the pastoral, all wedded to Beethoven's
music. Closer consideration, however, shows that the correlation of music and nature
cannot explain the structural attributes of the imaginative domain summoned by Tovey.
Although nature includes living beings, music does not; at best, music comprises sounds
that occur within a temporal framework. The independent agency implied by Tovey,
evident in the articulate and goal‐directed entities he imagines, also seems somewhat out
of step with the commonplace view of
― 91 ―
nature as a domain in which events are guided not so much by individual volition as by
larger forces working from without. In short, it is one thing to link elements of nature and
musical sounds to create a domain in which the cuckoo sings with the timbre of the clarinet
and the sound of a brook is summoned by undulating patterns in the strings. It is rather
another to animate musical sounds and to endow them with their own volition.
As it turns out, Tovey's interpretation does not rely on a simple correlation between
music and nature. Instead, Tovey activates a much larger network of mental spaces to
produce his interpretation. Central to this network is one of the most common of
conceptual blends, that of anthropomorphism (which played an important role in the blend
that produced A. A. Milne's Eeyore). The basic correlation that underlies this blend can be
seen in statements like “The car is being stubborn today — I just can't get her to start.” In
this mapping, the domain of inanimate objects is structured by the human domain; more
specifically, human being is mapped onto car, and a volitional state proper to humans
(stubbornness) is mapped onto the mechanical state of the car (not‐starting).[ 42 ] A blend
that exploits this mapping might then attribute additional volitional or emotive states to
the vehicle (“I think the car wants to stay home,” “She's mad because I haven't changed her
oil'), or extend into the nonvehicular domain (“I don't think the toaster likes me, it keeps
burning the bread”). In its most abstract form, this blend involves two input spaces, as
shown in the CIN of figure 2.6. One recruits structure from the human domain, the other
from some non‐human domain. The correlation of these two spaces gives rise to the
blended space of anthropomorphism, whose emergent structure reflects the counterpart
correlations of the input spaces. Within this space, non‐human entities become endowed
with the characteristics and intellects of humans while retaining many of the features that
make them distinct from humans. An anthropomorphic car may be stubborn and sullen,
but it will also be able to stay outside year round (even though it may not “like” to do so).
The generic space for the CIN connects entities with entities and states with states.
According to the logic proper to the network, the events that occur within each source
domain have causes that follow from the rules that govern the behavior of entities that
populate the domain. Stubbornness in the human domain reflects personal idiosyncrasies
and may be abetted by external circumstances that render the current situation
unsatisfactory. According to conventional wisdom, people are stubborn either because they
were born that way or because they are unhappy with the way things are and will not take
any action (other than being stubborn) until things change. Not‐starting in the domain of
automobile mechanics is caused by electromechanical conditions that are insufficient to
initiate or produce sustained combustion. The car will not start because there is something
wrong with the engine. It is important to note that, although the generic space may have a
structure that is strongly image‐schematic, generic spaces are often rich with detail.
Generic spaces
― 92 ―
FIGURE 2.6 CIN for anthropomorphic blend
[Full Size]
are not necessarily primitive or schematic—they only define the basic topography common
to all of the spaces in the blend.
Tovey activates the mental space of anthropomorphism early in his essay when he
writes, “The Pastoral Symphony has the enormous strength of some one who knows how to
relax.”[ 43 ] Within the blend that results from correlating the human and musical domains,
musical events become actors endowed with their own characteristics and capable of their
own acts of volition. Thus the transition that follows the first subject of the opening
movement of the symphony “leads in three indolent strides to a second subject which
slowly stretches itself out over tonic and dominant as a sort of three‐part round.”[ 44 ] Tovey
develops an anthropomorphic perspective on nature more gradually, bringing it to fruition
only in his account of the end of the slow movement. “Suddenly for a moment all is silent;
we have no ears even for the untiring brook, and through the silence comes the voice of the
nightingale, the quaint rhythmic pipe of the quail, and the syllabic yet impersonal signal of
the cuckoo.”[ 45 ] Here elements of nature (brooks and birds) become actors, each endowed
with individual characteristics and capable of individual acts of volition. In both cases, the
correlation of human and non‐human domains gives rise to imaginative conceptual blends:
symphonies that know how to relax, musical passages that stride, brooks that do not tire,
and cuckoos capable of impersonal signals.
― 93 ―
FIGURE 2.7 CIN for Tovey's analysis of the Beethoven Pastoral Symphony
[Full Size]
Construing both music and nature in anthropomorphic terms makes it possible for
Tovey to connect the two in a further blend, shown in figure 2.7. The generic space for this
network is the blended space of figure 2.6. One of the input spaces is that of
anthropomorphized nature, which includes natural agents who act within and respond to
their native environment (“nature”). The other input space is that of anthropomorphized
music, which includes musical agents who act within and respond to their native
environment (“music”). In the blend, aspects of the input spaces are combined to create
unique entities and relations. Here the murmur of a brook becomes an articulate melody
for clarinets and bassoons; here Beethoven's music can be conceived as natural without
having been reduced to nature.
The three operations that produce emergent structure in the blended space —
composition, completion, and elaboration—are all evident in Tovey's account of the end of
the fourth movement: “The storm moves in grand steps to its climax. This is marked by the
entry of the trombones.… Then the storm dies away, until with the last distant mutterings
of the thunder the oboes give a long slow fragment of bright sustained melody on the
dominant of F. This has been aptly compared with a rainbow”[ 46 ] Composition unites the
lives and interests of natural agents and musical agents. In the final twenty measures of the
fourth movement, the thunder mutters with the voices of the timpani and double basses,
who also prepare the tonality of the closing fifth movement precipitated by the retreat of
the storm.
― 94 ―
Completion supplies additional structure. Beethoven's terse title for the fourth movement
is “Gewitter, Sturm” (“Thunderstorm, Tempest”), which says nothing of the aftermath of a
storm. Nonetheless, we know that rainbows often follow storms, and so the “bright
sustained melody” of the oboes can summon the image of a rainbow, a completion
suggested but not necessitated by the input spaces.[ 47 ] Elaboration develops the blend by
pursuing its logic. The storm, as a magnificent entity, moves with grand steps. The climax of
its terror and wrath (enacted through musical agents) is marked by the entry of new
musical agents (the trombones). As these passions abate, musical agents gradually
disappear (a change only implicit in Tovey's prose) until only a bare few remain. Perhaps
most striking about this elaboration is that a single natural agent (the storm) appears to be
brought fully to life only through a multiplicity of musical agents. Such a genesis, while
inexplicable in the natural domain, is fully consonant with the idiosyncrasies of the musical
domain. There, individual musical entities (such as movements or entire symphonies) are
often understood to emerge from the actions of the subentities (such as themes or tonal
areas) that they comprise.
Tovey thus mobilizes a number of mental spaces to provide an account of Beethoven's
symphony. The most active spaces are those that contribute to a blend of
anthropomorphized natural and musical elements. This blend gives Tovey a way to
describe music that seems naturalized but does not reduce to a simple correlation of
natural and musical sounds. Less active, but still important, are spaces involved in blends
fundamental to this primary network. These include spaces built up from the domain of
human entities and events, along with the domains of nature and music. Tovey's
description makes only sparing use of these subsidiary blends, since they cannot provide
the naturalized account of music that is the goal of his essay. Far in the background is the
generic space common to anthropomorphic blends, which defines the core cross‐space
mapping and logic that underlie the entire network.
Summary
Conceptual blending is a dynamic process of meaning construction that involves small,
interconnected conceptual packets called mental spaces, which temporarily recruit
structure from conceptual domains in response to local conditions. When blending occurs,
a portion of the structure from two correlated input spaces is projected into a third,
blended space. As part of this process, the operations of composition, completion, and
elaboration produce structure within the blend that is not found within either of the input
spaces. This structure only becomes possible through the concepts and relations produced
by conceptual blending.
The structure common to the mental spaces within a conceptual integration network
is reflected in the generic space. This space defines the core cross‐space mapping and is
organized according to a basic logic that remains consistent throughout the network.
Generic spaces are not necessarily primitive or imageschematic
― 95 ―
—they only define the basic topography common to all of the spaces in the blend.
In a basic blend, four spaces will be active (although not necessarily to the same
degree): the generic space, the input spaces, and the blend. It often will be the case,
however, that additional spaces will be activated, to a greater or lesser extent, as the
construction of meaning proceeds. In certain passages within Tovey's account of
Beethoven's Sixth Symphony, as many as six spaces may be active. In addition to the blend,
these include mental spaces built up from the domains of music and nature, from the
domains of anthropomorphized music and nature, and from the generic domain of
anthropomorphism. This multiplicity of spaces explains the complexity of Tovey's account,
as well as its interpretive richness. It also serves to explain the seeming contradiction
entailed by Tovey's desire to deny the programmatic aspect of Beethoven's symphony
while responding to the powerful impressions to which it gives rise.
More generally, evidence from conceptual blending lends credence to the idea that
music is an independent conceptual domain. As I have shown in my analyses of text
painting and program music, musical concepts combine with concepts from other domains
to create blended concepts that suggest striking possibilities for the imagination. More to
the point, musical syntax (from the perspective developed in chapter 1) can be seen
contributing to the blended concepts associated with Wert's text painting and Beethoven's
program music. In the former, Wert's deployment of musical materials sets the stage for an
enactment—both by the voices of the singers and by our imaginations—of the sort of death
with which Guarini's poem is concerned. As but one example from the latter, Beethoven's
conclusion of the fourth movement and preparation for the fifth (the distant mutterings of
the timpani notwithstanding) provide a moment of contrast and articulation that clearly
supports our imagining one process ending (the storm) and another beginning (the
shepherd's song of joyful thanksgiving). And musical spaces can inform our understanding
of generic spaces: to hear Wert's musical climax is to get new insight into climax as a
general phenomenon, be that climax sexual, cerebral, or even visual; to hear Beethoven's
transition is to get a glimpse of how natural events like storms can be thought of not simply
as personages but also as processes.
Each of these aspects of conceptualizing music will be taken up in more detail
in chapter 6, within the context of the analysis of the nineteenth‐century Lied. There it will
be possible to see, to a greater extent, how processes of cross‐domain mapping and
conceptual blending construct and contribute to our understanding of both music and the
world as a whole.
CHAPTER TWO CROSS‐DOMAIN

MAPPING

Zbikowski - Conceptualizing Music, Chapter Two

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Zbikowski - Conceptualizing Music, Chapter Two

Încărcat de

Drepturi de autor:

Formate disponibile

Lawrence M.

Zbikowski: Conceptualizing Music: Cognitive Structure, Theory, and Analysis

Each characterization suggests not a literal representation of the spatial domain

appearance of metaphorical constructions in everyday discourse, Lakoff and Johnson

Image Schema Theory

developed a comprehensive theory of mental representation based on perceptual symbols

Talking Animals and Conceptual Integration Networks

FIGURE 2.3 Alternative representation of a four‐space CIN

Conceptual Blends and Text Painting

notwithstanding, Palestrina's text painting is a relatively isolated instance of meaning

Conceptual Blending and Program Music

FIGURE 2.6 CIN for anthropomorphic blend

CHAPTER TWO CROSS‐DOMAIN

S-ar putea să vă placă și