Sunteți pe pagina 1din 7

Spectral Analysis as a Resource for Contemporary

Orchestration Technique
Franois Rose, Conservatory of Music, University of the Pacific, USA
frose@pacific.edu www.uop.edu/conservatory/frose.html

James Hetrick, Department of Physics, University of the Pacific, USA


jhetrick@pacific.edu www.uop.edu/cop/physics/Hetrick.html

Proceedings of the Conference on Interdisciplinary Musicology (CIM05)


Actes du Colloque interdisciplinaire de musicologie (CIM05)
Montral (Qubec) Canada, 10-12/03/2005

Abstract
Over the last four centuries the technique of orchestration has evolved empirically. Based on
aural perception, intuition, and the taste of the epoch, composers developed the skill of
combining the sounds of instruments. But it is now possible to use the computer to analyze
sound mixtures and view them as if under a microscope. This method is referred to as
spectral analysis because it shows, among other things, the energetic pattern of a sound,
that is, its spectral strengths and weaknesses.

Hermann von Helmholtz (1821-1894) established that complex sounds that evoke a
sensation of pitch have periodic waveforms comprised of a unique set of harmonic partials,
represented in the Fourier Transform. Since then, researchers have demonstrated how the
partial content of a sound, its spectral power distribution, and the location of its formant(s)
all play an important role in our perception of timbre. Spectral analysis is the standard
method used to access information on this energetic pattern of a sound.

We introduce a computerized aid to orchestration that greatly extends the use of spectral
analysis in orchestration. It is made of two parts: a bank of Fast Fourier Transforms (FFTs)
accessed by a group of sub-routines designed to either perform sound analysis or propose
different orchestrations that imitate the energetic pattern of a reference sound.

Introduction
Although our Computerized Orchestration Tool is designed as an aid for composers, it is not our intent
to propose a replacement of the good old orchestration technique with a blind use of new technology.
After all, the sophisticated level reached by the empirical development of orchestration technique is
proof that imagination and intuition are a composer's most invaluable and irreplaceable assets.
Rather, we are proposing to combine the two, because we firmly believe that the integration of
spectral analysis in orchestration technique can expand the boundaries of the composers imagination
and intuitive flair. Because the keyword is integration, our tool is equipped with two sub-routines
designed to assist the composer in his/her orchestral decisions:

Sub-routine #1 - Sound mixtures analysis: The tools potential to analyze and compare
different orchestrations is illustrated with two excerpts from the literature; The two
orchestrations of the opening chord of Schoenbergs Five Pieces for Orchestra, op16, III and
the counterpoint of timbres in Ravels Daphnis et Chlo, La danse des jeunes filles.
Sub-routine #2 - Orchestral propositions: The tools ability to analyze a reference sound
and based on a specific group of instruments, to provide orchestral propositions that imitates
the resonance of the reference sound is illustrated with an example from Roses Lidentit
voile.

But first, we begin with the introduction to the structure of the tool itself.

Structure of the Computerized Orchestration Tool


The Tool has two components: a bank of FFTs accessed by a computer program that manages the two
sub-routines as illustrated in Figure 1.

CIM05, Montral, 10-12/03/2005 1 www.oicm.umontreal.ca/cim05


Franois ROSE
James HETRICK

Figure 1. Structure of the tool

To be a useful composition-aid, we felt that the tool had to be able to perform quasi-instanteneous
spectral analysis on different sound mixtures. That is why we decided that the most practical solution
was to work with a bank of FFTs. But in order to assure an acceptable degree of reliability, the bank
was built on the following premises:
The sounds were all recorded under the exact same conditions.
The sustain part of each sound was analyzed with a window of 4096 samples at an
incremental rate of 512 samples.
The hundreds of FFTs generated by this process were then averaged in a single one,
consequently an analyzed sound was summarized by a single averaged FFT.
Ordered like a chromatic scale, each averaged FFT was then compared with its adjacent ones,
if the energetic pattern of a FFT did not logically follow the adjacent ones, the FFT was
rejected and another sample was analyzed.
All pitches and performance techniques playable on an instrument are recorded at three
different dynamic levels.

Sub-routine #1: Sound mixtures analysis

The tool uses the standard linear combination of FFTs to analyze sound mixtures. To illustrate the
potential of this sub-routine, the two orchestrations of the opening chord in Schoenbergs Five Pieces
for Orchestra opus 16 are analyzed and compared. Finally, we analyze the opening of La danse des
jeunes filles in Ravels Daphnis and Chlo to determine which trumpets mute is best suited to realize
the intended timbral imitation with the woodwinds.

Combining FFTs: The Method


The fundamental frequency and resolution used for the calculation of the Fourier Transform was the
same for all the sounds: 44100Hz/4096samples = 10.76 Hz. Consequently, each averaged FFT
(hereafter, simply called an FFT) is in fact an array of 4096 values, where the first value of the array
correspond to the averaged amplitude of the first partial of the fundamental, the second value
correspond to the averaged amplitude of the second partial of the fundamental, and so on. Therefore,
combining FFTs is quite simple: we add together the corresponding ordinal amplitudes of each array.
Mathematically, one can view the FFT as a vector in a 4096 dimensional space; combining FFTs is
simply vector addition in this space. We found though that to take into consideration the effect of the
phase, which is neglected in our calculation of the averaged FFTs, the added amplitudes must be
regularized by a factor of , summarized in equation 1, below.

FFTSound Mixture(f) =
i
FFTi (f) (Eq. 1)

Here the sum i runs over the FFTs chosen for combination, and we display the explicit f dependence of
the FFT as a function of frequency, f.

Schoenbergs Five Pieces for Orchestra, opus 16, III


We illustrate the analytical potential of the tool by using two examples from the standard repertoire
with which we hope most readers will be familiar.
Arnold Schoenberg wrote in 1909 the third movement of his Five pieces for orchestra, opus 16,
entitled Farben (German for colors). The movement begins with a single chord, C3-G#3-B3-E4-A4

CIM05, Montral, 10-12/03/2005 2 www.oicm.umontreal.ca/cim05


Franois ROSE
James HETRICK

articulated with two different groups of instruments, metaphorically referred to as two colors, as
shown on the left hand side of Figure 2 (the score is in concert pitch including the contrabass). The
two string instruments, the viola and contrabass, which alternatively play the C3 belong to both
groups. It is in fact, the wind instruments that clearly divide the ensemble in two groups. The wind
instruments of the first group are, from the low G#3 to the high A4, the bassoon, the clarinet, and the
two flutes, whereas those of the second group are the French horn, the bassoon, the trumpet and the
English horn.
We used the tool to analyze these two colors and explain why the second orchestration sounds one
octave higher than the first one. The middle and right-hand sides of Figure 2 show, from the highest
to the lowest pitch, the average spectra of the four wind instruments, while the last spectrum shows
the result of their mixture. (Note that in order to illustrate the differences between the two groups,
the spectra of the contrabass and viola have been left out since they are common to both sounds, but
they have been accounted for in the mixtures). Note also that the average spectrum of the mixture is
displayed over its lowest pitch, in this case a C3 (131 Hz). Since the chord's pitches are not overtones
of that C, peaks of amplitude appear at odd places. For example, consider the second group, the
strong second partial of the English horn is an A5 (880 Hz), and it appears in the mixture as a 6.7
partial over the C3, because 880 divided by 131 equals approximately 6.7. An overall comparison
between the two groups' spectra reveals that the main difference between them comes from the
instrumentation of the B3 and A4. More specifically, the bassoon and English horn from the second
group have a much stronger second partial than their respective counterpart: the clarinet and the
flute. Considering that the combined effect of all the instruments of a group is summarized in its
mixture, a comparison between these two show that the main energetic area of the first group is
located on partial 3.3, and on partial 6.7 for the second one. This means that the orchestration of the
second group is not only brighter than the first, it also sounds one octave higher since 3.3 * 2 6.7
and any frequencies related by a factor of two, are an octave apart. We have noticed this subtle
octave effect on all the recordings we have listened to, an amazing effect considering that it is
generated by the -exact same- chord.

Figure 2. Schoenberg's Five Pieces for Orchestra, III (Farben)

CIM05, Montral, 10-12/03/2005 3 www.oicm.umontreal.ca/cim05


Franois ROSE
James HETRICK

Ravels Daphnis et Chlo, La Danse des Jeunes Filles


In Daphnis et Chlo Ravel uses a fascinating counterpoint of timbre at the beginning of La Danse des
Jeunes Filles, rehearsal number 17. In the foreground, a solo muted trumpet is answered one
measure later by the oboes in octaves with the English horn doubling in unison the lower oboe and the
Eb Clarinet doubling the higher one, as shown in Figure 3 (the transcription is in concert pitch).
Ravel's intention is clear, he is using the woodwinds to imitate the sound of the muted trumpet.
We have used the tool to analyze the sound of the woodwind instruments and to determine which
trumpets mute would be best suited to imitate it. The indication 'con sordino' in a trumpet's part is
almost always read as 'straight mute on'. The right hand side of Figure 3 shows three average spectra.
The top is the spectrum of a trumpet playing a Db5 mezzo-forte (622 Hz) muted with a straight mute,
we see that it behaves like a high-pass filter. The middle spectrum shows the mixture of the four
woodwind instruments. Note that the main energy is located in its lower part, more specifically around
its second partial (Db6, 1109 Hz), and that there is a cut-off around the 16th partial. This spectrum
does not correspond very well with the one using the straight mute, which has its main energy around
the 5th partial (F7, 2794 Hz) and an extremely rich structure, including more than thirty partials.
On the other hand, the last spectrum in Figure 3, displays the sound of a trumpet muted with a cup
mute, playing a Db5 mezzo-forte. Note that in this case, the main energy is around the second partial,
and that there is a cut-off around the 17th partial. Therefore, the resemblance between the trumpet's
sound and the four woodwind instruments would be enhanced if the trumpet were to be muted with a
cup mute instead of a straight.

Figure 3. Left-hand side: Foreground instruments of the opening two measures of Ravel's La Danse des Jeunes
Filles, from Daphnis et Chlo. Right-hand side : Top : Average spectrum of a trumpet muted with a straight mute
playing a Db5, mf Middle: Mix of an oboe and an English horn playing a Db5, f and an oboe and an Eb clarinet
playing a Db6, f Bottom: Average spectrum of a trumpet muted with a cup mute playing a Db5, mf

Sub-routine #2: Orchestral propositions

Because the sounds are represented in the bank as arrays of amplitudes, or vectors, it is possible to
submit the FFT of an arbitrary sound for pattern matching and thereby use the tool to suggest an
orchestration that would match, i.e. best reproduce a reference sound given a set of
instruments/notes chosen from the bank. More specifically, the tool uses an advanced method of
spectral decomposition to analyze an arbitrary sound and based on a specific group of instruments,
then provides suggestions on how to orchestrate that sound. Thus we can explore the potential of new
sound mixtures. For example, after analyzing a multiphonic for clarinet, the tool supplies two different
ways to orchestrate that sound for piano, violin and clarinet.

CIM05, Montral, 10-12/03/2005 4 www.oicm.umontreal.ca/cim05


Franois ROSE
James HETRICK

Spectral Decomposition: The Method

Our method of finding the best orchestration for a given reference sound is very similar in spirit to
the common practive of Least Sqares Fitting (LSF) done in statistical anlaysis. In LSF, the equation of
the best fit line passing through a number of data points is found. In our tool, we find the
combination of sounds from the bank whose combined (weighted) FFT gives the best fit to the FFT
of the reference sound.

As an example, recalling that we can view an FFT as a vector, suppose we have a 3-dimensional
vector with x,y, and z components ax, ay, and az. We can ask what 2-dimensional vector comes closest
to the 3-dimensional reference vector. This 2-d vector is the projection of the 3-d vector onto the 2-d
sub-space spanned by x and y. The projection will have x and y components (weights), ax, ay, which
give the closest 2-d vector to the 3-d reference. Our case in Fourier space is more complicated
however since the space has 4096 dimensions and the basis vectors (instruments/notes in our bank)
are not orthogonal to each other in Fourier space as x,y, and z are Euclidian geometry.

As an outline of our method, we decompose our reference vector into a sum of vectors chosen from
the bank, descibed mathematically as
FFTreference(f)
i
i FFTi (f) (Eq. 2)

This is much like equation 1, however each FFTi from the bank is now weighted by a constant i. While
this decomposition is not exact (since the bank is not a complete basis), there is a unique set {i} for
any given reference sound which makes the sum on the right closest to the reference FFT. Finding the
{i} is the same as finding the best projection of the arbitrary sound onto the basis of instruments
chosen for the orchestration. Thus, if composing for a trio, the index i would run over the FFTs of all
pitches and intensities of the three instruments chosen.

The details of the mathematical solution for finding {i} are beyond the scope of these Proceedings,
however our method uses Singular Value Decomposition to solve the (4096-dimensional) linear
system in equation 3 for {i} which makes the (square of the) difference between the reference FFT
and the sum smallest. See Press et. al.
2
(FFTreference(f) - i FFTi (f)) = 0 (Eq. 3)
ai i

The resulting solution {i} generally contains several large values and many small ones. The large
values represent the main contribution to reference sound from the basis, and then the suggested
orchestration. Since the {i} must be examined by the composer, there is still amply room for
intuition and experience, however the distribution of {i} often reveals new and evocative information
for the composer.

Roses Lidentit voile

In the transition between the first and second sections of Lidentit voile for clarinet, violin and piano,
a clarinet multiphonic is presented and as it fades out, its resonance is imitated by the entire trio. In
order to determine the specific pitches, dynamics, and performance practices that would best lead to
this timbral imitation, the clarinets multiphonic C4-B5 was analyzed by our tool. Based on the
specified three instruments, the tool supplied two solutions. Figure 4a shows one solution proposed by
the tool. On the left-hand side the solution is presented in score notation (note that the score is in
concert-pitch). On the right-hand side, the averaged spectrum of the multiphonic on the top is
compared with the averaged spectrum of the sound mixture shown in the second measure. To
facilitate the comparison, both spectra are displayed over a low C#3.
The averaged spectrum of the multiphonic shows that it has strong energy around respectively the 7th,
14th and 2th partials of the C#3. The averaged spectrum of the sound mixture matches the strong
energy of these partials but is also contains several other strong partials. Experience has

CIM05, Montral, 10-12/03/2005 5 www.oicm.umontreal.ca/cim05


Franois ROSE
James HETRICK

demonstrated that the level of imitation is fairly good.

Figure 4a. Left-hand side: The clarinets concert-pitch multiphonic C4-B5, and the pitches, dynamics, and
performance practices suggested by the tool to imitate its resonance. Right-hand side: Top: Averaged spectrum
of the multiphonic. Bottom: Averaged spectrum of the sound mixture shown in the second measure.

Figure 4b shows another solution proposed by the tool. This time the strong 7th, 14th and 2nd partials
are nicely matched, and the addition of new partials is rather limited. Experience has demonstrated
that the level of imitation was very conclusive.

Figure 4b. Left-hand side: The clarinets concert-pitch multiphonic C4-B5, and the pitches, dynamics, and
performance practices suggested by the tool to imitate its resonance. Right-hand side: Top: Averaged spectrum
of the multiphonic. Bottom: Averaged spectrum of the sound mixture shown in the second measure.

Conclusion
Orchestration techniques and triadic harmony are related to the overtone series. Consequently, for
almost four hundred years the development of orchestration techniques has been linked to tonal music.
While our imagination for pitch systems is unlimited, our inventiveness for sound mixtures remains
bound by laws of physics, and the instruments at our disposal. Since orchestration is the vehicle that
carries a musical idea from imagination to reality, it is clear that a composer's orchestration technique
has a major impact on his or her musical expression. Spectral analysis, by providing essential
information about sound mixtures which are new, or that are subject to constraints, can profoundly
and positively influence that technique.

CIM05, Montral, 10-12/03/2005 6 www.oicm.umontreal.ca/cim05


Franois ROSE
James HETRICK

References
von Helmholtz, H. L. F. (1877). On the Sensations of Tone as the Physiological Basis for the Theory of
Music. 2nd. Ed. trans. A. J. Ellis (1885), from German 4th Ed., Dover, New York (1954).

Press, W.H., Teulkolsky, S.A., Vetterling, W.T., Flannery, B.T. Numerical Recipes in C, The Art of
Scientific Computing. 2nd Ed., Cambridge Univerisity Press, Cambridge (2002).

CIM05, Montral, 10-12/03/2005 7 www.oicm.umontreal.ca/cim05

S-ar putea să vă placă și