Sunteți pe pagina 1din 686

TEXTBOOK

Unit 1
UNIT 01
The Primes
TEXTBOOK

UNIT OBJECTIVES

• Primes are the fundamental building blocks of arithmetic.

• There are infinitely many prime numbers.

• The fundamental theorem of arithmetic says that each whole number can be
uniquely decomposed into a product of primes.

• The answer to whether or not there is a pattern behind the primes has eluded
mathematicians for millennia.

• One can find arbitrarily long “prime deserts” on the number line.

• One can find finite arithmetic sequences of primes of any length.

• Clock math is a way of generalizing arithmetic.

• Primes and clock math can be used together to create strong encryption schemes.
Reason is immortal, all else mortal.

Pythagoras
Mathematicians have tried in vain to this day to
discover some order in the sequence of prime
numbers, and we have reason to believe that
it is a mystery into which the human mind will
never penetrate.

Leonhard Euler
UNIT 1 The Primes
textbook

SECTION 1.1 In 1974, astronomers sent a message into space from the Arecibo radio
telescope in Puerto Rico in an attempt to broadcast the presence of our
INTRODUCTION intelligent life to any other potentially sentient beings that happened to be
listening. The message contained information about our planet’s location, the
basis of our chemistry, and rudimentary information about our biological form.
In sending this message, astronomers had to confront a deep problem: how
does one communicate with another intelligent species that one knows nothing
about?

At the time, astronomers thought that the best way to ensure the
comprehensibility of the message was to use a concept fundamental to our own
logical understanding of the world, namely, prime numbers. No known natural
process creates prime numbers, yet they are at the very center of mathematical
thought. The astronomers assumed that any beings intelligent enough to be
able to listen to radio broadcasts would know about prime numbers. With this
in mind, the message senders encoded a pictogram
consisting of 1,679 bits; although the number 1,679
is not itself prime, it is the product of two prime
numbers. The general idea was that any alien listener
who receives a random number of blips might think
to factor that number, just to get a sense of what it
is. Choosing a number whose only factors are two
primes might prompt the receiver to think “two-
dimensional,” a rectangular array, in other words,
with one factor representing the number of rows and
the other representing the number of columns. Using
this information to arrange the pixels into a rectangle
would reveal the picture. This, of course, assumes
that an intelligent alien would recognize prime
numbers to be special in some way.

This assumption is actually quite reasonable.


Prime numbers, known to humankind for at least
20,000 years, represent a fundamental concept of
mathematics that has provided rich grounds for
study. Primes have been described as the atoms of
Item 1671 / Frank Drake,
arithmetic, the indivisible parts from which all other
ARECIBO MESSAGE OF numbers can be constructed. Seemingly simple, they
1974 (1974). Courtesy of
Frank Drake. have provided some of the most challenging problems

Unit 1 | 1
UNIT 1 The Primes
textbook

SECTION 1.1 for those willing to explore their world. Long studied for their mysteries, many
of which remain unsolved, primes were, until relatively recently, solely the
INTRODUCTION concern of mathematicians. With the explosion of digital communications and
CONTINUED our dependence on Internet transactions, however, primes now play a pivotal
role in other areas, as well.

The advent and growth of the Internet and the “information age” have given
unprecedented convenience to millions of technology users worldwide.
Technological advances and their application in numerous fields have helped
to shrink perceived distances and have done much to break down barriers that
divide us. Peril often accompanies progress, however, and our modern Internet
economy is not immune to risks. For example, when we make a purchase, pay
bills, or check our bank balance online, how do we know that someone will not
intercept the transmission and steal our personal information? One of the ideas
we will see in this unit is how the properties of prime numbers, in combination
with modular arithmetic, or “clock math,” are used to help keep our information
secure.

In this unit we will see how primes are fundamental to mathematical thought.
We will explore the seemingly simple concept that underlies their existence, and
we’ll catch a glimpse of the mystery inherent in their distribution on the number
line. Finally, we’ll take a brief look at how the modern standard of data security,
RSA encryption, uses fundamental properties of primes to create virtually
impenetrable codes.

Unit 1 | 2
UNIT 1 The Primes
textbook

SECTION 1.2

Math at the • Tally Sticks


Dawn of Time • Cultural Math

TALLY STICKS
• The earliest evidence of humans using numbers comes in the form of tally
sticks.
• The Ishango bone represents a level of early mathematics more
sophisticated than simple counting.

Evidence suggests that the exact beginning of mathematical thought, even the
origin of the concept of a “number,” predates the advent of written language.
The earliest example of recorded mathematical symbols is a sequence of tally
marks on the leg bone of a baboon found in Swaziland, dating to around 35,000
years ago. By contrast, the earliest known language writings date to around
6,500 years ago. It is not known for sure what the tally marks on the bone
represent, but it is plausible that they represent a record of an early hunter’s
kills. These tally marks may represent numbers in application, but there may
also be evidence that early humans were interested in properties of numbers
themselves.

Possible evidence of
mathematics more
sophisticated than
counting comes from
another bone dated
ten thousand years
younger than the
Swaziland counting
bone. Around 25,000
years ago, by the
shores of Lake Edward
Item 3060 / Oregon Public Broadcasting, created for (which today lies on
Mathematics Illuminated, ISHANGO BONE (2008).
Courtesy of Oregon Public Broadcasting.
the border between
Uganda and Zaire), the
Ishango people lived in a small fishing, hunting, and farming community. This
settlement lasted for a few centuries before being buried in a volcanic eruption.

Unit 1 | 3
UNIT 1 The Primes
textbook

SECTION 1.2 Excavations at this site turned up a bone tool handle with a series of interesting
marks. The Ishango bone, as it is now called, has groups of markings, some
Math at the of which represent primes. Although the exact meaning of these markings
Dawn of Time is still being debated, the current thought is that they represent some sort
CONTINUED
of lunar calendar. Regardless of the precise meaning of the markings, the
artifact demonstrates that humans were thinking about mathematical concepts,
perhaps even the concept of prime numbers, 25,000 years ago, well before the
emergence of cities.

CULTURAL MATH
• Mathematics arose independently in different forms across many cultures
throughout history.

Jump 10,000 years forward and a bit to the east from the makers of the Ishango
bone, and you’re in the emerging Egyptian and Fertile Crescent civilizations.
These civilizations had
deep understanding
of mathematics
and used it to
achieve unequalled
engineering feats.
Babylonian clay
tablets show an
understanding of
Pythagorean triples,
centuries before the
Item 2251 / Inca, QUIPU. USED FOR COUNTING (fifteen-early sixteenth cult of Pythagoras
century). Courtesy of Kathleen Cohen.
appeared in Greece.
However, these were not the only ancient
civilizations to develop, presumably
independently, familiarity with numbers
and number relationships. Mathematical
concepts may have spread naturally
throughout Africa, and the Middle East, and
Asia, but they also appeared early on in
Central and South America.

Evidence suggests that the development of


mathematics in early cultures was tied to
Item 2250 / Babylonian, MATHEMATICS TABLET
SHOWING CALCULATIONS OF VOLUME (ca. 1635
BCE). Courtesy of Kathleen Cohen. Unit 1 | 4
UNIT 1 The Primes
textbook

SECTION 1.2 specific purposes. Much


of early mathematical
Math at the thought was focused
Dawn of Time on representing and
CONTINUED
understanding the
movements of the
heavens, as is evident
in the Mayan long
count calendar and the
constellations of the
zodiac. Elsewhere, such
Item 2253 / Mayan, CALENDAR RELIEF (fifth-ninth century). as in ancient China,
Courtesy of Kathleen Cohen.
mathematics was put to
use in bookkeeping and other business activities. Math’s development generally
served practical purposes until about 600 BC, when the Greek philosophers
began to explore the world of numbers itself.

Unit 1 | 5
UNIT 1 The Primes
textbook

SECTION 1.3

Number for Number’s • Playing with Numbers


Sake: The Greeks • The Primacy of Proof
• Figurate Numbers

Playing with Numbers


• The Greeks studied numbers independent of their application to uses in the
real world.

Prime numbers may have been of interest to the people of Ishango, but it was
the Greeks who began to ask deep questions about them. The Greeks held high
esteem for the pursuit of knowledge and, in particular, mathematical truth. One
would not have expected such a high-level fascination from the illiterate and
innumerate people who conquered the Aegean peninsula. Their conquests and
the knowledge that flowed along their trade routes, however, enabled them to
catch up quickly with the rest of the mathematical world.

While the rest of humanity was seemingly occupied with the more practical uses
of mathematics, the Greeks were among the first to develop a mathematical
world that was not necessarily tied to real-world applications. It was during
this time that the concept of axiomatic structure emerged—mathematical
proof, in other words. Thales, a mathematician who is said to have astonished
his countrymen by correctly predicting a solar eclipse in the year 585 BC,
is generally credited as taking the first steps toward focusing on the logical
structure and principles behind mathematics. Described as the first philosopher
and the first mathematician, he is definitely the first person to whom a specific
mathematical “discovery” is ascribed, namely that an angle inscribed in a semi-
circle is a right angle. As is often the case with the emergence of new ideas,
there is some dispute on this, however. Thales is generally given the credit
for this discovery, although some claim that he simply re-packaged a previous
Babylonian finding.

There is some debate as to whether it was Thales or the Pythagoreans


who presided over the shift in mathematics from practical concerns to the
development of general principles and ideas. The supposed motto of the
Pythagorean group, “All Is Number,” encapsulates their preoccupation with both
mathematical and numerological concepts. For example, they ascribed a gender

Unit 1 | 6
UNIT 1 The Primes
textbook

SECTION 1.3 to numbers, odd numbers being male and even numbers being female. Much of
the mathematical tradition of ancient Greece, and, thus, of the civilizations that
Number for Number’s followed, stemmed from the obsessions of the Pythagoreans.
Sake: The Greeks
CONTINUED The Primacy of Proof
• Proof has long been one of the central ideas in mathematics.
• Mathematical theorems, unlike scientific theories, last forever.

Chief among the Pythagorean concerns was the notion of proof. In philosophy
it was possible to argue, as the Sophists did, both sides of a scenario and see
that neither was a clear winner. Math, however, is different in that “truth” can
be proved through a system of assumptions and allowed actions that show
that a given statement must follow from initial postulates. In other words, in
mathematics at least, there is indisputably a right answer, although it may
not always be obvious. This clarity, and the comfort that it often brings, was
of central importance to the Pythagoreans, and it represents a distinguishing
feature of the field of mathematics.

Theorems proved by, or at least attributed to, early Greeks, such as the
Pythagoreans, remain as true today as they were in ancient times. The same
cannot be said for any host of other Greek beliefs from non-mathematical
disciplines. One of the alluring features of mathematics is that it enables one to
say definite things about reality. This aspect was, and continues to be, a major
reason why people choose to study mathematics.

Figurate Numbers
• Playing with the geometric structure of numbers led to early insights in
number theory.

Playing with numbers—the exploration of numbers for their own sake—is


perhaps the first step towards mathematical sophistication. The Greeks
were fascinated by the different properties that certain numbers exhibited
geometrically. If we represent each whole number by a collection of pebbles
equal in count to that number, then many interesting relations and properties
of numbers can be found by looking at the shapes that one is able to make with
different arrangements of the collections.

Unit 1 | 7
UNIT 1 The Primes
textbook

SECTION 1.3
1094
Number for Number’s
Sake: The Greeks
CONTINUED

Square numbers are whole numbers that, when represented as a collection


of pebbles, can form a square array. The first square number is 1, the second
is 4, which can be portrayed as a 2 × 2 array, the third square number, 9,
forms a 3 × 3 array, etc. The triangular numbers can also be represented
by an interesting sequence of dot patterns, which is, in fact, the basis of their
classification as “triangular” numbers.

1095

The squares and triangular numbers, in this context, are examples of “figurate”
numbers—numbers that have “shapes.” Figurate numbers hold many
interesting properties; for example, consider the 5 × 5 square.

2294
Is it evident that the fifth square number, 25, represents the sum of the first five
odd numbers?

1031

We can decompose the square into nested L-shapes, also called gnomons, as
shown above. It is, hopefully, straightforward to state that any square can be

Unit 1 | 8
UNIT 1 The Primes
textbook

SECTION 1.3 broken down in this way. It should also be clear that every gnomon represents
an odd number, and that nested gnomons represent consecutive odds. Adding a
Number for Number’s gnomon to an n × n square increases the dimensions to (n+1) × (n+1), which is
Sake: The Greeks still a square. It is reasonable, then, to conclude that the sum of the first n odd
CONTINUED
numbers is n2.

In the online interactive, you will have the opportunity to play with figurate
numbers, such as those we have discussed, and to discover interesting
relationships between them. The study of figurate numbers leads quite
naturally to primes, which we will investigate further in the next section.

Unit 1 | 9
UNIT 1 The Primes
textbook

SECTION 1.4

Primes • Rectangles
• Factor Trees
• Fundamental Theorem of Arithmetic

Rectangles
• The rectangle model of multiplication links a number’s geometric structure
to its divisors.
• Primes are numbers that cannot be represented by rectangles with both
dimensions whole numbers greater than one.

Exploring a number’s geometric shape leads quite naturally to the notion


of prime numbers. Just as before, we can consider a whole number to be a
collection of pebbles. We could then address the question of whether it is
possible to form a rectangle with the pebbles.

Let’s look at a collection of 12 pebbles:

2295

Unit 1 | 10
UNIT 1 The Primes
textbook

SECTION 1.4 We can arrange these 12 pebbles into various rectangular arrays.

Primes &s&'
CONTINUED
's+
(s)

2296 &'s& +s' )s(

Note that the dimensions of each rectangle—the height and width—multiply to


equal 12. We call each of these numbers representing a possible dimension a
“divisor” of 12; so 12 has six divisors: 1, 2, 3, 4, 6, and 12. In general, we say that
a is a divisor of N if for some whole number b, a × b = N.

Note that any whole number, N, can be represented by a 1 × N rectangle—a


single row of pebbles.

########

2297 &sC

Some numbers, such as 12, 15, 20, and 100, can be represented by rectangles
that are more interesting than a single row of pebbles.

2298
(s)2&'

*s(2&* *s)2'%

&%s&%2&%%

Unit 1 | 11
UNIT 1 The Primes
textbook

SECTION 1.4 Other numbers, however, such as 5, 11, 17, and 101, can be represented only by
the single-row type of rectangle.
Primes
########
CONTINUED
&s&%&2&%&

2761
&s&,2&,

&s&&2&&

&s*2*

In arithmetic, we call a number “prime” if it has precisely two divisors.


These primes are the numbers, such as 2, 3, 5, 7, and 11, whose pebble
representations can be arranged only into the single-row type of rectangle.
Numbers with more than two divisors are called “composite.” Geometrically,
these are numbers that can be represented in dot formations by more than one
type of rectangle. Note that the number one, which has precisely one divisor, is
considered to be neither prime nor composite.

Factor Trees
• Factor trees reveal the prime decompositions of composite numbers.

A number that is prime has exactly two divisors, itself and one. If a number
is not prime—composite, in other words—we can “factor” it to find all of its
constituent prime “factors.” For example, the number 30 can be written as
6 × 5, which can in turn be written as a product of all-prime factors: 2 × 3 × 5.

(%

1099
* +

' (

Some numbers have multiple possible factor trees. Let’s consider the number
300, for example:
Unit 1 | 12
UNIT 1 The Primes
textbook

SECTION 1.4
(%% (%%
Primes
CONTINUED
&*% '
(% &%

2299 * + * ' &* &%

( ' ( * * '

(%%2*M(M'M*M' (%%2(M*M*M'M'
Note, however, that although these factor trees for 300 are different, the set of
prime factors generated is the same: 3, 5, 5, 2, and 2.

Any composite number can be decomposed this way into a product of primes.
In doing this, we see that primes can indeed be thought of as the “atoms,” or
fundamental building blocks, of all numbers. Real atoms are the smallest
individual pieces of an element, such as gold, that still retain all the properties
of that element. In this analogy, a composite number is like a molecule.
Breaking apart a molecule generates a collection of atoms of different elements,
each of which cannot be broken down further. We perform an analogous
breakdown when we decompose a composite number and express its prime
decomposition via a factor tree. It is interesting to note that every composite
number breaks down into a unique product of primes. How do we know this?

Fundamental Theorem of Arithmetic


• The Fundamental Theorem of Arithmetic states that every number has only
one prime decomposition.
• Primes are the “atoms” of arithmetic.

The fact that every composite number has a unique prime decomposition, a
concept known as the fundamental theorem of arithmetic, is often taken for
granted. In mathematics we should always be careful to question both our own
assumptions and the assumptions of those who would tell us something. Such
a questioning attitude derives from the previously mentioned importance of
proof, pioneered by the Greek philosophers, logicians, and mathematicians. The
tools for proving the fundamental theorem of arithmetic were first laid down by
the great mathematician Euclid, who lived in Alexandria in the third century BC.
These core concepts were not rigorously expressed, however, until Karl Gauss

Unit 1 | 13
UNIT 1 The Primes
textbook

SECTION 1.4 put them on a solid foundation in his Disquisitiones Arithmeticae, first published
in 1801. Expressed in modern language, the fundamental theorem of arithmetic
Primes states that:
CONTINUED
Every natural number greater than 1 can be written as a product of prime numbers
in essentially just one way.

This may seem intuitive, even obvious, but it doesn’t have to be the case. We can
imagine a number system in which the fundamental theorem of arithmetic does
not hold. Take, for example, a set, S, consisting of the numbers {1, 4, 7, 10, 13,
16, 19,..., 3n+1,…}. Each number in this system is one more than a multiple of
three. Suppose that these numbers are all we have to work with in performing
arithmetic operations in this system. As with the natural numbers, we can have
a notion of prime and composite in this system—let’s call them “S-prime” and
“S-composite” respectively.

H2p&!)!,!&%!&'!&+!####(c &r

H"8DBEDH>I: H"8DBEDH>I:

'- &+

300
) , ) )

H"EG>B: H"EG>B: H"EG>B: H"EG>B:

I]ZÆH"Eg^bZÇ I]ZÆH"Eg^bZÇ
[VXidg^oVi^dcd['-# [VXidg^oVi^dcd[&+#

The number 10 has only two divisors in this system, one and itself. The number
16, on the other hand, has three: 1, 4, and 16. Numbers such as 4, 7, and 10 can
be called “S-prime,” because they have exactly two divisors within the number
system, S. Numbers such as 28 and 16 can be called “S-composite,” because
they have more than two divisors in S. If S obeys the fundamental theorem of
arithmetic, then no matter how we draw a factor tree for an S-composite, we
should end up with the same set of S-primes. Is this the case? Let’s look at two
factor trees of the S-composite number 100:

Unit 1 | 14
UNIT 1 The Primes
textbook

SECTION 1.4
H"8DBEDH>I: H"8DBEDH>I:

Primes &%% &%%

2301 CONTINUED DG

) '* &% &%

H"EG>B: H"EG>B: H"EG>B: H"EG>B:

&%%2)s'* &%%2&%s'*
6cÆH"Eg^bZÇ 6Y^[[ZgZciÆH"Eg^bZÇ
[VXidg^oVi^dcd[&%%# [VXidg^oVi^dcd[&%%#

Notice that 100 can be written as a product of S-primes in two different ways,
4 × 25 and 10 × 10. So, this demonstrates that the fundamental theorem of
arithmetic does not hold for our number system S. This suggests to us that we
cannot take this seemingly obvious property of natural numbers for granted—
hence, Gauss’s proof that every natural number greater than one has a unique
prime factorization is of great importance in the field of number theory.

Actually, finding a number’s prime decomposition—also known as the prime


factorization—is relatively straightforward, provided we are aware that our
number is divisible by some factor and we know what that factor is. Suppose,
however, that we come across some large number and wish to find its prime
factors. How would we do this? How would we know whether it even had prime
factors other than itself?

If the situation were different, and we wished to multiply two large numbers, our
task would be easy—we have good, efficient algorithms for multiplying numbers
of any size. Factoring a large number, however, is exceedingly difficult. Most
factoring methods involve some version of the “trial and error” strategy. Even
for a relatively small number, such as 527, our only real choice is to try dividing
it by potential factors until one of them divides it evenly. In this case, it wouldn’t
take us too long to find that 527 = 17 × 31. For a very large number, however,
going through all the possibilities could take an intractably long time.

If someone took two large prime numbers and multiplied them together and
then asked us to find the prime factorization of that composite number, we
would be in trouble. The number we are given has only two prime factors, and if
we don’t know either of them, there is no good algorithm for finding them. This
“one-way-street” aspect of multiplication and factoring will play a key role in

Unit 1 | 15
UNIT 1 The Primes
textbook

SECTION 1.4 our upcoming discussion of encryption. Before we tackle encryption, however,
we should take a closer look at prime numbers, for they themselves hold a great
Primes deal of mystery.
CONTINUED

Unit 1 | 16
UNIT 1 The Primes
textbook

SECTION 1.5

Questions about • An Infinitude of Primes


Primes • Is It Prime?
• Structure of the Primes
• Riemann’s Hypothesis

An Infinitude of Primes
• Euclid proved that, given a finite list of primes, one can always create a new
prime not on that list, which implies that there are an infinite number of
primes.

Let’s take a look at a partial list of primes, with spaces left to represent the
composite numbers that occur between them:

_2 3 _ 5 _ 7 _ _ _ 11 _ 13 _ _ _ 17 _ 19 _ _ _ 23 _ _ _ _ _ 29 _ 31 _ _ _ _ _ 37 _ _ _
41 _ 43 _ _ _ 47 _ _ _ _ _ 53…

The primes seem to be haphazardly distributed; that is, there is not a clear
pattern to them. It seems also, with the exception of the occasional pair of
“twin primes”—primes separated by only one number-- that the gaps between
adjacent primes generally tend to become larger as the numbers themselves
get larger. From this evidence, one could plausibly think that at some point the
stream of primes dries up and that there might be no primes above some given
number. The question of how many primes there are is an interesting one that
arises quite naturally. Is the list finite or infinite?

Euclid thought about this problem and found a way to show that, in fact, there
are an infinite number of primes to be found. He did this by demonstrating that
from any finite collection of primes it is always possible to find one more. For
example, take the finite set {2, 3, 5}. Euclid suggested multiplying all members
of the set together and adding one. For our set, this gives us 31 (2 × 3 × 5 + 1 =
31), which cannot be evenly divided by any of the primes on our list, for it always
gives a remainder of one. Hence, 31 is a “new” prime.

Remember that in math it’s good to keep a skeptical, questioning attitude


concerning pronouncements. Perhaps that example was just an anomaly,
and maybe 2, 3, 5, and 31 are the only primes that exist. Applying Euclid’s test
strategy to this expanded set, we would multiply them together and get 930, to

Unit 1 | 17
UNIT 1 The Primes
textbook

SECTION 1.5 which we add one to end up with 931. Dividing 931 by 2, 3, 5, or 31 always leaves
a remainder of one. Does this mean that 931 is prime? No, not necessarily, but
Questions about it does mean that none of the numbers 2, 3, 5, and 31 is a factor of 931. It might
Primes not be obvious, but 931 is actually the product of 7, 7, and 19. So, even though
CONTINUED
931 is not itself prime, it is a composite number whose prime factorization
reveals new prime numbers (7 and 19) not in our original list. Again, our list
was incomplete.

The point is that Euclid showed that multiplying together a finite list of primes
and adding one will always produce a new number which, if not itself prime,
factors into new primes not on the list. We can always find a new prime using
this method. This implies that any finite list will never include all of the prime
numbers; therefore, the list of primes must be infinite!

The method we employed above will always uncover at least one new prime,
but it could very well miss some along the way. In our first example above, we
missed a few primes in between 5 and 31, namely 11, 13, 17, 23, and 29. This
method obviously will not produce an exhaustive list of primes; it serves only to
generate a new prime or two, given a list of starting primes.

In the second example above, we found that this method produces a number that
may not necessarily be prime. This suggests to us that it would be nice to have
an efficient, fail-safe method for determining whether a number of any size is
prime or not.

Is It Prime?
• Determining whether or not a number is prime is a non-trivial task.

The most straightforward way to know for sure whether or not a number is
prime is to test, systematically, its divisibility by all the numbers less than its
square root. For example, if 1001 were composite, then it would have to be
the product of two numbers, a and b, neither of which is one. One of these two
numbers would have to be smaller than (1001) —they can’t both be larger—
so we need only check for divisors up to 32, which is the square root of the
closest square number larger than 1001 (i.e., 1024). Because every number is
composed essentially of prime factors, we need only check the prime numbers
less than 32; these are 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, and 31. This is a feasible
number of potential divisors to check.

Unit 1 | 18
UNIT 1 The Primes
textbook

SECTION 1.5 There are a number of familiar divisibility tests that can be used to determine if
a number is divisible by most of the single digit numbers. One rule that is widely
Questions about known concerns divisibility by 3: a number is a multiple of 3 if all of its digits
Primes sum to a multiple of 3. For example, the digits of 78 sum to 15, and 78 factors to
CONTINUED
3 × 26. What about a number such as 221, which doesn’t fall in line with any of
the short-cut divisibility tests? To determine whether or not it is prime, using
the brute force method described above, we would have to divide it by all of the
primes less than its square root, which is somewhere between 14 and 15. Were
we to do this, we would find that our first five tries (checking 2, 3, 5, 7, and 11)
would be unsuccessful, all yielding remainders. Only on the sixth try (dividing by
13) would we find that 221 is indeed factorable into 13 × 17.

The brute force method of testing whether or not a number is prime can prove
to be quite a headache for large numbers. For example, using a computer that
can perform tens of billions of operations per second, the test for a 100-digit
number would take about 1040 seconds. This is quite a bit longer than the
current estimated age of the universe, about 12 billion years.

There are various tests, other than the brute force method, that can tell us
about the primeness of larger numbers. One such test uses Wilson’s theorem,
which states that if a number, p, is prime, then (p-1)! + 1 is a multiple of p. For
example, we know that the number 7 is prime. Then, according to Wilson’s
theorem, (7-1)! + 1 = 6! + 1 = 721 should be a multiple of 7. Indeed it is: 721 =
7 × 103. Unfortunately, as we will see in Combinatorics Counts, computing
factorials is practically infeasible for large numbers—e.g., 1000! is a number
2,568 digits long, and 1,000,000! is 5.5 million digits long. Wilson’s theorem is
useless for the sizes of numbers about which the question of primeness has not
already been settled.

The Structure of the Primes


• The answer to whether or not there is a pattern behind the primes has
eluded mathematicians for millennia.
• One can find interesting examples of both structure and randomness in the
primes.

Because directly testing whether or not an arbitrary large number is prime is


either very time-consuming or not very reliable, mathematicians have sought
to determine the primeness of a number in a totally different way—that is,
by looking for a pattern to the primes. If a pattern can be established, then it

Unit 1 | 19
UNIT 1 The Primes
textbook

SECTION 1.5 should be possible to determine where primes do and do not occur. Describing
exactly where primes should occur on the number line would not only provide an
Questions about elegant way to know whether or not a number is prime—it would be a beautiful
Primes result in itself.
CONTINUED

The first person generally credited with attempting to describe a pattern behind
the primes was the Greek philosopher and mathematician Eratosthenes. He
is credited with creating a method, now called the Sieve of Eratosthenes, for
systematically identifying all prime numbers. His method is simple: begin with
a finite list of whole numbers and eliminate all the multiples of 2 greater than 2,
then all the multiples of 3 greater than 3, then all the multiples of 5 greater than
5, and so on until the only numbers left are the ones that are not multiples of any
other numbers: these will be the primes.

EG>B:H6G:L=>I:6C98DBEDH>I:H6G:DG6C<:
& ' ( ) * + , - . &%

&& &' &( &) &* &+ &, &- &. '%

'& '' '( ') '* '+ ', '- '. (%

(& (' (( () (* (+ (, (- (. )%

1098 )&

*&
)'

*'
)(

*(
))

*)
)*

**
)+

*+
),

*,
)-

*-
).

*.
*%

+%

+& +' +( +) +* ++ +, +- +. ,%

,& ,' ,( ,) ,* ,+ ,, ,- ,. -%

-& -' -( -) -* -+ -, -- -. .%

.& .' .( .) .* .+ ., .- .. &%%

This method, however, is not terribly different or better for larger numbers
than the brute force method we discussed earlier. It does, however, provide
some possible clues to the structure of the primes available to us if, rather than
looking at each number individually, we look at the structure as a whole. For
example, can we find a pattern behind the primes by focusing on the spacing
between them? This strategy is similar to looking at the negative space around
a picture instead of looking directly at the picture itself.

Unit 1 | 20
UNIT 1 The Primes
textbook

SECTION 1.5 We can think of the space between primes


as “prime deserts,” strings of consecutive
Questions about numbers, none of which are prime. There is
Primes a trick, however, for finding prime deserts of
2302CONTINUED
whatever length we can think of. For example,
7! = 7 × 6 × 5 × 4 × 3 × 2 × 1 = 5,040. This
number is, of course, not prime—but neither is
5,040 + 2, nor 5,040 + 3, nor 5,040 + 4. … nor
A Cup or two faces? 5,040 + 7. In general, for any number N, the
string N! + 2, N! + 3,…N! + N identifies a string of N-1 consecutive composite
numbers. Setting N = 1,000,000 shows that it is possible to find 999,999
consecutive whole numbers, none of which are prime.

Paradoxically, just as we can find prime deserts of whatever length we choose,


we can also find strings of primes—sequences of primes that are evenly spaced
on the number line—of whatever length we choose. A good example of strings
of primes are the sets called “twin primes.” These are pairs of primes that
are spaced two numbers apart, such as 11 and 13, or 17 and 19. The set of 3,
5, and 7 forms another such string; again, each number is equally spaced from
the others. In yet another example, the numbers 5, 17, 29, 41, and 53 are all
prime and are all spaced evenly at 12 numbers apart. Such strings are called
arithmetic progressions. In recently completed research done by Terence Tao at
UCLA and Ben Green at the University of Bristol, arithmetic progressions of any
finite length were proven to exist. Not only are there prime deserts of arbitrary
length, but there are also strings of equally spaced primes of whatever finite
length we choose. If there is a pattern to the primes, it would have to account
for these paradoxically bizarre features.

A tantalizing clue as to the fundamental structure of the primes was discovered


by Marin Mersenne, a French music theoretician and mathematician of the
16th and 17th centuries. Specifically, in his study of number theory he became
interested in the powers of 2 and found an odd coincidence.

Unit 1 | 21
UNIT 1 The Primes
textbook

SECTION 1.5 8=6GID;EDL:GHD;ILD


EdlZghd[' '% '& '' '( ') '* '+ ', '-
1304
Questions about
Primes
Bjai^ea^ZYdji
DcZaZhhi]Vc
&
%
'
&
)
(
-
,
&+
&*
('
(&
+)
+(
&'-
&',
'*+
'**
VedlZgd['
CONTINUED
Notice that, in the above chart, one less than a prime power of 2 yields a prime
number. This seems to give us a potential way to predict where prime numbers
fall, and it also shows a startling relationship. If the zero power is disregarded,
these so-called Mersenne primes appear in the 2nd, 3rd, 5th, 7th , etc. positions in
the sequence—in other words, in the prime positions. They are prime numbers
in prime positions! Alas, this is but a coincidental occurrence in the smaller
numbers, as the pattern does not hold for all prime powers of 2. For example
211 – 1 is equal to 2,047, which is the product of 89 and 23.

Mersenne primes, nonetheless, are still very important in the modern study of
numbers. In fact, the largest prime number that we know about is a Mersenne
prime. As of 2006, this number was 232582657 − 1, a number that is 9,808,358
digits long. It would take an average person more than 100 days just to read it!
We should note that in addition to generating non-prime numbers, Mersenne’s
method also is not guaranteed to predict the positions of all of the primes.

We have so far looked at a few different approaches to predicting where


primes should be found on the number line. From the brute force method of
Eratosthenes to the elegant ideas of Tao, Green, and Mersenne, we have seen
that the primes do not give up their secrets easily. Perhaps, in asking where
exactly each prime falls on the number line, we are asking the wrong question.

Gauss, of fundamental theorem of arithmetic fame, took a different approach.


Finding a straightforward attack daunting, he examined matters from a different
perspective. He examined the density of primes below a certain number.

Unit 1 | 22
UNIT 1 The Primes
textbook

SECTION 1.5 <G6E=D;EG>B:9:CH>IN


**

Questions about
*%
Primes
CONTINUED
)*

)%

(*

(%

'*

'%

&*
Eg^bZhaZhhi]VcÆCÇ

&%

0 '% )% +% -% &%% &'% &)% &+% &-% '%%

ÆCÇ

He found that the density of primes—that is, a measure of the number of primes
in relation to all the numbers below any given number—is fairly constant. By
looking at tables of the number of primes below a given number, n, Gauss
n
observed that this measure always seems to be approximately the ratio log n ,
which implies that the nth prime is approximately n log n. This finding, while not
exactly what we were looking for in terms of predicting where primes appear on
the number line, represents a significant result in the search for a “law of the
primes.”

Riemann’s Hypothesis
• Riemann, following in the footsteps of Euler, thought that the pattern of
the primes was tied to the zeta function, which is a generalization of the
harmonic series.

Unit 1 | 23
UNIT 1 The Primes
textbook

SECTION 1.5 Gauss’s conjecture about the distribution of the primes was an important first
step. By looking at the problem from a different perspective, that of prime
Questions about density, and by examining functions that would give not a prime number directly,
Primes but rather the density of primes below a specified number, Gauss opened the
CONTINUED
exploration of the primes to new disciplines of mathematics. He was not the
only one working on this problem, however. The Swiss mathematician Leonhard
Euler, working in a similar vein, introduced yet another approach to discovering
a pattern behind the primes.

Euler was fascinated by a relative of the harmonic series known as the zeta
function. The harmonic series is simply the sum of the reciprocals of all the
whole numbers.


1 1 1 1 1 1 1 1
∑k 1+ + + + + + + + ...
2 3 4 5 6 7 8
k =1

What tipped Euler off that the harmonic series’ cousin, the zeta function, might
be useful in exploring the structure of the primes was that it contains every
natural number.

1 1 1 ∞
1
ς(x)=1+ + + ... = ∑
2x 3x 4 x
n =1 nx

Notice that the zeta function expresses the sum of the reciprocals of the xth
power of every number. We can input any value we choose for x and study the
behavior of the series. For any value of x less than or equal to one, the series
diverges—grows infinitely large. For values of x greater than one, the series
converges to a finite value. However, just because we know that the series
converges does not mean that we can always find the value upon which it settles.

Euler played with the zeta function and found that he was able to express every
element of the infinite sum as a product of the reciprocals of prime numbers.
This insight tied the study of prime numbers to the study of infinite series and
paved the way for one of the most elusive hypotheses in mathematics.

1 1 1 1
ς(x)= + + + ... + ...
1
x
2 x
3x
nx

Unit 1 | 24
UNIT 1 The Primes
textbook

SECTION 1.5 EULER’S FACTORIZATION

Questions about 1 1 1 1 1 1
= (1+ + + ...)× (1+ + + ...)× .... × (1+ + + ...)× ...
Primes 2x 4x 3x 9x px (p2 )x
CONTINUED

Bernhard Riemann, one of the most influential mathematicians of the 19th


century, built upon Euler’s discovery, specifically exploring the consequences
of using complex, or imaginary, numbers as the inputs to Euler’s version of the
zeta function. This effectively constructed a larger landscape of numbers to
explore, one that gave a better perspective on how the elusive primes might be
distributed along the number line. Riemann proposed that the distribution of
the primes was tied to the zeroes of the zeta function when considered over the
complex numbers.

If you imagine the entire collection of complex numbers to be a landscape with


hills and valleys, the zeta function is like a road that traverses it. We can think
of the zeroes of the zeta function as being the points at which the elevation of
the road is at sea level. Riemann said that these points mark where the primes
should occur.

Riemann’s hypothesis was groundbreaking, but it remains a hypothesis,


unproven, to this day. Given the history of the problem it purportedly solves, and
the centuries of effort that have led to it, the proof of Riemann’s hypothesis will
be quite valuable. As of this writing, there is a $1,000,000 reward available to
any person who shows definitively whether or not it is true or untrue.

The preceding discussion should have given you a sense that primes are
intrinsically fascinating. They are the fundamental basis of arithmetic—yet, they
are extremely mysterious. They surprisingly represent both the most basic and
the most cutting-edge aspects of mathematics. On a daily basis, however, we
are not typically conscious of them. Because of this, it is tempting to believe that
playing with primes is the purview of pure mathematics, with little relevance to
our daily lives. This couldn’t be further from the truth, especially in our modern
age when numbers are used to transmit all types of information, the security of
which is of great importance. Primes provide the building blocks of encryption
schemes that protect our most sensitive and valuable data.

Unit 1 | 25
UNIT 1 The Primes
textbook

SECTION 1.6

Encryption • History
• Caesar Ciphers
• Modular Arithmetic
• Prime Moduli

History
• The need to send private messages has ancient roots.
• Most historical encryption schemes used private keys.

When sending sensitive information, such as logging onto a bank account via
the World Wide Web, we want the intended receiver to be able to “read” the
message, and we also want any unintended recipient (or thief) of the message to
be thwarted. To accomplish this, we need a system of encryption, one that will
be impervious to even the cleverest of hackers. Online transactions are only the
most recent example of situations in which information must be protected while
in transit. There have been many different systems of encoding throughout the
years, many of them mathematical in nature.

Mathematical encryption relies on concepts such as modular arithmetic and the


fundamental properties of prime numbers to make messages incomprehensible
to unintended recipients. Obviously, encryption requires operations that are
easy to perform one way (i.e., encoding), yet nearly impossible to perform the
other way (i.e., decoding) without a key. Prime numbers can help us with this.
Recall that multiplying two primes is easy, whereas factoring a product of two
unknown primes can be extremely difficult and time-consuming.

The problem of sending a vital message to someone without it being discovered,


or deciphered, is an ancient one. This need crops up in a variety of situations,
but an all-too-common need for encryption has arisen time after time on
the battlefield. When attacking an enemy, it is critical to be able to send
coordinating information to your involved units without the enemy discovering
your plans. By encrypting messages in some fashion, you hope that, even if a
message is intercepted, its message will be inaccessible to unintended readers.

Encryption schemes have come in many forms throughout the ages. Many early
schemes could hardly be called “encryption”—“concealment” would be a more
appropriate term. A particularly striking example is the story of Histaiaeus, a

Unit 1 | 26
UNIT 1 The Primes
textbook

SECTION 1.6 5th-century-BC Greek provincial ruler, or tyrant, who wanted to encourage a
fellow tyrant of a neighboring state to revolt against the Persian king, Darius. To
Encryption convey his instructions securely, Histaiaeus shaved the head of his messenger,
CONTINUED wrote the message on his scalp, and then waited for the hair to re-grow. The
messenger, apparently carrying nothing of interest, traveled without being
harassed. Upon arriving at his destination, the messenger shaved his head and
pointed it toward the intended recipient. Although this was perhaps clever for
its time, such a system would quickly be compromised as word of its use spread.

Caesar Ciphers
• The simplest encryption scheme is to jumble the letters of the alphabet
according to some rule.

Clearly, a system of simply concealing one’s message is effective only up to


a point—a thorough search is all that is needed to expose the scheme and
intercept the message. To ensure message security, one would want an
interloper, were he or she to find the communication, to be baffled by what it
actually says. A simple scheme, whose first use is attributed to Julius Caesar,
is to replace each letter of the alphabet with a different letter according to some
rule. For example:

86:H6G8>E=:GJH>C<GJA:D;Æ699I=G::Ç

1101 6 7 8 9 : ; < = > ? @ A B C D E F G


9 : ; < = > ? @ A B C D E F G H I J
H I J
K L M
K L M N O
N O 6 7 8

This scheme replaces each letter with the letter that comes three letters later
in the sequence of the alphabet. In effect, this encoding scheme just shifts the
alphabet three letters to the left, with the substitutions for the last three letters
coming from the beginning of the sequence. Our key, let’s call it “k,” is 3.
Let’s now shift our thinking from letters to numbers; we can easily assign each
letter of the alphabet to a corresponding number as follows:

86:H6G8>E=:G8DGG:HEDC9>C<A:II:GHIDCJB7:GH

1102 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O


& ' ( ) * + , - . &% && &' &( &) &* &+ &, &- &. '% '& '' '( ') '* '+

Unit 1 | 27
UNIT 1 The Primes
textbook

SECTION 1.6 Applying the same “shift by 3” that was just used in the preceding example to
this new system creates the following encryption scheme:
Encryption
CONTINUED 86:H6G8>E=:GL>I=CJB7:GHH=>;I:9I=G::EA68:H

1103 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M


) * + , - . &% && &' &( &) &* &+ &, &- &. '% '& '' '( ') '* '+ &
N
'
O
(

To encrypt a letter, we simply add three to the number with which it was
originally paired. For example:

A, originally paired with ”1” gets shifted, or encrypted, to “4.”


and
M, originally paired with ”13” gets shifted to ”16.”

When we get to X, however, we have a problem. Following the current


encryption scheme, X, originally associated with “24,” would get shifted to “27.”
However, 27 isn’t anywhere in our list of possible numbers. So, we can wrap
around and start over at 1 after hitting 26 instead of using more numbers. To
express this adjustment mathematically, we could write:

24 + 3 = 1

This strange sort of arithmetic might seem somewhat familiar because it is


related to how we apply arithmetic to cyclical time. For example, if it is four
o’clock and we add twelve hours, it is again four o’clock, ignoring the A.M./P.M.
difference. This “clock math” is known as “modular arithmetic.”

Unit 1 | 28
UNIT 1 The Primes
textbook

SECTION 1.6 Modular Arithmetic


• Modular arithmetic is math that incorporates “wrap-around” effects.
Encryption
CONTINUED
Modular arithmetic is a key to understanding modern forms of encryption, and
it also demonstrates interesting properties of prime numbers. It incorporates
“wrap around” effects by having some number other than zero play the role of
zero in addition. For example, on an analog clock:
&'
&& &

'
&%

04 . (

)
-

, *
+
The number 12 behaves like zero, because adding 12 hours to any time (again,
ignoring A.M. or P.M. differences) doesn’t change anything. A typical addition
problem in this scheme would be:

3 + 12 = 3

To write a number in this system, we have to be conscious of how many


multiples of 12 it contains. A number such as 5 has no multiples of 12 in it, so
we would simply write 5. The number 17, however, is 5 + 1(12), which is also
equal to 5 in this system. Similarly, the number 39 is 5 + 2(12), which is also
equal to 5, again because the number twelve is acting like zero in this system. In
such a system, 12 is called the “modulus.” To describe the number 29, we would
say that it is “congruent to 5 modulo 12”, or 29 ≡ 5 mod 12.
Note that:
5 ≡ 5 mod 12
17 ≡ 5 mod 12
29 ≡ 5 mod 12
etc.
Any number of the form 5 + n(12) will be congruent to 5 in this system.

Unit 1 | 29
UNIT 1 The Primes
textbook

SECTION 1.6 With this new foundation in modular arithmetic, let’s return to the Caesar
cipher from before. Previously, the integrity of the code was dependent on the
Encryption security of our key (i.e., only we and our intended correspondent can know it).
CONTINUED The integrity of our code is vulnerable, however, to anyone who suspects that we
are simply shifting letters. If our message is sufficiently long, one could rather
easily figure out our key by examining letter frequencies. That is, it is highly
likely that the letter that appears most frequently in our code represents the
letter “E”, which is the most commonly used letter in English. After the code
is “cracked” in this way, the whole scheme crumbles because every letter is
shifted by the same amount!

Let’s shift our thinking back to a number code again. To increase security
and make our encryption “uncrackable” by a simple shift, we could multiply
numbers instead. In this case, the code key would be the number by which we
choose to multiply the original numbers. Multiplying in modular arithmetic is
just like multiplying in normal arithmetic, except that whenever we encounter
a product that is greater than some multiple of our modulus, we use the wrap-
around effect. For example, using our clock from before (mod 12),
4 × 2 ≡ 8 mod 12 (no surprise), but 4 × 3 ≡ 0 mod 12, and 4 × 4 ≡ 4 mod 12.
A multiplication table for mod 12 looks like this:

BD9&'BJAI>EA>86I>DC8=6GI
m & ' ( ) * + , - . &% && %
& & ' ( ) * + , - . &% && %

1105 '
(
'
(
)
+
+
.
-
%
&%
(
%
+
'
.
)
%
+
(
-
+
&%
.
%
%
) ) - % ) - % ) - % ) - %
* * &% ( - & + && ) . ' , %
+ + % + % + % + % + % + %
, , ' . ) && + & - ( &% * %
- - ) % - ) % - ) % - ) %
. . + ( % . + ( % . + ( %
&% &% - + ) ' % &% - + ) ' %
&& && &% . - , + * ) ( ' & %
% % % % % % % % % % % % %

Unit 1 | 30
UNIT 1 The Primes
textbook

SECTION 1.6 Notice first that we are using zero in place of the number twelve. Also, notice
that some of the columns do not contain all of the numbers in our system,
Encryption namely all of the numbers between 0 and 11 inclusive. As you can see, this is
CONTINUED true for the rows, as well. If we were to choose these numbers as our key for
encrypting a message, we would be setting our recipient up for confusion. For
example, let’s say that our encryption scheme involves multiplying the numbers
that make up our message by 4 in this mod 12 system. For simplicity’s sake,
let’s also say we are using a 12-letter alphabet.

8DGG:HEDC9>C<&'A:II:G6AE=67:I

1106
6 7 8 9 : ; < = > ? @ A
& ' ( ) * + , - . &% && %

This would serve as the top row of our mod 12 table. Because we are using k = 4
as our key, we are going to look to the 4th row of the table.

I67A:BD9&'ÄGDL)@:N
 6 7 8 9 : ; < = > ? @ A
m & ' ( ) * + , - . &% && %
& & ' ( ) * + , - . &% && %
' ' ) + - &% % ' ) + - &% %
( ( + . % ( + . % ( + . %

2932
) ) - % ) - % ) - % ) - %
* * &% ( - & + && ) . ' , %
+ + % + % + % + % + % + %
, , ' . ) && + & - ( &% * %
- - ) % - ) % - ) % - ) %
. . + ( % . + ( % . + ( %
&% &% - + ) ' % &% - + ) ' %
&& && &% . - , + * ) ( ' & %
% % % % % % % % % % % % %

The letter D, which is 4, would be encoded as:


4(letter) × 4(key) ≡ 4 (encrypted letter) mod 12

G would be:
7(letter) × 4(key) ≡ 4 (encrypted letter) mod 12

J would be:
10(letter) × 4(key) ≡ 4 (encrypted letter) mod 12
Unit 1 | 31
UNIT 1 The Primes
textbook

SECTION 1.6 The same goes for the letter A.

Encryption So, suppose that the message we want to encrypt is “GDJA.” Following this
CONTINUED encryption scheme, the message would be encoded as “4444.” How would the
recipient know which 4’s represent D’s and which are supposed to be J’s, G’s, or
A’s? It would be impossible for our recipient to decode our message. For this
reason, choosing 4, or any other key for which the row does not contain every
number is a bad idea!

Note, however, that 7 contains all the numbers in its corresponding row and
column. This means that using 7 as a key would give each letter a unique
number in our encryption scheme.

Seven is not the only number we could choose, of course. A quick check of the
table shows that both 5 and 11 would work, as well. Notice also that 5, 7, and
11, the three numbers that have all the potential code numbers present in their
rows and columns, share no common factors with the modulus, 12. This is
because they are prime but are not factors of 12. The numbers 2 and 3 are also
prime, but they are factors of 12, so their rows and columns include repetitions.
This suggests that prime numbers have some interesting properties in modular
arithmetic, especially as it relates to making sure that every letter has a unique
encryption. The power of prime numbers in codes, however, does not stop
there.

Prime Moduli
• Using a prime modulus ensures that all reciprocals exist.
• Arithmetic with prime moduli forms the theoretical underpinnings of RSA
encryption.

To maximize the potential number of encoding keys, we would ideally like to


have a table that includes every possible value in every column and every row.
With such a table, any key we choose will accurately transmit our message—
that is, our recipient will be able to decode the message with no ambiguity. We
noticed that keys that share factors with the modulus, or “size of our clock,” are
no good. So, what if we choose a modulus that is unlikely to share factors with
other numbers? A prime number has no factors other than 1 and itself, so it
seems that any prime would be a good choice of modulus in this situation.

Unit 1 | 32
UNIT 1 The Primes
textbook

SECTION 1.6 For example, the mod 7 multiplication table looks like this:

BD9,BJAI>EA>86I>DCI67A:
Encryption
m & ' ( ) * + %
CONTINUED
& & ' ( ) * + %
' ' ) + & ( * %

1107 (
)
(
)
+
&
'
*
*
'
&
+
)
(
%
%
* * ( & + ) ' %
+ + * ) ( ' & %
% % % % % % % %

This modulus would correspond to a seven-letter alphabet:

BD9,Ä6AE=67:I

1108
6 7 8 9 : ; <
& ' ( ) * + %

Would a system such as this be harder to crack? Looking at our mod 7 table, we
can see that for a key of 5, which again means that we look at row 5, the letter
D gets mapped, or encoded, to 6. If someone were to figure out, perhaps by
analyzing letter frequencies, that D is represented by 6, and they knew that our
modulus, the size of our alphabet, was 7, then they could set up the following
congruence: 4 × k ≡ 6 mod 7. With a little more thought, they could figure out
that k must be represented by 5. So, it seems that this encryption system is
somewhat more difficult to crack than the additive Caesar cipher, but it is by no
means impenetrable!

Unit 1 | 33
UNIT 1 The Primes
textbook

SECTION 1.6 An advantage of having a prime modulus is that every “fraction,” mod p, will
exist. Think for a minute back to mod 12.
Encryption
BD9&'ÄBJAI>EA>86I>DC
CONTINUED
m & ' ( ) * + , - . &% && %
& & ' ( ) * + , - . &% && %
' ' ) + - &% % ' ) + - &% %
( ( + . % ( + . % ( + . %
) ) - % ) - % ) - % ) - %
* * &% ( - & + && ) . ' , %

2933 +
,
+
,
%
'
+
.
%
)
+
&&
%
+
+
&
%
-
+
(
%
&%
+
*
%
%
- - ) % - ) % - ) % - ) %
. . + ( % . + ( % . + ( %
&% &% - + ) ' % &% - + ) ' %
&& && &% . - , + * ) ( ' & %
% % % % % % % % % % % % %

Can we find a number, that, when multiplied by 5, for example, yields 1? In


other words, does 5 have a reciprocal? This number would have to be our
1
equivalent of “ 5 .” Looking at our mod 12 table, we can see that this number is
1
5. So, 5 plays the role of “ 5 ” because 5 × 5 = 1 mod 12. Does every number in
mod 12 have such a reciprocal fraction? Consider the number 2, and notice that
nowhere in the rows or columns marked by 2 does the number 1 appear. This
means that in the mod 12 system, there is no number that you can multiply 2 by
and get an answer of 1—the 2 has no reciprocal. In other words, in the mod 12
system, not all fractions exist!

Now let’s look at our mod 7 table again.

BD9,ÄBJAI>EA>86I>DC
m & ' ( ) * + %
& & ' ( ) * + %
' ' ) + & ( * %
( ( + ' * & ) %

2934 ) ) & * ' + ( %


* * ( & + ) ' %
+ + * ) ( ' & %
% % % % % % % %

Unit 1 | 34
UNIT 1 The Primes
textbook

SECTION 1.6 We figured out that because 7 is prime, every number appears in every row of
the table. Therefore, the number 1 appears in every row, which implies that
Encryption every number has an associated reciprocal. In addition to a maximal choice
CONTINUED of keys, the existence of every reciprocal is necessary in order to create an
encryption scheme that is much harder to break than anything we have seen
so far. Reciprocals are critical because instead of adding or multiplying in a
given modulus, we will need to use exponents to create a strong encryption
scheme. Specifically, we will be raising powers to powers, such as (23)4, which
is equivalent to 23 × 4, or 212.

Suppose that our message is some number, M. In this strong encryption


scheme, we could raise M to a power, e, to encrypt it. If another number, d, is
the reciprocal of e, then it can be used to decrypt Me as follows:

(Me)d = Me × d = M1 = M

In order for this strategy to work in the modulus that we are using, every
number, e, must have a reciprocal, d. This will be true only if our modulus
is prime. We’ve now caught a glimpse of the unique properties of primes in
modular arithmetic. This understanding is an important base from which we
can launch our examination of the modern standard for data encryption, RSA, in
the final section.

Unit 1 | 35
UNIT 1 The Primes
textbook

SECTION 1.7

RSA Encryption • Public Key Encryption


• A Worked Example of RSA Encryption

Public Key Encryption


• RSA encryption is the modern standard for sending secure information.
• RSA is based on the modular arithmetic of primes.

In the mid 1970s, three MIT researchers, Ron Rivest, Adi Shamir, and Leonard
Adleman, discovered a new method of encryption that relies on the properties of
primes and modular arithmetic. This system, RSA (initials of the discoverers),
has remained secure for over 20 years, although countless people have
attempted to breach it. In 1982, Rivest, Shamir, and Adleman founded RSA
Security, a company that would go on to provide the standard in data encryption
used worldwide on the Internet.

RSA encryption relies not on one key, as in our previous Caesar cipher
examples, but on two. One of these keys is made public, and the other is kept
private. If you wish someone to send you an encrypted message, you simply tell
them your public key, and they can then encrypt their message so that only you
can read it. They do not need to know your private key for this system to work.
Here is a brief, and somewhat simplified, description of the process.

As we have seen throughout this unit, multiplying two primes together is easy
to do, but factoring a large number into primes is a nightmarish trial-and-error
scenario if the number has only two large primes as factors. This property of
primes provides the basis of the RSA encryption scheme.

In general, to encrypt a message using the RSA method, we choose two large
primes, p and q, and multiply them to produce a number N. The number N will
be the modulus of our system and will be made public. This is fine; because p
and q are both large and prime, their product, N, will be practically impossible to
factor and can be confidently made public knowledge. We will now use p and q
to make our public and private keys.

To make our keys, we first subtract 1 from both p and q and then multiply the
results: (p-1)(q-1) = T. At this point, we will choose our public key, which can be
any number less than T that shares no factors with it. This public key is what we

Unit 1 | 36
UNIT 1 The Primes
textbook

SECTION 1.7 referred to in the previous section as e. To construct our private key, we need
to identify a number, d, such that when it is multiplied by e, the public key, it is
RSA Encryption congruent to 1 mod T. In other words, the following congruence must hold:
CONTINUED
d × e ≡ 1 mod T

Now, if we want someone to send us a message that only we can read, we


tell them the modulus, N, and the encryption key, e. They then convert their
message from letters to a series of number strings, or “words,” just as we have
been doing in previous sections of this text. We must be careful that none of the
words are larger than the modulus, which would result in some words being
indecipherable. Let’s say this word is the number M. The encrypter raises each
word to the “eth” power, mod N. This new word, let’s call it C, is now encrypted
and can be sent to us. In mathematical terms, we have:

C ≡ Me mod N

We receive the coded number C. To decrypt it, all we have to do is raise it to the
“dth” power, mod N. This works because Cd ≡ (Me)d mod N, and Rivest, Shamir,
and Adleman were able to show that Med ≡ M mod N. Recall that we chose
e and d such that, d times e triple equal 1 mod T (d × e = 1 mod T), so there
is some mathematics hidden in this statement! Thus, our private key, known
only to us, undoes the encryption of the public key. Anyone who knows e and N
can send us a message that only we, because we know d, can decrypt. To get a
better idea, let’s look at an example.

A Worked Example of RSA Encryption


Real RSA encryption uses very large numbers, which would be very difficult—
not to mention less than illuminating—to use here. In this example, we’ll use
smaller numbers that are not at all realistic but that illustrate how the method
works.

First, let’s choose the primes that form the foundation of our scheme, p and q:
p = 17 and q = 19

By multiplying p and q, we get our modulus, N:


N = 17 × 19 = 323

Now, we find T by subtracting 1 from both p and q and multiplying:

Unit 1 | 37
UNIT 1 The Primes
textbook

SECTION 1.7 T = (p-1) × (q-1) = (16) × (18) = 288


Next, we can select the encryption key, the public key, e, so that it has no
RSA Encryption common factors with 288. Let’s let e = 11.
CONTINUED
To find the decryption key, d, we need a number that, multiplied by e, gives a
product that is congruent to 1 mod 288. Expressed mathematically, we need to
find a number d such that:
11 × d = 1 + n(288)

Using trial and error, we find that 131 works because 11 × 131 = 1 + 5 x 288.

So, our public key is N = 323 and e = 11.

The private key is d = 131.

If someone wants to send us the message “ABC” securely, using this scheme,
we tell them N and e. They first convert “ABC” to “123” and then do the
following arithmetic:

123e = (123)11 mod 323 = 81

The sender could then send the message, “81,” over a public line of
communication with confidence.

To decrypt the code, “81,” we can use N and d as follows:

81d = (81)131 mod 323 = 123

The message “81” becomes “123” after decryption, which we can then easily
convert to “ABC,” which was the intended message.

Notice that in order to break this scheme, a hacker would have to find the two
numbers that when multiplied together yield 323. The square root of 323 is a
little less than 18, so they would have to try a maximum of 7 divisors before they
would be guaranteed to break the modulus into the original primes that were
used to find the public and private keys. Using a computer, this would not be
difficult, so real RSA encryption uses numbers that are sufficiently large so that
even the fastest computers would take longer than a human lifespan to factor
them. It is upon this foundation, the difficulty of factoring products of two large
prime numbers, that modern data encryption rests.
Unit 1 | 38
UNIT 1 at a glance
textbook

SECTION 1.2

Math at the Dawn • The earliest evidence of humans using numbers comes in the form of tally
of Time sticks.
• The Ishango bone represents a level of early mathematics more
sophisticated than simple counting.
• Mathematics arose independently in different forms across many cultures
throughout history.

SECTION 3.2
1.3

Number for • The Greeks studied numbers independent of their application to uses in the
Number’s Sake: real world.
The Greeks
• Proof has long been one of the central ideas in mathematics.
• Mathematical theorems, unlike scientific theories, last forever.
• Playing with the geometric structure of numbers led to early insights in
number theory.

SECTION 3.2
1.4

Primes • The rectangle model of multiplication links a number’s geometric structure


to its divisors.
• Primes are numbers that cannot be represented by rectangles with both
dimensions whole numbers greater than one.
• Factor trees reveal the prime decompositions of composite numbers.
• The Fundamental Theorem of Arithmetic states that every number has only
one prime decomposition.
• Primes are the “atoms” of arithmetic.

Unit 1 | 39
UNIT 1 at a glance
textbook

SECTION 1.5

Questions about • Euclid proved that, given a finite list of primes, one can always create a new
Primes prime not on that list, which implies that there are an infinite number of
primes.
• Determining whether or not a number is prime is a non-trivial task.
• The answer to whether or not there is a pattern behind the primes has
eluded mathematicians for millennia.
• One can find interesting examples of both structure and randomness in the
primes.
• Riemann, following in the footsteps of Euler, thought that the pattern of
the primes was tied to the zeta function, which is a generalization of the
harmonic series.

SECTION 3.2
1.6

Encryption • The need to send private messages has ancient roots.


• Most historical encryption schemes used private keys.
• The simplest encryption scheme is to jumble the letters of the alphabet
according to some rule.
• Modular arithmetic is math that incorporates “wrap-around” effects.
• Using a prime modulus ensures that all reciprocals exist.
• Arithmetic with prime moduli forms the theoretical underpinnings of RSA
encryption.

SECTION 3.2
1.7

RSA Encryption • RSA encryption is the modern standard for sending secure information.
• RSA is based on the modular arithmetic of primes.

Unit 1 | 40
UNIT 1 The Primes
textbook

BIBLIOGRAPHY

WEBSITES http://primes.utm.edu/howmany.shtml
http://www.rsa.com/
http://www.mersenne.org/
www.claymath.org/fas/research_fellows/Green/bio.pdf
http://www.math.ucla.edu/~tao/
http://www.fundinguniverse.com/company-histories/RSA-Security-Inc-
Company-History.html

PRINT Adleman, Leonard M. “Computing with DNA,” Scientific American, vol. 279, no. 2
(August 1998).

Aristotle. (translator: Hippocrates G. Apostle) Aristotle’s Metaphysics.


Bloomington, IN: Indiana University Press, 1966.

Ash, Avner and Robert Gross. Fearless Symmetry: Exposing the Hidden Patterns of
Numbers. Princeton, NJ: Princeton University Press, 2006.

Berlinghoff, William P and Fernando Q. Gouvea. Math Through the Ages: A Gentle
History for Teachers and Others. Farmington, ME: Oxton House Publishers, 2002.

Bogart, Kenneth, Clifford Stein, and Robert L. Drysdale. Discrete Mathematics


for Computer Science (Mathematics Across the Curriculum). Emeryville, CA: Key
College Press, 2006.

Boyer, Carl B. (revised by Uta C. Merzbach). A History of Mathematics, 2nd ed.


New York: John Wiley and Sons, 1991.

Burton, David M. History of Mathematics: An Introduction, 4th ed. USA: WCB/


McGraw-Hill, 1999.

College of Letters and Science. “Terence Tao: The Mozart of Math.” UCLA. http://
www.college.ucla.edu/news/05/terencetaomath.html (accessed January 25,
2007).

Dantzig, Tobias. Number: The Language of Science, The Masterpiece Science


Edition. New York: Pi Press, an imprint of Pearson Education, Inc., 2005.

Unit 1 | 41
UNIT 1 The Primes
textbook

BIBLIOGRAPHY
Devlin, Keith. “61: Prime-Time News, ” Discover, vol. 26, no. 1 (January 2005).

PRINT Drake, Frank and Dava Sobel. Is Anyone Out There?: The Scientific Search for
CONTINUED Extraterrestrial Intelligence. New York: Delacorte Press, 1992.

Du Sautoy, Marcus. The Music of the Primes: Searching To Solve the Greatest
Mystery in Mathematics. New York: Harper Collins, 2003.

Du Sautoy, Marcus. 2006. Prime numbers get hitched. SEED (March 27),
http://seedmagazine.com/news/2006/03/prime_numbers_get_hitched.php
(accessed January 19, 2007).

Ellenberg, Jordan. 2006. Math’s architect of Beauty: how Terence Tao’s Quest
for Elegance Earned him a Fields Medal and a Macarthur Fellowship. SEED
(september 22), http://www.seedmagazine.com/news/2006/09/maths_
architect_of_beauty.php (accessed January 25, 2007).

Gross, Benedict and Joe Harris. The Magic of Numbers. Upper Saddle River, NJ:
Pearson Education, Inc/ Prentice Hall, 2004.

Joseph, George Gheverghese. Crest of the Peacock: The Non-European Roots of


Mathematics. Princeton, NJ: Princeton University Press, 2000.

Malkevitch, Joe. “Mathematics and Internet Security.” American Mathematical


Society. http://www.ams.org/featurecolumn/archive/internet.html (accessed
January 19, 2007).

Pfaff, Thomas J. and Max Tran. “The N-Jugs and Water Problem,” The Pi Mu
Epsilon Journal, vol. 12, no. 1 (Fall 2004).

Rivest, Ron and Robert Silverman “Are ‘Strong’ Primes Needed for RSA?”
International Association for Cryptologic Research. http://eprint.iacr.
org/2001/007 (accessed 2007).

Rockmore, Daniel. Stalking the Riemann Hypothesis: The Quest To Find the Hidden
Law of Prime Numbers. New York: Vintage Books, 2005.

. .
Šleževičiene, R., J. Steuding, and S. Turskiene. “Recent Breakthrough in
Primality Testing,” Nonlinear Analysis: Modeling and Control, vol. 9, no. 2 (2004).

Unit 1 | 42
UNIT 1 The Primes
textbook

BIBLIOGRAPHY
Singh, Simon. The Code Book: The Evolution of Secrecy from Mary Queen of Scots to
Quantum Cryptography. New York: Doubleday, 1999.
PRINT
CONTINUED Tanton, James. “Arithmetic, Algebra and Abstraction,” Text in preparation, to
appear 2009.

Wells, David. Prime Numbers: The Most Mysterious Figures in Mathematics.


Hoboken, NJ: John Wiley and Sons, Inc., 2005.

Unit 1 | 43
UNIT 1 The Primes
textbook

NOTES

Unit 1 | 44
TEXTBOOK
Unit 2
UNIT 02
Combinatorics Counts
TEXTBOOK

UNIT OBJECTIVES

• Combinatorics is about organization.

• Many combinatorial problems involve ways to enumerate, or count, various things


in an efficient manner.

• The counting function C(n,k), is a powerful tool used to count subsets of a larger set,
or give coefficients in binomial expansions.

• Bijection—the identification of a “one-to-one” correspondence—enables us to


enumerate a set that may be difficult to count in terms of another set that is more
easily counted.

• Pascal’s Triangle is an elegant illustration of the counting function C(n,k).

• Techniques from graph theory can help with combinatorial challenges such as
finding circular permutations.

• The pigeonhole principle—the idea that if you have more pigeons than holes, some
holes must have more than one pigeo—is a deceptively simple idea that can be used
to prove startling results.

• Ramsey Theory explains why we sometimes find order in supposed randomness.

• Ideas from combinatorics are at play in modern methods of DNA sequencing.

• The question of whether or not P = NP—whether certain types of seemingly


computationally intractable combinatorial problems can be solved in reasonable
amounts of time—is at the forefront of current research in both combinatorics and
computer science.
Mathematics may be defined as the economy
of counting. There is no problem in the whole
of mathematics which cannot be solved by
direct counting.

E. Mach
UNIT 2 Combinatorics Counts
textbook

SECTION 2.1 DNA is the genetic information that encodes the proteins that make up living
things. Human DNA is a large molecular chain consisting of sequences of four
INTRODUCTION different building blocks known as nucleotides. These nucleotides, adenine,
cytosine, guanine, and thymine, combine to form the different genes that make
us who we are. Human DNA consists of about 3 billion pairs of these building
blocks.

In 1990, the U.S. Department of Energy, in conjunction with the National


Institutes of Health and an international consortium of geneticists from China,
France, Germany, Japan, and the UK, set out to map the entire sequence of
base pairs in human DNA. The Human Genome Project, as it was called, was
projected to take fifteen years to complete. By the year 2000, just ten years
later, a rough draft was announced and by 2003, the sequence was declared to
be essentially complete, two years ahead of schedule.

What enabled this huge project to be completed more quickly than expected?
There were many factors, most significantly improvements in technology
and faster computers, which made it possible to complete time-consuming
calculations within more reasonable time frames. This new generation of
computers made it realistic to run powerful algorithms from the mathematics of
organization, combinatorics. Combinatorics is, simply put, the mathematics of
counting things—things that are generally collections of mathematically defined
or encoded objects. As such, combinatorics is the branch of mathematics that is
central to some basic problems inherent in our data-rich age: the organization
of large sets of data and the quest to uncover relational meaning among the
members of those sets. For example, when faced with a task of, say, combining
“puzzle pieces” of DNA to make a complete model, combinatorics can be used to
enumerate the possibilities. Not only does this tell us whether our ordering is
feasible, it also provides the tools that actually accomplish this ordering.

As we already noted, the effort to determine the human genome is a modern


context for applying combinatorics. A more classic problem is the infamous
“traveling salesperson problem:” Suppose that you are a traveling salesperson
and you wish to find the shortest route connecting a group of designated cities.
A simple combinatorics problem will help you establish the number of possible
itineraries. However, it turns out that finding the shortest possible route, for
even a relatively small number of cities, is much more difficult—in fact, it may
even be computationally intractable. These are all problems of combinatorics.

Unit 2 | 1
UNIT 2 Combinatorics Counts
textbook

SECTION 2.1 Not only can combinatorics help to organize complicated sets, but it can also
reveal whether or not any organization inherently exists in large, seemingly
INTRODUCTION “random” sets. This idea, known as Ramsey theory, gives some quantitative
CONTINUED rationale as to why we see constellations in the night sky. It also explains, and
debunks, some claims of the existence of hidden messages in the Bible. Ramsey
theory shows mathematically that structure must exist in randomness, although
it does not provide any guidance or formula for finding such structure.

In this unit, we will look at some of the uses of combinatorics, such as finding
combinations and permutations and sequencing DNA. We will also learn
about the general techniques of the combinatorialist, from bijection, to the
“pigeonhole principle,” to uses of Hamiltonian cycles in connected graphs.
We will also explore a bit of the history of this incredibly useful field, from the
counting problems of ancient Egypt, to the mysterious triangle of Pascal, to
questions at the forefront of modern-day computing.

Unit 2 | 2
UNIT 2 Combinatorics Counts
textbook

SECTION 2.2

Egypt and India • The Rhind Papyrus


• Flavors in India
• Functions
• Bijective Proof

The Rhind Papyrus


• The Rhind Papyrus, also known as the Ahmes Scroll, is the earliest known
combinatorial problem.
• The solution to the problem requires using the sum of a geometric series.

The problem of keeping track of


large numbers of possibilities
is by no means a new one. The
mathematicians of the middle
kingdom in Egypt were quite aware of
how quickly such problems can grow.
An early piece of evidence of this
comes from the Rhind Papyrus. This
scroll, transcribed by Ahmes from
Egyptian 12th Dynasty mathematical
texts, contains problems illustrating
many different mathematical
concepts, usually presented in applied
Item 1040 / Egyptian, RHIND MATHEMATICAL form. One such problem, “Number
PAPYRUS COPIED BY THE SCRIBE AHMES (ca. 1650
BCE). Courtesy of Art Resources, Inc. 79,” sometimes referred to as “the
Inventory Problem,” lays out a not-so-
straightforward counting scenario:

There are seven houses; in each house there are seven cats; each cat kills seven
mice; each mouse has eaten seven grains of barley; and each grain would have
produced seven hekats (an old unit of measure equivalent to about 5 liters).
What is the sum of all the enumerated things?

Unit 2 | 3
UNIT 2 Combinatorics Counts
textbook

DIAGRAM OF A BRANCHING TREE

SECTION 2.2

Egypt and India


CONTINUED

1132
We can approach this problem, as the Egyptians did, in a so-called “brute force”
fashion, by multiplying and adding four consecutive times. If there are seven
houses, each of which has seven cats, then there is a total of forty-nine cats.
The fact that there were seven mice munched by each cat means that a total of
343 mice met their demise. Continuing in this manner, we calculate that 2,401
grains of barley were eaten along with the mice, thereby keeping 16,807 hekats
of barley out of production. Adding together all the quantities involved (i.e.,
houses, cats, mice, barley grains, and hekats of barley), we find that there were
19,607 things in total.
RHIND PAPYRUS PROBLEM

Item Quantity Subtotal


Houses 7 7
Cats 49 56
Mice 343 399
Barley (spelt) 2401 2800
Hekats 16,807 19,607
total 19,607 19,607

Notice that this problem involves finding the sum of a sequence of terms that
increase geometrically—that is, each term is a constant multiple of the previous

Unit 2 | 4
UNIT 2 Combinatorics Counts
textbook

SECTION 2.2 one. Starting with seven, and multiplying by seven each time, we get to almost
17,000 in four steps. This is an example of a geometric series, and it shows
Egypt and India how quickly such a series can grow. We can find the desired sum in this case by
CONTINUED adding the first five powers of seven:

71 + 72 + 73 + 74 + 75 = 7 + 49 + 343 + 2,401 + 16,807 = 19,607

More generally, a geometric series is the sum of a sequence of terms in which


each new term is generated by multiplying the preceding term by some fixed
common factor. For example, the finite geometric series 1 + 2 + 4 + 8 +...+2n (for
some value of n) is such that each term is two times the term that precedes it.
A famous illustration of the speed at which such a series grows is the one that
asks how much money it would take to place one penny in the first square of a
chessboard, two pennies in the second square, four in the third square, eight
in the fourth square, and so on until there is a stack of pennies in each of the
board’s sixty-four squares. Fortunately, we do not have to use brute force, as
the Egyptians would have, to solve this. The clever solution goes like this:

In general, we can express a geometric series in this form:

a + ar + ar2 + ar3 + ar4 + … + arn

where a is some initial value and r is the constant factor or ratio. The general
a (rn+1-1)
solution, then, of the sum of this geometric series is S = .
(r-1)

In the case of the pennies piling up on the chessboard, a = 1 and r = 2. Because


there are sixty-four squares on a chessboard, n = 63 (the first square has one
63+1
1 (2 - 1)
penny, represented by 20). The sum is, therefore, , which is equal to
(2 - 1)
about 1019 pennies, or 1017 dollars (in non-scientific language, that’s 100 million
billion dollars)! This powerful example of how quickly a geometric series
expands gives us a glimpse of the magnitude of combinatorial explosions.

Using a formula to find the sum of the geometric series underlying the Egyptian
“inventory problem” and the pennies on the chessboard example demonstrates
an important idea underlying combinatorial mathematics—problems in which
the work grows very rapidly can often be reduced in clever ways to problems
that are more easily controlled. This idea popped up again in India in the 7th
century AD, this time having to do with combinations of flavors.

Unit 2 | 5
UNIT 2 Combinatorics Counts
textbook

SECTION 2.2 Flavors in India


• The problem of counting subsets of a larger set was explored by thinkers in
Egypt and India India as early as the 6th century BC.
CONTINUED

The Indian medical text, Sushruta Samhita, written by Sushruta in the 6th
century BC, examines the ways in which six fundamental flavors, bitter, sour,
salty, sweet, astringent, and hot, could be combined. (Note: It is important to
realize that for the purposes of this discussion, by “combinations,” we mean
subsets of a larger set in which order doesn’t matter; salty-sweet is the same
as sweet-salty.) This ancient text showed that there were sixty-three such
combinations, categorized as follows: six single tastes, fifteen pairs, twenty
triples, fifteen quadruples, six quintuples, and, of course, one combination of all
six tastes. There is, incidentally, one way to have zero flavors, generally called
the “empty set,” but we will disregard this because “flavorless” doesn’t count as
a flavor. Adding all of these possible groupings together, we can easily see that
their sum is sixty-three, but is there a more clever and basic way to look at this?

One way to approach this problem would be to make an organized list. We


could represent the six flavors with the letters A, B, C, D, E, and F and begin by
listing the possible “combinations” of one: A, B, C, D, E, F. Then we can list the
possible pairs: AB, AC, AD, AE, AF, BC, BD, BE, BF, CD, and so on. There seems
to be a more general idea at work here. Can we get to it?
ORDERED LIST

A B C D E F

AB AC AD AE AF
BC BD BE BF CD
CE CF DE DF EF

ABC ABD ABE ABF ACD


ACE ACF ADE ADF AEF

1133 BCD
BEF
BCE
CDE
BCF
CDF
BDE
CEF
BDF
DEF

ABCD ABCE ABCF ABDE ABDF


ABEF ACDE ACDF ACEF ADEF
BCDE BCDF BCEF BDEF CDEF

ABCDE ABCDF ABCEF ABDEF ACDEF BCDEF

ABCDEF
Unit 2 | 6
UNIT 2 Combinatorics Counts
textbook

SECTION 2.2 Functions


• Functions map members of one set to members of another set.
Egypt and India
CONTINUED To help us reach an efficient solution to “the problem of the flavors,” we can look
for some function that will enable us to count the number of subsets of a given
set of size n quickly and conveniently. Generally, we think of functions as math
machines into which we put numbers and which spit out correlated numbers,
but we can also think of a function as a way to describe how one set relates
to another. In this set-based concept, a function is a rule that assigns to each
member of a set of input values one, and only one, output value. For example,
the absolute value function, |n|, takes all real numbers as inputs and maps each
of them to their distance from the origin. Because distance is measured as a
non-negative value, the function |n| maps the set of all real numbers to the set
of non-negative real numbers. The inputs “5” and “-5” get assigned the same
output of “5.”
-5
TWO VIEWS OF THE ABSOLUTE VALUE FUNCTION -7 -3

-5
5 5

2124 -3
3
3 n 7
5
-7 3
7
7

Bijective Proof
• Bijection can be used to enumerate the members of a difficult-to-enumerate
set by establishing a one-to-one correspondence with a set that is easier to
enumerate.
• Using the concept of bijection, we can solve the Indian flavors problem in a
very elegant way.

The concept that no single input gives more than one output is common to all
simple functions. Some functions, however, are more restrictive. In addition
to restricting each input to only one output, these functions require that each
output is matched with exactly one input. Such functions, which are called
“bijections” or “one-to-one correspondences,” can be quite useful to us as
we attempt to find clever solutions to combinatorial problems. To do so, we
seek to show that two sets (the one that we are trying to find and another that

Unit 2 | 7
UNIT 2 Combinatorics Counts
textbook

SECTION 2.2 we can directly relate to it to form the bijection) can be put into one-to-one
correspondence with each other.
Egypt and India
CONTINUED A BIJECTION IS A ONE-TO-ONE CORRESPONDENCE BETWEEN SETS

1 2

2119 3 4

Imagine two sets, one containing a number of right shoes and the other
containing a number of left shoes. Would there be a way to determine whether
or not both sets are the same size (i.e., contain the same number of shoes)
without counting them? We could pair up each right shoe with a left shoe and
see if there are any leftovers in either set. If every right shoe pairs up with a left
shoe, with no leftovers in either set, then we are guaranteed that the two sets
are the same size. Given that assurance, we could simply count the right shoes
and know that the number of left shoes is the same. In math, it is often possible
to quantify a set of things that may be difficult to count using a set that is easier
to count and then showing that there is a one-to-one correspondence between
the two sets.

Armed with the power of bijection, we can efficiently tackle the flavors problem.
Remember that we want to determine how many combinations, or subsets, of
six flavors there are if order doesn’t matter. We know that any given flavor will
be either present in a subset or not. This means that we can represent each
possible combination as a six-digit binary string, using only the digits 0 and 1.
The first digit in the string indicates the status of flavor A; a 1 means “present”
and a 0 means “absent.” Likewise for the second digit, representing flavor B,
and so on. In this system, the set of all flavors, {A,B,C,D,E,F}, would be written
as 111111. The subset {A,B,D} would be indicated by the binary code 110100,
whereas the subset {C,F} would be 001001. We can see that because each flavor
can be only present or absent, each subset will be uniquely represented by a
binary string. This defines a bijection between all subsets of six flavors and
binary strings of length six.

Fortunately, figuring out how many six-digit binary strings there are is fairly

Unit 2 | 8
UNIT 2 Combinatorics Counts
textbook

SECTION 2.2 straightforward and much easier than counting subsets of flavors. Each digit
has only two options; it must be a 0 or a 1. We can simply multiply the number of
Egypt and India options for each digit to figure out how many possible strings there can be.
CONTINUED
2 × 2 × 2 × 2 × 2 × 2 = 26 = 64 strings

One of those strings, 000000, corresponds to “no flavor,” however, and we have
already decided to disregard that option, so we end up with a grand total of
sixty-three subsets. In general, we have found that the number of non-empty
subsets of n elements is 2n-1. This method is significantly faster than listing all
the possible combinations. The drawback of this method is that it does not tell
us how many subsets of a given size there are.

Recall that, according to the Indian text, there are six single flavors, fifteen
pairs, twenty triples, fifteen quadruples, six quintuples, and one way to combine
all six flavors. Is there a way to find these numbers—to enumerate subsets
according to their size—without listing and sorting all possible combinations?
Our method of finding a bijection between the total number of subsets and
binary strings doesn’t immediately give us this level of detail. In the next section
we will see how to count subsets of a particular size by using a function that has
many uses in both combinatorics and beyond, C(n,k).

Unit 2 | 9
UNIT 2 Combinatorics Counts
textbook

SECTION 2.3

Flavors Revisited • Permutations


• From Permutations to Combinations

Permutations
• Counting combinations, in which order does not matter, is different than
counting permutations, in which order does matter.
• The factorial operation is very important in counting permutations.

We can solve the problem of the flavors by a different method that will give us
a broader understanding of the subsets than the bijection method provided.
Specifically, we need a strategy that not only reveals the total number of
subsets, but that is also capable of categorizing the subsets by size. This is a
common theme in mathematics; solving problems in different ways deepens
our understanding of what is really happening. To re-phrase our problem, we
are seeking a formula that will tell us how many subsets of a given size can be
made from the original set of six flavors. We would then like to generalize this
formula to tell us how many subsets of size k, called k-subsets, can be made of a
set of n elements. In doing this, we will have to use the important combinatorial
concepts of permutation and combination.

We can start our thought process by considering how many ways the six flavors
can be arranged if we count each unique ordering separately. (Remember that
previously we gave no significance to order and considered, for example, the
subsets AB and BA to be the same.) Arrangements such as these, in which
order matters, are known as permutations. We can imagine the possible
permutations of six flavors as a sequence of six empty slots.

POSSIBILE CHOICES FOR SLOTS

A
B B

3071
C C C
D D D D
E E E E E
F F F F F F

Notice that there are six possible flavors that can occupy the first slot, five that
can occupy the second, four for the third, and so on. This is because once a
Unit 2 | 10
UNIT 2 Combinatorics Counts
textbook

SECTION 2.3 specific flavor is used, we don’t want it to appear again in the same permutation.
The total number of permutations of six flavors is then 6 × 5 × 4 × 3 × 2 × 1 =
Flavors Revisited 720, which we denote as 6!, called “six factorial.” In general, the number of
CONTINUED permutations of n objects will be n! As a shorthand, we can write P(n,n), or “the
permutations of n objects taken n at a time.”

So, permutations have a simple formula in terms of the factorial, but there
is more to consider. Remember, we also want to be able to find the number
of arrangements involving fewer than all six of the elements—what we call
subsets. Furthermore, in the final analysis we are not concerned with the
order of flavors; we really don’t care if a subset has salty before sweet or sweet
before salty. Such arrangements, in which order does not matter, are known as
combinations. To find a formula for counting combinations of a given size, we
will have to deal with both of these considerations.

First, let’s figure out how to deal with finding smaller arrangements selected
from a pool of six objects in which order still matters. For example, to find the
number of ways to order two flavors out of the set of six, we can imagine two
slots, the first of which has six possible flavors, the second of which has only
five possible flavors, once a flavor has “filled” the first slot. After multiplying,
as we did before, we see that there are thirty possible ways to order two out of
the six flavors. In the language of combinatorics, we say that we have found
the number of permutations of six objects taken two at a time. We can write
P(6,2) to express this; the general form for this expression is P(n,k), or “the
permutations of n objects taken k at a time.”

Notice that P(6,2) is less than P(6,6). Why is this? P(6,6) gives the total number
of unique orderings of all six flavors, but to find P(6,2), we are concerned with
only two flavors. Going back to the six–slot concept from before, we can let the
two flavors we care about occupy the first two slots. For example:

AB____

The remaining four slots can be ordered in 4! (24) ways, all of which have the
same first two flavors. So, P(6,6) over-counts P(6,2) by a factor of twenty-four.
Therefore, to find P(6,2), we should divide P(6,6) by 4!. Recognizing that 4 =
(6-2), we can write the following expression for the value of P(6,2):
6!
P(6,2) = (6 − 2)!

Unit 2 | 11
UNIT 2 Combinatorics Counts
textbook

SECTION 2.3 We can then generalize this for P(n,k):


n!
Flavors Revisited P(n,k) = (n − k)!
CONTINUED
This is the formula for the number of permutations of n objects taken k at a
time.

From Permutations to Combinations


• The formula for combinations of n objects taken k at a time can be found
by first looking at the permutations of n objects taken k at a time and then
dividing by the number of permutations of k objects taken k at a time.

Having addressed the first of our concerns—counting smaller permutations—


we can move on to the question of what to do about order. Permutations, recall,
count each unique ordering of objects separately, but in the problem of the
flavors, we don’t really care about the order of flavors in a subset. Knowing that
P(n,k) gives the number of permutations of n objects taken k at a time, can we
use this to determine the number of combinations in a subset of permutations?
If so, we could then find the number of combinations of six, five, four, three, two,
and single flavors. Then, after adding these together, we will have found the
total number of subsets of six flavors.

We can start by realizing that the number of permutations will always be


greater than the number of combinations. For example, P(n,k) treats the
arrangements ABC, ACB, BCA, BAC, CAB, and CBA as different. Viewing them
as combinations of three flavors, however, we would consider them all to be the
same combination. So, P(n,k) must be over-counting if we are interested only in
combinations. By what factor does P(n,k) over-count?

P(n,k) over-counts by the number of ways to arrange k objects. This is evident


in the example of six permutations of three objects taken three at a time above.
If we divide the six objects by 3!, or six, we get one, which is the number of
P(n,k)
combinations of three objects, taken three at a time. In general, will tell
P(k,k)
us the number of combinations of n objects taken k at a time. We call this C(n,k).
Using the formula for P(n,k) from above and recognizing that
P(k,k) = k!, we can write:
n!
C(n,k) = k!(n − k)!

Unit 2 | 12
UNIT 2 Combinatorics Counts
textbook

SECTION 2.3 For example, C(10,3) represents the number of possible three-topping pizzas
that we could choose given a total of ten possible toppings.
Flavors Revisited
CONTINUED Using this notation, we can complete the following short chart to solve the
original problem of the flavors by enumerating the subsets according to their
size.

VALUES OF THE COUNTING FUNCTION

N!
= number of unordered subsets
k!(n − k)!
6!
C(6,1) = 6 sets of 1
1!(5!)

3072 C(6,2)
6!
2!(4!)
6!
= 15 sets of 2

C(6,3) 3!(3!) = 20 sets of 3


6!
C(6,4) = 15 sets of 4
4!(2!)
6!
C(6,5) = 6 sets of 5
5!(1!)
6!
C(6,6) =1 set of 6
6!

Adding the results for the number of subsets of size one, two, three, four, five,
and six elements, we get:

6 + 15 + 20 + 15 + 6 + 1 = 63

Although it took us a while to derive the formula for C(n,k), using it to count
k-subsets in this manner is much faster than listing them all.

We have now efficiently answered the problem of the flavors in two different
ways, and we can see that adding together all the possible values of C(n,k), as
k ranges from 1 through 6, yields the same value that we found in our previous
solution using bijection. Furthermore, we can imagine that each of the specific
k-subsets corresponds to a unique binary string. In our next section, we will
use a similar method to derive what is going on at the heart of one of the most
famous and fascinating number patterns in mathematics: Pascal’s Triangle.

Unit 2 | 13
UNIT 2 Combinatorics Counts
textbook

SECTION 2.4

Pascal’s Triangle • Find the Weirdo


• The Triangle Takes Shape

Find the Weirdo


• Pascal’s Triangle is an important and widely useful mathematical concept.
• At its heart, Pascal’s Triangle is a recursive relationship by which we can,
given previous elements, find subsequent ones.

The counting function C(n,k) and the concept of bijection coalesce in one of the
most studied mathematical concepts, Pascal’s Triangle. At its heart, Pascal’s
Triangle represents a recursive way to compute all the C(n,k), the numbers
of k-subsets of an n-element set for any n and any k. As a recursive pattern,
Pascal’s Triangle incorporates previously known values in the creation of new
ones. To portray the relationship at the heart of the triangle, we will again solve
a particular problem in two different ways.

Once again let’s address the question of how many k-subsets there are of an
n!
n-element set. Solved one way, we know the answer is k!(n − k)! . Now as we
explore the question again, we will also consider whether or not a k-subset
contains the element “n”. Using the flavors example, we would sort all of our
combinations of flavors into two sets, those that have “salty” as one of their
components and those that do not. In this strategy, sometimes known as
“weirdo” analysis; we call “n” or “salty” the “weirdo” and make deductions by
counting the sets that either contain it or don’t contain it.

To start, let’s focus on just the k-subsets. We can separate these subsets into
two piles. Pile A will have all the k-subsets that contain the element n. Pile B
will have all the k-subsets that do not contain n. In terms of flavors, A has all
of the combinations containing “salty,” and B has all those that don’t. Note that
both pile A and pile B are sets of subsets.

All of the subsets in pile A have to contain n; therefore, to figure out how many of
them there are, we can just pretend that they are only of size k-1 (because one of
the slots is always filled by n).

____…n

Unit 2 | 14
UNIT 2 Combinatorics Counts
textbook

SECTION 2.4 (In other words, k spaces, one of which is always filled by n, means that there
are actually only k-1 spaces in play.)
Pascal’s Triangle
CONTINUED Likewise, because n is not allowed to move around, it is in some sense “out of
play” in our larger n-sized set. This means that each of the subsets in pile A has
k-1 spaces to fill using only the elements {1, 2, …, n-1}. The number of these
subsets in pile A is the same as the number of k-1 subsets of an (n-1)-sized set.
We thus have a bijection between the set of k-sized subsets containing n and the
set of (k-1)-sized subsets of an (n-1)-sized set.

We know the way to find how many (k-1)-sized subsets there are of (n-1) by
using our C(n,k) formula. For simplicity’s sake we’ll just write C(n-1,k-1).

Now, let’s look at pile B containing the k-sized subsets that do not contain n. For
each subset we have to fill k spaces using only the elements {1, 2, …, n-1}. This
is the same as asking how many k-sized subsets there are of an (n-1)-sized set.
We again can use our handy formula, written as C(n-1,k).

Finally, we know that if we combine pile A and pile B, we should have the total
amount of k-sized subsets of an n-sized set, which can be expressed as C(n,k).

PILE A + PILE B = SUBSETS OF N

2126 A + B = C(n,k)

The total number of subsets of size k, C(n,k), is equal to the total of those that
include n, the weirdo, C(n-1, k-1), plus those that don’t, C(n-1, k), or:

C(n,k) = C(n-1, k-1) + C(n-1,k)

This relationship, known as Pascal’s equation, gives us a recursive relationship


that enables us to compute the number of k-subsets of an n-element set using
the results we already have for smaller subsets of smaller sets. Organizing the
results of a few iterations of this rule into a chart yields an interesting structure.

Unit 2 | 15
UNIT 2 Combinatorics Counts
textbook

SECTION 2.4 The Triangle Takes Shape


• Pascal’s equation can be used to create his famous triangle, which can in
Pascal’s Triangle turn be used in a variety of ways to count different types of subsets.
CONTINUED • There are many interesting mathematical relationships, or identities, hidden
within Pascal’s Triangle.

Looking at a few iterations of Pascal’s equation gives us the following result in


tabular form:
PASCAL’S TRIANGLE

K
0 1 2 3 4 5 6 7
0 C(0,0) = 1
1 C(1,0) = 1 C(1,1) = 1

1138 n 2
3
C(2,0) = 1
1
C(2,1) = 2
3
1
3 1
4 1 4 6 4 1
5 1 5 10 10 5 1
6 1 6 15 20 15 6 1
7 1 7 21 35 35 21 7 1

This information is probably more familiar to you presented in the following


form:

STANDARD DIAGRAM OF PASCAL’S TRIANGLE THROUGH SEVENTH ROW

1
1 1
1 2 1

1139 1
1
1

5
4
10
3
6
3

10
4
1

5
1
1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1

In this arrangement, each number is denoted by C(row, column). Note that the
first row is considered row zero, as is the first column. So, returning to our
taste example, we can find the number of combinations of three out of six by
looking at row 6 and finding column 3. The value found in that position, 20, is
in complete agreement with everything we’ve done before. To find how many
subsets of any size there are in a group of six, we simply add all the numbers in
the sixth row, taking care not to add the 1 that represents the empty set.

Notice that C(0,0) and C(n,n) are both equivalent to one, reminding us that there

Unit 2 | 16
UNIT 2 Combinatorics Counts
textbook

SECTION 2.4 is only one way to choose zero items out of a set of zero, and only one way to
choose n items out of a set of n items when the order does not matter.
Pascal’s Triangle
CONTINUED Pascal’s Triangle is a mathematical paradise, with many interesting
relationships hidden in its structure. First, note that the sum of entries of any
row n is equal to 2n, in agreement with our binary strings bijection from before.

4th ROW
1 + 4 + 6 + 4 + 1 = 16
2n = 24 = 16

Also, the entries in the nth row of the triangle give the coefficients of the terms
in the expansion of a simple binomial raised to the power n, such as (x+y)n. For
example:

(x+y)3 = 1x3 + 3x2y + 3xy2 + 1y3

The coefficients of this polynomial can be found in the third row of Pascal’s
Triangle. Because they are useful in expanding binomials, the various sets of
C(n,k)s are also known as binomial coefficients. Note that this isn’t magic; it’s
simply the result of counting the number of subsets with k factors of x.

Another interesting phenomenon in Pascal’s Triangle can be found by looking


at so-called “hockey-sticks.” A hockey-stick is a pattern within the triangle
composed of a diagonal string of numbers and a terminating offset number,
such as those shown here:

Unit 2 | 17
UNIT 2 Combinatorics Counts
textbook

SECTION 2.4 HOCKEY STICKS IN PASCAL’S TRIANGLE

Pascal’s Triangle 1

CONTINUED
1 1

1 2 1

1 3 3 1

1 4 6 4 1

1 5 10 10 5 1

3087
1 6 15 20 15 6 1

1 7 21 35 35 21 7 1

1 8 28 56 70 56 28 8 1

1 9 36 84 126 126 84 36 9 1

1 10 45 120 210 252 210 120 45 10 1

1 11 55 165 330 462 462 330 165 55 11 1

1 12 66 220 495 792 924 792 495 220 66 12 1

1 13 78 186 715 1287 1716 1716 1287 715 186 78 13 1

What is fascinating in a hockey-stick pattern is that the linear string of numbers,


when added together, totals the value of the number that is offset. For example,
the sum of the numbers 1, 6, 21, and 56 is 84 (the blue pattern in the figure
above). This works no matter where in the triangle we draw a hockey stick, as
long as it starts with a “1.”

To get a sense for why this holds true, let’s look at the orange hockey stick
above, 1 + 3 + 6 = 10. Recognizing the “10” in our pattern as the second entry of
the 5th row, we can write it as 10 = C(5,2). Let’s plug this into Pascal’s equation:

C(n,k) = C(n-1, k-1) + C(n-1, k)

C(5,2) = C(4, 1) + C(4, 2)

Note that C(4,2) is the second entry in row four, “6,” which is part of our hockey
stick. However, C(4,1), the first entry in row four, is not part of our pattern. If
we use Pascal’s equation again, we find:
Unit 2 | 18
UNIT 2 Combinatorics Counts
textbook

SECTION 2.4 C(4,1) = C(3, 0) + C(3, 1)

Pascal’s Triangle C(3,0) = 1 and C(3,1) is the first entry of row three, which is “3.” Plugging these
CONTINUED results back into the equation for C(5,2), we get:

C(5,2) = C(4,2) + C(3,1) + C(3,0) => 10 = 6 + 3 + 1

This is the hockey stick identity that we set out to prove!

There are many other fascinating mathematical series and relationships to be


found in the triangle, such as triangular numbers, primes and their multiples,
and Fibonacci numbers to name but a few.

By the way, Pascal did not invent this triangle. It was known centuries earlier to
both the Indians and Chinese as having particular use in finding combinations,
as we have just seen. The Chinese mathematicians Yang Hui and Chu Shih Chieh
knew about it at least 350 years before Pascal’s work.

Unit 2 | 19
UNIT 2 Combinatorics Counts
textbook

SECTION 2.5

The Order of • Circular Ordered Selections


the Garter • A Royal Problem
• Enter Graph Theory

Circular Ordered Selections


• Counting permutations is different than simply counting combinations
because order must be taken into account.

So far we have learned how to consider both ordered and unordered subsets.
How might our results change if we require that the arrangements be circular?
To put this into context, let’s phrase all our previous problems in terms of dinner
parties. In this scenario n! is the number of ways of putting n people along one
side of a banquet table. C(n,k) is the number of ways of choosing k people out
of n to sit together at a table. What if we have circular tables, however, and
we want to count the number of ways that a given number of people can be
arranged around one of these?

Suppose we are expecting ten people for dinner; how many ways can we seat
them around a circular table? First, let’s think about how many ways we can line
them up. As we indicated above, there will be 10! ways to line up ten guests: ten
for the first position, nine for the second, eight for the third, and so on.

10 PEOPLE LINED UP

2128

How does this change if they are seated around a circular table? Concepts such
as this are called circular permutations, and they are not exactly like linear
permutations.

Unit 2 | 20
UNIT 2 Combinatorics Counts
textbook

SECTION 2.5 THESE CIRCULAR PERMUTATIONS ARE EQUIVALENT

The Order of
the Garter 1 7
CONTINUED 10 2 6 8

9 3 5 9

2129
8 4 4 10

7 5 3 1
6 2

Notice that every circular arrangement corresponds to ten different linear


arrangements.

ONE CIRCULAR PERMUTATION EQUIVALENT TO TEN LINEAR ONES

1 2 3 4 5 6 7 8 9 10
1
10 2 2 3 4 5 6 7 8 9 10 1
9 3
3 4 5 6 7 8 9 10 1 2
8 4 4 5 6 7 8 9 10 1 2 3

2130 7
6
5 5 6 7 8 9 10 1 2 3 4

6 7 8 9 10 1 2 3 4 5

7 8 9 10 1 2 3 4 5 6

8 9 10 1 2 3 4 5 6 7

9 10 1 2 3 4 5 6 7 8

10 1 2 3 4 5 6 7 8 9

Using our reasoning from before, we can see that the number of circular
arrangements is equal to the number of linear arrangements, 10!, divided by
ten to compensate for the fact that each circular permutation corresponds to
10!
ten different linear ones. This gives 9! = –––– as the number of ways to arrange
10!
Unit 2 | 21
UNIT 2 Combinatorics Counts
textbook

SECTION 2.5 ten guests around a table. We can generalize this to say that n elements can be
arranged in (n-1)! ways around a circle.
The Order of
the Garter A Royal Problem
CONTINUED
• The seating arrangement at the annual brunch of the Order of the Garter in
England is an example of circular permutations in action.

It may seem to you that problems such as these are just more examples of
mathematicians’ hypothetical word problems, but this problem of circular
arrangement pops up yearly in England. Every year in June, a procession and
service take place at Windsor Castle for the Order of the Garter—England’s
oldest order of chivalry (founded by Edward III in 1348.) Following the
installation of new members in the Throne Room, the Queen of England and
Duke of Edinburgh host a luncheon for members and officers of the Order in
Waterloo Chamber. Tradition holds that seating charts for the luncheon are
rotated so that no two guests will have sat next to one another in the last ten
years.

With forty-five people to consider, this problem could pose quite a headache
if we attacked it with the brute force method. If order matters, there are
44! possible arrangements, which is about 1054. Checking just one of these
arrangements per second would take 1046 years, which is about thirty-six
orders of magnitude longer than the universe has been in existence. Recall,
however, that we need ten consecutive years in which nobody has sat next to the
same person twice. This means that we need to check all of the subsets of ten
arrangements out of a possible 44!, which is C(44!,10). This number is much
larger—yet another combinatorial explosion.

Enter Graph Theory


• We can envision circular permutations as Hamilton cycles on complete
graphs.

Of course, using the organizing principles of combinatorics, there is a better


way. We could represent the forty-five members and their connections to each
other as a diagram of forty-five points, all of which are connected to one another.

Such a diagram is called a graph, with each point being a node, and each
connection an edge. A graph in which every node is connected to every other
node is called a complete graph. The standard notation for a complete graph

Unit 2 | 22
UNIT 2 Combinatorics Counts
textbook

SECTION
3081
2.5
with n nodes is Kn. So, a complete, forty-five-node graph would be referred to as
K45.
The Order of
the Garter K45
CONTINUED

The actual graph of K45 is quite large, so it may be helpful to examine a smaller
version to see the idea.

Unit 2 | 23
UNIT 2 Combinatorics Counts
textbook

SECTION 2.5 A B
A B

The Order of
the Garter C
CONTINUED E
E C

D
D

Complete K5

1140
A B

A B
C
E

E C D

Two mutually disjoint


D Hamilton cycles
A Hamilton cycle AEDBCA and ADCEBA

If we had just five people, A, B, C, D, and E, our complete K5 graph would look
like that in the diagram. We can come up with circular table arrangements
of these five people by looking at paths that visit all the nodes exactly once
and return to the start. Such a configuration is known as a Hamilton cycle.
Remember that a connection on the graph represents two people sitting next
to each other. In our current problem related to seating the Order of the Garter
luncheon guests we are concerned with Hamilton cycles that share no common
edges. Such cycles are said to be mutually disjoint. Two of these cycles for a
five-person arrangement are shown in the diagram.

OVERLAY OF TWO CYCLES


A B

41
E C

Unit 2 | 24
UNIT 2 Combinatorics Counts
textbook

SECTION 2.5 Notice that with these two cycles, every edge is accounted for. So, although
we may be able to construct other cycles, they will always include at least one
The Order of of the edges that we’ve already used. This means that there are two, and only
the Garter two, mutually disjoint Hamilton cycles for an arrangement of five elements.
CONTINUED
Consequently, we could have only two annual luncheons, at most, before two of
the five people sat next to each other again.

We can see why there can be no more than two mutually disjoint arrangements
in this situation by thinking about it from the perspective of one of the people
seated at the table, let’s say it’s the queen. The queen will always sit next to two
people, one on her right and one on her left. In the K5 case, there are only four
other “non-queen” people to sit by, so the queen will have sat next to everyone
after two years.

We can use similar lines of reasoning with arrangements of more people. For
example, to find mutually disjoint, circular arrangements of seven or nine
people, we can look at possibilities within the K7 and K9 graphs, respectively.

K7 AND K9

2131

1142

HAMILTON CYCLES ON K7

B B B C

A C A C
A D

G D G D

G E

F E F E F

Unit 2 | 25
UNIT 2 Combinatorics Counts
textbook

K7 AND K9
SECTION 2.5K
7
AND K9

The Order of

2131
the Garter
CONTINUED

HAMILTON CYCLES ON K9

B C B C

A D A D

I E I E

1143 H

G
F H
G
F

B C B C

A D A D

I E I E

H F H F

G G

Notice that K7 has three mutually disjoint Hamilton cycles within it and K9 has
four. Applying the queen’s perspective and reasoning as we did before, we can
deduce that there cannot be more than three years of non-duplicate seating
for seven people and not more than four years for nine people. Extending this
reasoning to the original problem, that of forty-five people, we see that the
queen has forty-four possible luncheon neighbors. Taken two at a time, one on
her right and one on her left, it would take her twenty-two years to sit by each
one.

Unit 2 | 26
UNIT 2 Combinatorics Counts
textbook

SECTION 2.5 This means that in our banquet group of forty-five members of the Order of the
Garter, there can be at most twenty-two arrangements in which no two people
The Order of sit next to each other more than once. In fact, twenty-two is always attainable—
the Garter more than enough for ten years’ worth of banquets. In general, the graph K2n+1
CONTINUED
will always have n disjoint Hamilton cycles incorporated within it.

THREE MUTUALLY DISJOINT HAMILTON CYCLES ON K6

2555

Finding possible orderings of dinner guests efficiently turns out to require


some quite interesting math involving graphs and circuits. These concepts are
applicable in other areas as well, and they can be used to show why certain
relationship structures, such as mutual friendship or mutual unfamiliarity,
must exist in randomly selected groups of people. Next, we will look at Ramsey
Theory and how it can be used to find organization in a number of situations.

Unit 2 | 27
UNIT 2 Combinatorics Counts
textbook

SECTION 2.6

Ramsey Theory • More Pigeons than Holes


• The Party Problem
• Ramsey Numbers

More Pigeons than Holes


• Dirichlet’s box, better known as the pigeonhole principle, is a deceptively
powerful concept that can be used to prove combinatorial results.

Of central importance in Ramsey Theory, and in combinatorics in general, is


the “pigeonhole principle,” also known as Dirichlet’s box. This principle simply
states that we cannot fit n+1 pigeons into n pigeonholes in such a way that only
one pigeon is placed in each hole, with no pigeons left over.
THE PIGEONHOLE PRINCIPLE

2132

The pigeonhole principle may seem to be too obvious and simple to be useful. It
can, however, be used to demonstrate possibly surprising results. For example,
in any big city, Los Angeles let’s say, there must be at least two people with
the same number of hairs on their heads. To see why this is a certainty, let’s
assume that a typical person has about 150,000 hairs on his head. Let’s also
assume that no one has more than a million head hairs.

There are significantly more than one million people in Los Angeles. If we
consider each specific number of hairs on a head to be a pigeonhole and each

Unit 2 | 28
UNIT 2 Combinatorics Counts
textbook

SECTION 2.6 person to be a pigeon, then we can assign the pigeons to the holes representing
the number of hairs on their heads. To summarize, there are no more than a
Ramsey Theory million pigeonholes, a million distinct possible numbers of hairs on a head, and
CONTINUED more than a million people (“pigeons”). Consequently, there will be more than
one person with a given number of hairs on their heads.

We’ll see how this deceptively powerful concept plays out next in the field of
Ramsey Theory.

The Party Problem


• “The Party Problem” states that in any group of six people, either three
people will all know each other, or three people will not know each other.
• “The Party Problem” is an example of Ramsey Theory.

Phrased another way, the Party Problem reveals that in any group of six people,
we are mathematically guaranteed to find either three mutual friends or three
mutual strangers. This is not true in a group of five people, so why is six the
magic number?

Let’s say you attend a party and become engaged in a discussion with five other
people. The six of you could be represented graphically by K6 , the complete
graph on six nodes. In this discussion, the relationships between people will be
represented by colored edges on the graph, with a blue edge indicating that the
connected nodes are mutual friends and a red edge denoting mutual strangers.

K6

1144 A

Unit 2 | 29
UNIT 2 Combinatorics Counts
textbook

SECTION 2.6 Notice that each vertex of the K6 graph has five connections. In the following
analysis, we’ll focus on vertex A.
Ramsey Theory
CONTINUED A’S FIVE CONNECTIONS, ALSO KNOWN AS A’S NEIGHBORS

2514
Each of A’s five neighbors is either a friend or a stranger. Notice that, because
there are five neighbors, at least three of them must be friends or at least three
must be strangers. Let’s focus on the case in which there are at least three blue
edges.

VERTICES B, C AND D WILL BE CONNECTED BY EITHER A RED EDGE OR A BLUE EDGE

B C POTENTIAL EDGES

2515
A D

Vertices B, C, and D will be connected by three edges, each of which will be


either red or blue. Because we are attempting to disprove that at least one of
the triangles formed has to have all edges of the same color, we can ignore the
option in which the remaining edges would all be blue. We need to consider only
the following cases for the colors of the edges connecting B, C, and D:

1. 2 blue, 1 red
2. 2 red, 1 blue
3. 3 red

Unit 2 | 30
UNIT 2 Combinatorics Counts
textbook

SECTION 2.6 IF ANY OF THE EDGES CONNECTING B, C AND D ARE BLUE,


A BLUE TRIANGLE IS FORMED
Ramsey Theory B C
CONTINUED

2516
A D

Note that if any one of the edges connecting B, C, and D is blue, a blue triangle
is formed, signifying three people who are all mutual friends. Conversely, a red
triangle represents three mutual strangers (i.e., three people, none of whom
knows either of the others).

CASE 1: 2 BLUE, 1 RED

B C

2517
A D

CASE 2: 2 RED, 1 BLUE

B C

2518
A D

Unit 2 | 31
UNIT 2 Combinatorics Counts
textbook

SECTION 2.6 CASE 3: 3 RED

Ramsey Theory B C
CONTINUED

2519
A D

All three cases lead to the formation of either a blue triangle or a red triangle.
Note that if edges AB, AC, and AD had been red instead of blue, a similar
argument and similar demonstrations would have led to the same conclusion—it
doesn’t really matter which coloring situations we look at.

This proves that among the six party goers there will be at least a group of three
friends or a group of three strangers. This Party Problem is a classic example of
Ramsey Theory.

Ramsey NUMBERS
• Ramsey Theory reveals why we tend to find structure in seemingly random
sets.
• Ramsey numbers indicate how big a set must be to guarantee the existence
of certain minimal structures.

Ramsey Theory is all about finding structure/organization in sets of data. The


solution to the Party Problem is an example of this kind of structure, and the
size of the party group, six, is known as a Ramsey number. Ramsey numbers
tell you how large a group must be before you are guaranteed to see certain
structures. For instance, the party problem is formally expressed as R(3,3) =
6. This means that six is the smallest number of people that guarantees that
either three of them will be mutual friends or three will be mutual strangers.
R(4,5) designates the smallest number of people that guarantees that either
four of them are mutual friends or five are mutual strangers. It takes a group of
twenty-five to guarantee this, so R(4,5) = 25.

The two examples of Ramsey numbers that we have discussed so far refer to
situations in which there are only two types of relationship between people,

Unit 2 | 32
UNIT 2 Combinatorics Counts
textbook

SECTION 2.6 friend or stranger. The application of Ramsey Theory is not limited to binary
situations, however. For example, R(3,3,3) refers to a group in which three types
Ramsey Theory of relationship are possible. These three relationship types might be friend,
CONTINUED enemy, and neutral. In this case, R(3,3,3) represents the smallest number of
people that guarantees that either three will be mutual friends, three will be
mutual enemies, or three will be mutually neutral. In fact, it takes a group of
seventeen people to ensure this, so R(3,3,3) = 17.

The ideas of Ramsey Theory apply to more than groups of people, however. For
example, similar lines of reasoning can be used to show that if a certain number
of dots are placed in a plane randomly, with no three dots collinear, a certain
subset of the dots will form the vertices of a convex polygon. In fact, placing
five dots randomly in a plane (no three dots collinear) ensures that at least
four of them can be connected to make a quadrilateral. This partially explains
why, when we see a star-filled sky, we see recognizable shapes that we call
constellations.

Another interesting application of Ramsey Theory can be found in text analysis.


Any sufficiently long string of letters will have unavoidable regularities, such as
certain combinations or strings of letters that must appear. This can somewhat
explain why people can find hidden messages in large bodies of text, such as the
Bible.

Computing Ramsey numbers, as we saw when we analyzed the Party Problem,


takes a fair amount of cleverness. To find the value of a Ramsey number, you
have to show not only that the size of the collection is large enough to guarantee
that the pattern of interest exists, but also that no smaller group provides the
guarantee. The larger or more significant the pattern or structure, the more
difficult it is to find the minimum group size that guarantees its existence.
Finding an upper limit tends to be fairly easy; what is exceedingly difficult is
showing that no smaller number suffices.

An example of a difficult Ramsey number is the value of R(5,5), the smallest


number of people that guarantees that either five will be mutual friends or
five will be mutual strangers. The value of R(5,5) is known to be somewhere
between forty-three and forty-nine. After years of investigation, this is our
best answer so far. To see why computing Ramsey numbers is so difficult, let’s
just say that we believe that R(5,5) is forty-nine exactly. This would mean that
any collection of forty-nine people has either five mutual friends or five mutual

Unit 2 | 33
UNIT 2 Combinatorics Counts
textbook

SECTION 2.6 strangers. To prove that forty-nine is actually the right number, we have to
show that any group of forty-eight will not necessarily have the five strangers or
Ramsey Theory five friends. A complete graph with forty-eight nodes has 1,128 edges—we can
CONTINUED figure this out by computing C(48,2). Using two colors, one for edges between
“friend” nodes and one for edges between “stranger” nodes, there are then
21128 (~ 10339) possible colorings of the 48-node complete graph. This is the
largest combinatorial explosion we have seen yet! Each of these colorings has
to be examined and determined not to contain the five mutual friends or five
mutual strangers in order for forty-eight to be ruled out as a candidate value
for R(5,5). The difficulty of computing Ramsey numbers was summed up quite
nicely by the great Hungarian graph theorist, Paul Erdös when he said:

[…] imagine an alien force, vastly more powerful than us, landing on Earth and
demanding the value of R(5,5) or they will destroy our planet. In that case, […],
we should marshal all our computers and our mathematicians and attempt to find
the value. Suppose, instead, that they ask for R(6,6). In that case, […], we should
attempt to destroy the aliens.

Unit 2 | 34
UNIT 2 Combinatorics Counts
textbook

SECTION 2.7

DNA Sequencing • de Bruijn Sequences


• Shotgun Sequencing

de Bruijn Sequences
• A de Bruijn sequence is the shortest string that contains all possible
permutations (order matters) of a particular length from a given set.
• We can construct de Bruijn sequences from a given set by finding a Hamilton
cycle on a directed graph.

Ramsey Theory says that patterns must exist in certain sets of data, whether
they be the connections between people, points of light in the sky, or sequences
of numbers. Remember, however, that Ramsey Theory does not specify what
that pattern is, just that it exists. If we need more-specific information, we will
need more-specific tools.

Consider, for instance, a hypothetical keyless-entry keypad on a car that


requires a 5-digit access code for entry. If you forgot your code, how could you
get into your car? One approach would be to try every possible combination
in succession, starting with 11111 and continuing on to 99999. How many
combinations would you have to try in a worst-case scenario (i.e., if the correct
combination is the very last option you try)? There are nine choices for the first
digit, nine for the second one also, and so on. The total number of possible
sequences would be 95, which is about 60,000—a daunting task! Perhaps we
can refine our strategy to speed things up a bit.

An interesting feature of these keypads is that they do not require an “enter”


key. This means that they take an unbroken stream of numbers until the correct
five digits are entered in sequence. So, we could arrange all 60,000 possible
codes into one long string, 300,000 digits in length, which would look like this:
11111 11112 11113…99998 99999. Is this the best strategy to apply? Of course
not. We can see that there are many overlapping sections of the different codes,
the sequence 1111, for example. Entering this pattern more times than we need
to would be redundant and would be quite a waste of time. Might we, instead,
look for a shorter sequence that takes advantage of these overlaps and still
contains all the possible combinations?

Unit 2 | 35
UNIT 2 Combinatorics Counts
textbook

SECTION 2.7 Such a sequence, called a de Bruijn sequence, is the shortest sequence that
contains every given k-length ordering of an n-sized set of elements. To see
DNA Sequencing how one is constructed, let’s look at a somewhat simpler example than our car
CONTINUED keypad above. Let’s pretend that our keypad requires only a two-digit code and
accepts only 1, 2, or 3 as values for those digits. If we were simply to try every
possible combination, we would be trying nine (3 x 3) two-digit orderings, or a
compiled sequence of eighteen digits. We know there are overlaps, so can we
find a de Bruijn sequence for two-digit strings in a set of three elements?

Our combinatorial tool of choice will be a directed graph—that is, a graph in


which the edges can be traversed in only one direction.

A SIMPLE DIRECTED GRAPH

2133

The graph we will use to construct our de Bruijn sequence will have as its nodes
all the possible two-digit orderings:

11
12
13
21
22
23
31
32
33

We will connect these nodes to each other in such a way that a directed edge
from an initial node to a terminal node exists (and is included in the graph) only
if the last digit of the initial node is the same as the first digit of the terminal

Unit 2 | 36
UNIT 2 Combinatorics Counts
textbook

SECTION 2.7 node. So, node 11 could connect to nodes 12 and 13 only; node 13 could connect
to nodes 31, 32, and 33 only. The entire web of allowable connections is shown
DNA Sequencing below.
DE BRUIJN GRAPH FOR N = 2, K = 3
CONTINUED
11

1145 13 21

31 12

23

33 22
32

We can define a de Bruijn sequence by finding a path on this graph that connects
all the nodes, returning to where we started. This is a Hamilton cycle, similar to
the one we used in the circular permutation example discussed earlier.

HAMILTON CYCLE ON DE BRUIJN GRAPH

11

1146 13 21

31 12

23

33 22
32

Unit 2 | 37
UNIT 2 Combinatorics Counts
textbook

SECTION 2.7 A Hamilton cycle on our de Bruijn graph is defined by this nodal path:

DNA Sequencing 11 > 12 > 21> 13 > 32 > 22 > 23 > 33 > 31 > 11
CONTINUED
This gives us the de Bruijn sequence: 1121322331

So, we can see that instead of entering an 18-digit sequence, we could enter the
10-digit sequence shown above, thereby saving us 44% of our effort. To find the
effort saved, by the way, we just compare the amount of change, 8 digits, to the
8
original amount, 18 digits— ~ 44%
18

The results are even more remarkable for our original example. Recall that
our brute-force sequence was 300,000 digits long. A de Bruijn sequence would
shrink this string to around 60,000 digits, saving us 80% of our time and effort.

Shotgun Sequencing
• Modern DNA sequencing involves breaking up a large DNA molecule
into many pieces that can be quickly sequenced simultaneously, and
reassembling the parts based on overlaps in a manner similar to
constructing a de Bruijn sequence.
• Shotgun sequencing is a faster, though less–reliable, method of sequencing
DNA.

This idea of finding the shortest possible string that contains all given
sequences has broader application in the field of genomics. Here, geneticists
wish to find the specific sequence of nucleotides that make up human DNA.
Each strand of our DNA is basically a string of billions of occurrences of the
nucleotides adenine (A), cytosine (C), guanine (G), and thymine (T) in some
specific sequence. Current techniques of reading this sequence cannot handle
such immense lengths. The standard method of reading strands of DNA, the so-
called Chain-Termination method, requires much shorter lengths.

Biologists are faced with the task of taking a given DNA molecule, breaking
it into manageable chunks, reading each chunk, and putting these chunks
back together to construct the original sequence. This is done by randomly
fragmenting the original strand into numerous small segments of many
nucleotides, sequencing these segments via Chain Termination to obtain
“reads,” and then looking at the overlaps in the “reads” to find the shortest
sequence that contains all of the reads.

Unit 2 | 38
UNIT 2 Combinatorics Counts
textbook

SECTION 2.7 In doing this, scientists have to assume that nature seeks efficiency. This means
that the chunks should be reassembled in such a way as to minimize the length
DNA Sequencing of the resultant DNA strand.
CONTINUED
Let’s look at a simplified example. Suppose that a DNA strand gave the
following fragments, or “reads,” after multiple rounds:

GGA ATT GAT TGC TTG

From what we learned before, there will be 120 (5!) possible chains that can be
constructed from these reads. Furthermore, because of overlap, not all will be
the same length.

For example, the ordering GGA ATT GAT TGC TTG and removing the overlaps
gives GGATTGATGCTTG, which is thirteen nucleotides long.

A different sequence, GGA GAT TGC ATT TTG, reduces to GGATGCATTG, which is
ten nucleotides long.

We want to find the shortest possible segment. To do this, we can construct a


directed graph, as we did with our de Bruijn sequence, using the rule that a node
is connected to another node only if the first can be turned into the second by
dropping the initial nucleotide and adding one to the end. In real life, overlaps
are much longer than one nucleotide, and reads are not all of uniform length.
We are examining an ideal, standardized case here to get a sense for the method
that is used.

Unit 2 | 39
UNIT 2 Combinatorics Counts
textbook

SECTION 2.7 Applying the chosen rule, we end up with the following graph:

SEQUENCING READS
DNA Sequencing
CONTINUED
GGA TGC

Our directed graph, connecting each read only


if one can be turned into the other by cutting
off the first nucleotide and adding the last.
ATT GAT TTG

1147
GGA TGC

The Hamilton path of our graph yields:


GGA > GAT > ATT > TTG > TGC
ATT GAT TTG
Getting rid of overlaps: GGATTGC

We are lucky in this case because there is only one possible sequence.
Normally, there are multiple viable candidates. Determining which is the actual
sequence requires different types of lab work unrelated to our purposes here.
Nevertheless, using this method greatly reduces the number of candidate
sequences.

In reality, reads and overlaps are much longer. Consequently, sequencing them
requires fast computers running efficient, clever algorithms. Combinatorics has
many connections and applications to computing in general, and it is this realm
that we will now explore.

Unit 2 | 40
UNIT 2 Combinatorics Counts
textbook

SECTION 2.8

P = NP • The Traveling Salesperson


• Different Types of Time
• Does P=NP?

The Traveling Salesperson


• The problem of how to find the shortest Hamilton cycle on a weighted graph
has many variations, and the task gets very difficult very quickly as the
graph gets bigger.

Imagine that you are a traveling salesperson and you must visit multiple cities
to make your calls. Because you are responsible for covering the costs of travel,
you are probably quite interested in planning a route that takes you to each city
once with the minimum amount of travel.

This problem is similar to the sequencing problems of the previous section,


except that now not all connections are equal. Such graphs are known as
“weighted graphs” and are somewhat more difficult to deal with than the more-
balanced graphs we have seen before.

DIAGRAM OF FOUR CITIES AND ALL THE ROUTES BETWEEN THEM.


SHOWN IN WEIGHTED GRAPH FORM

CITY A
60 MIL
ES
CITY B

1148
35

S
I LE
MIL

9 0M
ES

40 M

13
0M
ILES

IL
ES
CITY D
70 M
ILES

CITY C

Unit 2 | 41
UNIT 2 Combinatorics Counts
textbook

SECTION 2.8 With a small number of cities, this problem is not difficult to figure out. Let’s say
that you can start at any city you choose, but you have to return to the same city
P = NP to complete the cycle. It should be evident by now that the number of possible
CONTINUED routes will be (n-1)! So, for four cities, you will have six optional routes to check.

TABLE SHOWING THE DISTANCES BETWEEN


Pair of cities Distance between
A-B 60

1149 A-C
A-D
130
35
B-C 40
B-D 90
C-D 70

So, this problem quickly becomes a fairly simple exercise in finding the distance
for each route and choosing the shortest. However, suppose you decide to
add another city to your route. Now you have twenty-four possible routes to
investigate—a bigger problem, but still doable. If you would add yet another
city, you would have 120 possible routes to consider. This is quickly becoming
time-consuming! At this point, it would make sense to use a computer. We
could program the computer to enumerate every route, find their sums (total
travel distances), and then sort the routes by length. As we keep adding cities
to our sales itinerary, we could use our computer to check each route, but even
with as few as ten cities we would have to check about 350,000 routes. Twenty
cities would involve checking approximately 1017 routes. Even using our simple
algorithm on a fast computer will not enable us to find such a solution in any
realistic amount of time. This is an example of factorial time.

Different Types of Time


• How a problem scales, that is, how it changes as it involves more elements,
can be measured by how much time it takes an algorithm to solve it.
• Feasible problems can be solved in polynomial time.

Some problems can be solved in what is known as “polynomial time.”


Multiplying two numbers is an example of this. If you multiply two six-digit
numbers, it will not take appreciably longer than multiplying two five-digit
numbers. For example, long multiplication of two three-digit numbers requires
approximately nine operations. Long multiplication of two five-digit numbers
requires approximately twenty-five operations. In general, multiplying two

Unit 2 | 42
UNIT 2 Combinatorics Counts
textbook

SECTION 2.8 n-digit numbers commonly requires n2 operations. An algorithm in which the
number of steps, n, is a polynomial (such as n2 or (37n4-3n3+n-1) in the size of
P = NP the input is called a P-method. P-method problems can be solved in what is
CONTINUED known as “polynomial time.”

The problem of the traveling salesperson actually grows more quickly than
this—it grows in factorial time. There are various methods for solving such
problems. Some involve heuristic algorithms, which, although they may be
quick some of the time, are not dependably quick. Other techniques can achieve
approximate solutions quickly within a specified tolerance of the optimal
solution. Another, theoretical way to solve this type of problem would be to
use a computer that is a “lucky guesser.” Such a computer would, by making
lucky guesses, find the ideal answer in polynomial time. Problems that can
theoretically be solved in polynomial time only by such a “lucky” computer are
known as NP. Note that the “lucky computer” method doesn’t really exist as a
way of solving problems. It’s a theoretical construct used to distinguish different
types of computing problems, namely to define the NP class of problems.
Technically, the lucky computer isn’t solving the problem as stated—it is merely
verifying that its guess is correct, which presents a slightly different problem.

Does P = NP?
• The question of whether or not NP problems are really P problems in
disguise is still outstanding.

There are many problems similar to that of the traveling salesperson. Packing
boxes of different sizes into a confined space, such as when you pack to move or
go on vacation, is an example. The situations encountered when playing Tetris
can be transformed into the equivalent of the traveling salesperson problem.
All of these problems can be turned into one another, and all of these could be
theoretically solved in polynomial time by a “lucky” computer. Such problems
are known as NP-complete problems.

Because every NP-complete problem can be turned into every other NP-
complete problem, if someone were to find a P-method to solve one of them,
then there would be a P-method to solve all of them. This leads to the question:
Are all NP problems really just P problems in disguise?

This question is one of the major outstanding issues in mathematics, computing,


and complexity theory. It is also one of the Clay Mathematical Institute’s

Unit 2 | 43
UNIT 2 Combinatorics Counts
textbook

SECTION 2.8 Millennium Problems. Any person who either shows that P = NP, perhaps by
finding a P-method to solve the traveling salesperson problem, or proves that P
P = NP does not equal NP, will win $1,000,000.
CONTINUED

Unit 2 | 44
UNIT 2 at a glance
textbook

SECTION 2.2

Egypt and India • The Rhind Papyrus, also known as the Ahmes Scroll, is the earliest known
combinatorial problem.
• The solution to the problem requires using the sum of a geometric series.
• The problem of counting subsets of a larger set was explored by thinkers in
India as early as the 6th century BC.
• Functions map members of one set to members of another set.
• Bijection can be used to enumerate the members of a difficult-to-enumerate
set by establishing a one-to-one correspondence with a set that is easier to
enumerate.
• Using the concept of bijection, we can solve the Indian flavors problem in a
very elegant way.

SECTION 3.2
2.3

Flavors Revisited • Counting combinations, in which order does not matter, is different than
counting permutations, in which order does matter.
• The factorial operation is very important in counting permutations.
• The formula for combinations of n objects taken k at a time can be found
by first looking at the permutations of n objects taken k at a time and then
dividing by the number of permutations of k objects taken k at a time.

SECTION 3.2
2.4

Pascal’s Triangle • Pascal’s Triangle is an important and widely useful mathematical concept.
• At its heart, Pascal’s Triangle is a recursive relationship by which we can,
given previous elements, find subsequent ones.
• Pascal’s equation can be used to create his famous triangle, which can in
turn be used in a variety of ways to count different types of subsets.
• There are many interesting mathematical relationships, or identities, hidden
within Pascal’s Triangle.

Unit 2 | 45
UNIT 2 at a glance
textbook

SECTION 2.5

The Order of the • Counting permutations is different than simply counting combinations
Garter because order must be taken into account.
• The seating arrangement at the annual brunch of the Order of the Garter in
England is an example of circular permutations in action.
• We can envision circular permutations as Hamilton cycles on complete
graphs.

SECTION 3.2
2.6

Ramsey Theory • Dirichlet’s box, better known as the pigeonhole principle, is a deceptively
powerful concept that can be used to prove combinatorial results.
• “The Party Problem” states that in any group of six people, either three
people will all know each other, or three people will not know each other.
• “The Party Problem” is an example of Ramsey Theory.
• Ramsey Theory reveals why we tend to find structure in seemingly random
sets.
• Ramsey numbers indicate how big a set must be to guarantee the existence
of certain minimal structures.

SECTION 3.2
2.7

DNA Sequencing • A de Bruijn sequence is the shortest string that contains all possible
permutations (order matters) of a particular length from a given set.
• We can construct de Bruijn sequences from a given set by finding a Hamilton
cycle on a directed graph.
• Modern DNA sequencing involves breaking up a large DNA molecule
into many pieces that can be quickly sequenced simultaneously, and
reassembling the parts based on overlaps in a manner similar to
constructing a de Bruijn sequence.
• Shotgun sequencing is a faster, though less-reliable, method of sequencing
DNA.

Unit 2 | 46
UNIT 2 at a glance
textbook

SECTION 2.8

P = NP • The problem of how to find the shortest Hamilton cycle on a weighted graph
has many variations, and the task gets very difficult very quickly as the
graph gets bigger.
• How a problem scales, that is, how it changes as it involves more elements,
can be measured by how much time it takes an algorithm to solve it.
• Feasible problems can be solved in polynomial time.
• The question of whether or not NP problems are really P problems in
disguise is still outstanding.

Unit 2 | 47
UNIT 2 Combinatorics Counts
textbook

BIBLIOGRAPHY

WEBSITES http://www.genome.gov/
http://www.claymath.org/
http://www.royal.gov.uk/output/page4944.asp
http://www.ams.org/featurecolumn/archive/mulcahy1.html
http://www.genome.gov/10001167#hgp
http://www.ornl.gov/sci/techresources/Human_Genome/project/about.shtml

PRINT Beardsley, Tim “An Express Route to the Genome?” Scientific American, vol. 279,
issue 2 (August 1998).

Benjamin, Arthur T. and Jennifer J. Quinn. Proofs that Really Count: The Art of
Combinatorial Proof (Dolciani Mathematical Expositions). Washington, D.C.:
Mathematical Association of America, 2003.

Berlinghoff, William P. and Kerry E. Grant. A Mathematics Sampler: Topics for the
Liberal Arts, 3rd ed. New York: Ardsley House Publishers, Inc., 1992.

Bogart, Kenneth. Combinatorics Through Guided Discovery. (2004).

Bogart, Kenneth. Introductory Combinatorics, 3rd ed. Harcourt Academic Press,


2000.

Bogart, Kenneth, Clifford Stein, and Robert L. Drysdale. Discrete Mathematics


for Computer Science (Mathematics Across the Curriculum). Emeryville, CA: Key
College Press, 2006.

Devlin, Keith J. The Millennium Problems: The Seven Greatest Unsolved


Mathematical Puzzles of Our Time. New York: Basic Books, 2002.

Gross, Benedict and Joe Harris. The Magic of Numbers. Upper Saddle River, NJ:
Pearson Education, Inc./ Prentice Hall, 2004.

Hartsfield, Nora and Gerhard Ringel. Pearls in Graph Theory: A Comprehensive


Approach. San Diego, CA: Academic Press, 1990.

Unit 2 | 48
UNIT 2 Combinatorics Counts
textbook

BIBLIOGRAPHY
Joseph, George Gheverghese. Crest of the Peacock: The Non-European Roots of
Mathematics. Princeton, NJ: Princeton University Press, 2000.
PRINT
CONTINUED Maor, Eli. Trigonometric Delights. Princeton, NJ: Princeton University Press,
1998.

Morris, S. Brent. Magic Tricks, Card Shuffling, and Dynamic Computer Memories.
Washington D.C.: Mathematical Association of America, 1998.

Nahin, Paul J. Dr. Euler’s Fabulous Formula: Cures Many Mathematical Ills.
Princeton, NJ: Princeton University Press, 2006.

Newman, James R. Volume 1 of the World of Mathematics: A Small Library of the


Literature of Mathematics from A’h-mose the Scribe to Albert Einstein. New York:
Simon and Schuster, 1956.

Rashed, R. The Development of Arabic Mathematics: Between Arithmetic and


Algebra, [translated by A.F.W. Armstrong]. Boston, MA: Kluwer Academic, 1994.

Reeve, Eric C.R. (editor) Encyclopedia of Genetics. Chicago, IL: Fitzroy Dearborn
Publishers, 2001.

Tannenbaum, Peter. Excursions in Modern Mathematics, 5th ed. Upper Saddle


River, NJ: Pearson Education, Inc., 2004.

Human Genome Program. “Genomics and Its Impact on Science and Society: A
2003 Primer.”Oak Ridge National Laboratory, U.S. Department of Energy.
http://www.ornl.gov/sci/techresources/Human_Genome/publicat/primer2001/
index.shtml (accessed 2007).

Venter, J. Craig, et al. “Genomics: Shotgun Sequencing of the Human Genome,”


Science, vol. 280, Issue 5369 (June 1998).

Wallis, W.D. A Beginner’s Guide to Graph Theory. New York: Birkhauser Boston,
2000.

Wujastyk, Dominik. “The Combinatorics of Tastes and Humours in Classical


Indian Medicine and Mathematics,” Journal of Indian Philosophy, vol. 28, nos. 5-6
(December 2000).

Unit 2 | 49
UNIT 2 Combinatorics Counts
textbook

BIBLIOGRAPHY
Yu Zhang and Michael S. Waterman: “An Eulerian Path Approach to Local
Multiple Alignment for DNA Sequences,” Proceedings of the National Academy of
PRINT Sciences, USA, vol. 102, no. 5 (2005).
CONTINUED

MEDIA Hardman, Robert. “A Royal Year” (Part Two: Four Seasons, Section 3: Garter
Day). Silver Spring, MD: Acorn Media, 2005 Windsor Castle [videorecording
(DVD)]: An RDF Media/HTI co-production for BBC Television; History Television
International in association with Oregon Public Broadcasting; produced and
directed by Matt Reid, (2 DVDs).

Unit 2 | 50
UNIT 2 Combinatorics Counts
textbook

NOTES

Unit 2 | 51
TEXTBOOK
Unit 3
UNIT 03
How Big Is Infinity?
TEXTBOOK

UNIT OBJECTIVES

• Ideas of infinity come to light when considering number and geometry, the worlds
of the discrete and the continuous.

• Incommensurability is the idea that there is no measurement unit that fits into
some two quantities a whole number of times.

• Incommensurability led to the discovery of irrational numbers.

• Irrational numbers have decimal expansions that never end and never repeat.

• Two sets are the same size if their elements can be put into one-to-one
correspondence with one another.

• The size of a set is its cardinality.

• There is more than one type of infinity.

• The sets of rational and real numbers are examples of two different sizes of infinity.

• To properly describe the different sizes of infinity, a new definition of number is


required.

• Given a set of any size, one can create a larger set by taking the subsets of the
original set.
It is well known that the man who first made
public the theory of irrationals perished in a
shipwreck in order that the inexpressible and
unimaginable should ever remain veiled. And
so the guilty man, who fortuitously touched on
and revealed this aspect of living things, was
taken to the place where he began and there is
forever beaten by the waves.

Proclus Diadochus (412 - 485)


If you disregard the very simplest cases,
there is in all of mathematics not a single
infinite series whose sum has been
rigorously determined. In other words,
the most important parts of mathematics
stand without a foundation.

Niels H. Abel (1802 - 1829)


UNIT 3 How Big Is Infinity?
textbook

SECTION 3.1

INTRODUCTION From an early age, we have an intuitive sense that there can be no biggest
number. As soon as we learn how to add two numbers together, we have at our
disposal a mechanism by which we can make any number bigger—just add one!
We have a sense of both a process and a set—the set of all numbers—that are
infinite, larger than anything in our daily experience. We also learn a hierarchy
of numbers: a billion conquers a million, a googol beats a billion, and infinity is
the sovereign value, untouchable in its perfection.

What exactly is infinity? Does it really exist? It certainly doesn’t play any
obvious role in our everyday lives. We are finite, and we live in a finite world.
Our lives have definite beginnings and endings, and we measure the time
between these two points using discrete, finite, units such as years, minutes,
and seconds. Similarly, the physical space in which we live our lives and enact
our everyday pursuits is bounded and separated into fundamental units, such
as miles and millimeters. Our best bet for grasping some sensory experience
of infinity is to gaze toward the heavens on a starry night. It remains an open
question, however, whether or not the universe actually extends forever.

The process of adding the number one to another number to make it greater
does not make the result infinite—it merely makes another, greater, finite
number. The Greeks called a quantity or a collection “potentially infinite” if,
given any finite example of that quantity, a ”larger” example could always be
found. In this respect, a line segment (a “collection” of points) is potentially
infinite, because it can always be made longer, and the set of counting numbers
is potentially infinite, because from one counting number, we can always
construct a greater one. To conceive a quantity that is actually infinite, however,
is mind-bending and, in many ways, perplexing. What happens if we add the
number one to an actually infinite number? It’s already infinite—does adding to
it make it greater? How could one possibly have something ”more” than infinity?

Such a concept is often called “actual infinity,” and it is much more problematic
than “potential infinity.” It defies intuition, forcing us to rely on logic to explore
the defining aspects of the concept. The idea of actual infinity has been
disturbing to mathematicians since at least the time of the Greeks. At times,
it seems to be more an invention of the human imagination than anything real,
and, ideally, mathematics should be the language that describes reality, not
fantasy.

Unit 3 | 1
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.1 Despite its nebulous reality, the concept of infinity has long teased at
mathematicians’ minds. Around 500 BC it manifested in the form of
INTRODUCTION incommensurable quantities, a concept akin to heresy in the view of many,
CONTINUED particularly the followers of Pythagoras. At almost the same time, paradoxes
posed by the philosopher Zeno showed that infinity was a difficult concept for
the human mind to comprehend. It was at this point that the Greeks reluctantly
accepted the use of infinity in mathematics, but they left the challenge of
understanding it to philosophers and priests.

This view persisted for centuries. Infinity was a tool that could be used in
mathematics, even if it was not well understood. As it turned out, infinity proved
to be an indispensable tool for 16th- and 17th-century mathematicians seeking
to use mathematical concepts to describe real-world physical phenomena, This
is most evident in the field of calculus. In order for calculus to work (and we
assume that it does work, because it describes the physical world superbly),
we have to believe that actual infinity exists—that is, we have to believe at least
that an infinite process can have a finite result. So, the concept of infinity proved
useful then in much the same way that a modern cell phone does now; we
certainly don’t have to know how it works in order to make use of it.

Mathematics, however, is supposed to be based on rock-solid, well-understood


principles. As the tower of mathematics grew larger and more intricate,
with each new idea depending on the validity of those that came before it,
mathematicians began to double-check the foundational principles. They were
concerned that it might be a bad idea to base large parts of our understanding
of the world on a concept, infinity, that we fundamentally do not understand.
Enter Georg Cantor. Cantor sought to understand mathematically the infinite
and the consequences of believing in an actual infinity. He did this by creating
the language of sets, which are just collections of objects, such as numbers.
In doing so, he had to redefine what a number really is. Through some of the
most creative and ingenious mathematics ever done, Cantor showed, contrary
to intuition, that there can be different sizes of infinity. His polarizing results
generated much controversy that, to this day, is not completely resolved.

In this unit we will explore infinity by first looking at rational numbers and
some of their properties. We will then see how incommensurable quantities
and irrational numbers suggest that infinity is at work in the number system.
Through Zeno’s paradoxes, we will catch a glimpse of how difficult infinity can
be to understand. From there we will look at sets of numbers and re-learn how

Unit 3 | 2
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.1 to count in a way that will enable us to approach the concept of infinity. With
these tools in hand, we will get a sense of the universe of infinities that Cantor
INTRODUCTION discovered, culminating in Cantor’s Theorem, one of the most counter-intuitive
CONTINUED ideas in mathematics.

Unit 3 | 3
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.2

Rational Numbers • Common Denominator


• The Ever-Expanding Decimal
• Number vs. Magnitude

COMMON DENOMINATOR
• Rational numbers arise from the attempt to measure all quantities with a
common unit of measure.

The pursuit of infinity begins with an examination of the idea of number, or


quantity. Numbers originally were tools used to quantify groups of objects or
to measure real things, such as the length of a pole or the weight of a piece
of cheese. Measuring something requires some fundamental unit that can
be used as a basis of comparison. Some things, such as rope or time, can be
measured, or quantified, using a variety of units. In measuring a length of rope,
for example, we might express the result as either “5 feet” or “60 inches.”
The length of time from one Monday to the next Monday is commonly called a
“week,” but we could just as easily—and correctly—call it seven “days,” 168
“hours,” 10,080 “minutes,” or 604,800 “seconds.” These number expressions all
represent the same length of time and, thus, are interchangeable. Converting
any one of these equivalent values into another simply requires multiplying
or dividing by some whole number. For example, 10,080 minutes is 168 hours
times 60. In fact, every one of the above measurements could be converted into
seconds by multiplying by appropriate whole number values.

Now, we can take any two quantities and ask a similar question: can we find a
common unit of measurement that fits a whole number of times into both? Take
8 and 6, for example. This is straightforward; both 8 and 6 are whole numbers,
1 3
so we can use whole units to measure them both. What about and ? Here,
2 8
each of the quantities uses different base units, namely halves and eighths, and
comparing them would make little sense, because they are different things. We
1
can find a common unit of comparison, however, by recognizing that 2 is the
4
same as 8 .

So, we found a common denominator of 8, which implies that both of the


fractions could be expressed as multiples of the same unit, “one eighth.” In
some sense, we have redefined our basic unit of measurement, or fundamental

Unit 3 | 4
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.2 1
piece, to be 8 of the original piece, thereby transforming the original fractions
into easily-compared multiples of the same fundamental unit.
Rational Numbers
CONTINUED

2492 1
8
1 1
1 8 8
2 = 1 3
= 1
8 8 8
1 1
8 8

4 3
8 8

If two numbers can be expressed as whole-number multiples of some common


unit, of whatever size, they are in some sense “co-measurable” in that we can
measure both using the same ruler. The proper mathematical term for “co-
measurable” is “commensurable.” One way to think about this is that two
lengths are commensurable if there is a basic unit of measure that fits into both
of them a whole number of times. If we were to cut some length of rope into two
pieces of lengths a and b, there would be some third length, c, such that a = mc
and b = nc. In other words, these two numbers could be expressed as multiples
of some common unit.

C C C C C C

2494
A

C C C

Using a little algebra, we can confirm that the ratio of magnitudes of our two
commensurable quantities is equal to a ratio of whole numbers:

a (mc)
=
b (nc)

Unit 3 | 5
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.2 Canceling the common factor of c yields the equation:

(mc) m
Rational Numbers =
(nc) n
CONTINUED
Numbers that can be expressed as ratios of two whole numbers are called
rational numbers. This idea of “creating” numbers that may not relate to any
observable value in the real world is fairly modern. Although today we are
13
comfortable speaking of a number such as 25 as a concept that “exists” in its
own right, the ancient Greeks generally took care to phrase things only in terms
of geometric quantities--those that exist in the physical world. For instance,
they might have spoken of two lengths of rope, one that could be described as
13 measures of a certain unit, and the other 25 units of that same measure, but
13
they would not necessarily speak of the shorter length being 25 of the other.

u u
u u
u u
u u
u u
u u
u u
u u
u u
u u
u u
u u
u u u
u u u
u u u
u u u
u u u
u u u
13 UNITS u 25 UNITS VS. u 13
u
u u u 25
u u u

2493 u u u
u u u
u u u
u u u

On the other hand, today we are comfortable saying, for example, that a
13
length of string is 25 of an inch long. In doing so, we are saying that it is
commensurable with a piece of string one inch long, the fundamental unit of
1
comparison being 25 .

The modern and ancient views of rational numbers are intimately linked, but
it is important to remember that the Greeks thought of commensurability in
terms of whole units. The Pythagoreans, the followers of Pythagoras of Samos

Unit 3 | 6
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.2 in 6th century BC Greece, held sacred the idea that the first principle underlying
everything is “arithmos,” the intrinsic properties of whole numbers and their
Rational Numbers ratios. It is certainly a tidy idea that whole numbers, or ratios of them, are all
CONTINUED that is required to describe the world mathematically. It is thought that this
belief had origins in both the study of figurate numbers and the recognition
that strings or hammers of commensurable length sounded harmonious when
played or struck together.

THE EVER-EXPANDING DECIMAL


• Rational numbers can be expressed as decimals that repeat to infinity.

In any ratio of two whole numbers, expressed as a fraction, we can interpret


the first (top) number to be the “counter,” or numerator—that which indicates
how many pieces—and the second (bottom) number to be the “namer,” or
denominator—that which indicates the size of each piece.

In modern arithmetic, we use a base-10 system to count, or evaluate, things.


Large quantities are generally represented in terms of ones, tens, hundreds
and the like, whereas small quantities are more easily represented in terms of
tenths, hundredths, thousandths, and so on. Although the Greeks did not use
a base-10, or decimal, number system, it is illuminating to see how rational
numbers behave when expressed as decimals.

For example, we can interpret the number 423 as four 100s, two 10s and 3 units
1 1 1
(or 1s), and the value 0.423 as four ( 10)s, two (100)s and three ( 1000 )s. In such a
decimal system it is necessary to think of all quantities in terms of units of tens,
1 5
tenths and their powers. Thus, 2 , for instance, must be interpreted as , to
10
be written as 0.5.

5 1
The question of whether or not 0.5, or , represents the same quantity as 2
10
deserves a bit of thought, however, because it highlights a subtle difficulty with
our understanding of rational quantities (and maybe with our understanding
of “number” itself). To explain it, let’s return to the Greek point of view of
commensurability. Recall that two lengths, a and b, are commensurable if there
exists a common unit of measure, u, such that each length can be generated
by taking u a whole number of times: a = mu and b = nu. Applying algebra, it is
a m
easy to confirm that the ratio equals the ratio of whole numbers n . What
b
would happen, though, if we worked with a smaller unit, v, that fits five times
into u (that is, u = 5v)? Then, we would have: a = 5mv and b = 5nv, and the ratio

Unit 3 | 7
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.2 a 5m m
still would be equal to the ratio of whole numbers , or n . This shows
b 5n
that a rational number is not simply a ratio of any two specific whole numbers,
Rational Numbers
but rather, represents a collection of “equivalent” whole-number ratios. A
CONTINUED
consequence of all of this is that our modern notion of a rational number
is, in itself, somewhat abstract and troublesome to comprehend. Putting
1 5
philosophical woes aside for the present, we have at least seen that 2 and
10
are different representations of the same ratio.

Many find it useful to view rational quantities as answers to division problems.


For example, sharing one apple equally between two students results in each
student receiving half of an apple. Dividing two apples equally among three bins
2
yields 3 of an apple per bin. Thankfully, each equivalent representation of a
rational number, interpreted as a division problem, yields the same physical
result: dividing four apples among six bins, and 10 apples among 15 bins, and
200 apples among 300 bins, all yield the same result as dividing two apples
among three bins.

We will use this division model to our advantage as we convert fractions into
4
decimal representations. For example, to write 7 as a decimal number, we
can think of the process of dividing four things, such as apples, among seven
bins. Our decimal representation is then the number of apples in each bin, with
one whole apple being our fundamental unit. Because we are dividing only four
apples equally into seven bins, we realize that each bin must receive less than
one whole apple, so the value in the 1’s place of our decimal-expansion number
2495 must be 0.

SLICE INTO
1
10s

FOUR APPLES 7 BINS 40 SLICES EACH OF AN APPLE

ZOOM 1
7 BINS X 5 10 SLICES
= 35 SLICES ACCOUNTED FOR

5 SLICES LEFT OVER

EACH BIN HAS 5 SLICES


= 0.5 OF AN APPLES

Unit 3 | 8
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.2 How do we actually envision this process, though? What would we actually do to
the four apples to achieve the equal distribution into the bins? We could begin
Rational Numbers by cutting each apple up into ten equal pieces, which would give us 40 slices,
CONTINUED each being a tenth the size of a whole apple. If we were then to apportion these
slices equally into the seven bins, each bin would receive five slices (or five-
tenths of an apple) with five slices left over. Note that the content of each bin
after this initial distribution is represented by the decimal 0.5.

2496
CUT EACH LEFTOVER SLICE
INTO 10 PIECES AGAIN

5 SLICES 50 SLICES EACH OF AN APPLE

ZOOM
7 BINS X 7 SLICES
= 49 SLICES ACCOUNTED FOR

1 SLICE LEFT OVER

EACH BIN HAS 5 ,7


= 0.57 OF AN APPLE

If we repeat this process, dividing the five leftover tenths each into ten equal
pieces, we would have 50 slender slices, each being a hundredth of a whole
apple. Apportioning these slices equally among the bins would mean that
7
each bin receives seven slices (or 100 of an apple), with one slice left over. The
accumulated total in each bin can now be represented by the decimal 0.57.

Unit 3 | 9
UNIT 3 How Big Is Infinity?
textbook

2497
SECTION 3.2

CUT THE SLICE


Rational Numbers INTO 10 PIECES
CONTINUED 1 SLICE 10 SLICES EACH OF AN APPLE

ZOOM
7 BINS X 1 SLICES
= 7 SLICES ACCOUNTED FOR

3 SLICES LEFT OVER

EACH BIN HAS 5 ,7 , and 1


= 0.571 OF AN APPLE

If we take the one leftover slice and cut it into ten equal pieces, we will create
slices that are each just a thousandth of a whole apple. With equal distribution,

2498 each bin receives just one of these slices, and three are left over. The total
amount of whole apple now in each bin can be represented by the decimal 0.571.

ETC.

We can continue this process of dividing each leftover slice into ten pieces,
placing an equal number of slices into each of the seven bins, and then dividing
the leftovers again, indefinitely. In this particular example, we would soon find
that the number sequence repeats itself after six decimal places so that the
4
decimal representation of is 0.571428571428…. Why must the decimal repeat?
7
In our example, there are only six choices for non-terminating remainders
(i.e., 1, 2, 3, 4, 5, and 6). Note that a remainder of zero would end the division
process and create a terminating decimal. In the absence of termination, one of
the remainder values must reoccur, thereby beginning a repeating sequence of
numbers.

Unit 3 | 10
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.2 Note that this expansion never ceases, continuing for as long as we care to
continue the division process. This is somewhat reminiscent of the potential
Rational Numbers infinity we talked about in the introduction to this unit. We can always determine
CONTINUED another decimal place value, but after recognizing the repeating pattern, we
don’t need to.

There is nothing special about the ratio values 4 and 7 in this example. The
logic of breaking up leftovers into ten equal pieces and distributing those
pieces equally holds for whichever two numbers we choose. In this way, any
rational number can be written as a repeating decimal. Even fractions that
1
can be represented by “terminating” decimals, such as , can be thought of as
1 2
repeating, if we recognize that 2 = 0.5 = 0.5000000….

Conversely, any repeating decimal can be shown to be a ratio of whole numbers.


Consider, for example, the decimal 0.4444…. If we let x = 0.4444…, we are saying
 1  1   1 
that x consists of four 10 s, four s, four s, and so on. Ten times this value
 100  
 11,000
  1 
(10x) would then be four units (1s), four s, four s, and so on—or, more
 10   1,000 
concisely, 10x = 4.4444….

With the decimal values of both x and 10x established, we can construct this
calculation:

10x = 4.4444…
- x = -0.4444…
9x = 4

Solving the resulting equation for x, we get:

4
x=
9

Notice that this works because every 4 to the right of the decimal point in
the number 4.4444… matches up with a 4 to the right of the decimal point in
the number 0.4444…. When the two numbers are subtracted, all these 4s
completely cancel out.

This method of converting a repeating decimal into a fraction also works for

Unit 3 | 11
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.2 decimals that have longer repetition sequences, such as 0.325325325….
Let x = 0.325…
Rational Numbers Then 1,000x = 325.325…
CONTINUED
1,000x = 325.325…
– x = - 0.325…
999x = 325

325
x=
999

In the above example, both 0.325… and 325.325… exhibit an infinite decimal
expansion, yet we can cancel all the digits to the right of the decimal point
because it is plain to see that each decimal digit in 325.325…matches up with
an equivalent decimal digit in 0.325…; leaving only the whole number 325
after subtracting the two quantities. This idea of establishing a one-to-one
correspondence among the decimal digits provides a glimpse of how we might
think mathematically about infinity that will be of supreme importance later on
in this unit.

Number vs. Magnitude


• In the mathematics of early Greece, there was a strong distinction between
discrete and continuous measurement.
• Number refers to a discrete collection of atom-like units.
• Magnitude refers to something that is continuous and that can be infinitely
subdivided.
• Rational numbers can be expressed as decimals that repeat to infinity.

Early Greek mathematicians divided mathematics into the study of number, or


multitude, and the study of geometry, or magnitude. The multitude concept
presented numbers as collections of discrete units, rather like indivisible atoms.
Magnitudes, on the other hand, are continuous and infinitely divisible. Because
length is a magnitude, a line segment can be divided as many times as one likes.

The Pythagoreans believed that magnitudes could always be measured using


whole numbers, which would imply that lengths are not infinitely divisible.
Other schools, such as the followers of Parmenides, known as the Eleatics,
believed in the infinite divisibility of magnitudes.

Parmenides taught that true “being” is unity, static, and unchangeable. This

Unit 3 | 12
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.2 is similar to the idea that “all is one,” which implies that concepts such as
multiplicity and motion are illusions. If everything is part of the same thing,
Rational Numbers then there are no “multiple” things and, consequently, no motion, which is the
CONTINUED change in position of one thing relative to another. Pythagoreans believed in
multitude and motion perhaps because these concepts are intuitive, part of
collective common experience. A consequence of the Pythagorean notion of
multiplicity is that magnitudes should be commensurable. To the Pythagoreans,
the idea that between any two quantities in nature there exists a common unit
of measure, a common denominator, may have been comforting. It perhaps
suggested that the rational mind can always find a solid basis for comparison,
and does not have to rely on guesswork to say definite things about reality.

It would be easy to dismiss the Eleatic view, if it were not for the arguments of
one of Parmenides’ most famous pupils, Zeno. As we shall see, Zeno argued
against the Pythagorean notions of multiplicity and motion, using infinity to
show contradictions in this view. Prior to Zeno, however, problems with the
Pythagorean viewpoint arose from within their own ranks in the form of an
independent thinker by the name of Hipassus of Metapontum. Hipassus showed
that magnitudes are not always commensurable, an idea that upset his peers to
such a degree that, as the legend goes, he was drowned for his heresy. In the
next section, we shall examine the idea and consequences of incommensurable
magnitudes.

Unit 3 | 13
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.3

Incommensurability • Odds and Evens


and irrationality • The Infinite Chase

Odds and Evens


• The side and the diagonal of a square are incommensurable.

To recap, the Greek concept of magnitude was somewhat tied to what a person
could measure in the real world. The Pythagoreans believed that all magnitudes
in nature could be represented through arithmos, the intrinsic properties of
whole numbers. This means that for any two magnitudes, one should always be
able to find a fundamental unit that fits some whole number of times into each
of them (i.e., a unit whose magnitude is a whole number factor of each of the
original magnitudes)—an idea known as commensurabilty. Hipassus argued
against this idea by demonstrating that for some magnitudes this simply isn’t
the case—they are incommensurable. Although his original argument is lost to
the ages, the following proof, which uses algebraic notions that would have been
unfamiliar to the Greeks, gives a sense of the discovery that changed Greek
mathematics forever.

Let’s imagine a square with a side of length a and diagonal of length b.

b
a

a
If these lengths are commensurable, as Pythagoras and his followers believed
(without proof), then there is a common unit u such that a = mu and b = nu for
some whole numbers m and n. We can assume that m and n are not both even
1
(for if they were, it would indicate that the common unit could instead be 2 u, and
we would simply make that adjustment). So, we can safely assume that at least
one of these numbers is odd.

Unit 3 | 14
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.3 Applying Pythagoras’ theorem to the triangle formed in the square, we have:

Incommensurability a2 + a2 = b2
and irrationality
CONTINUED That is,

2a2 = b2

or, substituting our common unit expressions for the two lengths,

2m2u2 = n2u2

We know that our common unit, u can’t be zero, so we can cancel the u2 term
from both sides of the equation, leaving:

2m2 = n2

Obviously, n2 is even, because it is equal to some number, m2, multiplied by two.


If n2 is even, then n must be even also (if n were an odd number, then n2 would be
odd). We can express the even number n as two times some number.

n = 2w

Substituting this expression for n into the preceding equation gives us:

(2w)2 = 2m2

4w2 = 2m2

m2 = 2w2

This reveals that m2 is a multiple of two, that is, an even number. Consequently,
as we reasoned before, m must also be even, and we can write:

m = 2h

Now we have found a contradiction! Remember, we assumed at the beginning

Unit 3 | 15
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.3 that either m or n was odd, yet we have just shown that both have to be even.
This logical contradiction proves that there is no common unit, u, that fits a
Incommensurability whole number of times into both a and b—therefore, a and b, the lengths of the
and irrationality side and diagonal of a square, are incommensurable.
CONTINUED

The Infinite Chase


• Incommensurable quantities are not rationally related, because this
logically leads to an infinite regress.

What does incommensurability have to do with infinity? A contemporary of


Hipassus, Theodorus of Cyrene, proved the incommensurability of the side
and diagonal of a square by showing that no matter how small of a unit one
uses to measure the side and the diagonal, it will never fit a whole number of
times into both. In fact, selecting smaller and smaller units merely leads to an
infinite regression of triangles. Theodorus’ approach is illuminating in that it is
more in line with how the Greeks thought about mathematics than the previous
demonstration of incommensurability.

To get a sense of Theodorus’ proof, let’s again focus on the isosceles right
(also commonly called a “45°-45°-90°”) triangle formed by two sides and the
connecting diagonal of a square. Designating this as triangle ABC, with legs
of length a and hypotenuse length b, let’s once more assume that there is a
fundamental unit of measurement capable of representing the lengths of both a
side and the hypotenuse in whole number multiples; that is:

a = mu and b = nu

Along the hypotenuse of the right triangle, we can measure a length equal to the
side’s length and construct a new 45°-45°-90° triangle CDE as shown:

A A

013 a a
D

B a C B E C
Unit 3 | 16
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.3 Without too much difficulty, we can show that all three segments, BE, ED, and
DC are congruent. (We won’t go through the proof, but you would begin by
Incommensurability constructing a line from A to E and showing that the two triangles ABE and ADE
and irrationality are congruent.) and that each of these lengths is b-a, again a whole number of
CONTINUED copies of u.

Thus, from any 45°-45°-90° triangle with sides whose measure is a multiple of
u, we can construct a smaller 45°-45°-90° triangle with sides whose measure is
also a multiple of u. We can keep doing this for a number of iterations.

Eventually, however, we will obtain a


45°-45°-90° triangle so small that the length
of each of its sides is less than u, which can’t
be—u was supposed to be the fundamental
unit! We might be tempted to think that

3014 perhaps u was too big to be the fundamental


unit. Using a smaller unit, however, would
only delay the inevitable fact that at some
point we will reach a triangle with sides whose
lengths are shorter than our fundamental unit.
Choosing ever smaller units leads to ever smaller “terminal” triangles for as
along as we care to continue the process, another example of potential infinity.
Our beginning assumption that there was a common unit of measure leads to an
absurdity.

We have seen two different ways of demonstrating that the diagonal of a square
is incommensurable with its side length. It is not uncommon today to calculate
that if the side length of the square is 1 unit long, then its diagonal is 2 units
long. The Greeks, themselves, may not have agreed that something such as
this is a number. Recall that the Pythagoreans viewed numbers as discrete
collections of atom-like units. This view of numbers requires that we have a
whole number “counter” to determine the size of the collection and a whole
number “namer” to sit in the denominator of the ratio and designate the size of
the unit. However, 2 poses a problem because it is not useful in this method;
it does not allow us to use whole numbers to serve as “counters” and “namers.”
This concept put the Pythagoreans in a bind, because it demonstrates that the
length of the diagonal of a unit square cannot be a number; consequently, “all
is not number.” If we insist that such a number must exist because it measures
a magnitude that actually does exist, then it is clear that we do not know what a

Unit 3 | 17
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.3 number really is. We shall return to this problem a bit later in the text.
The incommensurability argument essentially shows that there are no whole
m
Incommensurability numbers m and n such that 2 = . We call quantities like these, “irrational,”
n
and irrationality and we have seen that their existence is fundamentally linked to a manifestation
CONTINUED
of infinity (the infinite regress of Theodorus’ proof, for instance.) In the previous
section, we saw that any rational number can be written as a repeating decimal
and vice versa. However, it doesn’t take much thought to conceive of a decimal
that does not repeat any finite digit sequence and does not end, such as:

0.101001000100001…

Putting aside for a moment the question of whether or not something like this
actually exists, we can say at least that this thing cannot be rational, because
if it was it would repeat itself, which it is clearly not going to do. Its decimal
expansion extends to infinity with no repetitive elements. This brings us to the
point that any non-repeating decimal is non-rational, or irrational. It can also
be shown that, like the 2 , any square root of a number that is not a square
number, will also be irrational. Values such as 5 , 7 , 103 , etc., are all
irrational.

Shortly after Hipassus made his arguments for incommensurability, which


would lead to the discovery of irrational quantities, an Eleatic philosopher, Zeno,
would also show the absurdity of a world in which there were fundamental
smallest units of space and time. Recall that the Eleatics held beliefs somewhat
diametrically opposed to those of the Pythagoreans—that multiplicity, the
idea that the universe is composed of fundamental parts—is ridiculous. They
believed in continuous magnitudes in which any perceived boundaries were
illusions. This idea is somewhat similar to the concept that “all is one.”
Similarly to Hipassus’ argument for incommensurable magnitudes, Zeno would
show that treating a line as a multitude of individual points was philosophically
contradictory. These ideas would force thinkers to confront notions of actual
infinity—an infinity contained in a limited space—which would prove to be both a
powerful concept and a troublesome idea in mathematics.

Unit 3 | 18
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.4

zeno’s paradoxes • Zeno vs. the Multitude


• You Can’t Catch Up. You Can’t Move. You Can’t Even Start.
• Limits

zeno vs. the multitude


• The first of Zeno’s arguments shows that considering a line segment to be a
collection of points is contradictory.

Many early Greeks, particularly the followers of Pythagoras, were fascinated by


the idea that whole numbers and their properties provided the first principles
upon which all else could be built. Numbers, to the Pythagoreans, were discrete
building blocks, like atoms. One of their basic assumptions was that there was
always some indivisible unit that could be used to compare any two quantities in
nature.

The Eleatic philosopher Zeno proposed a series of philosophical challenges to


the notions of multiplicity and motion that demolished the idea of fundamental
units of both space and time. That these arguments are paradoxical is due, in
large part, to the role of infinity.

Let’s first look at Zeno’s arguments against multiplicity. In the Pythagorean


view, all things in nature could be measured as multiples of a standard unit. For
instance, they viewed a line as a collection of discrete points. An infinite line
would be an infinite collection of points, but only in the sense of potential infinity,
because it was impossible for any person to create a real infinity. A bounded
line, a line segment, would, therefore, be constructed of a finite number of
points. Zeno, representing the views of the Eleatic school, argued against
this view by pointing out that a line segment of any given length can always be
bisected, or cut in half. Such a division creates two line segments, each of which
can be bisected again, and again, ad infinitum.

Unit 3 | 19
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.4
1/2

2499
zeno’s paradoxes
CONTINUED 1/4

1/8

1/16

1/32

1/64

To a Pythagorean, it was perfectly acceptable to think of an indivisible unit, an


“atom,” with which magnitude could be “built.” Hipassus’ argument against
commensurability complicated this view somewhat. Hipassus showed that
there could be no fundamental common unit between the side and diagonal
of a square. As we saw in Theodorus’ proof using triangles, this idea of
incommensurability implies that a magnitude can be divided as many times as
one wishes. Accepting the notion that a magnitude, such as a line segment,
can be infinitely divided, or bisected, leads ultimately to the conclusion that any
fundamental, atom-like unit must have zero length. This creates a paradox: how
can one construct a line segment out of pieces that have no length? One can add
zero to zero as many times as one likes and the result will always be zero.

The Eleatic view that a line segment can be infinitely bisected requires that the
segment be a continuum with no firm boundaries between one location and the
next. The Pythagorean view is based on the concept of discrete parts. Hipassus
and Theodorus argued against the Pythagorean view, but Zeno presented a
series of four situations that undercut both views. Zeno’s paradoxes, although
primarily constructed to refute the idea that motion is real, simultaneously
manage to argue for and against continuous space (and time), invoking infinity
and the absurdities that so often accompany it.

You Can’t Get There from Here


• Zeno’s paradoxes of motion are perhaps his most famous and extend his
arguments to consider the absurdities of both discrete and continuous space
and time.

Unit 3 | 20
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.4 Zeno’s most famous arguments have to do with both time and space. He showed
that viewing space as a multitude of points and time as a multitude of discrete
zeno’s paradoxes “moments” forces us to believe that motion is an illusion. Common sense
CONTINUED argues against this view, but common sense is informed by our senses, which
could, in the view of a philosopher in love with rational deduction, be deceiving
us. Zeno presented four arguments against motion: The Dichotomy, The Arrow,
Achilles and the Tortoise, and the Stade. Let’s look at two of these, the first an
argument against continuous space, the second an argument against discrete
space and time.

The Dichotomy: Space Cannot Be Continuous

A B

2500
1 1 1 1 1 1
64 32 16 8 4 2

The Dichotomy is very similar to the bisecting line argument we saw in the
prior section. In Zeno’s example, a horse is trying to traverse the distance from
point A to point B. Before it can reach point B, it obviously must first cover half
the distance. Before it can cover half the distance, it just as surely must cover
a quarter of the distance, and so on. If space is composed of a multitude of
points, it must cover an infinite number of these points in a finite time, which is
contradictory. Hence, by this line of reasoning, the horse can never make it from
point A to point B.

The Stade: Space and Time Cannot Be Discrete

A A A A

2501 B B B B

C C C C

Picture Of Three Four-Box Sequences

Picture Of Three Four-Box Sequences


This last of Zeno’s arguments is more easily understood in a modern example.
Suppose that there are three trains, each composed of cars of equal size. Train
A is at rest; train B is moving to the left relative to train A; and train C is moving
to the right relative to both of the other trains and is traveling at the same speed
as train B.
Unit 3 | 21
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.4 Let’s say it takes a time, T, for one car of train B to pass completely by one car of
train A.
zeno’s paradoxes
CONTINUED
A A

B 1 Unit of 1 Unit of C
2502 u
displacement displacement
u
T is how long it takes B T is also how long it takes C
to move this distance to move this distance

Because train C is moving at the same absolute speed as train B, it also takes
time T for one car of train C to pass one car of train A.

A B

B C

2503 u C D u

u u
2 units of displacement 2 units of displacement
in time T in time T/2

How far do trains B and C move relative to each other in the given time, T?
Because train B moves one car to the left and train C moves one car to the right,
they move two whole cars relative to each other. On the basis of this reasoning,
T
we would be perfectly justified in defining a new smallest unit of time, , as the
2
time it takes for train C to move one car relative to train B. This effectively treats
train B as being at rest, and we could imagine a new train, train D, and repeat
the argument ad infinitum.

The point here is that it is contradictory to imagine time as a series of discrete


moments, because those moments can be infinitely subdivided.

Limits
• There are multiple ways to resolve Zeno’s paradoxes, although many have
shortcomings.
• The standard mathematical resolution uses the idea that a sum of infinite,
decreasing, quantities can be considered finite.
• The idea of a limiting value to an infinite process is at the heart of calculus.

Unit 3 | 22
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.4 Philosophers and scientists throughout the centuries attempted to resolve
Zeno’s paradoxes by a variety of arguments. Some denied that space and time
zeno’s paradoxes exist in any meaningful sense. Some asserted that space and time are not, in
CONTINUED fact, infinitely divisible, and moved on. Others used the paradoxes as evidence
that our ability to reason is itself contradictory. Still others regarded the
distinction between the many and the one to be false, a concept reminiscent of
the Eleatic world view that helped spawn the paradoxes in the first place.

Whatever the putative resolutions, it would be a stretch to call any of them


mathematical. Mathematicians after Zeno had to accept the existence of
actual infinity, even though it does not make intuitive sense. For example, to
resolve the paradox of the The Dichotomy, we can look to the convergence of a
geometric series, 1 + x + x2 + x3…. It is not hard to show that a general geometric
1
series converges to 1− x as long as |x| is less than 1. To do this requires that
( )
we examine the behavior of the series as it approaches infinity. Note that a
1
general geometric series begins with 1, so if x = , the sum of the series is then
2
2.
1 1
= =2
 1  1
 1− 2   2 

The Dichotomy paradox essentially presents an infinite sum of terms of


1 1 1 
1 2 3
     
decreasing size  + + + ... , which we can recognize to be  1 +  1 +  1 + ...
2 4 8   2  2  2
However, unlike a general geometric series, the series implied by The
Dichotomy does not start with 1. Consequently, the sum of The Dichotomy
1 1
series is actually (1− x ) −1, which, with x = , equals 1. In other words, the horse
2
makes it from point A to point B.

So, infinity became a tool that could be used, as long as one didn’t look too
closely at exactly how it worked. Mathematicians came to accept that one could
indeed have a finite limit to an infinite sum. This concept made it possible
to arrive at a finite magnitude by summing an infinite number of infinitely
small pieces. Such pieces, which became known as infinitesimals, have, in
some disturbingly vague sense, arbitrarily small but non-zero magnitudes.
The great Newton, one of the fathers of the calculus, the revolutionary new
theory of the 1600s that described motion both in the heavens and on Earth, at
first based his ideas on these troublesome infinitesimals. It wasn’t until the
1800s that Augustine Cauchy turned matters around and developed a sound
base for the subject by speaking of limits. This had a profound effect on the

Unit 3 | 23
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.4 concept of “number,” for Cauchy also found a consistent way to give meaning
to irrational quantities, essentially defining them as limits of sequences
zeno’s paradoxes of rational quantities. For example, let’s return to the mysterious quantity
CONTINUED 0.101001000100001000…. Cauchy would define this as the limit of the sequence
of rationals, 0.1, 0.101, 0.101001, 0.1010010001, …. This shift of perspective
represented a marrying, of sorts, of the potential and the actual infinite, and it
brought some logic to the concepts of the infinity of irrationals and the infinite
sums that arise in calculus.

As calculus began to assume a larger and larger role in both math and science,
the need to understand infinity became greater. This quest for understanding
ultimately required a shift in thinking, away from looking at whole numbers and
magnitudes, toward thinking about sets. In the next section, we will see some of
the fundamental ideas in this new way of thinking.

Unit 3 | 24
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.5

RE-learning • What Is a Number Anyway?


to Count • Counting to Infinity

What is a number Anyway?


• To understand infinity, we need a new way to think about what a number is.

After the dual assault by Hipassus and Zeno, mathematicians were forced to
accept a world that admits both the discrete and the continuous, the rational
and the irrational. We consider rational numbers to be discrete quantities of
fundamental units, such as “three”-“sixths” (understood as the quantity three
of the fundamental unit one-sixth) or “twenty-five”-“hundredths.” Irrational
numbers are trickier, requiring an infinite number of non-repeating digits to be
expressed in decimal form. The fact that both of these types of things count as
“numbers” can be somewhat puzzling. Nevertheless, Cauchy and some of his
contemporaries had shown that irrationals, as represented by non-repeating,
non-terminating decimals, were essential building blocks for calculus and
associated areas of mathematics. Moreover, these innovative mathematicians
extended the traditional rules of arithmetic to these number newcomers in
such a seamless way that it became clear that the irrationals deserved to be
considered numbers every bit as much as their rational predecessors.

At this stage of the development of mathematical thought, the idea of number


had been extended from the counting numbers (the naturals) to the rationals
(by way of ratios of counting numbers) and on to irrationals (by way of infinite
sequences of rationals). All of these numbers, rational and irrational together,
formed a large set that came to be called the “real numbers.” At each step of
this categorizing process, the set of “acceptable” numbers had been enlarged—
or had it? Were these new sets really any bigger than their predecessors? Is
the set of rationals really bigger than the set of counting numbers? Is the set of
real numbers really bigger than the set of rationals? Is it possible that they are
all simply instances of the mysterious “size” called infinity?

The man who first tackled these questions was Georg Cantor. Cantor was a
German mathematician working in the second half of the 19th century and the
first two decades of the 20th century. He was a contemporary of luminaries
such as Poincaré, Kronecker, and Hilbert. The first two of these men refused
to acknowledge his great contributions to mathematics; the third was an ally of

Unit 3 | 25
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.5 tremendous standing. Cantor’s work was controversial and his life was one of
much struggle and little recognition. Denied employment at the more-respected
RE-learning universities in Germany, he was forced to work at smaller, less-prestigious
to Count institutions. Despite this, he helped set mathematics on firmer footing by fully
CONTINUED examining the implications of using actually infinite sets. His first breakthrough
was to re-define the concept of a number.

If you were asked to define the number 3, you could very well say “1, 2, 3”,
or “the number of things in the set {a, b, c}.” Both of these responses are
instances of the number three—both of them enumerate sets having three
members, but neither of them defines the concept without referring to either “3”
or “number.” In general, we should be wary of definitions that must reference
themselves. A better line of reasoning is required in order to come up with a
true definition of a particular number.

Imagine that you are a ballroom dance teacher. As you begin a lesson, you want
to make sure that you have the same number of girls and boys, so that each will
have a dance partner of the opposite sex. You could take the time to count the
boys, and then count the girls, and then compare the two numbers. A faster
way would be simply to pair them off, one boy with one girl, until everyone has
a partner. If there is no one left over, you have demonstrated that there are the
same number of boys as girls. In mathematician’s terms, you have shown a
one-to-one correspondence between the set of boys and the set of girls in your
dance class.

Going back to our troublesome definition, the most that we can say about the
number three is that it is the property shared by the sets {1, 2, 3} and {a, b, c},
and all other sets that can be put into one-to-one correspondence with these
sets. Hence, any set that can be put into one-to-one correspondence with these
sets also shares the property of “three-ness”. This is what we really mean when
we say “three.” Three is the common property of the group of sets containing
three members. This idea is called “cardinality,” which is a synonym for “size.”
The set {a,b,c} is a representative set of the cardinal number 3.

This all sounds like a bunch of semantics, but it is necessary to think of


numbers in this way to gain a firm hold on the concept of infinity. We can use
the technique of setting up one-to-one correspondence to compare the sizes
of different sets without having to “count” all the members of the sets. This is
indeed handy when it comes to infinity.

Unit 3 | 26
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.5 Counting to Infinity


• The rational numbers can be put into one-to-one correspondence with the
RE-learning counting (natural) numbers.
to Count • The irrational numbers cannot be put into one-to-one correspondence with
CONTINUED the natural numbers.
• A “countable” infinite set is one that can be put into one-to-one
correspondence with the set of natural numbers; an “uncountable” infinite
set is one that cannot.

To get a sense of the tools we’ll need to answer tough questions about infinity,
we can start with a relatively straightforward example. Note that only 4 of the
first 16 whole numbers are squares:

1 , 2, 3, 4 , 5, 6, 7, 8, 9 , 10, 11, 12, 13, 14, 15, 16 , …

It would be tempting to use this as evidence that there isn’t one-to-one


correspondence between the sets, as there seem to be more whole numbers
than square numbers. However, in the 16th and 17th centuries, Galileo, famous
for his work in astronomy and physics, demonstrated that there are, in fact, the
same number of whole numbers and square numbers. To do so, he pointed out
that every whole number can be made into a square number, after which it is
possible to line up the two sets of numbers as so:

Whole
1 2 3 4 5 6
Numbers

2511 Square
Numbers
1 4 9 16 25 36

Galileo’s simple exercise made it clear that the set of whole numbers can be put
into one-to-one correspondence with the set of square numbers. According to
our new definition of number, this means that there must be the same number
of each. What Galileo did was essentially to put the square numbers in a list and
use the natural numbers to count them. Does this strategy also work for other
types of numbers?

Let’s consider the rational numbers for a moment. Given any fraction, we can
always find a smaller one by taking half of it. So, if we want to list all of the
rational numbers in sequence from least to greatest, which one should be first?
1 1
We could say that 1000 is pretty small and could potentially be first, but 2000 is
smaller—should it be first? This line of thinking obviously will not get us very

Unit 3 | 27
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.5 far, as we can easily generate smaller and smaller fractions. Perhaps, because
rational numbers are all expressible in terms of two quantities, a “counter” and
RE-learning a “namer,” a single, linear list is not sufficient for the task at hand. It might be
to Count useful to organize fractions not in a list but, rather, in a two-dimensional array.
CONTINUED

To accomplish this, imagine putting all of the fractions that have a 1 as their
denominator in the first column of a table. Then let’s put all the fractions that
have a denominator of 2 in the second column, all those with a denominator of 3
in the third column, and so on. We will generate a table like this:

1/1 1/2 1/3 1/4 ...


2/1 2/2 2/3 2/4 ...

2512 3/1
4/1
3/2
4/2
3/3
4/3
3/4
4/4
...
...
... ... ... ... ...

We could list every positive rational number if we were to continue this grid.
Does this mean that the rational numbers cannot be put into a list, as is possible
with the square numbers? Cantor, remarkably, showed that it is indeed possible
to put rationals into a list format. This concept is now known as Cantor’s “first
diagonal” argument.

To compose such a list, we can trace out a weaving path through the table
above, skipping over fractions that really are the same as ones we’ve already
2 2 4 1,000
encountered (such as , , , , etc.).
2 4 4 3,000

1/1 1/2 1/3 1/4 ...

2504 2/1
3/1
2/2
3/2
2/3
3/3
2/4
3/4
...
...
4/1 4/2 4/3 4/4 ...
5/1 5/2 5/3 5/4 ...
... ... ... ... ...

This strategy creates a list that looks like this:


1/1, 2/1, 1/2, 1/3, 2/2, 3/1, 4/1, 3/2, 2/3, 1/4,...

It should be clear that writing the rationals in this fashion will account for every
possible rational for as long as we care to continue.

Unit 3 | 28
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.5 What’s the advantage of having this list? It’s easy to start counting them. We
1 1 2
can assign the number 1 to , the number 2 to , the number 3 to , and so
1 2 1
RE-learning forth for as long as we like. We see that there is a one-to-one correspondence
to Count between the natural numbers and the rational numbers. As with the boys and
CONTINUED girls in the dance class example, the one-to-one correspondence indicates that
the two sets are the same size.

So, the answer to the question, “How many rationals are there?” is “an infinite
number,” but it is a “countable” infinity. That is, it would be possible, in theory,
to list all the rationals and to number them using the natural numbers. Any set
that can be put into one-to-one correspondence with the set of natural numbers
is considered to be countably infinite. Note that we have not mentioned negative
rational numbers yet. However, through the same strategy they too can be put
into a list that can be shown to have one-to-one correspondence with the set of
natural numbers.

It might seem that we can put anything into a list that can then be matched up
with the set of natural numbers. How about the set of all real numbers?

Again, we can look for a one-to-one correspondence with the natural numbers.
Suppose we could list all of the real numbers, rational and irrational, between
0 and 1. Such a list, expressing both rationals and irrationals in decimal form,
might look like this:

0.36264934…
0.11192737…
0.33333333…
0.66736270…
0.98800034…

Even though we are limiting ourselves to looking only between 0 and 1, Zeno
made it clear that this list would be infinite. Furthermore, because we have put
the numbers in a list, there should be a “first”, “second”, etc., on up to the “nth”
number. So, it would appear that this list, as we have imagined it, is in one-to-
one correspondence with the natural numbers and is, therefore, countable.

Unit 3 | 29
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.5 However, at this point, Cantor had a second epiphany. He asked, “What if we
create a new decimal by the following method?”: In the columnar list of real
RE-learning numbers in decimal form, consider an array of diagonal digits—that is, the first
to Count digit of the first decimal, the second digit of the second decimal, the third digit of
CONTINUED the third decimal, and so on.

0.36264934...

2505 0.11192737...

0.66736270...

0.98800034...

...
In each case, if the digit is anything other than 1, put a 1 in the corresponding
place of the new decimal number. If the digit on the diagonal is 1, put a 2 in the
corresponding decimal place of the new number. The new decimal number
formed in this way is different from every other number on this list by at least
one digit.

1
0.36264934...
2
0.11192737... New decimal not on the list
1 0.1211...

2506
0.66736270...
1
0.98800034...

... can that be? The list we imagined was supposed to be complete, but we can
How
clearly create a number that was not on that list! This reasoning is similar to the
reasoning we employed with Euclid’s proof of the infinitude of primes in the unit
on prime numbers; namely, we start with a list that is assumed to be complete
and then show that it isn’t actually complete.
We now call this line of reasoning Cantor’s “second diagonal” argument—it
involves constructing a new number by following a diagonal path through the
digits of a “complete” set.

Following this process proves that the original list was incomplete, because
we were able to construct a number not in the list. This new number cannot
be paired up with any of the natural numbers, because each of them is already

Unit 3 | 30
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.5 paired up with a number from the original list. One could argue that we should
simply make room for the new number on our original list and re-assign the
RE-learning pairings, but this could be done ad infinitum, and at every step we could still
to Count create a new number not in the list.
CONTINUED

The only alternative explanation is that there must in some sense be more real
numbers than natural numbers! In other words, the reals are not countable.
Such a set is considered to be uncountably infinite.

We now have two distinct types of infinity, countable and uncountable. If we


consider something that we cannot count to be larger than something that we
can count (which seems logical), then it makes sense to say that the uncountable
type of infinity is larger than the countable type.

Cantor called the cardinality of all the sets that can be put into one-to-one
correspondence with the counting numbers ℵ0 , or “Aleph Null.” The cardinality
of sets that cannot be put into one-to-one correspondence with the counting
numbers, such as the set of real numbers, is referred to as c. The designations
ℵ0 and c are known as “transfinite” cardinalities. The cardinality c is also known
as the “cardinality of the continuum,” denoting that these sets are best thought
of as a continuous, unbroken line, as opposed to a discretely enumerated line.
Such a line is akin to the Greek idea of a magnitude: infinitely divisible, with no
discrete points. Both cardinalities are complete, actual infinities, rather than
potential infinities, but they are not equal in size to one another.

The idea that there are different types of infinity might seem strange, but
Cantor pushed his exploration even further into the realm of novel ideas. To
understand more completely what Cantor contributed, we should ask at least
two more questions.

First, we said that uncountable infinities, such as those with a cardinality of c,


are “bigger” than countable infinities, such as those with a cardinality of ℵ0, but
we didn’t prove it. Which is bigger, ℵ0 or c? What does it mean for one number,
finite or not, to be larger than another?

Unit 3 | 31
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.5 Second, we must wonder whether there are any more types of infinity. That
is, are there any more transfinite cardinalities? If not, why are there only two
RE-learning types?
to Count
CONTINUED These two questions are actually related. In the next section, we will see how
their resolution leads to even more bizarre conclusions about the nature of the
infinite.

Unit 3 | 32
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.6

Cantor’s Theorem • Which Is Bigger?


• Beyond Infinity

Which is Bigger?
• Set A is larger than set B if the elements of B can be put into one-to-one
correspondence with a subset of the elements of A, but cannot be put into
one-to-one correspondence with all of the elements of A.

What, in fact, is meant by the statement that one number is bigger than another
number? In terms of cardinalities, this question basically means, “When is one
set bigger than another?” Looking at sets of cardinality 5 and 3, for example, we
see that:

SET A
2507
SET B

For simplicity’s sake, “3” means “a set of cardinality 3,” and “5” means “a set of
cardinality 5.” So, “3” can be put into one-to-one correspondence with a subset
of “5.” Also, “3” cannot be put into one-to-one correspondence with all of “5.”
In other words, there is a one-to-one correspondence between “3” and part of
“5,” but not between “3” and all of “5.” Intuitively, we can see that “5” must be
bigger. This idea can be generalized. A set N is bigger than another set M, if:

There is a one-to-one correspondence between M and a subset of N.

AND

There is not a one-to-one correspondence between M and all of N.

We can use these conditions to consider the relationship between ℵ0 and c. Let’s
choose a set of size ℵ0, such as the natural numbers. The real numbers will be
our set of size c.

Unit 3 | 33
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.6
Checking for compliance with the first condition given above, we ask if there is
Cantor’s Theorem a one-to-one correspondence between the natural numbers and a subset of the
CONTINUED reals. Well, the set of real numbers includes the set of natural numbers, so the
answer is “yes.”

To verify that the second condition is also met, that there is not a one-to-one
correspondence between the natural numbers and the reals, we can simply
appeal to the same diagonal argument we made two sections ago. Recall that
the natural numbers were not numerous enough to be put into one-to-one
correspondence with the reals between 0 and 1. So, we have established firmly
now that c is indeed larger than ℵ0.

This exercise answers the first question posed at the end of the last section, and
in doing so, it gives us a way to determine whether a certain set is bigger than
another set. Let’s now turn to the second question: could there be a transfinite
cardinality larger than c? In other words, is there an infinity bigger than that
exhibited by the set of real numbers?

Beyond Infinity
• Given a set of any non-zero size, it is possible to create a larger set by taking
the set of subsets of the original.

Suppose we have a set N, consisting of {A,B,C,D}. We can then identify set S


as the set of all subsets of N. We can conduct a little mental experiment by
asking a friend to write the members of N in a row and then the members of S in
another row below the first, like so:

N a b c d

S {} {a} {b} {c} {d} {ab} {ac} {ad} {bc} {bd} {cd} {abc} {abd} {acd} {bcd} {abcd}

Next, we ask the friend to circle four elements of S and to match them up with
the elements of N.
N A B C D

2508 S {} {A} {B} {C} {D} {AB} {AC} {AD} {BC} {BD} {CD} {ABC} {ACD} {BCD} {ABCD}

Unit 3 | 34
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.6 We can find an element of S that is not matched up by asking our friend whether
an element of set N is matched with a subset that contains it. For example, say
Cantor’s Theorem that our friend tells us “yes” for A, “no” for B, “no” for C, and “yes” for D. With
CONTINUED this information, we can be sure that the subset {BC} has not been matched
with any member of N. Because every element of N has been matched with an
element of S, and there is at least one “leftover” element of S, we can say that
S is definitely larger than N. Note that we did not have to count the members of
either set to figure this out.

This strategy works for a finite set, but does it also work for an infinite set? The
fact that we did not have to count anything to prove that S is bigger than N is a
good sign. Let’s see if anything changes if we let N be an infinite set and S be
the set of subsets of N.

N={A,B,C,D, ...}
S={ {}, {A}, {AB}, {ABC}, ...}

Let’s again attempt to match up every member of N with a member of S. We


can ask the “yes or no” questions from before to find out which members of N
are paired up with subsets that contain them. Let’s say that any members of
N for which the answer is “no” go into a new set called W. The set W is then
the subset of N containing members that are not paired up with a subset that
contains them.

Because W is a subset of N, it must be in S, which is the set of all subsets of N.


Can W be matched up with some element of N? Let’s say that w, a member of N,
is matched up with W.

S
N

2509 W
w
W

If w is in W, then it is matched up with a subset that contains it, which


by definition means that it cannot be in W; therefore, we have a logical
contradiction.
Unit 3 | 35
UNIT 3 How Big Is Infinity?
textbook

SECTION 3.6
S
N
Cantor’s Theorem

2510 CONTINUED W W
w

If, on the other hand, w is not in W, then it is not matched up with a subset that
contains it, which is the requirement for being a member of W. However, we’ve
already established that w is not in W, so again we have a logical contradiction.

Our only viable explanation is to assume that w cannot be matched up with


W. This means that there are members of S that cannot be matched with any
members of N; consequently, S must be bigger than N. We have now seen that
in both the finite and infinite cases, the set of subsets is always larger than the
original set.

To return to the cardinalities of infinity, we can create a set larger than ℵ0 or c


by taking the set of the subsets. We can clearly keep doing this as long as we
please, so we have to conclude that there truly is an infinite number of different-
sized infinities! This mind-boggling thought is one of the pure beauties of
mathematics. It would be as if, having climbed the highest mountain in the
world, one could see that there were peaks of heights previously unimaginable.

With his observations, Cantor spelled out the logical consequences of believing
in actual infinity. Such an idea is “beyond human,” which is partly why Cantor’s
ideas received so much criticism. In mathematics, however, there are many
occasions to believe in concepts that are “beyond human,” such as infinitely long
lines in geometry (can anyone check?) or our modern acceptance of infinitely
long decimals as numbers. Cantor showed that one can make logical sense
out of infinity by thinking in terms of the sizes, or cardinalities, of sets. In
doing this, he shored up the foundations of all the mathematical concepts that
relied upon infinity. While there is still debate about some of the philosophical
underpinnings of infinity in mathematics, most mathematicians do not have to
concern themselves with it. This is due, in large part, to the work of Cantor.
David Hilbert, one of the truly great mathematicians of Cantor’s time, recognized
the enormity of his contribution in the immortal thought:
“No one shall expel us from the paradise which Cantor has created.”
Unit 3 | 36
UNIT 3 at a glance
textbook

SECTION 3.2

Rational Numbers • Rational numbers arise from the attempt to measure all quantities with a
common unit of measure.
• Rational numbers can be expressed as decimals that repeat to infinity.
• In the mathematics of early Greece, there was a strong distinction between
discrete and continuous measurement.
• Number refers to a discrete collection of atom-like units.
• Magnitude refers to something that is continuous and that can be infinitely
subdivided.

SECTION 3.2
3.3

Incommensurability • The side and the diagonal of a square are incommensurable.


and Irrationality • Incommensurable quantities are not rationally related, because this
logically leads to an infinite regress.

SECTION 3.2
3.4

Zeno’s Paradoxes • The first of Zeno’s arguments shows that considering a line segment to be a
collection of points is contradictory.
• Zeno’s paradoxes of motion are perhaps his most famous and extend his
arguments to consider the absurdities of both discrete and continuous space
and time.
• There are multiple ways to resolve Zeno’s paradoxes, although many have
shortcomings.
• The standard mathematical resolution uses the idea that a sum of infinite,
decreasing, quantities can be considered finite.
• The idea of a limiting value to an infinite process is at the heart of calculus.

Unit 3 | 37
UNIT 3 at a glance
textbook

SECTION 3.5

Re-Learning • To understand infinity, we need a new way to think about what a number is.
to Count • The rational numbers can be put into one-to-one correspondence with the
counting (natural) numbers.
• The irrational numbers cannot be put into one-to-one correspondence with
the natural numbers.
• A “countable” infinite set is one that can be put into one-to-one
correspondence with the set of natural numbers; an “uncountable” infinite
set is one that cannot.

SECTION 3.2
3.6

Cantor’s Theorem • Set A is larger than set B if the elements of A can be put into one-to-one
correspondence with a subset of the elements of B, but cannot be put into
one-to-one correspondence with all of the elements of B.
• Given a set of any non-zero size, it is possible to create a larger set by taking
the set of subsets of the original.

Unit 3 | 38
UNIT 3 How Big Is Infinity?
textbook

BIBLIOGRAPHY

WEBSITES http://itech.fgcu.edu/faculty/clindsey/mhf4404/archimedes/archimedes.html
http://personal.bgsu.edu/~carother/pi/Pi3a.html

PRINT Aristotle. (Edited by: Richard McKeon, Introduction by C.D. Reeve) The Basic
Works of Aristotle. New York: Modern Library, 2001.

Benjamin, Arthur T and Jennifer J. Quinn. Proofs that Really Count: The Art of
Combinatorial Proof (Dolciani Mathematical Expositions). Washington, D.C.:
Mathematical Association of America, 2003.

Berlinghoff, William P. and Fernando Q. Gouvea. Math Through the Ages : A


Gentle History for Teachers and Others. Farmington, ME: Oxton House Publishers,
2002.

Berlinghoff, William P. and Kerry E. Grant. A Mathematics Sampler: Topics for the
Liberal Arts, 3rd ed. New York: Ardsley House Publishers, Inc., 1992.

Boyer, Carl B. (revised by Uta C. Merzbach). A History of Mathematics, 2nd ed.


New York: John Wiley and Sons, 1991.

Burton, David M. History of Mathematics: An Introduction, 4th ed. USA: WCB/


McGraw-Hill, 1999.

Conway, John H. and Richard K. Guy. The Book of Numbers. New York:
Copernicus/ Springer-Verlag, 1996.

Du Sautoy, Marcus. The Music of the Primes: Searching To Solve the Greatest
Mystery in Mathematics. New York: Harper Collins, 2003.

Gazale, Midhat. Number: From Ahmes to Cantor. Princeton, NJ: Princeton


University Press, 2000.

Gross, Benedict and Joe Harris. The Magic of Numbers. Upper Saddle River, NJ:
Pearson Education, Inc./ Prentice Hall, 2004.

Henle, J.M. “Non-nonstandard analysis: Real infinitesimals,” Mathematical


Intelligencer, vol. 21 Issue 1 (Winter 1999).

Unit 3 | 39
UNIT 3 How Big Is Infinity?
textbook

BIBLIOGRAPHY
Joseph, George Gheverghese. Crest of the Peacock: The Non-European Roots of
Mathematics. Princeton, NJ: Princeton University Press, 2000.
PRINT
CONTINUED Mueckenheim, W. “On Cantor’s Important Proofs.” Cornell University Library.
http://arxiv.org/abs/math/0306200 (accessed 2007).

Mueckenheim, W. “The Meaning of Infinity.” Cornell University Library.


http://arxiv.org/abs/math/0403238 (accessed 2007).

Newman, James R. Volume 1 of The World of Mathematics: A Small Library of the


Literature of Mathematics from A’h-mose the Scribe to Albert Einstein. New York:
Simon and Schuster, 1956.

Poonen, Bjorn. “Infinity: Cardinal Numbers.” Berkeley Math Circle, UC Berkeley.


http://mathcircle.berkeley.edu/bmcarchivepages/handouts/1998_1999.html
(accessed 2007).

Schechter, Eric. “Potential Versus Completed Infinity: Its History and


Controversy.” Department of Mathematics, Vanderbilt University.
http://www.math.vanderbilt.edu/~schectex/
http://www.math.vanderbilt.edu/~schectex/courses/thereals/potential.html
(accessed 2007).

Schumacher, Carol. Chapter Zero: Fundamental Notions of Abstract Mathematics.


Reading, MA: Addison-Wesley Higher Mathematics, 1996.

Stewart, Ian. From Here to Infinity: A Guide to Today’s Mathematics. New York:
Oxford University Press, 1996.

Tannenbaum, Peter. Excursions in Modern Mathematics, 5th ed. Upper Saddle


River, NJ: Pearson Education, Inc., 2004.

Tanton, James. “Arithmetic, Algebra and Abstraction,” Text in preparation, to


appear 2009.

Weisstein, Eric W. “Newton’s Iteration.” Wolfram Research http://mathworld.


wolfram.com/NewtonsIteration.html (accessed 2007).

Unit 3 | 40
UNIT 3 How Big Is Infinity?
textbook

BIBLIOGRAPHY
Weisstein, Eric W. “Pythagoras’s Constant.” Wolfram Research. http://
mathworld.wolfram.com/PythagorassConstant.html (accessed 2007).
PRINT
CONTINUED White, Michael. “Incommensurables and Incomparables: On the Conceptual
Status and the Philosophical Use of Hyperreal Numbers,” Notre Dame Journal of
Formal Logic, vol. 40, no. 3 (Summer 1999).

Zeno, of Elea. [translated by H.D.P. Lee] Zeno of Elea. A Text, with translation
from the Greek and notes. Amsterdam: A. M. Hakkert, 1967.

Lecture Allen, G. Donald. “Lectures on the History of Mathematics: The History of


Infinity.” Department of Mathematics, Texas A&M University. http://www.math.
tamu.edu/~dallen/masters/index.htm
http://www.math.tamu.edu/~don.allen/history/m629_97a.html (accessed 2007).

Unit 3 | 41
UNIT 3 How Big Is Infinity?
textbook

NOTES

Unit 3 | 42
TEXTBOOK
Unit 4
UNIT 04
Topology’s Twists and Turns
TEXTBOOK

UNIT OBJECTIVES

• Topology is the study of fundamental shape.

• Objects are topologically equivalent if they can be continuously deformed into one
another. Properties that are preserved during this process are called topological
invariants.

• Intrinsic topology is the study of a surface or manifold from the perspective of being
on or in it.

• Extrinsic topology is concerned with properties of a surface or manifold seen from


an external viewpoint. This requires some kind of embedding.

• The Euler characteristic is a topological invariant.

• Orientability is a topological invariant.

• A configuration space is a topological object that can be used to study the allowable
states of a given system.

• The question of the shape of our universe is a question of intrinsic topology.


Abstractness, sometimes hurled as a reproach
at mathematics, is its chief glory and its
surest title to practical usefulness. It is also
the source of such beauty as may spring from
mathematics.

Eric Temple Bell


UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.1 Does the universe go on forever? If not, what happens when we get to the edge?
What are the possible shapes that our space can take? What makes these
INTRODUCTION possible shapes different from each other? These questions are fundamental
to the mathematical study of topology. Topology, originally known as analysis
situs--roughly, “geometry of position”, seeks to describe what is fundamental
about shape in general.

To envision what we mean, imagine a subway map. A subway map shows the
connections between stops and which train lines transfer to others, but it
does not give any indication of the geography of the ground. Neither does it
accurately portray the distance between
stops. It basically shows you only how
many stops are in between others and
which connections you must make in
order to get to your destination stop. This
emphasis on connections at the expense of
relatively superficial characteristics, such
1606 as distance, is the key idea behind topology.

Pretend that you are in an unfamiliar


city and, unfortunately, you are without a
subway map. You know which stop you wish to get to, but without a map, you are
hopelessly lost as to how to get there. You ask a kind-looking stranger for help,
and she tells you to get on the blue line towards Flatsburgh, go three stops, then
transfer to the red line towards Square City, and get off at the fifth stop. You can
follow these directions and get to your desired destination without ever having to
look at a map.

In following these directions, you are experiencing the subway system firsthand;
your mental image of your journey would not necessarily be that of a map, but
rather that of your first-person perspective. This is an important perspective
known as an “intrinsic” view. In topology, this correlates to the study of a
surface or spatial shape from the perspective of someone who is in it. Looking
at a map of the subway system, on the other hand, is an example of taking an
“extrinsic” view, because you can see the system from the point of view of an
outside observer. In this unit we will look at topology from both the intrinsic and
extrinsic views.

Unit 4 | 1
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.1 Now that we understand how we will be looking at things, we can ask, “what are
these things that we wish to study?” In short, they are topological spaces, such
INTRODUCTION as graphs and manifolds. Our understanding of the full meaning of this term,
CONTINUED “manifold,” will develop over the course of the unit, but for now we can think of
manifolds as surfaces that, when viewed up close, appear to be flat. Our system
of subway tunnels could be thought of as a 1-manifold, as it is essentially a
system in which one can go only forward or backward. A 2-manifold is the
surface of something like a sphere. A 3-manifold is like our universe and can
be thought of analogously to the 2-manifold being a surface. This may not be
intuitive; one of our goals in this unit is to develop a better understanding of the
concept of 3-manifolds.

Topology, the study of position without


regard to distance, is an area of
1608 mathematics that deals with highly
abstract, idealized notions of shape,
connectedness, and other properties.
It is a true exercise for the mind,
and as such is best appreciated for
its intellectual and aesthetic value.
Although most topology is studied
for its own sake, some ideas can be
This shows the subway in 3-D. applied to problems in the real world.
Configuration space, for example, is
a way to view all possible physical arrangements of a system, such as the
equipment on a factory’s manufacturing floor, as a topological space. This can
aid in high-level design processes.

In this unit we will look at what is essential about shapes from both the intrinsic
and extrinsic views. Examining concepts such as connectedness, embedding,
and orientability, we will see how surfaces are classified and learn a bit about
the recent classification of 3-manifolds. Finally, we will see how concepts such
as the Euler characteristic apply to the manufacturing floor, and we will close
with an exploration of what our universe might be like on the largest of scales.

Unit 4 | 2
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.2

What Is Essential • Euler’s Bridges


about Shape? • Rubber Sheet Geometry

Euler’s Bridges
• Topology is generally believed to have started with Euler’s solution to the
Bridges of Königsberg problem.
• Euler saw that the essential nature of the problem had nothing to do with
distance or other geographical features, but only with connections. He
expressed this in the Euler characteristic.

To get an idea of how a topologist views the world, let’s look at a famous problem
considered by many to be the inspiration for the birth of topology. In the mid-
1700s residents of the city of Königsberg, Prussia (now called Kaliningrad,
Russia), tried to find a route that traversed each of the city’s seven bridges
exactly once.

Item 3100 / Oregon Public Broadcasting, created for Mathematics Illuminated, BRIDGES OF KÖNIGSBERG (2008).
Courtesy of Oregon Public Broadcasting.

Leonhard Euler, an influential Swiss mathematician who was living in


Königsberg at the time, took an interest in this problem. His solution provided
the basis not only for the study of topology, but also for graph theory, a topic that
we will take up in another unit.

Unit 4 | 3
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.2 Euler recognized that the distances overland and the lengths of the bridges
had no bearing whatsoever on the issue of the possible existence of a path that
What Is Esential traversed each bridge only once. He was able to condense, or simplify, the map
about Shape? of Königsberg much in the same way that we simplify a city’s geography when
CONTINUED
creating a subway map. His drawing looked like this:

Item 3095 / Oregon Public Broadcasting, created


for Mathematics Illuminated, ABSTRACT BRIDGES
OF KÖNIGSBERG (2008). Courtesy of Oregon
Public Broadcasting.

Gone were any geographical or man-made features such as the river, streets,
buildings, parks, etc. Euler reduced the entire arrangement to a diagram of
edges and nodes (points), in which the distances between points and the angles
between edges were not at all important. In fact, from a topological viewpoint,
all of the following diagrams would be equivalent to the above drawing.

1612

GRAPHS THAT ARE HOMEOMORPHIC TO THE BRIDGES OF KÖNIGSBERG

Unit 4 | 4
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.2 Euler’s graph of the Königsberg bridges and the different versions shown here
have the same fundamental connections. No matter how we stretch or bend
What Is Esential the graph, the connections remain the same, or invariant. It is as though the
about Shape? edges and nodes are made of rubber, and we are allowed to do anything we want
CONTINUED
to them as long as we don’t cut or glue the rubber. For this reason, topology
is often known as “rubber sheet geometry.” We haven’t seen the “sheet” part
of this yet, but it is coming up very soon when we extend our discussion from
graphs to surfaces and manifolds.

Let’s take a closer look at the connections shared by the graphs above. In
all of these drawings there is one node of degree 5 (i.e., a point at which five
edges meet), and there are three nodes of degree 3. Now, as it turns out, the
degrees of the edges of this graph determine whether or not the sought-after
path exists. In our case, which, remember, is analogous to Euler’s Königsberg
bridges problem, no path exists because there are more than two nodes with
an odd degree. We will examine this idea in more depth in another unit; what is
important to the development of topology is that the geography of the city was
simplified to this representational collection of edges and nodes.

Euler found another property of graphs that remains invariant under stretching
and bending. He noticed that graphs in the plane have not only nodes and edges,
but also faces. A face is basically the area defined by an associated set of edges
and nodes. Faces are topologically the same as disks.

EDGE

NODE

2081

FACE

Unit 4 | 5
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.2 If one takes the number of vertices, subtracts the number of edges, and adds the
number of faces, including the face that surrounds the graph, the result is two.
What Is Esential This formula holds true for any graph that we can draw on a piece of paper—or
about Shape? a piece of rubber. No matter how much we stretch, twist, or bend a graph, this
CONTINUED
number will always be two. This number is known as the Euler characteristic,
or Euler number.

We must be careful here and note that all of the graphs we have considered so
far are flat; that is, they exist on a flat plane, which is only one possible type of
surface. A sphere is a different type of surface, as is a torus, or donut shape. As
you might imagine, graphs on such surfaces as these do not “behave” the same
as graphs on a flat plane.

Rubber Sheet Geometry


• The Euler characteristic of a graph tells you the kind of surface upon which
that graph can exist.
• Two surfaces are considered to be equivalent if one can be continuously
deformed into the other without cutting or gluing.

The above examination of basic graphs has prepared us to think about


topological surfaces. This is the “sheet” part of “rubber sheet geometry.”
In our study of topology, we will be concerned with many different types of
surfaces. What’s fascinating is that the Euler characteristic is specific to the
type of surface upon which a graph is drawn. We can use it to help us determine
what kind of a surface we have.

Unit 4 | 6
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.2 For example, let’s look at the surfaces of a sphere and a torus.

What Is Esential
about Shape?
CONTINUED

EIGHT TRIANGLES
Notice that the
COVERING graph shown on the sphere, corresponding to a horizontal
A SPHERE
“equator” and a vertical “equator,” has 6 vertices, 12 edges, and 8 faces. A
configuration such as this, in which the surface is broken up into cells that
completely cover it, is called a cell division.. A cell can be thought of as a face,
because both are topologically equivalent to a disk. Plugging the known values
into
CELLEuler’s
DIVISIONequation, we see that it does indeed yield a result of two as its Euler
OF THE TORUS
number. How about for a torus?

1615

Notice that the cell division of a torus shows that it has one node, two edges,
and one face (if we unwrap it). This gives us an Euler characteristic of zero.
The Euler characteristic is an incredibly powerful concept, and we will see its
usefulness demonstrated at several points in our discussion. For now, all we
need to remember is that the Euler characteristic is an invariant of the surface
with which we are working. That is, we can stretch, twist, or bend a surface as
much as we want and the Euler characteristic of graphs on the surface will not

Unit 4 | 7
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.2 change. In other words, the Euler characteristic is considered topologically
invariant. The objects studied in topology are malleable, and their true, basic
What Is Esential nature can sometimes be obscured by contortions and deformities, so it is quite
about Shape? helpful to have some measures, such as the Euler characteristic, that we can
CONTINUED
use to identify what kind of things we are dealing with.

In topology, two shapes or surfaces are considered the same


if we can continuously deform one into the other. Cutting and

68 pasting are forbidden, but we can bend and squeeze all we


want. Consider the example of the “linked chain” you can
form by interlocking the index finger and thumb of each hand.

In our normal way of thinking, there would be no way for


a person whose hands and fingers are in this position to
OBJECTS TOPOLOGICALLY
EQUIVALENT TO A TORUS
“unlock” or separate the “chain links” without parting the index finger and
thumb of one hand. In the world of topology, however, it’s possible to become
unlinked without “breaking” either link if the person is sufficiently flexible!

3073 Objects in topology that can be transformed into one another are called
homeomorphic.
A HUMAN TURNING INTO A GENUS 2 TORUS

For the remainder of this unit, we will be concerned primarily with surfaces and
their generalized cousins, manifolds. We will envision twisting and bending
these objects according to the ideas presented in this section in order to learn
what fundamental properties they have. Before we do that, however, it would
make sense to focus for a moment on what exactly we mean when we speak of
surfaces and manifolds.

Unit 4 | 8
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.3

Surfaces and • Local vs Global


Manifolds • Genus
• 3-Manifolds

Local vs Global
• A surface, a two-dimensional manifold, looks flat in a local view, but it can
have a more-interesting global structure.

In 1884, the English mathematician and writer Edwin Abbott wrote a novel in
which almost all of the characters are two-dimensional beings. They live on a
surface called Flatland because everywhere it seems to be—well, flat. When we
refer to a “surface,” we generally mean something that appears to be nice and
flat when we look at it closely, as the Flatlander does.

A LOCAL FLAT REGION


However, just because something is flat in a given region does not mean that it is
an infinite plane that extends this flatness in all directions forever.

1613
-

083 -

The global structure of our object that appears to be so nice and flat on the
local level might be very complicated, having hills, valleys, holes, and strange,
reversing regions (we’ll come to those a little later).

Unit 4 | 9
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.3 In topology, there is an important distinction between the local and global view
of surfaces. We generally regard a two-dimensional surface as an object that
Surfaces and appears locally as a flat plane, regardless of its global behavior. Some common
Manifolds two-dimensional surfaces are the sphere, the torus, and the double torus.
CONTINUED

1658

Remember, though, that we are concerned only with what is essential about
shape, so there are many surfaces with a variety of looks that are actually the
same topologically.

1614

Genus
• Topological objects are categorized by their genus (number of holes).

What separates each topological shape from all other types is the numbers of
holes. No matter what is done to a shape, as long as it is topologically allowed,
the number of holes will remain constant (although, as we shall see, a hole may
not always look like a hole). Hence, the number of holes is another topological
invariant, just like the Euler number.

In fact, the Euler characteristic is related to the number of holes a surface has.
Notice that a sphere, whose Euler characteristic is two, has no holes; a torus,
whose Euler characteristic is zero has one hole; and a double torus, whose Euler
characteristic is -2, has two holes. An examination of this pattern reveals that

Unit 4 | 10
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.3 for every hole, the Euler characteristic decreases by two. This implies that the
relationship is linear and follows this formula: Euler characteristic = something
Surfaces and – twice the number of holes. The number of holes is also known as a surface’s
Manifolds genus, so we now have a rough idea of how the Euler characteristic of a surface
CONTINUED
relates to its genus.

The genus of a surface is a feature of its global topology. The local topology,
remember, is always that of a flat plane. The fact that the local topology is flat,
however, doesn’t mean that the geometry has to be. In unit 8, we will discuss
different types of geometry in detail. For the moment we are concerned only
with the difference between geometry and topology. In geometry, the primary
concern is the measurement of things such as lengths and angles. In topology,
it is possible to manipulate shapes without tearing or gluing, so these concepts
are pretty meaningless.

We have been discussing two-dimensional surfaces up until this point, but there
is no reason that our ideas need to be limited to such objects. We can generalize
the idea of a surface into that of a manifold. A 2-manifold is an object that has
the local topology of a plane, just like a two-dimensional surface. A 1-manifold
is an object that has the local topology of a line segment, regardless of how
twisted and knotted it is globally. These descriptions reveal the key property of
a manifold: in the local view, it looks straight, or flat, and featureless, but when
viewed globally, it may present a more-interesting structure.

ZOOM IN

3-MANIFOLDS
• A 3-manifold is the three-dimensional analog of a surface; it appears to be
like normal space in a local view, but it can have a more-complicated global
structure.

A 3-manifold is the generalization of a surface in three dimensions. It is an


object that has the local topology of what we normally think of as “space.” It

Unit 4 | 11
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.3 too, like the 1- and 2-manifolds, can have rather convoluted global topology.
It’s a bit hard for us to visualize what topologies might be possible on a global
Surfaces and scale, because we are stuck inside such a manifold; consequently, we cannot
Manifolds gain an external view as we can with the 1- and 2-manifolds. Nonetheless, there
CONTINUED
are some things that we can observe to acquire some ideas about the global
topology of a 3-manifold. One of these meaningful observations is of what
happens as we leave a particular point and head off on a straight–line path. If,
after traveling a sufficiently long distance without turning, we find ourselves
back where we started, we might have a clue as to the global topology of the
3-manifold we inhabit.

Perhaps this is a bit hard to visualize, so let’s return to a subway example. Let’s
pretend that this subway system is very large, but very simple, consisting of
a single, large oval. It is so large, in fact, that at any given moment, it feels
as if we are traveling in a straight line. Furthermore, let’s assume that our
movement along the track is restricted to only forward or backward motion.
Basically, we are treating this subway as a 1-manifold. If we were newcomers
to the subway system, and we didn’t have a map, we might be able to deduce the
global topology of this system by observing the sequence of stops.

STOP 2 STOP 1

STOP 5

2084
STOP 3

STOP 4

If we board the train at stop A, stay on the train for a long time, and eventually
find ourselves at stop A again, we could safely assume that we are traveling
in some kind of loop, even though it doesn’t feel as if we’re turning anywhere.
This experience in 1-dimension gives us some idea of what it is like to be inside
a manifold. At any given point or moment, it seems like a straight line, flat
plane, or normal space, but as we attain a greater perspective on the system’s
structure, we find that it is not as simple as a line, plane, or space that extends
forever in all directions. As we shall see in the next section, we can go a step
further and actually use this interior, or intrinsic, view to understand topology in
a completely different light.
Unit 4 | 12
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.4

Intrinsic • The Intrinsic View


Topology • Adventures in Flatland
• Intrinsic View of a 3-Manifold

The Intrinsic View


• The extrinsic view of topology is like looking at a subway map; the intrinsic
view is like being on the subway.

In our subway example above, we saw that there are two ways to view a
manifold. The first, and probably most intuitive, way is to look at the manifold
as a whole as it sits in space. This kind of view is called an extrinsic view
and it is the kind of view that we get when we, for instance, look at a subway
map. Although this view is the most intuitive, it is, in some sense, not the
most fundamental way to view a surface or manifold. This is because a single
topological object can be represented extrinsically in many different ways. This
idea, known as “embedding,” will be covered in more detail in the next section,
but for now what we care about is that the extrinsic view is in some ways not as
fundamental as the intrinsic view.

The intrinsic view, remember, is the view from inside a surface or manifold. For
a surface, or 2-manifold, this view can be thought of as what a bug would see if
it landed on the surface. For a line, or 1-manifold, this is what a bug would see
if it landed on a wire. For a 3-manifold, the intrinsic view is what we see in our
daily lives as we look out into outer space. The intrinsic view is a way of viewing
a manifold without regard to how it is embedded. This enables us to distinguish
between which properties are inherent in the manifold and which properties are
the results of the way the manifold is represented.

Adventures in Flatland
• A Flatland explorer can experience topological shape as what happens as
she ventures further and further away from home in different directions.
• Box diagrams, also known as gluing diagrams, are a convenient way to
examine intrinsic topology.

To get a better sense of the intrinsic perspective, let’s consider the donut-
shaped torus that was introduced earlier.

Unit 4 | 13
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.4

Intrinsic
1735 Topology
CONTINUED HOME N

Let’s think about what a person living on this surface would experience. Let’s
say that our person is completely two-dimensional, a Flatlander, and she is
curious to find out what her world is like. Remember that because this is
a manifold, it always appears to her to be a flat plane. She, being naturally
curious, sets out to prove it. To do this, she leaves the front of her house and
begins walking “south” leaving a trail of blue thread behind her to mark her
path.

1736 HOME N

After traveling for a while without turning, she spots a building in the distance.
As she approaches, she recognizes the building as her own house, except now
she is facing the back of it. She correctly deduces that her world is not, in fact,
an infinite plane but, rather, is a curve that turns back in on itself. This indicates
to her that her world could be a closed manifold. A closed manifold does not
have to go on forever and yet has no boundary. An open manifold, on the other
hand, extends forever in all directions.

As our traveler approaches the backside of her house, she decides to tie the
end of the blue string that she is carrying to the end of the string at the front of
her house, which marks the beginning of her journey. She surmises that this
effectively creates some sort of loop around her world.

Unit 4 | 14
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.4

Intrinsic
1737 Topology
CONTINUED HOME N

Having realized that her world has some sort of global topology, she resolves
to discover exactly which kind of “shape” she lives in. To do this, she sets out
heading west, this time leaving a trail of red thread to mark her path. After
walking for quite a while without turning, she begins to wonder why she hasn’t
seen her blue thread anywhere. She had thought that she would cross it at
some point and that that would imply that her world is some sort of “hyper-
circle.” Flatlanders know about circles, so our explorer had thought that her
world was some sort of two-dimensional analog to the circle, sort of an “inflated
circle.” We three-dimensional beings call such a structure a “sphere.”

38 FLATLAND AS A SPHERE WITH RED


and BLUE THREADS

To our explorer’s surprise, after continuing on for a considerably longer time


than the duration of her first journey, she arrives at the east face of her house.
Furthermore, she has managed to return to her house without seeing her blue
thread. This disturbs her greatly, because with her two–dimensional mindset,
she has trouble envisioning the donut surface that we can see as the perfect
explanation for what she has experienced.

Unit 4 | 15
UNIT 4 Topology’s Twists and Turns
textbook

618 SECTION 4.4 We can clearly envision the donut surface that makes the
traveler’s
DONUT experience
SURFACE WITH REDpossible, but let’s try to get a feel for
Intrinsic AND
howBLUE
sheTHREADS
sees the situation. We need some sort of device
Topology or mechanism for drawing the donut surface from an
CONTINUED
insider’s perspective. To do this, we will represent both
a torus and a sphere intrinsically with what are known as box-diagrams, or
gluing diagrams. Gluing diagrams are simply flat shapes, squares in this case,
that have a set of rules governing what happens when an object crosses one of
the sides, or boundaries. We can imagine the boundaries being glued to one
another according to the specific markings in the diagram.
TO GET A SENSE FOR HOW THESE MANIFOLDS
BEHAVE, ATTACH LIKE SIDES.

Flat torus Flat


sphere

1111
To get a sense for how these manifolds behave, attach like sides

When an object crosses a single-arrow line, it returns from the analogous


position from the other single-arrow line. The same holds true when the
double-arrow boundaries are crossed. With our advantage of seeing in three
dimensions, we can easily imagine these box diagrams being curled up with
their edges glued together to make the familiar surfaces of a sphere and a torus
(with some help from our topologically allowed deformations of course).

To our explorer however, this view makes no sense; she would probably think of
her experience like this:
These diagrams represent an intrinsic view of the surfaces of a torus and a

Unit 4 | 16
UNIT 42085 Topology’s Twists and Turns
2085 textbook

“ROLLING
“ROLLING
UP” THEUP”
FLAT
THE
TORUS
FLAT AND
TORUS
FLAT
ANDSPHERE
FLAT SPHERE
SECTION 4.4
Make a Make
Torus a Torus Make a Make
Sphere
a Sphere

Intrinsic

LD

LD
Topology

FO

FO
CONTINUED

Torus Torus

Sphere Sphere
sphere. We could perform any topologically allowed operations to either surface
in our external view, and these diagrams would not change.

1126
H
H

The diagram on the right demonstrates what our explorer expected to happen;
it represents a sphere and shows that the two threads would have crossed.
Notice that the paths on this diagram are not straight. This is a result of the
fundamental difference between the local geometry of a torus and a sphere.
The local geometry of a sphere is one of positive curvature, whereas that of a
torus is flat. We will explore these ideas of geometry in more depth in unit 8.

Let’s return to our earlier example of a single-loop subway system. Remember


that we don’t have a map, and that it never really feels as if we’re turning when
we ride it, and that we return to our initial stop after a while. This is our intrinsic
experience, but the extrinsic view of our subway does not have to be a large oval,
or even a circle, for that matter. It can be any convoluted shape, even crossing
over itself, and we would have no idea.

Unit 4 | 17
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.4 Intrinsic View of a 3-Manifold


• Our experience of 3-manifolds is confined to an intrinsic view.
Intrinsic • We can represent a 3-manifold with a cube diagram, the three-dimensional
Topology analog of a box diagram.
CONTINUED

2086
THE DIFFERENT POSSIBILITIES OF THE SUBWAY
STOP 4
STOP 1
STOP 2

STOP 3 STOP 5

STOP 3
STOP 1

STOP 4

STOP 2 STOP 5

STOP 1

STOP 5

STOP 3

STOP 2

STOP 4

Unit 4 | 18
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.4 Our final point about intrinsic topology is that it is the only choice we have when
it comes to experiencing and attempting to understand a 3-manifold. Our
Intrinsic Flatlander from before had no choice but to explore the intrinsic topology of her
Topology two-dimensional world. Similarly, we have no choice but to explore the intrinsic
CONTINUED
topology of our own 3-manifold world. This is a topic to which we will return a
bit later, but for now, let’s look at some possible ways to think of the intrinsic
topology of a 3-manifold.

2709

The above diagram represents a “flat” 3-torus. If we were inside such a


manifold, we would find that as we “exited” one face, we would “enter” at the
analogous spot on the other face having the same marking. Notice that this is
similar to the situation of the flat 2-torus from before, except that in this 3-torus
we can travel up or down and experience the same behavior.

If this were the shape of our universe and we decided to carry out the
Flatlander’s experiment, the first thing we should notice is that we are going
to need another color of thread. If we leave out of the front of the box carrying
a blue thread, we will find that we eventually return through the back of the
box; if we leave out of the side of the box carrying a red thread, we will find that
we return through the opposite side of the box, having never crossed the blue
thread; and if we depart from the top of the box carrying a green thread, we will
find ourselves returning through the bottom of the box, having seen neither the
red nor the blue threads! This may seem very strange to us, as our Flatlander’s
experiment must seem to her. Of course, it’s impossible to carry out such an
experiment in our universe, so let’s consider what a person inside this manifold
must see.

Unit 4 | 19
UNIT
2087 4 Topology’s Twists and Turns
textbook

SECTION 4.4

Intrinsic
Topology
CONTINUED

This person looks forward and sees his back, looks to his right side and sees
his left, and looks up and sees his own feet. This gives the experiment some
reference points that we can have some hope of duplicating in our own universe.
For instance, we can use telescopes to map the night sky and look for regions
that seem to repeat themselves. This, of course, is very complicated, but it is
considerably less complicated than traveling to the edge of the universe in all
directions.

Now that we have an idea of how manifolds look when we view them
intrinsically, let’s turn our attention to the different ways these surfaces can be
viewed from outside. By gaining an understanding of how 1- and 2-manifolds
behave when viewed extrinsically, we’ll gain insight into how a 3-manifold, such
as our universe, might behave.

Unit 4 | 20
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.5

Embedding and the


Extrinsic View • Embedding
• Subway Maps
• Knots in Nature: DNA

Embedding
• Topological objects can be examined extrinsically by embedding them in
higher-dimensional spaces.
• Some objects require a certain minimum number of dimensions in which
they can be embedded without self-intersection.

Let’s reconsider the graphs that we viewed at the beginning of this unit. We
saw that by counting the number of faces, vertices, and edges, we could find the
Euler characteristic of a particular graph. Furthermore, we found that for any
graph that we can draw on a plane, the Euler characteristic is 2. What about the
following graph, though?

These are intersections,


not nodes

113

This graph obviously can be drawn on a flat piece of paper, and yet we are going
to have a tough time finding its Euler number. Counting edges and vertices is
easy, but counting the faces presents a challenge. This difficulty is due to the
fact that there are edges that intersect one another. On the flat piece of paper,
this simply looks like one edge overlaying another edge. We can’t actually find
the Euler characteristic of this graph, because there is no way to draw it on the
plane without edges intersecting each other. This graph is actually non-planar;
that is, it can’t be embedded in the plane.

Unit 4 | 21
UNIT 4 Topology’s Twists and Turns
textbook

30844.5
SECTION
Remember the Flatland explorer? After completing her explorations, she
discovered that the red thread and the blue thread never crossed each other.
Embedding and the The reason for this was because her surface had a hole in it. We can use this
Extrinsic View property of the torus to embed our non-planar graph without any intersections:
CONTINUED

The problem we encountered before was one of embedding. On the plane, this
graph can’t exist without edges crossing one another, but on the surface of a
torus, it can. Notice in the image above that the connections that make up the
graph have not changed; the only difference is the surface upon which the graph
is drawn.

Embedding refers to how a topological object—a graph, surface, or manifold—is


positioned in space. The concept of embedding is central to the idea of an
extrinsic view of topology simply because we cannot view something from the
outside unless it is somehow situated in some larger, or higher-dimensional,
space. Otherwise, from where would we be viewing it? Furthermore, there can
be many different ways to embed an object in that larger space.

As with our graph above, we occasionally encounter objects that cannot be


embedded in our space without a self-intersection, a fact that technically means
they can’t be embedded in our space at all. Such structures are called “non-
orientable surfaces,” and we will learn more about them in the next section. For
now, let’s look more closely at how an object with the same intrinsic topology
(i.e., the same connections) can have different embeddings.

Unit 4 | 22
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.5 Subway Maps


• Knots are different embeddings of a circle, a one-dimensional torus.
Embedding and the • The same object can be embedded in different ways. Some of these
Extrinsic View embeddings, such as a trefoil knot, cannot be smoothly deformed into the
CONTINUED
others.
• Reidemeister moves are a set of techniques with which one can tell which
knots are isomorphic to each other.

Let’s take another look at the subway loop from our earlier example.

2086
THE DIFFERENT POSSIBILITIES OF THE SUBWAY
STOP 4
STOP 1
STOP 2

STOP 3 STOP 5

STOP 3
STOP 1

STOP 4

STOP 2 STOP 5

STOP 1

STOP 5

STOP 3

STOP 2

STOP 4

Unit 4 | 23
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.5 We are treating our subway as a one-dimensional manifold. Remember, this
means that when we ride the train, we perceive only forward or backward
Embedding and the motion, even though we know that our subway is a loop because we keep coming
Extrinsic View back to the same stop. From our intrinsic perspective, the actual subway map
CONTINUED
could be any of the embeddings shown above.

The subway map represents an embedding of our 1-manifold in a two-


dimensional plane. By looking at the map, we view the manifold extrinsically.
The designer of the map has many choices as to how to draw it, provided that the
order of the stops remains the same. Intrinsically, all of these possibilities are
the same, but each version of the map is different extrinsically. Some of these
maps can be turned into one another by bending and stretching, but some of
them can’t.

Item 1719 / Jos Leys, TREFOIL KNOT (2004). Courtesy of Jos Leys.

Unit 4 | 24
3074
UNIT 4 Topology’s Twists and Turns
textbook

THE TREFOIL KNOT MAP OF THE SUBWAY


SECTION 4.5 This version of the map is unlike
STOP 1
the others. It is plain to see that no
Embedding and the STOP 5 matter how much we manipulate
Extrinsic View it, we cannot transform it into a
CONTINUED
STOP 3 circle without making a cut and re-
gluing the ends. However, recall
that, experienced intrinsically, this
configuration is no different than a
STOP 2
circle. Obviously, from an extrinsic
STOP 4
view, this equivalence no longer holds.

This configuration is an example of a knot. A knot, to a topologist, is simply


a particular embedding of a circle in 3 dimensional space—also known as
3-space. It may appear that these knots are embedded in the plane, but recall
that in the plane there is no such notion as “above” or “below.” Clearly, we
need these directional concepts in order to have knots. All knots, when viewed
intrinsically, are the same; they become interesting, really, only when we look at
them extrinsically.

KNOTS

1114
A B C

Some knots are easily undone, such as the one shown in image A. Sliding the
overlying side to either the right or the left creates what can, topologically,
be considered a circle. Other knots, such as that shown in image B, are a bit
more difficult, though not impossible, to undo. In this case, sliding the bottom
overlying half-loop down a bit, then sliding the middle overlying part to the
right creates what looks like a circle within a circle. Finally, a mere twist of the
remaining overlying part again creates a topological circle.

Unit 4 | 25
2089

UNIT 4 Topology’s Twists and Turns


textbook

SECTION 4.5

Embedding and the A


Extrinsic View
CONTINUED

Unfortunately, if we try to perform the same types of manipulations, called


“Reidemeister moves,” on knot C, we will be out of luck. A little mental
projection should convince you that clearing up one part of the knot will only
make things worse in other parts. This type of knot, known as a “trefoil knot,”
cannot be undone in the extrinsic view of topology. However, as we saw before,
in the intrinsic view, this is really no different topologically than a circle. The
only way to undo this knot would be to un-embed it, that is, take it out of our
space, untangle it, and then re-embed it in our space. A four-dimensional
being would have little trouble doing this, but we’ll save our examination of
the exploits of four-dimensional beings for a later unit. For now, all that is
important is that we cannot undo it in 3-space.

Central to this study of knots is the concept of isotopy. Isotopy is a form of


equivalence in which one topological object can be transformed into another
while maintaining the property of being an embedding. Although one needs to
be careful in defining it, it is a precise way to capture the notion of deforming
without crossing. This is what we are doing when we use our Reidemeister
moves to undo knots. Hence, we would say that knot A is isotopic to knot B and
that both are isotopic to a circle. Knot C, however, is isotopic to none of these
things, because we would have to un-embed it to undo it.

Unit 4 | 26
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.5 Knots in Nature: DNA


• Some ideas from knot theory have proven to be useful in the study of the
Embedding and the interplay between DNA and enzymes.
Extrinsic View • Central to this study are the concepts of double points and writhe.
CONTINUED

The mathematical study of knots has applications to the scientific study of DNA.
DNA is the genetic material that encodes the information that is the blueprint for
living livings. DNA is basically a very long strand of alternating pieces of genetic
material called “nucleotides.” Information is encoded in the DNA molecule by
the specific ordering of these nucleotides.

When biologists and geneticists are trying to define the specific sequence of
nucleotides in a strand of DNA, they first must break the molecule up into
smaller pieces. These pieces often form loops and knots similar to what you see

1115
here:
KNOTTED DNA

This structure resembles the kinds of knots that we were studying earlier. This
DNA knot is considered to be “packed” in a form unsuitable for replication.
Before the DNA can be copied, it must be “unpacked” by helper molecules
known as enzymes. This process proceeds in a fashion similar to the undoing
of mathematical knots. In fact, we can use concepts from knot theory to
understand and make predictions about DNA packing and unpacking. This, in
turn, enables us to make predictions about how certain enzymes will function.

Unit 4 | 27
DNA KNOTS

UNIT 4 Topology’s Twists and Turns


textbook

SECTION 4.5

1116
Embedding and the
Extrinsic View
CONTINUED

A B C

The picture above shows various DNA knots. Any place where a knot crosses
over itself is called a double point. The number of double points is known as a
knot diagram’s “crossing number.” What’s more, each double point is classified
as either positive or negative, depending on which way the overlying strand must
be turned so that it lines up with the underlying strand. If a clockwise turn of
less than 180° will bring about an alignment, then the double point is considered
“positive”; conversely, if a counterclockwise turn of less than 180° is sufficient to
bring the strands into alignment, then the double point is considered “negative.”
DOUBLE POINTS

+ -
1117
+ -
+ -
With each double point “worth 1” (either +1 or -1, as just discussed), the sum of
all the values of a knot’s double points is called its “writhe.” Certain enzymes
are able to reverse the sign of particular double points, thereby allowing the
knot to be undone, that is, the DNA to be unpacked (as shown in part B of the
following diagram).

Unit 4 | 28
UNIT 41118Topology’s Twists and Turns
textbook

UNDOING DNA
SECTION 4.5
move smoothly
Embedding and the A
Extrinsic View
CONTINUED

move
smoothly

change
doublepoint sign move smoothly

By comparing the crossing numbers and writhes of the same DNA knot after
successive applications of the enzyme gyrase, genetic researchers were able to
conclude that gyrase systematically reverses the signs of double points in a DNA
molecule.

The application of principles of knot theory, itself a subset of extrinsic topology,


to DNA enzyme analysis represents an interesting example of a branch of
mathematics that was originally studied for its own sake, as topology mostly
is, having unexpected applications in another field. We will explore some
other applications of topology, specifically intrinsic topology, a little later in
this unit. Before we proceed, however, we must take a look at an entire class
of topological objects that we have not yet discussed. These strange objects,
in which the concepts of left and right are meaningless, are the non-orientable
surfaces.

Unit 4 | 29
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.6

Non-Orientability • The Möbius Strip


• The Klein Bottle
• The Projective Plane

The Möbius Strip


• A non-orientable surface is one on which there are regions that reverse an
explorer’s sense of right and left.
• If a surface has any reversing paths, it is considered non-orientable.
• Non-orientability is a topological invariant.
• A Möbius strip is an object with only one side. It is the classic example of a
non-orientable surface.

Let’s go back and check in with our adventurous Flatland explorer. Having
completed her experiment with the red and blue threads, she decides to set
out once more, this time in a southeasterward direction. She travels a fair
distance and realizes that she has not seen either her blue or red thread
anywhere. Making a mental note of this, she treks on until she sees a building
in the distance. As she approaches it, she notices something strange. Although
the building appears to be her house, it has some odd features. The address
numbers are reversed, as if they were written in a mirror, and upon further
observation, she realizes that her entire house has been reversed. The tree that
used to be to the left as she approached her front door now is on the right. Her
bedroom, which used to be the last door on the right of her hallway, is now the
last door on the left.

Suspecting some kind of practical joke, she seeks out her neighbors to help
get to the bottom of this. When her neighbors see her, they are shocked at
her unusual appearance. All Flatlanders have their eyes to the north of their
mouths when facing west and their mouths to the north of their eyes when
facing east. The orientation on our explorer’s face is the opposite. Her mouth
is above her eye when she faces west, and her eye is above her mouth when she
faces east.

Unit 4 | 30
UNIT 4 Topology’s Twists and Turns
textbook

REVERSED EXPLORER

SECTION 4.6
A typical Our
flatlander explorer

1119
Non-Orientability
CONTINUED

Something happened to our explorer on her latest journey that reversed her
orientation, relative to how she started out. This is why everything appeared as
a mirror image to her. This strange part of Flatland was hitherto unknown, and
it
ANisINTRINSIC
hard for the average
VIEW Flatlander
OF WHAT HAPPENED to TO
figure
THE out what happened.
EXPLORER

1120 ?

As three-dimensional observers with the advantage of an extrinsic view, we have


the perspective to find a somewhat more satisfactory explanation. The region
that our explorer experienced is a place where orientation is meaningless.
This means that if a Flatlander takes a trip through this region, they will return
“mirrored,” as our explorer did. A surface with this mirroring characteristic is
known as a Möbius strip.

Unit 4 | 31
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.6

Non-Orientability
CONTINUED

Item 3067 / Oregon Public Broadcasting, created for Mathematics Illuminated, MÖBIUS
STRIP (2008). Courtesy of Oregon Public Broadcasting.

We can see that this surface has only one edge, and that while it appears to
have two sides, it really has only one. To create a model of this surface, simply
take a strip of paper, put one twist in it, and then attach the ends together.
Tracing the surface with your finger will convince you that both sides of this
object actually are one and the same. When our Flatlander explorer took a trip
2091
through “Möbius land,” completing one cycle of a Möbius strip, she returned
to the point where she began, reversed in orientation. (Be careful—remember
that a Flatlander lives “in” the surface and not “on” the surface.) This kind of
surface, in which paths exist that can reverse one’s orientation, is known as a
non-orientable
MAKING surface.
A MÖBIUS STRIP

THESE FORM ONE


CONTINOUS EDGE

MOBIUS STRIP

Unit 4 | 32
UNIT 4 Topology’s Twists and Turns
textbook

2092 SECTION 4.6 The Klein Bottle


• The Klein bottle is another non-orientable surface.
Non-Orientability • The Klein bottle cannot be embedded in three dimensions without
CONTINUED intersecting itself.

The Möbius strip is not the only kind of non-orientable surface. Another well-
known example is the Klein bottle, shown here intrinsically.

Notice that following some paths on the


Klein bottle will reverse one’s orientation
and following others will not. For instance,
in the following diagram, east-west paths
are reversing, whereas north-south paths
are not.

EXPLORING THE KLEIN BOTTLE

1122

This surface is a bit stranger than a Möbius strip. We can think of a Klein bottle
as a surface whose inner face and outer face are the same. To take an extrinsic
view of this surface would require that we somehow embed it into 3-space.

Unit 4 | 33
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.6 Unfortunately, just as we found that certain graphs cannot be embedded in the
plane without intersecting themselves, the Klein bottle cannot be embedded in
Non-Orientability 3-space without a self-intersection. We can, however, create what is called an
CONTINUED “immersion,” and one possible immersion looks like this:
Two more representations of a Klein bottle.

Item 1718 / Jos Leys, KLEIN BOTTLE (2004). Courtesy of Jos Leys.

Item 3063 /Oregon Public Broadcasting, created for Mathematics Illuminated,


KLEIN BOTTLE (2008). Courtesy of Oregon Public Broadcasting.

Mentally walking along the surface of this object should convince you that its
“inside” is the same as its “outside.”

Unit 4 | 34
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.6 The Projective Plane


• The projective plane is another common, non-orientable surface.
Non-Orientability • A projective plane can be attached to orientable surfaces to make them non-
CONTINUED orientable.

Both the Möbius strip and the Klein bottle are relatively easy to picture
extrinsically, but the third non-orientable surface that we shall investigate is
really best understood intrinsically. It is known as the “real projective plane” or,
more commonly, just the “projective plane.” Intrinsically it looks like this:
PROJECTIVE PLANE

1124 OR

Notice that a Flatlander traveling across this surface would be reversed no


matter which path she chose to follow.

An interesting aspect of the projective plane is that it can be used to construct


other surfaces. In fact, by cleverly attaching two projective planes together,
we get a Klein bottle. To perform this “operation,” we first take two projective
planes and unhinge them at one connection:
CREATING A KLEIN BOTTLE OUT OF TWO PROJECTIVE PLANES

CUT

1125
ETC..

Unit 4 | 35
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.6 We then connect them together to form the square, with a diagonal representing
the seam. Now, if we rotate the planes with respect to each other, we end up
Non-Orientability with a diagram that resembles a Klein bottle with a diagonal. Because the
CONTINUED diagonal is interior to the shape, we can disregard it, and—voilà!—we have a
Klein bottle.

This process of combining two surfaces to create a third surface, possibly of


another type, is a powerful idea. It helps to explain the structure that our
Flatland explorer found. Although she found her world to be like a torus in
most respects, it included a region that reverses people. We can think of this as
equivalent to the surface of a torus glued to either a Möbius strip, Klein bottle,
or projective plane. In fact, it is possible to add together all types of surfaces to
create new ones. In the next section we shall see how this concept leads to a
new type of algebra, in which we use surfaces instead of numbers.

Unit 4 | 36
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.7

Connected Sums and • Adding Shapes


the Classification • Classification of Surfaces
of Surfaces
• Poincaré’s Conjecture

Adding Shapes
• We can add two or more surfaces together via a connected sum.
• The sphere serves as the identity under connected sums.

2093 Connected sums provide a means for combining topological surfaces to create
other surfaces through strategic cutting and re-gluing. Now let’s be clear
about what we intend to do here, because previously we stated that cutting
and gluing are not allowed in topology. To be precise, two objects are said to
be topologically equivalent, if they can be deformed into one another smoothly
without cutting or gluing.

These two objects are considered to be the same in topology because they both
have only one hole, even though they look radically different. This is somewhat
similar to having three bananas and three oranges; to be sure, bananas are
different than oranges, but both groups are examples of the number “3.”
Similarly, the two tori depicted are different from each other, but both are
examples of a genus 1 surface. (Remember, an object’s genus is the number of
holes that it has.)

Unit 4 | 37
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.7 Carrying on with the oranges and bananas example, we can take a set of three
oranges and combine it with a set of five oranges to get a new set of eight
Connected Sums and oranges. This act of combining is what we think of as addition, one of the four
the Classification basic mathematical operations (subtraction, multiplication, and division being
of Surfaces
the other three). All of these operations take objects, numbers in this case, and
CONTINUED
do something to them, most of the time giving us a new, different number.

We can think of “cutting and gluing” as an operation in topology. It is a way of


taking two topological objects and combining them to make a new and, most
of the time, different topological object. We call the result of this operation a
“connected sum.” Our exercise in the preceding section, in which we turned two
projective planes into a Klein bottle, serves as an example.

Before we explore this further, let’s establish some notation for convenience.
We’ll refer to a torus as T2, a sphere as S2, the Klein bottle as K2, the projective
plane as P2, a disk as D2, and the Euclidean plane as E2. The number 2s in
these designations indicate that they are all two-dimensional surfaces. The
symbol we’ll use for a connected sum is the pound sign, or number sign, #.

So, for example, if we take two T2s, cut out a disk from each, and then glue them
together, we will have T2 # T2, as shown here:

19

This is a double-holed torus. We won’t give it its own symbol; instead, we’ll just
remember what the operation means. T2 # T2 # T2 would symbolize a three-
holed torus.

The example from the previous section, in which two projective planes were
joined to create a Klein bottle, would have this notation: P2 # P2 = K2.
Let’s return to our fruit example for a simple review of the concept of identity.

Unit 4 | 38
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.7 We can take a set of three bananas and add a set of zero bananas and end up
with just three bananas. Because adding zero doesn’t change anything, we refer
Connected Sums and to “0” as the “additive identity.”
the Classification
of Surfaces
In topology there is an identity as well—it is the sphere. If we take the
CONTINUED
connected sum of any object with a sphere, we end up with the original object.
For example, T2 # S2 = T2.

20

Also, K2 # S2 = K2

To explore connected sums fully, we need one more relationship:


K2 # P2 = T2 # P2. We can get a sense of why this is true by thinking of a
projective plane as a reversing region; in other words, anything that passes
through it has its orientation reversed.

Unit 4 | 39
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.7 If we attach a projective plane to a Klein bottle and then maneuver the Klein
bottle so that it passes through the projective plane, the Klein bottle turns into
Connected Sums and a torus and the projective plane remains unchanged. This suggests to us that
the Classification the connected sum of a Klein bottle and a projective plane is equivalent to the
of Surfaces
connected sum of a torus and a projective plane.
CONTINUED

We also find that the commutative property applies to connected sums. In


our fruit example, adding a set of three oranges to a set of five oranges is no
different than adding a set of five oranges to a set of three oranges—the order
does not matter. Similarly, taking the connected sum of two objects gives the
same result no matter what order we do it in. Connected sums also adhere to
the associative property, so that T2 # S2 # K2 = K2 # T2 # S2.

So, in topology, we have an identity and both the commutative and associative
properties. This suggests that we can use surfaces to do algebra! These
basic properties, along with the fact that P2 # P2 = K2 and K2 # P2 = T2 # P2,
enable us to deal with really complicated surfaces. Suppose that we have some
arbitrary surface, M2. Furthermore, suppose we know that:

M2 = P2 # T2 # P2 # S2 # P2 # K2

Let’s first use the commutative property to rearrange the sequence of this sum:

M2 = P2 # P2 # P2 # K2 # T2 # S2

Now, because K2 # P2 = T2 # P2, we can make a substitution and write:

M2 = P2 # P2 # P2 # T2 # T2 # S2

We know that adding a sphere is the identity and changes nothing, so let’s
drop it:

M2 = P2 # P2 # P2 # T2 # T2

Now, remembering that P2 # P2 = K2, we can write:

M2 = P2 # K2 # T2 # T2

Unit 4 | 40
UNIT 4 Topology’s Twists and Turns
textbook

2094 SECTION 4.7 Again making use of the fact that P2 # K2 = P2 # T2, we can write:

Connected Sums and M2 = P2 # T2 # T2 # T2


the Classification
of Surfaces
We should recognize this configuration as a three-holed torus with a projective
CONTINUED
plane attached. Alternatively, because adding a sphere changes nothing, we
could view this as a sphere with three “handles,” representing the tori, and a
projective plane attached.
CROSS CAP

3 TORI WITH A CROSS CAP ATTACHED


The Classification of Surfaces
• Every surface is reducible to a sphere with either handles or projective
planes—or both—attached.

It is fascinating to realize that any 2-manifold, or two-dimensional surface,


that we can envision will always be reducible, using the rules stated above,
to a sphere with some number of handles and/or some number of projective
planes attached. This very important theorem in topology is known as the
“classification of surfaces.” It was first proven for orientable surfaces by August
Möbius, a German mathematician, physicist, and astronomer, who was a student
of Gauss.

We have been examining surfaces and their topological representations,


but what about 3-manifolds? The possibilities with these structures are not
as straightforward as those involving two-dimensional surfaces, but it is a
fascinating story that started in the 19th century and was only resolved in the first
decade of the 21st century.

Unit 4 | 41
UNIT 4 Topology’s Twists and Turns
textbook

2095
SECTION 4.7 Before we can understand the 3-manifold case, we need one more concept as
a tool. Let’s return once again to our Flatland explorer. Recall that on her very
Connected Sums and first journey, she carried with her a length of blue thread. Had she been on the
the Classification surface of a sphere, she could have, while still holding both ends and without
of Surfaces
cutting the thread, spooled it all back up, effectively shrinking her loop of thread
CONTINUED
until it was entirely on her original spool.
LOOP SHRINKING ON A SPHERE

2096

However, she is not on the surface of a sphere, but is rather on a torus with
some sort of projective plane attached. If she tried to re-spool her blue thread,
she would quickly find it to be impossible, because the thread passes through
the hole of the torus.
CAN’T SHRINK THE LOOP ON A TORUS
!!!

This property, commonly referred to as the “loop-shrinking property,” states


that, on a sphere, or any surface that is topologically equivalent to a sphere,
every loop that is drawn on the surface can be shrunk continuously to a point.
If, on the other hand, you are not on a sphere, then it will always be possible
to draw a loop that cannot be shrunk to a point, as our intrepid Flatlander
discovered with her blue thread.

Poincaré’s Conjecture
• The Poincaré Conjecture is the equivalent of the classification of surfaces for
3-manifolds that have the loop-shrinking property.
• The Poincaré Conjecture was proven to be correct only in the first decade of
the 21st century.

Unit 4 | 42
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.7 The great French polymath Henri Poincaré sought a similar property governing
3-manifolds in 1904. He was curious to know whether a 3-manifold could
Connected Sums and exhibit the loop-shrinking property and not be the three-dimensional equivalent
the Classification of a sphere (often referred to as a “3-sphere”). His assumption that this was
of Surfaces
indeed possible became known as Poincaré’s Conjecture. This loop-shrinking
CONTINUED
conjecture has much to do with how 3-manifolds are classified, in much the
same way that the two-dimensional loop-shrinking conjecture helps to classify
surfaces—that is, it can tell us if we are on a 3-sphere or not.

The proof of Poincaré’s Conjecture eluded mathematicians for nearly 100 years
and became one of the most-sought-after results in all of mathematics. In the
intervening time, a great body of mathematics was developed and explored
by many brilliant thinkers, such as Thurston, and Hamilton. Thurston, in
particular, established a conjecture that allowed all 3-manifolds to be classified
in a similar way to the 2-manifolds. Now referred to as the Geometrization
Theorem, it, along with Poincaré’s original conjecture, was proven by the
reclusive Russian mathematician, Grigory Perelman, at the start of the 21st
century.

This great contribution to mathematics, representing the culmination of a


century of international efforts, earned Perelman a Fields Medal, which is the
mathematical equivalent of the Nobel Prize. In an odd twist, Perelman refused
the honor of the Fields Medal in an act that brought a fair amount of controversy
to the mathematics community.

Regardless of the dramatic personalities involved, the classification of


3-manifolds has far reaching consequences for mathematics. While it may have
some repercussions for the physical sciences, its primary value is in its beauty
as a mathematical construction. This is true for most topological exercises,
which generally are done not so much for their practical value as for their
mathematical and aesthetic value. Mathematics can indeed be as wondrous
and beautiful as a great work of art or music or any other achievement of the
human mind. Be that as it may, topology is not studied completely for its own
sake. In the remainder of this unit, we will examine two practical applications of
topological thinking. This first has to do with a rather mundane manufacturing
task with a startling topological explanation. The second is an exploration of the
shape of our universe.

Unit 4 | 43
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.8

Robots • A configuration space is a topological surface that corresponds to the


different states allowed to a given system.
• The concept of a configuration space can be used to plan things such as
manufacturing processes that minimize the risk of damage to expensive
machinery.
• We first make an intrinsic model of the configuration space; then we use the
Euler characteristic to find the genus and, with it, a sensible extrinsic view.
• This strategy is not in actual use in factories; this is merely an example of
how the ideas of topology might be used.

Imagine that you are the manager of an automated manufacturing facility.


You have invested large sums of money in a pair of state-of-the-art robots
to assist in the production of widgets. The production process is a five-step
process, requiring each of your two robots to visit five separate locations on
your manufacturing floor. The possible paths connecting the five stations are as
shown:

037

Now, the above graph displays all possible routes between the stations, but
some routes are actually preferable to others. In particular, you probably
don’t want routes that could lead to the robots colliding with one another,
resulting in costly and time-consuming repairs. So, you would like to restrict
the movement of the robots somewhat, so that they can accomplish their tasks
with the least chance of collision. In short, you wish to consider only those route
configurations that are safe for the robots.

Unit 4 | 44
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.8 One way to ensure that the robots never collide with one another is to insist that
they always maintain at least a one-full-edge distance between themselves. In
Robots such a system, if one robot (robot 1) were at station A, then robot 2 could not be
CONTINUED on any of the edges that connect to station A.

Robot 2 can be at any of these vertices OR on any of these edges

C C C

B D B D B D

1127
A E A E A E

Robot 1

What’s more, if robot 1 were on the edge between A and E, then robot 2 could not
be at either A, or E, or on any edge that connects to either of them.
C C

B D B D

1128
A E A E

Robot 1 is somewhere Robot 2 can be anywhere


on this edge on these edges

Unit 4 | 45
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.8 With these rules in place, we can organize all the possible safe configurations
into a topological object called a “configuration space.” A configuration
Robots space is a topological surface that represents all the possible arrangements
CONTINUED or configurations of a physical system, such as that of our robots and their
work stations. We can construct this surface by systematically cataloging
all allowable positions of robots in real space and correlating them with the
intrinsic cell decomposition of the surface.

The first thing to consider in describing this space is what happens when each
robot is at a station. We can think of these configurations as discrete, in that
a robot is either at a particular station or it is not. This suggests to us that the
representation of these configurations in configuration space should be vertices.
A vertex has no degrees of freedom, and this corresponds well to the idea that if
both robots are at stations, neither is free to change position.

So, for every possible way that two robots can be at two different stations, we
will have a unique vertex in our configuration space. If robot 1 is at station
A, then robot 2 has four possible locations. Applying this thinking around the
stations leads us to conclude that there are 5 × (5-1) possible ways for the two
robots both to be at stations. This means that our configuration space will have
20 vertices.

Let’s consider now what happens when robot 1 is at a station and robot 2 is on an
edge. This is no longer a nice, discrete situation, for although robot 1’s position
is fixed nicely, robot 2’s precise position on any particular edge can vary. This
suggests to us an object with one degree of freedom, which is a line segment,
or because we are thinking in terms of graphs, an edge. So, how many ways
are there for one robot to be fixed at a station as the other robot is traversing an
edge? If robot 1 is fixed at a station, then robot 2 can be on any of six possible
edges.

Unit 4 | 46
1129
UNIT 4 Topology’s Twists and Turns
textbook

CONFIGURATION SPACE OF THE TWO ROBOTS ON K5

C C C
SECTION 4.8

Robots B D B D B D
CONTINUED

A E A E A E

A VERTEX IN AN EDGE IN A FACE IN


CONFIGURATION SPACE CONFIGURATION SPACE CONFIGURATION SPACE

Following this reasoning around the graph, we find that there are 30 possible
arrangements in which robot 1 is in a fixed position and robot 2 is “moving.”
Applying the same logic, there must also be 30 ways for robot 2 to be fixed while
robot 1 changes position. This means that there are 60 possible ways for one
robot to be fixed at a station while the other robot is moving on an edge. We said
earlier that each of these ways corresponds to an edge in configuration space,
so in addition to the 20 vertices, our space has 60 edges.

The final situation to consider is when both robots are in transit—that is, neither
is fixed, both are allowed to move. This arrangement suggests to us an object
with two degrees of freedom, which is a face. So, each possible way that both
robots can be in transit corresponds to a face in configuration space. We must
still follow the “one-full-edge-apart” rule, however, to ensure that there are no
collisions, so if we confine robot 1 to a particular edge, as in the third image in
the diagram above, then robot two is restricted to three possible edges.

Going around the graph and applying this reasoning, we find that there are 15
ways for robot 1 to be confined to a particular edge and robot 2 to another edge.
Consequently, as the roles of the robots are reversed, there must be another 15
possible scenarios. This translates into 30 total ways for both robots to be in
transit, and each of these ways corresponds to a face in configuration space. So,
in addition to 20 vertices and 60 edges, our configuration space has 30 faces.

To construct this space, it is necessary to label every vertex, edge, and face
meticulously, and then to put these pieces together in some consistent manner.
There are many ways to do this; here is a portrayal of one such way:

Unit 4 | 47
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.8 AC

DC
AC
AB
Robots AE
CD
EC
DB
CONTINUED AD BD
DE CD
CB BE
DA
CE ED
EA BC AD
AB
AB CA CB
DB AE AE
EB CD
BA AC
EB BA CE
AC CB
AD CA DE
DC DE
EA
BD DC
EC BD
CD DA
AC BE
DA BE
ED
BC
EA BC CE
BA
EC
EC CA
EA

Now that we have constructed this space, we can plot paths through it and be
confident that those paths will correspond to sequences of safe movements
for the robots on the manufacturing floor. One thing that we must recognize
is that, when we mark a path in configuration space, we will inevitably come to
a boundary of a face (as indicated by the blue point in the diagram). Note that
a path that leaves a face will return (i.e., re-enter the configuration space) on
some other face, just as with our intrinsic box diagrams that we explored earlier.
The possible connections portrayed on this intrinsic representation of this
striking shape are very hard to grasp intuitively. Here is where Euler’s formula

2101 can be of help.

Because our configuration space has 20 vertices, 60 edges, and 30 faces, we can
substitute these values into the V – E + F formula; doing so gives us an Euler
characteristic of -10. Using this Euler number, we can find the genus of this
object by using the formula Χ = 2 - 2g, where g is the genus and Χ is the Euler
number. Substituting -10 for Χ and solving for g, we come up with a genus of
six. Recall that a surface’s genus is simply the number of holes that it has, so
we have discovered that our configuration space is actually a six-holed torus! 1

We neglected to mention earlier that this configuration space is an orientable surface.


1

Unit 4 | 48
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.8 It would have been difficult to say at the beginning that the possible
configurations that enable two robots to visit five manufacturing stations safely
Robots would end up forming a space that is topologically equivalent to a six-holed
CONTINUED donut. Nevertheless, the problem works out beautifully, and this is the reality of
the situation.

In this problem, we reduced a physical situation to an intrinsic topological


model. We then analyzed this model to find out what kind of a 2-manifold it
was. In our final section, we will turn our attention to the larger question of
3-manifolds. We will come face to face with the challenge of understanding an
object so large that even catching a glimpse of its intrinsic topology would be a
great breakthrough. This object is our own universe.

Unit 4 | 49
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.9

The Shape of Space • Life in a Manifold


• Heavenly Clues

Life in a Manifold
• The shape of the universe is a question that we can explore only intrinsically.
• There are many possible shapes to describe our universe.

Let us return, finally, to our initial question: does the universe go on forever? To
the thinker in ancient times, the size of the Earth must have been unfathomable.
Determining the shape of the Earth by a brute-force exploration approach
would have been a distinct impossibility. Like any manifold, the surface of the
Earth appears to be flat everywhere to those of us on its surface. Still, early
philosophers and thinkers were able to gauge, more-or-less correctly, the
size and shape of the Earth. The most famous of these efforts was made by
the Greek philosopher and mathematician, Eratosthenes, who very cleverly
calculated the circumference of the Earth by comparing shadows at different
latitudes. This ingenious exercise established facts that were empirically
verified many centuries later by the first round-the-world explorers.

In trying to comprehend the size and shape of our universe, we are faced with
a similar dilemma. As far as we can tell with local measurements, space
appears to be the three-dimensional analog of an infinite flat plane. To verify
this empirically, we would have to set out in a theoretically–impossible-
to–build, faster–than-light spaceship to explore the furthest reaches of the
visible universe. This is even less of an option to us than sailing around the
world would have been to the Greeks. We can search for other evidence and an
alternative verification method, however, just as Eratosthenes studied shadows
rather than attempting to sail around the world.

Unit 4 | 50
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.9 When we inquire into the shape of space, we are seeking to know whether the
universe has some sort of interesting topological structure. That is, if we were
The Shape of Space somehow able to explore the furthest reaches, as the Flatlander did in her
CONTINUED world, would we find that certain routes lead us back to where we started? At
the heart of this question is the idea of connectivity. If the universe is simply
connected, then it would be analogous to the surface of a sphere in three
dimensions. This would mean that any loop of string in space could be reeled in
to a point with no problems (at least no theoretical problems). This is the way
that Riemann and Einstein imagined space to be.

Another possibility, however, is that the universe is multi-connected. The


simplest shape for a multi-connected universe would be a 3-torus.

What would it be like to exist in this kind


of universe? Well, if you traveled forward
far enough, you would end up where you
started. Actually, the same thing would
2709 happen no matter which direction you
traveled. What if this space were so large
that it was impossible to travel far enough
to return to your starting point? If you
would simply look around, you would find some clues as to the nature of your
space. If you looked forward, you would see your back; if you looked to the left,
you would see your right side; and if you looked up, you would see the soles of
your feet.

Heavenly Clues
• To determine the shape of the space we live in, we can search the night sky.

By making visual observations, such as those just noted, and analyzing how
the copies of yourself that you see are arranged, you could begin to deduce the
large-scale topological structure of your universe. So, in terms of galaxies, we
could scan the night sky, looking for copies of our own galaxy. A problem arises,
however: we don’t know what our galaxy looks like from the outside, so how
would we know if we were looking at some distant image of it? Another problem
is that space is so large that light takes a very long time to reach us from other
galaxies. Consequently, the images that we see are not of the galaxies as
they are now, but rather as they were when the light that reaches us left them
hundreds, thousands, or even millions of years ago. Relating this to our earlier

Unit 4 | 51
UNIT 4 Topology’s Twists and Turns
textbook

SECTION 4.9 3-torus example, this situation is equivalent to seeing your back as it appeared
when you were ten years younger, or perhaps your side when you were five
The Shape of Space years younger, or, possibly, the soles of you feet as they looked twenty years
CONTINUED ago. It would not be evident that you are even seeing an image of yourself.

Item 1776 / NASA/WMAP Science Team, The MICROWAVE SKY WMAP (2006).
Courtesy of NASA/WMAP Science Team.

In order to hypothesize about the shape of space, astronomers have to study


something more basic than images of galaxies. So, they study things such
as the average distance between galaxies. These average distances can
then be collated into a distribution, and that distribution can be matched up
against theoretical ones corresponding to different shapes of the universe.
Space scientists also study the cosmic background radiation, which is a form
of radiation that is left over from the Big Bang. As it turns out, this type of
radiation is not uniformly distributed in space, so it provides a reference base
for exploring the night sky. Astronomers can also search the night sky for spots
that have the same temperature features and, possibly, other similarities. The
discovery of such regions that share characteristics would be a significant step
toward reaching a better understanding of the shape of our universe.

Unit 4 | 52
UNIT 4 at a glance
textbook

SECTION 4.2

What is Essential • Topology is generally believed to have started with Euler’s solution to the
about Shape? Bridges of Königsberg problem.
• Euler saw that the essential nature of the problem had nothing to do with
distance or other geographical features, but only with connections. He
expressed this in the Euler characteristic.
• The Euler characteristic of a graph tells you the kind of surface upon which
that graph can exist.
• Two surfaces are considered to be equivalent if one can be continuously
deformed into the other without cutting or gluing.

SECTION 3.2
4.3

Surfaces and • A surface, a two-dimensional manifold, looks flat in a local view, but it can
Manifolds have a more-interesting global structure.
• Topological objects are categorized by their genus (number of holes).
• A 3-manifold is the three-dimensional analog of a surface; it appears fo be
like normal space in a local view, but it can have a more-complicated global
structure.

SECTION 3.2
4.4

Intrinsic Topology • The extrinsic view of topology is like looking at a subway map; the intrinsic
view is like being on the subway.
• A Flatland explorer can experience topological shape as what happens as
she ventures further and further away from home in different directions.
• Box diagrams, also known as gluing diagrams, are a convenient way to
examine intrinsic topology.
• Our experience of 3-manifolds is confined to an intrinsic view.
• We can represent a 3-manifold with a cube diagram, the three-dimensional
analog of a box diagram.

Unit 4 | 53
UNIT 4 at a glance
textbook

SECTION 4.5

Embedding and the • Topological objects can be examined extrinsically by embedding them in
Extrinsic View higher-dimensional spaces.
• Some objects require a certain minimum number of dimensions in which
they can be embedded without self-intersection.
• Knots are different embeddings of a circle, a one-dimensional torus.
• The same object can be embedded in different ways. Some of these
embeddings, such as a trefoil knot, cannot be smoothly deformed into the
others.
• Reidemeister moves are a set of techniques with which one can tell which
knots are isomorphic to each other.
• Some ideas from knot theory have proven to be useful in the study of the
interplay between DNA and enzymes.
• Central to this study are the concepts of double points and writhe.

SECTION 3.2
4.6

Non-Orientability • A non-orientable surface is one on which there are regions that reverse an
explorer’s sense of right and left.
• If a surface has any reversing paths, it is considered non-orientable.
• Non-orientability is a topological invariant.
• A Möbius strip is an object with only one side. It is the classic example of a
non-orientable surface.
• The Klein bottle is another non-orientable surface.
• The Klein bottle cannot be embedded in three dimensions without
intersecting itself.
• The projective plane is another common, non-orientable surface.
• A projective plane can be attached to orientable surfaces to make them non-
orientable.

Unit 4 | 54
UNIT 4 at a glance
textbook

SECTION 4.7

Connected Sum and • We can add two or more surfaces together via a connected sum.
the Classification • The sphere serves as the identity under connected sums.
of Surfaces
• Every surface is reducible to a sphere with either handles or projective
planes—or both—attached.
• The Poincaré Conjecture is the equivalent of the classification of surfaces for
3-manifolds that have the loop-shrinking property.
• The Poincaré Conjecture was proven to be correct only in the first decade of
the 21st century.

SECTION 4.8

ROBOTS • A configuration space is a topological surface that corresponds to the


different states allowed to a given system.
• The concept of a configuration space can be used to plan things such as
manufacturing processes that minimize the risk of damage to expensive
machinery.
• We first make an intrinsic model of the configuration space; then we use the
Euler characteristic to find the genus and, with it, a sensible extrinsic view.
• This strategy is not in actual use in factories; this is merely an example of
how the ideas of topology might be used.

SECTION 3.2
4.9

The Shape of Space • The shape of the universe is a question that we can explore only intrinsically.
• There are many possible shapes to describe our universe.
• To determine the shape of the space we live in, we can search the night sky.

Unit 4 | 55
UNIT 4 Topology’s Twists and Turns
textbook

BIBLIOGRAPHY

WEBSITES http://www.geometrygames.org/
http://www.claymath.org/

PRINT Abrams, A. and R. Ghrist. “Finding Topology in a Factory: Configuration Spaces,”


The American Mathematical Monthly, 109, (February 2002).

Alexander, J.C. “On the Connected Sum of Projective Planes, Tori, and Klein
Bottles,” The American Mathematical Monthly, vol. 78, no. 2, (February 1971).

Arnold, B.H. Intuitive Concepts in Elementary Topology. Englewood Cliffs, NJ:


Prentice-Hall, 1962.

Ban, Yih-En Andrew, Herber Edelsbrunner, and Johannes Rudolph. “Interface


Surfaces for Protein-Protein Complexes,” RECOMB’04, San Diego, CA, (March
27–31, 2004).

Berlinghoff, William P. and Kerry E. Grant. A Mathematics Sampler: Topics for the
Liberal Arts, 3rd ed. New York: Ardsley House Publishers, Inc., 1992.

Borges, Carlos R. Elementary Topology and Applications. (World Scientific).


Singapore: World Scientific Press, 2000.

Boyer, Carl B. (revised by Uta C. Merzbach). A History of Mathematics, 2nd ed. New
York: John Wiley and Sons, 1991.

Casti, John L. Five More Golden Rules: Knots, Codes, Chaos, and Other Great
Theories of 20th-Century Mathematics. New York: John Wiley and Sons, 2000.

Devlin, Keith J. The Millennium Problems: The Seven Greatest Unsolved


Mathematical Puzzles of Our Time. New York: Basic Books, 2002.

Kuijpers, B., Paredaens, J., and J. Van den Bussche. “Lossless Representation
of Topological Spatial Data,” Advances in Spatial Databases (M.J. Egenhofer,
J.R. Herring, editors), Lecture Notes in Computer Science, vol. 951, Springer-
Verlanger, 1995.

Unit 4 | 56
UNIT 4 Topology’s Twists and Turns
textbook

BIBLIOGRAPHY
Kurant, Maciej and Patrick Thiran. “Trainspotting: Extraction and Analysis of
Traffic and Topologies of Transportation.” Networks. (Dated: May 23, 2006)
PRINT
CONTINUED Luminet, Jean-Pierre. “The Topology of the Universe: Is the Universe Crumpled?”
Laboratory Universe and Theories (LUTH).
http://luth2.obspm.fr/~luminet/etopo.html (accessed December 13, 2006).

Mackenzie, Dana. “Breakthrough of the Year: The Poincaré Conjecture—


Proved,” Science, vol. 314, no. 5807 (2006).

Milnor, John. “Towards the Poincaré Conjecture and the Classification of


3-Manifolds,” Notices of the American Mathematical Society, vol. 50, no. 10
(November 2003).

Monastyrsky, Michael. [Translated by James King and Victoria King. Edited by


R.O. Wells Junior] Riemann, Topology and Physics. Boston, MA: Birkhauser, 1979.

Montgomery, Richard “A New Solution to the Three-Body Problem,” Notice of the


AMS, vol. 48, no. 5 (May 2001).

Newman, James R. Volume 1 of the World of Mathematics: A Small Library of the


Literature of Mathematics from A’h-mose the Scribe to Albert Einstein, Presented
with Commentaries and Notes. New York: Simon and Schuster, 1956.

Pickover, Clifford A. The Möbius Strip: Dr. August Möbius’s Marvelous Band in
Mathematics, Game, Literature, Art, Technology, and Cosmology. New York:
Thunder’s Mouth Press, 2006.

Poincaré, Henri (edited and introduced by Daniel L. Goroff) New Methods of


Celestial Mechanics, vol. 1; Los Angeles, CA: American Institute of Physics: 1993.

Ray, Nicolas, Xavier Cavin, Jean-Claude Paul, and Bernard Maigret. “Dynamic
Interface Between Proteins,” Journal of Molecular Graphics and Modelling, vol. 23,
no. 4, (January 2005).

Rockmore, Dan. Stalking the Riemann Hypothesis: The Quest To Find the Hidden
Law of Prime Numbers. New York: Vintage Books (division of Randomhouse),
2005.

Unit 4 | 57
UNIT 4 Topology’s Twists and Turns
textbook

BIBLIOGRAPHY
Stewart, Ian. From Here to Infinity: A Guide to Today’s Mathematics. New York:
Oxford University Press, 1996.
PRINT
CONTINUED Sumners, De Witt. “Lifting the Curtain: Using Topology To Probe the Hidden
Action of Enzymes,” Notices of the AMS, vol. 42, no. 5 (May 1995).

Tannenbaum, Peter. Excursions in Modern Mathematics, 5th ed. Upper Saddle


River, NJ: Pearson Education, Inc., 2004.

Weeks, Jeffrey R. The Shape of Space, 2nd ed. (Pure and Applied Mathematics).
New York: Marcel Dekker Inc., 2002.

Weeks, Jeffrey. “The Poincaré Dodecahedral Space and the Mystery of the
Missing Fluctuations,” Notices of the AMS, vol. 51, no. 6 (June/July 2004).

Weisstein, Eric W. “Möbius, August Ferdinand (1790-1868)” Wolfram Research.


http://scienceworld.wolfram.com/biography/Moebius.html (accessed 2007).

LECTURES Hitchin, Nigel: “Lecture notes for course b3 2004: Geometry of Surfaces:
Chapter 1, Topology.” Mathematical Institute, University of Oxford.
http://www.maths.ox.ac.uk/~hitchin/hitchinnotes/hitchinnotes.html (accessed
2007).

McMullen, Curtis. “The Geometry of 3-Manifolds.” Lecture presented as part


of Harvard University’s Research Lecture for Non-Specialists, Cambridge,
Massachusetts, October 11, 2006.

Unit 4 | 58
UNIT 4 Topology’s Twists and Turns
textbook

NOTES

Unit 4 | 59
TEXTBOOK
Unit 5
UNIT 05
Other Dimensions
TEXTBOOK

UNIT OBJECTIVES

• Dimension is how mathematicians express the idea of degrees of freedom.

• Distance and angle are measurements that exist in many types of spaces.

• Lower-dimensional analogies extend qualitative understanding to spaces of four


dimensions and higher.

• The techniques of projection and slicing help us to understand high-dimensional


objects.

• High-dimensional space is one way to compare two people mathematically.

• Hausdorff dimension is a re-envisioning of our normal thinking of dimension due to


behavior of objects under scaling.

• Fractal dimensions describe many real-world objects that exhibit statistical self-
similarity.
Yet I exist in the hope that these memoirs,
in some manner, I know not how, may find
their way into the minds of humanity in
Some Dimension, and may stir up a race
of rebels who shall refuse to be confined
to limited Dimensionality.

A Square in Edwin Abbot’s FlatlanD


UNIT 5 Other Dimensions
textbook

SECTION 5.1

INTRODUCTION When we measure something, such as the length of a wooden beam, we are
focusing on one particular characteristic of that object and assigning a number
to it. Many objects, however, in both our everyday experience and the realm of
mathematics, cannot be adequately described by a single number. For instance,
if you were to build a house, you would need beams and boards that are cut
precisely in three different directions, length, width, and breadth. In other
words, a 2 × 6 that is three feet long will not do if you need one that is eight
feet long. All three measurements are independent and important. The more
aspects that we can measure about a single object, the more precisely we can
describe and work with it.

This way of thinking leads us quite naturally to the idea of “dimension.”


The word itself comes from the Latin dimensus, which means “to measure
separately.” So, quite literally, dimensions are aspects of a particular object
that we measure separately from one another.

In this unit, we will explore the idea of dimension in a few ways. At first we will
define it simply as quantities that can be manipulated independently of one
another. We will describe the fairly common concepts of one, two, and three
dimensions—most of us can easily grasp these—and then we’ll explore the
trickier 4th dimension and discuss how to conceive of higher dimensions. Then
we will introduce two concepts, scalability and self-similarity, and explain how
these give rise to a different idea of dimension, the “fractal” dimension.

Dimension is a tangible part of our everyday experience; we are accustomed


to “navigating the grid” in most cities and towns by moving in two directions,
north-south and east-west. Dimension is often referenced in popular culture,
too. Think of the “one-dimensional” character in a movie—the person who is
concerned with only one thing, to varying degrees, such as the hero of an action
movie, or the villain of a crime thriller. Artists such as Marcel Duchamp and
Pablo Picasso attempted to present the concept of “higher dimensions” in their
works by portraying objects from different angles simultaneously. In many
works of science fiction, people use extra dimensions to travel around the galaxy
via cosmic wormholes and other fanciful conjectures.

In modern mathematics the concept of dimension, utilized in a number of


practical applications, encompasses much more than just the three spatial

Unit 5 | 1
UNIT 5 Other Dimensions
textbook

SECTION 3.1 degrees of freedom—length, width, and height—to which we are accustomed.
For example, marketers and matchmakers design computer programs capable
INTRODUCTION of constructing “30-dimensional” profiles of individuals based on their multiple
CONTINUED interests and inclinations, hoping to pair these people with products or romantic
partners.

Many scientists believe that the very fabric of our universe—of reality—can be
understood only by going beyond the traditional three dimensions and studying
the mathematics of higher dimensions. Whether it is the five dimensions
associated with the theory of general relativity or the 13+ dimensions involved in
string theory, we live in a reality that allows for many degrees of freedom.

We can find exciting phenomena in fractional dimensions as well. This entirely


new and different way to view the concept of dimension has been applied to the
simulation of realistic plants in computer programs and to the authentication of
works of art, such as those of Jackson Pollock.

In this unit, we will learn how to leverage our intuitive understanding of the
world of three dimensions to enable us to think meaningfully about worlds of
many degrees of freedom. Mathematics often is applied to the study of things
and worlds that exist only in our minds—that is, the realm of the logically
possible. One of the basic tools mathematicians use to get a handle on these
mental worlds is the notion of dimension. We’ll develop a mathematical
understanding of dimension and gain some familiarity with associated tools,
such as slices and projections, which mathematicians use to conceive of and
understand our world and other multi-dimensional frontiers.

Unit 5 | 2
UNIT 5 Other Dimensions
textbook

SECTION 5.2

Degrees of • Fundamental Notions


Freedom • Lineland
• Flatland
• Spaceland

FUNDAMENTAL NOTIONS
• The most basic conception of dimension is as a degree of freedom.
• A point is an object with no properties other than location.
• A space is a collection of locations.
• Spaces can be characterized by their degrees of freedom.

The concept of dimension is, in its most basic and intuitive form, the concept
of measuring certain aspects of an object independently from all of its other
aspects. This idea of dimension is also known as “degrees of freedom.” If an
object has three degrees of freedom—height, width, and length, let’s say—that
means that it is able to “change” in any one of those three ways, and a change
in one has no effect on the other two. So, if we are navigating the streets of a
city laid out on a grid system, for instance, we are free to change our east-west
position or our north-south position, depending on whether we’re moving along
an avenue or a street. These are our two degrees of freedom. In a city whose
grid system is perfectly oriented to the four cardinal directions, going north on
an avenue does not affect your east-west position.

In order to examine the basic nature of spaces of different dimensions, we will


look at how many numbers it takes to specify the location of a point. For our
purposes, a point is an object with no other properties other than its location.
A point, by itself, has no degrees of freedom—it is effectively a space of zero
dimensions.

We consider a space to be a collection of locations. The zero-dimensional space


has only one location and, thus, allows for only one point. A space with more
than one possible location allows for at least one degree of freedom for a point
in that space. It also allows for the existence of multiple points, which then can
be grouped to form line segments, polygons, solids, and so on, depending on the
exact dimension of the space.

Unit 5 | 3
UNIT 5 Other Dimensions
textbook

SECTION 5.2 All spaces are not created equal. Their differences can be characterized in
various ways, such as how one defines distance, whether or not angles exist,
Degrees of and how many degrees of freedom are afforded the objects in that space. We
Freedom will concern ourselves only with the last of these properties. To help you get a
CONTINUED
handle on this concept of degrees of freedom, here’s another way to look at it:
a space of locations in which a point has only one degree of freedom is a space
in which points can differ from one another in only one way. A number line is a
model of this type of space.

Furthermore, two points in this space of one degree of freedom can never have
anything in common. If they did, they would be the same point!

A space of two degrees of freedom allows for points to differ from one another in
more than one way. For instance, (0, 1) is different from (0, 2), even though both
have a zero in common. The points (1, 0) and (2, 0) are distinct from both each
other and from (0, 1) and (0, 2), even though all four points incorporate a zero
value somewhere. A space of two degrees of freedom, thus, allows for a greater
variety of locations than are possible with only one degree of freedom.

In this section we will look at a few familiar spaces in terms of their dimension.
We will also give passing consideration to other properties, such as distance and
area, but our primary concern will be with dimensionality and its consequences.

Unit 5 | 4
UNIT 5 Other Dimensions
textbook

SECTION 5.2 Lineland


• A point in one dimension requires only one number to define it.
Degrees of • The number line is a good example of a one-dimensional space.
Freedom • Line segments are objects that connect two points.
CONTINUED
• Distance in a one-dimensional space is found by taking the difference of two
distinct points.

Let us first start by examining a one-dimensional space with which we are all
familiar, the number line.

Life in a one-dimensional (1-D) space is, well, just not that interesting. If you
were a point in 1-D space, all that we would need to pin down your exact position
is one number. That number would simply be how far you were, in whatever
units we’re using, from some agreed-upon reference point. The units could be
whatever we choose, as long as they are uniform. For our present discussion,
we’ll simply use the term “units.” The reference point is assigned the value of
zero and is more commonly known as “the origin.”

If we take two points in 1-D space and connect them, we form a line segment.
This line segment has a property that no single point has, length. The length of
a line segment in 1-D space can be found from the positions of the two endpoints
via subtraction.

That’s about all the “news” from one-dimensional space. Forwards or


backwards, this side of the origin or that side, long or short line segments—
these are pretty much the only things we could possibly care about if our
world were one-dimensional. So, let’s move on to explore a significantly more
interesting place, two-dimensional space.

Unit 5 | 5
UNIT 5 Other Dimensions
textbook

SECTION 5.2 Flatland


• Points in two-dimensional space require two numbers to specify them
Degrees of completely.
Freedom • The Cartesian plane is a good way to envision two-dimensional space.
CONTINUED
• Distance in the Euclidean version of two-dimensional space can be
calculated using the Pythagorean Theorem. One way that different spaces
are distinguished from one another is by the way that distance is defined.

In a two-dimensional (2-D) world, we have an added degree of freedom over a


one-dimensional world. One number is no longer enough to specify a unique
location. For instance, on the Cartesian plane a “3” on the horizontal direction,
or axis, can be paired with many different vertical values, and each pairing
defines a different, unique location in the space. Due to the fact that the
horizontal and vertical directions are “measured” completely independently of
each other, we need two numbers to pin down a location in 2-D space.

Unit 5 | 6
UNIT 5 Other Dimensions
textbook

SECTION 5.2

Degrees of Freedom
CONTINUED

Also, the origin now is not only the reference point for the horizontal axis, as
with the number line, but also for the vertical axis. It, too, requires two numbers
to define its location, so we define the origin as the point (0, 0). Notice now that
the question of direction is much more interesting than in 1-D space. In one
dimension, you can only go back and forth, but in two dimensions you can go
back and forth, up and down, or any combination of these.

Unit 5 | 7
UNIT 5 Other Dimensions
textbook

SECTION 5.2 Imagine that we have a line segment that starts at the origin and goes to (3, 4).

Degrees of Freedom
CONTINUED

It’s obvious that this line segment has neither a strictly vertical nor a
strictly horizontal orientation, but rather some hybrid of the two directions.
Furthermore, finding the length of this segment is now not a simple subtraction
problem, as before. We can, however, still determine a length by examining the
line segment’s directional components.

Unit 5 | 8
UNIT 5 Other Dimensions
textbook

SECTION 5.2 The components of the line segment can be thought of as its “shadows”
on the horizontal and vertical axes. This idea of finding a shadow will help
Degrees of Freedom us in understanding how objects with components in multiple independent
CONTINUED dimensions can be visualized, but we’ll get to that a little later.

Notice that the line segment forms the hypotenuse of a right triangle whose legs
are the horizontal and vertical components. This means that we can determine
the length of the line segment—or, in other words, the distance from the origin
to (3, 4)—by using the Pythagorean Theorem.

(horizontal component)2 + (vertical component)2 = (hypotenuse)2

If we rewrite this, taking the square root of both sides, we get:

hypotenuse = ( ) ( )
 horizontal component 2 + vertical component 2 
 
So, plugging in the horizontal value, 3, and the vertical value, 4, we get the
( )
familiar 5 = 3 + 4 for the length of our line segment.
2 2

The fact that we can use the Pythagorean Theorem to calculate the distance
between two points means that the version of 2-D space that we have been
studying is Euclidean. There are other ways to define distance, and this turns
out to be a good way to distinguish between spaces that, although they have the
same dimension, exhibit different behaviors.

Spaceland
• The concepts of distance and angle extend naturally into three dimensions.
• The way in which we extend our thinking from two to three dimensions
provides us with a template for thinking about higher dimensions.
• Each time we consider a new degree of freedom, we introduce a new
property that cannot exist in lower dimensions. Area (for 2-D) and volume
(for 3-D) are examples.

We have seen that in the 2-D world, horizontal and vertical directions are
independent dimensions. To think about a 3-D world, we need one more
direction that can change independently of horizontal and vertical changes. We
know this direction as movement “toward” or “away.” For simplicity’s sake,
from here on out we will follow convention and represent horizontal distance by
the letter x, vertical distance by the letter y, and distance toward (the “positive”
direction) or away (“negative”) by the letter z.
Unit 5 | 9
UNIT 5 Other Dimensions
textbook

SECTION 5.2 Notice that using just two


numbers won’t uniquely
Degrees of Freedom specify a point in this
CONTINUED space. For instance, the
designation (3, 4) pins
down a location only in
the xy-plane—it tells us
nothing about location in
the z-direction, or in other
words, how near to us or
how far from us the point
is. In fact, in 3-D space
(3, 4) defines a line, one
that is parallel to the z-axis. In other words, because no z value is specified,
the assumption is that z can take on any value, from positive infinity to negative
infinity. By contrast, (3, 4, 12) does indeed designate a uniquely defined point in
three dimensions.

In the 2-D world, we saw that we could use the Pythagorean Theorem to find
the distance from one point to another. Does it also work in the 3-D world?
Let’s see.

To find the distance from the origin to (3, 4, 5), we can imagine two right
triangles like so:

The first triangle is


formed in the xy-plane,
with its hypotenuse
being the line segment
that extends from the
origin to (3, 4). We saw
earlier that the length of
this hypotenuse can be
calculated directly from
the Pythagorean Theorem:

32 + 42 = 52

Unit 5 | 10
UNIT 5 Other Dimensions
textbook

SECTION 5.2 Thus, the hypotenuse of the first triangle measures 5 units. This line segment
now becomes a base of the second triangle, with vertices at the origin, (3, 4, 0),
Degrees of Freedom and (3, 4, 12):
CONTINUED
Again, we can use the Pythagorean Theorem to find the length of the
hypotenuse.

52 + 122 = 132

So, the length of the


line segment from the
origin to (3, 4, 12) is 13
units. Notice that if we
plug in the expression
for the square of the
first hypotenuse into the
expression for the second
hypotenuse, we get:

32 + 42 + 122 = 132

More generally:
(the x distance)2 + (the y distance)2 + (the z distance)2 = (total distance)2

This shows us that the Pythagorean Theorem generalizes quite nicely from the
2-D world to a 3-D world. In fact, we could continue this development into 4-D,
as we will soon see.

As we stated earlier, the addition of each new dimension to a space introduces


a new property that lower-dimensional spaces don’t have. For instance, in
2-D space we can have not only line segments but also planar shapes, such
as squares and discs, which exhibit the new property of “area.” Similarly, 3-D
space introduces the property of volume. Shapes with the property of volume,
called solids, are not possible in any space with fewer than three dimensions.

Also, note that we have been referring to dimension primarily as a spatial


measure, but it doesn’t have to be. Any quantity that can be measured
independently of others qualifies as a dimension. So, imagine that we have
a particle at a particular location in 3-D space. We might be concerned with

Unit 5 | 11
UNIT 5 Other Dimensions
textbook

SECTION 5.2 other properties of this particle besides its three spatial coordinates, such
as its mass, charge, or color. If we included each of these three independent
Degrees of Freedom measures as basic attributes in our description of the particle, we would have
CONTINUED a six-dimensional object—that is, it would be uniquely determined in a space
of six dimensions. Such a space is not very easy to visualize, but it presents no
problems mathematically. We simply realize that it is the space that contains
all sets of six numbers. Only three of those numbers are spatial coordinates,
but we don’t necessarily need to limit ourselves to these. We have seen that
ideas from lower-dimensional spaces generalize quite nicely as we step up
to higher-dimensional realms. We can use this idea to leverage our intuitive
understanding of lower-dimensional spaces to spaces of four dimensions
and higher.

Unit 5 | 12
UNIT 5 Other Dimensions
textbook

SECTION 5.3

journey into the • Is Time the Fourth Dimension?


fourth dimension • Hyperland
• The Hypercube
• Ways To Envision Four Spatial Dimensions

The idea that there are levels of reality that are normally inaccessible in our
daily lives is an ancient one. Mathematicians of the mid-nineteenth century
brought this ancient fascination into the modern age with their study of spaces
of four dimensions and higher. There are a few ways to interpret what we
mean by “the fourth dimension,” but they all boil down to considering another
degree of freedom that is independent of the three spatial dimensions that
we have defined. After just a few years of running and jumping around, we all
develop a pretty good intuitive sense of three dimensions, but imagining a fourth
independent “direction” can pose somewhat of a challenge. Perhaps the most
intuitive way to conceive of this dimension is to think about it as time.

Is Time the Fourth Dimension?


• Time is often thought of as the fourth dimension.
• Time plays a key role as a dimension in mathematical formulations of
physical laws and theories such as general relativity and string theory.
• The qualitative behavior of time as the fourth dimension is debatable.

Viewing time as the fourth dimension is appealing for a number of reasons.


The first is that we naturally have experience with time coordinates. When we
tell someone we will meet them for coffee at 3 P.M., we are specifying a point
in time. However, to increase the odds that the meeting actually occurs, we
also need to specify a place. So, establishing the meeting uniquely requires
three spatial coordinates and one time coordinate. You might say, “Meet me at
3 P.M. on the fifth-floor terrace of the building on the northwest corner of 3rd
Street and 4th Avenue,” for example. Of course, it is possible for time to change
independently of the spatial coordinates—all you have to do is sit relatively still
and your time coordinate will change while your position will not. So, if your
friend is late, you can maximize your chances of still meeting the person by
waiting at the correct spatial coordinates as the time coordinate continues to
change.

Unit 5 | 13
UNIT 5 Other Dimensions
textbook

SECTION 5.3 There are a couple of problems with considering time the fourth dimension,
however. The first is that you aren’t entirely free to “move around” in the time
journey into the dimension. In fact, you are pretty much stuck moving forward at a rate that
fourth dimension you cannot control (but that, according to Einstein, is not necessarily the same
CONTINUED
for everybody). So, time allows only a partial degree of freedom. The second
problem is that, while you can change your time coordinate without changing
your spatial coordinates, the reverse is not true: how could you move from point
A to point B without a passage (i.e., change in “position”) of time?

So, time’s role as a fourth dimension may be debatable on some philosophical


level, but for practical purposes, it works quite well. In fact, Einstein treated
time as inseparable from the three dimensions of space and gave us the
concept of “spacetime,” which is the four-dimensional equivalent of a surface,
something that we discuss in some depth in other units. This spacetime,
however, is curved by massive objects, which suggests that there might be a fifth
dimension that allows this curvature to take place. While this may seem mind-
boggling, string theory, one attempt by physicists to unify the fundamental laws
of the universe, is even more of a stretch. Depending on which version of string
theory you adopt, you will be asked to envision a space with between 8 and 26
dimensions. At some point, this just seems like the stuff of science fiction, and
a perfectly rational question would be: what are these higher dimensions? Are
they spatial?

Hyperland
• A point in four-space, also known as 4-D space, requires four numbers to fix
its position.
• Four-space has a fourth independent direction, described by “ana” and
“kata.”
• In Euclidean four-space, our standard notions of Pythagorean distance and
angle via the inner product extend quite nicely from three-space.

Before we get carried away by trying to comprehend a world of many


dimensions, we can start by considering what a fourth spatial dimension would
be like. Let’s back up and think about how we expanded our thinking through
the lower-dimension worlds that we introduced previously. Remember that
we used familiar concepts from the 2-D world to understand the 3-D world, so
perhaps we can use concepts from the 3-D world to understand the 4-D world.

Unit 5 | 14
UNIT 5 Other Dimensions
textbook

SECTION 5.3 First off, to specify a point in four-space, we need four numbers

journey into the


fourth dimension
CONTINUED

Consequently, a point such as (1, 2, 3) is not uniquely defined in four-space; it


would, in fact, designate a line parallel to the fourth axis, which we’ll call the
w-axis. In four-space, the w-axis is perpendicular to the x, y, and z, axes.

Now we’ve created a visualization problem. Most people are not accustomed to
thinking about a fourth axis in the space around us, and representing it poses
a challenge. To produce a visual model, we have to rely upon an illusion. This
should not overly concern us, however—we already do this when we depict a 3-D
object on a 2-D piece of paper or computer screen. For example, to represent
the third dimension, the z-axis, on a flat piece of paper (or a screen), the
convention is to draw a diagonal, dashed line in the xy-plane—we then use our
imaginations to view this line as “coming out of” the page.

Unit 5 | 15
UNIT 5 Other Dimensions
textbook

SECTION 5.3 To draw the fourth dimension, the w-axis, on a flat page also requires an illusion
and our imaginations. Let’s draw another line in the xy-plane and imagine that
journey into the it is “coming out of” the 3-D space that we already have in mind. In some ways,
fourth dimension we’re creating an illusion within an illusion.
CONTINUED

Before you are tempted to dismiss this as hocus-pocus, consider that the
mathematics is rock solid; it is only our habitual perception that is troubling us.
This is an interesting case of how techniques from mathematics can help us to
think about things that are difficult for our natural faculties of perception.

Remember that our conception of movement in the third dimension is “toward”


and “away.” If it helps you, think of this new, fourth degree of freedom as “in”
and “out.” Some mathematicians, however, prefer the terms “ana” and “kata,”
the Greek words for “up” and “down,” respectively, to represent the directions
one can move on the fourth axis.

Four-space has the capacity for all the configurations associated with lower
dimensions—lines, angles, planar shapes, and solids. Also, in the Euclidean
view of four-space, it’s possible to find the distance between two points by using
a straightforward extension of the Pythagorean Theorem.

The Hypercube
• The hypercube is the four-dimensional analog of the cube, square, and line
segment.
• A hypercube is formed by taking a 3-D cube, pushing a copy of it into the
fourth dimension, and connecting it with cubes.
• Envisioning this object in lower dimensions requires that we distort certain
aspects.
• The tesseract is a 3-D object that can be “folded up,” using the fourth
dimension, to create a hypercube.

Unit 5 | 16
UNIT 5 Other Dimensions
textbook

SECTION 5.3 You may recall that our “new” fourth dimension must introduce a quantifiable
property that has not yet existed in any of the lower dimensions—this is simply
journey into the a pre-requisite of a degree of freedom. Objects in four-space have a property,
fourth dimension analogous to area and volume, that we call “hyper-volume.” Possibly the most
CONTINUED
famous object with this property is the hypercube. To prepare to understand it,
let’s first look at how we formally construct “normal” squares and cubes.

First, to create a square in two dimensions, or a cube in three dimensions, we


start with the analogous object from the dimension that is one lower. That is, we
use parallel line segments, joined by perpendicular line segments, to create the
square. To create the cube, we use parallel squares connected by perpendicular
squares.

So, to create the hypercube, we start with a cube in 3-D space; then we create
another cube at a distance equal to the side-length of the original cube along
the w-axis. These two cubes can be thought of as being parallel in the same way
that the opposite sides of a square or the opposite faces of a cube are parallel.

Creating a hypercube by pushing.

Think back: to make a square, we connected the endpoints of two parallel


line segments using line segments of equal length; and to make a cube, we
connected the edges of two parallel squares with squares of equal shape. So,
to construct a hypercube, we will connect the faces of our parallel cubes with
cubes of equal size. It should be clear that connecting all the faces of our two
parallel cubes requires six “connector” cubes. Consequently, the hypercube is
made up of eight regular cubes that are “glued together” such that all of their
faces are attached to one another. Trying to visualize this can truly turn one’s
brain inside out, but here’s a progression of images that might help:
Unit 5 | 17
UNIT 5 Other Dimensions
textbook

SECTION 5.3

journey into the


fourth dimension
CONTINUED

Unit 5 | 18
UNIT 5 Other Dimensions
textbook

SECTION 5.3

journey into the


fourth dimension
CONTINUED

Unit 5 | 19
UNIT 5 Other Dimensions
textbook

SECTION 5.3 If it helps, imagine constructing a cube from this 2-D plan, or pattern, which is
called a “net”:
journey into the
fourth dimension
CONTINUED

To build the 3-D object from the 2-D net, you simply fold and glue the
appropriate edges together.

We can think of the following shape as a 3-D net that can be folded up to make a
hypercube:

To create the hypercube, we need to fold and glue faces to attach to one another.
Obviously, this requires that we “smush” and stretch the cubes, but were we
doing this in 4-D space, no deformation would be necessary.

Unit 5 | 20
UNIT 5 Other Dimensions
textbook

SECTION 5.3 Ways To Envision Four Dimensions


• A viewer from the fourth dimension would see both our insides and our
journey into the outsides simultaneously.
fourth dimension • Higher-dimensional viewing allows all sides of an object to be seen
CONTINUED
simultaneously.
• Artists such as Picasso and Duchamp have used the concept of higher-
dimensional viewing in their works.

Being in 4-D space has some rather strange properties. To imagine what some
of these might be like, let’s again use a lower-dimensional analogy. Let’s say
that a square in 2-D space has both a defined front and a defined back. If we
were in the plane with the square, we would not be able to see its back if we
were looking at its front.

However, if we raise ourselves


up off of the plane, we can
simultaneously see both the
front and the back, as well as
the interior, of the square. We
may think this is no big deal,
but the higher-dimensional
extension of this thinking can
be quite unnerving.

If a four-dimensional being were to look at us, they could see all sides of us
simultaneously. Plus, they would be able to see our “interiors.” Now, the
interior part is a bit hard to visualize, but we can imagine seeing something
from all angles simultaneously. Anyone who has constructed a 360-degree
photo landscape has some idea of what a four-dimensional being would see in
our 3-D world.

ITEM 2967 / Oregon Public Broadcasting, created for Mathematics Illuminated, VIEW OF A 4-D BEING; NOTE THAT
THE TREE IS THE SAME ON THE RIGHT AND LEFT EDGE (2008). Courtesy of Oregon Public Broadcasting.

This idea of seeing something from multiple angles simultaneously, can


be found in much of the art from the early twentieth century. The cubists,
including Pablo Picasso and Marcel Duchamp, were very much influenced by the
mathematical exploration of higher dimensions.

Unit 5 | 21
UNIT 5 Other Dimensions
textbook

SECTION 5.3 We have now seen how a fourth spatial dimension can exist in the mental realms
of both mathematics and art. Whether or not it exists in the real world is a
journey into the matter for science to settle. To prove it, we would have to observe phenomena
fourth dimension that cannot be explained in the absence of a fourth spatial dimension.
CONTINUED
Regardless of whether a fourth spatial dimension is physically real, however,
mathematical reasoning has shown that it is at least logically possible.

Mathematics provides tools with which we can explore and understand not
only the world of our senses, but also worlds we can conceive of only in our
minds. Higher-dimensional worlds are indeed possible for us to think about,
but we need certain tools in order to be able to say anything meaningful about
them. Analogies with lower-dimensional spaces represent one tool, the value
of which we have already seen in our earlier discussions. In the next section we
will learn about other mathematical techniques that we can use in our quest to
achieve a broader comprehension of dimension.

Unit 5 | 22
UNIT 5 Other Dimensions
textbook

SECTION 5.4

Slices, projections • The Hypersphere


and shadows • Slicing the Hypercube
• Shadows in the Cave

In 1884 Edwin A. Abbott published a novel about the concept of higher


dimensions entitled Flatland: A Romance of Many Dimensions. His novel
chronicled the adventures of A Square (a play on the author’s own name),
who resides in a two-dimensional world called “Flatland.” A Square is a
plane figure, and as such has only two degrees of freedom. He recognizes the
directions “left,” “right,” “forward,” and “backward,” but he has no concept of
“up” or “down.”

One day, A Square receives a visit from a visitor from the third dimension, A
Sphere. A Sphere “lifts” A Square out of Flatland so that he can experience a
three-dimensional world that was, up until that point, unthinkable. Abbott’s
book is a classic and is well worth reading, as its descriptions of how to think
about higher dimensions are still quite useful.

Let’s focus on one particular incident in the book, the part in which A Sphere
first makes contact with A Square. A Sphere introduces himself in this way:

I am not a plane Figure, but a Solid. You can call me a circle; but in reality
I am not a Circle, but an infinite number of Circles, of size varying from a
Point to a Circle of thirteen inches in diameter, one placed on the top of
the other. When I cut through your plane as I am now doing, I make in your
plane a section which you, very rightly, call a Circle. For even a Sphere—
which is my proper name in my own country—if he manifest himself at all
to an inhabitant of Flatland—must needs manifest himself as a Circle.1

A Sphere’s appearance in Flatland is an example of how we can use lower-


dimensional slices to get an idea of the structure of higher-dimensional objects.
If you’ve ever seen a topographical map, you have some idea of how such
“slices” are used to represent a 3-D landscape on a 2-D page.

Unit 5 | 23
UNIT 5 Other Dimensions
textbook

SECTION 5.4 The lines represent what are


known as “level curves.” They
Slices, projections are what we would see were
and shadows we to slice the landscape at
CONTINUED
different elevations. We can
use a similar slicing process to
get a sense of the structure of
objects in four dimensions.

Item 3085 /Brandon Laufenberg, TOPOGRAPHY [VECTOR]


(2006). Courtesy of iStockphoto.com/Brandon Laufenberg.
A topographical map shows contour lines that correspond
to lines of constant altitude.

The Hypersphere
• A sphere can be thought of as a stack of circular discs of increasing, then
decreasing, radii.
• The process of slicing is one way to visualize higher-dimensional objects via
level curves and surfaces.
• A hypersphere can be thought of as a “stack” of spheres of increasing, then
decreasing, radii.

A sphere is a three-dimensional object, so it cannot be represented in two


dimensions in the same way that it is in three dimensions. We could try to use
an illusion, as we did when portraying the w-axis, or we could consider a series
of slices taken at different positions on the sphere, as A Square encountered A
Sphere in Flatland.

Note that any 2-D slice of a sphere is a circle. Let’s take a moment to look at
what this entails mathematically.

Unit 5 | 24
UNIT 5 Other Dimensions
textbook

SECTION 5.4 The equation for a sphere in three dimensions comes from its definition:
all the points in space that are a given distance from the center. Remember that
Slices, projections distance in this space is calculated by using the 3-D version of the
and shadows Pythagorean Theorem,
CONTINUED

d=( 2 2 2
(difference of x coordinates) + (difference of y coordinates) + (difference of z coordinates) )
2 2 2

If we designate a point on the sphere as (x, y, z), and if we set the center at the
origin, this equation simplifies to:

d2 = x2 + y2 + z2

So, to our friend A Square, who has no notion of “z,” this will look like d2 = x2 +
y2, which is the equation for a circle in the 2-D world. What actually happened
to the “z” dimension? Well, if we imagine that the size of the circle in the plane
depends on where exactly the plane is slicing the sphere, then z must have
something to do with the size of the circle.

Mathematically, we can see this by rearranging our sphere equation a bit to get:

d2 – z2 = x2 + y2

So, if z represents where the plane is slicing the sphere, the act of slicing
equates to holding z constant. We can readily see that smaller absolute values
of z will yield larger circles, assuming, of course, that z = 0 represents the slice
that passes through
the exact center of the
sphere.
These slices, also
called “level curves,”
equivalent to the lines
on a topographical
map, are a useful way
of thinking about how
lower-dimensional
slices “stack up”
to make a higher-
dimensional object.

Unit 5 | 25
UNIT 5 Other Dimensions
textbook

SECTION 5.4 Let’s look at the case of the hypersphere, whose equation is just like that of the
sphere, with an added variable:
Slices, projections
and shadows D2 = x2 + y2 + z2 + w2
CONTINUED

We can think of the hypersphere as a 4-D version of a sphere, just as a


hypercube is a 4-D version of a cube. Before taking a slice of the hypersphere,
let’s just rearrange the equation, as before, to get:

D2 - w2 = x2 + y2 + z2

So, if we hold w constant, we will get a slice of the hypersphere.

C = x2 + y2 + z2, where C is (D2 – w2)

Notice that this is just the equation for a sphere in three dimensions. So, our
“slice” is actually a three-dimensional object. To be precise, what we normally
think of as a three-dimensional sphere is really a two-dimensional surface; we
are not concerned with points on the interior.

To create a hypersphere,
we would glue together all
the slices from w = -d to w
= +d. This gluing and the
resulting form are a bit hard
to imagine, but looking at
the slices gives you some
sense of the features of a
hypersphere, such as the
observation that its volume
decreases as you approach
extreme values of w.

Taking slices of a
hypersphere is relatively
straightforward. We don’t need to worry about how it is situated in relation to
the slicing plane because it appears the same from all angles—it exhibits radial
symmetry. Might the same be true of the hypercube? To find out, let’s first
consider a regular cube.

Unit 5 | 26
UNIT 5 Other Dimensions
textbook

SECTION 5.4
Slicing the Hypercube
Slices, projections • Slicing a cube yields different types of polygons, depending on the angle at
and shadows which you slice. This is in contrast to the slicing of a sphere, which always
CONTINUED
produces circles, regardless of the angle.
• Slices of a hypercube are various polyhedra, not just a series of cubes.
• Slices can miss crucial information about an object, such as whether or not
it is connected.

Similarly to how a plane can be used to slice a circle, we can also use a plane
to slice a cube. This time, however, the shape of the slice depends on the
orientation of the cube as it passes through the plane.

All three of the cubes shown are the same z-distance from the plane, but notice
that the slices are different! This is because the cube is positioned differently
in each example. Imagine slicing a block of cheese; the shape of your slice
depends on whether you are slicing a corner or a face and at what angle.

Imagine now a cube that is sliced perfectly through the middle by the xy-plane,
thus creating a square in the plane. Rotations in the xy-plane still give a square
and, were we to keep all other rotation angles constant, we could change the z
d d
value from − 2 to positive 2 , while rotating the cube and we would always have
the same-sized square, albeit a rotated one. This kind of rotation would be
fathomable for a Flatlander.

However, if we rotate the square in the xz- or yz-planes, the shape of the slice
changes. The most extreme example of this would be to imagine what the slices
of a cube would look like if it were to enter the plane vertex first. It might look
like this:

Unit 5 | 27
UNIT 5 Other Dimensions
textbook

SECTION 5.4

Slices, projections
and shadows
CONTINUED

These are some of the two-dimensional slices of a three-


dimensional cube.

In a similar way, the slice of a hypercube will depend on its orientation in the
xw-, yw-, and zw-planes. Here is a sequence of images representing 3-D slices
of a hypercube entering our space, vertex first:

These are some of the three-dimensional slices of a four dimensional hypercube.

So, we have seen that taking slices can help give us some idea of how four-
dimensional objects behave. Because slices are often incomplete pictures,
however, they necessarily miss many features of an object, depending on how
the slice is taken.

Unit 5 | 28
UNIT 5 Other Dimensions
textbook

SECTION 5.4

Slices, projections If we extend this thinking to a four-dimensional being intersecting our 3-D
and shadows world, we would perceive something like this:
CONTINUED

This 4-D creature


does indeed have a
continuous body, but
the connections are
all situated outside
of 3-D space, as
with the preceding
hand example. An
extra dimension can
provide connections
and paths that are
not available in lower
dimensions. An
interesting sidenote
is that going into this
fourth dimension does not somehow shrink the distance in 3-D space—it simply
allows a being to circumvent 3-D barriers. So, although going into “hyperspace”
to travel among the stars, as many a sci-fi character has done, does not
necessarily mean you can get anywhere more quickly, it does mean you won’t
have to worry about running into any objects along the way.

Shadows in the Cave


• Projections are like shadows.
• Projections are related to the inner product.
• Projections preserve more information than slices, but they necessarily
distort the picture in some way.

An alternative way to view a higher-dimensional object in lower dimensions is


through a projection. There are many different techniques of projecting, but the
one that we will examine is probably the most intuitive—we’ll simply ignore a
dimension.

To project a square, a fundamentally 2-D object, onto a lower-dimensional


space, the number line, we imagine a sort of transparent shadow that it casts on
the line.
Unit 5 | 29
UNIT 5 Other Dimensions
textbook

SECTION 5.4 A similar process can be used to project a 3-D cube onto a 2-D plane.

Slices, projections
and shadows
CONTINUED

We could also, if we wanted to, project a 3-D cube onto a 1-D line. To do this,
we would first project the cube onto the plane, then project the resulting planar
shape onto the line, as we did with the square.

Double projection

Unit 5 | 30
UNIT 5 Other Dimensions
textbook

SECTION 5.4 Representing a hypercube on


a flat page requires a similar
Slices, projections double projection. First, we
and shadows project the original 4-D object
CONTINUED
onto a 3-D object; then we project
the 3- object onto the 2-D page.
The result is quite different
from what we would see were
we somehow able to view the
hypercube in four dimensions,
A double projection of a hypercube
but it does convey important
information about its structure.

We can think of a projection as the flattening of an object. Consider how you


can flatten a flower or leaf by placing it between the pages of a thick book. The
result captures much about the essential shape of the object while, at the same
time, distorting it in some fashion.

These techniques, slices and projections, can come in handy when trying to
understand what higher-dimensional spatial objects are like. We said earlier,
however, that dimensions need not necessarily be spatial. We will now turn our
attention to some, possibly surprising, uses of dimension in our own, normal,
three- (or four-, or five-, or more) dimensional experience.

Unit 5 | 31
UNIT 5 Other Dimensions
textbook

SECTION 5.5

Many Dimensions • Dimensions of Personality


in Everyday Life • Love in 30 Dimensions

Dimensions of Personality
• Dimension can be used as a rough way to quantify certain aspects of human
nature.

You’ve probably heard the expression “one-dimensional” used to describe


someone or something that lacks a certain “depth” of character or complexity.
For example, a puppy could be described, more or less, as a creature that only
wants to play—sometimes more, sometimes less.

Stereotypes such as the husband who cares only about sports, or the daughter
whose only concern is her shoes, offer human examples of this conception
of one-dimensionality. One would hope that most people are not so simply
described, however.

Taking a broader view of our puppy, we could say that she is also concerned with
her hunger level. Given those two primary interests, to describe the puppy at
any point in time, we would need two numbers, one representing the desire to
play and the other representing the desire to eat.

Unit 5 | 32
UNIT 5 Other Dimensions
textbook

SECTION 5.5 These two axes serve as the basis for a two-dimensional plane. The points in
the plane correspond to different states of our puppy. So, (1, 9), for instance,
Many Dimensions would represent a puppy that doesn’t want to play much but that is extremely
in Everyday Life hungry. On the other hand, (9,1) would represent a puppy that, perhaps, has just
CONTINUED
eaten and now is full of energy!

Now that we have the general idea, let’s look at a three-dimensional case
involving very simple humans. Let’s say that these humans have three
measurable characteristics: affinity for low-budget movies, truthfulness, and
energy level.

Every human can be classified somewhere in this space, depending on one’s


respective values for the three characteristics. Now, we could ask, “what does
it mean for two people to be
close to one another in this
space?” (Remember that this
is not space-space, but rather
“characteristic-space.”)
The best way to think about
this is to think about the
points corresponding to each
person’s profile.

Let’s say that person A is


represented at (1, 9, 9) and
person B is represented at
(0, 9, 8). This means that
person A doesn’t like low-
budget movies much, is very honest, and has very high energy. Person B can’t
stand low-budget movies, is very honest, and has high energy. Judging by
these characteristics, these two people might get along pretty well. As a rough
approximation of their “compatibility,” we can find the distance between their
profile points in characteristic space by using the 3-D version of the Pythagorean
Theorem.

( ) ( ) ( ) ( ) ( ) ( )
 difference in x + difference in y + difference in z  =  1-0 + 9-9 + 9-8 
2 2 2 2 2 2

Distance =
   

This equals a distance of 2 , or approximately 1.41—very close.

Unit 5 | 33
UNIT 5 Other Dimensions
textbook

SECTION 5.5 What would we expect of two people who were far apart in this characteristic
space? For example, let’s consider person C, represented at (1, 0, 0): this
Many Dimensions person hates low-budget movies, lies like a rug, and spends all day on the
in Everyday Life couch. Person D, represented at (9, 9, 9), loves low-budget movies, always tells
CONTINUED
the truth, and works out every day. We can intuitively guess right away that
these two probably won’t get along; let’s see what the distance between them
would be:

( ) ( ) ( )
 1-9 + 0-9 + 0-9 
2 2 2

Distance =
 

This expression corresponds to a distance of about 15.5, quite a bit larger than
that of the first couple. Of course, in this case, we are looking at only three
aspects of a person’s life. It’s hard to imagine that this would be enough degrees
of freedom to come anywhere close to capturing an accurate description of
somebody mathematically.

Love in 30 Dimensions
• A 30-question survey can be used to create a 30-dimensional profile of a
person.
• People can be matched according to their distance from each other in 30-
dimensional space.

One of the great things about the Internet is its capacity to connect people with
the things that they want or need. Many websites collect information about
people and then make recommendations as to what book they should read, what
music they should listen to, and even whom they should date. Services such as
these, however, use many more than just three measurements or dimensions
to quantify a person. They typically construct a many-dimensional profile of a
person and put it into what is called a “feature vector.” This process basically
uses information that a person provides to assign that person to a point in a
multi-dimensional space.

Let’s examine the case of an online dating service. As of this writing, one
popular service uses 30 dimensions to quantify a person. The person is then
assigned a point in 30-dimensional space. Users then answer questions about
their ideal match, thereby creating a virtual 30-dimensional profile. Individuals
who are “close” to this person’s ideal match profile in 30-dimensional space are
considered to be potential romantic matches.

Unit 5 | 34
UNIT 5 Other Dimensions
textbook

SECTION 5.5 Now, the efficacy of this method could be debated—real humans are not
necessarily well-described by only 30 characteristics. Furthermore, not all
Many Dimensions traits are as important as others; smoking might be a deal breaker, whereas
in Everyday Life snoring might not be so bad. Nuances such as these are missed by the rough,
CONTINUED
all-characteristics-are-equal, 30-D distance model. Nonetheless, this system is
an example of how many-dimensional objects are at play in our daily lives.

In this example, we used the idea that distance between points is a concept that
generalizes no matter what dimension of space we are in. We saw in a previous
section that this works for two- and three-dimensional spaces, and we can use
the same method to show that it works in four dimensions as well. Of course,
we can’t empirically verify a distance in four or more dimensions, but the math
works. This exemplifies an important idea in mathematics: concepts from
spaces or things that we do understand can be expanded to help us grasp spaces
and things that we have no hope of experiencing first hand. This boils down to
the belief that once we have a good idea, we can “trust the math” in carrying its
application to new contexts. This lights the way forward, as we now turn to a
completely different, and equivalent, way to think about dimension.

Unit 5 | 35
UNIT 5 Other Dimensions
textbook

SECTION 5.6

Scaling and the • Rethinking Dimension


Hausdorff • The Koch Curve
Dimension
• Fractal Snowflakes

Up until this point, we have been thinking of dimension as the number of


independent measurements that are required to define a particular object
in a particular space. We will now, through the application of mathematical
concepts, see how the dimension of an object can be defined without regard
to numbers that we measure independently. This new capacity will enable us
to examine and describe new and fascinating objects that would otherwise
baffle us.

Rethinking Dimension
• One-dimensional, two-dimensional, and three-dimensional objects behave
differently as they scale—that is, as they expand or shrink.
• We can write an expression for dimension based on scale factor and the
number of self-similar copies.

Let’s return to the one-, two-, and three-dimensional worlds that we explored
earlier. Recall the basic object in each dimension: the line segment, the square,
and the cube, respectively. Now we’re going to observe these objects as they
undergo a process known as “scaling”; basically, we’ll explore how each object
changes as we shrink or enlarge it by a constant factor.

First up, the line segment—let’s look at a segment of length one unit.

If we were to triple the size of this object, we would have a line segment of
length three units. We could view this result as three of our original line
segment. So, we see that if we scale the line segment by a factor of three, we
end up with three copies of the original. Each of these copies is said to be “self-
similar” to the original segment.

Unit 5 | 36
UNIT 5 Other Dimensions
textbook

SECTION 5.6 Now, let’s do the same thing with a square whose sides are each one unit in
length.
Scaling and the
Hausdorff To increase the size of this object by a factor of three, we have
Dimension
to lengthen both the horizontal and vertical elements (or else it
CONTINUED
won’t be a square anymore). When we do this, “scaling up” each
segment by three, we get an entirely different relationship than
we got with the scaling of the line segment.

Notice that our new shape is not made up of three copies of the original, but
rather nine! This is an important property of area: it does not scale linearly with
the side length. When we double the side length of a square from 3 units to 6
units, the area does not just double—it quadruples!

Initial area = 3 × 3 = 9 units2


Final area = 6 × 6 = 36 units2
36
Ratio of Final Area to Initial Area = =4
9

Returning to our example square, notice that if we scale the side length by three,
the resulting object is made up of nine copies of the original. Note that 9 = 32.
In words, when a square is scaled, the number of self-similar squares in the
resulting square is equal to the scale factor to the second power.

Now, let’s look at the basic three-dimensional object, the cube.


This time, as we scale the side length by a factor of three, we
have to take three perpendicular directions into account.

Unit 5 | 37
UNIT 5 Other Dimensions
textbook

SECTION 5.1
5.6

Scaling and the


Hausdorff
Dimension
CONTINUED

So, if we increase the side length of a cube systematically by a factor of three,


the volume increases by a factor of 3 × 3 × 3, or 27. This means that volume
scales not linearly, and not as the square of side length (as does area), but,
rather, as the cube of side length. Furthermore, notice that each of the new
cubes generated is self-similar to the original cube. So, we have 27 = 33,
verifying that the number of self-similar copies is equal to the scale factor to the
third power.

This last point is important for any budding sculptors. If you wish to make
a large version of a small figurine, you would do well to make sure that the
figure’s legs are strong enough to hold up its disproportionately heavier mass!

Let’s organize our results from the scaling of these three objects:

Notice that the exponent in each case is equal to the dimension of the object
being scaled. Let’s generalize this.

N = number of self-similar copies


S = Scale factor
D = Dimension

N = SD
Unit 5 | 38
UNIT 5 Other Dimensions
textbook

SECTION 5.6 So, if we want to develop an equation that yields the dimension of an object
when we know how many self-similar copies it has as it scales, we should solve
Scaling and the the equation above for D. To bring D out of the exponent position, we can use
Hausdorff the natural logarithm, which comes in quite handy whenever we need to deal
Dimension
with exponents or convert powers to multiplication, or convert multiplication to
CONTINUED
addition. So, taking the natural logarithm of both sides, we get:

ln N = D ln S

Dividing both sides by ln S, we get:

ln n
D= ln S

This equation can be used to determine the dimension of an object based


solely on its properties of scaling and self-similarity. Something similar to
this definition of dimension was first identified by Felix Hausdorff, a German
astronomer and mathematician working in the first quarter of the twentieth
century. The value he identified is commonly known as an object’s Hausdorff
dimension.2

The Koch Curve


• The Koch curve has infinite perimeter in a finite space; this incongruity
indicates that it is not simply a 1-D object.
• The Koch curve has an area of zero, which indicates that it is not a
2-D object.

Now that we have a completely new way to look at dimension, let’s consider
some strange objects that defy traditional explanation. The first is the famous
Koch curve, or “Koch Snowflake.”

Unit 5 | 39
UNIT 5 Other Dimensions
textbook

SECTION 5.6 This shape can be created by beginning with a line segment and then iteratively
replacing the line segment with the following curve:
Scaling and the
Hausdorff
Dimension
CONTINUED

Let’s first look at this curve as if it were a 1-D line. At the outset, its length
4
would be one unit. After the first iteration, its length would be of a unit.
3

4
In the second iteration, each line segment is replaced with a curve that is 3 as
4
long. So, we can multiply the length from the first iteration by the factor of to
2
 4 3
obtain a length of  3  units for the second iteration of the Koch curve.
 

Unit 5 | 40
UNIT 5 Other Dimensions
textbook

SECTION 5.6 Now, as we repeat the same steps for the third iteration, it should be evident
3
4 4 4  4
that the new length will be × × =  3  units. We can generalize this by
3 3 3
4
Scaling and the saying that the curve will increase in length by a factor of with each iteration.
3
Hausdorff Thus, we are led to conclude that the length of the total curve continually gets
Dimension
larger without bound! This curve is infinite in length and yet stays within the
CONTINUED
confines of the page—very strange indeed! Perhaps this is not a 1-D line but
rather a 2-D plane figure.

As we can see in this progression of images, squares, no matter how small we


make them, will “over count” the measurement of the curve. They will never
have the resolution that we need to cover only the curve and no extra space.

Let’s see what happens if we treat each line segment as a square. The area of
the square each time will be equal to the length of the straight segment times
itself.

Unit 5 | 41
UNIT 5 Other Dimensions
textbook

SECTION 5.6 For the three cases depicted here (plus one thrown in to help show the trend) we
have the following information:
Scaling and the
Hausdorff
Dimension
CONTINUED

It should be evident that the total area of this curve depends on the area of
the squares we are using to measure it. In fact, the smaller the squares, the
smaller the area. Notice that after the first iteration the area of the curve has
gone from 1 unit2 to less than half of a square unit. After the third iteration, the
area has diminished to about a fifteenth of a square unit. It’s clear to see that
following this trend, the total area of the curve is headed towards zero!

In summary, measuring the curve as a 1-D object fails miserably, as it generates


an infinite length, and measuring the curve as a 2-D object gives us an area of
zero, which also classifies as a miserable failure. Let’s return to our equation
for the Hausdorff dimension to see if we can get to the root of this conundrum.

Fractal Snowflakes
• Using the Hausdorff definition of dimension, we find that the dimension of
the Koch curve is some decimal value between 1 and 2.

Unit 5 | 42
UNIT 5 Other Dimensions
textbook

SECTION 5.6

Scaling and the To find the Hausdorff dimension, we need to know how the self-similarity of
Hausdorff this object relates to how it scales. We see that after one iteration, each line
Dimension
segment is replaced with four copies of itself. Furthermore, we see that each
CONTINUED 1
self-similar copy is 3 the length of the original. This means that our scale
factor is 3 and our number of self-similar objects is 4.

Substituting these values for S and N in the dimension equation that we derived
earlier, we get:

ln (4)
D= ≈ 1.26..
ln (3)

Hence, this object is somewhere between one-dimensional and two-


dimensional! Results like this are fractional, or fractal, dimensions, and the
objects themselves are simply called “fractals.”

So, our path through the story of dimension has just taken another turn. Not
only have we glimpsed the behavior of dimensions higher than the three to
which we are accustomed, but now we have also seen that objects can be
described by non-integer dimensions. Put another way, some objects seem to
exist in spaces between intuitive dimensions.

Fractals were popularized by Benoit Mandelbrot in the 1970s when it was


found that many objects in nature resemble fractal designs to some degree
or another. Indeed, the vast numbers of intricate shapes found in nature are
rarely as conveniently geometric as simple lines, squares, and planes. In fact,
natural shapes tend to exhibit intriguing behavior at different scales, and while
not always exactly self-similar in the way that the Koch curve is, many natural
objects exhibit statistical self-similarity. As it turns out, this property can come
in quite handy, as we shall see in the next section.

Unit 5 | 43
UNIT 5 Other Dimensions
textbook

SECTION 5.7

Fractal by nature • How Long Is the Coastline of Britain?


• Will the Real Jackson Pollock Please Stand Up?

In our analysis of the Koch curve, we were fortunate that it behaves so nicely—
that is, it lends itself to being measured. Many objects in nature are not so
“nice.” They may exhibit properties of self-similarity either only at limited
scales (e.g., a fern leaf)—or only in a rough, approximate manner—or both.

Nevertheless, the concept of fractal dimension can generally be used to help


describe and analyze naturally occurring phenomena and objects. In order to
use this tool, however, we must replace our requirement of strict self-similarity
with a notion of approximate, or statistical, self-similarity. Let’s look at an
example.

How Long is the Coastline of Britain?


• Real objects are not exactly self-similar; rather, they are statistically
self-similar.
• The length of a curvy object, such as a coastline, depends on the size of the
ruler you use to measure it.

A famous application of fractals was posed as the question: “How Long is the
Coastline of Britain?”. This question embodies the fact that the value obtained
when measuring the length of a complicated shoreline, such as that of Britain,
depends on the length of the “ruler” that is used. Indeed, as with the Koch
curve, we can convince ourselves that the length can be as long as we choose.

Unit 5 | 44
UNIT 5 Other Dimensions
textbook

SECTION 5.7

Fractal by nature
CONTINUED

Alexandre Van de Sande, HOW LONG IS THE COAST OF BRITAIN? STATISTICAL SELF-SIMILARITY
AND FRACTIONAL DIMENSION (2004). Courtesy of Alexandre Van de Sande at wanderingabout.com

Benoit Mandlebrot saw that, if we view the coastline as a fractal, we can start
to make some sense of its measurement. The problem is that the curve does
not repeat its exact shape at different scales, as the Koch curve does. Rather,
statistical features repeat at different-length scales. This might include the
number of bays or peninsulas of a certain scale that one finds when measuring
with a specific ruler.

One might find that one quadrant of the entire curve contains three bays and
four peninsulas of length one unit (here we’re letting a unit equal the length
of one quadrant). If we then look at one-eighth of the curve, our unit becomes
smaller, and the larger bays and peninsulas that showed up in the first view
become more-or-less flat. New bays and peninsulas become evident, however,
now that we have a more detailed view. We might find that the number of
1
smaller bays and peninsulas (of length 8 ) is similar to before—say, three bays
and five peninsulas. So, although the exact shape is not the same at both scales,
the number of significant features is about the same. This gives us the idea that
the coastline is approximately self-similar.

Unit 5 | 45
UNIT 5 Other Dimensions
textbook

SECTION 5.7 We can use these


properties to find
Fractal by nature the dimension of
CONTINUED our coastline, but
we need a new
technique. The
strategy we used
previously to find
the dimension of
the Koch curve won’t
Same structure at different scales.
work in this case,
because we do not have exact self-similarity, but, rather, only statistical self-
similarity. To find out more about a method that might work, let’s look again at
the Koch curve and use rulers of different sizes to measure its length.

Recall that the first time we tried this, we found that the length of the curve
approaches infinity as we take closer and closer looks. This time, however,
instead of being concerned with the absolute length, we’ll focus on how the
length changes with the size of the ruler with which we choose to measure.
We start with a ruler of length one and find that the length of the curve is 4 units.
1
Now, if we measure with a ruler 3 as long (what might be considered a “more
 1
sensitive” ruler), we find that the length is 16 ×  3  units. As we use smaller and
smaller rulers, the following table begins to take shape:

Notice that nowhere so far


are we concerned with finding
copies that look exactly like
the entire curve—we care
only about how the measured
length of the curve changes
with the ruler size. Hopefully,
it is becoming apparent that
this technique will work on
curves that are not as uniform
as the Koch curve. To find the
relationship between these
quantities, we can plot them
on a graph.

Unit 5 | 46
UNIT 5 Other Dimensions
textbook

SECTION 5.7

Fractal by nature
CONTINUED

Notice that the scales with which we are dealing suggest that we should look at
a logarithm graph (log-log) of these data. This kind of plot is often useful when
dealing with quantities (like these) that change exponentially. To make the log-
log graph, we simply take the logarithm of all the quantities and re-plot the data.

Unit 5 | 47
UNIT 5 Other Dimensions
textbook

SECTION 5.7

Fractal by nature
CONTINUED

Now, to find out how these two values are correlated, we can look at the slope
of the best-fitting line. For simplicity’s sake, we’ll just choose the start and end
points:

rise [(4log 4 – 3log3) – log 4] (log 4 – log 3)


rise÷run
Slope = run = =
[-3log 3 -0] -log 3

log 4
Subtracting this from 1 yields log 3 , which is the same expression for dimension
that we obtained earlier by looking at self-similar copies.

So, to find the dimension of our original coastline, which will allow us to come
up with some sort of meaningful measurement, we can take a set of data that
includes both the length of the ruler we use and the total length that we find. If
we then plot the data on a log-log graph, we can find the relationship between
the choice of ruler and the total length. This will generate a line (or we can
choose a line of best fit), the slope of which is related to the dimension of the
coastline.

Unit 5 | 48
UNIT 5 Other Dimensions
textbook

SECTION 5.7

Fractal by nature
CONTINUED

Note that the slope of this line is equivalent to 1 minus the dimension of the
coastline—or, alternatively, the dimension of the curve is equal to 1 minus the
slope of our line. With this knowledge of the approximate dimension, we can
select a unit of an appropriate size with which to make our measurements. This
unit is not a length and not an area, but something in between—call it “larea” for
now. Furthermore, it is specific to the coastline with which we are concerned,
so it doesn’t provide a means of determining whether a certain coastline is
“longer” than another. However, it does enable us to talk about the relative
curviness of shorelines. For instance, we would expect a coastline with a fractal
dimension close to 1 to be much more featureless than a coastline whose
dimension is closer to 2.

Statistical self-similarity abounds in nature. The surface of a dry landscape


has the same features at many different scales. The branching of trees follows
similar rules. One of Mandelbrot’s great contributions was seeing how fractals
relate to the natural phenomena and rhythms of our world.

Unit 5 | 49
UNIT 5 Other Dimensions
textbook

SECTION 5.7 Will the Real Jackson Pollock Please Stand Up?
• The works of Jackson Pollock exhibit statistical self-similarity at different
Fractal by nature scales and have a fractal nature to them.
CONTINUED • Measuring the fractal dimension of a Pollock-style painting is one tool that
can help in verifying its origin.

Another person who was


fascinated by natural rhythms
was the American painter
Jackson Pollock. Pollock was
born in 1912 in Wyoming, and he
relocated to New York at the age
of 18. Through developing his
craft as a painter, he changed his
technique dramatically in 1947.
The drip paintings he began to
create, eschewing all traditionally
accepted concepts of form and
rigidity in favor of pure emotion
Item 3217/Hans Namuth, JACKSON POLLOCK PAINTING and crazily strewn lines, brought
AUTUMN RHYTHM: NUMBER 30, 1950 (1950). Courtesy: (c)
Hans Namuth Ltd., courtesy Pollock-Krasner House and him fame.
Study Center, East Hampton, NY.

On the surface, it seems that his technique could be easily replicated by anyone
with a bucket of paint, a canvas, a garage, and a penchant for extreme moods.
Recent mathematical analysis of his paintings has shown, however, that copying
a Pollock is not as easy as it may at first appear.

Richard Taylor, a physicist who pursued his analytical interest of Pollock’s work
while earning a masters degree in art theory from the University of New South
Wales, studied the statistical self-similarity of Pollock’s paintings. His method
was to take a digital scan of a Pollock painting and section it into squares of
different sizes for analysis, much as we sectioned off the coastline of Britain
previously. For each square size, computers are used to identify certain physical
traits of the paintings, somewhat analogous to the bays and peninsulas from the
coastline example. Researchers found that Pollock’s paintings exhibit statistical
self-similarity, and are, therefore, fractals.

Fractals were not widely known until the ‘60s, and Pollock died in 1956, so it is
highly unlikely that he was intentionally trying to paint mathematical objects.

Unit 5 | 50
UNIT 5 Other Dimensions
textbook

SECTION 5.7 Nevertheless, the fractal nature of his art is striking—and unique. In fact, it is
used, in conjunction with other methods, to authenticate paintings purported to
Fractal by nature be Pollock originals. This “fractal fingerprint” method involves computing the
CONTINUED fractal dimension of such a work and comparing it to the range of dimensions
known to be exhibited in Pollock’s paintings.

Taylor claims that his technique “shouldn’t be regarded as a final word on


Pollock authenticity, [although] it’s a pretty nifty use of fractal math.”3

It is clear that fractals, and fractal dimensions, initially discovered as abstract


mathematical objects, have a fascinating connection to the natural world.
Indeed, many of the objects that we encounter on a daily basis cannot be
measured within the traditional confines of one, two, and three dimensions as
independent parameters. Rather, they must be evaluated on the basis of their
scaling and self-similarity to be truly understood.

Unit 5 | 51
UNIT 5 at a glance
textbook

SECTION 5.2

DEgrees of Freedom • The most basic conception of dimension is as a degree of freedom.


• A point is an object with no properties other than location.
• A space is a collection of locations.
• Spaces can be characterized by their degrees of freedom.
• A point in one dimension requires only one number to define it.
• The number line is a good example of a one-dimensional space.
• Line segments are objects that connect two points.
• Distance in a one-dimensional space is found by taking the difference of two
distinct points.
• Points in two-dimensional space require two numbers to specify them
completely.
• The Cartesian plane is a good way to envision two-dimensional space.
• Distance in the Euclidean version of two-dimensional space can be
calculated using the Pythagorean Theorem. One way that different spaces
are distinguished from one another is by the way that distance is defined.
• The concepts of distance and angle extend naturally into three dimensions.
• The way in which we extend our thinking from two to three dimensions
provides us with a template for thinking about higher dimensions.
• Each time we consider a new degree of freedom, we introduce a new
property that cannot exist in lower dimensions. Area (for 2-D) and volume
(for 3-D) are examples.

SECTION 3.2
5.3

journey into the • Time is often thought of as the fourth dimension.


fourth Dimension • Time plays a key role as a dimension in mathematical formulations
of physical laws such as general relativity and string theory.
• The qualitative behavior of time as the fourth dimension is debatable.
• A point in four-space, also known as 4-D space, requires four numbers
to fix its position.
• Four-space has a fourth independent direction, described by
“ana” and “kata”.
• In Euclidean four-space, our standard notions of Pythagorean distance
and angle via the inner product extend quite nicely from three-space.

Unit 5 | 52
UNIT 5 at a glance
textbook

SECTION 5.3

journey into the • The hypercube is the four-dimensional analog of the cube, square,
fourth Dimension and line segment.
CONTINUED
• A hypercube is formed by taking a 3-D cube, pushing a copy of it into the
fourth dimension, and connecting it with cubes.
• Envisioning this object in lower dimensions requires that we distort
certain aspects.
• The tesseract is a 3-D object that can be “folded up,” using the fourth
dimension, to create a hypercube.
• A viewer from the fourth dimension would see both our insides and our
outsides simultaneously.
• Higher-dimensional viewing allows all sides of an object to be
seen simultaneously.
• Artists such as Picasso and Duchamp have used the concept of higher-
dimensional viewing in their works.

SECTION 3.2
5.4

Slices, Projections • A sphere can be thought of as a stack of circular discs of increasing, then
And Shadows decreasing, radii.
• The process of slicing is one way to visualize higher-dimensional objects via
level curves and surfaces.
• A hypersphere can be thought of as a “stack” of spheres of increasing, then
decreasing, radii.
• Slicing a cube yields different types of polygons, depending on the angle at
which you slice. This is in contrast to the slicing of a sphere, which always
produces circles, regardless of the angle.
• Slices of a hypercube are various polyhedra, not just a series of cubes.
• Slices can miss crucial information about an object, such as whether or not
it is connected.
• Projections are like shadows.
• Projections are related to the inner product.
• Projections preserve more information than slices, but they necessarily
distort the picture in some way.

Unit 5 | 53
UNIT 5 at a glance
textbook

SECTION 5.1
5.5

Many Dimensions • Dimension can be used as a rough way to quantify certain aspects of human
in Everyday life nature.
• A 30-question survey can be used to create a 30-dimensional profile of a
person.
• People can be matched according to their distance from each other in 30-
dimensional space.

SECTION 3.2
5.6

Scaling and the • One-dimensional, two-dimensional, and three-dimensional objects behave


Hausdorff differently as they scale—that is, as they expand or shrink.
Dimension • We can write an expression for dimension based on scale factor and the
number of self-similar copies.
• The Koch curve has infinite perimeter in a finite space; this incongruity
indicates that it is not simply a 1-D object.
• The Koch curve has an area of zero, which indicates that it is not a 2-D
object.
• Using the Hausdorff definition of dimension, we find that the dimension of
the Koch curve is some decimal value between 1 and 2.

SECTION 3.2
5.7
• Real objects are not exactly self-similar; rather, they are statistically self-
Fractal by Nature
similar.
• The length of a curvy object, such as a coastline, depends on the size of the
ruler you use to measure it.
• The works of Jackson Pollock exhibit statistical self-similarity at different
scales and have a fractal nature to them.
• Measuring the fractal dimension of a Pollock-style painting is one tool that
can help verify its origin.

Unit 5 | 54
UNIT 5 Other Dimensions
textbook

BIBLIOGRAPHY

WEBSITES http://www.moma.org/exhibitions/1998/pollock/website100/index.html
http://www.collisiondetection.net/mt/archives/2006/02/_there_were_two.html
http://www.fam-bundgaard.dk/SOMA/NEWS/N010920.HTM
http://www.math.union.edu/~dpvc/math/4D/

PRINT Abbott, Edward A. Flatland: A Romance of Many Dimensions (unabridged) Dover


Thrift Editions. Mineola, NY: Dover Publications, Inc., 1992.

Aguilera, Julieta. “Virtual Reality and the Unfolding of Higher Dimensions


(proceedings paper),” SPIE Conference Proceedings, vol. 6055 (January 2006).

Banchoff, Thomas F. Beyond the Third Dimension: Geometry, Computer Graphics


and Higher Dimensions. New York: Scientific American Library, division of
HPHLP, distributed by W. H. Freeman and Co., 1990.

Berlinghoff, William P. and Kerry E. Grant. A Mathematics Sampler: Topics for the
Liberal Arts, 3rd ed. New York: Ardsley House Publishers, Inc., 1992.

Burton, David M. History of Mathematics: An Introduction, 4th ed. USA: WCB/


McGraw-Hill, 1999.

Carter, Steve and Chadwick Snow. “Helping Singles Enter Better Marriages
Using Predictive Models of Marital Success,” presented at the 16th Annual
Convention of the American Psychological Society, (May 2004).

Dantzig, Tobias. Number: The Language of Science, The Masterpiece Science


Edition. New York: Pi Press, an imprint of Pearson Education, Inc., 2005.

Edgar, Gerald A. Measure, Topology, and Fractal Geometry. New York: Springer-
Verlag, 1990.

Eves, Howard. An Introduction to the History of Mathematics, 5th ed. (The Saunders
Series) Philadelphia, PA: Saunders College Publishing, 1983.

Falk, Ruma and Arnold D. Well, “Many Faces of the Correlation Coefficient,”
Journal of Statistics Education, vol. 5, no. 3 (1997).

Unit 5 | 55
UNIT 5 Other Dimensions
textbook

BIBLIOGRAPHY

Fiore, A.T. and Judith S. Donath. “Homophily in Online Dating: When Do You Like
PRINT Someone Like Yourself?” Paper presented at the Conference on Human Factors
CONTINUED in Computing Systems, Portland, Oregon, April 2-7, 2005
http://portal.acm.org/citation.cfm?id=1056808.1056919&coll=&dl=acm&type=se
ries&idx=1056808&part=Proceedings&WantType=Proceedings&title=Conferenc
e%20on%20Human%20Factors%20in%20Computing%20Systems&CFID=151515
15&CFTOKEN=6184618 (accessed April 20, 2007).

Greene, Brian. The Elegant Universe: Superstrings, Hidden Dimensions, and the
Quest for the Ultimate Theory. New York: W.W. Norton and Co., 1999.

Kennedy, Randy. “Black, White and Read All Over Over.” NYTimes.com
(December 15, 2006), http://www.nytimes.com/2006/12/15/arts/design/15serk.
html?ex=1323838800&en=8388f00a8250bff2&ei=5088&partner=rssnyt&emc=rs
s (accessed 2007).

Mandelbrot, Benoit B. The Fractal Geometry of Nature, Updated and Augmented.


New York: W.H. Freeman and Company, 1983.

Mandelbrot, Benoit B. “How Long Is the Coast of Britain? Statistical Self-


Similarity and Fractional Dimension,” Science, 156 (1967).

McCallum, William G., Deborah Hughes-Hallett, Andrew M. Gleason, et al.


Multivariable Calculus. New York: John Wiley and Sons, 1997.

Mureika, J.R. “Fractal Dimensions in Perceptual Color Space: A Comparison


Study Using Jackson Pollock’s Art.” Cornell University Library. http://arxiv.org/
abs/physics/0509152 (accessed 2007).

Mureika, J. R. et al. “Multifractal Structure in Nonrepresentational Art,”


Physical Review E, vol. 72, issue 4 (2005).

Ouellette, Jennifer. “Pollock’s Fractals: That Isn’t Just a Lot of Splattered Paint
on Those Canvases: It’s Good Mathematics” Discover, vol. 22, no. 11 (November
2001).

Randall, Lisa. Warped Passages: Unraveling Mysteries of the Universe’s Hidden


Dimensions. New York: Harper Perennial, 2005.

Unit 5 | 56
UNIT 5 Other Dimensions
textbook

BIBLIOGRAPHY

PRINT Rucker, Rudy. The Fourth Dimension: A Guided Tour of the Higher Universes.
CONTINUED Boston, MA: Houghton-Mifflin Co., 1984.

Salas, S.L. and Einar Hille. Calculus: One and Several Variables, 6th ed. New York:
John Wiley and Sons, 1990.

Tannenbaum, Peter. Excursions in Modern Mathematics, 5th ed. Upper Saddle


River, NJ: Pearson Education. Inc., 2004.

Taylor, Richard, Adam Micolich, and David Jonas. “Fractal Expressionism,”


PhysicsWorld, vol. 12, no. 10 (October 1999).

Taylor, R.P. et al. “From Science to Art and Back Again!” American Association
for the Advancement of Science. http://sciencecareers.sciencemag.org/career_
development/previous_issues/articles/0980/from_science_to_art_and_back_
again/(parent)/158 (accessed 2007).

Taylor, Richard P. “Order in Pollock’s Chaos,” Scientific American, 116


(December 2002).

Weeks, Jeffrey R. The Shape of Space, 2nd ed. (Pure and Applied Mathematics).
New York: Marcel Dekker Inc., 2002.

Weisstein, Eric W. “Hypercube.” Wolfram Research.


http://mathworld.wolfram.com/Hypercube.html (accessed 2007).

Weisstein, Eric W. “Fractal.” Wolfram Research.


http://mathworld.wolfram.com/Fractal.html (accessed 2007).

Lecture Haines, Eric and Tomas Möller. “Real-Time Shadows.” Lecture presented at
Game Developers Conference, San Jose, California, March 20-24, 2001.
http://www.acm.org/tog/editors/erich/ (accessed 2007).

1
Abbott, Edward A. Flatland: A Romance of Many Dimensions (unabridged) Dover Thrift Editions. Mineola, NY:
Dover Publications, Inc., 1992; 82.
2
The actual definition and derivation of the Hausdorff dimension is quite complicated and is out of our scope.
The definition given here will do fine for our purposes; the point is that it is a completely different way to view
dimension.
3
http://www.collisiondetection.net/mt/archives/2006/02/_there_were_two.html
Unit 5 | 57
UNIT 5 Other Dimensions
textbook

NOTES

Unit 5 | 58
TEXTBOOK
Unit 6
UNIT 06
The Beauty of symmetry
TEXTBOOK

UNIT OBJECTIVES

• Symmetry, in a mathematical sense, is a transformation that leaves an object


invariant.

• Some symmetries are understood as geometric motions.

• Some symmetries are understood as algebraic operations.

• When symmetries are combined, another symmetry is the result.

• Symmetries form what is known as a group, which allows mathematicians to


perform a sort of “arithmetic” with things that are not numbers.

• A “frieze group” is the set of symmetries of an infinite frieze (a frieze is a pattern that
repeats along a line – i.e., in one dimension).

• All frieze groups fall into one of seven general types.

• A “wallpaper group” is the set of symmetries of an infinite wallpaper pattern –


that is, a pattern that “repeats” in two dimensions.

• Wallpaper groups fall into one of 17 general categories.

• All groups of symmetries can be expressed as permutations, which also


form groups.

• The ways in which a polynomial’s roots behave under a certain collection of


permutations determine whether or not the roots have a simple algebraic formula
(such as the quadratic formula) in terms of the coefficients of the polynomial.

• Every conserved physical quantity is based on a continuous symmetry.


The Theory of Groups is a branch of
mathematics in which one does something to
something and then compares the result with
the result obtained from doing the same thing
to something else, or something else to the
same thing. . .

James R. Newman
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.1 Why do we find butterflies appealing? What is it about a snowflake that can hold
our attention? Why do we find some designs beautiful and others unattractive?
INTRODUCTION Such subjective observations can hardly be thought to be within the realm of the
objective explanatory power of mathematics, and yet the concept of “symmetry,”
an idea that underlies what humans consider to be beautiful, drives right to the
heart of mathematical thinking.

Item 2875 / Uzbek, EMBROIDERED SUSANI (nineteenth-twentieth century).


Courtesy of Kathleen Cohen.

Item 2925 /Tony Link Design, FIERY Item 2926 /Adam Mandoki, CUPOLA (2007). Courtesy of
ORNAMENTAL PATTERN (2007). Courtesy of iStockphoto.com/Adam Mandoki.
iStockphoto.com/Tony Link Design.

The mathematical study of symmetry is the rigorous study of the commonalities


between objects or situations. For example, a plain equilateral triangle with no

Unit 6 | 1
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.1 distinguishing markings to differentiate


one side or angle from another has six
INTRODUCTION geometric symmetries. These result
CONTINUED from actions, or transformations, such as

1255 flipping or rotating the triangle, that leave


it looking the same as when we started.
Furthermore, we can compose symmetries
to make other symmetries.

Interestingly, the ways in which we can


move an equilateral triangle and have it
appear as it did when we started have much in common with the ways in which
we can rearrange a stack of three objects, such as
3054 pancakes. This similarity is not accidental; rather, it
is an indication of some deeper concept in action, one
that transcends both equilateral triangles and stacks of
3055 pancakes. Mathematicians call objects that adhere to
this deeper concept a group.

3056 As we will see, the study of groups is one way


mathematicians extend the notion of arithmetic to

3057 objects that are “beyond numbers.” Group theory unites


the spatial thinking of geometry with the symbolic
realm of algebra. It is beautiful in its generalization and

3058 abstraction, providing deep insights into a wide array


of phenomena. In addition to its theoretical power
and usefulness, the study of groups also has many
3059 applications to the real world. The techniques of group
theory can be used to study the solvability of certain
kinds of equations, explain the existence of elementary physical particles, verify
the fact that physical laws are the same everywhere on the planet, or determine
when a deck of cards is sufficiently shuffled. These applications are, of course,
in addition to the fascinating use of group concepts and techniques in the
traditional art forms of various cultures.

Apart from its applications, the study of symmetry and groups reveals deep and
surprising connections between different areas of mathematics itself. Group
theory is part of a larger discipline known as abstract algebra. Abstract algebra
takes the skills and techniques learned in high school algebra—related to how

Unit 6 | 2
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.1 the system of numbers works in a general sense through the use of variables—
and extends these tools into the realm of geometry and beyond. Abstract
INTRODUCTION algebra shows the way in which incredibly interesting and complex structures
CONTINUED can be created simply by putting a few rules in place as to how a small group of
symbols can be moved around and related to one another. In this symbolic and
abstract setting we find a unity among
things that, on the surface, may seem
quite different. Just as the concept of
bilateral symmetry unites the forms
of butterflies, airplanes, and humans,
group theory and abstract algebra show
a relationship between the fundamental
structures of logic and mathematical
reality.

In this unit we will start by examining


different types of visual symmetry, and
we will see how the concept of groups
enables us to classify different design
motifs in both one and two dimensions.
From there we will examine the
symmetries inherent in permutations,
such as the ways in which one can
Item 3078 /Andrey Prokhorov, 3D SCHEME (2007).
Courtesy of iStockphoto.com/Andrey Prokhorov. stack pancakes or shuffle cards. With
these concepts in hand, we will catch a
glimpse of one of the crown jewels of abstract algebra, Galois theory. We will
then shift gears to study the role that symmetry plays in our understanding of
the physical universe.

Unit 6 | 3
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.2

Types of Symmetry • Foundations of Beauty


Leading to Groups • The Equilateral Triangle
• Stay with the Group

FOUNDATIONS OF BEAUTY
• Symmetry is a basic notion in the visual arts.The Ishango bone represents a
level of early mathematics more sophisticated than simple counting.
• Rotation, reflection, and translation are the most common types of visual
symmetry.

Symmetry is perhaps most familiar as an artistic or aesthetic concept. Designs


are said to be symmetric if they exhibit specific kinds of balance, repetition,
and/or harmony. In mathematics, symmetry is more akin to something like
“constancy,” or how something can be manipulated without changing its form.
In other words, the mathematical notion of symmetry relates to “objects” that
appear unchanged when certain transformations are applied.

Think of the form of a butterfly; its right


and left halves mirror each other. If you
knew what the right half of a butterfly
looked like, you could construct the left

1906 half by reflecting the right half over a line


that bisects the butterfly.

Butterflies exhibit a type of symmetry


called “bilateral symmetry” or a “mirror
symmetry,” (either half of the butterfly
is the mirror image of the other) one
that is very common among living things.

1907 Perhaps most familiar to us is our own


bilateral symmetry, the symmetry of
our left and right arms and hands, or
our left and right legs and feet, or the
Flatworm Ant Human approximate symmetry of our bodies
Note that reflecting any of these forms about the if bisected vertically into left and right
axis leaves the design unchanged.
halves. In general, bilateral symmetry is

Unit 6 | 4
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.2 present whenever an object or design can be broken down into two parts, one of
which is the reflection of the other. Given any motif, one can generate a design
Types of Symmetry with bilateral symmetry by choosing a line and reflecting the motif over it.
Leading to Groups Conversely, if a motif already possesses bilateral symmetry, it can be reflected
CONTINUED
over a line and we would notice no difference between the original and the
reflected versions. This action, reflection, leaves the original design apparently
unchanged, or invariant.

Bilateral symmetry is quite common in nature, but it is by no means the only


form of visual symmetry that we see in the world around us. Another common
form is rotational symmetry, such as that seen in sea stars and daisies.

Item 1576/Corel Corporation, BEAUTIFUL STARFISH Item 1253/NPS Photo by William S. Keller, OXEYE-
(2008). Courtesy of Corel Corporation. DAISY (CHRYSTANTEMUM LEUCANTHEMUM) (1962).
Courtesy of National Park Service.

Recall that to be symmetric an object must appear unchanged after some action
has been taken on it. An object that exhibits rotational symmetry will appear
unchanged if it is rotated through some angle. A circle can be rotated any
amount and still look like a circle, but most objects can be rotated only by some
specific amount, depending on the exact design. For example, an ideal sea star,
having five arms, is not symmetric under all rotations, but only those equivalent
1  360 
to 5 of a full rotation, or 72°  5  .

72

72

1908
Rotating an ideal sea star 72° leaves it’s appearance unchanged

Unit 6 | 5
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.2 A daisy, on the other hand, is rotationally symmetric under smaller rotational
increments. Let’s say it has 30 petals, all of which are the same in appearance—
Types of Symmetry no such daisy exists in the real world, of course—this is an ideal mathematical
Leading to Groups daisy. The flower will be symmetric under a rotation of 12° (360 ) or any multiple
30
CONTINUED
thereof.
12 12

1909
The ideal daisy is symmetric under rotations of 12°

You might have observed that the sea star and the daisy are not limited to
rotational symmetry. Depending on how you choose an axis of reflection, they
can each display bilateral (reflection) symmetries as well. Notice, however, that
only certain dividing lines can serve as axes of reflection.

1910
Reflection Symmetry Not an axis of
Reflection Symmetry

This brings us to an important point: an object may have more than one type of
symmetry. The specific symmetries that an object exhibits help to characterize
its shape. Remember, the motions associated with symmetries always leave the
object invariant. This means that combinations of these motions, which are how
mathematicians tend to think of symmetries, will also leave the original object
invariant. Let’s explore this idea a bit further by looking at the symmetries of an
equilateral triangle.

Unit 6 | 6
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.2 The Equilateral Triangle


• The symmetries of the equilateral triangle can be thought of as the
Types of Symmetry transformations (i.e., motions or actions) that leave the triangle invariant—
Leading to Groups looking the same as before the motion.
CONTINUED
• Combinations of symmetries are also symmetries.

1911
1255
Not an axis of
Reflection Symmetry

Notice that there are three lines over which the triangle can be reflected and
maintain its original appearance.

Considering rotations, we find that there are


B
only three that will return the triangle to its
1
original appearance. We can rotate it through 3
2
of a complete revolution (120°), 3 of a complete
1256 revolution (240°), or one full revolution (360°).
C A

As we have seen, an equilateral triangle has three


distinct reflections and three rotations under
which it remains invariant. Furthermore, since
all of these symmetries leave the triangle invariant, the combination of any
two of them creates a third symmetry. For example, a rotation through 120°,
followed by a reflection over a vertical line passing through its top vertex, leaves
the triangle in the same position it was in at the start. Let’s look at all possible
combinations of symmetries in an equilateral triangle a little more closely. To
do this, it will be helpful to label each vertex so that we can keep track of what
we have done.

A A

I
1257
B C B C

If we do nothing to the triangle, this is called the identity transformation, I.


Unit 6 | 7
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.2
A

C
R1
Types of Symmetry
Leading to Groups1258
CONTINUED
B C

B
A
This symmetry is simply a rotation of 120° counterclockwise; let’s call it R1.

B
R2
1259
B C

A
C
A rotation of 240° degrees counterclockwise is another symmetry of the
equilateral triangle; let’s call it R2.

A A

L
1260
B C C B

This diagram represents a reflection over the vertical axis (notice how vertices A
and B have switched sides); let’s call it L.

A
C

M
1261
B C
A
B

This symmetry is a reflection over the line extending from B through the
midpoint of AC; let’s call this motion M.

A
B

N
1262
B C
C
A

In this diagram the triangle has been reflected over the line extending from C
through the midpoint of AB; let’s call this action N.
Unit 6 | 8
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.2 Now that we have identified all the possible motions that leave the triangle
invariant, we can organize their combinations in a chart. In these combined
Types of Symmetry movements, the motions in the left column of the chart are done first, then
Leading to Groups the motions across the top row. For instance, the rotation R1, followed by the
CONTINUED
Identity, I, yields the same result as simply performing R1 by itself. Performing
the reflection M, followed by N, gives us the same result as simply performing
the rotation R2.

I R1 R2 L M N
I I R1 R2 L M N

1263 R1
R2
R1
R2
R2
I
I
R1
M
N
N
L
L
M
L L N M I R2 R1
M M L N R1 I R2
N N M L R2 R1 I

Notice that as we complete this chart of all possible combinations of


two motions, every result is one of our original symmetries. This is an
indication that we have found some sort of underlying relational structure.
Mathematicians call sets of objects that express this type of structure a group.

Stay With the Group


• A group is a collection of objects that obey a strict set of rules when acted
upon by an operation.

A group is just a collection of objects (i.e., elements in a set) that obey a


few rules when combined or composed by an operation that we often call
“multiplication.” This may seem like a vague, even unhelpful, description, but
it is precisely this generality that gives the study of groups, or group theory, its
power. It is also amazing and a bit mysterious that out of just a few simple rules
we can create mathematical structures of great beauty and intricacy.

The symmetries of an equilateral triangle form a group. Remember, these


symmetries are all the rigid motions that leave the triangle invariant. One of
the powers of group theory is that it allows us to perform operations that are
“sort of” arithmetic with things that are not numbers. Notice that the operation
we used in the triangle example above was simply the notion of “followed by.”
This is going to be completely analogous to the idea of combining two integers
by addition and getting another integer, or multiplying together two nonzero
fractions and getting another fraction!

Unit 6 | 9
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.2 Group theorists study objects that don’t have to be numbers as well as
operations that don’t have to be the standard arithmetical operations. Now we
Types of Symmetry can be a little more precise about what we mean by a group and how groups
Leading to Groups function. For example, we would like to be able to use the members of a group
CONTINUED
to do arithmetic and even to solve simple equations, such as 3x = 5. To solve this
equation, we need the operation of multiplication, and we need the number 3
to have an inverse. An inverse is simply a group member that, when combined
with another group member under the group operation, gives the Identity.

1
In the case of 3x = 5, the inverse is 3 , which when combined with 3 under the
operation of multiplication, gives 1, the multiplicative identity.

This scenario has pointed out the first two rules of a group. First, the group
must have an element that serves as the Identity. The characteristic feature of
the Identity is that when it is combined with any other member under the group
operation, it leaves that member unchanged.

Second, each member or element of the group must have an inverse. When a
member is combined with its inverse under the group operation, the result is the
Identity.

In addition to these two basic rules of group theory, there are two more concepts
that characterize groups. The third property, or requirement, of a group is
that it is closed under the group operation. This means that whenever two
group members are combined under the group operation, the result is another
member of the group. We saw this as we looked at all possible combinations of
symmetries of the equilateral triangle above. No matter which symmetry was
“followed by” which, the result was always another symmetry. For simplification
as we go forward in our exploration of groups, we might as well use the term
“multiplication” to express the operation of “followed by.”

The fourth and final requirement of a group is that it is associative. In other


words, if we take a list of three or more group members and combine them two
at a time, it doesn’t matter which end of the list we start with. Arithmetic with
numbers is governed by the associative property, so if we want to do arithmetic
with members of a group, we need them to be associative as well.

A group is a set of objects that conforms to the above four rules. It is worth
noting that although groups obey the associative property, the commutative

Unit 6 | 10
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.2 property generally does not apply; that is, the order in which we combine
motions usually matters. For example, in the table above for the equilateral
Types of Symmetry triangle symmetries notice that the rotation R1 followed by the reflection L gives
Leading to Groups the reflection M as a result, whereas L followed by R1 gives the reflection N as a
CONTINUED
result.

As a side note, specialized types of groups that do conform to the commutative


property are called Abelian groups. For our purposes the current discussion
will focus solely on more-general, non-commutative groups.

In examining the equilateral triangle, we saw that its symmetries formed a


group. Another example of a group would be the set of integers under the
operation of addition. If you add any two integers together, you get another
integer, this demonstrates that this set is closed. There is an identity element,
zero, that you can add to any integer without changing its value. Every integer
also has an inverse. For instance, if you take positive 3 and add to it negative
3, you get the Identity, zero. (Zero, just in case you were wondering, serves as
its own inverse, which is perfectly acceptable!) Finally, we know intuitively that
adding more than 2 numbers gives the same result no matter how we choose to
group them. For example:

(3+2) + 6 = 3 + (2 +6)

This demonstrates the associativity of the group of integers.

Group theory is very useful in that it finds commonalities among disparate


things through the power of abstraction. We will explore this idea in more depth
soon, but first let’s return to the concept we introduced at the beginning of this
section. With all of this focus on rules and axioms, it’s easy to forget that we
are chiefly concerned with understanding and characterizing symmetry in a
mathematical fashion. Now that we have introduced the basic requirements of
groups, we can start to characterize a wide variety of designs using groups. In
the next section, we will focus on one- and two-dimensional patterns and the
groups that describe them.

Unit 6 | 11
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.3

Frieze and • Groups in Visual Form


Wallpaper Groups • Friezes and Wallpapers

GROUPS IN VISUAL FORM


• The symmetries of an infinite sine wave for a group.

In the preceding section, we examined two specific types of visual symmetry,


reflection and rotation. We saw that objects, such as a sea star, a daisy, or an
equilateral triangle, can possess one or both of these symmetries. It might be
tempting to think that these two kinds of motion are the only symmetries that
an object can possess. To see whether or not this hunch is correct, we need to
revisit our understanding of what a symmetry is.

We often think of symmetry as a descriptor or quality of an object. We use the


term “symmetric” as an adjective to describe beautiful, balanced objects such
as the ones we studied in the preceding section. We’ve now begun to think of
symmetry in terms of the motions that leave an object appearing the same as
before the motion took place. Once we have freed our thinking from the idea

1912 of symmetry as a property of an object and shifted to considering symmetry to


be a motion, we open ourselves up to more options as to what we consider a
symmetry to be.

For instance, consider a sine wave.

y
2.5

0 -7.5 -5 -2.5 0 5 7.5 0


x

-2.5

Looking at this design, and assuming that it continues its pattern forever both
to the left and to the right, what motions could we apply to it that would leave
it invariant? In other words, imagine that we can pick this design up off of the
page, move it in some way, and replace it on the page. In what ways can we
move it so that when we replace it, we can’t tell that we did anything?
Unit 6 | 12
UNIT 6 The Beauty of Symmetry
textbook

1913
SECTION 6.3

Frieze and y
Wallpaper Groups
CONTINUED
x

1914
If we reflect the image over the y-axis, we will not end up with the same design.
The same is true if we reflect it over a horizontal line such as the x-axis.

y
y

x x

Reflect about the y-axis After reflection about the y-axis


the sine wave appears different

y
y

x x

Reflect about the x-axis After reflection about the x-axis


the sine wave appears different

π
If we reflect the sine curve over the vertical line x = , however, the original look
2
of the design is retained. So, this design remains invariant when reflected over
π
vertical lines defined by x = 2 + nπ . The nπ term of this expression represents
the fact that the sine curve goes to infinity and, thus, has an infinite number of
possible vertical reflection axes.

Unit 6 | 13
UNIT 6 The Beauty of Symmetry
textbook

1915
SECTION 6.3 y y

Frieze and
x x
Wallpaper Groups
CONTINUED

After 90 counterclockwise
rotation about the origin
Now let’s consider rotation. It is not hard to conceive that any rotation of less
than 180° will not leave the sine wave invariant. If, however, we rotate the
entire design exactly 180° about the origin, or about any point on the x-axis with
an x-value of the form 0 + n π , we get a result that coincides with the original
1916
configuration.

y y y

x x x

Translation of less than 2π Translation of 2π. Note that translations of


whole number multiples of 2π will leave the
sine wave invariant.

We have established that the sine wave has both rotational symmetry and
reflection symmetry under certain conditions. What if we simply shifted
the curve horizontally along the x-axis? Would this motion leave the design
invariant?

A little thought will reveal that as long as we move it in increments of 2 π , which


is the period of a standard sine wave, the design will be unchanged. This type
of motion is called a translation, and it, along with reflection and rotation, is a
symmetry of the sine wave. We must consider shifts of different magnitudes

1917 to be separate motions, however. Consequently, because we can shift the sine
curve by any integer multiple of 2 π that we choose, the number of translations
that leave this design invariant is infinite.

Y Y

X X

First stage of glide reflection. Second stage of glide reflection.


“The Glide” “The Reflection”

Unit 6 | 14
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.3 What if we shifted the sine wave horizontally by a value of π , and then reflected
it over the x-axis? Neither of these actions alone is a symmetry, because
Frieze and neither action alone leaves the design invariant. Together, however, these
Wallpaper Groups two motions actually do leave the design invariant. This type of symmetry is
CONTINUED called a glide reflection, and because it involves a translation, it also has infinite
varieties. Note that translations and glide reflections can be symmetries only of
designs that are infinite in extent.

So, we’ve seen that our relatively innocent-looking sine wave has many
symmetries: reflection over certain vertical lines, rotation by 180°, translation
by multiples of 2 π , and glide reflections—and let’s not forget the Identity! As
before, with our analysis of the equilateral triangle group, we can check to see
if these transformations of the sine wave form a group under the operation of
“followed by.”

First, we have an Identity, but does each motion have an inverse? A bit of
examination should convince you that both the vertical reflections and the 180°
rotations can be undone by themselves. The translations and glide reflections
can be undone by shifting the curve in the opposite direction of the initial shift.
If the first motion was to translate the curve to the right by 2 π, the inverse is
to translate it left by 2 π. This shows that the symmetries of a sine wave have
inverses.

With the first two requirements of a group, identity and inverse, confirmed,
let’s turn our focus to closure. Since the sine wave remains invariant under an
infinite number of translations and glide reflections, we cannot simply construct
a table and check to see that every combination of symmetries gives us another
symmetry. We should, however, be able to convince ourselves fairly easily that,
as long as we are careful about how we perform our reflections, rotations,
translations, and glide reflections, the result will still be a symmetry of the
design.

For example, say that S represents the sine wave. G, T, and R represent
symmetries of S--motions that leave S invariant. We’ll use * to represent the
operation of applying G, T, or R to S.

G*S means “apply motion G to sine wave S.” Because G is a symmetry, we


expect that G*S will result in an unchanged S, which is the same result obtained
with the Identity motion. Therefore, G*S = S, as does R*S, and T*S.

Unit 6 | 15
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.3
We can compose G, T, and R together to demonstrate both closure (that any
Frieze and combination of group elements results in an element within the group) and
Wallpaper Groups associativity (that multiple motions can be grouped however we choose without
CONTINUED affecting the result).

Let’s consider G*T*R*S. We can do this as follows:

G*(T*(R*S))

G*(T*S)

G*S

Notice that we end up with an unchanged S, a symmetry equivalent to the


Identity. This shows that combinations of G, T, and R are still symmetries,
demonstrating closure. Hopefully, it’s obvious that we would have reached the
same result regardless of the order of the G, T, and R motions in the original
expression, so the associativity requirement is confirmed.

Friezes
• Friezes are patterns along a line, which are commonly used in art.
• All friezes fall into one of seven general types, each of which is a
mathematical group.

Item 2876/Greek Hellenistic, ALEXANDER SARCOPHAGUS. DETAIL:


DECORATIVE FRIEZES. (fourth century BCE). Courtesy of Kathleen Cohen.

Unit 6 | 16
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.3 We’ve now seen that the symmetries of this infinite sine wave do indeed form a
group of the type known as frieze groups. A frieze is simply a repetitive design
Frieze and on a linear strip.
Wallpaper Groups
CONTINUED The number of designs possible with a frieze is limited only by the imagination
of the artist creating it. However, every frieze will have at least one of the
aforementioned symmetries or a symmetry of reflection over a horizontal line

1267
(a symmetry the sine wave does not have). Moreover, every frieze, if sufficiently

1. 1. T (translation only)

2. 2. TG (translation and glide reflection

3. 3. THG (translation, horizontal line reflection


and glide reflection

4. 4. TV (translation and vertical line reflection)

5. 5. TR ( translation and 180˚ rotation)

6. 6. TRVG (translation, 180˚ rotation, vertical line


reflection, and glide reflection)

7. 7. TRHVG (translation, 180˚ rotation, horizontal

1270
line relection, vertical line reflection, and
glide reflection)

“stripped” of its ornamental elements, falls


into one of the seven symmetry groups
described here:

Friezes extend in only one dimension,


but if we consider patterns that extend
in two dimensions, patterns that cover
the Euclidean plane for instance, we find
a similar result. These planar patterns,
constructed of the same basic motions as
the linear patterns (except that rotations of

Unit 6 | 17
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.3 less than 180° are possible), also form symmetry groups. Every one of these so-
called “wallpaper patterns” can be classified as one of seventeen planar groups.

1271
Frieze and
Wallpaper Groups
CONTINUED
This is an example of a wallpaper group
consisting of reflections over two axes and
rotations of 180°. The following diagram is
a simplified representation:

The blue lines represent the axes over


which this design can be reflected, and
the pink diamonds show the centers of
rotation. Friezes and wallpaper patterns are not the

1273
only geometric designs that can be classified into
a, β,γ ≠ 90˚ groups. Similar results hold for three-dimensional
patterns, such as those found in crystals, and even for
γ patterns in higher dimensions!

We’ve now seen how symmetry in patterns can be


captured mathematically in the concepts of friezes
β and wallpaper groups. Let’s now turn our attention
a away from design toward an area that may at first
seem wholly unrelated—permutations. Through this
exploration we’ll see the power of abstraction (and in this case, the abstract
concept of a group) to tie together ideas and situations that seem to have little in
common.

Unit 6 | 18
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.4

Card SHuffling • Permutations (or Stacking Pancakes)


• The Perfect Shuffle

Permutations
• The symmetries of a geometric object can be expressed as permutations
of elements.
• Permutations can form a group, as symmetries do.

Let’s return to the equilateral triangle group that we studied earlier. We have
been saying all along that the power of group theory lies in its capacity to reveal
deep connections between seemingly unrelated things. We can catch a small
glimpse of this by re-examining the motions that make up the equilateral
triangle group.

Remember, to keep track of the reflections and rotations that make up the
equilateral triangle group, we labeled the vertices A, B, and C. We can express
all of the symmetries shown earlier by using just these labels. As a convention,
we’ll express each configuration of the triangle as some sequence of A, B, and
C, with the first letter of each sequence corresponding to the vertex at the top
of the triangle, the second letter representing the right vertex, and the third
letter representing the left vertex. Basically, this is just a clockwise sequence,
starting from the top. Applying this convention yields the following sequences,
shown corresponding to the motions that produced them.

1275 A C

A B C A B C
A C B 120˚ C A B
B C B A
P1 P4

B C

A B C A B C
B A C C B A
C A A B
P2 P5

B A

A B C A B C
240˚ B C A A B C
A C C B
P3 P6=I

Unit 6 | 19
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.4 Each of these motions has the effect of permuting the labels of the vertices.
For example, a reflection over a vertical line through the top vertex of the
Card SHuffling equilateral triangle keeps A in the same (first) position in the label sequence,
CONTINUED while switching the positions of vertices B and C. This motion takes ABC and
gives us ACB.

Perhaps we don’t really even need an actual triangle to have a group. In fact,
let’s forget about the triangle for a moment and simply look at the ways in which
we can arrange three objects. Just for fun, let’s think of this as the ways in
which we can stack three different-flavored pancakes, one apple (A), one banana
(B), and one chocolate (C). We can use combinatorics to figure out how many
different arrangements there are. Because these are ordered permutations of
three objects, we know there will be 3! (3 × 2 × 1) or six arrangements. They are:

ABC
ACB
BAC
BCA
CAB
CBA

If we look at these sequences in terms of how such permutations are created,


instead of simply enumerating the specific arrangements, we will see an
interesting result. For the sake of simplicity, let’s say that each permutation
action starts on ABC. Here are our actions:

1276 P1
P2
ABC > ACB
ABC > BAC
A>A, B>C, C>B
A>B, B>A, C>C
P3 ABC > BCA A>B, B>C, C>A
P4 ABC > CAB A>C, B>A, C>B
P5 ABC > CBA A>C, B>B, C>A
P6=I ABC > ABC A>A, B>B, C>C

Let’s see if this set of permutations forms a group under the operation of
“followed by.” It’s clear that we have an Identity motion (P6). The easiest way to
check for closure, inverses, and associativity will be to form a table, as we did
with the equilateral triangle group. Note that as we perform these combined
movements, the motions represented across the top row of the table are done
first, followed by the motions in the left column.

Unit 6 | 20
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.4 For instance, P5 switches A with C and C with A while leaving B untouched.
Doing this twice in succession yields the same result as doing nothing, so P5
Card SHuffling serves as its own inverse. The same can be said for P1 and P2. What happens
CONTINUED when we perform the P3 motion twice in succession? P3 replaces A with B,
B with C, and C with A. Starting from ABC, P3 creates BCA, so applying this
transformation to BCA yields CAB, the same result obtained from performing P4

1277 once. In other words, P3 followed by P3 is equivalent to P4, which demonstrates


that P3 is not its own inverse. However, a little bit further examination reveals

P1 P2 P3 P4 P5 I
P1 I P3 P2 P5 P4 P1
P2 P4 I P5 P1 P3 P2
P3 P5 P1 P4 I P2 P3
P4 P2 P5 I P3 P1 P4
P5 P3 P4 P1 P2 I P5
I P1 P2 P3 P4 P5 I

that P4 undoes the changes that P3 creates, and, thus, serves as its inverse.
As we fill in this table, we find that every combination of two permutations
gives another permutation, thereby confirming that the set of permutations of
three objects is closed. Furthermore, we can see that each permutation has an
inverse. Finally, by using the reasoning of the previous examples, namely that
we should be able to combine motions in any way we choose without affecting
the result, we can convince ourselves that the elements are associative under
the operation “followed by.”

Having confirmed that the four requirements are met, we see that the set
of permutations of three objects indeed forms a group. In other words, the
possible ways in which we can stack three distinguishable pancakes forms a
group. It’s interesting to note that the number of elements in this group is the
same as the number of elements in the group of symmetries of the equilateral
triangle—six. Furthermore, if we examine the table we obtained for the triangle
with the table we obtained for the permutations, we find that they have identical
structure. This suggests to us that these two situations are both manifestations
of the same fundamental structure. In other words, what do equilateral
triangles have to do with stacks of three pancakes? They both have the same
group structure!

This is, of course, no coincidence and is merely an example of a much


broader observation, first documented by Arthur Cayley, an influential British

Unit 6 | 21
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.4 mathematician working in the late 1800s. Cayley proved that every finite group
of symmetries can be represented by a group of permutations. When two
Card SHuffling groups, such as a group of symmetries and a group of permutations, have the
CONTINUED same structure, we say that they are isomorphic to each other.

Now, to be clear, this does not mean that the group of symmetries of a square,

1694 for instance, will have the same structure as the set of all permutations of
four objects. This is patently obvious from the fact that a square has eight
symmetries, whereas there are twenty-four possible permutations of four

ABCD DABC CDAB BCDA


A B D A C D B C
Rotations
D C C B B A A D
90° CW 180° CW 270° CW
rotation rotation rotation

ABCD DBCA CBAD ADCB


B A D C C B A D
Reflections
C D A B D A B C
Reflect across Reflect across Reflect Reflect
vertical axis horizontal axis across across
diagonal diagonal
objects.
Cayley’s Theorem does assure us, however, that there is some group of
permutations that contains a subgroup (that is, a subset of the group that does
itself satisfy all the group axioms) that corresponds with the eight symmetries
of a square. In this case, the group of twenty-four permutations of four objects
suffices as the broader group that contains the corresponding subgroup. This
connection between geometric symmetries and permutations illustrates but one
example of the connecting power of group theory.

The Perfect Shuffle


• The symmetries of permutations are evident in card shuffling.
• Eight perfect “riffle” shuffles will return a standard deck of cards to the
order in which it started.

Unit 6 | 22
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.4 Let’s consider the connection between permutations and group theory in a little
bit more detail. In the example above, we considered permutations of three
Card SHuffling objects, modeled by a stack of pancakes. Increase the number of pancakes to
CONTINUED 52 and we can (and probably should!) shift our thinking from food to cards. We
may not normally consider a deck of cards to have symmetry, but if we view
symmetries as permutations, and permutations as shuffles, we see yet another
fundamental connection made possible by group theory.

As it turns out, we can use group theory to determine when a deck of cards has
been sufficiently shuffled. Let’s take a step back, though—generally, when
we think of a group, we’re thinking of some kind of symmetry, and to have a
symmetry, something must remain invariant. So, what remains invariant when
a deck of cards is shuffled? Isn’t the point of shuffling to mess everything up?
When we shuffle a deck of cards, what remains invariant is the deck itself.
Although it will probably end up in a different order than when we started, it
remains a deck of cards.

3129

While there are only six permutations of three pancakes, you might well imagine
that there are a staggering number of permutations of a deck of 52 cards. A
good shuffle would make each of the billions of possible deck orderings equally
likely. Is there a technique that can accomplish this?

If you were able to perform a perfect shuffle, one in which the deck is cut
into two stacks of 26 cards which are then riffled together one card at a time
alternately from each stack, the result would be anything but random. The
cards would be interleaved in some mathematically predictable way. The easiest
way to analyze this is to assume that the deck you start with is in ascending
order (2345678910JQKA) for each suit. After one perfect shuffle, this sequence
will be messed up, but in a very particular way. To get a better idea of what
happens, let’s simplify our example to just the lowest ten cards of a suit, with the
ace designated as “1.”
Unit 6 | 23
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.4 After a perfect cut, we have two stacks, 1 2 3 4 5 and 6 7 8 9 10. Riffling these
together perfectly, starting with the left stack, gives 1 6 2 7 3 8 4 9 5 10.
Card SHuffling If we were to perform a second perfect riffle shuffle, the cut would give us:
CONTINUED
1 6 2 7 3 and 8 4 9 5 10

. . . and riffling these together would give us: 1 8 6 4 2 9 7 5 3 10.

Cutting again gives us:

1 8 6 4 2 and 9 7 5 3 10

. . . which gives us 1 9 8 7 6 5 4 3 2 10 when riffled perfectly back together.


Note that all of the sequences of ten cards that we have seen so far are quite
predictable provided we know the starting order and perform perfect riffle
shuffles every time.

Cutting once again gives us 1 9 8 7 6 and 5 4 3 2 10, which, combined, yield the
sequence 1 5 9 4 8 3 7 2 6 10.

In yet another perfect shuffle: 1 5 9 4 8 and 3 7 2 6 10 come back together in the


sequence 1 3 5 7 9 2 4 6 8 10.

Okay, just once more. When riffled perfectly, this last action produces the
sequence 1 2 3 4 5 6 7 8 9 10, which was our original ordering! We have found
that six perfect riffle shuffles return our ten-card deck to its original ordering.
In addition to our previous observation that perfect riffle shuffles do not
randomize a deck, we now see that if you perform six of them, the deck actually
isn’t shuffled at all! You might think that a full deck of 52 cards would take many
more shuffles than this to achieve the same result, but actually it only takes
eight!

This means that, if we want to shuffle a deck of cards in a way that is


mathematically unpredictable—in other words, truly random—then we must
be imperfect shufflers. We should make a certain amount of errors, but not too
many, in our shuffling process. Each of these errors will propagate through
other shuffles until the deck is truly randomized. The number of shuffles
required to randomize a full deck of 52 cards, depending on our measure of
randomness (and that’s a whole other story!), is then either six or seven.

Unit 6 | 24
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.4 The unexpected connection between symmetries, permutations, and card
shuffling is just a small sampling of the broad explanatory power of group
Card SHuffling theory. Symmetry and group theory not only tell us about patterns and shuffles,
CONTINUED they can also be used to tackle some of the more abstract challenges that
arise in mathematics itself. One of the classic examples of this is the story
of Evariste Galois and the remarkable connections he made in an attempt to
solve a problem that had vexed some of the greatest mathematical thinkers for
centuries.

Unit 6 | 25
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.5

GALOIS THEORY • Equational Symmetry


• Galois
• Permuting Roots

EQUATIONAL SYMMETRY
• The study of permutation groups is related to the roots of algebraic
equations.

The study of groups is part of the larger discipline called abstract algebra. We
have seen how group structure represents an abstraction of two seemingly
different situations: an equilateral triangle and a stack of three pancakes. When
we think of algebra, we normally think of typical algebraic problems such as
“solve 3x +5 = 7”. We use these types of algebraic statements to make general
observations about how our number system works. To do this, we use variables
to represent unknown numbers, thus freeing us from the constraints of specific
numerical values so that we can see commonalities in different types of
mathematical expressions and equations. For instance, we know that equations
of the form y = mx + b all have something in common:, they give us straight lines
that are completely characterized by their slope and y-intercept.

Certain equations have symmetry in the form of invariance under permutation.


In other words, we can tell something about an equation in two variables by
seeing what happens when we swap the variables. For example, the equation
y = -x + 5 has the following graph:




        





 Unit 6 | 26
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.5 If we permute x and y (i.e., swap their positions), we get x = - y + 5, which has the
following graph:
GALOIS THEORY 
CONTINUED



        







The two graphs are the same! This is because the equation y = -x + 5 is invariant
under permutation of x and y. By contrast, the equation y = x2 + 1 has the
following graph:




        







Unit 6 | 27
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.5 The graph of its permutation, x = y2 + 1, looks like this:




GALOIS THEORY
CONTINUED




        







This altered look shows that y = x2 + 1 is not invariant under permutation of x and
y. This notion of permuting parts of equations will play a role as we attempt to
answer a question similar to “what makes an equation solvable?”

GALOIS
• Many mathematicians, including Evariste Galois, tackled the question of
whether or not a polynomial has a general solution by radicals.

If we combine the notions of abstraction seen in groups with the arithmetical


power of algebra, we get a system of thought that brings the explanatory
power of mathematics to things that are not numbers, such as the motions of
symmetries and permutations. In short, abstract algebra is the study of how
collections of objects behave under various operations. Its power lies in its
generality.

Surprisingly, things such as symmetries and permutations can be used to


understand problems from the realm of what we normally consider algebra.
One of the first people to realize the vital link between symmetry and algebra
was a 19th century French mathematician named Evariste Galois.

Galois is a legendary figure, known for both his extraordinary insight and his

Unit 6 | 28
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.5 dramatic life. A revolutionary, Galois was as much obsessed with politics as
with mathematics, to the point that he was expelled from school. He spent
GALOIS THEORY time in prison, led protests, and all the while continued to do groundbreaking
CONTINUED mathematics. Shot fatally in a duel at the age of 20, the young mathematician is
said to have written down all of his mathematical knowledge in a letter the night
before his demise. Whether or not this tale is true, the creative insight that
Galois brought to mathematics is difficult to overstate.

Galois made an astonishing breakthrough while attempting to resolve one of


the great questions of his age. Mathematicians had for centuries been trying to
find general formulas that could give the roots to any polynomial using only the
coefficients in the polynomial. Recall that a polynomial is a simple function that
is a linear combination of powers of the input. For example, a quadratic equation
is a second-degree polynomial such as p(x) = 3x2 + 7x - 2. The “roots” of p(x) are
precisely the variable inputs that result in p(x) being 0. In this case, the famous
quadratic formula tells you what the roots are in terms of the coefficients. Given
a general quadratic equation:

p(x ) = ax + bx + c
2

. . . the roots are:

-b+ b2 -4ac -b- b2 -4ac


x= and x=
2a 2a

Similar formulas exist for third-degree polynomials (cubics) and fourth-degree


polynomials (quartics). For some time it had been a question of whether or not
a fifth degree polynomial, or “quintic,” was solvable, in the sense of having a
similarly simple expression for the roots in terms of coefficients, square roots,
cube roots, etc. and simple arithmetic. After years of work, mathematicians
were able to find solutions only for specific cases, the simplest of these being
quintic equations of the form ax5 + b = 0, for which the solution would be the fifth
b
root of . Solutions that hold only for specific cases, however, are a far cry from
a
the complete mastery that a general solution implies.

PERMUTING ROots
• The general quintic equation cannot be solved by a formula.
• Galois theory can determine the solvability by formula of an equation by
examining the ways that permutation groups of its roots behave.

Unit 6 | 29
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.5 In 1824, the Norwegian algebraist Niels Henrik Abel published his
“impossibility” theorem in which he proved that there is no general solution by
GALOIS THEORY radicals—no nice formula, in other words—for polynomials of degree five or
CONTINUED higher. Earlier, in 1799, the Italian mathematician Paolo Ruffini had published a
similar finding, though his proof was somewhat flawed. The fact that no general
solution exists for quintic or higher-degree polynomials is now known as the
“Abel-Ruffini Theorem” or, alternatively, “Abel’s Impossibility Theorem.”

The methods that Galois used to reach a similar conclusion were more general,
opening new doors to mathematical exploration. Where Abel and Ruffini
showed simply that quintic and higher-degree polynomials could have no
general solution by radicals, Galois showed why. Furthermore, his contribution
was general enough to explain why polynomials of degree four and lower have
general solutions and why those solutions take the form that they do. His ideas
form the basis for what we now call Galois theory, the basis of modern group
theory.

Galois’ epiphany was to consider the symmetries of the roots of a polynomial.


He discovered a set of conditions in terms of these symmetries that would
determine if that polynomial had a solution by radicals. To do this, he worked
backwards. Instead of starting with a polynomial and trying to find its roots,
he started with a set of roots and looked to find the polynomial that they would
form.

For example, we can use the roots -1, -2, -3, -5, and -7 to construct a polynomial
and find its coefficients. To do this, we can recognize that these roots
correspond to the following binomial factors:

(x+1)(x+2)(x+3)(x+5)(x+7)

When multiplied together, these factors produce the quintic polynomial


expression:

x5 + 18x4 + 118x3 + 348x2 + 457x + 210

Galois studied the conditions under which the coefficients and roots were
related in such a way as to permit a solution by radicals. To do this, he started
with a set of roots and combined them with the rational numbers to serve as
basic building blocks for creating other, somewhat arbitrary, numbers using

Unit 6 | 30
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.5 multiplication and addition. Using these blocks, he identified a series of key
equations relating the roots and examined how those equations behaved when
GALOIS THEORY the roots were permuted—i.e., shuffled in a way similar to the equations and
CONTINUED graphs we saw in the previous section.

Galois discovered that the ability to solve a polynomial built out of a given set
of roots depends on the invariance under permutation of the roots of those key
equations, created from the roots. In essence, he saw that certain symmetries
in a polynomial’s roots determined whether or not that polynomial has a nice
solution.

Galois not only resolved one of the great challenges of his day, he discovered an
important connection between symmetry, permutation groups, and solvability of
equations. Galois’ discovery is perhaps one of the more unexpected results in
mathematics and shows yet again how group theory provides a way to see past
the superficial in order to find underlying connections. It should come as no
surprise then that group theory has played an important role in understanding
not only the artistic, gambling, and mathematical worlds, but also many
underlying connections inherent in the physical world.

Unit 6 | 31
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.6

physics • Please Be Discrete


• Continuity and Conservation

please be discrete
• Discrete symmetries are responsible for physical invariance.
• CPT-symmetry is a fundamental prediction of the standard model of particle
physics and is due to the interaction of three kinds of discrete symmetries.

The idea of symmetry plays an important role in physics. It primarily manifests


as the concept of invariance, the idea that certain quantities or properties do
not change under certain actions. A famous invariant is energy; the energy of
a closed system does not change. This is more commonly understood to mean
that energy is neither created nor destroyed. Conservation laws abound in
physics and have an important connection to symmetry. To get an idea of the
ubiquitous role of symmetry in physics, we have to consider both discrete and
continuous symmetries.

Up until this point, we have really only considered discrete symmetries, such
as those of the equilateral triangle. An object with discrete symmetries has
motions that leave it invariant that cannot be smoothly turned into one another.
In other words, with our triangle, both 120° and 240° rotations leave it invariant,
but all of the rotations between 120° and 240° are not symmetries, so we cannot
smoothly change one symmetry into another. Continuous symmetry, on the
other hand, can be seen in the rotation of a circle; every rotation can be smoothly
turned into every other rotation while the circle remains invariant. Both discrete
and continuous symmetries are important in physics. Let’s first look at the
discrete type.

An important discrete symmetry in physics is actually made up of three


situations that are thought to remain invariant. The first of these is the
conjecture that the universe would behave the same if every particle were
interchanged with its anti-particle. For instance, if we were made of anti-
protons and positrons instead of protons and electrons, we would not be
able to observe any difference. This swapping is called a charge-conjugation
transformation, and like permuting pancakes, can be thought of as a motion that
leaves the initial system invariant. It is referred to as C-symmetry.

Unit 6 | 32
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.6 The second idea is that the universe would behave the same if left and right
were interchanged. This is known as a parity inversion and is basically
Physics what we observe in a mirror. The idea that the mirror universe behaves no
CONTINUED differently than our own is known as P-symmetry. Unfortunately, both C-
and P-symmetries do not always hold. There are certain situations in which
inverting charge, orientation, or both, leads to a different outcome than if the
inversions had not occurred.

The third discrete symmetry is that of time reversal. Now, over long scales,
this idea is nonsense; of course the future is distinctly different from the past.
However, if one considers two billiard balls colliding (ignoring friction), the
incoming speeds and angles of the balls are the same as the outgoing, so if this
collision were run backwards in time, like rewinding a videotape, we would not
be able to tell. This is called T-symmetry, and it is less general than both C- and
P-symmetries. In fact, T-symmetry is, by itself, not really true, but something
fascinating happens when all three symmetries, C, P, and T, are considered
together.

Each of the C-, P-, and T-symmetries acting alone, or even any two of them
acting as a pair, do not leave a physical system invariant. However, these
“broken” symmetries tend to cancel each other out when all three are taken
together in what is known as CPT-symmetry. Basically, what this means is that
if every particle were swapped out with its anti-particle, and all coordinates
were inverted, and time was run backwards, the universe would behave no
differently than it does now.

CPT-symmetry is a fundamental prediction of the Standard Model of Particle


Physics. The standard model predicts what kind of particles should and do exist,
as well as their properties such as mass, charge, and spin. This model has
been remarkably accurate in its ability to explain the interactions of all particles
observed so far, not only for protons, neutrons, and electrons, but also more-
fundamental particles such as quarks and neutrinos.

Continuity and conservation


• Noether’s theorem implies that every conserved physical quantity is based
on a continuous symmetry.

Let’s now turn our attention to continuous symmetries. Recall that something
possessing continuous symmetries can remain invariant while one symmetry is

Unit 6 | 33
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.6 turned into another. Space itself, or more precisely, spacetime (the combination
of both space and time) possesses such continuous symmetries. For instance, if
Physics two billiard balls collide in one location, I expect the result would be no different
CONTINUED than if they had collided a foot to the left, or a centimeter to the left, or a micron
to the left. The collision remains invariant under translations of any magnitude
in spacetime. This is a continuous symmetry.

A remarkable theorem, proved at the beginning of the 20th century by Emmy


Noether, has the physical consequence that for every continuous symmetry in
spacetime, a quantity is conserved. Noether was a German mathematician and
theoretical physicist who made fundamental contributions in both physics and
algebra and greatly expanded the role of women mathematicians in Germany.
In 1933, despite her accomplishments, she was forced to flee Nazi Germany
because she was Jewish. Once safely in the United States, she taught at Bryn
Mawr and also lectured at Princeton’s Institute for Advanced Studies until her
death in 1935.

According to Noether’s theorem, every quantity that we consider to be conserved


corresponds to an underlying symmetry of spacetime. For instance, a common
notion in physics is that momentum is conserved, such as in the collision of two
billiard balls. This conserved quantity stems from the fact that spacetime has
continuous spatial translational symmetry, or put in other terms, “every location
is just as good as every other location.”

Conservation of energy, the law that helps roller coasters to function, stems
from the idea that if we conduct an experiment at one particular time and then
conduct the same experiment under the same conditions ten minutes later,
we should not expect to find a different result. Conducting an experiment ten
minutes later is what we call a translation in time. It’s basically like when
we imagined picking up our sine wave frieze motif and shifting it to the right,
except that now we are shifting an event forward in time. Noether’s theorem
implies that this time-translational invariance is what gives rise to the law of
conservation of energy.
Finally, the fact that spacetime is invariant under continuous rotations gives
rise to another important conserved physical quantity. If we ignore things such
as stars, planets, people, and dust and simply focus on perfectly featureless
spacetime, we would find that every direction is just as good as (i.e., equivalent
to) every other direction. This is rotational invariance and it, like the space
and time translations, is both continuous and gives rise to its own conserved

Unit 6 | 34
UNIT 6 The Beauty of Symmetry
textbook

SECTION 6.6 quantity, angular momentum in this case. Conservation of angular momentum,
for example, is why a figure skater can spin faster if she pulls in her arms.
PHysics
CONTINUED Noether’s theorem is yet another example of how symmetries show profound
connections among seemingly disparate ideas. It is this mysterious power to
bring some rhyme and reason to our world that compels mathematicians to
study the structure of symmetries and groups. There are undoubtedly many
surprising connections left to uncover via the power of group theory.

Unit 6 | 35
UNIT 6 at a glance
textbook

SECTION 6.2

Types of Symmetry • Symmetry is a basic notion in the visual arts.


Leading to Groups • Rotation, reflection, and translation are the most common types of visual
symmetry.
• The symmetries of the equilateral triangle can be thought of as the
transformations (i.e., motions or actions) that leave the triangle invariant—
looking the same as before the motion.
• Combinations of symmetries are also symmetries.
• A group is a collection of objects that obey a strict set of rules when acted
upon by an operation.

SECTION 3.2
6.3

Frieze and • The symmetries of an infinite sine wave form a group.


Wallpaper Groups • Friezes are patterns along a line, which are commonly used in art.
• All friezes fall into one of seven general types, each of which is a
mathematical group.

SECTION 3.2
6.4

Card Shuffling • The symmetries of a geometric object can be expressed as permutations


of elements.
• Permutations can form a group, as symmetries do.
• The symmetries of permutations are evident in card shuffling.
• Eight perfect “riffle” shuffles will return a standard deck of cards to the
order in which it started.

Unit 6 | 36
UNIT 6 at a glance
textbook

SECTION 6.5

Galois Theory • The study of permutation groups is related to the roots of algebraic
equations.
• Many mathematicians, including Evariste Galois, tackled the question of
whether or not a polynomial has a general solution by radicals.
• The general quintic equation cannot be solved by a formula.
• Galois theory can determine the solvability by formula of an equation by
examining the ways that permutation groups of its roots behave.

SECTION 3.2
6.6

PHYSics • Discrete symmetries are responsible for physical invariance.


• CPT-symmetry is a fundamental prediction of the standard model of particle
physics and is due to the interaction of three kinds of discrete symmetries.
• Noether’s theorem implies that every conserved physical quantity is based
on a continuous symmetry.

Unit 6 | 37
UNIT 6 The Beauty of Symmetry
textbook

BIBLIOGRAPHY

PRINT Ash, Avner and Robert Gross. Fearless Symmetry : Exposing the Hidden Patterns
of Numbers. Princeton, NJ: Princeton University Press, 2006.

Bashmakova, Isabella and Galina Smirnova. The Beginnings and Evolution of


Algebra, trans. Abe Shenitzer. USA: Dolciani Mathematical Expositions, Number
23, Mathematics Association of America, 2000.

Bayer, Dave and Persi Diaconis. “Trailing the Dovetail Shuffle to its Lair,”
The Annals of Applied Probability, vol. 2, no. 2 (May 1992).

Berlinghoff, William P. and Kerry E. Grant. A Mathematics Sampler: Topics for the
Liberal Arts, 3rd ed. New York: Ardsley House Publishers, Inc., 1992.

Byers, Nina. “E. Noether’s Discovery of the Deep Connection Between


Symmetries and Conservation Laws.” Cornell University Library.
http://arxiv.org/abs/physics/9807044 (accessed 2007).

Dolan, L. “The Beacon of Kac-Moody Symmetry for Physics,” Notice of the AMS,
vol. 42, no. 12 (December 1995).

Fraleigh, John B. A First Course in Abstract Algebra, 6th ed. Reading, MA: Addison-
Wesley Publishing Company, 1997.

Gribben, John. The Search for Superstrings, Symmetry, and the Theory of
Everything. Boston, MA: Little, Brown and Company, 1998.

Gross, David J. “The Role of Symmetry in Fundamental Physics,” Proceedings of


the National Academy of Sciences of the United States of America, vol. 93, no. 25
(December 10, 1996).

Kostant, B. “The Graph of the Truncated Icosahedron and the Last Letter of
Galois,” Notice of the AMS, vol. 42, no. 9 (September 1995).

Johnston, Bernard L. and Fred Richman. Numbers and Symmetry: An Introduction


to Algebra. Boca Raton, FL: CRC Press, 1997.

Unit 6 | 38
UNIT 6 The Beauty of Symmetry
textbook

BIBLIOGRAPHY
Lederman, Leon M. and Christopher T. Hill. Symmetry and the Beautiful Universe.
Amherst, NY: Prometheus Books, 2004.
PRINT
CONTINUED Livio, Mark. The Equation That Couldn’t Be Solved: How Mathematical Genius
Discovered the Language of Symmetry. New York: Simon & Schuster, 2005.

Miller, Gerald A. “Big Break for Charge Symmetry.” IOP Publishing.


http://physicsweb.org/articles/world/16/6/3 (accessed 2007).

Morris, S. Brent. Magic Tricks, Card Shuffling, and Dynamic Computer Memories.
Washington, DC: Mathematical Association of America, 1998.

Rockmore, Dan. Stalking the Riemann Hypothesis. New York: Vintage Books
(division of Randomhouse), 2005.

Tannenbaum, Peter. Excursions in Modern Mathematics, 5th ed. Upper Saddle


River, NJ: Pearson Education, Inc.,2004.

Weyl, Hermann. Symmetry (Princeton Science Library) Princeton, NJ: Princeton


University Press, 1989.

Unit 6 | 39
UNIT 6 The Beauty of Symmetry
textbook

NOTES

Unit 6 | 40
TEXTBOOK
Unit 7
UNIT 07
Making Sense of Randomness
TEXTBOOK

UNIT OBJECTIVES

• Mathematical consideration and understanding of chance took a curiously


long time to arise.

• The outcome of any particular event can be unpredictable, but the distribution of
outcomes of many independent events that are trials of a single experiment can be
predicted with great accuracy.

• The simple probability of a particular (“favorable”) outcome is the ratio of outcomes


that are favorable to the total number of outcomes.

• The Galton board is a useful model for understanding many concepts in probability.

• The Law of Large Numbers says that theoretical and experimental probabilities
agree with increasing precision as one examines the results of repeated
independent events.

• The distribution that results from repeated binary events is known as a


binomial distribution.

• A standard deviation is a measurement of the spread of the data points around


the average (mean).

• A normal distribution is determined solely by its mean and standard deviation.

• The Central Limit Theorem says that the average of repeated independent events is,
in the long run, normally distributed.

• Conditional probability and Markov chains provide a way to deal with events that are
not independent of one another.

• The BML Traffic model represents the frontier of modern


probabilistic understanding.
The huger the mob, and the greater the
apparent anarchy, the more perfect is its sway.
It is the supreme law of unreason. Whenever
a large sample of chaotic elements are taken
in hand and marshaled in the order of their
magnitude, an unsuspected and most beautiful
form of regularity proves to have been latent
all along.

Sir Francis Galton


UNIT 7 Making Sense of Randomness
textbook

SECTION 7.1 Mathematics is often thought of as an exact discipline. In fact, many people who
practice math are drawn to it because it tackles situations in which there are
introduction clear and predictable answers. There is a certain comfort in the idea that we can
use mathematics to make exact predictions about what happens in the future.
For example, we could use the mathematical formulation of physical laws to
predict the outcome of a coin flip if we knew enough about its size, weight,
shape, initial velocity, initial angle, and its other initial conditions. In practice,
however, we have a very hard time knowing all of the conditions that contribute
to the outcome of a coin flip. In the face of such complexity, we call the flip a
“random” event, one in which the outcome is based solely on chance and not
on any immediate knowable cause. Nonetheless, mathematics has a broad
set of tools to explain and describe events that appear, like the coin toss, to be
random. This set of tools makes up the mathematics of probability.

Does the past determine the future? If an event is truly random, the answer
must be “no.” There would be no way to predict the outcome of a specific
event given knowledge about its previous outcomes. Although it might seem
that situations like this are beyond the reach of mathematics, the truth is that
random events behave quite predictably, as long as one has no interest in the
outcome of any single event. Taken on average, random events are highly
predictable.

Probability theory manifests itself in many ways in our daily lives. Most of
us have insurance of some form or another-house, car, life, etc. These are
products that we purchase to help mitigate risk in our lives. We often associate
risk with unpredictable outcomes. This could be in the context of a small
business opening in an up-and-coming neighborhood, a commodities trader
making decisions based on how global political situations affect prices, or a
teenager getting behind the wheel for the first time. All of these situations
involve a certain amount of complexity that is functionally unpredictable
on a case-by-case basis. Probability theory, however, shows that there is
paradoxically a large amount of structure and predictability when these
individual situations are examined on a larger scale.

Probability theory shows that we can indeed make useful analyses and
predictions of events that are unpredictable on a case-by-case basis, provided
we look at the bigger picture of what happens when these events are repeated
many times. Concepts such as the Law of Large Numbers and the Central

Unit 7 | 1
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.1 Limit Theorem provide the machinery to make predictions about these types of
situations with confidence.
introduction
CONTINUED One of the most ubiquitous, and familiar, uses of probability is in gambling.
Casinos are the ultimate “players” in using mathematics to foresee the results
of a series of events that, taken individually, are functionally random. Indeed,
the mathematics of probability ensures that while an individual gambler may
have a good night or a lucky streak, in the long run, “the house always wins.”
Have you ever wondered how Las Vegas seems to have vast amounts of money
to spend on glitzy hotels and golf courses in the middle of the desert? Gambling
is a large, lucrative business, and its success is due, in part, to the laws of
probability.

In this unit we will see how probability, the mathematical study of the seemingly
unpredictable, has developed over a period of time to become an extremely
valuable tool in our modern world. We will see its relatively late origins in
European games of chance and its most recent applications in modeling and
understanding our increasingly complex and unpredictable world. We will
ponder how it is that news networks are able to predict the winners of elections
before all the votes have been counted. By the end of this unit, we will have
a sense of how mathematics can be used to make accurate predictions about
unpredictable events.

Unit 7 | 2
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.2

history • Roll Dem Bones


• A Rather Late Development

ROLL DEM’ BONES


• The mathematical study of probability probably was delayed for centuries
because of mysticism.

Throughout the ages, people have responded to the problem of what to do about
the future in different ways. For many ancient societies, the unknown future
was considered to be the province of the gods. Understanding and making
predictions about this future was left to religious figures and oracles. These
people employed a number of methods and devices with which they supposedly
divined the will of the gods.

Some of the most common tools of the ancient religious diviner were
astragali. These were bones, taken from the ankles of sheep, that would be
cast and interpreted. Astragali commonly had six sides, but they were very
asymmetrical. Often they were cast in groups, with the specific combinations of
values revealing the name of the god who could be expected to affect the future
affairs of the people. For example, if the bones said that Zeus was at work, there
would be reason for hope. If the bones said that Cronos was in charge, then the
people knew to prepare for the worst.

Unit 7 | 3
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.2

HISTORY
CONTINUED

Item 2242 / Kathleen Cohen, KNUCKLE-BONES (2008). Courtesy of Kathleen Cohen.

Item 2241 / Kathleen Cohen, DICE; CHECKERS; AND ASTRAGALUS (FOR KNUCKLE-BONES). (2008).
Courtesy of Kathleen Cohen.

Unit 7 | 4
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.2

HISTORY
CONTINUED

Item 2240 / Kathleen Cohen, DICE (2008). Courtesy of Kathleen Cohen.

Gradually, technology enabled the development of more-regularly-shaped


“prediction” devices. The first dice, made of pottery, are thought to have
appeared in ancient Egypt. By the time of the flowering of Greek culture, dice
were quite common, both for fortune-telling and for gaming. Dice have always
been popular tools in recreational gaming, or gambling, precisely because they
are thought to be random-event generators. “The roll of the dice” is thought to
be the ultimate unknown, so dice are thought to be somewhat fair arbiters. This
assumes of course that the dice are perfectly symmetrical and evenly weighted,
which early dice often were not. Discoveries of ancient loaded dice reveal that,
even though ancient people did not have a mathematical understanding of
probability, they knew how to weight games in their favor.

One might think that the Greeks, who embraced a central role for mathematics
in the world of the mind, would have discovered the features of probabilistic
thinking. Evidence shows that they did not. It is thought that the Greeks deemed
matters of chance to be the explicit purview of the gods. According to this view,
they believed that any attempt to understand what happens and what should
happen was a trespass into the territory of the gods. It was not of human concern.

Unit 7 | 5
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.2 Additionally, the Greeks favored understanding based on logical reasoning over
understanding based on empirical observations. One of the concepts at the
HISTORY heart of our modern understanding of probability is concerned with how actual
CONTINUED results compare with theoretical predictions. This type of empirical thinking
often took a back seat to logical axiomatic arguments in the mathematics of
ancient Greece.

The mathematics of probability went undiscovered for centuries, but gambling,


especially with dice, flourished. It seems that dice, in some form or another,
have been a constant feature of civilization from the time of the Greeks onward.
The Romans were fond of them, as were the knights of the Middle Ages, who
played a game called Hazard, an early forerunner of the modern game of craps,
thought to have been brought back from the Crusades.

A RATHER LATE DEVELOPMENT


• Renaissance mathematicians took the first strides toward understanding
chance in an abstract way.
• Pascal’s and Fermat’s solutions to the “Problem of the Points” provided
an early glimpse of how to use mathematics to say definite things about
unknown future events.
• Tree diagrams are useful for keeping track of possible outcomes.

It was not until the Renaissance that fascination with dice as an instrument
of gambling led to the first recorded abstract ideas about probability. The man
most responsible for this new way of thinking was a quintessential Renaissance
man, an accomplished doctor and mathematician by the name of Girolamo Cardano.

Cardano was famous in the mathematical world for many things, most notably
his general solutions to cubic equations. As a doctor, he was among the best
of his day. His passion, however, was to be found at the dice table. He was
a fanatic and compulsive gambler, once selling all of his wife’s possessions
for gambling stakes. Out of his obsession grew an interest in understanding
analytically, and, thus, mathematically, the odds of rolling certain numbers
with dice. In particular, he figured out how to express the chances of something
happening as a ratio of the number of ways in which the event could happen to
the total number of outcomes.

Unit 7 | 6
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.2 For example, what’s the probability of rolling a 4 with two regular dice?
There are thirty-six possible equally likely outcomes when a pair of dice is
HISTORY rolled, and of these only three combinations (1 and 3, 2 and 2, and 3 and 1)
CONTINUED
produce a total value of 4. So, the probability of rolling a 4 is 3 or 1 . This
36 12
seemingly straightforward observation was the first step toward a robust
mathematical understanding of the laws of chance. Dr. Cardano penned his
thoughts on the matter around 1525, but his discoveries were to go unpublished
until 1663. By that time, two Frenchmen had already made significant progress
of their own.

In the mid-1600s, the Chevalier de Méré, a wealthy French nobleman and avid
gambler, wrote a letter to one of the most prominent French mathematicians
of the day, Blaise Pascal. In his letter to Pascal, he asked how to divide the
stakes of an unfinished game. This so-called “problem of the points” was
framed as follows:

Suppose that two men are playing a game in which the first to win six points
takes all the money. How should the stakes be divided if the game is interrupted
when one man has five points and the other three?

Pascal consulted with Pierre de Fermat, another very prominent mathematician


of the day, in a series of letters that would become the basis for much of modern
probability theory. Fermat and Pascal approached the problem in different
ways. Fermat tended to use algebraic methods, while Pascal favored geometric
arguments. Both were concerned basically with counting. They figured that in
order to divide the stakes properly, they could not simply divide them in half,
because that would be unfair to the man who was in the lead at the time of the
game’s cessation. A proper division of the stakes would be based on how likely
it was that each player would have won had the game continued to completion.

The player in the lead could win in one more round. The trailing player would
need at least three more rounds to win. Furthermore, he must win in each
of those three rounds. Therefore, the two Frenchmen reasoned, the proper
division of the stakes should be based on how likely it was that the trailing
player would win three rounds in a row.

Unit 7 | 7
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.2 The trailing player has a one-in-two chance of winning the next round. Provided
he wins, he then has another one-in-two chance of winning the following round.
HISTORY So, after two rounds, there are four possible outcomes, only one of which is
CONTINUED
favorable to the trailing player. If he should happen to win the first two rounds,
he then again has a one-in-two chance of winning the third round. Let’s take a
look at how all of this information can be represented in a tree diagram.

P1 has 5 wins, P2 (orange) has 3 wins.


1 1
2 2

1482 1
4
1
4
1
4
1
4

1 1 1 1 1 1 1 1
8 8 8 8 8 8 8 8

P1: 7 , P2: 1
8 8

As we see in the tree diagram above, only one of the eight possible outcomes
results in the trailing player winning the stakes. Therefore, the trailing player
1 7
should be awarded of the pot, with the remaining going to the player who
8 8
was winning at the time of the interruption. This method of enumerating and
examining the possible outcomes of random events was a crucial link in the
mathematical conquest of the unpredictable.

Unit 7 | 8
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.3

simple probability • A Coin Toss


and counting • The Galton Board
• Enter Pascal’s Triangle

A COIN TOSS
• Simple probability is the ratio of favorable outcomes to the total number of
possible outcomes.

Probability theory enables us to use mathematics to characterize and predict


the behavior of random events. By “random” we mean “unpredictable” in the
sense that in a given specific situation, our knowledge of current conditions
gives us no way to say what will happen next. It may seem pointless to try
to predict the behavior of something that we fundamentally characterize as
unpredictable, but this is exactly what makes the mathematics of probability so
powerful. Let’s think about a coin toss. There is no way to predict the outcome
of a single coin toss. In this sense it is a random event.1 But we can say some
definite things about it.

The first thing that we can say is that the outcome will definitely be either
heads or tails. Putting this in mathematical terms, we say that the probability
of the coin landing heads up or tails up is 1, or absolutely certain. An event of
probability zero is effectively impossible. In the case of equally likely outcomes,
such as in the dice example above, determining the probability that a particular
outcome occurs basically involves counting. So, we do as Cardano did and
compare the number of ways a specific outcome can happen to how many total
outcomes are possible. The probability of the coin landing heads up would then
1
be 1 out of 2, or . There is, of course, the same probability that it will land tails
2
up. To see this we could start with the probability that the coin will be heads or
1
tails, 1, and subtract the probability that it will be heads, . This leaves us with
1 2
as the probability that the coin will not land heads up, in other words, the
2
probability that it will land tails up.

1 Now, we have to be a bit careful here because a coin toss would not be random
if we were able to know all of the initial conditions of the toss, but since we can’t
know all of the conditions that affect the outcome, we can treat it as random.

Unit 7 | 9
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.3 Determining the probability of the optional outcomes of a single coin flip may
not seem that interesting, but it provides a good starting point for understanding
simple probability the probabilities associated with any event or series of events that have only
and counting binary outcomes, (e.g., heads or tails, win or lose, on or off, left or right).
CONTINUED
We can abstract any such situation in the form of a simple machine known as
a Galton board.

THE GALTON BOARD


• The Galton board is a model of a sequence of random events.
• Each marble that passes through the system represents a trial consisting of
as many random events as there are rows in the system.

Imagine a peg fixed in the middle of an inclined board, with the base of the board
divided into two bins of equal size. If we drop a marble towards the peg, it will hit
it and deflect either into the right bin or the left bin. In terms of probability, this
is just like our coin toss from before.

1483
L R

The two bins represent the possible outcomes for the dropped marble, and
because there is only one way to get into either bin, the probability associated
1
with a marble ending up in a particular bin is . The advantage of viewing
2
binary systems in this way is that it is very easy to build complexity into the
experiments by adding rows of pegs. Let’s add a row of two pegs below the
initial peg, one to the right and one to the left.

Unit 7 | 10
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.3

simple probability
and counting
CONTINUED

1484 L R

LL LR RR
RL

Notice that there are now three bins at the base. We can, as before, figure
out the probability of the marble ending up in any particular bin. We may be
tempted to think that all three bins are equally likely destinations, which would
1
make the probability for any individual bin . This ignores the fact that if we
3
look at the paths a marble can take through the machine, we find that there are
two possible paths to the middle bin, whereas there is only one path leading to
each of the side bins. This suggests to us that we need to count the paths to bins
instead of the bins themselves. With such a simple system, enumerating the
paths is straightforward: LL, LR, RL, RR. With four possible paths, two of which
end up in the middle bin, the probability of the marble ending up in the middle
1
bin is 2 out of 4, or . Because each side bin has only one path associated with
2
it, the probability of the marble ending up in one particular side bin is 1 in 4
1
or . If we add all the probabilities together, we get the probability that the
4 1 1 1
marble will end up in one of the three bins: + + = 1.
2 4 4

Unit 7 | 11
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.3 This is what we would expect, because the marble cannot disappear and must,
therefore, end up in one of the three bins. To represent more-involved binary
simple probability outcome systems, we can continue adding rows to our machine:
and counting
CONTINUED

L R

1485
LR
LL RR
RL

LLL LLR RRL RRR


LRL RLR
RLL LRR

Here we have shown all possible ways that the marble can traverse three rows
of pegs. In each row the marble hits one peg and deflects either right or left.
This is a good model for any collection of three binary decisions, such as our
problem of points from the last section. If instead of left and right, you imagine
each peg represents win or lose, you have a nice model of the three rounds that
the two players might hypothetically play to finish their game. Let’s call each
deflection to the left a win for Player 1 and each deflection to the right a win for
Player 2. As we said earlier, Player 2 needs three consecutive wins, so the only
path that would lead him to victory would be the RRR path. Because this is just
one of eight possible paths, his chances of winning are 1 in 8. This means that if
the game is interrupted at this point, Player 2, the one who is behind, should get
1
of the pot.
8

To see why, let’s imagine that someone wishes to take Player 2’s place in the
game, even though he needs an unlikely three consecutive points to win. How
much should this newcomer pay to get into the game (which is the same as
asking how much Player 2 must accept to get out of the game)?

Unit 7 | 12
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.3 Using the language of the Galton board, the newcomer would need the sequence
RRR to win the entire pot-any other sequence results in a loss of however much
simple probability she paid Player 2 to get into the game. Let’s say that this newcomer is rather
and counting cautious and shrewd and, knowing that she is at a disadvantage, wishes to
CONTINUED
hedge her main wager (her payment to Player 2) with a series of side bets with
onlookers at the contest. Because there are eight possible outcomes, and she
can win on only one of them, she should place seven side bets to cover every
possible outcome. Each side bet is a 50/50 bet on whether a certain sequence
of events will happen. If the pot is $8, the newcomer should make the following
side bets:

Onlooker A pays $1 if LLL happens and gets $1 if RRR happens.


Onlooker B pays $1 if LLR happens and gets $1 if RRR happens.
Onlooker C pays $1 if LRL happens and gets $1 if RRR happens.
Onlooker D pays $1 if RLL happens and gets $1 if RRR happens.
Onlooker E pays $1 if LRR happens and gets $1 if RRR happens.
Onlooker F pays $1 if RLR happens and gets $1 if RRR happens.
Onlooker G pays $1 if RRL happens and gets $1 if RRR happens.

In the event that RRR happens, the newcomer would win the $8 pot and owe a
total of $7 on all of her side bets, resulting in a gain of $1. If any sequence other
than RRR happens, the newcomer would get $1 from one of her side-bets (the
rest would be ties) and the pot would go to Player 1. No matter what happens,
the newcomer ends up with $1, so to enter the contest she should pay Player 2
no more than $1.

The newcomer, of course, does not actually have to make all of the side bets. In
fact, if she hopes to gain anything from a fair bet gamble, she shouldn’t, because
she would be guaranteed to break even. However, considering these bets, also
known as hedges, helps in figuring out what the fair price is for entrance to the
game. Besides, if someone offered her the opportunity to play for less than a
dollar, then using these side bets she is guaranteed to make a profit. Such a
guarantee of profit is called an arbitrage opportunity, and this view of probability
as the hedge-able fair price plays a fundamental role in applying probability
theory to finance. It also happens to be the way that early thinkers such as
Fermat and Pascal viewed probability.

Unit 7 | 13
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.3 The newcomer should pay $1 to Player 2, which means that Player 2 could walk
1
away from the game at this point with $1, or of the total pot. So, this is also
8
simple probability the amount Player 2 should get if the game is interrupted, because in both cases
and counting he is leaving the unresolved game.
CONTINUED

ENTER PASCAL’S TRIANGLE


• For Galton boards with many rows, the task of enumerating paths is greatly
facilitated by using Pascal’s Triangle.

Let’s now return to the Galton board and see what happens as we continue to
add rows.

L R

LR
LL RR
RL
1486
LLR RRL
LLL LRL RLR RRR
RLL LRR

LLLL LLLR LLRR RLRL RRRL RRRR


LLRL LRLR RRLL RRLR
LRLL LRRL RLLR RLRR
RLLL LRRR

We notice that it quickly becomes unwieldy to enumerate every path. It would be


nice to have some easy way to find the number of paths to each bin, given how
many rows, or rounds, or decisions there are. Fortunately we can model this
situation, as we did in our discussion of combinatorics in Unit 2, using Pascal’s Triangle:

Unit 7 | 14
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.3
1

simple probability 1 1
and counting
CONTINUED 1 2 1

1488 1 3 3 1

1 4 6 4 1

In Unit 2, we found the generalization that the number of paths from the top of
Pascal’s Triangle to the kth “bin” in the nth row is given by:

N!
k!(n–k)!

Let’s verify that there are indeed six paths to the middle bin (k =2) of the fifth row
4! (4 * 3*2*1)
(n=4). = = 24 = 6.
2!(4 –2)! (2*1)(2*1) 4

Adding the path totals for each bin in the nth row gives the total number of paths
available to the marble, 16. The probability that the marble will end up in the
6 3
middle bin is, therefore, , or .
16 8

Our discussion up until now has been quite theoretical. We have used the
power of combinatorics to enumerate all the paths available to a marble
traveling through the Galton box, and we have calculated probabilities
associated with those paths, assuming each individual path has an equal
probability. If we would actually perform such an experiment, however, we
would have no way of knowing in which bin a single marble will end up. We can
speak only generally. However, even this general view has great power, as we
shall see in the next section.

Unit 7 | 15
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.4 • A Change of Perspective


• What Did You Expect?
law of large
numbers A CHANGE IN PERSPECTIVE
• Bernoulli’s Law of Large Numbers shifted the thinking about probability
from determining short-term payoffs to predicting long-term behavior.

In the preceding section, with the use of the Galton board, we found the
theoretical probability that a marble will end up in any specific bin. Now let’s
turn our attention to what actually happens when we let a marble go through the
board; furthermore, let’s see what happens when many marbles go through it!

Each path is equally likely, and we have to assume that marbles dropped
randomly into our machine are not predestined to follow any particular path.
Because the number of paths to each of the bins varies, we should expect that
over time, bins that have more paths leading to them will end up with more
marbles than bins that have fewer paths leading to them. Thus, the distribution
of a large number of marbles through the machine will not be even. The Law
of Large Numbers will help us to predict roughly the distribution that we would
find were we to run such an experiment.

The Law of Large Numbers says that when a random process, such as dropping
marbles through a Galton board, is repeated many times, the frequencies of
the observed outcomes get increasingly closer to the theoretical probabilities.
Jacob Bernoulli, the man who is credited with discovering this law around the
beginning of the 18th century, is said to have claimed that this observation
was so simple that even the dullest person knows it to be true. Despite this
pronouncement, it took him over 20 years to develop a rigorous mathematical
proof of the concept.

Unit 7 | 16
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.4 Let’s look at the Law of Large Numbers in terms of the Galton board:

law of large
numbers
CONTINUED

1896 L R

Diagram Showing Lots of Balls Going Through the 2 Row Galton Board

Recall that we found the probability of a ball ending up in the middle bin to be
1
. According to the Law of Large Numbers, if we ran 100 marbles through this
2
setup, about 50 of them would end up in the middle bin. If we ran 1000 marbles,
about 500 would end up in the middle. Furthermore, as we run more marbles
1
through the board, the proportion in the middle bin will get closer and closer to 2 .

Bernoulli may have thought that this concept is self-evident, but it nevertheless
is striking. Recall that we can’t say with any certainty at all where one particular
marble will end up. Still, we can say with very high accuracy how 1000 marbles
will end up. Better yet, the more marbles we run, the better our prediction will
be. The Law of Large Numbers is a powerful tool for taming randomness.

Unit 7 | 17
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.4
WHAT DID YOU EXPECT?
law of large • Expected value is the average result of a probabilistic event computed by
numbers taking into account possible results and their respective probabilities.
CONTINUED
• Expected value is a key concept in both gambling and insurance.

The notion of expected value, or expectation, codifies the “average behavior”


of a random event and is a key concept in the application of probability. For
example, imagine that you are a door-to-door salesperson. Your experience
tells you that the probability of making a sale, and thus a commission, on each
3
try is as follows: 8 that you make no sale and make no commission, 20 that
10
1
you make a small sale that leads to a $100 commission, and 20 that you make
a large sale that leads to a $500 commission.2 How much can you expect to
make, on average, per appointment? That is, what do you “expect” to be the
value of the total sales divided by the number of appointments? This will be an
expected value.

The expectation or expected value of a random process in which each outcome


has a particular payoff is simply the sum of the individual probabilities
multiplied by their corresponding payoffs. If P1 is the probability of outcome
#1 and V1 is the payoff value of that outcome, and so on (Pj and Vj for the jth
outcome and jth payoff value respectively), then the expected value can be
represented by the expression:

P1V1 + P2V2 +...+PNVN where n = number of possible outcomes

In our sales example, the individual terms are as follows: for a no-sale, 8 $0 =
10
$0; for the small commission, 3 $100 = $15; and for the large commission, 1
20 20
x $500 = $25. Thus, the expected value of the sales call payoff, per appointment,
is 0 + $15 + $25 = $40. This, of course, does not mean that you will make $40
for every appointment, but it is what you can “expect” to make on average over a
period of time (assuming that your probabilities are correct!).

2 Note that, colloquially, it might seem more natural to speak in terms of “odds,”
as in the “odds of making a sale.” Odds are computed as ratios of alternative
8
probabilities. If the probability of not making a sale is 10 , then the probability
of making a sale is 2 , and the odds of not making a sale are ( 8 )/( 2 ), more
10 10 10
commonly expressed as “4 to 1.”

Unit 7 | 18
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.4 The Law of Large Numbers ensures that the more sales calls you make, the
closer your average payoff, per appointment, will be to $40.
law of large
numbers The concept of expected value, in conjunction with the Law of Large Numbers,
CONTINUED
help form the operating principle of businesses that are based on risk. Two
prominent examples are casinos and insurance companies. Let’s look a little
more closely at each.

Any single casino game carries a certain risk for both the player and the
house. A player’s loss is the house’s gain and vice versa. It would seem that
no business could thrive in such a zero-sum situation, yet generally, the casino
business is quite lucrative. This is possible because while the individual player’s
risk is concentrated in a small number of hands or rounds of a game, the
casino’s risk is spread out among all the games and all the bets going on. In
short, casinos have the Law of Large Numbers working in their favor. Owners
and managers of casinos know that while the outcome of any single game
is unpredictable, the outcome of many rounds of that same game is entirely
predictable. For instance, they know that the probability of rolling a seven at
1
the craps table is 6 . Averaging this over many rolls means that a player will
1
roll a seven 6 of the time. In other words, in a group of six players, only one,
on average, will be a winner. The casinos then structure their payoffs or “odds”
slightly in their favor so that the money paid out to any player who wins will be
more than offset by the money taken in from the five players who, on average,
don’t win. Note that this does not require any sort of rigging or cheating as far
as actual game play. Casinos don’t need to cheat the individual gambler-as long
as they keep their doors open, the odds settle in their favor. They’ve structured
their payoffs to guarantee it in the long run and because they generally have
more working capital than any of the players, they can take advantage of the
long-term “guarantees.”

Insurance companies use similar principles to set premiums. They spend a


great deal of effort and resources calculating the odds of certain catastrophes,
such as a house fire, then multiply this value by the payoff they would give in
such an event. This amount is how much the company can expect to have to pay,
on average, for each person that they cover. They then set their rates at levels
that cover this “expense” in addition to providing their profit. The policyholder
gets peace of mind because the insurance company has effectively mitigated the
risk of potential loss in a given catastrophe.

Unit 7 | 19
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.4 The insurance company gets a flow of regular payments in exchange for a
massive payoff in the unlikely event of a big claim.
law of large
numbers The Law of Large Numbers is a powerful tool that enables us to say definite
CONTINUED
things about the real-world results of accumulated instances of unpredictable
events. This useful tool represents just one example of how mathematics can
be used to deal with randomness. The Law of Large Numbers applies to specific
outcomes and their probabilities, but what about the entire range of possible
outcomes and their associated probabilities? Just as the frequency of a specific
event will tend toward its probability over the long run, the full set of possible
outcomes will each tend to their own probabilities. Studying the distribution of
possible outcomes and probabilities will give us even more powerful tools with
which to predict long-term average behavior.

Unit 7 | 20
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.5

the galton board • Distribution


revisited • Tilting the Board
• 68-95-97.5

DISTRIBUTION
• Each bin of the Galton board has an associated probability, and looking at all
of the bins simultaneously gives a distribution.
• Because each peg represents a right-or-left, or binary, decision/option, the
distribution of probabilities is called a binomial distribution.

Let’s return to our Galton board for further exploration. In the previous section,
we used the Law of Large Numbers to see that, for each bin, the theoretical and
experimental probabilities get closer and closer to one another as we put more
and more marbles through the process. We’re going to return to theoretical
probabilities now and examine how the probabilities are distributed across all of
the bins.

Recall that in our earlier examples the bins in the middle had higher
probabilities than did the bins at the sides. This was because there are more
paths that terminate in the middle bins than terminate in the side bins. Let’s
make a histogram that correlates to the bins and their probabilities.

Unit 7 | 21
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.5 First a simple 2 row machine:

the galton board


revisited
CONTINUED

L R

1489 1
4
1
2
1
4

Probabilities

0.50

0.25 1
1 2 1
4 4

Bins

Notice that the distribution is symmetric; the probabilities for both the far
right and far left bins are equal. Let’s look at the histogram for a Galton board
with four rows.

Unit 7 | 22
UNIT 7 Making Sense of Randomness
textbook

1490

SECTION 7.5

the galton board


revisited
CONTINUED

L R

LR
LL RR
RL

LLR RRL
LLL LRL RLR RRR
RLL LRR

LLLL LLLR LLRR RLRL RRRL RRRR


LLRL LRLR RRLL RRLR
LRLL LRRL RLLR RLRR
RLLL LRRR

Probabilities

0.5

0.375
0.25

0.0625

Bins

Notice that the probabilities for each version of the system are distributed
across all of the bins and that even though the individual probabilities change
as we add more bins, they always sum to 1. This is in keeping with our intuition
that any marble must indeed end up in exactly one of the bins. At this point,
we’re going to need to label the bins so that we can discuss results in more detail.

Unit 7 | 23
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.5 To do this, let’s say that each marble, as it progresses through the system, gets 1
point for each movement to the right and 0 points for each movement to the left.
the galton board Each bin can then be represented by the summative “score” of a marble that
revisited ends up there.
CONTINUED
+0

+1

+1

1897 +0

+1

This ball has a score of 3

For example, a marble passing through a four-row system would have a


maximum possible score of 4, corresponding to the right-most bin, and a
minimum possible score of zero, corresponding to the left-most bin. The
remaining bins would have scores as follows:

1898

SCORE 0 1 2 3 4

Unit 7 | 24
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.5 The average result, which would be the expected score for one average marble,
can be found as before by multiplying the value of each bin by the probability of
the galton board landing there and summing the results. This would give us
1 6 32
0) + ( 4 1) + (
1
revisited ( 2) + ( 4 3) + ( 4) = = 2.
16 16 16 16 16 16
CONTINUED

Interestingly, we can also find this by looking at the number of rows and
1
multiplying by the probability of going right. This would be 4 rows times 2 =
2. Recall that the number of rows corresponds to the maximum score that a
marble can get. Multiplying this maximum score by the average probability at
each peg gives an expected average value. We’ll need this method in just a bit.

Furthermore, we want to mathematically describe how the values of the scores


of each marble are spread out around the mean. In other words, we need a way
to describe how random results vary from their expected value. This would give
us a sort of sensible ruler that we can use. To do this, it makes sense to look at
the difference between the expected value of each marble and the mean, so we
subtract the two quantities.

x - m where x = expected value and m = mean

Because this quantity is something like a notion of distance, we square it to


ensure that the value won’t be negative.

(x - m)2

We then multiply this squared difference by the probability of ending up in that


bin.

P(x - m)2

Finally, if we add all of these terms together, we will get a number that
describes how the expected values of the bins are distributed around the mean.
This is known as the variance.
n
∑ ( x - m)2P
i i
i=1

Because the variance is based on the square of the difference between a result
and its expected value, it scales somewhat awkwardly.

Unit 7 | 25
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.5
For example, if the difference between the expected value and the mean
the galton board changes by a factor of three, the variance would change by a factor of 32, or 9.
revisited To mitigate this so that our ruler scales in a more sensible manner, we can take
CONTINUED
the square root of our final result. Taking the square root of the variance gives
us a measure of the average difference between a marble’s score and the mean.
This is known as the standard deviation.

The number of bins corresponds to the number of rows in the system. Let’s
call this number n. Notice that the maximum score is also n. Remember that
this setup can represent many different situations, such as the result of n coin
tosses, or any other situation, regardless of whether or not the odds of each
individual event are 1 to 1, sometimes referred to as a “50-50 chance.”

TILTING THE BOARD


1
• If we assign a probability other than 2 to each peg of the Galton board, the
mean of the distribution will shift.

What would happen in a situation in which each individual event has a probability
1
other than ? Let’s return to dice for a moment, and then see how we can model
2
this on our Galton board. For instance, let’s say we want to roll a 5 with one die.
We either roll a 5 or we don’t, so this situation is binary, but unlike the coin toss,
1
the odds are not 1 to 1. The probability of rolling a 5 with one die is 6 —there is
only one way to roll a 5, whereas there are five possible results other than a 5.
We can model this using a Galton board by equating right deflections with rolling
a 5 and left deflections with rolling anything else. If we then tilt the board such
5
that each marble has a 1 chance of going to the right at each peg and a 6 chance
6
of going to the left, we have a great model for our problem-it’s like having a
1
“biased coin,” one in which the probability of getting a head is only 6 and the
1 5
probability of getting a tail is 1-( 6 ), or 6 .

With this model, it is easy to answer questions such as, what is the probability of
rolling a 5 exactly once in four rolls? In terms of our modified, tilted system, this
correlates to a marble going through four rows and deflecting to the right only
once, ending up in bin 1. To find the probability that a marble will end up in bin
1, which is the same probability of rolling one 5 in four rolls, we can no longer
simply count paths as we did before, because not all paths are equally likely.
Nonetheless, we can use our path count as a starting point.

Unit 7 | 26
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.5 REGULAR TILTED

the galton board


1899revisited
CONTINUED VS.

In our four-row system, we know that there are four possible paths to bin 1.
Now, instead of looking at the ratio of the number of paths ending in bin 1 to the
total number of paths to find the probability of ending up in bin 1, we can think
about the probability of each specific path occurring. Each path is a sequence of
four events, and each event is either a left (L) or right (R) shift in direction. The
four paths to bin 1 are thus, LLLR, LLRL, LRLL, RLLL. The probability for each
of these paths is the product of the probabilities of the individual events in the
sequence. For example: the path LLLR has a probability of
5 5 5 1 1 5 5 5
( 6 ) ( 6 ) ( 6 ) ( 6 ). The path RLLL has the probability ( 6) ( 6 ) ( 6 ) ( 6 ). Notice
that all the paths to bin 1 have the same probability. Therefore, to find the
probability of ending up in bin 1, we can just add the probabilities of taking the
specific paths that end in bin 1. Since all of these probabilities are the same, we
125
can simply multiply the probability for one path, 1296 , times the number of paths
(4) to get 500 .
1296

Unit 7 | 27
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.5 We can generalize this thinking to arrive at an expression that will tell us
the probability of landing in the kth bin of a system with n rows, in which the
the galton board probability of going to the right at each peg is p and the probability of going to
n!
revisited the left is 1-p. We multiply the number of paths, , times the probability
k!(n–k)!
CONTINUED th
of going right, p, to the k power, times the probability of going left, (1-p) to the
(n-k)th power (because if you go right k times, you necessarily go left the rest of
the time). The probability of landing in the kth bin is then:

n!
pk (1-p)(n-k)
k!(n–k)!

2111
.482 .386 .116 .015 .001

0 1 2 3 4
n=4 n=4 n=4 n=4 n=4
k=0 k=1 k=2 k=3 k=4

1 5
Using p = and (1-p) = 6 , from our dice example, we see that the distribution
6
of probabilities after four rows on the Galton board has shifted to the left
somewhat from what it was for the p = (1-p) = 1 situation of the fair coin toss.
2
Intuitively it makes sense that, if a marble has a greater chance of going left
than right at each peg, then there is a greater chance that it will end up in the
left bins.

Let’s look at how this affects the average marble’s score. We’ll need to find the
mean again, and we can do this, as we did before, by multiplying the number of
rows by the probability of deflecting to the right. (4 rows 1 = 2 ).
6 3

1 1
So, shifting the probability at each peg from 2 to 6 both moves the entire
distribution of probabilities to the left and shifts the mean value from
2
2 to 3 . We now see how the probability of each event (turn) determines the
overall distribution of outcomes of repeated events (sequences of turns).

Unit 7 | 28
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.5
Not only does the mean shift, but the variance and standard deviation shift as
the galton board well. Recall that these values have to do with how the outcomes are distributed
revisited around the mean. This distribution of probabilities, or outcomes, is called the
CONTINUED
binomial distribution, and it is a commonly occurring distribution in sequences
of repeated events in which there are only two possible outcomes for each event.

68-95-97.5
• The normal distribution is an ideal distribution that is determined by only its
mean and standard deviation.

The binomial distribution is useful, but it can take a long time to calculate,
especially in situations in which n, the number of events, or the number of rows
in the system, is large. There is an approximation to this distribution, however,
that is much more easily calculated and that provides a reasonably good model
σ for the probability distribution. It can be found using only the mean and the
standard deviation, and it is known as the normal distribution, familiar to many
of us as the “bell curve.”

MEAN

σ

POINT OF
INFLECTION

1901 FREQUENCY

NUMBER OF BIN

The normal distribution is related to a model of the distribution of the


probabilities of outcomes of repeated independent events-also called “Bernoulli
trials.” As we can see, it is a bell-shaped curve, and it turns out that it is
characterized by two properties. One distinguishing characteristic is its mean,

Unit 7 | 29
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.5 which correlates to the central position of the bell around which it is symmetric.
The other characteristic is the standard deviation. In the graph above, this
the galton board corresponds to the position where we see a point of inflection on the graph
revisited (there is one on either side of the mean, indicated in the figure above). One
CONTINUED
standard deviation is the average difference between an outcome and the mean.
In terms of percentages, the standard deviation, marked on either side of the
mean, defines the range within which 68% of the results fall (on average). In
other words, if scores on a test were normally distributed, about 68% of the
students would fall within one standard deviation of the mean. For example,
if the mean were 65 and the standard deviation 7, then 68% of students would
score between 58 and 72. What’s more, about 95% of students would have
scores within two standard deviations of the mean, and about 97.5% of students
would have scores within three standard deviations of the mean. For this
example, only 2.5% of students would have scores higher than 86 or lower than
44. This is commonly known as the 68-95-97.5 rule for normal distributions.

The normal distribution approximation provides a powerful tool for predicting


how the results of repeated independent experiments will be distributed.
Furthermore, the more events in sequence that we look at, the better the
normal distribution is at describing our results. Of course, there can always be
outliers, such as a string of all heads or tails, that momentarily will skew the
distribution one way or the other. However, on average, the normal distribution
is fairly representative of the real world. In terms of our 50-50 Galton board,
which can model a variety of binary situations, this means that the more rows
we have, the closer our distribution will be to the normal distribution. The
underlying reason for this involves the Central Limit Theorem, and it is to this
concept that we will now turn.

Unit 7 | 30
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.6

central limit • An Average of Averages


theorem • An Example from Politics

AN AVERAGE OF AVERAGES
• According to the Central Limit Theorem, the distribution of averages of
many trials is always normal, even if the distribution of each trial is not.

Let’s return to thinking about coin flips. If our coin is fair, the probability that
1
the result will be heads is , and the probability that the result will be tails is
2
the same. If we flip the coin 100 times, the Law of Large Numbers says that we
should have about 50 heads and about 50 tails. Furthermore, the more times we
flip the coin, the closer we get to this ratio.

Result of 100 coin flip:


1 ball traversing a 100-row machine

1902 Number of
results
49 51 The Law of Large Numbers says
that this ratio goes to 50/50 as
the number flips increase.

Heads Tails

Let’s now shift our thinking to consider sets of 100 coin flips. Flipping a coin
100 times is like running one marble through a 100-row version of our Galton
board. Running many marbles through this system is like doing the 100-coin
flip experiment many times, one for each marble. Instead of being concerned
with each flip, or each left or right deflection of a marble, we are only concerned
with the total result of 100 such individual events. According to the Law of Large
Numbers, the more times we flip the coin, the closer our overall results will
come to a 1-to-1 ratio of heads to tails.

Unit 7 | 31
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.6 However, if we cap our number of events at 100, and do multiple sets of 100
events, we will find that not all of the sets end up being an exact 50-50 split
Central limit between heads and tails. Some will have more heads than tails and vice versa.
theorem It’s also possible that a very few sets might come out all heads or all tails. To
CONTINUED
explain these results, we are going to need something a bit more powerful than
the Law of Large Numbers.

What is amazing is that we can predict with a fair level of accuracy how many
of these 100-flip tests should come out all heads, or all tails, or any mixture in
between. In fact, the distribution of outcomes of our 100-flip tests will follow a
normal distribution very closely. The guiding principle behind this reality is the
Central Limit Theorem.

Few Rows Few Rows


Few Trials ManyTrials

FREQUENCY FREQUENCY

SCORE SCORE
Not very good! Closer!

NORMAL

1903 Many Rows


CURVE

Many Rows
Few Trials Many Trials

FREQUENCY FREQUENCY

SCORE SCORE
Not too good Good fit!

The Central Limit Theorem was developed shortly after Bernoulli’s work on
the Law of Large Numbers, first by Abraham De Moivre. De Moivre’s work sat
relatively unnoticed until Pierre-Simon Laplace continued its development
decades later. Still, the Central Limit Theorem did not receive much recognition
until the beginning of the 20th century. It is one of the jewels of probability theory.

Unit 7 | 32
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.6 The Central Limit Theorem can be quite useful in making predictions about a
large group of results from a small sampling. For instance, in our sets of 100
Central limit coin flips, we don’t actually have to do numerous rounds of 100 flips in order
theorem to be able to say with a fair amount of confidence what would happen were we
CONTINUED
to do so. We can, for instance, complete just one round of 100 flips, look at the
outcome, say perhaps 75 heads and 25 tails, and ask, “how closely does this
one experiment represent the whole?” This is essentially what happens during
elections when television networks conduct exit polling.

AN EXAMPLE FROM POLITICS


• Exit poll results can be compared with a normal distribution to make
predictions about the results of an election based on a relatively small
sample of voters.

In an exit polling situation, voters are asked if they voted for a particular
candidate or not. If you ask 100 voters and you find that 75 voted for Candidate
A and 25 voted for Candidate B, how representative of the overall tally is this?
The mean of this sample is 75% for Candidate A. This is calculated by assigning
a score of 1 to a vote for Candidate A and a score of 0 to a vote for Candidate B,
multiplying the votes by the scores, adding these results, and dividing by the
total number of votes.
(75 −1)+ (25 − 0)
Mean = 100

Intuition tells us that it would be unwise to assume that the final tally of all the
votes will exhibit exactly the same ratio as this one sampling. That would be
akin to flipping a coin 100 times, getting 75 heads, and assuming that this is
what would happen more or less every time. In other words, we can’t assume
that the mean value of this one sample of 100 voters is the same as the true
mean value of the election at large. Even so, we can say something about how
the mean we found in the exit poll relates to the true mean.

We can use the Central Limit Theorem to realize that the distribution of all
possible 100-voter samples will be approximately normal, and, therefore,
the 68-95-97.5 rule applies. Recall that this rule says that 68% of sample
means will fall within one standard deviation of the true mean (the actual vote
breakdown of the whole election). However, this rule is useful only if we know
the standard deviation and the true mean, and if we knew the true mean, why
would we need to conduct an exit poll in the first place?

Unit 7 | 33
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.6 To find an approximation of the standard deviation, we must first find the
variance. Recall from the previous section that the variance is related to the
Central limit difference between how each person voted and the mean. Because the possible
theorem votes are only A or B, and A is assigned a score of 1 whereas B gets a score of
CONTINUED
0, then the possible differences are “1 minus the mean,” which corresponds to
the people who voted for A, and just the mean, which corresponds to the people
who voted for B. The total number of voters multiplied by the mean is the total
number of voters who voted for A. The total number of voters multiplied by
“one minus the mean” is the total number of voters who voted for B. To find the
variance, we square the differences, multiply by the vote proportions, add, and
divide by the total number of votes. If the total number of votes is V, then the
variance is:

(1–mean)2 − V − mean + mean2 − V − (1–mean)


Var = V

V − mean–2 − V − mean2 + V − mean3 + V − mean2 – V − mean3


V

The Vs cancel out and with a bit of algebra, we find:

Var = mean (1 – mean)

The standard deviation is thus (mean(1–mean)) .

The mean in which we are interested here is the true mean, but as yet we have
only a sample mean. Luckily, sample means and true means usually give
standard deviations that are pretty close to one another, so we can use the
standard deviation given by the sample mean to help us find approximately
where the true mean lies.

We have now seen how probability theory can be used to make powerful
predictions about certain situations. Up until this point, however, we have
been chiefly concerned with simple, idealistic examples such as coin tosses,
the rolling of dice, and quincunx machines. Let’s now turn our attention to
probabilities that are more in line with what happens in the real world.

Unit 7 | 34
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.7

other types of • Let’s Make a Deal


probability • Off the Chain

LET’S MAKE A DEAL


• Conditional probability applies to events that are not independent of one
another.

Can you use what you know about the past to predict the future? When does
past performance tell you about future returns? In roulette, the fact that the
wheel lands on a red space eight times in succession has no bearing on the
next spin of the wheel-even though we might be tempted to think that it does!
Each spin of the wheel is an independent event. Many other situations in life
do not exhibit such perfect independence, however. For instance, your chances
of winning the lottery are greatly increased by purchasing a ticket, and your
chances of being eaten by a shark are greatly reduced by staying on the beach.
More realistically, what the weather will be doing in an hour depends to a large
degree on what it is doing now. These examples and others like them come
from the world of conditional probability.

A classic example of conditional probability is what is often referred to as the


“Monty Hall Problem.” This is a situation in which a game show contestant is
faced with three doors, one of which conceals a new car, and the other two of
which conceal less desirable prizes, such as a donkey or a pile of sand. The
contestant chooses a door, door number 2 let’s say. Suppose that the host
then opens door number 1 to reveal a pile of sand. Now, with two closed doors
remaining, the host offers the contestant a chance to switch his/her selection to
door number 3. Should the person switch?

The probability that switching one’s selection will result in winning the car
depends on the probability that one’s initial selection was either correct or
1
incorrect. The probability that your initial guess is correct is . After the host
3
narrows the choice, the probability that you were initially correct is still the
1
same, , which means that your probability of being initially incorrect, and thus
3 2
the probability that switching your choice will prove fruitful, is . After the host
3
reveals one of the klunkers, we are now considering a conditional probability:
the probability that the remaining door has the grand prize, given that one
2
klunker has been revealed, is .
3

Unit 7 | 35
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.7 This is a much different result than if the host would reveal one of the non-
winning doors prior to your first choice. In this scenario, your first choice
1
OTHER TYPES OF would have a probability of being correct. If then given the option to switch,
2 1
PROBABILITY the probability that switching will be advantageous is only . The fact that our
2
CONTINUED
original situation leads to the switch strategy presenting a higher probability of
success may seem counter-intuitive, but one of the great strengths of probability
theory is that it allows us to quantify the randomness that we are facing and
gives us a rational and logical way to make decisions, one that is helpful in
situations in which our intuition is often wrong.

OFF THE CHAIN


• Markov Chains provide a way to talk about sequences of events in which the
probability of each event is dependent on the results of prior events.

A concept from probability that is similar to conditional probability, yet different


in some important ways, is the Markov Chain. In a Markov Chain, the probability
of a future event depends on what is happening now. The probability of the next
event depends on what happened in the previous event. The outcome of a given
experiment can affect the outcome of the next experiment.

Let’s say it is raining in our current location. There is a certain probability that
in ten minutes it will still be raining. There is also a certain probability that in
ten minutes it will be sunny. These two probabilities, the rain-rain transitional
probability and the rain-sun transitional probability, depend on many factors. If
we want to project what the weather will be like in an hour, we can model this
as a succession of six 10-minute steps. Each state along the way will affect
the probabilities for transitioning to another state. The rain-sun transition’s
probability will be different than the rain-rain transition’s, and both will be
different than the sun-sun transition. So, if it is raining right now, in order to
use our model to figure out the likelihood that it will still be raining in an hour,
we need to map out the various sequences of transitions and their probabilities.

2
For example, let’s say that the rain-rain transition has probability 3 . This leaves
1
the rain-sun transition with a probability of . Suppose the sun-sun transition
4 3
has a probability of , which makes the probability of the sun-rain transition
5
1
. We can organize these probabilities into a matrix to help us think through
5
this exercise in weather forecasting.

Unit 7 | 36
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.7
NEXT STATE

1653
OTHER TYPES OF CURRENT R
R
2
3
S
1
3

PROBABILITY STATE
S
1 4
5 5
CONTINUED

We can also construct a branching diagram to show the possible ways that this
model can develop over six steps:

R R R R R R

654 R

S S S S S S

R = Rain S = Sun
Orange lines represent sun-sun transition
Blue lines represent rain-rain transition
Light orange lines represent rain-sun transition
Light blue lines represent sun-rain transition

To find the probability that we end up either with rain or sun after six steps, we
need an efficient way to consider all of the ways and probabilities that, after the
sixth step, the weather will be sunny. For instance, after two steps the possible
ways for it to be sunny, assuming we begin with rain, are: rain–rain–sun or
rain–sun–sun. Each of these combinations has two transitions, and each
transition has an associated probability. We can multiply the probabilities of
each transition to find the overall probability of events developing according to
each specific sequence. Because both sequences end up with the same result,
we can add the probabilities of each sequence happening to obtain the overall
probability of ending with sun after two steps.

Unit 7 | 37
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.7
Multiplying and adding is okay for two steps, but for greater numbers of steps,
OTHER TYPES OF this process can be quite unwieldy because as we consider more steps, we have
PROBABILITY to consider more specific sequences. Fortunately, we can find the probability of
CONTINUED
either rain or sun at any step by multiplying the entire matrix of probabilities by
itself for however many steps we wish to consider. This is the same as raising
the probability matrix to a power. While the details of why this works would be
distractingly beyond the level of this discussion, it suffices to say that multiplying
two matrices together accounts for all of the various ways in which we can go
from a particular initial state to a particular final state in two steps.

Therefore, to find the probability that it will be sunny after six steps (i.e, in one
hour), we take our original probability matrix and raise it to the sixth power,
which gives us:

STATE AFTER 6 STEPS

1655 CURRENT
STATE
R
R
0.38
S
0.62
S 0.37 0.63

From this probability matrix, we can see that if it is currently raining, there is
a 38% chance that it will be raining in an hour and a 62% chance that it will be
sunny. This prediction of course is only as valid as the assumptions that went
into our model. Often, these assumptions are quite reasonable and powerful.
Because of this, Markov Chains form the heart of solving problems ranging from
how we can have a computer recognize human speech to how we can identify a
region of the human genome responsible for a genetic disease.

In this section we were introduced to two of the many ways in which probability
is used in a modern context. We have also seen the important connection
between probability and modeling. Our next section will bring us right to the
forefront of both probability and mathematical modeling.

Unit 7 | 38
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.8

modern • The BML Traffic Model


probability
THE BML TRAFFIC MODEL
• Although many of the principles of probability were established in centuries
past, it is still a vibrant field of mathematical study.
• The BML Traffic Model represents the frontier of probabilistic understanding
of complex systems.
• The BML Traffic Model applies not only to traffic congestion, but also to
physical or chemical processes such as phase changes.

Most of the ideas that we have discussed so far in this unit were first developed
in the 17th and 18th centuries. As in all of mathematics, there has been
continuous further development of these ideas since then. (In fact, we’ve
concentrated here on discrete probability and have not really said much
regarding continuous probability, the situation in which there is a continuous
possible range of outcomes, as with the height of an individual). An exciting
case in point is in the modeling of theoretical traffic flows.

The Biham-Middleton-Levine (BML) Traffic Model, first proposed in 1992,


provides a useful model to study how probability affects traffic flow and phase
transitions, such as the transformation of liquid water into ice. To get an idea of
how this model works, let’s imagine an ideal grid of city blocks.

1904 N

W E

To make things easier, let’s assume that the grid extends to infinity in all
directions. That way we don’t have to worry about any kind of boundary
conditions or effects. Let’s fill our grid with commuter cars, red ones trying to
go east and blue ones trying to go north.

Unit 7 | 39
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.8
Possible jam in
next time step.
Modern
Probability
CONTINUED

1905

To simplify things further, assume that cars move only one space at a time and
are allowed to move only as follows: Every odd second, red cars get to move if
the space immediately east of them is vacant, whereas blue cars get to move
every even second only if the space immediately north of them is vacant.
This process goes on indefinitely.

To determine the starting configuration of cars, we can select a probability, p,


that assigns whether or not a space is occupied by a car. We will be interested
in the different behaviors that are associated with different values of p. If
a particular space ends up being populated with a car, the car’s color, and
therefore its directional orientation, is determined by a method equivalent to
flipping a coin.

Unit 7 | 40
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.8 After the grid is populated, the simulation runs. After a period of time, patterns
and structure begin to emerge.
Modern
Probability
CONTINUED

Item 1733 / Raissa D’Souza, FREE FLOW R< R-C (1/2005). Courtesy of Raissa D’Souza.

Some initial probabilities lead to continuous flow. Cars can move freely forever.
In this picture, both red and blue cars are able to move throughout the grid. If
the cars were water molecules, these results would correspond to the liquid state.

Item 1734 / Raissa D’Souza, FULLY JAMMED R> R-C (1/2005). Courtesy of Raissa D’Souza.

Unit 7 | 41
UNIT 7 Making Sense of Randomness
textbook

SECTION 7.8 Other initial probabilities lead to traffic gridlock. Movement becomes
impossible. Notice how the red and blue cars are stalemated along the center
Modern diagonal of the grid. In the water analogy, this would be ice. Note that the parts
Probability in the corners are due to the boundary conditions of the grid, so that if a car
CONTINUED
leaves the left part of the screen, it returns on the right side and vice versa. The
same boundary conditions apply to the top and the bottom of the grid. Recalling
a concept from our previous unit on topology, this is a flat torus.

The BML model is perplexing because, while at low initial densities traffic
flows freely forever and at high initial densities traffic jams up rather quickly,
the density at which this transition occurs is not known. Also, there are
intermediate states of free flow mixed with periodic jams, depending on the
initial population density. As of this writing, there is no detailed mathematical
explanation for these behaviors, making this an area for continued exploration.

Rigorous attempts to address the issues involved in the BML Traffic Model and
similar models play a huge role in modern probability. Mathematicians are
truly just beginning to find ways of dealing with models that correspond to our
physical world in meaningful ways. These sorts of results correspond to some
of the deepest and most beautiful work in modern mathematics.

Unit 7 | 42
UNIT 7 at a glance
textbook

SECTION 7.2

history • The mathematical study of probability probably was delayed for centuries
because of mysticism.
• Renaissance mathematicians took the first strides toward understanding
chance in an abstract way.
• Pascal’s and Fermat’s solutions to the “Problem of the Points” provided
an early glimpse of how to use mathematics to say definite things about
unknown future events.
• Tree diagrams are useful for keeping track of possible outcomes.

SECTION 3.2
7.3

simple probability • Simple probability is the ratio of favorable outcomes to the total number of
and counting possible outcomes.
• The Galton Board is a model of a sequence of random events.
• Each marble that passes through the system represents a trial consisting of
as many random events as there are rows in the system.
• For Galton boards with many rows, the task of enumerating paths is greatly
facilitated by using Pascal’s Triangle.

SECTION 3.2
7.4

law of large • Bernoulli’s Law of Large Numbers shifted the thinking about probability
numbers from determining short-term payoffs to predicting long-term behavior.
• Expected value is the average result of a probabilistic event computed by
taking into account possible results and their respective probabilities.
• Expected value is a key concept in both gambling and insurance.

Unit 7 | 43
UNIT 7 at a glance
textbook

SECTION 7.5

the galton board • Each bin of the Galton board has an associated probability, and looking at all
revisited of the bins simultaneously gives a distribution.
• Because each peg represents a right-or-left, or binary, decision/option, the
distribution of probabilities is called a binomial distribution.
1
• If we assign a probability other than to each peg of the Galton board, the
2
mean of the distribution will shift.
• The normal distribution is an ideal distribution that is determined by only its
mean and standard deviation.

SECTION 3.2
7.6

central limit • According to the Central Limit Theorem, the distribution of averages of
theorem many trials is always normal, even if the distribution of each trial is not.
• Exit poll results can be compared with a normal distribution to make
predictions about the results of an election based on a relatively small
sample of voters.

SECTION 3.2
7.7

other types of • Conditional probability applies to events that are not independent of one
probability another.
• Markov Chains provide a way to talk about sequences of events in which the
probability of each event is dependent on the results of prior events.

Unit 7 | 44
UNIT 7 at a glance
textbook

SECTION 7.8

modern probability • Although many of the principles of probability were established in centuries
past, it is still a vibrant field of mathematical study.
• The BML Traffic Model represents the frontier of probabilistic understanding
of complex systems.
• The BML Traffic Model applies not only to traffic congestion, but also to
physical or chemical processes such as phase changes.

Unit 7 | 45
UNIT 7 Making Sense of Randomness
textbook

BIBLIOGRAPHY

WEBSITES http://www.dartmouth.edu/~chance/

PRINT Barth, Mike. “Industry Loss and Expense Ratio Comparisons Between HMOs
and Life/Health Insurers,” NAIC Research Quarterly, vol. 3, no. 1 (January 1997).

Berlinghoff, William P and Fernando Q. Gouvea. Math Through the Ages:


A Gentle History for Teachers and Others. Farmington, ME: Oxton House
Publishers, 2002.

Berlinghoff, William P. and Kerry E. Grant. A Mathematics Sampler: Topics


for the Liberal Arts, 3rd ed. New York: Ardsley House Publishers, Inc., 1992.

Bernstein, Peter L. Against the Gods: The Remarkable Story of Risk.


New York: John Wiley and Sons, 1996.

Bogart, Kenneth, Clifford Stein, and Robert L. Drysdale. Discrete Mathematics


for Computer Science (Mathematics Across the Curriculum). Emeryville, CA:
Key College Press, 2006.

Burton, David M. History of Mathematics: An introduction, 4th ed.


New York: WCB/McGraw-Hill, 1999.

Casti, John L. Five More Golden Rules: Knots, Codes, Chaos, and Other Great
Theories of 20th-Century Mathematics. New York: John Wiley and Sons,
Inc., 2000.

David, F.N. Games, Gods and Gambling; The Origins and History of Probability
and Statistical Ideas from the Earliest Times to the Newtonian Era. New York:
Hafner Publishing Company, 1962.

De Fermat, Pierre. “Fermat and Pascal on Probability” from Oeuvres de Fermat,


Tannery and Henry, eds. University of York. http://www.york.ac.uk/depts/
maths/histstat/pascal.pdf (accessed October 26, 2005).

Desrosieres, Alain. The Politics of Large Numbers (translated by Camille Naish).


Cambridge, MA: Harvard University Press, 1998.

Unit 7 | 46
UNIT 7 Making Sense of Randomness
textbook

BIBLIOGRAPHY
D’Souza, R. M. “Coexisting phases and lattice dependence of a cellular automata
model for traffic flow,” Physical Review E, vol. 71 (2005).

Epstein, Richard A. The Theory of Gambling and Statistical Logic.


New York: Academic Press, 1977.

Gigerenzer, Gerd et al. The Empire of Chance: How Probability Changed Science
and Everyday Life. New York: Cambridge University Press, 1989.

Ghahramani, Saeed. Fundamentals of Probability,2nd ed. Upper Saddle River,


NJ: Prentice Hall, Inc., 2000.

Gordon, Hugh. Discrete Probability. New York: Springer, 1997.

Grinstead, Charles M. and J. Laurie Snell. Introduction to Probability: 2nd rev. ed.
Providence, RI: American Mathematical Society, 1997.

Griffiths, T.L. and Tenebaum, J.B. “Probability, Algorithmic Complexity, and


Subjective Randomness,” Proceedings of the Twenty-Fifth Annual Conference of
the Cognitive Science Society, (2003).

Gross, Benedict and Joe Harris. The Magic of Numbers. Upper Saddle River,
NJ: Pearson Education, Inc/ Prentice Hall, 2004.

Kelly, J.L. Jr. “A New Interpretation of Information Rate,” Bell Systems Technical
Journal, vol. 35 (September 1956).

Larsen, Richard J and Morris L. Marx. An Introduction to Probability and its


Applications. Englewood Cliffs, NJ: Prentice Hall, Inc., 1985.

Martin, Bruce. “Fate… or Blind Chance? What Seems Like Eerie Predestination
Is Merely Coincidence,” The Washington Post, September 9, 1998, final edition.

Relf, Simon and Dennis Almeida. “Exploring the `Birthdays Problem’ and
Some of its Variants Through Computer Simulation,” International Journal
of Mathematical Education in Science & Technology, vol. 30, no. 1 (January/
February 1999).

Unit 7 | 47
UNIT 7 Making Sense of Randomness
textbook

BIBLIOGRAPHY
Smith, David Eugene. A Source Book in Mathematics. New York: McGraw-Hill
Book Company, Inc., 1929.

Tannenbaum, Peter. Excursions in Modern Mathematics, 5th ed. Upper Saddle


River, NJ: Pearson Education, Inc., 2004.

LECTURES D’Souza, R.M. “The Science of Complex Networks.” CSE Seminar at University of
California - Davis, February 2006. HYPERLINK “http://mae.ucdavis.edu/dsouza/
talks.html” http://mae.ucdavis.edu/dsouza/talks.html (accessed 2007).

Unit 7 | 48
TEXTBOOK
Unit 8
UNIT 08
Geometries Beyond Euclid
TEXTBOOK

UNIT OBJECTIVES

• Geometry is the mathematical study of space.

• Euclid’s postulates form the basis of the geometry we learn in high school.

• Euclid’s fifth postulate, also known as the parallel postulate, stood for over
two thousand years before it was shown to be unnecessary in creating a self-
consistent geometry.

• There are three broad categories of geometry: flat (zero curvature), spherical
(positive curvature), and hyperbolic (negative curvature).

• The geometry of a space goes hand in hand with how one defines the shortest
distance between two points in that space.

• Stereographic projection and other mappings allow us to visualize spaces that


might be conceptually difficult.

• Einstein showed that curved geometry is a way to model gravitational attraction.

• The recently proven Geometrization Theorem states that if we live in a randomly


selected universe with a uniform geometry, then it is probably a hyperbolic
universe.
It should be known that geometry enlightens
the intellect and sets one’s mind right. All
of its proofs are very clear and orderly. It
is hardly possible for errors to enter into
geometrical reasoning, because it is well
arranged and orderly. Thus, the mind that
constantly applies itself to geometry is not
likely to fall into error. In this convenient way,
the person who knows geometry acquires
intelligence

Ibn Khaldun (1332-1406)


UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.1 Math encompasses far more than the study of numbers. At its heart, it is
the application of logic in the search for order in the world around us. A
INTRODUCTION fundamental question in this search is how to divide up, or describe, space. The
study of this problem, geometry, has been of importance for thousands of years.
The geometers of Ancient Egypt established geometric concepts and rules that
form the basis of a discussion that has continued into the modern age.

The Egyptians were concerned with a variety of everyday geometric challenges,


from how to divide up lands that had been flooded, to the construction of the
pyramids. In fact, the need to measure and divide up land helped bring the word
“geometry” into existence—“geo” meaning “earth,” and “meter” meaning “to
measure.” The meanings of both of these roots have been expanded throughout
the centuries so that now, the “earth” aspect can be thought of as encompassing
all of space in general, and the “measure” element can be thought of as “divide
into regular sections.” Thus, a more useful, modern-day definition of geometry
is “the study of how to break space up into regular sections.”

As with other mathematical ideas, the geometric concepts of the Egyptians did
not stay confined to North Africa, but rather spread across the Mediterranean.
Points, lines, circles, and planes formed the vocabulary of a new kind of
thinking, one that was tied to empirical observations, and yet could exist without
them. The Greeks latched onto this notion of conceptual mathematics, and soon
complicated geometric ideas were being constructed with only the most basic
of theoretical tools. Much of this knowledge, accumulated over centuries, was
collected and expanded upon by the great mathematician Euclid of Alexandria
around 300 BC. His comprehensive collection of geometric knowledge, entitled
The Elements, went on to become the authoritative math book throughout the
world, with over a thousand editions since its initial printing in 1482.

Of central importance to Euclid were his postulates. These were statements


that could not be proven and had to be agreed upon as a starting point. His five
postulates described a world of straight lines and flat planes. The shapes he
focused on were idealized versions of shapes found in nature. The geometric
world Euclid described was, and still is, a wondrous achievement of logical
construction. It is a world that behaves self-consistently, lending credence to
the idea that it is a model of the “real” world. This idea, that statements about
the real world can be made on the basis of reason alone, has guided much of
western thought for centuries.

Unit 8 | 1
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.1 Euclid saw only part of the picture, however. Still, his geometry (which,
throughout the remainder of this discussion, will be referred to as “Euclidean
INTRODUCTION geometry”) withstood centuries of scrutiny by the best minds of the day. It was
CONTINUED not until the 1800s that Euclid’s view of the world was shown to be inadequate
as a model of the real world. The insights that have come to form the basis of
the modern study of geometry do not conform to Euclid’s postulates—they do,
however, lead to logical ways to describe the world as we know it, and space
in general. We are no longer challenged with questions of how to divide plots
of land; instead, our new tools enable us to ask, and answer, bigger questions.
In fact, we can use the techniques of modern, non-Euclidean, geometry to
understand the very fabric of reality.

In this unit we will see how Euclid elegantly combined the mathematical
knowledge of his day into a logically self-consistent system. We will then
examine how the close scrutiny of one of his fundamental assumptions led to an
entirely new kind of geometric thinking. From there we will explore this modern
view of geometry to see how one can replace Euclid’s straight lines with curves
and what that means for our understanding of the universe.

Unit 8 | 2
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.2

Euclidean • Euclid of Alexandria


Geometry • Axiomatic Systems
• Foundations of Geometry

Euclid of Alexandria
• Not much is known of Euclid’s life.
• Although he was not responsible for all of the content in The Elements,
Euclid broke new ground in his organization of the foundational
mathematical knowledge of the day.

Euclid is perhaps the most influential figure in the history of mathematics, so it


is somewhat surprising that almost nothing is known about his life. The little
that is known is mainly about his work as a teacher in Alexandria during the
reign of Ptolemy I, which dates to around 300 BC. This was some while after the
creation of Euclid’s most famous work, The Elements.

Euclid himself was known primarily for his skills as a teacher rather than for
his theorizing and contributions to research. Indeed, much of the content of the
thirteen volumes that make up The Elements is not original, nor is it a complete
overview of the mathematics of Euclid’s time. Rather, this text was intended
to serve as an introduction to the mathematical concepts of the day. Its great
triumph was in presenting concepts in logical order, beginning with the most
basic of assumptions and using them to build a series of propositions and
conclusions of increasing complexity.

Axiomatic Systems
• Axiomatic systems are a way of creating logical order.
• Axioms are agreed-upon first principles, which are then used to generate
other statements, known as “theorems,” using logical principles.
• Systems can be internally consistent or not, depending on whether or not
their axioms admit contradictions.

The system that Euclid used in The Elements—beginning with the most basic
assumptions and making only logically allowed steps in order to come up with
propositions or theorems—is what is known today as an axiomatic system. Here
is a very simple example of such a system:

Unit 8 | 3
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.2 Given the things: squirrels, trees, and climbing,

Euclidean 1. There are exactly three squirrels.


Geometry 2. Every squirrel climbs at least two trees.
CONTINUED 3. No tree is climbed by more than two squirrels.

A logical theorem could be the statement: there must be more than two trees.

A simple picture would prove this theorem:

1 2 3

So, a theorem is something that can be shown to be true, given a set of basic
assumptions and a series of logical steps with no contradictions introduced.
Now, consider the following axiomatic system:

Given the things: cat, dog


1. A cat is not a dog.
2. A cat is a dog.

It is clear that both statements 1 and 2 cannot be true simultaneously. However,


these are the basic axioms of our system, and axioms have to be assumed
to be true—so, this system is clearly worthless, because it contains a logical
contradiction from the start. In other words, it is not self-consistent. In this
example, the contradiction presents itself directly in the axioms, but most
contradictory systems are not so easy to identify.
Unit 8 | 4
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.2 Foundations of Geometry


• Euclid used five common notions and five postulates in The Elements.
Euclidean • The fifth postulate, also known as the “parallel postulate,” is somehow not
Geometry like the others.
CONTINUED

When Euclid laid the foundation for The Elements, he had to be careful to start
with statements that would be both self-consistent and basic enough to be
assumed true. He divided his initial assumptions into five postulates1 and five
common notions. They are as follows:

Common Notions:
1. Things that are equal to the same thing are also equal to one another.
2. If equals be added to equals, the wholes are equal.
3. If equals be subtracted from equals, the remainders are equal.
4. Things that coincide with one another are equal to one another.
5. The whole is greater than the part.

Postulates:
1. Any two points can be joined by a straight line.
2. Any straight line segment can be extended indefinitely in a straight line.
3. Given any straight line segment, a circle can be drawn having the segment
as radius and one endpoint as center.
4. All right angles are congruent.
5. If two lines intersect a third in such a way that the sum of the inner angles
on one side is less than two right angles, then the two lines inevitably must
intersect each other on that side if extended far enough.

That fifth postulate is a mouthful; fortunately, it can be rephrased. In the


fifth century, the philosopher Proclus re-stated Euclid’s fifth postulate in the
following form, which has become known as the parallel postulate:

Exactly one line parallel to a given line can be drawn through any point not on
the given line.

1 Note: A postulate is not quite the same as an axiom. Axioms are general
statements that can apply to different contexts, whereas postulates are
applicable only in one context, geometry in this case.

Unit 8 | 5
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.2 This postulate is somehow not like the other four. The first four seem to be
simple and self-evident in that it seems things could be no other way, but the
Euclidean fifth is more complicated. Euclid, himself, likely noticed this discrepancy, as
Geometry he did not use the parallel postulate until the 29th proposition (theorem) of The
CONTINUED Elements.

Euclid’s system has been incredibly long-lasting, and it is still standard fare
in high school geometry classes to this day. It represents an achievement in
organization and logical thought that remains as relevant today as it was 2000
years ago. That bothersome fifth postulate, however, showed a small crack
in the foundation of the system. This crack was ignored for centuries until
mathematicians of the 1800s, with further exploration, found it to be a doorway
into a world of broader understanding.

Unit 8 | 6
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.3

Non-Euclidean • Of Questioned Necessity


Geometry • Saccheri’s Quadrilaterals

Of Questioned Necessity
• There were multiple attempts to show that the parallel postulate was not
necessary to form an internally consistent geometry.
• Girolamo Saccheri built upon the work of Nasir Eddin in an attempt to “clear
Euclid of every flaw.”

Euclid’s parallel postulate bothered mathematicians for many years. Everyone


agreed that the first four postulates were completely obvious, but it seemed
to be asking too much to say that the fifth was equally as obvious. It was more
complicated than the other postulates, and, in fact, many thought that it could
actually be proved from them. This idea led to numerous attempts to show that
the fifth postulate was not independent of the other four and that, therefore, all
of geometry could be built upon only four fundamental ideas. One of the most
famous of these attempts was undertaken by an Italian Jesuit named Girolamo
Saccheri in the early eighteenth century.

Saccheri wanted to show that the parallel postulate was not necessary (i.e., that
it was a derivative of the other four), and if the title of his book, “Euclid Cleared
of Every Flaw,” is any indication, he truly believed that he had accomplished this.
He, like many others before, believed that the parallel postulate could be proved
from the other four. After numerous unsuccessful attempts to find a direct
proof of his claim, he tried a different tack. His renewed efforts were somewhat
influenced by the work of one of Genghis Khan’s numerous grandchildren, Nasir
Eddin. Eddin had attempted to prove the fifth postulate almost 500 years earlier
by looking at quadrilaterals and making assumptions that he hoped would lead
to contradictions. We’ve seen examples of similar “proofs by contradiction”
before, such as Euclid’s proof of the infinitude of primes back in unit one.

Unit 8 | 7
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.3 Saccheri’s Quadrilaterals


• Saccheri looked at three cases of a quadrilateral constructed without the aid
Non-Euclidean of the fifth postulate.
Geometry • Saccheri’s results, though intriguing, were misinterpreted.
CONTINUED

Saccheri’s approach was similar to that of Eddin. He began by considering a


quadrilateral whose base angles are both 90 degrees:

SUMMIT ANGLES

1806

BASE ANGLES START AT 90°

He then showed that the summit angles must be equal to each other without
using the fifth postulate. In renouncing the parallel postulate in the construction
of this quadrilateral, Saccheri had to consider two possibilities—either:

1. There are no lines parallel to a given line, or


2. There are at least two lines parallel to a given line

These cases presented three optional scenarios: 1) that the summit angles are
acute, (which would allow for more than one parallel line); 2) that the summit
angles are right angles (indicating that there is only one parallel line); and 3)
that the summit angles are obtuse (and, therefore, there are no parallel lines).
He hoped to show that the only possible arrangement would be the second
scenario, because it can play the role of the parallel postulate. Consequently, if
it turned out to be the only case that works, then he would have shown that the
parallel postulate is not necessary (recall that he arrived at this arrangement
without its help).

Unit 8 | 8
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.3 SUMMIT ANGLES = 90° SUMMIT ANGLES < 90° SUMMIT ANGLES > 90°

Non-Euclidean
1807
Geometry
CONTINUED

Saccheri was able to eliminate the obtuse angle scenario by assuming


(implicitly) that straight lines can be extended forever. If this were not true, it
would violate Euclid’s second postulate--that lines may be extended indefinitely.
This in turn makes the obtuse case useless as far as an indicator of the
necessity of the parallel postulate. Ruling out obtuse summit angles leaves just
two options—they must be either acute or right.

To show that the acute case was incorrect, Saccheri set about trying to derive
the propositions found in The Elements, assuming that the summit angles were
less than 90 degrees. He was hoping that this path would consistently lead
to contradictions. To his chagrin, he found that he was able to derive many of
Euclid’s propositions using this assumption and yet still avoid contradictions.
He was onto something, but at the time he was unaware that he was indeed
building a logically consistent universe in which the summit angles of such a
quadrilateral could each be acute.

Unit 8 | 9
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.3

Non-Euclidean
Geometry

1808
CONTINUED

He was so sure that Euclid had to be correct, however, and that the acute
angle case could not stand, that he twisted his own logic to accommodate what
he had hoped to find, namely that the summit angles must be 90 degrees.
Saccheri basically invented a contradiction where none existed in order to fit
his preconception. Not surprisingly, his arguments were not convincing, even
to the mathematicians of his day. He published another work after “Euclid
Cleared of Every Flaw,” attempting to clarify his so-called proof, but to no
avail. Mathematicians of the time believed that Saccheri had neither proven nor
disproven the necessity of Euclid’s fifth postulate.

Saccheri’s work was not in vain; he simply did not recognize what he had found.
In his dogged effort to prove his preconception, he missed out on his claim to
one of the great discoveries in geometry: that Euclid’s system of geometry is not
the only possible self-consistent geometry.

Unit 8 | 10
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.4

Spherical and • Gauss


Hyperbolic Geometry • Lobachevsky and Bolyai
• Spherical Geometry
• A World of Worlds

Gauss
• Gauss realized that it is possible to construct a self-consistent geometry
with the “many parallels” version of the fifth postulate.

Almost 100 years later, in the early 1800s, the great German mathematician
Carl Friedrich Gauss attacked the same issue of Euclid’s fifth postulate. He
recognized that although Saccheri had simply dismissed the possibility of
having more than one parallel line through a given point, one could construct a
completely self-consistent geometry that differed from that of Euclid by using
this postulate instead of the original. This new geometry describes a world in
which the summit angles of Saccheri’s quadrilaterals are acute.

There is a mathematical legend that Gauss, as much experimental scientist as


mathematician, sought to determine whether or not this new type of geometry
was actually the geometry of the real world. To do this, he supposedly
constructed a great triangle using signal fires and mirrors set on mountaintops.

1809

Unit 8 | 11
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.4 He then measured the angles between these points of fire and compared his
measurements to the expected finding of 180 degrees total. Why would he use
Spherical and mountain tops? Why not just draw a large triangle on a flat space of land? As
Hyperbolic Geometry we will see soon, a shape drawn on the surface of a sphere (or near-sphere,
CONTINUED
such as Earth) is different than a shape drawn in space. By connecting the
tops of mountains with rays of light, Gauss was creating a triangle using the
minimal-length connections made possible in space. These lines were free to
be as straight as possible, without having to bend or conform to the shape of
the surface of Earth. The veracity of this story is questionable; nevertheless,
it illustrates the difference between lines on a curved surface and lines in a
potentially curved space.

A more-radical mathematical idea for its time would be hard to find, but Gauss
did not publish his findings in this arena.2 Some suggest that this was to avoid
confrontation with the great philosopher, Kant, who espoused that the human
perception of reality is Euclidean. Others suggest that Gauss was afraid that he
would lose face with his contemporary mathematicians. For whatever reason, it
was not Gauss who first brought these ideas to light.

Lobachevsky and Bolyai


• Lobachevsky and Bolyai independently came to the same conclusion as
Gauss.

Nicolai Lobachevsky was a Russian who, in 1829, published a version of a


geometry in which, instead of just one parallel line, multiple parallel lines were
possible through any given point.

2 It
is interesting to note that Gauss did not publish many of his ideas. It is
commonly thought that this was because he was a perfectionist and would only
make his views known if they were above criticism. To that end, he would not
provide the intuitions behind his proofs, preferring instead to give the
impression that they came “out of thin air.” Eric Temple Bell estimated in 1937
that, were Gauss to have been more forthcoming, mathematics would have been
advanced by at least 50 years! (Here is yet another example of why students
should be encouraged to show their work!)

Unit 8 | 12
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.4

Spherical and
1810
Hyperbolic Geometry
CONTINUED

If we take Saccheri’s quadrilaterals and modify them a bit, we can show that
Lobachevsky’s idea is equivalent to saying that the three angles of a triangle can
add up to less than a Euclidean 180 degrees.)

180° < 180°

1811
EUCLIDEAN HYPERBOLIC

Lobachevsky showed that this assumption would not lead to any logical
contradictions and was, thus, just as valid as Euclid’s geometry. Almost at the
same time, in 1832, a Hungarian mathematician named János Bolyai published a
similar finding after studying what he called “absolute,” or neutral, geometry—
that is, geometry that uses only the first four of Euclid’s postulates.

So, Gauss, Lobachevsky, and Bolyai all independently found that a new and
completely self-consistent geometry could be established by letting more than
one line through a given point be parallel to a given line that does not include the
point. Evidently this was an idea whose time had come. What of the other case,
the case in which no parallel lines are possible?

Spherical Geometry
• The “no parallels” flavor of Euclid’s fifth postulate yields the geometry of the
surface of a sphere.

Unit 8 | 13
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.4 Oddly enough, a geometry that allows no parallel lines is not as strange as the
type that Gauss, Lobachevsky, and Bolyai described. We can indeed imagine
Spherical and a world in which lines always cross other lines. Think about the surface of a
Hyperbolic Geometry sphere, such as Earth (more or less). Lines that follow the surface will continue
CONTINUED
around the sphere and back to their beginning points to create circles. Lines
such as this that are the maximum length for a given sphere (that is, lines whose
length is equal to the circumference of the sphere) are called “great circles”—
the equator is an example. Thus, the shortest distance between any two points
on a sphere will be a part of one of these great circles. Any two such lines will
always intersect in two places; hence, there can be no parallel lines in this
system! People as far back as the Greeks understood this, and they understood
that geometry on a sphere is different from that on a plane.

GREAT CIRCLES

1812
>180°

In this type of geometry, also known as “spherical geometry,” Saccheri’s


quadrilaterals would have obtuse summit angles, and the angles of a triangle
would add up to more than 180 degrees. In such a system, one has to replace
the parallel postulate with a version that admits no parallel lines as well
as modify Euclid’s first two postulates. The first postulate’s restriction that
“through any two points, there is only one possible straight line” does not hold
true on a sphere.

Unit 8 | 14
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.4
GREAT CIRCLES

Spherical and
Hyperbolic Geometry P1

1813
CONTINUED

P2

On the surface of a sphere, we must allow for more than one line through any
two points. Likewise, we must modify Euclid’s second postulate, because
“lines” on the surface of a sphere are really circles, which have no end and no
beginning—they cannot extend to infinity. We can circumvent this by requiring
simply that lines be unbounded.

A World of Worlds
• Modern geometry comes in three varieties, and each can be visualized via a
model.
• A geodesic is, in the local view, the shortest distance between two points in
whichever geometry you choose to use. In a global view, a geodesic is the
path a particle would follow were it free of the influence of all forces.
• The surface of a sphere models a geometry that admits no parallel lines.
• The surface of a pseudosphere models a geometry that admits many
parallel lines through a given point.

With the inclusion of spherical geometry, mathematicians now had three


broad types of geometry with which to study and measure shapes and space:
hyperbolic (Lobachevsky), spherical (Riemann), and flat (Euclid).

1814
EUCLIDEAN SPHERICAL HYPERBOLIC
Unit 8 | 15
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.4 Obviously, the type of world you “live in” and its geometry depend on which
flavor” of the fifth postulate you choose. If you choose to allow only one line
Spherical and parallel to a given line through a given point, you are choosing to inhabit Euclid’s
Hyperbolic Geometry world of flat geometry. If you choose to allow many parallel lines through the
CONTINUED
same point, you are in Lobachevsky’s world of hyperbolic geometry. If, on
the other hand, you choose to allow no parallel lines, you are in the world of
spherical geometry.

We can envision Euclid’s universe as a flat


EUCLIDEAN
plane; this is the universe that we learn about
in high school, and its concepts make intuitive
sense. To get from one point to another point
1815 A
B
on the plane, we all know that a straight line
will be the shortest distance. This connection
of minimal length can be generalized into the
notion of a “geodesic.” A geodesic, in the local
view, is simply the shortest distance between two points in whichever geometry
you choose to use. When looking at the geometry of an object on a global level,
a geodesic is the path that a particle would take were it free from the influence
of any forces.

To envision the spherical universe, we earlier


SPHERICAL
used a sphere as a model. We said that the
shortest distance between two points on the
sphere, a “geodesic” by our new terminology,
is a part of a great circle. In other words, a
B

1816 straight line in spherical geometry is actually


a curve, when viewed from the Euclidean
perspective.

A
Lobachevsky’s universe is a bit harder to
visualize. Eugenio Beltrami, an Italian
A geodesic, the shortest distance from a to b mathematician working in Bologna, Pisa, and
is curved in spherical geometry.
Rome, found a shape, analogous to a sphere,
but with a surface that obeys Lobachevsky’s geometry. Imagine a tractrix (the
path something takes when you drag it by a leash horizontally) rotated around
its long axis to generate a shape not unlike two trumpets glued bell to bell.

Unit 8 | 16
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.4
B
Spherical and
1817
Hyperbolic Geometry A

CONTINUED

This is called a “pseudosphere,” and it can be thought of as the opposite of a


sphere—or, if you are feeling adventurous, as a sphere of imaginary radius.
The surface of a psuedosphere behaves according to the rules of hyperbolic
geometry. The geodesic of a pseudosphere, the minimal-length connection
between two points, is again a curve, but not a section of a great circle, as it was
in the world of spherical geometry.

In this discussion, we have seen how basic axioms can define a mental world
and that, by changing the axioms, we change the characteristic behavior, or
reality, of this mental world. Axioms are verbal statements; to visualize the
worlds that they create, we need visual models. We saw earlier that one way to
create such models is to embed them in some sort of space. This is what we are
doing when we look at a sphere or a pseudosphere. This method has proven to
be handy, because it preserves all the geometric properties, such as length and
angle, that are determined by our axioms.

However, we don’t always have the option of looking at spatial models of the
sphere and pseudosphere. Consequently, we have developed another method of
visualizing these spaces: maps. Maps are handy because they can be drawn on
a flat piece of paper. Unlike our spatial embeddings, however, maps necessarily
distort the picture in some way. They can, nevertheless, be of great help as we
try to understand all of these strange geometries, and so it is to maps that we
next turn our attention.

Unit 8 | 17
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.5

Shapes on a Plane: • Distortion


Making Maps • Projecting the Sphere
• Poincaré’s Disk

Distortion
• Representing curved geometries on a flat surface requires distortion.
• The type of distortion depends on the technique used to project a surface
onto a map.

If you were to take a globe and compare it to a flat map of Earth, you would find
some great discrepancies. Greenland, for instance, appears much larger on a
map than on a globe. The reason for this is that whoever designed the map was
faced with the tricky task of using a flat surface to portray a piece of the surface
of a sphere. Anyone who has ever tried to wrap an unboxed basketball or soccer
ball as a gift will have encountered a similar problem—it’s difficult to take a flat
surface and attempt to form it into a sphere.

Item 2877 / Giraldon, MAPPE-MONDES SUR DIVERSES PROJECTIONS (WORLD MAP)(Ca.


1815). Courtesy of Kathleen Cohen.

For our purposes, a map is a flat representation of a curved space.

Unit 8 | 18
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.5

Shapes on a Plane:
Making Maps
1818 CONTINUED

Likewise, if you’ve ever peeled an orange, especially if you’ve accomplished this


with just one strand of peel, you’ve found that you can’t lay it on a flat surface
without tearing it. This is actually just the opposite problem of gift-wrapping the
ball—here we’re trying to take a spherical surface and turn it into something
flat.

1819

ORANGE FLATTENED ORANGE PEEL

This is also exactly the problem that the mapmaker faces. Fortunately, there
are a variety of techniques for translating an image of a curved surface onto a
flat piece of paper. What’s even better is that, once we have a general technique,
we can use it to make maps not only of spheres, but also of pseudospheres and
other curved surfaces that defy our intuition. The technique that we will focus
on in this discussion is called a “stereographic projection.”

Unit 8 | 19
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.5 Projecting the Sphere


• The stereographic projection of a sphere yields a map with increasing
Shapes on a Plane: distortion of features near its north pole.
Making Maps • The stereographic projection preserves angles, but not length.
CONTINUED
• Geodesics are portions of a circle in stereographic projections of the sphere.

Let’s first see how stereographic projection works for a sphere.

1820

Take a sphere and place it on a plane. Let’s call the exact point where the
sphere touches the plane the south pole. The plane is tangent to this point. The
north pole would then be the point antipodal to the south pole, or in other words,
the point directly across the interior of the sphere.

1821
S

To perform a stereographic projection, we draw straight lines from the north


pole to the plane, rather like staking down a tarp or a tent.

Unit 8 | 20
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.5 N

Shapes on a Plane:
Making Maps
CONTINUED

1822 P1
P3
P3

P2

P1

P2 S

Now, each of these lines will intersect the sphere and continue on to the plane.
This is how we will map each unique point on the sphere to a unique point on the
plane.

Notice that every point on the sphere will be uniquely mapped onto the plane
except for the north pole. Where should it go? If we notice that points arbitrarily
close to the north pole get mapped further and further out on the plane, it
makes sense to define the north pole to be mapped to infinity.

What happens to geodesics in this mapping? To answer this question, we can


start by looking at the equator, which is a special case of geodesic. Notice that
the equator gets mapped to a circle on the plane.

3094
P3

P1 P3
P2

P1

P2 S

We call this the “equatorial circle.” All other geodesics on the sphere get
mapped to circles and lines in the plane that intersect the equatorial circle at
two opposite (called “antipodal”) points.
Unit 8 | 21
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.5 What determines whether or not a geodesic on the sphere gets mapped to a
circle or a line on the plane? Recall that the north pole gets mapped to infinity,
Shapes on a Plane: so any geodesic on the sphere that passes through the north pole will, when
Making Maps
CONTINUED
1823 mapped to the plane, extend to infinity, forming what we normally think of as a
line.
N

B
A

S
B

GEODESICS

A great thing about the stereographic projection is that it is conformal, which


means that it preserves the angles between geodesics. In other words, the
shape of an object, such as a triangle, is preserved because its angles are, but
its size is not preserved. This is because in order to preserve angle, we must
distort lengths.
N

B S
A

628 C

Unit 8 | 22
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.5 Notice that triangle A, in the southern hemisphere, looks more or less the same
size in its projection, whereas triangles B and C, each located progressively
Shapes on a Plane: further north on the sphere, get larger and larger on the plane. The triangles
Making Maps are still three-sided figures, but their sizes have changed. Also, notice that they
CONTINUED
appear “fat” on this map. This is a consequence of the fact that the three angles
of a triangle can add up to more than 180 degrees in spherical geometry. The
only way to represent this on the map is to replace the straight geodesics of the
plane with the curved geodesics of the sphere.

The Poincaré Disk


• Hyperbolic space can be modeled using the Poincaré disk.
• The boundary of the Poincaré disk represents infinity.
• Most geodesics map as semi-circles that form right angles with the
boundary of the disk.
• Triangles on the disk are “skinny.”

Now that we have seen a map of spherical space, let’s look at a map of
hyperbolic space. We saw earlier that the pseudosphere is a good model of
hyperbolic space. In a process similar to the one we used with the sphere, we
can make a map of the pseudosphere.

This map is called a Poincaré disk,


in honor of Henri Poincaré, the great
French mathematician, who was
its initial creator. The boundary of
the disk is mapped to infinity. Most
geodesics are represented on this map
as semicircles that make right angles
with the boundary, signifying that
lines in hyperbolic space both “begin”
and “end” at infinity. Geodesics that
Multiple parallel lines through a given point. Note
that the edges of the disk represent infinity. pass through the center of the disk,
however, are represented as straight
lines, connecting antipodal points.

Like the spherical map we just saw, this map is conformal—it preserves
the angles between geodesics. We can see quite easily how Saccheri’s
quadrilateral, if mapped on a Poincaré disk, would have acute summit angles.

Unit 8 | 23
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.5

Shapes on a Plane:
Making Maps
CONTINUED

1824

Triangles in hyperbolic space appear, on this map, to be “skinny” or “cuspy,”


showing that their angles add up to less than 180 degrees.

Item 1725 / Jos Leys, HYP023 (2002). Courtesy of Jos Leys.

A flap map of hyperbolic space tiled with quadrilaterals.

We can see that the “many-parallels” version of Euclid’s fifth postulate is


obeyed. If we draw a geodesic, we will get a rainbow shape. If we then choose
a point not on that line, we will be able to draw as many parallel lines as we
choose. In other words, this proves that we do indeed have a map of hyperbolic
space.

Unit 8 | 24
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.5

Shapes on a Plane:
Making Maps
CONTINUED

We see from these examples that it is possible to make maps of the different
geometries discussed so far. It should be evident from exploring the nature of
these geometries and their maps that, to be comfortable “getting around” in
these new multi-dimensional realms, we are going to have to understand curves
as well as we understand straight lines. To further that understanding, we now
turn to the subject of how to measure curvature.

Unit 8 | 25
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.6

Curvature • The Kissing Circle


• Curved Surfaces
• Pi Doesn’t Have To Be 3.14159...

The Kissing Circle


• We can measure the curvature of a curve in the plane extrinsically via an
osculating circle.
• Intrinsic measurements of curvature are impossible in one dimension.

In looking at both the sphere and the pseudosphere, we see that they are unlike
the plane in that they are both curved surfaces. Furthermore, we saw that
“straight” lines (lines of the shortest length, in other words) are not straight
at all on these surfaces; rather, they are curves. In order to explore these
surfaces, and others that do not obey Euclid’s fifth postulate, we need to be able
to discuss curves meaningfully. Let’s start with a simple curve in a plane:

1825 x

How can we describe this curve’s curviness? We could compare it to a circle,


a “perfect curve” in some respects, but it is evident that this curve is not really
even close to being a circle. It has regions that seem more tightly curved than
others, and it even has regions that curve in opposite directions. When we
look at a curve in this way, in the broader context of the plane, we are viewing
it extrinsically. By contrast, viewing a curve intrinsically, that is from the point
of view of someone on the curve, yields a different perspective and different
possible measurements.

Unit 8 | 26
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.6
y
Curvature
CONTINUED

1826 x

So, perhaps a good thing to do would be, instead of talking about the curvature
of the whole thing right away, to talk about the curvature at each point along the
curve. As theoretical travelers along the curve, we could stop at each point and
ask, “What size circle would define the curve in the immediate vicinity of this
point?”

1827 OSCULATING x
CIRCLE

First, let’s think of the tangent line at this point. This is a line that intersects
our curve only at this one point (locally, that is—it’s possible that the line might
also intersect the curve at other more-distant points). We can also think of
the tangent as the one straight line that best approximates our curve at this
particular point. Let’s then draw a line from this tangent point, perpendicular to
the tangent line.

Let this line, called a “normal line” or just a “normal,” be of a length that, if it
were the radius of a circle, that circle would be the biggest possible circle that
still touches the curve in only one place. In other words, the normal line should
be the radius of the circle that best approximates the curve at this particular
point. Such a circle is called an “osculating circle,” which literally means
“kissing” circle, because it just barely touches, or “kisses,” the curve at this one
point.
Unit 8 | 27
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.6

Curvature OSCULATING
CONTINUED y CIRCLE

1828 OSCULATING x
CIRCLE

We can define the curvature of a curve at any particular point to be the


reciprocal of the radius of the osculating circle that fits the curve at that point.
Let’s refer to this curvature as “k” from here on out.

What should we do, though, about the fact that some parts of the curve open
upward, whereas other parts of the curve open down? If we designate that the
normal always points to the same side of the curve—let’s choose upward for our
case—then when the normal happens to be on the same side as the osculating
circle, we’ll call this negative curvature, and when the normal happens to be on
the opposite side from the osculating circle, we’ll call this positive curvature.
At any point where the line is flat (i.e., straight), we don’t need an osculating
circle, and we’ll call this zero curvature. The choice of defining which curvature
to consider positive and which to consider negative is completely arbitrary. The
method chosen for this example is nice because, if we think of our planar curve
as a landscape, then the positively curved areas are the hills and the negatively
curved areas are the valleys.

POSITIVE
y
ZERO

1829 NEGATIVE x

Unit 8 | 28
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.6 An interesting feature of looking at a curve in this way is that, were we one-
dimensional beings living on the curve, we would not notice that it is curved
Curvature at all. This is called an intrinsic view. The only thing that can be measured
CONTINUED intrinsically on a curve is its length, and length alone tells us nothing about how
curvy a one-dimensional object is.

Remember that the way we quantified the curvature of this curve was to
compare it to a circle in the plane. Now, as one-dimensional beings, this
requires envisioning one more dimension than would be available to our
perception. The curvature becomes apparent only when the curve is viewed by
an observer not on the curve itself—that is, one who can see it extrinsically in
two dimensions.

Using this system, we can meaningfully talk about any curve in a plane, and we
know from previous discussions that once we understand something in a lower-
dimensional setting, we can generalize our thinking to a higher-dimensional
setting. In this case, instead of talking about plane curves, we will return to our

1830 curved surfaces, such as the sphere and the pseudosphere.

Curved Surfaces
• One extrinsic way to measure the curvature of a two-dimensional surface is
through principal curvatures.
• Principal curvatures cannot be used intrinsically to measure curvature.

Let’s take a moment to compare and contrast our plane curve and our curved
surface. Our plane curve, though drawn in a two-dimensional plane, is actually
only a one-dimensional object. This is because, if you were an ant living on
this curve, you would only have the option of traveling forward or backward.
Because of this, you wouldn’t even really know that your line was curved.

Unit 8 | 29
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.6 CURVED SURFACE

y
PLANE CURVE
Curvature
CONTINUED
1831
x
ANT CAN GO ONLY FORWARD
OR BACKWARD ANT CAN GO FORWARD,
BACKWARD, RIGHT OR LEFT

A curved surface, on the other hand, is two-dimensional. If you were an


ant living on it, you could move forward, backward, right, or left. As you
can see, however, a curved surface requires a third dimension to represent
it extrinsically, just as a one-dimensional planar curve requires a second
dimension for its extrinsic representation. We said that an ant on a plane curve
cannot experience this second dimension and, thus, has no idea that his world is
curved. Is the same true for an ant on the surface, however? It can’t experience
the third dimension, but might it still be able to find out if its world is curved?

18
32 To resolve this, we need to find a way to apply our concept of the osculating
circle to a curved 2-D surface. Actually, we can begin the same way as before.

NO
RM
AL

Normal plane slicing a curved surface.

Let’s pick a point on our surface and define the normal (remember, that’s a line
that is exactly perpendicular to the surface at this point). If it helps, imagine
the plane that is tangent at this one point as a flat meadow, and envision the
normal as a tree growing straight up in the middle of the meadow. Now that we
have both our point and our normal set, we can look at slices of the surface that
contain both the point and the normal.
Unit 8 | 30
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.6

2033
Curvature
CONTINUED

Y
X

3082

Y
X

It is clear that each of these slices through the surface will show a slightly
different curve, yet all of them contain our point of interest. So, which of these
slices is “the” curvature at this point? We have so many possibilities to choose
from!

One path toward a solution involves considering the extreme values—in other
words, the curve that is most positively curved and the curve that is most
negatively curved. We call these the “principal curvatures.” If we were then to
take the average of these two quantities, we would have a mean curvature for
this point.
Unit 8 | 31
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.6

Curvature
CONTINUED
1833

Would an ant on this surface be able to find, or develop an awareness of, these
principle curvatures? To do so, it would have to have some idea of a plane that
is perpendicular to the plane of his current existence. The complicating factor
here is that the ant has no idea that another perpendicular direction can even
exist! It will have a great deal of difficulty trying to figure out curves that can
only be seen with the aid of a perspective that it can’t have.

All hope is not lost, however. Again, we can turn to Gauss. His Theorem
Egregium says that there is a type of curvature that is intrinsic to a surface.
That is, it can be perceived by one who lives on the surface. Usually, this
curvature, called the Gaussian curvature, is simply defined as the product of the
two principal curvatures. For our example here, however, that will not be good
enough, because our ant can’t even know the principal curvatures!

Pi Doesn’t Have To Be 3.14159…


• One can measure the intrinsic curvature of a surface by drawing circles and
comparing their circumferences to their radii.
• Positive curvature yields a smaller circumference than we would expect for
a given radius.
• Negative curvature yields a larger circumference than we would expect for a
given radius.

Instead of trying to find the principal curvatures, the ant can draw a circle on
his surface and look at the ratio of the circumference to the diameter. This ratio
is often known as π , and in flat space it is about 3.14159. We usually consider
π to be a universal constant, and it can be, but that depends on which universe
we are talking about. In a Euclidean universe, π is indeed constant. In non-
Euclidean universes, however, the value of π depends on where exactly the
circle is drawn—it’s not a constant at all!
Unit 8 | 32
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.6


Curvature

1834
CONTINUED
1

TRAMPOLINE
Consider a circular trampoline. The circumference of this trampoline is fixed,
but the webbing in the center is flexible and can be thought of as a surface.
When no one is standing on the trampoline, the ratio of its circumference to its
diameter is indeed π . Now, consider what happens when someone stands in the
middle of the trampoline: the fabric stretches and the diameter, as measured on
the surface, increases.

<2 π
1

1835
TRAMPOLINE

The circumference, however, remains unchanged as the surface is stretched.


This causes the ratio of the circumference to the diameter, π , to decrease. Our
ant could indeed detect such a distortion! This would be positive curvature.

SPHERICAL r
=2π�

r
r
EUCLIDEAN

r
<2π�

1836
r
r
>2π�

HYPERBOLIC

3 circles, 3 geometries, 3 different ratios of circumference to diameter

Unit 8 | 33
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.6 For an example of how an ant could detect the curvature of a surface, consider
a globe. If we draw a small circle near the north pole, it will be more or less
Curvature indistinguishable from circles drawn on a flat plane. Now draw a circle that is
CONTINUED a bit bigger, say at the 45th parallel. This line is halfway between the north pole
and the equator. Its radius, as measured on the surface, will be considerably
longer in proportion to the circumference of the circle than was the case for the
1837 small polar circle. Therefore, the ratio of circumference to diameter will be
smaller—in other words, larger diameter and smaller circumference.

ANT

SEEMS FLAT
LOCALLY 45TH PARALLEL

<2π

EQUATOR

<2π

Now consider the circle represented by the equator. The diameter of this circle,
as measured on the surface, will be half the length of the circle! This would
mean that for the equator, π is equal to 2. This is quite a discrepancy from the
customary 3.14… value, and it indicates that we must be on a curved surface.

Negative curvature can be visualized as a saddle. Such a surface has more


circumference for a given radius (and, hence, diameter) than we would expect
with either flat or positive curvature.

Gaussian curvature is not as concerned with determining specific values of π as


it is with measuring how π changes as the radius changes. The more curved a
surface is, the faster π will change for circles of increasing radius.

Unit 8 | 34
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.6 This idea, that there are certain properties that can be measured regardless
of how our curve sits in space, was important in our topology unit and, as we
Curvature have just seen, it plays a significant role in our discussion of curvature as well.
CONTINUED These intrinsic properties of a surface—or the generalization of a surface, a
manifold—are definable and measurable without regard to any external frame
of reference. The Gaussian curvature is such a property, but the principal
curvatures are not.

Recall that to find the principal curvatures, one must take perpendicular
slices, which requires that our surface sit in some higher-dimensional space.
This is an extrinsic view. The fact that the Gaussian curvature of a surface,
as computed by the principal curvatures, yields an intrinsic quantity is quite
remarkable. In fact, it is known to this day as “Gauss’s Theorema Egregium,”
meaning “Gauss’s Remarkable Theorem.”

So, a natural question to ask might be: what kind of surface do we live on? We
must have an intrinsic view of whatever space we inhabit—indeed, we have no
way to get outside of it! A bit of thought, though, will lead to the realization that,
unlike ants, we perceive a third dimension, so whatever this is that we are all
living on, it is not a 2-D surface, but rather a 3-D manifold. A manifold can be
thought of as a higher-dimensional surface, or it might help to think of it as a
collection of points that sits in some larger collection of points.

Furthermore, our everyday experience includes a fourth degree of freedom,


time. If we consider time to be part and parcel of our reality, then we are
really living in a 4-D manifold called “spacetime.” So, is our spacetime the 4-D
equivalent of a flat, Euclidean plane, or is our reality curved in some spherical
or hyperbolic way? For help in exploring this question, we’ll turn to the ideas
of a certain former patent clerk whose theories permanently altered the way in
which we view our universe. First, however, we need to consider what happens
when our surface is not as “nice” as a simple, smooth sphere or pseudosphere.

Unit 8 | 35
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.7

General • Riemann’s Hills and Valleys


Relativity • Einstein

Riemann’s Hills and Valleys


• Riemann described how to deal with geometries that contain regions of
positive and negative curvature.
• The concept of the curvature of 3-D manifolds paved the way for Einstein’s
work.

We saw earlier that an ant on a globe can find the curvature of his world by
drawing circles of differing sizes and seeing how the value of π changes. The
globe is a surface of constant positive curvature, which simply means that no
matter where our ant decides to begin drawing circles, he’ll find that π changes
in the same way. ANT

1838

However, what if the surface that the ant draws on were not so uniform? Let’s
say that, just like the real surface of Earth, the ant’s surface has hills and
valleys. Dimensionally, this would mean that some parts of his world are more
curved than others. Riemann studied surfaces like this and came up with a
generalized way to deal with such surfaces and manifolds—ones with so-called,
non-constant curvature.

Unit 8 | 36
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.7 While the details of this method are beyond our scope, there are a couple of key
features worth noting. The first condition is that all the hills and valleys must be
General smooth—in other words, there is a way to get from point to point without
Relativity encountering any cliffs or walls. As long as this condition is met, Riemann said
CONTINUED that it is possible to use the local geometry to find the lengths of curves. So, our
ant would basically draw circles, as it did before, but it would see that π changes
in different ways, depending on the specific placements of the circles.

ANT

π CHANGES IN DIFFERENT
WAYS DEPENDING ON
WHERE THE ANT STARTS
DRAWING CIRCLES

1839

Using this method, the ant can measure the intrinsic curvature of the different
regions of its world. We find this idea of a non-constantly-curved surface easy to
visualize because we have a vantage point from the third dimension.

There is no reason, however, to think that only 2-D manifolds can be curved. In
fact, it would be surprising if that were true, because we have seen that math
concepts often generalize to higher dimensions. What would curvature in a
3-D manifold look like? This question puts us in a bit of a bind, for, just like the
ant, we have no higher-dimensional viewpoint from which we can observe this
curved space. We are stuck in it! Also, we must consider that our space is not
of uniform curvature. That is, the ant’s surface can have mountains and valleys,
so our space might have the 3-D equivalent. Thus, there might be pockets of
our space that are more curved than others, and there might be places that are
more or less flat.

Unit 8 | 37
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.7 Einstein


• Einstein said that space and time curve around massive objects, creating
General gravity.
Relativity • Events follow geodesics in this curved spacetime.
CONTINUED • The General Theory of Relativity has been experimentally verified.

Einstein shed light on these questions with his General Theory of Relativity,
which he developed in 1915-1916. He used the notion that some areas of a
surface are more curved than others. All that is required is that, if we look
closely enough at the surface, it appears to be flat.

LOCALLY THE SURFACE


APPEARS FLAT

1840

Likewise, if you look at a planar curve closely enough, it will appear to be a


straight line. A straight line is, of course, the geodesic of a Euclidean plane. So,
there are regions, however small, of any line that will appear as a geodesic of
the flat plane. A geodesic in a 3-D manifold will also be a line. If that manifold
has curvature, then the geodesic line will be curved also. Nonetheless, just
as with the planar curve, we can look at this curve in our 3-D manifold closely
enough for it to seem straight.

Unit 8 | 38
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.7
3D 3D
MANIFOLD MANIFOLD
(FLAT) (CURVED)
General
Relativity B B
CONTINUED
A A

1841 What would a geodesic in spacetime look like? Following the definition, it would
be the minimal-length connection between one time-and-place and another
time-and-place. Another way to think about a minimal-length connection is to
think of the path of least resistance. A geodesic can then be considered to be
the path that requires the least amount of energy. If we place an object at an
arbitrary point in spacetime, whatever it does naturally can be considered to be
the geodesic of that spacetime. If there happens to be a massive object, such
as a planet, nearby, we would expect our test object to “fall” toward the planet.
This suggests that the state of free fall is a minimal-length connection “in
action” in spacetime. In other words, falling is like following a geodesic.

Einstein noticed that an object in free fall “feels” no force of gravity.3 This is
analogous to an ant looking very closely at a curved surface and seeing it to
be flat—it doesn’t see the curvature. Likewise, if one thinks of gravity as the
curvature of our 4-D manifold of spacetime, then being in free fall, in which the
effects of gravity are not noticed, is like looking closely at the curved surface. In
other words, it is a perspective from which geodesics appear to be flat (straight
lines). Free fall is just a straight-line geodesic through spacetime.

Using reasoning such as this as a starting point, Einstein re-interpreted


gravity to be a result of the geometry of a curved spacetime. He said that it is
not a force in the Newtonian sense; rather, it is the effect of living in a curved
manifold. So, in other words, whereas the ant must draw circles to experience
the curvature of his world, we experience the curvature of our world though
gravity.

3 Supposedly, this was after interviewing a painter who had fallen off of a house.

Unit 8 | 39
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.7 This interpretation of gravity has been experimentally verified in numerous
ways. The first such verification occurred during a solar eclipse in 1919. Arthur
General Eddington found that certain stars, which, according to calculations, were behind
Relativity the sun at the time, were visible during the eclipse.
CONTINUED

SUN

MOON

1843

1842
The only way to interpret this phenomenon is to say that the light of the hidden
stars “bent” around the sun, like so:

SUN

MOON

Unit 8 | 40
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.7 This was an early verification that light, which should normally travel in straight
lines, actually travels in a curved path in the presence of a massive object.4 This
General implied that the geodesics in the vicinity of the sun are curved, the cause being
Relativity the mass of the sun.
CONTINUED
Since this initial verification, other experiments also have shown the predictions
of general relativity to be accurate. Einstein’s theory that massive objects
cause the spacetime in their vicinity to warp and that we experience this effect
as gravity was a breakthrough in our understanding of physics. It was also
a beautiful example of a mathematical idea that at first had little real-world
application (i.e., the Riemannian geometry of non-constant curvature) turning
out to be at the heart of one of the most fundamental phenomena in our human
experience, gravity.

The General Theory of Relativity describes how mass causes the geometry of
spacetime to curve locally. One can extend this thinking from considering the
masses and motions of planets around a star to stars around a galaxy, galaxies
around each other, clusters of galaxies around other clusters, and eventually
to the large-scale curvature of the universe itself. All the mass in the universe
must surely curve spacetime into some shape, and most probably that shape
exhibits non-constant curvature.

We know how mass causes local curvature of spacetime, but would spacetime
be flat were the mass not present? If so, would all of spacetime be flat without
mass, or would it just appear to be flat because we are like the ant looking too
closely at his surface to see any curvature? One of the most intriguing questions
in both mathematics and physics is the question of the underlying geometry of
reality.

4 Lightpaths are also geodesics of spacetime; we cannot conceive of something


that would find a more efficient path from point A to point B.

Unit 8 | 41
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.8

Geometrization • Many Ways To Be Hyperbolic


Conjecture • Hyperbolic Space

Many Ways To Be Hyperbolic


• We can examine different topological surfaces via the types of geometry they
admit.
• The polygons that can completely tile a given surface help to determine the
type of global geometry in which that surface exists.
• Most 2-D surfaces are hyperbolic.
1844
Recall that in our unit on topology, we learned about the classification of 2-D
surfaces. This classification was supported by looking at different 2-D surfaces
under various deformations and seeing that some surfaces really are just like
other surfaces. For example, both inflated and deflated beach balls are really
just spheres. The theorem that we explored in that earlier discussion was that
every 2-D surface can be turned into either a sphere or a torus with varying
numbers of holes. We will be concerned only with orientable surfaces in this
present discussion.

Inflated and deflated beach balls are topologically equivalent. Donut and coffee cup are topologically equivalent.

We can characterize the global curvature of these surfaces by examining what


sort of shapes would be necessary to cover the surfaces completely. These tiles
should have 90-degree angles at all vertices so that whenever four vertices
meet, the angular sum is 360 degrees. This requirement ensures that every
surface appears flat when viewed up close, one of the key criteria of a manifold.
For a sphere, the only tiles that will satisfy this condition are equilateral
triangles in which each angle is 90 degrees.

Unit 8 | 42
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.8
1
Geometrization
Conjecture
CONTINUED
1845 EIGHT TRIANGLES
CAN COVER A SPHERE

Eight triangles can cover a sphere. A torus convered by four rectangles.

Notice that the angles of each of these triangles sum to 270 degrees, a bit more
than the Euclidean 180 degrees that we would expect. This reminds us that the
surface of a sphere has positive curvature.

Notice that a single-holed torus can be completely tiled with four quadrilaterals
1847
(rectangles). The angle sum, as Saccheri would no doubt recognize, is 360
degrees, which is consistent with flat, Euclidean geometry.

Notice that the two-holed torus is tiled by hexagons whose angles are all
equivalent to 90 degrees. A hexagon in flat space has an angle sum of 720
degrees. Looking at our hexagons, we can see that this is not true on the
surface of our two-holed torus. The angle sum of one of our two-holed-torus-
tiling hexagons is 540 degrees, decidedly less than sum expected from a
Euclidean-based geometry. This means that our surface admits only hyperbolic
geometry.
Unit 8 | 43
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.8 We can extend this thinking to tori having any number of holes. Hexagons can
be used to make any other genus of torus. This means that although there
Geometrization are many ways to construct a hyperbolic 2-D surface, there is only one way to
Conjecture construct a flat surface, and only one way to construct a spherical surface. If
CONTINUED we want to identify the large-scale geometry of any multi-dimensional surface,
without any clues we would do well to guess “hyperbolic.”—the odds of being
right would be in our favor.

We have seen that the vast majority of surfaces that our ant can inhabit, the 2-D
universes so to speak, are hyperbolic. What about 3-D manifolds? What are
the possible geometries of the space that we inhabit? This is related to a long-
standing question, asked first by Poincaré at the turn of the nineteenth century
and resolved only in the first few years of the twenty-first century.

Hyperbolic Space
• If we were to remove all the mass of the universe, the fabric of space would
most likely be hyperbolic.

As you might expect, 3-D manifolds can be curved in analogous ways to the
2-D surfaces we have seen—that is, they can be spherical, flat, or hyperbolic.
Spherical space behaves like a spherical surface in that if you travel in a straight
line far enough away from your starting point, you will always return to where
you started, without having to turn around. This implies that the space is
bounded, like the surface of a sphere. This obeys the “no-parallels” version
of Euclid’s fifth postulate. Furthermore, in analogy to the ant’s circles, if we
create a sphere and compare its surface area to its radius, we will get a smaller
number than we would expect in flat space. As we create larger and larger
spheres, this ratio shrinks.

Unit 8 | 44
1849
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.8 Flat space behaves in a nice, Euclidean way. It obeys all five postulates; there
is only one parallel line through a given point; lines extend to infinity. This is
Geometrization probably how most of us envision space. Since it is unbounded, we can think of
Conjecture it as larger than spherical space. Spheres of all sizes exhibit the same ratio of
CONTINUED surface area to radius.

Hyperbolic space is a strange place, indeed. It can be thought of as much larger


than both spherical and flat space. If we were to make spheres of a given radius,
we would find that they have much larger surface areas than we would expect.
Furthermore, the larger the sphere, the greater the discrepancy we would find.

Unit 8 | 45
UNIT 8 Geometries Beyond Euclid
textbook

1851
SECTION 8.8 Note that we said earlier that we actually live in a 4-D manifold called
spacetime. So, to be clear about what we are talking about when we ask about
Geometrization the geometry of space, start by imagining all of space and all of time. Now,
Conjecture imagine taking what’s called a “time-slice” of this spacetime; in other words,
CONTINUED let’s take a snapshot by looking at just one particular moment in time.

1852
Planets Curving Spacetime

By taking this time slice, we have basically frozen the entire universe in time.
People stop walking mid-step without falling; planets stop in their orbits
around stars; galaxies cease their rotation. Geometrically, we now have a pure
3-D manifold with pockets of more or less curvature, depending on the mass
present. Now, let’s get rid of all the mass.

FLAT
UNIVERSE

POSITIVELY
CURVED
UNIVERSE

NEGATIVELY
CURVED UNIVERSE
Remove planets and spacetime returns to its “natural” shape.

With no mass, we are left with “pure” space. This is a 3-D manifold that
has some sort of geometry. With our now homogeneous 3-D manifold (the
inhomogeneous curving effects of mass have been mitigated), the possible
geometries are analogous to the geometries of the surfaces of the 2-D objects
that we examined before. Now, however, instead of being polygons, the tiles
will be polyhedra, and the type or types of polyhedra that will tile or “pack” a
particular space are determined by the specific geometry of that space.

Unit 8 | 46
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.8 So, the question remains: what are the possible geometries? In 1982 William
Thurston, one of the most influential modern geometers and topologists,
Geometrization proposed that there are eight possible geometries, Euclidean, spherical,
Conjecture hyperbolic, and five other systems. In the early and mid-2000s Perelman proved
CONTINUED Thurston’s claims to be correct. This result also proved the Poincaré conjecture,
which considered only spherical 3-D manifolds.

Now that Thurston’s geometrization conjecture has been proven to be correct—


and has earned the title of “theorem”--we have essentially a complete list of
possibilities for the fundamental geometry of our space. The task now is to
determine which geometry actually governs the real world we live in. This is
essentially what Gauss tried to do on his mountaintops so many years ago. The
problem, as we have seen, is that to reach a definitive answer, we need to be
able to look at extremely large shapes, much larger than anything on Earth or
even in our galaxy, perhaps.

So we are, indeed, much like the ant on its surface: we know what is happening
with the local curvature, but we are looking too closely to be able to discern
much about the large-scale geometry of our system. If we had to guess
the specific geometry of our space, we, like the ant, would do well to guess
hyperbolic. Indeed, Thurston’s Geometrization Theorem confirms that most
spaces are spaces that obey the “many parallels” version of Euclid’s fifth
postulate.

Unit 8 | 47
UNIT 8 at a glance
textbook

SECTION 8.2

Euclidean • Not much is known of Euclid’s life.


Geometry • Although he was not responsible for all of the content in The Elements,
Euclid broke new ground in his organization of the foundational
mathematical knowledge of the day.
• Axiomatic systems are a way of creating logical order.
• Axioms are agreed-upon first principles, which are then used to generate
other statements, known as “theorems,” using logical principles.
• Systems can be internally consistent or not, depending on whether or not
their axioms admit contradictions.
• Euclid used five common notions and five postulates in The Elements.
• The fifth postulate, also known as the “parallel postulate,” is somehow not
like the others.

SECTION 3.2
8.3

Non-Euclidean • There were multiple attempts to show that the parallel postulate was not
Geometry necessary to form an internally consistent geometry.
• Girolamo Saccheri built upon the work of Nasir Eddin in an attempt to “clear
Euclid of every flaw.”
• Saccheri looked at three cases of a quadrilateral constructed without the aid
of the fifth postulate.
• Saccheri’s results, though intriguing, were misinterpreted.

Unit 8 | 48
UNIT 8 at a glance
textbook

SECTION 8.4

SPHERICAL and • Gauss realized that it is possible to construct a self-consistent geometry


Hyperbolic with the “many parallels” version of the fifth postulate.
Geometry • He did not publish his findings.
• Lobachevsky and Bolyai independently came to the same conclusion as
Gauss.
• The “no parallels” flavor of Euclid’s fifth postulate yields the geometry of the
surface of a sphere.
• Modern geometry comes in three varieties, and each can be visualized via a
model.
• A geodesic is, in the local view, the shortest distance between two points in
whichever geometry you choose to use. In a global view, a geodesic is the
path a particle would follow were it free of the influence of all forces.
• The surface of a sphere models a geometry that admits no parallel lines.
• The surface of a pseudosphere models a geometry that admits many
parallel lines through a given point.

SECTION 3.2
8.5

Shapes on a Plane: • Representing curved geometries on a flat surface requires distortion.


Making Maps • The type of distortion depends on the technique used to project a surface
onto a map.
• The stereographic projection of a sphere yields a map with increasing
distortion of features near its north pole.
• The stereographic projection preserves angles, but not length.
• Geodesics are circles in stereographic projections of the sphere.
• Hyperbolic space can be modeled using the Poincaré disk.
• The boundary of the Poincaré disk represents infinity.
• Most geodesics map as semi-circles that form right angles with the
boundary of the disk.
• Triangles on the disk are “skinny.”

Unit 8 | 49
UNIT 8 Geometries Beyond Euclid
textbook

SECTION 8.6

Curvature • We can measure the curvature of a curve in the plane extrinsically via an
osculating circle.
• Intrinsic measurements of curvature are impossible in one dimension.
• One extrinsic way to measure the curvature of a two-dimensional surface is
through principle curvatures.
• Principle curvatures cannot be used intrinsically to measure curvature.
• One can measure the intrinsic curvature of a surface by drawing circles and
comparing their circumferences to their radii.
• Positive curvature yields a smaller circumference than we would expect for
a given radius.
• Negative curvature yields a larger circumference than we would expect for a
given radius.

SECTION 8.7

General • Riemann described how to deal with geometries that contain regions of
Relativity positive and negative curvature.
• The concept of the curvature of 3-D manifolds paved the way for Einstein’s
work.
• Einstein said that space and time curve around massive objects, creating
gravity.
• Events follow geodesics in this curved spacetime.
• The General Theory of Relativity has been experimentally verified.

SECTION 8.8

Geometrization • We can examine different topological surfaces via the types of geometry they
Conjecture admit.
• The polygons that can completely tile a given surface help to determine the
type of global geometry in which that surface exists.
• Most 2-D surfaces are hyperbolic.
• If we were to remove all the mass of the universe, the fabric of space would
most likely be hyperbolic.

Unit 8 | 50
UNIT 8 Geometries Beyond Euclid
textbook

BIBLIOGRAPHY

PRINT Anderson, Michael T. “Scalar Curvature and Geometrization Conjectures for


3-Manifolds,” Comparison Geometry, vol. 30 (1997).

Aste, Tomaso. “The Shell Map: The Structure of Froths Through a Dynamic Map.”
(arXiv:cond-mat/9803183v1), (1998). http://arxiv.org/ (accessed 2007.)

Berlinghoff, William P. and Kerry E. Grant. A Mathematics Sampler: Topics for the
Liberal Arts, 3rd ed. New York: Ardsley House Publishers, Inc., 1992.

Boyer, Carl B. (revised by Uta C. Merzbach). A History of Mathematics, 2nd ed. New
York: John Wiley and Sons, 1991.

Burton, David M. History of Mathematics: An introduction, 4th ed. USA: WCB/


McGraw-Hill, 1999.

Cannon, James W., William J. Floyd, Richard Kenyon, and Walter R. Parry.
“Hyperbolic Geometry,” in No. 31 of Mathematical Sciences Research Institute
Publications, Flavors of Geometry, edited by Silvio Levy, 59-115. New York:
Cambridge University Press, 1997.

Conway, J.H. and S. Torquato, “Packing, Tiling, and Covering with Tetrahedra”
Proceedings of the National Academy of Sciences, USA, vol. 103, no. 28. (July 2006).

Coxeter, H.S.M. Non-Euclidean Geometry, 6th ed. Washington, DC: Mathematical


Association of America, 1998.

Delman, Charles and Gregory Galperin. “A Tale of Three Circles,” Mathematics


Magazine, vol. 76, no.1 (February 2003).

Deza, Michel and Mikhail Shtogrin. “Uniform Partitions of 3-Space, Their


Relatives and Embedding,” European Journal of Combinatorics, vol. 21, no. 6
(August 2000).

Euclid. The Thirteen Books of Euclid’s Elements, translated from the text of
Heiberg, with introduction and commentary by Sir Thomas L. Heath, 2nd ed.
(unabridged). New York: Dover Publications, 1956.

Unit 8 | 51
UNIT 8 Geometries Beyond Euclid
textbook

BIBLIOGRAPHY
Eves, Howard. An Introduction to the History of Mathematics, 5th ed. (The
Saunders Series) Philadelphia, PA: Saunders College Publishing, 1983.
PRINT
CONTINUED Gauglhofer, Thomas and Hugo Parlier. “Minimal Length of Two Intersecting
Simple Closed Geodesics,” Manuscripta Mathematica, vol. 122, no. 3 (2007).

Goe, George, B.L. van der Waerden, and Arthur I. Miller. “Comments on Miller’s
“The Myth of Gauss’ Experiment on the Euclidean Nature of Physical Space,”
Isis, vol. 65, no. 1 (March 1974).

Goodman-Strauss, Chaim. “Compass and Straightedge in the Poincaré Disk,”


American Mathematical Monthly, vol. 108, no. 1 (January 2001).

Greenberg, Marvin Jay. Euclidean and Non-Euclidean Geometries: Development


and History. 2nd ed, New York: W.H. Freeman and Co., 1980.

Greene, Brian. The Elegant Universe: Superstrings, Hidden Dimensions, and the
Quest for the Ultimate Theory. New York: W.W. Norton and Co., 1999.

Lederman, Leon M. and Christopher T. Hill. Symmetry and the Beautiful Universe.
Amherst, NY: Prometheus Books, 2004.

Luminet, Jean-Pierre and Boudewijn F. Roukema. “Topology of the Universe:


Theory and Observations.” Cornell University Library.
http://fr.arxiv.org/abs/astro-ph/9901364v3 (accessed 2007).

Miller, Arthur I. “The Myth of Gauss’ Experiment on the Euclidean Nature of


Physical Space,” Isis, vol. 63, no. 3 (September 1972).

Monastyrsky, Michael. [Translated by James King and Victoria King. Edited by


R.O. Wells Jr.] Riemann, Topology and Physics. Boston, MA: Birkhauser, 1979.

O’Shea, Donal. The Poincaré Conjecture: In Search of the Shape of the Universe.
New York: Walker Publishing Company, 2007.

Paur, Kathy. “The Fenchel-Nielsen Coordinates of Teichmuller Space.” MIT


Undergraduate Journal of Mathematics, vol. 1 (1999).

Unit 8 | 52
UNIT 8 Geometries Beyond Euclid
textbook

BIBLIOGRAPHY
Peterson, Ivars. “Celestial Atomic Physics,” Science News Online. week of Sept
10, 2005; vol 168, no 11.
PRINT http://www.sciencenews.org/articles/20050910/mathtrek.asp (accessed 2007).
CONTINUED
Randall, Lisa. Warped Passages: Unraveling Mysteries of the Universe’s Hidden
Dimensions. New York: HarperPerennial. 2005.

Rockmore, Dan. Stalking the Riemann Hypothesis The Quest To Find the Hidden
Law of Prime Numbers. New York: Vintage Books (division of Randomhouse),
2005.

Shackleton, Kenneth J. “Combinatorial Rigidity in Curve Complexes and Mapping


Class Groups,” Pacific Journal of Mathematics, vol. 230, no. 1 (March 2007).

Thurston, William P. “The Geometry and Topology of Three-Manifolds.”


Mathematical Sciences Research Institute.
http://www.msri.org/publications/books/gt3m (accessed 2007).

Tobler, W.R. “Local Map Projections,” The American Cartographer, vol. 1, no. 1
(1974).

Weeks, Jeffrey. “The Poincaré Dodecahedral Space and the Mystery of the
Missing Fluctuations,” Notices of the AMS, vol. 51, no. 6 (June/July 2004).

Unit 8 | 53
UNIT 8 Geometries Beyond Euclid
textbook

NOTES

Unit 8 | 54
TEXTBOOK
Unit 9
UNIT 09
game theory
TEXTBOOK

UNIT OBJECTIVES

• Game theory is the mathematical study of social interactions as games that have
payoffs for the players.

• Traditional game theory assumes that players are rational actors, always acting in
ways that maximize their benefits. Real people are not necessarily like this.

• The payoff matrix is the essential way to express a game mathematically.

• In zero-sum games, a winner’s gains come at a loser’s expense.

• Non-zero-sum games can include both win-win and lose-lose situations.

• Strategies are the actions that players take in a game.

• Payoffs are often frequency dependent; that is, they depend on how many people
are playing a particular strategy as well as how often a game is played.

• Equilibrium is reached when each player has no incentive to play differently.

• Game analysis changes when games are played repeatedly. This gives rise to
mixed strategies.

• Game theory can provide insight into many situations, phenomena, and subjects,
including biology, sociology, and linguistics.
What accounts for TIT FOR TAT’s robust
success is its combination of being nice,
retaliatory, forgiving, and clear. Its niceness
prevents it from getting into unnecessary
trouble. Its retaliation discourages the
other side from persisting when defection
is tried. Its forgiveness helps restore
mutual cooperation. And its clarity makes
it intelligible to the other player, thereby
eliciting long-term cooperation.

Robert Axelrod
UNIT 9 game theory
TExtbook

SECTION 9.1 Mathematics has an on-again-off-again relationship with the real world.
There are fields of mathematics that exist, more or less, solely “for themselves.”
INTRODUCTION Researchers in these fields are primarily motivated by the most abstract of
“what-if” conjectures. The field of topology is a prime example of this, as it is
known primarily for the beauty of the thinking behind its results rather than for
its connections to reality. Much of mathematics, however, has its foundation in
the happenings of the world around us. The field of game theory, in some sense,
represents the pinnacle of this type of mathematics. It can be thought of as the
mathematical study of our human interactions.

Mathematics started off as a way to apply strict, rigorous thinking to the wildly
complicated world around us. Throughout its evolution, mathematical thought
has oscillated between two alternative paths of thinking. On the one hand, it has
often been at the vanguard of quantitative science, helping to shed new light on
previously incomprehensible phenomena through the power of logical thinking.
On the other hand, it has sometimes been taken to levels of abstraction
far beyond what any pragmatic scientist would be interested in. Even in its
more applied incarnations, mathematics involves a great many simplifying
assumptions. To get anywhere, we must reduce incredibly complicated
situations from our everyday reality into problems that can be strictly defined
and rigorously analyzed. The subject of game theory provides a great example
of this mathematical reduction in action.

Games, to a mathematician, are simplified versions of situations that arise in the


course of interactions among people, or any kinds of agents. A game represents
an idealized situation that can be analyzed mathematically to shed light on how
or why certain outcomes are reached. For example, the Ultimatum Game is a
two-player situation in which one player is given an amount of money to share
with another in any proportion desired. The receiving player then can choose
either to accept or to decline the giver’s offer. If the offer is accepted, both
players get their share of the money, as determined by the giver. If the receiver
declines, no one gets any money. This represents a vastly simplified model of
a business negotiation. As we will see later in this unit, this simple game has
far-reaching and sometimes surprising implications for how humans judge the
importance of justice and equality.

Most games have many simplifying assumptions, as is evident throughout


this discussion, but the one that provides the foundation for all others is the
assumption that players act rationally. The classic assumption is that a rational

Unit 9 | 1
UNIT 9 game theory
TExtbook

SECTION 9.1 player will always act in a way that seems to maximize personal benefit.
This assumption of rationality is what allows the mathematical analysis of
INTRODUCTION games to work. If the people playing a game do not behave rationally, then
CONTINUED the results of any game-theory-supported analysis may be less relevant.
Nonetheless, the conclusions that are obtained, even under these idealized
assumptions, can be useful. Interestingly, an important use of game theory
has been to probe the limits of rationality. As we shall see in our study of the
Ultimatum Game, there are multiple concepts of what is rational, in addition to
the one based on maximizing profit.

One of the most interesting conclusions reached in game theory is that rational
actions by both players can result in situations in which both players are worse
off. The primary example of this is the Prisoner’s Dilemma, which we will see in
more detail later on in this unit.

In this unit we will examine the mathematical analysis of games, beginning with
a bit of the history and some of the motivations behind the field’s development.
From there we will examine a few simple games in order to illustrate the basic
terminology and concepts. We will then move on to more substantial games,
such as the Prisoner’s Dilemma and the Hawks and Doves game. We will
examine the various types of solutions and equilibria that exist for these games,
and we will see how these elements change, depending on whether a game is
played once or more than once. In iterated and multi-player games, we will see
how the payoff a player can expect from using a particular strategy depends on
the strategies that others employ and how frequently they do so. Along the way,
we will see how game theory can help us understand the processes of evolution,
including the evolution of language. Finally, we will see how analyzing abstract
games applies to real-life situations, such as business transactions, language
development, and avoiding nuclear war.

Unit 9 | 2
UNIT 9 game theory
TExtbook

SECTION 9.2

origins of game • Games People Play


theory • Zero-Sum
• Minimize the Worst-Case Scenario

GAMES PEOPLE PLAY


• Mathematicians attempt to analyze types of interactions by treating them as
games in which players use strategies to obtain payoffs.
• Games like checkers and chess are games of perfect information; the board
shows both players all the information needed to make the right decision.
• Games like poker are games of imperfect information; no player has enough
information (the other players’ cards are hidden) to make a guaranteed right
decision.

Games have appeared throughout history, and in many different cultures, as a


form of entertainment and education. The first mathematical analyses of certain
games occurred in the 1700s, centered around the card game Le Her, but it was
not until the mid-1920s that the field of game theory was properly founded by
the multi-talented Hungarian mathematician and physicist, John von Neumann.

Von Neumann was an exceptional character and ranks among the brightest
minds of the first half of the twentieth century. He was involved in many fields,
including quantum physics, topology, computer science, and economics. He also
worked on the Manhattan Project, the top-secret effort that built the first atomic
bomb. In his leisure time, he was an avid poker player, and his interest in games
and how they “work” is said to have stemmed partly from this fascination. Poker
is a game in which mathematics and human nature square off. A player must
not only be good at calculating odds, but must also be able to read the other
players to tell if they are bluffing or not.

Poker players are able to bluff one another because poker is a game of
imperfect information. As a player, you know your own cards and you know
what the other players have wagered, but you do not know what cards they hold.
It is this missing information that brings excitement to the game and makes
it challenging. Many of the games we will explore in this discussion present
situations similar to those that arise in poker, in that players do not know all
of the information available and yet still must make decisions.

Unit 9 | 3
UNIT 9 game theory
TExtbook

SECTION 9.2 Certain games of imperfect information can lead to interesting scenarios in
which players unwittingly act against their own best interests.
origins of game
theory By contrast, chess is a game with perfect information. In a game of chess,
CONTINUED
knowledge of the positions (i.e., the pieces and board set-up) and all historical
information regarding the moves are available to both players. Players must
use this information to formulate and modify strategies and tactics during a
match. The challenge of chess lies not in the ambiguity of not knowing what
your opponent has to work with, but rather in the extreme complexity of possible
scenarios, each requiring a good deal of analysis. Despite their differences,
poker and chess have one very important characteristic in common: both are
zero-sum games.

ZERO SUM
• Zero-sum games are games in which a winner’s payoff is equal to a
loser’s loss.
• Utility is an imprecise concept, but biological payoffs are measurable.

Zero-sum games are games in which one player’s loss is exactly equal to
another player’s gain. If you win a hand at poker, your winnings will add up to
the sum of your opponents’ losses (unless you play at a casino, where the house
takes a cut). At the end of the day, poker players (as a group) are no wealthier
or poorer than when they started. Likewise, in chess, one person’s victory is
balanced by the opponent’s loss. Even when a chess match ends in a stalemate,
then both players are no better off than when they started-neither has gained or
lost anything, except perhaps time, but that is not taken into account.

Not all situations in life, or game theory, are zero-sum situations, however.
Two notable examples are business transactions and arms races. In an arms
race, two or more nations compete to build the most-and the most destructive-
weapons possible. This is a situation in which there are many outcomes that
leave all “players” worse off than when they started. If they invest heavily in
weapons, but then never use them, they have squandered a good deal of their
resources and are poorer for it. Also, if the nations use their weapons and go to
war, there is much squandering of resources, and all are worse off than before.

In business transactions, both parties exchange something for something that


they find more valuable. If you trade ten dollars for a hat, you have obviously
decided that the hat is more valuable to you than your ten dollars

Unit 9 | 4
UNIT 9 game theory
TExtbook

SECTION 9.2 (or, at least, equally valuable). Likewise, your ten dollars is more valuable
than the hat to the hat-seller. This idea of relative value is what economists
origins of game call “utility.” It is through increased or decreased utility that non-zero-sum
theory arrangements are possible.
CONTINUED

Utility, however, has proved to be a rather controversial concept. When game


theory is applied to biology, or evolution, the concept of utility comes through in
the term “biological fitness.” In this context, game theory measures payoffs in
terms of reproductive success-the objective number of offspring that a “player”
leaves behind to carry on a genetic legacy.

MINIMIZE THE WORST-CASE SCENARIO


• The minimax theorem, attributable to Von Neumann and Morgenstern,
states that players seek the strategy that minimizes their maximum loss.

Von Neumann focused the bulk of his research on zero-sum games. He and
economist Oskar Morgenstern established the field of game theory with the
publication of their book “The Theory of Games and Economic Behavior,”
which was primarily an analysis of the zero-sum situation. Von Neumann and
Morgenstern worked from the fundamental assumption that players always
act in a way that increases their utility-that is, they always implement the
strategy that they perceive will bring them the greatest reward. This is the
classic definition of “rational,” in the context of game theory. Furthermore, Von
Neumann showed that in any two-player, zero-sum, game, there will always
be a best, or optimal, strategy for each player. This will be the strategy that
maximizes the minimum possible gain, or utility, dependent on what the other
player does.

This theorem, known as the “minimax theorem,” set the foundation for the
mathematical study of games. It states that there is always a strategy that a
player can choose that will lead to their most-favorable worst possible outcome.
Depending on the specific rules of the game, that outcome might not be better
than that of another player, but it will be better than the alternatives, provided
the other player is playing in a similar fashion. For instance, even in a game
as complex as chess, there exists an outcome that should happen every time,
provided both players play perfectly. This means that, theoretically, one of the
three possible outcomes of chess-a white win, a black win, or a draw-should
be the “right” outcome if both players play perfectly. The game is so complex,

Unit 9 | 5
UNIT 9 game theory
TExtbook

SECTION 9.2 however, that the strategies that each person should employ to produce this
ideal outcome are unknown.
origins of game
theory An easier example to consider is the game of tic-tac-toe (TTT). A good TTT
CONTINUED
player can never be beaten. A good player, given the first move, will always
take a corner square, and any rational opponent will then take the center
square. Play continues from this point, with the players making the moves that
ensure that they will not lose, yet that also leave their best winning options
open. Because of the layout and the rules of TTT, this means that when two
good players play each other, the result will always be a tie. This is the “right”
outcome, provided no one makes a mistake.

One way to analyze the game progression is through the use of a “game tree.”
A game tree provides a systematic way to lay out all sequences of moves in a
game in visual format. Each move is represented by a node, whose branches
represent all the possible countermoves. Even for a game as simple as TTT, the
game tree gets very large, as there are over 300,000 (9!) possible sequences of
moves and countermoves that must be taken into account.

1150

Now that we’ve been introduced to some of the general concepts and terms that
will be used in our discussion of game theory, let’s turn our attention to a few
specific games and analyses.

Unit 9 | 6
UNIT 9 game theory
TExtbook

SECTION 9.3

simple games • Piece of Cake


• A Penny Saved...

PIECE OF CAKE
• Cake division is a very simple zero-sum game, modeled with a 2 x 2
payoff matrix.
• If each player is greedy, we assume that each will choose the strategy with
the best worst-case scenario.
• Equilibrium is reached when both players have no incentive to change their
strategies; these can be considered the “best” strategies.

Let’s start by looking at a simple situation that can be modeled as a game.


Suppose that two children at a birthday party both want to have the last piece
of cake. If one child gets it, the other will be resentful, and even if an adult
intervenes to split the cake in two, one child will inevitably complain that the
other’s share is larger. This conundrum can be avoided by letting one child cut
the cake and letting the other child have first choice of the pieces. This seems
to be an intuitively fair way to solve the problem. We can put this intuition on
firmer footing, however, using the techniques of game theory.

To model this situation as a game, we have to make a few simplifying


assumptions. First, the child who is to cut the cake, who we’ll call the “cutter,”
has a variety of choices of how to make the cut, but we can simplify things by
recognizing that the real decision is simply whether or not to attempt to cut the
cake fairly. In this model, we reduce the cutter’s choices to just two: namely, cut
evenly or cut unevenly. The chooser has only two possible actions, of course:
choose the piece perceived to be larger or the one that seems smaller. Finally,
we must assume that both children are completely selfish, or rational. That is,
they always act in a way that gives them as much cake as possible.

We can organize this information into a matrix that enables us to see and
analyze the various possible outcomes.

NOTE: The first value in each cell is the cutter’s payoff, and the second is the
chooser’s payoff. We will follow this convention of listing the row player’s payoff
first throughout the unit.

Unit 9 | 7
UNIT 9 game theory
TExtbook

1193
SECTION 9.3
Chooser
Choose larger Choose smaller

piece of cake
CONTINUED
Cutter
Cut evenly
( , (( , (
( , (( , (
The last piece of cake
Cut unevenly

In this scheme, the cutter chooses the row of the outcome and the chooser
chooses the column. For example, if the cutter chooses to cut evenly and the
chooser chooses the larger piece, then the cutter will get “half minus a crumb”
and the chooser will get “half plus a crumb,” which is the outcome represented
in the upper left cell of the table. Allowing for the difference of a “crumb” is
simply a way to acknowledge that actually cutting a cake evenly is extremely
difficult.

Now, being selfish, the cutter will choose the action that promises to bring
him the most cake regardless of what the chooser chooses. Cutting the cake
unevenly creates the possibility of getting the larger piece, but it also opens the
door for the chooser to thwart this effort. In other words, the cutter’s maximum
payoff in choosing to cut unevenly is the large piece, and his minimum payoff in
this case is the smaller piece.

If the cutter chooses to cut evenly, however, his maximum payoff is about half
of the cake, and his minimum payoff is also about half of the cake. So, of the
cutter’s two choices, the one that has the least downside-or, in other words,
the “maximum minimum”-is the one he should choose. Consequently, in this
situation, he should choose to cut the piece of cake evenly.

The chooser seeks to do the same thing, make the choice that maximizes her
benefit. In this case, if the cutter cuts evenly, the chooser’s best option is to
pick the “half plus a crumb.” Notice that even though both children implement
their best strategy, one still comes out slightly advantaged over the other. This
common feature of games is summed up in the following statement: “You know,
the best you can expect is to avoid the worst.”1 Games do not have to be fair.

The choices that the cutter and chooser had to make in the above example
are known as “pure strategies.” This simply means that the players play the
game using the same strategy every time; deviating from the strategy gains

Unit 9 | 8
UNIT 9 game theory
TExtbook

SECTION 9.3 them nothing and could potentially end up harming them. This idea of a “best”
strategy-in the sense that deviation from it increases the chance of reaching a
piece of cake less-desirable result-is known as an “equilibrium.”
CONTINUED

A nice example of equilibrium is the case of a three-way duel (sometimes called


a “truel”). Let’s say that person A is an excellent shot, able to hit the target
100% of the time; person B is a great shot, with a 90% success rate; and person
C is a terrible shot, striking the target a mere 20% of the time.

90%

2396 100% 20%

A C

Assuming that each person can shoot only once, the best strategy for each in
this case is basically: “shoot the person most dangerous to you.” If each player
adopts this strategy, then A should shoot at B and B should shoot at A, as each of
them is the other’s most imminent threat. Person C should shoot at A, because
there is a tiny chance that B will miss, but there is no chance that A will miss.
The outcome, if all “players” implement their equilibrium strategies, is that

Unit 9 | 9
UNIT 9 game theory
TExtbook

SECTION 9.3 Person C will be the winner of the contest, the one left standing. This example
demonstrates how the conclusions of game theory can sometimes be counter-
piece of cake intuitive.
CONTINUED

A PENNY SAVED...
• Matching pennies, in its one-off form, is an example of a game in which
there is no clear “best” strategy.
• Playing multiple rounds of matching pennies does have a clear equilibrium.

To get a better sense of how equilibrium works, let’s look at a game that has
no clear “best” strategy: matching pennies. In this game, two players, called
“Mixed” and “Matched,” simultaneously place one penny each on a table, either
heads up or heads down. If the two coins are matching, then Matched gets to
keep them both; if the two coins are not matching, then Mixed gets them both.
We can summarize the situation with the following payoff matrix:

Matched
1194 Mixed
Play heads
Play heads
(-1, 1)
Play tails
(1, -1)
Play tails (1, -1) (-1, 1)

We can see from this table that if Mixed plays heads and Matched also plays
heads, then Mixed loses a penny and Matched gains a penny. Remember, in
this game both players put their coins down at the same time, so neither has an
advantage in knowing what to pick. Notice also that this is indeed a zero-sum
situation-whatever Mixed loses is gained by Matched, and vice versa. If this
game were to be played just once, neither Mixed nor Matched would have any
clue as to what the other was going to play, so the choice between heads and
tails would be completely random. Therefore, unlike the cake game discussed
earlier in which each player had a definite “best” strategy, there is no one
strategy that beats all others for a single round of matching pennies.

The plot thickens, so to speak, when multiple rounds are played; this is what
game theorists call an “iterated game.” If the two players were to play multiple
rounds, then playing a pure strategy of heads every time or of tails every time
would definitely put a player at a disadvantage. For instance, if Matched noticed
that Mixed always plays heads, then she should play heads as well and win
every round.

Unit 9 | 10
UNIT 9 game theory
TExtbook

SECTION 9.3 A pure strategy is not the best bet for either side in this situation. Ideally, each
player would like to keep the other player guessing as to the next play. The
piece of cake most intuitive, and best, way to accomplish this is to play randomly. Random
CONTINUED play is an example of a mixed strategy. Let’s look at a modified table that
includes this new strategy option. Note that the payoff values in the table also
must change a bit in meaning. Whereas previously we were concerned with the
payoff of just a single round of play, we are now considering iterated games and
mixed strategies, and the payoffs must represent averages per round. Playing
randomly results in an average payoff of “zero” per round. Note that, because
this is still a zero-sum situation, if one player gets zero, so must the other.

Matched
1195 Mixed Play heads
Play heads
(-1, 1)
Play tails
(1, -1)
Play randomly
(0, 0)
Play tails (1, -1) (-1, 1) (0, 0)
Play randomly (0, 0) (0, 0) (0, 0)

Now, each player has a choice of how to play this iterated game-pure heads,
pure tails, or randomly. Using the logic developed in the preceding section,
Mixed should choose the strategy that ensures the maximum minimum, and
Matched, from his perspective, should do the same. This means that Mixed
should choose to play randomly, and so should Matched, and both of them
should expect to make nothing from the game. Playing randomly in this case
means playing heads and tails with equal probability. Doing this means that the
game is at equilibrium: neither player has anything to gain by deviating from the
chosen strategy if the other player does not deviate. Not every equilibrium in an
iterated game must be composed of equal probabilities, however. The precise
probabilities depend on the specific payoffs of the game.

In this analysis, we assume that playing randomly means that the odds of a
50
player playing heads or playing tails are . If this were not true, then the
50
opposing player could statistically recognize a bias towards either heads or tails
and adjust her play accordingly to take advantage of this. In a scenario such as
this, the payoffs for playing randomly would no longer be (0,0), but rather the
product of the pure strategy payoffs, (-1, 1) for example, and the proportion of
heads or tails played. For example, if out of 100 games, Mixed plays 60% heads,
then Matched should also play 60% heads and expect to have an average payoff
of 0.1 per round as opposed to the zero that would be expected if both players
play heads and tails with equal probability. Conversely, Mixed should expect to
lose 0.1 per round, on average.

Unit 9 | 11
UNIT 9 game theory
TExtbook

SECTION 9.3 We have until now been concerned with zero-sum games; whatever the winner
wins, the loser has to lose. However, many situations in life, and, hence, the
piece of cake games that model these situations, are not zero-sum. These are situations in
CONTINUED which the combined outcome can be greater than or less than zero. In other
words, some situations are win-win, and some situations are lose-lose. One of
the most famous non-zero-sum games is the Prisoner’s Dilemma.

Unit 9 | 12
UNIT 9 game theory
TExtbook

SECTION 9.4

PRISONER’S DILEMMA • Trust No One


• Déjà Vu

TRUST NO ONE
• The Prisoner’s Dilemma is a classic example of a non-zero-sum game.
• The equilibrium in Prisoner’s Dilemma is not the optimum solution.

The RAND Corporation, located in Santa Monica, California, is the original


“think-tank.” It was founded after World War II to be a center of national
security and global policy ideas and analysis. Whereas today it advises many
nations on a variety of issues, its initial focus was national defense. Game
theory was one of the early pursuits of RAND thinkers, and in 1950 two RAND
scientists, Merrill Flood and Melvin Dresher, framed what would become one of
the most fascinating games of all time, the Prisoner’s Dilemma.

The basic game is set up like this: imagine that you and your friend are caught
robbing a bank. Upon being apprehended you are immediately separated so
that you do not have time to communicate with each other. Each of you is taken
to a separate cell for interrogation. If you and your buddy cooperate (C) with
each other-that is, say nothing to the cops-each of you will get only a year in jail,
known as the “reward” payoff, R.

Your Buddy
1196 You C
C
(R, R)
D

If you both rat on each other, or “defect” (D), to use the game theorist’s
terminology, you will both get three years in prison, known as the “punishment”
payoff, P.

Unit 9 | 13
UNIT 9 game theory
TExtbook

SECTION 9.4
Your Buddy
1197
PRISONER’S DILEMMA You C
C
(R, R)
D

CONTINUED D (P, P)

If one of you cooperates and the other defects, the cooperator will get five years,
known as the “sucker’s” payoff, S, and the defector will get off with no jail time,
known as the “temptation to defect” payoff, T.

Your Buddy
1198 You C
C
(R, R)
D
(S, T)
D (T, S) (P, P)

This matrix concisely expresses the game as we have described it, where T = 0,
R = 1, P = 3, and S = 5. Note that T>R>P>S. (It might be useful here to interpret
the “is greater than” sign to mean “is better than,” because the values actually
represent negatives-years spent in jail.)

First let’s consider why this is not a zero-sum game. Looking at each cell, we
can tell that none of the payoffs for you and your buddy sum to zero. In fact,
all of them result in some net jail time for one or both of you, although some
outcomes are more favorable than others. For instance, if both of you cooperate,
the total time served by the two of you will be two years, which is as close to
win-win as this situation can get (after all, you did just rob a bank). If both of
you defect, then the total jail time for the two of you will be six years, a lose-lose
scenario that is a good deal worse than the best-case scenario. The other two
scenarios result in a total of five years of jail time served between the two of
you. So, if you could only agree with your buddy that both of you will keep quiet,
as a team you’ll be better off. The dilemma comes from the fact that neither you
nor your buddy has any incentive to do this.

You have no idea whether or not your buddy is going to cooperate. Even if you
have discussed a situation like this with him beforehand, you cannot be sure that
he won’t betray you. As a rational being, you are going to make the decision that
minimizes your potential downside, or your personal maximum penalty. If you
choose to cooperate with your buddy, the maximum penalty you could receive is
five years, and your best-case scenario is a one-year prison sentence. However,
if you choose to defect, your maximum penalty would be three years, and there

Unit 9 | 14
UNIT 9 game theory
TExtbook

SECTION 9.4 is a chance that you could get off with no jail time. Your rational buddy is faced
with the same set of options and the same reasoning. As a rational being, you
PRISONER’S DILEMMA will choose to defect and so will your buddy, and these actions result in the lose-
CONTINUED lose scenario.

What is so interesting in the Prisoner’s Dilemma is that it is an example in


which the equilibrium solution is not the same as the optimal solution. The
equilibrium solution, remember, is the state in which neither player has
anything to gain by switching strategies as long as the other player also doesn’t
switch. The optimal solution is the scenario in which the greatest good, or
utility, is realized. In the Prisoner’s Dilemma the greatest good, on the whole,
comes about when both players cooperate. This scenario is unstable, however,
because both players have an incentive to switch strategy. On the other hand,
if both players defect, neither has anything to gain by changing strategy if the
other doesn’t, so the defect-defect solution is stable. Game theorists would say
that the defect strategy is strictly dominant over the cooperate strategy as long
as T>R>P>S.

When versions of the Prisoner’s Dilemma are posed to actual people, the results
do not always match the mathematical predictions. Real people do not always
act rationally, and even if they did, it is very rare that a game is ever played just
once in real life. As an example, let’s say that you decide to cooperate, but your
buddy decides to defect. After you serve your sentence, your buddy offers to
rob another bank with you to help you get back on your feet (with friends like
this, who needs enemies?!). You agree, and both of you get caught again. This
situation is not exactly like the first time you got caught, however, because
now each of you has a reputation, a track record. Your buddy might realize
that you have already cooperated once and that if you cooperate again, and he
chooses to cooperate this time also, then both of you will be better off. On the
other hand, you might have revenge on your mind and decide that because your
buddy burned you the last time, you will retaliate this time. These kinds of
considerations make the Iterated Prisoner’s Dilemma more complicated than
the one-shot version.

Unit 9 | 15
UNIT 9 game theory
TExtbook

SECTION 9.4 DÉJÁ VU


• The Iterated Prisoner’s Dilemma admits a wide variety of equilibrium
PRISONER’S DILEMMA outcomes, depending on the mix of strategies adopted by the players.
CONTINUED
• In computer tournaments, strategies that are neither always generous nor
always punitive tend to fare the best.
If the Prisoner’s Dilemma is to be played over and over again, it is best that
the number of times that it is to be played is not pre-determined; otherwise,
everyone should just defect, as in the one-round version. The reasoning goes
like this: you should always defect in your last game because there is no
chance for retaliation. Knowing that your buddy will also think of this strategy,
you should always defect in your second-to-last game as well. This thinking
naturally extends all the way back to the first move, so everyone should just
always defect. If, however, players play without knowledge of when the game
will end, strategies other than “always defect” become viable, even dominant.
One such alternative strategy is the random strategy, in which a player randomly
cooperates or defects, with no consideration given to what has happened in
previous rounds. Another strategy is retaliation: always do to your opponent
what she did to you the last time.

There are many strategies, some of which are clearly better than others, and
others of which are rather obscure in their efficacy. To put all of these strategies
to the test, Robert Axelrod of the University of Michigan organized a tournament
in the mid-1980s in which different Iterated Prisoner’s Dilemma strategies
competed against each other over the course of many rounds. The winning
strategy was to be the one with the lowest accumulated jail time in the end.

One might suspect that “always-defecting” would still be the best strategy in
such a tournament. If two players played five rounds of the always-defecting
strategy, their individual scores at the end of five rounds would be 15 years.
(In our score-keeping, lower scores are better).

Results of “Pure Defect” vs. “Pure Defect”

P1 STRATEGY DDDDD
P2 STRATEGY DDDDD

P1 PAYOFF = PPPPP = 3 + 3 + 3 + 3 + 3 = 15 years


P2 PAYOFF = PPPPP = 3 + 3 + 3 + 3 + 3 = 15 years

Unit 9 | 16
UNIT 9 game theory
TExtbook

SECTION 9.4 On the other hand, if two players who were “always-cooperating” played each
other for five rounds, each player’s accumulated score would be five years.
PRISONER’S DILEMMA
CONTINUED Results of “Pure Cooperate” vs. “Pure Cooperate”

P1 STRATEGY CCCCC
P2 STRATEGY CCCCC

P1 PAYOFF = RRRRR = 1 + 1 + 1 + 1 + 1 = 5 years


P2 PAYOFF = RRRRR = 1 + 1 + 1 + 1 + 1 = 5 years

So, there is clearly something to be gained by not defecting all the time if you
can get into a repeated mutual cooperation situation, such as that shown above.
The question is: how can you get into such a situation, especially when a “Pure
Defect” strategy will dominate a “Pure Cooperate” strategy?

Results of “Pure Defect” vs. “Pure Cooperate”

P1 STRATEGY DDDDD
P2 STRATEGY CCCCC

P1 PAYOFF = TTTTT = 0 + 0 + 0 + 0 + 0 = 0 years


P2 PAYOFF = SSSSS = 5 + 5 + 5 + 5 + 5 = 25 years

Analysis of the strategies that fared best in Axelrod’s tournament indeed


provided answers to this question. Some were very complicated, based on
analyzing specific sequences of prior moves to prescribe the next sequence of
moves. Others were very simple, such as the Tit-For-Tat (TFT) strategy. As its
name implies, TFT relies simply on doing to your opponent what he last did to
you. So, if your opponent cooperates on the first turn, then you should cooperate
on the second turn. This can lead to the nice “always-cooperating” cycle if two
TFT players start off cooperating, while protecting the player from getting too
many sucker’s payoffs. However, TFT can also lead to the “always-defecting”
situation, if two TFT players start off by defecting.

The best strategies were variants of Tit-For-Tat with Forgiveness (TFTWF). This
strategy is basically the same as regular TFT except that some small percentage
of the time, you forgive your opponent’s prior defection and do not mimic it. This
provides a mechanism for breaking the “always-defecting” trap.

Unit 9 | 17
UNIT 9 game theory
TExtbook

SECTION 9.4 Results of “Tit-For-Tat-with-Forgiveness” vs. “Tit-For-Tat”:

PRISONER’S DILEMMA P1 (TFTWF) DDCCC


CONTINUED P2 (TFT) DDDCC

P1 PAYOFF = PPSRR = 3 + 3 + 5 + 1 + 1 = 13 years


P2 PAYOFF = PPTRR = 3 + 3 + 0 + 1 + 1 = 8 years

Note that in this particular match, TFTWF loses to TFT. Remember, however,
that the tournament consists of many matches against many different
strategies. TFT will invariably get caught in “always defecting” cycles,
whereas TFTWF will be able to escape these, providing an advantage over TFT in
the long run.

Most of the successful strategies in Axelrod’s tournament were based on some


amount of altruistic behavior. It was a stunning mathematical indication that
aggression and vindictiveness do not always prevail. It seems that a truly selfish
strategy, in the sense that it is designed to maximize one’s own benefit, must
include some element of forgiveness. In fact, Axelrod found that successful
strategies had four common traits, which he described anthropomorphically in
this way:

• First, the strategy should be “nice.” This means that it will not defect unless
its opponent defects first.

• Second, the strategy should retaliate against defectors to avoid being


exploited by “always defectors.”

• Third, the strategy should be forgiving. After retaliating against a defection,


it should begin to cooperate again as soon as its opponent cooperates.

• Fourth, the strategy should not try to score more than its opponent-it should
be non-envious. This stems from the fact that the strength of cooperation
lies in the reality that both parties benefit equally from it.

Unit 9 | 18
UNIT 9 game theory
TExtbook

SECTION 9.4 These traits are fascinating if not heartwarming, showing us that cooperation
and altruism really do have a place in a world as starkly defined as that of the
PRISONER’S DILEMMA Iterated Prisoner’s Dilemma. This suggests that studying games can help us to
CONTINUED understand some of the behavioral aspects of our natural world, such as why
certain types of animals live in cooperative societies and others live as solitary
aggressors. This world of conflicting living strategies is characterized by the
game called the “Hawks and Doves,” and it is to this game that we will next turn
our attention.

Unit 9 | 19
UNIT 9 game theory
TExtbook

SECTION 9.5

HAWKS AND DOVES • A Lover, Not a Fighter


• Might Might Not Always Make Right
• The Nitty Gritty

A LOVER, NOT A FIGHTER


• Hawks and Doves is a rudimentary model of game theory in a biological
context in which creatures compete for resources by being either aggressive
or passive.
• Aggressive players beat passive players, but they incur costs when they
compete against other aggressive players.

In the Prisoner’s Dilemma, the players have a choice of whether or not to


cooperate with one another or to defect. We saw that what the players should
do—that is, their best strategies-depend on whether they will be playing just
once or many times. If they are to play only once, they should both defect, even
though it would be better overall for them to cooperate. If they are to play many
times, however, it behooves them to try different mixes of cooperation and
defection. Axelrod’s tournament study demonstrated that, over the course of
numerous games, players who play pure strategies can be beaten by players
who choose mixed strategies.

We can extend this thinking to the natural world if we imagine the competition
for survival to be a tournament. The many rounds of this tournament
correspond to the daily struggles that certain species face for survival. In the
natural world, all species compete for resources using wildly varying strategies.
The strategies that are most successful in the long run are the ones that survive
to be passed along to offspring.

Strategies for survival in the natural world do vary, but broad trends are
discernible. For our purposes, we will simplify things greatly by limiting the
options to only two types of behavior, aggressive and passive, and we’ll call
the actors of these behaviors “Hawks” and “Doves” respectively. Aggressive
animals will always fight over resources, whereas passive animals will not. This
is the basis for a famous game often known as “Hawks and Doves,” which was
first proposed by John Maynard Smith and George Price in a 1973 paper.

Unit 9 | 20
UNIT 9 game theory
TExtbook

SECTION 9.5 The assumptions behind the game are pretty straightforward. Imagine a field
strewn with piles of food. This field is populated with animals that can behave
hawks and doves either passively or aggressively toward one another. The animals, or players,
CONTINUED
compete with one another for the resource piles. For simplicity’s sake, we
determine that all competitions occur between individuals, i.e., one-on-one.
The standard scenario is that one animal approaches a pile of food, and then
another animal presents a challenge for it. Furthermore, neither animal knows
the other’s behavioral identity until the challenge has begun. The possible
interactions are then Hawk-Hawk, Hawk-Dove, Dove-Hawk, and Dove-Dove.

Player 2
1199 Player 1
Hawk
Hawk Dove

Dove

Whenever a Hawk fights another Hawk, one of them wins the entire food pile,
thus getting a benefit, B. The loser gets injured, incurring a cost, C. Both B
and C can be thought of as food calories. The resource calories gained by the
winner are counted as positive, but the calories that the loser must devote to
healing are counted as negative. For the purposes of this game, we assume that
all Hawks are equal and win half of all their battles with other Hawks. Also, as
with the Prisoner’s Dilemma game, we are concerned only with the tendencies
established through iterative scenarios. Consequently, on average, a Hawk will
B C
gain calories and lose calories in a Hawk-Hawk interaction. This average
2 2 (B–C)
energy accounting can be simplified to .
2

Player 2
1200 Player 1
Hawk
(B–C) (B–C)
Dove

Hawk
2 , 2
Dove

When a Hawk challenges a Dove, the Dove does not fight but simply walks
away. This means that the Hawk gets the entire benefit, with no cost of fighting.
The Dove gets nothing, but also loses nothing. In Hawk-Dove and Dove-Hawk
interactions, the Hawk always gets the entire benefit, B, and the Dove always
gets nothing and loses nothing. Therefore, the Hawk’s average payoff is B, and
the Dove’s is zero.

Unit 9 | 21
UNIT 9 game theory
TExtbook

SECTION 9.5
Player 2
1201
hawks and doves
CONTINUED Player 1
Hawk
(B–C) (B–C)
Dove

Hawk (B, 0)
2 , 2
Dove (0, B)

Finally, when a Dove challenges a Dove, they do not fight but, rather, split the
resource evenly. Each player gets B2 without any cost to anyone.

Player 2
1202 Hawk
(B–C) (B–C)
Dove
Player 1 Hawk (B, 0)
2 , 2
B B
Dove (0, B)
( 2 , 2 )

Notice that this is not a zero-sum situation, because not all of the cells add up to
the same value; the Hawk-Hawk interaction yields less in total benefit than the
other three scenarios.

MIGHT MIGHT NOT ALWAYS MAKE RIGHT


• Pure strategies can be a bad idea.

Now that we have a grasp of the basic circumstances of the game, let’s think
about whether it’s better to be a Hawk or a Dove. It might seem, at first glance,
that being a Hawk is always the best idea. If we imagine that the population of
the field is nearly 100% Hawk, it’s hard to see how a Dove could ever survive
for very long, as it would get to eat only upon encountering another Dove. On
the other hand, if the field is nearly 100% Dove, then a single Hawk is going to
have it incredibly easy. This would lead us to think, if we had to choose between
playing Hawk and playing Dove, that we should always choose Hawk. After all,
a Hawk in an all-Dove world is going to do well, whereas a Dove in an all-Hawk
world is going to starve.

That’s the standard intuition, but let’s consider the situation of the lone Dove a
little more carefully. He never loses calories, and while the Hawks are gaining
calories, they are also losing them in their fights. If these costs end up being
more than the resource benefits, then each Hawk will experience an overall
calorie loss as time goes on, while the Dove holds steady (this assumes, of

Unit 9 | 22
UNIT 9 game theory
TExtbook

SECTION 9.5 course, that there is no cost for simply waiting around while everybody else
fights amongst themselves). After a while, the lone Dove will be doing much
hawks and doves better than the always-fighting Hawks. This suggests that if costs are more than
CONTINUED
benefits, one might do well to be a Dove in an all-Hawk world.

If Doves do better in an all-Hawk environment when costs outweigh benefits,


then over time the population should shift toward all Doves. This is based on the
assumption that the most fit, the ones with the highest net calories, survive to
reproduce more often than the less fit.

One might then think that whenever costs outweigh benefits, the population will
tend to evolve into all Doves. However, a Hawk in an all-Dove environment will
do extremely well relative to the Doves, even if costs outweigh benefits. This
is because the cost becomes irrelevant if there are no other Hawks around to
inflict injuries. We would then be led to believe that the all-Dove scenario is not
stable, even when costs outweigh benefits.

The idea of a pure strategy’s stability is an important one. In our analysis, we


saw that neither the all-Hawk nor the all-Dove strategy is stable when the costs
outweigh the benefits. This means that either situation can be infiltrated by
the opposing strategy. Note that this is not true when the benefits outweigh
the costs. Such a world would be driven towards the all-Hawk state, as a
lone Dove would gain nothing while the Hawks gained something from each
fight. This suggests to us that the relationship between costs and benefits has
something to do with which state will be stable. Furthermore, we can conclude
that because neither the all-Hawk nor the all-Dove state is stable, if there is to
be a stable state, it must lie somewhere between the pure states. This means
that if one has a choice as to whether to be a Hawk or a Dove, it would be best
to adopt a mixture of the strategies-but what mixture? Remember that on the
level of each individual confrontation, you have to choose your identity, whether
to be a Hawk or a Dove, before you know the identity of your opponent. What
percentage of the time should you be a Hawk and what percentage of the time
should you be a Dove? With just a little algebra, we can find these percentages:

Unit 9 | 23
UNIT 9 game theory
TExtbook

SECTION 9.5 THE NITTY GRITTY


• The optimum mix of passive and aggressive behavior depends on the exact
hawks and doves values of the costs and benefits.
CONTINUED

To start, let’s represent the pure Hawk strategy as H, the pure dove strategy as D,
and the Mixed strategy as S. The payoffs for these would be as follows:

E(H,S) = payoff of pure Hawk versus the Mixed strategy


E(D,S) = payoff of pure Dove versus the Mixed strategy

Let’s define p as the probability that the Mixed player plays Hawk in a given
interaction; then the expression 1-p represents the probability that the Mixed
player will play Dove. The expected average payoff of H vs. S, E(H,S), will be
composed of part of the Hawk-Hawk and part of the Hawk-Dove payoffs.

E(H,S) = (probability that S plays Hawk) x (payoff of Hawk-Hawk) + (probability


that S plays Dove) x (payoff of Hawk-Dove)

p(B–C)
E(H,S) = 2
+ (1-p)B

The expected average payoff for D vs. S, E(D,S), can be found in a similar manner:

E(D,S) = (probability that S plays Hawk) x (payoff of Dove-Hawk) + (probability that


S plays Dove) x (payoff of Dove-Dove)

B
E(D,S) = p(0) + (1-p)( )
2

S’s optimum mix will be when both H and D do equally well against it. This means
that S has nothing to gain by skewing the mix towards more Hawk or more Dove
than prescribed by p and (1-p) respectively. In other words, the optimal mix will
be the value of p when E(H,S) = E(D,S).

E(H,S) = E(D,S)
p(B–C) + (1-p)B = p(0) + (1-p)( B )
2 2

Solving this for p yields the percentage of time that S should play Hawk, which
B
turns out to be . Note that this percentage is entirely dependent on the benefit-
C
to-cost ratio.

Unit 9 | 24
UNIT 9 game theory
TExtbook

SECTION 9.5 All of this means that were we to study the population of our field for a long
time, we would find that the ratio of benefits given by food piles to costs incurred
hawks and doves by fighting would determine the percentage of time that a Mixed animal should
CONTINUED
play Hawk or Dove. If, for some reason, the system falls out of balance, as
when a group of players decides to play Hawk more often than they should, then
there will be a clear advantage for others to play Dove more than they should.
These counteracting forces would then drive the system back to the appropriate
average ratio of Hawks to Doves.

The evolutionary progress of our field, like the process of Axelrod’s tournament,
shows that pure strategies are neither always stable, nor always optimal. The
most successful strategies are usually mixed strategies. In terms of human
behavior, this suggests that to be successful, we should not be too quarrelsome,
nor should we be pushovers. Additionally, we should be forgiving at times, and
at other times we should not hesitate to retaliate against wrongdoers. These
conclusions are all well and good in theory, but how do they play out in real life
with actual human beings? In our next section, we will examine what happens
when game theory’s predictions are put to the test in different human cultures.

Unit 9 | 25
UNIT 9 game theory
TExtbook

SECTION 9.6

FAIRNESS IN • No One Said Life Was Fair


DIFFERENT CULTURES • Get Real

NO ONE EVER SAID LIFE WAS FAIR


• Games can be used as a sociologist’s measuring stick to quantify notions of
fairness in human cultures.
• The Ultimatum Game gives one player a sum of resources to be shared with
another player, who can accept or reject the offer.

What does “fair” really mean? Does it mean the same thing to everybody?
Sociologists have been able to explore these questions using the techniques of
game theory. Games can serve as one of the essential tools of the sociologist,
much as litmus paper serves as a tool for the chemist or a telescope serves as a
tool for the astronomer.

First, let’s clarify the difference between the terms “rational” and “fair.”
A rational action, as we have defined it, is one in which a player chooses the
strategy with the best chance of producing the most personal benefit, without
regard to what happens to the other player. Being fair, on the other hand, takes
into account a whole host of other factors, including cultural norms, experience
in market transactions, and experience with cooperation. In work done at the
turn of the twenty-first century, researchers found that the concept of what is
“fair” ranges widely, depending on who’s playing. They reached this conclusion
after watching how people from 17 different small-scale societies, ranging from
hunter-gatherers to nomadic herders to sedentary farmers, played in a variety
of cooperative games, such as the Ultimatum Game and the Public Goods Game.

In the Public Goods Game, players are asked to contribute some amount of
money to a communal pot, which will be subsequently increased, based on
how much everyone gives. In the Ultimatum Game, one player is given a sum
of money or other valuable resource and is instructed to share it with another
player. The first player decides how much to offer and the second player decides
whether or not to accept the offer. If the second player rejects the offer, neither
player gets any reward, or benefit.

Unit 9 | 26
UNIT 9 game theory
TExtbook

SECTION 9.6 Let’s examine the Ultimatum Game in a bit more detail. Player 1, the Offerer,
can offer any amount that he or she chooses. For the sake of simplicity, let’s say
FAIRNESS IN that the Offerer can choose to offer a high amount (H) or a low amount (L). If he
DIFFERENT CULTURES offers H, then he will be left with L if Player 2 accepts the offer, and vice versa.
CONTINUED
Player 2, the Receiver, always has the choice of accepting or rejecting the offer.
With these simplified assumptions we can create a matrix:

Receiver
1203 Offerer
Offer High
Accept Offer
(L, H)
Reject Offer
(0, 0)
Offer Low (H, L) (0, 0)

It should be evident from this matrix that a rational Receiver will never reject
an offer. From the rational Receiver’s point of view, receiving L, even if L is of
very low value, is better than getting nothing. A rational Offerer will pick the
strategy corresponding to the row with the largest minimum payoff. Both rows
in this case have the same minimum, 0, so the Offerer should then choose the
strategy with the best potential payoff, which will be to Offer Low. In fact, the
rational Offerer should offer the smallest amount possible, because the rational
Receiver accepts any offer.

GET REAL
• The notion of what is fair depends on cultural norms.

When actual people play this game, however, the results vary widely and are
never in line with the rational model. The study found that average offers across
all societies range from 25% of the total to more than 50%. Furthermore, many
real players will reject offers, even offers of more than 50%. What is perhaps
more illuminating is how offers and acceptances depend on the society in which
the players live.

Certain groups of people who are very economically independent, at least at the
family level, had the lowest average offers. Other groups of people who depend
on communal cooperation to gain food, such as in a whale hunt, had mean
offers very close to 50%. Still others, in societies in which gift-giving is an act of
status, had average offers above 50%. Quite surprisingly, some of these high-
offer societies exhibited high rejection rates as well.

Unit 9 | 27
UNIT 9 game theory
TExtbook

SECTION 9.6 Why would someone reject an offer? The answer relates to the psychology
inherent in reiterative games. The researchers surmised that people reject
FAIRNESS IN offers that are too low because if they accepted such offers, they would develop
DIFFERENT CULTURES a reputation for accepting low offers and, consequently, no one would give them
CONTINUED
higher offers in the future. Also, rejecting an offer turns the tables of power in
the Receiver’s favor. The Receiver can punish the low Offerer, who has much
more to lose in a rejection than the Receiver does. From the Receiver’s point of
view, it might be worth incurring the cost of losing the low offer if it discourages
the Offerer from being so stingy with future offers.

Why would anyone reject a high offer? In certain cultures, gift giving obligates
the receiver to return the favor; receivers who do not wish to be obligated to
someone else would then reject any offer that seemed to be too big a burden to
pay back. These cultural norms were thought to manifest themselves in how
people played the Ultimatum Game, as the participants sought to contextualize
their experience of the game. In other words, they often asked themselves,
“What does this game remind me of?” and then they adjusted their strategy to
align with their perception of the situation.

The Public Goods Game and the Ultimatum Game show that what people
perceive as being fair depends heavily on their cultural context. In these cases,
games served as tools for measuring and quantifying cultural values in the
real world. We see that the concept of fairness develops in human societies in
relation to their specific needs and values. Game theory can also be used to
examine another very human concept, that of language. We will now turn our
attention to how ideas from game theory can contribute to the explanation of
how language can arise and develop within a group.

Unit 9 | 28
UNIT 9 game theory
TExtbook

SECTION 9.7

LANGUAGE • The Name Game


• Pass It On

THE NAME GAME


• Language development can be modeled as a game in which players receive
a payoff when they are in agreement about a particular word referring to a
particular object.
• The payoff depends on how many players are in agreement.

Human language seems to be a large part of what makes us unique beings.


We said in the introduction to this unit that game theory can be thought of as
the mathematical study of human interactions. Language is arguably the most
fundamental of these interactions. It is, in fact, hard to imagine interactions
without language in the first place. Yet, how does a group of people agree on
which words to use for particular objects? How is this set of agreements passed
along to offspring? In 1999, in a paper written by Nowak, Plotkin, and Krakauer
at the Princeton Institute for Advanced Study, principles of game theory were
used to illuminate how language can develop.

Imagine a group of human ancestors, a troop of hominids, if you like. Suppose


that this group is just starting to communicate about specific objects in their
environment. Perhaps they have become concerned with communicating
specific threats or other important information more efficiently-knowing the
location of a hungry leopard or of a grove of fruit-laden trees is often critical to
survival. Each individual in the group develops an internal list of verbal signals,
or words, that are associated with objects such as leopards or fruit trees.

In order for communication to take place, a speaker makes an association


between an object and a word. A listener either has the same association or
they do not. Suppose that the speaker, upon seeing a leopard, says “leopard”
to the listener. If the listener has the same association, then they will
think “leopard” and act accordingly. If the listener does not have the same
association, then they will not understand, and there could be some negative
consequence.

Unit 9 | 29
UNIT 9 game theory
TExtbook

SECTION 9.7 This implies that if the speaker and the listener have the same association,
then there is some sort of payoff for both of them. That payoff could be that the
LANGUAGE listener avoids danger, or perhaps learns the location of some food. The payoff
CONTINUED for the speaker will be the same, if we imagine that both individuals use the
same word in the future. In this discussion, we will assume that, as with the
Hawks and Doves game, this language game is played more than once.

When the speaker wishes to alert the listener to the presence of the leopard,
there is a certain probability that the speaker will use a given word. Likewise,
there is a certain probability that the listener will associate the speaker’s word
with the concept “leopard.” The maximum payoff for these two will increase
as the probability that each player uses the same word for “leopard” increases.
Payoffs also increase as more players adopt the same vocabulary and
associations. With this kind of payoff structure in place, there is an incentive for
players to understand each other, which can lead, over time, to the development
of a common language.

PASS IT ON
• Languages are passed on to younger generations in a variety of ways.

Long-term development of language requires agreed-upon, object-signal


associations to be passed down to new generations of speakers and listeners.
How is this language transmitted to new generations?

Nowak, Plotkin, and Krakauer identified three main methods of language


transmission. The first, and perhaps most intuitive, is parental transmission.
Children tend to acquire the language of their parents and in this mode of
transmission, greater language “fitness” (average payoff of one’s list of
associated signals and objects) would correlate with greater biological “fitness.”
In other words, the successful use of language can affect one’s chances of
passing on genes successfully to the next generation.

The second mode of transmission was identified to be through a role model


outside of the family. In this mode, a high-status member of the group gains
many young imitators. High-ranking role models illustrate the connection
between language and status. So, if a child imitates the language profile of a
high-status individual, that child will, on average, out-compete the children who
do not imitate high-status individuals.

Unit 9 | 30
UNIT 9 game theory
TExtbook

SECTION 9.7 The final mode of transmission is simply random learning. In this scenario,
there are no clear incentives for learning language from any particular
LANGUAGE individual; instead, children imitate a random mixture of adults without regard
CONTINUED to status or payoff. This tends to maximize confusion and, thus, minimize the
payoffs that can accrue from mutual understanding. Groups who transmit
language via this method tend to take a significantly greater amount of time
to develop a common language, as opposed to groups that use the other two
transmission methods.

Unit 9 | 31
UNIT 9 Game Theory
textbook

SECTION 9.2

origins of game • Mathematicians attempt to analyze types of interactions by treating them as


theory games in which players use strategies to obtain payoffs.
• Games like checkers and chess are games of perfect information; the board
shows both players all the information needed to make the right decision.
• Games like poker are games of imperfect information; no player has enough
information (the other players’ cards are hidden) to make a guaranteed right
decision.
• Zero-sum games are games in which a winner’s payoff is equal to a loser’s
loss.
• Utility is an imprecise concept, but biological payoffs are measurable.
• The minimax theorem, attributable to Von Neumann and Morgenstern,
states that players seek the strategy that minimizes their maximum loss.

SECTION 3.2
9.3

simple games • Cake division is a very simple zero-sum game, modeled with a 2 x 2
payoff matrix.
• If each player is greedy, we assume that each will choose the strategy with
the best worst-case scenario.
• Equilibrium is reached when both players have no incentive to change their
strategies; these can be considered the “best” strategies.
• Matching pennies, in its one-off form, is an example of a game in which
there is no clear “best” strategy.
• Playing multiple rounds of matching pennies does have a clear equilibrium.

SECTION 3.2
9.4

prisoner’s dilemma • The Prisoner’s Dilemma is a classic example of a non-zero-sum game.


• The equilibrium in Prisoner’s Dilemma is not the optimum solution.
• The Iterated Prisoner’s Dilemma admits a wide variety of equilibrium
outcomes, depending on the mix of strategies adopted by the players.
• In computer tournaments, strategies that are neither always generous nor
always punitive tend to fare the best.

Unit 9 | 32
UNIT 9 Game Theory
textbook

SECTION 9.5

hawks and doves • Hawks and Doves is a rudimentary model of game theory in a biological
context in which creatures compete for resources by being either aggressive
or passive.
• Aggressive players beat passive players, but they incur costs when they
compete against other aggressive players.
• Pure strategies can be a bad idea.
• The optimum mix of passive and aggressive behavior depends on the exact
values of the costs and benefits.

SECTION 3.2
9.6

fairness in • Games can be used as a sociologist’s measuring stick to quantify notions of


different cultures fairness in human cultures.
• The Ultimatum Game gives one player a sum of resources to be shared with
another player, who can accept or reject the offer.
• The notion of what is fair depends on cultural norms.

SECTION 3.2
9.7

language • Language development can be modeled as a game in which players receive


a payoff when they are in agreement about a particular word referring to a
particular object.
• The payoff depends on how many players are in agreement.
• Languages are passed on to younger generations in a variety of ways.

Unit 9 | 33
UNIT 9 game theory
TExtbook

BIBLIOGRAPHY

PRINT Axelrod, Robert. The Evolution of Cooperation. USA: Basic Books (Perseus Books
Group), 1984.

Benjamin, Arthur T. and A.J. Goldman. “Analysis of N-Card le Her,” Journal of


Optimization Theory and Applications, vol. 114, no. 3 (September 2002).

Barash, David P. The Survival Game: How Game Theory Explains the Biology of
Cooperation and Competition. New York: Times Books: Henry Holt and Co., 2003.

Bernstein, Peter L. Against the Gods: The Remarkable Story of Risk. New York:
John Wiley and Sons, 1996.

Dutta, Prajit K. Strategies and Games: Theory and Practice. Cambridge, MA: MIT
Press, 1999.

Eshel, Ilan and L.L. Cavalli-Sforza. “Assortment of Encounters and Evolution of


Cooperativeness,” Proceedings of the National Academy of Sciences, USA, vol. 79,
no. 4 (1982).

Ficici, S.G., O. Melnik, and J.B. Pollack. “A Game-Theoretic Investigation of.


Selection Methods Used in Evolutionary Algorithms,” Proceedings of the 2000
Congress on Evolutionary Computation CEC00, (2000).

Fisher, Sir Ronald Aylmer. “Randomisation and an Old Enigma of Card Play,”
Mathematical Gazette, vol. 18 (1934).

Grassly, N.C., A. von Haeseler, and D. Krakauer. “Error, Population Structure,


and the Origin of Diverse Sign Systems,” Journal of Theoretical Biology, vol. 206,
no. 3 (2000).

Henrich, Joseph, Robert Boyd, Samuel Bowles, Colin Camerer, Ernst Fehr,
Herbert Gintis, and Richard McElreath. “In Search of Homo Economicus:
Behavioral Experiments in 15 Small-Scale Societies,” The American Economic
Review, vol. 91, no. 2, Papers and Proceedings of the 113th Annual Meeting of the
American Economic Association (May 2001).

Unit 9 | 34
UNIT 9 game theory
TExtbook

Henrich, Joseph. “Does Culture Matter in Economic Behavior? Ultimatum Game


Bargaining Among the Machiguenga of the Peruvian Amazon,” The American
Economic Review, vol. 90, no. 4 (September 2000).

Jager, G. “Evolutionary Game Theory and Linguistic Typology: A Case Study,”


In P. Dekker, editor, Proceedings of the 14th Amsterdam Colloquium. ILLC,
University of Amsterdam, (2003).

Mero, Laszlo. [Translated by Anna C. Gosi-Greguss. English version edited by


David Kramer.] Moral Calculations: Game Theory, Logic, and Human Frailty. New
York: Copernicus Springer-Verlag New York, Inc., 1998.

Nowak M.A. and R.M. May. “Evolutionary Games and Spatial Chaos,” Nature 359
(1992).

Nowak M.A., S. Bonhoeffer, and R.M. May. “More Spatial Games,” International
Journal of Bifurcation and Chaos, vol. 4, issue 1 (February 1994).

Nowak, M.A., J.B. Plotkin, and D. Krakauer. “The Evolutionary Language Game,”
Journal of Theoretical Biology, vol. 200, Issue 2 (21 September 1999).

Nowak, M.A., Karen M. Page, and Karl Sigmund. “Fairness Versus Reason in the
Ultimatum Game,” Science, vol. 289, issue 5485 (2000).

Poundstone, William. Prisoner’s Dilemma: John Von Neumann, Game Theory, and
the Puzzle of the Bomb. New York: Doubleday, 1992.

Siegfried, Tom. A Beautiful Math: John Nash, Game Theory, and the Modern Quest
for a Code of Nature. Washington, D.C.: John Henry Press, 2006.

Von Neumann, John and Oskar Morgenstern. Theory of Games and Economic
Behavior (6th paperback edition). Princeton, NJ: Princeton University Press, 1990.

1 Calvino,
Italo. (translated by William Weaver) If on a winter’s night a traveler.
San Diego-New York – London: Harvest Books/Harcourt Inc., 1981; p 4.

Unit 9 | 35
UNIT 9 game theory
TExtbook

NOTES
SECTION 9.1

Unit 9 | 36
TEXTBOOK
Unit 10
UNIT 10
Harmonious Math
TEXTBOOK

UNIT OBJECTIVES

• The connection between music and math goes back to the ancient Greek notion of
music as the math of time.

• Strings of rationally related lengths tend to sound harmonious when played


together.

• Sound waves can be expressed mathematically as the sum of periodic functions.

• Trigonometric functions can be used as the building blocks of more complicated


periodic functions.

• Frequency and amplitude are two important attributes of waves.

• A mathematical series either converges to a specific value or diverges to infinity.

• Any wave can be constructed out of simple sine waves using the techniques of
Fourier analysis and synthesis.

• The ability to manipulate directly functions or signals in the frequency domain has
been largely responsible for the great advances made in sound engineering and,
more generally, in all of digital technology.
Music is the pleasure the human mind
experiences from counting without being
aware that it is counting.

Leibniz
UNIT 10 Harmonious Math
textbook

SECTION 10.1

INTRODUCTION You may have heard the notion that music and mathematics are connected.
Perhaps you’ve heard one of those stories about the violin prodigy who also
excels at calculus, or the composer whose works are based on prime numbers.
Indeed, musical and mathematical talent seem to go hand in hand at times. Why
should this be so?

The answer to this complex question touches on more than just music and
mathematics. There are undoubtedly societal, psychological, and perhaps
even biological factors involved that can lead to the coincidence of music and
mathematical talent in the same individual. Nonetheless, it is safe to say that
perhaps talent in both areas has something to do with the fact that the two
disciplines are related in many fundamental, even abstract ways. It is the
connections between music and mathematics, some of which are surprising
indeed, that will be the focus of this unit. Our discussion will be concerned not
with explaining the connections between abilities in these two disciplines, but,
rather, with how the two disciplines relate to one another on a conceptual level.

One of the most fundamental ways that music and math are connected is in
the understanding of sound specifically, and wave phenomena in general.
Understanding sound as an instance of wave phenomena provides a nice forum
for the interaction of ideas from music, physics, and mathematics. Tools that
have been developed to help us understand the nature of sound, such as Fourier
analysis, can be generalized to shed light on many areas of mathematics. In return,
the mathematical understanding of sound has helped foster the development of
new technologies that extend the possibilities for musical exploration.

To the mathematician, wave structure and theory open the door to the
examination of periodic functions, some of the most basic forms of patterns in
mathematics. In this unit, we will examine how music and math have influenced
each other throughout the ages. In particular, we will view both music and
sound as “the math of time,” an idea that can be traced back to the Greeks. From
there we will look at our current understanding of sound and the mathematical
tools that have helped us reach that understanding. We will look at waves and
periodic functions in one-dimension, see how Fourier analysis can break these
into combinations of simple sine and cosine functions, and then move on to see
how these ideas can generalize to more complicated phenomena. Finally, an

Unit 10 | 1
UNIT 10 Harmonious Math
textbook

SECTION 10.1 exploration of the question, “can you hear the shape of a drum?” will introduce
us to the ways in which the mathematical study of periodicity and patterns can be
INTRODUCTION applied to interesting and challenging problems.
CONTINUED

Unit 10 | 2
UNIT 10 Harmonious Math
textbook

SECTION 10.2 • Mystical Connections


• A Rational Approach
The Math of Time
Mystical Connections
• Music played a central role in Greek thought.

Throughout history, music has played, and continues to play, an important


role in many cultures. In some cultures music is a participatory experience,
an active art form in which all are encouraged to partake. In other cultures,
music is a form of worship or entertainment, to be practiced by relatively few
but appreciated by many. Much of the formal western music tradition falls into
the latter category. This relationship with music has its roots in the music of the
ancient Greeks.

Music served a number of purposes in ancient Greek society. It was an element


of religious ceremonies, sporting events, and feasts, and it was part of Greek
theatre. In making their music, the Greeks used techniques that are still
commonly used today, employing strings, reeds, and resonant chambers to
create and control tones and melodies.

One group, the Pythagoreans, took a particular interest in exactly how


instruments could be controlled to make pleasing sounds. In Unit 3, we saw
how the Pythagorean obsession with all things involving number led to the
development of the idea of irrational numbers. Central to this concept was
the notion of incommensurability, which holds that certain quantities cannot
be related through whole number ratios. Hipassus, a Pythagorean who is
traditionally credited with developing this idea, is said to have been drowned for
his heretical ideas.

Heresy is an apt term to describe Hippassus’s ideas, because to the


Pythagoreans, the synchrony of numbers and music gave rise to a harmony
that was considered among the first guiding principles of the universe. They
believed in the “harmony of the spheres,” the idea that the motions of the
heavenly bodies created mathematically harmonious “tones.” This numero-
musical mysticism was centered somewhat on the idea of whole number ratios,
so Hippassus’s claim was not just an intellectual insult, but also a violation of a
fundamental philosophy—even spirituality.

Unit 10 | 3
UNIT 10 Harmonious Math
textbook

SECTION 10.2 A Rational Approach


• The Greeks recognized that strings of rationally related lengths sound
The Math of Time harmonious when vibrating together.
CONTINUED • Rational relationships are the foundation of Western music.

Why was the idea of whole number ratios so appealing? Pythagoras himself was
said to have noticed that plucked strings of different lengths sound harmonious
when those lengths are ratios of simple whole numbers. For example, if we
pluck a string of length 1 meter, and then we pluck a string of half a meter,
we will notice that the tones seem harmonious. The half-meter string sounds
higher in pitch, but the tone is “the same.” The two tones that come from
strings whose lengths are in the ratio of 1:2 represent an interval called the
“octave.” Other ratios also give aesthetically pleasing results. For example, two
strings with a length ratio of 3:2, when plucked, create a harmonious interval
called a “fifth.”

The Greeks were the first to arrange the individual tones that make up these
intervals into sequences, or scales. They named these scales, also known
as modes, after local geographic regions: Ionian, Dorian, Phrygian, Lydian,
Aeolian, etc. These modes were associated with different mental states. For
example, the Dorian mode was said to be relaxing, whereas the Phrygian mode
was supposed to inspire enthusiasm. The Greek modes are still important in
modern music, though many other basic note sequences have been created
throughout the centuries.

The idea of what is considered “musical” has expanded over the years, but
western music (i.e., music associated with the western hemisphere—as opposed
to eastern music) is still built upon the fundamental idea that tones associated
with whole number ratios sound good together. It is indeed a mystery as to why
our aesthetic sensibilities should favor this system of organizing musical tones.
In any case, this early connection between harmony and math set the stage for
centuries of fruitful collaboration. Music, as an academic subject, ascended to
a special place in the classical education of both Greek citizens and the learned
classes of those cultures that would carry on their intellectual traditions.

For example, the “Quadrivium,” composed of music, arithmetic, geometry, and


astronomy, represented the curriculum of classical education for centuries.
Such was the perceived value of musical education in the classical world that
it was made one of the four core subjects. However, the musical studies of

Unit 10 | 4
UNIT 10 Harmonious Math
textbook

SECTION 10.2 the Quadrivium focused mainly on the Pythagorean notion of ratios and scales
rather than on the performance of musical compositions. Students learned
The Math of Time about harmonics and the proportions that would yield pleasing scales and
CONTINUED melodies. This focus on the structure of music is closer to what, in the modern
age, would be called “music theory.”

The Greeks were some of the first people to apply mathematical thought to the
study of music.

NOTE SOUNDED DIVISION OF STRINGS INTERVALS CREATED

SIXTH c’’
OVERTONE
FOURTH

FIFTH g’ MINOR
OVERTONE SIXTH
MINOR THIRD

FOURTH e’
OVERTONE
MAJOR THIRD

THIRD c’ MAJOR TRIPLE


OVERTONE SIXTH OCTAVE
FOURTH

SECOND g
OVERTONE
FIFTH DOUBLE
OCTAVE
FIRST c TWELFTH
OVERTONE
OCTAVE

C
FUNDAMENTAL

This was to be a mere prelude to the understandings that future mathematicians


would bring to music. One of the most powerful connections to be discovered
was that music, and sound in general, travels in waves. The mathematics of
sound, of waves, to which we will now turn our attention, will lead us to powerful
ways of thinking not only about music, but about many other phenomena.

Unit 10 | 5
UNIT 10 Harmonious Math
textbook

SECTION 10.3

Sound and Waves • Something in the Air


• The Sound of Music

Something in the Air


• Sound is caused by compression and rarefaction of air molecules.
• We perceive the amplitude of a sound wave as its loudness, or volume.
• We perceive the frequency of a sound wave as its pitch.

As we have seen, the Greeks recognized connections between harmonic


intervals and rational numbers. As it turns out, they also had a rudimentary

1853
understanding of the most basic musical concept of all…sound. Thinkers
such as Aristotle suspected that sound was some sort of “disturbance” that is
propagated through the air.

rarefaction

string

compression

We are all familiar with waves of one sort or another. You may have seen them
at the beach, or felt them in an earthquake, or heard about them, perhaps when
someone has spoken of “airwaves” in relation to TV or radio broadcasts. Each of
these waves is different, but they all share some unifying characteristics. Let’s
look at some of the characteristics of ideal, simple waves, waves that we will
later use as “atoms” to construct more-realistic, complicated waves.

Unit 10 | 6
UNIT 10 Harmonious Math
textbook

SECTION 10.3 Imagine the smooth surface of a pond on a still day. If you throw a pebble into

1854
Sound and Waves
CONTINUED
the pond, you will see ripples emanating from the point at which the pebble
enters the water. These ripples consist of areas where the surface of the
water is heightened, called crests, followed by areas that are depressed, called
troughs. A cross-section of a few of these ripples might look like this:

crest

amplitude

trough

Notice that both the crests and the troughs reach equally above and below,
respectively, the still surface line. This shows us that waves travel by some sort
of displacement in a medium. The amount of displacement, as measured from
the still surface line, is called a wave’s amplitude.

To be precise, a rock hitting a pond creates an impulse, a temporary


disturbance. Over time, the effects of the disturbance dissipate and the surface
of the pond becomes smooth again. To explore the concept of waves further,

1855 let’s instead imagine some sort of regular disturbance, or ongoing pulsation,
such as a child slapping the surface of the water in a rhythmic fashion.

Unit 10 | 7
UNIT 10 Harmonious Math
textbook

SECTION 10.3 If the child’s mother is fishing in the same pond, her bob will move up and down
with the crests and troughs of the ripples. The bob will not move horizontally,
Sound and Waves only vertically. This is an important point concerning waves: the medium
CONTINUED through which a wave travels has no net movement when the wave passes
through it. That is, there is no net horizontal displacement.

1856

The bob stays in the same place as wave passes under it.

To carry this concept from our pond example back to sound waves traveling
through the air, this means that the air molecules that transmit the disturbance
that we interpret as sound do not, on average, travel any net distance. For
instance, a loudspeaker does not push a stream of air towards me. Rather, it
compresses air molecules to form a region of high pressure that travels away
from the source. Assuming that the air is of uniform density and pressure
to begin with, this region of high pressure will be balanced by a region of
low pressure, called rarefaction, immediately following the compression.
Remember, air molecules do move forth and back, but after the wave has

1857 passed, they are, on average, in the same place they were before the wave came
through.
RA

LONGITUDINAL
RE
FA

WAVE
CTIO
N

COMPRESSION

PEAK

TRANSVERSE
WAVE

TROUGH

Unit 10 | 8
UNIT 10 Harmonious Math
textbook

SECTION 10.3 As these groups of molecules alternately experience compressions and


rarefactions, a pulse is created, and this is what “reaches” our ears. Whether
Sound and Waves or not we hear the waves as sound has everything to do with their frequency,
CONTINUED or how many times every second the molecules switch from compression to
rarefaction and back to compression again, and their intensity, or how much the
air is compressed.

In our graph above, the vertical axis represents air pressure and the horizontal
axis represents time. The crests correspond to times of high pressure,
(compression) and the troughs represent times of low pressure (rarefaction).
The height of a crest corresponds to the degree of compression of the air, which,
when measured from the baseline, is another way to think about amplitude. We
perceive amplitude as a sound wave’s loudness.

To determine the frequency of the wave from our graph, we first look at how
much time elapses between successive crests or successive troughs. This
peak-to-peak or trough-to-trough time, which is called the period of the wave, is
usually measured in seconds. If we take the inverse of the period, we get a value
expressed in units of inverse seconds (i.e., “per second”). This is the frequency
of the wave. Frequency is most often measured in cycles per second, also called
“hertz” (Hz). If the frequency of a wave is greater than approximately 20 Hz (20
wave crests, or pulses, pass a given point in one second), then humans generally
perceive this phenomenon as a sound. The frequencies that an average human
being perceives as sound range from 20 Hz on the low end to 20,000 Hz on the
high end. Frequency in the music world is known as “pitch.” The greater the
frequency, the higher the pitch.

Frequency and amplitude are two of the mathematical concepts necessary


for understanding a “pure” sound wave. The third, and last, basic concept
related to waves is phase. Phase has to do with the position in the cycle of
compressions or rarefactions at which a wave starts. For example, if the cone
of a loudspeaker—the part that vibrates back and forth—starts out moving
away from you, the sound wave that eventually reaches you will begin with a
rarefaction. If, on the other hand, the cone starts by moving towards you, the
wave will first hit you with a compression. The speaker doesn’t have to start at
one of these extremes, however; it can start at any point in the cycle. Different
starting points mean different phases.

Unit 10 | 9
UNIT 10 Harmonious Math
textbook

SECTION 10.3

Sound and Waves


CONTINUED

Phases
Item 1858 / Oregon Public Broadcasting, created for Mathematics Illuminated, SINE WAVES IN DIFFERENT PHASES
(2008). Courtesy of Oregon Public Broadcasting.

The Sound of Music


• An instrument’s tone, the sound it produces, is a complex mixture of waves
of different frequencies.
• Instruments produce notes that have a fundamental frequency in
combination with multiples of that frequency known as partials or
overtones.

Now that we have some understanding of how a wave can be thought of strictly
on a physical basis, let’s return to the Greek idea of intervals. Recall that the
Greeks considered harmonious the sounds of plucked strings whose lengths
were in ratios of whole numbers. In general, strings of different lengths
produce sound of different frequencies. Without considering such things as
string thickness or tension, longer strings tend to produce lower frequencies
than do shorter strings. So, when two strings of different lengths are plucked
together, the resulting sound is a combination of frequencies. Surprisingly
though, even the sound produced by a single string is not made up entirely of
one frequency.

A string vibrates with some fundamental frequency, 440 Hz for an “A” note, for
example, but there are other frequencies present as well. These are known as
either partials or overtones, and they give each instrument its characteristic
sound, or timbre. Timbre helps explain why a tuba sounds different than a cello,
even though you can play a “middle C” on both instruments.

Unit 10 | 10
UNIT 10 Harmonious Math
textbook

SECTION 10.3

Sound and Waves


CONTINUED

Item 2044 / Oregon Public Broadcasting, created for Mathematics Illuminated, TWO DIFFERENT SOUNDWAVES,
EACH PLAYING THE NOTE “A” (2008). Courtesy of Oregon Public Broadcasting.

Item 3076 / Oregon Public Broadcasting, created for Mathematics Illuminated, TWO DIFFERENT SOUNDWAVES,
EACH PLAYING THE NOTE “A” (2008). Courtesy of Oregon Public Broadcasting.

For a single plucked string, the overtones occur at frequencies that are whole
number multiples of the fundamental frequency. So, a string vibrating at 440
Hz (an “A”) will also have some vibration at 880 Hz (440 × 2), 1320 Hz (440 × 3),
and so on. These additional frequencies have smaller amplitudes than does
the fundamental frequency and are, thus, more noticeable as added texture in a
sound rather than as altered pitch.

Every instrument has its own timbre. If you play a middle A, corresponding to
440 Hz, on a piano, the note will have a much different sound than the same
note played on a trumpet. This is due to the fact that, although both notes are
based on the fundamental frequency of 440 Hz, they have different combinations
of overtones attributable to the unique makeup of each instrument. If you’ve
ever heard “harmonics” played on a guitar, you have some sense of how a tone
can be made of different parts. When a guitarist plays “harmonics,” he or she
dampens a string at a very precise spot corresponding to some fraction of the
string’s length, thereby effectively muting the fundamental frequency of the
vibrating string. The only sounds remaining are the overtones, which sound
“thinner” than the fundamental tones and almost ethereal.

Unit 10 | 11
UNIT 10 Harmonious Math
textbook

SECTION 10.3 Up until this point, the connections we have drawn between music and math
have been mainly physical, with a few somewhat philosophical ideas thrown
Sound and Waves in as well. There is much more to the story, however. In order to take our
CONTINUED discussion to a deeper level, we first need to understand how waves can be
combined mathematically. Before we can combine waves mathematically,
however, we need a universal way to describe them. In the next section, we will
see how a simple wave can be expressed mathematically using the power of
triangles and trigonometry.

Unit 10 | 12
UNIT 10 Harmonious Math
textbook

SECTION 10.4

Mathematics • Periodic Functions


of Waves • Wheel in the Sky
• The Wave Equation

Periodic Functions
• Trigonometric functions, such as sine and cosine, are useful for modeling
sound waves, because they oscillate between values.

In the previous section, we looked at how to quantify different aspects of a sound


wave mathematically. We saw that frequency, phase, and amplitude are the
key quantifiable attributes that distinguish one wave from another. What of the
actual wave itself? What is the mathematical function that represents a wave?

Obviously, we need a relationship that exhibits periodic behavior, returning

1047 to the same position or value with regularity. Remember that a sound wave
causes air molecules to “vibrate” back and forth from their at-rest positions.
Any function used to model waves should display the same output value for
regularly repeated input values. If the function models air pressure, the input
value is time, and we, therefore, would want a function that periodically returns
to the same pressure value as time progresses. One such function is that old
trigonometry favorite, the sine function.

h
o

θ
a

NOTE: Throughout this discussion, we measure angles in units of radians.


Recall that 2π radians are equivalent to 360°, a complete circle. Half that value,
π
π radians, therefore corresponds to 180° (half a circle), 2 radians to 90°, and so
on.

Unit 10 | 13
UNIT 10 Harmonious Math
textbook

SECTION 10.4 Suppose that we have a right triangle. We can define a few quantities that relate
the angles of such a triangle to the lengths of its sides. The most familiar of
Mathematics these relationships are the sine, cosine, and tangent of an angle. The sine of
of Waves an angle is the ratio of the lengths of the opposite side and the hypotenuse.
CONTINUED
Similarly, the cosine of an angle is the ratio of the lengths of the adjacent side
and the hypotenuse. The tangent, then, is the ratio of the length of the opposite
side to the length of the adjacent side, or equivalently, the ratio of sine to cosine.

1859 h h
o o
θ θ
a a
sine cosine

h
o
θ
a
tangent

Wheel in the Sky


• We can connect the idea of the sine function of an angle to sine waves
dependent on time by analyzing the “spoke” of a unit circle as it rotates,
forming the hypotenuse of various right triangles.
• A sine wave can represent a sound wave theoretically, but not pictorially.
The shape of a sine wave is altogether different than the “shape” of a sound
wave found in nature.

For simplicity’s sake, let’s focus on the sine function. Notice that in a triangle,
the larger the angle, the longer the opposite side becomes. This fact is a
natural correspondence of triangles: side lengths increase in proportion to their
opposite angles.

Unit 10 | 14
UNIT 10 Harmonious Math
textbook

SECTION 10.4

1860
Mathematics
of Waves
CONTINUED

Note how the vertical component increases as θ increases.

In right triangles, the longest side will always be the hypotenuse. To find the
maximum value of sine, we can investigate a series of right triangles and see
exactly how large the side opposite our angle of interest can get. If we let
π
the angle get close to 2 radians, we see that the length of the opposite side
π
approaches the length of the hypotenuse. If we let the angle equal 2 (note
that this is purely a mental exercise—the triangle we have been imagining
disappears at this point), we interpret the opposite side and hypotenuse to
have the same length and, thus, their ratio, the sine of the angle, is 1.

Unit 10 | 15
UNIT 10 Harmonious Math
textbook

186110.4
SECTION

Mathematics
of Waves
CONTINUED

Note that as θ approaches a right angle, the vertical component approaches the length of the hypotenuse.

π
As the angle increases further, beyond radians and we shift our perspective to
2
look at the triangle formed by the angle’s complement, the opposite side begins
to shrink.

180−θ θ

π
When θ > π / 2
radians, we compute sine by looking at the triangle formed by 180- θ .
2

The length of the opposite side diminishes toward zero as the angle approaches
π radians. Notice that an angle of π radians and an angle of zero radians have
the same sine value—0.
Unit 10 | 16
UNIT 101863
Harmonious Math
textbook

SECTION 10.4

Mathematics
of Waves
CONTINUED

π RADIANS ZERO RADIANS


= NO VERTICAL = NO VERTICAL
COMPONENT COMPONENT

So far, we’ve seen that the sine function starts at zero, increases to 1, then
decreases back to zero as the angle steadily gets larger. This is somewhat
reminiscent of how the waves we studied in the previous section behave. If we
were to look at the air pressure of a particular region as a sound wave passed
through it, we would observe the sequence of events depicted in the following
images:

1864
RAREFACTION

BASELINE COMPRESSION

REGION OF INTEREST

SIN θ = 0

BASELINE PRESSURE

Unit 10 | 17
UNIT 10 Harmonious Math
textbook

1865
SECTION 10.4

Mathematics
of Waves
CONTINUED

1866
SIN θ =1
COMPRESSION

SIN θ =0
1867
BASELINE

SIN θ = 1

RAREFACTION

Unit 10 | 18
UNIT 10 Harmonious Math
textbook

SECTION 10.4 In our investigation of the sine function so far, we have modeled the first two of
these steps, the compression and return to baseline. In the following diagram
Mathematics
of Waves
CONTINUED
1868 we see that the sine also models the rarefaction of a sound wave by diminishing
to the value of -1 and then returning to zero, where we started.

Note that as vertical components dip below zero, sine becomes negative.

We have now seen that the sine of an angle oscillates between 0, 1, and -1
smoothly. Because the sine function exhibits this periodic behavior, it can serve
as a rough model of a simple sound wave. Although there are really no natural
sounds that are exactly modeled by a sine wave, we can create such an ideal,
pure tone using a synthesizer. A synthesizer can produce such a sound through
the exact control of the voltage that drives a loudspeaker.

There is an important difference, however, between the function that we use to


model the air pressure changes brought about by the passing of a sound wave
and the sine function, as we just described it. The sound wave pressure function
is a function of time. The standard sine wave is a function of angle. We can
reconcile this by establishing a relationship between angle and time.

Unit 10 | 19
UNIT 10 Harmonious Math
textbook

1869
SECTION 10.4

Mathematics
of Waves
CONTINUED π 3π
2 π 2 2π

THE SPOKE ROTATES WITH SPEED

How triangles relate to sine waves.

If we imagine the hypotenuse of the triangle that we just examined to have a


fixed length of 1 unit, and we allow this line segment to rotate freely around the
origin of the coordinate plane like the spoke of a suspended wheel, we can begin
to reconcile the angle vs. time problem. The rotational speed with which the
spoke rotates can be thought of as a frequency, because the spoke periodically
returns to the same position. As the spoke rotates, the angle it makes with the
positive horizontal axis at any given point in time can be found by looking at how
fast the spoke is rotating and how long it has been rotating. Multiplying these
two quantities results in an angle. So, instead of the sine of an angle, we can
now consider the sine of the rotational speed times time. Graphing the value
of sin ( ω t) on the vertical and time on the horizontal produces the familiar sine
curve.

This is how we connect triangles and unit circles with time-dependent sine

1870 waves.

θ = ωt

Unit 10 | 20
UNIT 10 Harmonious Math
textbook

SECTION 10.4
Now we have a good mathematical model of a simple sound wave:
Mathematics F(t) = sin ( ω t) where t is time and ω is related to the frequency of the wave.
of Waves
CONTINUED
Strictly speaking, because ω is a rotational speed, it is measured in units of
radians per second. If we can somehow get rid of the radians in this expression,
we will be left with a quantity that has units of inverse seconds—the same as
frequency! If we divide ω by 2 π radians, we will have corrected for the radians
and, thus, we will have found the frequency of the wave.

The amplitude of the wave will correspond to the maximum value that our
function can output. Because the sine function normally oscillates between -1
and 1, any coefficient attached to the function will directly affect the amplitude of
the wave. For example, the amplitude of the sine wave 4sin ( ωt) is 4.

The Wave Equation


• The mathematics of wave motion is expressed most generally in the wave
equation.
• The wave equation uses second derivatives to relate acceleration in space to
acceleration in time.

Finally, it is important to realize that sound waves are not solely functions of
time; as we have seen, they are actually pressure distributions in space that
vary with time. In order to model this situation mathematically and completely,
we need some formal expression of a wave’s behavior in both time and space.
We can think of its temporal behavior as related to frequency, but its spatial
behavior is better thought of in terms of how the amplitude at a given time
varies with the wave’s position. To be clear, the spatial dependence we are
talking about is not the height above or below baseline but rather is concerned
with the distance perpendicular to that—the direction in which the wave travels.

We can illustrate this space/time dependence by imagining first what a wave


would look like, were we to somehow stop time. If you’ve ever seen ripples
frozen in a pond in winter, “frozen in time,” so to speak, you have some idea of
what this would look like.

Looking at a cross-section of the frozen surface, we can visualize the spatial


dependence in one dimension, namely x, the horizontal dimension. We can see
that the height of a wave depends on position. Measuring at a trough produces a

Unit 10 | 21
UNIT 10 Harmonious Math
textbook

SECTION 10.4

Mathematics
of Waves
CONTINUED

Item 1871 / Oregon Public Broadcasting, created for Mathematics Illuminated, TWO DIFFERENT HEIGHTS AT TWO
DIFFERENT X-VALUES (2008). Courtesy of Oregon Public Broadcasting.

negative height, whereas measuring at a crest produces a positive height.

This tells us that any function that we wish to use to model the height of a
wave must somehow depend on position. We’ll use x to represent this spatial
coordinate. We’ll see a little bit later that waves in the real world are rarely one-
dimensional, in which case it becomes necessary to use additional coordinates
to represent spatial distribution in more dimensions.

We saw in the previous section how a wave depends on time. We used the
analogy of a steadily rotating spoke to express this dependence. With both
spatial and temporal dependence in hand, we can create a function, u, that
represents the height of a wave at any given point and time. We express this
dependence by making u a function of both position, x and time, t, or u(x,t).

To express how u changes with both position, x, and time, t, we are going to need
calculus, the mathematics of change. The calculus concept of a derivative, a
generalized notion of slope, represents the instantaneous rate of change at a
given point in time (or space). In this case, since u depends on both x and t, we
will have to use partial derivatives to express how u changes. Partial derivatives
enable us to talk about how u changes in regard to each of the quantities, x and t,
separately.
Unit 10 | 22
UNIT 10 Harmonious Math
textbook

SECTION 10.4
∂u/∂x represents how u changes with respect to x.
Mathematics
of Waves ∂u/∂t represents how u changes with respect to t.

1872
CONTINUED
It’s also important to notice that the height of a wave changes at a non-constant
rate. We can see this in the fact that a particle at a particular x, such as the
fishing bob from a few sections back, moves more slowly at the top of a crest or
bottom
THE BOBof a trough
SLOWS than it does
DOWN ATwhen in transit between
THE EXTREME VALUES the two.

To account for this changing speed, or changing rate of change, we must use
second derivatives. The expression ∂2u/∂t2 then represents the acceleration
(positive or negative) of u and ∂2/∂x2 represents the spatial analogy of
acceleration. By relating these two functions, we derive the one-dimensional
wave equation.

∂ 2u ∂ 2u
=c 2

∂t 2 ∂x 2

Here, c is a constant of proportionality. In some cases, c is the speed of the


wave. Our wave function, u(x,t), is the function that solves this wave equation, a
second-order partial differential equation. We can see that the sine and cosine
functions from the previous sections will indeed satisfy this equation. For
example, if the reader is curious and familiar with calculus, let u(x,t) = sin (cx + t),
differentiate with respect to x twice, with respect to t twice, and verify that these
expressions are proportional by c2.

Unit 10 | 23
UNIT 10 Harmonious Math
textbook

SECTION 10.4 Continuing any further in this direction of discussion will take us too far away
from our main objective, the exposition of the music-mathematics relationship.
Mathematics The wave equation is important, however, in that it demonstrates how it is
of Waves possible to represent a physical phenomenon, sound, using the language of
CONTINUED
mathematics. That is, we have now seen how we can express wave behavior
mathematically. Specifically, we’ve seen that sines and cosines of triangles are
periodic functions that can model the compression and rarefaction of groups of
air molecules. We shall now return to our quest to understand exactly how it
is that sounds become combined. With a solid mathematical understanding of
sound waves in hand, we will be able to combine multiple waves mathematically
using the power of Fourier analysis and synthesis.

Unit 10 | 24
UNIT 10 Harmonious Math
textbook

SECTION 10.5

Fourier • Adding Waves


• Building the Sawtooth
• The Frequency Domain

Adding Waves
• We can think of the combination of sine waves in a Fourier series as a
cooking recipe in which the full wave is a combination of varying amounts
(amplitudes) of waves of various frequencies.

Previously in our discussion, we have seen how the tones generated by different
instruments are really mixtures of some fundamental vibration, or oscillation,
and whole number multiples of that frequency, called overtones. The various
combinations of fundamental tones and overtones are what give instruments
their characteristic sounds. This understanding began with the Pythagorean
observation that strings with commensurable lengths sound harmonious when
plucked together. We’ve progressed from understanding the relations of string
lengths to understanding how waves work and how the frequencies of waves are
what we perceive as pitch. We’ve also seen how we can express simple sine and
cosine waves as periodic functions of time via a connection to trigonometry. In
essence, we have learned that musical tones are complicated mixtures of waves,
and we now know how to express simple waves mathematically. We are now
ready to use our mathematical tools to tackle complicated waves, such as the
tones that real instruments make. To do this, we need some concepts and tools
from an area of study that, when it began, had nothing to do with music, but
rather heat: Fourier analysis.

Joseph Fourier was an associate of Napoleon, accompanying the great general


on his conquest of Egypt. In return for his loyalty, Fourier was made governor
of southern Egypt, where he became obsessed with the properties of heat.
He studied heat flow and, in particular, the temporal and spatial variation in
temperature on the earth. He realized that the rotation of the earth about its
axis meant that its surface was heated in some uneven, but periodic way. In
reconciling the different cycles involved in the heating of our planet, Fourier hit
upon the idea that combinations of cycles could be used to describe all kinds of
phenomena.

Unit 10 | 25
UNIT 10 Harmonious Math
textbook

SECTION 10.5

Fourier
CONTINUED

Item 1873 / Oregon Public Broadcasting, created for Mathematics Illuminated,


COMBOS OF CYCLES (2008). Courtesy of Oregon Public Broadcasting.

Fourier said that any function can be represented mathematically as a


combination of basic periodic functions, sine waves and cosine waves. To create
any complicated function, one need only add together basic waves of differing
frequency, amplitude, and phase. In music, this means that we can theoretically
make any tone of any timbre if we know which waves to use and in which relative
amounts to use them. It’s not unlike making a meal from a recipe—you need a
list of ingredients, you need to know how much of each ingredient to use, and
you need to know how and in what order to combine them.

The ingredients used in Fourier analysis are simply sine and cosine waves. Of
course, these simple waves can come in different frequencies. For sounds that
we consider pleasing and musical, the sine wave mostly will come in frequencies
that are whole number multiples of a fundamental frequency. For sounds that
are “noisy,” such as white noise, the sine-wave ingredient frequencies can be
anything.

NOTE: In the following discussion, we’ll be using the shorthand terms “sin” and
“cos” to represent “sine” and “cosine,” respectively.

Unit 10 | 26
UNIT 10 Harmonious Math
textbook

SECTION 10.5 To begin, let’s look at a simple example, sin t:

Fourier
CONTINUED

Item 1874 / Oregon Public Broadcasting, created for Mathematics Illuminated,


SIN t (2008). Courtesy of Oregon Public Broadcasting.

Unit 10 | 27
UNIT 10 Harmonious Math
textbook

SECTION 10.5 Now, consider a modified sine function, sin 2t:

Fourier
CONTINUED

Item 1875 / Oregon Public Broadcasting, created for Mathematics Illuminated,


SIN 2t (2008). Courtesy of Oregon Public Broadcasting.

Unit 10 | 28
UNIT 10 Harmonious Math
textbook

SECTION 10.5 Combining these two functions gives us a new waveform, f(t) = sin t + sin 2t.

Fourier
CONTINUED

Item 1876 / Oregon Public Broadcasting, created for Mathematics Illuminated,


SIN t + SIN 2t (2008). Courtesy of Oregon Public Broadcasting.

This waveform is comprised of equal parts sin t and sin 2t. It has features of
both but is a new waveform. We don’t have to combine the two simple waves
in equal parts, however. Let’s look at what happens when we use only “half as
much” sin 2t:

Unit 6 | 29
UNIT 10 Harmonious Math
textbook

SECTION 10.5

Fourier
CONTINUED

Item 1877 / Oregon Public Broadcasting, created for Mathematics Illuminated,


SIN t + 0.5SIN 2t (2008). Courtesy of Oregon Public Broadcasting.

Just as we find when cooking, using different proportions of the same


ingredients yields a different result. This waveform is different than the one
we obtained previously, illustrating the effect that altering the coefficient of a
function can have on the graph, or wave. The coefficient corresponds to the
amplitude of a wave, and, in our combined function, essentially determines how
much each sine term contributes to the final waveform.

Unit 10 | 30
UNIT 10 Harmonious Math
textbook

SECTION 10.5 Now let’s see what happens when one of the terms is offset in phase.

Fourier
CONTINUED

Item 3088 / Oregon Public Broadcasting, created for Mathematics Illuminated,


SIN t vs. SIN π -t vs. COS t (2008). Courtesy of Oregon Public Broadcasting.
2

Unit 10 | 31
UNIT 10 Harmonious Math
textbook

SECTION 10.5 This produces yet another waveform, illustrating the effect of each component
wave’s phase. Notice that the graphs of sin (t+ π /2) and cos t are identical. This
Fourier shows us the natural phase relation between sine and cosine functions. Now
CONTINUED that we’ve seen how simple sine waves can be combined to create somewhat
more complex waves, let’s see how to make a more complicated wave, such as a
sawtooth wave.

Building the Sawtooth


• To build a sawtooth wave out of sine waves, we need to know which
frequencies and amplitudes to use.
• Fourier’s chief contribution was a method for determining which
amplitudes, frequencies, and phases of the trigonometric functions are
needed to model any function.
• The Fourier series representation of the sawtooth wave is an infinite sum of
sine waves.

First, let’s just look at the sawtooth waveform.

Item 1894 / Oregon Public Broadcasting, created for Mathematics Illuminated,


SAWTOOTH WAVE (2008). Courtesy of Oregon Public Broadcasting.

Notice that the graph has a series of “ramps” that indicate that the function
increases at some constant rate, then instantaneously drops to its minimum
value as soon as it reaches its maximum value. Each of the ramps looks like the
function y = x, which we can express as f(t) = t, given that we have been talking
about values relative to time. So, this sawtooth wave can be made by some sort
of function that periodically looks like f(t) = t. It has a period of 2 π , so we can
say that this function is f(t) = t for – π to π. According to Fourier, even a function
such as this can be written as the sum of sines and cosines.
Unit 10 | 32
UNIT 10 Harmonious Math
textbook

SECTION 10.5 To see this, let’s start with a sine wave of period 2 π , a period equivalent to that
of the sawtooth wave above.
Fourier
CONTINUED

Item 2063 / Oregon Public Broadcasting, created for Mathematics Illuminated,


BUILDING THE SAWTOOTH 1 (2008). Courtesy of Oregon Public Broadcasting.

Now, let’s subtract another sine wave of twice the original frequency.

Item 1181 / Oregon Public Broadcasting, created for Mathematics Illuminated,


BUILDING THE SAWTOOTH 2 (2008). Courtesy of Oregon Public Broadcasting.

The equation that represents the function we’ve built so far is:
f(t) = 2sin t – sin 2t

Unit 10 | 33
UNIT 10 Harmonious Math
textbook

SECTION 10.5 Let’s add a third sine wave of three times the original frequency.

Fourier
CONTINUED

Item 1182 / Oregon Public Broadcasting, created for Mathematics Illuminated,


BUILDING THE SAWTOOTH 3 (2008). Courtesy of Oregon Public Broadcasting.

With the addition of the third term, our Fourier expansion is now:
2
2sin t – sin 2t + sin 3t
3

At this point we are just guessing which frequencies and amplitudes, or


coefficients, to use. Fourier’s great contribution was in establishing a general
method, using the techniques of integral calculus, to find both the coefficients,
and by extension, the component frequencies of the expansion of any function.
This, as we shall soon see, has given mathematicians a greater range of
manipulative capabilities with functions that are difficult to deal with in their
standard form. Fourier’s specific method is beyond our scope in this text, but
the idea that certain functions can be represented as specific mixtures of sine
and cosine waves, is an important one.

Returning to our sawtooth exercise, we can see that as we add more terms, the
resultant wave begins to take on the sawtooth shape.

Unit 10 | 34
UNIT 10 Harmonious Math
textbook

SECTION 10.5
2 1
Four terms: f(t) = 2sin t – sin 2t + 3 sin 3t – 2 sin4t
Fourier
CONTINUED

Item 1183 / Oregon Public Broadcasting, created for Mathematics Illuminated,


BUILDING THE SAWTOOTH 4 (2008). Courtesy of Oregon Public Broadcasting.

2 1 2
Five terms: f(t) = 2sin t – sin 2t + 3
sin 3t – 2
sin4t + 5
sin5t

Item 1184 / Oregon Public Broadcasting, created for Mathematics Illuminated,


BUILDING THE SAWTOOTH 5 (2008). Courtesy of Oregon Public Broadcasting.

Unit 10 | 35
UNIT 10 Harmonious Math
textbook

SECTION 10.5
2 1 2 1
Six terms: f(t) = 2sin t – sin 2t + 3
sin 3t – 2
sin 4t + 5
sin 5t – 3
sin 6t
Fourier
CONTINUED

Item 1185 / Oregon Public Broadcasting, created for Mathematics Illuminated,


BUILDING THE SAWTOOTH 6 (2008). Courtesy of Oregon Public Broadcasting.

2 1 2 1
Seven terms: f(t) = 2sin t – sin 2t + sin 3t – sin 4t + sin 5t – sin 6t +
2 3 2 5 3
sin 7t
7

Item 1186 / Oregon Public Broadcasting, created for Mathematics Illuminated,


BUILDING THE SAWTOOTH 7 (2008). Courtesy of Oregon Public Broadcasting.

Unit 10 | 36
UNIT 10 Harmonious Math
textbook

SECTION 10.5 As you can see, the sum of the sine series is starting to look like a sawtooth
wave. In order for it to look exactly like one, however, will require an infinite
Fourier number of terms. To suggest an infinite sum, we often use the “dots”
CONTINUED convention, as in this equation:

2
F(t) = 2sin t – sin 2t + 3
sin 3t -… + bnsin nt

The dots indicate that the established pattern goes on and on. However, there is
a more precise way to represent this sum (or more confusing, depending on your
point of view!). This is called the “summation notation:”

f(t) = ∑ b sin nt
n
n =1

This representation encodes the fact that the index “n” starts at 1 and keeps
on going, and that for every index n there is a coefficient bn that is the “weight”
on the mode sin nt (of frequency 2 π n). So, the bn’s are the amplitudes of the
component frequencies, and in the case of the sawtooth wave, we can express
n +1
2(−1)
them by the formula bn= . We find this by using Fourier’s technique for
n
finding expansion coefficients (i.e., by computing an integral). The details of
this, although outside the scope of this text, can be found in most standard
calculus textbooks.

The final Fourier expansion of the sawtooth wave is then:



2(−1)n +1
f(t) = ∑ n
sin nt
n =1

In the Fourier series for this sawtooth wave, note that there are no cosine terms.
That’s because all of the coefficients that would correspond to cosines are zero.
In general, a Fourier series expansion is composed of contributions from sine
terms, sin nt (with amplitudes bn), cosine terms, cos nt (with amplitudes an),
and a constant offset, or bias, a0. So, in summation notation the general formula
for a Fourier expansion of a function, f(t), is:
a0 ∞

f(x) = + ∑ an cos(nx) + bnsin(nx)


2 n =1

Unit 10 | 37
UNIT 10 Harmonious Math
textbook

SECTION 10.5 Notice in the progression that we constructed earlier that as the number of
component waves increases, the overall waveform increasingly approaches the
Fourier look of the ideal sawtooth. Each additional term has a higher frequency than
CONTINUED the preceding term and, thus, provides more detail than the term before it. We
can get as close as we want to the form of the ideal sawtooth by adding as many
high-frequency components as we choose. This is analogous to a sculptor
roughing out a general shape and then refining details after multiple passes.

Being able to take any function and express it in terms of these fundamental
pieces is an extremely useful tool. In mathematics, functions that may
otherwise seem impenetrable may give up their secrets when transformed into
a Fourier series. In the realm of music, Fourier analysis gives musicians and
sound engineers extraordinary control over sound. They can choose to augment
or attenuate specific frequencies in order to make their instruments sound
perfect. Also, with today’s synthesizers, musicians can build up fantastic sounds
from scratch by playing with different combinations of sines and cosines.

The Frequency Domain


• After a function has been converted into a Fourier series representation, it
can be viewed in the frequency domain as opposed to the time domain.
• Frequency domain views provide a different, and sometimes more
enlightening, perspective on the behavior of signals.

As we have seen, Fourier analysis can be used to represent a sound, or any


signal, in the frequency domain. This view of a wave in terms of the specific
mixture of fundamental frequencies that are present is often called a signal’s
spectrum. Analyzing the spectra of different signals can yield some surprising
information about the source of the signals. For example, by looking at the
light from stars and identifying the presence or absence of specific frequencies,
astronomers can make extremely detailed predictions about the chemical
composition of the visible layers of the star. In audio engineering, technicians
can monitor the frequencies present in a sound and then amplify or attenuate
specific frequency bands in order to control the makeup and quality of the output
sound.

Unit 10 | 38
UNIT 10 Harmonious Math
textbook

SECTION 10.5

Fourier
CONTINUED

Item 3234 / Mantegazza et al., FIGURE 3 FROM MANTEGAZZA et al. “SIMULTANEOUS


INTENSIVE PHOTOMETRY AND HIGH RESOLUTION SPECTROSCOPY OF δ SCUTI
STARS” 366, 547-557 (2001). Courtesy of Astronomy and Astrophysics.

We can tell the chemical composition of distant stars by analyzing the component
frequencies in the light that they give off.

Each sine or cosine term in a Fourier expansion represents a specific frequency


component. We can graph these frequencies in a histogram in which each
band represents a range of frequencies. The height of each band corresponds
to the amplitude of the contribution of those frequencies to the overall signal.
This visual representation of sound may be familiar to you if you’ve ever used a
graphic equalizer.

Item 3201 / Dave Fulton, created for Mathematics Illuminated, GRAPHIC EQUALIZER (2008). Courtesy of Dave Fulton.

Unit 10 | 39
UNIT 10 Harmonious Math
textbook

SECTION 10.5

Fourier
CONTINUED

Item 2929 / Viktor Gmyria, SOUND LAB (2007). Courtesy of iStockphoto.com/Viktor Gmyria.
Graphic equalizer output.

Using the “sliders” of a graphic equalizer, one can adjust the amplitude of
the contribution of each frequency range to the overall sound. This makes it
possible to change the “color” of the sound coming out of the system. Boosting
low frequencies increases the bass tones and “richness” but can make the
sound “muddy.” Boosting higher frequencies improves the clarity but can make
the sound seem “thin.” The more sliders you have, the more precisely you can
sculpt the sound produced.

Taking a natural sound and breaking it up into its component frequencies may
seem like a daunting task. Computers are quite good at it, but they are by no
means the only way of accomplishing the feat. In fact, the human ear does
something like this to help us distinguish one kind of sound from another.

The basilar membrane in your ear is formed in such a way that sounds of
different frequencies cause different areas to vibrate, more or less going from
low to high as you progress from one end of the membrane to the other. Tiny
hairs on this membrane, corresponding roughly to frequency bands, ”pick up”
the relative amplitudes of the components of the tones you hear and relay this
information to the brain. The auditory processing part of your brain translates
the information into what we perceive as tones. Our ears and brains naturally
do a Fourier analysis of all incoming sounds!
Unit 10 | 40
UNIT 10 Harmonious Math
textbook

1189
SECTION 10.5

Fourier
CONTINUED

A B C D E F G A B C D E
440 Hz
BASILAR
MEMBRANE

RESPONSE RESPONSE
TO HIGH TO LOW
FREQUENCY FREQUENCY
20,000 Hz 440 Hz 20 Hz

In addition to helping us to distinguish the sounds of music, Fourier analysis


has broad application in many other fields, as well. Its signal-processing
capabilities are of use to scientists studying earthquakes, electronics, wireless
communication, and a whole host of other applications. Any field that involves
looking at or using signals to convey information, which covers a pretty broad
swath of modern endeavors in science and business, uses Fourier analysis in
some way or another.

Up until this point, we have been concerned with simple, one-dimensional


waves, such as those evident in a cross-section of the ripples on a pond.
However, a more realistic, complete analysis would have to involve the
vibrations of the entire surface of the water—in three dimensions. In the realm
of sound, we’re now moving from the vibration of a string to a musical surface—
such as a drum!

Unit 10 | 41
UNIT 10 Harmonious Math
textbook

SECTION 10.6

Can You Hear the • What One Can Do for a String…


Shape of a Drum? • Hearing Shapes

What One Can Do for a String…


• Fourier decompositions and frequency domain representations are not
limited to one-dimensional waves.
• A mathematical drum is a polygonal shape that resonates, given some
impulse.

As we saw in the last section, scientists and mathematicians can analyze the
frequency content of a given signal to discover important information about the
origins and nature of the signal. We have, so far, concentrated on the case of
one-dimensional waves, but there is no reason that the technique of analyzing
frequency spectra should be limited to this domain. What we can do for a string
can also be done for a higher-dimensional object, such as a membrane. All
sorts of objects can, and do, create sounds; the interesting question to consider
is whether, solely on the basis of knowing the frequency content of a sound, you
can deduce what object made it.

To be more specific, let’s think about drums. There are many factors that affect
the sound of a real drum, such as the tautness, or tension, of the drum head
and the shape of the resonant cavity, or body, of the drum. To understand the
acoustics of a drum completely, we would have to consider a broad array of
physical and phenomenological aspects, including the material with which the

1893 drum is constructed. Obviously, we will first have to make some simplifying
assumptions about the situation if we ever hope to develop a quantitative
understanding of how a drum “works.”

REAL DRUM DRUM HEAD ABSTRACT DRUM HEAD

Because we are focused on mathematics, we will take this idea of simplifying


assumptions to the extreme and examine abstract drums. With an abstract
drum, we are concerned with what is knowable in an ideal mathematical

Unit 10 | 42
UNIT 10 Harmonious Math
textbook

SECTION 10.6 sense. For our purposes, a drum is basically a two–dimensional, flat shape that
vibrates with some combination of frequencies when struck. Our analysis will
Can You Hear the have nothing to do with materials or size and shape of resonant cavities. We will
Shape of a Drum? be concerned solely with the frequency content of the signal produced by the
CONTINUED
various vibratory modes of our abstract drum.

Hearing Shapes
• The question of whether or not an object’s shape can be uniquely
determined by the spectrum of frequencies it emits when resonating
depends on the dimension of the object.
• A two-dimensional object is not uniquely determined by its frequency
spectrum.

The question of concern can be phrased in this way: “Can we hear the shape
of a drum?” More specifically, if we determine the frequency spectrum of the
sound given by a drum after it is struck, can we work backward to figure out the
geometric shape that produced that spectrum? For this to be possible, every
conceivable shape must have a unique frequency spectrum. If two different
shapes shared the same frequency spectrum, then it would be impossible to
“hear” the shape of either one—you would never know exactly which shape
produced the sound.

This question was first posed by mathematician Mark Kac in a 1966 paper.
Mathematicians quickly took up the challenge and soon determined that one
could “hear” the area of the shape. The problem of “hearing” the exact shape,
however, remained unsolved until 1991 when mathematicians Carolyn Gordon,
David Webb, and Scott Wolpert determined that the shape of a drum cannot
be categorically determined by its frequency spectrum. They confirmed this
by finding two different shapes (drumheads) that have the same frequency
spectrum.

Unit 10 | 43
UNIT 10 Harmonious Math
textbook

SECTION 10.6

Can You Hear the


Shape of a Drum?
CONTINUED

Item 3102 / Oregon Public Broadcasting, created for Mathematics Illuminated, MATHEMATICAL DRUMS (2008).
Courtesy of Carolyn Gordon. These two shapes “sound” the same.

Nevertheless, it is possible to distinguish between some shapes by using the


frequency spectrum alone. For example, you can “hear” the difference between
a rectangular drum and a circular drum. A more meaningful question, then,
might be, “what can you tell about a drum from its frequency spectrum?” In
a nutshell, some features are evident, others are not. Although the frequency
domain representation of a signal may not tell us everything about its source, it
can indeed provide us with some information. Using the techniques of Fourier
analysis and frequency domain representation, we can find out information
about the shapes of drums that we cannot see. As we’ve seen, these techniques
have also helped us determine the chemical composition of stars millions of
light years away. Clearly, breaking up a sound or other signal into its frequency
components can help uncover fundamental information about origin and
structure that is not otherwise evident.

Unit 10 | 44
UNIT 10 at a glance
textbook

SECTION 10.2

The Math of Time • Music played a central role in Greek thought.


• The Greeks recognized that strings of rationally related lengths sound
harmonious when vibrating together.
• Rational relations are the foundation of Western music.

SECTION 10.3
3.2

Sound and Waves • Sound is caused by compression and rarefaction of air molecules.
• We perceive the amplitude of a sound wave as its loudness, or volume.
• We perceive the frequency of a sound wave as its pitch.
• An instrument’s tone, the sound it produces, is a complex mixture of waves
of different frequencies.
• Instruments produce notes that have a fundamental frequency in
combination with multiples of that frequency known as partials or
overtones.

• Trigonometric functions, such as sine and cosine, are useful for modeling
SECTION 10.4
3.2 sound waves, because they oscillate between values.
• We can connect the idea of the sine function of an angle to sine waves
Mathematics dependent on time by analyzing the “spoke” of a unit circle as it rotates,
of Waves forming the hypotenuse of various right angles.
• A sine wave can represent a sound wave theoretically, but not pictorially.
The shape of a sine wave is altogether different than the “shape” of a sound
wave found in nature.
• The mathematics of wave motion is expressed most generally in the wave
equation.
• The wave equation uses second derivatives to relate acceleration in space to
acceleration in time.

Unit 10 | 45
UNIT 10 at a glance
textbook

SECTION 10.5

Fourier • We can think of the combination of sine waves in a Fourier series as a


cooking recipe in which the full wave is a combination of varying amounts
(amplitudes) of waves of various frequencies.
• To build a sawtooth wave out of sine waves, we need to know which
frequencies and amplitudes to use.
• Fourier’s chief contribution was a method for determining which
amplitudes, frequencies, and phases of the trigonometric functions are
needed to model any function.
• The Fourier series representation of the sawtooth wave is an infinite sum of
sine waves.
• After a function has been converted into a Fourier series representation, it
can be viewed in the frequency domain as opposed to the time domain.
• Frequency domain views provide a different, and sometimes more
enlightening, perspective on the behavior of signals.

SECTION 10.6
3.2

Can You Hear the • Fourier decompositions and frequency domain representations are not
Shape of a Drum? limited to one-dimensional waves.
• A mathematical drum is a polygonal shape that resonates, given some
impulse.
• The question of whether or not an object’s shape can be uniquely
determined by the spectrum of frequencies it emits when resonating
depends on the dimension of the object.
• A two-dimensional object is not uniquely determined by its frequency
spectrum.

Unit 10 | 46
UNIT 10 Harmonious Math
textbook

BIBLIOGRAPHY

Websites http://falstad.com
http://www.ams.org/featurecolumn/archive/199706.html

PRINT Boyer, Carl B. (revised by Uta C. Merzbach). A History of Mathematics, 2nd ed.
New York: John Wiley and Sons, 1991.

Burk, Phil, Larry Polansky, Douglas Repetto, Mary Roberts, and Dan Rockmore.
Music and Computers: A Theoretical and Historical Approach. Emeryville, CA: Key
College Publishing, 2004.

Du Sautoy, Marcus. The Music of the Primes: Searching To Solve the Greatest
Mystery in Mathematics. New York: Harper Collins, 2003.

Eves, Howard. An Introduction to the History of Mathematics, 5th ed. (The Saunders
Series) Philadelphia, PA: Saunders College Publishing, 1983.

Harkleroad, Leon. The Math Behind the Music. New York: Cambridge University
Press and Washington, DC: Mathematical Association of America, 2006.

Kac, Mark. “Can One Hear the Shape of a Drum?” The American Mathematical
Monthly, vol. 73, no. 4, part 2: Papers in Analysis (April 1966).

Lazzaro, John and John Wawrzynek. “Subtractive Synthesis Without Filters,” Audio
Anecdotes II. (2004).

Nahin, Paul J. Dr. Euler’s Fabulous Formula: Cures Many Mathematical Ills.
Princeton, NJ: Princeton University Press, 2006.

Rockmore, Dan. Stalking the Riemann Hypothesis The Quest To Find the Hidden Law
of Prime Numbers. New York: Vintage Books (division of Randomhouse), 2005.

Rothstein, Edward. Emblems of Mind: The Inner Life of Music and Mathematics.
USA: Times Books, 1995.

Transnational College of LEX. Translated by Alan Gleason. Who is Fourier? A


Mathematical Adventure. Belmont, MA: Language Research Foundation, 1995.

Unit 10 | 47
UNIT 10 Harmonious Math
textbook

NOTES

Unit 10 | 48
TEXTBOOK
Unit 11
UNIT 11
Connecting with Networks
TEXTBOOK

UNIT OBJECTIVES

• Networks can be represented by graphs, which can be analyzed mathematically.

• A graph is a set of elements along with another set that defines how the elements
are connected.

• The degree of a node is how many connections it has.

• A path is a sequence of edges connecting two nodes.

• A connected component of a graph is a maximal collection of nodes and edges that


are mutually connected.

• Random graphs can undergo “connectivity avalanches” during construction

• Distance on a graph is a measure of the fewest number of edges needed to travel


between two given nodes.

• The clustering coefficient is a measure of how many of a node’s neighbors are


connected to each other (e.g., the fraction of a given individual’s friends who are
also friends with each other).

• Small-world networks have higher-than-expected clustering coefficients and short


mean distances.

• Scale-free networks follow a power law when describing the distribution of


degrees.
Network thinking is poised to invade all
domains of human activity and most fields
of human inquiry. It is more than another
useful perspective or tool. Networks are by
their very nature the fabric of most complex
systems, and nodes and links deeply infuse
all strategies aimed at approaching our
interlocked universe.

Albert-László Barabási
UNIT 11 connecting with networks
textbook

SECTION 11.1

INTRODUCTION It is a cliché to say that we live in a connected age. Improvements in


communication and transportation technologies, starting with the telegraph
and locomotive and continuing through the Internet, jumbo jet, and beyond,
have brought us increasingly closer together. These technologies enable us to
maintain our relationships to one another more easily, and they encourage us to
make new connections.

Underlying these connecting technologies is an infrastructure of roads,


air routes, power lines, telephone cables, and a variety of electromagnetic
wave transmitters and receivers. These systems allow people, electricity,
and information to reach even the most remote areas of our country and our
world with relative ease. They are vastly complex collections of elements and
their connections. Because the elements and connections within a network
interact in complicated ways, they exhibit system characteristics that are often
unforeseeable when they are viewed simply as a large group of independent
network components. Obviously, the way they interrelate makes a huge
difference in the overall nature and capacity of the network.

Mathematicians view networks as fundamental objects of study. Networks, as a


whole, exhibit behavior that is very difficult, if not impossible, to understand by
studying the elements individually. Examples of this abound in the history of our
nation’s power grid. Small events, such as a single power line coming in contact
with an overgrown tree, can set in motion a cascade of events that leads to large-
scale power outages many miles away. That such events occur, despite multiple,
built-in safety features that are designed to prevent these types of outcomes on
a local scale, is a testament to the need to understand network behavior on a
broader scale.

Networks are all around us. We are connected to each other, not only through
physical links such as power lines, phone lines, and roads, but also through
the less-tangible relationships of friendship, family, and business ties. We use
a global information network, in the form of the Internet and World Wide Web,
almost without thinking. Our connections give us access to information and
opportunity.

Unit 11 | 1
UNIT 11 connecting with networks
textbook

SECTION 11.1 We can use our understanding of networks to study life itself, on multiple scales.
From networks of genes and proteins, to cellular structures, to ecosystems of
INTRODUCTION predators and prey, we realize that living beings are not in any way solitary; they
CONTINUED depend heavily on their interactions. Detailed understanding of the functioning of
the web of life can help us make better decisions about the future of our planet,
and of our species.

If one of the benefits of connectedness is that we are better able to work


together, a drawback is that we are more susceptible to small disturbances. As
is evident in the example of the failure of a solitary power line causing a huge
blackout, small disturbances can rapidly, and unpredictably, grow into real
dangers. One computer virus can quickly cripple a business, or an entire nation.
A biological virus can spread so rapidly in today’s era of broadly affordable airfare
that a global pandemic can envelop us before we know it is happening. Terrorists
of both the real and cyber-worlds can use the de-centralizing properties of
networks not only to mount an attack, but also to evade detection.

Analyzing networks mathematically is a way to understand the complicated world


around us. In this unit we will learn a bit about the history and fundamental
ideas of the subject. We will start with Euler’s early study and his approach
to the problem of the Königsberg bridges. Then we will travel through the
random networks of Paul Erdös and the small worlds of Duncan Watts and Steve
Strogatz. We will then explore the “rich-get-richer” world of the scale-free
network. Finally, we will take a look at the emerging study of dynamic networks.
Throughout this exploration, we will study both basic ideas and examples of
networks in action. By the end, we will have caught a glimpse of some of the
networks that are such pervasive influences in our daily lives.

Unit 11 | 2
UNIT 11 connecting with networks
textbook

SECTION 11.2 • Euler’s Bridges


• Examining Networks and Graphs
The Study of
Connections Euler’s Bridges
• Euler’s solution to the Bridges of Königsberg problem showed how to
analyze a real-life situation in terms of connections.
• The existence of an Eulerian path or cycle on a graph depends on the degree
(number of connections) of each node.

The set of ideas we now call “network theory” can be traced back to the work
of the great Swiss mathematician, Leonhard Euler. In the early-to-mid 1700s,
Euler lived in the kingdom of Prussia in the town of Königsberg, now known as
Kaliningrad, Russia. Through the town ran the river Pregel, and within the river
were two small islands. These islands and the mainland were connected by
seven bridges, as shown below.

Item 3100 / Oregon Public Broadcasting, created for Mathematics Illuminated, BRIDGES OF KÖNIGSBERG (2008).
Courtesy of Oregon Public Broadcasting.

A popular pastime among the city’s residents was to look for a path through
town that traversed all seven bridges without crossing the same bridge twice.
Euler became intrigued by this problem. He recognized that the solution had
nothing to do with any of the distances involved, but rather with the way in which
the landmasses were connected to each other. He assigned each destination
a letter and used pairs of letters to denote bridges. In modern mathematical
language, each destination is called a “node” or “vertex” and each bridge is
called an “edge.” The problem can be simply represented in an image of four
nodes and seven edges such as this:

Unit 11 | 3
UNIT 11 connecting with networks
textbook

SECTION 11.2

The Study of
Connections
CONTINUED

Item 3095 / Oregon Public Broadcasting, created for Mathematics Illuminated, NODE MAP OF THE BRIDGES OF
KÖNIGSBERG PROBLEM (2008). Courtesy of Oregon Public Broadcasting.

By abstracting the Königsberg bridges problem, Euler was able to prove that
there is no possible path that crosses each bridge exactly once. To do this, he
looked at how many connections each node has; mathematicians now call this
quantity a node’s degree.

Euler realized that for such a theoretical ideal path to exist, it would have to be
the case that at any “interior” (neither starting nor finishing) node of the walk,
upon reaching the node by one bridge, there would have to be a way to depart
the node by another bridge that had not been used yet. That is, if one was able to
arrive at a node via one edge, one would have to be able to leave that same node
via a different edge. Thus, as long as each interior node has an even number of
connections, a path that contains every edge, now known as an Eulerian path, is
potentially possible. Euler also realized that if we assume that the theoretical
journey ends at a different node than the one at which it begins, then both the
starting and finishing nodes must be of odd degree.

Unit 11 | 4
UNIT 11 connecting with networks
textbook

SECTION 11.2
ARRIVE
NO CYCLES OR
PATHS POSSIBLE
The Study of OR
Connections DEPART

CONTINUED
LAST FIRST
ARRIVE EDGE EDGE

2258
EULERIAN PATHS
POSSIBLE, BUT
IF ALL NODES HAVE
NOT CYCLES
EVEN DEGREE BOTH
EULERIAN PATHS AND
EULERIAN CYCLES
ARE POSSIBLE

A node having odd degree has to be either the initial or terminal node of an Eulerian path. If a graph has less than
1 or more than 2 nodes of odd degree, no Eulerian cycles or paths are possible.

Changing the problem slightly, Euler also knew that if one is required to start
and finish at the same node and walk a path that covers every edge only once, all
nodes must be of even degree. We now call such a route an Eulerian cycle.

Euler’s observation is now regarded as the first theorem in graph theory. It


is also regarded as the first observation in topology, the study of fundamental
properties of shape—those properties, such as connectivity, that don’t change
under stretching or squashing. As with many new fields of study, it took a while
for others to join the endeavor. It wasn’t until nearly a century later that other
mathematicians began to expand on this work begun by Euler.

The Irish mathematician William Hamilton picked up the torch in the middle of
the 19th century. His focus, like Euler’s, was on whether or not certain networks
admitted cycles. Hamilton is credited with defining a new type of cycle, one that,
rather than covering every edge of a network, visits every node exactly once.
This type of path is now commonly known as a Hamilton cycle, an example of
which we saw briefly in Section 2.5 of Combinatorics Counts.

Unit 11 | 5
UNIT 11 connecting with networks
textbook

SECTION 11.2 Questions about cycles in networks continued to provide fertile ground for post-
Euler thinkers concerned with networks. This search led to the identification
The Study of and classification of different types of networks. One of the simplest types
Connections of networks that have been identified is the tree. A German physicist, Gustav
CONTINUED Kirchoff, known primarily for his laws concerning electrical circuits, was
the first to record studies of something like network trees in the mid-1800s.
These organizational structures will be familiar to anyone who has filled out a
tournament bracket.

2259

A tree graph; there is only one route that connects each pair of nodes

In a tree, every node is connected to every other node by exactly one path. This
is different than the network of the Königsberg bridges, in which some nodes
are connected via multiple paths.

2260 A

ONLY ONE PATH FROM MULTIPLE PATHS FROM NODE A TO NODE B. THE
NODE A TO NODE B DISTANCE FROM A TO B IS THE SHORTEST PATH.

Unit 11 | 6
UNIT 11 connecting with networks
textbook

SECTION 11.2 If two nodes are connected by multiple paths, the length of the shortest of those
paths defines the distance between the two nodes. The average distance in a
The Study of network is the sum of all possible distances divided by how many there are.
Connections Cycles, paths, distance, and average distance are but a few of the characteristics
CONTINUED of networks that can be mathematically studied. As the body of network theory
grew, mathematicians developed more tools that enabled them to study and
classify different networks and their properties.

Examining Networks and Graphs


• A graph is a mathematical structure consisting of a set of elements and a set
that defines the connections between them.
• Graph theorists are concerned with a number of graph properties, such as
connectedness, connected components, and diameter.
• Graphs can be directed or undirected, weighted or unweighted.

A network is generally a real-world system of elements and their connections.


There are two main ways that mathematicians abstract networks so that they
can be more easily studied. The first, and most fundamental, way was pioneered
by Euler; a network can be represented abstractly as a set of elements (the
vertices or nodes) as well as a set of pairs (subsets of size two) of those
elements, representing edges. For example, one way to represent a certain
graph might be the set {A, B, C, D} (the set of vertices) together with the set
of pairs {AB, BC, AD, AC, CD} that indicate the edges. We can tell from this
representation that the graph has four nodes and five edges connecting them. It
might be easier, however, to visualize this network as below:

A B

2261
D C

Note that the edge BD is not included in the set of node pairs and thus does not
appear in the visual representation of the network.

Unit 11 | 7
UNIT 11 connecting with networks
textbook

SECTION 11.2 Mathematicians refer to this sort of diagram that relates nodes and edges
as a “graph.” The connections are just as important as the things they are
The Study of connecting. As you can see, these graphs are slightly different than the ones
Connections composed of points on the coordinate plane that are commonly studied in
CONTINUED school—in other words, in network theory we’re not concerned with graphs
of functions! In the most basic notion of a graph, all nodes are considered
to be indistinguishable from each other, as are all edges. This is the first
big abstraction in graph theory. Real networks are not made up of identical
elements that all connect to each other via the same relationship. Making these
assumptions, however, serves as a starting point for analysis.

By looking at a graph, we are better able to visualize and interpret the


connections between elements. A key question is that of connectivity. We say
that a graph is connected if, starting at any node, there exists a path to every
other node in the network, no matter how circuitous.

B
2262 B
A
A

A disconnected network: no route from A to B. A connected network: a circuitous path from A to B.

Connectedness is important in many different real-world networks. With


the power grid, if a house, block, or neighborhood is disconnected, it has no
electricity. If a social group is connected, then every person is acquainted
with every other member, although there may be intermediaries—“friends of
friends” and such. It’s not clear whether or not all the people on earth form a
connected network, for there may not be a chain of acquaintances linking the
most remote Mongolian nomad to a native living in the Amazon rainforest. We’ll
explore this idea in more detail a bit later.

Unit 11 | 8
UNIT 11 connecting with networks
textbook

SECTION 11.2

2263
The Study of
MONGOLIAN AMAZON
Connections FARMER NATIVE
CONTINUED

Even when a network is not connected, there may be a sub-network that is.
The Internet connects a large number of computers around the planet. Not all
computers, however, connect to the Internet. Hence, the Internet represents
what is known as a “connected component” of the network of all computers.
There are other connected components, such as the secure computer networks
run by the CIA and the Department of Defense. These connected components
are isolated from the Internet and from each other.

DOD

2264

CIA

THE WEB

While we’re on the subject of computer networks, it’s worth pointing out that
the World Wide Web is a “directed network.” This means that connections in
cyberspace are not necessarily “two-way streets.” For example, a blogger can
post a link to a site, but that site doesn’t necessarily have to link back to the
referring blog.

Unit 11 | 9
UNIT 11 connecting with networks
textbook

2265 11.2
SECTION
WEB
ONLINE
The Study of
BOOKSELLER
Connections
CONTINUED NEWSPAPER
BOOK REVIEW VIDEO
GAME
SITE
BILL’S BLOG
JILL’S BLOG

The system of phone lines and other physical (including wireless) connections
that make up the Internet, however, is an undirected network. These physical
connections are two-way streets, although not all sites use this capability.

2266 INTERNET

ONLINE
BOOKSELLER NEWSPAPER
BOOK REVIEW

VIDEO
GAME
BILL’S BLOG SITE
JILL’S BLOG

Let’s return for a minute to the network of all people on earth. If it turns out that
this network is indeed connected, then the chain of acquaintances that connects
the two most remote people, say the Mongolian nomad and the Amazon native,
is another quantity of interest known as the “diameter.” The diameter of a graph
or network is the longest possible distance between two nodes. Recall that we
specifically defined distance as the shortest path between two nodes, so the
diameter of a network is actually the “longest shortest path.”

Unit 11 | 10
UNIT 11 connecting with networks
textbook

SECTION 11.2

The Study of
Connections
2267
CONTINUED

The diameter of this graph is seven.

Finally, assuming that all nodes and edges are of equal value facilitates
observations about networks and the graphs that represent them. This
assumption can make things too simple sometimes, and important features
may be missed. Graphs that assign different values to the edges are known as
“weighted graphs.” We explored weighted graphs somewhat in our discussion
of the problem concerning the traveling salesperson in the unit: Combinatorics
Counts.
52 MILES

40 MILES
70 MILES

2268 45 MILES
50 MILES

50 MILES

The discoveries of Euler, Hamilton, Kirchoff, and others, formed a foundation


for future mathematicians to continue the study and classification of graphs
and their properties. Euler’s theorem was the first such observation, but
it was far from the last. Properties such as average distance, diameter,
and connectedness became important tools for studying networks. As
mathematicians learned to see networks as structures worthy of study in their
own right, they began to identify and understand a range of different types of
networks and the graphs that represent them. One of these types, random
networks, is the subject of our next section.

Unit 11 | 11
UNIT 11 connecting with networks
textbook

SECTION 11.3

RANDOM • My Brain Is Open


NETWORKS • Around the World

My Brain Is Open
• There are multiple ways to define a random network.
• As edges are added randomly to a collection of nodes, groups of connected
components become larger, resulting in a “connectivity avalanche.”

Paul Erdös was a Hungarian mathematician famous for both his exceptional
mind and his rather extensive list of collaborators. After receiving his doctorate
in the 1930s, he proceeded to work diligently throughout much of the 20th
century until his death in 1996. He was famous for his habit of showing up
on colleagues’ doorsteps, with a suitcase that contained all of his worldly
possessions, and greeting his future collaborator by proclaiming, “My brain is
open.” This was his way of letting colleagues know that he was interested in
collaborating with them on some difficult problem of the day. Erdös was sort
of an itinerant mathematician, hopping from one collaboration to the next,
connecting to many in the math world. Because of his ability to work with
people and forge numerous connections, it seems fitting that some of his most
influential work was in the study of networks and their graphs.

Erdös was one of the most prolific mathematicians in history, authoring or co-
authoring more papers than anyone except Euler. One of those collaborations,
with Alfréd Rényi, resulted in one of the key ideas in modern graph theory, the
random graph.

2268

RANDOM GRAPH LATTICE

Unit 11 | 12
UNIT 11 connecting with networks
textbook

SECTION 11.3 As mathematicians work to model real-world networks, an issue that arises
is that of determining a general taxonomy of networks. Perhaps there is a
RANDOM hierarchy of structure, and if so, where do real-life networks fit in the hierarchy?
NETWORKS Is there some attribute that characterizes real networks? Some real-life
CONTINUED networks, such as those that make up crystals—physical structures in which
atoms are connected by chemical bonds—are extremely ordered. Such regular
networks can be modeled by graphs known as lattices.

2270

Other networks exhibit very little regularity, their connections seeming to be


haphazard and unplanned. When we call such groupings “random networks,”
what exactly do we mean by that term?

Erdös and Rényi gave two different definitions of a random network. An


action-oriented description of their first definition is: for a given number of
elements, N, imagine the set of all the possible ways in which they could be
connected and select one of these at random. To figure out how many graphs
there are to choose from, we can use the C(n,k) function from our previous unit
on combinatorics. Because an edge is a connection between two nodes, the
number of possible edges between N nodes is equal to the number of ways to
select two out of N things, C(N, 2).

2271

Unit 11 | 13
UNIT 11 connecting with networks
textbook

SECTION 11.3 To figure out how many possible graphs can be created involving C(N,2) or
fewer edges, we can treat each of the possible edges as either present or not
RANDOM present. This is the exact same logic we applied in the unit on combinatorics
NETWORKS when defining a bijection between the number of subsets of N elements and
CONTINUED the number of binary strings (000101, 011110, etc.) of length N. In the case of
the binary strings, we found that there are 2N strings. Because we have C(N,2)
1
edges, the number of possible graphs is 2C(N,2). Each of these graphs has a 2 C(N,2)
chance of being randomly selected via this method.

2272

The second method that Erdös and Rényi described for constructing a random
graph is an incremental process. We consider each of the potential edges
between N nodes in turn. For each edge, flip a coin. If the coin lands heads
up, we make the connection; if it lands heads down, we leave the pair of nodes
unconnected and move on to the next pair.

2273

This second method of construction provides a good way to glimpse what


happens as a random network is constructed. A useful question to consider
is: When does the network become connected? Let’s explore this process by
imagining a bunch of buttons strewn about on the floor.

Unit 11 | 14
UNIT 11 connecting with networks
textbook

SECTION 11.3

RANDOM
NETWORKS
CONTINUED

2274

We can use strands of thread to connect buttons, and we can use the coin flip
method of determining whether or not to connect a pair of buttons. Early on in
this process, we will likely have a bunch of pairs of buttons, mostly disconnected
from each other. Gradually, as the process continues, many of these connected
pairs will become connected to each other, forming connected components.
One can think of the connected component as all of the buttons that would be
attached to a certain button if you were to pick it up. Usually, before we have
attached too many threads, each button will be a part of a connected component,
and there might be several connected components among the whole system of
buttons and threads.

2275

Unit 11 | 15
UNIT 11 connecting with networks
textbook

SECTION 11.3 At this stage, the network of buttons as a whole cannot be said to be connected.
Their grouping into multiple connected components represents an intermediate
RANDOM stage between utter isolation and complete connectedness. The size of the
NETWORKS largest among the connected components depends on how many threads have
CONTINUED been attached thus far. The nature of this correspondence is quite interesting.

When we first add a thread, the largest connected component consists of just
two buttons. As a fraction of the total possible connections, this is close to zero.
As we add a few more threads, any system of connected components that arises
will most likely be a tree, and there will still be a fair amount of isolated buttons.
This type of structure arises due to the high probability that, in the early stages
of network evolution, each new connection is either with a previously isolated
button or with a button that has, at most, one other connection. Eventually, as
the number of connecting threads increases and the number of isolated buttons
decreases, the odds shift so that we are more likely to connect two buttons that
already have connections to others. When we reach this stage of growth, the
addition of a new thread is likely to join connected components, thereby creating
ever larger components, the largest of which is sometimes called the giant
component. As we approach the situation in which the average button has at
least one connection, the giant component grows quickly to incorporate nearly
the whole system.

2276

Unit 11 | 16
UNIT 11 connecting with networks
textbook

SECTION 11.3 The rapid transformation from a few separate connected components to the
giant component is sometimes called a “connectivity avalanche,” and it is an
RANDOM example of a phase transition. Phase transitions occur all the time in nature,
NETWORKS such as when water turns to ice, or when a material becomes magnetized—any
CONTINUED time the condition of a system changes almost instantaneously.

Around the World


• The average distance is one way to classify different types of graphs.

Recall from earlier in this unit that distance on a graph is a measure of the least
number of edges needed to get from one node to another. Average distance is
the mean of all the individual distances. In a random graph, we can assume
that, given a certain number of average links per node, each node is just as
likely to be directly connected (i.e., connected by only one edge) to one node
as any other. Therefore, we should be able to come up with a relationship that
represents the average distance between nodes in a random graph.

Suppose we have a graph with N nodes, each of which has k links, on average, to
other nodes. This means that from any starting node, we can, again on average,
get to k other nodes within one step. It also means that we could get to k(k−1)
nodes within two steps.

A1
A2

A3
A4

2277 A C

B3
B

B2
B1

Nodes A1, A2, A3, A4 and B are 1 edge away from A. B1, B2, B3 and C are 2 edges away from A.

Unit 11 | 17
UNIT 11 connecting with networks
textbook

SECTION 11.3 Continuing this thinking, we could get to k(k−1)2 nodes within three steps, k(k−1)3
within four steps, and so on until we have k(k−1)(d−1) nodes at a distance of d
RANDOM steps. In a connected random graph, the maximum number of accessible nodes,
NETWORKS k(k−1)(d−1), at a distance d must be equal to the total number of nodes, N. We
CONTINUED therefore get:

N = k(k−1)(d−1)

Solving this for d, the average distance between nodes, we get:

ln N = (d−1)(ln k + ln (k−1))
lnN
(ln k + ln (k - 1)) = d−1
lnN
d=1+
(ln k + ln (k - 1))
This formula gives us the average distance between nodes on a random graph
in which each node has k connections. We are able to do this only with random
graphs because we require any two nodes to be equally likely to be directly
connected. This makes for convenient mathematics, but how applicable is it?

Let’s say that the six billion or so people of our world were randomly connected,
with each person having 1,000 acquaintances. Using these figures, each of a
1,000 1
person’s acquaintances would have a 6,000,000,000 = 6,000,000 chance of knowing any
of the other acquaintances of that person. This might seem odd, because
most people have friends who are friends with one another. This suggests a
level of structure in human connections that is more than random. Obviously,
our connections are not as regular as a lattice; no one is assigned a given
number of acquaintances from birth. The random meeting on the street, or the
friendships that develop out of any number of unforeseen difficulties, suggest
that the networks that we experience as humans are not overly-structured and
yet not completely random either; they fall somewhere in between. This type
of network is significantly more difficult for mathematicians to explain, but
meaningful progress has been made. What mathematicians have found, which
we might intuitively guess were we to run into a classmate from kindergarten
while on vacation in Antarctica, is that we live in a small world.

Unit 11 | 18
UNIT 11 connecting with networks
textbook

SECTION 11.4

SMALL WORLD • Six Degrees


NETWORKS • It’s a Small World
• There and Back Again

Six Degrees
• The idea that there are, at most, six degrees of separation between any two
people has its roots in an experiment by Stanley Milgram.

When we meet someone for the first time, we often search for some sort
of common ground upon which we can build a conversation and, possibly, a
friendship. Often, this common ground is a place, or a type of music, or a friend.
If you’ve ever played the name game with a new acquaintance and found that you
have a friend in common, you’ve experienced what both romantics and network
theorists call a “small world.” This concept engenders a feeling that our human
world is not as cold and random as it might seem on the surface.

The small-world concept implies that we are all connected through chains
of acquaintances. This is often expressed in the famous “six degrees of
separation” theory—the idea that we are, at most, six handshakes away
from anybody on the planet. This is actually a variant of the classic gangster
expression, “I know people who know people.” The six degrees of separation
idea has been made famous in popular culture through a famous play, a movie,
and a game based on connecting movie actors to Kevin Bacon. This popular
concept suggests that all of us are more interconnected than it may seem.

Where did this idea come from? How true is it? How can it be expressed
mathematically? The first person to study small worlds in any sort of scientific
way was the Harvard social psychologist Stanley Milgram. Milgram became
fairly well known in the field of social theory in 1963 for a series of experiments
in which he measured how likely people were to obey an authority figure, even
if it meant inflicting pain on another person. He found that the more degrees of
separation there were between the victim and the person inflicting the pain, the
more likely it was that the inflictor would follow orders resulting in harm—even
death—to the victim. This sets up the natural question of how many degrees of
separation exist between people in the real world.

Unit 11 | 19
UNIT 11 connecting with networks
textbook

SECTION 11.4 To study this question, Milgram sent letters to random people in Omaha and
Wichita and asked them to forward the letters to a certain person in Boston,
SMALL WORLD whom they did not know. However, they were given specific direction in how to
NETWORKS go about this. The instructions were to send the letter to a person with whom
CONTINUED they were on a first-name basis, a friend who they thought would have a better
chance of knowing the intended recipient. Most of the letters never arrived at
their destinations, but of the ones that did, it took an average of six forwards to
get there. This was the origin of the “six degrees of separation” theory.

The accuracy of the six-degrees story is debatable, but the small world that
it implies is very real. A small-world network is one in which most nodes are
not connected to each other and yet the average path between most nodes is
relatively short. It is a sort of middle ground between highly ordered lattice-type
networks and the random networks of Erdös and Rényi. Let’s look at this idea of
a small world a bit more closely.

It’s a Small World


• Average distance is relatively easy to compute for well-understood graphs,
such as ring lattices and random graphs.
• Small-world graphs have average distances that generally fall somewhere
between those found in a ring-lattice and those found in a random graph.

2278

A ring lattice.

Imagine that the six billion people of the world are arranged in a giant circle.
Furthermore, let’s say that each has 1,000 acquaintances, specifically, the 500
people to the left and the 500 people to the right. This idea presents a highly
ordered network known as a ring lattice. We can perform a version of Milgram’s

Unit 11 | 20
UNIT 11 connecting with networks
textbook

SECTION 11.4 experiment in this world by selecting one person in the ring and asking that
person to send a letter to the person directly opposite them.
SMALL WORLD

2279
A
NETWORKS
CONTINUED

D
B
500th person
to the right of A
C
500th person
to the right of B

To do this, the sender should give the letter to the 500th person on the right, and
this person, in turn, should then give it to the 500th person on the right (we must
assume that everyone in the circle is facing inward) and so on. Traveling in this
manner, in chunks of 500 people, how many connections will it take for the letter
to arrive in the hands of the person opposite the sender?

The intended recipient is approximately 3 billion people away from the original
sender. The letter traverses 500 people per connection, so it should take
3,000,000,000
500
, or 6,000,000 connections for the letter to arrive. This sort of world
obviously has significantly more than six degrees of separation.

Of course our world is not as structured as this ring-lattice world. We are


certainly free, for the most part, to associate with whomever we like. Let’s look
at the opposite extreme, the completely random world. Now, in the last section,
we already reasoned that the world is not completely randomly connected.
However, looking at this case in a little more detail will help us come to a better
understanding of the small-world idea.

Unit 11 | 21
UNIT 11 connecting with networks
textbook

SECTION 11.4

SMALL WORLD
NETWORKS

2286
CONTINUED

A randomly connected world.

Once again assuming a world population of six billion and that each person has
1,000 acquaintances, we can find out how many steps it would take for a letter to
travel from any sender to any other randomly selected recipient. Of course, we
could do this by using the formula from the last section—or we can reason our
way through it. Recall that the average distance, d, between nodes on a random
graph is given by the formula:

lnN
d = 1+
(lnk + ln(k −1))

where N is the number of nodes, and k is the number of links per node.
Substituting our values for N (6,000,000,000) and k (1,000), we have:

ln(6x109 )
d = 1+
ln(103 )+ ln(103 −1)

which computes to approximately 2.6 connections per person.

This implies that it would take almost three connections on average to pass
a letter from one person to any other person if our world were randomly
connected.

We have seen that an orderly, ring-lattice world would have six million degrees
of separation whereas a totally random world would have a little over three. The
six degrees of separation that Milgram found suggests that the real world is
randomly connected, though not entirely.

Unit 11 | 22
UNIT 11 connecting with networks
textbook

SECTION 11.4 There and Back Again


• The clustering coefficient is a measure of how likely it is that a node’s
SMALL WORLD neighbors are connected to each other.
NETWORKS • Networks in nature tend to exhibit a high degree of clustering.
CONTINUED

This idea of degrees of separation is simply another way to talk about the
average path length, also known as the characteristic path length, of a graph.
In general, random networks have short characteristic path lengths; ordered
networks have relatively long characteristic path lengths (the greater the order,
the longer the average path); and the characteristic path lengths of small-world
networks tend to fall somewhere in between. The other chief measure that
becomes important in studying and classifying these types of networks and their
graphs is the clustering coefficient.

2281

Unit 11 | 23
UNIT 11 connecting with networks
textbook

SECTION 11.4 The clustering coefficient is a measure of how many nodes share common
connections to other nodes. It is defined as the fraction of a particular node’s
SMALL WORLD connections, called the “neighborhood,” that share connections with each other.
NETWORKS In other words, it quantifies how many of one’s friends are also friends with
CONTINUED each other. The clustering coefficient can be found for a particular node, or
an average value can be calculated to give a clustering coefficient for an entire
network.

For example:

2282 V’s NEIGHBORS W

2
The clustering coefficient of vertex v is 3 , because there are three possible
connections among v’s neighbors, but only two of these are realized. Following
2 1
the same method, the clustering coefficients of vertices w, x, and y are 3 , 3 , and
1
3
respectively. The average clustering coefficient for this graph would then be:

 2 2 1 1
 3 + 3 + 3 + 3  2 1
= =
4 4 2

A clustering coefficient of 1 indicates that all of a node’s neighbors are


connected to each other. A clustering coefficient of 0 indicates that none of the
nodes in the neighborhood share common connections. The ring-lattice world
has a large degree of clustering. If we lived in this world, we would share 499 of
the same friends as our neighbor. This puts our individual clustering coefficient
499
at close to 1 ( 500 ). Because all nodes are identical in this world, the average

Unit 11 | 24
UNIT 11 connecting with networks
textbook

SECTION 11.4

SMALL WORLD
NETWORKS
CONTINUED

Note that most nodes share


connections with neighbors

clustering coefficient would be equal to the individual value.

Going back to our random world situation, each of your connections would
have a one-in-six-million chance of being connected to another one of your
connections. This means that the individual clustering coefficient is virtually

2284

zero. Consequently, the clustering coefficient of the network (the average


clustering coefficient over all nodes) is also close to zero.
A small-world network has both a short characteristic path length and a
clustering coefficient somewhat greater than that of a random network. We can

Unit 11 | 25
UNIT 11 connecting with networks
textbook

SECTION 11.4

SMALL WORLD
NETWORKS
CONTINUED

2280

imagine creating a small-world network by starting with our ring-lattice world


and randomly disconnecting and reconnecting people.
Each random connection connects local clusters, thereby reducing the
characteristic path length. The more random connections we make, the shorter
the average path length becomes, because, using our mailed-letter example,
a letter could take shortcuts and leap far more than the 500 people it was
constrained to in the ring-lattice world.

Using the measures of characteristic path length and clustering coefficient as


guideposts enables mathematicians to begin to classify and understand the vast
range of network structures that lie between the relatively well-understood
random and lattice networks. These “middle” types of networks are more
representative of the organizational systems found in nature, which tend to
be more randomly organized than lattices and more structured than random
networks.

Networks in the natural world tend to have a fair amount of clustering,


combined with a bit of randomness in their connections. One hypothesis as to
why this is so is that random networks are susceptible to adverse consequences
caused by random interruptions, such as when a node or edge is removed. This
is what happens when a gene mutates or a single power line fails because it
comes into contact with an overgrown tree.

In a random network, such interruptions, also called “deletions,” tend to


increase the characteristic path length, thereby making the network less
effective at transmitting signals. This occurrence is related to the fact that

Unit 11 | 26
UNIT 11 connecting with networks
textbook

SECTION 11.4 random networks transition very quickly from a group of separate connected
components, in which the characteristic path length is infinite because the graph
SMALL WORLD is not connected, to a fully connected graph. Remember, this phenomenon
NETWORKS was demonstrated in the button example in the previous section. If path length
CONTINUED decreases rapidly as we add edges, it makes sense to assume that it will
increase just as rapidly as we reverse direction and begin to remove edges.

In a highly clustered network, most nodes are connected in groups, so removing


one node does little to change the characteristic path length of the entire
network. Random deletions are more likely to take out inconsequential nodes

2287

CRUCIAL LINK

A clustered network, note that removing a random edge is not likely to disconnect the clustered components.

than ones that are critically connected.

Now that we have been introduced to the basic concepts of random networks,
ring-lattices, and small-world networks, we have some idea of the ways in
which mathematicians can analyze and say meaningful things about network
structures. The story does not end here, however. There are many possible
network structures that, organizationally, fall somewhere between order and
randomness. To sort these out further, we will have to increase the resolution
of the tools that we use to classify them. In the next section we will see how
analyzing the distribution of connections among nodes can lead to greater
mathematical understanding of networks.

Unit 11 | 27
UNIT 11 connecting with networks
textbook

SECTION 11.5

SCALE- FREE • Power Laws


NETWORKS • Airline Maps

Power Laws
• The distribution of connections per node of a random graph follows a bell
curve.
• Scale-free networks exhibit a power-law, or “fat-tail,” distribution.

The Internet is one of the most important and influential man-made networks
to arise in modern times. Like the phone networks that preceded it, it has
connected people across vast distances and has done much to make our world
seem smaller. By connecting libraries, universities, and schools with more and
more people, the World Wide Web has greatly facilitated the flow of information
around the globe.

Because the Web is open to anybody, it consists of hundreds of billions of pages


all connected via differing numbers of hyperlinks. In 1999, physicist Albert-
László Barabási and his colleagues at the University of Notre Dame in Indiana
set out to map the connectedness of the Web. They constructed a program,
called a crawler, to traverse the Web, collecting linkage data from the sites that
it came across, operating much like modern search engines. They expected to
find that most pages had about the same number of links, as would be the case
in a randomly constructed network. What they found was somewhat surprising.

Item 3227 / Tamara Munzner, Eric Hoffman, K. Claffy, and Bill Fenner, VISUALIZING THE GLOBAL TOPOLOGY OF
THE MBONE (1996). Courtesy of Munzner, Hoffman, Claffy, and Fenner.

Unit 11 | 28
UNIT 11 connecting with networks
textbook

SECTION 11.5 Random networks have a certain, predictable, distribution of connections among
their nodes. Because the process that creates them is indiscriminate, the
SCALE- FREE majority of nodes tend to end up with about the same number of connections.
NETWORKS There are, of course, always a few nodes that end up with significantly more
CONTINUED
connections than the majority, as well as a few nodes that end up with
significantly fewer connections than the majority. Consequently, the distribution
creates a bell curve when graphed with the number of connections represented
on the horizontal axis and the number of nodes with that number of connections
represented on the vertical axis.

MEAN

σ σ

2288
FRACTIONS
OF NODES

DEGREE OF NODES

Binomial Distribution

The peak of this curve is the mean number of connections per node in the
random network. The exact value of the mean is the total number of nodes
divided by the total number of connections. Barabási expected the results of his
web-crawler search to demonstrate a similar distribution, with a mean value
determined by the overall number of pages and links.

What Barabási found was that the vast majority of web pages in his sample had
very few links, while a few pages had the majority of the links. When graphed,
the degree distribution looked like this:

Unit 11 | 29
UNIT 11 connecting with networks
textbook

SECTION 11.5

SCALE- FREE
NETWORKS
2289
FRACTIONS
CONTINUED OF NODES

DEGREE OF NODES

This distribution pattern is quite different from the bell curve that arises
in random networks. It roughly follows what is known as a “power law.”
In a power-law distribution, the number of nodes with a given number of
connections is proportional to the number of connections, raised to a negative
exponent.

EQ:
P(k) ∼ k − γ

where P(k) is the fraction of nodes of degree k and gamma is an exponent that
determines the “fatness” of the tail of the distribution curve. Barabási found an
exponent of about -2.2 in his 1999 Internet survey.

What are the qualitative features of networks that follow power-law


distributions? Recall that random networks have very little structure and small-
world networks have a fair amount of clustering. Power-law-type networks
are characterized by a few highly connected nodes that serve as hubs and many
nodes with only a few connections. This explains the shape of the graph.

To consider a specific example, a power-law network might have one node with
1,000 connections, two nodes with 250 connections each, three nodes with 111
2
 1
connections, . . ., and k nodes with 1000  k  connections.

A convenient feature of graphs related to power-law distributions is that, for


a given distribution, they look the same no matter what scale one chooses to
1
examine. So, if we looked at only 10 of the nodes in this network, thus shifting
the scale of our observations, we would find that one node has 100 connections;
two nodes have 25 connections each; three nodes have 11 connections each;

Unit 11 | 30
UNIT 11 connecting with networks
textbook

SECTION 11.5  1
2

and k nodes have 100  k  connections each. The distribution graph of this
view would take the same shape as that of the larger network. The same exact
SCALE- FREE structure appears, regardless of our chosen scale. This phenomenon is similar
NETWORKS to what we observed with fractals in the unit on dimension.
CONTINUED

Airline Maps
• Scale-free networks are identifiable by the existence of a small number of
well-connected hubs.
• “Rich get richer”-type processes often lead to scale-free networks.

To get a sense of what a scale-free network looks like, imagine a map of airline
routes.

Item 2812 / Muh-Tian Lee. IMAGE (2001). Courtesy of NASA/Virtual Skies at http://virtualskies.arc.nasa.gov.
This is a route map for a major airline. Notice that it has major hubs in the cities of Newark, NJ and Houston, TX
and also a minor hub in Cleveland, OH.

Most major airlines have a few busy hubs through which most of their routes
pass. There are a greater number of medium-sized airports, each with fewer
flights to and from them. Then there are the small airports, of which there are
substantially more, but which have substantially less air traffic. Finally, there
are a great number of tiny, municipal airports, which provide almost no major
carrier service. This is a classic example of a scale-free network.

Unit 11 | 31
UNIT 11 connecting with networks
textbook

SECTION 11.5

SCALE- FREE
NETWORKS
CONTINUED

Item 3128 / United States Department of Transportation – Federal Highway Administration. NATIONAL HIGHWAY
SYSTEM MAP (2002). Courtesy of United States Department of Transportation – Federal Highway Administration.

The airline route map can be contrasted with a standard road map. The
distribution of connections on the roadmap follows a bell-shaped curve. That is,
most cities have one major highway that connects them to the network, whereas
a few cities have more than one major connection, and a few cities lie well off
the beaten path, at some distance from a major highway.

Scale-free networks exhibit interesting distributions of clustering coefficients.


The well-connected hubs tend to have lower clustering coefficients than those
of the less-well-connected nodes. This situation arises because each node
that connects to a hub creates as many potential neighborly connections as
there are nodes that are already connected. The more neighbors, the more
potential connections, which tends to lower the clustering coefficient. In simple
mathematical terms, as the denominator of the fraction increases, the value of
the fraction decreases.

By contrast, the nodes with fewer connections have fewer potential neighborly
connections, so the ones that do exist contribute strongly to the clustering
coefficient. By examining both the exponent of the power-law distribution and
the shape of the clustering coefficient distribution, one can separate and classify
scale-free networks in new ways.

How scale-free networks arise in the real world is somewhat interesting as well.
Recall that Barabási assumed that most web pages had about the same number
of links. In the absence of any contradicting evidence, this hypothesis was as
good as any. When he found, however, that some pages served as extremely
well-connected hubs, he searched for a reason that this might be the case.
He hypothesized that hubs with more connections were more desirable links
because they provided access to a greater number of other nodes. This became

Unit 11 | 32
UNIT 11 connecting with networks
textbook

SECTION 11.5 known as the “rich get richer” phenomenon, which applies not only to the
Internet but also to human social networks. People with more acquaintances
SCALE- FREE tend to meet more people than do those with fewer acquaintances. Hence, those
NETWORKS with bigger clusters of friends tend to grow bigger clusters of friends. Barabási
CONTINUED
called this “preferential attachment” and showed that it tends to generate scale-
free networks.

Discussing the mechanisms by which scale-free networks arise suggests an


interesting question: What is to be done about networks that change with
time? Up until this point, we have given lip-service to some of the processes
by which networks can be created, but our analyses have tended to measure
aspects of networks only after they have settled into a static state. This, of
course, is a limited view of how real networks evolve. We are always making
new acquaintances and losing touch with old ones. Web pages pop in and
out of existence all the time. In assuming that networks are static, we are
missing a significant portion of the picture. The study of networks in nature, of
ecosystems, sheds some light on how and why we should think about networks
that change with time.

Unit 11 | 33
UNIT 11 connecting with networks
textbook

SECTION 11.6

ECOSYSTEMS • Links in the Food Chain


• Unintended Consequences

Links in the Food Chain


• A food chain or food web is a graphic way of representing predator-prey and
symbiotic relationships that exist in ecosystems.

For most of this unit we have been focusing mainly on physical, human-made
networks, such as our power grid, the Internet, and the nation’s highway
system. We have also looked briefly at intangible networks, such as webs of
social connections. Until now, we have neglected a particular group of networks
that are more fundamental and important than any of those created by people:
ecosystems.

One common aspect of ecosystems is the food chain. A food chain describes
how energy gets transferred through a chain of organisms, beginning with
photosynthetic microorganisms such as algae, to consolidate in apex predators,
such as a great white shark, and then to be dispersed by scavengers, only to re-
enter the system at the bottom again.

OTTER

SHARK
SMALL

3249
FISH

SUN

CRAB SHARK
ZOO PLANKTON

PHYTOPLANKTON

Unit 11 | 34
UNIT 11 connecting with networks
textbook

SECTION 11.6 A food chain provides a convenient way of obtaining a rough approximation of
what happens in an ecosystem. A better approximation is available through
ECOSYSTEMS the food web. Food webs take into account that most members of ecosystems
CONTINUED interact with more than just one other member, or neighbor. In a food web,
nodes represent species, and edges represent predator-prey relationships, or
alternatively, mutually beneficial, or symbiotic, relationships.

Item 2919 /Neo Martinez, UNITED KINGDOM TROPHIC WEB (2005). Courtesy of Neo Martinez
A food web for a British forest.

Item 2920 / Neo Martinez, CARIBBEAN REEF TROPHIC WEB: IMAGE 1 (2005). Courtesy of Neo Martinez.
A food web for a Caribbean reef.

Food webs are examples of directed graphs, because certain relationships


are “one-way streets.” Sharks, for example, may eat otters, but otters do not
usually eat sharks. Such a relationship would be represented by an edge that
has some directionality.

Unit 11 | 35
UNIT 11 connecting with networks
textbook

SECTION 11.6

ECOSYSTEMS
CONTINUED
2290

Alternatively, remoras are fish that tend to attach themselves to sharks and feed
off of scraps, bacteria, and feces. This is a mutually beneficial, or symbiotic,
relationship: the shark gets a good cleaning and the remora gets a free ride and
free food. Species that live in symbiosis such as this would be represented in a
graph by nodes that are connected by two edges, one traveling each way.

2291
Unintended Consequences
• Networks in nature are constantly changing.
• Understanding how networks respond to disruption requires that we view
them as dynamic structures, rather than as static structures.

Ecosystems in nature portray dynamic equilibrium; predator and prey


populations are constantly changing in response to one another. For this reason,
any realistic model has to incorporate some sort of dynamics. It is critical to
study what happens when certain nodes become diminished in their influence
or are removed entirely from a network. Because ecosystems are typically
made up of many different species that interact in complicated ways, the
consequences of removing one or more nodes can be hard to predict.

A famous example of the unpredictable consequences of removing a key node


from an ecosystem occurred on the West Coast of North America in the 19th
century. Throughout the 1800s, Russia controlled what is now Alaska and had
considerable influence along the entire west coast of Canada and the northwest
coast of what is now the United States.

Unit 11 | 36
UNIT 11 connecting with networks
textbook

SECTION 11.6 Russian traders were especially interested in the pelts of both river and sea
otters to be used in making warm clothing for withstanding the cold Russian
ECOSYSTEMS winters. They paid trappers very handsomely for any and all otter pelts. As a
CONTINUED result, the trappers scoured the rivers, streams, and coastlines for otters. By
the year 1900, the otters had been hunted to the brink of extinction, effectively
removing them from the ecosystem of which they were a well-connected
member.

2292
Whenever a species is removed or disappears from a network, its prey tend
to benefit, and its predators tend to suffer. This causes ripple effects that can
rapidly spread to affect other nodes (species) in different ways. In the case at
hand, otters prey heavily on sea urchins. With the otters out of the picture from
the over-hunting, the sea urchin population began to boom up and down the
coast.

KELP

2293 URCHIN EATING KELP

Unit 11 | 37
UNIT 11 connecting with networks
textbook

SECTION 11.6 As it turns out, a favorite food of the urchin is kelp, a form of algae that grows
into large stalks, creating underwater forests that serve to hide and protect all
ECOSYSTEMS manner of other organisms, especially juvenile fish. The exploding population
CONTINUED of sea urchins feasted voraciously on the kelp, especially upon the vulnerable
spots where the stalks anchor to rocks. Under pressure from the increased
consumption by the urchin predation, the kelp forests very rapidly began to
disappear, and along with them the precious juvenile fish habitat.

With diminishing cover, the young fish were especially vulnerable to predation.
This eventually led to the collapse of certain fisheries along the coast. These
consequences were ultimately attributable to the removal of the otters from the
ecosystem. When governing authorities realized what had happened, otters
became a protected species. They have since slowly regained some of their
numbers, which has in turn resulted in the rejuvenation and expansion of some
of the kelp forests along the coast.

The difficult task of understanding the many different interactions in an


ecosystem is made even more difficult by variations in complexity. Some
ecosystems are quite simple, such as those found at high elevations, where only
a few of the hardiest, best-adapted species can survive. Other ecosystems, such
as those found in tropical rain forests, may have millions of member species
and are extraordinarily complex. A major question in ecology is whether or not
complexity in an ecosystem increases its stability.

It might seem obvious that, the more nodes and edges a network has, the
less likely it will be that the entire network or a large portion of it falls into
dysfunction at the removal of a random node. However, as we saw in our
discussion of random, small-world, and scale-free networks, different
structures behave differently when randomly disrupted. Recall that removing
nodes from a randomly connected network tends to lead rapidly toward
disconnection.

On the other hand, removing a few nodes from a scale-free network usually has
little effect, due to the presence of its highly connected hubs. Removing a hub,
however, can be catastrophic.

Unit 11 | 38
UNIT 11 connecting with networks
textbook

SECTION 11.6 Do real ecosystems behave as random graphs, small worlds, or scale-free
networks? Real-world food webs tend to have different qualities of all of these
ECOSYSTEMS types of structure. For example, the idea of keystone species, a species whose
CONTINUED presence or absence directly and strongly affects the stability of the entire
system, is closely related to the highly connected hubs of scale-free networks.

One final note: because species play very different roles in their ecological
networks, their form and behavior is often closely related to their
connectedness. This is why, for example, when snorkelling you will commonly
see many small and medium-sized fish, less commonly a few large fish, and
very rarely a shark. The same goes for terrestrial creatures. Deer sightings
are a quite common occurrence all over the country, but visual reports of bears,
wolves, and mountain lions are relatively rare. A chief reason for this is that
being a large predator requires expending a large amount of energy hunting
herbivores and growing the teeth and claws required to kill and eat them.

At each step in a food chain or web, a certain amount of energy is lost. Sunlight
falls on autotrophs, who convert it to sugar with a certain efficiency through
the process of photosynthesis. Nonetheless, not all of the sun’s energy gets
converted. The creatures that consume these primary producers convert their
sun-made sugars into body-mass via enzymatic processes that have a certain
efficiency. However, not all of the “sun energy” stored in the autotrophs is
captured. Consequently, after passing through just two levels of the food web,
the energy that started with the sun is only a fraction of what it was when it
arrived on the surface of the earth. The larger an animal’s mass, the more
energy it has consumed, because the amount of energy that strikes the earth
is fixed, this means that there should be fewer large animals than small ones.
Furthermore, because large predators are a step above large herbivores in the
hierarchy, it stands to reason that there should be still fewer of them.

Understanding how the different species with which we share our planet
interact requires an understanding of how the structure of networks affects the
roles and importance of the network members or elements. Networks such
as ecosystems are constantly changing, putting pressures on the species that
comprise them to adapt or die. In this sense, dynamic networks can be thought
of as one of the fundamental engines of evolutionary change.

Unit 11 | 39
UNIT 11 at a glance
student textbook

SECTION 11.2

The Study of • Euler’s solution to the Bridges of Königsberg problem showed how to
Connections analyze a real-life situation in terms of connections.
• The existence of an Eulerian path or cycle on a graph depends on the degree
(number of connections) of each node.
• A graph is a mathematical structure consisting of a set of elements and a set
that defines the connections between them.
• Graph theorists are concerned with a number of graph properties, such as
connectedness, connected components, and diameter.
• Graphs can be directed or undirected, weighted or unweighted.

SECTION 11.3

RANDOM • There are multiple ways to define a random network.


NETWORKS • As edges are added randomly to a collection of nodes, groups of connected
components become larger, resulting in a “connectivity avalanche.”
• The average distance is one way to classify different types of graphs.

SECTION 11.4

SMALL WORLD • The idea that there are, at most, six degrees of separation between any two
NETWORKS people has its roots in an experiment by Stanley Milgram.
• Average distance is relatively easy to compute for well-understood graphs,
such as ring lattices and random graphs.
• Small-world graphs have average distances that generally fall somewhere
between those found in a ring lattice and those found in a random graph.
• The clustering coefficient is a measure of how likely it is that a node’s
neighbors are connected to each other.
• Networks in nature tend to exhibit a high degree of clustering.

Unit 11 | 40
UNIT 11 at a glance
student textbook

SECTION 11.5

SCALE-FREE • The distribution of connections per node of a random graph follows a bell
NETWORKS curve.
• Scale-free networks exhibit a power-law, or “fat-tail,” distribution.
• Scale-free networks are identifiable by the existence of a small number of
well-connected hubs.
• “Rich get richer”-type processes often lead to scale-free networks.

SECTION 11.6

ECOSYSTEMS • A food chain or food web is a graphic way of representing predator-prey and
symbiotic relationships that exist in ecosystems.
• Networks in nature are constantly changing.
• Understanding how networks respond to disruption requires that we view
them as dynamic structures, rather than as static structures.

Unit 11 | 41
UNIT 11 connecting with networks
textbook

BIBLIOGRAPHY

WEBSITES http://www.foodwebs.org/

PRINT Barabási, Albert-László. Linked: The New Science of Networks. Cambridge, MA:
Perseus Publishing, 2002.

Brose U., E.L. Berlow, and N.D. Martinez. “Scaling Up Keystone Effects
From Simple to Complex Ecological Networks,” Ecology Letters, vol. 8, no. 12
(December 2005).

Brose, U., R.J. Williams, and N.D. Martinez. “Comment on Foraging Adaptation
and the Relationship Between Food-Web Complexity and Stability,” Science, vol.
301, no. 5635 (August 2003).

Buchanan, Mark. Nexus: Small Worlds and the Groundbreaking Science of


Networks. New York: W.W. Norton and Company, 2002.

Colinvaux, Paul. Why Big Fierce Animals Are Rare: An Ecologist’s Perspective.
(Princeton Science Library) Princeton, NJ: Princeton University Press, 1978.

Erdös, Paul and Alfréd Rényi. “On the Evolution of Random Graphs,” Publications
of the Mathematical Institute of the Hungarian Academy of Sciences, Series A 5
(1960).

Goh, Kwang-II, Eulsik Oh, Hawoong Jeong, Byungnam Kahng, and Doochul Kim.
“Classification of Scale-Free Networks,” Proceedings of the National Academy of
Science, USA, vol. 99, no. 20 (October 2002).

Hartsfield, Nora and Gerhard Ringel. Pearls in Graph Theory: A Comprehensive


Approach. San Diego, CA: Academic Press, 1990.

Hill, S., D. Agarwal, R. Bell, and C. Volinsky. “Building an Effective


Representation for Dynamic Networks,” Journal of Computational and Graphical
Statistics, vol. 25 (2006).

Latora V. and M. Marchiori. “Is the Boston Subway a Small-World Network?”


Physica A, vol. 314, no. 1 (1 November 2002).

Unit 11 | 42
UNIT 11 connecting with networks
textbook

BIBLIOGRAPHY
Liben-Nowell, D., J. Novak, R. Kumar, P. Raghavan, and A. Tomkins.
“Geographic Routing in Social Networks,” Proceedings of the National Academy of
Sciences, USA, vol. 102, no. 33 (2005).

Martinez, N.D. “Effects of Resolution on Food Web Structure,” Oikos, 66 (1993).

Martinez, N.D. “Scale-Dependent Constraints on Food-Web Structure,” American


Naturalist, 144 (1994).

Milgram, S. “The Small-World Problem,” Psychology Today, vol. 1, no. 1 (1967).

Montoya, José M., Stuart L. Pimm, and Ricard V. Solé. “Ecological Networks and
Their Fragility,” Nature, 442 (20 July 2006)

Newman, James R. Volume 1 of The World of Mathematics: A Small Library of the


Literature of Mathematics from A’h-mose the Scribe to Albert Einstein, Presented
with Commentaries and Notes. New York: Simon and Schuster, 1956.

Newman, M.E.J., D.J. Watts, and S. H. Strogatz. “Random Graph Models of


Social Networks,” Proceedings of the National Academy of Sciences, USA, vol. 99,
Supplement 1 (2002).

Proulx, Stephen R., Daniel E.L. Promislow, and Patrick C. Phillips. “Network
Thinking in Ecology and Evolution,” Trends in Ecology and Evolution, vol. 20, no. 6
(2005).

Schechter, Bruce. My Brain Is Open: The Mathematical Journeys of Paul Erdös.


New York: Touchstone (Simon and Schuster, Inc.), 2000.

Skyrms, Brian and Robin Pemantle. “A Dynamic Model of Social Network


Formation,” Proceedings of the National Academy of Sciences, USA, vol. 97, no. 16
(August 1, 2000).

Springer, A.M., J.A. Estes, G.B. van Vliet, T.M. Williams, D.F. Doak, E.M. Danner,
K.A. Forney, and B. Pfister. “Sequential Megafaunal Collapse in the North Pacific
Ocean: An Ongoing Legacy of Industrial Whaling?” Proceedings of the National
Academy of Sciences, USA, vol. 100, no. 21 (October 2003).

Unit 11 | 43
UNIT 11 connecting with networks
textbook

BIBLIOGRAPHY
Steinberg, PD, J.A. Estes, and F.C. Winter. “Evolutionary Consequences of Food
Chain Length in Kelp Forest Communities,” Proceedings of the National Academy
of Sciences, USA, vol. 92 (1995).

Tannenbaum, Peter. Excursions in Modern Mathematics, 5th ed. Upper Saddle


River, NJ: Pearson Education, Inc., 2004.

Wallis, W.D. A Beginner’s Guide to Graph Theory. New York: Birkhauser Boston,
2000.

Watts, Duncan. Six Degrees: The Science of the Connected Age. New York: W.W.
Norton and Co., 2003.

Watts, Duncan. Small Worlds: The Dynamics of Networks Between Order and
Randomness. (Princeton Studies in Complexity). Princeton, NJ: Princeton
University Press, 1999.

Watts, D.J. and S.H. Strogatz. “Collective Dynamics of Small-World Networks,”


Nature 393 (1998).

Williams, R.J., E.L. Berlow, J.A. Dunne, A.-L. Barabási, and N.D. Martinez.
“Two Degrees of Separation in Complex Food Webs,” Proceedings of the National
Academy of Sciences, USA, vol. 99, no. 20 (2002).

LECTURE D’Souza, R.M. “The Science of Complex Networks.” CSE Seminar at University
of California - Davis, February 2006. http://mae.ucdavis.edu/dsouza/talks.html
(accessed 2007).

Unit 11 | 44
UNIT 11 connecting with networks
textbook

NOTES

Unit 11 | 45
TEXTBOOK
Unit 12
UNIT 12
IN SYNC
TEXTBOOK

UNIT OBJECTIVES

• The mathematical study of spontaneous synchronization is an emerging field in the


study of nonlinear dynamics.

• Spontaneous synchronization occurs in a wide variety of human, biological, and


mechanical systems.

• Mathematical descriptions of synchronization require systems of coupled


differential equations.

• Differential equations relate quantities and their rates of change.

• Differential calculus makes it possible to deal mathematically with non-constant


rates of change.

• Entities that synchronize can be modeled as coupled oscillators.

• The Kuramoto model is a solvable system of coupled differential equations that can
represent multiple related oscillators.

• The Millennium Bridge incident represented the interaction of the worlds of both
biological and mechanical synchronization.
As far as the laws of mathematics refer to
reality, they are not certain; and as far as they
are certain, they do not refer to reality.

Albert Einstein
UNIT 12 in sync
textbook

SECTION 12.1

INTRODUCTION The interplay between the abstract world of mathematics and the real
world is not as straightforward as it might seem at first. While it is true that
mathematics can be used to make sense of and make testable predictions
about certain real-life situations, such as solar eclipses, there are many types
of natural phenomena, such as the turbulence of fluids in motion, for which
current mathematical models are inadequate. Situations such as turbulence
represent the frontier of how mathematics can be used to help us understand
reality. Coming to an understanding of turbulence is challenging because of the
complexity and dynamism of moving fluids. Turbulence involves the combined
behavior of trillions of particles of fluid, each of which is subject to many types of
forces and interactions. While mathematics can be used to describe the behavior
of a single particle relatively comprehensively, the behavior of a group of
associated particles is well understood only under certain, sometimes contrived,
conditions.

How can we make progress in understanding large, complex, dynamic systems?


It helps to start with certain special cases that lend themselves more readily than
others to analysis and quantification with the currently available mathematical
tools. An understanding of the behavior of a system in these special cases
can then provide hints regarding the behavior of the system in more general
situations. This is a common strategy in applied mathematics: First, find
intriguing special cases that lend themselves readily to study and explanation,
then explore how the results can be generalized. Spontaneous synchronization
is one such special case of complicated dynamic phenomena. Understanding
the mathematics of how, and under what circumstances, entities can come into
synchronization with one another provides a starting point for exploring the vast
world of nonlinear dynamics.

Our world is filled with all sorts of phenomena that amaze us with their regularity
and baffle us with their complexity. For example, how is it that a school of fish
can, seemingly simultaneously, all turn on a dime at a mere hint of a nearby
predator? How is it that very large groups of East Asian fireflies, and some other
varieties as well, when left to their own devices, spontaneously synchronize
their flashes? How do the individual cells that make up your heart contract
in a coordinated rhythmic fashion to keep your blood flowing? Even a system
as simple and seemingly unrelated as an inanimate pair of grandfather clocks
can exhibit a kind of synchronous behavior. It is clear that synchronization is a
phenomenon that can be found in many different contexts.
Unit 12 | 1
UNIT 12 in sync
textbook

SECTION 12.1 The art of mathematical modeling involves identifying a few simple and
quantifiable assumptions about a given system (or systems) of study that
INtroduction actually give rise to a good approximation of the phenomenon of interest.
CONTINUED Mathematically capturing the complex, dynamic phenomena of the real world
is a gargantuan task and is an area in which there is much opportunity for the
advancement of our understanding. The study of synchronization represents
one of the outposts on the frontier of this vast, unexplored territory.

In this chapter, we will begin by looking at some examples of natural


phenomena that exhibit fascinating coordinated and synchronous behavior.
Then we will learn a bit about the available mathematical tools that are useful
in our quest to understand these phenomena, namely differential equations and
calculus, the mathematics of change. From there we will investigate how one
particular mathematical model of a system of coupled oscillators can be used
to help us understand complex coordinated behavior. We will then be prepared
to take a more in-depth look at a couple of examples from the realms of biology
and physics to see how the study of synchronization is an example of using
mathematics to describe the real world.

Unit 12 | 2
UNIT 12 in sync
textbook

SECTION 12.2

Unit OVerview • The phenomenon of synchronous behavior occurs in many different


situations, ranging from the intentional synchronization of a symphony to
the inevitable synchrony of orbiting planets.

When we listen to an orchestra, we are often impressed by how well the


musicians can play together, each individual contributing to a whole that is
almost always something very different from the individual parts. From a
group of musicians playing individual parts, a complex, coordinated piece
emerges. The mechanism for this particular synchronization is not hard to
understand: The conductor keeps time and cues the musicians to play “in sync”
with each other.

Marching bands are


another example of
synchronous behavior.
Their synchrony
is somewhat more
complicated than that
of an orchestra in that
the marching musicians
move together in addition
to playing music together.
To play in sync with each
other, they take their
Item 2923/Chris Clark, COLLEGE MARCHING BAND IN FORMATION (2006).
Courtesy of iStockphoto.com/Chris Clark.
cues from a conductor,
as the orchestra does.
To move in sync with one another, however, they must take their cues from each
other. The marchers judge their position and velocity relative to their neighbors.
This may seem like a lot to think about for the band members, and it is.
Consequently, it would be tempting to conclude that synchronous, coordinated
behavior requires a conscious mind, but humans are definitely not the only ones
who exhibit synchronous behavior.

Unit 12 | 3
UNIT 12 in sync
textbook

SECTION 12.2 Flocking behavior in birds and


schooling behavior in fish are
UNIT OVERVIEW two examples of synchrony
CONTINUED in the animal world. Watch a
flock of pigeons flying and you
are likely to see them make
remarkably sharp turns, all
at the same time. The entire
flock can change direction
Item 3098/Tammy Peluso, SWIRL OF FISH COLLEGE (2007).seemingly simultaneously
Courtesy of iStockphoto.com/ Tammy Peluso.
and without running into each
other. The same is true for a
school of fish, darting, turning,
splitting, and re-uniting
to evade a predator. Both
flocks of birds and schools
of fish exhibit this sort of
coordinated motion--what we
have been calling synchronous
behavior--without a leader or
“conductor” whose actions tell
the group what to do. Rather,
Item 3099/Tammy Peluso, SCHOOL OF TREVALLIES (2007).
Courtesy of iStockphoto.com/ Tammy Peluso.
each individual pays attention
to its immediate neighbors
and makes small adjustments in speed and spacing to maintain the cohesion
of the group. This phenomenon of groups of individuals who each follow local
relationship rules results in the whole group seemingly acting as one. It would
then be reasonable to surmise that coordinated, synchronous behavior requires
some higher level of brain function--at least at a level that enables an individual
subconsciously to follow an innate set of rules about distance and speed.

Even this conjecture, however, falls apart when we consider another example
from the animal world. Certain species of fireflies in Southeast Asia exhibit
extraordinary synchronous behavior. By the thousands they are able to
synchronize the rhythmic flashing of their abdomens so that they all flash at the
same time. They seem to accomplish this naturally and spontaneously without
any leader showing the way. They accomplish this synchronization despite the
fact that each individual firefly’s brain can’t hold a candle to the processing
power of a bird’s or fish’s brain.

Unit 12 | 4
UNIT 12 in sync
textbook

SECTION 12.2

UNIT OVERVIEW
CONTINUED

Item 2429/Fletcher & Baylis / Photo Researchers, Inc., SYNCHRONOUS FIREFLIES (PTEROPTYX TENER) FLASHING
ON THE MANGROVE TREES (SONNERATIA CASEOLARIS) AT KUALA SELANGOR, MALAYSIA (2008). Courtesy of
Photo Researchers, Inc.

Synchronous behavior obviously occurs among simpler animals, but what about
at a sub-organism level? How about between cells? An individual cell has no
brain, and yet our bodies are made up of trillions of individual cells, each of
which functions—during states of health—in life-sustaining harmony with the
others. A great example of this is heart pacemaker cells. Pacemaker cells are
the key rhythm keepers that govern how and when the heart contracts. These
cells display a great degree of spontaneous synchronous behavior; indeed, if
they didn’t, none of us would be here to observe it! Each pacemaker cell has
an innate cycle of building and releasing electrical charges that ultimately
stimulate the cells of the heart to contract or relax. In isolation, pacemaker
cells keep their own rhythm. When one pacemaker cell is placed in proximity to
another pacemaker cell, however, something remarkable occurs. They maintain
their separate rhythms for a brief period and then naturally fall into sync
with one another, both building and releasing charges at the same time. This
phenomenon has no leader guiding it and no processor, such as a brain, to make
judgments about what the neighbors are doing.

At this point we could still argue that the phenomenon of synchronous behavior
requires some sort of living thing. Although it doesn’t need a leader, or even
brains, perhaps it results from some basic principle of biology.

Unit 12 | 5
UNIT 12 in sync
textbook

SECTION 12.2 Of course, by now we should not be surprised that this is not the case. All
sorts of non-biological systems can spontaneously synchronize, creating order
UNIT OVERVIEW where we might expect to see chaos. We can see this in the heavens, in the
CONTINUED tidal locking of our moon (a case of two cycles, both an orbit and a rotation)
becoming synchronized so that we always see the same side of the moon when
we look from Earth. Even something as simple and mundane as a system of
two pendula, little more than weights attached to the ends of sticks, will exhibit
spontaneous synchronization when both connected to a movable platform.

Synchronization is at the heart of the study of how order emerges from disorder
and the rules that guide this process. Mathematics is the perfect tool to use to
study this, because it provides methods that are general enough to encompass
the commonalities in the seemingly disparate phenomena that we have looked
at so far. Using the tools of mathematics, we can start to clarify a complex
situation by making simplifying assumptions, seeing how these simple cases
behave, and then trying to generalize our findings to cases that are not so
simple. This is a common theme in mathematics, but to use this method to
understand synchrony, we will need some specific mathematical tools, namely
those that can quantify and describe things that are continuously changing.

Unit 12 | 6
UNIT 12 in sync
textbook

SECTION 12.3

Calculus • Get in Line


• Non-Constant Slope

Get In line
• The slope-intercept form of a linear equation is a common way to represent
the mathematics of change.

One of the key features of algebraic mathematics is the use of symbols instead
of numbers. In algebra, we learn how to generalize and explore the rules
of arithmetic by using variables that can stand for any number. We become
less concerned with answers to specific problems and more concerned with
the relationships between the entities and values under investigation. The
advantage of this is that our analyses can be applied to a wider variety of
situations than would be possible if we restricted ourselves to using specific
numbers that apply only to a particular situation.

An example of this concern with relationships is the familiar slope-intercept


form of the equation of a line: y = mx + b. Typical high school algebra courses
reveal how this relationship can be applied to any number of situations. We
can apply the equation to the cost of painting a house, for example, by letting
y represent the total cost; x, the number of gallons of paint purchased; m, the
price per gallon of paint; and b, the fixed cost of supplies such as brushes and
buckets. The total cost can then be found by substituting real-world values for
the variables and performing the indicated operations.

In general, a linear equation expresses a relationship between the two variables,


x and y. These variables represent two values that are related in some way. In
other words, changing one leads to a change in the other. The constants of the
linear equation, m and b, help show specifically how x and y are related. These
constants are determined by the conditions of the situation that we wish to
understand and model with the equation.

Unit 12 | 7
UNIT 12 in sync
textbook

SECTION 12.3
y
(x2, y2)
Calculus
x +b
CONTINUED
y =m rise
(x1, y1)
run

b rise y2 - y1
m = run = x - x
2 1

1745 x

In the house-painting example mentioned above, we saw that b represents a


fixed, up-front cost. Graphically this value determines the placement of the
line on the coordinate plane. Specifically, it identifies the point at which the line
intersects the y-axis.

x +b3
y =m
y

b1
b3 m x+
y=

b1 x +b2
y =m

1746 b2 x

Many times in mathematics we have to choose what it is we care most about.


In other words, in a given situation we must decide which quantities to de-
emphasize and which to give our full focus. In our present discussion, we are
going to ignore b for the time being, because what we are really interested in is
how changes in one variable affect the other. In our painting example, the up-
front cost becomes increasingly less important as more paint is purchased, so
we should probably pay more attention to the price of paint than to those fixed,
Unit 12 | 8
UNIT 12 in sync
textbook

SECTION 12.3 up-front costs. Knowing the price per gallon will enable us to determine exactly
how our total cost changes as we use more or less paint. In the general case
Calculus then, m is more interesting to us right now because it lets us calculate how a
CONTINUED change in x will affect the value of y.

The number m compares the change in y to the change in x for a given line.
We call this ratio of changes, the “slope.” If we know two points on the line,
(x1, y1) and (x2, y2), we can find the slope by taking the difference in y values
and dividing it by the difference in x values. This slope ratio is commonly
referred to as “rise over run.”

y2-y1
m=
x2-x1

Slope is a useful concept because it describes how two quantities change in


relation to one another. The slope of a linear equation is constant; it never
changes. While many real-world situations can be modeled with a linear
equation, most cannot. For one thing, most real-world situations can’t be
modeled using just multiplication and addition. Equations involving powers
of variables, such as the equation for the velocity of a falling object, don’t
lend themselves to the simple notion of a constant slope implied in the linear
equation model. Let’s look at how we can generalize the concept of slope to talk
about such non-constant rates of change.

Non-Constant Slope
• To capture the notion of rates of change that can themselves change, we
need the concept of a derivative.

In our painting example, we might arrange a deal with the paint store that the
more paint we buy, the less we pay per gallon. This means that while the total
cost increases as we buy more paint, the rate at which the total cost changes
actually decreases. Our slope is no longer constant; it, like y, depends on which
x (amount of paint) we choose to consider. To better understand how real-life
situations change, we need a more comprehensive concept of slope.

Unit 12 | 9
UNIT 12 in sync
textbook

SECTION 12.3
y

Calculus
CONTINUED
(x2, y2)

(x1, y1)

1747 x

Notice that if we attempt to find the slope between two points on a curve, we
end up with a straight line that doesn’t correspond with the curve very well.
Furthermore, notice how the slope between two points on a curve changes
depending on which two points are selected.

(x2, y2) (x3, y3)


m1 m2

(x4, y4)
(x1, y1)

1748 Note that m1 is positive while m2 is negative. x

Considering just a few examples also makes it clear that generally the further
the two selected points are away from each other, the worse the correlation
between the slope of the line and what is actually happening to the curve over
the chosen interval.

Unit 12 | 10
UNIT 12 in sync
textbook

SECTION 12.3
y

Calculus
CONTINUED
(x2, y2)

(x1, y1) This slope line


is a poor fit!

1749 x

If we could somehow have a notion of slope between two points on a curve that
are not very far apart at all, we could practically eliminate the discrepancy
between the line determined by those points and the path of the curve. Such a
conceptual tool could help us understand mathematically all sorts of curves and
the situations they represent. To do this, we can shrink our view as far as we
wish and consider the slope between two points that are extremely close to one
another on the curve.

RISE
((x+∆x), y(x+∆x)) SLOPE =
RUN
y
y(x+∆x) - y(x)
∆x
RISE

(x, y (x))
RUN

1750 x
x ∆x

On a curve, imagine a point whose horizontal position is x. Now imagine a


second point on the curve that is some very small horizontal distance, Δx, from
x. This point’s horizontal position is x+Δx. The slope between these two points
is represented by this expression:
y (x ) − y (x + ∆x )
(x )+−∆x
yy(x y (x) -+y(x)
∆x )
____________
∆x Unit 12 | 11
UNIT 12 in sync
textbook

SECTION 12.3 This is the familiar “rise over run” expression indicating the rate of
change between these two very slightly separated points. If we let their
Calculus horizontal separation, Δx, approach zero, we will have an expression for the
CONTINUED “instantaneous” rate of change for that section of the curve. Note that we
cannot make the separation equal to zero, because division by zero is undefined.
We can, however, talk about the slope as Δx “gets arbitrarily close” to zero. This
quantity, called a derivative, is the generalized notion of slope that we need to
deal with many complicated (i.e., “curvy”) real-world models.

The derivative is a powerful mathematical tool because it allows us to describe


in great detail not only how quantities change in relation to each other, but also
how their changes change. We can now account for the vast amount of real-
world phenomena that do not conform to the simple, linear notion of a constant
slope.

We don’t have the space in this text to explore how to find derivatives of
specific functions, but we’ll need to use some of them later as we attempt to
mathematically model synchronization. The following table gives a few basic
functions and their derivatives.

ORIGINAL FUNCTION DERIVATIVE


(TELLS THE SLOPE AT ANY POINT)

5x + 6 5
x2
2x
sin 3x 3cos 3x
cos 7x -7sin 7x

1682 e9x 9e9x

The derivative is one of the key ideas in differential calculus, which can
be thought of as the mathematics of change. Calculus uses the concepts of
infinite processes and infinitesimal steps to describe how changing quantities
(e.g., those that grow, shrink, move, or proliferate) vary.

Unit 12 | 12
UNIT 12 in sync
textbook

SECTION 12.3 Ancient Egyptian thinkers, trying to compute the volumes of various solids,
made the first strides toward this understanding. Greek mathematicians,
Calculus such as Eudoxus and Archimedes, carried on this legacy by developing the
CONTINUED “method of exhaustion,” which involved dealing with infinite processes. As
the West descended into the so-called Dark Ages, Indian, Arab, and Persian
mathematicians flourished, making great strides toward an understanding of
derivatives. By the late 1600s, European mathematicians were building upon
the techniques of past thinkers, using calculus-like methods to understand
physical processes. It was at this point that the traditionally-held “fathers of
calculus,” Isaac Newton and Gottfried Leibniz, simultaneously put centuries’
worth of pieces together, and added many significant contributions of their own,
to form a coherent whole called “the calculus.”

Calculus provides us with the mathematical tools to deal with rates of change in
a sensible manner. But as with any discipline, the tools are only as effective as
the skill of the one who wields them. Using the tools of calculus to model real-
world situations requires the ability to see a dynamic situation and recognize
the relevant quantities and rates of change involved, and how they relate to each
other. With a grasp of the elements and relationships in play, we are better
prepared to express what is happening using equations that we can analyze to
make predictions about the future and to find new understandings of our world.

Unit 12 | 13
UNIT 12 in sync
textbook

SECTION 12.4

Differential • Free-Falling
equations • Bacterial Growth
• Solving Differential Equations

FREE-FALLING
• A differential equation is an expression that relates quantities and their
rates of change.
• The solution to a differential equation is not simply a number; it is a
function.

With a solid mathematical tool, calculus, in hand, we can set out to try to
understand the phenomena of the world mathematically. Let’s start with a
simple example. Imagine an object in free-fall. At any given time during its fall,
it will have some specific velocity, v. Furthermore, we intuitively know that the
longer something falls, the faster it goes. This suggests that the velocity of the
object should be expressed as a function of elapsed time, t.

To write the specific expression that will tell us the object’s velocity at any point
in time, let’s first assume that the object begins from a state of rest. This gives
us an “initial condition,” of v(0) = 0, or “the velocity at time zero equals zero.”
The velocity of the object as it falls will then be due solely to the influence of
gravity. If we multiply the time spent falling t by the acceleration due to gravity
g, which is the experimentally observed rate at which the velocity of a freely
falling object changes, we can determine the speed at which our object is falling
at any point in time:

v(t) = gt

Notice here that what interests us is not a specific value for velocity or time, but
rather the exact relationship between the two. In this example, we have a non-
constant velocity. If we take the derivative of this, we should get an expression
that tells us how fast velocity is changing. Doing this, we get:

dv (t)
= g (It is the derivative of a linear equation, like the first example in the
dt
table on page 12. Note the equation is shorthand for "the derivative
of v with respect to t.")

Unit 12 | 14
UNIT 12 in sync
textbook

SECTION 12.4 This is a very simple example of what is known as a differential equation. A
differential equation is simply an equation that relates quantities with their rates
Differential of change. In this example, we see that the amount by which v changes, dv, in
equations some small amount of time, dt, is equal to a constant, g.
CONTINUED

To solve this equation, we are looking for a function whose derivative is the
constant g. Notice that solving a differential equation does not give us a simple
number, as we would expect were we to solve the equation 10 = 4x -2 for the
variable x. Rather, our solution to a differential equation is a function v(t). This
example is somewhat contrived because we already know that the answer will
be v(t) = g. After all, that’s what we started with. But if we didn’t already know,
how could we figure it out?

There are a variety of methods that one can use to solve different types of
differential equations. No one method can solve every differential equation,
and there are many differential equations that can’t be solved at all. In the next
example, we’ll get a sense of the methods and thinking that go into solving
differential equations.

Bacterial Growth
• Exponential growth is a classic example of a real-world situation that lends
itself to a solvable differential equation.

Let’s look at another example, one that gives us an equation involving both
a quantity and its derivative. Imagine a single bacterium surrounded by
nutrients—perhaps it’s in a bottle of milk. Bacteria divide asexually by binary
fusion, their population basically doubling at set intervals. The more bacteria
there are, the more that are “born.” This implies a rate of change, or growth,
that is not steady, as was the case in the previous example of the velocity of
a falling object. Furthermore, the rate of increase in the bacteria population
depends on how many there are to begin with. If there are two bacteria initially,
the first increase is by two, the second increase is by four, the third increase is
by 8, etc.

Let’s designate P(t) as the number of bacteria at any given time, t. The rate of
dP
change in this population is then dt , some small change in population over a
small change in time. The rate, dP , depends on how many bacteria there are, P.
dt
Therefore:

dP
= aP(t)
dt Unit 12 | 15
UNIT 12 in sync
textbook

SECTION 12.4 The a is just a constant that is related to the specifics of the situation—what type
of bacteria, how long it takes them to reproduce, etc. In this situation, we have
Differential a rate of change that is directly proportional to the quantity that is changing; in
equations other words, we have an equation that relates a certain quantity to its derivative.
CONTINUED
This is a classic differential equation that describes exponential growth.

We could use a process known as integration to solve this by separating the


variables, putting the parts having to do with P on one side of the equation and
the parts having to do with t on the other side. Integration and differentiation
are two of the most important concepts of calculus. Whereas differentiation
seeks to explain rates of change, integration makes sense of the accumulation
of an infinite number of tiny changes. Integration is in a very real sense the
“opposite” of differentiation, but it can be very complicated for anything but the
simplest of equations. A faster way, for our purposes, might be simply to try a
few possible solutions and see if they work.

First let’s try P(t) = at. According to our table on page 12, dP/dt would then be
just a. Substituting this value into our differential equation we would get:

P(t) = at

Since this is true only for t = 1, let’s try something else.

dP
How about P(t) = sin at? would then be a cos( at) and we would have:
dt

a cos at = a sin at

Again, this is true only sometimes, in much the same way that a stopped clock
is right twice a day. We need something that is always true regardless of what
value of t we consider. Let’s try something else.

How about P(t) = eat? dP/dt would then be aeat, which is just aP(t)! This gives
us aeat = aeat, which is always true, no matter what t is. So the solution to our
differential equation is P(t) = eat.

Unit 12 | 16
UNIT 12 in sync
textbook

SECTION 12.4 In this example, we see again how the solution to a differential equation is a
function, not a number. In our example here, this function describes how to find
Differential the population of bacteria at any point in time, even though the rate of increase
equations is changing. It’s a nice, simple expression that encompasses the complexity of
CONTINUED
the situation under examination.

Solving Differential Equations


• Many differential equations are not solvable, but they can, upon analysis,
yield information about the system they represent.

In addition to integration and the “guess and check” method we just used, there
are other ways of solving differential equations (sometimes nicknamed “diff
EQs”), and they generally fall into two categories: exact and numerical methods.
Exact methods yield exact solutions, as did the function in our example above.
Numerical methods give approximations based on different algorithms. Often,
however, we can discover interesting behavior regarding our situation without
having to solve any equation. We can look at its qualitative behavior via what is
called a phase portrait, a picture that shows a system’s “phase space”.

dx dx
dt dt

2748 x x

Phase space is handy because it provides a way to represent all the possible
states of a system with one picture. It is a graph of the variables, such as
position and velocity, that determine the state of a system. We will talk about
phase space in more depth in Unit 13. For our purposes here, it suffices to say
that examining graphical representations of systems of differential equations
can yield a wealth of qualitative information about the system, such as whether
or not it will display cyclical or synchronous behavior.

Unit 12 | 17
UNIT 12 in sync
textbook

SECTION 12.4 Now that we have an idea how to model certain real-life situations using
equations that use both quantities and rates of change, we can tackle the issue
Differential of how synchronization arises in nature. We are going to look at one of the most
equations basic and accessible types of synchronization, that of cyclical behavior.
CONTINUED

Unit 12 | 18
UNIT 12 in sync
textbook

SECTION 12.5

Cycles • Fireflies

FIREFLIES
• One of the first breakthroughs in the study of spontaneous synchronization
was in the modeling of how two oscillators that are initially out of phase with
each other can come into phase with one another.
• Two oscillators influencing one another can be modeled by a system of
coupled differential equations.
• Certain species of fireflies exhibit this synchronization property in the wild.

How is it that two fireflies, each blinking to its own rhythm, can come into sync
with each other, flashing at the same time? How do we even begin to represent
this situation mathematically?

Item 2429/Fletcher & Baylis / Photo Researchers, Inc., SYNCHRONOUS FIREFLIES (PTEROPTYX TENER) FLASHING
ON THE MANGROVE TREES (SONNERATIA CASEOLARIS) AT KUALA SELANGOR, MALAYSIA (2008). Courtesy of
Photo Researchers, Inc.

A single firefly, if left to its own devices, will flash with some regularity. To
model this situation mathematically requires a function that has periodicity,
which simply means that it returns to the same value at regular intervals. As
we saw in our unit on the connections between music and mathematics, a
good mathematical function that models periodicity is a sinusoid. A sine wave
oscillates smoothly between one value and another. For the firefly, these two
values would be the states “on” and “off.”

Unit 12 | 19
UNIT 12 in sync
textbook

SECTION 12.5
1751
Cycles
CONTINUED
It would be reasonable to model the flashing of a single firefly by looking at the
sine of theta, where theta represents where the firefly is in its flashing cycle.
The firefly flashes when θ equals zero.

1752

Another way to think about this is to imagine a runner on a circular track.


Picture the runner traveling at a constant speed, corresponding to how quickly
the firefly charges up its flash. The flash itself corresponds to the runner
crossing the start/finish line. The angle theta then represents where the runner
is on the track in relation to the start/finish line.

1753

So, if theta represents where the firefly or the runner is in his cycle, the
derivative of this will indicate how fast that position is changing.


= the rate at which θ changes.
dt

Unit 12 | 20
UNIT 12 in sync
textbook

SECTION 12.5 This value is intuitively related to the frequency of oscillation—the more quickly
θ changes, the more cycles the runner, or the firefly, will complete. Let’s call
Cycles the frequency that the runner or firefly would have alone, without any influence
CONTINUED from others, the natural frequency, denoted by ω.

Things get interesting when we introduce another oscillator and consider two
fireflies, or two runners, that interact with one another. We can model each one
as an oscillator, just as we did in the single case, but because they interact with
each other, the expression is somewhat more complicated.

Because we now have two oscillators, we will have two phases (θ1 and θ2) and
two natural frequencies (ω1 and ω2) to account for. If we assume that the natural
frequencies are fixed, then we will need two equations for the two unknowns θ1
and θ2.

The first firefly has phase θ1 and frequency ω1. The second firefly has phase θ2
and frequency ω2. For both fireflies to flash in sync with one another, the two
thetas must be equal to one another. Mathematically, θ1 – θ2 must equal zero.

The phase difference, θ1 – θ2, determines the extent of “correction” each firefly
needs to make to synchronize with the other one. The necessary adjustment
varies depending on how far apart the two fireflies are in their cycles. If the
two fireflies are very far apart in their cycles, a large correction is needed. If
they are only slightly out of sync, only a slight nudge is required. However, the
situation is a bit more complex than this.

The adjustment each firefly makes can be either to slow down or to speed up
its flashes. How does it determine which to do? Consider the case of perfect
alternation, with one firefly flashing and then the other flashing at perfectly
spaced intervals. Should the one slow down or speed up to match the other? It
can speed up, basically doubling its frequency temporarily so that its next flash
coincides with the other, or it can slow down, halving its frequency, skipping the
next flash in the attempt to synchronize with the other.

If the flash of the first firefly occurs at a point in time that is less than half the
firefly’s cycle time from the second firefly’s next flash, it makes sense to speed
up. On the other hand, if it is more than half way through its cycle, it is better to
slow down and wait for the other firefly to catch up. The difference in θ is what

Unit 12 | 21
UNIT 12 in sync
textbook

SECTION 12.5 influences the firefly as to what to do. A function capable of modeling either a
speed up or a slow down must be able to periodically take on positive or negative
Cycles values, depending on the difference in θ. Once again this is ideally a sinusoid.
CONTINUED So, our mathematical model of how a firefly adjusts its flashing cycle to achieve
synchronization with another should look something like this:

dθ1
First firefly: ≈ sin (θ2-θ1)
dt
dθ 2
Second firefly: ≈ sin (θ1-θ2)
dt
The sine terms should be mediated by a constant that represents how strongly
the two fireflies interact with each other. This constant can take into account
things such as distance and ambient light levels that affect a firefly’s perception.
Let’s designate this constant K1 for the first firefly and K2 for the second firefly.
Incorporating these factors yields these modified expressions:
dθ1
First firefly: ≈ K1sin (θ2-θ1)
dt
dθ 2
Second firefly: ≈ K2sin (θ1-θ2)
dt
Finally, we shouldn’t forget the influence of each firefly’s natural rhythm, ω1 and
ω2 respectively.

dθ1
First firefly: = ω1 + K1sin (θ2 - θ1)
dt
dθ 2
Second firefly: = ω2 + K2sin (θ1 - θ2)
dt

These two equations represent the changes that each firefly should make, based
on what the other is doing, in order to achieve synchronization. Mathematically,
these are the equations of coupled oscillators. In our study of sync, we need to
analyze the behavior of these equations to find out the various conditions under
which spontaneous synchronization can occur. This is a simple, standard model
that can be applied to many different situations in which synchronization is
observed.

Recall that synchronization is defined to be the condition in which both


oscillators are in phase. Mathematically, this occurs when:

θ1 = θ2 or θ1 – θ2 =0
Unit 12 | 22
UNIT 12 in sync
textbook

SECTION 12.5 We can let φ = θ1 – θ2 to introduce a single, convenient variable to represent


the phase difference. The change in φ, representing how the phase difference
Cycles changes, would then be:
CONTINUED
dφ dθ1 dθ 2
= −
dt dt dt
Using our equations for the derivatives of the flashing cycle equations of the two
fireflies from above, we can get:


= ω1 - ω2 - (K1 +K2)sin φ
dt

What this equation tells us, via φ and dt , is that the fireflies’ synchronization with
one another is based on the difference in their natural frequencies,
ω1 - ω2, and how that difference compares to the strength of the signals they
send and receive from each other, K1 +K2, also called the coupling strength. If
the difference in frequency is less than the coupling strength, the fireflies will
spontaneously synchronize. If the difference is too great, they will go on flashing
at their individual rates.

This is a relatively straightforward model of potentially synchronous behavior


with two oscillators. Real-world systems, however, are often made up of many
oscillators. In the next section, we will explore how to expand our model to deal
with more-complicated systems such as these.

Unit 12 | 23
UNIT 12 in sync
textbook

SECTION 12.6

Many Oscillators • Kuramoto


and Biological Sync • Beyond Bugs

KURAMOTO
• The Kuramoto model mathematically captures the behavior of systems of
many coupled oscillators.
• Unlike other models of this type, the Kuramoto model is solvable.

We’ve now seen one possible way to model the rather complicated process of
two individual fireflies coming into sync with each other. The mechanism by
which this happens is based on each firefly being aware of the other’s cycle and
making modifications in its own cycle to match it. Synchronization between
these fireflies would not be possible were it not for this visual communication
taking place.

It’s interesting to think of this from the firefly’s perspective. At some level,
the firefly is aware of what its neighbor is doing and can, intentionally or not,
adapt its own cycle to match. With only one neighbor, this may not seem like a
big deal, but what about when there are two neighbors? How does our model
change if there are more than just two oscillators? In reality, synchronous
flashing has been observed in groups of many thousands of fireflies. If we
want our model to be as accurate and useful as possible, we must find a way
to generalize our model of coupled oscillators to account for synchronization
within groups of many oscillators.

One such model was developed by Yoshiki Kuramoto at Kyoto University in the
1970s. In considering large groups of oscillators, it makes things significantly
easier to assume that every oscillator affects each of the others equally. In
the context of a group of biological oscillators, such as fireflies, one could
reasonably expect that fireflies that are further away will actually have less
influence than fireflies that are closer. This geographical/spatial factor is
ignored in the Kuramoto model. This provides an example of how it is often
necessary to make simplifying assumptions about a situation in order to create
an understandable, workable model. Doing so provides a foothold from which
we can then explore what happens as that model is modified.

What is remarkable about the Kuramoto model is that it is a potentially infinite

Unit 12 | 24
UNIT 12 in sync
textbook

SECTION 12.6 set of nonlinear, coupled differential equations, and yet it can be solved exactly.
The general model itself resembles our system of two equations from the
Many Oscillators previous section:
and Biological Sync
CONTINUED
EQ:

∑ sin (θ )
∂θi K n

= ωi + j
−θij , i = 1...N
∂t n j =1

This form uses summation notation to compactly state a system of N differential


equations, one for each oscillator. What it says is that the change in phase for
a specific oscillator (the ith oscillator) depends on both its natural frequency,
ωi, and the sum of the influences of the other oscillators. These influences are
each related to the difference in phase between the ith oscillator and each other
oscillator taken individually, which is why the sum is over j oscillators, even
though the equation gives the behavior of the ith oscillator. Furthermore, the
amount of influence that each other oscillator has on the ith one, K, is divided
evenly by the total number of oscillators, N.

The Kuramoto model can be used to explain many different biological


phenomena because of its simplicity and the fact that it can be solved. Systems
of nonlinear, coupled differential equations can only rarely be solved exactly.
Solutions to the Kuramoto model resemble somewhat our conclusions from the
two-oscillator model, most notably the finding that spontaneous synchronization
occurs depending on the relationships between differences in natural frequency
and the strength of the interaction between oscillators.

In the realm of biology, there are many examples of situations in which the
Kuramoto model is applicable. We’ve already seen how it applies to fireflies,
and there are a couple of other fairly common yet fascinating examples from the
biological world.

Crickets and frogs communicate with cyclic sound much as fireflies do with
cyclic light. In some parts of the country, the night-time soundscape is full
of the chirps of crickets and the chorus of frogs croaking. Sometimes these
sounds can spontaneously synchronize within a species in a process that is
similar to how fireflies synchronize their flashes.

Unit 12 | 25
UNIT 12 in sync
textbook

SECTION 12.6 Beyond Bugs


• Heart pacemaker cells exhibit spontaneous synchronization in their firing of
Many Oscillators electrical impulses.
and Biological Sync
CONTINUED
Biological synchronization is by no means limited to insects and amphibians,
however. The cells that make up the human heart’s natural pacemaker, the
rhythm keeper that controls the electrical signals that cause the heart to pump,
display a propensity for spontaneous synchronization. Each cell can be thought
of as an individual oscillator, in much the same way that a firefly can, but with a
few key differences.

Recall that with the firefly, we modeled the cycle of its flashes as a smooth
sinusoidally varying function. A heart cell’s electrical firing is better modeled as
a pulse. The voltage across a cell builds slowly until it reaches some threshold;
at that point the cell discharges most of its voltage rapidly.
h h
SINE WAVE PULSE WAVE

1754 t t

Each cell has a form of communication with its neighbors via the voltages that
discharge. When one cell fires, it kicks up the voltages of its neighbors so that if
they are close to their firing threshold, they fire. This has a synchronizing effect
on all the nearby cells that were approaching their firing threshold when the
first one fired. Cells that were not close to firing get knocked further out of sync
with the others.

At first glance, it might seem that this would lead to disorganized behavior
among some cells and organized behavior among others. What actually
happens is that as certain cells near their firing threshold, voltage begins to leak
out in small amounts, to be absorbed by the neighboring cells. This leakage
would have little effect if there were only one or two cells, but in a group of
thousands, the leakage has a homogenizing effect on the average voltage across
each cell. In time, this leads to synchronization of the entire system, not just
particular groups of cells.

Unit 12 | 26
UNIT 12 in sync
textbook

SECTION 12.6 Cells that build up charge and then discharge precipitously are not modeled well
by the Kuramoto model. Math that involves sharp changes often gets tricky.
Many Oscillators These issues were successfully tackled, however, by Charlie Peskin at New York
and Biological Sync University in 1975. He was able to show mathematically how synchronization is
CONTINUED
possible for the entire cardiac firing system.

We have been talking mainly about cyclical synchronization up to this point, but
there are other forms of spontaneous order that arise in nature, such as flocking
and schooling. Believe it or not, even traffic congestion/flow often results in
spontaneous order. The models for these phenomena are not as simple as the
Kuramoto model, but the basic mechanism is the same. Spontaneous order
emerges naturally in systems in which the individuals communicate with each
other in some fashion and make small group-adaptive changes based on those
signals. What’s fascinating is that these individuals need not be organisms,
and the signals exchanged can be much simpler than a cricket’s chirp or the
voltage spikes of the heart’s pacemaker cells. Let us now turn our attention to
synchronization of inanimate objects.

Unit 12 | 27
UNIT 12 in sync
textbook

SECTION 12.7

Mechanical sync • Metronomes


• The Millennium Bridge

Metronomes
• Non-biological oscillators can spontaneously synchronize provided they
have a mechanism for exchanging signals (i.e., transferring kinetic energy).

At the beginning of this unit, we caught a glimpse of the variety of situations in


which synchronization can occur. Up until now, we have focused primarily on
sync as it occurs with living things that are able to send, receive, and interpret
signals. We hinted, however, at the fact that spontaneous synchronization is not
limited to living beings. It seems to be a fundamental phenomenon in nature,
occurring not only in the realm of biology, but also in chemistry and physics.

In fact, the first documented observations of a system coming into spontaneous


order were solidly in the realm of physics. In the 1660s, the Dutch physicist
Christian Huygens, known primarily for his contributions to probability,
astronomy, and optics, found himself sick in bed, as the legend goes, observing
two pendulum clocks. He noticed that no matter what configuration each
started in, they would eventually begin swinging in sync with each other.
Technically it was anti-phase sync:

1755

IN PHASE SYNC ANTI PHASE SYNC

Huygens examined the situation and found that the two clocks were both resting
on a loose, wobbly floorboard. He also noted that if the two clocks were placed
at opposite ends of the room, no such synchronization occurred. He surmised
that the motions of the two pendula transmitted tiny forces to each other via the
loose plank, subtly slowing down or speeding up the frequency of each until they
swung in anti-phase synchrony.

Unit 12 | 28
UNIT 12 in sync
textbook

SECTION 12.7 We can observe a similar phenomenon using a couple of metronomes. Imagine
that we have two metronomes, both set to oscillate at the same frequency.
Mechanical sync
CONTINUED If we place these two
metronomes on a solid,
fixed surface, out of phase
with each other, they will
continue to oscillate out of
phase with each other for
as long as we care to watch.
If we place the same two
metronomes on a board
that is allowed to move in a
Item 1756/Oregon Public Broadcasting, created for
Mathematics Illuminated, METRONOMES (2008). Courtesy particular way, however, the
of Oregon Public Broadcasting.
situation is quite different.

Item 1757/Oregon Public Broadcasting, created for Mathematics Illuminated, METRONOMES ON A WOBBLY PLANK
(2008). Courtesy of Oregon Public Broadcasting.

If the board connecting the metronomes sits atop two cans, so that it is free
to move laterally, parallel to the motion of the metronome arms, it becomes a
connection between the two metronomes that is capable of transmitting subtle
shifts in momentum.

Imagine that the arm of metronome 1 is moving towards the left, while the arm
of metronome 2 is moving towards the right. Let’s say that metronome 2 is
closer to the right-most point in its cycle than metronome 1 is to the left-most
point of its cycle.

Unit 12 | 29
UNIT 12 in sync
textbook

SECTION 12.7 1 2

Mechanical sync
CONTINUED

LEFT RIGHT

Item 1781/Oregon Public Broadcasting, created for Mathematics Illuminated, METRONOME 2 IS ABOUT TO HIT
ITS RIGHTMOST POSITION AND 1 ABOUT TO BE PULLED SLIGHTLY TO THE LEFT (2008). Courtesy of Oregon
Public Broadcasting.

When metronome 2 reaches its right-most point, the motion of switching to


start moving to the left imparts some small change in momentum that is equal
and opposite to the change that drives the metronome arm to the left. In other
words, it will shift the board ever so slightly to the right. This is a consequence
of Newton’s third law of motion, which states that for every action, there is an
equal and opposite reaction.

The effect of the board moving to the right is to accelerate, ever so slightly, the
arm of the left metronome towards its left-most point.

This is similar to the forces involved when you try to pull the tablecloth out from
under a setting of tall glasses. Unless you are extremely gifted and/or lucky,
you are likely to cause at least a few glasses to fall. When they fall, they will fall
in the direction opposite the movement of the tablecloth.

1758

Unit 12 | 30
UNIT 12 in sync
textbook

SECTION 12.7 This is how the board allows the two metronomes to influence each other. The
net effect of the small changes transmitted from metronome 1 to metronome 2,
Mechanical sync and vice versa, will be that the metronomes eventually will come to oscillate in
CONTINUED sync with each other.

50

-50

Item 1804/Oregon Public Broadcasting, created for Mathematics Illuminated, TWO METRONOMES
ON A WOBBLY PLANK (2008). Courtesy of Oregon Public Broadcasting.

The Millennium Bridge


• The worlds of biological and mechanical synchronization came together
in the shaking of the Millennium Bridge in London at the turn of the 21st
century.

This concept of oscillators, connected by some medium that can transmit signals
between them, seems to be at the heart of the phenomenon of synchronization.
We’ve seen how sync arises in a variety of contexts, both biological and
mechanical. In our final example, we will see how sync occurred in a system
comprised of both biological and mechanical elements.

The Millennium Bridge was constructed across the River Thames in London in
the late 1990s to commemorate the beginning of a new millennium in the year
2000.

Unit 12 | 31
UNIT 12 in sync
textbook

SECTION 12.7

Mechanical sync
CONTINUED

Item 3228/Alexander Hafemann, LONDON THAMES MILLENNIUM BRIDGE (2007). Courtesy of iStockphoto.com/
Alexander Hafemann.

On the day the bridge was opened


to the public, crowds of people
assembled to walk across the
newest landmark in the city. As
the bridge filled with people,
something remarkable, and
somewhat frightening, began to
take place. The bridge started
swaying, with no observable cause.
The winds were calm, and yet the
bridge began to sway with more
and more severity.

Item 3229/S. Greg Panosian, LONDON ARCHITECTURE AT


NIGHT (2007). Courtesy of iStockphoto.com/ S. Greg Panosian.

Unit 12 | 32
UNIT 12 in sync
textbook

SECTION 12.7

Mechanical sync
CONTINUED

2283
Video from that day shows that, as the bridge swayed, the pedestrians began to
compensate by adopting a staggering, side-to-side gait. Moreover, groups of
them began to stagger in sync with one another, completely unintentionally.

1760
The synchronized staggering of the people, begun as a response to the initially
slight swaying movements of the bridge, served to amplify the oscillations until
the bridge swayed quite violently. In this case, the walking surface of the bridge
served the same function as the plank in the metronome example that we just
examined; it transmitted small changes in lateral momentum between people
to the bridge structure, reinforcing the oscillations that had already begun. The
more the bridge shook, the more people compensated in their walking motion,
and as more people began to stagger in sync with each other, the bridge shook
more violently, creating a sort of feedback loop.

After a few days, the bridge was closed due to safety concerns and construction
crews reinforced it to prevent so much lateral flexibility. No one was injured in
the event, and it might have been written off as just an odd coincidence were it
not for mathematicians taking an interest in the phenomenon and seeing it as a
startling example of the mathematics of synchronization.

Unit 12 | 33
UNIT 12 in sync
textbook

SECTION 12.7 The following is an interview with Roger Ridsdill Smith, Director, Ove Arup and
Partners Ltd. and Project Director for the London Millennium Footbridge
Mechanical sync
CONTINUED What was Arup's role in the design and construction of the Millennium Bridge?

Arup have been the Engineer for the bridge, from its inception to completion of the
modification works.

Arup won the international competition (over 200 entrants) in 1996 as the Engineer
in a team with Foster and Partners (Architect) and Sir Anthony Caro (Artist).

Describe what happened to the bridge on 10 June 2000.

It is estimated that between 80 000 and 100 000 people crossed the bridge during
the first day. Analysis of video footage showed a maximum of 2000 people on the
deck at any one time, resulting in a maximum density of between 1.3 and 1.5 people
per square metre.

Unexpected excessive lateral vibrations of the bridge occurred. The movements


took place mainly on the south span, at a frequency of around 0.8 Hz ( the first south
lateral mode), and on the central span, at frequencies of just under 0.5Hz and 1.0 Hz
(the first and second lateral modes respectively). More rarely movement occurred
on the north span at a frequency of just over 1.0 Hz, (the first north lateral mode).

Excessive vibration did not occur continuously, but built up when a large number
of pedestrians were on the affected spans of bridge and died down if the number
of people on the bridge reduced, or if the people stopped walking. From visual
estimation of the amplitude of the movements on the south and central span, the
maximum lateral acceleration experienced on the bridge was between 200 and 250
milli-g. At this level of acceleration a significant number of pedestrians began to
have difficulty in walking and held onto the balustrades for support.

No excessive vertical vibration was observed.

The number of pedestrians allowed onto the bridge was reduced on Sunday 11th
June, and the movements occurred far more rarely. On the 12th June it was decided
to close the bridge in order to fully investigate the cause of the movements.

Unit 12 | 34
UNIT 12 in sync
textbook

SECTION 12.7 What is Synchronous Lateral Excitation? Briefly, how did you model
it mathematically?
Mechanical sync
CONTINUED The movement of the Millennium Bridge has been found to be due to the
synchronisation of lateral footfall forces within a large crowd of pedestrians on
the bridge. This arises because it is more comfortable for pedestrians to walk in
synchronisation with the natural swaying of the bridge, even if the degree of swaying
is initially very small. The pedestrians find this makes their interaction with the
movement of the bridge more predictable and helps them maintain their lateral
balance. This instinctive behaviour ensures that footfall forces are applied at the
resonant frequency of the bridge, and with a phase such as to increase the motion of
the bridge. As the amplitude of the motion increases, the lateral force imparted by
individuals increases, as does the degree of correlation between individuals. It was
subsequently determined, as described below, that for potentially susceptible spans
there is a critical number of pedestrians that will cause the vibrations to increase to
unacceptable levels.

How was the Millennium Bridge's swaying different from the swaying that
brought down the Tacoma Narrows Bridge in 1940?

The movements that occurred on the Tacoma Narrows Bridge were a resonant
response to forces exerted by wind rather than pedestrians. The pedestrian induced
forces that cause Synchronous Lateral Excitation are self-limiting because above a
certain level of movement, pedestrians stop walking.

How did ARUP fix the issue with the Millennium Bridge?

Although a few previous reports of this phenomenon were found in the literature,
none of them gave any reliable quantification of the lateral force due to the
pedestrians, or any relationship between the force exerted and the movement of the
deck surface

Arup therefore carried out tests in 3 universities, as well as crowd walking tests
on the bridge itself, in order to quantify the force exerted on the structure. Arup
then designed a system of passive dampers which are mobilized by the lateral
movements of the bridge. These dampers are arranged beneath the deck over the
full length of the bridge, as well as at the piers and at the south abutment.

Unit 12 | 35
UNIT 12 in sync
textbook

SECTION 12.7 In order to demonstrate that the solution performed satisfactorily, Arup carried out
a crowd test with 2000 pedestrians – the most extreme dynamic test ever carried
Mechanical sync out on a bridge. The bridge movements were less than a sixth of the allowable
CONTINUED movements.

The bridge reopened in February 2002.

Unit 12 | 36
UNIT 12 at a glance
textbook

SECTION 12.2

UNIT OverVIEW • The phenomenon of synchronous behavior occurs in many different


situations ranging from the intentional synchronization of a symphony to the
inevitable synchrony of orbiting planets.

SECTION 12.3
3.2

Calculus • The slope-intercept form of a linear equation is a common way to represent


the mathematics of change.
• To capture the notion of rates of change that can themselves change, we
need the concept of a derivative.

SECTION 12.4
3.2

Differential • A differential equation is an expression that relates quantities and their


equations rates of change.
• The solution to a differential equation is not simply a number; it is a
function.
• Exponential growth is a classic example of a real-world situation that lends
itself to a solvable differential equation.
• Many differential equations are not solvable, but they can, upon analysis,
yield information about the system they represent.

Unit 12 | 37
UNIT 12 at a glance
textbook

SECTION 12.5

CYCLES • One of the first breakthroughs in the study of spontaneous synchronization


was in the modeling of how two oscillators that initially are out of phase with
each other can come into phase with one another.
• Two oscillators influencing one another can be modeled by a system of
coupled differential equations.
• Certain species of fireflies exhibit this synchronization property in the wild.

SECTION 12.6
3.2

Many oscillators • The Kuramoto model mathematically captures the behavior of systems of
and biological sync many coupled oscillators.
• Unlike other models of this type, the Kuramoto model is solvable.
• Heart pacemaker cells exhibit spontaneous synchronization in their firing of
electrical impulses.

SECTION 12.7
3.2

Mechanical Sync • Non-biological oscillators can spontaneously synchronize provided they


have a mechanism for exchanging signals (i.e., transferring kinetic energy).
• The worlds of biological and mechanical synchronization came together
in the shaking of the Millennium Bridge in London at the turn of the 21st
century.

Unit 12 | 38
UNIT 12 in sync
textbook

BIBLIOGRAPHY

Websites http://math.nyu.edu/faculty/peskin/heartnotes/index.html
http://salt.uaa.alaska.edu/dept/metro.html
http://www.arup.com/MillenniumBridge/

PRINT Bardi, Jason Socrates. The Calculus Wars: Newton, Leibniz, and the Greatest
Mathematical Clash of All Time. New York: Thunder Mouth’s Press (division of
Avalon Publishing Group), 2006.

Bogert, Lysander W.J., Ayten Erol-Yilmaz, Raymond Tukkie, and


Johannes J. Van Lieshout. “Varying the Heart Rate Response to Dynamic
Exercise in Pacemaker-Dependent Subjects: Effects on Cardiac Output and
Cerebral Blood Velocity,” Clinical Science, vol. 109 (July 2005).

Boyer, Carl B. (revised by Uta C. Merzbach). A History of Mathematics, 2nd ed. New
York: John Wiley and Sons, 1991.

Diacu, Florin, and Philip Holmes. Celestial Encounters: the Origins of Chaos and
Stability. Princeton, NJ: Princeton University Press, 1996.

Joseph, George Gheverghese. Crest of the Peacock : The Non-European Roots of


Mathematics. Princeton, NJ: Princeton University Press, 2000.

Michaels, D.C., E.P. Matyas, and J. Jalife. “Mechanisms of Sinoatrial Pacemaker


Synchronization: A New Hypothesis,” Circulation Research, vol. 61, no. 55
(November 1987).

Pantaleone, James. “Synchronization of Metronomes,” American Journal of


Physics, vol. 70, no. 10 (October 2002).

Pikovsky, Arkady, Michael Rosenblum, and Jurgen Kurths. Synchronization : A


Universal Concept in Nonlinear Sciences. New York: Cambridge University Press,
2001.

Strogatz, Steven. Sync: The Emerging Science of Spontaneous Order


New York: Hyperion, 2003.

Unit 12 | 39
UNIT 12 in sync
textbook

BIBLIOGRAPHY

Print Strogatz, Steven. Nonlinear Dynamics and Chaos: With Applications to Physics,
Continued Biology, Chemistry and Engineering. Cambridge, MA: Perseus Books Publishing,
2000.

Thornton, Marion. Classical Dynamics of Particles and Systems, 4th ed. Orlando,
FL: Saunders College Publishing (Harcourt Brace College Publishers), 1995.

Watts, Duncan. Small Worlds: The Dynamics of Networks Between Order and
Randomness. (Princeton Studies in Complexity). Princeton, NJ: Princeton
University Press, 1999.

Watts, Duncan. Six Degrees: The Science of the Connected Age. New York: W.W.
Norton and Co., 2003.

Yeung, M. K. Stephen and Steven H. Strogatz. “Time Delay in the Kuramoto


Model of Coupled Oscillators,” Physical Review Letters, vol. 82, no. 3 (January
1999).

Unit 12 | 40
UNIT 12 in sync
textbook

NOTES

Unit 12 | 41
TEXTBOOK
Unit 13
UNIT 13
THE CONCEPTS OF CHAOS
TEXTBOOK

UNIT OBJECTIVES

• Chaos is a type of nonlinear behavior characterized by sensitive dependence on


initial conditions.

• Chaotic systems can be deterministic, yet unpredictable.

• Linear systems are solvable because of superposition.

• Nonlinear systems are often not solvable in an exact sense.

• Phase space is a way to find the qualitative behavior of nonlinear systems.

• Equilibrium points can be stable or unstable.

• Sensitive dependence is easily seen in certain types of iteration procedures.

• The logistic map provides an example of a system that bifurcates into chaos in a
relatively well-understood way.

• Space scientists use sensitive dependence to plan minimal-fuel routes through the
solar system.
Physicists like to think that all you
have to do is say, these are the conditions,
now what happens next?

RICHARD FEYNMAN
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.1

INTRODUCTION We live in a world in which seemingly insignificant details can have a great
impact. Very tiny changes in the starting conditions of a process or procedure
can have substantial, sometimes even dramatic, effects on subsequent behavior
and results. Some examples presented as evidence of this are purely anecdotal
or even theatrical—for example, the train you miss boarding by ten seconds
that ends up in a terrible crash—but others are more precise, more scientific,
even mathematical. This kind of indeterminacy may seem at odds with the usual
mathematical notion of a predictable world. Indeed, for centuries the prevailing
view of our universe was that it “runs like clockwork,” and its workings can be
mathematically and even numerically predicted from a given set of starting or
“initial” conditions. This predictability was possible, supposedly, because we can
write equations that tell us exactly (in a perfect world) what to expect, given a
set of starting circumstances. However, because we can never know anything
exactly—there is always some “error” in perception or measurement—this
earlier view of our world carried an implicit assumption that minor discrepancies
in the measurement of those beginning circumstances are of little consequence
because they should lead to only correspondingly small differences in the
predicted results. As it turns out, this view is naive. The real world is one in
which small differences in the initial circumstances of a sequence of events can
indeed have a significant effect on the final outcome. The mathematical tools that
we need to understand this sort of real-world phenomenology come from the
realm of chaos theory.

Imagine that two leaves, identical in every way (size, shape, mass, texture, etc.)
and attached as closely as possible to each other on the same tree branch, fall
at the same time. As the leaves fall, they encounter resistance from the air, with
its various eddies and small pockets of higher and lower pressure. These effects
cause the two leaves to “dance” in the air as they fall. At times they are close to
each other, and at other times they seem to be heading in opposite directions.
They finally land in two different locations, each much farther away from the
other than when they started.

How can we explain this behavior? The leaves started their descent from
virtually the same location and yet ended up far apart. How could such a small
difference in starting position lead to such a dramatic difference in final location?

Unit 13 | 1
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.1 In a linear world, this sort of behavior shouldn’t happen. Had the two falling
objects been apples rather than leaves, we would likely see little, perhaps no,
INTRODUCTION such disparity between their initial and final separation. The density and form
CONTINUED of the apples is such that the small shifting wind currents would have virtually
no displacement effect. In linear systems such as this, outcomes are always
fairly predictable if the initial conditions are known. Small differences in initial
conditions, such as the spacing between the apples on the branch, result in only
small differences in the eventual outcome, their spacing on the ground.

Leaves, however, are nothing like apples, and their behavior as they fall is
anything but easy to explain. Their flight paths are extremely sensitive to small
changes in their initial conditions. If the starting point is altered by just the tiniest
amount, the path taken by a falling leaf can be entirely different. This is
the hallmark of the mathematical concept of chaos.

The mathematics of chaos represents one prong of our endeavor to understand


the complicated world around us. This is no small task, given the diverse
complexity of our natural world—falling leaves, roiling streams, the rise and
fall of species, and of course, that most unpredictable element of nature,
the capricious weather. It is not hard to understand why the weather is so
unpredictable; it is an extensive and vastly complicated system with many
variables, all interacting in subtle ways. What’s startling to realize when studying
chaos theory is that even seemingly simple systems can behave in ways that are
difficult to predict.

In this chapter we will learn about the mathematics of chaos and how it fits into
the broader topic of nonlinear dynamics. Nonlinear dynamics can be thought of as
the study of complicated things and complicated behavior. In our previous study of
synchronization, we saw how individually complicated things, such as fireflies and
heart cells, can behave collectively in strikingly simple ways, such as oscillating
in unison. In this unit, we will see how a seemingly simple system, such as
that involving a leaf falling from a tree, can exhibit extraordinarily complicated
(i.e., difficult to predict) behavior. The broad field of nonlinear dynamics holds
much promise for the mathematical understanding of our world. Chaos theory
represents some of the first steps toward that understanding.

Unit 13 | 2
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.1 First, we will examine the distinction between linear and nonlinear systems.
Then, we will explore the notion of predictability. From there, we will examine
INTRODUCTION the fundamental trait of chaotic systems, namely, sensitive dependence on initial
CONTINUED conditions. With these notions in hand, we will consider some examples of chaos
in action.

Unit 13 | 3
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.2

LINEAR VS. NONLINEAR • Linear vs. Nonlinear


SYSTEMS • Springs and Things
• Good Behavior
• Over the Top

LINEAR VS. NONLINEAR


• Chaos is one of many behaviors that a nonlinear system can display.

Chaos theory is an often-misunderstood field of mathematics. Many people


associate chaos mathematics with the famous “butterfly flapping its wings
in China and causing a tornado in Texas” metaphor. This example is well-
meaning in that it shows the dependence of large, complicated systems on small
changes in initial circumstances. This metaphor is not terribly illuminating
regarding chaos theory, however, because the earth’s atmosphere is immensely
complicated, with many variables, and it is not too surprising that it behaves
in strange ways. Mathematical chaos is most remarkable not because it
arises in huge, complicated systems, such as that connected with our planet’s
weather, but rather because it appears to be a governing factor even in simple
systems, systems that one would think should be fairly predictable but that
instead turn out to be chaotic. So in order to observe and study chaos, we do not
need a large, complicated system; our only requirement is that our system be
nonlinear.

In high school, we learned that a linear equation is any expression of the form
y = mx + b, with m and b representing constants (such as 3 and -7) and x and
y representing variables, generally called the independent and dependent
variables, respectively. The equation is “linear” because its graph (all the “x,y”
points on the coordinate plane that satisfy the equation) is a straight line, and
also because a small change in the value of x effects a proportional, constant
change in y. A nonlinear equation is something that doesn’t have just a first
power of the independent variable and consequently can’t be graphed as a
simple straight line. One such example is a quadratic equation, ax2 + bx + c = 0.

In our study of chaos, we will need to expand the definitions of linear and
nonlinear to include differential equations. Recall from our discussion in the
preceding chapter on spontaneous synchronization that a differential equation

Unit 13 | 4
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.2 is an equation that contains both variables and derivatives, or instantaneous
rates of change. A linear differential equation is an equation in which dependent
LINEAR VS. NONLINEAR variables and their derivatives appear only to the first power. For example:
SYSTEMS
CONTINUED 5dy
+ 4y = 0
dt

This equation is linear because y, the dependent variable (it depends on t),
occurs only to the first power, as does its derivative. This differential equation is
also linear:

7d2 y 3dy
– + 5y = 0
dt 2
dt

Note that this equation contains a second derivative of the dependent variable,
but only to the first power. The following equation also is linear:

2dy
3yt2 – 5t5 + =0
dt

Although this equation involves higher powers, they apply only to t, which is the
independent variable. The dependent variable, y, and its derivative both appear
only to the first power, which is what determines whether or not a differential
equation is linear.

Consider this differential equation:

dy 3dy
7( )2 – + 5y = 0
dt dt

This equation contains a derivative raised to the second power, so it is classified


as nonlinear. An equation, containing derivatives or not, can be nonlinear in
other ways besides containing powers greater than one of dependent variables
or derivatives. For example, the following equations are nonlinear:

sin y = 0
y + ln y = 0

dy
y+ – 8t(ln y) = 0
dt

A linear system, then, is a set of equations that express a certain physical


situation without involving terms that include a dependent variable or the
derivatives of that variable to a power greater than one. A nonlinear system is

Unit 13 | 5
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.2 like a linear one, except that one or more terms are nonlinear.
The distinction between linear and nonlinear systems in mathematics defines
LINEAR VS. NONLINEAR the boundary between the relatively knowable, and the frustratingly elusive.
SYSTEMS Both types of systems can describe the dynamics of many different processes,
CONTINUED
such as planets orbiting each other, fluctuations in animal populations, the
behavior of electrical circuits, and so on. The difference between linear and
nonlinear lies in the details of the equations that govern how these systems
interact. For systems that behave linearly, it is relatively easy to find exact
solutions that we can use to predict future behavior within the system. For
nonlinear systems, we are lucky to find any such solution. Indeed, in nonlinear
dynamics, we often have to redefine what we consider to be a solution. Before
we get to this new view of solutions, however, let’s take a closer look at the
older, linear view.

SPRINGS AND THINGS


• A mass on a spring is an example of a simple harmonic oscillator, a well-
understood linear system.
• Linear systems can be solved relatively simply because they can be broken
down into parts that can be solved separately.

If we attached a weight of mass m to the free end of a spring of strength k that is


suspended vertically from a board or the ceiling and allowed the mass to bounce
up and down, we would have what is known as a harmonic oscillator. Given an
initial displacement (either lifting the mass above or pulling the mass below
its resting position), the weight would bounce up and down until the friction of
the air, the inelasticity in the spring, and the force of gravity combine to slow
the oscillations to a stop. The position of the mass is a dynamical system and is
easily defined with this well-known differential equation:

d2 x dx
m( ) + b( )+kx =0.
dt2
dt

This equation represents the balance of forces acting upon the mass. We know
that, given time, the mass will return to rest at its original position; in other
words, the forces acting to cause the oscillations must balance out to a zero
sum. The first term in the equation comes from Newton’s second law of motion
F = ma (force equals mass times acceleration). In our equation, the mass is
2
dx
represented by m and the acceleration is represented by . The second
dx dt2

term is the product of the velocity of the mass, and some constant, b, that
dt
,
represents the effect of air resistance. The final term represents the force

Unit 13 | 6
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.2 contributed by the contraction of the spring. This contribution is proportional
to how far the spring has been stretched—the more the stretching, the greater
LINEAR VS. NONLINEAR the contribution. To find this contribution, we simply multiply the strength
SYSTEMS of the spring, k, by the amount by which it is stretched, x. We add all these
CONTINUED
contributions together and set them equal to zero in accordance with Newton’s
third law of motion, which states that every action has an equal and opposite
reaction.

As this equation is written above, it incorporates both first and second


derivatives, making it somewhat difficult to solve directly. We can transform
the equation to one without a second derivative and, hence, one more easily
solved by performing a change of variables. To do this, we must first recognize
dx
2 dx dx 2 dx 1
that is just the first derivative of dt . If we let x = x1, and dt = x2, then
dt 2
dx 2
dt2

becomes . With the second derivative conveniently eliminated, we can now


dt
write a system of equations to model our oscillator:

dx1
= x2
dt
dx2
m( ) + b x2 + k x1 = 0
dt

We can rewrite this as:

dx1
= x2
dt
dx2 –b k
= ( ) x2 – ( )x1
dt m m

This is a linear system because all of its terms are single, first-degree variables
with constant coefficients. We need not work through the details of the solution
to this system. It is important to realize, however, that it would be some function
x(t) that describes where the mass would be at any time, t, that we choose. The
solution is an equation that can be used to determine the exact location of the
mass at any time during its oscillation.

Because this system is linear, we could use the principle of superposition to


solve it. This principle enables us to break a system of equations into pieces
that are more easily solved, solve them, and then combine the partial solutions
to find a solution of the entire system. This is a case of the whole solution being
exactly the sum of the partial solutions. Because of the applicability of this
principle of superposition, it is relatively easy to get exact, predictive solutions
for linear systems.
Unit 13 | 7
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.2 GOOD BEHAVIOR


• Linear systems tend toward one of four predictable behaviors.
LINEAR VS. NONLINEAR
SYSTEMS One nice thing about linear systems is that, because they are exactly solvable,
CONTINUED
we can categorize the types of behavior that they can exhibit. When we refer to
the behavior of a system, what we are really concerned with is the behavior of
the variables that describe the state of the system. For our oscillating spring,
the pertinent variables are the position of the mass, x, and its velocity, dx/
dt. Given these two values, we know exactly what the system is doing at any
moment. In general, the variables that describe the state of linear systems can:

1. Grow exponentially, heading toward infinity. An example of this occurs when


bacteria are allowed to grow with unlimited resources.

2. Decay exponentially, heading toward zero. A common example is the decay of


radioactive materials.

3. Cycle periodically, forever oscillating between values. An example is a


harmonic oscillator acting in the absence of friction.

4. Exhibit any combination of the above behaviors. Our mass and spring
oscillator acting with friction behaves as a combination of (3) and (2) in sort of
a decaying oscillation; it oscillates, but each cycle is shorter than the preceding
one until the mass stops moving. Another example of this is the case of a
bungee jumper coming to rest at the bottom of her cord.

All four of these behaviors are nice and predictable in the linear view.
Unfortunately, most real-life systems are not so well behaved and do not fit well
into a linear model.

OVER THE TOP


• A pendulum swinging outside of the small-angle approximation, where
sin θ ~ θ, is an example of a nonlinear system.
• For small swings, a pendulum behaves predictably, but for large swings, it
can behave strangely.

Unit 13 | 8
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.2 Let’s look at a slightly different type of oscillator, a pendulum. This is a very
common nonlinear system. To make things easier on ourselves, let’s say our
LINEAR VS. NONLINEAR pendulum is just a mass, m, at the end of a string (considered to have no mass)
SYSTEMS of length, L, moving under the acceleration due to gravity, g. Such a pendulum
CONTINUED
exists only in the mind of a physicist; the arm of a real pendulum has mass and
is affected by air resistance, even when it is only a string or thread. However,
this simplified, ideal model is good for our present purposes.

The force on the pendulum mass is a balance of the tension in the string and
the acceleration due to gravity. These forces vary, depending on the angle of
the pendulum. For instance, at the bottom of the swing, gravity is completely
mitigated by the tension in the string. At the top of the swing, the tension in the
string acts in the same direction as gravity. To model these varying forces, we
need a sinusoidal function.

The acceleration in terms of the angle the pendulum makes with the vertical is
then given by:

d2θθ –g
=
dt2
Lsinθ
θ

The sine term of the dependent variable makes this a nonlinear equation. To
solve this, we can make our lives easier, as we did before in the example using a
spring, by performing a change of variables. To do this, we let θ = θ1, and
θ2 = d θ . d θ then becomes dθθ . Our system then becomes:
θθ1 2 2

dt dt2 dt
dθθθ1
= θ2
dt

θ2 –g
=
dt Lθsinθθ1

This eliminated the second derivative, but the sine term is still there, so this
system remains nonlinear.

Unit 13 | 9
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.2 These so-called nonlinear systems can exhibit some wild behaviors, behaviors
that might be considered surprising, behaviors that don’t fit so nicely into
LINEAR VS. NONLINEAR equations. For example, our simple pendulum behaves very smoothly and
SYSTEMS predictably as long as it doesn’t swing too high.
CONTINUED


1761

Small angle Larger angle Larger angle

For larger and larger angles, the range of possible behaviors is more varied than
the simple cycling back and forth. For example, if the pendulum has sufficient
momentum, it will swing past the horizontal line of the pivot and go all the way
around, over the top. If it has a little less momentum than this, it might stall
near the vertical position above the pivot, lose the tension of the string, and drop
almost straight down under the influence of gravity. Both of these behaviors
are examples of nonlinearities. It’s worth noting that for a pendulum to swing
higher than its pivot, the mass must have some initial velocity. Velocity due to
gravity alone will not suffice. Since we are only concerned with general methods
and qualitative behavior, we can ignore this.

Some nonlinear systems do behave nicely and predictably, while others do not.
The range of nonlinear behaviors is vast, with chaos being just one type. It’s
the type that we understand the best. As we will see in the next section, our
understanding of chaos does not mean that we can make exact predictions in a
chaotic system, as we can with linear systems. In fact, to go any further in our
exploration of chaos we will have to redefine what we even mean by the term
“solution.”

Unit 13 | 10
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.3

LIMITS OF PREDICTABILITY • The Two-Body Problem


• Poincaré’s Discovery
• Phase Portraits

THE TWO-BODY PROBLEM


• Newton calculated the motion of the planets using differential equations for
two objects influencing each other with gravity.
• Given the initial conditions and the relevant equations, one can predict
where two mutually orbiting objects will be at any point in time.

A common belief toward the end of the 19th century was that mathematics can
be used to obtain an exact description of the world around us. The pinnacle of
this belief was the doctrine of determinism, which holds that if we can know the
state of the universe at one moment, and write out all the equations that govern
it, we can accurately predict its state at any other moment in the future.

Much of the impetus for this popular view came from the work of Sir Isaac
Newton in formulating both laws of motion and the mathematical techniques
of calculus that could be used to make accurate predictions based on those
laws. According to the Newtonian view, if one had the proper equations and
reasonably accurate knowledge of starting conditions, one could predict the
future behavior of a system with extreme accuracy.

This deterministic Newtonian view was, and is still, a powerful paradigm. It


fostered mathematical understanding of aspects of the world around us that
were previously inaccessible. A key example of the power of this line of thinking
was Newton’s solution of the two-body problem.

The two-body problem is a simplified version of the problem of describing the


motions of the planets. Numerous philosophers and scientists throughout the
centuries had attempted to explain planetary motion. Newton was the first to
model these motions mathematically and to make accurate predictions about
how the planets move and why.

Newton used his newly formulated law of gravitation to model the forces that
two massive bodies exert on each other. Plugging quantitative values for these
forces into his equations of motion, Newton was able to predict how the two

Unit 13 | 11
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.3 objects would move with respect to one another. He found a number of different
possible orbits that depended on specific conditions such as the masses of
LIMITS OF PREDICTABILITY the bodies, their separation distance, and their initial velocities. In short,
CONTINUED
he found that any system of two orbiting bodies exhibits one of two possible
behaviors. The two bodies either settle into a periodic orbit, cycling between
positions forever, or they affect each other only briefly and then separate along
asymptotic paths, in much the same way that a meteor shoots past a planet.
According to Newton, the specific starting values of the system determined
which one of these behaviors would occur. Once the system was quantified and
put in motion, its fate was known and there were no surprises.

Center
of mass

Center
of mass

1762

Orbit Orbit Brief interaction

POINCARé’s discovery
• The three-body problem is very different from the two-body problem.
• Poincaré showed that the behavior of a three-body system cannot be
quantitatively predicted.

The solution of the two-body problem was a triumph of both science


and mathematics. It gave hope that if the heavens could be understood
mathematically, so could other aspects of life. Perhaps there was a bright
future in which much of the unpleasant uncertainty in peoples’ lives could be
eliminated. It was assumed that Newton’s methods could be easily extended
from a system of two massive bodies to one with three and eventually to systems
with any number of bodies. Unfortunately, the “tricks” that Newton applied to
generate an exact solution to the two-body problem are not applicable to the
three-body problem. Many of the greatest mathematical minds of the 18th and
19th centuries, including Euler and Lagrange, attempted to find a general, exact,
solution. The problem of describing the interrelated motion of more than two
bodies remained so elusive that the King of Sweden, in the late 19th century,
established a prize for its solution. The king phrased his challenge in
these terms:
Unit 13 | 12
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.3 “Given a system of arbitrarily many mass points which attract each other
according to Newton’s laws, try to find, under the assumption that no two points
LIMITS OF PREDICTABILITY ever collide, a representation of the coordinates of each point as a series in a
CONTINUED
variable which is some known function of time and for all of whose values the
series converges uniformly.”

The great French mathematician and scientist, Henri Poincaré, tackled this
challenge. His response, while not providing the general solution that the king
sought, laid the groundwork for what would later be known as chaos theory.
He examined a very specific case of the three-body problem, a case in which two
of the bodies orbited each other as Newton described, while a third mass-less
speck orbited them. The advantage of this purely theoretical model was that the
speck exerted no gravitational attraction on the other two bodies.

As he delved into the problem, Poincaré abandoned the goal of finding exact
solutions of the type desired by the king and instead focused on studying the
qualitative behavior of the system. He realized that an exact solution, as was
available in the two-body case, was not possible for the case involving three
bodies. Fortunately, he also realized that this did not preclude answering
important qualitative questions such as, “Is the system stable or will the planets
eventually fly off to infinity?” What he found was that the behavior of the mass-
less speck was wildly unpredictable.

Poincaré was able to explore such qualitative features of the system by using the
concept of phase space. Phase space is an abstract space of the stated variables
of a system. In other words, if you take all the possible combinations of, say,
position and velocity and arrange them as coordinates in an abstract space, then
a path through this space represents how the system will evolve. The initial
conditions of a system correspond to where it starts in phase space.

The actual phase space for the three-body problem is 18-dimensional. Each
of the three bodies requires three dimensions to describe its position, x, y, and
dx dy dz
z, and three dimensions to describe its velocity, dt , , and . By looking
dt dt
only at the mass-less speck and confining its position and velocity to the orbital
dx dy
plane, Poincaré reduced the 18 dimensions to 4: x, y, , and . Constraining
dt dt
the total energy of the system eliminates one more variable dimension, leaving a
three-dimensional phase space, which is readily visualized.

Unit 13 | 13
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.3 Stable Manifold (orbits move toward the periodic orbit)

LIMITS OF PREDICTABILITY
CONTINUED

Unstable Manifold (orbits move away the periodic orbit)

Poincaré’s phase space for 3-bodies

What sorts of information can we infer from a picture such as this? It would be
better to look at a simpler example of phase space to get an idea of how we can
use it to analyze the qualitative behavior of a system.

PHASE PORTRAITS
• A phase portrait is a way to visualize all states of a system.
• Using a phase portrait, one can deduce the qualitative features of a
system’s evolution.
• If a system starts out at an equilibrium point, it will not be driven to change
its state.
• Equilibria can be stable (attractors) or unstable (repellers).

A more accessible example of this qualitative method is the phase portrait,


which is a specific path or set of paths through a phase space, of a system such
as:

dx
= sin x
dt

This system describes an object whose position oscillates sinusoidally.


Although this system can be solved directly through integration, looking at the
phase portrait will tell us more about the actual behavior of the system than
would be obvious in an exact solution.
Unit 13 | 14
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.3

LIMITS OF PREDICTABILITY dx
CONTINUED dt

dx
What this picture portrays is the velocity, , of the system at any given
dt
position, x. The arrows on the x-axis serve to remind us of the directional
component of the velocity. Values of x that yield positive velocities will move the
particle to the right; values of x associated with negative velocities will move it
to the left.

The places where our path crosses the x-axis correspond to positions that yield
no velocity (because sin(0) = 0). These are known as stability points, because if
we started a particle at any of these points, it would not be influenced to move
in any direction. There are two main types of equilibrium points, stable and
unstable. A stable equilibrium point is one to which a particle would return if it
were displaced by some small amount. Think of releasing a grape on the inside
rim of a bowl. No matter where you release the grape, it will always end up in
the center of the bottom of the bowl. This is a stable equilibrium point. If you
were to turn the bowl over and place the grape very carefully in the exact center
of its top, the grape would stay where you put it. If you placed it anywhere
else on the outside of the bowl, however, it would roll off. All of these possible
locations represent unstable equilibrium points.

Unit 13 | 15
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.3
Stable equilibrium Unstable equilibrium

LIMITS OF PREDICTABILITY
CONTINUED
1763 Bowl Grape Bowl

Grape

Note a small nudge and the Note a small nudge sends


grape returns to the center, the grape away from
the equilibrium point. the equilibrium point.

We can determine what kind of equilibrium points we have in our phase portrait
by looking at the velocities associated with particle movement around each
point. The velocity to the left of point A is positive, driving the particle to the
right. The velocity to the right of point A is negative, driving the particle to
the left. This means that if a particle starts out anywhere relatively close to
point A, it will eventually come to rest at point A. Point A is, therefore, a stable
equilibrium point. Because it seems to attract particles, we can call it an
attractor.

Point B, on the other hand, is a bit different. The velocities corresponding to


positions to its left are negative, tending to drive the particle away from the
point. The velocities corresponding to positions to its right are positive, also
tending to drive the particle away from point B. This indicates that starting a
particle anywhere near point B will result in that particle moving away from
that position and toward one of the attractors. Point B is, therefore, an unstable
equilibrium point. Because it tends to repel particles, we can call it a repeller.

Examining systems in this way, geometrically, has the advantage of enabling


us to see certain aspects of their behavior very clearly, without having to plow
through pages of equations. By identifying attractors and repellers, we can tell
how a system will evolve qualitatively over time, depending upon where it starts.

Unit 13 | 16
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.3 Although in the preceding example we saw how certain points act as attractors
or repellers, this does not always have to be the case. For example, if we look at
LIMITS OF PREDICTABILITY the phase portrait of a simple harmonic oscillator, such as the mass and spring
CONTINUED that we examined previously (but without friction this time), we see that there
is a nice closed loop corresponding to each combination of starting position and
starting velocity.

dx
dt

1764 x

MASS MASS

Small initial Large initial


displacement displacement

Wherever we start our system, we see that it will trace out a path in phase
space. This path is like a series of equilibrium points, and the system will
naturally evolve toward following this path through phase space. This means
that attractors do not have to be single points; instead, they can be paths, or
trajectories, through phase space.

This notion of examining phase space was part of the contribution that won
Poincaré the prize for the three-body problem. He did not find an exact solution,
but his techniques were so important that he won the contest anyway. He found
that the three-body system exhibits a range of possible behaviors, including
some that seem to be wildly unpredictable; that is, even if you know the state
of the system at some point in time, there is no guarantee that you will be able

Unit 13 | 17
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.3 to say what the state will be at some significantly later time. In the short term,
things are predictable, but in the long term, there is no way to know for sure
LIMITS OF PREDICTABILITY what will happen.
CONTINUED

These initial insights from Poincaré dealt a blow to the Newtonian paradigm
of deterministic predictability. Poincaré’s dynamics were still deterministic
in the sense that the state of a system at any one time depends on its state at
a previous time. However, in opposition to Newton’s viewpoint, they were far
from delineating a predictable future. Poincaré’s concept, which represented
the first notion of what is now called mathematical chaos, made it clear that
there is a limit to how far into the future we can see using mathematics. These
ideas, though shocking, did not really take hold until the mid-20th century when
the advent of the computer enabled mathematicians to practice mathematics
in an entirely new way. With the help of computers, mathematicians soon
found another remarkable aspect of chaotic behavior, the idea of sensitive
dependence, to which we will now turn.

Unit 13 | 18
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.4

sensitive • Rounding Error


dependence • Butterflies

rounding error
• Lorenz discovered that a small change in the input to a certain system of
equations resulted in a surprisingly large change in output.

Edward Lorenz is a noted mathematician and meteorologist. Throughout the


mid-to-late 20th century he was a meteorological researcher at MIT. In the
1950s and 1960s, the study of meteorology was as much art as it was science.
Weather forecasters could find certain patterns in weather systems that were
somewhat tame and predictable, but there was always an element of surprise.
It was thought that this was simply because the dynamics of the atmosphere
were so complex, involving so many variables, that it was impossible to state
with any precision at any one time what exactly was going on. Without knowing
the initial conditions of the system, it was very hard to make exact predictions
about what it would do next.

Lorenz hoped to gain some insight into the complexity of the weather by working
with an extremely simplified version of a weather model and running it on the
newly available computers. With a computer, he believed that he could have
exquisite control over the initial conditions, allowing the modeling equations
to function more-or-less free of measurement error. By looking at such an
ideal and simplified system, he hoped to get a better idea of the fundamental
phenomena that underlie the weather.

After considering a complicated, 12-equation model of how air moves, Lorenz


chose to focus on a system employing just three equations, a simple model of
convection rolls.

dx
= σ(y-x)
dt
dy
= rx – y – xz
dt
dz
= xy – bz
dt

Lorenz’s model represented an extreme simplification of a weather system.


Using simplified equations for convection currents, his model simulated various

Unit 13 | 19
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.4 winds interacting. In the early days of scientific computing, this was a tedious
process. He would input his equations and a set of initial conditions and then
sensitive have the computer calculate what would happen as time moved forward in
dependence discrete steps. To make sense of his model’s output, he would choose a specific
CONTINUED variable, such as the direction of the west wind, and plot its behavior graphically.
He watched as the wind shifted directions, a phenomenon represented by
a wavy-line computer printout. This line represented a record of how that
direction of the wind changed according to his mock-up, as calculated by the
computer.

As the story goes, one day, he was forced to stop his calculations mid-
simulation. When he returned a bit later, he decided to start the simulation
again, using the values that had been generated and recorded at its stopping
point, rather than starting the simulation over with the initial values. He
entered the values from before as the initial conditions and was amazed by what
happened. The simulation progressed as predicted for a while, but then quickly
and inexplicably diverged from what he had seen in previous simulations.

Lorenz initially suspected that there had been a computer malfunction. In


Newtonian determinism, there should be no difference between an interrupted
and a non-interrupted test. Upon further investigation and reflection, Lorenz
realized that there had been no malfunction; the discrepancy was due to a tiny
rounding difference between the computer and the printer that displayed the
data.

Lorenz’s computer’s memory was programmed to register six decimal places.


For example, at the end of a round of simulation, the computer would output a
number such as 0.506127. This number would then automatically be used as
the initial condition for the next round of simulation. Lorenz’s printout, on the
other hand, displayed only three decimal places (a paper-saving feature), and
it was this printout that he used to input the starting values when he re-started
the interrupted experiment. Had the computer not been interrupted, it would
have continued using the 6-digit number; Lorenz had assumed that inputting a
3-digit approximation would not change the results very much.

The difference between 0.506127 and 0.506 is a little more than one part in ten
thousand. This is a miniscule deviation, the kind of discrepancy that scientists
regularly ignore because they assume that small errors in input have only
small effects on output. Lorenz found, however, that this tiny discrepancy had

Unit 13 | 20
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.4 profound implications for the long-range behavior of his “simple” system.
Lorenz had thought that perhaps computers would be the supreme data
sensitive processors, capable of generating complete, accurate weather predictions.
dependence Nonetheless, he also knew that a computer’s output is only as reliable as its
CONTINUED
input. Experimental scientists have long known that the initial conditions of a
system can never be quantified with 100% accuracy. What Lorenz found in his
computer simulations was that a small difference in initial conditions could
result in large discrepancies between expected outcomes in certain systems.
This concept, which came to be known as sensitive dependence, is the key trait
of systems that exhibit chaos.

butterflies
• The phase space of the Lorenz system contains an attractor whose phase
portrait resembles a butterfly.
• The Lorenz attractor helps to explain how small changes in starting
conditions lead to greater changes down the line.

To understand sensitive dependence a little better, Lorenz decided to look at the


phase space of his system. He saw something much more complicated than the
simple phase portraits that we observed in the previous section.

Unit 13 | 21
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.4 This phase portrait is three-dimensional, one dimension for each of the
variables in Lorenz’s equations. It represents how Lorenz’s simplified weather
sensitive system evolves through time. It is an abstract path consisting of points whose
dependence coordinates are determined by Lorenz’s equations. If we imagine a particle
CONTINUED
sliding along this abstract path, that particle’s behavior will be indicative of the
behavior of the system in general.

What is remarkable about this object is that if you were to start two different
particles off in almost but not quite exactly the same location and then allow
them to flow along the curve, they would remain close to each other for a while
but would at some point start to diverge in their paths very rapidly. This is just
like the example of falling leaves from the introduction to this unit. Although
the two leaves start out in almost, but not quite exactly, the same position,
we all know that by the time they reach the ground, they can be very far apart
indeed. In the present example, note also that the particles, even though they
follow different paths, still stay somewhere close to this butterfly pattern.
That is another hallmark of chaos: indeterminacy mixed with some notion of
determinacy – that is bounded in space. Just as the leaf is sure to hit the ground
eventually, chaotic behaviors are confined in their outcomes.

Chaotic unpredictability and sensitive dependence can arise in some nonlinear


systems, but not all. They represent just a small part of the broader, mostly
untamed, field of nonlinear dynamics. While the initial discoveries of chaotic
behavior came from the realm of continuous dynamics, such as the motions
of planets, chaos also arises in discrete time situations. Lorenz, for example,
made his discovery by examining discrete-time solutions to his differential
equations. These are situations in which a process is repeated for several
steps, each step using the product of the step before as its initial condition.
The mechanics of chaos can be better understood by looking at these iterative
functions, and so it is to the subject of iteration that we will now turn our
attention.

Unit 13 | 22
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.5

iteration • Folding Dough

folding dough
• A simple way to see sensitive dependence is to look at discrete, iterative
processes, such as folding dough.

In everyday life, we rarely perceive any boundary between one moment and
the next, but, instead, perceive time as flowing continuously. Some processes,
however, can be broken into discrete steps. Folding and kneading dough is a
good example of this; each fold is more or less an instantaneous event and the
time between folds serves as a boundary.

1765 Stretch Fold

Stretch Fold Stretch

Folding dough can, therefore, be modeled, approximately, in “discrete time.” You


can think of discrete time as something like a sequence of snapshots, whereas
continuous time is more like a movie. Discrete time breaks a process up into
the inputs and outputs at individual, separated moments in time (or space).
A discrete dynamical system generally takes at each moment the output of a
given step to be used as the input for the next step in the process. This process
is called iteration; complete a step by performing an action that generates a
new value, then use that new value as the starting point as you repeat the same
action. Repeat this process for as long as you like.

Imagine a flake of pepper on the surface of the dough. As we knead and fold the
dough the pepper flake gets moved about, its location changing from discrete
moment to the next. A computational analogy would be as follows: We start
with a number; “stretch” it; chop off a bit we don’t need; and end up with a new
number. The stretching will be accomplished by multiplication, the chopping
will be a modular arithmetic action. Our process will be to multiply the starting

Unit 13 | 23
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.5 number by ten, then take the result, modulo 1. (Recall from the unit on primes
and modular arithmetic that “modulo 1” is the mathematical way of saying
iteration “remove the integer part.”) This eliminates any whole numbers that might be in
CONTINUED the result, leaving only a decimal number to begin the next iteration.

Let’s start our process with a decimal input, 0.506127.

First we stretch it by multiplying by ten:

10 × 0.506127 = 5.061270

Next we take the result mod 1:

1132 5.061270 mod 1 === 0.061270

We now use this result as the starting point for the next iteration:

0.061270 × 10 = 0.612700

0.612700 mod 1 === 0.612700 (no change, because there was no whole number
component of the number)

So far so good, but what does this have to do with chaos? If we take two
numbers that are almost but not quite exactly the same, say 0.12345 and
0.12349, and perform this iterative stretching and chopping process, we will see
the essence of chaos unfold before our eyes. The following table records the
evolution of the iterative process, and the image that follows demonstrates the
divergence of the two values on a series of number lines.

INPUT 0.12345 0.12349


1st Iteration (10x mod 1) 0.23450 0.23490
2nd Iteration 0.34500 0.34900
3rd Iteration 0.45000 0.49000
4th Iteration 0.50000 0.90000

Unit 13 | 24
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.5
0.12345
The two points start
iteration 0 0.12349 1 close to each other.
CONTINUED
0.2345

0 0.2349 1
0.345

0 0.349 1
0.45

1766 0 0.49
0.5
1
By the 4th step
they are far apart.
0 0.9 1

Notice that the two numbers start out virtually indistinguishable, with the
difference between them being only 4 parts in 100,000, hardly something to note.
As the iterative process begins to unfold, the numbers stay relatively close to
one another. After the first iteration, they differ by only 4 parts in 10,000. They
continue to remain relatively close to one another all the way up to the end of
the 3rd iteration. After four rounds of stretching and chopping, the numbers no
longer resemble one another at all; their initial difference has been amplified by
a factor of 10,000 and one is now nearly twice the value of the other. This is the
essence of sensitive dependence.

Notice that in this system, there was a particular point, namely the 4th iteration,
at which time the divergence of the values escalated quickly. We can call this
breakpoint the threshold of chaos. In the study of nonlinear dynamics, other,
more complicated systems, can have similar thresholds of chaos. These
thresholds are determined by the system and the exact values chosen as
initial conditions. An important question to explore is “when does chaos set
in?” Stated in other terms, the question is “how do I know when a system is
predictable and when it is not?”. To see how one might answer these questions,
we are going to look at a famous model that involves the rise and fall of the
populations of various wild animal species.

Unit 13 | 25
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.6

the logistic map • Bifurcation


• The Road to Chaos
• Feigenbaum’s Constants

bifurcation
• A bifurcation is an abrupt change in the qualitative behavior of a system.

The iterative, discrete-time view of chaos is powerful because it allows us to see


how a system evolves, step-by-step. Most nonlinear systems are chaotic only
under certain circumstances. A discrete-time analysis can help pin down these
circumstances. An example of this is the flow of water, or any fluid. As long
as it is allowed to flow at a reasonable speed along a course free of obstacles,
fluid flow is nice and predictable. However, as the speed of flow increases, or
as obstacles are added in the path of the flow, the flow starts to get somewhat
unpredictable. Eventually, under certain conditions, the fluid no longer behaves
in a predictable way at all; this condition is called turbulence.

Turbulence is a good deal more complicated and less understood than classic
chaos, but the point is that our system changes its qualitative behavior,
depending on the specific parameters we assign to it. We expect that using
different starting values will give us different results, but we also naturally tend
to expect that those results, while different quantitatively, will be somewhat
similar qualitatively. We might expect that doubling the weight of a moving
particle would halve its velocity, given the same amount of force. We would
probably also expect that the particle would still get to where it was headed
initially; although it might take longer. In a chaotic system, however, doubling
the weight might cause the particle to reverse direction, stop, oscillate between
two or more values, or exhibit any number of qualitatively different behaviors.

The point at which a system changes from one fundamental type of behavior to
another is called a bifurcation. An important question then is “for what values
of our system’s parameters does bifurcation occur?” Applied to our system
of moving water, the question is “at what speed does the water flow become
turbulent?” Answering this question and others like it is of great importance if
you are designing boats, testing aircraft, trying to understand the fluctuations of
the stock market, or trying to predict how populations of wild animals rise and
fall.

Unit 13 | 26
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.6 the road to chaos


• The logistic map is a model of population growth that exhibits many different
the logistic map types of behavior, depending on the value of a few constants.
CONTINUED • Above a certain parameter value, the logistic map becomes chaotic.

Let’s take a look at one specific iterative function, or map, to see bifurcation and
chaos in action. The function we will investigate, often called the logistic map,
represents a highly simplified model of population fluctuations. It takes an
initial population level and tells you what the population will be after some fixed
interval of time, or time step. The time step can be as long or as short as you
care to make it, depending on what species you are studying. For our purposes,
we’ll just make it some arbitrary quantity representing a generation. The
equation then, for some population pn+1 after an arbitrary time step, starting
with population pn is:

pn+1 = rpn(1-pn)

In this equation, the parameters that we can modify are the growth rate, r, and
the initial population, p0. In particular, we would like to know how the growth
rate affects the overall behavior of the system.

For example, if r is less than 1, pn goes to zero as n goes to infinity. This means
that the population diminishes to the point of extinction.

1.0 r = 2.8

Xn

0.5

1793
10 20 30 50 50
n

If r is between 1 and 3, the population eventually settles at some steady-state


value. Although the population may wobble a bit over short time spans, the
long-term behavior after many iterations is for the system to settle on one
population size.

Unit 13 | 27
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.6
1.0 r = 3.3

the logistic map Xn


CONTINUED
0.5

1794
10 20 30 50 50
n

If we let r = 3, we see a surprising change in the system’s behavior. Instead of


settling on one value, the population oscillates between two different values
forever. For our population this would mean, for example, that boom years
are followed directly by bust years and vice versa. This change in behavior is a
bifurcation from steady-state values to oscillations of period 2. We say “period
2” because it takes two iterations to return to the original value.

1.0 r = 3.5

Xn

0.5

1795
10 20 30 50 50
n

As r increases beyond 3, more interesting behavior emerges. We start to see


more bifurcations, and they become more frequent. Each time, the period of
oscillation doubles.

Unit 13 | 28
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.6
r1 = 3 Period 2
the logistic map
CONTINUED

r2= 3.449 Period 4

1767
r3 = 3.54409 Period 8

r4 = 3.6445 Period 16

Values of r versus their oscillation period. Paired up with graphs that show the successive period doublings.

The population oscillates first with period 2 when r = 3. When r = 3.449, the
period doubles to period 4, indicating that it now takes four iterations for the
population to return to a value that it has had before. The period continues
to double from 4 to 8 to 16, each time at a successively smaller increment of
increase in r. Eventually, when r = 3.569946, the period becomes infinite. This
means that the population fluctuates wildly, never regularly returning to any
previous value.

These period-doubling bifurcations are quite fascinating. Why does a population


that is stable at 2.999999 start swinging between two different values at 3? Also,
why does this oscillation occur more and more rapidly as the r-value approaches
the magic number of 3.569946? Furthermore, what happens if we let r get
bigger than 3.569946?

Unit 13 | 29
UNIT 13 Harmonious Math
textbook

SECTION 13.6 It is tempting to think that as r increases, the more chaotic the population
becomes, but the actual behavior is much more varied than this. The logistic
the logistic map map shows a range of behaviors. Above the magic number, the population
CONTINUED becomes chaotic, never settling onto a fixed value and never falling into any
periodicity. This is the same sort of behavior that we saw earlier in Lorenz’s
weather simulations.

r = 3.9
1.0

Xn

0.5

1796
10 20 30 50 50
n

There are certain “windows” of r-values, above the magic number, that give
oscillating populations. It seems that the system bifurcates both into and out of
chaos, depending on what r-values one chooses.

We can see the global behavior of the logistic map by looking at what is known
as an orbit diagram. This type of diagram is different than the ones we have
previously seen in this unit. Those previous diagrams showed how population
evolved in time, step by step. An orbit diagram shows how the behavior of a
system changes, depending on the r-value. It’s a way to see the long-range
global behavior of the entire system at a glance.

Unit 13 | 30
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.6
1.0

the logistic map


CONTINUED

1797
0.0
3.4 3.7 zoom 4.0
r

.18

1798
.13
3.847 3.85 3.857
r
Note that if we zoom in on the white band, we see a self-similar structure, reminiscent of a fractal.

Looking at this diagram, we see r represented along the horizontal and a


general p-value along the vertical. This tells us which values of p are accessible
for a given value of r. For r between 1 and 3, p settles on one value (not shown
in graph). At 3 we see the graph bifurcate into an oscillation between two
values. A little bit further along, we see each of those values bifurcate into two
more values at a little more than 3.4. This indicates that the population varies
between four different values before it returns to where it started.

A little further along, we can see the system double, double again, and then
double yet again. Eventually, around r = 3.6, it gets really messy. This is chaos,

Unit 13 | 31
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.6 but notice that it does not last forever. As r continues to increase, we see the
messiness clear up, at least for small windows of clean oscillations.
the logistic map
CONTINUED There are many different maps like the logistic map that show bifurcations
and chaotic behavior. In addition to the surprising mixture of order and chaos
revealed in the logistic map, there is a more-deeply-hidden surprise awaiting
when the bifurcation behavior of all such maps is examined. This surprise was
one of the first footholds that mathematicians established in the seemingly
hopeless world of chaos.

feigenbaum’s constants
• While the distance between successive bifurcations in the logistic map
changes, the ratio of those distances is a constant.

Mitchell Feigenbaum was a fixture at Los Alamos National Laboratory in the


1970s. Known for his breadth of knowledge, he was a trusted resource when a
colleague needed to bounce around ideas from any number of challenging fields.
One of Feigenbaum’s many interests was the bifurcation behavior of different
maps. Specifically, he looked at the intervals at which successive bifurcations
occur. In the logistic map, we saw that bifurcations did not occur at some steady
rate, but rather tended to cluster together. In other words, a system might
take a long time to evolve from steady-state values to oscillating behavior,
but not nearly as long to have a period-doubling bifurcation. Feigenbaum was
interested in the pattern behind these bifurcations, if there was any. Because
these bifurcations occur before the onset of chaos, they can be thought of as
“the road to chaos” in some sense. Feigenbaum felt that if he could understand
the bifurcations, he would have made an in-road into understanding chaos.

He began by looking at the intervals between bifurcations. Although he found


no regularity in the intervals themselves, he found an astonishing pattern in the
differences between the intervals. For example, if one bifurcation occurred at
3, and the next occurred at 3.4, and the next at 3.5, the successive differences
would be 0.4 and 0.1 respectively. When he looked at the ratio of these
differences, he found that they tended toward a certain irrational number, the
first few digits of which are 4.669. What is remarkable is that this number is the
same no matter which map one looks at, as long as it has only one parameter, as
does the logistic map.
Feigenbaum’s constant, 4.669…, can be thought of as the ratio of successive
bifurcation intervals in a system. It can be used to predict the onset of chaos

Unit 13 | 32
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.6 in a system before it ever shows up. So, even though a chaotic system is
fundamentally unpredictable, one can predict when the system will reach the
the logistic map chaotic state. This concept, known as universality, was an important step in the
CONTINUED understanding of chaotic behavior.

Feigenbaum’s work showed that the study of chaos was more than just an
exercise in rationalizing our inability to predict certain phenomena. He showed
that the onset of chaos itself could be predicted and thus, hopefully, better
controlled. Furthermore, because of the notion of sensitive dependence,
if chaos can be controlled, perhaps it can be manipulated to achieve some
desirable end, instead of simply imposing a barrier to impede our ability to
predict the future. In our final section, we will see how the concepts of chaos
can be used to our benefit.

Unit 13 | 33
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.7

fly me to the moon • Chaos…in…Space


• Leaves in the Stream

Chaos…in…Space
• Deterministic, Newtonian, mechanics were sufficient to get us to the moon
during the space race.

Lorenz’s discovery of sensitive dependence in the 1960s occurred at the time


of the golden era of space exploration in both the United States and the USSR.
These two competing superpowers utilized the best of deterministic, Newtonian
thinking to send human beings into space and to the moon. Achieving this
required huge, expensive, rockets and enormous amounts of fuel. Most of the
fuel required for a space flight was needed to escape the grip of Earth’s gravity
and to allow different types of orbits. Additional fuel was required to enable
spacecraft to move between different orbits, including orbits that coincided with
the path of the moon.

1768 Earth Earth

Old Orbit

New Orbit

Orbit Transfer Orbit

To compute these orbits, engineers used classic linear thinking, sticking to


paths that they knew would be forgiving. They knew that small changes would
result in small movements, and this helped to minimize error and maximize
control. The problem with this strategy is that the opposite is also true: large
movements require large changes, and large changes require large amounts of
fuel. Exploring the solar system, or just our closest neighbor, the moon, in this
manner is effective and relatively safe, but it is extremely expensive.

Unit 13 | 34
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.7 Fast-forward thirty years to the 1990s and the space race was in decline.
After the breakup of the USSR, the United States’ chief competitor for space
fly me to the moon dominance was out of the game. With the chief impetus for space exploration
CONTINUED out of the picture, the United States space program had slowly declined from its
ambitious projects of the 60s, 70s, and early 80s. No longer could they justify
expensive missions, such as those that landed humans on the moon. In this
political/social climate, a new paradigm of space exploration began to take
shape.

In the 1990s, scientists at NASA’s Jet Propulsion Laboratories began to wonder


whether some of the ideas from chaos theory might be useful in designing a
way to travel around the solar system using very small amounts of fuel. They
thought that perhaps they could use nonlinearities to their advantage to get
large accelerations for relatively little amounts of fuel. To get a better idea
of how this would work, let’s return to the example of falling leaves from the
introduction to this unit.

leaves in the stream


• Space scientists are able to use sensitive dependence to their advantage to
plan minimal-fuel routes through the solar system.
• By connecting Lagrange points, scientists have created an Interplanetary
Superhighway.

Recall that in our opening example, the two falling leaves started out in almost,
but not quite exactly the same position. By the time they reached the ground,
they ended up in very different locations. This is an example of the sensitive
dependence that is the hallmark of chaos theory.

Unit 13 | 35
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.7

fly me to the moon


CONTINUED

1769

If we imagine the two leaves to be spacecraft and the branch to be the Earth’s
orbit, then we get some sense for how this new paradigm of space exploration
works. Two spacecraft could start out in minutely different positions and be
carried throughout the solar system to very different locations. A very small
adjustment at the beginning of a journey could determine whether a spacecraft
ends up orbiting the moon or Pluto. The mechanism that would make all this
possible came to be called the Interplanetary Superhighway (IPS).

Item 1709/NASA/JPL-Caltech, ARTIST’S CONCEPT OF INTERPLANETARY


SUPERHIGHWAY (2002) Courtesy of NASA/JPL-Caltech.

To understand how the IPS works and what it has to do with chaos theory, let’s
look a little more closely at how the gravitational fields of different planetary
bodies interact.

Unit 13 | 36
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.7

fly me to the moon


CONTINUED

Item 1713/NASA/WMAP Science Team, LAGRANGE POINTS 1-5 OF THE


SUN-EARTH SYSTEM (2001). Courtesy of NASA/WMAP Science Team.
Note the Lagrange points of the Earth-Moon system, L1, L2, L3, L4, L5

We normally envision an orbit to be an elliptical path that results when mutual


gravitation between two bodies acts to keep one (the satellite) circling around
the other without flying off into space or crashing into its surface. Other types
of orbits are possible, however. One alternative type of orbit is characterized by
instability, and it is highly susceptible to small changes of course. These orbits
are known as halo orbits, and they are the nodes of the IPS network.

Halo orbits take advantage of what are known as Lagrange points. These are
points in space where two or more different gravitational fields are exactly
balanced. An object situated at a Lagrange point will be able to remain
motionless in space, like the rope in a stalemated tug-of-war.

Unit 13 | 37
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.1
13.7
Lagrange point

fly me to the moon

1770
CONTINUED Grape

Sun Earth

A small nudge sends A grape balances on the


a satellite away from bowl, but a small nudge
a Lagrange point. results in a large change.

Just a minimal applied force is enough to send an object hurtling away from the
Lagrange point in much the same way that a mere touch is sufficient to send
a delicately balanced grape rolling off of the top of an upside-down bowl. If
you knew exactly how and where to nudge the grape, you could control where
it ends up (for a perfectly spherical grape). Furthermore, your small exertion
would result in a large effect on the grape’s position. This is the essence of how
sensitive dependence can be harnessed and used to help us explore our solar
system.

Objects can sit at Lagrange points, albeit tentatively. They can also orbit them
in a manner similar to how they would orbit a planet, except that orbits around
Lagrange points are extremely unstable. The IPS is a very precise path that
connects the different Lagrange points across our solar system. It can be
visualized as a system of tubes whose surfaces represent paths that naturally
tend toward Lagrange points. By staying on the surface of one of these tubes,
a spacecraft can basically surf the gravitational landscape of the solar system
using very little fuel. Imagine our grape being nudged off of the first overturned
bowl and onto the pinnacle of another overturned bowl, where the process
is repeated—a theoretically perpetual system of motion with very little input
energy.

Unit 13 | 38
UNIT 13 THe concepts of Chaos
textbook

SECTION 13.1
13.7
Earth
Grape

fly me to the moon Small thrust


CONTINUED
Sun
Bowl
Mars

1770 Small thrust

Jupiter

In this system course corrections or alterations require very little fuel compared
to the amount required in the more Newtonian paradigm of powering one’s way
through space along deterministic orbits. By taking advantage of the sensitive
dependence of Lagrange points in the IPS, spacecraft can travel farther more
economically, and can devote more of their payload to mission equipment as
opposed to the equipment and materials related to propulsion. NASA began to
design missions using these concepts in the late 1990s and early 2000s. The IPS
is both an exciting development in the field of space exploration and a triumph of
using the mathematics of nonlinear systems and chaos.

Unit 13 | 39
UNIT 13 AT A GLANCE
textbook

SECTION 13.1
13.2

Linear vs. nonlinear • Chaos is one of many behaviors that a nonlinear system can display.
systems • A mass on a spring is an example of a simple harmonic oscillator, a well-
understood linear system.
• Linear systems can be solved relatively simply because they can be broken
down into parts that can be solved separately.
• Linear systems tend toward one of four predictable behaviors.
• A pendulum swinging outside of the small-angle approximation, where
sin θ ∼ θ, is an example of a nonlinear system.
• For small swings, a pendulum behaves predictably, but for large swings, it
can behave strangely.

SECTION 13.3
3.2

limits of predictability • Newton calculated the motion of the planets using differential equations for
two objects influencing each other with gravity.
• Given the initial conditions and the relevant equations, one can predict
where the two mutually orbiting objects will be at any point in time.
• The three-body problem is very different from the two-body problem.
• Poincaré showed that the behavior of a three-body system cannot be
quantitatively predicted.
• A phase portrait is a way to visualize all states of a system.
• Using a phase portrait, one can deduce the qualitative features of a system’s
evolution.
• If a system starts out at an equilibrium point, it will not be driven to change
its state.
• Equilibria can be stable (attractors) or unstable (repellers).

Unit 13 | 40
UNIT 13 at a glance
textbook

SECTION 13.4

sensitive dependence • Lorenz discovered that a small change in the input to a certain system of
equations resulted in a surprisingly large change in output.
• The phase space of the Lorenz system contains an attractor whose phase
portrait resembles a butterfly.
• The Lorenz attractor helps to explain how small changes in starting
conditions lead to greater changes down the line.

SECTION 13.5
3.2 • A simple way to see sensitive dependence is to look at discrete, iterative
processes, such as folding dough.
iteration

SECTION 13.6
3.2 • A bifurcation is an abrupt change in the qualitative behavior of a system.
• The logistic map is a model of population growth that exhibits many different
the logistic map types of behavior, depending on the value of a few constants.
• Above a certain parameter value, the logistic map becomes chaotic.
• While the distance between successive bifurcations in the logistic map
changes, the ratio of those distances is a constant.

SECTION 13.7
3.2 • Deterministic, Newtonian, mechanics were sufficient to get us to the moon
during the space race.
fly me to the moon • Space scientists are able to use sensitive dependence to their advantage to
plan minimal-fuel routes through the solar system.
• By connecting Lagrange points, scientists have created an Interplanetary
Superhighway.

Unit 13 | 41
UNIT 13 THe concepts of Chaos
textbook

BIBLIOGRAPHY

Websites http://ecommons.library.cornell.edu/handle/1813/97

PRINT Belbruno, Edward. Fly Me to the Moon: An Insider’s Guide to the New Science of
Space Travel. Princeton, NJ: Princeton University Press, 2007.

Chernikov, Aleksander A.; Roald Z. Sagdeev, and Georgii M. Zaslavskii.


“Chaos - How Regular Can it Be?” Physics Today, vol. 41, (November 1988).

Diacu, Florin and Philip Holmes. Celestial Encounters: The Origins of Chaos and
Stability. Princeton, NJ: Princeton University Press, 1996.

Glass, Leon, Michael R. Guevara, Alvin Shrier, and Rafael Perez.


“Bifurcation and Chaos in a Periodically Stimulated Cardiac Oscillator,”
Physica, vol. 7, (1983).

Gleick, James. Chaos: Making a New Science. New York: Penguin Books, 1988.

Holland, John H. Emergence: From Chaos to Order. Reading, MA: Helix Books
(Addison-Wesley Press), 1998.

Lo, M.W. “The Interplanetary Superhighway and the Origins Program”


IEEE Aerospace Conference Proceedings, vol. 7, (2002).

Nolasco, J.B. and R.W. Dahlen. “A Graphic Method for the Study of Alternation in
Cardiac Action Potentials,” Journal of Applied Physiology, vol. 25, no. 2 (1968).

Pikovsky, Arkady, Michael Rosenblum, and Jurgen Kurths. Synchronization :


A Universal Concept in Nonlinear Sciences. Cambridge, New York: Cambridge
University Press, 2001.

Smith, Douglas L. “Next Exit 0.5 Million Kilometers,” Engineering and Science,
vol. 65, no. 4 (2002).

Stewart, Ian. From Here to Infinity: A Guide to Today’s Mathematics. Oxford,


Great Britain: Oxford University Press, 1996.

Unit 13 | 42
UNIT 13 THe concepts of Chaos
textbook

BIBLIOGRAPHY

PRINT Strogatz, Steven. Nonlinear Dynamics and Chaos: With Applications to Physics,
CONTINUED Biology, Chemistry and Engineering. Cambridge, MA: Perseus Books Publishing,
2000.

Thornton, Marion. Classical Dynamics of Particles and Systems, 4th ed. Orlando,
FL: Saunders College Publishing (Harcourt Brace College Publishers), 1995.

Wang, Q.D. “Power Series Solutions and Integral Manifold of the N-Body
Problem,” Regular and Chaotic Dynamics, vol. 6, no. 4 (2001).

Feynman, Richard. The Character of Physical Law. New York: Modern Library:
1994; p 108.

Poincaré, Henri (edited and introduced by Daniel L. Goroff) New Methods of


Celestial Mechanics, vol. 1; Los Angeles, CA: American Institute of Physics:
1993; pp 110-111.

Unit 13 | 43
UNIT 13 THe concepts of Chaos
textbook

SECTION
NOTES 13.1

Unit 13 | 44

S-ar putea să vă placă și