Documente Academic
Documente Profesional
Documente Cultură
Simone Cerreia-Vioglio
Department of Decision Sciences and IGIER, Università Bocconi
Massimo Marinacci
AXA-Bocconi Chair, Department of Decision Sciences and IGIER, Università Bocconi
Elena Vigna
Dipartimento Esomas, Università di Torino and Collegio Carlo Alberto
5 September 2016
1
This manuscript is a very preliminary version of a textbook that will be published by Springer
International Publishing (ISBN 978-3-319-44713-1). It is for the personal use of Bocconi students who
are attending …rst year mathematics courses. We thank Gabriella Chiomio and Claudio Mattalia,
who thoroughly translated a …rst version of the manuscript, as well as Alexandra Fotiou, Giacomo
Lanzani and Kelly Gail Strada for excellent research assistance, Margherita Cigola, Guido Osimo,
and Lorenzo Peccati for some very useful comments that helped us to improve the manuscript. We
are especially indebted to Pierpaolo Battigalli, Erio Castagnoli (with whom this project started),
Itzhak Gilboa, Fabio Maccheroni, Luigi Montrucchio, and David Schmeidler for the discussions that
over the years shaped our views on economics and mathematics.
ii
Contents
I Structures 1
iii
iv CONTENTS
3 Linear structure 59
3.1 Vector subspaces of Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.2 Linear independence and dependence . . . . . . . . . . . . . . . . . . . . . . . 62
3.3 Linear combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.4 Generated subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.5 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.6 Bases of subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4 Euclidean structure 75
4.1 Absolute value and norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.1.1 Inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.1.2 Absolute value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.1.3 Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.2 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5 Topological structure 85
5.1 Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.2 Neighborhoods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3 Taxonomy of the points of Rn with respect to a set . . . . . . . . . . . . . . . 90
5.3.1 Interior, exterior and boundary points . . . . . . . . . . . . . . . . . . 90
5.3.2 Limit (accumulation) points . . . . . . . . . . . . . . . . . . . . . . . . 93
5.4 Open and closed sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.5 Set-theoretical stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.6 Compact sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.7 Closure and convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6 Functions 105
6.1 The concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.2.1 Static choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.2.2 Intertemporal choices . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.3 General properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.3.1 Preimages and level curves . . . . . . . . . . . . . . . . . . . . . . . . 117
6.3.2 Algebra of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.3.3 Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.4 Classes of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.4.1 Injective, surjective, and bijective functions . . . . . . . . . . . . . . . 126
6.4.2 Inverse functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.4.3 Bounded functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.4.4 Monotonic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.4.5 Concave and convex functions (preview) . . . . . . . . . . . . . . . . . 138
6.4.6 Separable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
6.5 Elementary functions on R . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.5.1 Polynomial functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.5.2 Exponential and logarithmic functions . . . . . . . . . . . . . . . . . . 142
6.5.3 Trigonometric and periodic functions . . . . . . . . . . . . . . . . . . . 144
CONTENTS v
7 Cardinality 159
7.1 Actual in…nite and potential in…nite . . . . . . . . . . . . . . . . . . . . . . . 159
7.2 Bijective functions and cardinality . . . . . . . . . . . . . . . . . . . . . . . . 160
7.3 A Pandora’s box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
8 Sequences 171
8.1 The concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
8.2 The space of sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
8.3 Application: intertemporal choices . . . . . . . . . . . . . . . . . . . . . . . . 175
8.4 Images and classes of sequences . . . . . . . . . . . . . . . . . . . . . . . . . . 176
8.5 Limits: introductory examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
8.6 Limits and asymptotic behavior . . . . . . . . . . . . . . . . . . . . . . . . . . 179
8.6.1 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
8.6.2 Limits from above and from below . . . . . . . . . . . . . . . . . . . . 182
8.6.3 Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
8.6.4 Topology of R and general de…nition of limit . . . . . . . . . . . . . . 183
8.7 Properties of limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
8.7.1 Monotonicity and convergence . . . . . . . . . . . . . . . . . . . . . . 187
8.7.2 Heron’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
8.7.3 The Bolzano-Weierstrass Theorem . . . . . . . . . . . . . . . . . . . . 191
8.8 Algebra of limits and fundamental limits . . . . . . . . . . . . . . . . . . . . . 194
8.8.1 (Many) certainties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
8.8.2 Some basic limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
8.8.3 Indeterminate forms for the limits . . . . . . . . . . . . . . . . . . . . 198
8.8.4 Summarizing tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
8.8.5 But how many indeterminate forms are? . . . . . . . . . . . . . . . . . 202
8.9 Convergence criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
8.10 The Cauchy condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
8.11 Napier’s constant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
8.12 Orders of convergence and of divergence . . . . . . . . . . . . . . . . . . . . . 213
8.12.1 Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.12.2 Little-o algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
8.12.3 Asymptotic equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . 217
8.12.4 Characterization and decay . . . . . . . . . . . . . . . . . . . . . . . . 221
8.12.5 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
8.12.6 Scales of in…nities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
vi CONTENTS
9 Series 229
9.1 The concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
9.1.1 Three classical series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
9.1.2 Intertemporal utility with in…nite horizon . . . . . . . . . . . . . . . . 233
9.2 Elementary properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
9.3 Series with positive terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
9.3.1 Comparison convergence criterion . . . . . . . . . . . . . . . . . . . . . 234
9.3.2 Ratio convergence criterion: prelude . . . . . . . . . . . . . . . . . . . 238
9.3.3 Ratio criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
9.3.4 A …rst series expansion . . . . . . . . . . . . . . . . . . . . . . . . . . 241
9.4 Series with terms of any sign . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
9.4.1 Absolute convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
9.4.2 Alternating series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
V Optima 437
18 Derivatives 499
18.1 De…nition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
18.1.1 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
18.2 Geometric interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
18.3 Derivative function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
18.4 Unilateral derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
18.5 Derivability and continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
18.6 Derivatives of elementary functions . . . . . . . . . . . . . . . . . . . . . . . . 512
18.7 Algebra of derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
18.8 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
18.9 Derivative of inverse functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
18.10Formulary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522
18.11Di¤erentiability and linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
x CONTENTS
21 Approximation 599
21.1 Taylor’s polynomial approximation . . . . . . . . . . . . . . . . . . . . . . . . 599
21.1.1 Polynomial expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . 599
21.1.2 Taylor’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
CONTENTS xi
32 Stieltjes’integral 847
32.1 De…nition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 848
32.2 Integrability criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 848
32.3 Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 850
32.4 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 852
32.5 Step integrators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 853
32.6 Integration by parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856
32.7 Change of variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856
33 Moments 859
33.1 Densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 859
33.2 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 860
33.3 The problem of moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 861
xiv CONTENTS
IX Appendices 865
A Permutations 867
A.1 Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 867
A.2 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 868
A.3 Anagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 869
A.4 Newton’s binomial formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 870
Structures
1
Chapter 1
1.1 Sets
A set (or aggregate) is a collection of distinguishable objects. There are two ways to describe
a set: by listing directly its elements, or by specifying a property that its elements have in
common. The second way is more common than the …rst one; for instance,
can be described as the set of the prime numbers between 10 and 30. The chairs of your
kitchen form a set of objects, the chairs, that have in common the property of being part
of your kitchen. The chairs of your bedroom form another set, as the letters of the Latin
alphabet form a set, distinct from the set of the letters of the Greek alphabet (and from the
set of chairs or from the set of numbers considered above).
Sets are usually denoted by capital letters: A, B, C, and so on; their elements are denoted
by small letters: a, b, c, and so on. To denote that an element a belongs to the set A we
write
a2A
where 2 is the symbol of belonging. Instead, to denote that an element a does not belong
to the set A we write a 2
= A.
O¤ the record remark (O.R.). The concept of set, apparently introduced in 1847 by
Bernhard Bolzano, is for us a primitive concept, not de…ned through other notions. The
situation is similar to the one we have in Euclidean geometry, in which points and lines are
primitive concepts (with an intuitive geometric meaning that readers may give them). H
1.1.1 Subsets
The chairs of your bedroom are a subset of the chairs of your home: a chair that belongs to
your bedroom also belongs to your home. In general, a set A is subset of a set B when all
the elements of A are also elements of B. In this case we write A B. Formally,
3
4 CHAPTER 1. SETS AND NUMBERS: AN INTUITIVE INTRODUCTION
and let
B = f11; 13; 15; 17; 19; 21; 23; 25; 27; 29g (1.2)
be the set of the odd numbers between 10 and 30. We have A B.
4 A ⊆B
2
-2 A
-4
B
-6
-6 -4 -2 0 2 4 6
by using the so-called Venn diagrams to represent graphically the sets A and B: it is an
ingenuous, yet e¤ective, way to visualize sets.
When we have both A B and B A – that is, x 2 A if and only if x 2 B – the two
sets A and B are said to be equal; in symbols A = B. For example, let A be the set of
the solutions of the quadratic equation x2 3x + 2 = 0 and let B be the set formed by the
numbers 1 and 2. It is easy to see that A = B.
When A B and A 6= B, we write A B and say that A is a proper subset of B.
The sets A = fag that consist of a unique element are called singletons. They are a
peculiar, but altogether legitimate, class of sets.1
N.B. Though the two symbols 2 and are conceptually well distinct and must not be
confused, there exists an interesting relation between them. Indeed, consider the set formed
by a unique element a, that is, the singleton fag. Through such a singleton, we can establish
the relation
a 2 A if and only if fag A
between 2 and . O
1
Note that a and fag are not the same thing; a is an element and fag is a set, even if it is formed by only
one element. For instance, the set A of the Nations of the Earth with the ‡ag of only one colour had (until
2011) only one element, Libya, but it is not “the Libya”: Tripoli is not the capital of A.
1.1. SETS 5
1.1.2 Operations
There are three basic operations among sets: union, intersection, and di¤erence. As we will
see, they take any two given sets and, starting from them, form a new set.
The …rst operation that we consider is the intersection of two sets A and B. As the
term “intersection” suggests, with this operation we select all the elements that belong
simultaneously to the sets A and B.
De…nition 2 Given two sets A and B, their intersection A \ B is the set of all the elements
that belong both to A and B, that is, x 2 A \ B if x 2 A and x 2 B.
For example, let A be the set of the left-handers and B the set of the right-handers in Italy.
The intersection A \ B is the set of the ambidextrous Italians. If, instead, A is the set of the
petrol cars and B the set of the methane cars, the intersection A \ B is the set of the bi-fuel
cars that run on both petrol and methane.
It can happen that two sets have no elements in common. For example, let
C = f10; 12; 14; 16; 18; 20; 22; 24; 26; 28; 30g (1.3)
be the set of the even numbers between 10 and 30. It has no elements in common with the
set B in (1.2). In this case we talk of disjoint sets, with no elements in common. Such a
notion gives us the opportunity to introduce a fundamental set.
As a …rst use of the notion, note that two sets A and B are disjoint when they have
empty intersection, that is, A \ B = ;. For example, for the sets B and C in (1.2) and (1.3),
we have B \ C = ;.
We write A 6= ; when the set A is not empty, that is, it contains at least one element.
Conventionally, we consider the empty set as a subset of any set, that is, ; A for every set
A.
It is immediate that A \ B A and that A \ B B. The next result is more subtle and
establishes a useful property that links and \.
6 CHAPTER 1. SETS AND NUMBERS: AN INTUITIVE INTRODUCTION
Proof “If”. Let A B. We want to prove that A \ B = A. In order to show that two
sets are equal, we always need to prove separately the two opposite inclusions: in this case,
A \ B A and A A \ B.
The …rst inclusion A \ B A is easily proven to be true. Indeed, let x 2 A \ B.2 Then,
by de…nition, x belongs both to A and to B. In particular, x 2 A and this is enough to
conclude that A \ B A.
Let us prove the second inclusion: A A \ B. Let x 2 A. As, by hypothesis, A B,
each element of A also belongs to B, it follows that x 2 B. Hence, x belongs both to A and
to B, i.e., x 2 A \ B, and this proves that A A \ B.
We have shown that both the inclusions A \ B A and A A \ B hold; we can therefore
conclude that A \ B = A, which completes the proof of the “If” part.
The next operation we consider is the union. Here again the term “union” already
suggests how in this operation all the elements of both sets are collected together.
De…nition 5 Given two sets A and B, their union A [ B is the set of all the elements that
belong to A or to B, that is, x 2 A [ B if x 2 A or x 2 B.3
Note that an element can belong to both sets (unless the sets are disjoint). For example,
if A is again the set of the left-handers and B is the set of the right-handers in Italy, the
union set contains all the Italians with at least one hand, and there are individuals (the
ambidexters) who belong to both sets.
It is immediate to show that A A [ B and that B A [ B. It then follows that
A\B A[B
2
In proving an inclusion between sets, say C D, throughout the book we will tacitly assume that C 6= ;
since the inclusion is trivially true when C = ;. For this reason our inclusion proof will show that x 2 C (i.e.,
C 6= ;) implies x 2 D.
3
The conjunction “or” has the inclusive sense of the Latin “vel” (x belongs to A or to B or to both) and
not the exclusive sense of “aut” (x belongs to either A or to B, but not to both). Indeed, Giuseppe Peano
gave the symbol [ the meaning “vel” when he …rst introduced it, along with the intersection symbol \ and
the membership symbol ", which he interpreted as the Latin “et” and “est”, respectively (see the “signorum
tabula” in his 1889 work Arithmetices principia, nova methodo exposita, a seminal work on the foundations
of mathematics).
1.1. SETS 7
4 A ∪ B
-2 A
B
-4
-6
-2 0 2 4 6 8 10
De…nition 6 Given two sets A and B, their di¤erence A B is the set of all the elements
that belong to A, but not to B, that is, x 2 A B if both x 2 A and x 2
= B.
The di¤erence set4 A B is therefore obtained by eliminating from A all the elements
that belong (also) to B. Graphically:
2 A -B
-1 B
A
-2
-3
-3 -2 -1 0 1 2 3 4 5
For example, let us go back to the sets A and B identi…ed in (1.1) and (1.2). Then,
that is, B A is the set of the non-prime odd numbers between 10 and 30. Note that: (i)
when A and B are disjoint, we have A B = A and B A = B, (ii) A B is equivalent
to A B = ; since, by removing from A all the elements that belong also to B, the set A is
deprived of all its elements, that is, we remain with the empty set.
In many applications there is a general set of reference, an all inclusive set, of which
various subsets are considered. For example, for demographers this set can be the entire
4
The set di¤erence A B is often denoted by AnB.
8 CHAPTER 1. SETS AND NUMBERS: AN INTUITIVE INTRODUCTION
population of a country, of which they can consider various subsets according to the demo-
graphic properties that are of interest (for instance, age is a common demographic variable
through which the population can be subdivided in subsets).
The general set of reference is called universal set or, more commonly, space. There is no
consolidated notation for this set (which is often clear from the context), which we denote
temporarily by S. Given any of its subsets A, the di¤erence S A is denoted by Ac and
is called the complement set, or simply the complement, of A. The di¤erence operation is
called complementation when it involves the universal set.
Example 7 If S is the set of all citizens of a country and A is the set of all citizens that are
at least 65 years old, the complement Ac is constituted by all citizens that are (strictly) less
than 65 years old. N
Proof Since we have to verify an equality between sets (as in the proof of Proposition 4),
we have to consider separately the two inclusions (Ac )c A and A (Ac )c .
If a 2 (Ac )c , then a 2
= Ac and therefore a 2 A. It follows that (Ac )c A.
Vice versa, if a 2 A, then a 2= Ac and therefore a 2 (Ac )c ; hence A (Ac )c .
(i) commutative, that is, for any two sets A and B, we have A \ B = B \ A and A [ B =
B [ A;
(ii) associative, that is, for any three sets A, B, and C, we have A[(B [ C) = (A [ B)[C
and A \ (B \ C) = (A \ B) \ C.
We leave to the reader the simple proof. Property (ii) permits to write A [ B [ C
and A \ B \ C and, therefore, to extend without ambiguity the operations of union and
intersection to an arbitrary (…nite) number of sets:
n
[ n
\
Ai and Ai
i=1 i=1
It is possible to extend such operations also to in…nitely many sets. If A1 ; A2 ; :::An ; ::: is an
in…nite collection of sets, the union
[1
An
n=1
1.1. SETS 9
is the set of the elements that belong at least to one of the An , that is,
1
[
An = fa : a 2 An for at least one index ng
n=1
The intersection
1
\
An
n=1
Example 10 Let An be the T1set of the even numbers n. For example, A3 = f0; 2g and
A6 = f0; 2; 4; 6g. We have Sn=1 An = f0g, since 0 is the only even number such that 0 2 An
S1
for each n 1. Moreover, 1 A
n=1 n = f2n : n positive integerg, that is, n=1 An is the set
of all even numbers. N
We turn to the relations between the operations of intersection and union. Note the
symmetry between properties (1.4) and (1.5), in which \ and [ are exchanged.
Proposition 11 The operations of union and intersection are distributive, that is, given
any three sets A, B, and C, we have
A \ (B [ C) = (A \ B) [ (A \ C) (1.4)
and
A [ (B \ C) = (A [ B) \ (A [ C) : (1.5)
Proof We prove only (1.4). We have to consider separately the two inclusions A\(B [ C)
(A \ B) [ (A \ C) and (A \ B) [ (A \ C) A \ (B [ C).
If x 2 A \ (B [ C), then x 2 A and x 2 B [ C, that is (i) x 2 A and (ii) x 2 B or
x 2 C. It follows that x 2 A \ B or x 2 A \ C, i.e., x 2 (A \ B) [ (A \ C), and therefore
A \ (B [ C) (A \ B) [ (A \ C).
Vice versa, if x 2 (A \ B) [ (A \ C), then x 2 A \ B or x 2 A \ C, that is, x belongs
to A and to at least one of B and C and therefore x 2 A \ (B [ C). It follows that
(A \ B) [ (A \ C) A \ (B [ C).
De…nition 12 A family
fA1 ; A2 ; : : : ; An g = fAi gni=1
of subsets of a set A is a partition of A if the subsets are pairwise
S disjoint, that is, Ai \Aj = ;
for every i 6= j, and if their union coincides with A, that is, ni=1 Ai = A.
10 CHAPTER 1. SETS AND NUMBERS: AN INTUITIVE INTRODUCTION
Example 13 Let A be the set of all citizens of a country. Its subsets A1 , A2 , and A3
formed, respectively, by the citizens of school or pre-school age (from 0 to 17 years old), by
the citizens of working age (from 18 to 65 years old) and by the elders (from 65 years old
on) constitute a partition of the set A. N
We conclude with the so-called De Morgan’s laws for complementation: they illustrate
the relationship between the operations of intersection, union, and complementation.
Proof We prove only the …rst law, leaving the second one to the reader. As usual, in order
to prove an equality between sets, we have to consider separately the two inclusions that
compose it. (i) (A [ B)c Ac \ B c . If x 2 (A [ B)c , then x 2
= A [ B, that is, x does not
belong either to A or to B. It follows that x belongs simultaneously to Ac and to B c and,
therefore, to their intersection. (ii) Ac \ B c (A [ B)c . If x 2 Ac \ B c then x 2= A and
x2= B; therefore, x does not belong to their union.
De Morgan’s laws show that, when considering complements, the operations [ and \ are
essentially interchangeable. Often these laws are written in the equivalent form
would require an ad hoc, highly non-trivial, course). But, it is important to be aware of these
paradoxes because the methods that have been developed to address them have a¤ected the
practice of mathematics, as well as that of the empirical sciences.
1.2 Numbers
To quantify the quantities of interest in economic applications (for example, the prices and
quantities of goods traded in some market) we need an adequate set of numbers. This is the
argument of the present section.
The natural numbers
0; 1; 2; 3; :::
do not need any introduction; their set will be denoted by the symbol N.
The set N of natural numbers is closed with respect to the fundamental operations of
addition and multiplication:
(i) m + n 2 N when m; n 2 N;
(ii) m n 2 N when m; n 2 N.
On the contrary, N is not closed with respect to the fundamental operations of subtraction
and division: for example, neither 5 6 nor 5=6 are natural numbers. It is therefore clear
that N is inadequate as a set of numbers to quantify all economic quantities: the budget of
a company is a …rst obvious example in which the closure with respect to the subtraction is
crucial (otherwise, how can we quantify losses?).
:::; 3; 2; 1; 0; 1; 2; 3; :::
form a …rst extension, denoted by the symbol Z, of the set N. It leads to a set that is closed
with respect to addition and multiplication, as well as to subtraction. Indeed, by setting
m n = m + ( n),6 we have
(i) m n 2 Z when m; n 2 Z;
(ii) m n 2 Z when m; n 2 Z.
Z = fm n : m; n 2 Ng
Proposition 15 N Z.
5
In ancient India positive numbers and negative numbers were distinguished by writing them, respectively,
in red and in black. This convention is in contrast to the one banks follow according to which a checking
account with negative balance is “in the red”.
6
The di¤erence m n is simply the sum of m with the negative n of n. Concerning this aspect, recall
the notion of algebraic sum.
12 CHAPTER 1. SETS AND NUMBERS: AN INTUITIVE INTRODUCTION
We are left with a fundamental operation with respect to which Z is not closed: division.
For example, 1=3 is not an integer number. To remedy this important shortcoming of the
integers (if we want to divide 1 cake among 3 guests, how can we quantify their portions
if only Z is available?), we need a further enlargement to the set of the rational numbers,
denoted by the symbol Q, and given by
nm o
Q= : m; n 2 Z with n 6= 0
n
In other words, the set of the rational numbers consists of all the fractions with integer
numbers in the numerator and in the denominator (not equal to zero).
Proposition 16 Z Q.
The set of rational numbers is closed with respect to all the four fundamental operations:7
(i) m n 2 Q when m; n 2 Q;
(ii) m n 2 Q when m; n 2 Q;
O.R. Each rational number that is not periodic, that is, that has a …nite number of decimals,
has two decimal representations. For example, 1 = 0:9 because
1
0:9 = 3 0:3 = 3 =1
3
In an analogous way, 2:5 = 2:49, 51:2 = 51:19, and so on. On the contrary, periodic rational
numbers and irrational numbers have a unique decimal representation (which is in…nite).
This is not a simple curiosity: if 0:9 were not equal to 1, we could state that 0:9 is the
number that immediately precedes 1 (without any other number in between), which would
violate a notable property that we will discuss shortly. H
The set of rational numbers seems, therefore, to be equipped with all what can be use-
ful. Some simple observations on the multiplication, however, will bring us some surprising
…ndings. If q is a rational number, as it is well known, the notation q n , with n 1, means
q q ::: q
| {z }
n times
We agree that q 0 = 1 for every q 6= 0. By itself the notation q n , called power of basis q
and exponent n, is just a simple way to write more compactly the repeated multiplication
7
The names of the four fundamental operations are addition, subtraction, multiplication, and division,
while the names of their results are, respectively, sum, di¤erence, product, and quotient (the addition of 3
and 4 has 7 as sum, and so on).
1.2. NUMBERS 13
of the same factor. Nevertheless, given a rational q > 0, it is natural to consider the inverse
1
path, that is, to determine the positive “number”, denoted by q n (sometimes by q 1=n ) — or,
p
equivalently, by n q –and called root of order n of q, such that
1 n
qn =q
p
For example,8 25 = 5 as 52 = 25. To understand the importance of roots, we can consider
the following simple geometric …gure:
p
By Pythagoras’ Theorem, the length of the hypotenuse is 2. To quantify elementary
geometric entities, we thus need square roots. Here we have a, tragic to some, surprise.9
p
Theorem 17 22
= Q.
p
Suppose, by contradiction, that
Proof p 2 2 Q. Then there exist m; n 2 Z such that
m=n = 2, and therefore
m 2
=2 (1.6)
n
We can assume that m=n is already reduced to its lowest terms, i.e., that m and n have no
factors in common.10 This means that m and n cannot both be even numbers (otherwise, 2
would be a common factor).
Formula (1.6) implies
m2 = 2n2 (1.7)
and therefore m2 is even. As the square of an odd number is odd, m is also even (di¤erently,
if m were odd, m2 would also be odd). Therefore, there exists an integer k 6= 0 such that
m = 2k (1.8)
Therefore n2 is even, and so n itself is even. In conclusion, both m and n are even, but this
contradicts
p the fact that m=n is reduced to its lowest terms. This contradiction proves that
22= Q.
This magni…cent result is one of the great theorems of Greek mathematics. Proved by
the Pythagorean school between the VI and the V century B.C., it was a turning point in
the history of mathematics. Leaving aside the philosophical aspects, from the mathematical
point of view it shows the need for a further enlargement of the set of numbers in order to
quantify basic geometric entities (as well as basic economic quantities, as it will be clear in
the sequel).
To introduce, at an intuitive level, this …nal enlargement,11 consider the classical real
line:
It is easy to see how on this line we can represent the rational numbers:
The rational numbers do not exhaust, however, the real line. For example, also roots like
p
2, or other non-rational numbers, such as , must …nd their representation on the real
line:12
We denote by R the set of all the numbers that can be represented on the real line; they are
called real numbers.
The set R has the following properties in terms of the fundamental operations (here a; b
and c are generic real numbers):
(i) a + b 2 R and a b 2 R;
(ii) a + b = b + a and a b = b a;
(iv) a + 0 = a and b 1 = b;
1
(v) a + ( a) = 0 and b b = 1 provided b 6= 0;
(vi) a (b + c) = a b + a c.
11
For a rigorous treatment we refer, for example, to the …rst chapter of W. Rudin, Principles of mathematical
analysis, McGraw-Hill, 1976.
12
Though intuitive, it is actually a postulate (of continuity of the real line).
1.3. STRUCTURE OF THE INTEGERS 15
Clearly, Q R; but Q 6= R: there are many real numbers, called irrationals, that are
not rational. Many roots and the numbers and e are examples of irrational numbers. It
is actually possible to prove that most real numbers are irrational. Although a rigorous
treatment of this topic would take us too far, the next simple result is already a clear
indication of how rich the set of the irrational numbers is.
Proposition 18 Given any two rational numbers a < b, there exists an irrational number
c 2 R such that a < c < b.
In conclusion, R is the set of numbers that we will consider in the rest of the book. It
turns out to be adequate for most economic applications.13
Example 19 The integer 6 is divisible by the integer 2, that is 2 j 6, as the integer 3 is such
that 6 = 2 3. Furthermore, 6 is divisible by 3, that is 3 j 6, as the integer 2 is such
that 6 = 2 3. N
13
An important further enlargement, which we do not consider, is the set C of complex numbers.
16 CHAPTER 1. SETS AND NUMBERS: AN INTUITIVE INTRODUCTION
The reader may have learned in elementary school how to divide two integers by using
remainders and quotients. For example, if n = 7 and m = 2, we have n = 3 2 + 1, with 3 as
the quotient and 1 as the remainder. The next simple result formalizes the above procedure
and shows that it holds for any pair of integers (something that young learners take for
granted, but from now on we will take nothing for granted).
Proposition 20 Given any two integers m and n, with m strictly positive,14 there is one
and only one pair of integers q and r such that
n = qm + r
with 0 r < m.
Proof Two distinct properties are stated in the proposition: the existence of the pair (q; r),
and its uniqueness. Let us start by proving its existence. We will only consider the case
in which n 0 (you need only to to change the sign if n < 0). Consider the set A =
fp 2 N : p n=mg. Since n 0, A is non-empty, as it contains at least the integer zero. Let
q be the largest element of A. By de…nition, qm n < (q + 1) m. Setting r = n qm, we
have
0 n qm = r < (q + 1) m qm = m
We have thus shown the existence of the desired pair (q; r).
Let us now consider uniqueness. By contradiction, let (q 0 ; r0 ) and (q 00 ; r00 ) be two di¤erent
pairs such that
n = q 0 m + r0 = q 00 m + r00 (1.9)
with 0 r0 ; r00 < m. Since (q 0 ; r0 ) and (q 00 ; r00 ) are di¤erent we have either q 0 6= q 00 or r0 6= r00
or both. If q 0 6= q 00 , without loss of generality, we can suppose that q 0 < q 00 ; that is,
q0 + 1 q 00 (1.10)
since q 0 and q 00 are integers. It follows from (1.9) that (q 00 q0 ) m = r0 r00 . Since
(q 00 q 0 ) m 0, we have that 0 r00 r0 < m. Hence,
q 00 q 0 m = r0 r00 < m
which implies that q 00 q 0 < 1, that is, q 00 < q 0 + 1, which contradicts (1.10). We can
conclude that, necessarily, q 0 = q 00 . This leaves open only the possibility that r0 6= r00 . But,
since q 0 = q 00 , we have that
0 = q 00 q0 m = r0 r00 6= 0;
a contradiction. Hence, the assumption of having two di¤erent pairs (q 0 ; r0 ) and (q 00 ; r00 ) is
false.
14
An integer m is said to be strictly positive if m > 0, that is, m 1.
1.3. STRUCTURE OF THE INTEGERS 17
Theorem 21 (Euclid) Any pair of strictly positive integers has one and only one greatest
common divisor.
Proof Like Proposition 20, this is also an existence and uniqueness result. Uniqueness is
obvious; let us prove existence. Let m and n be any two strictly positive integers. By
Proposition 20, there is a unique pair (q1 ; r1 ) such that
n = q 1 m + r1 (1.11)
with 0 r1 < m. If r1 = 0, then gcd (m; n) = m, and the proof is concluded. If r1 > 0, we
iterate the procedure by applying Proposition 20 to m. We thus have a unique pair (q2 ; r2 )
such that
m = q 2 r1 + r2 (1.12)
where 0 r2 < r1 . If r2 = 0, then gcd (m; n) = r1 . Indeed, (1.12) implies r1 j m. Further-
more, by (1.11) and (1.12), we have that
n q 1 m + r1 q 1 q 2 r1 + r 1
= = = q1 q2 + 1
r1 r1 r1
and so r1 j n. Thus r1 is a divisor both for n and m. We now need to show that it is the
greatest of those divisors. Suppose p is a strictly positive integer such that p j m and p j n.
By de…nition, there are two strictly positive integers a and b such that n = ap and m = bp.
We have that
r1 n q1 m
0< = = a q1 b
p p
Hence r1 =p is a strictly positive integer, which implies that r1 p. To sum up, gcd (m; n) =
r1 , if r2 = 0. If this is the case, the proof is concluded.
If r2 > 0, we iterate the procedure once more by applying Proposition 20 to r2 . We thus
have a unique pair (q3 ; r3 ) such that
r 1 = q 3 r2 + r 3
determines with a …nite number of iterations the mathematical entity whose existence is
stated – here, the greatest common divisor. The notion of algorithm is of paramount im-
portance because, when an algorithm is available, it makes mathematical entities computable.
In principle an algorithm can be automated by means of an appropriate computer program
(for example, Euclid’s Algorithm allows us to automate the search for the greatest common
divisors).
A natural number which is not prime is called composite. Let us denote the set of prime
numbers by P. Obviously, P N and N P is the set of composite numbers. The reader can
easily verify that the following naturals
12 = 22 3
60 = 22 3 5
522 = 2 32 29
What we have just seen raises two questions: if every natural number admits a prime
factorization (we have only seen a few speci…c examples up to now) and if such factorization
is unique. The next result, the Fundamental Theorem of Arithmetic, resolves both matters
by showing that every integer admits one and only one prime factorization. In other words,
every integer can be expressed uniquely as a product of prime numbers.
Prime numbers are thus the “atoms” of N: they are “indivisible” (as they are divisible
only by 1 and themselves) and by means of them any other natural number can be expressed
20 CHAPTER 1. SETS AND NUMBERS: AN INTUITIVE INTRODUCTION
uniquely. The importance of this result, which shows the centrality of prime numbers, can
be seen in its name. Its …rst proof can be found in the famous Disquisitiones Arithmeticae,
published in 1801 by Carl Friederich Gauss, although Euclid was already aware of the result
in its essence.
Proof Let us start by showing the existence of this factorization. We will proceed by
contradiction. Suppose there are natural numbers that do not have a prime factorization
as in (1.13). Let n > 1 be the smallest among them. Obviously, n is a composite number.
There are then two natural numbers p and q such that n = pq with 1 < p; q < n. Since n
is the smallest number that does not admit a prime factorization, the numbers p and q do
admit such factorization. In particular, we can write
n0 n0 0
p = pn1 1 pn2 2 pnk k and q = q1 1 q2 2 qsns
Since q1 is a divisor of n, it must be a divisor of at least one of the factors p1 < < pm .15
For example, let p1 be one such factor. Since both q1 and p1 are primes, we have that q1 = p1 .
Hence
n0 1 n0 0
pn1 1 1 pn2 2 pnk k = q1 1 q2 2 qsns < n
which contradicts the minimality of n, as the number pn1 1 1 pn2 2 pnk k also admits multiple
factorizations. The contradiction proves the uniqueness of the prime factorization.
From a methodological viewpoint it must be noted that this proof of existence is carried
out by contradiction and, as such, cannot be constructive. Indeed, such proofs are based on
the law of excluded middle (a property is true if and only if it is not false) and the truth
of a statement is established by showing its non-falseness. This often allows for such proofs
to be short and elegant but, although logically air-tight,16 they are almost metaphysical as
they do not provide a procedure for constructing the mathematical entities whose existence
15
This mathematical fact, although intuitive, requires a mathematical proof. This is indeed the content of
Euclid’s Lemma, which we do not prove. This lemma allows to conclude that if a prime p divides a product
of strictly positive integers, then it must divide at least one of them.
16
Unless one rejects the law of excluded middle, as many eminent mathematicians have done (although it
constitutes a minority view and a very subtle methodological issue, the analysis of which is surely premature).
1.3. STRUCTURE OF THE INTEGERS 21
they establish. In other words, they do not provide an algorithm with which such entities
can be determined.
To sum up, we invite the reader to compare this proof of existence with the constructive
one provided for Theorem 21. This comparison should clarify the di¤erences between the two
fundamental types of proofs of existence, constructive/direct and non-constructive/indirect.
It is not a coincidence that the proof of the existence in the Fundamental Theorem of
Arithmetic is not constructive. Indeed, designing algorithms which allow us to factorize
a natural number n into prime numbers (the so-called factorization tests) is exceedingly
complex. After all, constructing algorithms which can assess whether n is prime or composite
(the so-called primality tests) is already extremely cumbersome and it is to this day an active
research …eld (so much so that an important result in this …eld dates to 2002).17
In order to grasp the complexity of the problem it su¢ ces to observe that, if n is com-
p p
posite, there are two natural numbers a; b > 1 such that n = ab. Hence, a n or b n
(otherwise, ab > n), and so there is a divisor of n among the natural numbers between 1
p
and n. In order to verify whether n is prime or composite, we can merely divide n by all
p
natural numbers between 1 and n: if none of them is a divisor for n, we can safely conclude
that n is a prime number, or, if this is not the case, that n is composite. This procedure
p
requires at most n steps.
With this in mind, suppose we want to test whether the number 10100 + 1 is prime or
compositep (it is a number with 101 digits, so it is big, but not huge). The procedure requires
100 50
at most 10 + 1 operations, that is, at most 10 operations (approximately). Suppose we
have an extremely powerful computer which is able to carry out 1010 (ten billion) operations
per second. Since there are 31:536:000 seconds in a year, that is, approximately 3 107
seconds, our computer would be able to carry out approximately 3 107 1010 = 3 1017
operations in one year. In order to carry out the operations our procedure might require,
our computer would need
1050 1
= 1033
3 1017 3
years. We had better get started...
It should be noted that, if the prime factorization of two natural numbers n and m is
known, we can easily determine their greatest common divisor. For example, from
it easily follows that gcd (3801; 1708) = 7, which con…rms the result of Euclid’s Algorithm.
Given how di¢ cult it is to factorize natural numbers, the observation is hardly useful from
a computational standpoint. Thus, it is a good idea to hold on to Euclid’s Algorithm, which
thanks to Lamé’s Theorem is able to produce the greatest common divisors with reasonable
e¢ ciency, without having to conduct any factorization.
17
One of the reasons why the study of factorization tests is an active research …eld is that the di¢ culty
in factorizing natural numbers is exploited by modern cryptography to build unbreakable codes (see Section
6.4).
22 CHAPTER 1. SETS AND NUMBERS: AN INTUITIVE INTRODUCTION
Proof The proof is carried out by contradiction. Suppose that there are only …nitely many
prime numbers and denote them by p1 < p2 < < pn . De…ne
q = p1 p 2 pn
and set m = q + 1. The natural number m is larger than any prime number, hence it is a
composite number. By the Fundamental Theorem of Arithmetic, it is divisible by at least
one of the prime numbers p1 , p2 , ..., pn . Let us denote this divisor by p. Both natural
numbers m and q are thus divisible by p. It follows that also their di¤erence, that is the
natural number 1 = m q, is divisible by p, which is impossible since p > 1. Hence, the
assumption that there are …nitely many prime numbers is false.
In conclusion, we have looked at some basic notions in number theory, the branch of
mathematics which deals with the properties of integers. It is one of the most fascinating
and complex …elds of mathematics, and it bears incredibly deep results, which are often easy
to state, but very hard to prove. A classic example is Fermat’s (famous) Last Theorem,
whose statement is quite simple: if n 3, there cannot exist three strictly positive integers
x, y, and z such that xn + y n = z n . Thanks to Pythagoras’ Theorem we know that for
n = 2 such triplets of integers do exist (for example, 32 + 42 = 52 ); Fermat’s Last Theorem
states that n = 2 is indeed the only case in which this remarkable property holds. Stated
by Fermat, the theorem was …rst proven in 1994 by Andrew Wiles after more than three
centuries of unfruitful attempts.
(i) re‡exivity: a a;
(iv) completeness (or totality): for every pair a; b 2 R, we have a b or b a (or both);
ac bc if c > 0
ac = bc = 0 if c = 0
ac bc if c < 0
(vii) separation:18 given two sets of real numbers A and B, if a b for every a 2 A and
b 2 B, then there exists c 2 R such that a c b for every a 2 A and b 2 B.
The …rst three properties have an obvious interpretation. Completeness guarantees that
any two real numbers can always be ordered. Additive independence ensures that the initial
ordering between two real numbers a and b is not altered by adding to both the same real
number c. Multiplicative independence considers, instead, the stability of such ordering with
respect to multiplication.
Finally, separation permits to separate two sets ordered by – that is, such that each
element of one of the two sets is greater than or equal to each element of the other one –
through a real number c, called separating element.19 Separation is a fundamental property
of “continuity”of the real numbers and it is what mainly distinguishes them from the rational
numbers (for which such property does not hold, as remarked in the last footnote) and makes
them the natural environment for mathematical analysis.
The strict form a > b of the “weak”inequality indicates that a is strictly greater than
b. In terms of , we have a > b if and only if b a, that is, the strict inequality can be
de…ned as the negation of the weak inequality (of opposite direction). The reader can verify
that transitivity and independence (both additive and multiplicative) hold also for the strict
inequality >, while the other properties of the inequality do not hold for >.
(iii) the half-closed (or half-open) bounded intervals (a; b] = fx 2 R : a < x bg and
[a; b) = fx 2 R : a x < bg.
(iv) the unbounded intervals [a; 1) = fx 2 R : x ag and (a; 1) = fx 2 R : x > ag, and
their analogous ( 1; a] and ( 1; a).20 In particular, the positive half-line [0; 1) is
often denoted by R+ , while R++ denotes (0; 1), that is, the positive half-line without
the origin.
The use of the adjectives open, closed, and unbounded will become clear in Chapter 5.
To ease notation, in what follows (a; b) will denote both an open bounded interval and the
unbounded ones (a; 1), ( 1; b) and ( 1; 1) = R. Analogously, (a; b] and [a; b) will denote
both the half-closed bounded intervals and the unbounded ones ( 1; b] and [a; 1).
h x 8x 2 A
while it is called lower bound of A if it is smaller than or equal to each element of A, that
is, if
h x 8x 2 A
For example, if A = [0; 1], the number 3 is an upper bound and the number 1 is a lower
bound since 1 x 3 for every x 2 [0; 1]. In particular, the set of upper bounds of A is
the interval [1; 1) and the set of the lower bounds is the interval ( 1; 0].
We will denote by A the set of upper bounds of A and by A the set of lower bounds.
In the example just seen, A = [1; 1) and A = ( 1; 0].
(i) Upper bounds and lower bounds do not necessarily belong to the set A: the upper
bound 3 and the lower bound 1, for the set [0; 1], are an example of this.
(ii) Upper bounds and lower bounds might not exist. For example, for the set of even
numbers
f0; 2; 4; 6; g (1.14)
there is no real number which is greater than all its elements: hence, this set does not
have upper bounds. Analogously, the set
f0; 2; 4; 6; g (1.15)
has no lower bounds, while the set of integers Z is a simple example of a set without
upper and lower bounds.
20
When there is not danger of confusion, we will write simply 1 instead of +1. The symbol 1, introduced
in mathematics by John Wallis in the 17th Century, reminds a curve called lemniscate and a kind of hat or of
halo (symbol of force) put on the head of some tarot card …gures: in any case, it is de…nitely not a ‡attened
8.
21
The universal quanti…er 8 reads “for every”. Therefore, “8x 2 A”reads “for every element x that belongs
to the set A”.
1.4. ORDER STRUCTURE OF R 25
Through upper bounds and lower bounds we can give a …rst classi…cation of sets of the
real line.
For example, the closed interval [0; 1] is bounded, since it is bounded both from above
and from below, while the set (1.14) of even numbers is bounded from below, but not from
above (indeed, it has no upper bounds).22 Analogously, the set (1.15) is bounded from above,
but not from below.
Note that this classi…cation of sets is not exhaustive: there exist sets that do not fall in
any of the types (i)–(iii) of the previous de…nition. For example, Z has neither an upper
bound nor a lower bound in R, and therefore it is not of any of the types (i)-(iii). Such sets
are called unbounded .
x
^ x 8x 2 A
x
^ x 8x 2 A
The key feature of this de…nition is the condition that the maximum and minimum belong
to the set A at hand. It is immediate to see how maxima and minima are, respectively, upper
bounds and lower bounds. Indeed, they are nothing but the upper bounds and lower bounds
that belong to the set A. For such a reason, maxima and minima can be seen as the “best”
among the upper bounds and the lower bounds. Many economic applications are, indeed,
based on the search of maxima or minima of suitable sets of alternatives.
Unfortunately, maxima and minima are fragile notions: sets often do not admit them.
22
By using Proposition 38, the reader can formally prove that, indeed, the set of even numbers is not
bounded from above.
26 CHAPTER 1. SETS AND NUMBERS: AN INTUITIVE INTRODUCTION
Example 32 The half-closed interval [0; 1) has minimum 0, but it has no maximum. Indeed,
suppose by contradiction that there exists a maximum x ^ 2 [0; 1), so that x
^ x for every
x 2 [0; 1). Set
1 1
x
~= x
^+ 1
2 2
Since x^ < 1, we have x
^<x ~. But, it is obvious that x
~ 2 [0; 1), which contradicts the fact
that x
^ is maximum of [0; 1). N
(i) the half-closed interval (0; 1] has maximum 1, but it has no minimum;
(ii) the open interval (0; 1) has neither minimum, nor maximum.
The maximum of a set A is denoted by max A, and its minimum by min A. For example,
for A = [0; 1] we have max A = 1 and min A = 0.
Let x
^ 2 A be the maximum of A. If h is an upper bound of A, we have h x
^, since x
^ 2 A.
On the other hand, x
^ is also an upper bound, and we thus obtain (1.16).
Example 34 The set of upper bounds of [0; 1] is the interval [1; 1). In this example, the
equality (1.16) takes the form max [0; 1] = min [1; 1). N
Thus, when it exists, the maximum is the smallest upper bound. But, the smallest upper
bound –that is, min A –might exist also when the maximum does not exist. For example,
consider A = [0; 1): the maximum does not exist, but the smallest upper bound exists and
it is 1, i.e., min A = 1.
23
As already mentioned, in economics maxima play a fundamental role.
1.4. ORDER STRUCTURE OF R 27
All of this suggests that the smallest upper bound is the surrogate for the maximum
which we are looking for. Indeed, in the example just seen, the point 1 is, in absence of a
maximum, its closest approximation.
Reasoning in a similar way, the greatest lower bound, i.e., max A , is the natural candid-
ate to be the surrogate for the minimum when the latter does not exist. Motivated by what
we have just seen, we give the following de…nition.
De…nition 35 Given a non-empty set A R, one calls supremum of A the least upper
bound of A, that is, min A , and in…mum the greatest lower bound of A, that is, max A .
Thanks to Proposition 33, both the supremum and the in…mum of A are unique, when
they exist. We denote them by sup A and inf A. For example, for A = (0; 1) we have
inf A = 0 and sup A = 1.
As already remarked, when inf A 2 A, it is the minimum of A, and when sup A 2 A, it
is the maximum of A.
Although suprema and in…ma may exist when maxima and minima do not, they do not
always exist.
Example 36 Consider the set A of the even numbers in (1.14). In this case A = ; and so
A has no supremum. More generally, if A is not bounded from above, we have A = ; and
the supremum does not exist. In a similar way, the sets that are not bounded from below
have no in…ma.24 N
To be a useful surrogate, suprema and in…ma must exist for a large class of sets; other-
wise, if also their existence were problematic, they would be of little help as surrogates.25
Fortunately, the next important result shows that suprema and in…ma do indeed exist for a
large class of sets (with sets of the kind seen in the last example being the only troublesome
ones).
Theorem 37 (Least Upper Bound Principle) Each non-empty set A R has supremum
if it is bounded from above and it has in…mum if it is bounded from below.
Proof We limit ourselves to prove the …rst statement. To say that A is bounded from above
means that it admits an upper bound, i.e., that A 6= ;. Since a h for every a 2 A and
every h 2 A , by the separation property there exists a separating element c 2 R such that
a c h for every a 2 A and every h 2 A . Since c a for every a 2 A, we have that c
is an upper bound of A, so that c 2 A . But, since c h for every h 2 A , it follows that
c = min A , that is, c = sup A. This proves the existence of the supremum of A.
Except for the sets that are not bounded from above, all the other sets in R admit
supremum. Analogously, except for the sets that are not bounded from below, all the other
24
If A does not admit supremum, we write sup A = +1 and, when it does not admit in…mum, inf A = 1.
Moreover, by convention, we set sup ; = 1 and inf ; = +1. This is motivated by the fact that each real
number must be considered simultaneously an upper bound and a lower bound of ;: then it is natural to
conclude that sup ; = inf ; = inf R = 1 and inf ; = sup ; = sup R = + 1.
25
The utility of a surrogate depends on how well it approximates the original, as well as on its availability.
28 CHAPTER 1. SETS AND NUMBERS: AN INTUITIVE INTRODUCTION
sets in R have in…mum. Suprema and in…ma are thus excellent surrogates that exist, and so
help us, for a large class of subsets of R.
Note that a simple, but useful, consequence of the previous theorem is that bounded sets
have both supremum and in…mum.
1.4.3 Density
The order structure is also useful to clarify the relations among the sets N, Z, Q, and R.
First of all, we make rigorous a natural intuition: however great is a real number, there
always exists a greater natural number. This is the so-called Archimedean property of real
numbers.
Proposition 38 For each real number a 2 R, there exists a natural number n 2 N such that
n a.
Proof By contradiction, assume that there exists a 2 R such that a n for all n 2 N.
By the Least Upper Bound Principle, sup N exists and belongs to R. Recall that, by the
de…nition of sup,
sup N n 8n 2 N (1.17)
At the same time, again by the de…nition of sup, we have sup N 1 < n for some n 2 N
(otherwise, sup N 1 would be an upper bound of N, thus violating the fact that sup N is the
least of these upper bounds). We can conclude that sup N < n + 1 2 N, which contradicts
(1.17).
The next property shows a fundamental di¤erence between the structures of N and Z, on
the one side, and of Q and R, on the other side. If we take an integer number, we can talk
in a very natural way of predecessor and successor. In particular, if m 2 Z, its predecessor
is the integer m 1, while its successor is the integer m + 1 (for example, the predecessor of
317 is 316 and its successor is 318). In other words, Z has a discrete “rhythm”.
In contrast, we cannot talk of predecessors and successors in Q or in R. Consider …rst
Q. Given a rational number q = m=n, let q 0 = m0 =n0 be any rational such that q 0 > q. Set
1 0 1
q 00 = q + q
2 2
The number q 00 is rational, since
1 m0 1 m 1 m0 n + mn0
q 00 = 0
+ =
2 n 2 n 2 nn0
and one has
q < q 00 < q 0 (1.18)
Therefore, there is no smallest rational number greater than q. Analogously, it is easy to
see how that there is no greatest rational number smaller than q. Rational numbers, hence,
do not admit predecessors and successors.
In a similar way we show that, given any two real numbers a < b, there exists a real
number c such that a < c < b. Indeed,
1 1
a< a+ b<b
2 2
1.4. ORDER STRUCTURE OF R 29
Real numbers, therefore, also do not admit predecessors and successors. The rhythm of both
rational and real numbers is “tight”, without discrete interruptions (which are intervals).
Such property of Q and R is called density. Unlike N and Z, which are discrete sets, Q and
R are dense sets.26
Proposition 39 Given any two real numbers a < b, there exists a rational number q 2 Q
such that a < q < b.
[a + 1] = [a] + 1 (1.19)
since, for each n 2 Z, we have n a if and only if n + 1 a + 1. Moreover, [a] < a when
a2= Z.
Case 2: Let b a > 1, i.e., a < a + 1 < b. From Case 1 it follows that there exists q 2 Q
such that a < q < a + 1 < b.
Case 3: Let b a < 1. By the Archimedean property of real numbers, there exists
0 6= n 2 N such that
1
n
b a
so that nb na = n (b a) 1. Then, for what we have just seen in cases 1 and 2, there
exists q 2 Q such that na < q < nb. Therefore,
q
a< <b
n
which completes the proof because q=n 2 Q.
26
In his famous argument against plurality, Zeno of Elea remarks that a “plurality” is in…nite because “...
there will always be other things between the things that are, and yet others between those others.” (trans.
Raven). Zeno thus identi…es density as the characterizing property of an in…nite collection. With a (twenty
…ve centuries) hidden insight, we can say that he is neglecting the integers. Yet, it is stunning how he was
able to identify a key property of in…nite sets.
30 CHAPTER 1. SETS AND NUMBERS: AN INTUITIVE INTRODUCTION
(i) We have de…ned ar only for a > 0, in order to avoid dangerous and embarrassing q
3
misunderstandings. Think, for example, of ( 5) . It could be rewritten as 2 ( 5)3 =
2
p2
p 3
125 or as 2 5 ; which do not exist (among the real numbers). But, it could
3 6
q
also be written as ( 5) 2 = ( 5) 4 which, in turn, can be expressed as either 4 ( 5)6 =
p p 6
4
15; 625; or 4 5 . The former exists and is approximately equal to 11:180339, but
the latter does not exist.
p 1
(ii) Let us consider the root a = ap 2 . It is well known that each positive number has
two algebraic roots, for example 9 = 3. The unique positive value of the root is
called, instead, arithmetical root. For example, 3 and 3 are the two algebraic roots
of 9, while 3 is its unique arithmetical root. In what follows the (even order) roots will
always be in the arithmetical sense (and therefore with a unique value). It is, by the
way, the standard convention: for example, in the classical solution formula
p
b b2 4ac
x=
2a
of the quadratic equation ax2 + bx + c = 0, the root is in the arithmetical sense
(otherwise, we should not write because the root would be automatically double).
We now extend the notion of power to the case ax , with 0 < a 2 R and x 2 R. Since,
unfortunately, the details of this extension are tedious, we limit ourselves to saying that, if
a > 1, the power ax is the supremum of the set of all the values aq when the exponent q
varies among the rational numbers such that q x. Formally,
In a similar way we de…ne ax for 0 < a < 1. We have the following properties that, by (1.21),
follow from the analogous properties that hold when the exponent is rational.
ax < ay if a < 1
ax = ay = 1 if a = 1
Among the bases a > 0, the most important is the number e (which will be introduced
in Chapter 8). As we will see, the power ex has truly remarkable properties.
1.5.2 Logarithms
The operations of addition and multiplication are commutative: a + b = b + a and ab = ba.
Therefore, they have only one inverse operation, respectively the subtraction and the division:
The power operation ab , with a > 0, is not commutative: ab might well be di¤erent from
ba .Therefore, it has two distinct inverse operations.
Let ab = c. The …rst inverse operation (given c and b, …nd out a) is called root with index
b of c: p
a = b c = c1=b
The second one (given c and a, …nd out b) is called logarithm with base a of c:
b = loga c
Note that, together with a > 0 and c > 0, one must also have a 6= 1, because 1b = c is
impossible except when c = 1.
The logarithm is a fundamental notion, ubiquitous in mathematics and in all its applic-
ations. As we have just seen, it is a simple notion: the number b = loga c is nothing but the
exponent that must be given to a in order to get c, that is,
aloga c = c
The properties of the logarithms derive easily from the properties of the powers seen in
Lemma 40.
In view of the change of base property (vi), it is possible to take as base of the logarithms
always the same number, say 10, because
log10 c
loga c =
log10 a
As for the powers ax , also for the logarithms the most common base is the number e. In
such a case we simply write log x instead of loge x. Because of its importance, log x is called
the natural logarithm of x, which leads to the notation ln x sometimes used in place of log x.
The next result shows the close connections between logarithms and powers, which can
be actually seen as inverse notions.
loga ax = x 8x 2 R
and
aloga x = x 8x > 0
We leave to the reader the simple proof. To check their understanding of the material of
this section, the reader can also verify that bloga c = cloga b for all strictly positive numbers
a 6= 1, b, and c.
For example, in this manner, 4357 means 4 thousands, 3 hundreds, 5 tens and 7 units.
The natural numbers are thus expressed by powers of 10, each of which causes a digit to be
added: writing 4357 is the abbreviation of
The choice of decimal notation is due to the mere fact that we have ten …ngers, but
obviously is not the only possible one. Some Native American tribes used to count on their
hands using the eight spaces between their …ngers rather than the ten …ngers themselves.
They would have chosen only 8 digits, which could have easily been
0; 1; 2; 3; 4; 5; 6; 7
and they would have articulated the integers along the powers of 8, that is 8, 64, 512, 4096,
. . . They would have written our decimal number 4357 as
1 2
4 0:125 + 1 0:0015625 = 4 8 +1 8 = 0:41
In general, given a base b and a set of digits
Cb = fc0 ; c1 ; :::; cb 1g
used to represent the integers between 0 and b 1, every natural number n is written in the
base b as
dk dk 1 d1 d0
where k is an appropriate natural number and
n = d k bk + d k 1b
k 1
+ + d1 b + d0
0; 1; 2; 3; 4; 5; 6; 7; 8; 9; |; •
We have used the symbols | and • for the two additional digits we need compared to the
decimal notation. The duodecimal number
1011 = 1 23 + 0 22 + 1 21 + 1 20
and in decimal notation
11 = 1 101 + 1 100
The considerable reduction in the digit set C2 made possible by the base 2 involves in terms of
cost the large number of bits required to represent numbers in binary notation. For example:
if 16 consists of two decimal digits, the corresponding binary 10000 requires …ve bits; if 201
requires three digits, the corresponding binary 11001001 requires eight bits; if 2171 requires
four digits, the corresponding binary 100001111011 requires twelve bits, and so on. Very
quickly, binary notation requires a number of bits that only a computer is able to process.
From a purely mathematical perspective, the choice of base is merely conventional, and
going from one base to another is easy (although tedious).28 Bases 2 and 10 are nowadays the
28
Operations on numbers written in a non-decimal notation are not particularly di¢ cult either. For ex-
ample, 11 + 9 = 20 can be calculated in a binary way as
1011+
1001 =
10100
It is su¢ cient to remember that the “carrying” must be done at 2 and not at 10.
1.6. NUMBERS, FINGERS AND CIRCUITS 35
most important ones, but many others have been used in the past, such as 20 (the number of
…ngers and toes, a trace of which is still found in the French language where“quatre-vingts”,
or “four-twenties”stands for eighty and “four-twenty-ten” stands for ninety), as well as 16
(the number of spaces between …ngers and toes) and 60 (which is convenient because it is
divisible by 2, 3, 4, 5, 6, 10, 12, 15, 20 and 30; a signi…cant trace of this system remains in
how we divide hours and minutes and how we measure angles).
The positional notation has been used to perform manual calculations since the dawn
of times (just think about computations carried out with the abacus), but it is a relatively
recent conquest in terms of writing, made possible by the fundamental innovation of the zero,
and has been exceptionally important in the development of mathematics and its countless
applications – commercial, scienti…c, and technological. Born in India (apparently around
the 5th century AD), the positional notation was developed during the early Middle Ages
in the Arab world (especially thanks to the works of Al-Khwarizmi), from which the name
“Arabic numerals” for the digits (1.22) derives, and arrived in the Western world thanks to
Italian merchants between the 11th and 12th centuries. In particular, the son of one of those
merchants, Leonardo da Pisa (also known as Fibonacci), was the most important medieval
mathematician: he authored a famous treatise in 1202, the Liber Abaci, the most acclaimed
among the …rst essays in Europe regarding positional notation. Until then non-positional
Roman numerals were used
which made even trivial operations overly complex (try to sum sum up CXL and MCL, and
then 140 and 1150).
Let us conclude with the incipit of the …rst chapter of Liber Abaci and the extraordinary
innovation the book brought to the Western world:
9; 8; 7; 6; 5; 4; 3; 2; 1
Cum his itaque novem …guris, et cum hoc signo, quod arabice zephirum appellatur,
scribitur quilibet numerus, ut inferius demonstratur. [...] ut in sequenti cum
…guris numeris super notatis ostenditur.
R [ f 1; +1g
denoted by the symbol R (sometimes with [ 1; +1]) The order structure of R can be
naturally extended on R by setting 1 < a < +1 for each a 2 R.
a + 1 = +1; a 1= 1 8a 2 R (1.23)
+1 + 1 = +1 and 1 1= 1
with, in particular,
(v) division:
a a
= =0 8a 2 R
+1 1
(vi) power of a real number:
8
>
> a+1 = +1 if a > 1
>
>
>
< a+1 = 0 if 0 < a < 1
>
> a 1 =0 if a > 1
>
>
>
: 1
a = +1 if 0 < a < 1
30
A real number is often called scalar.
1.7. THE EXTENDED REAL LINE 37
While the addition of in…nities with the same sign is a well-de…ned operation (for example,
the sum of two positive in…nities is again a positive in…nity), the addition of in…nities of
di¤erent sign is not de…ned. For example, the result of +1 1 is not de…ned. This is a
…rst example of an indeterminate operation in R. In general, the following operations are
indeterminate:
1 0 and 0 ( 1) (1.25)
(iii) divisions with denominator equal to zero or with numerator and denominator that are
both in…nities:
a 1
and (1.26)
0 1
with a 2 R;
The indeterminate operations (i)–(iv) are called forms of indetermination and will play
an important role in the theory of limits. Note that, by setting a = 0, formula (1.26) takes
the form
0
0
O.R. As we have observed, the most natural geometric image of R is the (real) line: to each
point there corresponds a number and, vice versa, to each number there corresponds a point.
If we take a closed (and obviously bounded) segment, we can “transport” all the numbers
from the real line to the segment, as the following …gure shows:31
31
We refer to the proof of Proposition 249 for the analytic expression of the bijection shown here.
38 CHAPTER 1. SETS AND NUMBERS: AN INTUITIVE INTRODUCTION
2
y
1.5
1
1
0.5 1/2
0
O x
-0.5
-1
-1.5
-2
-5 -4 -3 -2 -1 0 1 2 3 4 5
All the real numbers that found a place on the real line also …nd a place on the segment,
extremes excluded (maybe packed, but they really …t all). Two points are left, the extremes
of the segment, to which it is natural to associate, respectively, +1 and 1. The geometric
image of R is therefore a closed segment. H
Elea in the V century B.C. and that has in Parmenides and Zeno its most famous exponents.
In Parmenides’famous doctrine of the Being, a turning point in intellectual history that the
reader might have encountered in some high school philosophy course, it is logic that permits
the study of the Being, that is, of the world of truth ( " ). This study is impossible for
the senses, which can only guide us among the appearances that characterize the world of
opinion ( o ). In particular, only the reason can dominate the arguments by contradiction,
which have no empirical substratum, but are the pure result of reason. Such arguments,
developed – according to Szabo – by the Eleatic school and at the center of its dialectics
(culminated in the famous paradoxes of Zeno), for example enabled the Eleatic philosopher
Melissus of Samo to state that the Being “always was what it was and always will be. For
if it had come into being, necessarily before it came into being there was nothing. But, if
there was nothing, in no way could something come into being from nothing”.33
True knowledge is thus theoretic, only the eye of the mind can see the truth, while
empirical analysis necessarily stops at the appearance. The anti-empirical character of the
Eleatic school could have been decisive in the birth of the deductive method, at least in
creating a favorable intellectual environment. Naturally, it is not possible to exclude an
opposite causality to the one proposed by Szabo: The deductive method could have been
developed inside mathematics and could have p then in‡uenced philosophy, and in particular
the Eleatics.34 Indeed, the irrationality of 2, established by the Pythagorean school (the
other great pre-Socratic school of Magna Graecia), is a …rst decisive triumph of such a
method in mathematics: only the eye of the mind could see such a property, which is devoid
of any “empirical” intuition. It is the eye of the mind that explains the inescapable error
in which incurs every empirical measurement of the hypotenuse of a right triangle with
catheti of unitary length: however accurate is this
p measurement, it will always be a rational
approximation of the true irrational distance, 2, with a consequent approximation error
(that, by the way, will probably vary from measurement to measurement).
In any case, between the VI and the V century B.C. two pre-Socratic schools of Magna
Graecia were the cradle of an incredible intellectual revolution. In the III century B.C. an-
other famous Magna Graecia scholar, Archimedes from Syracuse, led this revolution to its
maximum splendor in the classical world (and beyond). We close with Plato’s famous (prob-
ably …ctional) description of two protagonists of this revolution, Parmenides and Zeno.35
They came to Athens ... the former was, at the time of his visit, about 65 years
old, very white with age, but well favoured. Zeno was nearly 40 years of age,
tall and fair to look upon: in the days of his youth he was reported to have been
beloved by Parmenides.
On another label one reads: 1 year of ageing and 10 degrees. In this case we can write
(1; 10)
The pairs (2; 12) and (1; 10) are called ordered pairs and in them we distinguish the …rst
element, the ageing, from the second one, the alcoholic content. In an ordered pair the
position is, therefore, crucial.
Let A1 be the set of the possible years of ageing and let A2 be the set of the possible
alcoholic contents. We can write
De…nition 43 Given two sets A1 and A2 , the Cartesian product A1 A2 is the set of all
the ordered pairs (a1 ; a2 ) with a1 2 A1 and a2 2 A2 .
In the example, we have A1 N and A2 N, i.e., the elements of A1 and A2 are natural
numbers. More generally, we can assume that A1 = A2 = R, so that the elements of A1 and
A2 are any real numbers, although with a possible di¤erent interpretation according to their
position. In this case A1 A2 = R R = R2 and the pair (a1 ; a2 ) can be represented by a
point in the plane:
41
42 CHAPTER 2. CARTESIAN STRUCTURE AND RN
(i) (a1 ; a2 ) 2 R2 : a1 = 0 , that is, the set of the ordered pairs of the form (0; a2 ); it is
the vertical axis (or axis of the ordinates).
(ii) (a1 ; a2 ) 2 R2 : a2 = 0 , that is, the set of the ordered pairs of the form (a1 ; 0); it is
the horizontal axis (or axis of the abscissae).
(iii) (a1 ; a2 ) 2 R2 : a1 0 and a2 0 , that is, the set of the ordered pairs (a1 ; a2 ) with
both components that are positive; it is the …rst quadrant of the Cartesian plane (also
called positive orthant). In a similar way we can de…ne the other quadrants:
y
3
II I
1
0
O x
-1
III IV
-2
-3 -2 -1 0 1 2 3 4 5
(iv) (a1 ; a2 ) 2 R2 : a21 + a22 = 1 and (a1 ; a2 ) 2 R2 : a21 + a22 1 , that is, respectively
the circumference and the circle with center at the origin and radius equal to 1.
2.1. CARTESIAN PRODUCTS AND RN 43
Above we have classi…ed wines using two characteristics, ageing and alcoholic content.
We now consider a slightly more complicated product, for example a portfolio of assets.
We suppose that there exist four di¤erent assets that can be purchased on the market. A
portfolio is then described by an ordered quadruple
(a1 ; a2 ; a3 ; a4 )
where a1 is the amount of money invested in the …rst asset, a2 is the amount of money
invested in the second asset, and so on. For example,
denotes a portfolio in which 1000 euros have been invested in the …rst asset, 1500 in the
second one, and so on. The position is crucial: the portfolio
is very di¤erent from the previous one, although the amounts of money invested in the
di¤erent assets are the same.
Since amounts of money are numbers that are not necessarily integers, possibly negative
(in case of short sales), it is natural to take A1 = A2 = A3 = A4 = R, where Ai is the set of
the possible amounts of money that can be invested in asset i = 1; 2; 3; 4. We have
(a1 ; a2 ; a3 ; a4 ) 2 A1 A2 A3 A4 = R4
In particular,
(1000; 1500; 1200; 600) 2 R4
In general, if we consider n sets A1 ; A2 ; :::; An we can give the following de…nition.
A1 A2 An
Q
denoted by ni=1 Ai (sometimes by ni=1 Ai ), is the set of all the ordered n-tuples (a1 ; a2 ; :::; an )
with a1 2 A1 ; a2 2 A2 ; ; an 2 An .
Rn = |R R {z R}
n times
An element
x = (x1 ; x2 ; :::; xn ) 2 Rn
44 CHAPTER 2. CARTESIAN STRUCTURE AND RN
is called a vector.1 The Cartesian product Rn is called the Euclidean space (n-dimensional).
For n = 1, R is represented by the real line; for n = 2, R2 is represented by the plane;
and so on. As for R and R2 , the vectors (a1 ; a2 ; a3 ) in R3 admit a graphic representation:
1 z
0.9
0.8
a
3
0.7
0.6
0.5
a
2
0.4 O
0.3 a
1
0.2 y
x
0.1
0
0 0.2 0.4 0.6 0.8 1
This is no longer possible in Rn when n 4. The graphic representation may help the
intuition, but from a theoretical and computational viewpoint it has no importance because
the vectors of Rn , with n 4, are completely well-de…ned entities. They actually turn out
to be fundamental in economic applications, as we will see in Section 2.4.
Notation. We will denote the components of a vector by the same letter used for the vector
itself, along with an ad hoc index: for example a3 is the third component of the vector a,
y7 the seventh component of the vector y, and so on.
2.2 Operations in Rn
Let us consider two vectors in Rn ,
x + y = (x1 + y1 ; x2 + y2 ; :::; xn + yn )
For example, for the two vectors x = (7; 8; 9) and y = (2; 4; 7) in R3 , we have
Even in this case, we have x 2 Rn . In other words, also with the operation of multiplication
by scalars, we built a new element of Rn .
(i) x + y = y + x (commutativity),
(ii) (x + y) + z = x + (y + z) (associativity),
Proof We prove (i), leaving the other properties to the reader. We have
as desired.
(iv) ( x) = ( ) x (associativity).
Proof We prove (ii): the other properties are left to the reader. We have
( + ) x = (( + ) x1 ; ( + ) x2 ; :::; ( + ) xn )
= ( x1 + x1 ; x2 + x2 ; :::; xn + xn )
= ( x1 ; x2 ; :::; xn ) + ( x1 ; x2 ; :::; xn ) = x + x
46 CHAPTER 2. CARTESIAN STRUCTURE AND RN
as claimed.
As we will see better in the next chapter (Section 3.3), the operations of addition and
multiplication by scalars allow us to de…ne the important notion of linear combination of
vectors. In particular, a vector x 2 Rn will be said to be linear combination of the vectors
m
xi i=1 of Rn if there exist m real numbers (coe¢ cients) f i gm 1
i=1 such that x = 1 x +
+ mx . m
The last operation in Rn that we consider is the inner product . Given two vectors x and
y in Rn , their inner product, denoted by x y, is de…ned as
x y = x1 y1 + x2 y2 + + xn yn
Other common notations for the inner product are (x; y) and hx; yi.
For example, for the vectors x = (1; 1; 5; 3) and y = ( 2; 3; ; 1) of R4 , we have
x y = 1 ( 2) + ( 1) 3 + 5 + ( 3) ( 1) = 5 2
The inner product is an operation that di¤ers from addition and scalar multiplication in a
structural aspect: while the latter operations determine a new vector of Rn , the result of the
inner product is a scalar. The next result gathers the main properties of the inner product
(we leave to the reader the simple proof).
(i) x y = y x ( commutativity),
(ii) (x + y) z = (x z) + (y z) ( distributivity),
(iii) x z= (x z) ( distributivity).
Note that the two distributive properties can be summarized in the single property
( x + y) z = (x z) + (y z).
The study of the basic properties of the inequality reveals a …rst important novelty:
when n 2, the order does not satisfy completeness. Indeed, consider for example
x = (0; 1) and y = (1; 0) in R2 : we have neither x y nor y x. We say, therefore, that
on Rn is a partial order (which becomes complete when n = 1).
It is easy to …nd vectors in Rn that are not comparable. The following …gure shows the
vectors of R2 that are or than the vector x = (1; 2); the darker area represents the points
smaller than x, the clearer area those greater than x, and the two white areas represent the
points that are not comparable with x.
5
y
4
2
2
1
0
O 1 x
-1
-2
-2 -1 0 1 2 3 4 5
Apart from completeness, it is easy to verify that on Rn continues to enjoy the properties
seen for n = 1:
(i) re‡exivity: x x,
(iv) separation: given two sets A and B in Rn , if a b for every a 2 A and b 2 B, then
there exists c 2 Rn such that a c b for every a 2 A and b 2 B.
Another notion that becomes surprisingly delicate when n 2 is that of strict inequality.
Indeed, given two vectors x = (x1 ; x2 ; :::; xn ) and y = (y1 ; y2 ; :::; yn ) of Rn , two cases can
happen:
All the components of x are than the corresponding components of y, with some of
them strictly greater; i.e., xi yi for each index i = 1; 2; :::n, with xi > yi for at least
an index i.
48 CHAPTER 2. CARTESIAN STRUCTURE AND RN
All the components of x are > than the corresponding components of y; i.e., xi > yi
for each i = 1; 2; :::n:
In the …rst case we have a strict inequality, in symbols x > y; in the second case a strong
inequality, in symbols x y.
x y =) x > y =) x y
The three notions of inequality among vectors in Rn are, therefore, more and more
stringent:
(i) a weak notion, , that permits the equality between the two vectors;
(ii) an intermediate notion , >, that requires at least one strict inequality among the
components;
(iii) a strong notion, , that requires strict inequality among all the components of the
two vectors.
When n = 1, both > and reduce to the classical > on R seen in Section 1.4. Moreover,
the same symbols “reversed”, i.e., , <, and are used in the opposite case.
An especially important case is the comparison between a vector x and the vector 0. We
say that the vector x is:
(ii) strictly positive if x > 0, i.e., if all the components of x are positive and at least one
of them is strictly positive;
(iii) strongly positive if x 0, i.e., all the components of x are strictly positive.
N.B. The notation and terminology that we have introduced is not the only possible one.
For example, some authors use =, >, and > in place of >, >, and ; other authors call
“non-negative” the vectors that we call positive, and so on. O
Together with the lack of completeness of , the presence of the two di¤erent notions of
strict inequality is the main novelty that we have in Rn , when n 2, with respect to the
special case R, i.e., n = 1, of Section 1.4.
We also have
N.B. (i) The intervals in Rn can be expressed as Cartesian products of intervals in R; for
example, Y
n
[a; b] = i=1 [ai ; bi ]
2.4 Applications
2.4.1 Static choices
Let us consider a consumer who has to choose how many kilograms of apples and of potatoes
to buy at the market. For convenience, we assume that these goods are in…nitely divisible,
so that the consumer can buy any real positive quantity (for example, 3 kg of apples and
kg of potatoes). In this case, R+ is the set of the possible quantities of apples that can be
bought, and the same for potatoes. Therefore, the set of the bundles of apples and potatoes
that the consumer can buy is
R2+ = R+ R+ = f(x1 ; x2 ) : x1 ; x2 2 R+ g
Graphically, this is the …rst quadrant of the plane. In general, if a consumer chooses among
more than two goods, say n goods, the set of the bundles is represented by the Cartesian
product
Rn+ = R+ R+ R+ = f(x1 ; x2 ; ::; xn ) : xi 2 R+ for i = 1; 2; :::; ng
represents the quantity of money in di¤erent periods: in this case x is a cash ‡ow. For
example, the current account of a family records each day the balance between revenues
(wages, incomes, etc.) and expenditures (purchases, rents, etc.): setting T = 365, the
resulting cash ‡ow is
x = (x1 ; x2 ; ::::; x365 )
Therefore, x1 is the balance of the current account on January 1, x2 is the balance on January
2, and so on until x365 , which is the balance at the end of the year.
Instead of a single good over several periods, we can consider a bundle of several goods
over several periods. Similarly, in an intertemporal problem of production, we will have
vectors of input over several periods. Such situations are modeled by means of matrices,
a simple notion that will be studied in Chapter 13. Many economic applications focus,
however, on the single good case, and therefore RT is a very important space in the theory
of intertemporal choices.
Indeed, requiring that all the points of A be x^ amounts to require that none of them
be > x
^. A similar reformulation can be given for minima.
We turn now our attention to subsets of Rn and the partial order . We can extend the
notion of maximum in the following way.
3
The notation t = 1; 2; : : : ; T is equivalent to t 2 f1; 2; : : : ; T g, like the notation i = 1; 2; : : : ; n is equivalent
to i 2 f1; 2; : : : ; ng. Choosing one of them is a matter of convenience.
2.5. PARETO OPTIMA 51
In an analogous way we can de…ne the minimum. Moreover, the analogue of Proposition
33 holds: the maximum (minimum) of a set A Rn , if it exists, is unique (the proof is
similar).
Unfortunately, the notions of maximum and minimum are of little interest in applications
because often subsets of Rn do not have maxima or minima since the order is only partial
in Rn (as seen in Section 2.3). It is much more pro…table to follow, instead, the order of
ideas sketched in Lemma 49. Indeed, the characterization established there is equivalent to
the usual de…nition of maximum in R, but it becomes more general in Rn . This motivates
the next de…nition, of great importance in economic applications.
In a similar way we can de…ne minimals, which are also called Pareto optima.4 Say
that a point x 2 A is dominated by another point y 2 A if x < y, that is, if xi yi for
each index i, with xi < yi for at least an index i (Section 2.3). A dominated point is thus
outperformed by another point available in the set. For instance, if they represent bundles
of goods, a dominated bundle x is obviously a no better alternative than the dominant one
y. In terms of dominance, we can say that a point a of A is maximal if is not dominated
by any other point in A. That is, a is not outperformed by any other alternative available
in A. Maximality is thus the natural extension of the notion of maximum when dealing –
as it is often the case in applications –with alternatives that are multi-dimensional (and so
represented by vectors of Rn ).
In the rest of the section we focus on maxima and maximals, the most relevant in economic
applications, leaving to the reader the dual properties that hold for minima and minimals.
The set in next …gure has a maximum, point a, which thanks to this lemma is therefore
also the unique maximal.
4
Optima, like angels, have no gender. Even if it were preferable to talk about Pareto maxima and minima,
unfortunately the tradition does not distinguish between them calling them both Pareto optima. Their nature
is then clari…ed by the context.
52 CHAPTER 2. CARTESIAN STRUCTURE AND RN
Lemma 52 has, therefore, established that the maximum of a set, when it exists, is the unique
maximal; that is,
maximum =) maximal
But, the converse is false: there exist maximals that are not maxima; that is,
Example 53 The next …gure shows a set A of R2 that has no maxima, but in…nitely many
maximals.
3 a
2
A
0
O
-1
-2
-2 -1 0 1 2 3 4 5
It is easy to see that any point a 2 A on the dark edge is maximal: there is no x 2 A such
that x > a. On the other hand, a is not a maximum: we have a x only for the points
2.5. PARETO OPTIMA 53
x 2 A that are comparable with a, which are represented in the shaded part of A :
Nothing can be said, instead, for the points that are not comparable with a (the non-shaded
part of A). The lack of maxima for this set is due to the non comparability of all the elements
of the set, so, in the …nal analysis, to the fact that the order is only partial in Rn when
n > 1. N
The set A of the example illustrates another fundamental di¤erence between maxima and
maximals in Rn with n > 1: the maximum of a set, if it exists, is unique while a maximal
might not to be unique (indeed, very often, it is not).
In conclusion, because of the incompleteness of the order on Rn , maxima are much less
important than maximals, which are the key notion in Rn . That said, maximals might also
not exist: the 45 straight line is a simple subset of R2 without maximals and minimals.5
Maximals are fundamental in economics, where they are (often) called Pareto optima. The
set of these points is of particular importance.
De…nition 54 The set of the maximals of a set A Rn is called the Pareto (or e¢ cient)
frontier of A.
5
This set is the graph of the function f : R ! R given by f (x) = x, as we will see in Chapter 6.
54 CHAPTER 2. CARTESIAN STRUCTURE AND RN
In the last example, the dark edge is the Pareto frontier of the set A :
5
2
A
0
O
-1
-2
-2 -1 0 1 2 3 4 5
As a …rst economic application, assume for example that the di¤erent vectors of a set
A Rn represent the pro…ts that n individuals can get. The Pareto optima represent the
situations from which it is not possible to move away without reducing the pro…t of at least
one of the individuals. In other words, the n individuals would not object to restrict A to the
set of its Pareto optima (nobody looses), that is, to its Pareto frontier; a con‡ict of interests
arises, instead, when a point on the frontier has to be selected.
The concept of Pareto optimum, simple but ingenious, has the great merit of allowing
to narrow down, with a unanimous consensus, a set A of alternative possibilities, and so to
identify the true “critical” subset, the Pareto frontier, which is often much smaller than the
original set A.6
A magni…cent illustration of this key aspect of Pareto optimality is the famous Edgeworth
box.7 Consider two agents, Albert and Barbara, who have to divide between them unitary
quantities of two in…nitely divisible goods (for example, a kilogram of ‡our and a liter of
wine). We want to model the problem of division (probably determined by a bargaining
between them) and to see if, thanks to Pareto optimality, we can say something non-trivial
about it.
Each pair x = (x1 ; x2 ) with x1 2 [0; 1] and x2 2 [0; 1], is a possible allocation of the two
goods to one of the two agents: in other words, the Cartesian product [0; 1] [0; 1] describes
them all. The two agents must agree on the allocations (a1 ; a2 ) of Albert and (b1 ; b2 ) of
Barbara. Clearly,
a1 + b1 = a2 + b2 = 1 (2.1)
6
For the Pareto optimality is key that the agents only consider their own alternatives (bundles of goods,
pro…ts, etc.), without minding about those of their peers. In other words, that they do not feel envy or similar
social emotions. To see why, think to a tribe of “envious”, whose head decides to double the food rations to
half of the members of the tribe, living unchanged those of the other members. The new allocation would
provoke lively protests by the “unchanged” members even though nothing changed for them.
7
Since we will use notions that we will introduce in Chapter 6, the reader may want to read this application
after having read that chapter.
2.5. PARETO OPTIMA 55
To complete the description of the problem, we have to say which are the desiderata of
the two agents. To this end, we suppose that they have identical utility functions ua ; ub :
[0; 1] [0; 1] ! R, and that, for simplicity, they are of the Cobb-Douglas type ua (x1 ; x2 ) =
p
ub (x1 ; x2 ) = x1 x2 (see Example 174). The indi¤erence curves can be “packed” in the
following way:
This is the classic Edgeworth box. By condition (2.1), we can think of a point (x1 ; x2 ) 2
[0; 1] [0; 1] as the allocation of Albert. We can actually identify each possible division
between the two agents with the allocations (x1 ; x2 ) of Albert; indeed, the allocations of
Barbara (1 x1 ; 1 x2 ) are univocally determined once those of Albert are known.
Each allocation (x1 ; x2 ) has utility ua (x1 ; x2 ) for Albert and ub (1 x1 ; 1 x2 ) for Bar-
bara. Let
be the set of all the utility pro…les of the two agents determined by the division of the two
goods.
By looking at the Edgeworth box, the reader will be easily convinced that the Pareto
frontier of A, i.e., the set of the Pareto optima of A, is given by the diagonal
of the box. That is, by the locus of the tangency points of the indi¤erence curves (called
contract curve). To prove this rigorously, we need the next simple result.
Since the last inequality is always true, we conclude that (2.2) holds. Moreover, these
equivalences imply that
p p
1 x1 x2 = (1 x1 ) (1 x2 ) () (x1 x2 )2 = 0
Having established this lemma, we can now prove rigorously what the graph suggested.
that is,
p p p p
x1 x2 > dd = d and (1 x1 ) (1 x2 ) (1 d) (1 d) = 1 d
Therefore, p
p
1 x1 x2 < 1 d (1 x1 )(1 x2 )
8
A similar argument holds when ua (x1 ; x2 ) ua (d; d) and ub (1 x1 ; 1 x2 ) > ub (1 d; 1 d).
2.5. PARETO OPTIMA 57
which contradicts (2.2). It follows that there is no (x1 ; x2 ) 2 [0; 1] [0; 1] for which (2.4)
holds. This completes the proof.
By Proposition 56, we can say that if the agents maximize their Cobb-Douglas utilities,
the bargaining will be solved in a division of the goods on the diagonal of the Edgeworth
box, i.e., such that each agent has an equal quantity of both goods.
Naturally, Proposition 56 cannot tell us anything about which of the points of the diagonal
is, then, e¤ectively determined by the bargaining. The Pareto frontier D is, however, a small
subset of A: through the notion of Pareto optimum we have been able to say something
highly non-trivial about the problem of division.
58 CHAPTER 2. CARTESIAN STRUCTURE AND RN
Chapter 3
Linear structure
In this chapter we study more in depth the linear structure of Rn which was introduced
in Section 2.2. The study of such a fundamental structure of Rn , which we will continue
in Chapter 13 on linear functions, is part of linear algebra. The theory of …nance is a
fundamental application of linear algebra, as we will see in Section 17.5.
(v1) x + y = y + x (commutativity)
(v2) (x + y) + z = x + (y + z) (associativity)
(v5) (x + y) = x + y (distributivity)
(v6) ( + ) x = x + x (distributivity)
(v8) ( x) = ( )x (associativity)
For this reason, as the reader will learn in more advanced courses, Rn is an example of a
vector space, which, in general, is a set where we can de…ne two operations of addition and
multiplication by scalars that satisfy properties (v1)–(v8).1 For example, in Chapter 13 we
will see another example of vector space, the space of matrices.
We call vector subspaces of Rn its subsets that behave well with respect to the two
operations:
1
The notion of vector space (…rst proposed by Giuseppe Peano in 1888) is central in mathematics, but it
is necessary to go beyond Rn to fully understand it. For this reason the reader will study in depth this notion
in more advanced courses.
59
60 CHAPTER 3. LINEAR STRUCTURE
We leave to the reader the easy check that the two operations satisfy in V properties
(v1)–(v8). In this regard, note that the origin belongs to each vector subspace V –i.e., 0 2 V
–since 0x = 0 for every vector x 2 V .
x+ y 2V (3.1)
Proof “Only if”. Let V be a vector subspace and let x; y 2 V . As V is closed with respect
to multiplication by scalars, we have x 2 V and y 2 V . It follows that x + y 2 V since
V is closed with respect to addition.
“If”. Putting = = 1 in (3.1), we get x + y 2 V , while, putting = 0, we get
x 2 V . Therefore, V is closed with respect to the operations of addition and multiplication
by scalars inherited from Rn .
Putting = = 0, (3.1) implies that 0 2 V . This con…rms that each vector subspace
contains the origin 0.
Example 59 There are two legitimate, but trivial, subspaces of Rn : the singleton f0g and
the space Rn itself. In particular, the reader can check that a singleton fxg is a vector
subspace of Rn if and only if x = 0. N
M = fx 2 Rn : x1 = = xm = 0g
x + y = ( x1 + y1 ; :::; xn + yn )
= (0; :::; 0; xm+1 + ym+1 ; :::; xn + yn ) 2 M
In other words, M is the set of the solutions of this system of equations. It is a vector
subspace: the reader can check that, given x; y 2 M and ; 2 R, we have x + y 2 M .
Performing the computations,3 we …nd that the vectors
10 2
t; 6t; t; t (3.2)
3 3
10 2
M= t; 6t; t; t :t2R
3 3
If V1 and V2 are two vector subspaces, we can show that also their intersection V1 \ V2 is
a vector subspace. More generally, we have the following result.
Di¤erently from the intersection, the union of vector subspaces is not in general a vector
subspace, as the next example shows.
3
For the sake of completeness, we provide the computations. We consider x4 as a “parameter” and solve
the system in x1 , x2 , and x3 ; clearly, we will get solutions that depend on the value of the parameter x4 :
8 8
< 2x1 x2 + 2x3 + 2x4 = 0 < 2x1 x2 = 2x3 2x4
x1 x2 2x3 4x4 = 0 =) x1 x2 = 2x3 + 4x4 =)
: :
x1 2x2 2x3 10x4 = 0 x1 2x2 2x3 10x4 = 0
8 8
< 2 (x2 + 2x3 + 4x4 ) x2 = 2x3 2x4 < x2 = 6x3 10x4
x1 + ( 2x3 2x4 2x1 ) = 2x3 + 4x4 =) x1 = 4x3 6x4 =)
: :
x1 2x2 2x3 10x4 = 0 x1 2x2 2x3 10x4 = 0
8 8
< x2 = 6x3 10x4 < x2 = 6x3 10x4
x1 = 4x3 6x4 =) x1 = 4x3 6x4 =)
: :
( 4x3 6x4 ) 2 ( 6x3 10x4 ) 2x3 10x4 = 0 x3 = 32 x4
8 2
8
< x2 = 6 x
3 4
10x4 < x2 = 6x4
2
x1 = 4 3
x4 6x 4 =) x1 = 10 x
3 4
: 2 :
x3 = 3 x4 x3 = 23 x4
In conclusion, the vectors of R4 of the form (3.2) are the solutions of the system for every t 2 R.
62 CHAPTER 3. LINEAR STRUCTURE
V1 [ V2 = x 2 R2 : x1 = 0 or x2 = 0
1 = 2 = = m =0
m
The set xi i=1 is, instead, said to be linearly dependent if it is not linearly independent,
i.e.,4 if there exists a set f i gm
i=1 of real numbers, not all equal to zero, such that
1 2 m
1x + 2x + + mx =0
e1 = (1; 0; 0; :::; 0)
e2 = (0; 1; 0; :::; 0)
en = (0; 0; :::; 0; 1)
called standard unit vectors (or versors) of Rn . The set e1 ; :::; en is linearly independent.
Indeed
1
1e + + n en = ( 1 ; :::; n )
and therefore 1e
1 + + ne
n = 0 implies 1 = = n = 0. N
m
Example 66 All the sets of vectors xi i=1 of Rn that contain the vector 0 are linearly
dependent. Indeed, without loss of generality, set x1 = 0. Given a set f i gm
i=1 of scalars
with 1 6= 0 and i = 0 for i = 2; :::; m, we have
1 2 m
1x + 2x + + mx =0
m
which proves the linear dependence of the set xi i=1
. N
4
See Section C.6.3 of the appendix for a careful analysis of this important negation.
3.2. LINEAR INDEPENDENCE AND DEPENDENCE 63
Example 67 Two vectors x1 and x2 that are linearly dependent are called collinear. This
happens if and only if there exist two scalars 1 and 2 , where at least one is di¤erent from
zero, such that 1 x1 = 2 x2 . In other words, if and only if either x = 0, or y = 0, or there
exists 6= 0 such that x1 = x2 . N
1 2 3
1x + 2x + 3x = 1 (1; 1; 1) + 2 (3; 1; 5) + 3 (9; 1; 25)
=( 1 +3 2 +9 3; 1 + 2 + 3; 1 +5 2 + 25 3)
and, therefore, 1 2 3
1x + 2x + 3x = 0 means
8
< 1 +3 2+9 3 =0
1 + 2+ 3 =0
:
1 + 5 2 + 25 3 = 0
which is a system of equations whose unique solution is ( 1; 2; 3) = (0; 0; 0). More gener-
ally, to verify if k vectors
If ( 1 ; :::; k ) = (0; :::; 0) is the unique solution, then the vectors are linearly independent in
Rn . For example, consider in R3 the two vectors x1 = (1; 3; 4) and x2 = (2; 5; 1). The system
to solve is 8
< 1+2 2 =0
3 1+5 2 =0
:
4 1+ 2=0
It has the unique solution ( 1; 2) = (0; 0), and so the two vectors x1 and x2 are linearly
independent. N
64 CHAPTER 3. LINEAR STRUCTURE
10 2
t; 6t; t; t (3.3)
3 3
for each t 2 R. Therefore, (0; 0; 0; 0) is not the unique solution of the system, and so the
vectors x1 , x2 , x3 , and x4 are linearly dependent. Indeed, by setting for example t = 1 in
(3.3), the set of four numbers
10 2
( 1; 2; 3; 4) = ; 6; ;1
3 3
is a set of real coe¢ cients, with at least one di¤erent from zero, such that 1 2
1x + 2x +
3 4
3 x + 4 x = 0. N
Proposition 70 The subsets of a linearly independent set are, in turn, linearly independent.
The simple proof is left to the reader, who can also check that if we add a vector (or
more than one) to a linearly dependent set, the set remains linearly dependent.
1 m
x= 1x + + mx
Theorem 73 A …nite set S of Rn , with S 6= f0g, is linearly dependent if and only if there
exists at least an element of S that is a linear combination of other elements of S.
3.3. LINEAR COMBINATIONS 65
m
Proof “Only if”. Let S = xi i=1 be a linearly dependent set of Rn . Let 2 k m
be the smallest natural number between 2 and m such that the set x1 ; :::; xk is linearly
m
dependent. At worst, k is equal to m since by hypothesis xi i=1 is linearly dependent. By
the de…nition of linear dependence, there exist therefore k real coe¢ cients f i gki=1 , with at
least one di¤erent from zero, such that
1 2 k
1x + 2x + + kx =0
We have k 6= 0, because otherwise x1 ; :::; xk 1 would be a linearly dependent set, contra-
dicting the fact that k is the smallest natural number between 2 and m such that x1 ; :::; xk
is a linearly dependent set. Given that k 6= 0, we can write
1 1 2 2 k 1 k 1
xk = x + x + + x
k k k
and therefore xk is linear combination of the vectors x1 ; :::; xk 1 . In other words, the
vector xk of S is linear combination of other elements of S.
m
“If”. Suppose that the vector xk of a …nite set S = xi i=1 is a linear combination of
other elements of S. Without loss of generality, assume k = 1. There exists a set f i gm
i=2 of
real coe¢ cients such that
x1 = 2 x2 + + m xm
De…ne the real coe¢ cients f i gm
i=1 as follows
1 i=1
i =
i i 2
By construction,
Pm f i gm
i=1 is a set of real coe¢ cients, with at least one di¤erent from zero,
i
such that i=1 i x = 0. Indeed
m
X
i
ix = x1 + 2x
2
+ 3x
3
+ + mx
m
= x1 + x1 = 0
i=1
m
It follows that xi i=1
is a linearly dependent set.
Corollary 76 A …nite set S of Rn is linearly independent if and only if none of the vectors
in S is linear combination of other vectors in S.
The next result shows how span S has a “concrete” representation in terms of linear
combinations of S.
Before illustrating the theorem with some examples, we state a simple consequence.
y
6
3
2
0
O 2 x
-2
-4
-6 -4 -2 0 2 4 6
N
68 CHAPTER 3. LINEAR STRUCTURE
3.5 Bases
By Theorem 77, the subspace generated by a subset S of Rn is formed by all the linear
combinations of the vectors in S. Suppose that S is a linearly dependent set. By Theorem
73, some vectors in S are linear combinations of other elements of S. By Corollary 78, such
vectors are, therefore, redundant for the generation of span S. Indeed, if a vector x 2 span S
is a linear combination of vectors of S, then by Corollary 78 we have
2. all the vectors of S are essential for this representation, none of them is redundant.
m
X m
X
i i
x= ix = ix
i=1 i=1
Hence,
m
X
i
( i i) x =0
i=1
and, since the vectors in S are linearly independent, it follows that i i = 0 for every
i = 1; :::; m; that is, i = i for every i = 1; :::; m.
“If”. Let S = x1 ; :::; xm and suppose that each x 2 Rn can be written in a unique way
as a linear combination of vectors in S. Clearly, by Theorem 77, we have Rn = span S. It
3.5. BASES 69
remains to prove that S is a linearly independent set. Suppose that the scalars f i gm
i=1 are
such that
Xm
i
ix = 0
i=1
Since we also have
m
X
0xi = 0
i=1
we conclude that i = 0 for every i = 1; :::; m since, by hypothesis, the vector 0 can be
written in only one way as a linear combination of vectors in S.
that is, the coe¢ cients of the linear combination are the components of the vector x. N
Example 85 The canonical basis of R2 is f(1; 0) ; (0; 1)g. But, there exist in…nitely many
other bases of R2 : for example, S = f(1; 2) ; (0; 7)g is another such basis. It is easy to
prove the linear independence of S. To show that span (S) = R2 , consider any vector
x = (x1 ; x2 ) 2 R2 . We need to show that there exist 1 ; 2 2 R such that
1 = x1
2 1+7 2 = x2
Since
x2 2x1
1 = x1 ; 2 =
7
solve the system, we conclude that S is indeed a basis of R2 . N
Theorem 86 For each linearly independent set x1 ; :::; xk of Rn with k n, there exist
n
n k vectors xk+1 ; :::; xn such that the total set xi i=1 is a basis of Rn .
Due to its importance, we give two di¤erent proofs of the result. Both proofs require the
following lemma.
70 CHAPTER 3. LINEAR STRUCTURE
x = c1 b1 + : : : + cn bn
It follows that
span x; b2 ; :::; bn = span b1 ; b2 ; :::; bn = Rn
It remains to show that the set x; b2 ; :::; bn is linearly independent, so that we can
conclude that it is a basis of Rn . Let f i gni=1 R be coe¢ cients for which
n
X
i
1x + ib =0 (3.5)
i=2
If 1 6= 0, we have
n
X n
X
i i i i
x= b = 0b1 + b
i=2 1 i=2 1
Since x can be written in a unique way as linear combination of the vectors of the basis
n
bi i=1 , one gets that c1 = 0, which contradicts the hypothesis c1 6= 0. This means that
1 = 0 and (3.5) simpli…es to
Xn
1 i
0b + ib = 0
i=2
5
See Appendix D for the induction principle.
3.5. BASES 71
Suppose now that the statement of the theorem is true for each set of k 1 vectors; we
want to show that it is true for each set of k vectors. Let therefore x1 ; :::; xk be a set of k
linearly independent vectors. The subset x1 ; :::; xk 1 is linearly independent and has k 1
elements. By the induction hypothesis, there exist n (k 1) vectors yek ; :::; yen such that
x1 ; :::; xk 1 ; yek ; :::; yen is a basis of Rn . Therefore, there exist coe¢ cients f i gni=1 R such
that
k 1
X n
X
xk = i xi
+ ei
iy (3.7)
i=1 i=k
As the vectors x1 ; :::; xk 1 ; xk are linearly independent, at least one of the coe¢ cients
Pk 1
f i gni=k is di¤erent from zero. Otherwise we would have xk = i=1 i
i x and the vector x
k
Proof 2 of Theorem 86 The theorem holds for k = 1. Indeed, consider Pn a singleton fxg,6
1
with x 6= 0, and the canonical basis e ; :::; e n of R . As x = i=1 xi ei , there exists at
n
least one index i such that xi 6= 0. By Lemma 87, e1 ; :::; ei 1 ; x; ei+1 ; :::; en is a basis of
Rn .
Since the statement holds for k = 1, let 1 < k n be the smallest integer for which
the property is false. By Lemma 87, there exists a linearly independent set x1 ; :::; xk such
that there do not exist n k vectors of Rn that, added to x1 ; :::; xk , yield a basis of Rn .
Given that x1 ; :::; xk 1 is, in turn, linearly independent, the minimality of k implies that
there are xk ; :::; xn such that x1 ; :::; xk 1 ; xk ; :::; xn is a basis of Rn . But then
xk = c1 x1 + + ck 1x
k 1
+ ck xk + + cn xn
is a basis of Rn , a contradiction.
Proof (i) It is enough to set k = n in Theorem 86. (ii) Let S = x1 ; :::; xk be a linearly
independent set in Rn . We want to show that k n. By contradiction, suppose k > n.
Then, x1 ; :::; xn is in turn a linearly independent set and by assertion (i) is a basis of Rn .
Hence, the vectors xn+1 ; :::; xk are linear combinations of the vectors x1 ; :::; xn , which,
by Corollary 76, contradicts the linear independence of the vectors x1 ; :::; xk . Therefore,
k n, which completes the proof.
6
Note that a singleton fxg is linearly independent when x = 0 implies = 0, which is equivalent to
requiring x 6= 0.
72 CHAPTER 3. LINEAR STRUCTURE
Example 89 By assertion (i), any two linearly independent vectors form a basis of R2 .
Going back to Example 85, it is therefore su¢ cient to verify that the vectors (1; 2) and (0; 7)
are linearly independent to conclude that S = f(1; 2) ; (0; 7)g is a basis of R2 . N
Proof Suppose that Rn has a basis of n elements. By item (ii) of Corollary 88, every other
basis of Rn can have at most n elements. Let x1 ; :::; xk be any another basis of Rn . We
show that one cannot have k < n, and so conclude that k = n. Suppose that k < n. By
Theorem 86, there exist n k vectors xk+1 ; :::; xn such that the set x1 ; :::; xk ; xk+1 ; :::; xn
is a basis of Rn . This, however, contradicts the assumption that x1 ; :::; xk is a basis of Rn ,
because the vectors xk+1 ; :::; xn are not linear combinations of the vectors x1 ; :::; xk :
x1 ; :::; xn is a linearly independent set. Therefore k = n.
Bases of vector subspaces, too, permit to represent each vector of the subspace as a linear
combination of basis elements, and such representation is essential, without redundancies.
The results of the previous section can be easily generalized.7 We start with Theorem
83.
Theorem 95 All bases of a vector subspace of Rn have the same number of elements.
Although in view of Theorem 90 the result is not surprising, it remains of great elegance
because it shows how, despite their diversity, the bases share a fundamental characteristic
like the cardinality. This motivates the next de…nition, which was implicit in the discussion
that followed Theorem 90.
By Theorem 95, this number is unique, and is denoted by dim V . It is the notion of dimen-
sion that, indeed, makes interesting this (otherwise routine) section, as the next examples
show.
Example 97 In the special case V = Rn we have dim Rn = n, which makes rigorous the
discussion that followed Theorem 90. N
Example 98 (i) The horizontal axis is a vector subspace of dimension one of R2 . (ii) The
plane M = x = (x1 ; x2 ; x3 ) 2 R3 : x1 = 0 is a vector subspace of dimension two of R3 , that
is, dim M = 2. N
Example 99 If V = f0g, that is, if V is the trivial vector subspace formed only by the
origin 0, we set dim V = 0. On the other hand, V does not contain linearly independent
vectors (why?) and, therefore, it has as basis the empty set f;g. N
74 CHAPTER 3. LINEAR STRUCTURE
Chapter 4
Euclidean structure
Before studying the norm we introduce the absolute value, which is the scalar version of
the norm and probably already familiar to the reader.
x if x 0
jxj =
x if x < 0
For example, j5j = j 5j = 5. The absolute value satis…es the following elementary properties
that the reader can verify:
75
76 CHAPTER 4. EUCLIDEAN STRUCTURE
4.1.3 Norm
The notion of norm generalizes that of absolute value to Rn . In particular, the (Euclidean)
norm of a vector x 2 Rn , denoted by kxk, is given by
1
q
kxk = (x x) 2 = x21 + x22 + + x2n
When n = 1, the norm reduces to the absolute value; indeed, thanks to (4.2) we have
p
kxk = x2 = jxj 8x 2 R
q p
For example, if x = 4 we have kxk = ( 4)2 = 16 = 4 = j 4j = jxj.
Geometrically the norm of a vector is nothing but the length of the segment that joins it
with the origin, which is itspdistance from the origin. For n = 2 this length, by Pythagoras’
Theorem, is exactly kxk = x21 + x22 .
p p
(iii) if x = (a; 2a; a) 2 R3 , then kxk = a2 + (2a)2 + ( a)2 = jaj 6;
p
(iv) if x = 2; ; 2; 3 2 R4 , then
q p p
p 2
kxk = 22 + 2 + 2 + 32 = 4+ 2 +2+9= 15 + 2
The norm satis…es some elementary properties that extend to Rn those seen for the
absolute value. The next result gathers the simplest ones.
(i) kxk 0;
(iii) k xk = j j kxk.
Proof
p We verify (ii), leaving the rest to the reader. If x = 0 = (0; 0; :::; 0), then kxk =
0+0+ + 0 = 0; vice versa, if kxk = 0, we have
Since x2i 0 for each i = 1; 2; ; n, from (4.3) it follows that x2i = 0 for each i = 1; 2; ;n
since a sum of positive numbers is ‘only if they are all zero.
Property (iii) extends the property jxyj = jxj jyj of the absolute value. We state now the
famous Cauchy-Schwarz inequality, that is a di¤erent, and more subtle, extension of such
property.
1
Recall that two vectors are said to be collinear if they are linearly dependent (Example 67).
78 CHAPTER 4. EUCLIDEAN STRUCTURE
Whence
(x y)2 kxk2 kyk2
and, by taking square roots of both sides, we obtain the inequality (4.4). It remains to prove
that equality holds if and only if the vectors x and y are collinear.
“Only if”. Let us assume that (4.4) holds as equality. Then, by (4.5), it follows that = 0.
Thus, there exists a point t^ where the parabola at2 + bt + c takes the value 0, i.e.,
2
0 = (x + t^y) (x + t^y) = x + t^y
The Cauchy-Schwarz inequality allows us to prove the triangle inequality, thereby com-
pleting the extension to the norm of properties (i)–(iv) of the absolute value.
that is !1 !1
n
X n
X n
X n
X 2 n
X 2
xi yi x2i yi2
i=1 i=1 i=1
4.1. ABSOLUTE VALUE AND NORM 79
x
2
y
0
O
-1
-2
-3 -2 -1 0 1 2 3 4 5
en = (0; 0; :::; 0; 1)
are the standard (or canonical) versors of Rn introduced in Chapter 3. To see their special
status, note that in R2 they are
e1 = (1; 0) and e2 = (0; 1)
and lie on the horizontal and on the vertical axes, respectively. In particular, the four versors
e1 ; e2 are the versors that belong to the Cartesian axes of R2 :
0.8
0.6
2
+e
0.4
0.2
1 1
-e +e
0
O
-0.2
-0.4
2
-e
-0.6
-0.8
-1
-1 -0.5 0 0.5 1
80 CHAPTER 4. EUCLIDEAN STRUCTURE
In this case, too, the six versors e1 ; e2 ; e3 are the versors that belong to the Cartesian
axes of R3 .
4.2 Orthogonality
Appendix B.3 shows how two vectors x and y of the plane can be seen to be perpendicular
when their inner product is zero, i.e., x y = 0. This suggests the following:
x y=0
When x and y are orthogonal, we write x?y. From the commutativity of the inner
product it follows that x?y is equivalent to y?x.
Example 105 (i) Two di¤erent standard versors are orthogonal. For example, for e1 and
e2 in R3 we have
e1 e2 = (1; 0; 0) (0; 1; 0) = 0
p p p
(ii) The vectors 2=2; 6=2 and 3=2; 1=2 are orthogonal:
p p ! p ! p p
2 6 3 1 6 6
; ; = + =0
2 2 2 2 4 4
Proof We have
as desired.
De…nition 107 A set of vectors fxi gki=1 of Rn is said to be orthogonal if its vectors are
pairwise orthogonal.
4.2. ORTHOGONALITY 81
The set e1 ; :::; en of the fundamental versors is the most classical example of orthogonal
set.
Proposition 108 Any orthogonal set that does not contain the zero vector is linearly inde-
pendent.
k
Proof Let xi i=1 be an orthogonal set of Rn . Let f i gki=1 be a set of scalars such that
Pk i
i=1 i x = 0. We have to show that 1 = 2 = = k = 0. We have
k k k
!
X X X
j j i
0= jx 0 = jx ix
j=1 j=1 i=1
k
! k
! k
!
X X X
1 i 2 i k i
= 1x ix + 2x ix + + kx ix
i=1 i=1 i=1
k
! k
!
X X
2 1 2 1 i 2 2 2 2 1 i
= 1 x + 1x ix + 2 x + 2x 1x + ix
i=2 i=3
k 1
!
2 X
2
+ + k xk + kx
k
ix
i
i=1
k
X
2 2
= i xi
i=1
An orthogonal set composed by vectors of unit norm, i.e., by versors, is called orthonor-
mal.. The set e1 ; :::; en of the standard versors is, for example, orthonormal. In general,
k
given an orthogonal set xi i=1
of vectors of Rn , the set
k
xi
kxi k i=1
xi
obtained by dividing each element by its norm is orthonormal. Indeed, we have kxi k
=
1 xi xj 1
kxi k
xi = 1 and kxi k kxj k
= kxi kkxj k
xi xj = 0 for every i 6= j.
x1 = (1; 1; 1) ; x2 = ( 2; 1; 1) ; x3 = (0; 1; 1)
2
In reading this result, recall that a set of vectors containing the zero vector is necessarily linearly dependent
(see Example 66).
82 CHAPTER 4. EUCLIDEAN STRUCTURE
Then p p p
x1 = 3; x2 = 6; x3 = 2
Dividing each vector by its norm, we get the orthonormal vectors
x1 1 1 1 x2 2 1 1 x3 1 1
= p ;p ;p ; = p ;p ;p ; = 0; p ;p
kx1 k 3 3 3 kx2 k 6 6 6 kx3 k 2 2
The orthonormal bases of Rn , in primis the standard one, are the most important among
the bases of Rn because for them it is easy to determine the coe¢ cients of the linear com-
binations that represent the vectors of Rn :
The coe¢ cients y xi are called Fourier coe¢ cients in the given basis.
Proof Since fx1 ; x2 ; :::; xn g is a basis, there exist n scalars 1; 2 ; :::; n such that
n
X
i
y= ix
i=1
0 if i 6= j
xi xj =
1 if i = j
For the standard basis e1 ; e2 ; :::; en for each y = (y1 ; :::; yn ) 2 Rn we have y ei = yi
and in this way we …nd again (3.4), i.e.,
n
X
y= yi ei
i=1
1 1 1 2 1 1 1 1
x1 = p ;p ;p ; x2 = p ;p ;p ; x3 = 0; p ;p
3 3 3 6 6 6 2 2
Consider, for example, the vector y = (2; 3; 4). Since
9 3 1
x1 y = p ; x2 y = p ; x3 y = p
3 6 2
we have
y = x1 y x1 + x2 y x2 + x3 y x3
9 1 1 1 3 2 1 1 1 1 1
=p p ;p ;p +p p ;p ;p +p 0; p ;p
3 3 3 3 6 6 6 6 2 2 2
N
k 2 k
X X 2
i
x = xi
i=1 i=1
Proof We proceed by induction. We already know that the assertion holds for k = 2. We
suppose that it holds for k 1, i.e.,
k 1 2 k 1
X X 2
i
x = xi (4.7)
i=1 i=1
Pk 1 i
We show that this implies that it holds for k. Observe that, setting y = i=1 x , we have
y?xk . Indeed, !
k 1
X k 1
X
k i k
y x = x x = xi xk = 0
i=1 i=1
k 2 k 1 2
X X 2 2
i
x = x +xi k
= y + xk = kyk2 + xk
i=1 i=1
k 1 2 k 1 k
X 2 X 2 X
i k i 2 k 2
= x + x = x + x = xi
i=1 i=1 i=1
as desired.
84 CHAPTER 4. EUCLIDEAN STRUCTURE
Chapter 5
Topological structure
In this chapter we introduce the fundamental notion of distance between points of Rn and
we study its main properties and the consequences of its presence for Rn .
5.1 Distances
The norm, studied in Section 4.1, allows to de…ne a distance in Rn . We start with n = 1,
when the norm is simply the absolute value jxj. Consider two points x and y on the real
line, with x > y:
The distance between the two points is x y, which is the length of the segment that joins
them. On the other hand, if we take any two points x and y on the real line, without knowing
their order (i.e., if x y or x y), the distance becomes
jx yj
x y if x y
jx yj =
y x if x < y
and hence the absolute value of the di¤erence provides the distance between the two points
independently of their order. In symbols, we can write
d (x; y) = jx yj 8x; y 2 R
In particular, d (0; x) = jxj and therefore the absolute value, or, equivalently, the norm of a
point x 2 R can be regarded as its distance from the origin.
Let us now consider n = 2. We take two vectors x = (x1 ; x2 ) and y = (y1 ; y2 ) in R2 :
85
86 CHAPTER 5. TOPOLOGICAL STRUCTURE
The distance between the two vectors x and y is given by the length of the segment that
joins them (in boldface in the …gure). By Pythagoras’Theorem, this distance is
q
d(x; y) = (x1 y1 )2 + (x2 y2 )2 (5.1)
since it is the hypotenuse of the right triangle whose catheti are the segments that join xi
and yi for i = 1; 2.
Observe that the distance (5.1) it is nothing but the norm of the vector x y (and also of
y x), i.e.,
d (x; y) = kx yk
The distance between two vectors in R2 is, therefore, given by the norm of their di¤erence.
It is easy to see, applying again Pythagoras’Theorem, that the distance between two vectors
x and y in R3 is given by
q
d(x; y) = (x1 y1 )2 + (x2 y2 )2 + (x3 y3 )2
and therefore we have again
d (x; y) = kx yk
At this point we generalize the notion of distance to any n.
5.1. DISTANCES 87
De…nition 113 The ( Euclidean) distance d (x; y) between two vectors x and y in Rn is the
norm of their di¤ erence: d (x; y) = kx yk.
In particular, d(x; 0) = kxk, which is the norm kxk of the vector x 2 Rn M; can be
regarded as its distance from the vector 0, i.e., as we have already said, as the length of the
segment that represents x.
We state the following proposition for distances between vectors of Rn , leaving its simple
proof (it is su¢ cient to apply the de…nitions) to the reader.
(i) d (x; y) 0;
(ii) d (x; y) = 0 , x = y;
Properties (i)–(iv) are all natural for a notion of distance. (i) says that a distance is
always a positive quantity, which by (ii) is zero only between vectors that are equal, the
distance between distinct vectors being always strictly positive. (iii) says that distance is
a symmetric notion: in measuring a distance between two vectors, it does not matter from
which of the two vectors we begin the measurement. Finally, (iv) is the so-called triangle
inequality: for example, the distance between Milan, x, and Rome, y, cannot exceed the sum
of the distances between Milan and any other place z and between that place z and Rome:
detours cannot save the distance one needs to cover.
N
88 CHAPTER 5. TOPOLOGICAL STRUCTURE
5.2 Neighborhoods
De…nition 116 We call (spherical) neighborhood of center x0 2 Rn and radius " > 0, and
denote it by B" (x0 ), the set
The neighborhood B" (x0 ) is therefore the locus of Rn whose points lie at distance strictly
smaller than " from x0 .
In R such neighborhood is the open interval (x0 "; x0 + "), i.e.,
Indeed,
Hence in R the neighborhoods are intervals. It is easily seen that in R2 they are discs
(without circumference), in R3 balls (without surface), etc.. Indeed, the points that lie at a
distance less than " from x0 form a disc, a ball, etc. “without peel” of center x0 .1
1
Some textbooks consider as neighbourhood of a point x0 2 R any open interval containing x0 ; in this
textbook, however, we will not do this.
5.2. NEIGHBORHOODS 89
2 ε
x
0
1
0
O
-1
-2
-3 -2 -1 0 1 2 3 4 5
Let us give some examples of neighborhoods. For simplicity of notation, we will write
B" (x1 ; ::; xn ) instead of B" ((x1 ; ::; xn )).
(ii) We have
3 3 1 5
B 3 (1) = 1 ;1 + = ;
2 2 2 2 2
(iii) The notations B 1 (0) and B0 (1) are meaningless because we need " > 0.
(iv) We have
q
B3 (0; 0) = B3 (0) = x 2 R2 : d(x; 0) < 3 = x 2 R2 : x21 + x22 < 3
(v) We have
O.R. A point has in…nitely many neighborhoods: one for each value of " > 0. It is therefore
misleading to talk about the neighborhood of a point as if it were only one. H
For some purposes we will have the opportunity to use, exclusively in R, also “half
neighborhoods” of a point x0 ; precisely:
De…nition 118 The interval [x0 ; x0 + "), with " > 0, is called the right neighborhood of
x0 2 R of radius ". The interval (x0 "; x0 ], with " > 0, is called the left neighborhood of
x0 of radius ".
With them we can give a useful characterization of the supremum and in…mum of a
subset of R, introduced in Section 1.4.2.
(ii) for every " > 0, there exists x 2 A such that x > a ".
Proof “Only if”. If a = sup A, (i) is obviously satis…ed. Let " > 0. Since sup A > a ", the
point a " is not an upper bound of A. Therefore, there exists x 2 A such that x > a ".
“If”. Suppose that a 2 R satis…es (i) and (ii). By (i), a is an upper bound of A. By (ii),
it is also the least upper bound. Indeed, each b < a can be written as b = a ", by setting
" = a b > 0. Given b < a, by (ii) there exists x 2 A such that x > a " = b. Therefore, b
is not an upper bound of A, which implies that there is no upper bound smaller than a.
can say that the interior points are the points that belong to A both in set-theoretical sense
(x 2 A) and in topological sense (there exists B" (x) A).
The set of the interior points of A is called the interior of A and it is denoted by int A.
By de…nition int A A.
Example 121 Let A = (0; 1). Each point of A is interior, that is, int A = A. Let indeed
x 2 (0; 1) and consider the smallest among the distances of x from the extreme points 0 and
1, i.e., min fd (0; x) ; d (1; x)g. Take " > 0 such that
Then
B" (x) = (x "; x + ") (0; 1)
and therefore x is an interior point of A. Since x was any point of A, it follows that int A = A.
N
Example 122 Let A = [0; 1]. We have int A = (0; 1). Indeed, by proceeding as above, we
see that the points in (0; 1) are all interior, that is, (0; 1) int A. It remains to examine the
extreme points 0 and 1. Consider 0. Each of its neighborhoods has the form ( "; "), with
" > 0, and hence it contains also points of Ac . It follows that 0 2 = int A. In an analogous
way one can show that 1 2 = int A. We conclude that int A = (0; 1).
The set of the exterior points of A coincides with the complement set Ac = ( 1; 0) [
(1; +1), and therefore int Ac = Ac , as the reader can easily verify. N
A point x0 is therefore a boundary point for A if each of its neighborhoods contains both
points of A (because it is not exterior) and points of Ac (because it is not interior). The set
of the boundary points of a set A is called the boundary or frontier of A and it is denoted
by @A. Intuitively, the frontier is the “border” of a set.
Note that the de…nition of boundary points is residual: a point is a boundary point if it is
not “anything else”. This implies that the classi…cation into interior, exterior, and boundary
points is exhaustive: given a set A, each point x0 2 Rn necessarily falls down into one of
these three categories.
Example 124 (i) Let A = (0; 1). Given the residual nature of the de…nition of boundary
points, to determine @A we have …rst of all to identify the interior and exterior points. We
have seen that int A = (0; 1), and also that Ac = ( 1; 0] [ [1; +1), and hence
The exterior points to A are therefore those of the set ( 1; 0) [ (1; +1). It follows that
@A = f0; 1g
i.e., the boundary of (0; 1) is constituted by the two points 0 and 1. Note that A \ @A = ;:
in this example the boundary points do not belong to the set.
(ii) Let A = [0; 1]. In the Example 122 we have seen that int A = (0; 1) and that Ac is
the set of the exterior points of A. Therefore, @A = f0; 1g. Here we have @A A, the set
contains its own boundary points.
(iii) Let A = (1; 0]. The reader can verify that int A = (0; 1) and that all the points
of ( 1; 0) [ (1; +1) are exterior. Hence, @A = f0; 1g. In this example, the frontier stays
partly outside and partly inside the set: the boundary point 1 is in A, while the boundary
point 0 is not.
(iv) If
A = (x1 ; x2 ) 2 R2 : x21 + x22 1 R2
then all the points such that x21 + x22 < 1 are interior, that is,
while all the points such that x21 + x22 > 1 are exterior. Therefore,
The next lemma generalizes what we saw in items (i)–(iii) of the example.
Lemma 125 Let A R be a bounded set. Then sup A 2 @A and inf A 2 @A.
Proof We prove that = sup A 2 @A (the proof for the in…mum is analogous). Consider
an arbitrary neighborhood of , ( "; + "). We have ( ; + ") Ac , and therefore
( "; + ") \ Ac 6= ;. Moreover, thanks to Proposition 119 for every " > 0 there exists
x0 2 A such that x0 > ", so that ( "; ] \ A 6= ;, and hence ( "; + ") \ A 6= ;.
Therefore, for every " > 0 we have ( "; + ") \ A 6= ; and ( "; + ") \ Ac 6= ;, that
is, 2 @A.
Hence, a point x0 2 A is isolated if there exists a neighborhood B" (x0 ) such that A \
B" (x0 ) = fx0 g. As the terminology suggests, the isolated points are points of the set
“separated” from the rest of the set.
Example 127 Let A = [0; 1] [ f2g. It consists of the closed unit interval and, in addition,
the point 2. The latter is isolated. Indeed, if B" (2) is a neighborhood of 2 with " < 1, then
A \ B" (2) = f2g. N
As anticipated, we have
Hence, x0 is a limit point of A if, for every " > 0, there exists some x 2 A such that2
0 < kx0 xk < ". The set of limit points of A is denoted by A0 and it is called derived set
of A. Note that it is not required that the limit point x0 belongs to A.
N.B. De…nition 129 can be equivalently expressed saying that x0 2 Rn is a limit point for A if,
for every " > 0, there exists a neighborhood B" (x0 ) of x0 such that (B" (x0 ) fx0 g) \ A 6= ;.
O
First of all, let us state the relations of the limit points with the classi…cation just seen.
Obviously, limit points are never exterior. Moreover:
Proof (i) If x0 2 int A, there exists a neighborhood B"0 (x0 ) of x0 such that B"0 (x0 ) A.
Let B" (x0 ) be any neighborhood of x0 . The intersection
is in turn a neighborhood of x0 of radius min f"0 ; "g > 0. Hence Bminf"0 ;"g (x0 ) A and,
in order to complete the proof, it is su¢ cient to consider any x 2 Bminf"0 ;"g (x0 ) such that
x 6= x0 . Indeed, x belongs also to the neighborhood B" (x0 ) and it is distinct from x0 .
(ii) “If”. Consider a point x0 that is a boundary point, but not an isolated point. By the
de…nition of boundary points, for every " > 0 we have B" (x0 ) \ A 6= ;. By the de…nition
of non-isolated points, for every " > 0 we have B" (x0 ) \ A 6= fx0 g. This implies that for
every " > 0 we have (B" (x0 ) fx0 g) \ A 6= ;, i.e., that x0 is a limit point of A. “Only if”.
Take a point x0 that is both a boundary point and a limit point, i.e., x0 2 @A \ A0 . Each
neighborhood B" (x0 ) contains at least a point x 2 A distinct from x0 , that is, B" (x0 ) \ A 6=
fx0 g. It follows that x0 is not isolated.
In the light of this result, the set A0 of the limit points consists of the interior points of
A and the non-isolated boundary points of A. Therefore, a point of A is a limit point or it
is isolated, tertium non datur.
Example 131 (i) If A = [1; 0) R, all the points of the interval [0; 1] and only them are
limit points, that is, A0 = [0; 1]. Note how 1 is a limit point although it does not belong to
A.
(ii) If A = (x1 ; x2 ) 2 R2 : x21 + x22 1 , all the points of A are limit points, that is,
A = A0 . N
4
x
2
3
2 2
0
-1 O x
1
-1
-2
-3 -2 -1 0 1 2 3 4 5
In the de…nition of limit point it is required that each of its neighborhoods contains at
least one point of A other than itself. Actually, as we show now, it necessarily contains
in…nitely many of them.
5.4. OPEN AND CLOSED SETS 95
Proposition 133 Each neighborhood of a limit point of A contains in…nitely many points
of A.
Proof Let x be a limit point of A. Suppose, by contradiction, that there exists a neighbor-
hood B" (x) of x containing a …nite number of points fx1 ; :::; xn g of A, except, at most, x
itself. Since fx1 ; :::; xn g is a …nite set, the minimum distance
min d (x; xi )
i=1;:::;n
exists and it is strictly positive, i.e., mini=1;:::;n d (x; xi ) > 0. Let > 0 be such that
< mini=1;:::;n d (x; xi ) : It is evident that 0 < < ", since < mini=1;:::;n d (x; xi ) < ":
Hence B (x) B" (x): It is also evident, by construction, that for each i = 1; 2; :::n; we have
xi 2
= B (x): So, if x 2 A, we have B (x) \ A = fxg; if instead x 2 = A, we have B (x) \ A = ;:
Independently of whether x belongs to A or not, we have
B (x) \ A fxg
Therefore, the unique point of A that B (x) can contain is, at most, x itself. But, this
contradicts the hypothesis that x is a limit point of A.
O.R. The concept of interior point of a set A requires the existence of a neighborhood of
the point that is entirely formed by points of A. This means that it is possible to move away
(at least a bit) from the point by following any path that starts from it and remain inside
A (i.e., it is possible go for a “little walk” in any direction without showing the passport):
looking at the path in the opposite direction, we can say that it is possible to approach the
point by coming from any direction and by remaining within A.
The concept of limit point of a set A, which does not require that the point belongs to A,
requires instead that we can get as close as we want to the point by “jumping” on points of
the set (i.e., that, as when we cross a river jumping on surfacing stones, we can get as close
as we want to our target on “stones” that all belong to the set). This idea of approaching a
certain point by remaining within a given set will be crucial for the de…nition of the limit of
a function. H
De…nition 134 A subset A Rn is called open if all its points are interior, i.e., if int A =
A.
3
With the caveat of Example 124-(v).
96 CHAPTER 5. TOPOLOGICAL STRUCTURE
Example 135 The open intervals (a; b) are open (whence the name). Indeed, let x 2 (a; b)
be any point of (a; b). We show that it is interior. Let
We have B" (x) (a; b) and therefore x is an interior point of (a; b). It follows that (a; b) is
open. N
Example 136 The set A = B1 (0; 0) f(0; 0)g = x 2 R2 : 0 < x21 + x22 < 1 is open.
Graphically, it is the disc without both the “peel” and the origin, that is,
4
x
2
3
0
O x
1
-1
-2
-3 -2 -1 0 1 2 3 4 5
Given that the neighborhoods in R are of the type (a; b), they are all open. The next
result shows that the property of the neighborhoods of being open holds in general in Rn .
Proof Let B" (x0 ) be a neighborhood of a point x0 2 Rn . To show that B" (x0 ) is open, we
have to show that each of its points is interior. Let x 2 B" (x0 ). To prove that x is interior
to B" (x0 ), let
0 < "0 < " d (x; x0 ) (5.2)
Then B"0 (x) B" (x0 ) : Indeed, let y 2 B"0 (x). Then
where the last inequality follows from (5.2). Therefore, B"0 (x) B" (x0 ), which completes
the proof.
Clearly, A A. The closure of A is, thus, an “enlargement” of A that includes all its
boundary points, that is, the borders. Naturally, the notion of closure is relevant when the
borders are not already part of A.
De…nition 141 A subset A of Rn is called closed if it contains all its boundary points, that
is, if A = A.
Example 142 The set A = [0; 1) R is not closed since A 6= A, while the set A =
2 2 2
(x1 ; x2 ) 2 R : x1 + x2 1 is closed since A = A. N
Example 143 The closed intervals [a; b] R are closed (whence the name). The unbounded
intervals (a; 1) and ( 1; a) are open. The unbounded intervals [a; 1) and ( 1; a] are
closed. N
The notions of open and closed sets are dual, as the next basic result shows.4
4
In many textbooks a closed set is de…ned as one whose complement is open, and it is proved as a theorem
the consequent property that each closed set contains its boundary. In other words, the de…nition and the
theorem are switched with respect to the formulation we have chosen.
98 CHAPTER 5. TOPOLOGICAL STRUCTURE
Proof “Only if”. Let A be open. We show that Ac is closed. Let x be an arbitrary boundary
point of Ac , that is, x 2 @Ac . By de…nition, x is not interior either for A or for Ac . Hence,
x2 = int A. But, A = int A, since A is open. Therefore x 2 = A, that is, x 2 Ac . It follows that
@Ac Ac , since x was an arbitrary point of @Ac . Therefore, Ac = Ac , which proves that Ac
is closed.
“If”. Let Ac be closed. We show that A is open. Let x be any point of A. Since
x2 = Ac = Ac , x is not a boundary point for Ac and it is therefore interior for A or interior
for Ac . But, since x 2
= Ac implies x 2= int Ac , we conclude that x 2 int A. Hence the point x
is interior, which implies that A is open.
Example 146 The …nite sets A = fx1 ; x2 ; :::; xn g of Rn (in particular, the singletons) are
closed. To verify it, observe that the complement Ac is open. Indeed let x 2 Ac and " > 0
such that
" < d (x; xi ) 8i = 1; :::; n
We have B" (x) Ac and hence x is an interior point. It follows that Ac is open. We leave
the reader to verify that int A = ; and @A = A. N
4
x
2
3
0 -1 2
O x
1
-1
-1
-2
-3 -2 -1 0 1 2 3 4 5
f(2; 1)g [ f(x1 ; x2 ) 2 R2 : x2 = x21 g [ f(x1 ; x2 ) 2 R2 : (x1 + 1)2 + (x2 + 1)2 1=4g
of R2 . N
Open and closed sets are therefore two faces of the same medal: to state that a set is
closed/open is equivalent to state that its complement is open/closed. Naturally, there are
many sets that do not satisfy any of these properties. We now see a very simple example of
this.
5.4. OPEN AND CLOSED SETS 99
Example 148 The set A = [0; 1) R is neither open, nor closed. Indeed, int A = (0; 1) 6= A
and A = [0; 1] 6= A. N
There is a case in which the duality of open and closed sets assumes a curious appearance.
Example 149 The empty set ; and the entire Rn are simultaneously open and closed. By
Theorem 145, it is su¢ cient to show that Rn is both open and closed. But, this is obvious.
Indeed, Rn is open since, trivially, each of its points is interior, and it is closed because Rn
necessarily coincides with its own closure. It is possible to show that ; and Rn are the unique
sets with such double personality. N
We go back to the notion of closure A. The next result shows how it can equivalently be
seen as the addition to the set A of its limit points A0 . In other terms, adding the borders
is equivalent to adding the limit points.
From the equivalence just shown if follows, as a corollary, that a set is closed when it
contains all its limit points. It is a remarkable equivalence.
Corollary 151 A subset A of Rn is closed if and only if it contains all its limit points.
Example 152 The inclusion A0 A in Corollary 151 can be strict, in which case the set
0
A A consists of the isolated points of A. For example, let A = [0; 1] [ f 1; 4g. Then A
is closed and A0 = [0; 1]. Hence A0 is strictly included in A and the set A A0 = f 1; 4g
consists of the isolated points of A. N
int A A A (5.4)
The set of interior points int A is therefore the largest open set that approximates A
“from inside”, while the closure A is the smallest closed set that approximates A “from
outside”. The relation (5.4) is therefore the best topological sandwich, with lower open slice
and upper closed slice, that we can have for the set A.5
It is now easy to prove an interesting and intuitive property of the boundary of a set.
Proof Let A be any set in Rn . Since the exterior points to A are interior to its complement,
we have
(@A)c = int A [ int Ac
and hence @A is closed because int A and int Ac are open and, as we will see in Theorem
156, a union of open sets is open.
The next result, whose proof is left to the reader, shows that the di¤erence between the
closure and the interior of a set is given by its boundary points.
The result makes precise the intuition that open sets are sets without borders. Indeed,
Proposition 155 implies that A is open if and only if @A \ A = ;. On the other hand, by
de…nition, a set is closed if and only if @A A, that is, when it includes the borders.
The same is true for intersections of a …nite number of neighborhoods. It is, however, no
longer true for intersections of in…nitely many neighborhoods: for example,
1
\
B 1 (x0 ) = fx0 g (5.5)
n
n=1
i.e., this intersection reduces to the singleton fx0 g, which is closed, as observed in Example
146. Therefore, the intersection of in…nitely many neighborhoods might well not be open.
5
Clearly, there are also sandwiches with a lower closed slice and an upper open slice, as the reader will see
in more advanced courses.
5.5. SET-THEORETICAL STABILITY 101
T
To check (5.5), note that a point belongs to the intersection 1 n=1 B1=n (x0 ) if and only
if it belongs
T to each neighborhood B 1=n (x0 ). This is certainly true for x0 , and therefore
x0 2 1 B
n=1 1=n (x0 ). We show that it is the unique point
T1 that satis…es this property.
Suppose, by contradiction, that y 6= x0 is such that y 2 n=1 B1=n (x0 ). Since y 6= x0 , we
have d (x0 ; y) > 0. If we take n su¢ ciently large, in particular
1
n>
d (x0 ; y)
is nothing but the largest of the two. More generally, in the case of in…nitely many neigh-
borhoods B"i (x0 ), if supi "i < +1 we set " = supi "i , so that
1
[
B"i (x0 ) = B" (x0 )
i=1
For example,
1
[
B 1 (x0 ) = B1 (x0 )
n
n=1
For example,
1
[
Bn (x0 ) = Rn
n=1
Theorem 156 (i) The intersection of any …nite family of open sets is open. (ii) The union
of any family (…nite or not) of open sets is open.
102 CHAPTER 5. TOPOLOGICAL STRUCTURE
T
Proof (i) Let A = ni=1 Ai , with all Ai open sets. Each x 2 A belongs to all the Ai and
it is interior to all of them (because they are open),
T i.e., there exist neighborhoods of x,
B"i (x) Ai . We call B their intersection, B = ni=1 B"i (x): it is still a neighborhood of x
(with radius " = min f"1 ; :::; "n g) and, even more so, B Ai for each i = 1; 2; : : : ; n. But
then B A and it is a neighborhood of x all contained in A. Therefore, A is open.
S
(ii) Let A = A , where runs over a …nite or in…nite set. Each x 2 A belongs to at
least one among the A s, call it A . Since all the A s are open, there exists a neighborhood
of x contained in A and hence, even more so, in A. Therefore, x is interior to A and, given
the arbitrariness of x, A is open.
By Theorem 145 and the De Morgan laws, it is easy to prove that dual properties hold
for the closure, which is preserved by all intersections, but only by …nite unions.
Corollary 157 The union of any …nite family of closed sets is closed. The intersection of
any family (…nite or not) of closed sets is closed.
jxj < K 8x 2 A
The next de…nition is the natural extension of this idea to Rn , where the absolute value is
replaced by the more general notion of norm.
kxk < K 8x 2 A
By recalling that kxk is the distance of x to the origin, it is easily seen that a set A is
bounded if, for every x 2 A, we have d(x; 0) < K. In other words, A is bounded if there
exists a neighborhood BK (0) of the origin that contains it, i.e., A BK (0). It is immediate
to see that all the neighborhoods B" (x) are bounded sets, as are their closures (5.3): it
is su¢ cient to take K = ". On the contrary, the interval (a; 1) is a simple example of
unbounded set (for this reason it is called unbounded interval).
Using boundedness, we can de…ne a class of closed sets that turns out to be very important
for applications.
For example, all the intervals closed and bounded in R are compact6 . More generally,
the closures B" (x0 ) of any neighborhood B" (x0 ) in Rn are compact. For example, the set
is compact in Rn . This classical set of Rn is called closed unit ball. The reason for this
terminology is evident in the special case n = 2:
4
x
2
3
0
O x
1
-1
-2
-3 -2 -1 0 1 2 3 4 5
Like the closed sets, compactness is stable under …nite unions and arbitrary intersections,
as we leave to the reader to prove7 .
Example 160 The …nite sets A = fx1 ; x2 ; : : : ; xn g, and in particular the singletons, are
compact sets. Indeed, in Example 146 we showed that they were closed sets. Since they are
obviously bounded, they are compact. N
Example 161 Budget sets are a fundamental example of compact sets in consumer theory,
as Proposition 670 will show. N
Theorem 162 A set C in Rn is closed if and only if it contains the limit of every convergent
sequence of its points. That is, C is closed if and only if
fxn g C; xn ! x =) x 2 C (5.6)
6
The empty set ;, however trivial, is considered a compact set.
7
With regard to this, the reader can observe that, since the empty set is compact, the intersection of two
disjoint compact sets is the empty (compact) set.
8
Therefore, the section can be skipped in a …rst moment, and can be read only after having studied
sequences.
104 CHAPTER 5. TOPOLOGICAL STRUCTURE
Proof “Only if”. Let C be closed. Let fxn g C be a sequence such that xn ! x. We want
to show that x 2 C. Suppose, by contradiction, that x 2 = C. Since xn ! x, for every " > 0
there exists n" 1 such that xn 2 B" (x) for every n n" . Therefore, x is a limit point for
C, which contradicts x 2= C because C is closed and so contains all its limit points.
“If”. Let C be a set for which property (5.6) holds. By contradiction, let C be non-
closed. Then there exists at least one boundary point x of C that does not belong to C. As
it cannot be isolated (otherwise it would belong to C), by Lemma 130 x is a limit point for
C. Each neighborhood B1=n (x) does contain a point of C: call it xn . The sequence of such
xn s converges to x 2
= C, contradicting (5.6). Hence, C is closed.
This property is very important: a set is closed if and only if “it is closed with respect
to the limit operation”, that is, if by taking limits of sequences we never leave the set. The
property is natural in economics: a set is closed if (and only if), whenever it is possible to
get arbitrarily close to a point by still staying in the set, the point must belong to the set.
In a concrete problem it would be very strange if, with points of the set, one could get
arbitrarily close to a point x without being able to reach it: it would be like licking the window
of a confectioner without being able to reach the pastries (very close, yet unreachable). For
this reason the sets that appear in economic models are almost always closed.
Example 163 Consider the closed interval C = [a; b]. We show that it is closed using
Theorem 162. Let fxn g C be such that xn ! x 2 R. Thanks to Theorem 162, to show
that C is closed, it is su¢ cient to show that x 2 C. Since a xn b, a simple application
of the comparison criterion shows that a x b, that is, x 2 C. N
Functions
In other words, if the greengrocer buys 10 kg of walnuts he will pay them 4 euros per
kg, if he buys 20 kg he will pay them 3:9 euros per kg, and so on (we are assuming that, the
larger quantities are purchased, the lower the unit price).
The table is an example of a supply function, which associates to each quantity the
corresponding purchase price: A = f10; 20; 30; 40g is the set of the quantities and B =
f4; 3; 9; 3; 8; 3; 7g is the set of their unit prices; the supply function is a rule that associates
to each element of the set A an element of the set B.
In general, we have
De…nition 165 Given any two sets A and B, a function de…ned on A and with values in
B, denoted by f : A ! B, is a rule that associates to each element of the set A one, and
only one, element of the set B.
b = f (a)
105
106 CHAPTER 6. FUNCTIONS
The rule can be completely arbitrary; what matters is only that it associates to each
element a of A only one element b of B 1 .
The arbitrariness of the rule is the crucial aspect of the notion of function. It is one of
the fundamental ideas of mathematics, to which mathematicians arrived relatively recently:
the notion of function considered above was introduced in 1829 by Dirichlet after about 150
years of discussions (the …rst ideas regarding this notion go back to Leibnitz at the end of
the XVII century).
Note that it is perfectly legitimate that the same element of B is associated to two (or
more) di¤erent elements of A, that is,
Legitimate
On the contrary, it cannot happen that several elements of B are associated to the same
1
We have written in italics the most important words: the rule must hold for each element of A and, to
each one, it must associate only one element of B.
6.1. THE CONCEPT 107
element of A, i.e.,
Illegitimate
Before considering some examples, we introduce a bit of terminology. The two variables,
a and b, are traditionally called the independent variable and the dependent variable, re-
spectively. Moreover, the set A is called the domain of the function, while the set B is its
codomain.
The codomain is the set in which the function assumes its values, but not necessarily
contains only such values: it can also be larger. Concerning this aspect, the next notion is
important: given a 2 A, the element f (a) 2 B is called the image of a. Taken any subset C
of the domain A, the set
f (C) = ff (x) : x 2 Cg B (6.1)
of the images of the points in C is called the image of C. In particular, the set f (A) of
all the images of points of the domain is called image (set) of the function f . It is denoted
by Im f and it is therefore the subset of the codomain constituted by the elements that are
image of some element of the domain:
Im f = f (A) = ff (x) : x 2 Ag B
Note that each set that contains Im f is, indeed, a possible codomain for the function: if
Im f B and Im f C, then writing both f : A ! B and f : A ! C is …ne. The choice
of codomain is a mere question of convenience. For example, in this book, we will often
consider functions taking real values, that is, f (x) 2 R for each x in the domain of f ; in
this case, the natural choice for the codomain is the entire real line and we will usually write
f : A ! R.
Example 166 Let A be the set of all countries on Earth and B a set containing some colors
(at least four). The function f : A ! B associates to each country the color given to it on a
geographic map: Im f is the set of the colors e¤ectively used at least once. N
Example 167 The rule that associates to each living human being his date of birth is a
function f : A ! B, where A is the set of the human beings and, for example, B is the set
of the dates of the last 150 years (a codomain su¢ ciently large to contain all the possible
birth dates). N
108 CHAPTER 6. FUNCTIONS
Example 168 Consider the rule that associates to each real positive number x both the
positive square root and the negative square root (the so-called algebraic root), that is,
p p
f x; xg. For example, it associates to 4 the elements f 2; 2g. This rule does not describe
a function f : R+ ! R since to each element of the domain di¤erent from 0 two di¤erent
elements of the codomain are associated. N
Example 169 Let f : R ! R be de…ned by f (x) = x3 , for which the rule is to associate to
each real number its cube; each real number has a unique cube, and hence the rule de…nes
a function. Graphically:
5
y
4
0
O x
-1
-2
-3 -2 -1 0 1 2 3 4
Function x3
Example 170 Let f : R ! R be de…ned by f (x) = x2 , for which the rule is to associate to
each real number its square; each real number has a unique square that is certainly 0 and
hence this rule, too, de…nes a function with Im f = f (R) = R+ . Graphically:
5
y
4
1
1
0
-1 O 1 x
-1
-2
-3 -2 -1 0 1 2 3 4
Function x2
Observe how in this case to two di¤erent elements of the domain can correspond the same
element: for example, f (1) = f ( 1) = 1. N
p
Example 171 (i) Let f : R+ ! R be the function de…ned by f (x) = x, which associates
to each positive real number its (arithmetic) square root. The domain is the positive half-line
and Im f = R+ . Graphically:
5
y
4
0
O x
-1
-2
-3 -2 -1 0 1 2 3 4
p
Function x
(ii) The function f : R++ ! R de…ned by f (x) = loga x, a > 0 and a 6= 1, which
associates to each strictly positive real number its logarithm, has as domain R++ . Moreover,
110 CHAPTER 6. FUNCTIONS
Im f = R. Graphically:
5
y
4
0
O x
-1
-2
-3 -2 -1 0 1 2 3 4
Function log x
Example 172 (i) Let f : R ! R be de…ned by f (x) = jxj for every x 2 R. It is called
absolute value function of x ( or modulus function of x). For this function with domain R
we have Im f = R+ . Graphically:
5
y
4
0
O x
-1
-2
-3 -2 -1 0 1 2 3 4
Function jxj
5
y
4
0
O x
-1
-2
-3 -2 -1 0 1 2 3 4
Function 1= jxj
Here the domain is A = R f0g, the real line without the origin, while Im f = R++ . N
f (x1 ; x2 ) = x1 + x2
It associates to each pair x = (x1 ; x2 ) 2 R2 the sum of its components; for every x 2 R2
such sum is unique, and therefore the rule de…nes a function with Im f = f R2 = R.
(ii) The function f : Rn ! R de…ned by
n
X
f (x1 ; x2 ; ; xn ) = xi
i=1
generalizes to Rn the function of two variables f (x1 ; x2 ) = x1 + x2 (which is the special case
n = 2). N
It associates to each x = (x1 ; x2 ) 2 R2+ the square root of the product of the components; for
each x 2 R2+ this root is unique and, therefore, the rule de…nes a function with Im f = R+ .
(ii) The function f : Rn+ ! R de…ned by
n
Y
f (x1 ; x2 ; ; xn ) = xi i
i=1
4
To be consistent with the notation adopted for vectors, we should write f ((x1 ; x2 )); but, to ease notation,
throughout the book we write f (x1 ; x2 ).
112 CHAPTER 6. FUNCTIONS
P
with the exponents i > 0 of unit sum, that is, ni=1 i = 1, generalizes to Rn the function of
p
two variables f (x1 ; x2 ) = x1 x2 (which is the special case with n = 2 and 1 = 2 = 1=2).
This function is widely used in economics with the name of Cobb-Douglas function. N
For example, if (x1 ; x2 ) = (2; 5), then f (x1 ; x2 ) = (2; 2 5) = (2; 10) 2 R2 .
b=f(a)
real numbers into vectors of Rm , and for operators it transforms vectors of Rn into vectors
of Rm .
Observe that the names of the variables are altogether irrelevant: we can indi¤erently
write a = f (b), or y = f (x), or s = f (t), or = f ( ), etc., or also = f ( ): the names
of the variables are simple place cards and what counts is only the sequence of operations
(almost always numerical) that lead from a to b = f (a). Writing b = a2 +2a+1 is exactly the
same as writing y = x2 +2x+1, or s = t2 +2t+1, or = 2 +2 +1, or even = 2 +2 +1:
the function (its name is f ) is identi…ed by the operations square + double + 1 that allow
us to pass from the independent variable to the dependent one. H
We close this introductory section by making rigorous the notion of graph of a function,
until now used at an intuitive level. For the parabola x2 the graph
5
y
4
1
1
0
-1 O 1 x
-1
-2
-3 -2 -1 0 1 2 3 4
is the locus of the points (x; f (x)) of the plane, when x varies in the domain of the function.
For example, the points ( 1; 1), (0; 0) and (1; 1) belong to the graph of the parabola.
Gr f = f(x; f (x)) : x 2 Ag A B
unique f (x).
5
y
4
0
O x
-1
-2
-3 -2 -1 0 1 2 3 4
Curve in R2
(ii) When A R2 and B = R, the graph is a subset of the tridimensional space R3 , i.e., a
surface (without thickness).
Surface in R3
6.2. APPLICATIONS 115
6.2 Applications
6.2.1 Static choices
Let us assume that, as in Section 2.4.1, the vectors in Rn+ have the meaning of bundles of
goods. It is natural to think that the consumer will prefer some bundles to others. For
example, it is reasonable to assume that, if x y (x is “more abundant” than y), x is
preferred to y. In symbols, we write x % y, where the symbol % represents the preference
relation of the consumer on the bundles.
In general, we assume that the preference % on the various bundles of goods can be
represented by a function u : Rn+ ! R, called utility function, such that the bundle x is
preferred to y if and only if u (x) u (y), i.e.,
Originally, around 1870, the …rst marginalists (in particular Jevons, Menger and Walras)
interpreted u (x) as the level of welfare/physical satisfaction produced by the bundle x.
They gave therefore a physiological interpretation of the utility functions, which quanti…ed
the emotions that the consumer felt in owing di¤erent bundles. In the so-called cardinalist
interpretation of the utility functions that goes back to Jeremy Bentham and to his “pain
and pleasure calculus”,6 the utility functions, besides representing the preference %, are
inherently interesting because they quantify an emotive state of the consumer, his degree of
pleasure induced by the bundles. In addition to the comparison u (x) u (y), it is also licit
to compare the di¤erences
u (x) u (y) u (z) u (w) (6.3)
which indicate that the bundle x is more intensively preferred to y of how much the bundle
z is with respect to the bundle w. Moreover, since u (x) measures the degree of pleasure
that the consumer gets by the bundle x, in the cardinalist interpretation it is also licit to
compare these measures among di¤erent consumers, i.e., to make interpersonal comparisons
of utility.
The cardinalist interpretation came into question at the end of the XIX century due to
the impossibility of measuring in an experimental way the supposed physiological aspects
that lie at the basis of utility functions.7 For this reason, with the works of Vilfredo Pareto
at the beginning of the XX century, developed …rst by Eugen Slutsky in 1915 and then
by John Hicks in the 1930s,8 the ordinalist interpretation of the utility functions prevailed:
more modestly, it is assumed that they are only a mere numerical representation of the
preference % of the consumer. According to such an interpretation, what counts is only that
the ordering u (x) u (y) represents the preference for bundle x over bundle y, that is, x % y.
On the other hand, it is of no interest to know if it also represents the, more or less intense,
6
See his Introduction to the Principles of Morals and Legislation, published in 1789.
7
Around 1901, the famous mathematician Henri Poincaré wrote to Walras: “I can say that one satisfaction
is greater than another, since I prefer one to the other, but I cannot say that the …rst satisfaction is two or
three times greater than the other.” Even if he did not have great economic knowledge, Poincaré, with great
sensibility, understood the main point.
8
The interested reader can read G. J. Stigler, The development of utility theory I, II, Journal of Political
Economy, 58, 307–327 and 373–396, 1950.
116 CHAPTER 6. FUNCTIONS
consumer’s emotions. In other terms, in the ordinalist approach the fundamental notion is
the one of preference %, while the utility function is a mere numerical representation of it.
The comparisons of intensity (6.2) or the interpersonal comparisons of utility no longer have
meaning.
At the empirical level, the consumer’s preference % is revealed in the choices among
bundles which are much simpler to observe than emotions or other mental states.
The ordinalist interpretation established itself as the standard one because, besides the
superior empirical content just mentioned, the works of Pareto showed how it is su¢ cient
for developing the consumer theory. Nevertheless, at an intuitive level many economists
continue to use cardinalist categories because of their introspective plausibility. In any case,
thanks to the utility functions we can deal with the consumer problem that has to choose
a bundle in an assigned set A of Rn+ . The consumer will be guided in such a choice by his
utility function u : A Rn+ ! R; namely, u (x) u (y) indicates that the consumer prefers
the bundle x of goods to the bundle y or that he is indi¤erent between the two. The image
Im u represents all the levels of utility that can be obtained by the consumer.
For example,
Xn
u (x) = xi
i=1
is the utility function of a consumer that orders the bundles simply according to the sum of
the quantities of the goods that they contain. The classic Cobb-Douglas utility function is
n
Y
u (x) = xi i
i=1
Pn
with the exponents i > 0 such that i=1 i = 1 (see Example 174). When i = 1=n for
each i, we have
n n
!1
Y 1 Y n
u (x) = (xi ) =
n xi
i=1 i=1
according to which the bundles are ordered according to the n-th root of the product of the
quantities of the goods that they contain.9
Going back instead to Section 2.4.1, let us consider a producer that has to decide how
much output to produce. In such a decision the so-called production function f : A Rn+ !
R plays a crucial role. The production function describes how much output f (x) is obtained
starting from a vector x 2 Rn of input. For example,
n
!1
Y n
f (x) = xi
i=1
is the Cobb-Douglas production function in which the output is equal to the n-th root of the
product of the input components.
9
Note that, by an obvious property of the product, all the bundles with at least one zero component xi
have 0 utility. From an economic viewpoint, it is not really plausible to think that the presence of one zero
component has such drastic consequences. For this reason, it is often preferred to de…ne the Cobb-Douglas
function only on Rn++ , and we will do so.
6.3. GENERAL PROPERTIES 117
T
X
T 1 t 1
U (x) = u1 (x1 ) + u2 (x2 ) + + uT (xT ) = ut (xt ) (6.4)
t=1
where 2 (0; 1) is a subjective discount factor that depends on how “patient”the consumer
is. The more patient the consumer, i.e., the more he is willing to postpone his consumption
of a given quantity of the good, the higher the value of . The closer gets to 1, the closer
we approach the form
T
X
U (x) = u1 (x1 ) + u2 (x2 ) + + uT (xT ) = ut (xt )
t=1
in which the consumption in each period is evaluated in an identical way. On the contrary, the
closer gets to 0, the closer U (x) gets to u1 (x1 ), that is, the consumer becomes extremely
impatient and does not attach importance to future consumptions.
1
f (y) = fx 2 A : f (x) = yg
of the elements of the domain whose image is y. More generally, given any subset D of the
codomain B, its preimage f 1 (D) is the set
1
f (D) = fx 2 A : f (x) 2 Dg
Example 177 Consider the function f : A ! B that to each person associates the date of
birth. If y 2 B is a possible such date, f 1 (y) is the set of the (living) persons that have y
as date of birth; in other words, all the persons in f 1 (y) have the same age. N
118 CHAPTER 6. FUNCTIONS
Observe that as in the last case, when a < 0; we have f 1 (a; b) = f 1 ([0; b)). This is due to
the fact that the elements between a and 0 have no preimage. For example, if D = ( 1; 2),
then p p
f 1 (D) = 2; 2
Note that
1 1 1
f (D) = f ([0; 2)) = f ( 1; 2)
that is the negative elements of D are irrelevant (since they do not belong to the image of
the function). N
This terminology, which expresses the idea that the points of f 1 (k) are the points of the
domain in which the function reaches the “level”k, is particularly …tting in several economic
applications, as we will see shortly. The level curves are especially used for the functions
f : R2 ! R because in this case it is possible to give a geometric representation that may
prove illuminating.
10
The motivation is the same as the one that leads to representing the mountains on a geographic map
through the so-called isohypses, i.e., the ideal lines that connect all the points at the same altitude above the
sea level. For the functions of two variables, the problem is exactly the same: it is possible to represent a
surface in R3 through the lines that join the points (x1 ; x2 ) for which the function assumes the same value k.
6.3. GENERAL PROPERTIES 119
Example 180 Let f : R2 ! R be given by f (x1 ; x2 ) = x21 + x22 . For every k 0, the level
curve f 1 (k) is the locus in R2 of equation
x21 + x22 = k
p
i.e., it is the circle with center in the origin and radius k. Graphically, the level curves can
therefore be represented as:
4
x3
0
2
1 2
0 1
0
-1
-1
x2 -2 -2
x1
Two di¤erent level curves of the same function cannot have any point in common, that
is,
1 1
f (k1 ) \ f (k2 ) = ; (6.5)
120 CHAPTER 6. FUNCTIONS
if k1 6= k2 . Indeed, assuming there is a point x 2 Rn that belongs to the two curves of levels
k1 and k2 ,one would have f (x) = k1 and f (x) = k2 with k1 6= k2 , but this is forbidden
because, by de…nition, a function assumes only one value in each point.
p
Example 181 Let f : A R2 ! R be given by f p (x1 ; x2 ) = 7x21 x2 . For every k 0,
the level curve f 1 (k) is the locus in R2 of equation 7x21 x2 = k, that is, x2 = k 2 +7x21 .
It is a parabola that intersects the vertical axis in k 2 . Graphically:
7
x
6 2
1
k= 0
0
O x
1
-1
k= 1
-2
-3
-4
k= 2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
s
x21 + x22
f (x1 ; x2 ) =
x1
s
x21 + x22
=k
x1
that is x21 + x22 k 2 x1 = 0, and therefore they are circles passing through the origin and
6.3. GENERAL PROPERTIES 121
Note that, although all such circles have the origin as common point, the “true”level curves
are the circles without the origin (because in (0; 0) the function is not de…ned) and that they
cannot have any point in common. N
O.R. We limit ourselves to functions of two variables. The generic level curve of f has
equation
f (x1 ; x2 ) = k
It can be rewritten, in an apparently more complicated form, as
y = f (x1 ; x2 )
y=k
(ii) the equation y = k represents an horizontal plane (it contains the points (x1 ; x2 ; k) 2
R3 , i.e., all the points of “height” k);
(iii) the brace f geometrically means intersection between the sets de…ned by the two
equations.
The curve of level k appears therefore as the intersection between the surface that rep-
resents f and a horizontal plane.
122 CHAPTER 6. FUNCTIONS
x3
-2
-4
2
1 2
0 1
0
-1
-1
x2 -2 -2
x1
Hence, the various level curves are obtained by cutting the surface horizontally with hori-
zontal planes (at various levels) and representing the edges of the “slices” obtained in this
way on the plane (x1 ; x2 ). H
Indi¤erence curves
We see now a classical economic application of the level curves. Given a utility function
u : A Rn+ ! R, the level curves
1
u (k) = fx 2 A : u (x) = kg
are called indi¤erence curves. In other words, an indi¤erence curve is formed by all the
bundles x 2 Rn+ that have the same utility k, and are therefore indi¤erent for the consumer.
The set u 1 (k) : k 2 R of all the indi¤erence curves is sometimes called indi¤ erence map.
Example 183 Consider the simple Cobb-Douglas utility function u : R2+ ! R given by
1
u (x) = (x1 x2 ) 2 : For every k > 0 we have
n 1
o
u 1 (k) = x 2 R2+ : (x1 x2 ) 2 = k = x 2 R2+ : x1 x2 = k 2
k2
= x 2 R2+ : x2 =
x1
Therefore, the indi¤erence curve of level k is the hyperbola of equation
k2
x2 =
x1
When k > 0 varies we get the indi¤erence map u 1 (k) , i.e.,
k
6.3. GENERAL PROPERTIES 123
8
y
7
6 k=3
5
k=2
4
2 k=1
0
O x
-1
0 0.5 1 1.5 2 2.5 3 3.5
Note that the property of the indi¤erence curves being disjoint is nothing but a special
case of property (6.5) valid for any family of level curves.
are called isoquants. In other words, an isoquant is the set of all the input vectors x 2 Rn+
that produce the same output. The set f 1 (k) : k 2 R of all the isoquants is sometimes
called isoquant map.
Finally, for a cost function c : A R+ ! R, the level curves
1
c (k) = fx 2 A : c (x) = kg
are called isocosts. In other words, an isocost is the set of all the levels of output x 2 A that
have the same cost. The set c 1 (k) : k 2 R of all the isocosts is sometimes called isocost
map.
Indi¤erence curves, isoquants and isocosts are all examples of level curves, whose prop-
erties they inherit. For example, the fact that two level curves have no points in common –
property (6.5) –implies the analogous classical property of the indi¤erence curves, as already
observed.
De…nition 184 Given any two functions f and g in RA , the function f + g is the element
of RA for which
(f + g) (x) = f (x) + g (x) 8x 2 A:
The sum function f + g : A ! R is hence built adding, for each element x of the domain
A, the images f (x) and g (x) of x under the two functions.
Example 185 Let RR be the set of all the functions f : R ! R. Consider f (x) = x and
g (x) = x2 . The sum function f + g is de…ned by (f + g) (x) = x + x2 . N
(iii) the ratio function (f =g) (x) = f (x) =g (x) for every x 2 A, provided g (x) 6= 0.
We have introduced four operations in the set RA , based on the four basic operations on
the real numbers. It is easy to see that these operations enjoy analogous properties to those
of the basic operations. For example, the addition is commutative, that is, f + g = g + f ,
and associative, that is, (f + g) + h = f + (g + h).
N.B. (i) In De…nition 184 and in that of the other operations the functions have to share
p
the same domain A. For example, if f (x) = x2 and g (x) = x, the sum f + g is meaningless
because, for x < 0, the function g is not de…ned. (ii) The domain A is any set: numbers,
chairs, or other. On the contrary, it is essential that the codomain is R because it is among
real numbers that we are able to perform the four basic operations. O
6.3.3 Composition
Consider two functions f : A ! B and g : C ! D, with Im f C. Take any point x 2 A.
Since Im f C, the image f (x) belongs to the domain C of the function g. We can apply
the function g to the image f (x), obtaining in such a way the element g (f (x)) of D. Indeed,
the function g has as its argument the image f (x) of x.
1.6 A Im(f) ⊆ C D
1.4
1.2 f g
x f(x) g(f(x))
1
0.8
0.6
0.4
0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8
6.4. CLASSES OF FUNCTIONS 125
We have therefore associated to each element x of the set A the element g (f (x)) of the
set D. This rule, called of composition, starts with the functions f and g and de…nes a new
function from A in D, denoted by g f . Formally:
(g f ) (x) = g (f (x)) 8x 2 A
Note that the inclusion condition, Im f C, is key in making the composition possible.
Let us give some examples.
g (f (x)) = g x2 = x2 + 1
f (g (x)) = f (x + 1) = (x + 1)2
Example 189 If in the previous example we consider g~ : [1; +1) ! R given by g~ (x) = x 1,
the inclusion condition is satis…ed
p for f g~, because Im g~ = [0; +1) = R+ . In particular,
f g~ : [1; +1) ! R is given by x 1. As we will see soon in Section 6.7, the function g~ is
the restriction of g to [1; +1). N
Example 190 Let A be the set of all Italian citizens, f : A ! R the function that to
each of them associates his income for this year, and g : R ! R the function that to each
possible income associates the tax that must be paid. The composite function g f : A ! R
establishes the correspondence between each Italian and the tax that he has to pay. For the
tax o¢ ces (and also for the citizens) such composite function is of great interest. N
that is, if to di¤erent elements of the domain f associates di¤erent elements of the codomain.
Graphically:
1.6
A B
1.4
a
1
1.2 b
1
b
3
1 a b
2 2
0.8
0.6
0.4
0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8
A simple example of injective function is f (x) = x3 . Indeed, two distinct real numbers have
always distinct cubes, that is, x 6= y implies x3 6= y 3 for every x; y 2 R. A classical example
of non-injective function is f (x) = x2 : for instance, to the two distinct points 2 and 2 of
R there corresponds the same square, that is, f (2) = f ( 2) = 4.
Note that (6.6) is equivalent to the contrapositive:12
which requires that two elements of the domain that have the same image be equal.
Im f = B
that is, if for each element y of B there exists at least an element x of A such that f (x) = y.
In other words, a function is surjective if each element of the codomain is the image of at
least one point in the domain.
1.6
A B
1.4
a b
1 1
1.2
1 a b
2 2
0.8
a b
3 3
0.6
0.4
0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8
For example, the function f : R ! R given by f (x) = x3 is bijective. In the case of …nite
sets we have the following simple, but interesting, result, where jAj denotes the cardinality
of a …nite set A, that is, the number of elements that belong to it.
Proposition 192 Let A and B be two any …nite sets. There exists a bijection f : A ! B if
and only if jAj = jBj.
Proof “If”. Denote jAj = jBj = n and write A = fa1 ; a2 ; : : : ; an g and B = fb1 ; b2 ; : : : ; bn g.
Then de…ne the bijection f : A ! B by f (ai ) = bi for i = 1; 2; ; n.
“Only if”. Let f : A ! B be a bijection. By injectivity, we have jAj jBj. Indeed, to
each x 2 A there corresponds a distinct f (x) 2 B. On the other hand, by surjectivity, we
have jBj jAj. Indeed, for each y 2 B, set C (y) = f 1 (y) = fx 2 A : f (x) = yg. If y1 6= y2 ,
we have C (y1 ) \ C (y2 ) = ;. Hence, setting C = fC (y) : y 2 Bg, we have jBj = jCj. But, it
is easy to see that jCj jAj, whence jBj jAj. In conclusion, we have jAj = jBj.
As we will see in Chapter 7, paraphrasing a famous quote of David Hilbert, this result is
the door to the paradise of Cantor.
128 CHAPTER 6. FUNCTIONS
We therefore have
1
f (f (x)) = x 8x 2 A (6.7)
and
1
f f (y) = y 8y 2 Im f (6.8)
The inverse functions go the opposite way to the original ones: from x 2 A we arrive to
f (x) 2 B, and we go back with f 1 (f (x)) = x.
1.6
A B
1.4
1.2
f
1 x y
-1
f
0.8
0.6
0.4
0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8
It makes sense to talk about the inverse function only for injective functions, which are
then called invertible. Indeed, if f were not injective and there were therefore two elements
of the domain x1 6= x2 with the same image y = f (x1 ) = f (x2 ), the set of the preimages of
y would not be a singleton (because it would contain at least the two elements x1 and x2 )
and the relation f 1 would not be a function. When the function f is also surjective, and it
is therefore bijective, we have f 1 : B ! A. In such a case the domain of the inverse is the
entire codomain of f .
8 x
< if x < 0
f (x) = 2 :
:
3x if x 0
8
< 2y if y < 0
1
f (y) = :
: y if y 0
3
Example 196 Let f : R f0g ! R be given by f (x) = 1=x. From y = 1=x it follows that
x = 1=y, and therefore f 1 : R f0g ! R is given by f 1 (y) = 1=y. In this case f = f 1 .
Note that R f0g is both the domain of f 1 and the image of f: N
(
x if x 2 Q
f (x) =
x if x 2
=Q
1
It is easy to see that, when it exists, the inverse (g f ) of the composite function g f
is
1 1
f g (6.9)
that is, the composition of the inverse ones, but exchanged of place: indeed from y = g (f (x))
we get g 1 (y) = f (x) and …nally f 1 g 1 (y) = x. On the other hand, in dressing, …rst we
put the underpants (f ) and then the trousers (g); in undressing, …rst we take o¤ the trousers
(g 1 ) and then the underpants (f 1 ).
O.R. The graph of the function f 1 is the same as that of f , once that the Cartesian axes
have been rearranged. The simplest way of seeing it is to trace the graph of f on a paper
sheet with little thickness, to hold it up to the light rotating the axes by 900 so as to exchange
abscissae and ordinates: what appears is the graph of f 1 .
130 CHAPTER 6. FUNCTIONS
5 5
y y
4 4
3 3
2 2
1 1
0 0
x x
O O
-1 -1
-2 -2
-3 -2 -1 0 1 2 3 4 -3 -2 -1 0 1 2 3 4
p
Function y = f (x) = 3
x Function y = f 1(x) = x3
Inverses and cryptography The computation of the cube x3 of any scalar x is much
p
easier than the computation of the cube root 3 x: it is much easier to compute 803 = 512; 000
p
(three multiplications su¢ ce) than 3 512; 000 = 80. In other words, the computation of the
p
cubic function f (x) = x3 is much easier than the computation of its inverse f 1 (x) = 3 x.
This computational di¤erence increases signi…cantly as we take higher and higher odd powers
(for example f (x) = x5 , f (x) = x7 and so on).
Similarly, while the computation of ex is fairly easy, that of log x is much harder (before
the advent of electronic calculators, logarithmic tables were used to aid such computations).
From a merely computational viewpoint (not theoretical, where everything works smoothly),
the inverse function f 1 may be very di¢ cult to deal with. The injective functions, for which
the computation of f is easy, while that of f 1 is complex, are called one-way.13
For example, let A = f(p; q) 2 P P : p < qg, and consider the function f : A P P ! N
de…ned as f (p; q) = pq, which associates to each pair of prime numbers p; q 2 P, with p < q,
their product pq. For example, f (2; 3) = 6 and f (11; 13) = 143. Thanks to the Fundamental
Theorem of Arithmetic, it is an injective function.14 Given two prime numbers p and q, the
computation of their product is a trivial multiplication. Instead, given any natural number
n it is quite complex, and it can require a long time, even for a powerful computer, to de-
termine if it is the product of two prime numbers. In this regard, the reader may recall the
discussion regarding factorization and primality tests from Section 1.3.2 (to experience the
di¢ culty …rsthand, the reader may try to check whether the number 4343 is the product of
two prime numbers). This makes the computation of the inverse function f 1 very complex,
as opposed to the very simple computation of f . For this reason, f is a classic example of a
one-way function.
13
The notions of “simple” and “complex”, here used qualitatively, can be made more rigorous (as the
curious reader may discover in cryptography texts).
14
But not surjective: for example 4 2
= Im f because no two di¤erent prime numbers whose product is 4
exist.
6.4. CLASSES OF FUNCTIONS 131
(i) bounded from above if its image Im f is a set bounded from above in R, i.e., if there
exists M 2 R such that f (x) M for every x 2 A;
(ii) bounded from below if its image Im f is a set bounded from below in R, i.e., if there
exists m 2 R such that f (x) m for every x 2 A;
(iii) bounded if it is both bounded from above and from below.
is bounded from below, but not from above, since f (x) 0 for every x 2 R, while the
function f : R ! R given by f (x) = x2 is bounded from above, but not from below, since
f (x) 0 for every x 2 R.
The next lemma gives us a simple, but very useful, condition of boundedness.
Lemma 198 A function f : A ! R is bounded if and only if there exists k > 0 such that
jf (x)j k 8x 2 A (6.10)
Proof If f is bounded, there exist m; M 2 R such that m f (x) M . Let k > 0 be such
that k m M k. Then (6.10) holds. Vice versa, suppose that (6.10) holds. Thanks
to (4.1), which holds also for , we have k f (x) k, which implies that f is bounded
both from above and from below.
Thus, we have a …rst taxonomy of the functions with real values f : A ! R, that is, of
the elements of the space15 RA . Note that such taxonomy is not exhaustive, i.e., there exist
functions that do not satisfy any of the conditions (i)–(iii): this is the case, for example,
when f (x) = x. Such functions are called unbounded (their image is an unbounded set).
We denote by supx2A f (x), often shortened as sup f , the supremum of the image of a
function f : A ! R bounded from above, that is,
sup f (x) = sup (Im f )
x2A
By the de…nition of the supremum, a number M is such that f (x) M for every x 2 A if
and only if sup f M .
Similarly, we denote by inf x2A f (x) –often shortened as inf f –the in…mum of the image
of a function f : A ! R bounded from below, that is,
inf f (x) = inf (Im f )
x2A
By the de…nition of the in…mum, a scalar m is such that f (x) m for every x 2 A if and
only if m inf f .
Clearly, a bounded function f : A ! R has both extrema, and so
inf f f (x) sup f for every x 2 A
In particular, the numbers m and M are such that m f (x) M for every x 2 A if and
only if m inf f sup f M .
Example 199 For the function (6.11) one has that sup f = 1 and inf f = 2. For the
function f : R f0g ! R given by f (x) = 1= jxj, which is bounded from below, but not
from above, one has inf f = 0. N
15
Note the use of the term space to denote a set of reference (in this case the set of all the functions of RA ).
6.4. CLASSES OF FUNCTIONS 133
Monotonic functions on R
(i) increasing, if
x > y =) f (x) f (y) 8x; y 2 A (6.12)
strictly increasing, if
(ii) decreasing, if
x > y =) f (x) f (y) 8x; y 2 A (6.14)
strictly decreasing, if
f (x) = k 8x 2 A
Note that a function is constant if and only if it is both increasing and decreasing. In
other words, constancy is equivalent to having both monotonicity properties. It is for this
reason that we have introduced constancy among the forms of monotonicity. Soon, we will
see that for vector functions the relation between constancy and monotonicity is a bit more
subtle.
Increasing or decreasing functions are generically called monotonic. We thus have strict
monotonicity when the inequality between the images f (x) and f (y) is strict for all points
x 6= y in the domain. In other words, strict monotonicity excludes the possibility that the
function is constant in some region of the domain. Formally:
An analogous result holds for strictly decreasing functions. Strictly monotonic functions
are therefore injective, and hence invertible.
Proof “Only if”. Let f be strictly increasing and let f (x) = f (y). Suppose, by contradiction,
that x 6= y: x > y or y > x. In both cases, by (6.13), we have f (x) 6= f (y), which contradicts
f (x) = f (y). It follows that x = y, as desired.
“If”. Let us suppose that (6.15) holds. Let f be increasing. We prove that it is also
strictly increasing. Let x > y. By increasing monotonicity, we have f (x) f (y), but we
cannot have f (x) = f (y), because from (6.15) it would follow that x = y. We have therefore
f (x) > f (y), as claimed.
Example 202 The functions f : R ! R given by f (x) = x and f (x) = x3 are strictly
increasing, while the function
x if x 0
f (x) =
0 if x < 0
is increasing, but not strictly increasing, since it is constant for every x < 0. The same holds
for the function de…ned by
8
< x 1 if x 1
f (x) = 0 if 1<x<1 (6.16)
:
x+1 if x 1
Note that in (6.12) we can replace x > y by, x y without any consequence since we
have f (x) = f (y) if x = y. Hence, increasing monotonicity is equivalent to
which requires that, to larger values of the image correspond larger values of the argument.
Clearly, f (x) = f (y) is equivalent to having both f (x) f (y) and f (y) f (x), which in
turn, by (6.18), imply x y and y x, that is, x = y. Therefore, from (6.18) it follows that
In the light of Proposition 201, we conclude that an increasing function that satis…es also the
converse implication (6.18) is strictly increasing. The next result shows that the converse is
also true, establishing in this way an interesting characterization of the strictly increasing
functions; an analogous result holds for the strictly decreasing functions.
Proof Thanks to what we have seen above, it remains to prove the “Only if”part, i.e., that
a strictly increasing function satis…es (6.20). Since a strictly increasing function is increasing,
the implication
x y =) f (x) f (y)
is obvious. To prove (6.20) it remains to show that
f (x) f (y) =) x y
Let f (x) f (y) and suppose, by contradiction, that x < y. The strictly increasing mono-
tonicity implies f (x) < f (y), which contradicts f (x) f (y). We have therefore x y, as
desired.
Monotonic functions on Rn
The monotonicity notions seen in the case n = 1 generalize in a natural way to the case of
arbitrary n, but with some delicate aspects due to the two peculiarities of the case n 2,
that is, the incompleteness of and the presence of two notions of strict inequality, > and
. For the sake of brevity, we consider the increasing monotonicity (analogous notions hold
for the decreasing monotonicity). The notion of increasing monotonicity can be extended in
an obvious way: a function f : A Rn ! R is said to be increasing if
Note that this notion does not concern vectors x and y that cannot be compared, such
as for example (1; 2) and (2; 1) in R2 . Analogously, it is possible to introduce the concept of
decreasing function. Moreover, f is constant if there exists k 2 R such that
f (x) = k 8x 2 A
More delicate is the extension to Rn of the strict monotonicity, given that we have two
distinct concepts of strict inequality. A function f : A Rn ! R is said to be strictly
increasing if
x > y =) f (x) > f (y) 8x; y 2 A
and strongly increasing if is increasing and
Proof A strongly increasing function is, by de…nition, increasing. It remains to prove that
strictly increasing implies strongly increasing. Let therefore f be strictly increasing. We
need to prove that f is increasing and satis…es (6.22). If x y, we have x = y or x > y. In
the …rst case f (x) = f (y). In the second case f (x) > f (y), and hence f (x) f (y). Thus,
f is increasing. Moreover, if x y, a fortiori we have x > y, and therefore f (x) > f (y).
The function f is therefore strongly increasing.
The converses of the previous implications do not hold. An increasing function with
constant pieces is an example of increasing, but not strongly increasing function. Therefore
Moreover, as the next example shows, that there exist functions that are strongly increasing,
but not strictly increasing, that is,
is strongly increasing, but not strictly increasing. For example, x = (1; 2) and y = (1; 1) are
such that x > y, but f (x) = f (y) = 1. N
N.B. For operators f : Rn ! Rm with m > 1 the notions of monotonicity studied for the case
m = 1 assume a di¤erent meaning since the images f (x) and f (y) might not be comparable,
that is, neither f (x) f (y), nor f (y) f (x) holds. For example, if f : R2 ! R2 is such
that f (0; 1) = (1; 2) and f (3; 4) = (2; 1), the images (1; 2) and (2; 1) are not comparable.
For brevity, we do not deal with this issue and we leave to more advanced courses the study
of notions of monotonicity suitable for operators f : Rn ! Rm when m > 1. O
Utility functions
Let u : A ! R be a utility function de…ned on a suitable set A Rn+ of bundles of goods. A
transformation f u : A ! R of u, where f : Im u R ! R, de…nes another utility function
with the same meaning provided
In other words, the function f u orders the goods in the same way as u, that is,
x % y () (f u) (x) (f u) (y) x; y 2 A
is a utility function equivalent to u on Rn++ :17 It is the logarithmic version of the Cobb-
Douglas function, often called log-linear utility function.18 N
In this case it is su¢ cient to increase the amount of any of the goods to achieve a greater
utility: “the more of any good is always better”.
If, instead, we want to contemplate the possibility that some good can actually be useless
to the consumer, we can only ask for u to be increasing:
Indeed, if a good is “useless” (as wine is for a teetotaller, or for a drunk who has already
had too much of it), the inequality x y might be determined exactly by a larger amount
of this good, keeping all the other unvaried; it is reasonable then that u (x) = u (y), since
the consumer does not get any bene…t in passing from y to x. In this case “the more of any
good can be better or indi¤erent”.
Finally, “the more of any good is always better”property implied by strict monotonicity
can be weakened in the sense of the strong monotonicity by assuming that “the more of all
the goods is always better”, that is,
In this case, there is an improvement only when the amounts of all goods increase, it is not
enough to increase the amount of only some good. Such a form of monotonicity re‡ects a
17
Recall that, even if mathematically it can be de…ned on the entire positive orthant Rn+ , from the economic
viewpoint, it is precisely on Rn
++ that the Cobb-Douglas function is interesting (Example 207).
18
It is necessary to consider the Cobb-Douglas function on Rn ++ , and not on the entire positive orthant
Rn+ in order for the logarithmic transformation to be well de…ned on strictly positive numbers. While the
,
Cobb-Douglas function can be de…ned on the entire positive orthant Rn + , the log-linear function is de…ned
only on Rn ++ . On the other hand, note also what we have observed in the previous footnote.
138 CHAPTER 6. FUNCTIONS
form of complementarity among goods, so that an increase of the amounts of only some of
them can turn out to be super‡uous for the consumer if the quantities of other goods remain
unchanged. Perfect complementarity a la Leontief is the extreme case, a classical example
being the pairs of shoes, right and left.19
in which the goods are perfect complements, is strongly increasing. By (6.23), it is also
increasing. As we have already seen in Example 205, it is not strictly increasing.
(iii) The reader can check which properties of monotonicity hold if we consider the two
previous utility functions on the entire positive orthant Rn+ and not just on Rn++ . N
Observe that consumers with strictly monotonic or strongly monotonic utility functions
are “insatiable”, because by increasing in a suitable way their bundles their utility also
increases. This property of utility functions is sometimes called insatiability, and hence it
is shared by both strict and strong monotonicity. The unique form of monotonicity that
can encompass the possibility of satiety is increasing monotonicity (6.25): as observed for
the drunk consumer, this weaker form of monotonicity allows for the possibility that a given
good, when it exceeds a certain level, does not result in a further increase of utility. On
the contrary, it cannot happen that utility decreases: if (6.25) holds, utility either increases
or remains constant, but it never decreases. Therefore, if an extra glass of wine results
in a decrease of the drunk’s utility, this cannot be modelled by any form of increasing
monotonicity, no matter how weak.
f ( x + (1 ) y) f (x) + (1 ) f (y)
Note that the domain must be an interval for the points x + (1 ) y to belong to it
so that the expression f ( x + (1 ) y) makes sense.
Example 209 The functions f; g : R ! R given by f (x) = x2 and g(x) = ex are convex,
while the function f : R ! R given by f (x) = ln x is concave. The function f : R ! R given
by f (x) = x3 is neither concave, nor convex. N
5 5
4 4
3 3
2 2
1 1
0 0
x y x y
-1 -1
-2 -2
-3 -2 -1 0 1 2 3 4 -3 -2 -1 0 1 2 3 4
8
3
6
2 4
2
1 x
0
y
0
x -2
y
-4
-1
-6
-2 -8
-3 -2 -1 0 1 2 3 4 5
-3
-1 0 1 2 3 4
Example 214 Separable utility functions are very important in the static case as well. The
utility functions used by the …rst marginalists were indeed of the form
n
X
u (x) = ui (xi ) (6.27)
i=1
In other words, it was assumed that the utility (cardinally intended) of a bundle x is de-
composable into the utility of the quantities xi of the various goods that compose it. This is
a restrictive assumption that ignores each possible interdependency, for example of comple-
mentarity or substitutability, among the di¤erent goods. Due to its remarkable tractability,
however, 6.27 remained for a long time the usual form of the utility functions until, at the
end of the XIX century, the works of Edgeworth and Pareto showed how to develop the
consumer theory for utility functions that are not necessarily separable. N
Example 215 If in (6.27) we set ui (xi ) = xi for all i, we obtain the important special case
n
X
u (x) = xi
i=1
6.5. ELEMENTARY FUNCTIONS ON R 141
where the goods are perfect substitutes. The utility of bundles x depends only on the sum
of the amounts of the di¤erent goods, regardless of the speci…c amounts of the individual
goods. For example, think of x as a bundle of di¤erent types of oranges, which di¤er in origin
and taste, but are identical in terms of nutritional values. In this case, if the utility of the
bundle depends only on its nutritional value, then these di¤erent types of oranges are perfect
substitutes. This case is opposite to the case of perfect complements that characterizes the
Leontief utility function. N
Example 216 More generally, if in (6.27) we set ui (xi ) = i xi for all i, with i > 0, we
have
n
X
u (x) = i xi
i=1
In this case, the goods in the bundle are no longer perfect substitutes; rather, their relevance
depends on their weights i . Therefore, in order to keep utility constant, each good can
be replaced with another according to a linear trade-o¤. Intuitively, one unit of good i is
equivalent to j = i units of good j. The notion of marginal rate of substitution (Section
23.2.2) formalizes this idea. N
of the Cobb-Douglas utility function, that is, the log-linear utility function (Example 206),
is separable. The example shows that sometimes it is possible to obtain separable versions
of utility functions by using their strictly monotonic transformations. Usually, the separable
versions are the most convenient from the analytical point of view (so is, for example, the
log-linear utility, handier to manipulate with respect to the non-separable version (6.26)). N
f (x) = a0 + a1 x + + an xn
with ai 2 R for every 0 i n and an 6= 0. Let Pn be the set of all polynomials of degree
lower than or equal to n. Naturally, one has
P0 P1 P2 Pn
Example 219 A polynomial f has degree zero when there exists a 2 R such that f (x) = a
for every x. The constant functions can therefore be regarded as polynomials of degree zero.
N
[
The set of all polynomials, of any degree, is denoted by P; that is, P = Pn .
n 0
f (x) = ax
5
y
4
1 1
0
O x
-1
-2
-3 -2 -1 0 1 2 3 4
Function ex
The negative exponential function f (x) = e x is also very important; its graph is:
6.5. ELEMENTARY FUNCTIONS ON R 143
2
y
1
0
O x
-1 -1
-2
-3
-4
-5
-3 -2 -1 0 1 2 3 4
Function e x
The image of the exponential function is the set (0; 1) of the strictly positive scalars.
Moreover, thanks to Lemma 40-(iv), the exponential function ax is:
(ii) constant if a = 1;
loga ax = x 8x 2 R
and
aloga x = x 8x 2 (0; 1)
are therefore nothing but the relations (6.7) and (6.8) for the inverse functions, i.e., the
relations f 1 (f (x)) = x and f f 1 (y) = y.
In light of the importance of the natural logarithm, we will call f (x) = log x = loge x
logarithmic function without further speci…cation.20 As the exponential function, to which
it is strictly linked as we will see soon, the logarithmic function is central in many …elds. Its
graph is:
20
Another standard notation for log x is ln x.
144 CHAPTER 6. FUNCTIONS
5
y
4
0
O 1 x
-1
-2
-3 -2 -1 0 1 2 3 4
Function log x
We conclude with a result that summarizes the properties of monotonicity of these ele-
mentary functions.
Lemma 220 Both the exponential function ax and the logarithmic function loga x are in-
creasing if a > 1 and decreasing if 0 < a < 1.
Proof For the exponential function, observe that, when a > 1, also ah > 1 for every h > 0.
Therefore ax+h = ax ah > ax for every h > 0. For the logarithmic function, after observing
that loga k > 0 if a > 1 and k > 1, we have
h h
loga (x + h) = loga x 1 + = loga x + loga 1 + > loga x
x x
for every h > 0, as desired.
Trigonometric functions
The sine function f : R ! R de…ned by f (x) = sin x is the …rst example of trigonometric
function. For each x 2 R we have
sin (x + 2k ) = sin x 8k 2 Z
4
y
3
0
O x
-1
-2
-3
-4
-4 -2 0 2 4 6
The function f : R ! R de…ned by f (x) = cos x is the cosine function. For each x 2 R
we have
cos (x + 2k ) = cos x 8k 2 Z
4
y
3
0
O x
-1
-2
-3
-4
-4 -2 0 2 4 6
10
y
8
0
O x
-2
-4
-6
-8
-10
-4 -3 -2 -1 0 1 2 3 4
The functions sin x, cos x and tan x are monotonic, and hence invertible, respectively
in the intervals [ =2; =2], [0; ], and ( =2; =2). Their inverse functions are denoted
respectively by arcsin x (or sin 1 x), arccos x (or cos 1 x), and arctan x (or tan 1 x). In
particular, restricting ourselves to the interval of strict monotonicity of the function sin x,
[ =2; =2] ; we have
h i
sin x : ; ! [ 1; 1]
2 2
Hence, the inverse function of sin x is
h i
arcsin x : [ 1; 1] ! ;
2 2
and its graph is:
3 y
O x
-1
-2
-3
-4 -3 -2 -1 0 1 2 3 4
6.5. ELEMENTARY FUNCTIONS ON R 147
cos x : [0; ] ! [ 1; 1]
arccos x : [ 1; 1] ! [0; ]
y
3
0
O x
-1
-2
-3
-4 -3 -2 -1 0 1 2 3 4
Finally, restricting ourselves to the interval ( =2; =2) of strict monotonicity of tan x
we have:
tan x : ; !R
2 2
arctan x : R ! ;
2 2
3 y
O x
-1
-2
-3
-4 -3 -2 -1 0 1 2 3 4
It is immediate to see that, for 2 (0; =2), one has 0 < sin < < tan .
Periodic functions
The smallest (if there exists) among such p > 0 is called the period of f . In particular,
the functions sin x and cos x are periodic of period 2 , while the function tan x has period
. Their graphs show the property that characterizes the periodic functions, that is, of
repeating themselves identical on each interval of width p.
Example 222 The functions sin2 x and log tan x are periodic of period . N
Example 223 The function f : R ! R given by f (x) = x [x] is called mantissa function.22
For x > 0 the mantissa of x is its decimal part; for example f (2:37) = 0:37. The mantissa
function is periodic of period 1: by (1.19), [x + 1] = [x] + 1 for every x 2 R, and therefore
The graph
2.5
2
y
1.5
0.5
-0.5
-1
O x
-1.5
-2
-2.5
-3 -2 -1 0 1 2 3
The reader can verify that periodicity is preserved by the fundamental operations among
functions. That is, if f and g are two periodic functions of same period p, the functions
f (x) + g (x), f (x) g (x) and f (x) =g (x) are also periodic (of period at most p).
f (^
x) f (x) 8x 2 A (6.29)
The value f (^
x) of the function at x
^ is called a global maximum (or maximum value) of f
on A. The maximum of a function f : A ! R is, if it exists, the value M 2 R such that
M = max(Im f )
In this case we write M = maxx2A f (x), and a point x0 2 A such that f (x0 ) = M is called
a maximizer of f on A.
Thus, the maximum value of f on A is nothing but the maximum of the set f (A) = Im f ,
that is,
f (^
x) = max f (A) = max(Im f )
Thanks to Proposition 33, the maximum value is unique. We denote this unique value by
max f (x)
x2A
150 CHAPTER 6. FUNCTIONS
Analogous de…nitions hold for the minimum value of f on A and for the minimizer of f on
A.
Example 225 Consider the parabola f (x) = x2 , whose graph is
5
y
4
0
O x
-1
-2
-3 -2 -1 0 1 2 3 4
As one can see from the graph, the minimizer of f is 0 and the minimum value is 0. Indeed,
0 = f (0) f (x) for every x 2 R. N
As we have seen, if it exists, the maximum (minimum) of f on A is unique. Vice versa,
the maximizer and the minimizer might not be unique; indeed, in general they are not, as
the next example shows.
Example 226 Let f : R ! R be the sine function f (x) = sin x (Section 6.5.3). Since
Im f = [ 1; 1], the unique maximum of f on R is 1 and the unique minimum of f on R is
1: Nevertheless there are both in…nitely many maximizers (all the points x = =2 + 2k
with k 2 Z) and in…nitely many minimizers (all the points x = =2 + 2k with k 2 Z), as
we can easily see from the graph:
4
y
3
0
O x
-1
-2
-3
-4
-4 -2 0 2 4 6
N
6.7. DOMAINS AND RESTRICTIONS 151
The restriction fjC can therefore be seen as f restricted on the subset C of A. Thanks
to the smaller domain, the function fjC can satisfy properties di¤erent from those of the
original function f .
Example 228 Let g : [0; 1] ! R be de…ned by g(x) = x2 . The function g can be seen as
the restriction to the interval [0; 1] of the function f : R ! R given by f (x) = x2 ; that
is g = fj[0;1] . Thanks to its restricted domain, the function g has more (better) properties
than the function f . For example: g is strictly increasing, while f is not; g is injective (and
therefore invertible), while f is not; g is bounded, while f is only bounded from below; g
has both a (global) maximizer and a minimizer, while f does not have a maximizer. N
an economic viewpoint, this utility function is often considered only on R2++ . Therefore,
purely economic considerations lead to restricting the domain on which we study the func-
p
tion f (x1 ; x2 ) = x1 x2 . N
Example 231 Let g : [0; +1) ! R be de…ned by g (x) = x3 : The function g can be seen
as the restriction to the interval [0; +1) of the function f : R ! R given by f (x) = x3 , that
is, g = fj[0;+1] . We observe that g is convex, while f is not; g is bounded from below, while
f is not; g has a minimizer, while f does not. N
In a dual way relative to the concept of restriction, we introduce now the concept of
extension of a function (function “extended” to a domain larger than the initial one).
is called an extension of f to C.
It is evident from the de…nitions just given that restriction and extension are two faces
of the same medal: g is an extension of f if and only if f is a restriction of g. In particular,
a function de…ned on its natural domain A is an extension to A of each restriction of this
function. It is also evident that if a function has an extension, it has in…nitely many ones.23
1
x if x 6= 0
g(x) =
0 if x = 0
is an extension of the function f (x) = 1=x, which has as natural domain R f0g. N
x for x 0
g(x) =
log x for x > 0
It is an extension of the function f (x) = log x, which has natural domain R++ . N
23
It could happen that a function does not have restrictions or does not have extensions. Indeed, let
f : A R ! R. In the extreme situations, if A = fx0 g, that is, if the domain of f is a single point, then f
does not have restrictions. If instead A = R, f does not have extensions.
6.8. GRAND FINALE: PREFERENCES AND UTILITY 153
(i) we write x y if the bundle x is strictly preferred to y, that is, if x % y, but not y % x;
(ii) we write x y if the bundle x is indi¤ erent relative to the bundle y, that is, if both
x % y and y % x.
Note that the relations and are, obviously, mutually exclusive: between two indif-
ferent bundles there cannot exist strict preference, and vice versa.
This …rst axiom re‡ects the “weakness”of %: each bundle is preferred to itself. The next
axiom is more interesting.
It is an axiom of rationality that requires that the preferences of the decision maker have
no cycles:
x%y%z x
Strict preference and indi¤erence inherit these …rst two properties (with the obvious
exception of re‡exivity for the strict preference).
(ii) is transitive.
Proof (i) We have x x since, thanks to the re‡exivity of %, both x % x and x - x hold.
Hence, the relation is re‡exive. To prove the transitivity, suppose that x y and y z.
We show that this implies x z. By de…nition, x y means that x % y and y % x, while
y z means that y % z and z % y. Thanks to the transitivity of %, from x % y and y % z
24
In the weak sense of “prefers or is indi¤erent”.
154 CHAPTER 6. FUNCTIONS
[x] = fy 2 A : y xg
the collection of the bundles indi¤erent to it. This set is the indi¤ erence class of % determined
by the bundle x.
and
x y () [x] \ [y] = ; (6.31)
Relations (6.30) and (6.31) express two fundamental properties of the indi¤erence classes.
By (6.30), the indi¤erence class [x] does not depend on the choice of the bundle x: each
indi¤erent bundle determines the same indi¤erence class. By (6.31) the indi¤erence classes
do not have elements in common, they do not intersect.
Proof By the previous lemma, is re‡exive and transitive. This implies (6.30) and (6.31).
Concerning (6.30), suppose that x y. We show that this implies [x] [y]. Let z 2 [x], that
is, z x. Since is transitive, x y and z x imply that z y, that is, z 2 [y], which
shows that [x] [y]. A similar argument shows that [y] [x], and therefore we conclude
that x y implies [y] = [x]. Since the converse is obvious, (6.30) is proved.
We move now to (6.31) and we suppose that x y. This implies that [x] \ [y] = ;. Let
us suppose, by contradiction, that this is not the case and there exists z 2 [x] \ [y]. By
de…nition, we have both z x and z y and hence, by the transitivity of , we have x y,
which contradicts x y. The contradiction shows that x y implies [x] \ [y] = ;. Since
here also the converse is obvious, the proof is complete.
Now we take again the study of %. The next axiom does not concern the rationality, but
the information of the decision maker.
Completeness requires that the consumer is able to compare any two bundles of goods,
even very di¤erent ones. Naturally, to do so the consumer must, at least, have su¢ cient
information about the two possibilities: it is easy to think examples where this assumption
is rather strong.
6.8. GRAND FINALE: PREFERENCES AND UTILITY 155
In any case, note how completeness requires, inter alia, that each bundle be comparable
to itself, that is, x % x. Thus, completeness implies re‡exivity.
Given the completeness assumption, the relations and are both exclusive (as seen
above) and exhaustive.
Lemma 238 Let % be complete. Given two any bundles x and y, we have always x y or
y x or x y.25
Since we are considering bundles of economic goods (and not of “bads”), it is natural to
assume the monotonicity, i.e., that “more is better”. The triad , >, and leads to three
possible incarnations of this simple principle of rationality:
The relationships among the three notions are very similar to those seen for the analogous
notions of monotonicity studied (also for utility functions) in Section 6.4.4. For example, the
strict monotonicity means that, given a bundle, the increase of the quantity of any good of
the bundle determines a strictly preferred bundle.
Analogous considerations hold for the other notions. In particular, (6.23) assumes the
form:
strict monotonicity =) strong monotonicity =) monotonicity
The function u is called of (Paretian) utility and it represents also the strict preference and
indi¤erence:
Proof Indeed,
Expression (6.33) allows to represent the indi¤erence classes as indi¤erence curves of the
utility function:
[x] = fy 2 X : u (y) = u (x)g
As already observed in Section 6.4.4, the utility function is a mere representation of the
preference relation, which is the basic notion, without any special psychological meaning.
Indeed, we have already seen how each strictly increasing function f : Im u ! R de…nes an
equivalent utility function f u, for which it holds that
x % y () (f u) (x) (f u) (y)
x + (1 )z y x + (1 )z
The axiom implies that there exist no in…nitely preferred and no in…nitely “unpreferred”
bundles. Given the preferences x y and y z, for the consumer the bundle x cannot
be in…nitely better than y, nor the bundle z can be in…nitely worse than y. Indeed, by
combining appropriately the bundles x and z we get both a bundle better than y, that is,
x + (1 )z, and a bundle worse than y, that is, x + (1 )z. This would be impossible
if x were in…nitely better than y, or if z were in…nitely worse than y.
Concerning this aspect, recall the analogous property of the real numbers: if x; y; z 2 R
are three scalars with x > y > z, there exist ; 2 (0; 1) such that
The property does not hold if we consider 1 and 1, that is, the extended real line
R= [ 1; 1]. In this case, if y 2 R, but x = +1 and/or z = 1, the scalar x is in…nitely
greater than y, and z is in…nitely smaller than y, and there do not exist ; 2 (0; 1) that
satisfy the inequality (6.35). Indeed, 1 = +1 and ( 1) = 1 for every ; 2 (0; 1),
as seen in Section 1.7.
In conclusion, the Archimedean axiom makes the bundles of di¤erent but comparable
quality, that is, however di¤erent they belong to the same league. Thanks to it we can now
state the theorem of existence, whose not simple proof we will omit.
Theorem 240 Let % be a preference de…ned on A = Rn+ . The following conditions are
equivalent:
(ii) there exists a strictly monotonic and continuous function28 u : A ! R such that (6.32)
holds, that is,
x % y () u(x) u(y)
This is a result of remarkable importance: most economic applications use utility func-
tions and the theorem shows which conditions on preferences justify such use.29
To appreciate the importance of Theorem 240, we close the chapter with a famous ex-
ample of preferences that do not admit utility function. Let A = R2+ and, given two bundles
x and y, let us write x % y if x1 > y1 or if x1 = y1 and x2 y2 . The consumer starts by
considering the …rst coordinate: if x1 > y1 , then x % y. If, on the other hand, x1 = y1 , then
he turns his attention to the second coordinate: if x2 y2 , then x % y.
The preference takes the way with which the dictionaries order the words; for this reason
% is called lexicographic preference. In particular, we have x y if x1 > y1 or x1 = y1 and
x2 > y2 , while we have x y if and only if x = y. The indi¤erence classes are therefore
singletons, a …rst remarkable characteristic of this preference.
The lexicographic preference is complete, transitive and strictly monotonic, as the reader
can easily verify. It is not Archimedean, however. Indeed, consider, for example, x = (1; 0),
y = (0; 1), and z = (0; 0). We have x y z and
x + (1 ) z = ( ; 0) y z 8 2 (0; 1)
Proposition 241 The lexicographic preference does not admit any utility function.
28
Continuity is an important property, to which Chapter 12 is devoted.
29
There exist other results on existence of utility functions, in great part proved in the years 1940ies and
1950ies.
158 CHAPTER 6. FUNCTIONS
Proof Suppose, by contradiction, that there exists u : R2+ ! R that represents the lex-
icographic preference. Let a < b be any two positive scalars. For each x 0 we have
(x; a) (x; b) and therefore u (x; a) < u (x; b). By Proposition 39, there exists a rational
number q (x) such that u (x; ) < q (x) < u (x; ). The rule x 7! q (x) de…nes therefore a
function q : R+ ! Q. It is injective. If x 6= y, for example y < x, then:
u (y; a) < q (y) < u (y; b) < u (x; a) < q (x) < u (x; b)
and hence q (x) 6= q (y). But, since R+ has the same cardinality of R, the injectivity of the
function q : R+ ! Q implies jQj jRj, contradicting Theorem 250 of Cantor. This proves
that the lexicographic preference does not admit any utility function.
Chapter 7
Cardinality
159
160 CHAPTER 7. CARDINALITY
and in…nite, thus putting the notion of set at the foundations of mathematics. It is not by
chance that our textbook starts with such a notion. The rest of the chapter is devoted to
the Cantorian study of in…nite sets, in particular of their cardinality.
Example 242 The set A = f11; 13; 15; 17; 19g of the odd integer numbers between 10 and
20 is …nite and jAj = 5. N
Thanks to Proposition 192, two …nite sets have the same cardinality if and only if their
elements can be put in a one-to-one correspondence: for example, if we have seven seats and
seven students, we can pair each seat with a student by making the latter sit on the former.
In particular, we have the following de…nition.
In other words, A is …nite if there exist a set f1; 2; :::; ng of natural numbers and a bijective
function f : f1; 2; :::; ng ! A. The set f1; 2; :::; ng can be seen as the “prototypical” set of
cardinality n, relative to which it is possible to “calibrate” all the other …nite sets of same
cardinality through suitable bijective functions.
For the cardinality of …nite sets, the functional viewpoint, based on bijective functions
and on isolating a prototypical set, not much more than a curiosity. However, it becomes
substantial when we want to extend the notion of cardinality to in…nite sets. This was one of
the fundamental intuitions of Georg Cantor, which led to the birth of the theory of in…nite
sets. Indeed, the possibility of establishing a one-to-one correspondence among in…nite sets
allows for a classi…cation of these sets by “size”and leads to the discovery of properties that
are not always intuitive.
Relative to …nite sets, countable sets immediately exhibit a remarkable, possibly puzzling,
property: it is always possible to put a countable set into a one-to-one correspondence with
an in…nite proper subset of it. In other words, losing elements may not a¤ect cardinality
when dealing with countable sets.
Theorem 245 Each in…nite subset of a countable set is also countable.
Proof Let X be a countable set and let A X be an in…nite proper subset of X, i.e.,
A 6= X. Since X is countable, its elements can be listed as a sequence of distinct elements
X = fx0 ; x1 ; : : : ; xn ; : : :g = fxi gi2N . Let us denote by n0 the smallest integer larger than or
equal to 0 such that xn0 2 A (if, for example, x0 2 A, we have n0 = 0, if x0 2 = A and x1 2 A
we have n0 = 1, and so on). Analogously, let us denote by n1 the smallest integer number
(strictly) larger than n0 such that xn1 2 A. Given n0 ; n1 ; : : : ; nj (j 1), let us de…ne nj+1 as
the smallest integer number larger than nj such that xnj+1 2 A. Consider now the function
f : N ! A de…ned by f (i) = xni , with i = 0; 1; : : : ; n; : : :. It is easy to check that f is a
one-to-one correspondence between N and A, and so A is countable.
The following example should clarify the scope of the previous theorem. The set E of
even numbers is, clearly, a proper subset of N that we may think contains only “half” of
its elements. Nevertheless, it is possible to establish a one-to-one correspondence with N by
putting in correspondence each even number to its half:
2n 2 E !n2N
and therefore jEj = jNj. Already Galileo realized this remarkable peculiarity of in…nite sets,
which clearly distinguishes them from …nite sets, whose proper subsets have always smaller
cardinality.3 In a famous passage of the Discorsi e dimostrazioni matematiche intorno a
due nuove scienze,4 published in 1638, he observed that the natural numbers can be put in
a one-to-one correspondence with their squares by setting n2 $ n. The squares, which at
…rst sight seem to constitute a rather small subset of N, are thus in equal number with the
natural numbers: “in an in…nite number, if one could conceive of such a thing, he would be
forced to admit that there are as many squares as there are numbers all taken together”. The
clarity with which Galileo exposes the problem is worthy of his genius. Unfortunately, the
mathematical notions available to him were completely insu¢ cient for further developing
his intuitions. For example, the notion of function, fundamental for the ideas of Cantor,
emerged (in a primitive form) only at the end of the Seventeenth century in the works of
Leibnitz.
Clearly, the union of a …nite number of countable sets is also countable. Much more is
actually true.
3
The mathematical fact considered here is at the basis of several little stories. For example, The Paradise
Hotel has countably in…nite rooms, progressively numbered 1; 2; 3; . At a certain moment, they are all
occupied when a new guest checks in. At this point, the hotel manager faces a conundrum: how to …nd a
room for the new guest? Well, after some thought, he realizes that it is easier than he imagined! It is enough
to ask every other guest to move to the room coming after the one they are actually occupying (1 ! 2; 2 ! 3;
3 ! 4, etc.). In this way, room number 1 will become free. He also realizes that it is possible to improve
upon this new arrangement! It is enough to ask everyone to move to the room with a number which is twice
the one of the room actually occupied (1 ! 2; 2 ! 4; 3 ! 6, etc.). In this way, in…nite rooms will become
available: all the odd ones.
4
The passage is in a dialogue between Sagredo, Salviati, and Simplicio, during the …rst day.
162 CHAPTER 7. CARDINALITY
Theorem 246 The union of a countable collection of countable sets is also countable.
With a similar argument it is possible to prove that also the Cartesian product of a …nite
number of countable sets is countable. In particular, the result above yields that the set Q
of the rational numbers is countable.
The reader can verify that f is indeed bijective, proving that Z is countable. On the other
hand, the set nm o
Q= : m 2 Z and n 2 N, with n 6= 0
n
7.2. BIJECTIVE FUNCTIONS AND CARDINALITY 163
+1
[
Q= An
n=1
where
0 1 1 2 2 m m
An = ; ; ; ; ;:::; ; ;:::
n n n n n n n
The property just stated is quite surprising: though the rational numbers are much more
numerous than the natural numbers, there exists a way to put these two classes of numbers
into a one-to-one correspondence. The cardinality of N, and of any countable set, is usually
denoted by @0 : jNj = @0 . Therefore, we can write as
jQj = @0
De…nition 248 A set A has the cardinality of the continuum if it can be put in a one-to-one
correspondence with the set R of the real numbers. In this case, we write jAj = jRj.
The cardinality of the continuum is often denoted by c, that is, jRj = c. Also in this case
there exist subsets that are, prima facie, much smaller than R, but turn out to have the same
cardinality. Let us see an example which will be useful in proving that R is uncountable.
Proposition 249 The interval (0; 1) has the cardinality of the continuum.
Proof We want to show that j(0; 1)j = jRj. To do this we have to show that the numbers of
(0; 1) can be put in a one-to-one correspondence with those of R. The bijection f : R ! (0; 1)
de…ned by
1 x
1 2e if x < 0
f (x) = 1 x
2e if x 0
5
@ (aleph) is the …rst letter of the Hebrew alphabet. In the following section we will formalize, also for
in…nite sets, the idea of having the same or greater cardinality; now, we treat these notions intuitively.
164 CHAPTER 7. CARDINALITY
2
y
1.5
1
1
0.5 1/2
0
O x
-0.5
-1
-1.5
-2
-5 -4 -3 -2 -1 0 1 2 3 4 5
shows that, indeed, this is the case (as the reader can also formally verify).
Proof We proceed by contradiction and assume that R is countable. Hence, there exists
a bijective function g : N ! R. By Proposition 249, it follows that there exists a bijective
function f : R ! (0; 1). The reader can easily prove that f g is a bijective function from
N to (0; 1), yielding that (0; 1) is countable. We will next reach a contradiction, showing
that (0; 1) cannot be countable. To this end, we write all the numbers in (0; 1) using their
decimal representation: each x 2 (0; 1) will be written as
x = 0:c0 c1 cn
with ci 2 f0; 1; :::; 9g, using always in…nitely many digits (for example 3:54 will be written
3:54000000 : : :). Since until now we obtained that (0; 1) is countable, there exists a way to
list its elements as a sequence.
and so on. Let us take then the number x = 0:d0 d1 d2 d3 dn such that its generic decimal
digit dn is di¤erent from cnn (but without choosing in…nitely many times 9, thus to avoid a
periodic 9 which, as we know, does not exist on its own). The number x belongs to (0; 1), but
sadly does not belong to the list written above, since dn 6= cnn (and therefore it is di¤erent
from x0 since d0 6= c00 , from x1 since d1 6= c11 , etc.). We conclude that the list written
above cannot be complete and hence the numbers of (0; 1) cannot be put in a one-to-one
correspondence with N. The interval (0; 1) therefore is not countable, a contradiction.
7.3. A PANDORA’S BOX 165
The set R of real numbers is, therefore, much richer than N and Q. The rational numbers
— that have, as we remarked, a “quick rhythm”— are comparatively very few with respect
to the real numbers: they form a kind of very …ne dust that overlaps with the real numbers
without covering them all. At the same time, it is dust so …ne that between any two real
numbers, no matter how close they are, there are particles of it.
In sum, the real line is a new prototype of in…nite set.
It is possible to prove that both the union and the Cartesian product of a …nite or
countable collection of sets that have the cardinality of the continuum has, in turn, the
cardinality of the continuum. This has the next remarkable consequence.
Theorem 251 Rn has the power of the continuum for each n 1.
This is another remarkable …nding, which is surprising already in the special case of the
plane R2 that, intuitively, may appear to contain many more points than the real line. It is
in front of results of this type, so surprising for our “…nitary” intuition, that Cantor wrote
in a letter to Dedekind “I see it, but I do not believe it”. His key intuition on the use of
bijective functions to study the cardinality of in…nite sets opened a new and fundamental
area of mathematics, which is also rich in terms of philosophical implications (mentioned at
the beginning of the chapter).
Sets can have the same size, but also di¤erent sizes. This motivates the following de…ni-
tion:
De…nition 254 A set A has cardinality less than or equal to that of B, written jAj jBj,
if there exists an injective function f : A ! B. A set A has cardinality strictly less than that
of B, written jAj < jBj, if jAj jBj and jAj = 6 jBj.
(ii) jAj jBj and jBj jCj imply that jAj jCj;
(iii) jAj jBj and jBj jAj if and only if jAj = jBj;
Example 256 We have jNj < jRj. Indeed, by Theorem 250 jNj =
6 jRj and, by assertion (iv),
N R implies jNj jRj. N
Properties (i) and (ii) say that the order is re‡exive and transitive. As for property
(iii), it tells us that and = are related in a natural way. Finally, (iv) con…rms the intuitive
idea that smaller sets have a small cardinality. Remarkably, this intuition does not carry
over to < – i.e., A ( B does not imply jAj < jBj – because, as we have already noted, a
proper subset of an in…nite set may have the same cardinality as the original set (as Galileo
had envisioned).
(ii) Since jAj jBj, there exists an injective function f : A ! B. Since jBj jCj, there
exists an injective function g : B ! C. Next, note that h = g f is well de…ned, h : A ! C,
and, by the initial part of the proof, we also know it is injective, proving that jAj jCj.
(iii) We only prove the “if” part. The “only if” part is the content of the Schroeder-
Bernstein’s Theorem which we leave to more advanced courses. By de…nition and since
jAj = jBj, there exists a bijection f : A ! B. Since f is bijective, it follows that f 1 : B ! A
is well de…ned and bijective. Thus, both f : A ! B and f 1 : B ! A are injective, yielding
that jAj jBj and jBj jAj.
(iv) De…ne f : A ! B by the rule f (a) = a. Since A B, f is well de…ned and clearly
injective, proving the statement.
When a set A is …nite and non-empty, we clearly have jAj < 2A . Remarkably, the
inequality continues to hold for in…nite sets.
Theorem 257 (Cantor) For each set A, …nite or in…nite, we have jAj < 2A .
Proof Consider a set A and the collection of all singletons C = ffagga2A . It is immediate to
see that there is a bijective mapping between A and C, that is, jAj = jCj, and C 2A . Since
jCj 2A , we conclude that jAj 2A . Next, by contradiction, assume that jAj = j2A j.
Then there exists a bijection between A and 2A which associates to each element a 2 A an
element b = b (a) 2 2A and vice versa: a $ b. Observe that each b (a), being an element of
2A , is a subset of A. Consider now all the elements a 2 A such that the corresponding subset
b (a) does not contain a. Call S the subset of these elements, that is, S = fa 2 A : a 62 b (a)g.
Since S is a subset of A, S 2 2A . Since we have a bijection between A and 2A , there must
exist an element c 2 A such that b (c) = S. We have two cases:
(i) if c 2 S, then by the de…nition of S, b (c) does not contain c and therefore c 2
= b (c) = S;
(ii) if c 2
= S, then by the de…nition of S, b (c) contains c and therefore c 2 b (c) = S.
Cantor’s Theorem o¤ers a simple way to make a “cardinality jump” starting from a
given set A: it is su¢ cient to consider the power set 2A . For example, 2R > jRj, then also
R
22 > j2R j, and so on. We can therefore build an in…nite sequence of sets that are of higher
and higher cardinality. In this way, we enrich (7.2), which now becomes
n R
o
1; 2; :::; n; :::; @0 ; jRj ; 2R ; 22 ; ::: (7.3)
Here is the Pandora’s box mentioned above, which Theorem 257 has allowed us to uncover.
The breathtaking sequence (7.3) is only the incipit of the theory of the in…nite sets, whose
study (even the introductory part) would take us too far away.
Before moving on with the book, however, we consider a …nal famous aspect of the
theory, the so-called continuum hypothesis (which the reader might have already heard of).
By Theorem 257, we know that 2N > jNj. On the other hand, by Theorem 250 we also
have jRj > jNj. The next result (we omit its proof) shows that these two inequalities are
actually not distinct.
168 CHAPTER 7. CARDINALITY
Therefore, the power set of N has the cardinality of the continuum. The continuum
hypothesis states that there is no set A such that
That is, there does not exist any in…nite set of intermediate cardinality between @0 and c.
In other words, a set that has cardinality larger than @0 must have at least the cardinality
of the continuum.
The validity of the continuum hypothesis is the …rst among the celebrated Hilbert prob-
lems, posed by David Hilbert in 1900, and represents one of the deepest questions in math-
ematics. By adopting this hypothesis, it is possible to set
@1 = jRj
and to consider the cardinality of the continuum as the second in…nite cardinal number @1
after the …rst one @0 = jNj.
The continuum hypothesis can be reformulated in a suggestive way by writing
@1 = 2@0
That is, the smallest cardinal number greater than @0 is equal to the cardinality of the power
set of N or, equivalently, of any set of cardinality @0 (like, for example, the rational numbers).
The generalized continuum hypothesis states that, for each n, we have
@n+1 = 2@n
All the jumps of cardinality in (7.3), not only the …rst one from @0 to @1 , are thus obtained
by considering the power set. Therefore,
R 2R
@2 = 22 ; @3 = 22
Summing up, the depth of the problems that the use of bijective functions opened is
incredible. As we have seen, this study started by Cantor is, at the same time, rigorous
and intrepid (as typical of the best mathematics, at the basis of its beauty). It relies on
the use of bijective functions to capture the fundamental principle of similarity (in terms of
numerosity) among sets.6
6
The reader who wants to learn more about set theory can consult P. Halmos, Naive set theory, Van
Nostrand, 1960 or P. Suppes, Axiomatic set theory, Van Nostrand, 1960.
Part II
Discrete analysis
169
Chapter 8
Sequences
where each number occupies a place of order, i.e., it follows (except the …rst one) a real
number and precedes another one. The next de…nition formalizes this. We denote by N+
the set of the natural numbers without 0.
n 7 ! 2n (8.2)
and so we have the sequence of even strictly positive integers. The image f (n) is usually
denoted by xn . With such notation, the sequence of the even strictly positive integers is
xn = 2n for each n 1. The images xn are called terms (or elements) of the sequence. We
will denote sequences by fxn g1
n=1 , or brie‡y by fxn g.
1
There are di¤erent ways to de…ne a sequence fxn g, that is, to describe the underlying
function f : N+ ! R. A …rst way is to describe it in closed form, i.e., through a formula: for
example, it is what we have done with the sequence of the even numbers using (8.2). Other
de…ning rules are, for example,
n 7 ! 2n 1 (8.3)
2
n7 !n (8.4)
1
n7 ! p (8.5)
2n 1
1
The choice of starting the sequence from n = 1 instead of n = 0 is a mere convention. In contexts where
it is more suitable to start from n = 0, is perfectly legitimate to consider sequences fxn g1
n=0 .
171
172 CHAPTER 8. SEQUENCES
Rule (8.3) gives rise to the sequence of odd strictly positive integers
f1; 3; 5; 7; g (8.6)
f1; 4; 9; 16; g
1 1 1
1; p ; p ; p ; (8.7)
2 4 8
Another important way to de…ne a sequence is by recurrence (or recursion). Consider
the classical Fibonacci sequence
in which each term is the sum of the two terms that precede it, with …xed initial values 0
and 1. For example, in the fourth position we …nd the number 2, i.e., the sum 1 + 1 of the
two terms that precede it, in the …fth position we …nd the number 3, i.e., the sum 1 + 2 of
the two terms that precede it, and so on. The underlying function f : N+ ! R is, hence,
(
f (1) = 0 ; f (2) = 1
(8.8)
f (n) = f (n 1) + f (n 2) for n 3
We therefore have two initial values, f (1) = 0 and f (2) = 1, and a recursive rule that allows
to calculate the term in position n once the two preceding terms are known. Di¤erently
from the sequences de…ned through a closed formula, such as (8.3)–(8.5), to obtain the term
xn we now have to …rst build, using the recursive rule, all the terms that precede it. For
example, to calculate the term x100 in the sequence (8.6) of the odd numbers, it is su¢ cient
to substitute n = 100 in formula (8.3), …nding x100 = 199. On the contrary, to calculate the
term x100 in the Fibonacci sequence we have to rebuild …rst by recurrence the …rst 99 terms
of the sequence. Indeed, it is true that to determine x100 it is su¢ cient to know the values
of x99 and x98 and then to use the rule x100 = x99 + x98 , but to determine x99 and x98 we
must …rst know x97 and x96 , and so on.
Therefore, the recursive de…nition of a sequence consists of one or more initial values
and of a recurrence rule that, starting from them, allows to build the various terms of the
sequence. The initial values are arbitrary. For example, if in (8.8) we choose f (1) = 2 and
f (2) = 1 we have the following Fibonacci sequence
We provide now a pair of classic examples of sequences, the …rst one de…ned by recurrence
and the second one in closed form.
f (1) = a
f (n) = f (n 1) + b for n 2
8.1. THE CONCEPT 173
The initial value is f (1) = a, starting from which it is possible to build the entire sequence
through the recursive formula f (n) = f (n 1) + b. Such sequence is called arithmetic (or
an arithmetic progression) with …rst term a and common di¤erence b. For example, if a = 2
and b = 4, we have
f2; 6; 10; 14; 18; 22; g
. N
1 1 1 1
1; ; ; ; ;
2 3 4 5
a; aq; aq 2 ; aq 3 ; aq 4 ;
is called geometric (or a geometric progression) with …rst term a and common ratio q. N
Clearly, not all sequences can be described in closed or recursive form. The most famous
example is the sequence fpn g of prime numbers: it is in…nite by Euclid’s Theorem, but it
does not have a (known) explicit description. In particular:
(i) Given n, we do not know any formula that tells us what pn is; in other words, the
sequence fpn g cannot be de…ned in closed form (as far as we know).
(ii) Given pn , we do not know any formula that tells us what pn+1 is; in other words, the
sequence fpn g cannot be de…ned by recurrence.
(iii) Given any prime number p, we do not know of any formula that gives us a prime
number q greater than p; in other words, the knowledge of a prime number does not
give any information on the subsequent prime numbers.
Hence, we do not have a clue on how the prime numbers follow one another, that is,
on the form of the function f : N+ ! R that de…nes such sequence. We have, therefore,
to consider all the natural numbers and check, one by one, whether or not they are prime
numbers through the primality tests (Section 1.3.2). Having at our disposal the eternity, we
could then construct term by term the sequence fpn g. More modestly, in the short time that
passed between Euclid and us, tables of prime numbers have been compiled; they establish
the terms of the sequence fpn g until numbers that may seem very large to us, but that are
nothing relative to the in…nity of all the prime numbers.
O.R. Concerning observation (iii), for centuries mathematicians have looked for a rule that,
given a prime number p, made it possible to …nd a greater prime q > p, that is, a function
q = f (p). A famous example of a possible such rule is given by the prime numbers of
2
It is called harmonic because 1=2; 1=3; 1=4; are the positions in which we have to put a …nger on a
vibrating string to obtain the di¤erent notes.
174 CHAPTER 8. SEQUENCES
Mersenne. A prime number is said to be a Mersenne number if it can be written in the form
2p 1 with p prime. It is possible to prove that if 2p 1 is prime, then so is p. For centuries,
it was believed (or hoped) that the much more interesting converse was true, namely: if p
is prime, so is 2p 1. This conjecture was de…nitely disproved in 1536, when Hudalricus
Regius showed that
211 1 = 2047 = 23 89
thus …nding the …rst counterexample to the conjecture. Indeed, p = 11 does not satisfy
it. In any case, the Mersenne numbers are among the most important prime numbers. In
particular, as of 2016, the greatest prime number known is
274207281 1
which has 22338618 digits and is a Mersenne number (see the Great Internet Mersenne Prime
Search). H
We close the section by observing that given any function f : R+ ! R, the restriction of
f to N+ , fjN+ is a sequence.
x = fxn g = fx1 ; x2 ; : : : ; xn ; : : :g
The operations seen on the functions in Section 6.3.2 have as a special case the operations
on sequences, that is, on elements of the space R1 . In particular, given two sequences
x = fxn g and y = fyn g in R1 , we have:
In view of (i), for convenience of notation, we will denote the sum directly as fxn + yn g
instead of f(x + y)n g, and we will do the same for the other operations.4
(ii) x > y if x y and x 6= y, that is, if x y and there exists at least a position index n
such that xn > yn ;
x y =) x > y =) x y 8x; y 2 R1
(i) increasing if
x y =) g (x) g (y) x; y 2 A (8.9)
The decreasing counterparts of these notions are de…ned in an analogous way. Moreover,
in particular, g is constant if there exists k 2 R such that
g (x) = k 8x 2 A
For brevity we do not dwell further upon these notions, and we limit ourselves to observing
that the strictly increasing monotonicity implies the other two properties.
where 2 (0; 1) can be interpreted as a subjective discount factor that, as we have seen,
depends on the degree of patience of the consumer.
Regarding this, a few observations analogous to those made in Section 6.4.4 for utility func-
tions on Rn are valid. In particular, here we have, too,
f 1; 1; 1; 1; :::g (8.10)
in which the two values 1 and 1 are repeated. The constant sequence, with generic element
xn = 2 for every n 1,
f2; 2; 2; :::g (8.11)
is constituted only by 2 (the corresponding f is therefore the constant function f (n) = 2 for
every n 1).
Concerning this aspect, the image (or range)
Im f = ff (n) : n 1g
of the sequence, which consists exactly of the values that the sequence assumes, disregarding
repetitions, is important. For example, the image of the sequence (8.10) is f 1; 1g, while
for the constant sequence (8.11) it is the singleton f2g. The image gives therefore a very
important information because it indicates which values the sequence e¤ectively assumes,
without the repetitions: as we have seen, they can be very few and repeat themselves over
and over again along the sequence. On the other hand, the sequence (8.6) of the odd
8.4. IMAGES AND CLASSES OF SEQUENCES 177
numbers does not contain any repetition, and its image consists of all its terms, that is,
Im f = f2n 1 : n 1g.
Through the image, in Section 6.4.3 we have studied various notions of boundedness for
functions. In the special case of the sequences — i.e., of the functions f : N+ ! R — these
general notions assume the following form. A sequence fxn g is:
(i) bounded from above if there exists k 2 R such that xn k for every n 1;
(ii) bounded from below if there exists k 2 R such that xn k for every n 1;
(iii) bounded if it is both bounded from above and from below, i.e., if there exists k > 0
such that jxn j k for every n 1.
For example, the sequence fxn g = f( 1)n g is bounded, while that of the odd numbers
(8.6) is only bounded from below. Note that, as usual, this classi…cation is not exhaustive,
since there exist sequences that are neither bounded from above, nor bounded from below:
for example, xn = ( 1)n n. Such sequences are called unbounded.
Another important class of sequences are the monotonic ones, which are de…ned in a
similar way to what we saw for functions in Section 6.4.4. In particular, a sequence fxn g is:
(i) increasing if
xn+1 xn 8n 1
strictly increasing if
xn+1 > xn 8n 1
(ii) decreasing if
xn+1 xn 8n 1
strictly decreasing if
xn+1 < xn 8n 1
(iii) constant if it is both increasing and decreasing, i.e., if there exists k 2 R such that
xn = k 8n 1
De…nition 262 We say that a sequence satis…es a property P eventually if, starting from
a certain place of order n = nP , all the terms of the sequence satisfy P.
Obviously, the place (or index) n depends on the property P: this is indicated by writing
n = nP .
5
For sequences the notions of strict monotonicity are not so important.
178 CHAPTER 8. SEQUENCES
Example 263 (i) The sequence f2; 4; 6; 32; 57; 1; 3; 5; 7; 9; 11; g is eventually increas-
ing: indeed, starting from the 6th term, it is increasing.
(ii) The sequence fng is eventually 1:000: indeed, all the terms of the sequence, starting
from the ones of place 1:000, are 1:000.
O.R. To eventually satisfy a property, the sequence, “when young”, can do what it wants;
the important is that “when enough” (that is, from a certain n onward), “it settles down”.
Youthful blunders are forgiven: what is important is that, sooner or later, all the terms of
the sequence satisfy the property. H
1 1 1
1; p ; p ; p ;
2 4 8
p
By continuing, we can verify that, for larger and larger values of n, its terms xn = 1= 2n 1
become closer and closer, we say “tend”, to the value L = 0. In this case we say that the
sequence tends to 0 and we write
1
lim p =0
n!1 2n 1
f1; 3; 5; 7; g
the terms xn = 2n 1 of the sequence grow larger and larger for larger and larger values of
n. In this case we say that the sequence diverges positively, and we write
lim (2n 1) = +1
n!1
8.6. LIMITS AND ASYMPTOTIC BEHAVIOR 179
f 1; 1; 1; 1; g
By changing the values of n, it continues to oscillate between the values 1 and 1, never
approaching (eventually) any particular value. In this case, we say that the sequence is
oscillating (or irregular): it does not have a limit.
(ii) divergence to +1 or to 1;
(iii) oscillation.
In the …rst two cases we say that the sequence is regular : it tends (it approaches asymp-
totically) to a value, possibly in…nite. In case (iii) we say that the sequence is irregular (or
oscillating). In the rest of the section we formalize the intuitive idea of “tending to a value”.6
8.6.1 Convergence
We start with convergence, that is, with case (i).
Therefore, a sequence fxn g converges to L when, for each quantity ", arbitrarily small (but
positive), there exists a place n" (that depends on "!) starting from which the distance
between the terms xn of the sequence and the limit L is always smaller than ". A sequence
fxn g that converges to a point L 2 R is called convergent.
6
Often, irregual sequences are called divergent. In order to avoid any confusion with regular sequences
that are not convergent, the latter have the extra speci…cation of being divergent to either +1 or 1.
180 CHAPTER 8. SEQUENCES
We have said that the position (index) n" depends on ". Moreover, as it should be clear
from Examples 266 and 267, the choice of n" is not unique: if there exists a position n" such
that jxn Lj < " for every n n" , the same holds for any subsequent position, which can
also be itself chosen as n" . The choice of which among these positions to call n" is completely
irrelevant: the de…nition asks that there exists (at least) one. The two examples that we will
present shortly should clarify the question.
that is
n n" =) L " < xn < L + "
O.R. The de…nition requires that “falling eventually inside”happens for every neighborhood
of L: it is thus essential that this happens for arbitrarily small neighborhoods (it is easy to
belong to an enormous neighborhood, but di¢ cult to belong to a very small one). H
Example 266 Consider the sequence f1=ng. The natural candidate for its limit is 0. Let
us verify that this is indeed the case. Let " > 0. We have
1 1 1
0 < " () < " () n >
n n "
Therefore, if we take as n" any integer greater that 1=", for example n" = [1="] + 1,7 then
we have
1
n n" =) 0 < < "
n
and therefore 0 is actually the limit of the sequence. For example, if " = 10 100 , we have
n" = 10100 + 1. Note that we could have chosen n" to be any integer greater than 10100 + 1.
N
n p o
Example 267 Consider the sequence (8.7), that is, 1= 2n 1 . Also here the natural
candidate for its limit is 0. Let us verify this. Let " > 0. We have
1 1 n 1 1 1
p 0 < " () n 1 < " () 2 2 > () n > 1 + 2 log2
2n 1 2 2 " "
7
Recall that [ ] denotes the integer part, introduced in Section 1.4.3.
8.6. LIMITS AND ASYMPTOTIC BEHAVIOR 181
and therefore, by taking n" to be any integer greater than 1 + 2 log2 " 1, for example n" =
2 + 2 log2 " 1 , we have
1
n n" =) 0 < p <"
2n 1
Therefore, 0 is the limit of the sequence. For example, if " = 10 100 we have n" = 2 +
2 log2 10100 = 2 + 200 [log2 10]. N
We have seen two examples of sequences that converge to 0. Such sequences are called
in…nitesimal (or null ). Thanks to the next result, the computation of their limits is of
particular importance.
Proof “Only if”. Let limn!1 xn = L. Consider the (new) sequence of term yn = d(xn ; L):
We have to prove that limn!1 yn = 0, i.e., that for every " > 0 there exists n" 1 such that
n n" implies jyn j < ". On the other hand, since yn 0, this is equivalent to showing that
Since xn ! L, given " > 0, there exists n" 1 such that d(xn ; L) < " for every n n" , and
therefore (8.13) holds.
“If”. Suppose that limn!+1 d (L; xn ) = 0. Let " > 0. There exists n" 1 such that
d (L; xn ) < " for every n n" . Therefore, xn 2 B" (L) for every n n" , as desired.
We can therefore reduce the study of the convergence of any sequence to the convergence
to 0 of the sequence fd (xn ; L)gn 1 of real numbers. In other words, to check if xn ! L, it
is su¢ cient to check if d (xn ; L) ! 0.
1
xn = 1 + ( 1)n
n
( 1)n ( 1)n 1
d (xn ; 1) = 1 + 1 = = ! 0;
n n n
We leave to the reader the rigorous de…nition of limit from above and from below in
terms of right and left neighborhoods of L.
8.6.3 Divergence
We now consider the divergence, starting with the positive divergence. The idea of the
de…nition is similar, mutatis mutandis, to the previous ones.
n nK =) xn > K
In other words, a sequence diverges positively when it eventually becomes greater than
every K > 0. Since the constant K can be taken arbitrarily large, this can happen only if
the sequence is not bounded from above.
O.R. The de…nition requires that the inequality holds for every scalar K: it is decisive
that this happens for arbitrarily large values of K (it is easy to be > K when K is small,
increasingly di¢ cult the larger K is). H
Example 272 Consider the sequence of the even numbers, xn = 2n and let us verify that
it diverges positively. Let K 2 R. We have
K
2n > K () n >
2
and so we can choose as nK any integer greater than K=2. For example, if K = 10100 , we
can put nK = 10100 =2 + 1. Therefore fxn g = f2ng diverges positively. N
n nK =) xn < K
8.6. LIMITS AND ASYMPTOTIC BEHAVIOR 183
In such a case, the terms of the sequence are eventually smaller than every K < 0:
although the constant can take arbitrarily large negative values (in absolute value), there
exists a position besides which all the terms of the sequence are smaller than or equal to the
constant. This characterizes the convergence to 1 of the sequence.
Proposition 274 A sequence fxn g, with eventually xn > 0, diverges positively if and only
if the sequence f1=xn g converges to zero.
An analogous result holds for the negative divergence. Note how the hypothesis “eventu-
ally xn > 0” is irrelevant for a sequence that diverges positively since this kind of sequence
always satis…es this condition.
Proof “If”. Let 1=xn ! 0. Let K > 0. Setting " = 1=K > 0, by De…nition 264, there exists
n1=K 1 such that 1=xn < 1=K for every n n1=K . Therefore, xn > K for every n n1=K ,
and by De…nition 271 we have xn ! +1.
“Only if”. Let xn ! +1 and let " > 0. Setting K = 1=" > 0, by De…nition 271, there
exists n1=" such that xn > 1=" for every n n1=" . Therefore, 0 < 1=xn < " for every n n1="
and therefore 1=xn ! 0.
O.R. Adding, subtracting, altering (or changing in any other way) a …nite number of terms
of a sequence does not change its asymptotic behavior: if it is regular, i.e., convergent or
(properly) divergent, it remains so, and with the same limit; if it is oscillating (irregular),
it remains so. This obviously depends on the fact that the limit requires that a property
(either “hitting” an arbitrarily small neighborhood in case of convergence or being greater
than an arbitrarily large number in case of divergence) only holds eventually. H
O.R. A neighborhood B" (x) of a point is the smaller, the smaller " > 0 is; a neighborhood
(K; +1] of +1 is the smaller, the greater K is (and similarly for the neighborhoods [ 1; K)
of 1). H
Having observed that (K; +1] and [ 1; K) are open in R for every K 2 R, we can state
a lemma that will turn out to be useful in de…ning limits of sequences and functions.
184 CHAPTER 8. SEQUENCES
Proof Since the proof of (ii) is analogous, it is su¢ cient to show (i). “If”. Let A be unbounded
from above, i.e., A does not have an upper bound. Let (K; +1] be a neighborhood of +1.
Since A does not have any upper bound, K is not an upper bound of A. Therefore there
exists x 2 A such that x > K, i.e., x 2 (K; +1] \ A and x 6= +1. It follows that +1 is a
limit point of A (indeed, each neighborhood of +1 contains points of A di¤erent from +1).
“Only if”. Let +1 be a limit point of A. We show that A does not have any upper
bound. Suppose, by contradiction, that K 2 R is an upper bound of A. Since +1 is a limit
point of A, the neighborhood (K; +1] of +1 contains a point x 2 A such that x 6= +1.
Therefore K < x, contradicting the fact that K is an upper bound of A.
Example 277 The sets A such that (a; +1) A for some a 2 R constitute an important
class of unbounded from above sets. By Lemma 276, it follows that for them +1 is a limit
point. In a similar way, 1 is a limit point for the sets A such that ( 1; a) A for some
a 2 R. N
Using the topology of R we can give a general de…nition of limit that extends De…nition
265 in order to include also the De…nitions 271 and 273 of divergence. We observe that in
the next de…nition, which uni…es all the possible de…nitions of limit of sequence, we have
that: 8
>
> B" (L) if L 2 R
<
U (L) = (K; +1] if L = +1
>
>
:
[ 1; K) if L = 1
n nU =) xn 2 U (L)
O.R. Observe that if L 2 R, nU depends on an arbitrary radius " > 0 (in particular, as
small as we want), and hence we can write nU = n" : If, instead, L = +1, nU depends on
any real number K (in particular, arbitrarily large) and we can write nU = nK , with K > 0
without loss of generality. Finally, if L = 1, nU depends on any negative real number K
(in particular, arbitrarily large in absolute value) and, without loosing generality, we can set
nU = nK with K < 0. On the other hand, when L is …nite it is decisive that the property
holds also for arbitrarily small values of ". When L = 1, it is instead decisive that the
property holds also for K arbitrarily large in absolute value. H
8.7. PROPERTIES OF LIMITS 185
Theorem 279 (Uniqueness of the limit) A sequence fxn g converges to at most one limit
L 2 R.
Proof Let us suppose, by contradiction, that there exist two distinct limits belonging to the
set R. For such limits di¤erent cases are possible.
We analyze …rst the case of two distinct …nite limits L0 ; L00 2 R, i.e., L0 6= L00 . Without
loss of generality, suppose that L00 > L0 . Take " > 0 such that
L00 L0
"<
2
Then
B" L0 \ B" L00 = ;
10 y
8 L''+ε
L''
L''- ε
6
L'+ε
L'
4
L'- ε
O x
0
-2
-2 -1 0 1 2 3 4
as the reader can verify. On the other hand, by De…nition 265, there exists n0" 1 such that
xn 2 B" (L0 ) for every n n0" , and there exists n00" 1 such that xn 2 B" (L00 ) for every
n n" . Setting n" = max fn" ; n" g, we have therefore both xn 2 B" (L0 ) and xn 2 B" (L00 )
00 0 00
for every n n" , i.e., xn 2 B" (L0 ) \ B" (L00 ) for every n n" . But this contradicts
0 00 0 00
B" (L ) \ B" (L ) = ;. It follows that L = L and therefore the limit is unique.
Let us analyze now the case in which the sequence admits one …nite limit L and another
one equal to +1. For every " > 0 and every K > 0, there exist n" and nK such that
It is now su¢ cient to take K = L + " to realize that, for n max fn" ; nK g, the two
inequalities cannot coexist.
The remaining cases can be treated in an analogous way.
The next result shows that, when a sequence converges to a point L 2 R, in each neigh-
borhood of L we …nd almost all the points of the sequence.
In other words, the sequence eventually belongs to any neighborhood B" (L) of L.
Proof Let us suppose xn ! L. By De…nition 265, for every " > 0 there exists n" 1 such
that xn 2 B" (L) for every n n" . Therefore, except at most the values xn with 1 n < n" ,
all the elements of the sequence belong to B" (L).
Vice versa, given any neighborhood B" (L) of L, suppose that all the terms of the sequence
belong to it, except at most a …nite number of them. Denote by fxnk g, k = 1; 2; : : : ; m, the
set of the elements of the sequence that do not belong to B" (L). Setting n" = nm + 1, we
have that xn 2 B" (L) for every n n" . As this is true for each neighborhood B" (L) of L,
by De…nition 265 we have that xn ! L.
The next classical result, concerning the permanence of sign, shows that if a sequence
has a non-zero limit L, then the terms of the sequence eventually have the same sign as L.
Theorem 281 (Permanence of sign) Let fxn g be a sequence that converges to the limit
L 6= 0. Then, there exists n 1 such that xn has the same sign as L for every n n, that
is,
xn L > 0 8n n
Analogously, if xn ! +1 ( 1), then there exists n such that xn is positive (negative) for
every n n.
Proof Suppose L > 0 (an analogous argument holds if L < 0). Let " 2 (0; L). By De…nition
264 there exists n 1 such that jxn Lj < ", i.e., L " < xn < L + " for every n n. Since
" 2 (0; L), we have L " > 0. Therefore,
The Theorem on the permanence of sign represents a property of the limits with respect
to the order structure on R. We give another simple result of the same type, leaving the
proof to the reader.
Proposition 282 Let fxn g and fyn g be two sequences such that xn ! L 2 R and yn !
H 2 R. If eventually xn yn , then L H.
The converse does not hold: for example, let L = H = 0, fxn g = f 1=ng and fyn g =
f1=ng. We have L H, but xn < yn for every n. However, if we assume L > H, the
converse holds in the following strict form:
8.7. PROPERTIES OF LIMITS 187
Proposition 283 If fxn g and fyn g are two sequences such that xn ! L 2 R and yn ! H 2
R, with L > H, then eventually xn > yn .
Proof We prove the statement for L; H 2 R, leaving the other cases to the reader. Let
0 < " < (L H) =2. Since H + " < L ", we have (H "; H + ") \ (L "; L + ") = ;.
Moreover, there exist n0" ; n00" 1 such that yn 2 (H "; H + ") for every n n0" and
xn 2 (L "; L + ") for every n n00" . For n maxfn0" ; n00" g it follows that yn 2 (H "; H + ")
and xn 2 (L "; L + "), so that xn > L " > H + " > yn for every n maxfn0" ; n00" g.
The scope of the proposition is remarkable. It allows, for example, to verify the positive
or negative divergence of a sequence through the simple comparison with other divergent
sequences. Indeed, if xn yn and xn diverges negatively, then so does yn ; if xn yn and yn
diverges positively, then so does xn .
Proof Suppose xn ! L. Setting " = 1, there exists n1 1 such that xn 2 B1 (L) for every
n n1 . Let M be a constant such that M > max [1; d (x1 ; L) ; : : : ; d (xn1 1 ; L)]. We have
d (xn ; L) < M for every n 1, i.e., jxn Lj < M for every n 1. This implies that
L M < xn < L + M
Thanks to Proposition 284, the convergent sequences constitute a subset of the bounded
ones. Therefore, if a sequence is not bounded, it cannot be convergent.
In general, the converse of Proposition 284 is false. For example the sequence xn =
( 1)n is bounded, but does not converge. However, for an important class of sequences, the
monotonic ones, the converse holds: for such sequences, the boundedness is both a necessary
and su¢ cient condition for convergence. Before to state and prove such a result, we need
another important theorem:
Proof Let fxn g be an increasing sequence in R (the proof in the case of a decreasing sequence
is analogous). The sequence fxn g can be bounded or not bounded from above (it is surely
bounded from below since x1 xn for every n 1).
188 CHAPTER 8. SEQUENCES
Thus, Theorem 285 guarantees that monotonic sequences cannot be irregular.8 We are
now able to state and prove the theorem anticipated above on the equivalence of boundedness
and convergence for monotonic sequences.
We close with an obvious, but useful observation: the results just discussed hold, more
generally, for sequences that are eventually monotonic.
Thus, Heron’s sequence converges to the square root of a. On top of that, the rate of
convergence is quite fast, as we shall see.
Proof The sequence is (strictly) decreasing, at least after n = 2, as the following claims
show:
8
The version for functions of real variable of this important result is Lemma 881 (which will be used in
the study of improper integrals).
9 p
We could also take x1 = b with b between a and a: in this way the rate of convergence is increased.
8.7. PROPERTIES OF LIMITS 189
To conclude, the sequence fxn g is decreasing (at least for n 2) and thus admits a limit
L which must satisfy
1 a a a
L= L+ ; 2L = L + ; L= ; L2 = a
2 L L L
p
Hence, L = a.
p
Example 288 (i) Let us compute 2, which we know to be approximately 1:4142135.
Hero’s sequence is made up of the following elements:
x1 = 2
1 2 3
x2 = 2+ =
= 1:5
2 2 2
1 3 2 17
x3 = + = ' 1:4166667
2 2 3=2 12
1 17 2 577
x4 = + = ' 1:4142156
2 12 17=12 408
1 577 2 665857
x5 = + = ' 1:4142135
2 408 577=408 470832
10
If a = 1, then clearly xn = 1 for all n.
190 CHAPTER 8. SEQUENCES
p
(ii) Let us compute 428356 ' 654:48911:
p
(iii) For 0:13 ' 0:3605551 we have
Note that, since 0; 13 < 1, the sequence is decreasing starting from the second element.
N
1 a
xn+1 = xn + < xn
2 xn
By iterating the algorithm xn and a=xn become closer and closer: their common value cannot
p
be anything other than a.
y
4
2a/x n+1
a/x
n
1
0
O x x x
n+1 n
-1
-1 0 1 2 3 4 5
8.7. PROPERTIES OF LIMITS 191
with each nk 1, the sequence fxnk g1 k=1 is called subsequence of fxn g. In other words, the
subsequence fxnk g is a sequence built starting from the original sequence fxn g taking only
the terms of position (index) nk .
1 1 1 1
1; ; ; ; : : : ; ;:::
3 5 7 2k + 1
in which fnk gk 1 is the sequence of the odd numbers f1; 3; 5; : : :g: this subsequence has been
built by selecting the elements of odd place in the original one. Another subsequence of
(8.14) is given by
1 1 1 1 1
; ; ; ;:::; n;:::
2 4 8 16 2
where this time fnk gk 1 is formed by the (strictly positive integer) powers of 2, that is,
2; 22 ; 23 ; : : :
Example 290 Consider the oscillating sequence in R with generic element xn = ( 1)n . A
simple subsequence is given by
f1; 1; 1; : : : ; 1; : : :g (8.15)
in which fnk gk 1 is the sequence of the even numbers: this subsequence has been built by
selecting the elements of even place in the original one. If we selected those of odd place we
would have built the subsequence
f 1; 1; 1; : : : ; 1; : : :g (8.16)
Taking fnk gk 1 = f1000kg, i.e., selecting only the elements of places 1; 000, 2; 000,
3; 000, ... we still get f1; 1; 1; : : : ; 1; : : :g. N
A subsequence is obtained simply by discarding some terms (also in…nitely many) of the
sequence, leaving in place however an in…nite number of them. If a sequence is regular, each
of its subsequences is regular and with the same limit (ubi maior ...).
192 CHAPTER 8. SEQUENCES
Proposition 291 A sequence fxn g in R is regular, with limit L 2 R, if and only if each of
its subsequences is regular with the same limit L.
Proof We prove the result for L 2 R, leaving the cases 1 to the reader. “Only if”.
Suppose that fxn g converges to L. Let " > 0. There exists n" 1 such that jxn Lj < "
for every n n" . Let fxnk g1 k=1 be a subsequence of fxn g. Since nk k for every k 1, we
a fortiori have jxnk Lj < " for every k n" , so that fxnk g converges to L. “If”. Suppose
that each subsequence of fxn g converges to L. Assume, by contradiction, that fxn g does
not converge to L. Then, there exists "0 > 0 such that, for every integer k 1, there exists
nk k for which jxnk Lj > "0 . Consider 11 the sequence of such xnk . It is a subsequence
of fxn g that, by construction, does not converge to L: contradiction. Hence, fxn g converges
to L.
As the last example shows, it can happen that, while the original sequence is irregular,
some of its subsequences are convergent. In other words, it can happen, by selecting the
elements in a suitable way, that we can “extract”a convergent trend out of an irregular one.
In Example 290 we have an oscillating sequence, from which we have selected a constant
subsequence taking only the elements of even position (or only of odd position). The next
result, the Bolzano-Weierstrass Theorem, shows that this is always possible, provided the
sequence is bounded.
Theorem 292 (Bolzano-Weierstrass) Each bounded sequence has (at least) one conver-
gent subsequence.
In other words, from any bounded sequence fxn g, even if very irregular, it is possible
to extract a convergent subsequence fxnk g , i.e., such that there exists L 2 R for which
lim xnk = L. The possibility of being always able to “extract” a convergent behavior from
any bounded sequence is a property of great importance.
Example 293 The sequence xn = ( 1)n is bounded, since its image is the bounded set
f 1; 1g. By the Bolzano-Weierstrass Theorem, it has at least one convergent subsequence.
Indeed, such are the constant subsequences (8.15) and (8.16). N
Let us suppose that, instead, I is not …nite, i.e., that there exist in…nitely many positions
n 1 such that
m > n =) xm > xn (8.17)
Since they are in…nitely many, we denote the elements of I as I = fn1 ; n2 ; : : : ; nk ; : : :g, with
n1 < n2 < : : : < nk : : : Thanks to (8.17) we have
The subsequence fxnk g is therefore monotonic increasing; this completes the proof in Case
2.
O.R. The Bolzano-Weierstrass Theorem states in substance that it is not possible to take
in…nitely many numbers (the elements of the sequence) in a bounded real interval in a way
that they (or a part of them) are “well separated”one from the other: necessarily they crowd
in the proximity of (at least) one point. H
Proposition 295 Each unbounded sequence has a divergent subsequence (to +1 when it is
not bounded from above, to 1 when it is not bounded from below).12
Proof If the sequence is unbounded from above, then for every K > 0 there exists at least
one element of the sequence greater than K. We denote by xnK the smallest term in the
sequence fxn g that turns out to be > K: taking K = 1; 2; : : :, fxnK g is clearly a subsequence
of fxn g (all its terms have been taken among those of fxn g) and, by de…nition, it diverges
to +1. The case of sequences not bounded from below can be treated in an analogous way.
In conclusion:
O.R. We can therefore say that there is no way of taking in…nitely many real numbers
without at least a part of them crowding somewhere (in proximity of either a …nite number
or of +1 or of 1; i.e., of some point of R). H
12
In the case where it is neither bounded from above nor bounded from below it has both a subsequence
diverging to +1 and a subsequence diverging to 1.
194 CHAPTER 8. SEQUENCES
+1 1 or 1+1
(ii) xn yn ! LH, provided that LH is not an indeterminate form (1.25), of the type
1 0 or 0 ( 1)
Proof (i) Let xn ! L and yn ! H, with L; H 2 R. This means that, for every " > 0, there
exist n1 and n2 such that
xn + yn > K + L "
and since K + L " > 0 is arbitrary, it follows that xn + yn ! +1. The other cases of
in…nite limit are treated in an analogous way.
(ii) Let xn ! L and yn ! H, with x; y 2 R. This means that, for every " > 0, there
exist n1 and n2 such that
Moreover, being convergent, fyn g is bounded (recall Proposition 284): there exists b > 0
such that jyn j b for every n. Now, for n n3 = max fn1 ; n2 g,
jxn yn LHj = jyn (xn L) + L (yn H)j jyn j jxn Lj + jLj jyn Hj < " (b + jLj)
one also has, for every K > 0, yn > K for n n2 . It follows that, for n n3 = max fn1 ; n2 g,
xn yn > (L ") K
and, for the arbitrariness of (L ") K > 0, we conclude that xn yn ! +1. If L < 0 and
H = +1, xn yn < (L + ") K and therefore xn yn ! 1. The other cases of in…nite limits
are treated in an analogous way.
The following result shows that the case a=0 of assertion (iii) with a 6= 0 is not an
indeterminacy for the algebra of limits, although it is so for the extended real line (as
indicated in Section 1.7).
We have therefore also the following indeterminate forms for the limits: 1 1 , which we
will often denote by writing 11 ; 00 ; and (+1)0 , which we will often denote by writing 10 .
lim n = +1
1
lim =0
n
since 0 < 1=n < " for every n [1="] + 1.
As we have anticipated, from these two elementary limits we deduce, using Propositions
297 and 302, many other ones:
(iii) we have: 8
< +1 if > 1
n
lim = 1 if = 1
: +
0 if 0 < < 1
+1 if > 1
lim log n =
1 if 0 < < 1
and many other limits. For example, we have
7
lim 5n + n2 + 1 = +1 + 1 + 1 = +1
198 CHAPTER 8. SEQUENCES
as well as
3 1
lim n2 3n + 1 = lim n2 1 + 2 = +1 (1 0 + 0) = +1
n n
5 7
n2 5n 7 n2 1 n n2 1 0 0 1
lim = lim 4 6 = =
2n2 + 4n + 6 n2 2 + n + n2
2+0+0 2
1
5 n
lim = [0 (5 0)] = 0
2n2
and
n (n + 1) (n + 2) n n 1 + n1 n 1 + n2
lim = lim 1 2 4
(2n 1) (3n 2) (5n 4) 2n 1 2n 3n 1 3n 5n 1 5n
1 2
1+ n 1+ n
= lim 1 2 4
30 1 2n 1 3n 1 5n
1 1 1
= =
30 1 1 1 30
Indeterminate form 1 1
Let us consider the indeterminate form 1 1. For example, the limit of the sum xn + yn of
the sequences xn = n and yn = n2 falls under this form of indetermination, so one cannot
resort to previous results. We have, however,
xn + yn = n n2 = n (1 n)
Now take xn = n2 and yn = n. Also in this case, the limit of the sum xn + yn falls
under the indeterminacy 1 1. By proceeding like we have just done, this time we obtain
lim (xn + yn ) = lim n (n 1) = lim n lim (n 1) = +1
1
Next, take xn = n and yn = n, again of type 1 1. Here again a simple manipulation
n
allows us to …nd a way out:
1 1
lim (xn + yn ) = lim n + n = lim =0
n n
Finally, take xn = n2 +( 1)n n and yn = n2 , which is again of type 1 1 since xn ! +1,
because xn n2 n = n (n 1). Now
lim (xn + yn ) = lim ( 1)n n
does not exist.
Therefore, when we have a case of type 1 1, it can happen that the limit under
consideration is either +1 or 1 or …nite or nonexistent. In other words, everything can
happen. The simple observation that the case at hand is of type 1 1 does not allow us to
say anything on the limit of the sum.16 In the case 1 1 we have to really look carefully
at the two sequences and, each time, manage to …nd a way, which is very often simple, to
avoid the indeterminacy (as we have seen in the small examples discussed above). The same
can be said for the other indeterminate forms.
Indeterminate form 0 1
Let, for example, xn = 1=n and yn = n3 . The limit of their product has the form 0 1, and
therefore we cannot let ourselves be guided by the previous results. We have, however,
1
lim xn yn = lim n3 = lim n2 = +1
n
1
If xn = and yn = n, then
n3
1 1
lim xn yn = lim 3
n = lim 2 = 0
n n
If xn = n3 and yn = 7=n3 , then
7
lim xn yn = lim n3 = lim 7 = 7
n3
If xn = 1=n and yn = n(cos n + 2),17 then
lim xn yn = lim(cos n + 2)
does not exist.
Again, only the direct calculation of the limit can determine its value: the indeterminate
form can give completely di¤erent results.
16
If instead, it were a form of type 1 + a, even without knowing how the two sequences are de…ned, we
would have been able to say that the limit of their sum is 1.
17
Using the comparison criterion, that we will study soon (Theorem 303), it is possible to prove easily that
yn ! +1.
200 CHAPTER 8. SEQUENCES
xn n 1
lim = lim 2 = lim = 0
yn n n
On the other hand, exchanging xn with yn , the indeterminacy 1=1 remains, but
yn n2
lim = lim = lim n = +1
xn n
xn n2 1 1
lim = lim = lim 1 =
yn 1 + 2n2 n2
+2 2
Observe the simple relation between the indeterminacies 1=1 and 0=0: if the limit of
the quotient of the sequences fxn g and fyn g falls under the indeterminate form 1=1, the
limit of the quotient of the sequences f1=xn g and f1=yn g falls under the indeterminate form
0=0, and vice versa.
sum +1 L 1
+1 +1 +1 ??
H +1 L+H 1
1 ?? 1 1
We have two indeterminate cases out of nine. We pass to the product: the inner cells give
the result for the limit fxn yn g :
where there are four indeterminate cases out of twenty-…ve. Finally, for the quotient we have
the following table, where the inner cells give the result for the limit fxn =yn g:
where, in the light of Proposition 300, in the third row we have assumed that yn tends to
0 from above (yn ! 0+ ) or from below (yn ! 0 ). In turn, this determines the sign of the
in…nity; for example,
1 1
lim 1 = lim n = +1 and lim 1 = lim ( n) = 1
n n
O.R. The case 0 1 is not an indeterminacy. It is obviously an abbreviation for the lim xynn ,
where the base is a sequence (positive, otherwise the power is not de…ned!) tending to
0 (more precisely to 0+ ) and the exponent is a divergent sequence. We can set without
di¢ culty 0+1 = 0: the idea is to multiply 0 by itself “in…nitely many times” and we get a
zero as large as a palace (a “very big zero”, as a famous professor used to say). The form
0 1 is reciprocal to the previous one and therefore 0 1 = +1. H
202 CHAPTER 8. SEQUENCES
(i) If xn ; yn ! 1, their ratio xn =yn appears in the form 1=1, but it is su¢ cient to write
the ratio as
1
xn
yn
to get the form 0 1.
(ii) If xn ; yn ! 0, their ratio xn =yn appears in the form 0=0, but it is su¢ cient to write
the ratio as
1
xn
yn
to get the form 0 1.
(iv) For the last three cases it is su¢ cient to consider the logarithm to place oneself in the
case 0 1:
The number e, which we will meet shortly, represents the limit of an indeterminate
form (the most valuable) of the type 11 .
The reader can try to bring back all the forms of indeterminacy to 0=0 or to 1=1.
We start with the classical comparison criterion: when two sequences converge to the
same limit, the same is true for any sequence whose terms are “sandwiched” between those
of the two original sequences.20
Theorem 303 (Comparison criterion) Let fxn g, fyn g, and fzn g be three sequences. If,
eventually,
yn xn zn (8.21)
and
lim yn = lim zn = L 2 R (8.22)
then
lim xn = L
Proof Let " > 0. From (8.22) it follows, by De…nition 265, that there exists n1 such
that yn 2 B" (L) for every n n1 , and there exists n2 such that zn 2 B" (L) for every
n n2 . Finally we call n3 the index starting from which one has yn xn zn . Setting
n = max fn1 ; n2 ; n3 g, we have yn 2 B" (L), zn 2 B" (L), and yn xn zn for every n n,
and therefore
L " < yn xn zn < L + " 8n n
that is, xn 2 B" (L) for every n n. Hence, xn ! L as claimed.
The typical use of the result is in proving the convergence of a sequence by showing that
it can be “trapped” between two suitable convergent sequences.
Example 304 Consider the sequence xn = n 2 sin2 n. Since 1 sin n 1 for every
n 2 N, we have 0 sin2 n 1 for every n 1 and therefore
sin2 n 1
0 8n 1
n2 n2
If we consider the sequences with yn = 0 and zn = 1=n2 , conditions (8.21) and (8.22) with
L = 0 are satis…ed. By the comparison criterion, lim xn = 0. N
The next two simple and useful theorems introduce some analytical tools that will be
used also for the convergence of series, as we will see in next chapter.
20
In Italy, the theorem is sometimes called “the two carabinieri (policemen) theorem”. Indeed, if convict
fxn g is escorted by the two policemen fyn g and fzn g (one on each “side”), then he is forced to go wherever
they go.
204 CHAPTER 8. SEQUENCES
Theorem 306 (Ratio criterion) If there exists a number q < 1 such that, eventually,
xn+1
q (8.23)
xn
Proof Suppose that the inequality holds starting from n = 1: if it held from a certain n
onwards, just recall that eliminating a …nite number of terms does not alter the limit. From
xn+1
q
xn
qn 1
jx1 j xn qn 1
jx1 j 8n 2
Since 0 < q < 1, we have q n 1 ! 0 and the result then follows from the comparison criterion.
Note that the theorem does not simply require that the ratio jxn+1 =xn j be < 1, that is,
xn+1
<1
xn
but that it is “far from it”, i.e., that it is smaller than a number q which, in turn, is itself
lower than 1. The next example clari…es this observation.
Example 307 The sequence xn = ( 1)n 1 + n1 does not converge (indeed, the subsequence
extracted from it with the even indices tends to +1, whereas that with the odd indices tends
to 1), although
1
xn+1 1 + n+1 n2 + 2n
= = <1
xn 1 + n1 n2 + 2n + 1
for every n 1. N
Note that the property (8.23) clearly holds if the ratio jxn+1 =xn j has a limit and the
latter is strictly smaller than 1, that is,
xn+1
lim <1 (8.24)
xn
Indeed, let us denote by L this limit and let " > 0 be such that L + " < 1. By the de…nition
of limit, eventually we have
xn+1
L <"
xn
8.9. CONVERGENCE CRITERIA 205
that is L " < jxn+1 =xn j < L + ". Therefore, setting q = L + ", it follows that eventually
jxn+1 =xn j < q, which is property (8.23).
Indeed, the limit form (8.24) is the most common way in which the ratio criterion is
applied, as we will see soon in some example.
nk
lim n
=0 (8.25)
Indeed, by setting
nk
xn = n
and by taking the ratio of two consecutive terms (the absolute value is irrelevant given
that here all the terms are positive), we have
k k
xn+1 (n + 1)k n n+1 1 1 1 1
= n+1
= = 1+ ! <1
xn nk n n
logk n log n
lim = lim =0
n n
What precedes indicates a precise hierarchy among the following classes of divergent
sequences:
n
with > 1; nk with k > 0; logk n with k > 0
The “strongest” are the exponentials, graded according to the base , then the powers
follow, graded according to the exponent k, and, …nally, the logarithms, graded according to
the exponent k.
For example,
5n 6 2n n123 + 7n87 n36 log n ! +1
inheriting the behavior of 5n , and
n4 3n3 + 6n2 4 1
4 3 2
!
5n + 7n + 25n + 342 5
206 CHAPTER 8. SEQUENCES
because the numerator inherits the behavior of n4 and the denominator of 5n4 .
Soon, in Section 8.12, we will make rigorous these observations on limits based on the
rate of convergence (or divergence).
Theorem 308 (Root criterion) If there exists a number q < 1 such that, at least eventu-
ally
p
n
jxn j q (8.26)
The strict
p inequality q < 1 is key: the constant sequence xn = 1 does not converge to 0,
n
although jxn j 1 for every n.
we immediately get that jxn j q n i.e., that qn xn q n . Since 0 < q < 1, q n ! 0, the
result follows from the comparison criterion.
This limit form is the most common in which the criterion is applied.
The next simple example shows how both the ratio and the root criteria are su¢ cient,
yet not necessary, conditions for convergence. However useful, they cannot thus always be
conclusive in determining the convergence of a sequence.
Example 309 The sequence with xn = 1=n converges to zero. However, we have that
xn+1 n
= !1
xn n+1
p
n
and so the ratio criterion is not applicable. Furthermore, n ! 1 as
p log n
log n
n = log n1=n = !0
n
It follows that r
n 1 1
= p !1
n n
hence neither the root criterion is applicable. None of the two criteria can thus be useful in
determining the convergence of such a simple sequence. N
8.10. THE CAUCHY CONDITION 207
Lastly, note that both sequences xn = 1=n and xn = ( 1)n =n satisfy condition
xn+1
!1
xn
although the …rst one converges to zero and the second one does not converge at all. Therefore
such a condition does not allow us to draw any conclusion regarding the asymptotic behavior
of the sequence: it can
p converge or not while satisfying this condition. Similar considerations
hold for condition n jxn j ! 1: it is enough to look at the sequences fng and f1=ng. All
this con…rms the importance of condition < 1 for the limit forms (8.24) and (8.27) to have
conclusive power. The next well-known limit further con…rms our statement.
p
n
Proposition 310 For every k > 0 we have that lim k = 1.
p
n
p
n
implies that k 1 ! 0, that is lim k = 1.
Theorem 311 (Cauchy) The sequence fxn g is convergent if and only if for each " > 0
there exists an integer n" 1 such that
Condition (8.28), which can be rewritten as d (xn ; xm ) < " for every n; m n" , is
called the Cauchy condition: its validity for every " > 0 is therefore a necessary and a
208 CHAPTER 8. SEQUENCES
su¢ cient condition for convergence. Sequences that satisfy this condition are called “Cauchy
sequences”.
Proof “Only if”. If xn ! L then, by de…nition, for each " > 0 there exists n" 1 such that
jxn Lj < " for every n n" . This implies that for every n; m n"
Next, we denote by A (respectively, B) the set of the real numbers such that the sequence
is eventually strictly larger (respectively, smaller) than each of them. Formally, we de…ne
and
B = fb 2 R : 9nb 2 N such that b > xn 8n nb g
Note that:
(i) A and B are not empty. Indeed, we have xn" " 2 A and xn" + " 2 B.
(iii) sup A = inf B. Indeed, by the Least Upper Bound Principle and the previous two
points, sup A and inf B are well de…ned and such that sup A inf B. Since, by point
(i), xn" " 2 A and xn" + " 2 B, we have xn" " sup A inf B xn" + ", and, in
particular, jinf B sup Aj 2". Since " can be chosen to be arbitrarily small, we have
jinf B sup Aj = 0, that is, inf B = sup A.
Let us call z the common value of sup A and inf B. We claim that xn ! z. Indeed, by
…xing arbitrarily a number > 0, there exist a 2 A and b 2 B such that 0 b a < and
therefore (since a z b and hence z < a and b < z + )
z <a<b<z+
But, by de…nition of A and B, the sequence is eventually strictly larger than a and eventually
strictly smaller than b and hence, the sequence is such that eventually
z < xn < z +
which, due to the arbitrary choice of , shows that z is the limit of the sequence fxn g.
8.11. NAPIER’S CONSTANT 209
Example 312 The sequence with generic term xn = 1=n is a Cauchy sequence. Indeed, let
" > 0. We have to show that there exists n" 1 such that for every n; m n" one has
jxn xm j < ". It is without loss of generality to think that n m. Note that for n m we
1 1 1 1 1 1
have jxn xm j = m n . Clearly, for every n m, 0 < m n < m . Since m < " is the same
1 1
as m > " , by choosing n" = " + 1, we have jxn xm j < " for every n m n" , proving
that xn = 1=n is a Cauchy sequence. N
Example 313 The sequence with generic term xn = log n is not a Cauchy sequence. Let
us suppose, by contradiction, that for …xed " > 0 there exists n" 1 such that for every
n; m n" we have jxn xm j < ". First, note that if n = m + h with h 2 N, we have
jxn xm j = log m+h
m < " if and only if h < m(e" 1). Thus, by choosing h = [m(e" 1)]+1
and m n" , we obtain jxn xm j = log m+h m ", which contradicts (since n; m n" )
jxn xm j < ". Therefore, xn = log n is not a Cauchy sequence. N
O.R. The previous theorem, sometimes called general criterion of convergence, does state
a fundamental property of convergent sequences, yet its relevance is due to the structural
property it isolates. This property of the set of real numbers is called completeness (and is
of great conceptual importance).
For example, let us assume (as it was the case for Pythagoras) that we only knew the
rational numbers: the space on which we operate is therefore Q. Consider the sequence
whose elements (all rationals) are the decimal approximations of :
and this can be made arbitrarily small. The sequence, however, does not converge (recall
that we are acting as if we were only aware of Q, and not of R) to any point of Q: if we
knew R, we could say that it converges to (and it could not converge to anything else since
the limit is unique). Therefore, in Q the Cauchy condition is necessary, but not su¢ cient,
for convergence.
Reiterating, there are spaces, such as R, in which the Cauchy condition is su¢ cient for
convergence and others, such as Q, in which it is not. The …rst ones are called complete
and the second ones incomplete. Without going into details, R is the “completion” of Q
in that it is the smallest complete space that contains Q. In other words, Q is e¢ ciently
enriched with all the“limits” (which do not exist in Q) of the sequences that satisfy the
Cauchy condition, obtaining in this way the complete space R. H
We will prove soon that it converges to a number (irrational, and actually transcendental)21
that is usually denoted by e and is equal to 2:71828:::
e is the most natural base of the logarithms and as such it acquires remarkable properties.
From now on we will take without exception e as base of the logarithms and very often as
base of powers.
Theorem 314 The sequence (8.29) is convergent. Its limit is denoted by e and it is called
Napier’s constant.
Since the sequence involves powers, the root criterion is the …rst possibility to consider.
Unfortunately, s
n 1 n 1
1+ =1+ !1
n n
and therefore this criterion cannot be applied. The proof is based, instead, on the following
classical inequality.
Proof The proof is done by induction. Inequality (8.30) holds for n = 2. Indeed, for each
a 6= 0 we have:
(1 + a)2 = 1 + 2a + a2 > 1 + 2a
Suppose now that (8.30) holds for some n 2, i.e.,
(1 + a)n > 1 + an
where the …rst inequality, due to the induction hypothesis, holds only for a > 1. This
completes the induction step.
We proceed by steps.
21
A non-rational number
p is called algebraic if it is a root of some polynomial equation with integer coe¢ -
cients: for example, 2 is algebraic because it is a root of the equation x2 2 = 0. Non-algebraic irrational
numbers are called transcendental.
22
For n = 1 the equality holds trivially.
8.11. NAPIER’S CONSTANT 211
and, using the inequality (1 + a)n > 1 + an proved in Lemma 315,23 we see that
n
1 n n 1
1+ >1+ >1+ >1+
n2 1 n2 1 n 2 n
bn b1
0 < bn an = < !0
n+1 n+1
Since the two sequences are strictly monotonic, convergent, and such that their di¤erence
tends to 0, we conclude that sup an = inf bn = lim an = lim bn .
One obtains
23 1
Note that 1< n2 1
6= 0 for n 2:
212 CHAPTER 8. SEQUENCES
a1 = 21 = 2 b1 = 22 = 4
3 2 9 3 3 27
a2 = 2 = 4 = 2:25 b2 = 2 = 8 = 3:375
11 10 11 11
a10 = 10 ' 2:59 b10 = 10 ' 2:85
and therefore Napier’s constant lies between 2:59 and 2:85. As we have already indicated,
the number e is transcendental and is equal to 2:71828:::.
From the fundamental limit just indicated we can deduce many other limits. The limit
(8.29) and the following ones (i)–(v) are other examples of fundamental limits.
For k = 1 the proof can be done easily considering the integer part of xn . For any k,
it is su¢ cient to set k=xn = 1=yn , so that
xn kyn yn k
k 1 1
1+ = 1+ = 1+ ! ek
xn yn yn
logb (1 + an )
lim = logb e 80 < b 6= 1
an
cyn 1
lim = log c
yn
cyn 1 an
=
yn logc (1 + an )
and so we are reduced to the (reciprocal of the) previous case in which the limit is
1= logc e = loge c = log c.
8.12. ORDERS OF CONVERGENCE AND OF DIVERGENCE 213
Even though here the numerator also tends to +1, the denominator has driven the fraction
on its side, forcing it to zero. Hence, the higher rate of divergence (that is, of convergence
to +1) of the sequence fxn g reveals itself in the convergence to zero of the ratio yn =xn .
The ratio seems therefore to be the natural …eld of comparison for the relative speed of
convergence/divergence of the two sequences.
The next de…nition formalizes this intuition, important both from the conceptual point
of view and the calculations.
214 CHAPTER 8. SEQUENCES
De…nition 316 Let fxn g and fyn g be two sequences, with the terms of the …rst one even-
tually di¤ erent from zero.
(i) If
yn
!0
xn
we say that fyn g is negligible with respect to fxn g, and we write
yn = o (xn )
(ii) If
yn
! k 6= 0 (8.31)
xn
we say that fyn g is of the same order or that it is comparable with fxn g, and we write
yn xn
yn xn
This classi…cation is comparative. For example, fyn g is negligible with respect to fxn g:
this does not mean that fyn g is in itself negligible, but that it becomes so when it is compared
to fxn g. The sequence n2 is negligible with respect to n5 , but, in absence of n5 , it
is not negligible at all (it tends to in…nity!).
Observe, in addition, that thanks to Proposition 274 we have
yn xn
! 1 () !0
xn yn
i.e., if and only if xn = o (yn ). Therefore, also when the ratio diverges we can use the above
classi…cation, no separate analysis is needed.
Lemma 317 Let fxn g and fyn g be two sequences with terms eventually di¤ erent from zero.
Then,
(i) the relation of comparability (in particular ) is both symmetric, that is, yn xn if
and only if xn yn , and transitive, that is, zn yn and yn xn imply zn xn .
(ii) the relation of negligibility is transitive, that is, zn = o (yn ) and yn = o (xn ) implies
zn = o (xn ).
8.12. ORDERS OF CONVERGENCE AND OF DIVERGENCE 215
We now consider the more interesting cases, where both sequences are in…nitesimal or
divergent. We start with two in…nitesimal sequences fxn g and fyn g, that is, limn!1 xn =
limn!1 yn = 0. In this case, the negligible sequences tend faster to zero. Let us consider,
for example, xn = 1=n and yn = 1=n2 . Intuitively, yn goes to zero faster than xn . Indeed,
1
n2 1
1 = !0
n
n
N.B. Setting xn = n and yn = n + k, with k > 0, the sequences fxn g and fyn g are
asymptotic. Indeed, no matter how large k is, the divergence to +1 of the two sequences
216 CHAPTER 8. SEQUENCES
will make negligible, from the asymptotic point of view, the role of k. Such a fundamental
viewpoint, central to the theory of sequences, should not make us forget that two asymptotic
sequences are, in general, very di¤erent (to …x ideas, set for example k = 1010 , i.e., 10 billions,
and consider the asymptotic sequences xn = n and yn = n + 1010 ). O
Proposition 318 For every pair of sequences fxn g and fyn g and for every scalar c 6= 0, it
holds that:
The relation o(xn ) + o (xn ) = o (xn ) in (i), bizarre at …rst sight, simply means that the
sum of two little-o of the same sequence is still a little-o of the same sequence, that is,
it continues to be negligible with respect to that sequence. The analogous re-reading of
the other properties in the proposition facilitates its understanding. Note that (ii) has the
remarkable special case
o(xn )o(xn ) = o(x2n )
Proof If a sequence is little-o of xn it can be written as xn "n , where "n is an in…nitesimal
sequence. Indeed
xn "n
lim = lim "n = 0
xn
and therefore xn "n is little-o of xn . The proof will be based on this very useful arti…ce.
(i) Let us call xn "n the …rst of the two little-o to the left of the equality symbol, and
xn n the second one, with "n and n two in…nitesimal sequences. Then
xn "n + xn n
lim = lim ("n + n) =0
xn
which shows that o(xn ) + o (xn ) is o (xn ).
(ii) Let us call xn "n the little-o of xn and yn n the little-o of yn , with "n and n two
in…nitesimal sequences. Then
xn "n yn n
lim = lim ("n n) =0
xn yn
(iii) Let us call xn "n the little-o of xn , with "n in…nitesimal sequence. Then
c xn "n
lim = c lim "n = 0
xn
that shows that c o(xn ) is o (xn ).
(iv) Let us call yn = xn "n , with "n an in…nitesimal sequence. Then, the little-o of yn
can be written as yn n that is, xn "n n , with n an in…nitesimal sequence. Moreover, we call
xn n the little-o of xn , with n an in…nitesimal sequence. Then
xn "n n + xn n
lim = lim ("n n + n) =0
xn
so that o(yn ) + o (xn ) = o (xn ).
Example 319 Let fxn g be the sequence with n-th term xn = n2 . Consider the sequences
with n-th term yn = n and zn = 2(log n n). It is immediate to see that yn = o(xn ) = o(n2 )
and zn = o(xn ) = o(n2 ).
(i) Summing up the two sequences we obtain yn + zn = 2 log n n, which is still o(n2 ), in
accordance with (i) proved above.
(ii) Multiplying the two sequences we obtain yn zn = 2n log n 2n2 , which is o(n2 n2 )
i.e., o(n4 ), in accordance with (ii) proved above (in the special case o(xn )o(xn )). Note
that yn zn is not o(n2 ).
(iii) Take c = 3 and consider c yn = 3n. It is immediate that 3n is still o(n2 ), in accordance
with (iii) proved above.
p
(iv) Consider the sequence wn = n 1. It is immediate that wn = o(yn ) = o(n). Consider
now the sum wn + zn (with zn de…ned above),which is the sum of an o(yn ) and an
p
o(xn ), with yn = o(xn ). We have wn + zn = n 1 + 2 log n 2n, which is o(n2 ), i.e.,
o(xn ), in accordance with (iv) proved above. Note that wn + zn is not o(yn ), even if
wn is o(yn ). N
N.B. (i) To state that a sequence is o (1) simply means that it tends to 0: indeed, xn =
o (1) means that xn =1 = xn ! 0. (ii) The fourth property in the previous proposition is
particularly important, since it highlights that if yn is negligible with respect to xn , in the
sum o(yn ) + o (xn ) the little-o o(yn ) is incorporated from o (xn ). O
yn ! L () xn ! L (8.33)
In detail:
All this suggests that it is possible to replace xn by yn (or vice versa) in the calculation
of the limits. Such possibility is attractive because it would allow to replace to a complicate
sequence by a simpler one that is asymptotic to it.
To make this intuition precise we start by observing that the asymptotic equivalence
is preserved under the fundamental operations.
jxn = (xn + wn )j k
(ii) yn zn xn wn ;
Note that for sums, di¤erently from the case of products and ratios, the result does not
hold in general, but only with a signi…cant ad hoc hypothesis.
For this reason assertions (ii) and (iii) are the most interesting and in the sequel we will
concentrate on the asymptotic equivalence of products and ratios, leaving to the reader the
study of sums.
Proof (i) We have
y n + zn yn zn yn xn zn wn
= + = +
xn + wn xn + wn xn + wn xn xn + wn wn xn + wn
yn xn zn xn yn zn xn zn
= + 1 = +
xn xn + wn wn xn + wn xn wn xn + wn wn
Since yn =xn ! 1 and zn =wn ! 1, we have
yn zn
!0
xn wn
hence
yn zn xn yn zn xn yn zn
0 = k !0
xn wn xn + wn xn wn xn + wn xn wn
By the comparison criterion,
yn zn xn
!0
xn wn xn + wn
and hence, since zn =wn ! 1, we have
y n + zn
!1
xn + wn
24
For example, the condition holds if fxn g and fwn g are both eventually positive.
8.12. ORDERS OF CONVERGENCE AND OF DIVERGENCE 219
as desired.
(ii) and (iii) We have
y n zn y n zn
= !1
xn wn xn wn
and yn
zn yn wn yn wn
xn = = !1
wn zn xn xn zn
since yn =xn ! 1 and zn =wn ! 1.
The next simple lemma is very useful: in the calculation of a limit it is good to neglect
what is negligible.
xn + o (xn ) ! L () xn ! L
What is negligible with respect to the sequence fxn g, i.e., what is o (xn ), is asymptotically
irrelevant and one can safely ignore it. Together with Lemma 320, this implies for products
and ratios, that
(xn + o (xn )) (yn + o (xn )) xn yn (8.34)
and
xn + o (xn ) xn
(8.35)
yn + o (xn ) yn
We illustrate these very useful asymptotic equivalences with some examples, that we
invite the reader to read with particular attention.
n4 3n3 + 5n2 7
lim
2n5 + 12n4 6n3 + 4n + 1
n4 3n3 + 5n2 7 n4 + o n4 n4 1
= = !0
2n5 + 12n4 6n3 + 4n + 1 2n5 + o (n5 ) 2n5 2n
N
220 CHAPTER 8. SEQUENCES
1 3
lim n2 7n + 3 2 +
n n2
By (8.34),25
1 3
n2 7n + 3 2 + = n2 + o n2 (2 + o (1)) 2n2 ! +1
n n2
N
n (n + 1) (n + 2) (n + 3)
lim
(n 1) (n 2) (n 3) (n 4)
By (8.35),
n (n + 1) (n + 2) (n + 3) n4 + o n4 n4
= 4 =1 !1
(n 1) (n 2) (n 3) (n 4) n + o (n4 ) n4
N
n 1
lim e 7+
n
By (8.34),
n 1 n n
e 7+ =e (7 + o (1)) 7e !0
n
N
By (8.32), we have
yn xn zn wn
() (8.36)
zn wn yn xn
provided that the ratios are (eventually) well de…ned and not zero. Therefore, once we have
established the asymptoticity of the ratios yn =zn and xn =wn , we “automatically” have also
the asymptoticity of their reciprocals zn =yn and wn =xn .
25
For k 2 R, with k 6= 0, we have k + o(1) k. Indeed,
k + o(1) 1
= 1 + o(1) ! 1
k k
8.12. ORDERS OF CONVERGENCE AND OF DIVERGENCE 221
e5n n7 4n2 + 3n
lim
6n + n 8 n4 + 5n3
By (8.35),
n
e5n n7 4n2 + 3n e5n + o e5n e5n e5
= = ! +1
6n + n8 n4 + 5n3 6n + o (6n ) 6n 6
6n + n 8 n4 + 5n3
lim
e5n n7 4n2 + 3n
then, by (8.36),
n
6n + n8 n4 + 5n3 6
!0
e5n n7 4n2 + 3n e5
N
xn yn () xn = yn + o (yn )
In other words, two sequences are asymptotic when they are equal modulo a component
that is asymptotically negligible with respect to them. This result further clari…es how the
relation can be seen as an asymptotic equality.
xn yn + o (yn ) o (yn )
= =1+ !1
yn yn yn
Proposition 328 Let fxn g be a sequence with terms eventually non-zero. Then
1
log jxn j ! k 6= 0 (8.37)
n
1 1 kn + o (n)
log jxn j = log ekn+o(n) = !k
n n n
“Only if.” Set zn = log jxn j. Since k 6= 0, from (8.37) it follows that zn =kn ! 1, i.e.,
zn kn. From the previous proposition and Proposition 318-(iii) it follows that
as claimed.
When k < 0, the condition (8.37) characterizes the sequences that converge to zero at
exponential rate. In that case, we speak about exponential decay. When k > 0, there is
instead an explosive exponential behavior.
8.12.5 Terminology
Due to their importance, for the comparison both of in…nitesimal sequences and of divergent
sequences there is a speci…c terminology. In particular,
(i) if two in…nitesimal sequences fxn g and fyn g are such that yn = o (xn ), we say that the
sequence fyn g is in…nitesimal of higher order with respect to fxn g;
(ii) if two divergent sequences fxn g and fyn g are such that yn = o (xn ), we say that the
sequence fyn g is of lower order of in…nity with respect to fxn g.
In other words, a sequence is in…nitesimal of higher order if it tends to zero faster, while
it is of lower order of in…nity if it tends to in…nity slower. Besides the terminology (which is
not universal), it is important to recall the idea of negligibility that lies at the basis of the
relation yn = o (xn ).
(ii) nk = o ( n ) for every > 1, as already proved with the ratio criterion. We have
n = o nk if, instead, 0 < < 1 and k > 0.
logk2 n 1
k1
= k1 k2
!0
log n log n
The next lemma reports two important comparisons of in…nities that show that expo-
nentials are of lower order of in…nity than factorials n!. We omit the proof.
Lemma 329 One has that n = o (n!), with > 0, and n! = o (nn ).
Note that this implies, by Lemma 317, that n = o (nn ). Exponentials are, therefore, of
lower order of in…nity also compared with sequences of the type nn .
The di¤erent orders of in…nity and in…nitesimal are sometimes organized through scales.
If we limit ourselves to the in…nities (similar considerations hold for the in…nitesimals), the
most classical scale of in…nities is the logarithmic-exponential one. Taking xn = n as the
basis, we have the ascending scale
2 k n
n; n2 ; :::; nk ; :::; en ; e2n ; :::; ekn ; :::; en ; :::; en ; :::; ee ; :::
They give some “samples”for the asymptotic behavior of sequence fxn g that tends to in…nity.
For example, if xn log n, the sequence fxn g is asymptotically logarithmic; if xn n2 , the
sequence fxn g is asymptotically quadratic, and so on.
Although for brevity we omit the details, Lemma 329 shows that the logarithmic-exponential
scale can be remarkably re…ned with orders of in…nity of the type n! and nn .
n
Given this, in the applications one seldom considers orders of in…nity higher than ee
and lower than log log n. On the other hand, log log n has an almost imperceptible increase,
it is almost constant:
n 10 102 103 104 105 106
log log n 0:834 03 1:527 2 1:932 6 2:220 3 2:443 5 2:625 8
n
while ee increases explosively:
n 3 4 5 6
n
ee 5:284 9 108 5:148 4 1023 2:851 1 1064 1:610 3 10175
The asymptotic behavior of sequences tending to in…nity relevant in the applications usu-
n
ally ranges between the slowness of log log n and the explosiveness of ee . But, from the
theoretical point of view, the study of the scales of in…nity is of great elegance.
224 CHAPTER 8. SEQUENCES
Two approximations of log n! are thus known. The …rst one, which De Moivre came up
with, is slightly less precise as it has an error term of order o (n). The second approximation
was given by Stirling and is more accurate - its error term is o (1) - but also more complex.27
Proof We shall only show the …rst equality. By setting xn = n!=nn , in the proof of Lemma
329 we have seen that
xn+1 1
lim =
xn e
From (10.18), we have also that
p
n
p n! 1
lim n
xn = lim =
n e
p n
We can thus conclude that n= n n! = e (1 + o (1)), or n!=nn = e n (1 + o (1)) , that is to
say
n
n! = nn e n
(1 + o (1))
p
One can hence conclude that n! = nn e n 2 neo(1) , and so
n!
p = eo(1) ! 1
nn e n 2 n
n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
(n) 0 1 2 2 3 3 4 4 4 4 5 5 6 6 6
and so on. It is naturally not possible to fully describe the sequence as it would be equi-
valent to describing the sequence of prime numbers, which we have argued to be impossible.
Nevertheless, we can still ask ourselves whether there is be a sequence fxn g which can be
described in closed form and is asymptotically equal to : in other words, our question is
whether we can …nd a reasonably simple sequence that asymptotically approximates well
enough.
Around the year 1800, Gauss and Legendre noticed independently from on another that
the sequence fn= log ng swell approximated , as we can check by inspection of the following
table.
n (n)
n (n) log n n= log n
10 4 4; 3 0; 921
102 25 21; 7 1; 151
103 168 145 1; 161
104 1:229 1:086 1; 132
105 9:592 8:686 1; 104
1010 455:052:511 434:294:482 1; 048
1015 29:844:570:422:669 28:952:965:460:217 1; 031
1020 2:220:819:602:560:918:840 2:171:472:409:516:250:000 1; 023
becomes closer and closer to 1 as n increases. Gauss and Legendre’s conjecture was that this
was so because is asymptotically equal to fn= log ng. Their conjecture remained untested
for about a century, until it was proven to be true in 1896 by two great mathematicians
independently, Jacques Hadamard and Charles de la Vallée Poussin. The importance of
226 CHAPTER 8. SEQUENCES
Although we are not able to describe the sequence , thanks to the Prime Number
Theorem we can say that its asymptotic behavior is similar to that of the simple sequence
fn= log ng, that is to say that their number in any given interval of natural numbers [m; n]
is approximately
n m
(n) (m) =
log n log m
with increasing accuracy. This result, which undoubtedly has a statistical “‡avor”is incred-
ibly elegant, even more so if we consider its following remarkable consequence.
The sequence of prime numbers fpn g is thus asymptotically equivalent to fn log ng. The
n-th prime number’s value is, approximately, n log n.29 For example, by inspecting the prime
number table one can see that for n = 100 one has that pn = 541 while its “estimate” is
n log n = 460 (rounding down). Similarly:
pn
n pn n log n n log n
One can see that the ratio between pn and its estimate n log n stays steadily around 1.
log n
(n) 1 <" 8n n" (8.39)
n
Since pn ! 1, there is an n" such that pn n" per n n" . Hence (8.39) implies that
log pn
(pn ) 1 <" 8n n"
pn
log pn
n 1 <" 8n n"
pn
that is,
log pn
n !1 (8.40)
pn
from which it follows that
log pn
log n ! log 1 = 0
pn
or, log n + log log pn log pn ! 0. Since log pn ! +1,
log n log log pn log n + log log pn log pn
+ 1= !0
log pn log pn log pn
log n
!1
log pn
8.13 Sequences in Rn
We now examine sequences xk of vectors in Rn . For them we give the following de…nition
of limit that follows closely the one already given for sequences in R. The fundamental
di¤erence is that each element of the sequence is now a vector xk = (xk1 ; xk2 ; :::; xkn ) 2 Rn and
not a scalar.
xk L ! 0 () xki Li !0 8i = 1; 2; : : : ; n (8.41)
that is if and only if the numerical sequences xki of the i-th components converge to the
component Li of the vector L.
The convergence of a sequence of vectors therefore reduces to the convergence of the
sequences of the single components and hence it does not present any di¢ culty of under-
standing or calculation.
N.B. Observe that a sequence in Rn is nothing but the restriction to N+ of a vector function
f : R ! Rn . O
1 1 2k + 3
1 + ; 2;
k k 5k 7
in R3 . Since
1 1 2k + 3 2
1+ !1 , !0 and !
k k2 5k 7 5
the sequence converges to the vector (1; 0; 2=5). N
In an analogous way we de…ne the divergences to +1 and to 1 when all the components
of the vectors that form the sequence diverge respectively to +1 or to 1. When, …nally, the
single components have di¤erent behaviors (some converge, others diverge or are irregular)
the sequence of vectors does not have a limit. For brevity, we omit the details.
Notation The sequences of vectors are denoted by xk instead of fxn g to avoid confusion
with the dimension n of the space Rn and to be able to indicate the single components xki
of each vector xk of the sequence.
Chapter 9
Series
The idea that we want to develop here concerns, roughly, the possibility of summing in…nitely
many addends. To provide a rudimentary example, imagine a stick 1 meter long and cut it
in half, obtaining in this way two pieces 1=2 meter long; then cut the second piece in half,
obtaining two pieces 1=4 meter long; cut again the second piece, obtaining two pieces 1=8
meter long, and continue ideally, without never stopping. This cutting process would result
in in…nitely many pieces of length 1=2, 1=4, 1=8, ... in which the original stick of 1 meter
has been divided. It is rather natural to imagine that
1 1 1 1
+ + + + + =1 (9.1)
2 4 8 2n
i.e., that — by reassembling the individual pieces — we would get back the original meter.
In this chapter we will give a precise meaning to equalities like (9.1). Let us imagine,
therefore, a sequence fxn g, and that we want to “sum” all the terms, i.e., to carry out the
operation
1
X
x1 + x2 + + xn + = xn
n=1
To give a precise meaning to this new operation of “addition of in…nitely many summands”,
which is di¤erent from the ordinary addition (as we will realize),1 we will sum a …nite number
of terms, say n, and then make n tend to in…nity and take the resulting limit, if it exists, as
the value to assign to the series. We are, therefore, thinking of constructing a new sequence
1
We cannot really sum in…nitely many summands: all the world paper would not su¢ ce, nor would our
entire life, we would not know where to put the line that one traditionally writes under the summands before
adding them, etc..
229
230 CHAPTER 9. SERIES
fsn g, so de…ned by
s1 = x1 (9.2)
s2 = x1 + x2
s3 = x1 + x2 + x3
sn = x1 + + xn
and to take the possible limit of fsn g as the sum of the series.
De…nition
P1 335 The series with terms given by a sequence fxn g of scalars, in symbols
x
n=1 n , is the sequence fsn g de…ned in (9.2). The terms sn of the sequence are called
partial sums of the series.
P
The series 1 n=1 xn is therefore de…ned as the sequence fs g of the partial sums, whose
Pn1
limit behavior determines its value. In particular, the series n=1 xn is:
1
X
xn = S
n=1
if lim sn = S 2 R;
P1
(ii) positively divergent, in symbols n=1 xn = +1, if lim sn = +1;
P
(iii) negatively divergent, in symbols 1
n=1 xn = 1, if lim sn = 1;
Brie‡y, we attribute to the series the same character (convergence, divergence, or irreg-
ularity) as that of its sequence of the partial sums2 .
O.R. Sometimes it is useful to start the series with the index n = 0 rather than from n = 1.
When the option exists (we will see that this is not the case for some types of series, like the
harmonic series, which for example cannot be de…ned for n = 0), the choice to start a series
from either n = 0 or n = 1 (or from another value of n) is a pure matter of convenience
and the context itself typically suggests the best choice. In any case, this choice does not
alter the character of the series and, therefore, it does not a¤ect the problem of determining
whether the series converges or not. H
2
Using the terminology already employed for the sequences, a series is sometimes called regular when it is
not irregular, that is when one of the cases (i)–(iii) holds.
9.1. THE CONCEPT 231
Since
1 1 1
=
n (n + 1) n n+1
one has that
1 11
sn = + + +
1 2 2 3 n (n + 1)
1 1 1 1 1 1 1 1
=1 + + + + =1 !1
2 2 3 3 4 n n+1 n+1
Therefore,
1
X 1
=1
n (n + 1)
n=1
and so the Mengoli series converges and it has sum 1. N
Let us consider its partial sums taken for indices n that are powers of 2 (n = 2k ):
1
s1 = 1; s2 = 1 +
2
1 1 1 1 1 1 1 1
s4 = 1 + + + > 1 + + + = s2 + = 1 + 2
2 3 4 2 4 4 2 2
1 1 1 1 1 1 1 1 1 1
s8 = s4 + + + + > s4 + + + + = s4 + > 1 + 3
5 6 7 8 8 8 8 8 2 2
Continuing in this way we see that
1
s2k > 1 + k (9.3)
2
The sequence of partial sums is strictly increasing (since the summands are all positive) and
therefore it admits limit; (9.3) guarantees that it is not bounded from above and therefore
lim sn = +1. Hence,
X 1
1
= +1
n
n=1
Example 338 (Geometric series) The geometric series with ratio q is de…ned as follows:
1
X
1 + q + q2 + q3 + + qn + = qn
n=0
sn = |1 + 1 +
{z + 1} = n + 1 ! +1
n+1 times
sn qsn = 1 + q + q 2 + q 3 + + qn q 1 + q + q2 + q3 + + qn
= 1 + q + q2 + q3 + + qn q + q2 + q3 + + q n+1 = 1 q n+1
we have
(1 q) sn = 1 q n+1
1
sn !
1 q
(iii) if q = 1, the partial sums of odd order are equal to zero, while those of even order
are equal to 1. The sequence formed by them is hence irregular;
(iv) if q < 1, the sequence q n+1 is irregular and therefore it is so also fsn g. N
9.1. THE CONCEPT 233
Epicurus in his letter to Herodotus wrote “Once one says that there are in…nite parts in
a body or parts of any degree of smallness, it is not possible to conceive how this should be,
and indeed how could the body any longer be limited in size?” The former examples show
that, indeed, if these “parts”, these particles, have a strictly positive, but di¤erent size — for
example either 1=n (n + 1) or q n , with q 2 (0; 1) — then the series might converge, and so
the size of the “body” could be de…ned. Nevertheless, Epicurus was right in the sense that,
if we assume –as it seems he does too –that all the particles have same size, no matter how
small: the series
"+"+"+ +"+
P1
positively diverges, that is, n=1 " = +1, for every " > 0. Indeed, we have sn = n" ! +1.
This simple series has an important philosophical counterpart (the properties of series have
been often used, even within philosophy, to try to clarify the nature of the potential in…nite).
where 2 (0; 1) is the subjective discount factor. In the light of what we have just seen,
(9.4) is the series
X1
t 1
ut (xt ) (9.5)
t=1
Series allow therefore to give a correct meaning to the fundamental speci…cation (9.4)
of the intertemporal utility function. Naturally, we are interested in the case in which the
series (9.5) is convergent, because we want for the overall utility that the consumer gets
from the intertemporal consumption fx1 ; x2 ; : : : ; xt ; : : :g to be …nite. Otherwise, how could
we compare, and hence choose, among such pro…les if we get in…nite utility?4
Using the properties of the geometric series, it is possible to show that the series (9.5)
converges if and only if < 1, provided that the utility functions ut are positive and bounded
by the same constant. In such a case, having assumed 2 (0; 1), the intertemporal utility
function
X1
t 1
U (x) = ut (xt ) (9.6)
t=1
Convergence to zero of the sequence fxn g is therefore a necessary condition for conver-
gence of its series. The fact that this condition is only
P1 necessary is demonstrated by the
harmonic series: even if we have 1=n ! 0, the series n=1 1=n diverges.
Proposition 341 Each series with positive terms is convergent or positively divergent. In
particular, it is convergent if and only if it is bounded from above.6
The series with positive terms inherit hence the remarkable regularity properties of mono-
tonic sequences. This gives them a particularly important status among the series. For them,
we now recast the convergence criteria presented in Section 8.9 for the sequences.
P1 P1
Proposition 342 (Comparison criterion) Let n=1 xn and n=1 yn be two series with
positive terms and let xn yn eventually.
P1 P
(i) If diverges positively, then so does 1
n=1 xn n=1 yn .
P1 P1
(ii) If n=1 yn converges, then so does n=1 xn .
P 0
Proof Let n0 1 be such that xn yn for all n n0 , and set = nn=1 (yn xn ). By
calling sn and n the partial sums of the two sequences, for n > n0 we have that
Xn
n sn = + (yk xk )
k=n0 +1
that is, n sn + . Therefore, the statement follows from Proposition 282 (which is the
sequence counterpart of this statement).
Note that (i) is the contrapositive of (ii), and vice versa: indeed, thanks to Proposition
341, for a series with positive terms the negation of the convergence is the positive diver-
gence.7 For their utility we stated both, but it is the same property seen in equivalent
ways.
the convergence of the geometric series with ratio 2=5 guarantees, via the comparison cri-
terion, the convergence of the series. N
If > 2, then
1 1
< 2
n n
for every n > 1 and therefore we still have convergence.
Finally, it is possible to see, but it is more delicate, that the generalized harmonic series
converges also for all values of 2 (1; 2).
To sum up, the generalized harmonic series
1
X 1
n
n=1
For the generalized harmonic series, the case = 1 is hence the “last”case of divergence:
it is su¢ cient to very slightly increase the exponent, from 1 to 1 + " with " > 0, and the
series will converge. This suggests that the divergence is extremely slow, as the reader can
check calculating some of the partial sums.10 This intuition is made precise by next nice
result.
1 1 1
1+ + + + log n (9.7)
2 3 n
In other words, the sequence of the partial sums of the harmonic series is asymptotic to
the logarithm.
converges for > 1 and any , as well as for = 1 and > 1, while it diverges for <1
and any , as well as for = 1 and any 1. N
The comparison criterion has a nice and useful asymptotic version, based on the asymp-
totic comparison of the terms of the sequences.
P P
Proposition 348 (Asymptotic comparison criterion) Let 1 xn and 1
n=1P yn be two
n=1P
series with positive and non-zero terms. If xn yn , then the series n=1 xn and 1
11 1
n=1 yn
have the same character.
10
A famous professor talked about “cadaverous in…nity”.
11
The hypothesis that the terms are non-zero is necessary to make the ratio xn =yn well-de…ned.
238 CHAPTER 9. SERIES
The theorem requires that the ratios are (uniformly) smaller than a number q which
is itself smaller than 1, and not simply that they are all smaller than 1. Indeed, for the
harmonic series the ratios are
1
n+1 n
1 = n+1
n
so all lower than 1, but the series diverges (as the ratios tend to 1, there is no room to insert
a number q that is simultaneously
P1 greater than all and smaller than 1).
Since the convergence of n=1 xn implies xn ! 0 (Theorem 339), the ratio criterion for
series can be seen as an extension of the homonymous criterion for the sequences. A similar
observation holds for the root criterion that we will see soon.
Proof From xn+1 qxn we deduce, as in the analogous criterion for sequences that 0 < xn
q n 1 x1 , and the …rst statement follows from the comparison criterion (Proposition 342) and
12
The hypothesis that the terms xn are non-zero ensures that the ratio xn+1 =xn is well de…ned (recall the
analogous condition required for the asymptotic comparison criterion).
240 CHAPTER 9. SERIES
from the convergence of the geometric series. If instead xn+1 =xn 1, i.e., if xn+1 xn > 0,
fxn g is increasing and therefore it cannot tend to 0.
It is possible to prove (see Section 10.4) that if the lim(xn+1 =xn ) exists, the ratio criterion
assumes exactly the tripartite form given in Proposition 350; in particular, if
xn+1
lim =1
xn
the criterion fails gives no indication about the character of the series.
At the operative level, the tripartite form is the usual one in which we apply the ratio
criterion. At the mechanical level, it is su¢ cient to recall the tripartition of Proposition
350 and the illustrative examples given in the Prelude. But, not to do plumbing rather
than mathematics, it is important to keep in mind the theoretical foundations provided by
Proposition 352.
Let us see other examples.
P
Example 353 (i) The series 1 k n
n=1 n q converges for every k 2 R and every 0 < q < 1.
Indeed,
(n + 1)k q n+1 n+1 k
= q !q<1
nk q n n
This shows also that this series diverges positively when q > 1.
xn+1 n! x
= !0 8x > 0
(n + 1)! xn n+1
xn+1 n n
n
= x !x
n+1 x n+1
which obviously is < 1 when 0 < x < 1. N
We stop here our study of the convergence criteria. Much more can be said: in Section
10.4 we will continue to investigate this topic in some more depth.
9.3. SERIES WITH POSITIVE TERMS 241
Proof In Example 344 we have shown that the series converges. Let us calculate its sum.
By Newton’s formula (A.4), we have
n n
X n
X
1 n 1 1 n! 1
1+ = k
=
n k n k! (n k)! nk
k=0 k=0
n! k
= n (n 1) (n k + 1) |n {z n} = n
(n k)! | {z }
k times k times
Therefore,
n! 1
1
(n k)! nk
which implies
n n
X n
X
1 1 n! 1 1
1+ =
n k! (n k)! nk k!
k=0 k=0
It follows that
1
X 1
e (9.10)
n!
n=0
The sum (9.9) can be generalized in a substantial way (we omit the proof).
The equality (9.12) holds for every number x and it reduces to (9.9) in the special case
x = 1. Note the very remarkable series expression
1
X
x xn x2 x3 xn
e = =1+x+ + + + + (9.13)
n! 2 3! n!
n=0
X1
of the exponential function. We will see soon that xn =n! is a power series. For this
n=0
reason, the equality (9.13) is called the power series expansion of the exponential function.
It is a result as elegant as important, which allows to “decompose”the exponential function
in a sum (although in…nite) of elementary functions such as the powers xn .
We will study in greater generality series expansions with tools of di¤erential calculus,
of which series expansions are one of the most remarkable applications.
The next result shows that the convergence of the series of absolute values, which can be
veri…ed with the criteria just discussed, guarantees the convergence of the much wilder, not
necessarily positive, original series. We omit the proof.
9.4. SERIES WITH TERMS OF ANY SIGN 243
P1 P1
Theorem 357 If n=1 xn converges absolutely, then n=1 xn converges.
The class of absolutely convergent series is therefore contained in the one of convergent
series. It is possible to prove that this subclass has fundamental properties of regularity with
respect to the convergent series of generic terms that are not absolutely convergent. In other
words, the absolutely convergent series are, among the convergent ones, ones that behave
signi…cantly better.
jxn+1 j xn+1 n! x
= = !0 8x 2 R
jxn j (n + 1)! xn n+1
it follows that it converges absolutely.
P
(iii) The series 1 n
n=1 x =n converges for every 1 < x < 1. Indeed,
jxn+1 j xn+1 n n
= n
= jxj ! jxj
jxn j n+1 x n+1
which obviously is < 1 when 1 < x < 1. Thus, also this series converges absolutely.N
We have ( 1)n =n2 = 1=n2 . Therefore, the series assigned converges absolutely. N
with every xn 0. We can say already something interesting about them (we omit the
proof).
P1 n+1
Proposition 361 The series n=1 ( 1) xn with alternating sign is convergent if the
sequence fxn g is decreasing and in…nitesimal.
As we well know by now, the condition xn ! 0 is necessary, but not su¢ cient for the
convergence of a generic series. For an alternating series though, it becomes also su¢ cient,
provided fxn g is decreasing.13
Example 362 By using this result, we can conclude that the series
1
X 1 1 1 1 1 1
( 1)n+1 =1 + + +
n 2 3 4 5 6
n=1
13
Note that xn is the absolute value of the term ( 1)n+1 xn of the sequence.
14
The sum of the alternating harmonic series is log 2.
Chapter 10
Discrete calculus
Discrete calculus deals with problems analogous to those of di¤erential calculus, with the
di¤erence that sequences, that is functions f : N f0g ! R with discrete domain, are
considered instead of functions on R. As we will see, Discrete calculus results are more raw
and less neat than the those obtained in di¤erential calculus.1 Nevertheless, discrete calculus
can be very useful in some applications. More precisely, in this chapter we will show its use
in the study of series and sequences, allowing for a deeper analysis of some issues which we
have already discussed.
Example 363 If we consider the sequence f( 1)n g we have that yn = 1 and zn = 1 for
every n, whereas for the sequence f1=ng we have that yn = 1=n and zn = 0 for every n. N
M zn yn M 8n (10.1)
therefore fyn g is decreasing and fzn g is increasing. Since they are monotone, from Theorem
285 we have that fyn g and fzn g converge. If we denote their limits as y and z, that is,
yn ! y and zn ! z, we can write
1
Some parts of this chapter require a basic knowledge of di¤erential calculus. This chapter can be read
seamlessly after Chapter 18.
245
246 CHAPTER 10. DISCRETE CALCULUS
The limits y and z are respectively referred to as limit superior and limit inferior of fxn g
and they are denoted as lim sup xn and lim inf xn .
This example shows two fundamental properties of these limits: they always exist, even if
the original sequence has no limit2 and their equality is a necessary and su¢ cient condition
for the convergence of the sequence fxn g: lim sup xn = lim inf xn if and only if lim xn .
Formally:
Proof Thanks to (10.1), (10.2) follows from Proposition 282. The proof of the second part
of the statement is left as an exercise for the reader.
lim inf xn = lim sup xn and lim sup xn = lim inf xn (10.3)
They are duality properties, as they relate the limit superior and limit inferior of the sequence
fxn g with those of the opposite sequence f xn g. For instance, this simple duality allows to
easily translate some properties of the limit superior into properties of the limit inferior, and
vice versa. This is exactly what happens in the next proof.
Another interesting consequence of the duality is the possibility to rewrite the inequality
(10.2) as
lim inf xn lim inf xn
The next result lists some simple, yet useful, properties of the limit superior and the
limit inferior. Thanks to the previous result, they imply the similar properties which we
have listed for convergent sequences.
Lemma 366 Let fxn g and fyn g be two bounded sequence of real numbers. We have:
(iii) lim inf xn lim inf yn and lim sup xn lim sup yn if eventually xn yn .
Proof (i) For every n we have that supk n (xk + yk ) supk n xk + supk n xk . Hence, (i)
follows from Proposition 282. (ii) follows from (i) and the duality result (10.3):
If the sequence converges, there exists a unique limit point: the limit of the sequence.
If the sequence does not converge, the limit points are the scalars which are approached
by in…nitely many elements of the sequence, even if the sequence does not converge to said
scalars. Indeed, it can be easily shown that L is a limit point for fxn g if and only if there
exists a subsequence fxnk g that converges to L.
Example 368 (i) The interval [ 1; 1] is the set of limit points of the sequence fsin ng ,
whereas f0; 1g are the limit points of the sequence f( 1)n g. (ii) the singleton f0g is the
unique limit point of the convergent sequence f1=ng. N
The next result shows that the limit points belong to the interval determined by the limit
superior and the limit inferior.
Proposition 369 Let fxn g be a bounded sequence of scalars. A value x 2 R is a limit point
for the sequence only if x 2 [lim inf xn ; lim sup xn ].
Intuitively, the larger the set of limit points, the more the sequence is divergent, in particu-
lar, this set reduces to a singleton when the sequence converges. In light of the last result, the
di¤erence between superior and inferior limits, that is, the length of [lim inf xn ; lim sup xn ],
is a (not particularly precise) indicator of the divergence of a sequence.
Thanks to the inequality lim inf xn lim inf xn , the interval [lim inf xn ; lim sup xn ]
can be rewritten as [lim inf xn ; lim inf xn ]. For instance, if xn = sin n or xn = cos n, we
have that [lim inf xn ; lim inf xn ] = [ 1; 1].
N.B. Up to this point, we have considered only bounded sequences. Actually, if we allow the
limit superior and limit inferior to assume in…nity as a value, all these results can be easily
extended to generic sequences. For instance, if we consider the sequence fng, that diverges
to +1, we have that lim inf xn = lim sup xn = +1; for the sequence f en g, that diverges
to 1, we have that lim sup xn = lim inf xn = 1, whereas for the sequence f( 1)n ng we
have that lim inf xn = 1 and lim sup xn = +1, so that [lim inf xn ; lim sup xn ] = R. We
leave the extension of the previous results to generic sequences to the reader. O
248 CHAPTER 10. DISCRETE CALCULUS
The next result lists the algebraic properties of the di¤erences, that is, their behavior
with respect to the fundamental operations. It is the discrete counterpart of the results in
Section 18.7.
Proposition 371 Let fxn g and fyn g be two sequences, we have that
(i) (xn + yn ) = xn + yn ;
On the one hand, (i) guarantees that the di¤erence distributes over addition, on the
other hand, (ii) and (iii) show that more complex rules hold for multiplication and division.
Properties (ii) and (iii) are respectively called product rule and quotient rule.
Therefore, the monotonicity of the original sequence is revealed by the sign of the di¤er-
ences.
xn = an+1 an = (a 1) an = (a 1) xn
k
X k
k
xn = k 1
xn = k 1
xn+1 k 1
xn = ( 1)k i
xn+i (10.5)
i
i=0
n2 = (n + 1)2 n2 = 2n + 1
2 2
n = 2 (n + 1) + 1 (2n + 1) = 2
k 1 k k
nk = k 1
knk 1
+ n 2
+ +1
2
k 1 k 1 k k 1 k 2 k 1
=k n + n + + 1
2
= k (k 1)! + 0 + + 0 = k!
Notice that the zeroes in the last line follow from the induction hypothesis that guarantees
that, for 2 r k,
k 1 k r r 1 k r k r r 1
n = n = (k r)! = 0
The example shows the analogy between in discrete calculus and the derivative in
“continuous” calculus. Indeed, in the continuous case, it is necessary to derive k times xk
in order to obtain a constant and k + 1 times to get the constant 0. In the discrete case,
we must apply k times the operator to the sequence nk in order to obtain a constant and
k + 1 times to get the constant 0.
Formula (10.5) permits the following beautiful generalization of the series expansion
(9.12) of the exponential function.
Proposition 375 Let fyn g be any sequence of scalars. Then, for each n 1,
1
X 1
X
xk k x xj
yn = e yn+j 8x 2 R (10.6)
k! j!
k=0 j=0
1
X k 1
xk X k X xj
( 1)k i yn+i = e x
yn+j 8x 2 R (10.7)
k! i j!
k=0 i=0 j=0
Fix an integer j 0. We show that the coe¢ cients of yn+j on the two sides of (10.6) are
equal. Clearly, on the right hand side this coe¢ cient is e x xj =j!. As to the left hand side,
this coe¢ cient is
1
X 1
X
xk k k j xk k
( 1) = ( 1)k j
k! j k! j
k=0 k=j
where the equality holds because the binomial coe¢ cients are zero if k < j. Therefore, it
remains to prove that
X1
xk k xj
( 1)k j = e x (10.8)
k! j j!
k=j
10.2. DISCRETE CALCULUS 251
Set i = k j. Then,
1
X 1
X 1
X
xk k xi+j i+j xi+j (i + j)! i + j
( 1)k j
= ( 1)i = ( 1)i
k! j (i + j)! j (i + j)! i!j! j
k=j i=0 i=0
1 1
xj X i ( 1)i xj X ( x)i xj x
= x = = e
j! i! j! i! j!
i=0 i=0
as desired.
The series expansion (9.12) is a special case of (10.6). Indeed, let n = 0 so that (10.6)
becomes
X1 X1
xk k xj
y0 = e x yj (10.9)
k! j!
k=0 j=0
Therefore, even if the ratio xn =yn does converge, the behavior of the ratio xn = yn of
the di¤erences may not. Conversely, the next result shows that the asymptotic behavior of
the ratio xn = yn determines the one of xn =yn .
Theorem 377 (Cesàro) Let fyn g be an increasing sequence that diverges to in…nity, that
is, yn " +1, and let fxn g be a generic sequence. It follows that
xn xn xn xn
lim inf lim inf lim sup lim sup (10.10)
yn yn yn yn
252 CHAPTER 10. DISCRETE CALCULUS
In particular, this inequality implies that, if the (…nite or in…nite) limit of the ratio
xn = yn exists, we have that
xn xn xn xn
lim inf = lim inf = lim sup = lim sup (10.11)
yn yn yn yn
that is, xn =yn converges to the same limit. Therefore, as stated above, the “regularity”of the
the asymptotic behavior of the ratio xn = yn implies the “regularity” of the original ratio
xn =yn . At the same time, if the ratio xn =yn presents an “irregular”asymptotic behavior, so
will the di¤erence ratio.
Proof We will only prove the special case (10.11) when xn = yn admits a limit, …nite or
in…nite. Therefore, let xn = yn ! L 2 R. It follows that, for " > 0, there exists n" such
that
xn
L "< <L+"
yn
for every n n" . Since, by hypothesis, yn+1 yn > 0, we have
(L ") (yn+1 yn ) < xn+1 xn < (L + ") (yn+1 yn ) 8n n"
In particular, for every n > n" , we obtain
(L ") (yn" +1 yn" ) < xn" +1 xn" < (L + ") (yn" +1 yn" )
(L ") (yn" +2 yn" +1 ) < xn" +2 xn" +1 < (L + ") (yn" +2 yn" +1 )
The previous result can be interpreted as the discrete version of de l’Hospital’s Theorem.
As the de l’Hospital’s Theorem is useful in …nding the limit of functions, in particular if they
present indeterminate forms, the discrete analogous by Cesàro proves itself to be most useful
in …nding the limit of sequences that present indeterminate forms.
10.3. CONVERGENCE IN MEAN 253
In the next section, we will see how Cesàro’s Theorem allows for a better understanding
of convergence criteria for series (see Section 10.4). To this end, the following remarkable
consequence of Cesàro’s Theorem will be crucial.
Corollary 379 Let fxn g be a sequence such that, eventually, xn > 0. It follows
xn+1 p p xn+1
lim inf lim inf n xn lim sup n xn lim sup
xn xn
Proof Let fxn g be a positive sequence. We have
xn+1 p 1
log = log xn+1 log xn and log n
xn = log xn
xn n
Consider log xn and yn = n , (10.10) takes the form
log xn log xn log xn log xn
lim inf lim inf lim sup lim sup
yn yn yn yn
that is
log xxn+1
n
p p log xxn+1
n
lim inf lim inf log n
xn lim sup log n
xn lim sup
1 1
from which (10.18) follows.
The sequence Pn
i=1 xi
n
of arithmetic means converges always to the same limit of the sequence fxn g, whereas the
converse does not hold: the sequence of means may converge while the original one does not.
Therefore, the sequence of means is more “stable”than the original one. This motivates
the following, more general, de…nition of limit of a sequence, which is named after Cesàro
(or “in mean”), which is fundamental in probability theory (and in its applications).
De…nition 382 We say that a sequence fxn g converges a la Cesàro (or in mean) to L, and
C
we write xn ! L, when
x1 + x2 + + xn
!L
n
From the previous results, it follows that standard convergence to a limit implies conver-
gence a la Cesàro to the same limit. The converse does not hold: we may have convergence
a la Cesàro without standard convergence.
Example 383 The sequence f( 1)n g from the last example does not converge, whereas
C
( 1)n ! 0. N
It is useful to …nd conditions such that the converse holds, that is, the convergence of the
sequence of means implies the convergence of the original sequence. These results are called
Tauberian theorems. We provide one as an example:
10.3. CONVERGENCE IN MEAN 255
Proposition 384 (Landau) Let fxn g be a sequence such that there exists k < 0 such that
C
xn =n > k for every n 1. Then xn ! L 2 R if and only if xn ! L.
In particular, the hypothesis is always satis…ed when the sequence fxn g is increasing: an
increasing sequence converges to L if and only if it converges a la Cesàro to L.
Whenever a sequence does not converge in mean, we consider the sequence of the means
of the means, that, by previous results, it is more likely to converge than the sequence of
means: this is called (C; 2) convergence. This idea can be extended to the mean of the
mean iterated k times. We won’t consider such cases. However, the fundamental principle is
that means tend to smooth the behavior of a sequence. In various fashions, often stochastic
(an example is the law of large number previously mentioned), this principle is of central
importance in the applications. In medio stat virtus.
s1 = 1 ; s2 = 0 ; s3 = 1 ; s4 = 0 ; s5 = 1 ;
1+0 1 2 2 1 3
y 1 = 1 ; y2 = = ; y3 = ; y4 = = ; y5 = ;
2 2 3 4 2 5
Even if this is not his main scienti…c contribution, the name of Guido Grandi is re-
membered for his treatment of this series. It is curious to notice that, until mid-nineteenth
century, also the greatest mathematicians believed that this series summed up to 1=2. Until
then, mathematics had been developing untidily: on the one hand, highly complex theorems
were known, on the other, attention to well-posed de…nitions and rigor which we are now
used to, was lacking.
The monk Guido Grandi, proposed the following explanation, which contains two mis-
takes. First of all, he identi…ed 1 1 + 1 1 + 1 1 + as a geometric series with common
ratio q = 1 (correct) and therefore having sum
1 1 1
= =
1 q 1 ( 1) 2
(wrong, since the geometric series converges only when jqj < 1); pairing the addenda (wrong,
since the associative property does not generally hold for series) he derived the equality
(1 1) + (1 1) + =0+0+ in order to conclude
1
0+0+0+ =
2
that is, the sum of in…nite zeroes is equal to 1=2. This led him not to deny the existence
of God , but to deem as irrelevant his intervention in the creation; even without divine
intervention, something can come out of nothing (if you wait long enough): creatio ex nihilo.
Having said this, Grandi can be satis…ed of his work: he made several mistakes, yet, well
ahead of his time, he provided an answer to a much more general question.
Lemma 386 Let fxn g be a sequence with, eventually, xn > 0. There exists q < 1 such that,
eventually, xn+1 =xn q if and only if
xn+1
lim sup <1 (10.13)
xn
Proof “Only if”. Suppose that there exists q < 1 such that eventually (9.8) holds. There
exists n such that xn+1 =xn q for every n n. Therefore, for any such n we have
supk n xk+1 =xk q, which implies
xn+1 xk+1
lim sup = lim sup q<1
xn n!1 k n xk
xn+1
sup L <" 8n n
k n xn
that is
xk+1
L " < sup <L+" 8n n
k n xk
If we choose " su¢ ciently small so that L + " < 1, by setting q = L + " we obtain the desired
condition.
The previous analysis leads to the following Corollary, which is very useful for computa-
tions, where the ratio criterion is expressed in terms of limits.
P
Corollary 387 Let 1 n=1 xn be a series with, eventually, xn > 0.
(i) If
xn+1
lim sup <1
xn
then the series converges.
(ii) If
xn+1
lim inf >1
xn
then the series diverges positively.
Notice that, thanks to Lemma 386, point (i) is equivalent to point (i) of Proposition 352.
In contrast, point (ii) is weaker than point (ii) of Proposition 352 since condition (10.14) is
only su¢ cient, but not necessary, to have that xn+1 =xn 1 eventually.
As shown by the following examples, this speci…cation of the ratio criterion is particularly
useful when the limit
xn+1
lim
xn
exists, that is, whenever
xn+1 xn+1 xn+1
lim = lim sup = lim inf
xn xn xn
In this particular case, the ratio criterion takes the useful tripartite form of Proposition 350:
258 CHAPTER 10. DISCRETE CALCULUS
(i) if
xn+1
lim <1
xn
the series converges;
(ii) if
xn+1
lim >1
xn
the limit of the series is 1;
(iii) if
xn+1
lim =1
xn
the criterion fails and it does not determine the behavior of the series.
As we have seen in Section 8.9, this form of the ratio criterion is the one which is usually
used in applications. Examples P351 and 353 have
P1shown 2cases (i) and (ii). The unfortunate
1
case (iii) is well-exempli…ed by n=1 1=n and n=1 1=n .
Let us see its limit form. By Lemma 386, point (i) can be equivalently stated as
p
lim sup n
xn < 1
p
As to point (ii), it requires that n xn 1 for in…nitely many values of n, that is, that there
p
is a subsequence fnk g such that nk xnk 1 for every k. Such a condition holds if
p
lim sup n
xn > 1 (10.16)
and only if
p
lim sup n
xn 1 (10.17)
10.4. CONVERGENCE CRITERIA FOR SERIES 259
The constant sequence xn = 1 exempli…es how condition (10.17) can hold even if (10.16)
does not. Sequence f(1 1=n)n g on the other hand, shows how even condition (ii) from
Proposition 388 may not hold although (10.17) holds. It is therefore clear that (10.16)
implies point (ii) of Proposition 388, which in turn implies (10.17), but that the opposite
implications do not hold.
All this brings us to the following limit form, in which point (i) is equivalent to that of
Proposition 388, while point (ii) is weaker than its counterpart since, as we have seen above,
p
condition (10.16) only is a su¢ cient condition for n xn 1 to hold for in…nitely many values
of n.
P
Corollary 389 (Root criterion in limit form) Let 1 n=1 xn be a positive term series.
p
(i) If lim sup n xn < 1, the series converges.
p
(ii) If lim sup n xn > 1, the series diverges positively.
p p
Proof If lim sup n xn < 1, we have that n xn q for some q < 1, eventually. The desider-
p p
atum follows from Proposition 388. If lim sup n xn > 1, then n xn 1 for in…nitely many
values of n, and the thesis follows from Proposition 388.
As for the limit form of the ratio criterion, also that of the root criterion is particularly
p
useful when lim n xn exists. Under such circumstances the criterion takes the following
tripartite form:
(i) if
p
lim n
xn < 1
the series converges;
(ii) if
p
lim n
xn > 1
the series diverges positively;
(iii) if
p
lim n
xn = 1
the criterion fails and it does not determine the behavior of the series.
As for the tripartite form of the ratio criterion, that of the root criterion is its most useful
form at a computational level. Nonetheless, we hope the reader will always keep in mind the
theoretical background of the criterion, as “ye were not made to live like unto brutes, but
for pursuit of virtue and of knowledge”.
Example 390 (i) Let q > 0. The series
1
X qn
nn
n=1
converges as r
n qn q
n
= !0
n n
260 CHAPTER 10. DISCRETE CALCULUS
P p
(ii) Let 0 q < 1: The series 1 k n
n=1 n q converges for every k: indeed
n
nk q n = qnk=n ! q
since n k=n ! 1 (as log n k=n = (k=n) log n ! 0). N
Proposition 391 For every sequence fxn g with positive terms, we have that
xn+1 p p xn+1
lim inf lim inf n
xn lim sup n
xn lim sup (10.18)
xn xn
If lim xn+1 =xn exists, we have that
xn+1 p
lim = lim n xn (10.19)
xn
and so the two criteria are equivalent in their limit form. However, if lim xn+1 =xn does not
exist, we still obtain from (10.18) that
xn+1 p
lim sup < 1 ) lim sup n xn < 1
xn
and
xn+1 p
lim inf > 1 ) lim sup n xn > 1
xn
which suggests that the root criterion is more powerful than the ratio criterion in determining
convergence: whenever the ratio criterion rules in favor of convergence or of divergence, we
would have reached the same conclusion by using the root criterion. The opposite does not
hold, as the next example shows: the ratio criterion fails while the root criterion determines
that the series in question converges.
that is:
1 1 1 1 1 1 1
+1+ + + + + + +
2 8 4 32 16 128 64
We have that 8 1
>
> 2(n+1) 2
=2 if n odd
xn+1 < 1
2n
=
xn >
>
1
: 2n+1
1 = 18 if n even
2n 2
and ( 1
p 2 if n odd
n
xn = p
n
4
2 if n even
5
See Rudin (1976) p. 67.
10.4. CONVERGENCE CRITERIA FOR SERIES 261
so that
xn+1 xn+1 1
lim sup =2 , lim inf =
xn xn 8
and
p 1
lim sup n
xn =
2
The ratio criterion thus fails, while the root criterion tells us that the series converges. N
Even though the root criterion is more powerful, the ratio criterion can still be useful as
it is generally easier to compute the limit of ratios than that of roots. The root criterion may
be more powerful from a theoretical standpoint, yet it is harder to use from a computational
perspective.
In light of this, when using the criteria for solving problems, one should …rst check
whether lim xn+1 =xn exists and, if it does, compute it. In such a case, thanks to (10.19) we
p
can also know the value of lim n xn and thus we can use the more powerful root criterion.
In the unfortunate case in which lim xn+1 =xn does not exist so that we can no longer
determine lim sup xn+1 =xn and lim inf xn+1 =xn , we can either use the less powerful ratio
criterion (which may fail as we have seen in the previous example), or we may try to compute
p
lim sup n xn directly, hoping it exists (as in the previous example) so that the root criterion
can be used in its handier limit form.
Finally, note that, however powerful it may be, the root criterion (and, a fortiori, the
weaker ratio criterion) only gives a su¢ cient condition for convergence, as the following
example shows.
p
Proof Take q > 0 such that lim sup n xn q < 1. There is an nq 1 such that
p
n
xn q
Thanks to (10.20), we can say that those convergent series whose terms converge to zero
less quickly than the geometric sequence, that is such that q n = o (xn ), or out of the root
criterion’s reach. For example, for every natural number k 2 we have that
qn
1 !0
nk
P1
and so q n = o n k . In order to determine whether the series n=1 n
k converges, the root
criterion is thus useless. This is con…rmed by the fact that
r
n 1
lim =1
nk
and we are able to understand why the root criterion fails in this instance thanks to Propos-
ition 394.
Such a function orders all possible intertemporal consumption pro…les x = (x1 ; :::; xt ; :::) 2
R1 . In particular, the higher the subjective discount factor the more the decision maker
cares about future periods, that is he is more patient.
One may ask oneself what happens in the limit case " 1,6 that is when the subjective
discount factor tends to 1. Intuitively, we are in an “in…nite patience” setting, where all
periods, present and future, count the same for the decision maker. When the horizon T is
…nite, the answer is simple:
T
X T
X
t 1
lim ut (xt ) = ut (xt ) (10.22)
"1
t=1 t=1
so that the limit case corresponds to the sum of the utilities of all periods, all with equal
unitary weight. When the horizon is in…nite the problem becomes far more complex as, by
6
For the meaning of " 1 we refer the reader to Section 8.6.2.
10.5. INFINITE PATIENCE 263
1
X
Theorem 339, for the series ut (xt ) to converge, it must be that limt!1 ut (xt ) = 0, which
t=1
is hardly justi…able in an economic perspective.
Let us consider instead the limit
1
X
t 1
lim (1 ) ut (xt )
"1
t=1
0; 0 ; 1; 1 ; 0; 0; 0; 0 ; 1; 1; 1; 1; 1; 1; 1; 1; :::
|{z} |{z} | {z } | {z }
2 elements 2 elements 4 elements 8 elements
where every block of 0s and 1s has length equal to the sum of the lengths of the previous
1
X
blocks. One can show that lim "1 (1 ) t 1
xt does not exist. N
t=1
The next remarkable result, the non-simple proof of which we omit, shows how the
existence of the limit is equivalent to convergence in means.
V (x) = (1 ) U (x) 8x 2 R1
as long as the limits exist. The in…nite patience case is thus captured by the limit of the
average utilities
T
1X
lim ut (xt ) (10.24)
T !+1 T
t=1
that is by the limit a la Cesàro of the sequence fut (xt )g. Such a criterion can be thus seen
as a limit case for " 1 of the intertemporal utility function V .
XT
The role the sum ut (xt ) plays in case (10.22) with …nite horizon is thus played in
t=1
the in…nite horizon case by the limit of the average utilities (10.24). This relevant economic
application of Frobenius-Littlewood’s Theorem allows us to elegantly conclude this chapter.
Part III
Continuity
265
Chapter 11
Limits of functions
Inserting other values closer and closer to the origin, we can verify that the corresponding
values of sin x=x get closer and closer to the limit L = 1. In this case we say that “the limit
of sin x=x as x tends to x0 = 0 is L = 1”; in symbols,
sin x
lim =1
x!0 x
Observe that in this example the point x0 = 0 where we take the limit does not belong to
the domain of the function f .
x for x 1
f (x) =
1 for x > 1
Its graph is
267
268 CHAPTER 11. LIMITS OF FUNCTIONS
How does f behave when one approaches the point x0 = 1? Taking points closer and
closer to x0 = 1 we have:
Adding other values, closer and closer to x0 = 1, we can verify that as x gets closer and
closer to x0 = 1, f (x) gets closer and closer to L = 1. In this case we say that “the limit of
f (x) as x tends to x0 = 1 is L = 1”, and we write
lim f (x) = 1
x!1
Observe that the value that the function assumes at the point x0 = 1 is f (1) = 1, and
therefore the limit L = 1 is equal to the value f (1) of the function at x0 = 1.
8
< x if x < 1
f (x) = 2 if x = 1
:
1 if x > 1
Compared to the previous example we have introduced a “jump”at the point x = 1, so that
the function jumps to the value 2 (we have indeed f (1) = 2).
11.1. INTRODUCTORY EXAMPLES 269
If we study the behavior of f for values of x closer and closer to x0 = 1, we build the
same table as before (because the function, except at the point 1, is identical to the one in
the previous example), and therefore also in this case we have
lim f (x) = 1
x!1
This time the value that the function assumes at the point 1 is f (1) = 2, di¤erent from the
value L = 1 of the limit.
Until now we have approached the point x0 both from the right and from the left, that is,
bilaterally (in two-sided manner). Sometimes this is not possible; rather, one can approach
from either the right or the left, that is, unilaterally (in one-sided manner). Let us consider,
for example, the function f : R f2g ! R given by f (x) = 1= (x 2) and let x0 = 2.
270 CHAPTER 11. LIMITS OF FUNCTIONS
“To approach the point x0 = 2 from the right”means to approach it by considering only
values x > 2:
x 2:0001 2:001 2:01 2:05 2:1 2:2 2:5
f (x) 10; 000 1; 000 100 20 10 5 2
For values closer and closer to 2 from the right the function assumes values that are larger
and larger and not bounded from above. In this case we say that “the function f tends to
+1 as x tends to 2 from the right” and we write
lim f (x) = +1
x!2+
Let us see now what happens by approaching x0 = 2 from the left, that is, by considering
values x < 2:
For values closer and closer to 2 from the left the function assumes larger and larger (in
absolute value) negative values. In this case we say that “the function f tends to 1 as x
tends to 2 from the left” and we write
lim f (x) = 1
x!2
Observe that
lim f (x) 6= lim f (x)
x!2+ x!2
that is, there exist the “right-hand” and the “left-hand” limits, but they are di¤erent. As
we will see in Proposition 413, the fact that the one-sided limits are distinct re‡ects the
fact that the two-sided limit of f (x) as x tends to 2 does not exist. Indeed, the equality of
the one-sided limits is equivalent to the existence of the two-sided limit. For example, if we
modify the function by considering f (x) = 1= jx 2j, we have
Now the two one-sided limits are equal, and they coincide with the two-sided one, which in
this case exists (even if in…nite).
Considering again the function f (x) = 1= (x 2), what does it happen if as x0 we take
+1? In other terms, what does it happen if we consider increasingly larger values of x?
Look at the following table:
For increasingly larger values of x the function assumes values closer and closer to 0. In this
case we say that “the function tends to 0 as x tends to +1” and we write
lim f (x) = 0
x!+1
11.1. INTRODUCTORY EXAMPLES 271
Observe how the function assumes values close to 0, but always strictly positive: f approaches
0 “from above”. If we want to emphasize this aspect we write
lim f (x) = 0+
x!+1
where 0+ suggests that while converging to 0 the values of f (x) remain strictly positive.
What does it happen if, instead, as x0 we take 1? We have the following table of
values:
For negative and increasingly larger (in absolute value) values of x, the function assumes
values closer and closer to 0. We say that “the function tends to 0 as x tends to 1” and
we write
lim f (x) = 0
x! 1
If we want to emphasize that the function, in tending to 0, remains negative, we write
lim f (x) = 0
x! 1
Finally, after having seen various types of limits, let us consider a function that has no
limit, i.e., that it does not exhibit any “de…nite trend”. Let f : R f0g ! R be given by
1
f (x) = sin
x
At the point x0 = 0 the function does not have a limit: for x closer and closer to x0 = 0, the
function continues to oscillate with a tighter and tighter sinusoidal trend:
1 y
0.8
0.6
0.4
0.2
0
x
-0.2
-0.4
-0.6
-0.8
-1
f (x) = sin x1
272 CHAPTER 11. LIMITS OF FUNCTIONS
The point x0 = 0 is, however, the unique point where the function does not have a limit:
at all the points of the domain the limit exists. A much more dramatic behavior is displayed
by the function f : R ! R given by
1 for x 2 Q
f (x) = (11.2)
0 for x 2
=Q
This remarkable recipe, called the Dirichlet function, oscillates “obsessively” between the
values 0 and 1 because, by the density of the rational numbers in the real numbers, for any
pair x < y of real numbers there exists a rational number q such that x < q < y. As we will
see, the Dirichlet function does not have a limit at any point x0 2 R.
if for every " > 0 there exists a " > 0 such that, for every x 2 A,
0 < jx x0 j < " =) jf (x) Lj < " (11.3)
The value L is called the limit of the function at x0 .
Note that (11.3) can be written as
0 < d (x; x0 ) < " =) d (f (x) ; L) < " (11.4)
The de…nition requires that, for any …xed quantity " > 0, arbitrarily small, there exists
a value " such that all the points x 2 A lying at a distance smaller than " from x0 have
images f (x) lying at a distance smaller than " from the value L of the limit. Note that the
condition d (x; x0 ) > 0 is equivalent to requiring x 6= x0 .
11.2. FUNCTIONS OF A SINGLE VARIABLE 273
Example 398 Let us show that limx!2 (3x 5) = 1. We have to verify that, for every
" > 0, there exists " > 0 such that
We have j(3x 5) 1j < " if and only if jx 2j < "=3 and therefore, setting " = "=3 yields
(11.5). N
Note how the quantity " depends on the value chosen for ": the smaller the value of ",
the smaller " . Naturally, the choice of " is not unique, so that, when we …nd a value of " ,
all the values smaller than it also work …ne: in Example 398, we could actually choose as "
any (positive) value lower than "=3.
N.B. The value of ", besides depending on ", clearly depends also on x0 . O
Example 399 Let us reconsider the Dirichlet function (11.2): limx!x0 f (x) does not exist
for any x0 2 R. Indeed, given x0 2 R, let us suppose, by contradiction, that limx!x0 f (x)
exists and is equal to L 2 R. Let 0 < " < 1=2. By de…nition, there exists = " such that
1
x 2 (x0 ; x0 + ) and x 6= x0 =) jf (x) Lj < " <
2
In each neighborhood (x0 ; x0 + ) there exist both rational points and irrational points
distinct from x0 (see Proposition 39), that is, points x 2 (x0 ; x0 + ) for which f (x) = 1
and points x 2 (x0 ; x0 + ) for which f (x) = 0. But this contradicts jf (x) Lj < 1=2
for every x 2 (x0 ; x0 + ) with x 6= x0 .1 Therefore, limx!x0 f (x) does not exist in any
point x0 2 R. N
A de…nition as 397, in which the distances are made explicit, is called of “"- ” type. In
the light of (11.4), the rewriting of De…nition 397 in the language of the neighborhoods is
immediate. To make it more immediately expressive the symbology, we will denote, rather
by the letter B, respectively by U (x0 ) a neighborhood of x0 of radius (of the independent
variable, i.e., a neighborhood in abscissa) and by V (L) = V" (L) a neighborhood of L of
radius " (of the dependent variable, i.e., a neighborhood in ordinate).
lim f (x) = L 2 R
x!x0
if, for every neighborhood V" (L) of L, there exists a neighborhood U " (x0 ) of x0 such that
if, for every neighborhood V" (L) of L, there exists a neighborhood U " (x0 ) of x0 such that
x 2 U " (x0 ) \ A and x 6= x0 =) f (x) 2 V" (L) (11.7)
The di¤erence between De…nitions 400 and 401 is obviously minor: where in the …rst
de…nition we had R, in the second one we have R. The simple modi…cation allows however
to consider also the cases (ii), (iii) and (iv). In particular:
As an example, we consider explicitly some subcases, leaving to the reader the other
ones. We start with the subcase x0 2 R and L = +1 of (i). In this case De…nition 401 is
equivalent to the following one, in “"- ” form, that is, with making the distances explicit.
if, for every M > 0, there exists M > 0 such that, for every x 2 A, we have
0 < jx x0 j < M =) f (x) > M (11.8)
In other words, for each constant M , no matter large, there exists M > 0 such that all
the points x 2 A that lie at distance less than M (excluding at most x0 ) have images f (x)
larger than M .
1 1
0 < jx x0 j < M () 0 < jx 2j < =) >M
M jx 2j
and therefore
0 < jx 2j < M =) f (x) > M
that is, limx!2 f (x) = +1. N
2
In a nutshell, we can say that “there exists a neighbourhood”takes the place of “eventually”as employed
sequences.
11.2. FUNCTIONS OF A SINGLE VARIABLE 275
Let us now consider case (iii) with x0 = +1 and L 2 R. Here De…nition 401 is equivalent
to the following one, still in “"- ” form because the distances are made explicit.
lim f (x) = L 2 R
x!+1
if, for every " > 0, there exists M" > 0 such that, for every x 2 A, we have
In this case, for each choice of " > 0 arbitrarily small, there exists a value M" such that
the images of points x greater that M" lie at distance less than " from L.
Finally, we consider case (iv) with x0 = L = +1. In this case De…nition 401 is equivalent
to the following one:
lim f (x) = +1
x!+1
if, for every M > 0, there exists N such that, for every x 2 A, we have
Setting N = M 2 yields
x > N =) f (x) > M
that is, limx!+1 f (x) = +1. N
3
By Lemma 276, the fact that A is not bounded from above guarantees that +1 is a limit point of A.
For example, this is the case when (a; +1) A.
276 CHAPTER 11. LIMITS OF FUNCTIONS
O.R. It is useful to see the concept of limit “in three stages” (as a rocket):
(i) for every neighborhood V of L (in ordinate)
(ii) there exists a neighborhood U of x0 (in abscissa) such that
(iii) all the values of f with x 2 U , x 6= x0 , belong to V , i.e., all the images — excluding at
most f (x0 ) — of f in U \ A belong to V : f (U \ A fx0 g) V .
10 y
V(l)
6
O U(x ) x
0
0
-2
-2 -1 0 1 2 3 4
We are often tempted to simplify to two stages: “the values of x close to x0 have images
f (x) close to L”, that is,
for every U there exists V such that f (U \ A fx0 g) V
Unfortunately, in such a way we say nothing at all because what precedes is always true, as
the …gure shows:
5
y
3 V(l)
0
O x
U(x )
-1 0
-2
-3
-4
-4 -2 0 2 4 6
11.2. FUNCTIONS OF A SINGLE VARIABLE 277
In the …gure, for every neighborhood U , also very small, of x0 there exists always a
neighborhood (usually quite big) V of L inside which all the values of f (x) with x 2 U fx0 g
fall. Such V can always be taken as an open interval that contains f (U fx0 g). H
O.R. As we have said many times a sequence is a function de…ned on N+ . This set has only
one limit point: +1. For a sequence the only limit we can talk about is therefore limn!1 ,
which we sometimes denote simply by lim since there is no danger of confusion. H
It is easy to see that the limit limx!1 f (x) does not exist. In these cases one can resort to
the weaker notion of one-sided (or unilateral) limit, already met in an intuitive way in the
introductory examples of the present chapter. They suggest the di¤erent possibilities:
Note how, in the one-sided limits, the point x0 is necessarily in R, while the value of the
limit is in R.
The next de…nition includes the two possible general cases (i) and (ii), by suitably modi-
fying De…nition 401.
if, for every neighborhood V" (L) of L, there exists a right neighborhood U +" (x0 ) = [x0 ; x0 + " )
of x0 such that
x 2 U +" (x0 ) \ A and x 6= x0 =) f (x) 2 V" (L) (11.11)
The value L is called the right limit of the function at x0 .
Since, excluding x0 , the neighborhood U +" (x0 ) reduces to (x0 ; x0 + " ), (11.11) can be
simpler written as
x 2 (x0 ; x0 + " ) \ A =) f (x) 2 V" (L)
In particular:
lim f (x) = L 2 R
x!x+
0
if, for every " > 0, there exists = " > 0 such that, for every x 2 A,
Let us consider the subcase L = +1 of (ii), leaving to the reader the subcase L = 1.
For this case De…nition 408 is equivalent to the following “"- ” de…nition:
lim f (x) = +1
x!x+
0
if, for every M > 0, there exists M > 0 such that, for every x 2 A,
In the same way we de…ne the left limits, which are denoted by limx!x f (x). We leave
0
the details to the reader.
To end this section, let us see an example in which the two one-sided limits (the right
one and the left one) exist, but are di¤erent.
11.2. FUNCTIONS OF A SINGLE VARIABLE 279
Proposition 413 Let f : A R ! R be a function and x0 a point for which there exists a
neighborhood B" (x0 ) such that B" (x0 ) fx0 g A. Then limx!x0 f (x) = L 2 R if and only
if
lim f (x) = lim f (x) = L 2 R
x!x+
0 x!x0
Note that B" (x0 ) fx0 g is a neighborhood of x0 “with a hole in it”, i.e., without the
point x0 itself. The condition B" (x0 ) fx0 g A requires that there exists at least one such
neighborhood with a hole in included in A. Naturally, an obvious and important case where
there exists a neighborhood B" (x0 ) such that B" (x0 ) fx0 g A is when x0 is an interior
point of A.
Going back to the examples just seen, for f (x) = 1= jx 2j we have limx!2 f (x) = +1
and hence, by Proposition 413,
and hence, by Proposition 413, the two-sided limit limx!2 f (x) does not exist.
280 CHAPTER 11. LIMITS OF FUNCTIONS
Proof We prove the proposition for L 2 R, leaving to the reader the case L = 1.
Moreover, for simplicity we suppose that x0 is an interior point of A.
“If”. We show that if limx!x f (x) = limx!x+ f (x) = L, then limx!x0 f (x) = L. Let
0 0
" > 0. Since limx!x+ f (x) = L, there exists 0" > 0 such that for every x 2 (x0 ; x0 + 0" ) \ A
0
we have jf (x) Lj < ". On the other hand, since limx!x f (x) = L, there exists 00" > 0
0
00 0 00
such that for every x 2 (x0 " ; x0 ) \ A we have jf (x) Lj < ". Let " = min "; " .
Then
x 2 (x0 ; x0 + " ) \ A =) jf (x) Lj < " (11.14)
and
x 2 (x0 " ; x0 ) \ A =) jf (x) Lj < " (11.15)
that is
x 2 (x0 " ; x0 + ") \ A and x 6= x0 =) jf (x) Lj < "
Therefore, limx!x0 f (x) = L.
“Only if”. That is, if limx!x0 f (x) = L, then limx!x f (x) = limx!x+ f (x) = L. Let
0 0
" > 0. Since limx!x0 f (x) = L, there exists " > 0 such that
Since x0 is not a boundary point, both (x0 " ; x0 ) \ A and (x0 ; x0 + " ) \ A are not empty.
Therefore, (11.16) implies both (11.14) and (11.15), that is, limx!x+ f (x) = limx!x f (x) =
0 0
L.
The reader will have observed that, when A is an interval, Proposition 413 forbids that
x0 is a boundary point. Indeed, to …x ideas, let us consider an interval of the real line with
extremes a < b.4 When x0 = a = inf A, it does not make sense to talk about the one-
sided limit limx!a f (x), while when x0 = b = sup A it does not make sense to talk about
the unilateral limit limx!b+ f (x). We will set, instead, limx!a f (x) = limx!a+ f (x) and
limx!b f (x) = limx!b f (x) since the conditions in the de…nition of the two-sided limit are
perfectly satis…ed: for each (two-sided) neighborhood V of L there exists a neighborhood,
by force one-sided because x0 is a boundary point, such that the images of f , except perhaps
f (x0 ), fall in V .
A similar observation can be made if A is a half-line bounded from below (above): in
such a case the left (right) limit as x tends to the in…mum (supremum) of A does not exist:
p
For example, for f (x) = x and x0 = 0, we have x0 = inf A (even better, x0 = min A) and
p
the limit limx!0 x is not de…ned. When x0 is a generic boundary point, for example the
point 3 for the set A = (0; 3) [ [5; 19], the problem appears exactly in the same terms.
p
Example 414 Let f : R+ ! R be given by f (x) = x. In Example 410 we have seen that
limx!0+ f (x) = 0. By what we have just said, we can also write limx!0 f (x) = 0. It is
instructive to calculate this bilateral limit also directly, through De…nition 397. Let " > 0.
As we have seen in Example 410, we have
p
jf (x) Lj = x < " () x < "2
4
In other words, one of the following four cases holds: (i) A = (a; b); (ii) A = [a; b); (iii) A = (a; b]; (iv)
A = [a; b].
11.2. FUNCTIONS OF A SINGLE VARIABLE 281
Setting " = "2 , for every x 2 A, that is, for every x 0, we have
lim f (x) = L 2 R
x!x0
f ((U \ A) fx0 g) V
It is this version of two-sided limit that the reader will …nd generalized to topological
spaces in more advanced courses. We are happy to leave to the reader the analogous general
version for one-sided limits.
We observe that also for functions f : A R ! R it is possible to introduce the notions
of limit from above and of limit from below studied for sequences. Clearly, these notions
should not be confused with the one-sided limits discussed before.
lim f (x) = +1 or 1
x!x+
0
lim f (x) = +1 or 1
x!x0
(ii) when
lim f (x) = L (or lim f (x) = L)
x!+1 x! 1
1
Example 416 The function f (x) = x+1 + 2, with graph
8
y
6
0
O x
-2
-4
-5 0 5
if, for every neighborhood V" (L) of L, there exists a neighborhood U " (x0 ) of x0 such that
De…nition 401 is the special case with n = 1. In the “"- ” version we have (11.17) if, for
every " > 0, there exists " > 0 such that, for every x 2 A,
Note how (11.19) is absolutely identical to (11.4): the distance d (x; x0 ), that is, jx x0 j
in the case n = 1, in the general case becomes kx x0 k.
P
Example 418 Let f : Rn ! R be given by f (x) = 1+ ni=1 xi . We verify that limx!0 f (x) =
1. Let " > 0. We have
n
X n
X
d (f (x) ; 1) = 1 + xi 1 < " () xi < "
i=1 i=1
5
We consider only the case x0 2 Rn , since the study of the analogous notion x ! 1 is particularly
delicate for the vector case.
11.3. FUNCTIONS OF SEVERAL VARIABLES 283
P Pn
Set " = "=n. Since j ni=1 xi j i=1 jxi j, we have
v
u n n
uX " X "2
d (x; x0 ) < " () t x2i < () x2i < 2 =)
n n
i=1 i=1
q r
2 "2 2 "2 "
xi < 2 8i = 1; 2; : : : ; n =) jxi j = xi < 2
= 8i = 1; 2; : : : ; n
n n n
Xn n
X Xn
=) jxi j < " =) d (f (x) ; 1) = xi jxi j < "
i=1 i=1 i=1
To extend the de…nition of one-sided limit to the vector case we should consider the
di¤erent directions along which we can approach a point x0 2 Rn . Leaving to more advanced
courses the study of the topic, here we will only consider the extension to Rn of the scalar
two-sided limit given by De…nition 417. By contrast, there is no di¢ culty in extending to
vector functions the limits from above and from below (because L is in R and not in Rn ).
The calculation of the limit of a function de…ned on Rn is much more delicate than in
the scalar case. Intuitively, for a function of several variables f to tend to a value L, as x
tends to the vector x0 , it is necessary that this happens along all possible approaching paths
from x to x0 . If, therefore, there are two approaching ways along which f does not tend to
the same limit value, the function does not have a limit as x ! x0 . The following example
should clarify the issue.
ln(1 + x1 x2 )
f (x1 ; x2 ) =
x21
Let us verify that lim(x1 ;x2 )!(0;0) f (x) does not exist. Consider two approaching paths to the
vector (0; 0), along the parabola x2 = x21 , and along the straight line x2 = x1 . Along the …rst
path we have
ln(1 + x21 )
lim f (x1 ; x2 ) = lim f (x1 ; x1 ) = lim =1
(x1 ;x2 )!(0;0) x1 !0 x1 !0 x21
Since f tends to two di¤erent limit values along the two di¤erent paths to (0; 0), we conclude
that lim(x1 ;x2 )!(0;0) f (x) does not exist. Let us prove it rigorously using the last de…nition.
Suppose, by contradiction, that the limit exists, that is,
lim f (x1 ; x2 ) = L
(x1 ;x2 )!(0;0)
284 CHAPTER 11. LIMITS OF FUNCTIONS
Set " = 1=4. By the de…nition of limit, there exists 1 > 0 such that for (x1 ; x2 ) 2 B 1 (0; 0),
with (x1 ; x2 ) 6= (0; 0), one has
1
d (f (x1 ; x2 ) ; L) < (11.21)
4
From (11.20), setting
ln(1 + x3 )
g(x) =
x2
one gets that limx1 !0 g(x1 ) = 0. Therefore, by setting again " = 1=4, there exists 2 >0
such that for x1 2 B 2 (0) R, with x1 6= 0, we have
1 1
g(x1 ) 2 ( "; ") = ;
4 4
Now consider the spherical neighborhood B 2 (0; 0) R2 of (0; 0): Take a point of the
2
parabola x2 = x1 that belongs to this neighborhood, that is, a point x ^21 2 B 2 (0; 0),
^1 ; x
with x^1 ; x ^1 2 B 2 (0),6 and therefore
^21 6= (0; 0). We have x
1 1
f x ^21 = g (^
^1 ; x x1 ) 2 ; (11.22)
4 4
Analogously, from the limit along the second path above setting
ln(1 + x2 )
h(x) =
x2
one gets limx1 !0 h(x1 ) = 1. Therefore, setting again " = 1=4, there exists 3 > 0 such that
for x1 2 B 3 (0) R, with x1 6= 0, we have
3 5
h(x1 ) 2 (1 "; 1 + ") = ;
4 4
Now consider the spherical neighborhood B 3 (0; 0) R2 of (0; 0) and take a point of the
straight line x2 = x1 that belongs to it, that is, a point (~
x1 ; x
~1 ) 2 B 3 (0; 0), with (~
x1 ; x
~1 ) 6=
(0; 0). We have x~1 2 B 3 (0), so that
3 5
f (~
x1 ; x
~1 ) = h (^
x1 ) 2 ; (11.23)
4 4
Let = minf 1 ; 2 ; 3 g and consider two points x ^21 and (~
^1 ; x x1 ; x
~1 ) on the parabola and
on the straight line that belong to B (0; 0) and that are di¤erent from the origin (0; 0): By
(11.22) and (11.23), we have
1
d f x ^21 ; f (~
^1 ; x x1 ; x
~1 ) >
2
On the other hand, for (11.21) we have
1 1 1
d f x ^21 ; f (~
^1 ; x x1 ; x
~1 ) d f x ^21 ; L + d (L; f (~
^1 ; x x1 ; x
~1 )) < + =
4 4 2
We reached a contradiction, and therefore the limit lim(x1 ;x2 )!(0;0) f (x1 ; x2 ) does not exist.
N
6
Indeed, d((^ ^21 ); (0; 0)) <
x1 ; x 2, ^21 + x
that is, x ^41 < 2
2, ^21 <
implies x 2
2, whence d(^
x1 ; 0) < 2.
11.4. PROPERTIES OF LIMITS 285
Finally, we observe that De…nition 468 of Chapter 12 will extend the notion of limit
to operators. Since this last extension does not present any substantial novelty, we prefer
to study it directly together with the notion of continuity of operators (which justi…es the
study).
“Only if”. Let us suppose limx!x0 f (x) = L 2 R. Let fxn g be a sequence of points of A,
with xn 6= x0 for every n, such that xn ! x0 . Let " > 0. There exists " > 0 such that for
every x 2 A with 0 < d (x; x0 ) < " we have d (f (x) ; L) < ". Since xn ! x0 and xn 6= x0 ,
there exists n" 1 such that 0 < d (xn ; x0 ) < " for every n n" . For every n n" we have
therefore d (f (xn ) ; L) < ", which implies f (xn ) ! L.
Example 421 We go back to the limit limx!2 (3x 5) of Example 398. Since A = R, let
fxn g be any sequence of real numbers, with xn 6= 2 for every n, such that xn ! 2. For
example, xn = 2 + 1=n or xn = 2 1=n2 . By the algebra of limits of sequences, we have
and the limit limx!0 f (x). Since A = R++ and x0 = 0, let fxn g be any sequence of strictly
positive real numbers, i.e., xn > 0 for every n, such that xn ! 0. For example, xn = 1=n or
xn = 1=n2 . By the algebra of limits of sequences, we have
p
xn 1
lim = lim p = +1
n!+1 xn n!+1 xn
The characterization of limits through sequences is important both from the computa-
tional viewpoint, because the calculation of the limits of functions reduces to the simpler
calculation of limits of sequences, and from the theoretical viewpoint, because in this way
many of the properties seen for limits of sequences extend immediately to limits of functions.
In this section we will rely on the second, more theoretical aspect, to obtain basic properties
of limits of functions. We start with the uniqueness of the limit.
Proof Let us suppose, by contradiction, that there exist two di¤erent limits L0 6= L00 . Let
fxn g be a sequence in A, with eventually xn 6= x0 , such that xn ! x0 . By Proposition 420,
f (xn ) ! L0 and f (xn ) ! L00 , which contradicts the uniqueness of the limit for sequences.
It follows that L0 = L00 .
Alternative proof By contradiction, let us suppose that there exist two di¤erent limits L1
and L2 , that is, L1 6= L2 . We assume therefore that
lim f (x) = L1
x!x0
and
lim f (x) = L2
x!x0
with L1 6= L2 . Without loss of generality, suppose that L1 > L2 . There exists a number K
such that L1 > K > L2 . Setting 0 < "1 < L1 K and 0 < "2 < K L2 , the neighborhoods
11.4. PROPERTIES OF LIMITS 287
B"1 (L1 ) = (L1 "1 ; L1 + "1 ) and B"2 (L2 ) = (L2 "2 ; L2 + "2 ) are disjoint.
10 y
L +ε
2 2
8
L2
L -ε
2 2
6
L +ε
1 1
L
1
4 L -ε
1 1
O x
0
-2
-2 -1 0 1 2 3 4
Since by hypothesis limx!x0 f (x) = L1 , given "1 > 0 one can …nd 1 > 0 such that
Analogously, since by hypothesis limx!x0 f (x) = L2 , given "2 > 0 one can …nd 2 > 0 such
that
x 2 (x0 2 ; x0 + 2 ) \ A; x 6= x0 =) f (x) 2 (L2 " 2 ; L2 + " 2 ) (11.25)
Taking = min f 1 ; 2 g we have that the neighborhood (x0 ; x0 + ) of x0 with radius
is contained in the two previous neighborhoods, i.e., in (x0 ; x0 + ) both (11.24) and
(11.25) hold:
8x 2 (x0 ; x0 + ); x 6= x0 =) f (x) 2 (L1 "1 ; L1 + "1 ) and f (x) 2 (L2 " 2 ; L2 + " 2 )
Hence,
We continue with the version for functions of the permanence of sign Theorem.
We leave to the reader the “sequential”proof, based on Theorem 281 and on Proposition
420, giving instead an alternative proof that does not use limits of sequences.
Alternative proof Suppose that L > 0. Since limx!x0 f (x) = L, taking " = L=2 > 0,
there exists a neighborhood U" (x0 ) of x0 such that
L L L 3L
x 2 U" (x0 ) \ A fx0 g =) f (x) 2 L ;L + = ;
2 2 2 2
Since L=2 > 0, we are done. For L < 0 the proof is analogous.
Again we leave to the reader the “sequential” proof, based on Theorem 303 and on
Proposition 420 and provide an alternative proof.
Alternative proof Given " > 0 arbitrary, we have to show that there exists > 0 such that
f (x) 2 (L "; L + ") for every x 2 (x0 ; x0 + ) \ A with x 6= x0 . Since limx!x0 g(x) = L,
given " > 0, there exists 1 > 0 such that
8x 2 (x0 1 ; x0 + 1) \ A; x 6= x0 =) L " < g(x) < L + " (11.28)
Since limx!x0 h(x) = L, given " > 0, there exists 2 > 0 such that
8x 2 (x0 2 ; x0 + 2) \ A; x 6= x0 =) L " < h(x) < L + " (11.29)
Now taking = min f 1 ; 2 g, we have that in (x0 ; x0 + ) \ A both (11.28) and (11.29)
hold. Moreover, g(x) f (x) h(x) in (x0 ; x0 + ) \ A. Therefore, for any x 2
(x0 ; x0 + ) \ A; x 6= x0 we have
L " < g(x) f (x) h(x) < L + "
that is
f (x) 2 (L "; L + ") 8x 2 (x0 ; x0 + ); x 6= x0
Since " was arbitrary, we have limx!x0 f (x) = L, as claimed.
The interpretation of the result is completely analogous to the version for sequences. Let
us see a simple application, similar, mutatis mutandis, to the one seen in Example 304.
11.4. PROPERTIES OF LIMITS 289
2 1
Example 426 Let f : R ! R be given by f (x) = ex cos x and let x0 = 0. Since
1
0 cos2 1 8x 2 R
x
considering the monotonicity of the exponential function, for x 0 we have
2 1
1 = e0 x ex cos x e 1 x = ex 8x 0
Setting g (x) = 1 and h (x) = ex , conditions (11.26) and (11.27) are satis…ed with L = 1.
Therefore limx!0 f (x) = 1. The proof for x < 0 is analogous. N
As already observed, also for functions, the permanence of sign and the comparison
theorems are properties of the limits with respect to the structure of order. The next
proposition, the analogue for functions of Proposition 282, is yet another simple result of the
same type.
(ii) If L > H, then there exists a neighborhood of x0 in which f (x) > g (x) :
Proof (i) By contradiction, assume that L < H. Set " = H L, so that " > 0. The
neighborhoods (L "=4; L + "=4) and (H "=4; H + "=4) are disjoint since L + "=4 < H
"=4. Since limx!x0 f (x) = L, there exists 1 > 0 such that
" "
x 2 (x0 1 ; x0 + 1 ); x 6= x0 =) f (x) 2 L ;L +
4 4
Analogously, since limx!x0 g (x) = H, there exists 2 > 0 such that
" "
x 2 (x0 2 ; x0 + 2 ); x 6= x0 =) g(x) 2 H ;H +
4 4
By setting = minf 1 ; 2 g, we have
" " " "
x 2 (x0 ; x0 + ); x 6= x0 =) L < f (x) < L + < H < g(x) < H +
4 4 4 4
that is, f (x) < g(x) for every x 2 B (x0 ). This contradicts the hypothesis that f (x) g (x)
in a neighborhood Br (x0 ) of x0 .
(ii) If one would have f (x) g(x) for every neighborhood of x0 , then by (i) one would
have L H.
Observe that in (i) L H continues to hold also when we have the strict inequality
f (x) > g (x). For example, if f (x) = 1=x and g (x) = 0, for x ! +1 we have L = H = 0
although f (x) > g (x) for every x > 0.
290 CHAPTER 11. LIMITS OF FUNCTIONS
(i) limx!x0 (f + g) (x) = L+M , provided that L+M is not an indeterminate form (1.24),
of the type
+1 1 or 1+1
(ii) limx!x0 (f g) (x) = LM , provided that LM is not an indeterminate form (1.25), of the
type
1 0 or 0 ( 1)
(iii) limx!x0 (f =g) (x) = L=M provided that g (x) 6= 0 in a neighborhood of x0 , with x 6= x0 ,
and L=M is not an indeterminate form (1.26), of the type8
1 a
or
1 0
Proof We prove only (i), leaving to the reader the analogous proof of (ii) and (iii). Let fxn g
be a sequence in A, with xn 6= x0 for every n 1, such that xn ! x0 . By Proposition 420,
f (xn ) ! L and g (xn ) ! M . Let us suppose that L + M is not an indeterminate form. By
Proposition 297, (f + g) (xn ) ! L + M , and therefore, by Proposition 420 it follows that
limx!x0 (f + g) (x) = L + M .
Example 429 Let f; g : R f0g ! R be given by f (x) = sin x=x and g (x) = 1= jxj. We
have limx!0 sin x=x = 1 and limx!0 1= jxj = +1, and therefore
sin x 1
lim + = 1 + 1 = +1
x!0 x jxj
As in the case of sequences, the case a=0 of point (iii) with a 6= 0 is not of indetermination
for the algebra of limits, as the following version for functions of Proposition 300 shows.
Proposition 430 Let limx!x0 f (x) = L 2 R, with L 6= 0, and limx!x0 g(x) = 0. The limit
limx!x0 (f =g) (x) exists if and only if there is a neighborhood U (x0 ) of x0 2 Rn where the
function g has constant sign, except at most at x0 . In such a case:9
7
As in the previous section, we will consider limits at points x0 2 Rn , leaving to the reader the case
x ! 1 for functions of one variable.
8
As in the case of sequences, we observe that to exclude the indeterminacy a0 is equivalent to require that
M 6= 0.
9
Here g ! 0+ and g ! 0 indicate that limx!x0 g (x) = 0 with, respectively, g(x) > 0 and g (x) < 0 for
every x0 6= x 2 U (x0 ).
11.5. ALGEBRA OF LIMITS 291
f (x)
lim = +1
x!x0 g (x)
f (x)
lim = 1
x!x0 g (x)
Example 432 Take f (x) = 1=x 1 and g(x) = 1=x. As x ! +1 we have f ! 1 and
g ! 0. Since g(x) > 0 for every x > 0, and therefore also in any neighborhood of +1,
we have g ! 0+ . Thanks to the version for x ! 1 of Proposition 430, one obtains
limx!+1 (f =g) (x) = 1. N
Indeterminacy 1 1
Consider, for example, the limit limx!0 (f + g) (x) of the sum of the functions f; g : R f0g !
R given by f (x) = 1=x2 and g (x) = 1=x4 , which falls under this indeterminacy. We have
1 1 1 1
(f + g) (x) = = 2 1
x2 x4 x x2
and therefore
1 1
lim (f + g) (x) = lim lim 1 = 1
x!0 x!0 x2 x!0 x2
since (+1) ( 1) is not an indeterminacy. Exchanging the signs between these two func-
tions, that is, setting f (x) = 1=x2 and g (x) = 1=x4 , we have again the indeterminacy
1 1 at x0 = 0, but this time limx!0 (f + g) (x) = +1. As it is completely obvious, also
for the case of functions the indeterminate forms can give completely di¤erent results and
they must be solved case by case.
Finally note that such f and g give rise to an indeterminacy at x0 = 0, but not at x0 6= 0.
Therefore, it is crucial to specify the point x0 that we are considering.
292 CHAPTER 11. LIMITS OF FUNCTIONS
Indeterminacy 0 1
For example, let f; g : R ! R be given by f (x) = (x 3)2 and g (x) = 1= (x 3)4 . The
limit limx!3 (f g) (x) falls under this indeterminacy. But we have
1 1
lim (f g) (x) = lim (x 3)2 4 = lim = +1
x!3 x!3 (x 3) x!3 (x 3)2
On the other hand, considering f (x) = 1= (x 3)2 and g (x) = (x 3)4 , we have
1
lim (f g) (x) = lim (x 3)4 = lim (x 3)2 = 0
x!3 x!3 (x 3)2 x!3
Again, only the direct calculation of the limit can determine its value.
with results of opposite sign in the two cases: again, one cannot avoid the direct calculation
of the limit.
For the functions f and g just seen, at the point x0 = 0 we have the indeterminacy 0=0,
but
f x2
lim (x) = lim = lim x = 0
x!0 g x!0 x x!0
while, setting g (x) = x4 , we still have a form of the type 0=0 and
f x2 1
lim 4
(x) = lim
= lim 2 = +1
x!0 g
x!0 x x!0 x
p
On the other hand, taking f : R+ ! R given by f (x) = x + x 2 and g : R f1g ! R
given by g (x) = x 1, we have
p p p
f x+ x 2 x 1+ x 1 x 1
lim (x) = lim = lim = lim 1 +
x!1 g x!1 x 1 x!1 x 1 x!1 x 1
p
x 1 1 1 3
= 1 + lim p p = 1 + lim p =1+ =
x!1 ( x 1) ( x + 1) x!1 x+1 2 2
We close with two observations.
11.6. ELEMENTARY LIMITS AND IMPORTANT LIMITS 293
As for sequences, for functions the various indeterminacies can be reduced to one
another.
Also in the case of functions we can summarize what we have seen till now in tables
completely identical to those in Section 8.8.4, to which we refer.
lim xn = xn0
x!x0
(iv) Let f : R++ ! R be given by f (x) = loga x, with a > 0; a 6= 1. For every x0 > 0, we
have limx!x0 loga x = loga x0 . Moreover,
1 if a > 1 +1 if a > 1
lim loga x = and lim loga x =
x!0+ +1 if a < 1 x!+1 1 if a < 1
(v) Let f; g : R ! R be given by f (x) = sin x and g (x) = cos x. For every x0 2 R, we
have limx!x0 sin x = sin x0 and limx!x0 cos x = cos x0 . The limits limx! 1 sin x and
limx! 1 cos x do not exist. N
294 CHAPTER 11. LIMITS OF FUNCTIONS
Proposition 434 Let f; g : R+ ! R be de…ned by f (x) = sin x=x and g (x) = (cos x 1) =x.
Then
sin x
lim =1 (11.30)
x!0 x
and
1 cos x 1 cos x 1
lim = 0; lim 2
= (11.31)
x!0 x x!0 x 2
Proof It is easy to see graphically that 0 < sin x < x < tan x for x 2 (0; =2) and that
tan x < x < sin x < 0 for x 2 ( =2; 0). Therefore, dividing all the terms by sin x and
observing that sin x > 0 when x 2 (0; =2) and sin x < 0 when x 2 ( =2; 0), we have in all
the cases
x 1
1< <
sin x cos x
The …rst limit then follows from the comparison criterion. For the third one, it is su¢ cient
to observe that
1 cos x 1 cos x 1 + cos x 1 cos2 x sin2 x 1
2
= 2
= 2
= 2
x x 1 + cos x x (1 + cos x) x 1 + cos x
and that, as x ! 0, the …rst factor tends to 1 while the second one tends to 1=2.
Finally, the second limit follows immediately from the third one:
1 cos x 1 cos x 1
=x 2
!0 =0
x x 2
From the analogous propositions for sequences we easily deduce (the proofs are essentially
identical) the following ones:
In particular
f (x) x
1 1
lim 1+ = e; lim 1+ =e
x!x0 f (x) x! 1 x
af (x) 1
lim = log a
x!x0 f (x)
11.7. ORDERS OF CONVERGENCE AND OF DIVERGENCE 295
In particular,
ax 1
lim = log a (11.32)
x!0 x
which, when a = e, becomes
ex 1
lim =1
x!0 x
(iii) Let 0 < a 6= 1 and f (x) ! 0 as x ! x0 . Then
loga (1 + f (x)) 1
lim =
x!x0 f (x) log a
In particular,
loga (1 + x) 1
lim =
x!0 x log a
which, when a = e, becomes
log(1 + x)
lim =1
x!0 x
(iv) If f (x) ! 0 as x ! x0 , we have
(1 + f (x)) 1
lim =
x!x0 f (x)
In particular,
(1 + x) 1
lim =
x!0 x
(i) If
f (x)
lim =0
x!x0 g (x)
we say that f is negligible with respect to g as x ! x0 ; in symbols,
f = o (g) as x ! x0
296 CHAPTER 11. LIMITS OF FUNCTIONS
(ii) If
f (x)
lim = k 6= 0 (11.33)
x!x0 g (x)
we say that f is comparable with g as x ! x0 ; in symbols,
f g as x ! x0
(iii) In particular, if
f (x)
lim =1
x!x0 g (x)
we say that f and g are asymptotic (or asymptotically equivalent) to one another as
x ! x0 and we write
f (x) g (x) as x ! x0
It is easy to see that for functions, too, the relations and for enjoy the same properties
seen in Section 8.12 for sequences:
(i) the relations of comparability and of asymptotic are symmetric and transitive;
(iii) if the limits limx!x0 f (x) and limx!x0 g (x) are both …nite and non-zero, then f g
as x ! x0 ;
We now consider the cases, the most interesting also for functions, in which both functions
converge to zero or diverge to 1. We start with the convergence to zero: limx!x0 f (x) =
limx!x0 g (x) = 0. In this case, intuitively, f is negligible with respect to g as x ! x0 if it
tends to zero faster. Let, for example, x0 = 1, and f (x) = (x 1)2 and g (x) = x 1. We
have
(x 1)2
lim = lim (x 1) ! 0
x!1 x 1 x!1
larger and larger values in absolute value. For example, if f (x) = x and g (x) = x2 , for
x0 = +1, we have
x 1
lim = lim =0
x!+1 x2 x!+1 x
x2
lim = lim x = 0
x!0 x x!0
and therefore g = o (f ) as x ! 0.
For functions, too, the meaning of negligibility must be speci…ed according to whether
we consider convergence to zero or divergence to in…nity. Moreover, the point x0 where we
take the limit is essential, and this represents the only meaningful novelty with respect to
what we have seen for sequences.
Proposition 436 For every pair of functions f and g and for every scalar c 6= 0, it holds
that:
(i) o(f ) + o (f ) = o (f );
The writing o(f ) + o (f ) = o (f ) in (i), bizarre at …rst sight, simply means that the sum of
two little-o of the same function is still a little-o of that function, that is, it continues to be
negligible with respect to that function. The analogous re-reading of the other properties of
the proposition facilitates its understanding. Note that (ii) has the remarkable special case
Proof As for sequences, if a function is little-o of f it can be written as f ", where " is an
in…nitesimal. Indeed
f "
lim = lim " = 0
x!x0 f x!x0
and therefore f " is little-o of f . The proof will be based on this very useful arti…ce.
(i) Let us call f " the …rst of the two little-o to the left of the equality symbol, and f
the second one, with " and two in…nitesimals as x ! x0 . Then
f "+f
lim = lim (" + ) = 0
x!x0 f x!x0
so that o(g) + o (f ) = o (f ).
Example 437 Let f (x) be the function given by f (x) = xn , with n > 2. Consider the
functions g(x) = xn 1 and h(x) = e x 3xn 1 . It is immediate to verify that g = o(f ) =
o(xn ) and h = o(f ) = o(xn ) as x ! +1.
(i) Summing the two functions we obtain g + h = e x 2xn 1, which is still o(xn ) as
x ! +1, in accordance with (i) proved above.
(ii) Multiplying the two functions we obtain g h = xn 1 e x 3x2n 2 , which is o(xn xn ) ,
i.e., o(x2n ) as x ! +1, in accordance with (ii) proved above (in the special case
o(f )o(f )). Note that (since n > 2), g h is not o(xn ).
(iii) Let us set c = 3, and consider c g = 3xn 1 . It is immediate to verify that 3xn 1 is
still o(xn ) as x ! +1, in accordance with (iii) proved above.
(iv) Consider the function l(x) = x + 1. It is immediate to verify that l = o(g) = o(xn 1 )
as x ! +1. Consider now the sum l + h (with h de…ned above), which is a sum of
a o(g) and of a o(f ), with g = o(f ). We have l + h = x + 1 + e x 3xn 1 , which is
o(xn ) as x ! +1, i.e., o(f ), in accordance with (iv) proved above. Note that l + h is
not o(g), even if l is a o(g). N
11.7. ORDERS OF CONVERGENCE AND OF DIVERGENCE 299
N.B. (i) To state that a function is o (1) as x ! x0 simply means that it tends to 0 as
x ! x0 . Indeed, f = o (1) means that f =1 = f ! 0 as x ! x0 . (ii) The fourth property of
the previous proposition is particularly important, since it highlights that if g is negligible
with respect to f , in the sum o(g) + o (f ) the little-o o(g) is “absorbed” by o (f ). O
Let us see some classical examples of functions with di¤erent rates of divergence.
xk
lim x
=0
x!+1
(ii) xh = o xk as x ! +1 if h < k;
loga x
lim =0
x!+1 xk
Note that, by the transitivity property of the negligibility relation, from (i) and (ii) it
follows that
loga x = o ( x ) as x ! +1
Proof For all the three functions x , xk , loga x, one has that f (n 1) f (x) f (n)
where n = [x] is the integer part of x: such sequences are therefore increasing. It is then
su¢ cient to use the sequence de…nition of the limit of a function and to use the comparison
criterion.
N.B. To state that a function is a o (1) as x ! x0 simply means that it tends to 0: indeed,
to state that f (x) = o (1) means that f (x) =1 = f (x) ! 0. O
that is, two functions asymptotic to one another as x ! x0 have the same limit as x ! x0 .
In particular, we have the following version for functions of Lemma 320.
300 CHAPTER 11. LIMITS OF FUNCTIONS
(i) f (x)+h (x) g (x)+l (x) as x ! x0 , provided that there exist k > 0 and a neighborhood
B" (x0 ) of x0 such that
g (x)
k (11.34)
g (x) + l (x)
for every x 2 B" (x0 );
(iii) f (x) =h (x) g (x) =l (x) as x ! x0 , provided that h (x) 6= 0 and l(x) 6= 0 in every
point x 6= x0 of a neighborhood B" (x0 ).
Therefore,
lim f (x) = L () lim (f (x) + o (f (x))) = L
x!x0 x!x0
and
f (x) + o (f (x)) f (x)
as x ! x0 (11.37)
g (x) + o (g (x)) g (x)
Let us see some examples.
3
and let us set f (x) = x and g (x) = x 2 . As x ! +1 we have
1 2
2x 2 + 5x 3 = o (f ) and 3 + 3x = o (g)
x 2+ 2x 4 + e x x 2
4 + x 8 + 3x 10 4
= x2 ! +1 as x ! +1
x x
N
1 cos x
lim
x!0 sin2 x + x3
Applying …rst (11.37) and then Lemma 439 item (iii), we obtain
11.7.3 Terminology
Here too, for the comparison of two functions converging to zero and of two functions tending
to 1, there is a speci…c terminology. In particular,
(ii) a function f such that limx!x0 f (x) = 1 is called in…nite (or unbounded, or in…nitely
large) as x ! x0 ;
(iii) if two functions f and g are in…nitesimal at x0 and such that f = o (g) as x ! x0 , then
f is said to be in…nitesimal of higher order at x0 with respect to g;
(iv) if two functions f and g are in…nite at x0 and such that f = o (g) as x ! x0 , then f
is said to be in…nite of lower order with respect to g.
A function is therefore in…nitesimal of higher order than another one if it tends to zero
faster, while it is in…nite of lower order if it tends to in…nity slower.
(ii) xk = o ( x ) for every > 1 and k > 0, as already proved with the ratio criterion. If
instead 0 < < 1 and k > 0, then x = o xk .
(iii) If k1 > k2 > 0, then xk2 = o xk1 : indeed, xk2 =xk1 = xk2 k1 ! 0.
logh2 x 1
h1
= h1 h2
!0
log x log x
The previous results adapt easily to the case in which instead of x we have a function
f (x) ! +1 as x ! x0 . Moreover, such results can be organized in scales of in…nities and
in…nitesimals, in analogy with what we have seen for sequences. For brevity we omit the
details.