A Course in Analysis - Volume I - Introductory Calculus, Analysis of Functions of One Real Variable

A Cour e 1n
Ana ysis
lb f 1
(t )dt =f b) - f (a)
n
A Course in
Analysis
-----VoLI----
lntroductory calculus
Analysls of Functions of One Real Variable
9625_9789814689083_tp.indd 1 29/7/15 5:23 pm

A Course in Analysis
Vol. I
Part 1 Introductory Calculus
Part 2 Analysis of Functions of One Real Variable
Vol. II
Part 3 Differentiation of Functions of Several Variables
Part 4 Integration of Functions of Several Variables
Part 5 Vector Calculus
Vol. III
Part 6 Measure and Integration Theory
Part 7 Complex-valued Functions of a Complex Variable
Part 8 Fourier Analysis
Vol. IV
Part 9 Ordinary Differential Equations
Part 10 Partial Differential Equations
Part 11 Calculus of Variations
Vol. V
Part 12 Functional Analysis
Part 13 Operator Theory
Part 14 Theory of Distributions
Vol. VI
Part 15 Differential Geometry of Curves and Surfaces
Part 16 Differentiable Manifolds and Riemannian Geometry
Part 17 Lie Groups
Vol. VII
Part 18 History of Analysis
RokTing - A Course in Analysis.indd 1 29/7/2015 11:56:51 AM

A Course in
Analysis
-------Vol.I - - - - - - -
Introductory Calculus
Analysis of Functions of One Real Variable
Niels Jacob
Kristian P Evans
Swansea University, UK
lit World Scientific

NEW JERSEY • LO NOON • SINGAPORE • BEIJING • SHANGHAI • HONG KONG • TAIPEI • CHENNAI • TOKYO
9625_9789814689083_tp.indd 2 29/7/15 5:23 pm

Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library.
A COURSE IN ANALYSIS
Volume I: Introductory Calculus, Analysis of Functions of One Real Variable
Copyright © 2016 by World Scientific Publishing Co. Pte. Ltd.
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or
mechanical, including photocopying, recording or any information storage and retrieval system now known or to
be invented, without written permission from the publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center,
Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from
the publisher.
ISBN 978-981-4689-08-3
ISBN 978-981-4689-09-0 (pbk)
Printed in Singapore
RokTing - A Course in Analysis.indd 2 29/7/2015 11:56:51 AM

July 21, 2015 11:31 World Scientific Book - 9.75in x 6.5in reduction˙9625 page 1
Preface
We are currently living in times where many undergraduates consider the
internet as their main, if not their only source for supporting their academic
studies. Furthermore, many publishers prefer short textbooks directly related
to modules as the best solution for mathematics textbooks. This project,
namely to write and publish a whole course on analysis consisting of up to 6
volumes, therefore, may appear to be going against the grain, perhaps even
a Don Quixote’s style fight against modernity. However the motivation for
developing these volumes has slowly emerged over the last few years by our
observations while teaching analysis to undergraduates.
The modular approach to teaching combined with examination pressure has

prevented students from seeing crucial connections between topics being
taught in different modules, even when prerequisites and dependencies are
emphasised. In fact many universities in the U.K. expect their modules to
be quite independent. The problem is further amplified by the tendency for
lecturers to teach the same module for several years - mainly to reduce the
burden of teaching in order to gain more time for research. All this has led to
a situation where topics such as analysis of several variables, vector calculus,
differential geometry of curves and surfaces are seen by students as rather
unrelated topics. They also consider Lebesgue integration, real-variable the-
ory, Fourier analysis as separate topics with no connections, and this list is
unfortunately easy to extend. In situations where algebraic concepts (linear
algebra, symmetry and groups) are used in higher dimensional analysis is
even worse. In short, while in the most exciting recent mathematical re-
search the interplay of many diverse mathematical subject areas becomes
more important than ever, in our teaching as it is perceived by the students,
we artificially separate closely related mathematical topics and put them into
isolated boxes called modules. It is clear that such a common practice pre-
vents even the better students from advancing and getting a deeper insight
into mathematics.
Five years ago, after long discussions and preparations we changed the un-
dergraduate mathematics provision at Swansea University. We now think
more in terms of courses than modules. Our analysis course runs over five
terms as does our algebra course, and both are compulsory for all students.
Clearly there are still choices and in particular in the final year students can
v
choose out of quite a few advanced modules. A further, rather important new
feature of the new provision is that we leave (whenever possible) each course
for each cohort in the hands of one lecturer. The students seem to favour
this type of continuity in terms of both the presentation of material and the
lecturer, and more importantly they are performing much better than they
have done in previous years.
Another problem that needed to be addressed was providing students with

problems that fitted to their lecture material. Everyone who has taught
mathematics for some time has experienced that many problems eventually
do not work out because at some point in the solution a result not yet covered
in the lectures is needed. But students still need to have a good number
of problems with correct solutions. These should be a mixture of routine
exercises, more testing problems going beyond what was so far covered in
the lecture and some real challenges. Moreover problems can provide an
opportunity to extend the theory or link to other parts of mathematics, but
they are only useful when students are confident that they have mastered
them correctly. For this reason we have added to every chapter a good
number of problems and we have provided complete solutions. In total, for
the 32 chapters in volume 1 there are more than 360 problems (often with
sub-problems) with complete solutions. This part constitutes more than 25%
of the first volume. Note that problems marked with * are more challenging.
Our aim is to provide students and lecturers with a coherent text which can
and should serve entire undergraduate studies in Analysis. The Course can
also be used as a standard reference work. It might be worth mentioning
that for graduate students in analysis such a lack of a modern course was
also felt at no other place but Princeton University. E.M. Stein’s four-volume
course “Princeton Lectures in Analysis” published jointly with R. Shakarchi
between 2003 - 2011 is a response to such a real need, i.e. multiple-volume
courses are by no means out of date, maybe they are needed more than ever
to give students a foundation and a lasting reference for their mathematical
education and beyond.
The first named author has taught mathematics, mainly analysis related
topics, but also geometry and probability theory, for over 38 years at 7 uni-
versities in 2 countries. The material in this course is based on ca. 40
different modules he has taught over the years. For these volumes the mate-
vi
PREFACE
rial was of course rearranged and amended, but nonetheless to a large extent
they reflect still the provision. This first volume covers first year analysis
as taught by the first named author with the support of the second named
author in Swansea in the academic year 2010/11, an introduction to calculus
and analysis of functions of one variable.
Finally we want to thank all who have supported us in writing this volume,
in particular the World Scientific Press team.
Niels Jacob
Kristian P. Evans
Swansea, January 2015
vii
This page intentionally left blank
Contents
Preface v
Acknowledgements and Apologies xiii
List of Symbols xvii
The Greek Alphabet xxiii
Part 1: Introductory Calculus 1

1 Numbers - Revision 3
2 The Absolute Value, Inequalities and Intervals 19
3 Mathematical Induction 39
4 Functions and Mappings 55
5 Functions and Mappings Continued 71
6 Derivatives 91
7 Derivatives Continued 107
8 The Derivative as a Tool to Investigate

Functions 115
9 The Exponential and Logarithmic Functions 125
10 Trigonometric Functions and Their

Inverses 139
11 Investigating Functions 155
12 Integrating Functions 171
ix
13 Rules for Integration 183
Part 2: Analysis in One Dimension 199

14 Problems with the Real Line 201
15 Sequences and their Limits 211
16 A First Encounter with Series 225
17 The Completeness of the Real Numbers 233
18 Convergence Criteria for Series, b-adic Fractions 243
19 Point Sets in R 263
20 Continuous Functions 277
21 Differentiation 293
22 Applications of the Derivative 305
23 Convex Functions and some Norms on Rn 317
24 Uniform Convergence and Interchanging Limits 331
25 The Riemann Integral 343
26 The Fundamental Theorem of Calculus 369
27 A First Encounter with Differential Equations 383
28 Improper Integrals and the Γ-Function 395
29 Power Series and Taylor Series 411
30 Infinite Products and the Gauss Integral 427
31 More on the Γ-Function 443
32 Selected Topics on Functions of a Real Variable 459
x
CONTENTS
Appendices 471
Appendix I: Elementary Aspects of Mathematical Logic 473
Appendix II: Sets and Mappings. A Collection of Formulae 481
Appendix III: The Peano Axioms 491
Appendix IV: Results from Elementary Geometry 495
Appendix V: Trigonometric and Hyperbolic Functions 499
Appendix VI: More on the Completeness of R 505
Appendix VII: Limes Superior and Limes Inferior 519
Appendix VIII: Connected Sets in R 523
Solutions to Problems of Part 1 525
Solutions to Problems of Part 2 627
References 733
Mathematicians Contributing to Analysis 735
Subject Index 737
xi
Acknowledgements and Apologies

Calculus and basic analysis of functions of one real variable is a standard
topic taught in mathematics across the world. The material is well studied
and a lot of textbooks covering the topics exist. The first textbook dealing
with “calculus”, i.e. analysis of a real-valued function of one variable, was
published in 1696 by de l’Hospital. In the last 300 years thousands of such
textbooks have been published in all major languages, in addition many col-
lections of problems have been added. This is easy to understand since the
topic was and still is rapidly developing, in particular its place within mathe-
matics, and this has of course an impact on its presentation. Thus, there is a
need to “rewrite” calculus and analysis textbooks in each generation. How-
ever basic results and examples (and hence problems) remain unchanged and
still have a place in modern presentations. The tradition in writing textbooks
on such a topic is not to give detailed references to original sources, in fact
this is almost impossible. In drafting my own lecture notes I made use of
many of them, but as all academics know, when drafting lecture notes about
standard material we do not usually make a lot of references. Consequently,
when now using my notes which are partly three decades old, I do not recall
most of the sources I used and combined at that time. There are a number
of books that I used as both a student and a lecturer and therefore they have
been used here. Thus in the main text there are essentially no references
but I do acknowledge the important influence of the following treatises (and
I will always refer below to the copy I had used).
Dieudonné, J., Grundzüge der modernen Analysis, 2. Aufl. Logik und

Grundlagen der Mathematik Bd. 8. Friedrich Vieweg & Sohn, Braunschweig
1972.
Endl, K., und Luh, W., Analysis I, 3. Aufl. Analysis II, 2. Aufl. Akade-
mische Verlagsgesellschaft, Wiesbaden 1975 und 1974.
Fichtenholz, G.M., Differential- und Integralrechnung I, 8. Aufl. Differential-

und Integralrechnung II, 4. Aufl. Differential- und Integralrechnung III, 6.
Aufl. Hochschulbücher für Mathematik Bd. 61, 62, 63. VEB Deutscher
Verlag der Wissenschaften, Berlin 1973, 1972 und 1973.
Forster, O., Analysis 1, 2. Nachdruck. Analysis 2, 2. Nachdruck. Analysis 3.

Friedrich Vieweg & Sohn, Braunschweig 1978, 1979, 1981. (These books will
xiii
have stronger impact on some passages, in particular in parts dealing with

integration theory, since they were much used textbooks when I started my
teaching career supporting corresponding modules.)
Heuser, H., Lehrbuch der Analysis. Teil 1 und 2. B.G. Teubner Verlag,
Stuttgart 1980 und 1981.
Rudin, W., Principles of Mathematical Analysis, 3rd ed. McGraw-Hill In-

ternational Editions, Mathematical Series. McGraw-Hill Book Company,
Singapore 1976.
Walter, W., Gewöhnliche Differentialgleichungen. Heidelberger Taschenbücher

Bd. 110. Springer Verlag, 1972.
Walter, W., Analysis 1, 3. Aufl. Analysis 2, 4. Aufl. Springer Verlag, Berlin,

1992 und 1995.
For compiling the lists of formulae in some of the appendices we used often
Zeidler, E., (ed.), Oxford Users Guide to Mathematics. Oxford University

Press, Oxford 2004.
Solved problems are important for students and we used some existing col-
lections of solved problems to supplement our selection. Sometimes these
collections served only to get some ideas, on some occasions we picked prob-
lems but provided different or modified solutions, but here and there we used
complete solutions. The main sources which are very valuable for students
are
Kaczor, W.J., and Nowak, M.T., Problems in Mathematical Analysis I, II

and III. Students Mathematical Library Vol. 4, 12, and 21. American Math-
ematical Society, Providence R.I., 2000, 2001, and 2003.
Lipschutz, M.M., Differentialgeometrie. Theorie und Anwendung. (Deutsche

Bearbeitung von H.-D. Landschulz.) Schaum’s Outline Series. McGraw-Hill
Book Company, Duesseldorf, 1980.
Spiegel, M.R. Advanced Calculus. Schaum’s Outline Series Theory and

Problems. McGraw-Hill Book Company, New York 1963.
Spiegel, M.R., Real Variables. Schaum’s Outline Series Theory and Prob-
xiv
ACKNOWLEDGEMENTS AND APOLOGIES
lems. McGraw-Hill Book Company, New York 1969.
Spiegel, M.R., Advanced Mathematics for Engineers and Scientists. Schaum’s

Outline Series Theory and Problems. McGraw-Hill Book Company, New
York 1971.
We would finally like to mention that although we have endeavoured to cor-

rect all typos etc via proof-reading, clearly some errors may remain. Please
contact us if you find any such mistakes.
Niels Jacob
xv
List of Symbols
N natural numbers
kN := {n ∈ N | n = km for m ∈ N}

N0 := N {0}
Z integers
Q rational numbers
R real numbers
R+ non-negative real numbers
Rn = R × · · · × R set of ordered n-tuples of real numbers
x−1 := x1
xn := x · x · . . . · x (n factors)
1 √
a n or n a nth root of a
n √
x m = m xn
x > 0 x is strictly greater than 0
x < 0 x is strictly less than 0
x ≥ 0 x is non-negative
x ≤ 0 x is non-positive
|x| absolute value of x
∞ infinity
−∞ negative infinity
n! n factorial
n
k
binomial coefficient
max{a1 , . . . , an } maximum of a1 , . . . , an
min{a1 , . . . , an } minimum of a1 , . . . , an
n
aj finite sum of aj
j=1
∞
ak infinite series
k=1
k

aj = am + am+1 + · · · + ak
j=m
xvii
n

aj finite product of aj
j=1
n
= al · al+1 · . . . · an
j=l
∞
j=1 aj infinite product of aj
X ×Y Cartesian product
∅ empty set
P(X) power set of the set X
∈ belongs to
∈
/ does not belong to
x • y binary operation
⊂ set subset
M1 \ M2 set subtraction

set intersection

set union

A complement of A
=⇒ implies
xRy relation
∼ equivalence relation
[a] equivalence class
∨ or
∧ and
⇐⇒ equivalence (statements)
∀ for all
∃ there exists
¬p negation of p
N
Aj finite union of sets Aj
j=1
N

Aj finite intersection of sets Aj

j=1
B (a) := {x ∈ R | |x − a| < }
xviii
LIST OF SYMBOLS
S 1 circle centred at the origin with radius 1

B (a) := {x ∈ R | |x − a| ≤ }
(a, b) := {x ∈ R | a < x < b}
[a, b) := {x ∈ R | a ≤ x < b}
(a, b] := {x ∈ R | a < x ≤ b}
[a, b] := {x ∈ R | a ≤ x ≤ b}
(0, ∞) := {x ∈ R | x > 0}
max D maximum of D
min D minimum of D
sup D supremum of D
inf D infimum of D
f : D → R mappings, see Chapter 4
D(f ) domain of f
Γ(f ) graph of f
R(f ) range of f
f (D) image of D under f
f −1 (B) pre-image of B
Aut(X) set of all bijective mappings f : X → X
f2 ◦ f1 composition of f1 with f2
χA characteristic function of a set A
pr1 first coordinate projection
pr2 second coordinate projection
f −1 inverse mapping
idD identity mapping
f |D1 restriction of f to D1
f + positive part of f
f − negative part of f
f ⊥ g f and g orthogonal
C k (I) k-times continuously differentiable functions
C(I) = C 0 (I) continuous functions
C ∞ (I) arbitrarily often differentiable functions
Cbk (I) k-times differentiable bounded functions
xix
M(K; R) set of functions from K to R

Mb (K; R) := {f : K → R | sup |f (x)| < ∞}
x∈K
BV ([a, b]) set of functions of bounded variation on [a, b]
T [a, b] step functions on [a, b]
lim f (y) = a limit of the function f
y→x
lim f (y) = a limit of the function f at ∞
y→∞
df (x0 )
f (x0 ) or dx
derivative of f with respect to x at x0
d2 f (x0 )
f (x0 ) or f (2)
(x0 ) or dx2
second derivative of f at x0
(k) df k (x0 ) th
f (x0 ) or k derivative of f at x0
dxk
(an )n∈N sequence
(anl )l∈N subsequence
lim an = a limit of a sequence
n→∞
lim sup = lim limit superior
n→∞
lim inf = lim limit inferior
n→∞
lim f (y) or y→x
lim f (y) limit from the right
yx
y>x
lim f (y) or y→x
lim f (y) limit from the left
yx
y<x
Z(t1 , . . . , tn ) or Zn partition
m(Zn ) mesh size of Zn

VZ (f ) := n−1 k=0 |f (xk+1 − f (xk )|
V (f ) := sup VZ (f )
Z
Vab (f ):= V (f )
Sr (g, Zn) Riemann sum of g with respect to Zn
b
g(t)dt definite integral
a
g(t)dt indefinite integral
∗b
upper integral
a
b
lower integral
∗a
xx
LIST OF SYMBOLS
T(ca n ) power series associated with cn centred at a

(k)
Tf,c Taylor polynomials
(n+1)
Rf,c remainder of Taylor’s formula
||x||1 = |x1 | + · · · + |xn | for x = (x1 , . . . , xn ) ∈ Rn

||x||2 = (x21 + · · · + x2n ) for x = (x1 , . . . , xn ) ∈ Rn
||x||∞ = max{|x1 |, . . . , |xn |} for x = (x1 , . . . , xn ) ∈ Rn
1
||x||p := ( nν=1 |xν |p ) p
||f ||K,∞ := sup |f (x)|
x∈K
p1
b p
||f ||p := a
|f (x)| dx
exp x = ex exponential function
ln x natural logarithm
ax := ex ln a
loga x logarithm of x with respect to the basis a
[x] entier-function
sin sine function
cos cosine function
tan tangent function
cot cotangent function
sec secant function
csc co-secant function
arcsin inverse sine function
arccos inverse cosine function
arctan inverse tangent function
arccot inverse cotangent function
sinh hyperbolic sine function
cosh hyperbolic cosine function
tanh hyperbolic tangent function
coth hyperbolic cotangent function
sech hyperbolic secant function
cosech hyperbolic co-secant function
xxi
arsinh inverse hyperbolic sine function

arcosh inverse hyperbolic cosine function
artanh inverse hyperbolic tangent function
Γ(x) gamma-function
Jl (x) Bessel function
B(x, y) beta-function
e Euler number
γ Euler’s constant
xxii
The Greek Alphabet

alpha α A
beta β B
gamma γ Γ
delta δ Δ
epsilon E
zeta ζ Z
eta η H
theta θ Θ
iota ι I
kappa κ K
lambda λ Λ
mu μ M
nu ν N
xi ξ Ξ
omikron O o
pi π Π
rho ρ P
sigma σ Σ
tau τ T
upsilon υ Υ
phi φ Φ
chi χ X
psi ψ Ψ
omega ω Ω
Note that ϕ is also used for φ.
xxiii
Part 1
Introductory
Calculus
1 Numbers - Revision
Before we start with calculus we need to know how to manipulate complicated
expressions of real numbers and above all we must become familiar in doing
this. We urge students to avoid using calculators in this course. The intention
here is to ensure that we understand the basics; much of what is introduced
may seem obvious but the concepts will become very useful later in the book.
In particular, we will need a lot of familiarity in manipulating expressions
where numbers are replaced by functions or later on even by operators. We
start to systematically introduce set theory as a common language in modern
mathematics. Basic notions from logic on which we rely are taught in other
courses, however these are collected in Appendix I.
The natural numbers or positive integers are the numbers
1, 2, 3, 4, 5, . . . (1.1)
We denote the set of all natural numbers by N. When we want to indicate

that n is a natural number, i.e. an element of the set N, we simply write
n ∈ N, (1.2)
for example
12 ∈ N. (1.3)
The set of all integers is denoted by Z and consists of the numbers
. . . , −5, −4, −3, −2, −1, 0, 1, 2, 3, 4, 5, . . . (1.4)
When k is an integer we write

k ∈ Z, (1.5)
for example
−15 ∈ Z. (1.6)
Note that −15 is not a natural number and for this we write
/ N.
−15 ∈ (1.7)
It is obvious that every natural number is an integer, or more formally
n ∈ N implies n ∈ Z. (1.8)
3
We say that N is a subset of Z and for this we write
N ⊂ Z. (1.9)
Clearly, there are other subsets of Z, for example the set of all negative
integers, or the set of all even integers, etc. The rational numbers are
denoted by Q and this is the set of all fractions
k
q= , where k ∈ Z and n ∈ N. (1.10)
n
Examples of fractions are
3 7 −1 −12
, , , , etc. (1.11)
7 7 8 3
We also write − 18 for −1
8
, etc. Note that we face a problem: − 12 3
and − 41
are different formal expressions which represent the same number, and in
addition we want to consider − 41 and −4 to be equal. For now we use a naı̈ve
k
approach where we consider two rational numbers q = m and r = nl as equal
if kn = lm, and further, for 1 we write k. The last identification of k1 with
k
k, k ∈ Z, allows us to consider Z as a subset of Q, i.e.
Z ⊂ Q. (1.12)
Since N ⊂ Z, i.e. every natural number is also an integer, we find further
N ⊂ Q. (1.13)
Note that the last argument is:
“a subset of a subset is a subset”.
It is helpful to introduce at this stage the few notions and notations from set
theory that we have used so far in a more systematic way. Unfortunately,
there is no simple and unproblematic way to introduce the general notion of
a set. For our purposes the original definition of G. Cantor is sufficient:
We consider a set as a collection of well distinguishable objects of our intu-

ition or our thinking as an entity M. We call these objects the elements of
the set M.
4
1 NUMBERS - REVISION
If M is a set and m an element of M we write
m ∈ M, (1.14)
and if k does not belong to M we write
k∈
/ M. (1.15)
Before we can do anything with sets we need to define when two sets M1 and
M2 are equal:
Two sets are equal if and only if they have the same elements.
For this we write

M1 = M2 . (1.16)
If every element of a set M2 is an element of a set M1 we call M2 a subset
of M1 and we write
M2 ⊂ M1 . (1.17)
Thus (1.17) means that m ∈ M2 implies that m ∈ M1 . In the case that M3
is a further set which is a subset of M2 , i.e. M3 ⊂ M2 , then every element of
M3 must be an element of M1 too. Hence we have
M3 ⊂ M2 and M2 ⊂ M1 implies M3 ⊂ M1 . (1.18)
It may happen that M2 is a subset of M1 and M1 is a subset of M2 , i.e.

M2 ⊂ M1 and M1 ⊂ M2 . In this case every element of M1 is an element of
M2 and every element of M2 is an element of M1 , hence M1 = M2 , or
M2 ⊂ M1 and M1 ⊂ M2 implies M1 = M2 . (1.19)
So far we have introduced the natural numbers, the integers and the rational
numbers. We already know that there are numbers which are not √ rational,
i.e. have no representation as a fraction. Take for example π or 2. We call
these numbers irrational numbers. The real numbers, denoted by R, is
the set consisting of all rational and irrational numbers. Of course, a second
thought shows that this is not a proper definition. However, up until now we
have had a naı̈ve idea of what the real numbers are, for example the points
on a straight line. We will operate with this naiv̈e approach for some time
until we can eventually give a proper definition and characterisation of R.
5
This approach is the more justified one since historically an understanding

of the nature of R took mankind a few thousand years of mathematical
thinking. Indeed, the understanding of real numbers was one of the most
outstanding and important problems in the history of mathematics and it is
still a challenge to students.
Therefore, for the moment, the irrational numbers are those real numbers
which are not rational. Let M2 ⊂ M1 , i.e. M2 is a subset of the set M1 . The
set consisting of elements in M1 not belonging to M2 is denoted by M1 \ M2 ;
we write
M1 \ M2 = {x ∈ M1 | x ∈/ M2 }. (1.20)
In this notation we find that the irrational numbers form the set
R \ Q = {x ∈ R | x ∈
/ Q}, (1.21)
for which we do not introduce an extra symbol. Note that (1.20) suggests a
way to characterise sets, for example
{k ∈ Z | k = 2n for some n ∈ N} (1.22)
is the set of all even natural numbers. Again, it is easier to slowly get used
to this notation than to give a formal definition. The idea is to consider all
those elements of a given set M which share a certain property A, i.e.
{m ∈ M | m has the property A}. (1.23)
Another way to characterise a set is by listing all of its elements, for example
{1, 2, 3} or {x, y}. (1.24)
Before turning to algebraic operations in R, we introduce a further practical

notation:
x, y ∈ M means x ∈ M and y ∈ M, (1.25)
which of course generalises to more than two elements.
We have the following rules for adding real numbers x, y, z ∈ R
(x + y) + z = x + (y + z), (1.26)
and
x + y = y + x. (1.27)
6
From equality (1.26) we deduce that it does not make any difference whether
we first add x and y together and then add z, or whether we first add y
and z together and then add x. We say that addition of real numbers is
associative and (1.26) is called the associative law of addition. Equality
(1.27) tells us that the order does not matter when adding real numbers, i.e.
addition in R is commutative.
There is one (and only one) real number which is very special with respect to
addition: we may add this number to any other number x ∈ R and the result
is again x. This number is 0 and we consider 0 as the neutral element
with respect to addition, i.e.
x + 0 = x for all x ∈ R. (1.28)
Given a real number x, there is always exactly one real number −x such that
x + (−x) = 0. (1.29)
We call −x the inverse element to x with respect to addition. Instead

of (1.29) we often write
x − x = 0, (1.30)
and more generally if −y is the inverse of y and x is a real number we write
x − y := x + (−y). (1.31)
Note that we have used the symbol “:=” here for the first time. In general
A := B means that A is defined by B, for example we write
2N := {n ∈ N | n = 2m for m ∈ N} (1.32)
for the even natural numbers.

We can also multiply real numbers where the rules
(x · y) · z = x · (y · z) (1.33)
and
x·y = y·x (1.34)
hold for all x, y, z, ∈ R. Hence multiplication is associative and com-
mutative. For the time being we write x · y for the product of x and y
but later on we will adopt the usual notation and will just write xy. As in
7
the case of addition, for multiplication there exists a neutral element,

namely the real number 1. Indeed, for all x ∈ R we have
1 · x = x. (1.35)
This leads immediately to the question of the existence of an inverse ele-

ment with respect to multiplication for a real number x. We already
know the answer; all but one real numbers have an inverse with respect to
multiplication. The real number 0 does not. Thus for x ∈ R \ {0} = {y ∈
R | y = 0} there exists an element x−1 ∈ R such that
x · x−1 = 1. (1.36)
Shortly we will investigate why 0 does not have an inverse element with
respect to multiplication. A further notation for x−1 is
1
:= x−1 , x = 0, (1.37)
x
and for x · y −1 = y −1 · x we write
x
:= x · y −1, y = 0. (1.38)
y
Finally we want to link addition and multiplication. This is done by the law
of distributivity which states that for all x, y, z, ∈ R
x · (y + z) = (x · y) + (x · z). (1.39)
With the standard convention that multiplication precedes addition we write
x · (y + z) = x · y + x · z (1.40)
or
x(y + z) = xy + xz. (1.41)
Now we can address the problem why 0 cannot have an inverse with respect
to multiplication. Since 1 − 1 = 0 for any x ∈ R it follows that
0 · x = (1 − 1)x = x − x = 0.
Since 0 = 1 it follows that there is a real number x such that 0 · x = 1, hence

0 cannot have an inverse with respect to multiplication. This is nothing but
8
the following well-known statement: you cannot divide by 0; the expression

k
0
, k ∈ Z does not make sense.
Let us have a more formal look at addition in R and multiplication in R\{0}.
In the set R we can pick any two elements x, y ∈ R and form a new element in
R called x+y. This is an example of a binary operation on a given set. In our
current naı̈ve approach, given a set A which contains at least one element,
i.e. A is non-empty, we call any rule a binary operation if it assigns to a
pair (x, y) of elements x and y in the set A a new element z ∈ A. For this
new element we write for now
z = x • y. (1.42)
The condition that a set A is non-empty will occur quite often. We formally
introduce the empty set ∅ as the set which has no elements and A being
non-empty means
A = ∅. (1.43)
The set R \ {0} is of course non-empty and multiplication on R \ {0} gives
a further binary operation. We write (R, +) and (R \ {0}, ·) to indicate that
we want to consider R with the binary operation “+” and R \ {0} with the
binary operation “·”.
Let us return to R with the algebraic operation addition satisfying (1.26) -
(1.29), the algebraic operation multiplication satisfying (1.33) - (1.36) and
the law of distributivity (1.39). We want to derive some simple rules for
doing calculations. For a, b, c, d ∈ R, b = 0 and d = 0 we have
a c ad + cb
+ = . (1.44)
b d bd
Indeed:
a c
+ = ab−1 + cd−1
b d
bd
= (ab−1 + cd−1 )
bd
1
= ((ab−1 + cd−1 ) · (bd) ·
bd
1
= (ab−1 bd + cd−1 bd) ·
bd
ad + cb
= ,
bd
9
where we used that for every real number x, x = 0, that xx = 1. (Recall that
1
1
= 1 and xx = 11 if and only if 1 · x = x · 1 which is of course true.) Let us
do an example. For a = 34 , b = 15
8
, c = 92 and d = 23 we find
3 9
a c 4
+ = 8 + 22 ,
b d 15 3
and it follows that

3 9 −1 −1
4 2 3 8 9 2
8 + 2 = · + ·
15 3
4 15 2 3
3 15 9 3
· = + ·
4 8 2 2
45 27
= +
32 4
45 · 4 + 27 · 32 1044 261
= = = .
32 · 4 128 32
Here we have already used the general rule that for a = 0 and b = 0 we have
that a −1 b
= , (1.45)
b a
as we know the rule
1 a
a· = for a ∈ R and b ∈ R \ {0}. (1.46)
b b
In addition we know that
a c ac
· = , b = 0, d = 0. (1.47)
b d bd
The rule (1.45) is of some more interest, so let us spend some time on it.
Recall that ab = ab−1 , hence (1.45) claims that (ab−1 )−1 = ba−1 . We can prove
this easily by assuming that the inverse element is uniquely determined:
(ab−1 )(ba−1 ) = ab−1 ba−1
= a(b−1 b)a−1 = a · 1 · a−1

= a · a−1 = 1.
10
Next we turn our attention to powers of real numbers. Let x, y ∈ R and

n, m ∈ N. We set
xn := x · x · x · . . . · x (n factors). (1.48)
Elementary rules are

xn · xm = xn+m (1.49)
and
(x · y)n = xn · y n . (1.50)
Clearly we have
0n = 0 for all n ∈ N. (1.51)
Suppose that x = 0, then xn = 0 and we may consider the inverse element
(xn )−1 of xn . It follows that
xn
xn · (xn )−1 = = 1,
xn
but in addition we have
x ·...· x 1
xn · (xn )−1 = = (x · . . . · x) ,
x ·...· x x · ...·x
thus we find
1
(xn )−1 = (1.52)
xn
and we write
x−n := (xn )−1 . (1.53)
The rules (1.49) and (1.50) now extend to all n, m ∈ Z provided that x = 0
and y = 0. If we agree to define
x0 = 1, (1.54)
for all x ∈ R, then we may summarise our considerations to
xk · xl = xk+l (1.55)
and
(x · y)k = xk · y k (1.56)
11
for all x, y ∈ R \ {0} and k, l ∈ Z. For fractions we find that

a k ak
= (1.57)
b bk
is true for either a, b ∈ R \ {0} and k ∈ Z, or a ∈ R, b ∈ R \ {0} and k ∈ N.
Now we may calculate 3 2 2 3 9
− − 8
2 4 −2 3 7 = 49 277
+8 16
+8
3
211
108 844
= 23 = .
16
621
Finally we extend our considerations to fractional powers. We take it for
granted that for n ∈ N and a ≥ 0, a ∈ R, there exists a √
unique b ∈ R, b ≥ 0,
1 1
n
such that b = a. This number b is denoted by a n or n a. We call a n the
nth root of a. Note that so far the nth root is only defined for a ≥ 0 and it
is unique and non-negative. For a ≥ 0 and b ≥ 0 and m, n ∈ N we have
1 1 1
(a · b) n = a n · b n . (1.58)
Indeed, we can extend (1.55) and (1.56) to fractional powers. For x > 0 and
y > 0 and p, q ∈ Q it follows that
xp · z q = xp+q (1.59)
and
(x · y)p = xp · y p . (1.60)
Further, for p = n
m
, n, m ∈ N, and x ≥ 0 we write
n √
x m = m xn . (1.61)
We have already used the notion of “positive” or “negative” real numbers.

Let us recollect this order structure on R. Given any real number x ∈ R
then exactly one of the following three statements is true
x = 0, x > 0, x < 0, (1.62)
i.e. either x is equal to 0, or it is strictly larger than 0, or it is strictly less

than 0. We can represent the real numbers as points on a line, the real line:
12
− 92 − 72 − 52 − 32 − 12 1
2
3
2
5
2
7
2
9
2
−5 −4 −3 −2 −1 0 1 2 3 4 5
Figure 1.1
At the moment we pretend that there is a one-to-one correspondence between
the points on the real line and the real numbers. If x > 0 we say that x is
positive, we call x negative if x < 0. We write x ≥ 0 if x > 0 or x = 0 and
we write x ≤ 0 if x < 0 or x = 0. It is convenient to add the notation R+ for
all non-negative real numbers, i.e. R := {x ∈ R | x ≥ 0}. If x ≥ 0 we call
x non-negative, if x ≤ 0 we call x non-positive. The following rules hold
for x, y ∈ R:
x > 0 then − x < 0, x < 0 then − x > 0; (1.63)
x > 0 then x−1 > 0, x < 0 then x−1 < 0; (1.64)

x > 0 and y > 0 then x · y > 0; (1.65)
x > 0 and y < 0 then x · y < 0; (1.66)
x < 0 and y > 0 then x · y < 0; (1.67)
x < 0 and y < 0 then x · y > 0; (1.68)
x > 0 and y > 0 then x + y > 0. (1.69)
Furthermore we write
x < y if x − y < 0, (1.70)
or
x > y if x − y > 0, (1.71)
as well as
x ≤ y if x < y or x = y, (1.72)
and
x ≥ y if x > y or x = y. (1.73)
Clearly we have
x > y if and only if y < x (1.74)
and
x ≥ y if and only if y ≤ x. (1.75)
13
Here are some simple rules for handling inequalities. For a, b ∈ R and x, y ∈ R
we have:
x > y implies x + a > y + a; (1.76)
x ≥ y implies x + a ≥ y + a; (1.77)
x < y implies x + a < y + a; (1.78)
x ≤ y implies x + a ≤ y + a; (1.79)
x > y and a > b implies x + a > y + b. (1.80)
If x, y ∈ R and a ∈ R, a > 0, then we have:
x > y implies a · x > a · y; (1.81)
x ≥ y implies a · x ≥ a · y; (1.82)
x < y implies a · x < a · y; (1.83)
x ≤ y implies a · x ≤ a · y. (1.84)
We also know that
a > b > 0 and x > y > 0 imply a · x > b · y. (1.85)
However, for a < 0 we have
x > y implies a · x < a · y; (1.86)
x ≥ y implies a · x ≤ a · y; (1.87)
x < y implies a · x > a · y; (1.88)
x ≤ y implies a · x ≥ a · y. (1.89)
In the next section we will often make use of these rules. Here are some
simple examples:
i)
3 7 3 7 7
≤ , hence 4 · = 3 ≤ = 4 · ,
4 8 4 2 8
however
3 7 7
(−4) · = −3 ≥ − = (−4) · .
4 2 8
ii)
3 + x > 2 + y implies 1 + x > y or y − x < 1.
14
iii) Consider 7x−5 > 21x+30. This inequality is equivalent to 7x > 21x+35,
which is again equivalent to x > 3x + 5, or −5 > 2x, implying x < − 52 . In
fact all these manipulations are reversible. Thus the problem: find all x ∈ R
such that
7x − 5 > 2x + 30
has the solution x ∈ R such that x < − 52 . More formally, the set of solutions
of the inequality
7x − 5 > 2x + 30
is given by
5
x∈R|x<− .
2
In this chapter we have summarised what we may have already learned else-
where about real numbers. We might have even slightly extended these ideas.
Some ideas from set theory have been introduced, further, we occasionally
pointed out that some of the statements and rules we take for granted need
a proper justification, and we indicated some of the more formal aspects,
such as relations to binary operations. In Part 1 of our course we will conse-
quently use the following approach: starting from a basic knowledge we will
gradually move to more and more precision, indicating any gaps in our work
along the way. Eventually we will be prepared for a more mature approach
to mathematics in particular in analysis when entering Part 2.
Problems
1. Is the set {φ} empty?
2. Decide which of the following sets is empty
a) {x ∈ R | x2 = 16 and 2x+3 = 12}, b) {x ∈ Q | x2 = 9 and 3x−6 = 3},
c) {x ∈ R | x = x}, d) {x ∈ Z | x2 = 14 }, e) {x ∈ Q | x2 = 14 }.
3. Given the 3 sets
A = {3, 5, 7, 9, 11}, B = {z ∈ Z | z is odd} and C = {z ∈ Z \ {2} | z is prime}.
Recall that z ∈ Z is an even number if z is divisible by 2, otherwise

it is an odd number. By definition 0 is an even number. State which
of the following inclusions are true:
a) A ⊂ B; b) A ⊂ C; c) C ⊂ B.
15
4. Let M = {n ∈ N | n ≥ 5}. Find Z \ M.

5. With R = {k ∈ N | k 2 ≤ 10} and B = {1, 2, 3, 4, 5, 6} find B\R.
6. Determine the following sets:
a) {x ∈ Z | 5x + 7 = 13}; b) {x ∈ Q | 5x + 7 = 13};
c) {x ∈ Z | 5x + 7 ≤ 13}.
7. Simplify:
−7
27 18
3
+7 42 −33
a) 3 8
− 5
; b) 4 12
2
− 17
; c) 52 +19
.
19
8. a) Simplify:
3a + 4(a + b)2 − 6a( 12 + b) − 2b(a + 2b)
1 , a + b = 0.
2
(a + b)
b) Show that for a + b = c

1 2
2
(a − 3b2 − c2 − 2ab + 4bc)
1 = 2a − 6b + 2c
4
(a + b − c)
c) Simplify:
a−b 4ab a+b
+ 2
−
a + b (a + b) a−b
(a = b and a = −b).
d) Simplify:

x3 − y 3 1 x y
− y 4x2 3
− +
y−x y x y x
(x = y, x = 0, y = 0).
9. Simplify: 8 12 6
1
9 11
− 29 5
−7

8 3 7
.
3 4
−2
10. Simplify:
2 3 1 4 8 3
( 25 ) −( 38 )
2
a) 3
− 2
+5 9
; b) 19 .
40
16
11. Simplify:
a)
(a + b)3 − (b − a)2 (b + a)
, ab = 0;
4ab
b)
a 3 4
b
− ab
, ab = 0.
a2 b3
12. Find:
√
225 a4 b6
a) 625; b) 49
; c) (a+b)2
, a ≥ 0, b ≥ 0 and a + b = 0.
13. Find every x ∈ R such that

7
a) 3x − 12 ≥ −7, b) 4
+ 25 x ≤ 38 x, c) (x − 3)(x + 4) ≥ 0,
and give a graphical representation of the set of solutions.

z
14. Let x > 0, y > 0, z > 0. Is the term xy well defined?
z
Hint: try x = 2, y = 3, z = 2 and compare (xy )z with x(y ) .
15. Prove by using the stated rules for addition and multiplication that
1 1 d+b
(a) b
+ d
= d·b
; b = 0, d = 0.
a
a
(b) b
c = b
· dc , b = 0, c = 0, d = 0.
d
Hint: first prove that for x = 0, (x−1 )−1 = x.
16. Let a, b, c ∈ R, a > 0 and b2 − 4ac ≥ 0.

(a) Prove that ax2 + bx + c = 0 for some x ∈ R if and only if
2
b b2
a x+ − + c = 0.
2a 4a
(b) Use the fact that for y ≥ 0 there exists exactly one real number
√ √
y ≥ 0 such that ( y)2 = y to find all solutions to the quadratic
equation
ax2 + bx + c = 0.
17
2 The Absolute Value, Inequalities and

Intervals
In order to be able to handle inequalities and to handle terms involving real
numbers we need to know whether x ∈ R is zero, positive or negative. Let
us start with a simple example:
x ∈ R then x2 ≥ 0. (2.1)
For x = 0 there is nothing to prove. If x > 0 then by (1.65) we know
x · x = x2 > 0, if x < 0 then by (1.68) it follows that x2 > 0.
This may look quite trivial but it opens the way to a non-trivial result:
let a, b ∈ R then we always have
a2 + b2
ab ≤ . (2.2)
2
2 2
Here we say that a +b2
is an estimate (upper estimate) for ab. To show this
we firstly see that for a, b ∈ R it follows from (2.1) that (a − b)2 ≥ 0. However
(a − b)2 = a2 − 2ab + b2 therefore
a2 − 2ab + b2 ≥ 0,
or
a2 + b2 ≥ 2ab,
implying
a2 + b2
ab ≤ . (2.3)
2
For the case a = 5 and b = 6, we find
25 + 36 1
30 ≤ = 30 .
2 2
This is a reasonably good estimate since intuitively, 30 12 is a fairly good
estimate of 30; it is not too far way. For a = −5 and b = 6 we find
25 + 36 1
−30 ≤ = 30 ,
2 2
1
which is a rather crude result, i.e. 30 2 is a poor estimate for -30. The problem
is that on the right hand side we only have positive terms and they cannot
give a good estimate of negative terms.
To remedy this situation we introduce one of the most important notation
in calculus and analysis.
19
Definition 2.1. Let x ∈ R, the absolute value of x ∈ R, denoted by |x|,

is defined by ⎧
⎨ x, x > 0;
|x| := 0, x = 0; (2.4)
⎩
−x, x < 0.
Thus for all x ∈ R the absolute value |x| is non-negative, i.e.
|x| ≥ 0 for all x ∈ R. (2.5)
Here are some examples: | 35 | = 35 , | − 78 | = 78 , |0| = 0.

We claim that we can improve (2.2) by
a2 + b2
|ab| ≤ for all a, b ∈ R. (2.6)
2
We already know
a2 + b2
ab ≤ , (2.7)
2
therefore all we need to show is that
a2 + b2
−ab ≤ .
2
To do this consider (a + b)2 . As before we find
0 ≤ (a + b)2 = a2 + 2ab + b2 ,
and therefore −2ab ≤ a2 + b2 , or
a2 + b2
−ab ≤ . (2.8)
2
Thus, (2.7) and (2.8) imply
a2 + b2
|ab| ≤ , (2.9)
2
since |ab| can only take the value ab or −ab.
Here are some rules for handling the absolute value: For x, y ∈ R we find
|xy| = |x||y|, in particular |x| = | − x|. (2.10)
20
2 THE ABSOLUTE VALUE, INEQUALITIES AND INTERVALS
We can prove (2.10) by considering 4 cases. First note the table
x≥0 y≥0 x≤0 y≤0
|x| = x |y| = y |x| = −x |y| = −y.

Now we have
1. x ≥ 0, y ≥ 0 then |x||y| = xy, and xy ≥ 0, i.e. |xy| = xy;
2. x ≥ 0, y ≤ 0 then |x||y| = x(−y) = −xy, and xy ≤ 0, i.e. |xy| = −xy;
3. x ≤ 0, y ≥ 0 then |x||y| = (−x)y = −xy, and xy ≤ 0, i.e. |xy| = −xy;
4. x ≤ 0, y ≤ 0 then |x||y| = (−x)(−y) = xy, and xy ≥ 0, i.e. |xy| = xy.

For y = 0 it follows from (2.10) that

x |x|
=
y |y| . (2.11)
Thus we have for example

3 −12 | − 12|
· −4 = 3 · 4 or 12
7 8 7 8 −5 = | − 5| = 5 .
The triangle inequality is a very important result: It states that for x, y ∈

R we have
|x + y| ≤ |x| + |y|. (2.12)
Again we prove (2.12) by discussing the different cases:
1. x ≥ 0 and y ≥ 0 implies x + y ≥ 0, hence |x + y| = x + y, but in this
case |x| = x and |y| = y, hence |x| + |y| = x + y and we have proved
(2.12) with equality.
2. x ≥ 0 and y ≤ 0. Two cases may occur : x + y ≥ 0 or x + y ≤ 0. In

the first case |x + y| = x + y ≤ x − y = |x| + |y|, in the second case
|x + y| = −(x + y) = −x − y ≤ x − y = |x| + |y|.
3. x ≤ 0 and y ≥ 0. This is just the second case with x and y interchanged.
4. x ≤ 0 and y ≤ 0. Then x + y ≤ 0, hence |x + y| = −x − y but |x| = −x

and |y| = −y, hence |x + y| = −x − y = |x| + |y|.
21
Basic Properties of the Absolute Value:

i) |x| ≥ 0 for all x ∈ R and |x| = 0 if and only if x = 0;
ii) |xy| = |x||y| for all x, y ∈ R;
iii) |x + y| ≤ |x| + |y| for all x, y ∈ R.
Note that both x and −x have the same absolute value |x|. On the real line
we find that for any x ∈ R
−|x| 0 |x|
Figure 2.1
Let us change our point of view. Consider on R the set
{y ∈ R| |y| = x, x > 0 fixed}. (2.13)
This set only consists of two points: x and −x. Thus we may use the
absolute value to define subsets of R. We may extend this procedure by
allowing inequalities:
Let ε > 0 and a ∈ R be fixed. Define on R the subset
Bε (a) := {x ∈ R | |x − a| < ε}. (2.14)
We want to find all points in R belonging to the set Bε (a). Using the definition
of the absolute value we find
|x − a| < ε if and only if − ε < x − a < ε,
or
|x − a| < ε if and only if − ε + a < x < a + ε.
As the simplest case take a = 0. This means that in Bε (0) we find all points
with absolute value less than ε, or equivalently those whose distance to 0 is
less than ε :
B (0)
0
−
Figure 2.2
22
But now we see the general interpretation: in Bε (a) we find all points which
have a distance less than ε to a
B (a)
a 0
− + a a+
Figure 2.3
Example 2.2. A. Consider the set
1
B 1 (5) = {x ∈ R| |x − 5| < }
2 2
1 1
= {x ∈ R| − + 5 < x < 5 + }
2 2
9 11
= {x ∈ R| < x < }
2 2
B 1 (5)
2
0 9
5 11
2 2
Figure 2.4
B. Next we look at

2 2
B2 − = x ∈ R| x − − < 2
3 3
2
= {x ∈ R| |x + | < 2}
3
2 2
= {x ∈ R| − 2 − < x < 2 − }
3 3
8 4
= {x ∈ R| − < x < }
3 3
23
B2 (− 23 )
− 23 0
− 83 4
3
Figure 2.5
We can now define sets in R by using inequalities. Let us define for a, b ∈ R,

a < b,
(a, b) := {x ∈ R| a < x < b} (2.15)
0 a b
Figure 2.6
Thus in (a, b) we find all real numbers x which are larger than a and less
than b. For example
(−3, 8) = {x ∈ R| − 3 < x < 8}
0
−3 8
Figure 2.7
With this notation we have
(−ε, ε) = Bε (0)
or more generally
(−ε + a, a + ε) = Bε (a)
for ε > 0 and a ∈ R. Note that the numbers a and b do not belong to (a, b).
Again we can extend our procedure of defining sets. For a, b ∈ R, a < b we
set
[a, b) := {x ∈ R| a ≤ x < b}, (2.16)
which corresponds to
24
a b
Figure 2.8
Also, we may consider
(a, b] := {x ∈ R| a < x ≤ b}, (2.17)
a b
Figure 2.9
Finally we introduce
[a, b] := {x ∈ R| a ≤ x ≤ b}, (2.18)
a b
Figure 2.10
Definition 2.3. For a, b ∈ R, a < b, we call

(a,b) the open interval with end points a and b;
(a,b] the (left) half-open interval with end points a and b;
[a,b) the (right) half-open interval with end points a and b;
[a,b] the closed interval with end points a and b.

An important remark: in the case of a closed interval the end points belong
to the interval (set) whereas in the case of an open interval the end points
do not belong to the interval (set).
Example 2.4. We find
2 2
[−3, ] = {x ∈ R| − 3 ≤ x ≤ }
3 3
25
0
−3 2
3
Figure 2.11
or
1 3 1 3
( , ] = {x ∈ R| < x ≤ }
5 4 5 4
0 1 3
1
5 4
Figure 2.12
For the closed interval [−ε + a, a + ε] we also write
Bε (a) := [−ε + a, a + ε]. (2.19)
Often we will encounter the following type of problem: given ε1 > 0 and
ε2 > 0 as well as a1 , a2 ∈ R, find all points x ∈ R such that x ∈ Bε1 (a1 )
and x ∈ Bε2 (a2 ). We have an easy geometric solution to the problem: it may
happen that
a1 a2
a1 − 1 a1 + 1 a2 − 2 a2 + 2
Figure 2.13
or
a2 − 2 a2 + 2
a1 a2
a1 − 1 a1 + 1
Figure 2.14
In the first case Bε1 (a1 ) and Bε2 (a2 ) have no points in common i.e. they are
disjoint. In the second case there are points in the intersection of B2 (a1 )
and B2 (a2 ), i.e. these points belong to both sets. In order to find the points
in the intersection, we must solve simultaneously the inequalities
−ε1 + a1 < x < a1 + ε1 and − ε2 + a2 < x < a2 + ε2 . (2.20)
26
The conditions on x are
−ε1 + a1 < x and − ε2 + a2 < x
and
x < a1 + ε1 and x < a2 + ε2
therefore
max{−ε1 + a1 , −ε2 + a2 } < x
and
x < min{a1 + ε1 , a2 + ε2 }.
Thus the solution to (2.20) is
x ∈ (max{−ε1 + a1 , −ε2 + a2 }, min{a1 + ε1 , a2 + ε2 }).
Example 2.5. A. We have x ∈ B2 (3) and x ∈ B2 (4) only for
x ∈ (max{1, 2}, min{5, 6}) = (2, 5).
B2 (4)
2 4 6
0 3
1 5
B2 (3) Figure 2.15
B. We have x ∈ B2 (3) and x ∈ B2 (8) only for
x ∈ (max{1, 6}, min{5, 10}) = (6, 5),
but (6,5) is not an interval since 6 > 5, i.e. there are no points belonging to
both sets.
B2 (8)
8
0 1 3 5 6 10
B2 (3) Figure 2.16
27
The set of all points belonging to Bε1 (a1 ) and to Bε2 (a2 ) is denoted by
Bε1 (a1 ) ∩ Bε2 (a2 ), (2.21)
and this set is called the intersection of Bε1 (a1 ) and Bε2 (a2 ). In the case where
there are no points in the intersection, i.e. in the case where the intersection
is empty, we write
Bε1 (a1 ) ∩ Bε2 (a2 ) = ∅. (2.22)
We define the intersection of two general sets A and B by
A ∩ B = {x ∈ A | x ∈ B} = {x ∈ B | x ∈ A} = {x | x ∈ A and x ∈ B},
i.e. x ∈ A ∩ B if x ∈ A and x ∈ B. Two sets with an empty intersection

are called disjoint. Before we continue to discuss intersections of intervals
in more detail, we want to introduce a few more ideas from set theory. For
two sets A and B we introduce their union by
A ∪ B = {x | x ∈ A or x ∈ B}. (2.23)
Often it is advantageous to consider the sets we are dealing with as subsets

of a given set X. For example all our intervals are subsets of R. Suppose
A ⊂ X and B ⊂ X for which we sometimes write A, B ⊂ X. Then the
intersection and union of A and B are given by
A ∩ B = {x ∈ X | x ∈ A and x ∈ B}, (2.24)
A ∪ B = {x ∈ X | x ∈ A or x ∈ B}. (2.25)
For example with X = N, A = {1, 2, 3, 5, 7} and B = {3, 4, 5, 8, 9} we find
A ∩ B = {1, 2, 3, 5, 7} ∩ {3, 4, 5, 8, 9} = {3, 5}
and
A ∪ B = {1, 2, 3, 5, 7} ∪ {3, 4, 5, 8, 9} = {1, 2, 3, 4, 5, 7, 8, 9}.
Given a set X and a subset A ⊂ X we may form a new set, the complement
of A in X for which we write A and is defined by
A := X \ A = {x ∈ X | x ∈
/ A}. (2.26)
28
Note that A depends on X therefore we should write AX or in the more

traditional way; X A. For example, N ⊂ Z, and
Z N = {z ∈ Z | z ∈
/ N} = {z ∈ Z | z ≤ 0} (2.27)
whereas for N ⊂ R we find
R N = {x ∈ R | x ∈
/ N} = R \ N (2.28)
and clearly
Z N = R N.
We will use the notation A when it is clear from the context which set X
is meant, i.e. for which X we consider A to be a subset, otherwise we write
X \ A instead of A .
In Appendix II we have collected many results about operations on sets. Here
we summarise some rules and give an outline of some of the proofs. Further
proofs are given in Appendix II. The empty set is a special set, basic rules
for the empty set which are all discussed in Appendix II are: For any set X
the following hold:
X ∪ ∅ = X and X ∩ ∅ = ∅. (2.29)
Further, ∅ ⊂ X for every set X and when considering ∅ as a subset of X we
have ∅ = X. For every set X we have the obvious relations
X ∪ X = X and X ∩ X = X, (2.30)
and for two sets X and Y we have
X ∪ Y = Y ∪ X and X ∩ Y = Y ∩ X. (2.31)
Let us have a look at X ∪ Y = Y ∪ X. We prove the equality of the two sets,

as mentioned previously, by proving that each is a subset of the other. Thus
in the case under consideration we prove
X ∪ Y ⊂ Y ∪ X and Y ∪ X ⊂ X ∪ Y. (2.32)
The next rule for proving such statements is to transform these statements
into a formal logical statement: for example X ∪ Y ⊂ Y ∪ X corresponds to
(x ∈ X ∪ Y ) implies (x ∈ Y ∪ X) (2.33)
29
or equivalently
(X ∈ X ∪ Y ) =⇒ (X ∈ Y ∪ X). (2.34)
Now let us have a closer look at the statement x ∈ X ∪ Y :
x ∈ X ∪ Y if and only if x ∈ X or x ∈ Y, (2.35)
or equivalently
(x ∈ X ∪ Y ) ⇐⇒ (x ∈ X) ∨ (x ∈ Y ). (2.36)
But x ∈ X or x ∈ Y is equivalent to x ∈ Y or x ∈ X, more formally
(x ∈ X) ∨ (x ∈ Y ) ⇐⇒ (x ∈ Y ) ∨ (x ∈ X). (2.37)
The latter statement however implies x ∈ Y ∪ X. Thus we have proved

x ∈ X ∪ Y implies x ∈ Y ∪ X, or X ∪ Y ⊂ Y ∪ X. Analogously, we may
prove Y ∪ X ⊂ X ∪ Y , however this is left as a useful exercise.
We can prove further similar rules for the sets X, Y, Z:
X ∪ (Y ∪ Z) = (X ∪ Y ) ∪ Z (2.38)
which allows us just to write X ∪ Y ∪ Z, and further
X ∩ (Y ∩ Z) = (X ∩ Y ) ∩ Z (2.39)
which similarly allows us just to write X ∩ Y ∩ Z. We can also combine

unions and intersections, however more care is needed here:
X ∪ (Y ∩ Z) = (X ∪ Y ) ∩ (X ∩ Z) (2.40)
and
X ∩ (Y ∪ Z) = (X ∩ Y ) ∪ (X ∩ Z). (2.41)
Let us prove (2.41): we need to prove
X ∩ (Y ∪ Z) ⊂ (X ∩ Y ) ∪ (X ∩ Z) (2.42)
and
(X ∩ Y ) ∪ (X ∩ Z) ⊂ X ∩ (Y ∪ Z). (2.43)
Note that
x ∈ X ∩ (Y ∪ Z) ⇐⇒ (x ∈ X) ∧ (x ∈ Y ∪ Z)
30
⇐⇒ (x ∈ X) ∧ ((x ∈ Y ) ∨ (x ∈ Z))
⇐⇒ ((x ∈ X) ∨ (x ∈ Y )) ∧ ((x ∈ X) ∨ (x ∈ Z)),
where we used (A.I.10) from Appendix I. However,
((x ∈ X)∨(x ∈ Y ))∧((x ∈ X)∨(x ∈ Z)) ⇐⇒ x ∈ (X∪Y )∩(X∪Z). (2.44)
Thus we have proved (2.42) as well as (2.43).

Now let us turn to the complement. In the following, A, B, C are all subsets
of a fixed set X. First we note that
(A ) = A, (2.45)
which follows from
x ∈ (A ) ⇐⇒ x ∈
/ A ⇐⇒ x ∈ A.
Finally we state de Morgan’s laws:
(A ∩ B) = A ∪ B (2.46)
and
(A ∪ B) = A ∩ B . (2.47)
We prove (2.46). The fact that x ∈ (A ∩ B) means
x∈
/ A∩B ⇐⇒ (x ∈
/ A) ∨ (x ∈
/ B)
⇐⇒ (x ∈ A ) ∨ (x ∈ B )
⇐⇒ x ∈ (A ∪ B ),
therefore we have proved (A∩B) ⊂ (A ∪B ) as well as (A ∪B ) ⊂ (A∩B) .

Let A1 , . . . , AN be a finite number of sets. For their union we write
N

Aj = A1 ∪ · · · ∪ AN , (2.48)
j=1
and for their intersection we write

N

Aj = A1 ∩ · · · ∩ AN . (2.49)
j=1
31
N

Thus, x ∈ Aj if for at least one j0 ∈ {1, . . . , N} we have x ∈ Aj0 , whereas
j=1
N
x∈ j=1 Aj means that x ∈ Aj for all j ∈ {1, . . . , N}.
We now return to intervals on the real line. We may determine intersections

of intervals:
(a, b) ∩ (c, d) or [a, b) ∩ [c, d] etc.
In each case we have to solve systems of inequalities
x ∈ (a, b) ∩ (c, d) if and only if a < x < b and c < x < d,
i.e.
max{a, c} < x < min{b, d},
or
x ∈ [a, b) ∩ [c, d] if and only if a ≤ x < b and c ≤ x ≤ d,
i.e.
max{a, c} ≤ x < b if b ≤ d
or
max{a, c} ≤ x ≤ d if d < b.
Here max{a, c} stands for the larger number, i.e. the maximum of a and c,
whereas min{b, d} stands for the smaller number, i.e. the minimum of b and
d.
Example 2.6. We have
[−2, 5) ∩ [3, 6] = [3, 5) (2.50)
0
-2 3 5 6
Figure 2.17
Note that (2.50) is an equality of sets, namely
{x ∈ R| − 2 ≤ x < 5} ∩ {x ∈ R|3 ≤ x ≤ 5} = {x ∈ R|3 ≤ x < 5}. (2.51)
32
We may also look at unions of intervals which is less problematic since we

do not need to solve inequalities however we might have to combine them.
For two, say open, intervals (a, b) and (c, d) it may happen that they do not
intersect, their union is then just (a, b) ∪ (c, d)
a b c d
Figure 2.18
If (a, b) ∩ (c, d) = ∅, then (a, b) ∪ (c, d) is either one of these intervals, namely
(a, b) if (c, d) ⊂ (a, b) or (c, d) if (a, b) ⊂ (c, d)
c a b d
Figure 2.19
or (a, b) ∪ (c, d) = (min(a, c), max(b, d))
(c, d)
(a, b)
a c b d
Figure 2.20
Note that in the case of closed or half-open intervals we may meet some
new possibilities(compared with open intervals). The two intervals (a, b] and
(b, c), for example, do not intersect
(a, b] ∩ (b, c) = {x ∈ R | a < x ≤ b and b < x < c} = ∅,
however
(a, b] ∪ (b, c) = {x ∈ R | a < x ≤ b or b < x < c}
= {x ∈ R | a < x < c} = (a, c).

Thus the union of two disjoint open intervals is never an interval, while in
the case of disjoint half-open intervals the union might be an interval. We
will discuss more cases in the exercises.
33
For convenience let us introduce some further notation
(a, ∞) := {x ∈ R|x > a}, (2.52)

[a, ∞) := {x ∈ R|x ≥ a}, (2.53)
(−∞, b) := {x ∈ R|x < b}, (2.54)
(−∞, b] := {x ∈ R|x ≤ b}, (2.55)
and
(−∞, ∞) := R. (2.56)
We call “∞” infinity and “−∞” minus infinity and at the moment it is just
a useful name and notation.
We have already used max and min without stating the formal definitions:

a, a ≥ b;
max{a, b} := (2.57)
b, b ≥ a,
and
a, a ≤ b;
min{a, b} := (2.58)
b, b ≤ a.
It is interesting to note that we can express max and min using the absolute
value.
Lemma 2.7. For a, b ∈ R we have
1
max{a, b} = (a + b + |a − b|) (2.59)
2
and
1
min{a, b} = (a + b − |a − b|). (2.60)
2
Proof. We prove (2.59) and leave (2.60) as an exercise. If a ≥ b then
max{a, b} = a. In this case a − b ≥ 0, hence |a − b| = a − b and
1 1
(a + b + a − b) = 2a = a.
2 2
If however b ≥ a then max{a, b} = b. In this case a−b ≤ 0 hence |a−b| = b−a
and we find
1 1
(a + b + b − a) = 2b = b.
2 2
34
The notations of maximum and minimum easily extend to finite sets of real
numbers. If a1 , · · · , an ∈ R then
max{a1 , · · · , an } := ak if ak ≥ aj for j = 1, · · · , n (2.61)
and
min{a1 , · · · , an } := al if al ≤ aj for j = 1, · · · , n. (2.62)
Definition (2.61) tells us that ak is larger or equal than all other elements
a1 , · · · , an in the set {a1 , · · · , an } and (2.62) says that al is less or equal to
all other elements of the set {a1 , · · · , an }.
Example 2.8. The following hold

3
max{1, 7, − , 13} = 13,
5
and
1
min{ , 2, −5, 13} = −5.
3
We close this chapter by showing some additional properties of the absolute
value. As a rule lower bounds or estimates from below are in general more
difficult to obtain. Let us consider the triangle inequality
|a + b| ≤ |a| + |b|.
Since |a + b| ≥ 0 the estimate
−|a| − |b| ≤ |a + b|
is trivial. The converse triangle inequality however is non-trivial:
Lemma 2.9. For all a, b ∈ R we have
||a| − |b|| ≤ |a − b| (2.63)
and
||a| − |b|| ≤ |a + b|. (2.64)
35
Proof. First note that (2.64) follows from (2.63) and vice versa. In fact we
may take the real number −b instead of b in (2.63) to find
||a| − |b|| = ||a| − | − b|| ≤ |a − (−b)| = |a + b|.
The proof that (2.64) implies (2.63) follows the same idea.
Now we prove (2.63). By the triangle inequality we know that
|a| = |a − b + b| ≤ |a − b| + |b|
implying
|a| − |b| ≤ |a − b|. (2.65)
On the other hand we have
|b| = |b − a + a| ≤ |b − a| + |a| = |a − b| + |a|
implying
−(|a| − |b|) ≤ |a − b|, (2.66)
thus together with (2.65) we have
||a| − |b|| ≤ |a − b|. (2.67)
Problems
1. Let X = {a, b, c, d, e, f, g, h, i} and consider the subsets
A = {a, b, c, d}, B = {b, d, f, h} and C = {c, d, e, f }. Find A ,
(A ∩ C) , B \ C, and (A ∪ B) .
2. Find the following subsets of the real line: 1 7

3
a) B
4 (2) ∩ B
3 (8); b)
(B2 (5) ∩ B7 (−2)) ; c) ( −3, 2
∪ −4, 3 ) ;
d) −2, 73 ∩ 35 , 154
.
In each case, sketch the solution set.
3. For the sets A ⊂ X and B ⊂ X, prove the following statements:

a) A ∩ B ⊂ A ⊂ A ∪ B; b) (A \ B) ∩ B = φ; c) B \ A = B ∩ A .
36
4. For A, B, C ⊂ X prove the following statements:

a) (A ∩ B) ∪ C = (A ∪ C) ∩ (B ∪ C); b) (A ∪ B) = A ∩ B .
5. For A, B ⊂ X (which means that A ⊂ X and B ⊂ X) prove that the

following statements are equivalent
a) A ⊂ B, b) A ∩ B = A, c) B ⊂ A , d) A ∪ B = B, e) B ∪ A = X,
f) A ∩ B = φ.
6. Calculate
the following
values:
√
a) − 58 ; b) 11
3
− 3 ; c) 7 − 12 ; d) | | − 3| − | − 5| |; e) a2 ,
9 5
a ∈ R.
7. Prove that for every ε > 0 and all a, b ∈ R the following hold
1 2
|ab| ≤ εa2 + b,
4ε
and
1
min{a, b} = (a + b − |a − b|).
2
Furthermore, for a > 0 prove that
1
a+ ≥ 2.
a
8. Prove for a, b, c ∈ R that
|a − c| ≤ |a − b| + |b − c|
and
| |a − b| − |c| | ≤ | |a − b| − c| ≤ |a| + |b| + |c|.
9. a) Find every x ∈ R that satisfies
8x − 11 > −24x + 89.
b) Find every x ∈ Z that satisfies
−3 ≤ 7x − 2 < 6x + 5.
c) Find every x ∈ R that satisfies
|x − 3| ≤ |x + 3|.
37
10. a) For which values of x ∈ R does the inequality
2x + 6(2 − x) ≥ 8 − 2x
hold?
b) Find all values of x ∈ R such that
x2 + 2x − 10 < 3x + 2.
38
3 Mathematical Induction
Mathematics derives new statements from given ones. The underlying pro-
cedure is of course called a proof. It is by no means easy to define what a
(correct) proof is, and there is no need to do this here. For a working math-
ematician a proof reduces to the following: you start with some statements
either being taken for granted to be true (axioms) or already proven (theo-
rems, propositions, lemmata), and then you apply the usual rules of (math-
ematical) logic which we have collected in Appendix I in order to arrive at
new statements. Very often we have to handle statements A(n) depending
on n ∈ N or n ∈ {k ∈ Z | k ≥ m for some m ∈ Z}. For example the
statement A(n) could be
n(n + 1)
A(n) : 1 + 2 + · · · + n = , n ∈ N. (3.1)
2
To prove that such a statement is true for N we cannot just check one-by-
one that it is true for every natural number however we may use a method
called mathematical induction. It is possible to show that this method
is sufficient for proving statements like A(n), n ∈ N, however this involves
looking at the actual construction of N and Peano’s axioms which goes be-
yond the scope of this introductory section. For more about Peano’s axioms
and mathematical induction, see Appendix III. The method of mathemati-
cal induction follows from the axiom of mathematical induction (one of
Peano’s axioms): Suppose that for each n ≥ m, m, n ∈ Z, a mathematical
statement A(n) is given. If A(m) is true and if for all n ≥ m the statement
A(n) implies that the statement A(n + 1) is true, then A(n) is true for all
n ≥ m.
At this stage we will just assume this axiom. An alternative version of the
axiom of mathematical induction is:
Suppose for each n ≥ m, m, n ∈ Z, a statement A(n) is given. If A(m) is
true and if for all n ≥ m the statements A(m), . . . , A(n) imply the statement
A(n + 1), then A(n) holds for all n ≥ m.
In simple terms this means that the method of mathematical induction is
as follows: we begin by showing that A(m) is true for some m ∈ N, usually
m = 0 or m = 1 (base case). Next we assume that A(n) is true for arbitrary
n ≥ m (induction hypothesis) and then prove that A(n+1) is true (induction
step).
39
Let us start with a simple example to see how we can apply the axiom of
mathematical induction.
Example 3.1. For every n ≥ 0, n ∈ Z, the statement
A(n) : 11n+2 + 122n+1 is divisible by 133 (3.2)
holds. Recall that a natural number l is divisible by a natural number m

if there exists a natural number k such that l = k · m. We start by proving
that A(0) is true i.e. that
112 + 12 is divisible by 133.
Since 112 + 12 = 121 + 12 = 133 this statement is true. Now we assume

that for arbitrary but fixed n ≥ 0, the statement A(n) is true and we want
to deduce that A(n + 1) is also true. In other words, we want to prove that
if 11n+2 + 12n+1 is divisible by 133 then 11(n+1)+2 + 122(n+1)+1 is divisible by
133 too. Note that n is not specified, it is arbitrary but fixed. We cannot
take a particular n, say n = 5 or n = 12543. So we have to prove that
11(n+1)+2 + 122(n+1)+1 = 11n+3 + 122n+3 is divisible by 133 assuming that
11n+2 + 122n+1 is divisible by 133. How can we reduce or transform A(n + 1)
to a statement to which we can apply A(n)?
Here is a suggestion:
11n+3 + 122n+3 = 11 · 11n+2 + 122 · 122n+1
= 11 · 11n+2 + 144 · 122n+1

= 11 · 11n+2 + (11 + 133) · 122n+1 (3.3)
n+2 2n+1 2n+1
= 11 · 11 + 11 · 12 + 133 · 12
= 11(11n+2 + 122n+1 ) + 133 · 122n+1 . (3.4)
The step in (3.3) is crucial, splitting 144 into the sum of 11 and 133 allows
us to deduce a statement to which we can apply A(n). Indeed, since by
assumption 11n+2 + 122n+1 is divisible by 133, there exists a k ∈ N such that
11n+2 + 122n+1 = 133k, and (3.4) becomes
11n+3 + 122n+3 = 11(n+1)+2 + 122(n+1)+1 (3.5)
= 11 · 133 · k + 133 · 122n+1
40
3 MATHEMATICAL INDUCTION
= 133(11k + 122n+1 ).
Since 11k + 122n+1 is a natural number, say m, it follows that 11(n+1)+2 +
122(n+1)+1 = 133m, i.e. A(n+1) is correct. Now the principle of mathematical
induction yields that A(n) holds for all n ≥ 0.
This example already gives an insight that mathematical induction as a
method of proving a statement A(n) for all n ≥ m, n, m ∈ Z, is a way
forward. However, depending on the statement A(n) we may need addi-
tional knowledge for the proof that A(n) implies A(n + 1). Indeed this is
already of course the case when proving A(m).
There is a reason why we have not started with proving (3.1). Although the
notation 1 + 2 + · · · + n is intuitively clear, we will introduce a better one.
Suppose that a1 , . . . ak ∈ R, which is shorthand for: suppose that for every
j ∈ {k ∈ N | k ≤ n} we have aj ∈ R. The sum A of these n real numbers is
denoted by
n
A := aj , (3.6)
j=1
which is what we mean when writing

A = a1 + · · · + an . (3.7)
At the end of this chapter we will discuss an even more formal way to intro-
duce (3.6). Here are some examples on how to use this new notation
Example 3.2. A. For j ∈ N let aj = j. Then
n
n

aj = j (3.8)
j=1 j=1
gives of course the expression considered in (3.1).

B. Take aj = j 2 , j ∈ N, to find
n
n

aj = j 2. (3.9)
j=1 j=1
C. Now take bj = 1j , j ∈ N, to form the sum

n
n
1
bj = . (3.10)
j=1 j=1
j
41
(Of course it does not matter whether we denote the numbers by aj or bj .)

D. Finally, with cj = 2j , j ∈ N0 = N ∪ {0}, we can form the sum
n
n

cj = 2j . (3.11)
j=0 j=0
The last part of Example 3.2 is interesting. Everyone understands what is

meant by
n
cj = c0 + c1 + · · · + cn . (3.12)
j=0
We can extend this notation: for m, k ∈ Z, m ≤ k, let the real numbers

am , am+1 , . . . , ak be given. We set for their sum
k

aj = am + am+1 + · · · + ak . (3.13)
j=m
For example with aj = (j + 12 )−2 we can form

2
2
1
aj =
j=−3 j=−3
(j + 12 )2
1 11 1 1 1
= + + 3 2+ 5 2 + +
( 52 )2 ( 32 )2
( ) (2) ( 12 )2 ( 12 )2
2
1 1 1 1 2072
= 4 + +1+1+ + = .
25 9 9 25 225
It is convenient to include the following convention
k

aj = 0 for k < m. (3.14)
j=m
Moreover, the associative law for addition implies for k ≤ l ≤ m that

m
l
m

aj = aj + aj (3.15)
j=k j=k j=l+1
(= (ak + · · · + al ) + (al+1 + · · · + am ).)

Now we return to statement (3.1).
42
Example 3.3. Prove that

n
n(n + 1)
A(n) : j= , n ≥ 1. (3.16)
j=1
2
We start by proving A(1), i.e. we note that

1
1(1 + 1)
j = 1 as well as = 1,
j=1
2
i.e. A(1) holds. Now suppose that A(n) holds for arbitrary but fixed n ∈ N.
We want to show that then A(n + 1) holds too. Indeed we have
n+1
n

j= j + (n + 1),
j=1 j=1
and this is already the crucial step since it allows us to use statement A(n),
namely
n+1
n
j= j + (n + 1)
j=1 j=1
n(n + 1) n(n + 1) + 2(n + 1)

+ (n + 1) =
2 2
(n + 1)(n + 2)
= ,
2
which is A(n + 1).
Example 3.4. For x = 1 the statement

n
xn+1 − 1
A(n) : xj = , n≥0 (3.17)
j=0
x−1
n
holds. Recall that x0 = 1, thus we have j=0 x
j
= 1 + x + x2 + · · · + xn . For
n = 0 the statement A(0) is correct:
0
x1 − 1 x−1
xj = x0 = 1 and = = 1.
j=0
x−1 x−1
43
n
xn+1 − 1
Now if xj = then
j=0
x−1
n+1
n
xn+1 − 1
xj = xj + xn+1 = + xn+1
j=0 j=0
x−1
n+1
x − 1 xn+1 (x − 1)
= +
x−1 x−1
xn+1 + xn+2 − xn−1 − 1 xn+2 − 1
= = ,
x−1 x−1
i.e. we have proved that A(n + 1) is correct:

n+1
xn+2 − 1
xj = .
j=0
x−1
We can also use mathematical induction to prove inequalities or estimates.
Lemma 3.5. Let a1 , . . . , an ∈ R. Then we have the estimates

n n

al ≤ |al | ≤ n max {|a1 |, . . . , |al |}. (3.18)

l=1 l=1
Proof. For n = 1 we find

1

al = |a1 | = 1 · max {|a1 |}.

l=1
Now suppose that (3.18) holds for arbitrary but fixed n ∈ N. We find using
the triangle inequality (2.12) that

n+1 n n

al = al + an+1 ≤ al + |an+1 |

l=1 l=1 l=1
n

≤ |al | + |an+1 |,
l=1
44
where in the last step we used (3.18) for n. Now the rest is straightforward
since
n n+1

|al | + |an+1 | = |al |,
l=1 l=1
and the first estimate is proved for n + 1 provided it holds for n, hence by
mathematical induction the first estimate holds for all n ∈ N. The second
estimate in (3.18) is proved without induction. Let max {|a
1 |, . . . , |an |} = |ak |
for some 1 ≤ k ≤ n. Replacing each number |al | in nl=1 |al | by |ak | will
increase the sum, i.e.
n

|al | ≤ |ak | + · · · + |ak | ≤ n · max {|a1 |, . . . , |ak |}.
l=1
As in the case for finite sums we can introduce a notation for finite products.
Let a1 , . . . , an ∈ R be given. We denote their product by
n

aj = a1 · a2 · . . . · an . (3.19)
j=1
Clearly, using the associative law for multiplication we have for m < n that
n
m n

aj = aj · aj . (3.20)
j=1 j=1 j=m+1
Note that the second term on the right hand side of (3.20) is an obvious
generalisation of (3.19), compare with the analogous notation for sums, see
(3.15).
Hence, for l < n, l, n ∈ Z, and real numbers al , al+1 , . . . , an we write for their
product
n
aj = al · al+1 · . . . · an . (3.21)
j=l
We introduce further for n ∈ N

n

n! := j, (3.22)
j=1
45
and we call this number n factorial. For example we have 6! = 1·2·3·4·5·6 =

720. Using (3.20) we find
(n + 1)! = n!(n + 1). (3.23)
Further we define
0! = 1. (3.24)
Definition 3.6. For n, k ∈ N ∪ {0}, k ≤ n, we define the binomial coef-

ficient by
n n!
:= , (3.25)
k k!(n − k)!

where we read nk as n over k. For k > n we set

n
= 0. (3.26)
k
Example 3.7. The following hold:

n n!
= = 1;
0 0!n!

n n!
= = n;
1 1!(n − 1)!

n n!
= = 1;
n n!(n − n)!

2 2!
= = 2;
1 1!1!

4 4! 1·2·3·4
= = = 6.
2 2!(4 − 2)! 2·2
Lemma 3.8. For 1 ≤ k ≤ n the following holds

n n−1 n−1
= + . (3.27)
k k−1 k
Proof. For n = k it is straightforward:

n n−1 n−1
= +
n n−1 n
46
or
1 = 1 + 0.
Now for 1 ≤ k < n we have

n−1 n−1 (n − 1)! (n − 1)!
+ = +
k−1 k (k − 1)!(n − k)! k!(n − k − 1)!
k(n − 1)! + (n − k)(n − 1)!
=
k!(n − k)!

(n − k + k)!(n − 1)! n
= = .
k!(n + k)! k
We can now prove our first non-trival result. The following formulae should
be familiar:
(a + b)2 = a2 + 2ab + b2 and (a − b)2 = a2 − 2ab + b2 .
These are generalised by:
Theorem 3.9 (Binomial theorem). For x, y ∈ R and n ∈ N ∪ {0}
n

n n
(x + y) = xn−k y k . (3.28)
k=0
k
Proof. We use mathematical induction. Denote the statement in (3.28) as

A(n). For n = 0 we have
0
0 0 0−k k 0 0 0
(x + y) = 1 and since x y = x y =1
k=0
k 0
the statement A(0) holds. Now we prove that A(n) implies A(n + 1):
47
n

n n−k k
n+1 n
(x + y) = (x + y) (x + y) = x y (x + y)
k
k=0
n n
n n+1−k k n n−k k+1
= x y + x y
k k
k=0 k=0
n n−1
n n+1−k k n n−k k+1
= xn+1 + x y + x y + y n+1
k k
k=1 k=0
n n
n n+1−k k n
= xn+1 + x y + xn−(k−1) y k + y n+1 (3.29)
k k−1
k=1 k=1
n
n n
= xn+1 + + xn+1−k y k + y n+1
k k−1
k=1
n
n + 1 n+1 0 n n n+1−k k n + 1 0 n+1
= x y + + x y + x y
0 k k−1 n+1
k=1
n + 1
n+1
= xn+1−k y k ,
k
k=0
proving the result.
In Remark 3.13 below we clarify the calculation leading to (3.29) in more

detail.
Corollary 3.10. The following holds

n
n
= (1 + 1)n = 2n , (3.30)
k=0
k
and moreover we have

n

n k
(−1) = (1 − 1)n = 0. (3.31)
k=0
k
Example 3.11. Using the binomial theorem we get
(x + y)0 = 1,
(x + y)1 = x + y,
(x + y)2 = x2 + 2xy + y 2 ,
48
(x − y)2 = x2 − 2xy + y 2,
(x + y)3 = x3 + 3x2 y + 3xy 2 + y 3 ,
(x + y)4 = x4 + 4x3 y + 6x2 y 2 + 4xy 3 + y 4 .
Remark 3.12. The binomial coefficients will play an important part in prob-
ability theory and combinatorics.
Remark 3.13 (Changing the running index in a sum). In deriving

(3.29) we used
n−1
n
n n−k k+1 n
x y = xn−(k−1) y k . (3.32)
k=0
k k=1
k − 1
To obtain this result we argue as follows: in the first sum put the running
index k equal to l = k + 1. Thus, whenever we see k we replace it by l − 1
to get
n−1
n−1

n n
xn−(l−1) y l−1+1 = xn−(l−1) y l
l−1=0
l − 1 l−1=0
l − 1
n
n
= xn−(l−1) y l ,
l=1
l − 1
and now put l = k.
There is still a need to improve our formal definition of the sum of n real
numbers as given in (3.6), the same applies to the definition of their product,
see (3.21). We have to introduce the concept of a recursive definition.
Suppose for m ≤ j ≤ n, m, n ∈ Z, mathematical objects C(j) are defined.
For example C(j) = jl=1 al for 1 ≤ j ≤ n and al ∈ R. It might happen that
we can extend the definition to get a new object C(n + 1). In our example
we may define
n+1
n

C(n + 1) := al := an+1 + al = an+1 + C(n). (3.33)
l=1 l=1
Thus we use the already defined objects C(m), . . . , C(n) to define the new
object C(n+1). If we can extend this to all n ≥ m, i.e. for all n ≥ m, m, n ∈
Z, we can define C(n+1) given C(m), . . . C(n), then we say that the sequence
49
of objects C(n), n ≥ m, is recursively defined, or defined by recursion, or

as some authors say defined by mathematical induction.
Here are a few examples in addition to (3.33):
n+1
n

al := an+1 · al ; (3.34)
l=m l=m
an+1 := a · an , n ≥ m, a = 0. (3.35)
We can put this into a more formal scheme which indicates that we may prove
by mathematical induction that when defining objects C(j) by recursion we
indeed have defined all the elements of the sequence C(j), j ≥ m. The formal
proof however we omit. If C(j), j ≥ m, j, m ∈ Z, are the objects we want to
define we start with
A(m) : C(m) is defined by some formula,
for example
1

A(1) : al := a1 .
l=1
In the next step we consider
A(n + 1) : C(n + 1) which is defined using C(m), . . . , C(n),
for example
n+1
n

A(n + 1) : C(n + 1) := al := an+1 + al = an+1 + C(n).
l=1 l=1
Note that we will not always need C(m), . . . , C(n) to define C(n + 1); in our
example C(n) is sufficient. We can interpret A(n) as the statement: given
A(m), . . . , A(n−1), then it is formally possible to define A(n). The proof that
a definition by recursion gives all objects C(n), n ≥ m, must now show that
for all n ≥ m the following holds: if we can formally define A(m), . . . A(n),
then we can also formally define A(n + 1).
Next comes an observation which will force us to be a bit cautious. So far
mathematical statements are objects which we have not really defined, how-
ever we have a naı̈ve but often correct idea of what statements are. Math-
ematical induction was introduced to prove such (naı̈ve) statements. The
50
situation above i.e. the definition by recursion, is slightly different. The

statement we want to prove is that a formal definition is correct, i.e. we need
to know what “formally correct” definitions are. Currently, for our course
we need not resolve these problems, all we need to know is that sometimes
we must be cautious. Those of you who will later study mathematical logic
or the foundations of mathematics will read more about this and similar
problems.
Problems
1. a) Use mathematical induction to prove that for k ∈ N ∪ {0}
k 3 + (k + 1)3 + (k + 2)3
is divisible by 9.
b) Prove by mathematical induction that for every integer n ≥ 0
the number
n5 n4 n3 n
+ + −
5 2 3 30
is an integer.
2. Prove by mathematical induction that

a) for every x, y ∈ R and all n ∈ N the term xn − y n always has
x − y as a factor, i.e.
xn − y n = (x − y)Qn (x, y)
where for x fixed Qn (x, y) is a polynomial with respect to y and for y

fixed Qn (x, y) is a polynomial with respect to x and y.
b) For every x > 0 and y > 0 and for all n ∈ N the following holds
(n − 1)xn + y n ≥ nxn−1 y.
3. Find the value of each of the following sums:
2 5
6

1 k k−2 l+1
a) j
; b) (a − a ), a = 1; c) (−1)l .
j=−2
2 k=2 l=1
l
51
4. a) For λ ∈ R and a1 , . . . , aN , b1 , . . . , bN ∈ R show that

N
N

λ aj = (λaj )
j=1 j=1
and
N
N
N

aj + bj = (aj + bj ).
j=1 j=1 j=1
b) For x, y ∈ R simplify
5

(x − y) xk y 5−k .
k=0
5. Prove the following identities:

n
k

1 n
a) = ; b) n · n! = (k + 1)! − 1;
(2k − 1)(2k + 1) 2n + 1 n=1
k=1
m
1
c) (a + (j − 1)d) = m(2a + (m − 1)d).
j=1
2
6. Find the value of each of the following products:

2
6
5
j+2
a) 2−k ; b) (j − 4); c) .
k=−2 j=3 j=1
j+4
7. For ν, μ ∈ R and a1 , . . . , aN ∈ R show that

N
N
N

N N
(μaj ) + (νaj ) = (μ + ν ) aj .
j=1 j=1 j=1
8. Find the value of the following:

63! (n+1)!−n! (n+1)!
a) 7! and 60!
; b) n
; c) (n−1)!
.
52
9. Prove the following by induction:

1 k nn
n n−1
2k − 1 1 2n
a) = 2n , n ≥ 2; b) 1+ = , n ≥ 1.
k=1
2k 2 n k=1
k n!
10. Find the binomial expansion of the following:
a) (5x2 + 3y)4 ; b) (x − y)n .
11. Prove the following:

a)
n n(n − 1) · . . . · (n − k + 1)
= ;
k 1 · 2 · ...· k
b) For α ∈ R and k ∈ N consider

α α(α − 1) · . . . · (α − k + 1) α
:= , := 1,
k 1 · 2 · ...· k 0
and prove for k ≥ 2 that
1
1 · 3 · . . . · (2k − 3)
2 = (−1)k−1 .
k 2 · 4 · . . . · (2k)
12. Let p, k ∈ N. Use mathematical induction to prove:

a) k ≥ 1 and p ≥ 2 implies pk > k;
b) k ≥ 1 and p ≥ 3 implies pk > k 2 ;
c) for k ≥ 5 it is true that 2k > k 2 .
13. Prove the following by induction:
N k

1 √
a) √ ≤ 2 N; b) (2m)! ≥ ((k + 1)!)k .
j=1
j m=1
14. Prove the arithmetic-geometric mean inequality:

For k ≥ 2 and a1 , . . . , ak ∈ R, aj ≥ 0 where j = 1, . . . , k, the following
holds
√ a1 + . . . + ak
(∗ ∗) k
a1 · . . . · ak ≤
k
53
or k1
k
k
1
aj ≤ aj .
j=1
k j=1
Hint: first prove (∗ ∗) by induction for n = 2k , k ∈ N. Then for k ∈ N

choose k such that n < 2k , and with
n
1
a := aj
k j=1
k −k
consider a1 · . . . · ak · a2 .
15. Define
1 c
xn := xn−1 + , n ∈ N,
2 xn−1
with c > 0 and x0 := 1. Further set
c
an := , n ∈ N ∪ {0}.
xn
Prove an ≤ an+1 ≤ xn+1 ≤ xn for n ≥ 1.
54
4 Functions and Mappings

Let D ⊂ R, i.e. D is a subset of the real numbers. Often we need to associate
with x ∈ D a new real number which we denote at the moment by f (x).
Example 4.1. A. Suppose that a shop offers n ∈ N items for sale and we
enumerate these items by 1, · · · , n, we may then assign a price to each. Thus
D = {1, · · · , n} and for x ∈ D the new number f (x) denotes the price.
B. For x ∈ D = R we can consider its absolute value |x|, i.e. f (x) = |x|.
C. With D = {x ∈ R|x ≥ 0} we may consider
√
x −→ f (x) = x
i.e. for each x ∈ D we consider its square root.

D. Let D = {x ∈ R|x = 0} = R \ {0}. With x ∈ D we may consider its
inverse with respect to multiplication, i.e.
1
f (x) = = x−1 .
x
Let us agree to the following
Definition 4.2. Let D ⊂ R. A function f : D −→ R is a rule which assigns

to every x ∈ D exactly one real value f (x). For this we write x → f (x) and
say that x is mapped onto f (x), or f (x) is the value of f at x.
It is convenient to introduce the notation

f :D−→R
x −→f (x).
Note that when thinking more carefully about the foundations of mathemat-
ics this definition causes some problems. However for now it is absolutely
sufficient for our purposes. We call D the domain of the function f , some-
times we write D(f ) instead of D. Often, if no confusion arises (just as
above) we call f a function and omit the domain and the target set or
co-domain R. Two functions fj : Dj → R, j = 1, 2, are equal if and only if
D1 = D2 and if for all x ∈ D1 = D2 we have f1 (x) = f2 (x). Sometimes it is
useful to write f (·) instead of f .
Example 4.3. A. The absolute value is the function
55
|.| :R−→R
x −→|x|.
B. The square root (function) is given by

√
. :R+ −→R √
x −→ x
where we write R+ = {x ∈ R|x ≥ 0}.

C. Consider
f1 :R−→R f2 :Z−→R
x −→x2 and x −→x2 .
Both are functions but they are not equal since D(f1 ) = D(f2 ).
D. Let k ∈ N0 := N ∪ {0} and a0 , a1 , . . . , ak ∈ R. Then for every x ∈ R we
can construct a new real number by
k

p(x) := aj xj = a0 + a1 x + . . . + ak xk . (4.1)
j=0
Thus we may define the function

p :R−→R
x −→p(x) := kj=0 aj xj .
Functions of this type are called polynomial functions (on R) or in short-

hand polynomials.
We are mainly interested in studying functions defined on some subset of R,

often an interval. There is a simple but important way to interpret such a
function, namely by considering it as a set of ordered pairs of real numbers:
{ (x, f (x))| x ∈ D} .
Let us first formalise this idea and then we will use it to give a geometric
interpretation of a function. For x ∈ R and y ∈ R we can form the pairs
(x, y) and (y, x) where it matters whether x or y is in the first position. The
set of all ordered pairs of real numbers is called the Cartesian product
of R with itself and is denoted by R × R or simply by R2 . Thus a ∈ R2 if a
is a pair (x, y) of real numbers, x, y ∈ R. Two pairs (x1 , y1 ) and (x2 , y2) are
56
4 FUNCTIONS AND MAPPINGS
√ √
equal if and only if x1 = x2 and y1 = y2 . For example 4, 1 = 2, 1 but
(2, 1) = (1, 2). If D ⊂ R we can define a subset of R by
2

D × R := (x, y) ∈ R2 x ∈ D and y ∈ R , (4.2)
and of course this extends easily to D ⊂ R and R ⊂ R :

D × R := (x, y) ∈ R2 x ∈ D and y ∈ R . (4.3)
Now, given a function f : D −→ R it follows that
{ (x, f (x))| x ∈ D} ⊂ D × R ⊂ R2 .
We call this set the graph of f and denote it by Γ(f )
Γ(f ) := { (x, f (x))| x ∈ D} . (4.4)
For a function f : D −→ R the value at x is the real number f (x) and the
graph Γ(f ) is a subset of the Cartesian product D × R.
Consider the function |.| : R −→ R, x −→ |x|. It is defined for all x ∈ R but
only non-negative real numbers may occur as a value of the function, since
|x| ≥ 0 for x ∈ R. We introduce the range of a function f : D −→ R as the
set
R(f ) := { y ∈ R| there exists x ∈ D such that y = f (x)} . (4.5)
Another way to look at the range of f is to consider it as the image of D. In

this sense we define the image of D under f , denoted by f (D), as
f (D) = {y ∈ R| exists x ∈ D such that y = f (x)} = R(f ).
An important problem is to determine the range of a given function. Let us

give a geometrical interpretation of the graph of a function. We have already
agreed to interpret a real number x as a point on the real line. Thus it is
natural to interpret a pair of real numbers as a point in the plane. The graph
Γ(f ) of a function f : D −→ R is the collection of all points (x, f (x)) for
x ∈ D, thus Γ(f ) ⊂ D × R.
57
y
3
(−2, 32 ) 2 + (3, 2)
+
1
x
−5 −4 −3 −2 −1 1 2 3 4
−1
−2
−3
+ +
(−4, − 72 ) −4 (1, − 72 )
Figure 4.1
−5
Here are some examples with y = f (x) (or y = g(x), y = h(x)). In the
following figure the function f is the identity on R, g is a parabola, again
defined on R, and h is the square root function which is of course only defined
on R+ = {x ∈ R | x ≥ 0}.
y
g(x) = x2
5 f (x) = x
4 √
h(x) = x
3
2
1
x
−5 −4 −3 −2 −1 1 2 3 4 5 6 7 8
−1
−2
−3 Figure 4.2
58
In Figure 4.3 the function f is the absolute value with domain R, the function
g is a hyperbola defined on R \ {0}, and h is a cubic polynomial with domain
R. y
1
g(x) = x
6
5 h(x) = x3
4 f (x) = |x|
x
−7 −6 −5 −4 −3 −2 −1 1 2 3 4 5 6
g(x) = x1 −1
−2
−3
−4
−5 Figure 4.3
It is likely you have already seen these graphs before, but there is a non-
trivial question: how do we know that they are correct? A typical domain
D(f ) contains infinitely many points. We cannot calculate all values f (x).
Thus before we can draw the graph we need to understand and discuss the
function f : D −→ R and its behaviour. The following are natural questions:
• are there lower and upper bounds?
• are there local or global extreme values, i.e. maxima or minima?
• is the function monotone?
• is the graph connected?
..
.
59
The last question arises when looking at f : R\ {0}−→ R,x −→ x1 . The
graph
Γ(f ) has the two components Γ+ (f ) = x, x1 x > 0 and Γ− (f ) =
x, x1 x < 0 , i.e.
Γ (f ) = Γ+ (f ) ∪ Γ− (f ) (4.6)
and in addition
Γ+ (f ) ∩ Γ− (f ) = ∅. (4.7)
Thus it is not possible to get from a point in Γ+ (f ) to a point in Γ− (f )

while staying in Γ (f ) .
So far we “know” only a few functions and they all look very “nice”, i.e.
“smooth” and easy to deal with. Here are a few not so nice candidates:
Example 4.4. A. Let A ⊂ R be any set, its characteristic function
χA : R −→ R, x −→ χA (x), is defined by

1, x ∈ A
χA (x) = (4.8)
0, x ∈
/A
The graph of χA for A = [−1, − 12 ] ∪ { 12 } ∪ [1, 2] is given by
y = χA (x)
−1− 1 1 1 2 x
2 2
Figure 4.4
For A = Q we get the Dirichlet function χQ : R −→ R

1, x ∈ Q
χQ (x) = (4.9)
0, x ∈ R \ Q
however it is not possible to draw this graph.
B. The entier-function is given by x −→ [x], i.e. [.] : R −→ R, x −→ [x],
60
with [x] being the largest integer less or equal than x. Thus [1] = 1, in
general [k] = k for k ∈ Z, but [ 12 ] = 0, [− 12 ] = −1 etc. Note that we always
have x − [x] ∈ [0, 1). Here is the graph of [x] and x − [x]:
[x]
4
2
x − [x]
1
−7 −6 −5 −4 −3 −2 −1 1 2 3 4 5 6
−1
−2
−3
−4
−5 Figure 4.5
where [ indicates that the left end point is included and ) indicates that the
right point is not included. In addition let us consider the new function
f : R −→ R, x −→ x − [x]. Its graph looks periodic with period 1. This
means that f (x + 1) = f (x) for all x ∈ R. Thus for a general function we
may ask whether it is periodic.
There are some simple procedures to construct new functions from given
ones. Let f1 , f2 : D −→ R be given functions. Note that they have the same
domain. We define
i) their sum by
f1 + f2 :D−→R
x −→(f1 + f2 )(x) := f1 (x) + f2 (x) (4.10)
61
ii) their difference by
f1 − f2 :D−→R
x −→(f1 − f2 )(x) := f1 (x) − f2 (x) (4.11)
iii) their product by
f1 · f2 :D−→R
x −→(f1 · f2 )(x) := f1 (x) · f2 (x) . (4.12)
The constant function fc : R −→ R, x −→ c, c ∈ R fixed, is a polynomial

function therefore we have already encountered this function. In particular,
when taking fc for the function f1 in (4.12) we find
(fc · f2 )(x) = c f2 (x) for all x ∈ R, (4.13)
i.e. we can form a new function by multiplying it pointwise by a constant.

However this argument is not quite correct: The product of two functions
is defined only when they have the same domain. We resolve this problem
by introducing the restriction of a function to subsets of its domain. Let
D1 ⊂ D ⊂ R and let f : D −→ R be a function. We call f |D1 : D1 −→ R,
x −→ f |D1 (x), the restriction of f to D1 if f |D1 (x) = f (x) for all x ∈ D1 .
In the case where no confusion may occur we write simply f |D1 or even f .
Now, since fc : R −→ R, x −→ c, is defined on the whole real line we may
restrict it to any subset D and therefore (4.13) makes sense for all functions
defined on some D ⊂ R.
A problem, in fact a more serious one than one may think at the beginning,
is to define the quotient of two functions f1 , f2 : D −→ R. The idea is to
define
f1 f1 (x)
(x) = for x ∈ D. (4.14)
f2 f2 (x)
However this does not make sense for f2 (x) = 0. We either have to assume
f2 (x) = 0 for all x ∈ D or we can only define ff12 on Dq = {x ∈ D|f2 (x) = 0} .
In fact the situation is more delicate if we look at the simple case where
f1 , f2 : R −→ R, f1 (x) = x and f2 (x) = x for all x ∈ R. Of course ff12 (x)
(x)
=1
for all x = 0 but we would like to extend this so that it also holds for
x = 0. Thus a further problem to study is: when does a given function
f : D −→ R have an extension to a larger domain D1 , D ⊂ D1 ⊂ R? We
62
call f1 : D1 −→ R an extension of f : D −→ R if D ⊂ D1 and f1 |D = f .

In our example ff12 is only defined on R \ {0}, but we may extend ff12 to R by

defining ff12 (0) := 1. Again here a new problem arises. There is nothing

to stop us defining ff12 (0) = 2 or ff12 (0) = q, q being any real number.

In each case we get a function extending ff12 with domain D ff12 = R \ {0}
to a function with domain R. Extensions are not unique. Thus we may add
conditions
to achieve uniqueness. In the above example it is natural to define
f1
f2
(0) by 1. We long for some criteria providing us with some help to find
natural extensions.

Consider two polynomial functions p : R → R, x → kj=0 aj xj , and q : R →

R, x → li=0 bi xi . We can easily define their sum p + q, their difference p − q
and their product p · q. By easily we mean that we can rely on (4.10), (4.11)
and (4.12). In Problem 11 we will show that p + q, p − q and p · q are also
polynomials. We will mdetermine their coefficients, i.e. each of the functions
r
is of the type x → r=0 cr x where m is determined by k and l, whereas the
coefficients cr , 0 ≤ r ≤ m, are determined by the numbers aj , 0 ≤ j ≤ k, and
bi , 0 ≤ i ≤ l, and of course in each case they are different. However since
q(x) might be zero, we have a problem to define the quotient x → p(x) q(x)
.
Thus when discussing functions, the set of their zeroes, or more generally the
set of their a−points, i.e. the set {x ∈ D|f (x) = a} is also of importance.
The set of all functions h : Dh −→ R, h(x) = p(x)
q(x)
for two polynomials p and
q, with Dh := {x ∈ R|q(x) = 0} is called the set of all rational functions.
Since q(x) = 1 for all x ∈ R is a polynomial, all polynomials are rational
functions.
Note that we can add polynomials, but in general we cannot add two rational
functions: they might have different domains. However, if h1 : D(h1 ) −→ R
and h2 : D(h2 ) −→ R are two rational functions then
h1 |D(h1 )∩D(h2 ) + h2 |D(h1 )∩D(h2 )
is always defined. The same type of argument holds for the difference and
the product of two rational functions, and with the obvious extension in each
case for finitely many ones.
Now look at p(x) = (x − 1)2 and q(x) = (x − 1). Both are polynomials,
63
{x ∈ R | q(x) = 0} = {1}. Thus we may define their quotient on R \ {1} by

p (x − 1)2
(x) = = x − 1. (4.15)
q x−1

Obviously we can extend pq : R \ {1} → R to R just by defining pq (1) = 0.
Hence the domain D( pq ) = R \ {x ∈ R | q(x) = 0} does not have to be the
natural one for the quotient pq . We will return to this problem later. By
being careful with domains we can even define the quotient of two rational
functions: p1
q1 p1 q2
p2 = , (4.16)
q2
p2 q1
but note that the left hand side requires q1 (x) = 0, p2 (x) = 0 and q2 (x) = 0,
whereas for the right hand side we need at most that p2 (x)q1 (x) = 0.
We want to extend the idea of a function to arbitrary sets X and Y , X = ∅
and Y = ∅. We start by transforming our old definition:
A mapping f : X → Y, x → f (x), is a rule which associates to every x ∈ X
one and only one y := f (x) ∈ Y . Sometimes we also write
f : X → Y
x → f (x)
For example we may take X as the set of all bounded open intervals, i.e.
X := {(a, b) | a < b and a, b ∈ R}, and for Y we may take the non-negative
real numbers, i.e. Y = R+ . Now we may define λ : X → R+ , λ((a, b)) = b−a.
Thus the mapping λ maps every bounded open interval (a, b) ⊂ R onto its
length b − a ∈ R. Another example is the following: Take X = R+ and Y to
be the set of all closed intervals [0, a], a > 0, i.e. Y := {[0, a] | a > 0}. Then
f : X → Y, a → [0, a] is a mapping.
However, as we have already pointed out in the case of functions, the term
“rule” is not well defined. Thus we try something else taking into account
our experience with functions.
Consider the Cartesian product X × Y , i.e.
X × Y := {(x, y) | x ∈ X and y ∈ Y }. (4.17)
A subset R ⊂ X × Y is called a relation of elements in X and Y . For

example with X = Y = Z we may look at:
64
R1 := {(k, k 2 ) | k ∈ Z} ⊂ Z × Z, with graphical representation:
−4 −3 −2 −1 1 2 3
−1 Figure 4.6
or R2 := {(k, m) | k ∈ Z, m ∈ Z, |m| = |k|}, with graphical representation:
−4 −3 −2 −1 1 2 3
−1
−2
−3
−4 Figure 4.7
65
Given a relation R ⊂ X × Y we sometimes write xRy instead of (x, y) ∈ R.

A relation R ⊂ X × X is called:
reflexive if xRx, i.e. (x, x) ∈ R, for all x ∈ X;
symmetric if xRy and yRx, i.e. (x, y) ∈ R and (y, x) ∈ R, for all x, y ∈ X;
transitive if xRy and yRz implies xRz, i.e. if (x, y) ∈ R and (y, z) ∈ R
implies (x, z) ∈ R.
A reflexive, symmetric and transitive relation is called an equivalence re-
lation and these relations are of central importance in mathematics. Often
we write “∼” to indicate an equivalence relation.
The relation R = {(k, −k) | k ∈ Z} ⊂ Z is symmetric but neither reflexive
nor transitive. The identity relation
R = {(x, x) | x ∈ X} ⊂ X × X
is an equivalence relation.
Definition 4.5. A mapping f : X → Y is a relation Rf ⊂ X × Y such

that for every x ∈ X there exists exactly one y ∈ Y such that xRf y. We
write y := f (x).
In other words
Rf = {(x, f (x)) | x ∈ X} ⊂ X × Y
is a generalisation of the graph of a function. Once again X is called the
domain of f , Y is sometimes called the co-domain of f or the target set.
In this sense functions are mappings f : D → R, D ⊂ R. Making this dis-
tinction between mappings and functions may seem artificial, but it might
be helpful in the beginning. As a rough guide, when speaking about func-
tions we mean mappings from some set to the real numbers (or the complex
numbers in later parts).
The range of f or the image of X under f is
R(f ) = {y ∈ Y | there exists x ∈ X such that y = f (x)}.
We may restrict f : X → Y to a subset Z ⊂ X and then it makes sense to

speak of the image of Z under f , i.e. to consider f (Z). Note that f (Z) is
always a subset of Y , not an element of Y . Some results for the image are
obvious, for example Z1 ⊂ Z2 ⊂ X implies f (Z1) ⊂ f (Z2 ) ⊂ R(f ) = f (X).
66
Example 4.6. A. Let f : R → R, x → x2 . Then we have
f ([−2, −1] ∪ [1, 2]) = f ([−2, −1]) = f ([1, 2]) = [1, 4].
B. Let A ⊂ R and χA : R → R. We find χA (B) = {1} for every B ⊂ A,

χA (C) = {0} for every C ⊂ R such that C ⊂ A.
The graph of f is again denoted by Γ(f ) which is of course Rf ⊂ X × Y .
Clearly every function f : D → R, D ⊂ R, is a mapping in this new sense.
Let f : X → Y be a mapping and let B ⊂ Y . The pre-image of B is the
set
f −1 (B) := {x ∈ X|f (x) ∈ B}. (4.18)
Example 4.7. A. For the parabola f : R → R, x → x2 , we find for y > 0
√ √
that f −1 ({y}) = {− y, y}, for y = 0 we have f −1 ({0}) = 0, and for y < 0
we have that f −1 ({y}) = ∅.
B. Consider the function f : R → R, x → x − [x], compare with Example
4.4.B. If B ⊂ R \ [0, 1) then f −1 (B) = ∅, however for every y ∈ [0, 1) the
pre-image f −1 ({y}) consists of infinitely many points. Indeed for y ∈ [0, 1)
we have
f −1 ({y}) = {x = y + k|k ∈ Z]}.
This is typical behaviour of a periodic function.
C. Let A ⊂ R be a non-empty set and let χA be the characteristic function

of A. Then the following hold: χ−1 −1 −1
A ({1}) = A; χA ({0}) = A ; χA ({y}) = ∅
if y ∈
/ {0, 1}.
Before returning to functions f : D → R, D ⊂ R, we want to discuss a
further new idea: the power set. Let X be a set. Its power set P(X) is by
definition the set of all subsets of X i.e.
P(X) := {Y | Y ⊂ X}
where we understand that ∅ ⊂ X for every set and X ⊂ X. If X = {1, 2}

then P(X) = {∅, {1}, {2}, {1, 2}}.
Note: elements of P(X) are sets.

We may ask the following question: consider P(R2 ), the power set of R2 , i.e.
all subsets of the plane. Can we define for every A ∈ P(R2 ), i.e for every
subset A ⊂ R2 , area in a reasonable way? Thus we are looking for a mapping
μ : P(R2 ) → [0, ∞], A → μ(A), where μ(A) is the area of A. We will see that
67
this is not possible if we want to maintain basic properties of area. However,

this example indicates that mappings might be defined on families of sets
but we should not be afraid of working with such mappings.
Problems
1. a) Find the product sets A × B and B × A for A = {3, 4, 5, 6} and
B = {1, 2, 3} and sketch the set in the plane.
b) Prove that N × Z ⊂ R × Q.
c) Let X = {1, 2, 3}, Y = {3, 4, 5}, and Z = {6, 7}.
Find (X ∪ Y ) × Z, X × (Y ∪ Z) and (X × Z) ∩ (Y × Z).
2. For the sets A, B, C and D prove that: a) (A∪B)×C = (A×C)∪(B×C)

and b) (A × B) ∩ (C × D) = (A ∩ C) × (B ∩ D).
3. For the sets X, Y and X , Y show that X × Y ⊂ X × Y if and only

if X ⊂ X and Y ⊂ Y .
4. Sketch the sets

5
5

({j} × Ij ) and (Ij × {j})
j=1 j=1
for Ij = [j, j + 1] ⊂ R, i.e. Ij = {x ∈ R | j ≤ x ≤ j + 1}.
5. Let p ∈ N be fixed and consider on Z the following relation: mRp n

if m − n is divisible by p. For this we should use the more commonly
used notation
m ≡ n mod(p)
(this reads as: m is congruent to n modulo p). The reader has probably
already seen this relation in an algebra course. Prove that
m ≡ n mod(p) is an equivalence relation on Z.
6. Consider the set Z × N and define a relation on Z × N by

(k, m) ∼ (l, n) if and only if nk = lm. Prove that “∼” is an equivalence
relation on Z × N.
7. Find the power set of: a) the empty set φ; b) the set {1, 2, 3}.
68
8. Let X be a set with N elements, Prove that the power set P(X) of X
has 2N elements. Use the fact
that
the number of subsets of k elements
of a set with N elements is Nk .
9. Consider the following rule: x ∈ R is mapped onto the solution of the

quadratic equation y 2 − 2y + x = 0. Does this rule define a function on
R?
10. Let p, q : R → R be two polynomials. Prove that p + q and p · q are

also polynomials.
11. We call a polynomial even if it is of the form

n

p(x) = a2j x2j , a2n = 0.
j=0
a) Show that p is a polynomial of degree 2n and has the unique

representation
2n

p(x) = bl xl .
l=0
b) Define the function f : R −→ R by

n

x → f (x) := a2j |x|2j .
j=0
Prove that p = f (as functions).

c) Determine the largest set D ⊂ R where g : R −→ R, x → x3 ,
and h : R −→ R, x → |x|3 , coincide, i.e. g|D = h|D .
12. For each of the following rational expressions q(x) find the largest set
D ⊂ R such that q : D −→ R is a well defined function. Where appro-
priate try to extend q : D −→ R to a larger domain in a meaningful
way by modifying q.
x3 −5x2 −17 (x−3)2 (2x+7)5 x2 −x−12

a) q1 (x) = x2 +7
b) q2 (x) = (x−3)(x+4)(2x+7)8
c) q3 (x) = (x−4)(x+2)
.
69
13. (i) Let f : X −→ Y be a mapping. For pre-images prove:

a) f −1 (A ∩ B) = f −1 (A) ∩ f −1 (B), A, B ⊂ Y ;
b) f −1 (A ∪ B) = f −1 (A) ∪ f −1 (B), A, B ⊂ Y.
(ii) Let f : X −→ Y be a mapping and A, B ⊂ X.

For the image prove:
a) f (A ∩ B) ⊂ f (A) ∩ f (B);
b) f (A ∪ B) = f (A) ∪ f (B);
c) f ({x}) = {f (x)} for x ∈ X.
14. (i) In each of the following cases find the pre-images:

a) f : R −→ R, x → x2 + 1, find f −1 ({y}) for y ∈ R;
b) g : R \ {0} −→ R, x → x1 , find g −1 ({z}) for z ∈ R;
c) h : R −→ R, x → 12 x + 3, find h−1 ((a, b)) for (a, b) ⊂ R, a < b.
(ii) In each of the following cases find the image of the indicated set:
√ !
a) f : [0, ∞) −→ R, x → x, find f 14 , 9 ;
x2 −1
b) g : R −→ R, x → x2 +2
, find g({1, 2, 3, 4});
c) h : R −→ R, x → 2 , find h(N).
x
70
5 Functions and Mappings Continued

We continue our considerations on mappings f : X → Y between two sets
X and Y . We may consider functions f : D → F , with D, F ⊂ R instead
of functions f : D → R. While we can always restrict a given function
f : D → R or f : D → F to a subset D1 ⊂ D, we cannot in general easily
restrict the target set or co-domain F . For example f : R −→ R, x −→ x+2,
is a well defined function. However, when shrinking the target set to [0, 2],
then f : R −→ [0, 2], x −→ x+2, does not define a function since for example
for x = 5 ∈ R the “value” f (5) = 7 does not belong to the co-domain [0, 2].
However, if we restrict f to the set [−2, 0] then f |[−2,0] : [−2, 0] −→ [0, 2] is
of course once again a function. Nonetheless, for reasons which will become
clear later, it makes sense to consider functions with co-domains different to
R.
Definition 5.1. Let f : D −→ F, D, F ⊂ R, be a function. A. We call f

injective or one-to-one if for x, y ∈ D, x = y, it follows that f (x) = f (y).
B. We call f surjective or onto if for every y ∈ F there exists x ∈ D such
that f (x) = y. C. If f is injective and surjective we call f bijective.
Remark 5.2. A. Obviously we can extend the definition of injectivity, sur-

jectivity and bijectivity to general mappings. A mapping f : X → Y is
injective if x1 = x2 , x1 , x2 ∈ X, implies f (x1 ) = f (x2 ). The mapping f is
surjective if for every y ∈ Y there exists x ∈ X such that f (x) = y. If f is
injective and surjective then it is bijective.
B. A mapping f : X → Y (or a function f : D → F , D, F ⊂ R) is surjective
if and only if R(f ) = Y (or R(f ) = F ). There is an easy way to make ev-
ery mapping surjective: shrink the co-domain to the range. This is formally
correct but of course in general we do not know R(f ) explicitly.
Example 5.3. A. Consider the function f1 : R −→ R, x −→ x2 . Since for

x = −x, i.e. x = 0, it follows that x2 = (−x)2 = x2 , the function f1 is not
injective. Moreover, since x2 ≥ 0 for all x ∈ R, any negative number does not
belong to the range of f1 , hence f1 is not surjective. However, the function
f˜1 : R −→ R+ , x −→ x2 is surjective: given y ≥ 0 there exists a unique
√
xy ≥ 0 such that x2y = y, namely xy = y. But f˜1 is still not injective since
f˜1 (xy ) = f˜1 (−xy ). Now, we may also reduce the domain of f˜1 and consider
f1∗ : R+ −→ R+ , x −→ x2 . We know that for y ≥ 0 there exists a unique
√
xy = y such that x2y = y and xy ≥ 0, hence f1∗ is injective and surjective,
71
i.e. bijective. This example shows the importance of the domain and the
range of a function when deciding about its injectivity and its surjectivity,
respectively.
B. For a ∈ R consider fa : R −→ R, x −→ x + a. We claim that fa is always
bijective. First, for x = y it follows that fa (x) = x + a = y + a = fa (y), and
secondly, given z ∈ R the equation fa (x) = z, i.e. x + a = z, has the (unique)
solution x = z − a and it follows that fa (z − a) = (z − a) + a = z, i.e. fa is
surjective. The last calculation shows what “determine whether f : D −→ F
is surjective” or “find the range R(f )” really means: we have to solve the
equation f (x) = y for all y ∈ F such that the solution belongs to D.
C. For a = 0 the function ga : R −→ R, x −→ ax is bijective. First note
that for x, z ∈ R, x = z, it follows that ax = az, i.e. ga is injective. To show
that ga is surjective we have to solve for all y ∈ R the equation ga (x) = y,i.e.
ax = y. Clearly the solution is x = ya provided a = 0. Thus ga , a = 0, is
bijective.
Note that in the case where a = 0 the function g0 is the constant function
g0 : R −→ R, x −→ 0. This function is neither injective nor surjective.
In fact for every c ∈ R the constant function hc : R −→ R, x −→ c, i.e.
hc (x) = c for all x ∈ R, is neither injective nor surjective.
D. Let A ⊂ R be a set and consider χA : R −→ [0, 1], the characteristic
function of the set A. For all x ∈ A this function is equal to 1, and for all
x ∈ A it has the value 0. Thus it is neither injective nor surjective: it is not
injective since either A or A has at least two elements and they are mapped
by χA onto the same value. In addition for 12 ∈ [0, 1] there is no x ∈ R such
that χa (x) = 12 , therefore it is not surjective.
E. The absolute value |.| : R −→ R+ is surjective but not injective. Indeed,
for x = −x, i.e. x = 0, we know that |x| = | − x|, i.e. | · | is not injective.
On the other hand, for y ≥ 0 we may take x = y to find |x| = y, showing
surjectivity.
Next we meet some examples considering general mappings.
Example 5.4. A. Let X = N and Y = R. We consider mappings f : N → R,
n → f (n). Such mappings are called sequences of real numbers and it
is convenient to write (f (n))n∈N for such a sequence. Later on we will just
start with a sequence (an )n∈N , an ∈ N, suppressing often that we are working
with a mapping, i.e. that an = f (n) for some f : N → R. Now the question
arises whether a mapping f : N → R can be surjective. The answer is no, a
proof will be given later, see Theorem 18.35.
72
5 FUNCTIONS AND MAPPINGS CONTINUED
B. Let X = R2 = R × R and Y = R. We define the two coordinate

projections pr1 : R2 → R, pr1 (x) = x1 , and pr2 : R2 → R, pr2 = x2 , where
x = (x1 , x2 ) ∈ R2 . Both projections are surjective.
We give a proof for pr1 : given x1 ∈ R we need to find y = (y1 , y2) ∈ R2 such
that pr1 (y) = x1 . Any pair y = (x1 , y2 ), y2 ∈ R, will do. However, pr1 and
pr2 are not injective. Again, we only deal with pr1 and consider x = (x1 , x2 )
and y = (x1 , y2 ) with x2 = y2 . Then x = y but pr1 (x) = x1 = pr1 (y).
Consider now two functions f1 : D1 −→ F1 and f2 : D2 −→ F2 . Suppose in
addition that R(f1 ) = D2 . Given x ∈ D1 then f1 (x) ∈ R(f1 ) = D2 . Hence
we may apply f2 to f1 (x), i.e. we may form f2 (f1 (x)).
f2 ◦ f1
f1
f2
f2 (f1 (x))
x
f1 (x)
R(f1 ) = D2
D1 F1 F2
Figure 5.1
Thus we have defined a new function from D1 to F2 :

Definition 5.5. Let f1 : D1 −→ F1 and f2 : D2 −→ F2 be two functions such
that R(f1 ) = D2 . The function g : D1 −→ F2 defined by g(x) = f2 (f1 (x)) is
called the composition of f1 with f2 and is denoted by f2 ◦ f1 .
Remark 5.6. Once again we can extend our considerations to general map-
pings f : X → Y and g : Y → Z. If R(f ) = Y then we may define the
composition h := g ◦ f : X → Z by h(x) = g(f (x)).
73
Remark 5.7. A. Note that in the case where f2 ◦ f1 is well defined√f1 ◦ f2

need not be defined. For example take f1 : R √+ −→ R, x −→ x and
f2 : R+ −→ R, x −→ −x. Then (f2 ◦ f1 )(x) = − x. But since f2 (x) ≤ 0 for
all x ∈ R+ we cannot apply f1 to f2 (x) for x > 0.
B. Suppose that f1 : R+ −→ R+ and f2 : R+ −→ R+ are both surjective.
Then f2 ◦ f1 and f1 ◦ f2 are both defined. However they do not necessarily
coincide. For example, take f1 (x) = x2 and f2 (x) = 2x. Then f2 (f1 (x)) =
2x2 whereas f1 (f2 (x)) = (2x)2 = 4x2 . Thus, in general, when both f2 ◦ f1
and f1 ◦ f2 are defined they are different functions.
C. We may extend our definition to the situation where R(f1 ) ⊂ D2 . Then
we can still define f2 |R(f1 ) ◦ f1 . For example consider the two functions f1 :
R −→ R, x −→ x2 and f2 : R −→ R, f2 being an arbitrary function. Since
R(f1 ) = R+ we have R(f1 ) ⊂ D(f2 ). Thus we can form (f2 |R+ ◦ f1 )(x) =
f2 (x2 ). Soon we will also write f2 ◦ f1 instead of f2 |R+ ◦ f1 .
Lemma 5.8. Let f1 : D1 −→ F1 and f2 : D2 −→ F2 be two injective

functions. Suppose that R(f1 ) = D2 . Then the function f2 ◦ f1 : D1 −→ F2 is
injective too, i.e. the composition of two injective functions is also injective.
Proof. Let x, y ∈ D1 , x = y. Since f1 is injective it follows that f1 (x) = f1 (y).

Now, the injectivity of f2 implies further that f2 (f1 (x)) = f2 (f (y)).
Lemma 5.9. Let f1 : D1 −→ F1 and f2 : D2 −→ F2 be two surjective

functions. Suppose that R(f1 ) = D2 . Then the composed function f2 ◦ f1 :
D1 −→ F2 is surjective.
Proof. Let z ∈ F2 . Since f2 is surjective there exists y ∈ D2 such that

f2 (y) = z. Now, D2 = R(f1 ) and f1 is surjective. Hence there exists x ∈ D1
such that f1 (x) = y ∈ D2 = R(f1 ). Thus we have f2 (f1 (x)) = z implying
that f2 ◦ f1 is surjective.
Corollary 5.10. Let f1 : D1 −→ F1 and f2 : D2 −→ F2 be two bijective

functions such that R(f1 ) = D2 . Then f2 ◦ f1 : D1 −→ D2 is bijective too.
Proof. We know that in this case f2 ◦ f1 is injective and surjective.
Exercise 5.11. Prove that the composition of two injective mappings is in-
jective and that of two surjective mappings is surjective. Deduce that the
composition of two bijective mappings is bijective.
74
Consider now three functions f1 : D1 −→ F1 , f2 : D2 −→ F2 , f3 : D3 −→ F3 .

Suppose that R(f1 ) = F1 = D2 and that R(f2 ) = F2 = D3 . Then we may
consider the two compositions
f3 ◦ (f2 ◦ f1 ) : D1 −→ F3 (5.1)
(f3 ◦ f2 ) ◦ f1 : D1 −→ F3 . (5.2)
From (5.1) we find for all x ∈ D1
(f3 ◦ (f2 ◦ f1 ))(x) = f3 ◦ ((f2 ◦ f1 )(x)) = f3 (f2 (f1 (x)))
and (5.2) yields for all x ∈ D1
((f3 ◦ f2 ) ◦ f1 )(x) = (f3 ◦ f2 ) ◦ (f1 (x)) = f3 (f2 (f1 (x))).
Thus we have proved
Lemma 5.12. The composition of functions (mappings) is associative, i.e.

for f1 : D1 −→ F1 , f2 : D2 −→ F2 , f3 : D3 −→ F3 with R(f1 ) = F1 = D2
and R(f2 ) = F2 = D3 we have
f3 ◦ (f2 ◦ f1 ) = (f3 ◦ f2 ) ◦ f1 . (5.3)
By Lemma 5.12 we may just write f3 ◦ f2 ◦ f1 for both expressions (5.1) and
(5.2). This clearly extends to finitely many functions.
√ f1 : R → R+ , x −→ 1 + x , and f2 : {x ∈ R|x ≥

2
Example 5.13. Let
1} → R+ , x − √→ x. Then R(f1 ) = {x|x ≥ 1} = D(f2 ) and we √ find
˜
(f2 ◦ f1 )(x) = 1 + x . Clearly we may consider f2 : R+ → R+ , x −→ x,
2
and then we may form f˜2 |{x|x≥1} ◦ f1 = f2 ◦ f1 . Everyone will agree that
the latter approach is simpler and no confusion will arise when we just write
f˜2 ◦ f1 , which is however an abuse of notation.
Let f : D −→ F be a bijective function. Given y ∈ F we can find a unique

x ∈ D such that f (x) = y. This defines a new function mapping y to x.
Definition 5.14. Let f : D −→ F be a bijective function. The function

f −1 : F −→ D, x −→ f −1 (y) where f −1 (y) = x if f (x) = y is called the
inverse function, or just the inverse, of f .
75
Remark 5.15. Once again, this definition extends to arbitrary mappings

in the obvious way: let f : X → Y be bijective. Define f −1 : Y → X by
f −1 (y) = x if f (x) = y.
Example 5.16. A. Consider the function fa : R −→ R, x −→ x + a. The
inverse function fa−1 is determined by finding for y ∈ R the value x ∈ R
such that y = fa (x) = x + a, which gives x = y − a. Thus fa−1 : R −→ R,
y −→ y − a, or fa−1 = f−a .
B. For a = 0 consider the function ga : R −→ R, x −→ ax. The inverse
function is determined by solving y = ax, i.e. x = ay . Hence ga−1 : R −→ R,
y −→ ya , i.e. ga−1 = ga−1 .
√
C. Consider . : R+ −→ R+ . We want√to determine its inverse function.
Now we have to solve the equation y = x, i.e. x = y 2. Thus the inverse is
√
given by f : R+ −→ R+ , y −→ y 2 . Note that . : R+ −→ R+ is not the
inverse to f˜ : R −→ R, x −→ x2 . This function is not bijective. However
√
it is easy to check that . : R+ −→ R+ is the inverse of f˜ : R+ −→ R+ ,
x → x2 .
Let f : D −→ F be bijective with inverse f −1 : F −→ D. We may consider
the two compositions
f −1 ◦ f : D −→ D (5.4)
f ◦ f −1 : F −→ F. (5.5)
In the first case we find (f −1 ◦ f )(x) = f −1 (f (x)) and since f −1 (y) = x when
f (x) = y it follows that (f −1 ◦ f )(x) = x for all x ∈ D. On the other hand,
for y ∈ F we have f (f −1(y)) = f (x) for f (x) = y, hence (f ◦ f −1 )(y) = y.
Definition 5.17. Let D be a set. The identity (or identity mapping) on D
is the function idD : D −→ D, x −→ x.
Obviously we have for f : D −→ F
f ◦ idD = f and idF ◦ f = f. (5.6)
Therefore, just before giving Definition 5.17 we proved:
f −1 ◦ f = idD (5.7)
and
f ◦ f −1 = idF . (5.8)
76
Corollary 5.18. If f : D −→ F is bijective then f −1 : F −→ D is also

bijective and (f −1 )−1 = f . Moreover f −1 is uniquely determined.
Proof. Firstly we claim that f −1 is injective. For y1 = y2 , y1 , y2 ∈ F , suppose
that f −1 (y1 ) = f −1 (y2). Then by (5.8) we find
y1 = f (f −1 (y1)) = f (f −1(y2 )) = y2 (5.9)
which is a contradiction, hence f −1 is injective. Next we claim that f −1 is
surjective. Given x ∈ D, with y = f (x) we find by (5.7) that
f −1 (y) = f −1 (f (x)) = x, (5.10)
i.e. f −1 is surjective. A bijective function g : D −→ F is the inverse of
the bijective function f −1 : F −→ D if g(x) = y for f −1 (y) = x, but f has
exactly this property, i.e. (f −1 )−1 = f.
Finally we prove that f −1 is uniquely determined. Let g, h : F −→ D be
two bijective functions such that g ◦ f = h ◦ f = idD . We have to prove that
g(y) = h(y) for all y ∈ F . Given y ∈ F . Since f is bijective there exists a
unique x ∈ D such that f (x) = y. Now it follows that
g(y) = g(f (x)) = x = h(f (x)) = h(y)
implying that g = h.
The reader may have noted that Definition 5.17 and its Corollary are now
given for D being an arbitrary set, i.e. f being a mapping and not necessarily
a function.
Lemma 5.19. Let f1 : D1 −→ F1 and f2 : D2 −→ F2 be bijective mappings
such that R(f1 ) = F1 = D2 . Then the composition f2 ◦ f1 : D1 −→ F2 has
the inverse function
(f2 ◦ f1 )−1 = f1−1 ◦ f2−1 : F2 −→ D1 . (5.11)
Proof. We know that f2 ◦f1 is bijective, hence (f2 ◦f1 )−1 exists and is bijective.
Since we also know that (f2 ◦ f1 )−1 is uniquely determined we may find an
expression for (f2 ◦ f1 )−1 from the two following calculations:

(f2 ◦ f1 ) ◦ f1−1 ◦ f2−1 = f2 ◦ f1 ◦ f1−1 ◦ f2−1
= f2 ◦ idF1 ◦ f2−1
= f2 ◦ idD2 ◦ f2−1
= f2 ◦ f2−1 = idF2
77
and

f1−1 ◦ f2−1 ◦ (f2 ◦ f1 ) = f1−1 ◦ f2−1 ◦ f2 ◦ f1
= f1−1 ◦ idD2 ◦ f1 = f1−1 ◦ idF1 ◦ f1
= f1−1 ◦ f1 = idD1
proving the lemma.
There are easy ways to understand the concept of injectivity, surjectivity,

bijectivity, and inverse functions by looking at the graph of a function.
If f : D −→ F is surjective then for every value in the target set F considered
as a subset of the y-axis there must correspond at least one value in the
domain D considered as a subset of the x-axis:
y-axis
F
f (xj ) = y
y
x1 x2 x3 x4 x-axis
D
Figure 5.2
If f : D −→ F is injective then for every value on the y-axis belonging also

to F there corresponds at most one value in D considered as a subset of the
x-axis:
78
y-axis
y0
F
f
y1
x1 x-axis
D
Figure 5.3
If f is bijective then for every value y ∈ F , F considered as subset of the

y-axis, there corresponds one and only one point x ∈ D, D considered as a
subset of the x-axis:
y-axis
F f
y
x D x-axis
Figure 5.4
Before looking further at bijective functions we consider a useful geometric

interpretation. Let (a, b) ∈ R2 . The point (b, a) ∈ R2 is obtained by reflecting
(a, b) in the line y = x:
79
y
y=x
b a x
Figure 5.5
Let f : D −→ F be bijective and let Γ(f ) be its graph. Since f is bijective we

may also consider the graph of f −1 which is Γ(f −1 ) = {(y, f −1(y))|y ∈ F } ⊂
F × D. Now if we reflect the whole coordinate system in the line y = x, i.e.
in the principal diagonal, we can recover the graph Γ(f −1 ) from Γ(f ).
y=x
Γ(f )
Γ(f −1 )
F x
D
Figure 5.6
√
For example for . : R+ −→ R+ we find:
80
f −1 (x) = x2
y=x
5
3 √
f (x) = x
2
x
−1 1 2 3 4 5 6
−1 Figure 5.7
We end this chapter by looking at some algebraic operations. Let h : D1 −→

F1 and f, g : D2 −→ F2 be functions such that R(h) = D2 . Then we define
(f ± g) ◦ h := f ◦ h ± g ◦ h, (5.12)
(f · g) ◦ h := (f ◦ h) · (g ◦ h) , (5.13)
and if g(y) = 0 for all y ∈ D2

f f ◦h
◦ h := . (5.14)
g g◦h
For example
√ we may consider h : √ R −→ R+ , x −→ |x|, f : R+ −→ R,
x −→ x and g : R+ −→ R, x −→ 1 + x where we get

((f ± g) ◦ h) (x) = |x| ± 1 + |x|

((f · g) ◦ h) (x) = |x| 1 + |x|

f |x|
◦ h (x) = .
g 1 + |x|
Now let us again consider some more abstract mathematics.

Let us return to general sets and equivalence relations. Let X = ∅ be a set.
81
We define on its power set P(X) a relation R by (A, B) ∈ R ⊂ P × P if and

only if there exists f : A → B which is bijective. (In our other notation we
would write A ∼ B if there exists a bijective mapping f : A → B.) We claim
that R is an equivalence relation.
R is reflexive:
ARA: take f = idA : A → A, x → idA (x) = x
R is symmetric:
ARB and BRA : if ARB then there exists a bijective
f : A → B, but f −1 : B → A is bijective too.
R is transitive:
ARB and BRC means that there exists f : A → B and
g : B → C both bijective, then g ◦ f : A → C is bijective
too.
This is one of the most important equivalence relations which had an enor-
mous influence on the historical development of set theory. We will return
to it later.
To proceed further we need the following considerations. Let X = ∅ be a set
and “∼” an equivalence relation on X. Let a ∈ X. We denote by [a] the set
of all x ∈ X with x ∼ a, i.e.
[a] := {x ∈ X | x ∼ a} (5.15)
and we call [a] the equivalence class of a or generated by a. A partition of
a set X is a set of subsets of X such that every element of X belongs to only
one of these subsets, for example {1, 2}, {3, 4}, {5} would be a partition of
{1, 2, 3, 4, 5}, however neither {1}, {3, 4} nor {1, 2}, {2, 3, 4}, {5} would be.
Formally, we call a family of sets Z ⊂ P(X) a partition of X if
1. every x ∈ X belongs to some Z ∈ Z, i.e. for x ∈ X there exists Z ∈ Z
such that x ∈ Z or
X= Z,
Z∈Z
with
Z := {x ∈ X | x ∈ Z for some Z ∈ Z}
Z∈Z
82
2. for Z1 , Z2 ∈ Z we have that either Z1 ∩ Z2 = ∅ or Z1 = Z2 , i.e. we have

that x ∈ Z1 ∩ Z2 implies Z1 = Z2 .
Figure 5.8 illustrates a typical partition of a set X:
Z1
X
Z2
Z4
Z3
Figure 5.8
Proposition 5.20. Let X = ∅ and “∼” be an equivalence relation on X.

Then {[a] | a ∈ X} is a partition of X.

Proof. Clearly a ∈ [a] since a ∼ a. Hence X = a∈X [a]. Further, if c ∈
[a] ∩ [b] then c ∼ a and c ∼ b therefore a ∼ b implying [a] ⊂ [b] as well as
[b] ⊂ [a], i.e. [a] = [b]. Thus, if [a] ∩ [b] = ∅ then [a] = [b]. Thus we have
proved that {[a]| a ∈ X} is a partition of X.
It is of interest that given a partition Z of X then there exists an equivalence
relation “∼” on X such that the elements of Z are exactly the equivalence
classes corresponding to “∼”. Indeed, we just have to define x ∼ y if and
only if x, y ∈ Z for some Z ∈ Z. Obviously x ∼ x since x ∈ Z for some
Z ∈ Z, and x ∼ y if and only if y ∼ x since equivalence means to belong to
the same set Z. Finally, if x ∼ y and y ∼ z then x, y ∈ Z and y, z ∈ Z ,
however y ∈ Z ∩ Z implying Z = Z , i.e. x ∼ z.
Definition 5.21. Let X = ∅ and “∼” be an equivalence relation on X with
equivalence classes [a], a ∈ X, inducing the partition Z of X. We call a
subset R ⊂ X a complete set of representatives with respect to “∼” if
1. r1 , r2 ∈ R and r1 ∼ r2 implies r1 = r2
83
and

2. X = [r] = {x ∈ X | x ∼ r}.
r∈R r∈R
Now we return to the equivalence relation we considered at the beginning of

the chapter.
Dealing with the set of all sets can be quite troublesome and it may lead to
some serious problems. Let us fix a set X = ∅ and suppose X is “large”. On
P(X), the power set of X, i.e.
P(X) = {A | A ⊂ X} (5.16)
we introduce the equivalence relation
A ∼ B if there exists a bijection fAB : A → B. (5.17)
This is our old example; it induces a partition of X. When X = R and
A ∼ B we say that A and B have the same cardinality. The notion of
cardinality of sets can be extended to more general sets than subsets of R,
but for our purposes it is sufficient to restrict ourselves to the case of R.
Denote by Nn := {1, . . . , n} the first n natural numbers. Every finite subset
A ⊂ R is equivalent to one and only one of the sets Nn , n ∈ N. Indeed, if A
has n elements a1 , . . . , an then j → aj is a bijection from Nn to A and n is
uniquely determined. Thus {Nn }n∈N gives a complete set of representatives
of the finite subsets of R. In this equivalence relation the representative
is just determined by the number of elements of the finite set. Now, N
itself is not finite and determines a further equivalence class, the class of the
countable sets and of course N is a representative of this class. A set
Y ⊂ R is countable if there exists a bijection from N to Y or equivalently
if there exists a bijection from Y to N. The finite sets together with the
countable sets, i.e. sets A ⊂ R, such that A ∼ Nn or A ∼ N, are called the
denumerable subsets of R.
We claim that Z and Q are countable. (We identify Q as a subset of R).
How do we prove this surprising statement? There are clearly many “more”
integers or fractions than natural numbers, i.e. N ⊂ Z, N ⊂ Q, N = Z and
N = Q as well as Z = Q. Note that this is typical for infinite sets: they
contain proper subsets which can be mapped bijectively onto them, i.e. they
have the same cardinality. Here is a possible bijection fZN :

2k, k∈N
fZN (k) := (5.18)
2|k| + 1, k ∈ Z \ N.
84
Clearly fZN is injective, k = k implies fZN (k) = fZN (k ). However fZN is also
surjective: given n ∈ N, if n is even, i.e. n = 2k, k ∈ N, then fZN (k) = n. If
n is odd, i.e. n = 2k + 1, k ∈ N, then fZN (−k) = n.
The case of Q is more involved and we only indicate the idea of showing how
to prove that all non-negative fractions can be mapped bijectively onto N.
Note that there is a lot of multiple counting in the following scheme, i.e. we
need to refine the counting process. This enumeration scheme is due to G.
Cantor who is together with R. Dedekind the founder of set theory; one of
the greatest intellectual achievements of mankind.
0 1 2 3 4 5
1 1 1 1 1 1
0 1 2 3 4 5
2 2 2 2 2 2
0 1 2 3 4 5
3 3 3 3 3 3
0 1 2 3 4 5
4 4 4 4 4 4
0 1 2 3 4 5
5 5 5 5 5 5
0 1 2 3 4 5
6 6 6 6 6 6
Figure 5.9
Now we consider the question: is R is countable? That is, does a bijective

function from N to R exist? The answer is no and we will prove this when
we discuss decimal fractions in Chapter 18 of Part 2. For this we first need
to understand the convergence of series of real numbers. Thus we know that
R has finite and countable subsets but it is not itself countable. We may
ask whether there is a subset C ⊂ R such that C is not countable and has
not the same cardinality as R, i.e. there is no bijection fCR : C → R. The
85
famous continuum hypothesis (CH) states that such a set does not exist.
So far a proof does not exist, however K. Gödel proved that in the standard
model of set theory which is denoted by ZFC, where Z stands for E. Zermelo,
F for A. Fraenkel and C for the Axiom of Choice, CH cannot be disproved.
Some thirty years later P. Cohen proved that CH is independent of ZFC.
Problems
1. Decide whether or not the following functions are injective, surjective
or bijective. Sketch the graph in each case.
a)
f1 : R −→ R+
x −→ |x − 3| + 2;
b)
f2 : [1, ∞) −→ (0, 2]
2
x −→
x
where for a ∈ R we write [a, ∞) = {x ∈ R | x ≥ a};
c)
f3 : [−2, 7] −→ [0, 3]
√
x −→ x + 2

p p
2. a) Consider the mapping g : Q −→ Z, q
−→ g q
= p + q. Is g
injective, surjective or bijective?
b) Let r : R × R −→ R × R, (x, y) −→ r(x, y) := (y, x). Test r for
injectivity, surjectivity and bijectivity.
3. Given f : R −→ R, x −→ 5x2 − 2x + 1, and g : [−5, ∞) −→ R,
a) √
x −→ 5 + x. Find f ◦ g : [−5, ∞) −→ R.
Consider f : R −→ R, x −→ |x + 3| − 2 and h : R −→ R,
b) √
x −→ x4 + 2. Find the largest sets D1 ⊂ R and D2 ⊂ R such that
we can form f ◦ h : D1 −→ R and h ◦ f : D2 −→ R. In each case give
a formula for the function, i.e. for f ◦ h and h ◦ f .
set D ⊂ R where we can define f ◦ h where
c) Find the largest √
f : [0, ∞) −→ R, x −→ x and h : R −→ R, x −→ |x + 2| − 1.
86
4. (Exercise 5.11) Prove that the composition of two injective mappings

is injective and that of two surjective mappings is surjective. Deduce
that the composition of two bijective mappings is bijective.
5. Let X = φ be a non-empty set. Denote by Aut(X) the set of all bi-

jective mappings f : X −→ X. The abbreviation Aut(X) comes from
automorphism, a notion that is dealt with in algebra. Prove that
(Aut(X), ◦) is in general a non-Abelian group, where “◦” stands for
the composition of mappings.
Note: in order to verify that Aut(X) is a non-Abelian group the fol-
lowing must be proved:
(i) f, g, h ∈ Aut(X) implies (f ◦ g) ◦ h = f ◦ (g ◦ h);
(ii) f, g ∈ Aut(X) implies f ◦ g ∈ Aut(X);
(iii) there exists e ∈ Aut(X) such that for all f ∈ Aut(X) the
following holds: f ◦ e = e ◦ f = f ;
(iv) for f ∈ Aut(X) there exists kf ∈ Aut(X) such that
f ◦ kf = kf ◦ f = e.
6. Let f : X −→ Y be a mapping.
a) Prove that f is injective if and only if there exists a mapping
g : Y −→ X such that g ◦ f = idX .
b) Prove that f is surjective if and only if there exists a mapping
h : Y −→ X such that f ◦ h = idY .
7. Consider the mapping f : {x ∈ R|x > 0} −→ {x ∈ R|x > 0}, x −→ x1 .

Prove that f = f −1 .
8. For h : R −→ R, x −→ x2 + 2, and f : R+ −→ R, x −→ x1, g : R+ −→

√
R, x −→ x + |x − 2|, find (f + g) ◦ h, (f · g) ◦ h and 1g ◦ h.
9. Let X = φ and f : X −→ R. Define two functions
|f (x)| + f (x)
f + : X −→ R, x −→ f + (x) :=
2
and
|f (x)| − f (x)
f − : X −→ R, x −→ f − (x) := ,
2
87
which are called the positive part of f and the negative part of f
respectively. Prove that f + (x) ≥ 0 and f − (x) ≥ 0 for all x ∈ X and
f = f + − f − , |f | = f + + f − .
10. Find the inverse of each of the following mappings:

a) f1 : {x ∈ R|x ≥ 0} −→ (0, 1], x −→ 1
1+x2
;
b) f2 : {x ∈ R|x ≥ 0} −→ (0, 2] where

−x + 2, x ∈ [0, 1]
f2 (x) = 1
x
, x ∈ (1, ∞).
Sketch the graph of f2 .
" #
c) f3 : N −→ q | q = 1
n3
and n ∈ N , n −→ 1
n3
.
11. a) Let pr1 : R2 −→ R, (x, y) −→ pr1 ((x, y)) = x. Denote by

B1 (0) = {(x, y) ∈ R2 | x2 + y 2 ≤ 1} the disc with centre 0 = (0, 0) ∈ R2
and radius 1 and by S 1 = {(x, y) ∈ R2 | x2 + y 2 = 1} the circle with
centre 0 = (0, 0) ∈ R2 and radius 1. Find pr1 (B1 (0)) and pr1 (S 1 ).
(Sketch the situations).
b) Let R(g) = {(x, g(x)) | x ∈ [0, 1] and g(x) = x2 + 1}.
Find pr2 (R(g)) where pr2 : R2 −→ R, (x, y) −→ pr2 ((x, y)) = y.
12. Let X and Y be two non-empty sets A ⊂ X, B ⊂ Y . For the
projections pr1 : X × Y −→ X, (x, y) −→ pr1 ((x, y)) = x, and
pr2 : X × Y −→ Y , (x, y) −→ pr2 ((x, y)) = y, prove that
pr1−1 (A) = A × Y and pr2−1 (B) = X × B.
13. Let j : N −→ R be a mapping. Prove that the image j(N) ⊂ R is

a countable set if j is injective. Does the converse hold, i.e does the
countability of j(N) imply that j is injective?
14. Let D ⊂ R, D = φ, and denote by M(D; R) the set of all mappings f :
D −→ R. We define the relation f ∼ g for f, g ∈ M(D; R) as follows:
f ∼ g if and only if there exists a finite set A = Af,g = {x1 , . . . , xm }
depending on f and g, i.e. the points xj as well as m depend on f
and g, such that f |D\A = g|D\A . Prove that “∼” defines an equivalence
relation on M(D; R).
88
15. Let X, Y, Z be sets. We can define the Cartesian product (X × Y ) × Z.

An element of this set has the form ((x, y), z) with (x, y) ∈ X × Y and
z ∈ Z. Define the set X × Y × Z as the set of all ordered triples
(x, y, z) where x ∈ X, y ∈ Y and z ∈ Z, i.e.
X × Y × Z = {(x, y, z) | x ∈ X ∧ y ∈ Y ∧ z ∈ Z}.
Prove that
J : (X × Y ) × Z −→ X × Y × Z
((x, y), z) −→ (x, y, z)
is a bijective mapping. (Note that by definition (x, y, z) = (x , y , z ) if
and only if x = x , y = y and z = z .)
Remark: for finitely many sets A1 , . . . , AN we can define their carte-
sian product by
A1 × . . . × AN := {(a1 , . . . , aN ) | a1 ∈ A1 , . . . , aN ∈ AN },
or more formally
A1 × . . . × AN := {(a1 , . . . , aN ) | for all j ∈ {1, . . . , N} : aj ∈ Aj }.
In particular we may work with
Rn := R × . . . × R (n terms);
Zm := Z × . . . × Z (m terms)
and more generally
Ak := A × . . . × A (k terms).
89
6 Derivatives
We want to study real-valued functions f : D → R, D ⊂ R, more closely.
For example we would like to know whether f is monotone increasing or
decreasing, attains local extreme values, has zeroes, etc. For all this and for
many more problems the concept of the derivative is very helpful. We will
spend some time on the construction of the derivative which we will formally
define in Definition 6.2. The central idea is to substitute locally, i.e. in a
neighbourhood of a point x0 ∈ D, the graph Γ(f ) of a function f by a straight
line, more precisely by the graph Γ(g) of a function g : R → R, x → ax + b.
y
Γ(g1 )
Γ(g2 )
Γ(g3 )
Γ(f )
y0 = f (x0 )
D = [a, b]
Γ(g4 )
a x0 b x
Figure 6.1
As Figure 6.1 shows, many straight lines are possible. We have already
indicated one condition that we want to impose: if x0 ∈ D is the point of
interest i.e. if we want to replace Γ(f ) in a neighbourhood of x0 by Γ(g),
then (x0 , f (x0 )) should lie on the straight line being selected.
The equation of a straight line passing through (x0 , f (x0 )) can be obtained as
follows. A straight line should be interpreted as the graph Γ(g) of a function
g : R → R, x → g(x) = ax + b. The condition that (x0 , f (x0 )) ∈ Γ(g)
means
g(x0 ) = ax0 + b = f (x0 ) (6.1)
91
which is one equation for the two unknown a and b. Thus we need a further
condition to determine g, i.e. Γ(g). Since our aim is to substitute locally, i.e.
in a neighbourhood of x0 ∈ D, Γ(f ) by Γ(g), we may argue as follows. For
|x − x0 | small we should have
f (x) ≈ g(x) = ax + b, (6.2)
where “ ≈ ” stands at the moment for “f (x) being close to g(x)”. Of course,
in addition to (6.2) we assume (6.1).
Thus for |x − x0 | small we should have
f (x) − f (x0 ) ≈ a(x − x0 ), (6.3)
which we obtain by subtracting (6.1) from (6.2). For x = x0 this yields
f (x) − f (x0 )
= a + error. (6.4)
x − x0
Now if |x − x0 | tends to 0 then the error should also go to 0. This would
determine a and from (6.1) we can now calculate b to be
b = f (x0 ) − ax0 . (6.5)
We need to be precise by what “the error goes to 0 as |x − x0 | goes to 0”
means. Before this we give a geometric interpretation for our considerations.
We have the intuitive idea of a tangent to a given curve, in the case of a
circle we can even give a precise definition:
(x0 , y0 )
(0, 0)
Figure 6.2
92
6 DERIVATIVES
A straight line is a tangent to a circle at the point (x0 , y0) if the point
(x0 , y0 ) belongs to (the graph of) this straight line and this straight line is
perpendicular (later we will also say orthogonal) to the straight line through
the centre of the circle and the point (x0 , y0 ).
For a general curve we cannot use this definition, but we may do the following:
consider the graph Γ(f ) of f : D → R, x0 ∈ D.
y − axis
Γ(f )
Γ(g)
y0 = f (x0 )
a x0 b x − axis
Figure 6.3
Instead of g, which we do not know, we consider a straight line g̃ nearby

given as g̃(x) = ãx + b̃, which has the property that (x0 , f (x0 )) ∈ Γ(g̃), i.e.
the point (x0 , f (x0 )) lies on the graph of g̃, therefore g̃(x0 ) = ãx0 + b̃ = f (x0 ),
and for |x − x0 | small Γ(g̃) intersects Γ(f ) (only) in one further point, say
(x1 , f (x1 ))
93
y − axis
Γ(f )
Γ(g̃)
y1 = f (x1 )
y0 = f (x0 )
a x0 b x − axis
x1
Figure 6.4
This straight line is completely determined by the two conditions:
g̃(x0 ) = f (x0 ) = ãx0 + b̃; (6.6)
and
g̃(x1 ) = f (x1 ) = ãx1 + b̃. (6.7)
This leads to
f (x0 ) − f (x1 )
ã = (6.8)
x0 − x1
and
f (x1 )x0 − f (x0 )x1
b̃ = . (6.9)
x0 − x1
Thus the error term in (6.4) should be given by |a − ã|. Intuitively we now
take a sequence of points (xν , f (xν )), ν ∈ N, on Γ(f ), xν = x0 for all ν ∈ N,
tending to (x0 , f (x0 )) and consider the corresponding straight lines gν (x) =
aν x + b with
f (x0 ) − f (xν )
aν = (6.10)
x0 − xν
and
f (xν )x0 − f (x0 )xν
bν = . (6.11)
x0 − xν
94
6 DERIVATIVES
y − axis
Γ(f )
Γ(g3 )
Γ(g2 )
Γ(g1 )
y = f (x0 )
a x1 x2 x3 x0 b x − axis
Figure 6.5
We may think that the tangent is just the “limit line”. However, here we
encounter one of the main problems in analysis. Do we know that the “limit”
exists?
Having these preliminaries in mind we now do the preparations needed for
correct and precise statements. We need to understand the concept of a limit
of a function:
lim f (y) = a. (6.12)
y→x
This should be equivalent to

lim (f (y) − a) = 0 (6.13)
y→x
or
lim |f (y) − a| = 0. (6.14)
y→x
The latter means: given a small error bound ε > 0, if y is close to x then
|f (y) − a| < ε.
So let us give a first definition: we say that the limit of f : D → R as y ∈ D

approaches x ∈ D is equal to a, i.e.
lim f (y) = a, (6.15)
y→x
95
if for every ε > 0 there exists δ > 0 such that 0 < |x − y| < δ implies
|f (y) − a| < ε.
We will see later in Part 2 that this definition yields the following simple
rules for limits:
Let f, g : D → R be two functions and assume that
lim f (y) = a (6.16)

y→x
and
lim g(y) = b (6.17)
y→x
then we have
lim (f ± g)(y) = lim f (y) ± lim g(y) = a ± b, (6.18)

y→x y→x y→x
as well as
lim (f · g)(y) = lim f (y) · lim g(y) = a · b. (6.19)
y→x y→x y→x
If in addition g(y) = 0 for all y ∈ D and b = 0, then
lim f (y)
f (y) y→x a
lim = = . (6.20)
y→x g(y) lim g(y) b
y→x
(Note that we will improve (6.20), we will need only the assumption that
b = 0.)
Example 6.1. A. Consider the constant function hc : R → R, x → c, i.e.
hc (x) = c for all x ∈ R. Then |hc (y) − hc (x)| = |c − c| = 0 and therefore
whatever the value of |x − y| is, |hc (y) − hc (x)| < ε for every ε > 0. Hence
lim hc (x) = c. (6.21)

y→x
B. We claim for fa : R → R, x → fa (x) = a + x that
lim fa (y) = fa (x). (6.22)

y→x
Indeed, consider
|fa (y) − fa (x)| = |a + y − a − x| = |y − x|.
96
6 DERIVATIVES
Given ε > 0, take δ = ε to find that for |y − x| < δ it follows that
|fa (y) − fa (x)| = |y − x| < ε,
i.e.
|fa (y) − fa (x)| < ε.
C. Now let p : R → R be a polynomial, i.e.

M

p(x) = aj xj = a0 + a1 x + a2 x2 + · · · + aM xM .
j=1
Thus p is the finite sum of finite products of functions for which we know the
limits, hence
lim p(y) = p(x). (6.23)
y→x
D. Consider the characteristic function χA : R → R, x → χA (x), for

A = (0, ∞) := {x ∈ R|x > 0}. The graph of χ(0,∞) is
x
Figure 6.6
Suppose that limy→0 χ(0,∞) (y) = a for some a ∈ R. Then for every > 0 there
exists δ > 0 such that |y| < δ implies |χ(0,∞) (y) − a| < , i.e. −δ < y < δ
implies |χ(0,∞) (y) − a| < . Now for −δ < y < 0 we have χ(0,∞) (y) = 0,
implying that |a| < for all > 0, i.e. a must be equal to 0. However,
for 0 < y < δ, if we have χ(0,∞) (y) = 1 and with a = 0 we would have
|1 − 0| = 1 < for every > 0. This is of course a contradiction. Therefore
lim χ(0,∞) (y) does not exist.
y→0
97
We now return to our original problem and study for a given function f :
D → R the limit

f (y) − f (x) f (x) − f (y)
lim = lim . (6.24)
y→x y−x y→x x−y
Definition 6.2. Let f : D → R, D ⊂ R, be a function and x0 ∈ D. We

say that f is differentiable at x0 and has the derivative f (x0 ) at x0 if
the following limit
f (y) − f (x0 )
lim (6.25)
y→x0
y=x0
y − x0
exists. In this case we set
f (y) − f (x0 )
f (x0 ) = y→x
lim . (6.26)
0
y=x0
y − x0
Remark 6.3. A. It is clear that we have to exclude the value y = x0 in

(6.25) otherwise f (y)−f
y−x0
(x0 )
may not be defined.
f (y)−f (x0 )
B. Note that lim y→x 0
y=x0 y−x0
always means that only points y ∈ D are
considered.
C. Note that we have given a pointwise definition, i.e. given f : D → R,
so far we have only defined its derivative at x0 and this is the real number
f (x0 ).
D. For historical reasons as well as for practical reasons we will often write
df (x0 ) df
= f (x0 ) or (x0 ) = f (x0 ). (6.27)
dx dx
Example 6.4. A. Consider the constant function hc : R → R, x → hc (x) =

c. For every x0 ∈ R we find
hc (y) − hc (x0 ) c−c

lim = y→x
lim = y→x
lim h0 (y) = 0,
y→x0
y=x0
y − x0 0 y − x
y=x0 0 0
y=x0
i.e.
hc (x0 ) = 0 for all x0 ∈ R. (6.28)
98
6 DERIVATIVES
B. For the function f : R → R, x → ax + b, a, b ∈ R, we find for every

x0 ∈ R
f (y) − f (x0 ) ay + b − (ax0 + b)
lim = y→x
lim
y→x0
y=x0
y − x0 0
y=x0
y − x0
a(y − x0 )
= y→x
lim = y→x
lim ha (y) = a,
0
y=x0
y − x0 0
y=x0
i.e.
f (x0 ) = a for all x0 ∈ R. (6.29)
C. Let g : R → R, x → ax , a ∈ R. Using the formula
2
y 2 − x20 = (y + x0 )(y − x0 )
we get for x0 ∈ R
g(y) − g(x0 ) ay 2 − ax20 a(y 2 − x20 )
lim = y→x
lim = y→x
lim
y→x 0
y=x0
y − x0 0
y=x0
y − x0 0
y=x0
y − x0
a(y + x0 )(y − x0 )
= y→x
lim = y→x
lim a(y + x0 ) = 2ax0
0
y=x0
y − x0 0
y=x0
i.e.
g (x0 ) = 2ax0 . (6.30)
D. We want to differentiate the function h : R \ {0} → R, x → 1
x
. For x0 = 0
we find
1 1 x0 −y
y
− x0 y·x0
lim = y→x
lim
y→x0
y=x0
y − x0 0
y=x0
y − x0
−1 1
= y→x
lim = − 2,
0
y=x0
y · x0 x0
i.e.
1
h (x0 ) = − , x0 = 0. (6.31)
x20
Recall that by assumption y ∈ D(h) = R \ {0}, i.e. y = 0.
Note that in all these examples we can find the derivative for all points in
the domain. Thus in each case we can define a new function. Therefore we
give
99
Definition 6.5. Let f : D → R be a function. If f (x) exists for all x ∈ D

we define the new function f , called the derivative (or first order derivative)
of f by f : D → R, x → f (x).
By Example 6.4 we may write
(c) = 0 ; (6.32)
(ax + b) = a ; (6.33)
(ax2 ) = 2ax ; (6.34)
1 1
( ) = − 2 . (6.35)
x x
In the next step we want to derive some rules for calculating derivatives.
Theorem 6.6. Let f,g: D → R be two functions each differentiable at x0 ∈ D
with derivatives f (x0 ) and g (x0 ), respectively. Then for all a ∈ R we have
(af ) (x0 ) = af (x0 ) (6.36)
and
(f ± g) (x0 ) = f (x0 ) ± g (x0 ). (6.37)
In particular, this means that (af ) (x0 ) and (f ± g) (x0 ) exist.
Proof. To see (6.36) just note
(af )(y) − (af )(x0 ) a(f (y) − f (x0 ))
lim = y→x
lim
y→x0
y=x0
y − x0 0
y=x0
y − x0
f (y) − f (x0 )
lim a) · ( y→x
= ( y→x lim ) = af (x0 ),
0
y=x0
0
y=x0
y − x0
where we write lim a for lim ha (y) and ha (y) = a for all y ∈ R. Now we
y→x0 y→x0
prove (6.37)

(f ± g)(y) − (f ± g)(x0 ) f (y) − f (x0 ) g(y) − g(x0 )
lim = y→x
lim ±
y→x0
y=x0
y − x0 y=x0
0 y − x0 y − x0

f (y) − f (x0 ) g(y) − g(x0 )
= y→xlim ± y→x
lim
y=x
0 y − x0 y=x
0 y − x0
0 0
= f (x0 ) ± g (x0 ).
100
6 DERIVATIVES
To proceed further, we need the following simple but far reaching observation.
Lemma 6.7. If g : D → R is differentiable at x0 ∈ D then
lim g(y) = lim g(y) = g(x0 ).

y→x0 y→x0
y=x0
Proof. Note that for y = x0 we have
g(y) − g(x0 )
g(y) − g(x0 ) = (y − x0 ).
y − x0
g(y) − g(x0 )
Now y→x
lim = g (x0 ) and y→x
lim (y − x0 ) = 0. Consequently we have
0
y=x0
y − x0 0
y=x0
g(y) − g(x0 )
lim (g(y) − g(x0 )) = y→x
lim lim (y − x0 ) = 0,
y→x0
y=x0
0
y=x0
y − x0 y→x0
y=x0
or
lim g(y) = g(x0 ).
y→x0
y=x0
We want to determine (f g)(x0 ) for two function f, g : D → R differentiable

at x0 . For this firstly consider
(f · g)(y) − (f · g)(x0 ) f (y)g(y) − f (x0 )g(x0 )

=
y − x0 y − x0
(f (y) − f (x0 ))g(y) + (g(y) − g(x0 ))f (x0 )
=
y − x0
f (y) − f (x0 ) g(y) − g(x0 )
= · g(y) + f (x0 ) · .
y − x0 y − x0
Now we can prove Leibniz’s rule, which is also known as the product rule.
Theorem 6.8. Let f, g : D → R be two functions each differentiable at

x0 ∈ D. Then (f · g) is differentiable at x0 and for (f · g)(x0 ) we find
(f · g) (x0 ) = f (x0 )g(x0 ) + f (x0 )g (x0 ). (6.38)
101
Proof. Using the calculation made above we have

(f g)(y) − (f g)(x0 ) f (y) − f (x0 ) g(y) − g(x0 )
lim = y→x
lim · g(y) + f (x0 ) ·
y→x0
y=x0
y − x0 0
y=x0
y − x0 y − x0
$ % $ %
f (y) − f (x0 ) g(y) − g(x0 )
= y→x
lim · g(y) + y→x lim f (x0 ) · .
0
y=x0
y − x0 0
y=x0
y − x0
Now it follows by Lemma 6.7 that
(f g)(y) − (f g)(x0 )
lim
y→x0
y=x0
y − x0

f (y) − f (x0 ) g(y) − g(x0 )
= y→x
lim lim g(y) + y→x
lim f (x0 ) y→x
lim
0
y=x0
y − x0 y→x0
y=x0
0
y=x0
0
y=x0
y − x0
= f (x0 )g(x0 ) + f (x0 )g (x0 ).
With Lemma 6.7 in mind we add a new, central concept to our considerations.
Definition 6.9. A function f : D → R is called continuous at x0 ∈ D if

lim f (y) = f (x0 ). If f is continuous for each x0 ∈ D we call f continuous
y→x0
(in D).
We can now restate Lemma 6.7 as:
Corollary 6.10. Let f : D → R be a function. This function is continuous

at each point where it is differentiable.
The class of continuous functions is much larger than the class of differen-
tiable functions and we will discuss these functions in greater detail later on.
We will also give an example of a continuous function which is not differen-
tiable.
Remark 6.11. A function f : (a, b) → R is continuous at x0 ∈ (a, b) if and

only if
lim f (y) = f ( lim y) = f (x0 ). (6.39)
y→x0 y→x0
Next we use Leibniz’s rule to calculate further derivatives.
102
6 DERIVATIVES
Example 6.12. A. The derivative of the function Mn : R → R, x → xn ,

n ∈ N, is given by Mn (x) = nxn−1 , i.e.
(xn ) = nxn−1 . (6.40)
We prove this by mathematical induction. For n = 1 we have M1 (x) = x,

i.e.
M1 (x) = 1 = 1 · x0 .

Now assume that Mm (x) = mxm−1 . We calculate

Mm+1 (x) = (xMm (x))

= Mm (x) + xMm (x)
m m−1
= x + mx · x
= (m + 1)xm ,
which proves (6.40).

N j
B. For a polynomial p(x) = j=0 aj x we have
N
N

j−1
p (x) = aj jx = aj jxj−1 . (6.41)
j=0 j=1
The proof consists of the following chain of observations:
(aj xj ) = aj jxj−1 ,
and for differentiable functions fj we have

N
N

( fj ) = fj ,
j=0 j=0
which follows from (6.37). For example we find
(5x2 + 7x3 − 3x5 ) = 10x + 21x2 − 15x4 .

x2 +1
C. Consider the function x → for x = 0, we can write this function as
x

2 1
(x + 1) · ,
x
103
and to determine its derivative we use also ( x1 ) = − x12 :

1 1 1
(x + 1)( ) = (x2 + 1) ( ) + (x2 + 1)( )
2
x x x
1 1
= 2x( ) + (x2 + 1)(− 2 )
x x
x2 + 1
=2−
x2
2x − x2 − 1
2
=
x2
2
x −1
=
x2
1
= 1 − 2.
x
D. For n ∈ N we claim
(x−n ) = −nx−n−1 , x = 0. (6.42)
Again we use induction. For n = 1 we know
(x−1 ) = −x−2 = −x−1−1 .
Now, if (x−m ) = −mx−m−1 it follows that
1
(x−m−1 ) = ( (x−m ))
x
1 1
= − 2 (x−m ) + (x−m )
x x
−m−2 1
= −x − mx−m−1
x
−m−2
= −(m + 1)x ,
proving (6.42).
In the next chapter we will discuss more examples after having investigated
the derivatives of composed functions.
104
6 DERIVATIVES
Problems
1. Using the rules (6.18) − (6.20) for limits prove:
5 1 1 − x2
a) lim3 3
x2 − 7
12
x = ; b) lim = 2;
x→ 4 2 x→1 1 − x
x3 − 4x2 + 7x − 13 2
c) lim = .
x→3 − 75 x2 + 1+x
1
2 25
2. Find the following limits:
x2 − 2x + 5 x2 − 9
a) lim ; b) lim .
x→4 x−2 x→−3 (x + 5)(x + 3)
3. Consider the function

f : R −→ R,

x3 − 22, x = 3
x →
17, x = 3.
Find lim f (x). Is f continuous at x = 3?
x→3
4. a) Assume: f, g : (a, b) −→ R, a < b, are two functions such that

|f (x)| ≤ g(x) for all x ∈ (a, b), then lim g(x) = 0 implies lim f (x) = 0,
x→c x→c
a < c < b.
Prove that for every bounded function h : (−2, 2) −→ R it follows that
lim (xh(x)) = 0. Here, we call h bounded if for some M ≥ 0 we have
x→0
|h(x)| ≤ M for all x ∈ (−2, 2).
b) Use part a) to prove that the function
f : R −→ R,

x sin x1 , x = 0
x →
0, x=0
is continuous at x = 0.
5. By using the definition of the derivative prove that f : (−1, 1) −→ R,

x → 34 x2 − 2 is differentiable at x0 = − 12 and find f − 12 .
105
6.* Consider the characteristic function χ[0,1] : R −→ R. Prove that this

function is differentiable for x ∈
/ {0, 1} and has derivative 0, while for
x ∈ {0, 1} the function is not differentiable.
7.* Consider the function

g : R −→ R,

1, x≤2
g(x) =
x2 − 3, x > 2.
Prove that g is not differentiable at x0 = 2. Is g continuous at x0 = 2?
Hint: you will need to go back to the very definition in order to inves-
tigate the continuity of g at x0 = 2.
8. Using rules (6.36) − (6.38) as well as Example 6.12 find the derivatives
of the following functions:
a) f : (1, 5) −→ R, f (x) = 75 x2 − 2
x3
;
t7 +12t3 −2
b) g : (1, 2) −→ R, g(t) = t5
;

M
c) h : (2, 7) −→ R, h(s) = js−j , M ≥ 2.
j=1
9. First prove that f : R −→ R, f (x) = χR+ is not differentiable at x0 = 0.

Now consider the function h : R −→ R, x → x2 f (x) = x2 χR+ (x). Is h
differentiable at x0 = 0? If it is, find h (0).
106
7 Derivatives Continued
In this chapter we want to extend the number of rules for calculating deriva-
tives. Before doing this, let us agree to a slight simplification in our notation.
In the following we will often write
f (y) − f (x0 ) f (y) − f (x0 )
lim instead of lim ,
y→x0 y − x0 y→x0
y=x0
y − x0
however we still assume that y = x0 when using the simplified notation.

Example 7.1. We want to find the derivative of the function f : R −→ R,
x −→ (x2 + 1)2 . There is an easy way to do this:
(x2 + 1)2 = x4 + 2x2 + 1,
hence
f (x) = 4x3 + 4x.
If we instead consider the function f˜k : R −→ R, x −→ (x2 + 1)k , k ∈ N,
the calculation becomes more involved, we first calculate (x2 + 1)k = . . . and
then take the derivative. Note that f and f˜k are composed functions. With
g̃ : R −→ R, x −→ x2 + 1, we find f (x) = (g̃(x))2 and f˜k (x) = (g̃(x))k . Thus
with hk (y) = y k we have f = h2 ◦ g̃ and f˜k = hk ◦ g̃. We aim to express for
an arbitrary composed function f = h ◦ g its derivative by using those of h
and g. Note that hk (x) = kxk−1 is simple to calculate as is g̃ (x) = 2x.
√ √
Example 7.2. Consider . : R+ −→ R, x −→ x. We want to calculate
√
the derivative of . at x0 ∈ R+ . Thus we have to look at
√ √ √ √ √ √
x − x0 ( x − x0 )( x + x0 )
= √ √
x − x0 (x − x0 )( x + x0 )
x − x0
= √ √
(x − x0 )( x + x0 )
1
= √ √ .
x + x0
√ √
Assuming lim x = x0 we find for x0 = 0
x→x0
√ 1
( x) x=x0 = √ ;
2 x0
107
or
1 1
(x1/2 ) = 1/2
= x−1/2 , x > 0. (7.1)
2x 2

Now we want to calculate dxd
p(x), where p : R −→ R is a differentiable
function with R(p) ⊂ {x ∈ R|x > 0}.
Consider

p(x) − p(x0 ) ( p(x) − p(x0 ))( p(x) + p(x0 ))
=
x − x0 (x − x0 )( p(x) + p(x0 ))
p(x) − p(x0 ) 1
= · ,
x − x0 p(x) + p(x0 )

and for x −→ x0 we find assuming lim p(x) = p(x0 ), that
x−→x0

p(x) − p(x0 ) 1
lim = p (x0 ).
x−→x0 x − x0 2 p(x0 )
√
If we write for a moment g(x) = x the above result reads as
g(p(x)) − g(p(x0 ))
(g ◦ p) (x0 ) = lim
x−→x0 x − x0

p(x) − p(x0 )
= lim
x−→x0 x − x0
1
= p (x0 ) = g (p(x0 )) · p (x0 ),
2 p(x0 )
√
where we used that g (x) = ( x) = 2√1 x , which we still need to prove.
The previous example suggests the following general result:
(f ◦ h) (x) = f (h(x)) · h (x)
and we are going to prove this now.

Theorem 7.3 (Chain rule). Let h : D −→ R be a differentiable function
and let f : G −→ R, R(h) ⊂ G, be a further differentiable function. Then
the composed function f ◦ h : D −→ R is differentiable and
(f ◦ h) (x) = f (h(x)) · h (x), x ∈ D. (7.2)
108
7 DERIVATIVES CONTINUED
Proof. First recall that by Corollary 6.10 both functions h and f are contin-
uous. In particular we have
lim h(x) = h(x0 ).

x−→x0
Now consider as a first attempt
f (h(x)) − f (h(x0 )) f (h(x)) − f (h(x0 )) h(x) − h(x0 )

= · ,
x − x0 h(x) − h(x0 ) x − x0
with y = h(x), y0 = h(y0 ) this reads as
f (h(x)) − f (h(x0 )) f (y) − f (y0) h(x) − h(x0 )

= · .
x − x0 y − y0 x − x0
As x −→ x0 we know that
h(x) − h(x0 )
lim = h (x0 )
x−→x0 x − x0
and since x −→ x0 implies y = h(x) −→ h(x0 ) = y0 we have
f (h(x)) − f (h(x0 )) f (y) − f (y0 )

lim = lim
x−→x0 x − x0 y−→y0 y − y0

= f (y0 )
= f (h(x0 ))
which yields indeed
(f ◦ h) (x0 ) = f (h(x0 )) · h (x0 ).
However, there is a problem: h(x) − h(x0 ) = 0 need not be true. Indeed the
term h(x) − h(x0 ) could be zero for infinitely many values. Thus we have to
modify the proof. Define the function
f (y)−f (y0 )
∗ y−y0
for y = y0
f (y) := . (7.3)
f (y0) for y = y0
Then we have
lim f ∗ (y) = f ∗ (y0 ) = f (y0 ) (7.4)
y−→y0
109
and further
f (y) − f (y0 ) = f ∗ (y)(y − y0 ). (7.5)
Now it follows that
f (h(x)) − f (h(x0 ))
(f ◦ h) (x0 ) = lim
x−→x0 x − x0
∗
f (h(x))(h(x) − h(x0 ))
= lim
x−→x0 x − x0
h(x) − h(x0 )
= lim f ∗ (h(x)) lim
x−→x0 x−→x0 x − x0
= f (h(x0 ))h (x0 ),
where we used that lim h(x) = h(x0 ) and therefore lim f ∗ (y) = f (y0 )
x−→x0 y−→y0
implies lim f ∗ (h(x)) = f (h(x0 )).
x−→x0
Example 7.4. A. In the situation of Example 7.1 we first find

2
(x + 1)2 = 2(x2 + 1) · 2x = 4x3 + 4x,
and more generally
((x2 + 1)k ) = k(x2 + 1)k−1 · 2x = 2xk(x2 + 1)k−1.
B. Let g : R
−→ R, g(x) = 0 for all x ∈ R, be differentiable. We want
to find g(·) (x). With f (x) = x1 for x = 0 the function x −→ g(x)
1 1
is the
composed function x −→ (f ◦ g)(x).
Therefore we find

1
(x) = (f ◦ g)(x) = f (g(x)) · g (x)
g(·)
1 g (x)
= − 2 · g (x) = − ,
g (x) g(x)2
i.e.
1 g
= − 2, g = 0. (7.6)
g g
110
Thus for g(x) = x2 + 1 we find

1 −2x
2
= 2 .
x +1 (x + 1)2
C. Let g : R −→ R, g(x) = 0 for all x ∈ R, be a differentiable function and

let h : R −→ R be a further differentiable function. Then it follows using
Leibniz’s rule and (7.6)) that

h 1
(x) = h· (x)
g g

1 1
= h (x) · + h(x) (x)
g(x) g
h (x) h(x) · g (x)
= −
g(x) g 2(x)
g(x) · h (x) − g (x) · h(x)
= .
g(x)2
This rule is often called the quotient rule.
For example we find
3
x − 7x (x2 + 3)(3x2 − 7) − 2x(x3 − 7x)
=
x2 + 3 (x2 + 3)2
x4 + 16x2 − 21
= .
x4 + 6x2 + 9
We may use the chain rule to determine the derivative of the inverse function
of f : R −→ R provided it exists. Since f −1 (f (x)) = x we find by the chain
rule −1
f ◦ f (x) = (f −1 ) (f (x)) · f (x) = (x) = 1.
In the case where f (x) = 0 we find
1
f −1 (f (x)) =
f (x)
or with f (x) = y, i.e. x = f −1 (y) we get
1
f −1 (y) =
f (f −1 (y))
111
or putting φ(y) := f −1 (y) :

1
φ (y) = .
f (φ(y))
This calculation has some critical points, but it paves the way to prove:
Theorem 7.5. Let D ⊂ R be a closed interval and let f : D −→ R be
an injective function, i.e. f : D −→ R(f ) is bijective. Suppose that f is
differentiable at the point x0 ∈ D and that f (x0 ) = 0. Then the inverse
function
φ := f −1 : R(f ) −→ R
is differentiable at y0 := f (x0 ) and we have
1 1
φ (y0 ) = = . (7.7)
f (x0 ) f (φ(y 0 ))
We will provide a complete proof of this theorem later in our course but for
the moment we take this result for granted.
Example 7.6. Let f : R+ −→ R+ , x −→ x2 . For x = 0 we have f (x) √=
2x = 0. The inverse function f −1 is of course f −1 : R+ −→ R+ , x −→ x.
√
From (7.7) we derive with y0 = x0 , i.e. y0 = x20 that
√ 1 1
( y 0 ) = = √
2x0 2 y0
confirming our previous result.

We close this chapter by providing an example of a continuous function which
is not differentiable.
Example 7.7. The function | · | : R → R is not differentiable at x0 = 0.
Consider the quotient

|x| − |x0 | |x| 1, f or x > 0
= =
x − x0 x −1, f or x < 0.
|x|
Suppose that lim = a for some a ∈ R. Then for = 12 there exists
x→0
x=0
x
δ > 0 such that for all x ∈ R with |x| < δ, i.e. −δ < x < δ, it follows that
112

|x|
x − a < 12 . In particular, for −δ < x < 0 we have | − 1 − a| < 1
2
and for
1
0 < x < δ we have |1 − a| < 2
, i.e.
1 1 1 1
− < −1 − a < and − < 1 − a <
2 2 2 2
implying that − 32 < a < 12 and 12 < a < 32 which is a contradiction. Therefore
x → |x| is not differentiable at x0 = 0.
The continuity of x → |x| at x0 = 0 is trivial. We just need to prove that
lim |x| = 0, i.e. given > 0 we need to find δ > 0 such that |x| < δ implies
x→0
||x| − |0|| = |x| < . Thus δ = will do.
Problems
To solve these problems knowledge√ of derivatives of rational functions and
the square root function x → x, x > 0 may be used. Moreover, while
solving these problems results from previous questions may be used without
proof or justification.
√
1. Consider the function hk : (0, ∞) −→ R, k ∈ N, hk (x) = xk . Find
d
h (x).
dx k
2. Find the derivatives of the following functions:

k
i) f : R −→ R, f (x) = (1 + x2 )− 2 , k ∈ N;

ii) g : R \ {0} −→ R, g(y) = 1 + y14 ;

z4
iii) h : R −→ R, h(z) = 1+z 2.
3. Find the derivatives of the following functions:

3u5 −7u9
i) f : R −→ R, f (u) = 1+u6 +u8
;
1
(1+v2 ) 2
ii) g : R −→ R, g(v) = 7 ;
(5+v2 ) 2
√
z 5 −2z 4
iii) h : (0, ∞) −→ R, h(z) = 12+z 2 (1+z 3 )
.
4. The function f : (0, ∞) −→ (0, ∞), x → xk , k ∈ N, is bijective and

f (x) = kxk−1 = 0 for all x ∈ (0, ∞). Find the derivative of its inverse
function f −1 .
113
1
5. In
√ the following denote the inverse function of x → xk by x → x k =
k
x, x > 0 for k ∈ N. Find the derivatives of:
1
i) f : R −→ R, f (s) = (1 + s2 ) k ;
√
1+t4
ii) g : R −→ R, g(t) = √
5
1+t6 +t8
;
1
iii) h : (0, ∞) −→ R, h(u) = u 7 .
1+u2
1+u4
l 1 √
6. For x > 0, k ∈ N and l ∈ N0 we set x k = xl · x k = xl k x. Find the
derivatives of: 3
l 2 )− 2
i) f : (0, ∞) −→ R, f (x) = x k ; ii) g : R −→ R, g(s) = (1+s
(1+s )4 5 .
7. Let p, q : R −→ R be two polynomials such that q(x) = 0 for all x ∈ R

and p(x)
q(x)
> 2 for all x ∈ R. Find
&
d p(x)
− 2.
dx q(x)

1
8. Find the derivative of g : (−1, 1) −→ R, where g(t) = (t2 − 1)(2t + 3) 2 .
9. Let f : (0, 1) −→ (2, 3) and h : (2, 3) −→ (3, 4) be two bijective

and differentiable functions such that f (x) = 0 for all x ∈ (0, 1) and
h (y) = 0 for all y ∈ (2, 3). For z ∈ (3, 4) find the derivative of
(h ◦ f )−1 (z).
m

10. Let p(x) = ak xk be a polynomial and u : R −→ R be a differentiable
k=0
d d
function. i) Find dx (p(u(x)). ii) Find dx
u(p(x)). iii) Suppose that
u(x) = 0 for all x ∈ R. Find dxd 1
u(p(x))
.
114
8 The Derivative as a Tool to Investigate

Functions
In this chapter we discuss how to use the derivative to investigate functions.
We will give some motivation for the results and statements, but we postpone
most of the proofs until Part 2 of our course. The reason is simple: all proofs
will require a deeper understanding of the concept of a limit. However it is
helpful to introduce at an early stage certain useful tools. In fact this is the
main justification for a calculus course preceding a rigorous analysis course.
Example 8.1. Consider the function f corresponding to the given graph
Γ(f )
y-axis
x0 x1 x-axis
Figure 8.1
It looks like the function is unbounded but at x0 the function has a local
maximum and at x1 it has a local minimum.
We want to find criteria for these properties to hold. For this we first need
some definitions.
Definition 8.2. A function f : D → R is said to be bounded if there exists
M ≥ 0 such that |f (x)| ≤ M for all x ∈ D.
Example 8.3. A. The function χA : R → R is for every set A ⊂ R bounded
since |χA (x)| ≤ 1 for all x ∈ R.
B. The function | · | : R → R, x → |x| is unbounded. Indeed suppose there
exists M ≥ 0 such that |x| ≤ M for all x ∈ R. Then for x = M + 1 we would
find
|M + 1| = M + 1 ≤ M
115
which is a contradiction.
Definition 8.4. Let f : (a, b) → R be a function, a < b. We say that f has
a local maximum at x0 ∈ (a, b) (a local minimum at x1 ∈ (a, b)) if there
exists > 0 such that f (x0 ) ≥ f (y) for all y ∈ (a, b) satisfying |x0 − y| <
(f (x1 ) ≤ f (y) for all y ∈ (a, b) such that |x1 − y| < ).
In the case that f has either a local maximum or a local minimum at x2 ∈
(a, b) we just speak of a local extreme value or a local extremum at x2 .
Of central importance is:
Theorem 8.5. Suppose that f : (a, b) → R has a local extremum at x0 ∈
(a, b). If f is differentiable at x0 then f (x0 ) = 0.
This result fits well to our imagination, look at the graph
y-axis
x0 x1 x-axis
Figure 8.2
The function f has a local maximum at x0 and a local minimum at x1 . At

these points we expect there to be a horizontal tangent, i.e. a tangent with
slope zero.
Example 8.6. A. Consider the parabola f : R → R, x → (x−α)2 +β. It is
differentiable for all x ∈ R with derivative f (x) = 2(x − α), thus f (x0 ) = 0
if and only if x0 = α. If we restrict f to any interval (a,b) such that α ∈ (a, b)
then according to Theorem 8.5 the function f |(a,b) might have a local extreme
value at x0 = α. In this example the statement is of course easy to prove
without using the derivative. Since (x − α)2 ≥ 0 for all x ∈ R it follows that
f (x) ≥ β for all x ∈ R but for x = α we have f (α) = β implying that there
116
8 THE DERIVATIVE AS A TOOL TO INVESTIGATE FUNCTIONS
is a (local) minimum at x0 = α.
B. Consider the function g : (−N, M) → R, M, N ∈ N, x → x3 .

The only zero of g (x) = 3x2 is x0 = 0. Now g(0) = 0, but g(x) < 0 for x < 0
and g(x) > 0 for x > 0. Hence the function has no local extreme value at
x0 = 0. This is obvious from its graph:
y-axis
f (x) = x3
x-axis
Figure 8.3
This example shows that Theorem 8.5 is a necessary but not a sufficient
condition for a local extreme value.
C. The function | · | : R → R, x → |x| is for all x > 0 or x < 0 strictly
positive whereas |0| = 0. Thus at x0 = 0 it has a local minimum. However
the absolute value is not differentiable at x0 = 0, compare Example 7.7. Thus
we cannot apply Theorem 8.5.
Theorem 8.5 only gives a necessary condition for local extreme values to
exist. We want to find now sufficient criteria for local maxima and minima.
It turns out that for this we need higher order derivatives.
Let f : D → R be a function such that f (x) exist for all x ∈ D. Then we
can consider f as a new function f : D → R, x → f (x). Next we may
ask whether f has at x0 ∈ D a derivative, i.e. whether
f (x) − f (x0 )
lim = f (x0 ) (8.1)
x→x0
x=x0
x − x0
117
exists. When it does we call f (x0 ) the second derivative of f at x0 .

d2
Instead of f (x0 ) the notation f (2) (x0 ) or dx 2 f (x0 ) are common.
Of course we may iterate the process and define
dk f
k
(x0 ) = f (k) (x0 ) = (f k−1 ) (x0 )
dx
f k−1 (x) − f k−1(x0 )
= x→x
lim
0
x=x0
x − x0
0
as the k th derivative of f at x0 . By definition: f (0) = ddxf0 = f.
Note that the definition of higher order derivatives is a definition by recursion:
k−1
dk d d
k
f (x) := f (x).
dx dx dxk−1
Example 8.7. A. Consider f : R → R, x → x2 . Then we find f (x) =

2x, f (x) = 2 and f (3) (x) = 0, hence f (k) (x) = 0 for k ≥ 3.
B. Consider g : (0, ∞) → R, x → x−1 . We find
g (x) = −1 · x−2 , g (x) = (−1)(−2)x−3 , g (3) = (−1)(−2)(−3)x−4 .
Clearly we may extend our rules for taking derivatives to higher order deriva-
tives. Here are some of the simple ones:
dk dk f dk g
(f ± g) = ± (8.2)
dxk dxk dxk
and
dk dk f
(cf ) = c . (8.3)
dxk dxk
However the following rule is not so simple:
k k−l
dk k d f dl g
(f · g) = , (8.4)
dxk l=0
l dxk−l dxl

where kl denote the binomial coefficients. We will return to this formula in
Part 2, see Problem 2 in Chapter 21. Here is the above rule in its simplest
form, i.e. when k = 2:
118
$ %
d2 d d
(f · g) = (f · g)
dx2 dx dx
d
= (f g + f g )
dx
= f g + f g + f g + f g
= f g + 2f g + f g
2 2
2 d 2 df dg 2 dg
= 2
f· g+ + f .
0 dx 1 dx dx 2 dx2
Now let us return to our original problem.
Theorem 8.8. Let f : (a, b) → R be a differentiable function. Suppose that

f has a second order derivative at x0 ∈ (a, b). If f (x0 ) = 0 and f (x0 ) < 0
then f has a local maximum at x0 . If f (x0 ) = 0 and f (x0 ) > 0 then f has a
local minimum at x0 .
This is sometimes referred to as the second derivative test. We will later, in

Part 2, find a geometric interpretation of this result, compare with Remark
23.3.
Example 8.9. Again we look at f : (a, b) → R, x → (x−α)2 +β, α ∈ (a, b).

We know already that f (α) = 0 and α is the only zero of f . In addition we
d
find f (x) = dx (2(x − α)) = 2. Hence f (α) > 0 and f has a local minimum
at α.
The following result, called the mean value theorem is useful to study
functions in more detail.
Theorem 8.10. Let f : [a, b] → R be a continuous function differentiable

in (a,b). Then there exist ξ ∈ (a, b) such that
f (b) − f (a) = f (ξ)(b − a). (8.5)
Writing (8.5) as f (b)−f

b−a
(a)
= f (ξ) we get the following intuitive graphical
representation (note that both dotted lines are parallel, i.e. they have the
same slope):
119
y-axis
f (ξ)
f
f (b) − f (a)
b−a
a ξ b x-axis
Figure 8.4
Remark 8.11. When proving the mean value theorem later in Part 2 of our
course, compare with Corollary 22.6, we will carefully discuss the importance
of each of the assumptions in the above theorem.
The mean value theorem has important consequences:
Corollary 8.12. Suppose that f : [a, b] → R fulfils the assumptions of the
mean value theorem. Further suppose that m ≤ f (η) ≤ M for all η ∈ (a, b).
Then we have the estimates
m(x − y) ≤ f (x) − f (y) ≤ M(x − y) (8.6)
for all x, y ∈ (a, b), y ≤ x.
Proof. We may apply the mean value theorem to f |[y,x] to find first
f (x) − f (y) = f (ξ)(x − y)
for some ξ ∈ (y, x), ξ = ξ(x, y). Since f (ξ) ≥ m and x − y ≥ 0 this implies
m(x − y) ≤ f (x) − f (y).
Further, since f (ξ) ≤ M and x − y ≥ 0 we find in addition
f (x) − f (y) ≤ M(x − y).
120
Corollary 8.13. Suppose that f : [a, b] → R is a function satisfying the

assumptions of the mean value theorem. If f (x) = 0 for all x ∈ (a, b) then f
is constant, i.e. there exist c ∈ R such that f (x) = c for all x ∈ [a, b].
Proof. Using (8.6) with m = M = 0 we find f (x) = f (y) for all x, y ∈ (a, b),
i.e. f (x) = c := f (x0 ) for x ∈ [a, b] and some fixed x0 ∈ [a, b].
Finally we discuss monotone functions.
Definition 8.14. Let f : D → R be a function, D ⊂ R. We call f
increasing if x, y ∈ D and x < y implies f (x) ≤ f (y);

strictly increasing if x, y ∈ D and x < y implies f (x) < f (y);
decreasing if x, y ∈ D and x < y implies f (x) ≥ f (y);
strictly decreasing if x, y ∈ D and x < y implies f (x) > f (y).
A function satisfying one of these conditions is called monotone.
Some authors prefer to call increasing functions non-decreasing and strictly

increasing functions just increasing as well as decreasing functions non-increasing
and strictly decreasing functions just decreasing.
Example 8.15. A. The function χ(0,∞) : R → R is increasing but not

strictly increasing. This is most easily seen by looking at its graph
x
Figure 8.5
B. The function fa : R → R, x → ax is strictly increasing for a > 0 and

strictly decreasing for a < 0. Indeed, a > 0 implies for x < y that ax < ay,
whereas a < 0 implies for x < y that ax < ay.
121
C. The function g : R → R, x → x2 is not monotone as is seen from

its graph, or verified by an easy calculation. However g|(a,b) is for every
(a, b) ⊂ R+ strictly increasing, and g|(c,d) is for every (c, d) ⊂ R\R+ strictly
decreasing.
Theorem 8.16. Let f : R → R be continuous and differentiable on (a, b).

We then have the following statements: if
f (x) ≥ 0 for all x ∈ (a, b) then f is increasing; (8.7)

f (x) > 0 for all x ∈ (a, b) then f is strictly increasing; (8.8)
f (x) ≤ 0 for all x ∈ (a, b) then f is decreasing; (8.9)
f (x) < 0 for all x ∈ (a, b) then f is strictly decreasing. (8.10)
For a proof we refer to Part 2, Theorem 22.13.
Example 8.17. Consider f : R → R, x → x5 . Since f (x) = 5x4 for all

x ∈ R it follows that f (x) ≥ 0 for all x ∈ R. In fact f (x) > 0 for all
x ∈ R\{0}. Hence f is increasing, in fact strictly increasing. The latter is
clear on (−∞, 0) and (0, ∞), and since f (0) = 0 it follows also f (x) < f (0)
for x < 0 and f (0) < f (y) for 0 < y.
Problems
1. a) Let f : D1 −→ R and g : D2 −→ R be two functions such
that f (D1 ) ⊂ D2 . Suppose that g is bounded with bound M ≥ 0, i.e.
|g(x)| ≤ M for all x ∈ D2 . Prove that g ◦ f : D1 −→ R is bounded and
find a bound for g ◦ f .
b) Consider f : (1, 2) −→ R, x → (x − 1)2 and g : (0, ∞) −→ R,
y → y1 . Show that f is bounded and that f ((1, 2)) ⊂ (0, ∞). Is the
function g ◦ f : (1, 2) −→ R bounded?
c) Give an example of a continuous function f : (a, b) −→ R, a < b,
with the property that for all a1 and b1 such that a < a1 < b1 < b the
function f |[a1 ,b1 ] is bounded but f is unbounded.
2. Let p : R −→ R be a polynomial of degree k ∈ N0 . Prove that if n ≥ k

2
p(x)
then the function f (x) = (1+x 2 )n is bounded on R.
122
3. Find the 3following

derivatives:3 √
d2 x +2x−5
a) dx2 , x = 1; b) dtd 3 ( t4 + 1);
5x−1 5
|s|
d
c) ds s2 +4
. Does the function s → s|s| 2 +4 have a second derivative for
s = 0?
4. Let u, v : R −→ R be two twice differentiable functions. Find
d2 2 2 −1

(u (x) + 1)(v (x) + 1) .
dx2
5. Let f : (a, b) −→ R and g : (c, d) −→ R be two twice differentiable

functions and suppose that f ((a, b)) ⊂ (c, d). Prove that
2
d
(g ◦ f ) (x) = g (f (x))(f (x))2 + g (f (x))f (x).
dx2
2
1

Now find dtd 2 (1 + f 2 (t))− 2 where f : R −→ R is twice differentiable.
6. Find

d2 1 x2
where u : R −→ R, u(x) = √ .
dx2 (u (x) + 2)2
2
1 + x2
7. Prove that for n ∈ N0 there exists a polynomial pn of degree k ≤ n

such that
dn 1 pn (x)
n 2
= .
dx 1+x (1 + x2 )n+1
Now deduce that there exists a constant cn ≥ 0 such that
n
d 1 cn

dxn 1 + x2 ≤ n+2 .
(1 + x2 ) 2
Hint: a) Use mathematical induction, b) Use Problem 2 of this chapter.
8. Find the local extreme values of:

s2 −2s
a) f : R −→ R, f (x) = |x|3 ; b) g : R −→ R, g(s) = 2+3s2
;
√
c) h : (−1, 1) −→ R, h(u) = (1 + u) 1 − u2 .
123
9. a) The function g : (−1, 1) −→ R, x → x2 , has a minmum at

x0 = 0. Find a function f : (−1, 1) −→ R, f (t) ≥ 0 for all t ∈ (−1, 1),
such that f is non constant and f ◦ g has a maximum at x0 = 0.
b) Let f : R −→ R and suppose that f has a local maximum at
x0 ∈ R. Let c ∈ R.Prove that h : R −→ R, h(x) = f (x − c), has a local
maximum at c + x0 .
10. a) Suppose that sin (x) := dxd
(sin x) = cos x for all x ∈ R and
suppose that | cos x| ≤ 1. Use the mean value theorem to deduce that
| sin x| ≤ |x| knowing that sin 0 = 0.
√ d
b) Consider g : [1, 2] −→ R, g(x) = x. Deduce that dx g(x) ≤ 12
for all x ∈ [1,
2]. Now we use the mean value theorem or its corollaries
11
to estimate 10
by
'
19 11 21
≤ ≤ .
20 10 20
11. a) For n ∈ N0 define χn := χ[n,∞) to be the characteristic function
of [n, ∞). For N ∈ N define
N

XN (t) := χn (t).
n=0
Sketch the graph of X5 . Is Xn increasing?

b) Consider fa : R+ −→ R, x → 1+ax x
2 , where a > 0 is a fixed
constant. Determine the largest subset of R+ where fa is decreasing

and the subset where fa is increasing.
12. a) Let f : (a, b) −→ R and g : (c, d) −→ R be two monotone
increasing functions such that g((c, d)) ⊂ (a, b). Prove that f ◦ g :
(c, d) −→ R is monotone increasing.
b) Let f, g : R −→ R be differentiable functions. Prove that if f
and g are either both positive or both negative valued functions then
f ◦ g and g ◦ f are monotone increasing.
13. Let f, g : [a, b] −→ R, a < b, be continuous and differentiable on (a, b).
Suppose that f (a) = g(a) and 0 ≤ f (x) < g (x) for all x ∈ (a, b).
Prove that f (x) < g(x) for all x ∈ (a, b).
Hint: use the mean value theorem with h = g − f .
124
9 The Exponential and Logarithmic Functions

The functions we will introduce in this and the following chapters i.e. ex-
ponential and logarithmic functions, trigonometric functions and hyperbolic
functions are the so-called elementary transcendental functions. Their
definition requires more than just algebraic operations. In fact even the
existence of these functions requires a proof. One way to introduce the expo-
nential function is to consider it as the (unique) solution to a simple initial
value problem for a first order differential equation. We will later on prove
Theorem 9.1. There exists a function f : R −→ R such that
f (x) = f (x) for all x ∈ R and f (0) = 1. (9.1)
Definition 9.2. The function f in Theorem 9.1 is called the exponential

function and is denoted by exp, i.e. exp : R −→ R, exp = exp and exp(0) =
1.
Lemma 9.3. For all x ∈ R, exp(x) = 0 and
1
exp(−x) = = (exp(x))−1 .
exp(x)
Proof. Since for f = exp we find
d
(f (x)f (−x)) = f (x)f (−x) + f (x)(−f (−x))
dx
= f (x)f (−x) − f (x)f (−x) = 0.
Therefore we know that with some c ∈ R
f (x)f (−x) = c for all x ∈ R. (9.2)
But for x = 0 we find c = (f (0))2 = 1. Now it follows from (9.2) that
f (x)f (−x) = 1,
1
i.e. f (x)
= f (−x) or with f = exp
1
exp(−x) = . (9.3)
exp(x)
125
Lemma 9.4. The function exp is unique.

Proof. Suppose that f and g both satisfy (9.1). Then by the previous lemma
g
f
is defined and we have

d g g (x)f (x) − f (x)g(x)
(x) =
dx f f (x)2
g(x)f (x) − f (x)g(x)
= =0
f (x)2
implying fg (x) = K for all x ∈ R and some K ∈ R, or g(x) = Kf (x). Since

g(0) = f (0) = 1 we find g(0) = 1 = Kf (0) = K, i.e. K = 1 and f = g.
Before we can proceed further we state without proof (which we will provide
later, in Part 2, Theorem 20.17) the intermediate value theorem:
Theorem 9.5. Let f : [a, b] −→ R be a continuous function and set α :=

f (a) and β := f (b). Suppose α < γ < β. Then there exists x0 ∈ (a, b) such
that f (x0 ) = γ. In the case where β < γ < α we get the same conclusion.
The intermediate value theorem applied to exp implies that exp(x) > 0 for
all x ∈ R. Indeed, suppose that there is x1 ∈ R such that exp(x1 ) < 0. Since
x0 = 0, we conclude that there must be x0 ∈ (x1 , 0) if x1 < 0 or x0 ∈ (0, x1 )
if x1 > 0, such that exp(x0 ) = 0 which is impossible by Lemma 9.3 Hence
exp(x) > 0 for all x ∈ R.
Lemma 9.6. The exponential function is strictly positive and strictly in-
creasing.
Proof. It remains to prove that exp is strictly increasing. But exp (x) =
exp(x) > 0, implying the result.
The following result is very important:
Lemma 9.7 (Functional equation for exp). For all x, y ∈ R we have
exp(x + y) = exp(x) exp(y). (9.4)
Proof. For y ∈ R fixed we consider the function x −→ g(x) := exp(y + x). It
follows that
g (x) = (exp(y + x)) = exp(y + x) = g(x),
126
9 THE EXPONENTIAL AND LOGARITHMIC FUNCTIONS
hence g(x) = K exp(x) for some K ∈ R, compare with the proof of Lemma
9.4. Now, with x = 0 we find
exp(y) = g(0) = K exp(0) = K,
or
exp(x + y) = g(x) = exp(x) exp(y)
proving the lemma.
Given a function f : R → R. We say that f solves Cauchy’s functional

equation if f (x+y) = f (x)f (y) for all x, y ∈ R. In this sense exp is a solution
to Cauchy’s functional equation. Note that exp is not the only solution to
this functional equation, however it is the only continuous one.
We define the Euler number e by
e := exp(1). (9.5)
Since exp is strictly increasing we have e > 1.
Corollary 9.8. For all n ∈ N we have
exp(n) = en . (9.6)
Proof. For n = 1 there is nothing to prove. Suppose that exp(n) = en for

some n ∈ N. Then it follows that
exp(n + 1) = exp(n) exp(1) = en e = en+1 .
The principle of mathematical induction now yields the corollary.
Using (9.3) we deduce from (9.6) that for m ∈ N

1
exp(−m) = e−m = . (9.7)
em
It is possible to justify for all x ∈ R
exp(x) = ex . (9.8)
In particular we have
ex+y = ex ey and e0 = 1. (9.9)
127
We know that exp is strictly increasing and exp(x) > 0 for all x ∈ R. Assume
for a moment that R(exp) = {x ∈ R|x > 0}. Then we know that exp : R −→
{x ∈ R|x > 0} is bijective and has a differentiable inverse, i.e. there exists a
function ln :{x ∈ R|x > 0}−→R with the properties
x−→ln x
ln(exp x) = x for x ∈ R (9.10)
and
exp(ln y) = y for y > 0. (9.11)
We call ln the (natural) logarithm. For its derivative we find using (7.7)
that
d 1 1 1
ln y =
= = ,
dy exp (ln y) exp(ln y) y
i.e.
1
(ln y) = , y > 0, (9.12)
y
which also implies that ln is strictly increasing on {y ∈ R|y > 0}. Further-
more we have
ln(1) = 0 (9.13)
since 1 = exp(0), and we claim for x, y > 0 that
ln(x · y) = ln x + ln y. (9.14)
Fix y > 0 and consider g(x) = ln(y · x) − ln x. Differentiating with respect to

x yields
1 1
g (x) = y ln (y · x) − ln (x) = y − = 0,
yx x
hence g (x) = 0 for all x > 0 implying that g(x) = c for some c ∈ R, and all
x > 0. Since g(1) = ln y we find
ln y = g(1) = ln(yx) − ln x
or
ln yx = ln y + ln x,
proving (9.14). Finally we note that for x > 0
x 1
0 = ln 1 = ln = ln x + ln = ln x + ln x−1 ,
x x
128
or
1
ln x−1 = ln
= − ln x. (9.15)
x
Now let a > 0 be given. We define on R the function x −→ ax by
ax := ex ln a . (9.16)
It is easy and a good exercise to prove for x, y ∈ R that
ax+y = ax ay and a0 = 1, (9.17)
as well as
(ax )y = axy . (9.18)
Further, x −→ ax is bijective with range {y ∈ R|y > 0} and has an inverse
function which is denoted by x −→ loga x. The value of loga x is called the
logarithm of x with respect to the basis a.
(Note that since a−1 = a1 , it is often convenient to define x −→ ax and
y −→ loga y only for a > 1.)
For the derivative of x −→ ax we find
d x d x ln a
a = e = (ln a)ex ln a = (ln a)ax , (9.19)
dx dx
and this implies for x −→ loga x
d 1 1 1
loga x = x = log x
= . (9.20)
dx (a ) (loga x) (ln a)a a (ln a)x
Here are some examples of derivatives
d 1
(x ln x) = ln x + x · = 1 + ln x, (9.21)
dx x

d x d x ln x d
(x ) = e = (x ln x) ex ln x = (1 + ln x)xx . (9.22)
dx dx dx
For differentiable functions u : R −→ R, v : R −→ R+ \ {0} we find
d u(x)
e = u (x)eu(x) (9.23)
dx
and
d 1 v (x)
ln v(x) = v (x) = . (9.24)
dx v(x) v(x)
129

The term vv is often called the logarithmic derivative of v.
Before we can draw the graph of exp and ln, we need to study the asymptotic
behaviour of functions.
Let f : R −→ R be a function. We want to study the behaviour of f (x) for
x becoming larger and larger, i.e. for x tending to infinity. It may happen
that for x tending to infinity f (x) tends to some number a or to infinity, but
other cases are possible.
We write
lim f (x) = a (9.25)
x−→∞
if for every > 0 given, there exists N = N() ∈ N such that

x > N implies |f (x) − a| < .
1
Example 9.9. We claim for f (x) = 1+x2
that
1
lim f (x) = lim = 0.
x−→∞ x−→∞ 1 + x2
Thus, given > 0 we need to find N() ∈ N such that

1
x > N() implies − 0 = 1 < .
1 + x2 1 + x2
Since for x > 0 it follows that

1 1
2
<
1+x x
we are done if for > 0 we can find N() ∈ N such that
1
x > N() implies < .
x

But this is easy: take N() = 1 + 1 > 1 . If x > 1 + 1 then
1 1 1
2
< < 1 < .
1+x x
+1
Now, it may happen that a in (9.25) is itself infinity, i.e. we write
lim f (x) = ∞ (9.26)
x−→∞
if for every M > 0 there exists N = NM ∈ N such that x > N implies

f (x) > M.
130
Example 9.10. We claim for n ∈ N that
lim xn = ∞. (9.27)
x−→∞
We have to find for M > 0 given a natural number N = NM such that if

x > NM then xn > M. Take N = [M] + 1. Now x > NM implies xn > NM
n
=
n
([M] + 1) > M proving (9.27).
In order to study lim exp(x) and related limits we need
x−→∞
Lemma 9.11. A (Bernoulli’s inequality). Let a > 0 and n ∈ N0 . Then
(1 + a)n ≥ 1 + na. (9.28)
B. Let a > 0 and n ∈ N0 . Then

n(n − 1) 2
(1 + a)n ≥ 1 + na + a. (9.29)
2
Proof. Since n(n−1)
2
a2 > 0 it follows that (9.29) implies (9.28). We now prove
(9.29). For n = 0 we find
0(−1) 2
1 = (1 + a)0 = 1 + 0 · a + a = 1.
2
Now assume that (9.29) holds for some fixed n ∈ N. For n + 1 we find
(1 + a)n+1 = (1 + a)n (1 + a)

n(n − 1) 2
≥ 1 + na + a (1 + a)
2
n(n − 1) 2 n(n − 1) 3
= 1 + na + a + a + na2 + a
2 2
n(n − 1) + 2n 2
≥ 1 + (n + 1)a + a
2
(n + 1)n 2
= 1 + (n + 1)a + a
2
and the result follows by the principle of mathematical induction.
Lemma 9.12. We have
lim ex = ∞. (9.30)
x−→∞
131
Proof. Given M > 0. We have to find N ∈ N such that x > N implies

ex > M. First note that e = e1 > e0 = 1, i.e. e = (1 + b) for some b > 0. The
monotonicity of exp implies for x > N using (9.28)
ex > eN = (1 + b)N ≥ 1 + bN.
Thus, given M > 0 choose N ∈ N such that 1 + bN > M to find that
x > N implies ex > eN = (1 + b)N ≥ 1 + bN > M.
Remark 9.13. Note that we have assumed that for M > 0 we find N ∈ N
such that 1 + bN > M. If M ≤ 1 then every N ∈ N will do, but this case is
of course not interesting. If M > 1 then
M −1
MM
−1
− 1 > 0 and we may take N such
that N > b , for example N = 1 + b .
Lemma 9.14. We have
x
lim = lim xe−x = 0. (9.31)
x−→∞ exp(x) x−→∞
Proof. We claim that φ(x) := xe−x is for x > 1 strictly decreasing. This
follows from
φ (x) = e−x − xe−x = e−x (1 − x) < 0,
provided x > 1. Hence for x > N > 1 it follows that
0 ≤ xe−x < Ne−N .
Now, given > 0 take N > 1 such that

2 1 21
2
< , i.e. N > 2 + 1,
b N −1 b
where b is determined by e = 1 + b. Now using (9.29)
N
0 ≤ xe−x ≤ Ne−N = N(1 + b)−N =
(1 + b)N
N 1 2 1
≤ N (N −1) 2
≤ N −1 2 = 2
1 + Nb + b 2
b b N −1
2
< .
132
Next we extend our considerations to very small values of x. It may happen

that f : R −→ R tends to a ∈ R when x becomes smaller and smaller. For
this we define
lim f (x) = a (9.32)
x−→−∞
if for every > 0 there exists N = N() ∈ N such that x < −N implies
|f (x) − a| < .
Lemma 9.15. We have

lim ex = 0. (9.33)
x−→−∞
Proof. We have to prove that for every > 0 there exists N ∈ N such that
x < −N implies |ex − 0| = ex < . With y := −x > 0 this is equivalent to
N < y implies e−y < or N < y implies e1y < , i.e.
lim e−y = 0. (9.34)

y−→∞
We now prove (9.34). The function y −→ g(y) = e−y is strictly decreasing
since g (y) = −e−y < 0. By Bernoulli’s inequality we find therefore for N < y
and using e = (1 + b) that
1 1
e−y ≤ e−N ≤ < .
1 + Nb Nb
Hence, given > 0 choose N ∈ N such that 1

Nb
< to find that
1 1
N < y implies e−y < e−N ≤ < < .
1 + Nb Nb
Thus (9.34) and therefore (9.33) is proved.
Note that Lemma 9.15 together with Lemma 9.12 finally proves that the
range of exp is equal to {x ∈ R|x > 0}.
Now we can sketch the graph of x −→ exp(x). It must be strictly positive,
strictly increasing, for x −→ −∞ it tends to 0, at x = 0 it has the value 1,
and for x −→ ∞ it tends to ∞:
133
4
y = exp(x)
3
−3 −2 −1 1 2 3 x
−1
−2 Figure 9.1
By our general considerations we can now also sketch the graph of x −→ ln x.
We only have to reflect the graph of exp at the principal diagonal:
y
5 y = exp(x)
3
y=x
2
1 y = ln(x)
−3 −2 −1 1 2 3 x
−1
−2 Figure 9.2
134
Let us calculate some further limits. First we note that
lim ln x = ∞. (9.35)
x−→∞
Given x > 0 set x = exp y. Now x tending to infinity implies that exp y
tends to infinity which is only possible when y tends to infinity, but y = ln x.
Next consider ln x for x tending to 0. This is equivalent to considering ln x1
for x tending to ∞. But ln x1 = − ln x and lim ln x = ∞. Thus we find
x−→∞
ln x −→ −∞ for x −→ 0. For this we write
lim ln x = −∞. (9.36)

x−→0
Theorem 9.16. We have

ln x
lim = 0. (9.37)
x−→∞ x
Proof. Let x = ey , i.e. y = ln x. Then

ln x y
= y = ye−y .
x e
Since x −→ ∞ implies y −→ ∞ we can apply Lemma 9.14 to find
ln x
lim = lim ye−y = 0.
x−→∞ x y−→∞
Problems
1. a) Using the definition of lim f (x) = ∞ prove that lim (x2 − 5) =
x→∞ x→∞
∞.
k

b) Let p : R −→ R, p(x) = al xl , be a polynomial of degree k
l=0
with ak > 0. Prove that lim p(x) = ∞.
x→∞
c) For a ∈ R prove that
1 + a + ax2
lim = a.
x→∞ 1 + x2
135
2. a) For n ∈ N deduce from Lemma 9.11.B the Bernoulli inequal-

ity: (1 + a)n > 1 + na, i.e. the strict inequality holds.
b) Use part a) to prove for n ≥ 2 that
n
1 1
1+ 2 >1+ .
n −1 n
3. For a > 0 define ax := exp(xlna) = exlna , and prove that ax+y = ax ay

and a0 = 1.
4. Find the following derivatives:
d
√ d d2
1
a) dx exp(− x2 + 1); b) du exp(−loga (1+u2 )), a > 0; c) dt2
exp − 1+t2 .
5. By induction show that for n ∈ N0 there exists a polynomial pn of
degree at most n such that
dn −x2 2
n
e = pn (x)e−x .
dt
6. Find the following derivatives:
√ d2
d
a) ds d
ln( s4 + 1−s2 ); b) dx (ln(ax )), a > 1; c) dy 2
ln((y 2 +1)−k ), k ∈ N.
7. a) For a > 0 prove that

x
lim = 0.
x→∞ exp(ax)
b) Use part a) to prove for a > 0 and n ∈ N that

xn
lim = 0.
x→∞ exp(ax)

Hint: expx = exp nx · . . . · exp nx .
m

8. Let p(x) = bk xk , bm > 0, be a polynomial. Find
k=0
lim (exp(p(x))).
x→−∞
Hint: distinguish whether m is even or odd.
136
n

9. Let p(x) = ak xk be a polynomial and an > 0. Prove that there
k=0
exists R > 0 such that p(x) > 0 for x ≥ R. Hence for x ≥ R the
ln(p(x))
function x → ln(p(x)) is defined. Now show that lim = 0.
x→∞ x
10. a) For x, y > 0 prove under the assumption that for a > 0 it follows
1
that lna 2 = 12 lna the estimate
lnx + lny x+y

≤ ln .
2 2
b) For x > y > 0 such that x − y = 1 prove that

1 1
≤ lnx − lny ≤ .
x y
(Use the mean value theorem.)
11. Let v : R −→ R be a differentiable function and suppose that the

logarithmic derivative of v is identically 1 and that v(0) = 1. Find the
function v.
137
10 Trigonometric Functions and Their

Inverses
Since we have introduced the exponential function as a solution of a differ-
ential equation and an initial condition, we may think to introduce sin and
cos, as solutions of the differential equations:
f = g, g = −f (10.1)
f (0) = 0, g(0) = 1. (10.2)

Postponing the existence proof, it is possible to identify f with sin and g
with cos, and to prove their basic properties by only using (10.1) and (10.2).
We follow however a different method. We introduce both functions by us-
ing elementary geometry of the circle and then we will derive some of their
properties. It turns out that switching from very classical geometry to cal-
culus leads to some problems, all of which cannot be resolved in this part of
the course. However, in Part 2 we will have a more rigorous approach using
power series and therefore we may justify our naı̈ve handling of trigonometric
functions here.
Consider the circle in R2 with centre (0, 0) and radius 1. The total length of
its circumference is 2π. It makes sense to measure the size of an angle φ by
the corresponding arc length. More precisely, let φ be the angle ∠CAB in

Figure 10.1 below and denote by l(BC) the length of the arc BC connecting

B and C. For the measure of the size of φ we take the value l(BC).
tan φ (1, tan φ)

C
y0 = sin φ
C = (x0 , y0 ) = (cos φ, sin φ)
φ
A = (0, 0) x0 = cos φ 1
B = (1, 0)
Figure 10.1
139
In this way we find that an angle of 45◦ corresponds to π4 , an angle of 90◦

corresponds to π2 etc. For 0 ≤ φ < 2π we can now define the following two
functions
φ −→ sin φ and φ −→ cos φ
where the definitions are easily taken from Figure 10.1: denote by C =
(x0 , y0 ) the point where the ray starting at A = (0, 0) forming the angle φ
with the x-axis intersects the circle (as usual angles in the unit circle are
measured anticlockwise). Then we define:
sin φ = y0 , cos φ = x0 . (10.3)
Figures 10.2 and 10.3 below give a further insight into the values of sin and
cos for 0 ≤ φ < 2π. First we look at Figure 10.2:
y-axis
(− cos φ, sin φ) (cos φ, sin φ)

sin φ
π
2
+φ
φ
− cos φ cos φ 1 x-axis
π+φ
3π
2
+φ
− sin φ
(− cos φ, − sin φ) (cos φ, − sin φ)
Figure 10.2
We find for example that cos(π − φ) = − cos φ and sin(π + φ) = − sin φ, etc.
Next we consider Figure 10.3:
140
10 TRIGONOMETRIC FUNCTIONS AND THEIR INVERSES
y-axis
(− sin φ, cos φ)
cos φ
sin φ (cos φ, sin φ)

π
2
+φ
− sin φ φ
− cos φ π+φ sin φ cos φ 1 x-axis
− sin φ
(− cos φ, − sin φ)
3π
2
+φ
− cos φ
(sin φ, − cos φ)
Figure 10.3
Here we find for example that cos( π2 + φ) = − sin φ and sin( 3π

2
+ φ) = − cos φ.
Further similar formulae can be found in Appendix V. Note that in our
definition we have excluded φ = 2π. We remedy this by extending both
functions to all of R in the following way: let φ ∈ R, then there exists a
unique k ∈ Z such that φ ∈ [2kπ, 2(k + 1)π), i.e. 2kπ ≤ φ < 2(k + 1)π.
We now set
sin φ := sin(φ − 2kπ), cos φ := cos(φ − 2kπ). (10.4)
Note that φ − 2kπ ∈ [0, 2π) and therefore sin(φ − 2kπ) and cos(φ − 2kπ) are
well defined. From this extension it follows immediately that sin : R −→ R
and cos : R −→ R are periodic functions with period 2π, i.e. sin(φ + 2π) =
sin φ and cos(φ + 2π) = cos φ. Further it follows that
| sin φ| ≤ 1 and | cos φ| ≤ 1 (10.5)
and we have the special values
141
sin 0 = 0 cos 0 = 1
sin π2 = 1 cos π2 = 0
sin π = 0 cos π = −1
sin 3π
2
= −1 cos 3π2
=0
sin 2π = 0 cos 2π = 1.
Moreover by Pythagoras’ theorem, see Appendix IV, we know
x20 + y02 = 1
or
cos2 φ + sin2 φ = 1. (10.6)
We also note the following results:
sin(φ1 + φ2 ) = sin φ1 cos φ2 + cos φ1 sin φ2 ; (10.7)

cos(φ1 + φ2 ) = cos φ1 cos φ2 − sin φ1 sin φ2 ; (10.8)
φ1 − φ2 φ1 + φ2
sin φ1 − sin φ2 = 2 sin cos ; (10.9)
2 2
φ1 + φ2 φ1 − φ2
cos φ1 − cos φ2 = −2 sin sin ; (10.10)
2 2
as well as the symmetries
sin(−x) = − sin x, cos(−x) = cos x. (10.11)
Again we refer to Appendix V where we have collected more similar formulae.

The formulae in (10.11) suggest:
Definition 10.1. Let f, g : R −→ R be two functions. We call f an even
function if
f (x) = f (−x) for all x ∈ R, (10.12)
and we call g and an odd function if
g(x) = −g(−x). (10.13)
Hence sin is an odd and cos is an even function.

Lemma 10.2. Let f1 , f2 : R −→ R be two even functions and let g1 , g2 :
R −→ R be two odd functions. Then f1 · f2 and g1 · g2 are even, whereas
f1 · g1 is odd, i.e. the product of two even or two odd functions is even, the
product of an even function with an odd function is odd.
142
Proof. The following hold
(f1 · f2 )(−x) = f1 (−x)f2 (−x) = f1 (x)f2 (x) = (f1 · f2 )(x),
(g1 · g2 )(−x) = g1 (−x)g2 (−x) = (−g1 (x))(−g2 (x)) = g1 (x)g2 (x),

(f1 · g1 )(−x) = f1 (−x)g1 (−x) = f1 (x)(−g1 (x)) = −(f1 · g1 )(x),
proving the lemma.
Next if we compare in Figure 10.1 sin φ with φ, we get
| sin φ| ≤ |φ|. (10.14)
The latter allows us to calculate
lim sin φ = 0. (10.15)

φ−→0
Indeed, given > 0 choose δ = to find for |φ−0| = |φ| < δ that | sin φ−0| =
| sin φ| ≤|φ| < δ = . Thus we have proved that sin is continuous at 0. Since
cos φ = 1 − sin2 φ we find that

lim cos φ = lim 1 − sin2 φ = 1, (10.16)
φ−→0 φ−→0
i.e. cos is also continuous at 0. This further implies:

Corollary 10.3. The functions sin and cos are continuous.
Proof. For φ0 fixed we find with h = φ − φ0
lim sin φ = lim sin(φ0 + h)

φ−→φ0 h−→0
= lim (sin φ0 cos h + cos φ0 sin h)
h−→0
= sin φ0 ( lim cos h) + cos φ0 ( lim sin h)
h−→0 h−→0
= sin φ0
proving the continuity of sin. Observing that
lim cos φ = lim cos(φ0 + h)

φ−→φ0 h−→0
= lim (cos φ0 cos h + sin φ0 sin h) = cos φ0
h−→0
we deduce that cos is continuous.
143

From elementary geometry we know that a sector OAB with an angle φ,
0 ≤ φ < 2π, of a circle with radius r has area 12 r 2 φ, see the following figure
for an explanation.

Length of AB: rφ
0 r A
Figure 10.4
Now we consider the unit circle and the following figure:
D = (cos φ, sin φ)
B
Length of AB: rφ
φ
·
0=(0,0) A = (cos φ, 0) C = (1, 0)
Figure 10.5

It is obvious that the area of the sector OAB is less or equal to that of the
triangle OCD. Since r = cos φ and the area of the triangle OCD is given by
1
2
sin φ we find for 0 ≤ φ < π2
1 1
φ cos2 φ ≤ sin φ,
2 2
144
π
and for 0 < φ < 2
sin φ
cos2 φ ≤ . (10.17)
φ
sin φ
Since cos2 φ as well as φ
are even functions we can extend (10.17) to all φ,
0 < |φ| < π2 .
We now claim:
Theorem 10.4. We have

sin φ
lim = 1. (10.18)
φ−→0 φ
π
Proof. From (10.17) and (10.14) we deduce for 0 < φ < 2
that
sin φ
1 − sin2 φ = cos2 φ ≤ ≤ 1.
φ
Now it follows that

sin φ
− sin2 φ ≤ − 1 ≤ 0,
φ
or
sin φ sin φ
0≤1− = − 1 ≤ sin2 φ ≤ φ2 .
φ φ
√
Given > 0 take δ = to find for |φ| < δ, φ = 0, that

sin φ
2 2 2 2
φ − 1 ≤ sin φ ≤ φ = |φ| < δ = .
Corollary 10.5. The function sin : R −→ R is differentiable and we have
sin = cos . (10.19)
Moreover, cos : R −→ R is differentiable and we have
cos = − sin . (10.20)
145
Proof. Note that sin 0 = 0, and therefore Theorem 10.4 states that sin is
differentiable at 0 with derivative 1 which is equal to cos 0. Now, using
(10.9) we find
sin φ − sin φ0 2 sin φ−φ
2
0
cos φ+φ
2
0
sin(φ − φ0 )/2 φ + φ0
= = · cos ,
φ − φ0 φ − φ0 (φ − φ0 )/2 2
which implies
sin φ − sin φ0 sin φ−φ
2
0
φ − φ0
lim = lim φ−φ0
lim cos
φ−→φ0 φ − φ0 φ−→φ0
2
φ−→φ0 2
sin h φ + φ0
= lim lim cos = cos φ0 ,
h−→0 h φ−→φ0 2
where we used the continuity of cos, compare with Corollary 10.3.
Knowing that sin is differentiable and sin = cos allows us to calculate the
derivative of cos by using the chain rule:
d d 1
cos x = (1 − sin2 x) 2
dx dx
1 1
= (−2 sin x cos x) · (1 − sin2 )x)− 2
2
sin x cos x
= − = − sin x.
cos x
Corollary 10.6. The function sin has for φ = (2k + 12 )π, k ∈ Z, a (local)
maximum and for φ = (2k − 12 )π, k ∈ Z, a (local) minimum. The function
cos has for 2kπ, k ∈ Z, a (local) maximum and for (2k + 1)π, k ∈ Z, a (local)
minimum.
Proof. We know (sin φ) = cos φ = 0 for φ = (k + 12 )π, k ∈ Z. Now (sin φ) =
− sin φ. Hence for φ = (2k + 12 )π, we find
π 1 1
− sin = − sin(2k + )π = (sin) (2k + )π = −1 < 0,
2 2 2
thus sin has a local maximum for φ = (2k + 2 )π. For φ = (2k − 12 )π we find
1
π π 1 1
sin = − sin − = − sin(2k − )π = (sin )(2k − )π = 1 > 0
2 2 2 2
1
implying that sin has a local minimum for φ = (2k − 2 )π. The result for cos
is proved in an analogous way.
146
From our definition of sin and cos it is clear that φ = π is the smallest zero
of sin larger than 0, as is π2 the smallest zero of cos larger than 0. We also
note the formula:
π
cos φ = sin(φ + ). (10.21)
2
The graphs Γ(sin) and Γ(cos) look like:
1 Γ(sin)
−π − π2 π
2
π 3π
2 2π
−1 Γ(cos)
Figure 10.6
Consider the function sin : R −→ R. Since it has period 2π it cannot be

injective. Further we know that sin π = 0, i.e. sin π = sin 0 = − sin(−π) = 0,
implying that sin cannot be injective on [0, π]. However we claim that sin :
[− π2 , π2 ] −→ R is injective, in fact strictly increasing. For this we only need
to consider
π π
sin x = cos x > 0 for x ∈ (− , ),
2 2
implying that sin |(− π2 , π2 ) is strictly increasing. Since sin(− π2 ) = −1 and
sin( π2 ) = 1 it follows that sin : [− π2 , π2 ] −→ [−1, 1] is bijective. Hence it has
an inverse function defined on [−1, 1] which we denote by sin−1 or arcsin. In
the same way we find that cos : [0, π] −→ [−1, 1] is strictly decreasing, recall
cos x = − sin x and for x ∈ (0, π) we have sin x > 0. Hence there exists the
inverse function cos−1 or arccos which is defined on [−1, 1].
Definition 10.7. The function arcsin is called the arcus-sine function

and arccos is called the arcus-cosine function.
Theorem 10.8. A. The function sin : [− π2 , π2 ] −→ [−1, 1] is bijective with

inverse function arcsin : [−1, 1] −→ [0, π] and for −1 < x < 1 we have
d 1
arcsin(x) = √ . (10.22)
dx 1 − x2
147
B. The function cos : [0, π] −→ [−1, 1] is bijective with inverse function

arccos : [−1, 1] −→ [0, π] and for −1 < x < 1 we have
d 1
arccos(x) = − √ . (10.23)
dx 1 − x2
Proof. It remains to prove (10.22) and (10.23). From Theorem 7.5 we know
1
φ (x) =
f (φ(x))
for φ = f −1 . For arcsin we deduce
d 1
arcsin (x) = arcsin x =
dx sin (arcsin x)
1 1
= =
cos(arcsin x) 1 − sin2 (arcsin x)
1
= √ .
1 − x2
For arccos we find
d 1
arccos (x) = arccos x =
dx cos (arccos x)
1 1
= = −
− sin(arccos x) 1 − cos2 (arccos x)
1
= −√ .
1 − x2
Using sin and cos we may introduce some further functions of importance.
Consider first the tangent function
sin x
tan x := . (10.24)
cos x
Of course we must assure that cos x = 0, thus we define the function tan on
the set R \ {(k + 12 )π|k ∈ Z}. It is obvious that tan is an odd function since
sin(−x) sin x
tan(−x) = =− = − tan x,
cos(−x) cos x
148
and we find on R \ {(k + 12 )π|k ∈ Z} that
d d sin x
tan (x) = tan(x) =
dx dx cos x
cos x cos x + sin x sin x 1
= 2
= ,
cos x cos2 x
i.e.
1
tan x =
. (10.25)
cos2 x
Further we may introduce the cotangent function
cos x
cot x := , (10.26)
sin x
which is defined on R \ {kπ|k ∈ Z}. Once again we find that cot is an odd
function and we have
d cos x − sin x sin x − cos x cos x
cot (x) = =
dx sin x sin2 x
1
= − 2 ,
sin x
i.e.
1
cot (x) = − . (10.27)
sin2 x
From (10.25) it follows that on (− π2 , π2 ) the function tan is strictly increasing,
hence it has an inverse, the arcus-tangent function arctan : R −→ (− π2 , π2 ).
Note however that we have not yet proved that R(tan |(− π2 , π2 ) ) = R.
For arctan we find by Theorem 7.5 that
1
arctan (x) = = cos2 (arctan x).
tan (arctan x)
1
Now, cos2 y = 1+tan2 y
as follows from
1 1 cos2 y
= = = cos2 y,
1+ sin2 y cos2 y
+ sin2 y sin2 y + cos2 y
cos2 y cos2 y cos2 y
which yields
1 1
arctan (x) = 2 = ,
1 + tan (arctan x) 1 + x2
149
i.e.
1
arctan (x) = . (10.28)
1 + x2
From (10.27) we find that for x ∈ (0, π) the function cot is strictly decreasing
and hence it has an inverse function arccot, arcus-cotangent. For arccot
we find
1
arccot (x) =
= − sin2 (arccot x).
cot (arccot x)
Since sin2 y = 1+cot
1
2 y we find
1 1
arccot x = − 2
=− ,
1 + cot (arccot x) 1 + x2
i.e.
1
arccot x = − . (10.29)
1 + x2
We postpone the proof of R(tan |(− π2 , π2 ) ) = R(cot |(0,π) ) = R, until Remark
20.18.B. and we refer to Appendix V where one can find a lot of formulae
connecting sin, cos, tan, cot, arcsin, arccos, arctan, arccot. We mention that
often a new name is introduced for x −→ sin1 x and x −→ cos1 x , namely
1 1
csc x = and sec x = (10.30)
sin x cos x
called co-secant and secant function. We finally consider the following
graphs:
arcsin(x) arccos(x)
π
2
−1 1
−1 0 1
150
tan(x) cot(x)
−π − π2 π
2
π − π2 π π − 3π
2 2
arctan(x) arccot(x)
π
π
2
− π2
151
Problems
1. a) Let f : R+ −→ R be any function and let g : R −→ R+ be an
even function. Prove that f ◦ g : R −→ R is an even function.
b) Let f : R −→ R and g : R −→ R be odd functions. Is f ◦ g an
odd function too?
c) Given an even function f : R −→ R and (a, b) ⊂ R, a < 0 < b.
Prove that f |(a,b) cannot have an inverse function.
2. a) Let f : R −→ R be a differentiable function. Prove that if f is
even then f is odd and if f is odd then f is even. Deduce that if f
is a k times continuously differentiable function and l ≤ k is an even
number then f (l) is even.
b) Let f : R+ −→ R be a function. Show that f has an even
extension g : R −→ R and f |(0,∞) an odd extension h : R −→ R.
3. a) Does the limit lim (sin x) exist?
x→∞
b) Prove for k ∈ N that
(sin x)k
lim = 0.
x→∞ x
4. Using the definitions of sin, cos, tan and cot, and the addition theorems
find the values of
a) sin π8 , b) cos π6 , c) tan π3 , d) cot 12
π
.
5. Find the values of
√ √ √
a) arcsin 23 , b) arccos − 12 2 , c) arctan √13 , d) (− 3).
6. a) For x, y ∈ R prove that | sin x − sin y| ≤ |x − y|.

b) For x, y ∈ [−a, a] ⊂ − π2 , π2 show that
1
| tan x − tan y| ≤ |x − y|.
cos2 a
c) Prove that for all n ∈ N and all x ∈ R we have
| sin nx| ≤ n| sin x|. Does the statement: for all a > 0 and all x ∈ R
| sin ax| ≤ a| sin x| hold?
152
7. Let f : R −→ R be a fundtion. Further let g : R −→ R be a periodic

function with period a > 0. Prove that the function f ◦ g is periodic
with period a. Is the function g ◦ f periodic?
8. Find the derivatives (on the natural domains) of the given functions:
√
d
a) dx cos(ln(1 + x2 )); b) dtd √sin(tan t)
4
1−cos t
; c) d
ds
arcsin( 1 + cos s);
d 2
d) du
arctan(e−u cot u).
9. For n ∈ N the Dirichlet

kernel which is of great importance in Fourier
π π
analysis is defined on − 2 , 2 by
sin(2n+1)t

, t ∈ − π2 , π2 , t = 0
Dn (t) := sin t
2n + 1, t = 0.
Prove that
1
Dn (t) = Cn (2t),
2
where n
1
Cn (t) =
+ cos jt,
2 j=1

and deduce that Dn is on − π2 , π2 arbitrarily often differentiable.
n

Hint: first find cos jt and consider cos jt · sin 2t .
j=1
153
11 Investigating Functions
In this chapter we want to develop a scheme for investigating a given func-
tion in a systematic way. The first problem we have to address is that of the
domain. Clearly, if a function is given as f : D −→ R we know D. However,
often we have to handle functions which are obtained from given ones or con-
structed “indirectly”: the exponential function was introduced as a solution
of a certain differential equation; the tangent function is the quotient of two
x2
functions both having many zeroes; the function x −→ |x| is not defined for
x = 0 but easily extended to the function x −→ |x| which is defined for all
x ∈ R.
Thus our starting point should be an expression f (x) defined originally for
some subset D̃ ⊂ R such that x −→ f (x) is a function on D̃. The first step
is to determine the maximal domain D of the expression f (x), i.e. the
largest set D ⊂ R such that D̃ ⊂ D and f (x) is defined on D. We distinguish
the maximal domain D of the expression f (x) as the domain of the maximal
extension of f : D̃ −→ R.
Example 11.1. Consider on D̃ = {x ∈ R|x = 0 and x = 1} the expression

2 −1
f (x) = xx−1 . This expression is well defined for x = 0 and we can extend the
domain of this expression easily to D = {x ∈ |x = 1}, obtaining a function
f : D −→ R. Since x2 − 1 = (x − 1)(x + 1) we find that f (x) = (x−1)(x+1) x−1
2 −1
which is for x = 1 equal to x + 1. However for x = 1 the expression xx−1 is
not defined and we cannot extend this expression to R whereas the function
2 −1
f : D −→ R, x −→ xx−1 , has an extension to R by the function f ∗ : R −→ R,
2 −1
x −→ x + 1. Indeed, for x = 1 we have f ∗ (x) = x + 1 = (x+1)(x−1)x−1
= xx−1 ,
hence f ∗ |D = f. This distinction might look a bit artificial, however it is
not as we will see later. At the moment we agree to concentrate only on
determining the maximal domain D of the expression f (x).
Next we investigate symmetry and monotonicity. So far we know three sym-

metries: f can be even or odd or periodic (or none of these). Suppose
f : D −→ R is given. In order for f to be even (odd) we must have that
x ∈ D implies −x ∈ D, and in order for f to have period a we need to
have x ∈ D implies x + a ∈ D. Monotonicity is best checked (if possible) by
looking at f .
In general D will be a proper subset of R, i.e. not equal to R. We call a
point x0 ∈ D an interior point or inner point of D if there exists > 0
155
such that (− + x0 , x0 + ) ⊂ D. Assume D = (a, b) = {x ∈ R|a < x < b} is

an open interval. We claim that all points of (a, b) are inner points. Indeed,
given x0 ∈ (a, b).
x0
a x0 − x0 + b
Figure 11.1
1
Consider := min(x0 −a, b−x0 ) > 0. Then we claim (−+x0 , x0 +) ⊂ (a, b).
2
The proof is simple x ∈ (− + x0 , x0 + ) means − + x0 < x < x0 + and
with = 12 min(x0 − a, b − x0 ) we find in the case where = 12 (x0 − a) that
1 1
− (x0 − a) + x0 < x < x0 + (x0 − a)
2 2
which yields
1 1 1 1 1
a < a + x0 < x < (x0 − a) + x0 < b + x0 < b,
2 2 2 2 2
hence x ∈ (− + x0 , x0 + ), = 12 (x0 − a), implies x ∈ (a, b). The case where
= 12 (b − x0 ) is proved in the same way and is left as an exercise.
We call x0 ∈ R a boundary point of D ⊂ R if for every > 0 the interval
(−+x0 , x0 +) contains at least a point belonging to D and a point belonging
to D , recall D = {x ∈ R|x ∈ / D}. It may happen that a boundary point
belongs to D but it need not belong to D. Consider the set
D = (a, b] = {x ∈ R|a < x ≤ b}.
By definition a ∈
/ D but b ∈ D. We claim that both a and b are boundary
points.
a− a a+ b
Figure 11.2
We start with a and choose any > 0. The set (− + a, a + ) consists of all
points x ∈ R such that − + a < x < a + , hence all points − + a < x ≤ a
belong to (a, b] and all points a < x < a+ belong to (a, b] provided ≤ b−a.
Thus a is a boundary point not belonging to (a, b]. Now, to see that b is a
156
11 INVESTIGATING FUNCTIONS
boundary point take > 0 and consider (− + b, b + ). These are all points
x satisfying − + b < x < b + . Those x satisfying − + b < x ≤ b belong to
(a, b], provided < b − a and those satisfying b < x < b + belong to (a, b] .
Hence b is also a boundary point and it belongs to D = (a, b].
Note, in both cases we have to modify our argument if becomes too large,
< 12 (b − a) will always be sufficient.
By definition we call −∞ and +∞ the boundary points (at infinity) of the
intervals (−∞, a) or (−∞, a] and (b, +∞) or [b, +∞), respectively, as well as
of R = (−∞, ∞). This is a slight abuse of the definition but helpful.
Typically the domains D we will have to work with will consist of a finite
union of finite or infinite intervals which could be open, closed or half-open.
However, countable unions of finite intervals may also occur, think of the
tangent function. The set ∂D of all boundary points of D (excluding −∞
and +∞) is called the boundary of D. The first task is to find all boundary
points of D.
In the following we will only investigate functions which are continuous on
D, in fact we will assume the functions to be a few times differentiable. Here
is a fact which we will prove in Part 2: if f : D −→ R is continuous and D
a finite union of bounded and closed intervals then f is bounded, i.e. there
exists M ≥ 0 such that |f (x)| ≤ M for all x ∈ D.
As the example f : (0, 1] −→ R, x −→ x1 , shows this does not hold for non-
closed intervals, and g : R −→ R, x −→ x, shows that this does not hold for
unbounded intervals.
We want to study the continuous function f : D −→ R at boundary points of
D. First consider the case where D is a bounded interval. In the case where
D = [a, b] is closed (and bounded) we know that f is bounded and f (a) as
well as f (b) are finite values. Suppose that D is not closed, i.e. D = (a, b]
or D = [a, b) or D = (a, b). Of course f could still be bounded, but it need
not be. If a boundary point does not belong to D everything may happen.
However if a boundary point belongs to D, f remains “locally” bounded,
i.e. bounded at this boundary point (and in a small neighbourhood of it
belonging to D), but no information is known a priori for all of D. Indeed, if
a ∈ D (the case b ∈ D goes analogously) we find that f |[a, b−a ] is continuous,
2
hence bounded. The simple proof that f : D −→ R being continuous implies
the continuity of f |D̃ , D̃ ⊂ D, is left to the reader.
Let f : (a, b) −→ R be a continuous function. Here are some examples of
what may happen at the boundary:
157
Example 11.2. A. The function f : (0, 1) −→ R, x −→ x1 , is unbounded

as x −→ 0. However, we can control its behaviour as x −→ 0. It is strictly
monotone decreasing, i.e. x < y implies x1 > y1 . Further it is always non-
negative.
2 +1
B. The function f : (1, ∞) −→ R, x −→ xx−1 , is unbounded as x −→ ∞.
However, we can find its behaviour as x −→ ∞. Since

x2 + 1 1 + x12
=x
x−1 1 − x1

1 + x12
and lim = 1 it follows with g : (1, ∞) −→ R, x −→ x, that
x−→∞ 1 − x1
f (x)
lim = 1. (11.1)
x−→∞ g(x)
f (x)
Now, lim = 1 means that given > 0 there exists N ∈ N such that
g(x)
x−→∞
for x > N it follows that

f (x)

g(x) − 1 < or |f (x) − g(x)| < g(x),
i.e.
−g(x) < f (x) − g(x) < g(x)
or
(1 − )g(x) < f (x) < (1 + )g(x) for x > N, (11.2)
recall g(x) = x which is positive for x > N. This means that for > 0 given
and x sufficiently large, the behaviour of f is controlled by g.
C. Consider g : (0, 1) −→ R, x −→ sin x1 . This function is bounded but it
does not have a limit or specific asymptotic behaviour as x −→ 0. Indeed,
1
for the sequence xn = nπ we have sin x1n = 0, for the sequence yn = 2n+1 1 π it
2
follows that sin y1n = 1, and in fact for every value z ∈ [−1, 1] we can find a
sequence zn , zn −→ 0, such that sin z1n −→ z.
The most interesting case is Example 11.2.B which leads to:
158
Definition 11.3. Let f : (a, b) −→ R be a function, −∞ ≤ a < b ≤ ∞. We

call g : (a, b) −→ R, an asymptote of f at a (at b) if
f (x)
lim =1 (11.3)
x−→a g(x)

f (x)
lim =1 .
x−→b g(x)
If g is an asymptote of f at a we say that as x tends to a the function f

behaves asymptotically as g.
Note that there are more general notions of an asymptote but the one given
is sufficient for our purpose.
Example 11.4. Consider the polynomial p : R −→ R, x −→ p(x) =
N
aj xj , with aN = 0. We claim that g(x) = aN xN is an asymptote of p
j=0
as x −→ +∞. We have to prove
p(x)
lim = 1.
x−→∞ aN xN
Since
N N
j=0 aj x
j aj j−N
= x
aN xN a
j=0 N
N
−1
aj j−N
= 1+ x
j=0
aN
it remains to prove
N
−1
aj j−N
lim x = 0.
x−→∞
j=0
aN
But we know that lim xj−N = 0 for j < N. Note that the same argument
x−→∞
yields that g(x) = aN xN is also an asymptote of p(x) as x −→ −∞. Further,
this example shows that an asymptote is not uniquely determined. Take for
simplicity p(x) = x2 + 1, then x −→ x2 is an asymptote, but by a trivial
calculation it is easy to see that x −→ x2 + c, c ∈ R, is a further one.
159
Now, given a continuous function f : D −→ R where D is maximal and has

boundary points a1 , . . . , aN (±∞ might be included). In order to investigate
f we need to determine the behaviour of f at a1 , . . . , aN . The function
might be bounded at some boundary points, it might have asymptotes at
other boundary points, but there might also be boundary points where we
have quite an irregular behaviour, i.e. we end up with no specific statement.
In order to obtain asymptotes we need to calculate limits such as
f (x)
lim
x−→a g(x)
where both f and g may tend to zero as x −→ a, or may tend to infinity as
x −→ a. (Note a = ±∞ is allowed.)
Without proof (see [3, p. 152]) we state
Theorem 11.5 (de l’Hospital). Let f and g be differentiable functions
defined on (a, b), −∞ ≤ a < b ≤ ∞, and suppose that g (x) = 0 for all
x ∈ (a, b). Suppose that either
lim
x→a
f (x) = lim
x→a
g(x) = 0 (11.4)
x=a x=a
or
lim
x→a
g(x) = +∞ or − ∞. (11.5)
x=a
Then
f (x) f (x)
lim = lim (11.6)
x→a
x=a
g(x) x→a
x=a
g (x)
provided the limit on the right hand side exists. An analogous statement holds
for the boundary point b.
Example 11.6. A. For α > 0 we have
eαx αeαx
lim = lim = +∞. (11.7)
x−→∞ x x−→∞ 1
N j
B. For every polynomial p(x) = j=0 aj x , aN = 0, and α > 0 we have
N

lim aj xj e−αx = 0. (11.8)
x−→∞
j=0
160
Indeed

N N j
j=0 aj x
j
j=0 aj x
lim = lim
x−→+∞ eαx x−→+∞ αeαx
dN
N j
dxN j=0 aj x
= . . . = lim
x−→+∞ αN eαx
N! aN
= lim = 0.
x−→+∞ αN eαx
(We are allowed of course to iterate applications of de l’Hospital’s rule.)
C. We claim
lim
x→0
xx = 1. (11.9)
x>0
First note that by the continuity of exp we have
lim
x→0
xx = lim
x→0
exp(x ln x)
x>0 x>0

= exp lim
x→0
x ln x .
x>0
Now

ln x
lim
x→0
(x ln x) = lim
x→0 1
x>0 x>0 x
1
x
= lim = lim (−x) = 0,
x→0
x>0
− x12 x→0
x>0
hence
x
lim x = exp lim (x ln x) = exp(0) = 1.
x→0 x−→0
x>0
Now, knowing how to investigate functions at the boundary of their domains

we turn to the interior of the domain, i.e. all points x ∈ D which together
with a small open interval (− + x, x + ) belong to D.
We assume that f : D −→ R is twice continuously differentiable. We want
to determine local extreme values. For this we know what to do: determine
all zeroes x1 , . . . , xK of f in D, and then consider f (xj ). If f (xj ) > 0
then we have a local minimum, if f (xj ) < 0 then we have a local maximum.
161
Special consideration is needed for points where f (xl ) = f (xl ) = 0. It is

still possible for a function to have a local extreme value at such a point, for
example f : R → R, f (x) = x4 , has a local (and global) minimum at x = 0,
however f (0) = f (0) = 0. On the other hand, for g : R → R, g(x) = x3 ,
we also have g (0) = g (0) = 0, but at x = 0 the function g does not have
a local extreme value, in fact it is an example of a point of inflexion. If
x < 0 then g(x) < 0 and if x > 0 then g(x) > 0, while g(0) = 0.
Let us summarise our method: given a function f : D̃ → R, D̃ ⊂ R, in order
to properly investigate its behaviour we do the following:
• we determine its maximal domain D;
• we determine all of its symmetries;
• we investigate whether it is monotone or not;
• we study its behaviour at the boundary points of D;
• we look for local extreme values;
• we try to sketch the graph.
We want to investigate the hyperbolic functions:
ex − e−x
sinh x := ; (11.10)
2
ex + e−x
cosh x := ; (11.11)
2
sinh x ex − e−x
tanh x := = x ; (11.12)
cosh x e + e−x
and
cosh x ex + e−x
coth x := = x . (11.13)
sinh x e − e−x
Other hyperbolic functions are:
1
cosech x := ; (11.14)
sinh x
and
1
sech x := . (11.15)
cosh x
162
We start with sinh. The domain of sinh is obviously R, and since
e−x − e−(−x) ex − e−x

sinh(−x) := =− = − sinh(x), (11.16)
2 2
sinh is an odd function with sinh(0) = 0. Asymptotes g1 for x −→ ∞ and g2

x −x
for x −→ −∞ are determined by g1 (x) = e2 and g2 (x) = − e 2 . Indeed we
have
sinh x ex − e−x −2x

lim = lim = lim 1 − e = 1,
x−→∞ g1 (x) x−→∞ ex x−→∞
and
sinh x ex − e−x
lim = lim − −x
= lim 1 − e2x = 1.
x−→−∞ g2 (x) x−→−∞ e x−→−∞
Further we find
ex + e−x
sinh (x) = = cosh x > 0, (11.17)
2
implying that sinh is strictly monotone increasing. The graph of sinh looks
like:
−2 −1 1 2
−1
−2
−3
−4 Figure 11.3
163
For the domain D of cosh we again find that D = R and from

e−x + e−(−x) e−x + ex
cosh(−x) = = = cosh(x)
2 2
we deduce that cosh is an even function which implies that cosh could not
x
be strictly monotone. An asymptote for x −→ ∞ is g1 (x) = e2 and for
−x
x −→ −∞ an asymptote is g3 (x) = e 2 . Indeed we find
cosh x ex + e−x
lim = lim x
= lim 1 + e−2x = 1
x−→∞ g1 (x) x−→∞ e x−→∞
and
cosh x ex + e−x
lim = lim −x
= lim 1 + e2x = 1.
x−→∞ g3 (x) x−→−∞ e x−→−∞
Since
ex − e−x
cosh (x) = = sinh(x) (11.18)
2
we find that x0 = 0 is the only zero of cosh . Further
cosh (x) = sinh (x) = cosh(x) > 0
e0 +e0
for all x. Hence cosh has a minimum at x0 = 0 with value cosh 0 = 2
= 1.
The graph of cosh is given by
−3 −2 −1 0 1 2 3
Figure 11.4
Next we discuss tanh. Since cosh x = 0 for all x ∈ R we find that the domain
of tanh is again R. Further, tanh is the product of an even and an odd
164
function, hence it is an odd function. From

sinh x cosh2 x − sinh2 x 1
tanh (x) = = 2 = (11.19)
cosh x cosh x cosh2 x
we deduce that tanh is strictly monotone increasing. Note that we have used
cosh2 x − sinh2 x = 1 (11.20)
which is left as an exercise. Since

ex − e−x
tanh x =
ex + e−x
we get for x −→ ∞

1 − e−2x
lim tanh(x) = lim 1 · =1
x−→∞ x−→∞ 1 + e−2x
and for x −→ −∞ we have

1 − e2x
lim tanh(x) = lim −1 · = −1.
x−→−∞ x−→−∞ 1 + e2x
Thus x −→ 1 is an asymptote for x −→ ∞ and x −→ −1 is an asymptote
for x −→ −∞. The graph of tanh looks like
−4 −3 −2 −1 1 2 3
−1
Figure 11.5
Finally we consider coth x = cosh x

sinh x
. Since sinh(0) = 0, coth is only defined on
R \ {0}. Moreover, as a product of an even and an odd function it is an odd
function. Thus we may restrict our discussions to x > 0.
The derivative of coth is given by

cosh x sinh2 x − cosh2 x 1
coth = = 2 =− , (11.21)
sinh x sinh x sinh2 x
165
which is for x = 0 always strictly negative, hence coth |{x∈R|x>0} and coth |{x∈R|x<0}
are strictly decreasing functions.
For x −→ ∞ we find
ex + e−x
lim coth x = lim =1
x−→∞ x−→∞ ex − e−x
implying that x −→ 1 is an asymptote for coth as x −→ ∞.

Further, for x −→ 0, x > 0, we find that
cosh x ex + e−x
= x −→ +∞.
sinh x e − e−x
Thus coth, when it is restricted to (0, ∞) decreases from +∞ to 1 as x −→ ∞.
Using coth(−x) = − coth(x) we find that x −→ −1 is an asymptote for
x −→ −∞ and that coth x −→ −∞ as x −→ 0 for x < 0. The graph of coth
is given by
−3 −2 −1 1 2
−1
−2
−3
−4 Figure 11.6
Note that Figure 11.6 suggests that R(coth) has a gap, namely the interval
[−1, 1]. It is the discontinuity of coth at x = 0 which tolerates such a
behaviour.
166
Since sinh is strictly increasing with range R(sinh) = R it has an inverse

defined on R. By definition
arsinh x := sinh−1 x. (11.22)
The notation comes from area sinus hyperbolicus and in some books
one may see the notation area sinh for arsinh. Using formula (7.7) for the
derivative of the inverse function we find
1
sinh−1 (y) =
sinh sinh−1 (y)

1
= .
cosh sinh−1 (y)

Now, cosh x = 1 + sinh2 x, recall (11.20), implying that
1 1
sinh−1 (y) = = . (11.23)
1 + sinh 2
sinh−1 (y) 1 + y2
We claim √
arsinh x = ln x + x2 + 1 . (11.24)
Note that
√
√ d
dx
x + x2 + 1 1
ln x + x2 + 1 = √ =√ ,
x + x2 + 1 x2 + 1
i.e. √
ln x + x2 + 1 = arsinh (x),
and therefore they differ only by a constant:
√
arsinh x = c + ln x + x2 + 1 .
But arsinh 0 = 0 which gives

√
0 = arsinh 0 = c + ln 0 + 0 + 1 = c,
i.e. c = 0 and (11.23) holds.

The function tanh is strictly increasing with range (−1, 1), hence it has an
inverse
artanh : (−1, 1) −→ R, artanh := tanh−1 . (11.25)
167
We want to find artanh . First note that
1 cosh2 x − sinh2 x
tanh (x) = 2 = 2 = 1 − tanh2 (x).
cosh x cosh x
Now we get by (7.7)
1
artanh (y) = tanh−1 (y) =
(tanh )(tanh−1 (y))
1 1
= 2 −1 = ,
(1 − tanh )(tanh (y)) 1 − y2
i.e. we have
1
artanh (x) = for − 1 < x < 1. (11.26)
1 − x2
As in the case of arsinh we can prove
1 1+x
artanh(x) = ln . (11.27)
2 1−x
In the exercises there will be questions related to the inverse function of
cosh |[0,∞) . This function is denoted by arcosh and is defined on [1, ∞). Its
derivative is given by
1
arcosh (x) = √ , x>1 (11.28)
x2 −1
and we have √
arcosh x = ln(x + x2 − 1), x > 1. (11.29)
For coth we restrict our attention first to values x > 1. Thus coth |(0,∞) is
considered as a strictly decreasing function with range (1, ∞). This function
has an inverse function arcoth and we have
1
arcoth (x) = , x>1 (11.30)
1 − x2
and
1 x+1
arcoth(x) = ln , x > 1. (11.31)
2 x−1
Using the symmetry of coth we can extend (11.29) and (11.30) to x < −1.
168
Problems
1. Consider the set D := [−1, 2) ∪ {3, 4} ∪ [5, 6]. Find every interior point
of D and the boundary ∂D of D.
2. For each of the following expressions find the maximal set D ⊂ R such
that on D the expressions define functions.

a) (x2 − 1) (x2 + 4x).
cos(ln(arctan x))
b) x3 +4x2 −5x
.
1
c) ((sinh x) (1 − x4 )) 2 .
d) cot(arcsin x).
3. Use l’Hospital’s rules to find the following limits. If necessary, iterate

an application of these rules.
1 + cos πx
a) lim 2 ;
x→1 x − 2x + 1
ln(cos 3t)
b) lim ;
t→0 ln(cos 2t)
t>0
3y 2 − y + 5
c) lim ;
y→∞ 5y 2 − 6y − 3

1 1
d) lim − 2 .
u→0 sin2 u u
1 1 u2 −sin 2u u2 sin x
Hint: rewrite sin2 u
− u2
as u4
· sin2 u
, note that lim = 1,
x→0 x
u2 −sin 2u
and when applying l’Hospital rules to u4
make use of the addition
theorems for trigonometric functions.
4. a) For g : R −→ R find the asymptote as x → +∞ where

2

g(x) = ln 1 + x2 + ex .
b) Find the asymptote as t → ±∞ for the function

1
−
h(t) = e 1+t2 .
169
5. Following the method introduced in this chapter investigate the follow-

ing functions fj : D̃j −→ R and sketch their graphs. (Note that D̃j ⊂ R
is some domain, therefore firstly find the maximal domain Dj .)
2x2 +12x−2
a) f1 : D̃1 −→ R, f1 (x) = , D̃1 = [2, ∞);
1
15(x2 −1) 2
s2
π π
b) f2 : D̃2 −→ R, f2 (s) = tan 1+s 4 , D̃2 = , ;
6 4

2
c) f3 : D̃3 −→ R, f3 (t) = arsinh 1 − e−t , D̃3 = R+ .
6. Prove the following formulae for hyperbolic functions:

a) cosh2 x − sinh2 x = 1;
b) sinh2 x = 1
cosh2 x−1
;
c) sinh(x ± y) = sinh x cosh y ± cosh x sinh y;
tanh x−tanh y
d) tanh(x − y) = 1−tanh x tanh y
.
The following identity may be used:
cosh(x − y) = cosh x cosh y − sinh x sinh y.
170
12 Integrating Functions
Let us start to analyse a natural problem in mathematics. Given a continuous
function g : [a, b] → R, a, b ∈ R, a < b, can we find a function f : [a, b] → R
such that on (a, b)
f (t) = g(t)? (12.1)
Let us assume that we know the value of f (a). A very rough approximation
of f (t), a < t < b, is
f (t) − f (a)
.
t−a
Hence (12.1) would give
f (t) − f (a) ≈ g(t)(t − a) (12.2)
or
f (t) ≈ f (a) + g(t)(t − a), (12.3)
where g(t) ≈ h(t) means that g is close to h. There is a simple geometric
interpretation of the right hand side of (12.2)
y = g(x)
g(t)
a t b x
Figure 12.1
The area of the rectangle with vertices (a, 0), (t, 0), (t, g(t)) and (a, g(t)) is
given by g(t)(t − a). Of course, when t varies in [a, b] we obtain a function
t → g(t)(t − a) + f (a). (12.4)
But only for very small values of t − a, t > a, do we expect the function
(12.4) to be a reasonable approximation of a function f satisfying (12.1).
171
However we may improve the approximation. Given t ∈ (a, b) as before and

take t1 ∈ (a, b), t1 < t, and note that
f (t) − f (a) = f (t) − f (t1 ) + f (t1 ) − f (a)
≈ g(t)(t − t1 ) + g(t1 )(t1 − a).
y = g(x)
g(t)
g(t1 )
a t b x
Figure 12.2
Iterating this process n-times we find with a < t1 < t2 < · · · < t < b,
f (t) − f (a) = f (t) − f (tn ) + f (tn ) − f (tn−1 ) + · · · + f (t1 ) − f (a)
≈ g(t)(t − tn ) + g(tn )(tn − tn−1 ) + · · · + g(t1 )(t1 − a)
n+1

= g(tj )(tj − tj−1 ), (12.5)
j=1
where t0 := a and tn+1 := t.

y = g(x)
g(t3 )
g(t2 )
g(t)
g(t1 )
g(a)
···
a t1 t2 t3 · · · tn t = tn+1 b x
Figure 12.3
172
12 INTEGRATING FUNCTIONS
Now letting n → ∞ such that max1≤j≤n+1(tj − tj−1 ) → 0, we may conjecture

that f (t)−f (a) is given by the area bounded by the sets {(x, y) ∈ R2 |x = a},
{(x, y) ∈ R2 |x = t}, {(x, y) ∈ R2 |y = 0}, and {(x, y) ∈ R2 |y = g(t)}, or in
short the area of the set bounded by the x-axis, the function g and the lines
x = a and x = t.
Although this is the correct conjecture we must overcome some problems to
justify this solution. Most of all, we need to define what is meant by “the
area bounded by the x-axis, the function g and the lines x = a and x = t”.
Let g : [a, b] → R be a continuous function (which must be bounded as every
continuous function on a closed and bounded interval is). Let a = t0 < t1 <
t2 < · · · < tn < tn+1 = b be a finite sequence of points in [a, b]:
t1 t2 t3 t4 t5 t6
a = t0 t7 = b
Figure 12.4
We call such a finite sequence a partition of [a, b] into sub-intervals [tj , tj−1 ],
j = 1, ...n + 1 and we sometimes write Z(t1 , · · · tn ) or just Zn for such a
partition. The number
m(Zn ) := max{tj − tj−1 |j = 1, · · · , n + 1} (12.6)
is called the mesh size or width of the partition Zn . Given a partition Zn

we can form the (Riemann) sum (of g with respect to Zn )
n+1

Sr (g, Zn) := g(tj )(tj − tj−1 ). (12.7)
j=1
In the case where g ≥ 0 we already know an interpretation of Sr (g, Zn ) as

an approximation of the area bounded by the x-axis, the function g and the
lines x = a and x = b:
173
y = g(x)
a = t0 t1 t2 t3 t4 t5 t6 t7 = b x
Figure 12.5
In fact we may generalise Sr (g, Zn) slightly to the general Riemann sum of
g with respect to Zn and points ξj ∈ [tj−1 , tj ], which is defined by:
n+1

S(g, Zn, ξ) := g(ξj )(tj − tj−1 ) (12.8)
j=1
where ξ = (ξ1 , · · · , ξn+1).

y = g(x)
ξ1 ξ2 ξ3 ξ4 ξ5 ξ6 ξ7
a = t0 t1 t2 t3 t4 t5 t6 t7 = b x
Figure 12.6
Definition 12.1. Let g : [a, b] → R, a, b ∈ R, a < b, be a continuous
function. Suppose there exists a number Ia,b (g) ∈ R such that for every ε > 0
there exists δ > 0 with the property that if Zn is any finite partition of [a,b]
with mesh size m(Zn ) < δ then
|Ia,b (g) − Sr (g, Zn )| < ε.
174
In this case we call Ia,b (g) the (Riemann) integral of g over the interval
[a,b] and denote it by
b
g(t)dt := Ia,b (g). (12.9)
a
In Chapter 25, in particular Theorem 25.24, we will discuss in detail Riemann

sums and their relation to Riemann integrability. Without proof we quote
Theorem 12.2. For every continuous function g : [a, b] → R, a, b ∈ R, a <
b
b, the integral a g(t)dt exists and its value can be calculated by using (12.8)
instead of (12.7).
Definition 12.3. The area A of a set in R2 bounded by the x-axis, a non-
negative continuous function g : [a, b] → R, a, b ∈ R, a < b, and the lines
x=a and x=b, is by definition
b
A := g(t)dt. (12.10)
a
Remark 12.4. A. Note that Definition 12.3 is not tautological: it is a non-

trivial problem to define the area of an arbitrary subset of R2 .
B. Let us agree to define for any function g : [a, b] → R
c
g(t)dt = 0 for all c ∈ [a, b] (12.11)
c
This definition is justified by the idea that the interval [c, c] has length zero,
hence the rectangle with one side of length g(t) and the other of length 0
should have area 0.
Let g : [a, b] → R be a continuous function. We define a new function
f : [a, b] → R by x
x → f (x) := g(t)dt. (12.12)
a
Since [a, x] is a closed and bounded interval and g|[a,x] is continuous f (x) is
well defined.
The following theorem is important:
Theorem 12.5. Let g : [a, b] → R be a continuous function. Then f :
[a, b] → R defined by (12.12) is differentiable and we have
f (x) = g(x), (12.13)
175
i.e. we have x
d d
f (x) = g(t)dt = g(x). (12.14)
dx dx a
We will prove this result in Part 2 of our course in a similar way as to how we
motivated the introduction of the integral. Note that Theorem 12.5 allows
us to calculate integrals. First we give
Definition 12.6. Let g : [a, b] → R be a function. We call a differentiable

function f a primitive of g if f = g.
x
Hence by Theorem 12.5, x → a g(t)dt is a primitive of g. A primitive of a
function g is not unique. If f is a primitive of g then for every constant c ∈ R
a further primitive of g is given by f + c since (f + c) = f . It is important
that this is the only type of non-uniqueness of a primitive: if f and h are
two primitives of g then there exists a constant c ∈ R such that f − h = c.
Indeed, being a primitive implies
(f − h) = g − g = 0,
which yields f − h = c.
Theorem 12.7. (Fundamental Theorem of Calculus). Let g : [a, b] →

R be a continuous function and let h be a primitive of g. Then we have
b
g(t)dt = h(b) − h(a). (12.15)
a
Proof. We know that f defined by (12.12) is a primitive of g. Since f (a) = 0

b
and f (b) = a g(t)dt we find in this case that
b
g(t)dt = f (b) − f (a).
a
Now, if h is any further primitive, then f − h = c implying that
f (b) − f (a) = h(b) − h(a)
and (12.15) follows.
176
Let us now introduce a useful notation. If f is a primitive of g we write

b
g(t)dt = f |ba . (12.16)
a
Now we use the fundamental theorem to evaluate integrals.

Example 12.8. For k ∈ N we have
b b
k xk+1
x dx = , (12.17)
a k + 1 a
xk+1
i.e. f (x) = k+1
is a primitive of g(x) = xk and
b
bk+1 ak+1
xk dx = − . (12.18)
a k+1 k+1
We only have to note that
k+1
d x 1
f (x) = = (k + 1) xk+1−1 = xk .
dx k + 1 k+1
Example 12.9. For 0 < a 0.
dx x
Further we have
b
ln b − ln a = ln b + ln a−1 = ln .
a
Example 12.10. Let k ∈ Z, k < −1. Further assume that either a 1. Then we find
b b
−n x−n+1
x dx = . (12.21)
a −n + 1 a
177
Example 12.11. Let α > 0 and define for x > 0
xα := eα ln x . (12.22)
For α = n ∈ N we find
en ln x = eln x · · · eln x = x · · · x = xn ,
thus (12.22) generalises the power function. Moreover we have

d α d α ln x 1
x = e = α eα ln x = αxα−1 ,
dx dx x
which yields for 0 < a 0. Indeed we find that dx α+1
= xα , i.e. x → xα+1 is
a primitive of x → xα . Without proof we note that (12.23) holds for all
α = −1.
We want to return to Example 12.9:
Example 12.12. For a < b < 0 we have
b
1
dx = ln(−x)|ba . (12.24)
a x
d 1
Indeed for x < 0 we find dx ln(−x) = − −x = x1 . We can combine (12.19)
with (12.24) to get
b b
1
dx = ln |x| , 0∈
/ [a, b]. (12.25)
a x a
Example 12.13. Since sin = cos and cos = − sin we have

b
sin xdx = − cos x|ba (12.26)
a
and b
cos xdx = sin x|ba . (12.27)
a
178
Taking in (12.27) a = 0 and b = π we find

π
cos xdx = sin π − sin 0 = 0.
0
Hence there are functions not identical to zero whose integral over a certain
interval might be zero.
Example 12.14. For exp we find
b
ex dx = ex |ba = eb − ea . (12.28)
a
Example 12.15. We find

b b
1 dx
√ dx = √ = arcsin x|ba , [a, b] ⊂ (−1, 1). (12.29)
1−x2 1−x 2
a a

b
dx
2
= arctan x|ba . (12.30)
a 1+x

b √
dx
√ = ln(x + 1 + x2 )|ba = arsinh x |ba . (12.31)
a 1 + x2
All these examples are simple to prove: we just use our knowledge about
derivatives. Whenever we know of two functions where f = g we can imme-
diately write b
g(t)dt = f (x)|ba .
a
In the next chapter we will meet rules on how to reduce a given integral to
an integral which we can evaluate. Unfortunately this is not always possible.
Before doing this, let us introduce a further traditional notation. If g is a
continuous function then we denote its generic primitive by

g(x)dx or g dx.
b
Thus may have two interpretations:
x in the form a g(t)dt it helps us to
define a number, hence x → a g(t)dt defines a unique function; in the form
g(x)dx it denotes the generic primitive of g. Older books tend to call
b
a
g(t)dt a definite integral and gdt an indefinite integral.
179
Problems
1. a) Find the Riemann sum of the function f : [1, 2] −→ R with
f (t) = 2t2 − t with respect to the partition tk = 1 + nk , k = 0, 1, . . . , n,
and ξk being the midpoint of the interval [tk−1 , tk ], k = 1, . . . , n.
b) Let a < b and h : [a, b] −→ R be the function h(t) = 1
1+t2
.
a(m2 −l2 )+l2 b
For the partition tl = m2
, l = 0, 1, . . . , m, and ξl ∈ [tl , tl+1 ],
l = 0, . . . , m−1, such that ξl −tl = 13 (tl+1 −tl ) and tl+1 −ξl = 23 (tl+1 −tl )
find the corresponding Riemann sum.
(After calculating tk − tk−1 , k = 1, . . . , m, and ξk , k = 1, . . . , m, and
m

forming the sum g(ξk )(tk − tk−1 ), it will not be possible to simplify
k=1
much in this expression.)
2. Let g : [a, b] −→ R be a function with the Riemann sum
n

S(g, Zn, ξ) = g(ξj )(tj − tj−1 ).
j=1
Let a < tk < b be a fixed point in Zn . Prove that
S(g|[a,tk ] , Zn |[a,tk ] ,ξ|[a,tk ] ) + S(g|[tk ,b] , Zn |[tk ,b] , ξ|[tk ,b] )

= S(g, Zn, ξ).
Here Z|[a,tk ] is the partition a = t0 < t1 < · · · < tk , Z|[tk ,b] is the
partition tk < tk+1 < · · · < tn = b, and ξ|[a,tk ] as well as ξ|[tk ,b] denote
the points ξj belonging to [a, tk ] and [tk , b] respectively.
3. a) By interpreting integration as the area under a curve (Definition

12.3) find
1
|x|dx
−2
by calculating the area of the triangles ABC and BDE in Figure 12.7
where A = (−2, 0), B = (0, 0), C = (−2, 2), D = (1, 0), E = (1, 1).
180
C
−2
E
−1
|
A B D
Figure 12.7
b) The upper semicircle with radius R√in Figure 12.8 is the graph
of the function g : [−R, R] −→ R, g(r) = R2 − r 2 .
−R
g(r)−
R
−R r R
Figure 12.8
Again using the interpretation
R that
R integration represents area, see Def-
√
inition 12.3, find g(r)dr = R2 − r 2 dr.
−R −R
4. By calculating derivatives prove that in each of the following cases F

is a primitive of f , i.e. F = f .
a) F (x) = ln(cosh x), f (x) = tanh x;
as
b) F (s) = lna
, a > 1, f (s) = as ;
eu (sin 5u−5 cos 5u)
c) F (u) = 26
, f (u) = eu sin 5u;
d) F (r) = − 12 cos (r 2 + 4r − 6), f (r) = (r + 2) sin (r 2 + 4r − 6).
181
13 Rules for Integration

There are essentially two sets of rules for integration. The first and easier
ones are derived from properties of the summation process. The second set
of rules is derived from our rules for taking derivatives and the fundamental
theorem.
From Definition 12.1 it follows that we b can approximate for a continuous
function g : [a, b] −→ R the integral a g(t) dt by (finite) Riemann sums.
Since for two continuous functions g1 : [a, b] −→ R and g2 : [a, b] −→ R and
for real numbers λ, μ ∈ R we have
Sr (λg1 + μg2 , Zn ) = λSr (g1 , Zn ) + μSr (g2 , Zn )
the triangle inequality yields for a given > 0

b b b

(λg (t) + μg (t)) dt − λ g (t) dt − μ g (t) dt
1 2 1 2
a a a
b

≤ (λg1 (t) + μg2(t)) dt − Sr (λg1 + g2 , Zn )
a
b b

+ λ g1 (t) dt − λSr (g1 , Zn ) + μ g2 (t) dt − μSr (g2 , Zn )
a a
< 3
provided the mesh size m(Zn ) is small enough. Thus we have proved
b b b
(λg1 (t) + μg2 (t)) dt = λ g1 (t) dt + μ g2 (t) dt, (13.1)
a a a
i.e. the integral is linear. Furthermore, if g : [a, b] −→ R is continuous and

non-negative, i.e. g ≥ 0, then it follows that Sr (g, πn ) ≥ 0. Now it follows
for a given > 0 and for a sufficiently small m(πn ) that
b
− + Sr (g, Zn ) ≤ g(t) dt
a
b
implying that − ≤ a
g(t) dt for all > 0, i.e.
b
g(t) dt ≥ 0 if g(t) ≥ 0 for all t ∈ [a, b]. (13.2)
a
183
In all of our considerations we have so far assumed that a b by defining

b a
g(t) dt := − g(t) dt. (13.3)
a b
For example we have

−1 0
0
x2 02 (−1)2 1
x dx = − x dx = − = − − − = .
0 −1 2 −1 2 2 2
A simple application of (13.1) is

N

Example 13.1. Let a, b ∈ R and p(t) = cj tj , cj ∈ R, be a polynomial.
j=0
Then we have
b b
N N
b
j
p(t) dt = cj t dt = cj tj dt
a a j=0 j=0 a
b
cj j+1
N

= t
j+1
j=0 a
N N N
cj j+1 cj j+1 cj
= b − a = (bj+1 − aj+1 ).
j=0
j + 1 j=0
j + 1 j=0
j + 1
Now we turn to rules following from the fundamental theorem of calculus

and rules for taking derivatives. We start with
Theorem 13.2 (Integration by Parts). Let f, g : [a, b] −→ R be two
continuously differentiable functions. Then
b b b

f (s)g (s) ds = f · g − g(s)f (s) ds. (13.4)
a a a
184
13 RULES FOR INTEGRATION
Proof. From Leibniz’s rule we know
(f g) (s) = f (s) g(s) + f (s) g (s).
Integrating this equality we get

b b b
f (s) g (s) ds = (f g) ds − f (s) g(s) ds.
a a a
Since f · g is a primitive of (f · g) the fundamental theorem implies

b b

(f g) (s) ds = f g
a a
which finally yields (13.4).
Example 13.3. Let 0 < a < b. We want to show that

b b b

ln x dx = ((x ln x) − x) = x((ln x) − 1) . (13.5)
a a a
For this we take f (x) = ln x and g(x) = x in (13.4). Since g (x) = 1 and
(ln x) = x1 we find
b b b 1

(ln x) 1 dx = (ln x)x − x dx
a a a x
b b b

= (ln x)x − 1 dx = ((x ln x) − x) .
a a a
Example 13.4. For a < b and with f (x) = x and g (x) = sin x, i.e. we may
take g(x) = − cos x, to find
b b b

x sin x dx = −x cos x − 1 (− cos x) dx
a a a
b b

= −x cos x + cos x dx
a a
b

= (−x cos x + sin x) .
a
185
Example 13.5. For a < b we find

b b
x
(cos x) e dx = cos x e + (sin x) ex dx
x
a a
b b b
x x x
= cos x e + sin x e − (cos x) e dx
a a a
b b
x
= (cos x + sin x) e − (cos x) ex dx,
a a
or b
b
x x
2 (cos x) e dx = (cos x + sin x) e
a a
implying b
b
x (cos x + sin x) ex
(cos x) e dx = .
a 2 a
Sometimes integrals “longing” for an integration by parts can be handled

easier with a little trick.
Example 13.6. For α, β ∈ R and a < b we have
1
sin αx sin βx = (cos(α − β)x − cos(α + β)x) ,
2
compare with (10.10). Therefore we find for α = β and α = −β
b
1 b
sin αx sin βx dx = (cos(α − β)x − cos(α + β)x) dx
a 2 a
b
1 sin(α − β)x sin(α + β)x
= − .
2 α−β α+β a
Our next rule for integration is derived from the chain rule.
Theorem 13.7 (Change of variables, Part 1). Let g : [a, b] −→ R be a
continuous function and let φ : [α, β] −→ [a, b] be a differentiable function
with continuous derivative φ . Then
β φ(β)
g(φ(t))φ(t) dt = g(x) dx. (13.6)
α φ(α)
186
Proof. Let f : [a, b] −→ R be a primitive of g, i.e. f = g. The chain rule

yields
(f ◦ φ)(t) = f (φ(t))φ (t) = g(φ(t))φ(t).
Now it follows from the fundamental theorem of calculus that
β β

g(φ(t))φ (t) dt = (f ◦ φ) (t) dt
α α
β

= (f ◦ φ) = f (φ(β)) − f (φ(α))
α
φ(β)
= g(x) dx.
φ(α)
Example 13.8. For a continuous function g : R −→ R we find for α < β

and c ∈ R that β β+c
g(t + c) dt = g(x) dx. (13.7)
α α+c
Indeed, we just have to take φ(t) = t + c, note φ (t) = 1, and restrict g to

[α + c, β + c].
Example 13.9. For a continuous function g : R −→ R we find for α < β

and c = 0 that
β
1 βc
g(ct) dt = g(x) dx. (13.8)
α c αc
This follows from (13.6) with φ(t) = ct, φ (t) = c and restricting g to [αc, βc]
(or [βc, αc] if c < 0).
Remark 13.10. In Examples 13.8 and 13.9 the function g does not have to
be defined on all of R. It would be sufficient to consider functions defined on
[α + c, β + c] and [αc, βc], respectively.
Example 13.11. Let φ : [a, b] −→ R be a differentiable function with con-

tinuous derivative φ . Assume further that φ(t) = 0 for all t ∈ [a, b]. Then
b b
φ (t)
dt = ln |φ(t)| . (13.9)
a φ(t) a
187

d
For this note first that dt ln φ(t) = φφ(t)
(t)
provided ln φ(t) is defined, i.e. φ(t) >
0. Now we use in the change of variable formula g(x) = x1 and it follows that
b φ(b) φ(b) b
φ (t) 1
dt = dx = ln x = ln φ(t) .
a φ(t) φ(a) x φ(a) a
The case where φ(t) < 0 is treated by switching from φ(t) to −φ(t). As an
immediate consequence of (13.9) we find
b
b
x 1 b
2x 1 2

dx = dx = ln(1 + x ) (13.10)
a 1 + x2 2 a 1 + x2 2 a
or
b b
cos t b

cot t dt = dt = ln | sin t| , (13.11)
a a sin t a
provided sin t has no zero in [a, b]. Note further that

b b b b
sin t − sin t
tan t dt = dt = − dt = − ln | cos t| (13.12)
a a cos t a cos t a
provided cos t has no zero in [a, b].
Before we use the change of variables method in a more sophisticated situa-

tions we want to discuss a slightly modified change of variables formula.
Theorem 13.12 (Change of variables, Part 2). Let g : [a, b] −→ R

be a continuous function and let φ : [α, β] −→ R be a strictly monotone
differentiable function with continuous derivative. Suppose that φ(α) = a
and φ(β) = b, i.e. φ−1 (a) = α, φ−1 (b) = β. Then
b β φ−1 (b)

g(x) dx = g(φ(t))φ (t) dt = g(φ(t))φ(t) dt. (13.13)
a α φ−1 (a)
Proof. Of course (13.13) follows from (13.6) using that φ−1 exists.
Let us compare (13.6) with (13.13). In (13.6) we have to identify the function
we want to integrate as a term g(φ(t))φ(t), whereas in (13.13) we start with
b
the integral a g(x) dx and modify it. But we have to pay a price: we have to
find an invertible (bijective) smooth change of variable, i.e. we need to find
188
b
t = t(x) = φ−1 (x) to transform a g(x) dx to the right hand side in (13.13).
Note that
dt d φ−1 1 1
= (x) = −1 = .
dx dx φ (φ (x)) φ (t)
b
The transformation of a g(x) dx could be done in a formal way
g(x) g(φ(t))
dx φ (t) dt
a φ−1 (a)
b φ−1 (b).
The second step looks a bit more demanding. In principle we can easily
introduce t = φ−1 (x). But now we need φ (t), i.e. we have to invert φ−1 . In
certain examples this is often not needed.
b
Example 13.13. Consider the integral a (x + 2) sin(x2 + 4x − 6) dx. We
dt
choose t = φ−1 (x) = x2 + 4x − 6, i.e. dx = 2x + 4 = 2(x + 2).
Now we use
sin(x2 + 4x − 6) sin t
but instead of
dx φ (t) dt
we observe that
1
(x + 2) dx = dt
2
which yields
b −1
2 1 φ (b)
(x + 2) sin(x + 4x − 6) dx = sin t dt
a 2 φ−1 (a)
φ−1 (b)
1 1 1
= − cos t = cos(φ−1 (a)) − cos(φ−1 (b)).
2 φ−1 (a) 2 2
Note that in our example we must ensure that φ is defined on an interval

where it is invertible, since φ−1 is needed. A simple calculation gives
t = φ−1 (x) = x2 + 4x − 6 = x2 + 4x + 4 − 10
= (x + 2)2 − 10
189
or √
x = φ(t) = −2 + t + 10, t ≥ −10 and x ≥ −2
and √
x = φ(t) = −2 − t + 10, t ≥ −10 and x ≤ −2.
Hence for b > a ≥ −2 or a < b ≤ −2 we may use our calculation. In each
case we eventually get
b
1 1
(x + 2) sin(x2 + 4x − 6) dx = cos(a2 + 4a − 6) − cos(b2 + 4b − 6).
a 2 2
We want to optimise our strategy to evaluate integrals further by using the

notation
g(x) dx (13.14)
for the primitives of g, i.e. with this notation we can write for a primitive f
of g
f (x) = g(t) dt + c (13.15)
where c is a constant. (This is not a very well defined notation, but very
useful.)
Using in (13.14) a change of variables t = φ−1 (x) we find that

g(φ(t))φ(t) dt + c̃ (13.16)
is a primitive of g(φ(t))φ(t), and (13.15) and (13.16) differ only by a constant.

Thus instead of always transforming the limits of the integral we first work
on the level of primitives:

g(x) dx = g(φ(t))φ(t) dt.
b
To eventually find a
g(x) dx we observe that
b φ−1 (b)

g(x) dx = h(t)
a φ−1 (a)
where h is any primitive of g(φ(t))φ(t).
190
Example 13.14. A. Consider

cot(ln x)
dx.
x
1
Using t = ln x, i.e. dt = x
dx, and we find by (13.11) that

cot(ln x)
dx = cot t dt = ln | sin t| + c.
x
B. Consider 1
dx
.
−1 (x + 2)(3 − x)
Observe first that

dx dx dx
= = .
(x + 2)(3 − x) 2
6 − (x − x) 25 1 2
4
− (x − 2
)
Now take t = x − 12 , i.e. dt = dx to find

dx dt dt
= = 2
(x + 2)(3 − x) 25
− t2 5
1 − 2t5
4 2

2 dt
= 2 .
5
1 − 2t5
2t
By a further change of variables s = 5
, i.e. ds = 25 dt we find

2 dt ds
2 = √ = arcsin s + c
5 1 − s2
1 − 2t5
2t
= arcsin + c.
5
Therefore we have

dx 2t 2x − 1
= arcsin + c = arcsin +c
(x + 2)(3 − x) 5 5
191
and finally
1
1
dx 2x − 1
= arcsin
−1 (x + 2)(3 − x) 5 −1

1 3
= arcsin − arcsin − .
5 5
1 3
= arcsin( ) + arcsin( ).
5 5
C. Consider
2−x tanh 21−x dx.
Take t = 21−x which yields dt = −(ln 2)21−x dx, i.e. 2−x dx = − 2 ln1 2 dt and
therefore

−x 1−x 1
2 tanh 2 dx = (tanh t) − dt
2 ln 2
1 1
=− ln cosh t + c = − ln cosh 21−x + c
2 ln 2 2 ln 2
where we used that (ln cosh t) = tanh t.
D. Consider 1/√2
x arcsin x2
√ dx.
0 1 − x4
1 √2x dx
Take t = arcsin x2 to find dt = √ 2x dx, i.e. dt = 1−x4
and hence
1−(x2 )2

x arcsin x2 1 1
√ dx = t dt = t2 + c.
1 − x4 2 4
It follows that
√ 1/√2
1/ 2
x arcsin x2 1
√ dx = (arcsin x2 )2
0 1 − x4 4 0
2 2
1 1 1 π
arcsin − (arcsin 0)2 =
4 2 4 144
since arcsin 0 = 0 and arcsin 12 = π6 .
192
Important Remark. Using the change of variable formula requires experi-

ence and routine which one only gets by doing many examples. There is no
general principle on how to find the best change of variables, but of course
there are some rules. Nowadays we can use powerful programme packages
to evaluate integrals. However, one still needs some experience to handle in-
tegrals without using such a package as it will be useful in many theoretical
considerations in many fields of mathematics.
A further method we need to learn is related to the decomposition of rational
functions into partial fractions. Let P (x) and Q(x) be two polynomials
and suppose that the degree of P (x) is less than that of Q(x). (Otherwise
use polynomial division to decompose PQ(x)(x) R(x)
= g(x) + Q(x) , where g(x) is a
polynomial and R(x) is now a polynomial of degree less than Q(x).) From
algebra we know that each polynomial in R with leading coefficient equal to
1 has the unique factorisation
Q(x) = (x − z1 )p1 . . . (x − zk )pk (x2 + α1 x + β1 )q1 . . . (x2 + αl x + βl )ql (13.17)
where the polynomials x − zj , j = 1, . . . , k, and x2 + αj x = βj , j = 1, . . . , l
have real coefficients and are mutually different, and pj , ql ∈ N. It can be
shown that
k pi l qi
P (x) aij bij x + cij
= j
+ (13.18)
Q(x) i=1 j=1
(x − zi ) i=1 j=1
(x + αi x + βi )j
2
holds with suitable real numbers aij , bij and cij . Hence, whenever the integral
b P (x)
a Q(x)
dx exists we have
b k
pi b l qi b
P (x) aij bij x + cij
dx = dx + dx. (13.19)
a Q(x) a (x − zi )j a (x2 + αi x + βi )j
i=1 j=1 i=1 j=1
In practice we work as in the following example:

3x − 2 A B C D
3
= + + 2
+
(4x − 3)(2x + 5) 4x + 3 2x + 5 (2x + 5) (2x + 5)3
A(2x + 5)3 + B(4x − 3)(2x + 5)2 + C(4x − 3)(2x + 5) + D(4x − 3)
= .
(4x − 3)(2x + 5)3
This leads to the equality
3x − 2 = A(2x + 5)3 + B(4x − 3)(2x + 5)2 + C(4x − 3)(2x + 5) + D(4x − 3).
193
Expanding the right hand side and comparing coefficients we end up with
four linear equations for the four unknowns A, B, C, D. Note that Q̃ =
(4x − 3)(2x + 5)3 does not have leading coefficient 1 and it is not of type
(13.17). However
3 3
3 3 5 3 5
Q̃(x) = 4 x − 2 x+ = 32 x − x+
4 2 4 2
= 32Q(x)
and Q(x) has leading coefficient 1 and is of type (13.17). In general we can
find γ0 ∈ R such that for a polynomial Q̃(x) we get Q̃(x) = γ0 Q(x) where
Q(x) has leading coefficient 1 and is of type (13.17).
For practical purposes switching from Q̃ to γ0 Q is often not needed, but in
order to get in (13.17) uniqueness up to the order of factors it is needed. An
alternative way is to use in (13.17) for a general polynomial Q̃ the represen-
tation γ0 (x − z1 )p1 · · · (x − zk )pk (x2 + α1 x + β1 )q1 · · · (x2 + αl + βl )ql where γ0
is the leading coefficient of Q̃.
Here is a more simple example:
Example 13.15. A. Find

6−x
dx.
(x − 3)(2x + 5)
Write
6−x A B A(2x + 5) + B(x − 3)

= + =
(x − 3)(2x + 5) x − 3 2x + 5 (x − 3)(2x + 5)
implying
6 − x = 5A − 3B + x(2A + B)
or
5A − 3B = 6 and 2A + B = −1
3
which yields A = 11
and B = − 17
11
.
Hence
3 17
6−x
= 11 − 11 ,
(x − 3)(2x + 5) x − 3 2x + 5
194
and therefore

6−x 3 1 17 2
dx = dx − dx
(x − 3)(2x + 5) 11 x−3 2 · 11 2x + 5
3 17
= ln |x − 3| − ln |2x + 5| + c.
11 22
b dx
B. Let −1, 1 ∈
/ [a, b] and consider a 1−x 2 . We try
1 1 A B (A + B) + (A − B)x
2
= = + =
1−x (1 − x)(1 + x) 1−x 1+x 1 − x2
which leads to A + B = 1 and A − B = 0, i.e. A = B = 12 .
This implies
b
1 1 b 1 1 b 1
2
dx = dx + dx
a 1−x 2 a 1−x 2 a 1+x
b b
1 1 1
= dx − dx
2 a 1+x a x−1
b
1
= (ln |x + 1| − ln |x − 1|)
2 a
b
1 x + 1
= ln .
2 x − 1 a
Problems
1. Find n
1 1
(1 + k 2 )x k2 dx.
0 k=1
2. a) For f : [a, b] −→ R integrable prove that

b b b
|f (x)|dx = +
f (x)dx + f − (x)dx.
a a a
b) Prove that if f : [a, b] −→ R is integrable and satisfies |f (t)| ≤ M

for all t ∈ [a, b] then
b

f (t)dt ≤ M(b − a).

a
195
c) Let f : [−1, 0] −→ R be a differentiable function such that

0
f (−1) = 0 and f (x) ≥ 0 for all x ∈ [−1, 0]. Show that f (x)dx ≥ 0.
−1
3. By only using symmetry considerations prove that

1
1
1+ 2
sin x3 dx = 0.
−1 1 + x
4. Denote the Dirichlet kernel discussed in Problem 3 of Chapter 9 by Dn .

Use the results of that problem to show that
π
2 2
Dn (t)dt = 1
π 0
for all n ∈ N.
5. For a continuous function f : R −→ R use a straightforward change of

variable to find the integrals
b b
f (αt)dt and f (αt + β)dt, α = 0, t ∈ R,
a a
b
in terms of the integral f (t)dt, a < b.
a
6. Use integration by parts and where appropriate the results of Problem

5 to evaluate the following integrals:
π
4
a) ϑ cos ϑdϑ;
0
2
b) xln(2x + 1)dx;
1
2
1
m
c) s sinh(ms)ds;
0
3
lnt
d) √ dt;
1 t
π
e) e2r sin 3rdr.
0
196
7. For m, n ∈ N prove
⎧
π ⎨ 0, n = m
1
(cos nx)(cos mx)dx = 1, n=m>0
π −π ⎩
2, n=m=0
and π
1
(sin nx)(cos mx)dx = 0.
π −π
8. Find the following primitives:

a) x2 eλx dx;

dt
b) 2
, a, b, c ∈ R,
at + bt + c
note that different cases must be considered for different a, b, c.
9. Let g : R −→ R be a continuous and periodic function with period

a > 0, i.e. g(t + a) = g(t) for all t ∈ R. For all c ∈ R show that
a c+a
g(t)dt = g(t)dt.
0 c
10. Use a change of variable to evaluate the following integrals:

e2
dx
a) ;
e x(lnx)3
π
2 dt
b) (try: tan 2t = s);
π
3
5 + 3 cos t
√1 2
2 y arcsin y
c) dy (try: arcsin y 2 = ν);
0 1 − y4
1
ds
d) √ ;
1
2
5 − 4s − s2
4
1
e) 3 dx (try: x = sinh t).
2
1 (1 + x ) 2
197
11. Evaluate the following integrals. Note that a change of variables and
integration by parts may need to be used.
4 √
a) 3 2t+1 dt;
0
π
x sin x
b) I := 2
dx.
0 1 + cos x
π2
Hint: derive the equality I = 2
− I.
12. Use partial fractions to find

x+1
dx.
x4 − x
(The result of Problem 8 b) may eventually become useful.)
13. For f, g : [a, b] −→ R being three times continuously differentiable
prove
b b
(3) b b b
f (t)g (t)dt = f g |a − f g |a + f g|a − f (3) (t)g(t)dt.
a a
14. Prove that for g continuously differentiable and g(s) > 0 we have

g (s)
ds = 2 g(s).
g(s)
Now find π
2 cos r
√ dr.
π
6
sin r
15. Let f : [−π, π] −→ R be a continuously differentiable function such

that |f (t)| ≤ M for all t ∈ [−π, π]. Prove that
π
2πM
f (t) cos ntdt ≤ .
n
−π
16. For n ≥ 2, n ∈ N, find x

lim t−n dt.
x→∞ 1
198
Part 2
Analysis in One
Dimension
14 Problems with the Real Line

In Part 1 we omitted several proofs; some were omitted because they are ob-
vious, whereas others were omitted because they depend on tools or results
proved in an algebra course and these were therefore perhaps not yet known.
However most of the proofs we omitted claim the existence of a real number
with certain properties and we could not prove this in Part 1.
We have identified the real numbers with the real line and sometimes we
switched from algebraic to geometric arguments, but this is in fact a non-
trivial problem. In this chapter we want to analyse this problem in more
detail.
Let us summarise, i.e. recollect from Part 1, the basic algebraic properties
of the real numbers. On R we have two operations, addition and multipli-
cation
+ :R × R−→R · :R × R−→R
(x, y) −→x + y (x, y) −→x · y.
The rules for addition are for x, y, z ∈ R
(x + y) + z = x + (y + z); (14.1)
x + 0 = x; (14.2)
x + (−x) = 0; (14.3)
x + y = y + x; (14.4)
where (14.2) means that in R there exists an element 0 such that x+0 = x for
all x ∈ R, i.e. 0 is a neutral element with respect to addition. Further we
interpret (14.3) as follows: for every x ∈ R there exists an inverse element
−x with respect to addition.
For multiplication we have with x, y, z ∈ R the rules
(x · y) · z = x · (y · z); (14.5)
1 · x = x; (14.6)
x · x−1 = 1 for x = 0; (14.7)
x · y = y · x; (14.8)
201
here (14.6) means that there exists 1 ∈ R, 1 = 0, such that 1 is a neutral

element with respect to multiplication and (14.7) means that each x ∈ R,
x = 0, has an inverse element with respect to multiplication. These two
operations are linked by the law of distribution
x · (y + z) = x · y + x · z. (14.9)
It turns out that there are many sets K with operations + and · satisfying
(14.4)-(14.9), we call each such algebraic object (K, +, ·) a (commutative)
field. It can be easily checked that all rational numbers form a field, as do
the complex numbers C. In algebra a lot of consequences of these axioms
can be learned. These consequences justify our usual calculations in Q, R,
or C. Here we take these consequences for granted.
For R and Q we also have axioms of order:
for every x ∈ R (∈ Q) one and only one

of the statements x = 0, x > 0, x < 0 holds; (14.10)
x > 0 and y > 0 implies x + y > 0; (14.11)

x > 0 and y > 0 implies x · y > 0. (14.12)
All further properties of the order structure on R (or Q) can be deduced

from (14.10)-(14.12). Recall that we write x < y if x − y < 0 and x > y if
x − y > 0, compare with (1.70) and (1.71). Let us prove some consequences
of (14.10)-(14.12) to get some flavour of the arguments involved.
We claim that x > y and y > z implies x > z. From each inequality we
deduce that x − y > 0 and y − z > 0 respectively, hence
(x − y) + (y − z) = x − z > 0 or x > z.
Next we show x > y and a > 0 implies ax > ay. Since x − y > 0 and a > 0
it follows that a(x − y) > 0 or ax − ay > 0, i.e. ax > ay.
Of course we will continue to use the notation x ≤ y and x ≥ y as defined in
Part 1, Chapter 1.
A (commutative) field (K, +, ·) on which (14.10)-(14.12) hold is called an
ordered field. Both R and Q are ordered fields, but C is not. This follows
for example from the fact that i2 = −1. Indeed, (14.12) implies for x > 0
that x2 > 0, for x = 0 we have x2 = 0, and for x < 0 it follows that −x > 0,
202
14 PROBLEMS WITH THE REAL LINE
and therefore x2 = (−x)(−x) > 0. Hence (14.4)-(14.12) imply x2 ≥ 0 which

does not hold in C. Thus we have a distinction between C and R (or Q).
So far we cannot make any distinction between R and Q. In fact, we do not
even know what R should be. But in Q we have a problem, in fact we have
several quite similar problems:
Claim: in Q there is no element a such that a2 = 2.
Suppose a ∈ Q has this property where we may assume that a > 0. Then
a = pq , p = 0, p, q ∈ N0 , and q and p have no common factor. From a2 = 2 we
deduce that q 2 /p2 = 2 or q 2 = 2p2 . This implies that q 2 is an even number,
hence q is an even number, say q = 2r. Now it follows that 4r 2 = 2p2 or
p2 = 2r 2 , i.e. p2 , hence p is an even number too, which is a contradiction,
therefore the above claim is true.
Now let us turn to our geometric interpretation of R as the points on the
(real) line. Consider the unit square in the plane
D = (0, 1)
C = (1, 1)
(l, 0)
B = (1, 0)
A = (0, 0)
Figure 14.1
We know that by Pythagoras’ theorem the length l of AC is given by l2 =

12 + 12 = 2, i.e. l2 = 2. Hence this length l is not given by a rational number.
Certainly we can consider all rational numbers as points on a line. In doing
so, the above consideration shows that on the line containing only rational
numbers (points) there are gaps.
On the other hand, given two rational number q, p ∈ Q, q < p, there are
infinitely many rational numbers r ∈ Q such that q < r < p. Indeed,
203
take r1 = q+p
2
and then take instead of p the number r1 , now continue this
procedure. Doing this N-times we find rN = q + p−q 2N
. Thus given > 0 we
can find a rational number rN such that |q − rN | < . Just take p = q + 1
and N such that 2−N < .
So we face the following strange situation: not every point on the “line”
corresponds to a rational number but we can put into the gap between two
rational numbers as many rational numbers as we like.
The number l, l2 = 2, lies between two rational numbers. We can argue as
follows: the square of the length d of the side AF of the triangle AEF with
A = (0, 0), E = (4, 0) and F = (4, 3) is equal 42 + 32 = 52 , i.e. d = 5, but
l < d:
F
3
D C
1
B (l, 0) E
0
A 1 2 3 4 5
Figure 14.2
−1
Thus we have 0 < l < 5 implying that we can get as close as we wish to l
in terms of rational numbers. A word of caution: once again we have mixed
geometric arguments with algebraic ones.
We need to resolve these problems, but before we can do this we need more
knowledge about properties of the real numbers (if they exist). We continue
for a while to pretend as if we already have the real numbers at our disposal
and try to deduce new tools so that we are eventually in a position to establish
the existence of the real numbers.
First let us add a further axiom
204
Archimedes’ Axiom Given x, y ∈ R, x > 0 and y > 0, there exists a

natural number n ∈ N such that nx > y.
Note that Archimedes’ axiom links the order structure of the real numbers
with properties of the natural numbers.
Consequences of Archimedes’ Axiom
1. Given x ∈ R, x > 0, there exists n ∈ N such that n > x.
2. Given x ∈ R, there exists a unique k ∈ Z such that k ≤ x < k + 1.
3. For every > 0 there exists n ∈ N such that n1 < . Indeed: there exists
n ∈ N such that n > 1 , implying n1 < .
As before, compare with Example 4.4.B, we denote the unique number k ∈ Z
in 2. as [x].
We now extend Bernoulli’s inequality (Lemma 9.11.A). Let x ∈ R, x ≥
−1. Then for all n ∈ N0
(1 + x)n ≥ 1 + nx. (14.13)
Proof. For n = 0 we have
(1 + x)0 = 1 ≥ 1 + 0 · x.
Suppose that (1 + x)k ≥ 1 + kx. It follows that
(1 + x)k+1 ≥ (1 + kx)(1 + x) = 1 + (k + 1)x + kx2
≥ 1 + (k + 1)x
provided 1 + x ≥ 0, i.e. x ≥ −1. Now (14.13) follows from the principle of
mathematical induction.
In order to get more used to inequalities let us derive some consequences of
Bernoulli’s inequality.
Definition 14.1. For n ∈ N let a1 , . . . , an be positive real numbers. Their
arithmetic mean is defined by
n
a1 + · · · + an 1
An := = ak , (14.14)
n n k=1
and their geometric mean is given by
n
n1
1

Gn := (a1 · . . . · an ) = n aj . (14.15)
j=1
205
Lemma 14.2. For positive numbers a1 , . . . , an ∈ R, n ∈ N, the arithmetic-

geometric mean inequality holds:
n
n1 n
1
aj ≤ ak , (14.16)
j=1
n k=1
or Gn ≤ An .
Proof. The case n = 1 is trivial. Let n ≥ 2 and with y = x + 1, x > −1,

Bernoulli’s inequality reads as
y n ≥ 1 + n(y − 1). (14.17)

An
With y = An−1
> 0, n ≥ 2, we deduce from (14.17) that
n
An An
≥1+n −1
An−1 An−1
An−1 + nAn − nAn−1 nAn − (n − 1)An−1 an

= = =
An−1 An−1 An−1
implying
An ≥ an An−1
n−1 ≥ an an−1 An−2
n−2
≥ an an−1 · . . . · a1 = Gnn ,
or Gn ≤ An .
Corollary 14.3. For a1 , . . . , an , b1 , . . . , bn ∈ R, n ∈ N, the Cauchy-Schwarz

inequality holds:
n 12 n 12
n n

ak bk ≤ |ak bk | ≤ a2k b2k . (14.18)

k=1 k=1 k=1 k=1
Proof. The first estimate is just the triangle inequality. For c1 , c2 ∈ R we

know that
c2 + c22
|c1 c2 | ≤ 1 .
2
206
This implies that for every j = 1, . . . , n
n
n
12 n
12 n
12

|aj bj | a2k b2k = b2j a2k a2j b2k
k=1 k=1 k=1 k=1
n
n

b2j a2k + a2j b2k
k=1 k=1
≤ ,
2
and summing from j = 1 to j = n gives

⎛ ⎞ 12 12 n
n 2 n n
n
n
n
j=1 bj k=1 a2k + 2
j=1 aj
2
k=1 bk
⎝ |aj bj |⎠ a2k b2k ≤
j=1
2
k=1 k=1
⎛ ⎞
n
n

= ⎝ b2j ⎠ b2j
j=1 n=1
which implies
12 n 12
n n

aj bj ≤ a2j b2j .

j=1 j=1 j=1
Remark 14.4. A. The Cauchy-Schwarz inequality is often called the Cauchy-

Schwarz-Bunyakovsky inequality.
B. The proof of Lemma 14.2 is taken from L. Maligranda [9] and that of
Corollary 14.3 is taken from M. Lin [8].
Lemma 14.5 (Minkowski’s inequality). For real numbers a1 , . . . , an , b1 , . . . , bn

we have
n 12 n 12 n 12

(ak + bk )2 ≤ a2k + b2k . (14.19)
k=1 k=1 k=1
n
Proof. If k=1 (ak + bk )2 = 0 the statement is trivial.
207
n
In the case k=1 (ak + bk )2 > 0 we find
n
n
n

(ak + bk )2 = (ak + bk )ak + (ak + bk )bk
k=1 k=1 k=1
n n
≤ |ak + bk ||ak | + |ak + bk ||bk |
k=1 k=1
n
12 n
12 n
12 n
12

2
≤ (ak + bk ) a2k + 2
(ak + bk ) b2k
k=1 k=1 k=1 k=1
n
12 ⎛ n 12 n 12 ⎞

= (ak + bk )2 ⎝ 2
ak + bk ⎠
2
k=1 k=1 k=1
implying (14.19).
Corollary 14.6. For real numbers a1 , . . . , ak , b1 , . . . , bk , c1 . . . , ck we have
n
12 n
12 n
12

(ak − bk )2 ≤ (ak − ck )2 + (ck − bk )2 . (14.20)
k=1 k=1 k=1
Proof. We only need to take ak − ck for ak and ck − bk for bk in Minkowski’s

inequality.
We will use the next result quite often, it is a result which depends on
Archimedes’ axiom.
Lemma 14.7. Let a > 1 be a real number. For every R ∈ R, R > 0, there
exists n0 ∈ N such that
an0 > R. (14.21)
Proof. If we take x = a − 1 > 0 in Bernoulli’s inequality we find
an = (1 + x)n ≥ 1 + nx.
Let R > 1 then by Archimedes’ axiom we can find an n0 ∈ N such that

n0 x > R − 1, i.e.
an0 ≥ 1 + n0 x > R.
For R ∈ (0, 1] the statement is trivial.
Corollary 14.8. For 0 < a < 1 and > 0 there exists n0 ∈ N such that
an0 < .
208
Proof. We know that 1

a
> 1 and by Lemma 14.5 there exists n0 ∈ N such
that n0
1 1
> , i.e. an0 < .
a
Problems
1. Given x, y ∈ R, x < y. Prove the existence of z ∈ R such that x < z <
y.
2. Using the axioms of an ordered field prove that for x, y, z ∈ R:
a) x < 0 implies −x > 0;
b) x2 > 0 for all x ∈ R;
c) a < 0 and x < y implies ax > ay.
3. Prove that Archimedes’ axiom holds in Q.
4. Show that there is no element a ∈ Q such that a2 = 3.
5. Using Bernoulli’s inequality prove
2nn ≤ (n + 1)n for n ≥ 1,
then, by induction show that
n n
n! ≤ 2 .
2
6. Use mathematical induction to prove for xk ≥ 0, k ∈ N, and n ∈ N
that n n

(1 + xk ) ≥ 1 + xk .
k=1 k=1
7.* Prove that the arithmetic-geometric mean inequality implies Bernoulli’s

inequality and therefore by the proof of Lemma 14.2 it is in fact equiv-
alent to the Bernoulli inequality.
Hint: first prove the cases n = 1 and n ≥ 2 with 0 < x < 1 − n1 .
Then apply the arithmetic-geometric mean inequality to the n num-
bers 1 + n(1 + x), 1, . . . , 1.
209
8. For x ∈ R and n, m ∈ N prove that if −x < n < m then

x n x m
1+ ≤ 1+ .
n m
9. For ak ∈ R, k = 1, . . . , n, prove by using the Cauchy-Schwarz inequality

that n 12
n √

ak ≤ n a2k .

k=1 k=1
Now prove
n
n
12
1 √
√ |ak | ≤ a2k ≤ n max (|a1 |, . . . , |an |).
n k=1 k=1
210
15 Sequences and their Limits

By definition a sequence of real numbers is a mapping from N to R, i.e.
each n ∈ N is mapped on some an ∈ R. Usually we write (an )n∈N for a
sequence, but also sometimes (a1 , a2 , a3 , . . . ). It is appropriate to consider
a little generalisation, namely to consider a mapping from {n ∈ Z|n ≥ k},
k ∈ Z, to R and we denote the corresponding sequence by (an )n≥k .
Example 15.1. A. Let an = a, a ∈ R fixed, for all n ∈ N, then we obtain
the constant sequence (a, a, a, . . . ).
B. Put an = n1 , n ∈ N, this gives the sequence n1 n∈N or (1, 12 , 13 , 14 , . . . ).
C. The sequence (−1, 1, −1, 1, . . . ) could be written as ((−1)n )n∈N . More
generally if (an )n∈N , an ≥ 0 is a sequence of non-negative numbers we may
consider the sequence ((−1)n an )n∈N which has an alternating sign.
D. Take an = n+1 n
for n ∈ N0 . This leads to the sequence (0, 12 , 23 , 34 , 45 , . . . ).
E. Let a ∈ R, a = 0. The sequence (an )n∈N0 , is called a geometric se-
quence.
Note that we need to know all terms an of the sequence (an )n≥k ; knowing a
finite number is not sufficient. In particular, there is no way to find an+1 by
only knowing a1 , . . . , an . For example
1 1 1 1 1
1, , , , , , . . .
2 3 4 5 6
does not give us a sequence, by no means can we deduce that the next term
is 17 . The next term could be any number. For this reason any question in
which a finite sequence of real numbers is given and the reader is then asked
to find the next number is not valid.
Example 15.2. The Fibonacci numbers are the sequence defined by a0 =
1, a1 = 1, and an = an−1 + an−2 for n ≥ 2. This sequence is defined by a
recursion formula. The first Fibonacci numbers are 1, 1, 2, 3, 5, 8, 13, 21, . . . .
The Fibonacci numbers form an example of a recursively defined sequence.
Consider the sequence defined by
ak+1 = λak , k ∈ N, a0 = 1. (15.1)
ak+1
Thus ak
= λ and the right hand side is independent of k. The geometric
q k+1 q1
sequence (q k )k∈N0 , q ∈ R \ {0}, has the property that qk
= q. Now q0
=q=
211
λ implies that ak = λk , k ∈ N0 . Next we want to see whether (q k )n∈N0 can

lead to explicit expressions for a more general recursively defined sequence,
for example the Fibonacci numbers:
ak+2 = ak+1 + ak , k ≥ 2, a0 = a1 = 1. (15.2)
Taking ak = q k in (15.2) we arrive at
q k+2 = q k+1 + q k , k ≥ 2,
which we may write as

q k (q 2 − q − 1) = 0. (15.3)
Since q k = 0 we need
√ to find solutions
√to the quadratic equation q 2 −q −1 = 0
which are α = 2 + 2 5 and β = 2 − 2 5. Now ak := Aαk−1 +Bβ k−1, A, B ∈ R
1 1 1 1
satisfies for k ≥ 2
0 = Aαk−1(α2 − α − 1) + Bβ k−1(β 2 − β − 1),
i.e.
ak+2 = Aαk+1 + Bβ k+1 = Aαk + Bβ k + Aαk−1 + Bβ k−1 = ak+1 + ak
and now we determine A and B such that a0 = a1 = 1, i.e. we look at the

system
1 = A + B and 1 = Aα + Bβ (15.4)
√ √
5+1 5−1
which has the solution A = √ ,B
2 5
= √ .
2 5
Hence the Fibonacci numbers
are given by
√ k
√ k
1+ 5
− 1−2 5
2
ak = √ , k ≥ 0. (15.5)
5
Note that we may extend this approach to tackle more general recursively
defined sequences such as
ak+n = A1 ak+n−1 + · · · + An an
aj = xj , j = 0, . . . , n − 1,
by looking at solutions of
q n − A1 q n−1 − A2 q n−2 − · · · − An = 0.
212
15 SEQUENCES AND THEIR LIMITS
An elementary discussion of recursively defined sequences is given in A. I.

Markuschewitsch [10].
We now come to one of the fundamental definitions of this course, the limit
of a sequence.
Definition 15.3. Let (an )n≥k be a sequence of real numbers. The sequence
is called convergent to a ∈ R if for every ε > 0 there exists N = N(ε) ∈ N
such that n ≥ N() implies
|an − a| < ε. (15.6)
If (an )n∈N converges to a we call a the limit of (an )n∈N and we write
lim an = a. (15.7)
n→∞
Before discussing some examples let us give some different formulations of

our definition. For a ∈ R and ε > 0 we may consider the open interval
(a − ε, a + ε) := {x ∈ R; |x − a| < ε} = {x ∈ R; a − ε < x < a + ε}.
a
a− a+
Figure 15.1
If (an )n∈N converges to a, then given ε > 0, all elements an , n ≥ N(ε), will
lie in the interval (a − ε, a + ε). This is equivalent to the statement that for
every ε > 0 all but a finite number of the an ’s will lie in (a − ε, a + ε).
We call the interval (a − ε, a + ε), ε > 0, an ε-neighbourhood of a. Thus
the convergence of (an )n∈N to a means that for every ε > 0 all but finitely
many elements of the sequence lie in the corresponding ε-neighbourhood of
a.
Definition 15.4. A sequence of real numbers is called divergent if it has

no limit, i.e. it does not converge.
Example 15.5. A. If an = a ∈ R for all n ∈ N, then lim an = a. Indeed,

n→∞
given ε > 0 then we have
|an − a| = |a − a| = 0 < ε for all n ≥ 1.
213
1 1
B. Consider the sequence n n∈N
. We claim lim = 0.
n→∞ n
Given ε > 0, let N(ε) ∈ N be such that N(ε) > 1ε . It follows that

1
− 0 = 1 < ε for all n ≥ N(ε).
n n
C. The sequence ((−1)n )n∈N is divergent.

Assume ((−1)n )n∈N converges to a ∈ R. Then for ε = 1 there must exist
N ∈ N such that for all n ≥ N it follows that |(−1)n − a| < 1. But for all n
we have |(−1)n+1 − (−1)n | = 2, and for n ≥ N
2 = |(−1)n+1 − (−1)n | = |((−1)n+1 − a) + (a − (−1)n )|

≤ |(−1)n+1 − a| + |a − ((−1)n )| < 1 + 1 = 2,
n Hence no a ∈ R can be the limit of ((−1)n )n∈N .
n
D. The limit of n+1 n∈N is 1, i.e. lim n+1 = 1.
n→∞
Given ε > 0 we find

n n − (n + 1) 1

n + 1 − 1 = n + 1 = n + 1 < ε.

Hence, if we choose N(ε) = 1ε + 1, then for each n ≥ N(ε) we have

n

n + 1 − 1 < ε.
n
E. We have lim n = 0.
n→∞ 2
For n > 3 we know that n2 ≤ 2n . It follows that
n2 n 1
n
≤ 1 or n < for n > 3.
2 2 n
Let ε > 0 be given and take N(ε) > max{3, 1ε }. Now n ≥ N() implies
n n 1

n − 0 = n ≤ < ε.
2 2 n
A helpful observation is
214
Lemma 15.6. For a convergent sequence (an )n≥k of real numbers and any
m∈N
lim an = lim an+m (15.8)
n→∞ n→∞
holds.
Proof. First define bn := an+m . Since limn→∞ an = a exists, for every > 0
there exists N such that n ≥ N implies |a − an | < . However for these n we
have n + m ≥ N and therefore
|bn − a| = |an+m − a| < .
Definition 15.7. A sequence of real numbers (an )n≥k is bounded above if

there exists K1 ∈ R such that an ≤ K1 for all n ∈ N. It is called bounded
below if there exists K2 ∈ R such that K2 ≤ an for all n ∈ N. We call
(an )n≥k bounded if it is bounded above and below, i.e. if there is some
K ∈ R such that −K ≤ an ≤ K, or |an | ≤ K.
Theorem 15.8. Every convergent sequence (an )n≥k is bounded. If K is a
bound for |an |, i.e. |an | ≤ K for all n ≥ k, and if a ∈ R is the limit of
(an )n≥k , then |a| ≤ K.
Proof. Let (an )n≥k be a sequence converging to a, i.e. lim an = a. By
n→∞
definition, for ε = 1 there exists N such that |an − a| < 1 for all n ≥ N. This
implies
|an | = |an − a + a| ≤ |a| + |an − a| ≤ |a| + 1
for n ≥ N. Now if we define M := max{|a1 |, . . . , |aN −1 |, |a| + 1}, then
|an | ≤ M for all n ≥ k, i.e. (an )n∈N is bounded.
Further, if |an | ≤ K for all n ≥ k we find
|a| ≤ |an | + |an − a| ≤ K + |an − a|.
For > 0 there exists N() ∈ N such that n ≥ N() implies |an − a| < and
therefore n ≥ N() implies |a| ≤ K + . Since > 0 is arbitrary we deduce
that |a| ≤ K.
Remark 15.9. Of course, a bounded sequence need not be convergent:
((−1)n )n∈N is bounded since
|(−1)n | = 1 for all n ∈ N,
but we already know that this sequence is divergent.
215
Example 15.10. The sequence (an )n≥0 of all Fibonacci numbers is divergent
since we always have that an ≥ n for n ∈ N0 . For n = 0, 1 this is trivial. Now
suppose an ≥ n for all n ≤ N, we find aN +1 = aN + aN −1 ≥ N + N − 1 =
2N − 1 ≥ N + 1
Example 15.11. We want to study the geometric sequence (q n )n∈N .
A. If |q| < 1, then lim q n = 0.
n→∞
We know by Corollary 14.8 that for ε > 0 there exists N ∈ N such that
|q|N < ε. Now we find
|q n − 0| = |q n | = |q|n ≤ |q|N < ε
for all n ≥ N.
B. For q = 1 we have q n = 1 and we already know that
lim q n = lim 1 = 1.
n→∞ n→∞
C. For q = −1, i.e. q = (−1) , we have just shown that ((−1)n )n∈N is
n n
divergent.
D. For |q| > 1 it follows that (|q|n )n∈N , hence (q n )n∈N is unbounded, compare
Lemma 14.7. Therefore (q n )n∈N is divergent.
√ 1
Example 15.12. We claim that lim n n = lim n n = 1. For this we set
√ n→∞ n→∞
an := n n − 1. Given > 0 we need to find N() ∈ N such that n ≥ N()
implies |an | = an < The binomial theorem yields
n
n n j n n(n − 1) 2
n = (1 + an ) = an ≥ 1 + a2n = 1 + an .
j 2 2
j=0
For n ≥ 2 this implies

2(n − 1) 2
a2n ≤ = ,
n(n − 1) n
or √
2
an ≤ √ .
n
√
Thus we need to find N() such that n ≥ N() implies √2 < . However
n
with N0 ≥ 22 and n ≥ N0 ≥ 2 it follows that
√ √
2 2
√ <√ ≤ .
n N0
216
So far we have defined the limit of a sequence. But do we know that it is

unique?
Theorem 15.13. The limit a ∈ R of a sequence (an )n≥k is unique.
Proof. Suppose that (an )n≥k has two limits a and a . Then given ε > 0, since
lim an = a, there exists N1 ∈ N such that |an − a| < 12 ε for n ≥ N1 . On
n→∞
the other hand, since we also have lim an = a there exists N2 such that
n→∞
|an − a | < 12 ε for n ≥ N2 . Thus it follows that, if N ≥ max{N1 , N2 }, then
|a − a | = |(a − an ) + (an − a )| ≤ |a − an | + |an − a |

1 1
< ε + ε = ε,
2 2
This is true for all ε > 0 and so |a − a | = 0 or a = a .
Theorem 15.14 (Sum of convergent sequences). Let (an )n≥k and (bn )n≥k
be two convergent sequences with limits a and b, respectively, i.e. lim an = a
n→∞
and lim bn = b. Then the sequence (cn )n≥k , cn := an + bn , converges to a + b,
n→∞
i.e.
lim cn = a + b.
n→∞
ε
Proof. Given ε > 0. For 2
> 0 there exist N1 and N2 such that
ε
n ≥ N1 implies |a − an | < ,
2
and
ε
n ≥ N2 implies |b − bn | <
.
2
For N = max{N1 , N2 } we find that n ≥ N implies
|cn − (a + b)| = |an + bn − (a + b)| = |(an − a) + (bn − b)|

ε ε
≤ |an − a| + |bn − b| < + = ε,
2 2
thus lim cn = a + b.
n→∞
217
Example 15.15. Consider cn = n+1n

, n ∈ N. Setting an = 1 and bn = n1 we
find cn = an + bn . We know lim an = 1 and lim bn = lim n1 = 0. Thus we
n→∞ n→∞ n→∞
n+1
find lim = 0 + 1 = 1.
n→∞ n
Theorem 15.16 (Product of convergent sequences). Let (an )n≥k and

(bn )n≥k be two convergent sequences. Then the sequence (an ·bn )n≥k converges
to a · b, i.e.
lim (an · bn ) = ( lim an ) · ( lim bn ).
n→∞ n→∞ n→∞
Proof. Put lim an = a and lim bn = b. We know that (an )n∈N is bounded,
n→∞ n→∞
hence with some K > 0 we have |an | ≤ K for all n ∈ N. But (bn )n∈N is
also a bounded sequence, and without loss of generality we may also assume
that |bn | ≤ K for all n ∈ N. In addition by Theorem 15.8 we also know that
|a| ≤ K and |b| ≤ K. The convergence of (an )n∈N and (bn )n∈N implies that
for ε > 0 there exists N1 , N2 ∈ N such that
ε ε
|an − a| < for n ≥ N1 , and |bn − b| < for n ≥ N2 .
2K 2K
Now, for all n ≥ N := max (N1 , N2 ) we find
|an · bn − a · b| = |an (bn − b) + (an − a) · b|
≤ |an ||bn − b| + |an − a||b|
ε ε
≤K· +K · = ε.
2K 2K
Corollary 15.17. Let (an )n≥k be a convergent sequence and λ ∈ R. Then

the sequence (λan )n≥k converges and the limit is given by
lim (λan ) = λ lim an .
n→∞ n→∞
Proof. We may apply Theorem 15.16 with bn = λ for all n ≥ k.

Corollary 15.18. Let (an )n≥k and (bn )n≥k be two convergent sequences.
Then the sequence (an − bn )n≥k is convergent and its limit is
lim (an − bn ) = lim an − lim bn .
n→∞ n→∞ n→∞
218
Proof. Just combine Corollary 15.17 with Theorem 15.14.

Theorem 15.19. Let (an )n≥k and (bn )n≥k be convergent sequences and sup-
pose that lim bn = 0. Then there exists N0 ∈ N such that bn = 0 for n ≥ N0
n→∞
and the sequence abnn is convergent to
n≥N0
an lim an
lim = n→∞ .
n→∞ bn lim bn
n→∞
|b|
Proof. Since b := lim bn we find for ε := 2
> 0 a number N0 ∈ N such that
n→∞
|b|
|bn − b| < for n ≥ N0 .
2
Since |b| − |bn | ≤ |bn − b| for n ≥ N0 we have
|b| |b|
|b| − |bn | < or < |bn |,
2 2
the last statement being equivalent to |b1n | < |b|
2
. In particular for all n ≥ N0
we have bn = 0 and hence for these n the expression abnn always makes sense.
Next suppose that an = 1 for all n. Given ε > 0 there exists N1 ∈ N such
that
ε|b|2
|bn − b| < for n ≥ N1 .
2
Therefore, for n ≥ N := max{N0 , N1 } we find

1
− 1 = b − bn = 1 |b − bn | < 2 |bn − b|
bn b bn b |bn ||b| |b|2
2 ε|b|2
< 2· = ε.
|b| 2
Hence we have proved that
1 1
lim = .
n→∞ bn b
an 1
Since = an · , n ≥ N0 , the general case follows now from Theorem
bn bn
15.16.
219
7n2 + 3n
Example 15.20. Consider the sequence an = , n ∈ N. We may
n2 − 2
7 + n3
write an = . Now, lim n1 = 0, implying that lim n12 = lim n1 · n1 = 0
1 − n22 n→∞ n→∞ n→∞
by Theorem 15.16. Further, by Corollary 15.17 we find that lim n3 = 0 and

n→∞
lim n22 = 0. Thus lim 7 + n3 = 7 and lim 1 − n22 = 1. According to
n→∞ n→∞ n→∞
Theorem 15.19 we have

7n2 + 3n lim 7 + n3 7
lim 2 = n→∞ 2
= = 7.
n→∞ n − 2 lim 1 − n2 1
n→∞
Theorem 15.21. Let (an )n≥k and (bn )n≥k be two convergent sequences and
suppose that an ≤ bn for all n ≥ k. Then we have lim an ≤ lim bn .
n→∞ n→∞
a−b
Proof. Suppose b := lim bn < a := lim an . For ε := 2
> 0 there exists
n→∞ n→∞
N1 , N2 ∈ N such that |an − a| < ε for n ≥ N1 and |bn − b| < ε for n ≥ N2 .
For n ≥ max{N1 , N2 } we find that
an > a − ε and bn < b + ε.
By the definition of ε we have

a−b a+b a−b
a−ε=a− = = b+ = b+ε
2 2 2
implying that bn < b + ε = a − ε < an which contradicts the assumption
an ≤ bn and so the theorem is proved.
Remark 15.22. A. In particular an ≥ 0 implies that limn→∞ an ≥ 0.
B. Note that in Theorem 15.21 we need not assume that an ≤ bn for all
n ≥ k. It is sufficient to assume that an ≤ bn for all N0 ≥ k, N0 ∈ N.
C. Note further that an < bn for all n ≥ k (or all n ≥ N0 , N0 ≥ k) does not
imply that
lim an < lim bn .
n→∞ n→∞
To see this take the sequence an = 0 for all n ∈ N and bn = n1 . For all n ∈ N
we know that an = 0 < n1 = bn , but lim an = 0 = lim bn .
n→∞ n→∞
Corollary 15.23. Suppose that (an )n∈N is a convergent sequence and that
with two numbers A and B we have A ≤ an ≤ B for all n ∈ N, n ≥ N0 .
Then
A ≤ lim an ≤ B.
n→∞
220
Problems
1. Let M be a countable set and f : M → R. Prove that we can arrange
F (M) as a sequence, i.e. f (M) = {ak ∈ R|k ∈ N} with suitable real
numbers ak .
Hint: recall that M is countable if and only if there exists a bijective
mapping g : N → M.
2. a) Let (an )n≥k be a sequence of real numbers such that an = a for
all n ≥ M ≥ k. Prove that (an )n≥k converges and find its limit.
b) Let (an )n≥k be a sequence with limit a.
Consider the sequence

cn , k ≤ n ≤ M − 1
bn :=
an , n≥M
for any choice of numbers cn , k ≤ n ≤ M −1, and any choice of M ≥ k.
Prove that lim bn = a.
n→∞
3. Let (an )n≥k be a sequence converging to 0 and let (bn )n≥k be a bounded
sequence. Show that lim (bn an ) = 0.
n→∞
4. a) Suppose that a = lim an = lim bn . Moreover for n ≥ k let

n→∞ n→∞
cn ∈ R be given satisfying an ≤ cn ≤ bn . Show that (cn )n≥k converges
to a.
b) Suppose that a = lim an and b = lim bn , a < b, and suppose
n→∞ n→∞
that for n ∈ N the numbers cn ∈ R satisfy an ≤ cn ≤ bn . Does this
imply the convergence of (cn )n∈N ?
5. a) Prove that lim an = a implies lim |an | = |a|. Now deduce that
n→∞ n→∞
lim an = a is equivalent to lim |an − a| = 0.
n→∞ n→∞
b) Let (an )n∈N be a sequence of real numbers and a ∈ R. Further
let (μn )n∈N be a sequence of non-negative numbers converging to 0.
Suppose that for all n ∈ N we have |an − a| ≤ μn . Prove that lim an =
n→∞
a.
6. Suppose that lim an = a and lim bn = b. Prove that lim max{an , bn } =
n→∞ n→∞ n→∞
max{a, b} and lim min{an , bn } = min{a, b}.
n→∞
221
Hint: find a representation of the maximum and the minimum of two

numbers with the help of the absolute value. Then use the result of
Problem 5 a).
5
7. a) Prove that lim = 0, i.e. prove that for every > 0 there
n→∞n+6 5
exists N = N() ∈ N such that n ≥ N() implies n+6 − 0 < .
b) For = 1
1000
find N ∈ N such that n ≥ N implies

4n 4 1

3n + 2 − 3 < 1000 .
8. For k ∈ N prove:
1
a) lim k = 0;
n→∞ n
1
b) lim 1 = 0.
n→∞ n k
9. Use the theorems about limits and already proved results about limits
of sequences to find:
(n + 1)2 − n2
a) lim ;
n→∞ n
√ √
b) lim ( n + 1 − n);
n→∞
n
j=1 j
c) lim ;
n→∞ n2
n 2
j=1 j
d) lim ;
n→∞ n3
1 + 2 · 3n
e) lim ;
n→∞ 5 + 4 · 3n
n + 4n
f) lim .
n→∞ 5n
√
10. Prove that lim n a = 1 for a ≥ 1.
n→∞
11. Find the following limit:

n

1
lim 1− .
n→∞
j=1
j+1
222
12. Find n
ak ν k
lim k=0
m l
, ν ∈ N, ak , bl ∈ R.
l=0 bl ν
ν→∞
Note that the cases n < m, m < n and n = m need to be considered

separately.
13. Suppose that lim an = a. Prove that

n→∞
n
j=1 an
lim = a.
n→∞ n
14. Let f : (a, b) → R be a function and x0 ∈ (a, b) be fixed. Suppose that

there exists δ > 0 such that |x − x0 | < δ implies f (x)−f
x−x0
(x0 )
− A < .
Deduce that then for every > 0 there exists N() ∈ N such that
n ≥ N() implies

n f x0 + 1 − f (x0 ) − A < ,
n
i.e.
1
lim n f x0 + − f (x0 ) = A.
n→∞ n
223
16 A First Encounter with Series

We next want to look at sequences from a different (but equivalent) point of
view. Let (an )n∈N be a sequence of real numbers. Starting with (an )n∈N we
may introduce a new sequence
n

sn := ak , n ∈ N,
k=1

n
more generally, if (an )n≥l , then sn := ak . We call sn the nth partial sum
k=l

∞
of the (infinite) series ak . Thus we have a new sequence (sn )n∈N and
k=1

∞
note that at the moment ak is just a formal expression for this sequence.
k=1
However, it may happen that the sequence of the partial sums (sn )n∈N con-

∞
verges to some limit s. In this case we denote the limit also by ak . Thus
k=1

∞
the symbol ak will have two meanings: a formal expression for the se-
k=1 n

quence of partial sums ak and, if it exists, the limit of the sequence
k=1 n∈N
of partial sums.
Remark 16.1. Note that every sequence (an )n∈N has a representation as the
partial sums of a series, i.e. in a certain sense sequences and series are in a
one-to-one correspondence. Indeed, given (sn )n∈N define
an = sn − sn−1 .

n
Then sn = ak .
k=1
Let us formally state
Definition 16.2. Let (an )n≥k be a sequence

n of real numbers and denote ∞ by
(sn )n≥k the sequence of partial sums l=k al . We call
the series l=k al
convergent to s ∈ R and denote the limit also by ∞ s
l=k l if the sequence
(sn )n≥k converges to s.
225

∞
1
Example 16.3. Consider the series k(k+1)
. We then see that
k=1
n
n
1 1 1
sn = = −
k=1
k(k + 1) k=1 k k+1
n
n−1

1 1 1
=1+ − −
k=2
k k=1 k + 1 n + 1
1 n−1 1 n−1
1
=1− + −
n + 1 k=1 k + 1 k=1 k + 1
1 n
=1− = ,
n+1 n+1
n

∞
1
i.e. (sn )n∈N = n+1 n∈N
and therefore the series k(k+1)
converges and its
k=1
limit is given by
∞
1 n
= lim sn = lim = 1.
k=1
k(k + 1) n→∞ n→∞ n + 1
Theorem 16.4. Let x ∈ R and |x| < 1. Then we have

∞
1
xk = .
k=0
1−x
Proof. We first claim

n
1 − xn+1
sn = sn (x) = xk = .
k=0
1−x
Once this is proved, from Example 15.11.A it follows that
lim xn+1 = x · lim xn = 0 for |x| < 1,

n→∞ n→∞
therefore we find that

1 − xn+1 1
lim sn = lim = .
n→∞ n→∞ 1 − x 1−x
226
16 A FIRST ENCOUNTER WITH SERIES

∞
The series xk is called the geometric series (with parameter or variable
k=0
x ∈ (−1, 1)). Now we prove: let x ∈ R, x = 1, then for all n ∈ N0 we have
n
1 − xn+1
xk = .
k=0
1−x
Indeed, for n = 0 we find

0
1 − x0+1
xk = x0 = 1 = = 1,
k=0
1−x
and further
n+1
n
1 − xn+1
xk = xk + xn+1 = + xn+1
k=0 k=0
1−x
1 − xn+1 (1 − x)xn+1
= +
1−x 1−x
n+1 n+1
1−x +x − xn+2
=
1−x
1 − xn+2
= ,
1−x
and the result follows by mathematical induction.
Remark 16.5. Let us change our point of view and consider the function
f : R \ {1} → R, f (x) = 1−x 1
. If |x| < 1 then Theorem 16.4 says that f
∞ k
has a representation by x → ∞ k
k=0 x in the
∞ ksense that f (x) = k=0 x for
|x| < 1. We say that for |x| < 1 the series k=0 x converges to the function
f . Note that f is defined on a much larger set than the series converges, i.e.
the series represents f only on a subset of the domain of f .
Example 16.6. A. The following holds
∞ ∞ k
−k 1 1 1 1 1
2 = = 1+ + + +··· = 1 = 2.
k=0 k=0
2 2 4 8 1− 2
B. We have
∞
∞ k
−k 1 1 1 1 1 2
(−2) = − = 1− + − ±··· = 1 = .
k=0 k=0
2 2 4 8 1 − −2 3
227
C. For φ ∈ (0, π) we know that | cos φ| < 1 and consequently we find

∞
1 1
(cos2 φ)k = = .
k=0
1 − cos2 φ sin2 φ
Since the convergence of a series is by definition the convergence of the se-

quence of its partial sums, we may immediately derive some rules for handling
convergent series by using known results for sequences:

∞
∞
Theorem 16.7. Let al and bl be two convergent series and λ ∈ R then
l=k l=k

∞
∞
∞
the series (al + bl ), (al − bl ) and (λal ) converge. Moreover, for their
l=k l=k l=k
limits we have ∞ ∞ ∞

(al ± bl ) = al ± bl
l=k l=k l=k
and ∞ ∞

λal = λ al .
l=k l=k

n
n
Proof. With cn := al and dn := bl we have
l=k l=k
n
n
n

(al ± bl ) = al ± bl = cn ± dn ,
l=k l=k l=k
implying that
∞ n
n n

(al ± bl ) = lim (al ± bl ) = lim al ± bl
n→∞ n→∞
l=k l=k l=k l=k
n n
∞
∞

= lim al ± lim bl = al ± bl .
n→∞ n→∞
l=k l=k l=k l=k
The final assertion is shown in an analogous way.

∞
1

∞
1
Example 16.8. Recall the series k(k+1)
. We know that k(k+1)
= 1,
k=1 k=1
and further we have
1 1 1
= − .
k(k + 1) k k+1
228

∞
1

∞
1
But we will see later that the series k
is not convergent, hence k+1
does
k=1 k=1
not converge. Hence
∞
∞

1 1
−
k k+1
k=1 k=1
does not make sense.
Definition 16.9. A sequence (an )n≥k of real numbers is called divergent

to +∞ ( to −∞) if for any K ∈ R there exists N = N(K) ∈ N such that if
n ≥ N then an > K ( an < K).
For a sequence divergent to +∞ (−∞) we will write
lim an = ∞ ( lim an = −∞).

n→∞ n→∞
Example 16.10. A. For m ∈ N the sequence (nm )n∈N diverges to +∞.

B. The sequence (−(2n ))n∈N diverges to −∞.

n
C. The sequence (sn )n∈N , sn := k = n(n+1)
2
, diverges to +∞.
k=1
D. The sequence ((−1)n )n∈N diverges, but it does not diverge to +∞ or −∞.
E. The sequence of the Fibonacci numbers diverges to +∞.
Theorem 16.11. Let (an )n≥k be a sequence diverging to +∞ or −∞. Then

thereexists n0 ∈ N such that for all n ≥ n0 we have an = 0 and the sequence

1
converges to 0.
an n≥n0
Proof. Suppose that lim an = +∞. There exists n0 ≥ k such that an > 0
n→∞
for all n ≥ n0 . In particular we have an = 0 for n ≥ n0 . Now, given

ε > 0, there exists n1 such that an > ε for n ≥ n1 which implies a1n < ε
1
for n ≥ max{n0 , n1 }. The other case is shown in an analogous way, or by

considering the sequence (−an )n≥k .
Theorem 16.12. Let (an )n≥k be a sequence ofpositive

(negative) real num-
1
bers such that lim an = 0. Then the sequence diverges to +∞ (or
n→∞ an n≥k
−∞).
229
Proof. We only handle the case an > 0 for all n ≥ k. Let K > 0 be given.
Since lim an = 0 there exists N ∈ N such that |an | < ε := K1 for n ≥ N.
n→∞
Hence
1 1 1
= > = K for n ≥ N,
an |an | ε
1
i.e. lim = +∞. The case an < 0 follows in a similar way.
n→∞ an
Example 16.13. Using Example 15.5.E we find that

2n
lim = +∞.
n→∞ n
Let us now return to series. Consider a sequence (an )n≥k of non-negative

real
numbers. The corresponding sequence of partial sums (sn )n≥k , sn = nl=k al ,
has the property that m > n implies sm ≥ sn since
m
n
m
n

al = al + al ≥ al .
l=k l=k l=n+1 l=k
Suppose that there exists κ > 0 such that infinitely many al satisfy al ≥ κ.
We claim that in this case (sn )n≥k diverges to +∞. Indeed, given K > 0 we
can find N0 ∈ N such that κN0 > K. Since al ≥ κ for infinitely many l ≥ k
there exists N1 ∈ N such that in the set {ak , . . . , aN1 } at least N0 elements
satisfy al ≥ κ. We introduce the set
M(N0 , N1 ) := {l ∈ N|l ≤ N1 and al ≥ κ}.
For n ≥ N1 it follows that

n

al ≥ al ≥ κ ≥ N0 κ > K.
l=k l∈M (N0 ,N1 ) l∈M (N0 ,N1 )
Here we used the fact that M(N0 , N1 ) has at least N0 elements. The notation

l∈M (N0 ,N1 ) al is almost self-explaining: the summation is over all elements
of M(N0 , N1 ), i.e. we sum up all al with l ∈ M(N0 , N1 ).
Therefore for a series ∞ l=k al of non-negative numbers to converge the fol-
lowing must hold: for every > 0 there exists N() ∈ N such that n ≥ N()
implies an = |an | < , i.e. lim an = 0.
n→∞
230
Observe that the following two new concepts arose in the considerations
above:
- monotonicity of a sequence: m > n implies am ≥ an ;
- selecting a subsequence: for infinitely many l we have al ≥ κ, in other
words we can find a sequence of integers lj , j ∈ N, lj ≥ k, such that alj ≥ κ,
i.e. (alj )j∈N is a new sequence whose elements are elements of the sequence
(al )l∈N and alj ≥ κ for all j ∈ N.
In the next chapter we will investigate these issues in more detail.
Problems
1. Let Sn := n(n+1)(2n+1)
6
, n ∈ N, be the nth partial sum of a sequence
(an )n∈N . Find an .
∞ ∞
2. Let an ≤ bn , n ∈ N, and∞suppose that n=1 an and n=1 bn converge.
∞
Prove that n=1 an ≤ n=1 bn holds.
2 1 1
3. Use the fact that 4k 2 −1
= 2k−1
− 2k+1
to prove that
∞
1 1
= .
k=1
4k 2 −1 2
4. Find the limit of the following series:

(−1)k
a) ∞ k=0 5k ;

b) ∞ n=0 e
−nx
, x < 0;
∞ 4 k
c) k=2 7 .

5. Find all y ∈ R for which ∞ 1
k=0 (y−2)k is a convergent geometric series.
When there is convergence find the limit.
∞ ∞
6. Series of the type k=1 (ak − ak−1 ) and k=1 (ak − ak+1 ) are called
telescopic series. Prove that they converge if and only if lim ak
k→∞
exists. In this case we have
∞
∞

(ak − ak−1 ) = lim ak − a0 and (ak − ak+1 ) = a1 − lim ak .
k→∞ k→∞
k=1 k=1
231
∞ 1 (−1)k

7. a) Find k=0 2k
+ 3k
;
1
b) Under the assumption that lim ln(1 + ) = 0 show that
n→∞ n
∞

1 1
ln 1 − 2 = ln ;
k=1
k 2
∞ 1
c) Suppose k=1 k 2 = A. Prove that
∞
1 3A
= .
(2k − 1)2 4
k=1
3 2
2 +n , n ∈ N, diverges to +∞.
+2n −2
8. a) Prove that the sequence an = n15n

b) Prove that the sequence sin1 1 diverges to +∞. Hint: for
n n∈N
all x ∈ R, we have | sin x| ≤ |x|.
9. Construct sequences (an )n∈N and (bn )n∈N of real numbers such that
lim an = ∞, lim bn = 0 and
n→∞ n→∞
a) lim (an bn ) = +∞;

n→∞
b) lim (an bn ) = −∞;

n→∞
c) lim (an bn ) = c, c ∈ R is a given number.

n→∞
232
17 The Completeness of the Real Numbers

We want to discuss the problem of there being “gaps on the real line”. Recall
that the rational numbers Q have gaps: there is no rational number q such
that q 2 = 2. However such a number would represent the length of the
diagonal of the unit square, i.e. there is a “need” for such a number to exist.
There are other situations where we expect a number with certain properties
to exist but we still cannot prove its existence. Consider a sequence (an )n∈N ,
an ∈ R, such that an < an+1 for all n ∈ N and assume in addition that
an ≤ M for all n ∈ N.
a1 a2 a3 a4 . . .
M
Figure 17.1
The distance between an and an+1 , i.e. an+1 − an > 0 must become smaller
and smaller. Indeed, suppose that for infinitely many nk ∈ N, k ∈ N, we have
ank +1 − ank ≥ η > 0. We claim that there must exist an N ∈ N such that
n ≥ N implies an+1 ≥ M which is a contradiction.
By Archimedes’ axiom, given η > 0 there exists N ∈ N such that Nη ≥
M + |a1 |. Since ank +1 − ank ≥ η for infinitely many nk there exists N1 ∈ N
such that for at least N elements l ∈ {1, . . . , N1 }, we have anl +1 − anl ≥ η.
Now for n ≥ N1 we find
an+1 − a1 = an+1 − an + an − an−1 + · · · + a2 − a1
n
= (aj+1 − aj )
j=1
N

≥ (anl +1 − anl ) ≥ Nη ≥ M + |a1 |
l=1
or, since |x| + x ≥ 0,

an+1 ≥ M + |a1 | + a1 ≥ M.
Thus we know that an+1 − an must become smaller and smaller, therefore
intuitively we would expect (an )n∈N to have a limit. However does such a
233
limit exist in R?
The following definition is a more formal approach to the statement that the
“distance between elements of a sequence becomes smaller and smaller”.
Definition 17.1. A sequence (an )n≥k of real numbers is called a Cauchy

sequence if for every > 0 there exists N ∈ N such that n, m ≥ N implies
|an − am | < .
Remark 17.2. Note that the condition |an − am | < for n, m ≥ N is

equivalent to |an+k − an | < for n ≥ N and k ∈ N.
Proposition 17.3. A. Every convergent sequence is a Cauchy sequence.

B. Every Cauchy sequence is bounded.
Proof. Suppose that (an )n≥k converges to a. Given > 0 we can find N ∈ N
such that |an − a| < 2 for all n ≥ N. Thus for n, m ≥ N we get

|an − am | + |(an − a) − (am − a)| ≤ |an − a| + |am − a| < + = .
2 2
Now let (an )n≥k be a Cauchy sequence. For = 1 there exists N ∈ N such
that n ≥ N implies
|an | − |aN | ≤ |an − aN | < 1
which yields for all l ≥ k
|al | ≤ max{1 + |aN |, |ak |, . . . , |aN −1 |}.
It is quite a difficult problem to decide whether every Cauchy sequence con-

verges. In fact it is impossible to deduce from the axioms for R in the way
we have listed them so far that every Cauchy sequence has a limit in R.
Therefore we add the following axiom:
Axiom of Completeness. In R every Cauchy sequence has a limit.
Although we can derive some very important results from this axiom immedi-
ately, we of course need to justify the axiom. In Appendix VI we see that the
axiom is equivalent to the statement that every set of real numbers bounded
from above has a least upper bound - a statement which is more intuitive.
In addition we show in Appendix VI how (starting with Q) to construct an
234
17 THE COMPLETENESS OF THE REAL NUMBERS
ordered field satisfying Archimedes’ axiom, the completeness axiom and into
which we can embed Q as a dense subset.
Next we continue by proving some of the important consequences of the
completeness axiom. We start with the famous Theorem of Bolzano and
Weierstrass. We first need to consider the following definition
Definition 17.4. Given a sequence (an )n≥k let k ≤ n1 < n2 < n3 < . . .
be a strictly increasing sequence of integers (nl )l∈N . We call the sequence
(anl )l∈N = (an1 , an2 , . . . ) a subsequence of (an )n≥k .
Example 17.5. Consider the sequence ((−1)n )n∈N . The sequence ((−1)2k )k∈N
and the sequence ((−1)2k+1 )k∈N are subsequences of ((−1)n )n∈N . Indeed, in
the first case we take nk = 2k, in the second case we have nk = 2k + 1.
Note that (−1)2k = 1 and (−1)2k+1 = −1, i.e. ((−1)2k )k∈N = (1, 1, . . . ), and
((−1)2k+1 )k∈N = (−1, −1, . . . ). Thus while ((−1)n )n∈N is divergent, each of
the two subsequences is convergent, but they have different limits.
Theorem 17.6 (Bolzano-Weierstrass). Every bounded sequence
(an )n∈N0 in R has a convergent subsequence.
Proof. We proceed in three steps.
1. Since (an )n∈N0 is bounded there exist numbers A, B ∈ R such that A ≤
an ≤ B for all n ∈ N0 . Let us consider the interval [A, B] := {x ∈ R; A ≤
x ≤ B}. We will use the principle of mathematical induction to construct a
sequence [Ak , Bk ], k ∈ N0 , of intervals having the following properties:
i) In each interval [Ak , Bk ] there are infinitely many elements of (an )n∈N0 ;
ii) [Ak , Bk ] ⊂ [Ak−1 , Bk−1], k ≥ 1;
iii) Bk − Ak = 2−k (B − A).
We start by setting A0 = A and B0 = B. Now suppose that [Ak , Bk ] is
already constructed and has the properties i)-iii). Let M := Ak +B 2
k
be the
centre of the interval. Since in [Ak , Bk ] there are infinitely many elements
of (an )n∈N0 , either [Ak , M] or [M, Bk ] (or both) must contain infinitely many
elements of (an )n∈N0 . Now we define the interval
⎧
⎪
⎨[Ak , M], if [Ak , M] contains infinitely many
[Ak+1 , Bk+1 ] := elements of (an )n∈N0
⎪
⎩
[M, Bk ], otherwise.
Obviously [Ak+1 , Bk+1] satisfies i)-iii).
2. Now we define inductively a subsequence (ank )k∈N0 of (an )n∈N0 such that
235
ank ∈ [Ak , Bk ] for all k ∈ N. We start with an0 := a0 .

Now suppose that ank is already defined. Since in [Ak+1 , Bk+1 ] there are
infinitely many elements of (an )n∈N , we may choose some ank+1 ∈ [Ak+1 , Bk+1 ]
such that ank+1 is an element of the original sequence and nk+1 > nk .
3. Finally we prove that (ank )k∈N0 is a Cauchy sequence. For this let ε > 0
be given and take N ∈ N such that 2−N (B − A) < ε. For all k, j ∈ N,
k, j ≥ N, we find
ank ∈ [Ak , Bk ] ⊂ [AN , BN ]
anj ∈ [Aj , Bj ] ⊂ [AN , BN ],

thus
|ank − anj | ≤ |BN − AN | = 2−N (B − A) < ε
and we are done.
Remark 17.7. Clearly the Bolzano-Weierstrass theorem also holds for se-
quences (an )n≥k .
Definition 17.8. A. A number a ∈ R is called an accumulation point or

a cluster point or a limit point of a sequence (an )n≥k if there exists a
subsequence (anl )l∈N of (an )n≥k , anl = a converging to a, i.e. lim anl = a.
l→∞
B. A point a ∈ R is an accumulation point of B ⊂ R if there exists a
sequence (bn )n∈N , bn ∈ B, bn = a, converging to a.
Example 17.9. A. The sequence ((−1)n )n∈N has two accumulation points,
namely +1 and −1, compare with Example 17.5. Thus while the limit of
a sequence is always unique, a sequence may have a lot of (even infinitely
many) accumulation points.
B. The sequence an = (−1)n + n1 , n ∈ N, also has the two accumulation
points +1 and −1. Indeed we have

1 1
lim a2n = lim (−1)2n + = lim 1 + = 1,
n→∞ n→∞ 2n n→∞ 2n
and

2n+1 1 1
lim a2n+1 = lim (−1) + = lim −1 + = −1.
n→∞ n→∞ 2n + 1 n→∞ 2n + 1
236
C. The sequence an = n has no accumulation point since each of its subse-

quences is unbounded.
D. Consider the sequence
-
n2 for n even
an = 1 .
n
for n odd
It is unbounded but has one accumulation point, namely 0 since lim a2n−1 =
n→∞
1
lim = 0.
n→∞ 2n−1
Lemma 17.10. A. If (an )n≥k converges, then the limit is the only accu-
mulation point of (an )n≥k , i.e. every subsequence of a converging sequence
converges to the same limit.
B. If a subsequence of a Cauchy sequence converges, the whole sequence con-
verges (to the same limit).
Proof. Part A is obvious. B. Let (ak )k∈N be a Cauchy sequence and suppose
that (akl )l∈N converges to a. For > 0 there exists N1 ∈ N such that l ≥ N1
implies |akl − a| < 2 . Since (akl ) is a Cauchy sequence there exists N2 ∈ N
such that n, m ≥ N2 implies |an − am | < 2 . Thus for l ≥ N1 and nl ≥ N2 we
find for all n ≥ N2
|an − a| ≤ |an − anl | + |anl − a| < .
Definition 17.11. Let (an )n≥k be a sequence of real numbers. We call

(an )n≥k
monotone increasing if an ≤ an+1 for all n ≥ k
strictly monotone increasing if an < an+1 for all n ≥ k
monotone decreasing if an ≥ an+1 for all n ≥ k
strictly monotone decreasing if an > an+1 for all n ≥ k
Remark 17.12. We call (an )n≥k just monotone if one of the four conditions
of Definition 17.11 holds.
Example 17.13. A. The sequence an = n1 is strictly monotone decreasing.
B. The Fibonacci sequence is increasing but not strictly increasing.
C. The sequence ((−1)n )n∈N is neither monotone increasing nor decreasing.
D. If (an )n∈N is a sequence
of positive numbers an > 0 then the sequence of
partial sums sn = nk=1 an is strictly monotone increasing.
237
The next result resolves one of the problems discussed at the beginning of
this chapter.
Theorem 17.14. Every monotone and bounded sequence (an )n≥k is conver-
gent.
Proof. We know that (an )n≥k is bounded, hence by the Bolzano-Weierstrass

theorem it has a convergent subsequence (ank )k∈N and we denote its limit by
a. We will prove that the whole sequence (an )n≥k converges to a. For this
let ε > 0 be given and l0 ∈ N such that
|anl − a| < ε for all l ≥ l0 .
Set N := nl0 then for every n ≥ N there exists l ≥ l0 such that nl ≤ n < nl+1 .
If (an )n≥k is monotone increasing (decreasing) it follows that
anl ≤ an ≤ anl+1 (anl ≥ an ≥ anl+1 ).
In either case we find that
|an − a| ≤ max (|anl − a|, |anl+1 − a|) < ε
which proves the theorem.
Next we introduce the principle of nested intervals. Let In := [An , Bn ],

n ∈ N0 , be a family of non-empty (and non-degenerate) intervals [An , Bn ] =
{x ∈ R; An ≤ x ≤ Bn } with length ln = Bn − An > 0.
Suppose that
i) In+1 ⊂ In , i.e. An ≤ An+1 < Bn+1 ≤ Bn
ii) lim ln = 0.
n→∞
Such a family (In )n∈N0 is called a family of nested intervals. Let us look
at the intersection of these intervals

I := In := {x ∈ R| x ∈ In for all n ∈ N}.

n∈N0
Theorem 17.15 (Principle of nested intervals). Let (In )n∈N0 be a family

of nested intervals. Then there exists exactly one point x0 ∈ I, i.e. I = {x0 }.
238
Proof. Since An ≤ An+1 < Bn+1 ≤ Bn it follows that the sequence of left
end points (An )n∈N0 as well as the sequence of right end points (Bn )n∈N0 are
bounded. Since each of these sequences is monotone it is convergent. Denote
their limits by A and B, respectively. Clearly we have A ≤ B. If A = B
we are done. Suppose that A < B. Then [A, B] ⊂ In and there exists
n∈N0
x0 ∈ [A, B] such that A < x0 < B. But in this case An < x0 < Bn implying
0 < x0 − An < Bn − An
and
An − Bn < −Bn + x0 < 0
leading to
0 ≤ x0 − lim An ≤ lim (Bn − An ) = 0
n→∞ n→∞
and
0 = lim (An − Bn ) ≤ − lim Bn + x0 ≤ 0,
n→∞ n→∞
i.e. x0 = lim An and x0 = lim Bn , i.e. A = B, contradicting the assumption.

n→∞ n→∞
Remark 17.16. It is clear that the principle of nested intervals can be

formulated and proved for a sequence (In )n≥k .
The proof of the Bolzano-Weierstrass theorem requires the axiom of com-
pleteness, hence all other results in this chapter do. The following example
shows that we can use the axiom of completeness to find a number x in R
such that x2 = 2.
Example 17.17. Let a > 0 and x0 > 0 be two real numbers. We define the
sequence (xn )n∈N0 by
1 a
x0 := x0 and xn+1 := (xn + ).
2 xn
The sequence (xn )n∈N0 converges to a and a is the unique positive solution
of the equation x2 = a.
We show this result using the following steps.
1. For all n we have xn > 0.
Indeed, x0 > 0 by assumption and if xn > 0, so is xn+1 = 12 (xn + a
xn
).
239
2. For all n ≥ 1 we have x2n ≥ a.

For this note that
1 a 2
x2n − a = (xn−1 + ) −a
4 xn−1
1 a2
= (x2n−1 + 2a + 2 ) − a
4 xn−1
1 a 2
= (xn−1 − ) ≥ 0.
4 xn−1
3. For n ≥ 1 we also have xn+1 ≤ xn , i.e. the sequence is monotone
decreasing, since
1 a 1
xn − xn+1 = xn − (xn + ) = (x2 − a) ≥ 0
2 xn 2xn n
note that xn > 0 and x2n ≥ a.
4. We conclude that (xn )n∈N is a monotone decreasing sequence satisfying
0 ≤ xn ≤ x1 , i.e. it is bounded. Hence it is convergent by Theorem
17.14 and the limit x of (xn )n∈N0 satisfies 0 ≤ x ≤ x1 .
5. Applying the rules for convergent sequences to the equation
1 a
xn+1 = (xn + ),
2 xn
we obtain
1 a
x = (x + )
2 x
i.e. x2 = a. Since x ≥ 0, x is the positive solution to x2 = a.
Example 17.18. Suppose that a = 2 in Example 17.17. Starting with
x0 = 1, we obtain the sequence: 1, 32 , 17 , 577 , . . . which converges rapidly to
√ 12 408
2. Note that all the terms of the sequence are rational, but the limit is not.
The following example shows why we cannot just take the limits of the defin-
ing equation.
Example 17.19. Define a sequence by x0 = 2 and xn+1 = 2xn − x2n .
2
If the sequence has a limit x we obtain x = 2x − or x2 = 2. Since all
x
the terms are positive, this would be the positive square root of 2 as before.
However, the sequence is 2, 3, 5 13 , 9 15
7
, . . . which is an increasing sequence and
unbounded, as we can prove by induction.
240
Problems

1. a) Consider the sequence (sn )n∈N , sn := nj=1 1j . Prove that s2n −
sn > 12 and deduce that (sn )n∈N diverges to +∞.
j+1
b) Prove that the sequence (sn )n∈N , sn := nj=1 (−1)j , is a Cauchy
sequence.
2. Let (an )n∈N be a sequence of real numbers such that for all n ≥ N we
have |an − an+1 | < 2−n . Prove that (an )n∈N is a Cauchy sequence.
3. Let (an )n≥k , (bn )n≥k , and (cn )n≥k be sequences of real numbers such
that lim an = a, lim bn = b and for all n ≥ k we have an ≤ cn ≤ bn .
n→∞ n→∞
Prove that (cn )n≥k has a convergent subsequence.
√
n
4. Given the sequence n+1 . Show that this is a bounded decreasing
n∈N
sequence and deduce that its limit exists.
n 1
5. Consider the sequence k=0 k! n≥0 . Prove that this sequence is bounded

and deduce that it must have the limit ∞ 1
k=0 k! .
Hint: first show that k! ≥ 2 k−1
for k ∈ N.
6. Let (an )n∈N , an ≥ 0, be a sequence and assume that (an )n∈N has no
accumulation points. Prove that lim an = ∞.
n→∞
7. Give an example of a sequence (an )n∈N such that −2, 13 , 17 are accumu-
lation points of (an )n∈N and −3 ≤ an ≤ 19 for all n ∈ N.
8.* Let a > 0, k ∈ N and xk0 > a, x0 > 0. Define
xkn − a (k − 1)xkn + a
xn+1 := xn − = , n ∈ N0 .
kxk−1
n kxk−1
n
1 √
k
Prove that lim xn = a k = a. Hint: use the following steps:
n→∞
i) xn > 0 for all n ∈ N;

k
n −a
ii) − xkx k ≥ 1;
n
241
iii) by using Bernoulli’s inequality prove that

k
xkn − a
xn − ≥ a;
kxk−1
n
iv) xkn ≥ a;
v) xn+1 ≤ xn .
n
9.* Prove that the sequence (an )n∈N , an = 1 + n1 , has the limit ∞ 1
j=1 j! .
We denote the limit by e where e is called the Euler number.
n n+1
10. Let an = 1 + n1 and bn = 1 + n1 . Prove that ([an , bn ])n∈N are
nested intervals with {e} = n∈N [an , bn ], and e is the Euler number.
242
18 Convergence Criteria for Series, b-adic

Fractions
Our new understanding of the completeness of the real line, in particular
the concept of a Cauchy sequence, gives us new tools to handle series. We
formulate our first results for sequences (an )n≥k . We will soon switch to
sequences (an )n∈N or (an )n∈N0 , but extending results to the case (an )n≥k is
straightforward. We start by formulating the Cauchy criterion for series.

∞
Theorem 18.1. Given a sequence (an )n≥k of real numbers. The series an
n=1
converges if and only if for every ε > 0 there exists N = N(ε) ∈ N such that
n ≥ m ≥ N implies n

ak < ε. (18.1)

k=m

p
Proof. Let sp := al be the pth partial sum. It follows that
l=k
n

sn − sm−1 = al ,
l=m
and the criterion is nothing but the statement that the sequence of partial
sums forms a Cauchy sequence.

∞
Theorem 18.2. If the series al converges then lim al = 0.
l=k l→∞
∞
Proof. If al converges, then by Theorem 18.1, for every ε > 0 it follows

n l=k

that al < ε provided n ≥ m ≥ N for some suitable N ∈ N. Putting
l=m
n = m we find that |al | < ε for all n ≥ N, i.e. lim al = 0.
l→∞
1
∞
Example 18.3. A. For |q| < 1 we know that k=0 q k = 1−q , i.e. the series
converges. Moreover, by Example 15.11.A we know that lim q k = 0.
k→∞

∞
k
B. The series (−1) diverges since the sequence
k=1
((−1)k )k∈N does not converge to 0.
243

∞
Theorem 18.4. Let al be a series of non-negative numbers al ≥ 0. This
l=k
series converges if and only if it is bounded, i.e. the sequence of its partial
sums is bounded.

p
Proof. Since al ≥ 0 for all l ≥ k the sequence of partial sums sp = al is
l=1
monotone increasing and bounded, hence by Theorem 17.14 it is convergent.

∞
Conversely, if al is convergent the corresponding sequence of partial sums
l=1
must be bounded.

∞
1
Example 18.5. The harmonic series n
diverges.
n=1
Referring to Problem 1 a) in Chapter 17, we may argue that (sn )n≥1 is not a
Cauchy sequence, hence it cannot converge. We give here a further proof by
showing that the partial sums are unbounded. Consider the special partial
sums
2k+1 k
2p+1
1 1 1
s2k+1 := =1+ +
n=1
n 2 p=1 n=2p +1 n

1 1 1 1 1 1 1
=1+ + + + + + + +
2 3 4 5 6 7 8
⎛ ⎞
k+1
2
1⎠
···+⎝ .
k
n
n=2 +1
Each of the terms in brackets is larger than 12 . Indeed we have 2p terms to

p+1
2
1 1
add in the sum , the smallest of which is 2p+1 , hence
p
n=2 +1
n
2p+1
1 1 1
≥ 2p · p+1 = .
n=2p +1
n 2 2
Therefore
∞ we find s2k+1 ≥ 1+ k2 implying that the partial sums are unbounded
and so n=1 n1 is divergent.

∞
1

∞
Remark 18.6. Note that n
is an example of a divergent series al with
n=1 l=k
lim al = 0. Hence the converse of Theorem 18.2 does not hold.
l→∞
244
18 CONVERGENCE CRITERIA FOR SERIES, B-ADIC FRACTIONS

∞
Example 18.7. For all k ∈ N, k > 1, the series 1
nk
converges. To see this
n=1

∞
1
we apply Theorem 18.4 and prove the boundedness of nk
for k > 1. For
n=1
p ∈ N such that N ≤ 2p+1 − 1 we find
N 2 p+1
−1
1 1
sN := k
≤
n=1
n n=1
nk
2p+1 −1
1 1 1
=1+ k
+ k +···+
2 3 n=2p
nk
p p q
1 1
≤ 2q q k =
q=1
(2 ) q=1
2k−1
∞
1 2k−1
≤ (2−k+1 )q = = .
q=1
1 − 2−k+1 2k−1 − 1
The next result is useful when dealing with alternating series, i.e. series
in which consecutive terms change sign.
Theorem 18.8 (Leibniz’s criterion for alternating series). Let (an )n∈N
be a monotone decreasing sequence of non-negative real numbers with lim an =
n→∞

∞
n
0. Then the series (−1) an converges.
n=1

k
Proof. Set sk := (−1)n an . Since
n=1
s2k+2 − s2k = −a2k+1 + a2k+2 ≤ 0
it follows that
s0 ≥ s2 ≥ s4 ≥ · · · ≥ s2k+2 ≥ . . .
and analogously, since
s2k+3 − s2k+1 = a2k+2 − a2k+3 ≥ 0
we find
s1 ≤ s2 ≤ s3 ≤ · · · ≤ s2k+3 ≤ . . .
245
In addition we have
s2k+1 ≤ s2k
since s2k+1 − s2k = −a2k+1 ≤ 0.
The sequence (s2k )k∈N is monotone decreasing and bounded since s2k ≥ s1 .
By Theorem 17.14 it is convergent, hence
lim s2k = S
k→∞
for some S ∈ R. Analogously we see that (s2k+1 )k∈N is monotone increasing

and bounded, hence convergent:
lim s2k = S .
k→∞
Further we find
S − S = lim (s2k − s2k−1 ) = lim a2k+1 = 0,

k→∞ k→∞

∞
i.e. S = S . Now we prove (−1)k ak = S. For this let ε > 0 be given.
k=1
Then there exists N1 (ε), N2 (ε) ∈ N such that |s2k − S| < ε for k ≥ N1 ,
|s2k+1 − S| < ε for k ≥ N2 . Thus for k ≥ max (2N1 , 2N2 + 1) we find
|sk − S| < ε.

∞
(−1)k
Example 18.9. A. The alternating harmonic series k
converges.
k=1

∞
(−1)k
(Also compare with Problem 1 b) in Chapter 17.) B. The series 2k+1
k=0
converges.

∞
Definition 18.10. A series ak of real numbers is called absolutely con-
k=1

∞
vergent if the series |ak | converges.
k=1
Theorem 18.11. Any absolutely convergent series is convergent.
246

∞
Proof. Suppose ak is absolutely convergent. According to the Cauchy
k=1

∞
criterion applied to the series |ak |, for ε > 0 there exists a number N(ε)
k=1
such that n ≥ m ≥ N implies
n

|ak | < ε.
k=m
Now the triangle inequality yields for n ≥ m ≥ N

n n

a k ≤ |ak | < ε,

k=m k=m

∞
i.e. the Cauchy criterion holds for ak which implies the convergence of
k=1

∞
ak by Theorem 18.1.
k=1
Remark 18.12. The alternating harmonic series shows that the converse of
Theorem 18.11 is not true: a convergent series need not be absolutely conver-
gent. Convergent series which are not absolutely convergent are sometimes
called conditionally convergent.

∞
Theorem 18.13 (Comparison test). Let ck be a convergent series of
k=1
non-negative real numbers ck ≥ 0. Further let (ak )k∈N be a sequence such

∞
that |ak | ≤ ck for all k ∈ N. Then the series ak converges absolutely.
k=1
Proof. Given ε > 0 there exists N(ε) ∈ N such that

n n

ck = ck < ε for n ≥ m ≥ N.

k=m k=m
Therefore we find
n
n

|ak | ≤ ck < ε for n ≥ m ≥ N,
k=m k=m
247
The next two tests, the ratio test and the root test are very powerful tools.
We will use these tests in this part and later on when dealing with power
series, and also in the chapter on complex analysis.

∞
Theorem 18.14 (Ratio test). Let an be a series such that an = 0 for
n=0
all n ≥ N0 . Suppose that there exists ν, 0 < ν < 1, such that

an+1

an ≤ ν for all n ≥ N0 .

∞
Then the series an converges absolutely.
n=0

∞
Proof. The convergence of the series an does not depend on the first N0
n=0
terms. Now
an+1

an ≤ ν for all n > N0 ,

∞
implies that |aN0 +k | ≤ |aN0 |ν k . Since 0 < ν < 1 the series ν n converges,
n=N0
so by Theorem 18.13 the theorem is proved.

an+1

Corollary 18.15. Let (an )n∈N0 be a sequence and suppose that lim =
n→∞ a
∞ n
a < 1. Then the series n=0 an converges absolutely.
Proof. Since a < 1 there exists > 0 such that 0 < a + < 1. For this > 0
there exists N ∈ N such that n ≥ N implies

an+1
− a <
an
or
an+1

an < a + < 1
and the ratio test then gives the result.
Remark 18.16. A. Note that changing finitely many elements in a sequence
or series does not effect its convergence behaviour.

∞
B. The series ck in Theorem 18.13 is called a majorant of the series
k=1
248

∞
ak .
k=1

an+1 an+1
C. Note that the condition an < 1 or lim = 1 are not sufficient
n→∞ a
∞ n
for the (absolute) convergence of k=1 ak as the harmonic series
shows. Here
a n
an+1 n+1
an = n1 , hence an+1
an
= an = n+1n
< 1 as well as lim = lim =
∞ 1
n→∞ an n→∞ n + 1
1 and n=1 n diverges.
∞
n2
Example 18.17. The series converges.
n=0
2n
n2
If an = 2n
, then for n ≥ 3 we have

an+1 (n + 1)2 2n 1 1 2 1 1 2 8

an = 2n+1 n2 = 2 (1 + n ) ≤ 2 (1 + 3 ) = 9 < 1,
and so the series is convergent by Theorem 18.14.

Theorem 18.18 (Root Test). Let (an )n∈N0 be a sequence of real numbers
1
and suppose that for all n ≥ N0 we have |an | n ≤ q < 1. Then ∞ n=0 an
converges absolutely.
n
N0 −1 ∞ n
∞ |an | ≤ q . Therefore n=0 |an |+ n=N0 q
Proof. For n ≥ N0 it follows that
is a convergent majorant for n=0 |an | and by the comparison test, Theorem
18.13, the result follows.
1 1
Remark 18.19. We can replace the condition |an | n ≤ q < 1 by limn→∞ |an | n <
1, see Problem 12 b).
Example 18.20. For |r| < 1 and a ≥ 1 consider the sequence
n
r , n even
an :=
ar n , n odd
It follows that
√ r, n even
n
an = √
n
ar, n odd

n
therefore using Problem 10 in Chapter 15 we find lim |an | = |r| < 1 and
n→∞
taking Remark 18.19, i.e. Problem 12 b), into account we find that ∞ n=1 an
converges absolutely.
249
The comparison test and its

∞consequences discussed so far cannot help to
1
decide the convergence of n=2 n ln n or similar series. However we can es-
tablish an integral (comparison) test and this is indeed the most powerful
test. The basic idea behind this test is that integrals are limits of sums.
Theorem 18.21 (Integral Test). Let f : [1, ∞) → R be a non-negative

decreasing function which for every N ∈ N is integrable over the interval
N

[1, N]. The series ∞n=1 f (n) converges if and only if lim f (x)dx exists
N →∞ 1
and is finite.
Proof. For the interval [1, N] we choose the partition t1 = 1 < 2 < 3 < · · · <
N
N. Then the sum f (1) + · · · + f (N − 1) is the Riemann sum for 1 f (x)dx
with respect to this partition and the points ξj = tj = j, whereas the sum
N
f (2)+· · ·+f (N) is the Riemann sum for 1 f (x)dx with respect to the same
partition and the points ξj = tj+1 = j + 1. Since f is decreasing it follows
that N
f (2) + · · · + f (N) < f (x)dx < f (1) + · · · + f (N − 1).
1
y = f (x)
f (1)
f (2)
.
.
.
f (6)
1 2 3 4 5 6
Figure 18.1

N
Since f ≥ 0 it follows that 1 f (x)dx is an increasing sequence. Now if
∞ N ∈N
N
n=1 f (n) converges then 1
f (x)dx is a bounded increasing sequence
N N ∈N N
and therefore lim f (x)dx exists. Conversely if lim f (x)dx exists,
N →∞ 1 N →∞ 1
250

i.e. is finite, then the increasing sequence SN = N n=1 f (n) is bounded by
N
∞
lim f (x)dx + f (1) and hence n=1 f (n) converges.
N →∞ 1
∞
1 1
Example 18.22. A. The series diverges since x → is a
n=2
n ln(n) x ln(x)
N
1 N
decreasing function and dx = ln(ln(x)) 1 = ln (ln N) − ln (ln 2)
2 x ln(x)
∞
1
does not have a finite limit. However the series converges since
n=1
n(ln(n))2
N
1 1 1 N
x → is a decreasing function and dx = − =
x(ln(x)) 2
2 x(ln(x)) 2 ln(x) 2
1 1
− + converges.
ln N ln 2
B. For α > 1 the function x → x1α is on [1, ∞) positive and decreasing.
Further N N
1 1
1−α 1
α
dx = x = (N 1−α − 1).
1 x 1 − α 1 1 − α
Since α > 1 it follows that
N
1 1 1
lim α
dx = lim (N 1−α − 1) =
N →∞ 1 x N →∞ 1−α α−1

implying the convergence of the series ∞ 1
n=1 nα for α > 1.
ln(n)
Lemma 18.23. For α > 0, we have lim = 0.
n→∞ nα
k
Proof. Putting n = ek , this is equivalent to αk → 0 as k → ∞.
∞
e
k
But is a convergent series by the ratio test:
k=1
eαk
ak+1 (k + 1)eαk k+1 1 1
= α(k+1) = α
→ α < 1.
ak e k k e e
k
Therefore by Lemma 9.14 → 0 as k → ∞.
eαk
251
Theorem 18.24. The sequence 1 + 12 + 13 · · · + n1 − ln(n) converges to a limit.

This limit is denoted by γ and is called Euler’s constant.
n partition 1 < 2 < · · · < n. Then

Proof. On the interval [1, n] we consider the
the sum 12 + · · ·+ n1 is a Riemann sum for 1 dx
x
which is less than the integral
n
and the sum 1 + 2 + · · · + n−1 is a Riemann sum for 1 dx
1 1
x
which is greater
than the integral, therefore we find
1 1 1 1 1
+ + · · · + < ln n < 1 + + · · · + .
2 3 n 2 n−1

Now set an := ln n − 12 + · · · + n1 . Note that

1
an+1 = ln (n + 1) − ln (n) − + an
n+1
and since
n+1
1 1 1
ln (n + 1) − ln (n) − = dx − ≥ 0,
n+1 n n n+1
it follows that (an )n∈N is monotone decreasing. But an > 0, hence it has a
limit. Therefore 1 + 12 + 13 + n1 − ln n = 1 − an must also tend to a limit.
Remark 18.25. A numerical approximation for the Euler constant is γ ≈
0.577215664901 . . .
We know that for a finite sum we can change the order of the summation:
addition is commutative. For series this question is a different one, it is sum-
mation combined with taking a limit. Thus it is a new, non-trivial question
when we ask whether we can rearrange the order of elements “summed up”
in a series.
∞
Definition 18.26. Let an be a series and τ : N0 → N0 be a bijective
n=0

∞
∞
mapping. The series aτ (n) is called a rearrangement of the series an .
n=0 n=0

∞
Theorem 18.27. Let an be an absolutely convergent series with limit A,
n=0

∞
i.e. an = A. Then every rearrangement of this series also converges to A.
n=0
252
Proof. Let τ : N0 → N0 be any bijective mapping. We have to prove that

m
lim aτ (k) = A.
m→∞
k=0

∞
∞
Let ε > 0. Since |ak | converges, there exists N0 ∈ N such that |ak | <
k=0 k=N0
ε
2
. This implies that

N
0 −1 ∞ ∞
ε

A − ak = ak ≤ |ak | < .
2
k=0 k=N k=N
0 0
Now, take N such that {τ (0), . . . , τ (N)} ⊃ {0, 1, . . . , N0 − 1}. For m ≥ N

we find
m m N −1
N
0 −1 0

aτ (k) − A ≤
a τ (k) − ak + ak − A

k=0 k=0 k=0 k=0
∞
ε
≤ |ak | + < ε,
k=N0
2
and the theorem is proved.

Remark 18.28. If ∞ n=0 an is convergent but not absolutely convergent,
then rearrangements will in general change the limit. In fact, such series can
be rearranged to make the limit any value.
Example 18.29. Consider τN : N0 → N0 defined to be
⎧
⎨ n + N, 0 ≤ n < N
τ (n) := n − N, N ≤ n < 2N
⎩
n, n ≥ 2N

This is a bijective mapping from N0 to N0 and in the case of ∞ n 1
n=0 (−1) n+1
the rearranged series for N = 3 is
∞
1 (−1)3 (−1)4 (−1)5 (−1)0
(−1)τ (n) = + + +
n=0
τ (n) + 1 3+1 4+1 5+1 0+1
∞
(−1)1 (−1)2 1
+ + + (−1)n
1+1 2 + 1 n=6 n+1
∞
1 1 1 1 1 1
= − + − +1− + + (−1)n .
4 5 6 2 3 n=6 n+1
253
The next result relates rational numbers to real ones from the point of view
of approximation.
Theorem 18.30. Every real number can be approximated by a sequence of

rational numbers.
Proof. We will show that every positive real number can be approximated
m
by dyadic rationals. Let r ∈ R, r > 0, be the form n with n ∈ Z and m ∈ N
2
and let
n

xn := al 2−l
l=−k
Here k is the smallest positive integer that
x < 2k+1 .
Then we put a−k = 1 and define al by
-
1 if r − xl−1 > 2−l
al = (18.2)
0 if r − xl−1 ≤ 2−l
From the construction it follows that
xn+1 ≤ x < xn+1 + 2−n−1 ,
i.e.
|x − xn | ≤ 2−n for n ≥ −k,

∞
hence x = lim xn = an 2−n .
n→∞ n=−k
Finally we wish to address a problem about real numbers: their representa-

tion as decimal and dyadic numbers or more generally, as b-adic numbers.
Definition 18.31. Let b ∈ N, b ≥ 2. A b-adic fraction is a series of the

type
∞
± an b−n
n=−k
254
with k ≥ 0 and an ∈ N0 such that 0 ≤ an < b.

If b is fixed then it is sufficient to write
±a−k a−k+1 · · · a−1 a0 a1 a2 a3 · · ·
If b = 10 we are dealing with decimal fractions and if b = 2 we have dyadic
fractions.
Proposition 18.32. Every b-adic fraction converges.
Proof. We show that the sequence of partial sums form a Cauchy sequence.
It is sufficient∞ to consider the case of non-negative b-adic fractions. We
−n
therefore
m let n=−k an b be a b-adic fraction and for m ≥ −k we set sm =
−n
n=−k an b . For m ≥ m ≥ −k we find
m

|sm − sm | = an b−n
n=m+1
m

≤ (b − 1)b−n
n=m+1
m
−m−1
−m−1
= (b − 1)b b−n
n=0
−m−1 1
≤ (b − 1)b = b−m .
1 − b−1
For > 0 we find N ∈ N such that m ≥ m ≥ −k implies |sm − sm | < ,
namely if b−m < for m ≥ N, and the result then follows.
Of central importance is
Theorem 18.33. Let b ∈ N where b ≥ 2 then every real number x ∈ R has
a representation as a b-adic fraction, i.e.
∞

x = sgn(x) an b−n
n=−k
where k ≥ 0 and an ∈ N0 such that 0 ≤ an 0
sgn(x) = 0, x = 0 .
⎪
⎩
−1, x < 0
255
Proof. Again we may assume that x > 0. By Lemma 14.7 there exists l ∈ N0
such that x < bl+1 . Let k be the smallest non-negative integer such that
0 ≤ x < bk+1 .
Now we construct a sequence (an )n≥−k of integers 0 ≤ an ≤ b − 1 such that

for m

xm := an b−n
n=−k
we have
xm ≤ x < xm + b−m .
Since
0 = 0 · bk < 1 · bk < · · · < (b − 1)bk < b · bk = bk+1
is a partition of [0, bk+1 ] and since 0 ≤ x < bk+1 , there exists exactly one
non-negative integer 0 ≤ a−k ≤ b such that
x−k = a−k bk < x < (a−k + 1)bk = x−k + bk .
0 bk 2bk 3bk (b − 1)bk bk+1
x
Figure 18.2
Thus we have a starting point for induction. Next we suppose that all an for
n ≤ m are already constructed such that
xm ≤ x < xm + b−m .
We now consider the partition
xm < xm b−m−1 < xm + 2b−m−1 < · · · < xm + bb−m−1 = xm b−m−1 .
Then there exists a unique non-negative integer 0 ≤ am+1 ≤ b − 1 such that
xm + am+1 b−m−1 ≤ x < xm + (am+1 + 1)b−m−1 .
Since xm+1 = xm + am+1 b−m−1 we have
xm+1 ≤ x < xm+1 + b−m−1 ,
256
and the sequence is constructed. By construction we have
|x − xm | < b−m for all m ≥ −k,
which implies lim xm = x, i.e.

m→∞
∞

x= an b−n .
n=−k
Remark 18.34. A. For b = 10 we can find the decimal representation

of real numbers and only Theorem 18.33 allows us to work with it as we do.
For b = 2 we get the dyadic numbers or the dyadic representation of real
numbers which is important in the representation of numbers in computing.
B. Theorem 18.33 also implies: given any real number x and > 0 there
exists a rational number q = q() such that |x − q| < , i.e. we can approx-
imate every real number by rational
numbers. In fact we only need to take
N N
n=−k an b
−n
with N such that x − n=−k an b−n < since N n=−k an b
−n
∈
Q. From this it is evident that every real number in an interval I ⊂ R can
be approximated by the rational numbers in this interval, i.e. by numbers
belonging to I ∩ Q. For b = 2 this is the content of Theorem 18.30.
Finally we can prove
Theorem 18.35. The real numbers are not countable.
Proof. We prove that (0, 1) ⊂ R is not countable which of course implies
that R is not countable. Suppose that (0, 1) is countable then there exists a
sequence (xn )n∈N of real numbers xn such that
(0, 1) = {xn |n ∈ N} .
We represent each xn by its decimal fraction
x1 = 0.a11 a12 a13 a14 a15 . . .

x2 = 0.a21 a22 a23 a24 a25 . . .
x3 = 0.a31 a32 a33 a34 a35 . . .
x4 = 0.a41 a42 a43 a44 a45 . . .
.. ..
. .
257
We define c ∈ (0, 1) by its decimal representation
c = 0.c1 c2 c3 c4 c5 . . .
with -
1
1 if akk =
ck :=
2 if akk = 1.
In particular we have ck = akk for all k ≥ 1. By assumption there must
be some n ∈ N such that xn = c which would imply ann = cn . This is a
contradiction and the theorem is proved.
Remark 18.36. The procedure used in the proof of Theorem 18.35 is called
Cantor’s diagonalisation argument (or procedure). In fact it was used 15
years earlier by Paul du Bois-Reymond.
Corollary 18.37. The irrational numbers R \ Q are not countable.
This follows from Theorem 18.35 and
Theorem 18.38. For n ∈ N let An be a countable set then ∪n∈N An = {x|x ∈

An for some n ∈ N} is countable. (I.e. the countable union of a countable
set is countable.)
Proof. Each set An can be written as a sequence
An = (anj )j∈N = (an1 , an2 , an3 , . . .).
Now we can arrange ∪n∈N An in the following way:
a11 a12 a13 a14 a15 a16 . . .

a21 a22 a23 a24 a25 a26 . . .
a31 a32 a33 a34 a35 a36 . . .
a41 a42 a43 a44 a45 a56 . . .
a51 a52 a53 a54 a55 a56 . . .
a61 a62 a63 a64 a65 a66 . . .
.. .. .. .. .. ..
. . . . . .
and we construct a bijection to N as in the case of the rational numbers in

(0, 1).
258
a11 a12 a13 a14 a15 a16 . . .
a21 a22 a23 a24 a25 a26 . . .
a31 a32 a33 a34 a35 a36 . . .
a41 a42 a43 a44 a45 a46 . . .
a51 a52 a53 a54 a55 a56 . . .
a61 a62 a63 a64 a65 a66 . . .

.. .. .. .. .. ..
. . . . . .
Figure 18.3
Problems
1 n+k
1. For > 0, find N ∈ N such that n ≥ N implies m k=1 2 < . Why
does this imply the Cauchy criterion holds for the series ∞ −k
k=0 2 ?
2. Let (an )n∈N be a monotone

decreasing sequence of non-negative num-
bers. Prove that if ∞n=1 an converges then lim (nan ) = 0.
n→∞
3. Let (an )n∈N be a sequence of non-negative numbers which is decreas-

ing. Prove that the series ∞
n=1 an converges if and only if the series
∞ n n
n=1 2 a2 converges.
∞
Hint: compare s = n=1 an with the partial sum s2n and use the
monotonicity criterion, i.e. Theorem 18.8.
4. Apply the result of Problem 3 to test the following series for conver-
gence:
259
∞
n=1 nα , α ∈ R;
1
a)
∞
b) 1
n=2 n(ln n)α , α ∈ R.
5. Test the following alternating series for convergence:
(−1)n−1
a) ∞ n=1 nα
, α ∈ R;
∞ (−1)n+1
b) n=1 2n−1 ;
(−1)n
c) ∞ n=2 n ln n .
6. Let (an )n≥k and (bn )n≥k be two

∞sequences of real numbers such
∞that
0 ≤ an ≤ bn . Suppose that n=k an diverges. Prove that n=k bn
diverges too.
7. Use a comparison with a convergent or divergent series or otherwise to
investigate the following series for convergence:
(−1)k k 2
a) ∞ k=1 k 4 +2k ;

b) ∞ k!
k=1 k k ;

c) ∞ ln(n+1)
n=1 3n3 +7 ;

d) ∞ n=1 sin 3 ;
1
∞ cos kxn
e) k=1 1+k2 , x ∈ R;

f) ∞ emx
m=1 m4 , x ∈ R;

g) ∞ x2
l=1 l2 +x2 , x ∈ R;

h) ∞ n+5
n=1 (2n+1) n+3 .
√
∞ 2
8. Suppose
that (a n ) n∈N and (bn ) n∈N are two sequences such that n=1 an
and ∞ b
n=1 n
2
converge. Prove the (extended) Cauchy-Schwarz in-
equality
∞ 12 ∞ 12
∞ ∞

ak bk ≤ |ak bk | ≤ a2k b2k

k=1 k=1 k=1 k=1
and the (extended) Minkowski inequality

∞ 12 ∞ 12 ∞ 12

|ak + bk |2 ≤ a2k + b2k .
k=1 k=1 k=1
260
9. Let (ak )k∈N be a sequence of real numbers. Prove that the series
∞ 1 |ak |
k=1 2k 1+|ak | converges. Furthermore, for two sequences (ak )k∈N and
(bk )k∈N of real numbers prove
∞
∞
∞

1 |ak + bk | 1 |ak | 1 |bk |
k
≤ k
+ .
2 1 + |ak bk | 2 1 + |ak | 2k 1 + |bk |
k=1 k=1 k=1
10. Use the ratio test or otherwise to investigate the convergence of the
following series:

a) ∞ 6 −n2
n=1 n e ;
∞ 4n2 +15n−3
b) n=1 2 3 ;
n (n+1) 2
∞ xk
c) k=0 k! , x ∈ R;
∞ k x2k
d) k=0 (−1) (2k)! , x ∈ R.

11. Prove the following: if for a sequence (an )n∈N of real numbers an+1
an
≥
∞
λ > 1 then the series n=1 an diverges. Use this result to show the
divergence of:
(−1)n 3n
a) ∞ n=1 n4
;
3
b) ∞ n√2
n=1 (n+3) 4n+15 .
12. Let (an )n∈N be a sequence of real numbers.

1
a) Prove that if |an | n ≥ 1 then ∞n=1 |an | diverges.
1
b) Prove that if lim |an | n = a < 1 then ∞ n=1 |an | converges.
n→∞

13.* Prove Raabe’s test: suppose that an+1
an
≤1− a
n
holds for n ≥ N. If
∞
a > 1 then n=1 an converges absolutely.
14. Consider the series
∞
2
1 · 4 · 7 · . . . · (3n − 2)
.
n=1
3 · 6 · 9 · . . . · 3n
Use Raabe’s test to show that it converges. Is it possible to use the

ratio test to prove convergence of this series?
261
15. Use the integral test to investigate convergence or divergence of the

following series:

a) ∞ 1
k=2 k(ln k)α , α > 1;

b) ∞ l=1 le
−l2
;
∞ ln k
c) k=2 k ;

d) ∞ ln k
k=2 k 2 .

16.* Let (a )n∈N be a sequence of real numbers for which ∞
n∞ n=1 an converges
but |a
n=1 n | diverges, i.e. the series is not absolutely convergent.

Prove that for c ∈ R given there exists a rearrangement of ∞ n=1 an the
limit of which is c.
1
17. Find the representation of x = 7
as a b-adic fraction when
a) b = 2;
b) b = 7;
c) b = 10.
18. Prove that if D ⊂ R is a set which contains an open interval (a, b), i.e.
(a, b) ⊂ D, then D is not countable.
Hint: use the fact that the interval (0, 1) is not countable and construct
a bijective mapping f : (a, b) → (0, 1).
262
19 Point Sets in R
Functions or sequences map subsets of the real line onto subsets of the real
line. In order to understand this process better we need to acquire more
knowledge of subsets of the real line. This is a task which will accompany
us for some time and it is partly more abstract and formal than students are
used to at the beginning of their studies. However it is unavoidable in order
to gain a deeper understanding of mathematics.
We already know a certain class of subsets of R and we have seen its impor-
tance: intervals.
For a ≤ b we define the closed interval by
[a, b] := {x ∈ R|a ≤ x ≤ b}, (19.1)
noting that [a, a] = {a} is a closed interval. For a < b we have the open
interval
(a, b) := {x ∈ R|a < x < b}, (19.2)
and for a < b we have two kinds of half-open intervals, namely
[a, b) := {x ∈ R|a ≤ x < b} (19.3)
and
(a, b] := {x ∈ R|a < x ≤ b}. (19.4)
We extend these notions to infinite or unbounded intervals. For a ∈ R
we set
[a, ∞) := {x ∈ R|x ≥ a}, (19.5)

(a, ∞) := {x ∈ R|x > a}, (19.6)
(−∞, a] := {x ∈ R|x ≤ a}, (19.7)
(−∞, a) := {x ∈ R|x < a}. (19.8)
Moreover we define
R+ := [0, ∞), (19.9)
so that (0, ∞) = R+ \ {0} and we occasionally use
(−∞, ∞) := R, (19.10)
i.e. we consider R as an interval. The following definition has far reaching

consequences.
263
Definition 19.1. A set A ⊂ R is called open, more precisely an open subset

of R, or open in R, if for every x ∈ A there exists an > 0 such that the
open interval (x − , x + ) belongs entirely to A, i.e. (x − , x + ) ⊂ A. By
definition the empty set ∅ is open.
Clearly R is an open set. Moreover we find
Lemma 19.2. Every open interval (a, b) ⊂ R is an open subset of R.
Proof. First note that there is a need for a proof. At a first glance the
notion of an open interval is unrelated to the notion of an open set. But
of course we should expect some consistency in our notions. Therefore let
(a, b) ⊂ R be an open interval. We want to prove that for x ∈ (a, b) there
exists > 0 such that the open interval (x − , x + ) is a subset of (a, b), i.e.
(x − , x + ) ⊂ (a, b). For this choose := 12 min(x − a, b − x) > 0 and it
follows that (x − , x + ) ⊂ (a, b).
This proof has a clear geometric idea:
a x
x− x+ b
Figure 19.1
Note that the proof is also valid for (−∞, b) or (a, ∞), i.e. both are open
sets.
We next want to study some properties of open sets.
Lemma 19.3. A. For a finite collection of open subsets A1 , . . . , AN of R the

intersection N
ν=1 Aν is open.
arbitrary index set and for j ∈ I let Aj ⊂ R be an open

B. Let I = ∅ be an
set, then the union j∈I Aj is an open set in R.
Proof. A. Assume that ∩Nν=1 Aν =∅, otherwise there is nothing to prove since
by definition ∅ is open. Let x ∈ N ν=1 Aν , thus x ∈ Aν for all ν = 1, . . . , N.
Since Aν is open there exists ν > 0 such that (x − ν , x + ν ) ⊂ Aν . For
:= min1≤ν≤N ν > 0 we find
N

N

x ∈ (x − , x + ) ⊂ (x − ν , x + ν ) ⊂ Aν ,
ν=1 ν=1
264
19 POINT SETS IN R

implying the openess of N ν=1 Aν .
B. Now let I = ∅ be any index set and for j ∈ I let Aj ⊂ R be open.
Consider
A := Aj := {x ∈ R|x ∈ Aj0 for some j0 ∈ I}, (19.11)
j∈I
and assume that at least one set Aj1 is non-empty, otherwise A = ∅ and
nothing remains to prove. Take x ∈ A, then for some j0 ∈ I we have x ∈ Aj0
exists an open interval (x − , x + ) ⊂ Aj0 which
and since Aj0 is open there
yields (x − , x + ) ⊂ j∈I Aj = A and the lemma is proved.
Example 19.4. A. If a1 < b1 < a2 < b2 then the two intervals (a1 , b1 ) and
(a2 , b2 ) are open and disjoint. Their union (a1 , b1 ) ∪ (a2 , b2 ) is open too but
it is not an interval anymore.
a1 b1 a2 b2
Figure 19.2
∞ 1
Moreover, the set n=1 (n − n
,n + n1 ) is open.
1 1
B. Consider the open intervals (1 − n+1 , 1 + n+1 ). Their intersection is given
by
∞

1 1
{1} = (1 − ,1 + )
n=1
n+1 n+1
(compare also with Problem 4). The set {1} does not contain an open inter-
val, hence we cannot expect that an infinite intersection of open sets is open.
C. The following type of construction will be used (in a modified form) quite
often. Let a 0 and for
∈ [a, b] consider the open interval (f (x) − , f (x) + ) ⊂ R. It follows that
x
x∈[a,b] (f (x) − , f (x) + ) ⊂ R is an open

set. The image of f , i.e. f ([a, b]) is
a subset in R and clearly f ([a, b]) ⊂ x∈[a,b] (f (x) − , f (x) + ). Thus we can
consider f ([a, b]) as a subset of an open set and every y = f (x) ∈ f ([a, b])
is the centre of an open interval of length 2 entirely belonging to this open
set. Clearly f ([a, b]) does not have to be open, just consider f : [a, b] → R,
f (x) = c ∈ R for all x ∈ [a, b]. Then f ([a, b]) = {c} which is not open.
Recall that by Definition 17.8.B a point a ∈ R is an accumulation point of

B ⊂ R if there exists a sequence (bn )n∈N , bn ∈ B, bn = a, converging to a.
265
Definition 19.5. A set B ⊂ R is called closed, more precisely a closed

subset of R, or closed in R, if it contains all its accumulation points.
Theorem 19.6. A set B ⊂ R is closed if and only if its complement B is
open. Consequently A ⊂ R is open if A is closed.
Proof. Suppose B is closed and x ∈ B , then x is not an accumulation
point of B, i.e. there is no sequence (bn )n∈N , bn ∈ B, converging to x, and
so there exists an interval (x − , x + ) which contains no point of B, i.e
(x − , x + ) ⊂ B , and so B is open. Conversely, suppose B is open
and a is an accumulation point of B. Then, if a ∈ B , there exists an open
interval (a − , a + ) contained in B , which contradicts the fact that a is
an accumulation point of B, i.e. the existence of a sequence (bn )n∈N , bn ∈
B, bn = a, converging to a. Hence a ∈ B and B is closed. The final statement
follows from (A ) = A.
Lemma 19.7. The sets ∅ and R are closed and any closed interval is closed.
Moreover, the union of finitely many closed sets is closed and the intersection
of an arbitrary collection of closed sets is closed.
Proof. We have ∅ = R and R = ∅ implying that ∅ and R are closed. For the
interval [a, b] we can write [a, b] = ((−∞, a) ∪ (b, ∞)) implying that [a, b] is
closed. Also (−∞, b] = (b, ∞) , so that (−∞, b] is closed. Similarly [a, ∞) is
closed.
Now let Bν ⊂ R, ν = 1, . . . , N, be a family of closed sets. Then
N
N

Bν = Bν ,
ν=1 ν=1

N
N
and since Bν is open, Bν is open, and hence Bν is closed. For
ν=1 ν=1
an arbitrary collection Bj ⊂ R, j ∈ I, of closed sets we have

Bj = {x ∈ R| x ∈ Bj for all j ∈ I.}

j∈I
and therefore

Bj = Bj ,
j∈I j∈I
266
19 POINT SETS IN R

and since each Bjc is open it follows from Lemma 19.3 that Bj is open,
j∈I

hence Bj is closed.
j∈I
Remark 19.8. In Problem 1 we will prove that [a, b) and (a, b] are neither
open nor closed.
Example 19.9. A. A single point a ∈ R forms a closed set {a} since {a} =
((−∞, a) ∪ (a, ∞)) . This implies that any finite union of points a1 , . . . , aN
is closed:
N

{aν |ν = 1, . . . , N} = {aν }.
ν=1
∞Let aν ∈ R, ν ∈ N and assume for some δ > 0 that |aν − aν+1 | ≥ δ. Then
B.
v=1 {aν } is a closed set. (Compare with Problem 3).
Definition 19.10. A set U ∈ R is called a neighbourhood of x ∈ R if

there exists an open set A ⊂ U containing x, i.e. x ∈ A ⊂ U.
Obviously every open set is a neighbourhood of all its points. However
the closed interval [a, b] is only a neighbourhood of the points belonging
to (a, b) ⊂ [a, b]. It is not a neighbourhood in R of {a}, {b} or any subset
containing a or b (or both). From our considerations above we have
Theorem 19.11. Let U ⊂ R be a neighbourhood of x ∈ R then there exists
an open interval (x − δ, x + δ) ⊂ U, δ > 0. Further, by Theorem 18.33 we
know that there exists a dyadic fraction
N

y = sgn(x) al 2−l , al ∈ N0 , (19.12)
l=−k
such that |x − y| < δ, i.e. y ∈ U, implying that in every neighbourhood of a

real number we can find a rational number.
Next we want to understand the idea of boundedness for subsets of the real
line.
Definition 19.12. A set D ⊂ R is called bounded from above (bounded
from below) if there exists K ∈ R such that
x ≤ K(x ≥ K) for all x ∈ D. (19.13)
267
We call K an upper (lower) bound for D. If D is bounded from above and

from below we call D bounded.
Remark 19.13. A. Upper and lower bounds are not uniquely determined.
In fact if D is bounded from above by K then K > K is a further upper
bound and if D is bounded from below by M then M < M is a further lower
bound.
B. A set D ⊂ R is bounded if and only if for some K we have |x| ≤ K for
all x ∈ D. Indeed, since A ≤ x ≤ B for some A ≤ B, we may also take
K := max (|A|, |B|) to find −K ≤ x ≤ K for x ∈ D.
C. Note further that a sequence (an )n∈N is bounded if and only if the set
{aν |ν ∈ N} ⊂ R is bounded in R.
D. Let a < b be real numbers then the corresponding open, closed and half-
open intervals (a, b), [a, b], [a, b) and (a, b] are all bounded with lower bound
a and upper bound b. However in some cases the bound belongs to the
interval, in other cases it does not. The intervals (−∞, a) and (−∞, a] are
not bounded sets, but they are bounded from above, while (b, ∞) and [b, ∞)
are not bounded but bounded from below.
The last remark raises the following interesting question: Suppose that D ⊂
R is bounded above. We would like to know whether there exists a smallest
upper bound, i.e. K ∈ R being an upper bound of D with the property that
if K < K then K cannot be an upper bound of D.
Of fundamental importance is the following theorem which once again needs
the completeness of R.
Theorem 19.14. Every non-empty set D ⊂ R which is bounded from above
has a least upper bound. Every non-empty set D ⊂ R which is bounded from
below has a greatest lower bound.
Definition 19.15. Let D ⊂ R be a subset. The least upper bound of D is
called its supremum, its greatest lower bound is called its infimum. The
supremum of a set D is denoted by sup D, the infimum is denoted by inf D.
Proof of Theorem 19.14. We show the case where D is bounded from above.
Since D = ∅ and bounded from above there exists x0 ∈ D and K0 ∈ R,
an upper bound of D, such that x0 ≤ K0 , hence r := K0 − x0 ≥ 0. We
now take the arithmetic mean K0 2+x0 which may or may not be an upper
bound for D. If it is, we call it K1 . If it is not an upper bound for D, there
exists x1 ∈ D, x1 > x0 , larger than K0 +x 2
0
. In this case we set K1 := K0 ,
268
19 POINT SETS IN R
i.e. we do not change the upper bound. We repeat this process to obtain a
decreasing sequence of upper bounds and an increasing sequence of elements
belonging to D, and we will prove that they converge to the same limit. Our
demonstration uses mathematical induction:
We construct
i) a sequence x0 ≤ x1 ≤ x2 ≤ · · · of elements in D, and
ii) a sequence K0 ≥ K1 ≥ K2 ≥ · · · of upper bounds of D such that
Kn − xn ≤ 2−n r for all n ∈ N, r = K0 − x0 . (19.14)
Starting with x0 and K0 let us assume that x0 , . . . xn , ∈ D and K0 , . . . , Kn ,
upper bounds of D, are already constructed such that (19.14) holds. Define
Kn + xn
M := .
2
There are two possibilities: if M is an upper bound of D, we put xn+1 := xn
and Kn+1 := M; if M is not an upper bound of D, we put Kn+1 := Kn and
choose xn+1 ∈ D with xn+1 > M. In each case we have
xn ≤ xn+1 , Kn ≥ Kn+1 and Kn+1 − xn+1 ≤ 2−n−1 r.
The sequence (Kn )n∈N is monotone decreasing and bounded since x0 ≤ Kn ≤
K0 . Hence (Kn )n∈N0 converges to some K ∈ R. Since for x ∈ D we always
have x ≤ Kn , it follows that x ≤ lim Kn = K, i.e. K is an upper bound for
n→∞
D. To show that it is the least upper bound, suppose K < K. Then there
exists n0 ∈ N such that 2−n0 r < K − K , which yields
xn ≥ Kn − 2−n r ≥ K − 2−n r > K ,
so that K is not an upper bound. Hence K = sup D. Note that (19.14)
implies limn→∞ Kn = limn→∞ xn .
Example 19.16. A. For a closed interval [a, b], a ≤ b, we have sup[a, b] = b
and inf[a, b] = a.
B. For an open interval (a, b), a < b, we find sup(a, b) = b and inf(a, b) = a.
We show that b = sup(a, b). Clearly, b is an upper bound for (a, b). Suppose
that b < b. It follows that

a + b b + b
x := max , ∈ (a, b)
2 2
and b < x, hence b could not be an upper bound.
269
Example 19.17. The following holds

n2
sup |n ∈ N = 1.
n2 + 1

n2 n2
Suppose 0 < < 1 is given. Since limn→∞ n2 +1
= 1 and since n2 +1
is
n∈N
an increasing sequence it follows that there exists N() such that n ≥ N()
2
implies < n2n+1 , hence < 1 cannot be an upper bound, while 1 is clearly
an upper bound. This example easily extends. Let (an )n∈N be a sequence of
real numbers converging to a, i.e. limn→∞ an = a. Suppose that an ≤ a for
all n ∈ N then sup{an |n ∈ N} = a (compare with Problem 10).
The examples show that sometimes inf D or sup D belong to D, sometimes

not.
Definition 19.18. A. If D ⊂ R and x = sup D ∈ D, then we call x the

maximum of D and write x = max D. In this case we have sup D = max D.
If D ⊂ R and y = inf D ∈ D, then we call y the minimum of D and write
y = min D. In this case we have inf D = min D.
B. If a set D is not bounded from above we write sup D = ∞, if it is not
bounded from below, we write inf D = −∞.
If D is bounded from above, it need not have a maximum. However there is

always a sequence in D converging to sup D as shown in the proof of Theorem
19.14. A similar statement holds for the minimum and infimum.
We now turn to sequences. A sequence may have or may not have a limit, or
it may have several converging subsequences. The following notions of limit
superior and limit inferior will help to clarify the situation.
Definition 19.19. Let (an )n∈N be a sequence of real numbers. We define its
limit superior by
lim sup an := lim (sup{ak |k ≥ n}) (19.15)

n→∞ n→∞
and its limit inferior by
lim inf an := lim (inf{ak |k ≥ n}). (19.16)

n→∞ n→∞
270
19 POINT SETS IN R
Remark 19.20. A. An alternative notation is

lim = lim sup and lim = lim inf .
B. The sequence (sup{ak |k ≥ n})n∈N is monotone decreasing whereas the
sequence (inf{ak |k ≥ n})n∈N is monotone increasing. Therefore
lim sup an and lim inf an
n→∞ n→∞
exist either as limits in R or as “improper limits” +∞ or −∞, i.e. the se-

quence (sup{ak |k ≥ n})n∈N diverges to ±∞ , and/or the sequence (inf{ak |k ≥
n})n∈N diverges to ±∞.

Example 19.21. A. Consider the sequence (an )n∈N , where an = (−1)n 1 + n12 .
We find -
1 + n12 , if n is even
sup{ak |k ≥ n} = 1
1 + (n+1) 2, if n is odd,
hence lim supn→∞ an = 1. Further we find
-
− 1 + n12 , if n is odd
inf{ak |k ≥ n} = 1
− 1 + (n+1) 2 , if n is even,
hence limn→∞ inf an = −1.

B. For the sequence (an )n∈N , an = n, we find
sup{ak |k ≥ n} = ∞ and inf{ak |k ≥ n} = n,
which yields
lim sup an = lim inf an = +∞.
n→∞ n→∞
Theorem 19.22. A. Let (an )n∈N , an ∈ R, be a bounded sequence and denote

by A the set of all its accumulation points. It holds that
lim sup an = sup A (19.17)
n→∞
and
lim inf an = inf A. (19.18)
n→∞
B. A sequence (an )n∈N of real numbers an ∈ R converges to a limit a ∈ R if

and only if
lim sup an = lim inf an = a. (19.19)
n→∞ n→∞
271
Proof. A. We prove (19.17), the proof of (19.18) is similar. With An :=

sup{ak |k ≥ n} we have by the definition of lim sup that
a := lim sup an = lim An .
n→∞ n→∞
Since (an )n∈N is bounded it follows that An ∈ R as well as a ∈ R. We claim

that a is an accumulation point, i.e. a ∈ A, and that a ≤ a for all a ∈ A.
By definition A is the set of all limits of converging subsequences of (an )n∈N .
Therefore, to prove a ∈ A it is sufficient to show that for every N ∈ N and
every > 0 there exists n ≥ N, n = nN, , such that |an − a | < . Indeed by
this we get a subsequence of (an )n∈N converging to a . Since limn→∞ An = a
we find m ≥ N such that |Am − a | < 2 and the definition of Am implies the
existence of n, n ≥ m, such that |an − Am | < 2 which yields for n ≥ N that
|an − a | < . Thus we have proved a ∈ A. Let a ∈ A be an accumulation
point of (an )n∈N . Then there exists a subsequence (ank )k∈N of (an )n∈N such
that limk→∞ ank = a. By definition of Ank we have Ank ≥ ank . This implies
a = lim An = lim Ank ≥ lim ank = a,
n→∞ k→∞ k→∞

but a ∈ A and a ≤ a for all a ∈ A implies a = sup A.
B. In the case where (an )n∈N converges to a ∈ R, we know by Theorem 15.8
that (an )n∈N is bounded and further A = {a}. Thus applying part A we get
lim sup an = sup A = a = inf A = lim inf an .
n→∞
Now suppose that (19.19) holds. We set as before An := sup{ak |k ≥ n}

and further Bn := inf{ak |k ≥ n}. In other words limn→∞ An = limn→∞ Bn .
Given > 0 there exists N ∈ N such that |a−AN | < and |a−BN | < . Since
BN ≤ an ≤ AN for all n ≥ N, it follows that −(a − BN ) ≤ an − a ≤ AN − a
or |an − a| < for all n ≥ N, i.e. (an )n∈N converges to a.
The proof of Theorem 19.22 gives an alternative characterisation of lim sup
and lim inf.
Corollary 19.23. Let (an )n∈N be a bounded sequence. Its greatest accumula-
tion point is lim supn→∞ an and its smallest accumulation point is lim inf n→∞ an .
Moreover we have
lim sup an = − lim inf (−an ). (19.20)
n→∞ n→∞
In order to see (19.20) note that passing from (an )n∈N to (−an )n∈N is a re-
flection about 0 which reverses all order relations.
272
19 POINT SETS IN R
Finally we want to provide some results which are useful to know, but we
provide the proofs only in Appendix VIII. We start with
Definition 19.24. A. Let A ⊂ R be a non-empty set. We call a pair

{O1 , O2} of non-empty open and disjoint subsets of R a splitting of A if
A ⊂ O1 ∪ O2 and A ∩ O1 as well as A ∩ O2 is non-empty.
B. A non-empty subset A ⊂ R is called connected if A does not have a
splitting.
Theorem 19.25. A non-empty subset of R is connected if and only if it is

an interval.
Corollary 19.26. A subset A ⊂ R is both open and closed if and only if A

is either empty, i.e. A = ∅, or A is all of R, i.e. A = R.
Proof. Both ∅ and R are open and closed. Indeed ∅ is open by definition,
hence ∅ = R is closed. However R = n∈N (−n, n) is the union of open
sets, hence open, implying that R is closed. Suppose that A is open and
closed, hence A is open and closed and the connected set R has the splitting
R = A ∪ A . Hence either A or A is empty, hence A is either R or ∅.
We finally have
Theorem 19.27. Every open set A ⊂ R is a denumerable union of disjoint

open intervals.
In Appendix VIII we will provide a proof of Theorem 19.25 and Theorem

19.27.
Problems
1. Prove that for a < b the half-open interval [a, b) is neither open nor
closed.
2. Is Q ⊂ R, i.e. the set of all rational numbers, a closed or an open

subset of R?
3. Let aν ∈ R,ν ∈ N, assume aν < aν+1 for ν ∈ N and limν→∞ = ∞.

Prove that ∞ ν=1 {aν } is closed.
273
4. Give an example of a sequence (Bν )ν∈N of closed sets in R such that

∪ν∈N Bν is not closed.
5. Let (aν )ν∈N , aν ∈ R, be a sequence converging to a ∈ R. Is {aν |ν ∈ N}
closed in general? Prove that {aν |ν ∈ N} ∪ {a} is closed.
6. Let A and B be two non-empty sets of real numbers and define A+B :=
{c = a + b|a ∈ A, b ∈ B}. Prove that if A and B are both bounded
then A + B is bounded too.
7. a) Given the set M := (−3, 2) ∪ [4, 6] ∪ {10} ⊂ R. Prove that
(−3, 2) ∪ (4, 6) is the largest open set contained in M and that [−3, 2] ∪
[4, 6] ∪ {10} is the smallest closed set which contains M.

b) Prove that n∈N (− n1 , n1 ) = {0}.
8. a) Consider the set

1 1
G := y ∈ R|y = , x ≥ .
x 2
Find inf G and sup G. Does G have a maximum or minimum?
b) Find a sequence (an )n∈N , an ∈ R, with 3 accumulation points
5
such that sup{an |n ∈ N} = 3, inf{an |n ∈ N} = 0, lim sup an = and
n→∞ 2
1
lim inf an = .
n→∞ 2
9. For each of the following sequences (an )n∈N , an ∈ R, determine
sup{an |n ∈ N}, inf{an |n ∈ N}, lim sup an and lim inf an :
n→∞ n→∞
n−1
a) an = 2 − 10
;
(−1)n−1
b) an = n+1
;
2
c) an = 3
(1 − 101n ).
10. Let (an )n∈N be a sequence of real numbers converging to a, i.e.
limn→∞ an = a. Suppose that an ≤ a for all n ∈ N. Prove that
sup{an |n ∈ N} = a.
11. Let (an )n∈N be a sequence. Prove that a = lim supn→∞ an if and only
of for every > 0 the estimate an < a + holds for all but finitely many
n ∈ N.
274
19 POINT SETS IN R
12. Let (an )n∈N and (bn )n∈N be two sequences and let λ > 0. Prove
a) lim supn→∞ (λan ) = λ lim supn→∞ an ;
b) lim supn→∞ (an + bn ) ≤ lim supn→∞ an + lim supn→∞ bn ;
c) lim supn→∞ (an + bn ) ≥ lim supn→∞ an + lim inf n→∞ bn ;
d) if limn→∞ bn = b, i.e. the limit exists, then
lim sup(an + bn ) = lim sup an + lim bn .

n→∞ n→∞ n→∞
Hint: use Problem 11.
13. The set A := [0, 1] ∪ {2} ∪ (3, 4) ⊂ R is not an interval, hence not
connected. Give a splitting {O1 , O2} of A.
275
20 Continuous Functions
In Chapter 6 we encountered the concept of a continuous function, see Defi-
nition 6.9. This notion depends on the idea of a limit of a function (at some
point of its domain) which was introduced in Chapter 6. Recall: a function
f : D → R, D ⊂ R, has the limit a as y ∈ D approaches x if for every > 0
there exists δ > 0 such that 0 < |x − y| < δ implies |f (y) − a| < . First we
want to relate this definition to limits of sequences.
Theorem 20.1. Let D ⊂ R and f : D → R be a function and suppose that

for x ∈ R there exists a sequence (xk )k∈N , xk ∈ D, xk = x, converging to x.
The function has the limit a ∈ R as y ∈ D approaches x, i.e. limy→x f (y) = a,
if and only if for every sequence (xn )n∈N , xn ∈ D \ {x}, converging to x, i.e.
limn→∞ xn = x, it follows that limn→∞ f (xn ) = a.
Proof. Suppose that for ε > 0 there exists δ > 0 such that 0 < |y − x| < δ,
y ∈ D, implies |f (y) − a| < ε. Let limn→∞ xn = x, xn ∈ D. Then there
exists N = N(δ) such that for n ≥ N(δ) it follows that |xn − x| < δ. By
assumption it follows that |f (xn ) − a| < ε for n ≥ N(δ) = N(δ(ε)), i.e.
limn→∞ f (xn ) = a.
Suppose now that for every sequence (xn )n∈N , xn ∈ D, with limn→∞ xn = x
it follows that limn→∞ f (xn ) = a. We have to prove that for every ε > 0
there exists δ > 0 such that 0 < |y − x| < δ implies |f (x) − a| < ε. Suppose
this does not hold. Then there exists ε > 0 such that for no value of δ > 0
do we have |f (y) − a| < ε for all y ∈ D with 0 < |y − x| < δ. Thus for every
n ∈ N there exists xn ∈ D such that
1
|xn − x| < and |f (xn ) − a| ≥ ε .
n
This implies that limn→∞ xn = x and therefore limn→∞ f (xn ) = a, but
|f (xn ) − a| ≥ ε for some ε > 0 which is a contradiction.
We now have the following characterisations of continuity of f at a point x:
Theorem 20.2. A function f : D → R, D ⊂ R, is continuous at x ∈ D if

either of the following equivalent conditions holds:
i) for every ε > 0 there exists δ = δ(ε) > 0 such that for y ∈ D the
condition 0 < |y − x| < δ implies |f (y) − f (x)| < ε;
277
ii) for every sequence (xn )n∈N , xn ∈ D, converging to x ∈ D it fol-

lows that (f (xn ))n∈N converges to f (x), i.e. limn→∞ xn = x implies
limn→∞ f (xn ) = f (x).
Note that statement i) is just Definition 6.9.
Definition 20.3. We call f : D → R, D ⊂ R, continuous on D if f

is continuous for each x ∈ D. The set of all continuous functions on D is
denoted by C(D).
From Example 6.1.C we can deduce that every polynomial p : R → R is

continuous. In particular, this applies to the constant function x → c, c ∈ R,
the identity x → x and x → x2 . Furthermore, it is easy to see that x → |x| is
continuous on R. Indeed the converse triangle inequality yields ||x| − |y|| ≤
|x − y|, thus given > 0 choose δ = to find for 0 < |x − y| < δ that
||x| − |y|| ≤ |x − y| < .
Corollary 20.4. Let f : D → R be continuous at x ∈ D and f (x) = 0.

Then f (y) = 0 for all y in a neighbourhood of x, i.e. there exists δ > 0 such
that f (y) = 0 for all y ∈ D, |x − y| < δ.
Proof. For ε := |f (x)| > 0 there exists δ > 0 such that y ∈ D and 0 <
|y − x| < δ implies |f (y) − f (x)| < ε. It follows that
|f (y)| ≥ |f (x)| − |f (y) − f (x)| > 0 for y ∈ D, 0 < |y − x| < δ .
Before we prove deeper results on continuous functions we want to investigate

more the concept of the limit of a function.
Let f : D → R, D ⊂ R, be a function and let x ∈ R be an accumulation
point of D in the sense that there exists a sequence (xk )k∈N , xk ∈ D \ {x},
such that limk→∞ xk = x. Let D1 , D2 ⊂ D be such that x is an accumulation
point of both D1 and D2 and suppose that D1 ∩ D2 = ∅. If limy→x f (y) = a
then limy→x f |D1 (y) = a and limx→y F |D2 (y) = a. Of special interest is the
case where D1 and D2 are subsets of open intervals with x being the right
end point of the interval containing D1 and the left end point of the interval
containing D2 , still x is supposed to be an accumulation point of D1 and D2 .
278
20 CONTINUOUS FUNCTIONS
c1 D1 x D2 c2
Figure 20.1
If limy→x f (y) = a then in the case limy→x f |D1 (y) we are approaching x from
the left, i.e. y < x, and in the case of limy→x f |D2 (y) we are approaching x
from the right, i.e. x < y. This leads to
Definition 20.5. A. We say that f : D → R has a limit from the right
if for every sequence (xn )n∈N , xn ∈ D and xn > x, with limn→∞ xn = x it
follows that limn→∞ f (xn ) = a. We write
lim f (y) = a or y→x

lim f (y) = a, . (20.1)
yx
y>x
In the case where a = f (x) we call f right continuous or continuous

from the right at x.
B. We say that f : D → R has a limit from the left if for every se-
quence (xn )n∈N , xn ∈ D and xn < x, with limn→∞ xn = x it follows that
limn→∞ f (xn ) = a. We write
lim f (y) = a or y→x

lim f (y) = a. (20.2)
yx
y<x
In the case where a = f (x) we call f left continuous or continuous from

the left at x.
Lemma 20.6. Let x ∈ R be an accumulation point of D ⊂ R and f : D → R
a function. The function f has a limit a at x if and only if it has a limit
from the right and a limit from the left at x and both coincide and are equal
to a.
Proof. We already know that if f has a limit a at x, then it also has a limit
from the right and from the left at x and these limits are equal to a. Now
suppose that limy→x f (y) = limy→x f (y) = a. Let (xk )n∈N , xk ∈ D, be any
y>x y<x
sequence converging to x. For > 0 there exists δ1 > 0 and δ2 > 0 such that
for xk ∈ D, xk > x and |xk − x| < δ1 it follows that |f (xk ) − a| < , and for
xk ∈ D, xk < x and |xk − x| < δ2 it follows that |f (xk ) − a| < . Thus for
δ = min(δ1 , δ2 ) it follows that xk ∈ D and |xk −x| < δ implies |f (xk )−a| < .
Since (xk )k∈N converges to x, given > 0 we find N ∈ N such that k ≥ N
279
implies |xk − x| < δ, thus given > 0 we find N ∈ N such that k ≥ N implies
|f (xk ) − a| < proving the lemma.
Definition 20.7. We say that f : D → R has a limit a at ∞ if for each se-

quence (xn )n∈N , xn ∈ D and limn→∞ xn = ∞, it follows that limn→∞ f (xn ) =
a. We write
lim f (y) = a . (20.3)
y→∞
Analogously we define limy→−∞ f (y) = a.
Example 20.8. A. Consider x → [x]. Then limx1 [x] = 1 and limx1 [x] =
0.
Indeed, for any sequence (xn )n≥0 , xn > 1 and limn→∞ xn = 1 it follows for n
sufficiently large that [xn ] = 1, if however xn → 1 and xn < 1 then [xn ] = 0
for n large.
B. Let P (x) = xk + a1 xk−1 + . . . + ak−1 x + a, k ≥ 1, be a polynomial. It
follows that
lim P (x) = ∞
x→∞
and
+∞ for k even
lim P (x) =
x→−∞ −∞ for k odd.
Proof. For x = 0 we write
a1 a2 ak
P (x) = xk g(x) = xk 1 + + 2 + ...+ k .
x x x
If x ≥ c := max (1, 2k|a1 |, . . . , 2k|ak |) it follows that
1
g(x) ≥ ,
2
hence for these x we have
1 k x
P (x) ≥ x ≥ .
2 2
Thus, if xn → ∞ then P (xn ) ≥ x2n → ∞, or limn→∞ P (xn ) = ∞. Since
P (−x) = (−1)k Q(x) = (−1)k (xk − a1 xk−1 + . . . + (−1)k−1 ak−1 + (−1)k ak the
second statement follows from the first.
280
Theorem 20.9. Let f, g : D → R be two functions continuous at x ∈ D,

and let λ ∈ R. The following functions are continuous at x:
f +g, λf , f · g.
f
In addition, if g(x) = 0, then g
is also continuous at x.
Proof. Let (xn )n∈N , xn ∈ D, be a sequence converging to x. It follows from

the limit theorems for sequences that
lim (f + g)(xn ) = lim f (xn ) + lim g(xn ) = f (x) + g(x) = (f + g)(x),

n→∞ n→∞ n→∞
lim (λf )(xn ) = λ lim f (xn ) = λ f (x),

n→∞ n→∞

lim (f · g)(xn ) = lim f (xn ) · lim g(xn ) = f (x)·g(x) = (f · g)(x),
n→∞ n→∞ n→∞

f limn→∞ f (xn ) f (x) f
lim (xn ) = = = (x),
n→∞ g limn→∞ g(xn ) g(x) g

f
note that by assumption (xn ) is well defined for n large enough.
g
Remark 20.10. In case that f, g ∈ C(D), i.e. f and g are continuous on D,
then Theorem 20.9 implies that f + g, λ f , f · g ∈ C(D). Thus C(D) forms
an algebra with the natural operations. In particular C(D) is a vector space.
P (x)
Corollary 20.11. All rational functions x → where P and Q are
Q(x)
polynomials are continuous on the set R \ {x0 ∈ R | Q(x0 ) = 0}.
Theorem 20.12. Let f : D → R and g : E → R be two functions such that
f (D) ⊂ E. Suppose that f is continuous at x ∈ D and that g is continuous
at y := f (x) ∈ E. Then the function g ◦ f : D → R is continuous at x.
Proof. Let (xn )n∈N , xn ∈ D, be a sequence with limn→∞ xn = x. Since f is
continuous at x it follows that limn→∞ f (xn ) = f (x). Setting yn := f (xn )
it follows that limn→∞ yn = y and the continuity of g at y implies that
limn→∞ g(yn ) = g(y), hence
lim g(f (xn )) = lim (g ◦ f )(xn ) = (g ◦ f )(x) .

n→∞ n→∞
281
Example 20.13. A. If f : D → R is continuous, then so is |f |.

B. The continuity of |f | : D → R, x → |f (x)| however does not imply the
continuity of f .
Theorem 20.14. For a < b, let f : [a, b] → R be a continuous function

with f (a) < 0 and f (b) > 0 (or f (a) > 0 and f (b) < 0). Then there exists
ξ ∈ [a, b] such that f (ξ) = 0.
Proof. Suppose that f (a) < 0 and f (b) > 0. We will construct a sequence of
closed intervals ([an , bn ])n∈N with the properties
(i) [an , bn ] ⊂ [an−1 , bn−1 ] for n ≥ 1;
(ii) bn − an = 2−n (b − a);
(iii) f (an ) ≤ 0 and f (bn ) ≥ 0.

We start with [a0 , b0 ] = [a, b]. Suppose that [an , bn ] has already been con-
structed and set m := an +b2
n
. If f (m) ≥ 0, then take [an+1 , bn+1 ] = [an , m],
if f (m) < 0, then take [an+1 , bn+1 ] = [m, bn ]. Obviously (i)–(iii) are fulfilled.
The sequence (bn )n∈N is monotone decreasing and bounded, The sequence
(an )n∈N is monotone increasing and bounded, hence both sequences are con-
vergent and because of (ii) they have the same limit. Let
ξ := lim an = lim bn .
n→∞ n→∞
Since f is continuous it follows that
lim f (an ) = lim f (bn ) = f (ξ).

n→∞ n→∞
In addition
f (ξ) = lim f (an ) ≤ 0 ≤ lim f (bn ) = f (ξ),
n→∞ n→∞
so that f (ξ) = 0.
Remark 20.15. This result allows us to decide whether the equation f (x) =
0 has a solution in the domain [a, b] of f . Suppose that for some c1 ∈ [a, b]
we have f (c1 ) > 0 (f (c1 ) < 0) and for some c2 ∈ [a, b], c2 > c1 we have
f (c2 ) < 0 (f (c2 ) > 0), then f |[c1,c2 ] satisfies the conditions of Theorem 20.14
and hence f (x) = 0 must have a solution ξ ∈ [c1 , c2 ] ⊂ [a, b].
282
Example 20.16. If f : R → R, x → xn + c1 xn−1 + . . . + cn is a polynomial

and n is odd, then f has a zero, i.e. there exists some z ∈ R such that
f (z) = 0. Indeed, since limx→∞ f (x) = +∞ and limx→−∞ f (x) = −∞, there
exists a closed interval [a, b], a < b, such that f (a) < 0 and f (b) > 0, which
implies the result by Theorem 20.14.
We can now provide a proof of the intermediate value theorem, see The-
orem 9.5.
Theorem 20.17. Let f : [a, b] → R, a < b, be a continuous function and
let η be any real number between f (a) and f (b). Then there exists ξ ∈ [a, b]
such that f (ξ) = η.
Proof. Suppose that f (a) < η < f (b) and define g : [a, b] → R by g(x) =
f (x) − η. Then it follows that g(a) < 0 < g(b) and Theorem 20.14 gives the
result, since g(ξ) = 0 if and only if f (ξ) = η.
Remark 20.18. A. The content of Theorem 20.17 allows the following refor-
mulation: the image of an interval under a continuous function is an interval.
In light of Theorem 19.25 we may further rephrase the result as: a continuous
function maps connected sets onto connected sets. In this formulation the
result has a generalisation far beyond the situation discussed so far.
B. We can use the intermediate value theorem to determine the range of a
function. Suppose that f : (a, b) → R is continuous and lim f (x) = −∞ as
x→a
well as lim f (x) = ∞. Then the range of f must be R. Indeed, given any
x→b
ξ ∈ R we can find a1 and b1 , a < a1 < b1 < b, such that f (a1 ) ≤ ξ ≤ f (b1 ).
Hence, by Theorem 20.17 there exists x0 ∈ [a1 , b1 ] ⊂ (a, b) such that f (x0 ) =
ξ, i.e. ξ is in the range of f . In Chapter 10 we have used this already to
determine the range of tan and cot.
We recall the definition of a bounded function, see Definition 8.2.
Definition 20.19. A function f : D → R is bounded if f (D) ⊂ R is
bounded, i.e. if there is M ≥ 0 such that
|f (x)| ≤ M for all x ∈ D .
Theorem 20.20. Every continuous function defined on a closed and bounded

interval is bounded and there are p, q ∈ [a, b] such that
f (p) = sup{f (x) | x ∈ [a, b]} = max{f (x) | x ∈ [a, b]}
283
and
f (q) = inf{f (x) | x ∈ [a, b]} = min{f (x) | x ∈ [a, b]}.
Proof. We prove the result for the maximum. For the minimum we only have
to consider −f instead of f . Set
A := sup {f (x) | x ∈ [a, b]} ∈ R ∪ {∞}.
Take a sequence (xn )n∈N , xn ∈ [a, b], such that
lim f (xn ) = A.
n→∞
The sequence (xn )n∈N is bounded, hence by the Bolzano-Weierstrass theorem

there is a subsequence (xnk )k∈N converging to some p ∈ [a, b], i.e.
lim xnk = p ∈ [a, b].

k→∞
The continuity of f implies now
A = lim f (xnk ) = f (p),

k→∞
i.e. f (p) = sup f ([a, b]) = max f ([a, b]).

Continuous functions on bounded closed intervals have the “best” properties
you may imagine. The reason behind this is compactness, a notion we will
investigate now.
Definition 20.21. Let D ⊂ R be any set. We call a collection of open sets
Aν ⊂ R, ν ∈ I, an open covering of D if

D⊂ Aν .
ν∈I
Definition 20.22. A set K ⊂ R is compact if for every open covering

(Aν )ν∈I of K we may select a finite subcovering of K, i.e. there exists
ν1 , . . . , νN ∈ I such that N

K⊂ Aνk .
k=1
Remark 20.23. The important point in the definition of compactness is

that for every open covering we may select a finite subcovering of K.
284
Proposition 20.24. Every compact set K ⊂ R is bounded and closed.
Proof. Since (−n, n)n∈N is an open covering of K, we may select a finite

subcovering (−n1 , n1 ), . . . , (−nN , nN ) such that K ⊂ N k=1 (−nk , nk )
= (−nN0 , nN0 ) where nN0 = max1≤k≤N nk . Thus K ⊂ (−nN0 , nN0 ), and so
|x| ≤ nN0 for all x ∈ K, i.e. K is bounded. Next we prove that K c is open.
Take x ∈ K c . For every y ∈ K it follows that |x − y| > εy > 0 (for some
εy
εy > 0) and the open intervals x− 2 , x+ 2 and y − ε2y , y + ε2y are disjoint.
εy
x y
Figure 20.2

Clearly y − ε2y , y + ε2y y∈K is an open covering of K. By the compactness of
ε ε ε ε
K we may take a finite subcovering y1 − y21 , y1 + y21 , . . . , yN − y2N , yN + y2N
of K. It follows that
N

εy εy
Bx := x − j,x+ j
j=1
2 2
is open and x ∈ Bx . In addition

N
εy εy
Bx ∩ yj − j , yj + j = ∅
j=1
2 2
implying that Bx ∩ K = ∅, or Bx ⊂ K c . Thus we have proved that the

complement of K is open, i.e. K is closed.
In preparing the converse to Proposition 20.24 we show
Proposition 20.25. Every bounded closed interval [a, b] ⊂ R is compact.
Proof. We prove the proposition by contradiction. Suppose that there is an

open covering (Aν )ν∈I of [a, b] which has no finite subcovering. For m = a+b 2
it follows that at least one of the intervals [a, m] and [m, b] cannot be covered
by a finite subcovering of (Aν )ν∈I . Call this interval I1 . By induction we get
a sequence of closed intervals (Ij )j∈N with the following properties:
(i) [a, b] ⊃ I1 ⊃ I2 ⊃ . . .
285
(ii) Ij is not covered by a finite subcovering of (Aν )ν∈I
(iii) for x, y ∈ Ij it follows that |x − y| < 2−j (b − a).

By the principle
of nested intervals, Theorem 17.15, there is one point x0
which lies in j∈N Ij . Therefore, for some j0 we have x0 ∈ Aj0 . Since Aj0
is open there is some ε > 0 such that |y − x0 | < ε implies y ∈ Aj0 . Taking
n such that 2−n (b − a) < ε, then it follows from (iii) that In ⊂ Aj0 which
contradicts (ii).
Now we may prove the famous Heine-Borel Theorem.

Theorem 20.26. A set K ⊂ R is compact if and only if it is bounded and
closed.
Proof. We know already that compact sets are bounded and closed, so it
remains to prove that a closed and bounded set is compact. Let (Aν )ν∈I be
an open covering of the closed and bounded set K. Since K is bounded, there
exists a closed interval [a, b] ⊂ R such that K ⊂ [a, b]. The family of open
sets (Aν )ν∈I , together with Ap := R \ K form an open covering of R, since

j∈I Aj ∪ Ap ⊃ K ∪ K = R. Therefore, (Aν )ν∈I∪{p} is also an open covering
c
of [a, b] and by Proposition 20.25 it contains a finite subcovering (Aνj )νj ∈IN
where IN is a finite subset of I ∪ {p}. If p ∈ IN , then, since K ∩ Ap = ∅, we
can remove Ap and we still have a finite covering of K.
Our first application of compactness is related to uniform continuity.
Definition 20.27. A function f : D → R is called uniformly continuous
on D if for every ε > 0 there exists δ > 0 such that for x, y ∈ D the inequality
|x − y| < δ implies |f (x) − f (y)| < ε.
Remark 20.28. A. The important difference of continuity on D and uniform
continuity lies in the fact that in the latter case δ is independent of x ∈ D.
B. If f : D → R is uniformly continuous on D, then it is obviously continuous
on D. However the converse is false.
Example 20.29. The function f : (0, 1] → R, x → x1 , is
continuous
on (0, 1].
p p2
Indeed, for p ∈ (0, 1] and ε > 0 it follows with δ := min , ε
2 2
that

1 1 x − p 2 |x − p| 2δ

|f (x) − f (p)| = − = ≤ < 2 ≤ ε,
x p xp p 2 p
286
where we use that |x − p| < δ ≤ p2 implies − p2 < x − p or p2 < x, i.e. x1 < 2p .

Thus, f is continuous on (0, 1].
Now, suppose that f is uniformly continuous on (0, 1]. Then there would be
some δ > 0 such that for all x, y ∈ (0, 1] and |x − y| < δ it would follow that

1 1

|f (x) − f (y)| = − < 1.
x y
For n ∈ N we have

1 1
− 1 = 1 and − 1 = n,
n 2n 2n 1 1
n 2n
1
thus for 2n
< δ it follows that

f 1 − f 1 = n ≥ 1 ,
n 2n
which contradicts |f (x) − f (y)| < 1.
Theorem 20.30. Every continuous function f : K → R on a compact set
K ⊂ R is uniformly continuous and bounded.
Proof. Let ε > 0. Since f is continuous for each x ∈ K there is δx,ε such
that y ∈ K and |x − y| < δx,ε implies |f (x) − f (y)| < 2ε . Denote by I(x) the
interval (x − δx,ε
2
, x + δx,ε
2
). Clearly (I(x))x∈K is an open covering of K. By
compactness there is a finite subcovering

δxl ,ε δxl ,ε
xl − , xl + .
2 2 l∈{1,...,N }
Take δ := 12 min (δx1 ,ε , . . . , δxN ,ε ). For |x − y| < δ it follows that for some
1 ≤ j ≤ N we have
δxj ,ε δxj ,ε
x ∈ xj − , xj +
2 2
and further
δxj ,ε
|xj − y| ≤ |x − y| + |x − xj | < δ + < δxj ,ε ,
2
and therefore
ε ε
|f (y) − f (x)| ≤ |f (y) − f (xj )| + |f (x) − f (xj )| < + =ε
2 2
287
proving that f is uniformly continuous. Next we prove that f is bounded. For

= 1 and x ∈ K there exists δx > 0 such that y ∈ K and |x − y| < δx implies
|f (x) − f (y)| < 1. The intervals J(x) := (x − δx , x + δx ), x ∈ K, form an open
covering of K. Hence, since K is compact, we can cover K by finitely many of
these intervals, say J(x1 ), . . . , J(xN )). On J(Xj ) we have |f (y)−f (xj )| < 1 or
|f (y)| ≤ 1+|f (xj )|, implying |f (y)| ≤ 1+max1≤j≤N |f (xj )| for all y ∈ K.
Finally in this chapter we prove
Theorem 20.31. Let f : [a, b] → R, f ([a, b]) = [A, B], have an inverse
function f −1 , i.e. f −1 : [A, B] → R and f ◦ f −1 = id[A,B] and f −1 ◦ f = id[a,b] .
If f is continuous, so is f −1 .
Proof. Suppose that f −1 is not continuous. Then there is y ∈ [A, B] and a
sequence (yn )n∈N , yn ∈ [A, B], such that limn→∞ yn = y and for some ε > 0
|f −1(yn ) − f −1 (y)| > ε.
Since f −1 (yn ) ∈ [a, b], a subsequence (f −1 (ynk ))k∈N converges by the Bolzano-
Weierstrass theorem:
lim f −1 (ynk ) = c,
k→∞
−1
and |c − f (ynk )| ≥ ε. Further f (f −1 (ynk )) = ynk and the continuity of f
implies
y = lim ynk = lim f (f −1(ynk )) = f (c),
k→∞ k→∞
−1 −1
i.e. f (y) = f (f (c)) = c contradicting |c − f −1 (y)| ≥ ε and the theorem
is proved.
Problems
1. Let f : [a, b] → R, a < b, be a function. Prove that f is continuous
at x ∈ [a, b] if and only if for every sequence (xn )n∈N , xn ∈ (a, b),
converging to x the following holds
lim f (xn ) = f ( lim xn ).

n→∞ n→∞
2.* Let D ⊂ R be an open set. Prove that f : D → R is continuous if and

only if the pre-image of every open set in R is again open, i.e. f −1 (U)
is open whenever U ⊂ R is open.
288
3. Give an − δ definition for f : D → R having a right (left) limit at

x ∈ D.
4. a) Consider the function χ[0,1]∩Q : [0, 1] → R i.e.
-
1, x ∈ [0, 1] ∩ Q
χ[0,1]∩Q =
0, x ∈ [0, 1] and x ∈ / Q.
Prove that χ[0,1]∩Q is not continuous at any point x ∈ [0, 1].

b) Define f : R → R by
-
x, x ∈ Q
f (x) :=
0, x ∈ R \ Q.
Prove that f is only continuous at x = 0.

5. Let g : [0, 1] → R be an arbitrary bounded function. Prove that f :
[0, 1] → R, f (x) = xg(x), is continuous at x = 0.
6.* a) Let f : (a, b) → R be a monotone function and x0 ∈ (a, b). Prove
that
lim f (x) exists.
x→x 0
x>x0
b) Let f : D → R be a function and x ∈ D. We call x a point

of discontinuity of f if f is not continuous at x. Now let I ⊂ R be
an interval (bounded or unbounded) and let g : I → R be a monotone
function. Prove that g has at most countable points of discontinuity.
7.* Let I ⊂ R be an interval (bounded or unbounded). We call f : I → R
a càdlàg function (continu à droite, limites à gauche) if for all x ∈ I
the function f is continuous from the right and has a limit from the
left, i.e.
fr := y→x
lim f (y) = f (x) and fl := y→x
lim f (y) exists.
y>x y<x
Prove that if f : I → R is a monotone function then there exists a

monotone function h : I → R which is càdlàg and coincides with f
apart from at a countable number of points of discontinuity.
Hint: use the result of Problem 6.
289
8. a) Let f, g : D → R be two continuous functions. Prove that ϕ, ψ :

D → R defined by ϕ(x) = max(f (x), g(x)) and ψ(x) = min(f (x), g(x))
are also continuous on D.
b) Let f : D → R be a function and define
-
f (x), if f (x) ≥ 0
f+ (x) :=
0, if f (x) < 0
and -
−f (x), if f (x) ≤ 0
f− (x) :=
0, if f (x) > 0.
Show that f = f+ − f− and |f | = f+ + f− . Moreover, show that f is

continuous if and only if f+ and f− are continuous.
9. Let f, g : [a, b] → R be two continuous functions and suppose that

f |[a,b]∩Q = g|[a,b]∩Q. Prove that f = g, i.e. if f and g coincide on
rational points of their domains, then they coincide everywhere.
10. Let f : [0, a] → R be a continuous function. Prove that f has a unique

continuous extension to [−a, a] as an even function and that f − f (0)
has a unique continuous extension to [−a, a] as an odd function.
11. Let D ⊂ R be a non-empty set and C(D) the set of all continuous
functions f : D → R.
a) Prove that C(D) with its natural operations forms an R-algebra.
b) Let a : D → R be a fixed continuous function and define Aop :
C(D) → C(D) by Aop u = au, i.e. Aop u(x) = a(x)u(x). Prove that Aop
is a linear operator on C(D).
12. Let f : D → R, D ⊂ R, be a function. We call x0 ∈ D a fixed point

of f if f (x0 ) = x0 .
a) Give a geometric or graphical interpretations for a fixed point.
b) Prove that h : D → R has an a-point, i.e. there exists x0 ∈ D
such that h(x0 ) = a, if and only if g : D → R, g(x) = h(x) + x − a has
a fixed point.
290
13. a) Let f, g : [a, b] → R be two continuous functions such that

f (a) < g(a) and f (b) > g(b). Prove that there exists x0 ∈ [a, b] such
that f (x0 ) = g(x0 ).
b) Prove that there exists at least one x0 ∈ [ π2 , 3π
2
] solving the
1
equation sin x = 2+cos 4x .
14. a) Consider the two sets A := { n1 |n ∈ N} ⊂ R and B := A ∪ {0}.

Using the basic definition of compactness prove that A is not compact
but B is.
b) Let (an )n∈N , an ∈ R, be a sequence of real numbers converging
to a0 ∈ R. Prove that {ak |k ∈ N0 } is compact.
15. For every N ∈ N an open covering of (0, 1) is given by (Ux )x∈[0,1] , Ux =

3 3
(x − 4N , x + 4N ). Prove that (U k )k=0,...,N is an open subcovering of
n
(0, 1) but (0, 1) is not compact.

compact sets Kν ⊂ R. Prove that
16. Let (Kν )ν∈I be a family of ν∈I Kν
is compact, but in general ν∈I Kν is not compact.
17. a) Let f : K → R be a continuous function defined on a compact

set K. If f (x) > 0 for all x ∈ K then there exists α > 0 such that
f (x) ≥ α > 0 for all x ∈ K, i.e. if a continuous function is strictly
positive on a compact set, it is bounded away from 0.
b) Prove that if f : D → R is uniformly continuous and D is
bounded, then f is bounded.
√
18. For a ∈ R consider f : [−a, ∞) → R, f (x) = x + a. Prove that f is
uniformly continuous.
19. Let f : [a, b] → R be a continuous function. We call f piecewise linear

if there exists a partition a = x0 < x1 < · · · < xN = b of [a, b] and
real numbers αk and βk such that f |[xk−1,xk ] = αk x + βk , k = 1, . . . , N.
Let g : [a, b] → R be a continuous function. Prove that for every > 0
there exists a piecewise linear function ϕ : [a, b] → R such that for all
x ∈ [a, b]
|g(x) − ϕ(x)| ≤ .
291
20. We call a function f : D → R Lipschitz continuous if for some κ > 0

we have |f (x) −f (y)| ≤ κ|x−y| for all x, y ∈ D. Prove that a Lipschitz
continuous function is uniformly continuous.
21. a) Let f : D → R be a uniformly continuous function. Prove that

for every D ⊂ D the function f |D : D → R is uniformly continuous
too.
b) Let g : (a, b] → R be a continuous function. Suppose that
limx→a g(x) exists. Prove that g is uniformly continuous.
x>a
Hint: show that g has a continuous extension to [a, b].
292
21 Differentiation
Let D ⊂ R and f : D → R be a function. We know by examples that even
continuous functions may look rather complicated. Thus we may ask the
question whether it is possible to approximate locally a given function by a
simpler function. Obviously straight lines (considered as graphs of functions)
are the simplest functions on R . They are given by
ga,b : R → R
x → ax + b
with a, b ∈ R. We want to make our considerations for a moment more
complicated and relate our point of view to linear algebra. Given an n-
dimensional vector space (V, R) over the reals. A mapping A : V → V is
called linear if A(λx + μy) = λAx + μAy holds for all λ, μ ∈ R and x, y ∈ V .
Choosing a fixed basis in V we know that with respect to this basis A has a
representation as an n×n-matrix. Now, R is a real vector space of dimension
1 and taking 1 ∈ R as basis any matrix is just a real number.
Thus all linear mappings Aa : R → R have the matrix representation x →
Aa x = ax where a ∈ R represents Aa .
Therefore we may interpret a straight line as the graph of the composition
of two mappings: A linear mapping x → ax and a translation Tb : R → R,
x → x + b, i.e. we consider
Tb ◦ Aa : R → R
x → Tb (Aa x) = Tb (ax) = ax + b.
We call these mappings the affine mappings ha,b := Tb ◦ Aa , a, b ∈ R, on R.
Thus straight lines are the graphs of affine maps. More generally:
Definition 21.1. Let (V, R) be a vector space over R. We call F : V → V
an affine mapping if F x = Ax + b holds for a linear mapping A : V → V
and a vector b ∈ V .
Let us return to our original problem. Given f : D → R and x0 ∈ D. We
are looking for an affine mapping ha,b : R → R such that in a neighbourhood
of x0 the function x → ha,b (x) is a good approximation of x → f (x). In
particular we require f (x0 ) = ha,b (x0 ). Thus in a neighbourhood of x0 we
want to have that
|f (x) − ha,b (x)| = |f (x) − (ax + b)| is small, and
293
f (x0 ) = ax0 + b.
Thus
|f (x) − f (x0 ) − a(x − x0 )| should be small,
which leads to
f (x) − f (x0 )
− a |x − x0 |
x − x0
should be small. Now suppose that φx0 : D → R is a function such that
φx0 (x)
lim = 0. (21.1)
x→x0 x − x0
Consider for some c

f (x) = f (x0 ) + c(x − x0 ) + φx0 (x).
It follows that
f (x) − f (x0 ) φx (x)
−c= 0 . (21.2)
x − x0 x − x0
x0 φ (x)
Since by our assumption limx→x0 x−x 0
= 0, we find that in a neighbourhood
of x0 the expression
f (x) − f (x0 )
−c
x − x0
will be small and x → hc,f (x0 )−cx0 (x) would be locally an affine linear approx-
imation of f at x0 . However, in order that we may argue as before, it is clear
from (21.2) and (21.1) that
f (x) − f (x0 )
lim =c (21.3)
x→x0 x − x0
must hold.
The existence of the limit (21.3) is by no means clear. Take the function
x → |x| and x0 = 0. For x > 0 we find
f (x) − f (x0 ) |x| − |0| x
lim = x→x
lim = lim = 1
x→x0
x>0
x − x0 x>0
0 x−0 x→x0 x
but for x < 0 we have

f (x) − f (x0 ) −x
lim x−x0 = lim = −1
x→x0
x<0 x→x0 x
x<0
thus limx→0 |x|

x
does not exist. The following definition is crucial.
294
21 DIFFERENTIATION
Definition 21.2. Let D ⊂ R and f : D → R be a function. We call f

differentiable at x0 if the limit
f (x) − f (x0 )
f (x0 ) := lim (21.4)
x→x0 x − x0
exists. The number f (x0 ) is called the derivative of f at x0 . If f is at
all points of D differentiable we call the function f : D → R, x → f (x), the
derivative of f.
df df
Instead of f (x0 ) or f we also will write dx
(x0 ) or dx
, respectively.
From the previous considerations the following theorem is almost clear.
Theorem 21.3. Let D ⊂ R, x0 ∈ D, and suppose that there is at least one
sequence (xn )n∈N , xn ∈ D \ {x0 } converging to x0 . A function f : D → R is
differentiable at x0 if and only if there is a constant c ∈ R and a function
φx0 : D → R satisfying
φx0 (x)
lim = 0, (21.5)
x→x0 x − x0
such that
f (x) = f (x0 ) + c(x − x0 ) + φx0 (x). (21.6)
In this case we have c = f (x0 ).
Proof. Suppose that f is differentiable at x0 ∈ D and set c = f (x0 ). Defining
φx0 (x) := f (x) − f (x0 ) − f (x0 )(x − x0 )
we find
φx0 (x) f (x) − f (x0 )
lim = lim − f (x0 ) = 0,
x→x0 x − x0 x→x0 x − x0
and obviously
f (x) = f (x0 ) + f (x0 )(x − x0 ) + φx0 (x).
Now suppose that (21.6) holds with φx0 satisfying (21.5). Then we find
immediately that

f (x) − f (x0 ) φx0 (x)
lim − c = lim =0
x→x0 x − x0 x→x0 x − x0
or
f (x) − f (x0 )
lim = c, i.e. c = f (x0 ).
x→x0 x − x0
295
Thus the concept of differentiation is that of an affine approximation. It is

more convenient to speak about linear approximations with the interpre-
tation that f (x)−f (x0 ) is approximated by f (x0 )(x−x0 ) and f (x0 ) : R → R
is considered to be a linear mapping.
We may also provide a geometric interpretation. Suppose that f (x0 ) exists.
The term
f (x) − f (x0 )
,
x − x0
which is called the difference quotient of f at x0 , gives the slope of the
straight line through the points (x0 , f (x0 )) and (x, f (x)):
y-axis
y0 = f (x0 )
y = f (x)
x x0 x-axis
Figure 21.1
In the limiting case, if f is differentiable at x0 , for x → x0 these straight

lines will “converge” to a straight line with slope f (x0 ) passing through
(x0 , f (x0 )). This straight line is called the tangent line of (the graph of) f
at the point (x0 , f (x0 )) :
Definition 21.4. Let f : D → R be a function differentiable at x0 ∈ D. The

graph of the function gx0 : R → R, gx0 (t) = f (x0 )(t − x0 ) + f (x0 ) is called
the tangent or tangent line of f at x0 .
296
21 DIFFERENTIATION
Note that the tangent is defined after we have introduced the notion of dif-
ferentiability. An important consequence of Theorem 21.3 is
Corollary 21.5. A function differentiable at x0 is continuous at x0 .
Proof. Since f is differentiable at x0 we have
f (x) = f (x0 ) + f (x0 )(x − x0 ) + φx0 (x)
φx0 (x)
where limx→x0 x−x0
= 0. In particular we have limx→x0 φx0 (x) = 0 which
leads to
lim f (x) = f (x0 ) + f (x0 ) lim (x − x0 ) + lim φx0 (x) = f (x0 ).

x→x0 x→x0 x→x0
Let us recollect some concrete derivatives already discussed in Chapter 6,

Example 6.1.
Example 21.6. A. The function f1 , x → c ∈ R, is differentiable on R and
we have f1 (x) = 0 for all x:
f1 (x) − f1 (x0 ) c−c
f1 (x0 ) = lim = lim = 0.
x→x0 x − x0 x→x0 x − x0
In this case φx0 (x) = 0 for all x ∈ R.

B. The function f2 : R → R, x → cx, is differentiable on R and we have
f2 (x) = c for all x:
f2 (x) − f2 (x0 ) c(x − x0 )
f2 (x0 ) = lim = lim = c.
x→x0 x − x0 x→x 0 x − x0
Again we find φx0 (x) = 0 for all x ∈ R. This should not be a surprise: since
f is linear, i.e. its graph is a straight line, we just take this straight line as
an approximation.
C. For f3 : R → R, x → x2 , we find
f3 (x) − f3 (x0 ) x2 − x20 (x − x0 )(x + x0 )
= = = x + x0
x − x0 x − x0 x − x0
which yields
x2 − x20
f3 (x0 ) = lim = lim (x + x0 ) = 2x0 .
x→x0 x − x0 x→x0
297
Here we have φx0 (x) = (x − x0 )2 .

D. For f4 , x → x1 , defined on R \ {0} we find
1 1 x−x0
x
− x0 xx0 1
= =− ,
x − x0 x − x0 xx0
and it follows for x0 = 0 that
1 1
x
− 1x0 1
f4 (x0 ) = lim − =− 2 .
= lim
x→x0 x − x0 x→x0 xx0 x0

In this case we find φx0 (x) = x12 − xx1 0 (x − x0 ).
0
Note that x → |x| provides us with a function which is not differentiable at

x0 = 0.
Elementary rules for differentiation are collected in the following theorem,
all results were proved in Chapters 6 and 7.
Theorem 21.7. Let f, g : D → R be differentiable at x0 ∈ D and λ ∈ R.
Then the functions
f + g , λf , f · g : D → R
are differentiable at x0 and we have
(f + g) (x0 ) = f (x0 ) + g (x0 ); (21.7)
(λf ) (x0 ) = λf (x0 ); (21.8)

(f · g) (x0 ) = f (x0 )g(x0 ) + f (x0 )g (x0 ) (Leibniz’s rule). (21.9)
f
0, then x →
If in addition g(x0 ) = (x)
is also differentiable at x0 and we
g
have
f f (x0 )g(x0 ) − f (x0 )g (x0 )
(x0 ) = . (21.10)
g g 2(x0 )

Corollary 21.8. All polynomial functions m
x → m ν
ν=0 aν x are differentiable
a x ν
and so are all rational functions x → kν=0 b ν xμ on their domain, i.e. the set
μ=0 μ

{x ∈ R| kν=0 bμ xμ = 0}. Moreover, we have for n ∈ N
(xn ) = nxn−1 , x ∈ R,
and
1
= −nx−n−1 , x ∈ R \ {0}.
xn
298
21 DIFFERENTIATION
Next we recollect the chain rule, see Theorem 7.3.

Theorem 21.9. Let f : D → R and g : E → R be two functions such
that f (D) ⊂ E. Suppose that f is differentiable at x0 ∈ D and that g is
differentiable at y0 := f (x0 ) ∈ E. Then the function
g◦f :D →R
is differentiable at x0 and we have
(g ◦ f ) (x0 ) = g (f (x0 ))f (x0 ). (21.11)
Example 21.10. For α ∈ R consider f : [0, ∞] → R, x → xα .

Since xα = eα ln x = exp(α ln x) we find
d α
x = exp(α ln x)
dx
d
= exp (α ln x) (α ln x)
dx
α α
= exp(α ln x) = xα = α xα−1 .
x x
Furthermore we restate Theorem 7.5:
Theorem 21.11. For a strictly monotone and continuous function
f : I → R which is differentiable at x0 with f (x0 ) = 0 the inverse function
is differentiable at y0 = f (x0 ) and
1
(f −1 ) (y0 ) = (21.12)
f (f −1 (y 0 ))
holds.
Let f : D → R be a differentiable function. Then f : D → R is again
a function and we may ask whether f is differentiable. If yes, we call the
derivative (f ) of f the second derivative of f and write simply f . By
induction we define higher order derivatives(if they exist)
k−1
dk (k) d d d (k−1)
k
f (x) := f (x) := k−1
f (x) = f (x).
dx dx dx dx
If f has a k th derivative as defined above we call f k-times differentiable
299
Corollary 21.12. Let f, g : D → R be two k-times differentiable functions.

Then f · g : D → R is k-times differentiable and we have
k
dk k (k−l)
k
(f · g)(x) = f (x)g (l) (x).
dx l=0
l
Exercise 21.13. Prove Corollary 21.12.

Let us introduce some useful notations. For any interval I ⊂ R and k ∈ N
we denote the set of functions f that are k-times differentiable with the k th
derivative continuous on I by C k (I). Clearly, C k (I) is a vector space over R,
and Leibniz’s rule (Corollary 21.12), shows that C k (I) is an algebra. Further
C(I) = C 0 (I) denotes the space of all continuous functions f : I → R, and
we set
C ∞ (I) := C k (I).
k∈N
Moreover we set
Cbk (I) := {u ∈ C k (I)|u(l) is bounded for l = 0, . . . , k},
recall u(0) := u.
Exercise 21.14. Show that for 0 ≤ k ≤ ∞ the set C k (I) with its natural
operations, i.e. pointwise addition and multiplication is an R algebra.
Note that functions that are differentiable need not have continuous deriva-
tives.
Example 21.15. Let
-
x2 sin( x1 ) , x = 0
f (x) := .
0 ,x = 0
Then, by the rules f is differentiable for all x = 0. In fact

1 1
f (x) = 2x sin − cos
x x
and we have
1
x2 sin x
−0 1
f (0) = lim = lim x sin = 0.
x→0 x x→0 x
But f (x) does not have a limit as x → 0.
300
21 DIFFERENTIATION
Finally we want to state without proof a formula which allows us to calculate

higher order derivatives of composed functions. Let I1 ⊂ R and I2 ⊂ R be two
intervals and f : I2 → R and g : I1 → R be two arbitrarily often differentiable
functions such that g(I1) ⊂ I2 . The nth -derivative of h := f ◦ g : I1 → R is
given by
k1 (2) k2 (n) kn
(n) n! g(1)(x) g (x) g (x)
h (x) = f (k) (g(x)) · ··· (21.13)
k1 ! · · · kn ! 1! 2! n!
where k = k1 + · · · + kn and the summation is over all k1 , . . . , kn such that

k1 + 2k2 + · · · + nkn = n. The formula (21.13) is called the Faà di Bruno
formula, a proof of which is given in W. J. Kaczor and M. T. Nowak [6,
p. 227-231].
Problems
1. In the situation of Figure 21.1 find the straight line Sx0 ,x : R → R
passing through the points (x0 , f (x0 )) and (x, f (x)). Further prove
that if gx0 is the tangent line of f at x0 then limx→x0 Sx0 ,x (t) = gx0 (t).
2. Prove Leibniz’s rule for higher order derivatives: for f, g ∈ C m (I),

m ∈ N ∪ {∞}, and k ∈ N, k ≤ m
k

dk (k) k (k−l)
(f · g)(x) = (f · g) (x) = f (x)g (l) (x).
dxk l
l=0
3. Show that C k (I), 0 ≤ k ≤ ∞, is an R-algebra.
4. Find a, b, c, d ∈ R such that f : R → R given by

⎧
⎪
⎨ax + b, x≤0
2
f (x) := cx + dx, 0 < x ≤ 1
⎪
⎩
1 − x1 , x>1
is differentiable.
301
5. For 1 ≤ j ≤ n, let fj : R → R, fj (x) > 0, be differentiable. Prove

⎛ ⎞
n
⎜ fk ⎟
⎜ ⎟ n
fk (x)
⎜ k=1 ⎟
⎜ n ⎟ (x) = .
⎜ ⎟ fk (x)
⎝ fk ⎠ k=1
k=1
6. a) Let f : (a, b) → R be differentiable at x0 ∈ (a, b). Show that

f (x0 + h) − f (x0 − h)
lim = f (x0 ).
h→0 2h
b) Give an example of a function g : (a, b) → R such that for some
x0 ∈ (a, b)
g(x0 + h) − g(x0 − h)
lim =A
h→0 2h
exists but g (x) does not, i.e. g is not differentiable at x0 .
7. Let h : [−a, a] → R be a bounded function. Prove that f : [−a, a] → R,
f (x) = x2 h(x), is differentiable at x0 = 0 and f (0) = 0.
8. a) Prove that the derivative of an even function f : R → R is odd
and that the derivative of an odd function g : R → R is even.
b) Let f : R → R be an a-periodic function, i.e. f (x + a) = f (x)
for all x ∈ R. Suppose that f (x) = 0 for all x ∈ R and that f is
differentiable. Prove that f is also a-periodic.

9. For |x| < 1 prove that ∞ k x
k=1 kx = (1−x)2 .
∞ k 1
Hint: recall k=0 x = 1−x for |x| < 1.
10. a) Prove that for k ∈ N there exists a polynomial Pk of degree at
most k such that
dk 1 Pk (x)
k
(1 + x2 )− 2 = 2k+1
dx (1 + x2 ) 2
and derive that
k
d
2 − 12 1

dxk (1 + x ) ≤ ck k+1 .
(1 + x2 ) 2
302
21 DIFFERENTIATION
b) Let f ∈ Cbm (R), m ∈ N. Use the Faà di Bruno formula to prove

1 1
|f (m) ((1 + x2 )− 2 )| ≤ cm m+1 .
(1 + x2 ) 2
303
22 Applications of the Derivative

In this chapter we first recollect results from Part 1 and we provide some of
the missing proofs. Moreover, we will add some further applications of the
derivative to problems in geometry. We start with
Definition 22.1. Let f : (a, b) → R be a function. The function f has a

local maximum (local minimum) at x ∈ (a, b) if there is an ε > 0 such
that
f (x) ≥ f (y) (f (x) ≤ f (y)) whenever |y − x| < ε. (22.1)
If in (22.1) equality holds only for x = y, then we will speak of an isolated
maximum (minimum). By a local extreme value we mean either a
local maximum or a local minimum.
Of fundamental importance is now
Theorem 22.2. Suppose f : (a, b) → R has a local extreme value at x ∈ (a, b)

and that f is differentiable at x. Then we have f (x) = 0.
Proof. If f is constant on (a, b), then any c will do. If not, suppose first that f
has a local maximum at x ∈ (a, b). Take ε > 0 such that (x−ε, x+ε) ⊂ (a, b)
and
f (y) ≤ f (x) for all y ∈ (x − ε, x + ε).
It follows that
f (y) − f (x)
f+ (x) = y→x
lim ≤0 (22.2)
y>0
y−x
and
f (y) − f (x)
f− (x) = y→x
lim ≥ 0. (22.3)
y<0
y−x
The differentiability of f implies now
f+ (x) = f− (x) = f (x),
hence f (x) = 0. If f has a local minimum apply the proof to −f .
Remark 22.3. A. Note that f (x) = 0 does not imply that f has a local
extreme value at x: take f (x) = x3 and x = 0.
B. The geometric interpretation of Theorem 22.2 is that at a local extreme
305
value f has a horizontal tangent.

C. Suppose that f : [a, b] → R is continuous and differentiable on (a, b). We
know that f has a global maximum and minimum on [a, b]. But if one of
these global extreme values lies at a boundary point a or b, then it is not
necessary that f (a) or f (b) is zero. Here we must understand f (a) and
f (b) as one-sided limits of the difference quotient. Consider f : [0, 1] → R,
x → x, which has the global minimum at x1 = 0 and the global maximum
at x2 = 1. But f (y) = 1 for all y ∈ [0, 1].
D. We say that f is differentiable from the right at x (differentiable
from the left) if the limit (22.2) ((22.3)) exists.
Theorem 22.4 (Rolle’s Theorem). Let f : [a, b] → R be a continuous

function, differentiable on (a, b), with f (a) = f (b). Then there is a point
x ∈ (a, b) such that f (x) = 0.
Proof. Since [a, b] is compact and f is continuous, f has a maximum and a

minimum. Since f (a) = f (b) and if f is not constant, either the maximum
or the minimum is attained at some point x0 ∈ (a, b). Hence by Theorem
22.2 we must have f (x0 ) = 0.
Theorem 22.5. Let f, g : [a, b] → R be two continuous functions differen-

tiable on (a, b). Then there is a point x ∈ (a, b) such that
(f (b) − f (a)) g (x) = (g(b) − g(a)) f (x). (22.4)
Proof. Consider the continuous function h : [a, b] → R,
h(t) = [f (b) − f (a)] g(t) − [g(b) − g(a)] f (t)
which is differentiable on (a, b). Since
h(a) = f (b)g(a) − f (a)g(b) = h(b),
we can apply Theorem 22.4, and hence there is x ∈ (a, b) such that h (x) = 0
i.e.
0 = h (x) = (f (b) − f (a))g (x) − (g(b) − g(a))f (x),
implying the theorem.
Theorem 22.5 is sometimes called the second or the generalised mean
value theorem. Setting g(x) = x in this theorem gives
306
22 APPLICATIONS OF THE DERIVATIVE
Corollary 22.6 (Mean value theorem). If f is a real valued continuous

function on [a, b] which is differentiable in (a, b), then there is a point x ∈
(a, b) such that
f (b) − f (a) = (b − a)f (x). (22.5)
Corollary 22.7. If f is as in Corollary 22.6 and furthermore m ≤ f (z) ≤
M holds for all z ∈ (a, b), then we have the estimates
m(y − x) ≤ f (y) − f (x) ≤ M(y − x) (22.6)
for all x, y ∈ [a, b], x < y.

Proof. Apply the mean value theorem to f |[x,y] to find
f (y) − f (x) = (y − x)f (z), some z ∈ (x, y)
and use m ≤ f (z) ≤ M.

Corollary 22.8. If f (x) = 0 for all x ∈ (a, b), then f is constant on (a, b).
Proof. Apply Corollary 22.7 with m = M = 0.
This corollary has an interesting consequence. Suppose that both f1 , f2 :
(a, b) → R satisfy the equation f (x) = h(x) for all x ∈ (a, b) where h :
(a, b) → R is a given function. It follows that f1 (x) − f2 (x) = 0 for all
x ∈ (a, b), hence f1 − f2 = c for some c ∈ R. Thus two solutions of the
differential equation f (x) = h(x), x ∈ (a, b), differ only by a constant.
0
We always have a problem evaluating the quotient . The usual example
0
f (x)
is when we want to evaluate the limit as x → x0 of a quotient when
g(x)
f (x0 ) = g(x0 ) = 0. If the functions are differentiable then there is a useful
corollary to Theorem 22.5:
Theorem 22.9 (L’Hospital’s rule). If f and g are differentiable in a neigh-
bourhood of x0 and f (x0 ) = g(x0 ) = 0, then
f (x) f (x)
lim = lim (22.7)
x→x0 g(x) x→x0 g (x)
provided the limit on the right hand side exists.
307
Proof. Suppose first that x > x0 . By Theorem 22.5 applied to the interval
[x0 , x], there exists y, x0 < y < x, such that
f (y) f (x) − f (x0 ) f (x)

= = .
g (y) g(x) − g(x0 ) g(x)
f (y)
As x → x0 , x > x0 , it follows that y → x0 , thus if limy→x0 g (y)
exists, it is
equal to limx→x0 fg(x)
(x)
. A similar argument works when x < x0 .
Already in Part 1 we made use of these rules, see Theorem 11.5 and Example
11.6.
Sometimes we may have to use L’Hospital’s rule more than once.
sin(x) − x cos(x) − 1 − sin(x) 1

lim 3
= lim 2
= lim =− ,
x→0 x x→0 3x x→0 6x 6
where we used limx→∞ sin(x)

x
= 1, compare with Theorem 10.4.
Remark 22.11. We cannot use L’Hospital’s rule to establish non-convergence,

f (x) f (x)
as it is possible that lim exists while lim does not.
x→x0 g(x) x→x0 g (x)
Example 22.12. Let

-
x2 sin x1 , x = 0
f (x) :=
0 ,x = 0
and g(x) = x. Then f (0) = g(0) = 0 and both functions are differentiable at
f (x)
0. Now = x sin(x) → 0 as x → 0, but
g(x)

f (x) 1 1
= f (x) = 2x sin − cos
g (x) x x
which does not have a limit as x → 0.
308
Other forms of l’Hospital’s rule are:

1. If f and g are differentiable and lim f (x) = lim g(x) = 0, then
x→∞ x→∞
f (x) f (x)
lim = lim (22.8)
x→∞ g(x) x→∞ g (x)
when the limit on the right hand side exists.

2. If f and g are differentiable and lim f (x) = lim g(x) = ∞, then
x→∞ x→∞

f (x) f (x)
lim = lim (22.9)
x→∞ g(x) x→∞ g (x)
when the limit on the right hand side exists.

Next we will characterise monotone functions using the derivative.
Theorem 22.13. Let f : [a, b] → R be continuous and differentiable on
(a, b). If f (x) > 0 for all x ∈ (a, b) (or f (x) ≥ 0, f (x) ≤ 0, f (x) < 0), then
f is on [a, b] strictly monotone increasing (monotone increasing, monotone
decreasing, strictly monotone decreasing).
Proof. We discuss only the case f (x) > 0 for all x ∈ (a, b), the other cases
are analogous. Suppose that f is not strictly monotone increasing. Then
there are x1 , x2 ∈ (a, b), x1 < x2 such that f (x1 ) ≥ f (x2 ). By the mean value
theorem we find some y ∈ (x1 , x2 ) such that
f (x2 ) − f (x1 )
f (y) = ≤ 0,
x2 − x1
Exercise 22.14. Show that, if f is monotone decreasing on [a, b] and is
differentiable on (a, b) then f (x) ≤ 0 for all x.
Remark 22.15. If f is strictly increasing and differentiable, we need not
have f (x) > 0 for all x. The function f (x) = x3 is strictly monotone
increasing and differentiable, but f (0) = 0.
Theorem 22.16. Suppose that f : (a, b) → R is twice differentiable at x ∈
(a, b). In addition assume that
f (x) = 0 and f (x) > 0 (f (x) < 0).
Then f has an isolated local minimum (maximum) at x.
309
Proof. We consider the case f (x) > 0, the second case goes analogously. By
assumption we have
f (y) − f (x)
f (x) = lim > 0.
y→x y−x
Hence there is an ε > 0 such that

f (y) − f (x)
> 0 for all y, 0 < |y − x| < ε.
y−x
Since f (x) = 0 it follows that
f (y) < 0 for x − ε < y < x
and
f (y) > 0 for x < y < x + ε.
Therefore f is strictly monotone decreasing in [x−ε, x] and strictly monotone
increasing in [x, x + ε] :
y-axis
x x-axis
Figure 22.1
Thus, f has an isolated local minimum at x.
310
Remark 22.17. As the function x → x4 shows we may have a minimum,

here at x0 = 0, and f (x0 ) = f (x0 ) = 0. Thus if f (x0 ) = 0 for a twice
differentiable function with f (x0 ) = 0 we cannot in general make a statement
about whether f has an extreme value at x0 or not.
Let f : [a, b] → R be a twice continuously differentiable function. We want
to study its graph Γ(f ) ⊂ R2 as a geometrical object.
y = f (x)
h̃x0
g̃x0
x0 x
Figure 22.2
Locally, i.e. in a neighbourhood of x0 we can replace Γ(f ) by the tangent

line g̃x0 to give an approximation of Γ(f ). Recall that g̃x0 is the straight line
g̃x0 = {(t, gx0 (t))|gx0 (t) = f (x0 )t+f (x0 )−x0 f (x0 ), t ∈ R} = Γ(gx0 ) (22.10)
which we also interpret as the graph Γ(gx0 ) of the function t → gx0 (t),
gx0 (t) = f (x0 )t + f (x0 ) − x0 f (x0 ). In the case where f (x0 ) = 0 then
g̃x0 is a line parallel to the x-axis and the line ñx0 := {(x0 , t)|t ∈ R} is par-
allel to the y-axis and passes through (x0 , f (x0 )) and they are perpendicular
to each other. However, ñx0 is not the graph of a function. For f (x0 ) = 0
we can consider the straight line

1 x0
ñx0 = (t, nx0 (t)|nx0 (t) = − t + f (x0 ) + = Γ(nx0 ),
f (x0 ) f (x0 )
311
which is the graph of t → nx0 (t), nx0 (t) = − f (x 1

0)
t + f (x0 ) + f x(x00 ) . We find
that g̃x0 and ñx0 intersect at the point (x0 , f (x0 )) and they are perpendicular.

1
The latter follows from the fact that g̃x0 has direction vector and
f (x0 )

1
ñx0 has direction vector 1 implying that their scalar product in R2
− f (x 0 )
is 0:
1 1 f (x0 )
< , 1 >= 1 − = 0.
f (x0 ) − f (x0 ) f (x0 )
We call ñx0 the normal line of f (or Γ(f )) at x0 (or (x0 , f (x0 ))). In Volume
II, Chapter 39, we will understand why it is of advantage
to replace the
−f (x0 ) 1
direction vector of ñx0 by = −f (x0 ) .
1 − f (x1 0 )
If in a neighbourhood of x0 the graph Γ(f ) is not a straight line, it may
be argued that we can approximate the graph Γ(f ) even better by a circle
κx0 passing through (x0 , f (x0 )). Suppose that the circle is given by the
set {(x, y) ∈ R2 |(x − c1 )2 + (y − c2 )2 = r 2 } and suppose further that in a
neighbourhood of x0 we can represent y as a twice continuously differentiable
function of x, y = h(x). Thus we have (x − c1 )2 + (h(x) − c2 )2 = r 2 , or in a
neighbourhood of x0 we have

|y − c2 | = |h(x) − c2 | = r 2 − (x − c1 )2 or h(x) = ± r 2 − (x1 − c1 )2 + c2 .
For being a better approximation than g̃x0 we must have h(x0 ) = f (x0 ) and
h (x0 ) = f (x0 ), i.e. the circle must pass through (x0 , f (x0 )) and have the
same tangent line at (x0 , f (x0 )) as f has. To improve the approximation we
add the condition h (x0 ) = f (x0 ). Now we want to determine c1 , c2 and r.
Differentiating (x − c1 )2 + (h(x) − c2 )2 = r 2 twice we find
(x − c1 ) + h (x)(h(x) − c2 ) = 0 (22.11)
and
2
1 + h (x) + h (x)(h(x) − c2 ) = 0. (22.12)
For x0 this implies
1 + f 2 (x0 ) + f (x0 )(f (x0 ) − c2 ) = 0,
or if f (x0 ) = 0
1 + f 2 (x0 )
c2 = f (x0 ) + , (22.13)
f (x0 )
312
and then
x0 − c1 + f (x0 )(f (x0 ) − c2 ) = 0,
or again if f (x0 ) = 0
1 + f 2 (x0 )
c1 = x0 − f (x0 ) , (22.14)
f (x0 )
and finally, if f (x0 ) = 0,
3
(1 + f 2 (x0 )) 2
r= . (22.15)
|f (x0 )|
The condition f (x0 ) = 0 is of course natural when assuming that locally we
can improve the approximation by a straight line. The circle
κx0 := {(x, y) ∈ R2 |(x − c1 )2 + (y − c2 )2 = r2 }
⎧ 2
⎨ 2
2
1 + f (x0 )
= (x, y) ∈ R x − x0 + f (x0 )
⎩ f (x0 )
2 ⎫
(1 + f (x0 ))3 ⎬
2 2
1 + f (x0 )
+ y − f (x0 ) − =
f (x0 ) |f (x0 )|2 ⎭
is called the circle of curvature or osculating circle. Further we call

(c1 , c2 ) the centre of curvature, r is called the radius of curvature and
1
r
is called the curvature of f at x0 (or of Γ(f ) at (x0 , f (x0 ))).
If we also assume that f (x0 ) = 0, then we find
1 x0
nx0 (c1 ) = − c1 + + f (x0 )
f (x0 )
f (x0 )

1 (1 + f 2 (x0 )) x0
=− x0 − f (x0 ) + + f (x0 )
f (x0 ) f (x0 )
f (x0 )
1 + f 2 (x0 )
= + f (x0 ) = c2 ,
f (x0 )
thus the centre of curvature lies on the normal line.
Problems
1. Use the generalised mean value theorem to prove that for f ∈ C 2 ([a, b])
satisfying f (a) = f (b) and f (a) = f (b) = 0 there exists x1 , x2 ∈ (a, b),
x1 = x2 , such that f (x1 ) = f (x2 ).
313
2. Let f : [a, b] → R be a function satisfying the estimate |f (x) − f (y)| ≤

c|x − y|1+α for all x, y ∈ [a, b]. Prove that f is constant, i.e. f (x) = c0
for some c0 ∈ R and all x ∈ [a, b].
3. Let f : (a, b) → R be differentiable with bounded derivative f , i.e.

|f (x)| ≤ M for all x ∈ (a, b). Prove that f is Lipschitz continuous
and hence uniformly continuous, see Problem 20 in Chapter 20 for the
definition of Lipschitz continuity.
4. For 0 0 use the mean value theorem to show
p q
x x
1+ < 1+ .
p q
Hint: apply the mean value theorem to y → ln(1 + y) on [0, xq ] and on
[ xq , xp ], x > 0.
5. For α, β > 0 prove:

a)
eαx
lim = +∞;
x→∞ xβ
b)
(ln x)β
lim = 0;
x→∞ xα
c)
lim xx = 1.
x→0
x>0
6. Find the following limit:

1
lim (8 − x) x−7 .
x→7
7. Let f ∈ C 2 (R) such that f (0) = 1, f (0) = 0, and f (0) = −1. Prove
that for any a ∈ R
x
a 2
− a2
lim f √ = e .
x→∞
x>0 x
(This problem is taken from [6].)
314
8. Show that if f is monotone decreasing on [a, b] and is differentiable on

[a, b] then f (x) ≤ 0 for all x ∈ (a, b).
9. A function f ∈ C ∞ ((0, ∞)) is said to be completely monotone if for

all k ∈ N0 the following holds
dk f (t)
(−1)k ≥ 0. (22.16)
dtk
A function f ∈ C ∞ ((0, ∞)) is called a Bernstein function if f ≥ 0
and for all k ∈ N
dk f (t)
(−1)k ≤ 0. (22.17)
dtk
Prove that for a > 0 the function t → e−at is completely monotone and
the function t → 1 − e−at is a Bernstein function. Furthermore show
that t → tα , 0 < α ≤ 1, is another Bernstein function.
1 2
10. a) Determine all local extreme values of f (x) = x 3 (1 − x) 3 .
b) Find the maximum of f : R → R given by
1 1
f (x) = + .
1 + |x| 1 + |x − 1|
(This problem is taken from [6].)

√
11. Let g : [−1, 1] → R, g(x) = 1 − x2 . For x0 ∈ (−1, 1) find the tangent
line, the normal line and the circle of curvature of g at x0 .
12. Consider the hyberbola f : (0, ∞) → R, f (x) = x1 . For x0 ∈ (0, ∞)

find the normal line and the curvature of f at x0 .
315
23 Convex Functions and some Norms on Rn

Let us begin with
Definition 23.1. Let I ⊂ R be an interval and f : I → R be a function. We
call f : I → R convex if for all x1 , x2 ∈ I and all λ ∈ (0, 1) the inequality
f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 ) (23.1)
holds. If −f is convex, we call f concave.
Obviously (23.1) is also correct for λ = 1 and λ = 0.
y-axis
f (x2 )
λf (x1 ) + (1 − λ)f (x2 )
f (x1 )
f (λx1 + (1 − λ)x2 )
x1 λx1 + (1 − λ)x2 x2 x-axis

Figure 23.1
Theorem 23.2. Let I ⊂ R be an open interval and f : I → R a twice

differentiable function. The function f is convex if and only if f (x) ≥ 0 for
all x ∈ I.
Proof. Suppose that f (x) ≥ 0 for all x ∈ I. It follows that f is monotone
increasing on I. For x1 , x2 ∈ I, x1 < x2 , and 0 < λ < 1 we put x :=
λx1 + (1 − λ)x2 and so x1 < x < x2 . By the mean value theorem there exists
y1 ∈ (x1 , x) and y2 ∈ (x, x2 ) such that
f (x) − f (x1 ) f (x2 ) − f (x)
= f (y1 ) ≤ f (y2 ) = .
x − x1 x2 − x
317
But x − x1 = (1 − λ)(x2 − x1 ) and x2 − x = λ(x2 − x1 ), which leads to

f (x) − f (x1 ) f (x2 ) − f (x)
≤ ,
1−λ λ
i.e.
λf (x) − λf (x1 ) ≤ (1 − λ)f (x2 ) − (1 − λ)f (x),
or
λf (x) + (1 − λ)f (x) = f (x) ≤ λf (x1 ) + (1 − λ)f (x2 ),
hence f is convex.
Now suppose that f : I → R is convex. Further assume that for some x0 ∈ I
we have f (x0 ) < 0. For c := f (x0 ) and φ(x) := f (x) − c(x − x0 ), x ∈ I, it
follows that φ (x0 ) = 0 and φ (x0 ) = f (x0 ) < 0. Therefore the function φ
must have an isolated local maximum at x0 . It follows that there exists h > 0
such that [x0 − h, x0 + h] ⊂ I and φ(x0 − h) < φ(x0 ), φ(x0 + h) < φ(x0 ),
which implies
1
f (x0 ) = φ(x0 ) > (φ(x0 − h) + φ(x0 + h))
2
1
= (f (x0 − h) + f (x0 + h)).
2
Taking x1 := x0 − h, x2 := x0 + h and λ = 12 , we find
x0 = λx1 + (1 − λ)x2
and
f (λx1 + (1 − λ)x2 ) > λf (x1 ) + (1 − λ)f (x2 )
Remark 23.3. The criterion for convexity (concavity) given in Theorem
23.2 we may combine with our sufficient criterion for the existence of a local
minimum (maximum). If f : (a, b) → R is twice continuously differentiable
and has a critical point at c ∈ (a, b), i.e. f (c) = 0, then the graph Γ(f ) of
f has at c a horizontal tangent. If f is convex (concave) in a neighbourhood
of c the graph of f must lie above (below) this horizontal tangent, hence
at c the function f has a local minimum (maximum). Thus our sufficient
criterion for the existence of a local minimum (maximum) at c, i.e. Theorem
8.8, has a natural geometric interpretation: if f has at c a horizontal tangent
and if f (c) > 0 (f (c) < 0) then f is locally, i.e. in a neighbourhood of c,
convex (concave) and therefore f has at c a minimum (maximum).
318
23 CONVEX FUNCTIONS AND SOME NORMS ON RN
The basic definition of convexity does not require differentiability and not
even continuity, it is a geometric statement expressed by an inequality. If we
consider Figure 23.1 then inequality (23.1) says that for all x ∈ [x1 , x2 ] the
graph of f lies below the line segment connecting (x1 , f (x1 )) to (x2 , f (x2 )).
This line segment is the graph of the function
f (x1 ) − f (x2 )
g(t) = f (x1 ) + (t − x1 ), t ∈ [x1 , x2 ]. (23.2)
x1 − x2
Hence convexity means
f (x1 ) − f (x2 )
f (t) ≤ f (x1 ) + (t − x1 ) (23.3)
x1 − x2
for all t ∈ [x1 , x2 ].
Lemma 23.4. A function f : I → R is convex if and only if for any three
points x < z < y, x, y, z ∈ I, the inequalities
f (x) − f (z) f (x) − f (y) f (z) − f (y)
≤ ≤ (23.4)
x−z x−y z−y
hold.
Proof. From (23.3) we deduce with x = x1 , y = x2 , z = t that
f (x) − f (z) f (z) − f (x) f (x) − f (y)

= ≤ ,
x−z z−x x−y
which is the first inequality in (23.4). Since
f (x) − f (y) f (x) − f (y)

f (x) − f (y) = (x − y) = (z − y + x − z)
x−y x−y
we find
f (x) − f (y) f (x) − f (y)
f (x) + (z − x) = f (y) + (z − y),
x−y x−y
and with (23.3) it follows that
f (x) − f (y)
f (z) ≤ f (y) + (z − y),
x−y
319
or
f (x) − f (y)
f (z) − f (y) ≤ (z − y).
x−y
Taking into account that z − y < 0, we eventually arrive at
f (x) − f (y) f (z) − f (y)

≤ ,
x−y z−y
proving the second inequality in (23.4). Now suppose that (23.4) holds and
take z = αx + (1 − α)y to find
f (x) − f (αx + (1 − α)y) f (x) − f (y) f (αx + (1 − α)y) − f (y)

≤ ≤ ,
x − αx − (1 − α)y x−y αx + (1 − α)y − y
which yields
f (x) − f (αx + (1 − α)y) f (αx + (1 − α)y) − f (y)

≤ ,
(1 − α)(x − y) α(x − y)
and since x − y < 0 we arrive at
α(f (x) − f (αx + (1 − α)y) ≥ (1 − α)(f (αx + (1 − α)y) − f (y)),
or
f (αx + (1 − α)y) ≤ αf (x) + (1 − α)f (y),
proving the convexity of f .
Theorem 23.5. Let I be an interval with end points a < b and let f : I → R
be a convex function. For every x ∈ (a, b) the function is differentiable from
the right and from the left.
Proof. Take x ∈ (a, b) and t1 , t2 ∈ I such that x < t1 < t2 . From (23.4) we
deduce
f (x) − f (t1 ) f (x) − f (t2 )
≤ ,
x − t1 x − t2
in other words, the function F : [x, b] ∩ I → R, F (t) = f (x)−f
x−t
(t)
, is monotone
increasing. Further, for x1 ∈ I with x1 < x it follows again by (23.4) that
f (x1 ) − f (x) f (x) − f (t)

≤ = F (t),
x1 − x x−t
320
implying that F is bounded from below. Hence
f (x) − f (t)
lim F (t) = lim = f+ (x)
t→x
t>0
t→x
t>0
x−t
exists. Analogously we see that f is differentiable from the left, i.e. f− (x)
exists for x ∈ (a, b).
Corollary 23.6. Let f : I → R be convex with I being an interval with end

points a < b, then f |(a,b) is continuous.
Proof. With the same argument as in the proof of Corollary 21.5 we deduce
that if f is differentiable from the right (left) at x ∈ (a, b) then f is continuous
from the right (left) at x. Hence being continuous from the right and from the
left, f must be continuous at x. (A more detailed proof is given in Problem
3.)
Remark 23.7. Using Problem 6 in Chapter 20 and some further consider-

ations it is possible to prove that a convex function f as in Theorem 23.5 is
at most a countable set non-differentiable. (Compare with D. J. H. Garling,
[4, Corollary 7.2.4, p. 184]).
Proposition 23.8. Let I ⊂ R be an interval and f, g, fn : I → R, n ∈ N

be convex functions. Then f + g and αf, α ≥ 0, are convex functions and if
F (x) := limn→∞ fn (x) exists and is finite for every x ∈ I, then F : I → R is
convex too.
Proof. The convexity of f + g and αf follows from the defining inequalities
f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 )
and
g(λx1 + (1 − λ)x2 ) ≤ λg(x1 ) + (1 − λ)g(x2 )
by adding and multiplying by α ≥ 0, respectively. Moreover, if limn→∞ fn (x) =
F (x) < ∞ exists for all x ∈ I we can pass to the limit in
fn (λx1 + (1 − λ)x2 ) ≤ λfn (x1 ) + (1 − λ)fn (x2 )
and the inequality is preserved for the limit function.
321
Remark 23.9. Note that by Corollary 23.6 convex functions provide us with
a class of functions for which the pointwise limit of sequences belonging to
this class is always continuous.
Proposition 23.10. Let I ⊂ R be an interval and J = ∅ an index set.
Suppose that for each j ∈ J a convex function fj : I → R is given. Then if
g(x) := sup{fj (x)|j ∈ J} < ∞ (23.5)
is finite for each x ∈ I, then g : I → R is convex.
Proof. Let > 0. There exists fj , j ∈ J, such that for all x1 , x2 ∈ I and
λ ∈ (0, 1) the following holds
fj (λx1 + (1 − λ)x2 ) ≥ g(λx1 + (1 − λ)x2 ) − ,
which implies by the convexity of f
g(λx1 + (1 − λ)x2 ) − ≤ fj (λx1 + (1 − λ)x2 )
≤ λfj (x1 ) + (1 − λ)fj (x2 )
≤ λg(x1 ) + (1 − λ)g(x2 ).
Since > 0 is arbitrary we eventually get
g(λx1 + (1 − λ)x2 ) ≤ λg(x1 ) + (1 − λ)g(x2 ).
The following simple inequality turns out to be quite useful:

Lemma 23.11. Let p, q ∈ (1, ∞) such that 1p + 1
q
= 1, then we have for all
x, y ≥ 0
1 1 x y
xp y q ≤ + . (23.6)
p q
Proof. We may assume x, y > 0. For the function ln : (0, ∞) → R we find
d2 1
2
(ln(x)) = − 2 < 0, thus the function ln is concave, implying that
dx x
1 1 1 1
ln( x + y) ≥ ln x + ln y,
p q p q
or
1 1 1 1
exp(ln( x + y)) ≥ exp( ln x + ln y),
p q p q
leading to (23.6)
322
The following considerations are just the beginning of a better understanding

of the concept of a limit and convergence.
On R, the natural distance between two numbers is the absolute value of their
difference. Using this we are able to define convergence. In other spaces, we
need a notion of ‘distance’ or metric. Even in R2 we have a choice:

the Euclidean distance: d(x, y) = (x1 − y1 )2 + (x2 − y2 )2 ;
the distance: d(x, y) = |x1 − y1 | + |x2 − y2 |;
and the sup metric: d(x, y) = max{|x1 − y1 |, |x2 − y2 |}.
In these cases we actually only need to define the distance from a point to
the origin.
Definition 23.12. A mapping · : Rn → R is called a norm on Rn if
1. x ≥ 0 for all x ∈ Rn and x = 0 if and only if x = 0;
2. λx = |λ|x for all x ∈ Rn , λ ∈ R;
3. x + y ≤ x + y, x, y ∈ Rn (triangle inequality).
Given a norm we define the metric d(x, y) = x − y. Corresponding to
the distances above, we write x2 = (x21 + x22 ), x1 = |x1 | + |x2 |, and
x∞ = max{|x1 | + |x2 |}.
The unit sphere in Rn with respect to a given norm is the locus of points
at distance 1 from 0, i.e. {x ∈ Rn |d(x, 0) = ||x|| = 1}. The unit spheres for
the three norms x2 , x1 , and x∞ are respectively:
1
1
1 1
{x ∈ R2 | ||x||2 = 1} {x ∈ R2 | ||x||1 = 1}
Figure 23.2 Figure 23.3
323
{x ∈ R2 | ||x||∞ = 1}
Figure 23.4
Definition 23.13. Let p ≥ 1 be a real number and x = (x1 , . . . , xn ) ∈ Rn .

We define
n p1

xp := |xν |p . (23.7)
ν=1
Remark 23.14. A. For p = 2 we find

n 12

2
x2 = |xν | ,
ν=1
and therefore x − y2 is the Euclidean distance of x and y in Rn .

B. Obviously we have
n
p1

λxp = |λxν |p = |λ|xp , λ ∈ R, (23.8)
ν=1
and
||x||p ≥ 0 for all x ∈ Rn and xp = 0 if and only if x = 0 ∈ Rn . (23.9)
1 1
Theorem 23.15 (Hölder’s inequality). Let p, q ∈ (1, ∞), p
+ q
= 1. For
x, y ∈ Rn it follows that the inequality
n

|xν yν | ≤ xp yq (23.10)
ν=1
holds.
324
Proof. Suppose that xp = 0 and yq = 0, otherwise (23.10) is trivial.

Consider
|xν |p |yν |q
ξν := p , ην := .
xp yqq

n
n
It follows that ξν = ην = 1. Applying (23.6) to ξν and ην we obtain
ν=1 ν=1
|xν · yν | 1 1
ξν ην
= ξνp ηνq ≤ +
xp yq p q
and summing over all ν we have
n
1 1 1
|xν · yν | ≤ + = 1
xp yq ν=1 p q
which implies Hölder’s inequality.

Remark 23.16. For p = 2 Hölder’s inequality reduces to the Cauchy-
Schwarz inequality (compare with Corollary 14.3)
n
n

| xν yν | ≤ |xν yν | ≤ x2 y2.
ν=1 ν=1
Next we extend Minkowski’s inequality from the Euclidean norm || · ||2

(Lemma 14.5) to the norm || · ||p , 1 ≤ p < ∞.
Theorem 23.17 (Minkowski’s inequality). Let p ∈ [1, ∞). Then we have
for all x, y ∈ Rn the inequality
x + yp ≤ xp + yp. (23.11)
Proof. For p = 1 we apply the triangle inequality

n
n
n

|xν + yν | ≤ |xν | + |yν |.
ν=1 ν=1 ν=1
p
Now, for p > 1 and q = p−1 , i.e. p1 + 1q = 1, we consider z ∈ Rn , zν =
p−1
|xν + yν | , ν = 1, . . . , n. It follows that
zνq = |xν + yν |q(p−1) = |xν + yν |p ,
325
or p
zq = x + ypq .
Next we first apply the triangle inequality and then Hölder’s inequality to
obtain
n
n
n

|xν + yν ||zν | ≤ |xν zν | + |yν zν |
ν=1 ν=1 ν=1
≤ (xp + yp)zq .
Using the definition of z, we find

p
x + ypp ≤ (xp + yp)x + ypq ,

p
and since p − q
= 1 the theorem is proved.
Corollary 23.18. For 1 ≤ p < ∞ a norm is given on Rn by || · ||p .

Definition 23.19. Let · be any norm on Rn .
A. A sequence (xk )k∈N , xk ∈ Rn , converges in Rn with respect to the norm
· to x ∈ Rn if for every ε > 0 there exists N(ε) ∈ N such that for k ≥ N(ε)
xk − x < ε.
B. Let D ⊂ Rn be a set and x0 ∈ D. We call f : D → R continuous in x0

with respect to the norm · if for every ε > 0 there exists δ = δ(ε, x0 ) such
that x ∈ D and 0 < ||x − x0 || < δ implies
|f (x) − f (x0 )| < ε.
If f is continuous at all points we just call it continuous on D.

Example 23.20. Let L : Rn → R be a linear mapping, i.e. L ∈ (Rn )∗ . Then
L is continuous with respect to any of the norms
· p , 1 ≤ p < ∞.
Proof. Choose a basis {b1 , . . . , bn } ⊂ Rn . Then

n
n

L(x) = L( xν bν ) = xν L(bν ).
ν=1 ν=1
326
Now, for p > 1 we use Hölder’s inequality and obtain

n
|L(x) − L(y)| = |L(x − y)| ≤ |xν − yν ||L(bν )|
ν=1
n
1
≤( |L(bν )|q ) q x − yp ,
ν=1
1 1
where p
+ q
= 1 or
|L(x) − L(y)| ≤ Mx − yp .
ε
Hence, given ε > 0, take δ = M
to find for x − yp < δ
|L(x) − L(y)| ≤ Mx − yp < ε.
For p = 1 we just find
n

|L(x) − L(y)| = | (xν − yν )L(bν )|
ν=1
≤ max |L(bν )|x − y1
ν=1,...,n
n
implying the continuity of L in (R , · 1 ).
Example 23.21. (Compare with Problem 9 b))Every norm · on Rn is
continuous, i.e. the mapping || · || : Rn → R, x → ||x||, is continuous. Indeed,
the triangle inequality gives
|x − y| ≤ x − y
which implies the continuity.
(1) (n)
Exercise 23.22. Prove that (xk )k∈N , xk ∈ Rn , xk = (xk , . . . , xk ) converges
with respect to · p , 1 ≤ p < ∞, to x = (x(1) , . . . x(n) ) ∈ Rn , if and only if
(ν) (ν)
for all ν, 1 ≤ ν ≤ n, the sequences (xk )k∈N , xk ∈ R, converges in R to xν .
Problems
1. Prove that the convexity of f : I → R implies Jensen’s inequality:
for every m ∈ N, m ≥ 2, and any choice of points x1 , . . . , xm ∈ I and
all 0 ≤ λj ≤ 1, j = 1, . . . , m, such that λ1 + · · · + λm = 1 it follows that
f (λ1 x1 + · · · + λm xm ) ≤ λ1 f (x1 ) + · · · + λm f (xm ). (23.12)
Hint: use mathematical induction with respect to m.
327
2. Give a direct proof that a convex function f : I → R, I ⊂ R being an

interval with end points a < b, is continuous on (a, b). Hint: use (23.4)
f (x)−f (y)
to estimate x−y against a constant.
3. Let f : R → R be a convex function and suppose that at x0 ∈ R the

function f attains a local minimum. Show that x0 is in fact a global
minimum.
4. a) Using the fact that x → ln x is on (0, ∞) a concave function,

give a simple proof of the arithmetic-geometric mean inequality,
see Lemma 14.2, i.e. prove for x1 , . . . , xn > 0 that
n
n1 n
1
xk ≤ xk .
k=1
n k=1
b) Prove that f : (0, ∞) → R, f (x) = x ln x is convex and derive

x+y
(x + y) ln ≤ x ln x + y ln y. (23.13)
2
5. For a ∈ [1, 32 ] consider fa : [−1, 1] → R, fa (x) = eax . Prove that fa is

convex and that
- 3
e 2 x , x ∈ [0, 1]
sup fa (x) = .
a∈[1, 32 ] ex , x ∈ [−1, 0]
6. Let f, h : R → R be convex and assume in addition that f is increasing.

Prove that h ◦ f is convex.
7. For k ∈ N let || · ||k be a norm on Rn . Prove that

∞
1 ||x − y||k
d(x, y) :=
2k 1 + ||x − y||k
k=1
is a metric on Rn , i.e. d(x, y) ≥ 0 and d(x, y) = 0 if and only if x = y,

d(x, y) = d(y, x), and the triangle inequality d(x, z) ≤ d(x, y) + d(y, z)
holds. Hint: to prove the triangle inequality use the fact that f →
t
f (t) = 1+t is increasing on [0, ∞).
328
8. For the Euclidean norm || · || on Rn prove Peetre’s inequality
1 + ||x||2
≤ 2(1 + ||x − y||2). (23.14)
1 + ||y||2
9. a) Let || · ||(1) and || · ||(2) be two norms on Rn . Prove that by
||x|| := ||x||(1) + ||x||(2)
and
|||x||| := max(||x||(1) , ||x||(2) )
two further norms are given on Rn .
b) Prove the converse triangle inequality
||x|| − ||y|| ≤ | ||x|| − ||y|| | < ||x − y||.
(1) (n)
10. Let (xk )k∈N , xk ∈ Rn , xk = (xk , . . . , xk ) be a sequence in Rn . Prove
that (xk )k∈N converges to x ∈ Rn , x = (x(1) , . . . , x(n) ), in the norm
|| · ||p , 1 ≤ p < ∞, if and only if
(j)
lim |xk − x(j) | = 0
k→∞
for 1 ≤ j ≤ n.
(1) (n)
11. Let (xk )k∈N , xk = (xk , . . . , xk ) ∈ Rn , be a sequence converging in the
norm || · ||p, p ∈ [1, ∞), to some x = (x(1) , . . . , x(n) ) ∈ Rn . Suppose that
|| · || is a further norm on Rn satisfying the inequality ||y|| ≤ c||y||p for
all y ∈ Rn with some c > 0. Prove that (xk )k∈N converges to x with
respect to || · ||.
329
24 Uniform Convergence and Interchanging

Limits
A lot of the material in this chapter can be skipped during a first reading.
Of importance are the definitions of pointwise and uniform convergence, the
fact that uniform convergence can be described as convergence with respect
to the supremum norm and the result that the uniform limit of continuous
functions is continuous, Theorem 24.6. However here is the correct place to
add some further material to be considered later.
In the following let K = ∅ be a set. We may consider functions f, g : K → R
and for α ∈ R it follows that the functions f ± g, f · g and αf can be defined
on K by
(f ± g)(x) := f (x) ± g(x), (24.1)
(f · g)(x) := f (x)g(x), (24.2)
(αf )(x) := αf (x). (24.3)
Note that we use the algebraic operation for real numbers (the target set of
our functions) to implement an algebraic structure on the set of functions
f : K → R. Of course this is not new to us, see Chapter 4. If we denote
the set of functions from K to R by M(K; R) := {f |f : K → R}, it is easy
to see that with the natural or pointwise operations (24.1)-(24.3) M(K; R)
is an R-algebra, in particular it is an R-vector space. The elements of this
vector space are functions. For example, if K = I ⊂ R is an interval we find
that C(I) ⊂ M(I; R) is a subspace, in fact a sub-algebra. Recall that C(I)
stands for the vector space of all continuous functions from I to R (also see
Problem 11, Chapter 20). The idea of considering functions as elements of
a vector space (or an algebra) is new to us - our next step is to consider
sequences of functions as sequences of elements in a vector space. For n ∈ N
let fn : K → R be a function. We may ask whether such a sequence (fn )n∈N
or (fn )n≥k , k ∈ Z, of functions has a limit, however what does this mean?
So far we only know limits of sequences of real numbers or of vectors in Rn
with respect to a norm || · ||p , see Definition 23.19. Thus instead of looking
at (fn )n∈N we may look at (fn (x))n∈N , x ∈ K, which is a sequence of real
numbers. More precisely for every x ∈ K we have a sequence of real numbers,
i.e. we are dealing with a family (indexed by K) of sequences of real numbers.
We can define (at least) two types of convergence, and in each case the limit
is again a function f : U → R.
331
Definition 24.1. A.We say that (fn )n∈N converges pointwise on K to

f if for all x ∈ K the sequences (fn (x))n∈N converge to f (x), i.e. for every
x ∈ K and every ε > 0 there exists N = N(x, ε) ∈ N such that n ≥ N
implies
|fn (x) − f (x)| < ε.
B.The sequence (fn )n∈N is said to converge uniformly to f if for every
ε > 0 there is N(ε) ∈ N such that n ≥ N implies for all x ∈ K
|fn (x) − f (x)| < ε.
The important difference is that in the case of uniform convergence N is in-

dependent of x. Clearly, uniform convergence implies pointwise convergence.
Example 24.2. For n ≥ 2 define fn : [0, 1] → R by

fn (x) = max(n − n2 |x − n1 |, 0), see Figure 24.1
y-axis
1 2 x-axis
n n 1
Figure 24.1
The sequence (fn )n∈N\{1} converges pointwise on [0, 1] to f = 0. Indeed, for
x = 0 we have fn (x) = 0 for all n. Further, for every x ∈ (0, 1] there exists
N = N(x) ≥ 2 such that
2
≤ x for n ≥ N(x)
n
332
24 UNIFORM CONVERGENCE AND INTERCHANGING LIMITS
implying that for n ≥ N(x)

1 2 1
n − n2 |x − | ≤ n − n2 ( − ) = n − n = 0,
n n n
hence fn (x) = 0 for n ≥ N(x), which yields lim fn (x) = 0.
n→∞
However (fn )n∈N does not converge uniformly to f = 0 since for no n ≥ 2
we have |fn (x) − 0| < 1 for all x ∈ [0, 1], note that max fn (x) = n.
x∈[0,1]
The last remark leads to a different description of uniform convergence.

Lemma 24.3. The sequence (fn )n∈N converges uniformly to f if and only if
for every ε > 0 there is N = N(ε) ∈ N such that for all n ≥ N
sup |fn (x) − f (x)| < ε.

x∈K
Proof. Suppose that (fn )n∈N converges uniformly to f. Then for ε > ε > 0
there exists N = N(ε ) such that
|fn (x) − f (x)| < ε for all x ∈ K and n ≥ N,
hence
sup |fn (x) − f (x)| ≤ ε < ε for all n ≥ N.
x∈K
Conversely, since
|fn (x) − f (x)| ≤ sup |fn (y) − f (y)|

y∈K
for all x ∈ K it follows that if sup |fn (y) − f (y)| < ε, then
y∈K
|fn (x) − f (x)| < ε for all x ∈ K.
It turns out that uniform convergence can be considered as convergence with

respect to a suitable norm.
Definition 24.4. Let K = ∅ be a set and f : K → R be a function. We set
f K,∞ := sup |f (x)|. (24.4)

x∈K
If the set K is fixed we just write f ∞ instead of f K,∞. We call ||f ||∞ the
supremum norm or just the sup norm of f .
333
Lemma 24.5. On the set Mb (K; R) := {f : K → R| sup |f (x)| < ∞} a

x∈K
norm is given by · K,∞, i.e. the following hold:
||f ||K,∞ ≥ 0 and ||f ||K,∞ = 0 if and only if f (x) = 0 for all x ∈ K;
i.e. f is the 0-element in Mb (K; R); (24.5)
||λf ||K,∞ = |λ|||f ||K,∞ for λ ∈ R and f ∈ Mb (K; R); (24.6)
||f + g||K,∞ ≤ ||f ||K,∞ + ||g||K,∞ for all f, g ∈ Mb (K; R). (24.7)
Proof. Clearly f ∞ ≥ 0 and f ∞ = 0 means that |f (x)| = 0 for all x ∈ K,

implying f (x) = 0 for all x ∈ K. Further, for λ ∈ R we find
λf ∞ = sup |λf (x)| = |λ| sup |f (x)| = |λ|f ∞ .

x∈K x∈K
Finally, for f, g ∈ Mb (K; R) it follows that
f + g∞ = sup |f (x) + g(x)| ≤ sup |f (x)| + sup |g(x)| = f ∞ + g∞ .

x∈K x∈K x∈K
Note that the triangle inequality implies the converse triangle inequality,
i.e.
||f ||∞,K − ||g||∞,K ≤ | ||f ||∞,K − ||g||∞,K | ≤ ||f − g||∞,K ,
compare with Lemma 2.9 or Problem 9 b) in Chapter 23. The next theorem
shows the importance of uniform convergence.
Theorem 24.6. Let (fn )n∈N be a sequence in C(I), where I ⊂ R is an

interval and suppose that (fn )n∈N converges uniformly to f : I → R. Then f
is continuous, i.e. the uniform limit of continuous functions is continuous.
Proof. Let x ∈ I. We have to prove: given ε > 0 then there exists δ =

δ(x, ε) > 0 such that
|f (x) − f (x )| < ε for all x ∈ I, |x − x | < δ.
Since (fn )n∈N converges uniformly to f, there exists N ∈ N such that

ε
|fN (y) − f (y)| < for all y ∈ I.
3
334
Since fN is continuous at x, there exists δ > 0 such that

ε
|fN (x) − f (x )| < for all y ∈ I, |x − x | < δ.
3
Therefore, for all x ∈ I such that |x − x | < δ it follows that
|f (x) − f (x )| ≤ |f (x) − fN (x)| + |fN (x) − fN (x )| + |fN (x ) − f (x )|
ε ε ε
< + + =ε
3 3 3
and the theorem is proved.
Example 24.7. Consider on [0, 1] the sequence of functions fn (x) = xn . This
sequence of continuous functions converges pointwise, namely for x ∈ [0, 1)
we find lim xn = 0 where as for x = 1 we have lim xn = 1.
n→∞ - n→∞
0 , x ∈ [0, 1)
The limit function is f (x) = and it is discontinuous.
1 ,x = 1
We want to study uniform convergence more closely, and as already men-
tioned, the following could be skipped in a first reading. As we will see there
is a small problem when dealing with uniform convergence and boundedness
of sequences.
Example 24.8. Consider the sequence (fn )n∈N0 where fn : R → R and
f0 (x) = ex and fn (x) = n1 sin nx. Given > 0 we take N() = 1 + 1 to find
that for n > N() it follows that n1 < , and consequently, for n > N() we
have
1 1
|fn (x) − 0| = | sin nx| ≤ < .
n n
Hence (fn )n∈N0 converges uniformly to the function x → 0 for all x ∈ R.
However f0 is unbounded. Thus for n ≥ N() we have supx∈R |fn (x)| < ,
i.e. fn , n ≥ N(), is bounded and clearly the limit function is bounded, but
not all functions of the sequence (fn )n∈N0 must be bounded.
In general we have
Lemma 24.9. Let (fn )n∈N , fn ∈ M(K; R), be a sequence converging uni-
formly to f ∈ M(K; R). If for all n ≥ N0 the functions fn are bounded, i.e.
n ≥ N0 implies fn ∈ Mb (K; R), then the limit f must be a bounded function
too, i.e.
|f (x)| ≤ sup |f (x)| = ||f ||∞ < ∞. (24.8)
x∈K
335
Proof. By uniform convergence we know that for = 1 there exists N1 ∈ N

such that ||f − fn0 ||∞ ≤ 1 for n ≥ N1 . For n0 ≥ max(N0 , N1 ) we find
||f ||∞ ≤ ||f − fn0 ||∞ + ||fn0 ||∞ ≤ 1 + ||fn0 ||∞ .
Corollary 24.10. If a sequence fn ∈ Mb (K; R) converges uniformly to f ∈

M(K; R) then f ∈ Mb (K; R) and the sequence is bounded in the sense that
||fn ||∞ ≤ C < ∞ with C independent of n. Moreover we have ||f ||∞ ≤ C.
Proof. The first part follows from
||fn ||∞ ≤ ||fn − f ||∞ + ||f ||∞
and Lemma 24.9. To prove ||f ||∞ ≤ C note that for > 0 there exists N()
such that n ≥ N() implies by the converse triangle inequality that
||f ||∞ − ||fn ||∞ ≤ ||f − fn ||∞ <
or
||f ||∞ ≤ + ||fn ||∞ ≤ + C,
however > 0 was arbitrary which implies ||f ||∞ ≤ C.
In order to simplify matters, in the following we will only investigate uniform
convergence in Mb (K; R). As a first result we prove that the Cauchy criterion
holds for uniform convergence in Mb (K; R).
Theorem 24.11. A sequence (fn )n∈N , fn ∈ Mb (K; R), converges uniformly
with limit f ∈ Mb (K; R) if and only if for every > 0 there exists N() such
that n, m ≥ N() implies ||fn − fm ||∞ < .
Proof. Suppose that (fn )n∈N converges uniformly to f . For > 0 there exists
N() such that n ≥ N() implies ||f −fn ||∞ < 2 which yields for n, m ≥ N()
that

||fn − fm ||∞ = ||fn − f + f − fm ||∞ ≤ ||fn − f ||∞ + ||fm − f ||∞ < + = .
2 2
Conversely suppose that for > 0 there exists N() such that n, m ≥ N()
implies ||fn − fm ||∞ < . This gives for every x ∈ K and n, m ≥ N(),
|fn (x) − fm (x)| ≤ ||fn − fm ||∞ < , (24.9)
336
i.e. for every x ∈ K the sequence (fn (x))n∈N is a Cauchy sequence in R,

hence has a limit f (x). We define the function f : K → R by x → f (x), and
we want to prove that (fn )n∈N converges uniformly to f . In (24.9) we may
pass to the limit as m → ∞ to find
|fn (x) − f (x)| ≤ , (24.10)
which yields
||fn − f ||∞ = sup |fn (x) − f (x)| ≤ , (24.11)
x∈K
i.e. (fn )n∈N converges uniformly to f .

Definition 24.12. A sequence (fn )n∈N , fn ∈ Mb (K; R), is called a Cauchy
sequence with respect to the norm || · ||∞ if for every > 0 there exists
N() ∈ N such that n, m ≥ N() implies ||fn − fm ||∞ < .
We proved in Theorem 24.11 that on the vector space Mb (K; R) equipped
with the sup norm || · ||∞ every Cauchy sequence with respect to the sup
norm has a limit in Mb (K; R) with respect to the sup norm. In this sense we
call (Mb (K; R), || · ||∞ ) a complete normed space or Banach space.
Lemma 24.13. Let (fn )n∈N , (gn )n∈N be two sequences in Mb (K; R) which
converge uniformly to f and g, respectively. Then (fn + gn )n∈N converge
uniformly to f + g and (fn · gn )n∈N converge uniformly to f · g. In particular,
for λ ∈ R the sequence (λfn )n∈N converges uniformly to λf .
Proof. In light of Theorem 24.6 we need to prove the convergence of (fn +
gn )n∈N to f + g and the convergence of (fn · gn )n∈N to f · g with respect to
the norm || · ||∞ . We proceed as in the proofs of the analogous results for
sequences of real numbers by replacing the absolute value by the norm ||·||∞ .
For > 0 there exists N() such that n ≥ N() implies ||f − fn ||∞ < and
||g − gn ||∞ < which implies by the triangle inequality
||(fn + gn ) − (f + g)||∞ ≤ ||fn − f ||∞ + ||gn − g||∞ < + = 2,
i.e. (fn + gn )n∈N converges uniformly to f + g. Moreover, since (gn )n∈N is

bounded with respect to || · ||∞ , i.e. ||gn ||∞ ≤ c0 , and with ||f ||∞ ≤ c1 it
follows that
||fn gn − f g||∞ = ||fn gn − f gn + f gn − f g||∞

≤ ||(fn − f )gn ||∞ + ||f (gn − g)||∞.
337
Since for h1 , h2 ∈ Mb (K, R) we have

||h1 h2 ||∞ = sup |h1 (x)h2 (x)| ≤ sup |h1 (x)| sup |h2 (x)| = ||h1 ||∞ ||h2 ||∞
x∈K x∈K x∈K
it follows
||fn gn − f g||∞ ≤ ||gn ||∞ ||fn − f ||∞ + ||f ||∞||gn − g||∞

≤ c0 ||fn − f ||∞ + c1 ||gn − g||∞ < (c0 + c1 ),
implying the uniform convergence of (fn · gn )n∈N to f · g.
We have seen in Example 24.7 that there are pointwise convergent sequences
of continuous functions which are not uniformly convergent and whose limit
is not continuous. If we combine Proposition 23.8 and Corollary 23.6 we
see that the pointwise limit of convex functions is continuous, i.e. uniform
convergence is not needed to get continuity. The argument is that con-
vex functions are continuous and that pointwise limits of convex functions
are convex. The next result gives a further example that using additional
information we sometimes get that pointwise convergence implies uniform
convergence.
Proposition 24.14. Let fn : [a, b] → R, a < b be a sequence of increasing

functions converging pointwise to a continuous function f : [a, b] → R then
the convergence is uniform.
Remark 24.15. A. This result also holds for sequences of decreasing func-
tions.
B. Note that we do not require fn to be continuous, i.e. we may have a
sequence of non-continuous functions converging uniformly to a continuous
function.
Proof of Proposition 24.14. As a continuous function on a compact interval,

f is uniformly continuous. Thus for > 0 there exists δ > 0 such that
|x − y| < δ, x, y ∈ [a, b], implies |f (x) − f (y)| < 2 . Now we choose a partition
of [a, b] with points a = x0 < x1 < · · · < xk = b such that |xj − xj−1 | < δ
for j = 1, . . . k. Using the pointwise convergence of the sequence (fn )n∈N we
deduce
lim fn (xj ) = f (xj ), j = 1, . . . , k.
n→∞
338
Thus there exists N0 such that n ≥ N0 implies

|fn (xj ) − f (xj )| < , j = 1, . . . , k. (24.12)
2
For x ∈ [a, b] we find j such that xj−1 ≤ x < xj and the monotonicity of fn
implies by (24.12) that

f (xj−1 ) − < fn (xj−1 ) ≤ fn (x) ≤ f (xj ) < f (xj ) + .
2 2
As a pointwise limit of increasing functions f must be increasing, compare
with Problem 4, i.e. f (xj−1) ≤ f (x) ≤ f (xj ) which now yields using the
uniform continuity of f

− < f (xj−1 ) − f (xj ) − ≤ fn (x) − f (x) ≤ f (xj ) − f (xj−1) + <
2 2
or |fn (x) − f (x)| < for x ∈ [a, b] and n ≥ N0 , i.e. for n ≥ N0 we have
||fn − f ||∞ < .
Exercise 24.16. Let fn : [a, b] → R be a sequence of monotone increasing
functions converging pointwise to f : [a, b] → R. Show that f is increasing.
The Bolzano-Weierstrass theorem, Theorem 17.6, states that every bounded
sequence in R has a convergent subsequence. We may ask whether such
a result holds for uniformly convergent sequences of functions too. In fact
this is not the case, but with certain additional conditions the result can be
rescued. This is the famous Arzela-Ascoli theorem which will be discussed
later in our course.
Let us return to Theorem 24.6. We can interpret this result differently as
follows:
lim lim fn (x) = lim lim fn (x), (24.13)
n→∞ x→x0 x→x0 n→∞
i.e. under uniform convergence we are allowed to interchange the order of

the limits.
We have seen that pointwise convergence is not sufficient to justify (24.13).
However, what about differentiability? When does the following hold?

lim fn (x) = lim fn (x). (24.14)
n→∞ n→∞
It turns out that uniform convergence of (fn )n∈N or of (fn )n∈N is not sufficient:
339
Example 24.17. For n ∈ N consider fn : R → R, fn (x) = n1 sin nx. Since

||fn ||∞ = supx∈R n1 sin nx = n1 it follows that (fn )n∈N converges uniformly
on R to f0 (x) = 0 for all x ∈ R and f0 is differentiable with derivative
f0 (x) = 0 for all x ∈ R. However fn (x) = n1 n cos nx = cos nx and (fn )n∈N is
not even pointwise convergent, note that for x = π it follows for even n that
cos nπ = 1, while for odd n we have cos nπ = −1.
Example 24.18. Now consider gn : R → R, gn (x) = n cos n12 x. Then

limn→∞ gn (x) does not in general exist, i.e. we cannot define a limit function
g. However, gn (x) = − n1 sin n12 x, and since

1 1 1
||gn (x)||∞ = sup sin 2 x = ,
x∈R n n n
it follows that (gn )n∈N converges uniformly to 0, i.e. the function x → 0 for
all x ∈ R.
It turns out that the pointwise convergence of (fn )n∈N and the uniform con-
vergence of (fn )n∈N will be sufficient to imply (24.14), but to prove this we
will need more tools.
Problems
1. Show that the sequence (gn )n∈N , gn : R → R, where
-
x
, if n is even
gn (x) = n1
n
, if n is odd
is pointwise convergent but not uniform.
2. Prove that the pointwise limit of-the sequence (fn )n∈N , fn : [0, 1] → R,
1
1 2
, x=0
fn (x) = 1+(nx−1)2 , is f (x) = and deduce that the
0, x ∈ (0, 1],
convergence cannot be uniform.
3. Test the following for uniform convergence:

a) fn (x) = xn (1 − x), on [0, 1];
nx2
b) gn (x) = 1+nx
on [0, 1];
340
c) hn (x) = arctan x24x

+n4
on R;
d) kα,n (x) = 1
nα
cos(an x) on R for any sequence (an )n∈N , an ∈
R, α > 0.
4. Prove that if (fn )n∈N , fn : I → R, where I ⊂ R is an interval, is increas-

ing and converges pointwise on I to f : I → R, then f is increasing
too.

5. Consider the polynomial p(x) = N k
k=0 ck x of degree N. Show that
N
there exists a sequence of polynomials pn (x) = k=0 ck,n xk , with ra-
tional coefficients ck,n ∈ Q converging uniformly on [0, 1] to p.
6. Let I ⊂ R be an interval and suppose that the sequence fn : I → R

of continuous functions converges uniformly to the continuous function
f : I → R. Let (xn )n∈N , xn ∈ I, be a sequence converging to x ∈ I.
Prove that
lim fn (xn ) = f (x).
n→∞
7. Let fn ∈ C((a, b)), a < b and n ∈ N, be a sequence of continuous

functions with the property that for every compact interval [α, β] ⊂
(a, b) the sequence fn |[α,β] n∈N converges uniformly to a function gα,β .
Prove that then (fn )n∈N converges pointwise on (a, b) to a continuous
function f ∈ C((a, b)).
8. Let f : R → R be a continuously differentiable function such that

f is uniformly
continuous
on R. Prove that gn : R → R, gn (x) :=
n f x + n1 − f (x) converges uniformly on R to f .
9. Consider fn : [−1, 1] → R where fn (x) = 1+nx x

. Prove that (fn )n∈N
converges uniformly to the zero function, i.e. to the function x →
h(x) = -0 for all x ∈ [−1, 1], while (fn )n∈N converges pointwise to
1, x = 0
g(x) :=
0, x ∈ [−1, 1] \ {0}.
341
25 The Riemann Integral

In this and the following chapter we want to rigorously derive the results
already discussed and used in Chapters 12 and 13. Our starting point is to
determine the area A bounded by the graph Γ(f ) of a function f : [a, b] → R,
a < b, the interval [a, b], the line segment joining (a, 0) and (a, f (a)) and the
line segment joining (b, 0) and (b, f (b)), see Figure 25.1.
f (b) Γ(f )
f (a)
a b
Figure 25.1
We take for granted that the area of a rectangle with vertices (a, 0), (b, 0),
(b, c), (a, c), a < b, b < c, is given by
A = (b − a)(c − b).
a b
Figure 25.2
Interpreting the line segment connecting (a, c) with (b, c) as the graph of the
function fc : [a, b] → R, fc (x) = c, we find for the area of this rectangle
A = fc (a)(b − a), in fact A = fc (ξ)(b − a) for every ξ ∈ [a, b]. Furthermore,
when looking at Figure 25.3
343
c2
c3
c1
···
cn
a = x0 x1 x2 x3 · · · xn−1 xn = b
Figure 25.3
we of course agree that the area A is given by
n

A= ck (xk − xk−1 ) (25.1)
k=1
or with f : [a, b] → R, f |(xk−1,xk ) (x) = ck ,

n

A = A(f ) = f (ξk )(xk − xk−1 ), ξ ∈ (xk−1 , xk ). (25.2)
k=1
Now we have the obvious idea: in order to find the area bounded by Γ(f ), f :
[a, b] → R, f (x) ≥ 0, and the interval [a, b] as well as the line segment
connecting (a, 0) and (a, f (a)) and the line segment connecting (b, 0) and
(b, f (b)), see Figure 25.1, we approximate Γ(f ) by the graphs of piecewise
constant functions, see Figure 25.4, and try to pass to the limit.
···
t1 t2 t3 t4 t5 t6
a = x0 = t0 x1 x2 x3 · · b· = xn = tm
Figure 25.4
344
25 THE RIEMANN INTEGRAL
This idea eventually leads to a solution of the original problem, but we must
first overcome a few difficulties. One problem is that the area A(f ) we
are looking for is not yet defined. In fact only after we define a proper
approximation process for a (large) class of functions can we define the “area
under the graph” of a function. We therefore need to find for a class of
functions a way of approximating them with piecewise constant functions ϕ
such that the area A(ϕ) associated with ϕ by (25.2) converges to a quantity
A(f ) which we can interpret as the “area under Γ(f )”.
First let us consider piecewise constant functions.
Definition 25.1. Let [a, b], a < b, be a closed and bounded, hence compact
interval. We call a finite set of numbers or points a = x0 < x1 < · · · <
xn−1 < xn = b a partition Z of [a, b].
We denote partitions by Z = Z(x0 , . . . , xn ) with the understanding that x0 =

a and xn = b. When we want to emphasise the corresponding interval [a, b]
we write Z(a, x1 , . . . , xn−1 , b). On [a, b] we can consider several partitions
Z1 = Z(x0 , . . . , xn ), Z = Z (t0 , . . . , tm ) (clearly a = x0 = t0 , b = xn = tm ).
Given two partitions Z1 (x0 , . . . , xn ) and Z2 (t0 , . . . , tm ) we can construct the
joint partition Z(y0 , . . . , yk ) by
Z(y0 , . . . , yk ) = Z1 (x0 , . . . , xn ) ∪ Z2 (t0 , . . . , tm ) = {x0 , . . . , xn } ∪ {t0 , . . . , tm },
and clearly y0 = x0 = t0 = a, yk = xn = tm = b. Given a partition Z =

Z(x0 , . . . , xn ) of [a, b]. We call
max{xk − xk−1 |k = 1, . . . n} (25.3)
the mesh size or width of Z. If xk − xk−1 = η is independent of k we call

Z an equidistant partition of [a, b], a < b.
Definition 25.2. Let ϕ : [a, b] → R be a function. We call ϕ a step func-

tion on [a, b] if there exists a partition Z = Z(x0 , . . . , xn ) of [a, b] such that
f |(xk−1 ,xk ) , is constant, k = 1, . . . n, i.e. f (x) = ck for all x ∈ (xk−1 , xk ) and
some ck ∈ R.
The set of all step functions on [a, b] is denoted by T [a, b]. Note that in
Definition 25.2 no statement about the values f (xk ), xk ∈ Z, is made, except
that they are real numbers.
345
Remark 25.3. A. It is worth mentioning that ϕ ∈ T [a, b] may have different

step function representations. Take the constant function ϕ(x) = c for all
x ∈ [a, b]. It is a step function with respect to the partition Z = Z(x0 , x1 ) =
{a, b}, but for any finite number of points x0 < x1 < · · · < xn−1 < xn
we can consider ϕ as a step function with respect to that partition, i.e.
ϕ|(xk−1 ,xk ) = ck = c. In general if ϕ is a step function with respect to Z1 and
Z2 is a partition such that Z1 ⊂ Z2 then we can also represent ϕ as a step
function with respect to Z2 .
B. Given a step function ϕ : [a, b] → R with respect to the partition Z =
Z(x0 , . . . , xn ) with ϕ|(xk−1 ,xk ) = ck . We can write ϕ as
n
n

ϕ(x) = ck χ(xk−1 ,xk ) (x) + ϕ(xk )χ{xk } (x), (25.4)
k=1 k=0
-
1, x ∈ A
where as usual χA (x) = denotes the characteristic function of
0, x ∈
/A
the set A.
We have seen that the set of all functions f : [a, b] → R, a < b, form an
R-vector space, in fact even an algebra, with respect to pointwise operations,
i.e. (f + g)(x) = f (x) + g(x), (λf )(x) = λf (x), (f · g)(x) = f (x)g(x).
Lemma 25.4. The step functions T [a, b] are a subspace of the vector space
of all real-valued functions defined on [a, b].
Proof. We have to prove that ϕ, ψ ∈ T [a, b], λ ∈ R imply that ϕ + ψ ∈ T [a, b]
and λϕ ∈ T [a, b]. Let ϕ be given with respect to the partition Z1 and ψ with
respect to the partition Z2 . We now consider ϕ and ψ as step functions with
respect to the joint partition Z = Z1 ∪ Z2 = {t0 , . . . , tk }. For 1 ≤ l ≤ n
we have ϕ|(tl−1 ,tl ) = cl and ψ|(tl−1 ,tl ) = dl for some cl , dl ∈ R, and therefore
(ϕ + ψ)|(tl−1 ,tl ) = cl + dl , i.e. with respect to Z the function ϕ + ψ is also a
step function. Obviously, with ϕ ∈ T [a, b] it follows that λϕ ∈ T [a, b], since
ϕ|(xj−1 ,xj ) = cj implies (λϕ)|(xj−1 ,xj ) = λcj .
Exercise 25.5. Prove that T [a, b] is an algebra.

The next result is crucial for the following reason. It tells us that a continuous
function can always be “sandwiched” between two step functions such that
these two step functions differ only by a prescribed magnitude.
346
Theorem 25.6. Let f : [a, b] → R, a < b, be a continuous function. For

> 0 there exists ϕ, ψ ∈ T [a, b] such that
ϕ(x) − f (x) ≤ ψ(x) for all x ∈ [a, b] (25.5)
and
ψ(x) − ϕ(x) − |ψ(x) − ϕ(x)| ≤ for all x ∈ [a, b]. (25.6)
Proof. As a continuous function on a compact set, f is uniformly continuous.

Hence for > 0 there exists δ > 0 such that x, y ∈ [a, b] and |x − y| < δ
imply |f (x) − f (y)| < 2 . We divide [a, b] into n equally long intervals with
length less than δ:
b−a
tk := a + k , k = 0, 1, . . . , n
n
where n is chosen such that b−a n
< δ. This gives an equidistant partition
Z = Z(t0 , . . . , tn ) of [a, b] and we will define ϕ and ψ with respect to Z. For
1 ≤ k ≤ n we set

ck := f (tk ) + , ck := f (tk ) − ,
2 2
and
ϕ(a) = ψ(a) := f (a), (25.7)
as well as for x ∈ (tk−1 , tk ], k = 1, . . . , n
ϕ(x) = ck , ψ(x) = ck . (25.8)
The definition of ck , ck yields
|ϕ(x) − ψ(x)| ≤ for all x ∈ [a, b].
For x = a = t0 we have ϕ(x) = f (x) = ψ(x), hence ϕ(x) ≤ f (x) ≤ ψ(x). For
x ∈ (tk−1 , tk ] it follows that |x − tk | < δ and therefore

− < f (x) − f (tk ) < ,
2 2
or

ϕ(x) = ck = f (tk ) − < f (x) < f (tk ) + = ck = ψ(x),
2 2
i.e. ϕ(x) ≤ f (x) ≤ ψ(x) for all x ∈ [a, b].
347
Now we define an integral for step functions ϕ ∈ T [a, b] with the aim to
extend it at least to all continuous functions on [a, b].
Definition 25.7. Let ϕ ∈ T [a, b] be given with respect to the partition Z =

Z(x0 , . . . , xn ) by ϕ|(xk−1 ,xk ) = ck . The integral of ϕ is defined by
b n

ϕ(x)dx := ck (xk − xk−1 ). (25.9)
a k=1
Note that the integral does not depend on the values f (tk ), k = 0, . . . , n.
However, the integral as defined by (25.9) seems to depend on Z, but ϕ can
be represented with respect to other partitions. So we need to prove that the
integral only depends on ϕ and not on the chosen partition to represent ϕ.
b
Lemma 25.8. The definition of a f (x)dx is independent of the choice of
partition representing ϕ.
Proof. Let Z1 (x0 , . . . , xn ) and Z2 (t0 , . . . , tm ) be two partitions of [a, b] such

that ϕ|(xk−1 ,xk ) = ck and ϕ|(tl−1 ,tl ) = cl . We have to prove
n
m

ci (xi − xi−1 ) = cj (tj − tj−1 ).
i=1 j=1
Suppose first that Z1 ⊂ Z2 , xi = tki . It follows that
xi−1 = tki−1 < tki−1 +1 < · · · < tki = xi , 1 ≤ i ≤ n,
and
cj = ci for ki−1 < j ≤ ki ,
implying
m
n
ki
n

cj (tj − tj−1 ) = cj (tj − tj−1 ) = ci (xi − xi−1 ).
j=1 i=1 j=ki−1 +1 i=1
The general case follows by using Z = Z1 ∪ Z2 as a third partition and apply

the case just proven to Z and Z1 as well as Z and Z2 .
On T [a, b] the integral is linear and positivity preserving, i.e. the integral of
non-negative functions is non-negative.
348
Theorem 25.9. For ϕ, ψ ∈ T [a, b] and λ ∈ R the following hold:

b b b
(ϕ + ψ)(x)dx = ϕ(x)dx + ψ(x)dx (25.10)
a a a
and
b b
(λϕ)(x)dx = λ ϕ(x)dx. (25.11)
a a
Further, if ϕ ≥ 0, i.e. ϕ(x) ≥ 0 for all x ∈ [a, b], then we have

b
ϕ(x)dx ≥ 0. (25.12)
a
Proof. For (25.10) we need to represent ϕ and ψ with respect to the same
partition and then we can use as for the proofs of (25.11) and (25.12) the
fact that the summation process is additive, homogeneous and positivity
preserving, i.e. the sum of non-negative numbers is non-negative.
Corollary 25.10. Let ϕ, ψ ∈ T [a, b] and ϕ ≤ ψ, i.e. ϕ(x) ≤ ψ(x) for all
x ∈ [a, b] then we have
b b
ϕ(x)dx ≤ ψ(x)dx. (25.13)
a a
Proof. Since ψ − ϕ ≥ 0, using (25.10)-(25.12) we find

b b b
0≤ (ψ(x) − ϕ(x))dx = ψ(x)dx − ϕ(x)dx.
a a a
Now we want to extend the integral to a larger class of functions. We try to

use the following idea: given f : [a, b] → R and a step function ψ : [a, b] → R
such that f ≤ ψ. The infimum of the integrals of all ψ with this prop-
erty should approximate “the area under Γ(f )” from above. On the other
hand the supremum of the integrals of all ϕ ∈ T [a, b], ϕ ≤ f , should ap-
proximate “the area under Γ(f )” from below, see Figures 25.5 and 25.6.
349
a b a b
Figure 25.5 Figure 25.6
This leads us to introduce the upper and lower integral as well as the Rie-
mann integral.
Definition 25.11. A. Let f : [a, b] → R be a bounded function. The upper

integral of f is defined as
b ∗ b
f (x)dx := inf ϕ(x)dx|ϕ ∈ T [a, b] and ϕ ≥ f (25.14)
a
a
and the lower integral of f is defined by
b b
f (x)dx := sup ϕ(x)dx|ϕ ∈ T [a, b] and ϕ ≤ f . (25.15)
∗ a
a
B. We call a bounded function f : [a, b] → R Riemann integrable if
b ∗ b
f (x)dx = f (x)dx. (25.16)
∗
a a
350
In this case we write

b b b ∗
f (x)dx := f (x)dx = f (x)dx (25.17)
a ∗
a a
and call the left hand side of (25.17) the Riemann integral of f (over
[a, b]).
C. Let f : [a, b] → R be a non-negative Riemann integrable function. The
area A(f ) bounded by Γ(f ), the interval [a, b], the line segment connecting
(a, 0) and (a, f (a)) and the line segment connecting (b, 0) and (b, f (b)) is
defined by b
A(f ) = f (x)dx.
a
(Compare with Definition 12.3.)
Note that from Definition 25.11.A we always have
b b ∗
f (x)dx ≤ f (x)dx. (25.18)
∗
a a
Thus the aim is to determine the class of bounded functions where equality
holds in (25.18) and to discuss properties of the Riemann integral. First
however we give two examples.
Example 25.12. A. For f ∈ T [a, b] clearly we have
b b ∗ b
f (x)dx = f (x)dx = f (x)dx,
∗ a
a a
thus step functions are Riemann integrable.

B. For χQ∩[0,1] : [0, 1] → R, i.e.
-
1, x ∈ Q ∩ [0, 1]
χQ∩[0,1] =
/Q
0, x ∈ [0, 1], x ∈
it follows that
1 ∗ 1
χQ∩[0,1] (x)dx = 1 and χQ∩[0,1] (x)dx = 0,
∗
0 0
and therefore χQ∩[0,1] is not Riemann integrable.
351
Theorem 25.13. Let f, g : [a, b] → R be bounded functions. Then we have
b ∗ b ∗ b ∗
(f + g)dx ≤ f dx + gdx, (subadditivity) (25.19)
a a a
and for λ ≥ 0
b ∗ b ∗
(λf )dx = λ f dx, (positive homogeneity). (25.20)
a a
∗ b ∗
Proof. Let us write f for f dx if it is clear what is meant, analogously
a
we will write f. Now, to prove part (25.19) it is sufficient to show that

∗
∗ ∗ ∗
(f + g)dx ≤ f dx + gdx + ε for all ε > 0.
We know that there are ϕ, ψ ∈ T [a, b], ϕ ≥ f, ψ ≥ g such that

∗ ∗
ε ε
ϕ≤ f + , and ψ≤ g+ .
2 2
Since ϕ + ψ ≥ f + g it follows that

∗ ∗ ∗
(f + g)dx ≤ f dx + g dx + ε.
To prove (25.20) it is sufficient to show that

∗ ∗ ∗
λ f −ε≤ (λf ) ≤ λ f + ε for all ε > 0.
Further we may assume that λ > 0. By definition there is ϕ ∈ T [a, b] such

that ∗
ε
ϕ≤ f+ .
λ
352
Since λϕ ≥ λf it follows that

∗ ∗ ∗
ε
(λf ) ≤ (λϕ) = λ ϕ ≤ λ f+ =λ f + ε,
λ
and analogously we may prove
∗ ∗
λ f≤ (λf ) + ε.
Corollary 25.14. Let f, g : [a, b] → R be bounded functions. Then it follows

that

(f + g)dx ≥ f dx + g dx, (25.21)
∗ ∗ ∗

(λf )dx = λ f dx for all λ ≥ 0, (25.22)
∗ ∗
and for λ < 0 we have
∗ ∗
λf = λ f and λf = λ f. (25.23)
∗ ∗
Proof. We only need to note the equality

∗
f =− (−f )
∗
which follows from the definition.

Suppose that f : [a, b] → R is Riemann integrable, i.e.
b b b ∗
f (x)dx = f (x)dx = f (x)dx.
∗ a
a a
Using the definition of inf and sup, given > 0, we can find ψ, ϕ ∈ T [a, b],
ϕ ≤ f ≤ ψ, such that
b b b b

f (x) ≤ ϕ(x)dx + and ψ(x)dx − ≤ f (x)dx,
a a 2 a 2 a
hence we have:
353
Theorem 25.15. The function f : [a, b] → R is Riemann integrable if and

only if for every > 0 there exists step functions ϕ, ψ ∈ T [a, b] such that
ϕ ≤ f ≤ ψ and
b b b
(ψ(x) − ϕ(x))dx = ψ(x)dx − ϕ(x)dx ≤ . (25.24)
a a a
This together with Theorem 25.6 gives

Theorem 25.16. A continuous function f : [a, b] → R is Riemann inte-
grable.
Proof. By Theorem 25.6, given ε > 0, there are step functions ϕ, ψ ∈ T [a, b]
ε
such that ϕ ≤ f ≤ ψ and ψ(x) − ϕ(x) ≤ b−a for all x ∈ [a, b].
It follows that
b b b b
ε
ψ(x)dx − ϕ(x)dx = (ψ(x) − φ(x))dx ≤ dx = ε.
b−a
a a a a
Furthermore we have
Theorem 25.17. Every monotone function f : [a, b] → R is Riemann inte-
grable.
Proof. Suppose that f is increasing (for decreasing functions the proof goes
analogously). By xk := a + (b−a)
n
, k = 0, 1, . . . , n, an equidistant partition of
[a, b] is given. We now define the two step functions ϕ, ψ ∈ T [a, b] by
ϕ(x) :=f (xk−1 ), xk−1 ≤ x < xk ,
ψ(x) :=f (xk ), xk−1 ≤ x < xk ,
as well as ϕ(b) = ψ(b) = f (b). Since f is monotone increasing we find
ϕ ≤ f ≤ ψ. Furthermore we have
b b n
n

ψ(x)dx − ϕ(x)dx = f (xk )(xk − xk−1 ) − f (xk−1 )(xk − xk−1 )
a a k=1 k=1
n n

b−a
= f (xk ) − f (xk−1 )
n k=1 k=1
b−a
= (f (b) − f (a)).
n
354
Now given > 0 we can find N ∈ N such that for n ≥ N it follows that
b−a b−a
(f (xn ) − f (x0 )) = (f (b) − f (a)) < ,
n n
i.e. we have
b b
ψ(x)dx − ϕ(x)dx <
a a
and by Theorem 25.15 the result follows.
Note that now we have two classes of integrable functions which are not
necessarily continuous: step functions and monotone functions.
Theorem 25.18. The set of all Riemann integrable functions f : [a, b] → R
form a real vector space. In addition we have for two Riemann integrable
functions f, g : [a, b] → R
b b
f ≤ g implies f (x)dx ≤ g(x)dx. (25.25)
a a
Proof. From Theorem 25.13 and Corollary 25.14 we deduce immediately

∗ ∗ ∗
f+ g ≤ (f + g) ≤ (f + g) ≤ f+ g
∗ ∗ ∗
and
∗ ∗ ∗
f= f, g= g implies that (f + g) = (f + g)
∗ ∗ ∗
as well as
b b b
(f (x) + g(x))dx = f (x)dx + g(x)dx.
a a a
By Theorem 25.13 and Corollary 25.14 we find for λ > 0
∗ ∗ ∗
λ f dx = (λf )dx ≤ (λf )dx = λ f dx,
∗
∗

so λf is integrable and λf dx = λ f dx = λ f dx. For λ < 0 we
use Corollary 25.14, in particular (25.23). Finally, (25.25) follows once we
355
b
know that f ≥ 0 implies a f (x)dx ≥ 0. But for f ≥ 0 there always exists
ϕ ∈ T [a, b] such that 0 ≤ ϕ ≤ f , hence
b b
0≤ f (x)dx = f (x)dx.
∗ a
a
Recall that the positive part of f : D → R is defined by

-
f (x), f (x) > 0
f+ (x) := (25.26)
0, f (x) ≤ 0
and the negative part is defined by

-
−f (x), f (x) < 0
f− (x) := (25.27)
0, f (x) ≥ 0.
Clearly f + ≥ 0, f− ≥ 0 and f = f+ − f− as well as |f | = f+ + f− .
Theorem 25.19. If f, g : [a, b] → R are Riemann integrable functions then

f+ , f− and |f |p , 1 ≤ p < ∞, as well as f · g are Riemann integrable.
Proof. By our assumptions, given ε > 0 there are step functions ϕ, ψ ∈ T [a, b]
such that ϕ ≤ f ≤ ψ and
b
(ψ − ϕ)(x)dx ≤ ε.
a
The functions φ+ , ψ+ are also step functions and we have ϕ+ ≤ f+ ≤ ψ+ . In

addition it follows that
b b
(ψ+ − ϕ+ )(x)dx ≤ (ψ − ϕ)(x)dx ≤ ε.
a a
Analogously we find that f− is integrable. Now, it follows that |f | is inte-

grable, recall |f | = f+ + f− . We want to prove that for p ≥ 1 the function
356
|f |p is integrable. Suppose first that 0 ≤ f ≤ 1. Then for ε > 0 there are

step functions φ, ψ ∈ T [a, b] such that
0≤ϕ≤f ≤ψ≤1
and
b
ε
(ψ − ϕ)(x)dx ≤ .
p
a
It follows that ϕ and ψ are step functions and ϕp ≤ f p ≤ ψ p . Since
p p
d p
x = pxp−1
dx
the mean value theorem yields
ψ p − ϕp ≤ p(ψ − ϕ),
note that xp−1 ≤ 1 for 0 ≤ x ≤ 1. Hence we find
b b
p p
(ψ − ϕ )(x)dx ≤ p (ψ − ϕ)(x)dx < ε,
a a
thus |f |p = f p is integrable. Now, for arbitrary f we find that |f |p = f+p + f−p ,

hence we may reduce the general case to non-negative functions. Further, if
f ≥ 0 but sup f (x) > 1, we may consider g(x) := supf (x)f (x) , i.e. 0 ≤ g(x) ≤
x∈[a,b] x∈[a,b]
b
1. It follows that |g|p(x)dx exists, but
a
b b b
p p 1 1
|g| (x)dx = |f (x)| · dx = |f (x)|p dx
sup f (t) sup f (t)
a a t t a
and the integrability of |f |p is proved. Since f · g = 12 ((f + g)2 − (f − g)2 )

the integrability of f · g follows from the integrability of |f |2 .
Corollary 25.20. For a Riemann integrable function f : [a, b] → R the
triangle inequality for integrals holds, i.e.
b b

f (x)dx≤ |f (x)|dx. (25.28)

a a
357
Proof. From Theorem 25.19 we deduce

b b b b

f (x)dx = (f +
− f −
)(x)dx = f +
(x)dx − f −
(x)dx

a a a a
b b b
≤ f + (x)dx + f − (x)dx = |f (x)|dx.
a a a
We now prove the mean value theorem for integrals:

Theorem 25.21. Let f, φ : [a, b] → R be continuous functions and suppose
that φ ≥ 0. Then there exists ξ ∈ [a, b] such that
b b
f (x)φ(x)dx = f (ξ) φ(x)dx. (25.29)
a a
In particular, for φ = 1 it follows that

b
f (x)dx = f (ξ)(b − a) for some ξ ∈ [a, b]. (25.30)
a
Proof. Define
m := inf{f (x); x ∈ [a, b]},
and
M := sup{f (x); x ∈ [a, b]}.
It follows that
mφ ≤ f φ ≤ Mφ,
hence
b b b
m φ(x)dx ≤ f (x)φ(x)dx ≤ M φ(x)dx.
a a a
Thus there is μ ∈ [m, M] such that
b b
f (x)φ(x)dx = μ φ(x)dx.
a a
358
Now, the intermediate value theorem for continuous functions, Theorem

20.17 or Theorem 9.5, gives the existence of ξ ∈ [a, b] such that f (ξ) = μ
When combining the mean value theorem and the fundamental theorem of
calculus we get a powerful tool to derive estimates. For this reason we post-
pone applications of the mean value theorem until the next chapter. However
we state a very useful and often applied consequence of (25.30).
Corollary 25.22. Let f ∈ C([a, b]) and h > 0 such that x, x + h ∈ [a, b] then
x+h
1
lim f (t)dt = f (x). (25.31)
h→0 h x
Proof. By (25.30) we find

x+h
1
f (t)dt = f (ξ) for some ξ ∈ [x, x + h],
h x
and the continuity of f implies the result.
Given an integrable function f : [a, b] → R. It is obviously not easy to find

step functions being close to f. Therefore we introduce a further way of
b
approximating the integral a f (x)dx by using certain values of f.
Definition 25.23. Let f : [a, b] → R be a function and Z = Z(x0 , . . . , xn )

be a partition of [a, b]. Further, for 1 ≤ k ≤ n let ξk ∈ [xk−1 , xk ]. Then
n

f (ξk )(xk − xk−1 ) (25.32)
k=1
is called the Riemann sum of f with respect to the partition Z and points
ξk , k = 1, . . . , n.
As before we denote the mesh size of the partition Z by
η := η(Z) := max (xk − xk−1 ). (25.33)

1≤k≤n
Theorem 25.24. Let f : [a, b] → R be a Riemann integrable function. Then

for every ε > 0 there is δ > 0 such that for every partition Z = Z(x0 , . . . , xn )
359
with mesh size less than or equal to δ, i.e. η(Z) ≤ δ, and any choice of
general points ξk ∈ [xk−1 , xk ]
b
n

f (x)dx − f (ξk )(xk − xk−1 ) ≤ ε

k=1
a
holds.
Proof. Given ε > 0 there are step functions φ, ψ ∈ T [a, b] such that
b
ε
φ ≤ f ≤ ψ and (ψ − φ)(x)dx ≤ .
2
a
Without loss of generality we may assume that φ and ψ are given with respect
to the same partition
a = t0 < t1 < . . . < tm = b.
Since f is bounded
M := sup{|f (x)||x ∈ [a, b]} ≥ 0
is finite and we may assume M = 0. We claim that for

ε
δ :=
8Mm
the assertion of the theorem holds.
For this let Z(x0 , . . . , xn ) be any partition of [a, b] such that η(Z) ≤ δ and
take general points ξk ∈ [xk−1 , xk ]. We define the step function F ∈ T [a, b]
by
F (xk ) = 0 and F (x) = f (ξk ) for xk−1 < x < xk .
It follows that
b n

F (x)dx = f (ξk )(xk − xk−1 )
a k=1
is the Riemann sum of f with respect to the partition Z(x0 , . . . , xn ) and

points ξk ∈ [xk−1 , xk ]. The step function F has the properties:
360
1. ϕ(x) − 2M ≤ F (x) ≤ ψ(x) + 2M, x ∈ [a, b],

2. if [xk−1 , xk ] ⊂ (tj−1 , tj ) for some j, then
ϕ(x) ≤ F (x) ≤ ψ(x) for all x ∈ [xk−1 , xk ].
Denote by A ⊂ [a, b] the set

A := {(xk−1 , xk )| there is j such that [xk−1 , xk ] ⊂ (tj−1 , tj )}
and define s ∈ T [a, b] by

-
0 ,x ∈ A
s(x) := .
2M ,x ∈
/A
It follows by 1) and 2) that
ϕ(x) − s(x) ≤ F (x) ≤ ψ(x) + s(x) for all x ∈ [a, b].
There are at most 2m intervals [xk−1 , xk ] where s is not 0, thus

b
ε
s(x)dx ≤ 2M(2mδ) ≤ ,
2
a
which implies
b b b
ε ε
ϕ(x)dx − ≤ F (x)dx ≤ ψ(x)dx + .
2 2
a a a
The choice of ϕ and ψ yields further

b b b b
ε ε
f (x)dx ≤ ϕ(x)dx + and ψ(x)dx ≤ f (x)dx +
2 2
a a a a
or
b b
| f (x)dx − F (x)dx| ≤ ε
a a
proving the theorem.
361
We want to use Theorem 25.24 to generalise Minkowski’s and Hölder’s in-

equality to integrable functions. Let f : [a, b] → R be a Riemann integrable
function and p ≥ 1. We define
⎛ ⎞ p1
b
f p := ⎝ |f (x)|p dx⎠ . (25.34)
a
Proposition 25.25. For f, g : [a, b] → R Riemann integrable and p ≥ 1 we

have
f + gp ≤ f p + gp , (25.35)
p
and for 1 < p < ∞, q := p−1
, it follows that
b b
| f (x)g(x)dx| ≤ |f (x)g(x)|dx ≤ f p gq . (25.36)
a a
Proof. We just have to approximate the integrals by a Riemann sum and

then we have to pass to the limit. We prove Minkowski’s inequality in detail:
⎛ b ⎞ p1

f + gp = ⎝ |f (x) + g(x)|p dx⎠
a
n
p1
(25.37)
p
≤ |f (ξk ) + g(ξk )| (xk − xk−1 ) +ε
k=1
n
p1
1 1
= (|f (ξk )(xk − xk−1 ) p + g(ξk )(xk − xk−1 ) p |p +ε
k=1
n
p1 n
p1

≤ |f (ξk )|p (xk − xk−1 ) + |g(ξk )|p (xk − xk−1 ) +ε
k=1 k=1
⎛ ⎞ p1 ⎛ ⎞ 1p
b b
≤⎝ |f (x)|p dx⎠ + ⎝ |g(x)|pdx⎠ + 2ε
a a
362
Exercise 25.26. For f, g : [a, b] → R Riemann integrable prove Hölder’s

inequality, i.e. (25.36).
Interchanging limits and integrals is an important topic and we will return
to this on many occasions. Here we state a first result.
Theorem 25.27. Let fn : [a, b] → R, n ∈ N, be a sequence of continuous
functions converging uniformly on [a, b] to f : [a, b] → R, then we have
b b b
f (x)dx = lim fn (x) dx = lim fn (x)dx.
a a n→∞ n→∞ a
Proof. By Theorem 24.6 we know that f is continuous, hence integrable and

it follows
b b b

f (x)dx − fn (x)dx ≤ |f (x) − fn (x)|dx ≤ (b − a)||f − fn ||∞

a a a
and since ||f − fn ||∞ → 0 as n → ∞ the theorem is proved.

Finally we want to consider the integral as a set function. First we note the
trivial fact that if T [a, b] and a < c 0 given ϕ, ψ ∈

b
T [a, b] are such that ϕ ≤ f ≤ ψ and a (ψ − ϕ)(x)dx < , then ϕ[a,c] ≤
f |[a,c] ≤ ψ|[a,c] and
c b
(ψ|[a,c] − ϕ|[a,c] )(x)dx = ((ψ − ϕ)χ[a,c] )(x)dx
a a
b
≤ (ψ − ϕ)(x)dx < .
a
Hence for c ∈ (a, b) the function f |[a,c] is integrable and moreover

b c b
f (x)dx = f (x)dx + f (x)dx. (25.38)
a a c
This easily extends to
363
Proposition 25.28. Let f : [a, b] → R be integrable and (Ij )j=1,...,N be a

finite partition of [a, b] into intervals Ij = [aj , bj ] such that [a, b] = Nj=1 Ij
and (aj , bj ) ∩ (al , bl ) = ∅ for j =
l, then we have
b N

f (x)dx = f (x)dx, (25.39)
a j=1 Ij
bj
where with Ij = [aj , bj ] we write Ij
f (x)dx for aj
f (x)dx.
Rewriting (25.39) as
N

N
f (x)dx = f (x)dx (25.40)
j=1 Ij j=1 Ij
we may interpret (25.39) as (25.40) as set-additivity of the integral. Later on

we will extend (25.40) to countable many sub-intervals Ij , j ∈ N. Further-
more, we will try to replace intervals by more general sets.
Two helpful definitions are
b a
f (x)dx := − f (x)dx if b < a (25.41)
a b
and a
f (x)dx := 0. (25.42)
a
Note that Proposition 25.28 implies for f : [a, b] → R, f ≥ 0, that for

a ≤ c ≤ d ≤ b we have
d b
f (x)dx ≤ f (x)dx. (25.43)
c a
Problems
1. Prove that the product of two step functions is a step function, i.e.
ϕ, ψ ∈ T [a, b] implies ϕ · ψ ∈ T [a, b], and deduce that T [a, b] is an
algebra.
364
2. Let f : [a, b] → R be Riemann integrable and y1 , . . . , yn ∈ [a, b]. Define

f˜ : [a, b] → R by
-
f (x), x ∈ [a, b] \ {y1 , . . . , yN }
f˜ :=
cj , x = yj
b
for some cj ∈ R. Prove that f˜ is Riemann integrable and a
f˜(x)dx =
b
a
f (x)dx.
3. We call f : [a, b] → R piecewise continuous if there exists a parti-

tion Z = Z(x0 , . . . , xn ) of [a, b] such that f |(xk−1 ,xk ) is continuous and
limx→xk−1 f (x) and limx→xk−1 f (x) exist.
x>xk−1 x<xk
a) Give an example of a piecewise continuous function which is not

continuous.
b) Let f : [a, b] → R be a bounded piecewise continuous function.
Prove that f is Riemann integrable
4. If f : [a, b] → R is Riemann integrable and f (x) ≥ γ > 0 for all

x ∈ [a, b], then f1 is Riemann integrable.
b
5. Does a |f (x)|dx = 0 imply for a Riemann integrable function f :
[a, b] → R that f (x) = 0 for all x ∈ [a, b]?
b
6. Prove for f ∈ C([a, b]) that a |f (x)|dx = 0 implies that f (x) = 0 for
all x ∈ [a, b]. Deduce that
b
||f ||L1 := |f (x)|dx
a
is a norm on the vector space C([a, b]).
7. a) Let f : [a, b] → R be a Riemann integrable function and (Zn )n∈N

(n) (n) (n)
be the sequence of the partition Zn = Z(x0 , . . . , xn ) where xj :=
(n) (n) (n)
a + nj (b − a), 0 ≤ j ≤ n. Further let Sn (f ) := nj=1 f (xj )(xj − xj−1 ).
Prove that b
lim Sn (f ) = f (x)dx.
n→∞ a
365
b 1
b
b) Denote by −a f (x)dx := b−a a
f (x)dx the mean value of f :
(n)
[a, b] → R which we assume to be Riemann integrable. With xj :=
a + nj (b − a), j = 0, . . . , n, prove that
n
b 1 (n)
− f (x)dx = lim f (xj ).
a n→∞ n
j=1
8. For two Riemann integrable functions f, g : [a, b] → R prove Hölder’s

inequality:
p1 q1
b b b b
p q
f (x)g(x)dx ≤ |f (x)g(x)|dx ≤ |f (x)| dx |g(x)| dx
a a a a
1 1
with p
+ q
= 1, 1 < p.
9. a) Use Hölder’s inequality to prove for a Riemann integrable func-

tion f : [a, b] → R the estimate
b b pq
q−p
|f (x)|p dx ≤ (b − a) q |f (x)|q dx
a a
where 1 ≤ p < q. Hint: note that |f (x)| = 1 · |f (x)|p and x → 1 is

p
integrable.
b) Prove that for two Riemann integrable functions f, g : [a, b] → R
and every > 0 we have
b b
2 1 b
|f (x)g(x)|dx ≤ |f (x)| dx + |g(x)|2dx.
a a 4 a
10. Let f : [a, b] → R be Riemann integrable. For k ∈ N prove

b 2 b 2 b
f (x) sin kxdx + f (x) cos kxdx ≤ (b − a) f 2 (x)dx.
a a a
11. Let h : [a, b] → R be Riemann integrable and f : [c, d] → R be a convex

function such that h([a, b]) ⊂ [c, d]. Show Jensen’s inequality for
integrals
b b
1 1
f h(t)dt ≤ f (h(t))dt. (25.44)
b−a a b−a a
Hint: use Jensen’s inequality for sums, see Problem 1 in Chapter 23.
366
12. On [0, 1] consider the sequence (fn )n∈N of functions

⎧
2 1
⎪
⎨4n x, 0 ≤ x ≤ 2n
1
fn (x) := −4n2 x + 4n, 2n ≤ x ≤ n1
⎪
⎩ 1
0, n
≤ x ≤ 1.
Sketch the graph of fn and prove that fn is continuous. Furthermore

show that limn→∞ fn (x) = 0 for every x ∈ [0, 1], i.e. fn converges on
[0, 1] pointwise to the zero function. Verify by using a simple geometric
1
interpretation (calculating the area of a triangle) that 0 fn (x)dx = 1.
Hence we have an example of a sequence converging pointwise, the
integrals converge, but the integral of the limit is not equal to the limit
of the integrals.
13. Prove that if a sequence of Riemann integrable functions fn : [a, b] → R

converges uniformly to f : [a, b] → R then f is Riemann integrable and
b b b
lim fn (x)dx = f (x)dx = lim fn (x) dx .
n→∞ a a a n→∞
367
26 The Fundamental Theorem of Calculus

We want to investigate the relation of integration and differentiation. In the
following I ⊂ R will denote any interval (open, closed or half-open) with
distinct end points. Note that I need not be bounded.
Theorem 26.1. Let f : I → R be a continuous function and a ∈ I. If
F : I → R is defined by
x
F (x) := f (t)dt
a

then F is differentiable with F = f. In particular F is continuous.
Proof. For h = 0 we find
⎛ x+h ⎞
x
x+h
F (x + h) − F (x) 1⎝ 1
= f (t)dt − f (t)dt⎠ = f (t)dt.
h h h
a a x
By the mean value theorem, Theorem 25.21, there is ξh ∈ [x, x + h] (or

ξh ∈ [x + h, x] if h < 0) such that

x+h
f (t)dt = hf (ξh ).
x
Since lim ξh = x and since f is continuous it follows that

h→0

x+h
1 1
F (x) = lim f (t)dt = lim hf (ξh ) = f (x).
h→0 h h→0 h
x
Definition 26.2. Let F : I → R be a differentiable function. If F = f, f :

I → R, then we call F a primitive of f.
Proposition 26.3. Two primitives of f differ only by a constant.
Proof. Let c ∈ R be a constant and F = f, then (F + c) = f i.e. F + c
is a primitive of f . Conversely, if F and G are two primitives of f, then
F − G = f − f = 0, hence (F − G) = c or F = G + c.
369
Now we may prove the fundamental theorem of calculus, compare with

Theorem 12.7.
Theorem 26.4. Let f : I → R be a continuous function with primitive F.

For all a, b ∈ I, a < b, we have
b
f (x)dx = F (b) − F (a).
a
Proof. For x ∈ I set
x
F0 (x) = f (t)dt, (a ∈ I fixed).
a
Then F0 is a primitive of f such that
b
F0 (a) = 0 and F0 (b) = f (t)dt.
a
If F is any primitive of f, then there is c ∈ R such that F − F0 = c which

yields
b
F (b) − F (a) = F0 (b) − F0 (a) = F0 (b) = f (t)dt.
a
A useful notation is b
f (x)dx = F |ba , (26.1)
a
and more generally

h|ba := h(b) − h(a). (26.2)
Let us restate (with full proofs) some rules for integration that have already
been proved in Chapter 13.
370
26 THE FUNDAMENTAL THEOREM OF CALCULUS
Proposition 26.5 (Integration by Parts). For two continuously differen-

tiable functions f, g : [a, b] → R we have
b b

f (x)g (x)dx = f · g|ba − f (x)g(x)dx
a a
(26.3)
b
= f (b)g(b) − f (a)g(a) − f (x)g(x)dx.
a
Proof. With F := f · g we find
F (x) = f (x)g(x) + f (x)g (x).
This yields by the fundamental theorem
b b

f (x)g(x)dx + f (x)g (x)dx = F |ba = f · g|ba .
a a
Proposition 26.6 (Integration by Substitution). For f : I → R a

continuous function and φ : [a, b] → R a continuously differentiable function,
i.e. φ ∈ C 1 [a, b], such that φ([a, b]) ⊂ I we have
b φ(b)
f (φ(t))φ(t)dt = f (x)dx. (26.4)
a φ(a)
Proof. Let F be a primitive of f. Using the chain rule we find for F ◦ φ :

[a, b] → R
(F ◦ φ) (t) = F (φ(t))φ (t) = f (φ(t))φ (t)
which implies by the fundamental theorem
b φ(b)

f (φ(t))φ (t)dt = F ◦ φ|ba = F (φ(b)) − F (φ(a)) = f (x)dx.
a φ(a)
371
In Chapter 13 we have already given a lot of applications of these rules. Here

we are interested in more theoretical applications.
Proposition 26.7. A. Let h : [−a, a] → R be an even continuous function.

Then we find a a
h(t)dt = 2 h(t)dt. (26.5)
−a 0
B. For an odd continuous function g : [−a, a] → R it follows that

a
g(t)dt = 0. (26.6)
−a
Proof. A. We know that

a 0 a
h(t)dt = h(t)dt + h(t)dt.
−a −a 0
Now the change of variable t → −s gives

0 0 0 a
h(t)dt = − h(−s)ds = − h(s)ds = h(s)ds,
−a a a 0
i.e.
a a a a
h(t)dt = h(t)dt + h(s)ds = 2 h(t)dt.
−a 0 0 0
B. Since
a 0 a
g(t)dt = g(t)dt + g(t)dt
−a −a 0
0 a
we are done if we can show that −a
g(t)dt = − 0
g(t)dt. The change of
variable t → −s yields however
0 0 0 a
g(t)dt = − g(−s)ds = g(s)ds = − g(t)dt.
−a a a 0
A further symmetry we have encounter was periodicity. We claim first
372
Proposition 26.8. Let f : R → R be a continuous function with period

c > 0. If for some a ∈ R we have
a
f (t)dt = 0 (26.7)
a−c
then every primitive F and f has period c too.

Proof. Let F be any primitive of f . If follows that
x+c x+c x
F (x + c) − F (a) = f (t)dt = f (t − c)dt = f (t)dt
a a a−c
= F (x) − F (a − c),
or a
F (x + c) − F (x) = F (a) − F (a − c) = f (t)dt = 0. (26.8)
a−c
Remark 26.9. A. The function g(x) = 1 + cos x is 2π-periodic since
g(x + 2π) = 1 + cos(x + 2π) = 1 + cos x = g(x).
A primitive of g is G(x) = x + sin x since G (x) = 1 + cos x. However,
G(x + 2π) = x + 2π + sin(x + 2π) = x + 2π + sin x = G(x).
Moreover we have
2π a+2π
(1 + cos x)dx = (1 + cos x)dx = 2π = 0
0 a
for all a ∈ R. Hence we cannot expect Proposition 26.8 to hold for all periodic
functions. b+c a
B. From (26.8) it follows that b f (t)dt = 0 for all b ∈ R if a−c f (t)dt = 0
for an a ∈ R.
Let f : R → R be a a continuous function with period c > 0 and for some
a ∈ R set A := a−c f (t)dt, A need not be zero. The function fA : R →
R, x → fA (x) = f (x) − Ac is once again periodic with period c:
A A
fA (x + c) = f (x + c) − = f (x) − = fA (x).
c c
373
Moreover it holds
a a
A
fA (x)dx = (f (x) − )dx = A − A = 0.
a−c a−c c
Hence we may apply Proposition 26.8 to fA to find that
b+c b+c b+c
A
0= fA (t)dt = (f (x) − )dx = f (x)dx − A, (26.9)
b b c b
and we have proved

Corollary 26.10. Let f : R → R be a continuous function with period c > 0.
For every b ∈ R it holds b+c
f (x)dx = A, (26.10)
b
i.e. the integrals of f over any interval of length c have all the same value.
From the last few results we may pick up an important message: symmetry
may be used to simplify the evaluation of integrals.
We have seen that the Riemann integral is positivity preserving, with the
consequence that for two integrable functions f, g : I → R, the inequality
f ≤ g implies
f (x)dx ≤ g(x)dx. (26.11)
I I
Corollary 26.11. A. Let f : [a, b] → R be a continuous function such that

m ≤ f (x) ≤ M with m, M ∈ R. Then it holds
b
m(b − a) ≤ f (t)dt ≤ M(b − a). (26.12)
a
B. Let f : [a, b] → R be a continuous function such that |f (x)| ≤ M for all

t ∈ [a, b], M ∈ R. Then we have
b

f (t)dt ≤ M(b − a). (26.13)

a
Proof. A. Integrating the inequality m ≤ f (t) ≤ M we find

b b b
m(b − a) = mdt ≤ f (t)dt ≤ Mdt = M(b − a).
a a a
374
B. Using the triangle inequality for integrals we find with (26.11)

b b b

f (t)dt ≤ |f (t)|dt ≤ Mdt = M(b − a).

a a a
This corollary has many nice applications.
Example 26.12. We claim that for all x, y ∈ R
| sin x − sin y| ≤ |x − y|. (26.14)
Indeed for x ≥ y we find

x x
| sin x − sin y| = | cos t dt| ≤ 1dt = x − y = |x − y|,
y y
and for x ≤ y it follows that
| sin x − sin y| = | sin y − sin x| ≤ |y − x| = |x − y|.
Analogously we find for all x, y ∈ R
| cos x − cos y| ≤ |x − y|. (26.15)
Example 26.13. Let 1 < a < b. We claim

a 1 b 1 b
1− = (b − a) ≤ ln ≤ (b − a) = − 1. (26.16)
b b a a a
Proof. First note that
b
b 1
ln = ln b − ln a = dx.
a a x
1 1 1
Now we estimate the integral. Since b
< x
< a
for a < x < b it follows that
b b b
1 1 1 1 1
(b − a) = 1dx ≤ dx ≤ 1dx = (b − a),
b b a a x a a a
implying (26.16).
375
Example 26.14. For all a ≥ 0 and t ≥ 0 we have

at
≤ 1 − e−at ≤ at. (26.17)
1 + at
Since
at at
−at −x
1−e = e dx ≤ 1dx = at
0 0
we get the right inequality. To get the left inequality note that t → eat −1−at
is on [0, ∞) for all a ≥ 0 monotone increasing since dtd (eat − 1 − at) =
a(eat − 1) ≥ 0. For t = 0 we have eat − 1 − at|t=0 = 0, hence we find
0 ≤ eat − 1 − at
or
(1 + at) ≤ eat ,
i.e. (1 + at)e−at ≤ 1, which is equivalent to
0 ≤ 1 − (1 + at)e−at
or
at ≤ 1 + at − (1 + at)e−at ,
leading eventually to
at
≤ 1 − e−at .
1 + at
Example 26.15. For a ≥ 0 and t ≥ 0 it holds
at
e − 1 + at 1 2
≤ a t. (26.18)
t 2
Proof. Observe that

at
0≤ (1 − ex )dx = at + e−at − 1,
0
and therefore using (26.17)

at at
at x (at)2
|e − 1 + at| = (1 − e )dx ≤ xdx =
0 0 2
which implies (26.18).
376
Note that all these examples work along the same idea: we want to estimate
the difference f (x) − f (y) for a given function f . If we can identify f as a
primitive, i.e. for some function g we have
y
f (x) − f (y) = g(t)dt,
x
then we can use estimates for g to control this difference.

The following inequality is a first, rather crude version of the Poincaré
inequality in one dimension.
Proposition 26.16. Let f ∈ C 1 ([a, b]) and suppose that f (a) = f (b) = 0.
Then there exists a constant γ0 > 0 such that
b 12 b 12
2 2
|f (x)| dx ≤ γ0 |f (x)| dx . (26.19)
a a
Proof. We note that

b b b
2 2 d
|f (x)| dx = f (x)dx = x f 2 (x)dx
a a a dx
b b
d 2
=− x f (x)dx = −2 xf (x)f (x)dx,
a dx a
or by the Cauchy-Schwarz inequality

b b
2
|f (x)| dx ≤ 2 max(|a|, |b|) |f (x)||f (x)|dx
a a
b 12 b 12
2 2
≤ 2 max(|a|, |b|) |f (x)| dx |f (x)| dx
a a
implying (26.19) with γ0 = 2 max(|a|, |b|).

p1
b
Remark 26.17. Using the notation ||f ||p = a |f (x)|2 dx the Poincaré
inequality reads as
||f ||L2 ≤ γ0 ||f ||L2 , (26.20)
and as we will see in Problem 10 this implies that on C01 ([a, b]) := {f ∈
C 1 ([a, b])|f (a) = f (b) = 0} a norm is given by ||f ||L2 . Clearly ||f ||L2 is not
a norm on C 1 ([a, b]) since every constant function fc , fc (x) = c, (restricted
to [a, b]) belongs to C 1 ([a, b]) and has derivative fc = 0, i.e. ||fc ||L2 = 0 but
fc = 0 for c = 0.
377
Consider f : [a, b] → R a continuous function and let α, β : [c, d] → [a, b]

be two continuously differentiable functions such that α(x) ≤ β(x) for all
x ∈ [c, d]. We define the function G : [c, d] → R by
β(x)
G(x) := f (t)dt. (26.21)
α(x)
Proposition 26.18. The function G in (26.21) is differentiable and

G (x) = β (x)f (β(x)) − α (x)f (α(x)). (26.22)

Proof. For a primitive F of f , i.e. F = f , it follows that
G(x) = F (β(x)) − F (α(x))
and the chain rule yields
G (x) = β (x)F (β(x)) − α (x)F (α(x))
= β (x)f (β(x)) − α (x)f (α(x)).
We now consider limits of sequences of differentiable functions.

Theorem 26.19. Let fn : [a, b] → R be a sequence of continuously differ-
entiable functions converging pointwise to f : [a, b] → R. Suppose that the
sequence (fn )n∈N converges uniformly. Then f is differentiable and we have
f (x) = lim fn (x) for all x ∈ [a, b]. (26.23)
n→∞
Proof. Define f ∗ (x) := limn→∞ fn (x). If follows that f ∗ : [a, b] → R is a

continuous function. Further, for x ∈ [a, b] we have
x
fn (x) = fn (a) + fn (t)dt.
a
x x
By Theorem 25.27 we know that a f (t)dt → a f ∗ (t)dt, implying
x
f (x) = f (a) + f ∗ (t)dt.
a
Now the fundamental theorem yields f (x) = f ∗ (x) and the theorem is

proved.
We refer to Problem 9 in Chapter 24 for an example showing that uniform
convergence of (fn )n∈N and pointwise convergence of (fn )n∈N is not sufficient
for (26.23) to hold. Theorem 26.19 will become particularly powerful when
applied to series of functions.
378
Problems
1. Prove
a) If f ∈ C k ([a, b]) then all primitives of f belong to C k+1 ([a, b]), k ≥
0.
b) Every f ∈ C([a, b]) determines by its primitive a one-dimensional
affine subspace of C 1 ([a, b]).
2. Define on x C([a, b]) the mapping T : C([a, b]) → C([a, b]) by (T f )(x) :=
f (a) + a e−t f (t)dt, x ∈ [a, b]. We call g ∈ C([a, b]) a fixed point of
T if T g = g, i.e. (T g)(x) = g(x) for all x ∈ [a, b]. Prove that if
g ∈ C k ([a, b]) is a fixed point it must belong to C k+1 ([a, b]), hence
g ∈ C ∞ ([a, b]) = ∞ k
k=1 C ([a, b]).
3. For f ∈ C(R), f ≥ 0, define for the right half-open interval I = [a, b),
a, b ∈ R b
μ(I) := f (t)dt and μ(∅) := 0.
a
Moreover, for I1 = [a1 , b1 ), I2 = [a2 , b2 ), b1 < a2 , we define
μ(I1 ∪ I2 ) = μ(I1 ) + μ(I2 ).
a) Prove that the union of two bounded right half-open intervals is

either disjoint or a right half-open interval and deduce that the union of
finitely many right half-open intervals is the union of finitely mutually
disjoint right half-open intervals or consists of one right half-open inter-
val. Moreover the intersection of two bounded right half-open intervals
is either empty or right half-open.
b) Let I1 and I2 be two bounded right half-open intervals. Prove
that
μ(I1 ∪ I2 ) + μ(I1 ∩ I2 ) = μ(I1 ) + μ(I2 ).
c) For a0 ∈ R fixed define the function μa0 (x) := μ([a0 , x)). Show
that μa0 ∈ C 1 (R) and μa0 (x) = f (x).
4. Let f : [−a, a] → R
xbe a continuous function and suppose that for all
x ∈ (0, a] we have −x f (t)dt = 0. Prove that f is an odd function.
Hint: first prove that if g : [a, b] → R is continuous and if for all
379
β
α < β, α, β ∈ [a, b], it follows that α
g(t)dt = 0 then g(t) = 0 for all
t ∈ [a, b].
5. a) For ρ > 1 and 0 ≤ x < y ≤ 1 prove that
y ρ − xρ ≤ ρ(y − x).
b) For − π4 ≤ x < y ≤ π
4
show that
2
y − x ≤ √ (sin y − sin x).
2
6. Let f : [a, b] → R be a continuous function. Prove that a primitive F

of f is Lipschitz continuous and
|F (x) − F (y)| ≤ ||f ||∞ |x − y| for all x, y ∈ [a, b].
7. Let f, g : [a, b] → R be two non-identical zero Riemann integrable

b
functions. We call f and g orthogonal if a f (x)g(x)dx = 0. If f and
g are orthogonal we write f ⊥ g. We agree that 0 is orthogonal to
every f : [a, b] → R.
a) Prove that -
c, x ∈ a, b−a
2
f (x) =
0, x ∈ b−a
2
,b
and -
0, x ∈ a, b−a
g(x) = 2
−c, x ∈ b−a
2
,a
are orthogonal.
b) Suppose that f : [−a, a] → R is even and g : [−a, a] → R is odd.
Show that they are orthogonal.
c) For f ∈ C([a, b]) define {f }⊥ := {g ∈ C([a, b])|f ⊥ g}. Prove
that {f }⊥ is a subspace of C([a, b]).
8. Let f ∈ C 1 ([a, b]) and suppose that f ⊥ f . Prove that this implies
|f (a)| = |f (b)|.
380
9. Prove that on C01 ([a, b]) := {f ∈ C 1 ([a, b])|f (a) = f (b) = 0} a norm is
given by ||f ||L2 . Hint: use Corollary 26.10 and Problem 6 in Chapter
25.
10. Let α, β : [a, b] → R be two differentiable functions such that α < β.

In addition suppose that α is decreasing and β is increasing. Let
f : [α(a), β(b)] → R be a non-negative function. Prove that G :
[α(b), β(b)] → R,
β(x)
G(x) := f (t)dt
α(x)
is increasing.
11.* The following result sharpens Theorem 26.19 considerably: let fn :

[a, b] → R be a sequence of differentiable functions. Suppose that for
an x0 ∈ [a, b] the sequence (fn (x0 ))n∈N converges and that (fn )n∈N con-
verges uniformly on [a, b] to some function. Then there exists a differ-
entiable function f : [a, b] → R such that (fn )n∈N converges uniformly
to f and we have
f (x) = lim fn (x) for all x ∈ [a, b].

n→∞
Hint: prove that (fn )n∈N forms a Cauchy sequence with respect to the
norm || · ||∞ and apply Theorem 24.11.
N k
12. Consider the sequence SN (x) := k=0 x defined on (−1, 1). Let
[a, b] ⊂ (−1, 1) be a compact interval. Prove that SN (x) as well as
SN
(x) converge uniformly on [a, b]. Deduce that for m ∈ N, m ≥ 2 it
follows that ∞
k m
k
= .
k=1
m (m − 1)2
381
27 A First Encounter with Differential

Equations
The fundamental theorem paves the way to solve some differential equations,
more precisely some ordinary differential equations. First let us re-
interpret the fundamental theorem. So far we have used this theorem (see
Part 1) to evaluate integrals. We may ask the following question:
Given a function h : [a, b] → R, can we find a function u : [a, b] → R such
that
u(x) = h(x) (27.1)
holds for all x ∈ (a, b)?
Obviously u will not be unique since for v := u + c, c ∈ R, we find
v (x) = u (x) + c = u(x) = h(x).
However, two functions satisfying (27.1) can only differ by a constant. Thus,
if we prescribe for example u(a) = u0 , u0 ∈ R, in the class of all differentiable
functions on (a, b) with continuous extension to [a, b] there will be at most
one function solving the initial value problem
u = h, u(a) = u0 . (27.2)
In the case that h is continuous we can find a solution to (27.2) by integration:
x
u(x) = u0 + h(t)dt, (27.3)
a
which follows fromdifferentiating (27.3). Indeed by the fundamental theorem

x
we find that x → a h(t)dt is differentiable and further
x
d d
u(x) = u0 + h(t)dt = h(x)
dx dx a
as well as a
u(a) = u0 + h(t)dt = u0 .
a
Note that we can “derive” the solution (27.3) to (27.2) by “integrating”
(27.2): x x
u (t)dt = h(t)dt
a a
383
implying x
u(x) − u(a) = h(t)dt,
a
or x x
u(x) = u(a) + h(t)dt = u0 + h(t)dt.
a a
Of course (27.2) is a rather simple initial value problem. We may want to

solve a more general initial value problem, namely
g(u(x))u(x) = h(x) and u(a) = u0 , (27.4)
where u : [a, b] → R is sought to be continuous on [a, b] and differentiable on

(a, b). We assume again h : [a, b] → R to be a continuous function and we
assume g : R → R to be continuous too. We may integrate (27.4) to obtain
x x
g(u(t))u(t)dt = h(t)dt. (27.5)
a a
Let us have a closer look at the integral on the left hand side. If we consider
z = u(t) as a new variable then the rule for integration by substitution,
compare with Proposition 26.6 or Theorem 13.12, yields
u(x) x
g(z)dz = g(u(t))u(t)dt, (27.6)
u(a) a
which implies
u(x) x
g(z)dz = h(t)dt. (27.7)
u(a) a
Now let G be a primitive of the continuous function g and let H be a primitive

of the continuous function h. Then (27.7) becomes
G(u(x)) − G(u(a)) = H(x) − H(a), (27.8)
i.e.
G(u(x)) = H(x) + G(u0 ) − H(a). (27.9)
If we add the assumption that G has an inverse, we find
u(x) = G−1 (H(x) + G(u0 ) − H(a)). (27.10)
384
27 A FIRST ENCOUNTER WITH DIFFERENTIAL EQUATIONS
First we note that in (27.10) for x = a we get
u(a) = G−1 (H(a) + G(u0 ) − H(a)) = G−1 (G(u0)) = u0.
Next we suppose that G−1 is differentiable, for example the condition G (y) =
g(y) = 0 for all y ∈ R will be sufficient. It follows from (27.10) that
d d −1
u(x) = G (H(x) + G(u0 ) − H(a))
dx dx
d
= (G−1 ) (H(x) + G(u0 ) − H(a)) (H(x) + G(u0 ) − H(a))
dx
1
= −1 h(x)
G (G (H(x) + G(u0 ) − H(a)))
1
= h(x)
g (G−1 (H(x) + G(u0 ) − H(a)))
1
= h(x),
g(u(x))
or
d
g(u(x)) u(x) = h(x),
dx
i.e. by (27.10) we have indeed a solution to (27.4). We have added two new
conditions: G has an inverse and G−1 is differentiable. However, these two
conditions are not independent: if g(y) = 0 for all y ∈ R, it must be either
strictly positive or strictly negative. Since G is the primitive of g, i.e. G = g,
in the first case G is strictly monotone increasing and in the second case it is
strictly monotone decreasing. In each case however G has an inverse which is
differentiable. Thus we have proved the following existence and uniqueness
result:
Theorem 27.1. Let h : [a, b] → R, a < b, be a continuous function and
u0 ∈ R. Suppose that g : R → R is a continuous function and g(y) = 0 for
all y ∈ R. In this case the initial value problem
g(u(x))u(x) = h(x), u(a) = u0 (27.11)
has the unique solution
u(x) = G−1 (H(x) + G(u0 ) − H(a)), (27.12)
where H is a primitive of h and G is a primitive of g.
385
Remark 27.2. Note that after we have derived a candidate for u as a solution
to (27.4) we have to verify that this function is indeed a solution. This is
typical for solving differential equations: to derive a formula for u we need
to do some calculations, but since we do not know what u is, we may not
be able to justify these calculations. Thus we pretend as if all steps in the
calculation are allowed, and once we have derived a formula we try (and we
have) to verify that this formula gives a solution.
Remark 27.3. Let us return to formula (27.8) (or (27.9)). In the case
that h and g are continuous, hence have a primitive, this formula makes
sense for any u : [a, b] → R, u(a) = u0 . Hence we can call a function, not
necessarily differentiable, a generalised solution to (27.4) if (27.8) holds. Non-
differentiable solutions of (partial) differential equations are of importance,
however we first need to understand more about differentiable solutions.
For solving differential equations, say the initial value problem (27.4), some-
times a rather formal approach is helpful. With y = u(x) we write (27.4)
as
dy
g(y) = h(x), (27.13)
dx
and
y0 = u(a). (27.14)
We now write (27.13) formally as
g(y)dy = h(x)dx (27.15)
and take primitives on both sides, i.e. look at

g(y)dy = h(x)dx + C (27.16)
(of course we only need to include one constant). Thus we have a formal
algorithm:
dy
To solve g(u(x))u(x) = h (x), look at g(y) dx = h(x), and integrate g(y)dy =
h(x)dx, i.e. for the integration process the variables are separated and con-
sequently this method is called separation of variables. Once we have
separated the variables we try to evaluate the two integrals in (27.16) and
we then return to (27.8).
386
Before discussing some examples we want to give a simple generalisation of

the method. Consider the differential equation
h2 (x)g2 (u(x))u (x) − h1 (x)g1 (u(x)) = 0 (27.17)
where h1 , h2 : [a, b] → R and g1 , g2 : R → R are continuous functions. We
formally transform this equation to
g2 (u(x)) h1 (x)
u (x) = , (27.18)
g1 (u(x)) h2 (x)
and with g = gg21 and h = hh12 we are back to the first case provided g1 (y) = 0
for all y ∈ R and h2 (x) = 0 for all x ∈ [a, b]. Thus we derive formally the
condition
g2 (y) h1 (x)
dy = dx + C, y = u(x),
g1 (y) h2 (x)
g2 h1
i.e. we are looking for a primitive G of g1
as well as for a primitive H of h2
,
and then we try to solve the equation
G(y) = H(x) + C
for y. If this is possible we obtain a function y = u(x) and eventually we can
try to adjust the initial value u0 by choosing C such that u(a) = u0 . Thus
the strategy is to find G−1 and then to justify or verify that
u(x) = G−1 (H(x) + C)
solves (27.17).
Example 27.4. We want to solve
3u2 (x)u (x) = cos x, u(0) = 1 (27.19)
for x ∈ R. With g(z) = 3z 2 and h(x) = cos x we find the primitives G(z) = z 3
1
and H(x) = sin x, respectively. The inverse of G is of course G−1 (s) = s 3
and since G(u0 ) = G(1) = 1 and H(0) = 0 it follows that
1
u(x) = (1 + sin x) 3 (27.20)
is a candidate for a solution to (27.19). An easy calculation shows that
u(0) = 1 and
du d 1 1 2
= (1 + sin x) 3 = (1 + sin x)− 3 cos x
dx dx 3
387
or
d
3u2 (x) u(x) = cos x.
dx
Note that we cannot apply Theorem 27.1 since g(z) = 3z 2 has a zero at
z0 = 0. The short calculation above is still formal, we still need to specify for
which values of x it holds for. The problem are points where sin x = −1, i.e.
x = 3π
2
+ 2kπ, k ∈ Z. At these points u given by (27.20) is not differentiable.
However for x = 3π
2
+ 2kπ we have
du(x) 2 1 2
3u2 (x) = 3(1 + sin x) 3 · (1 + sin x)− 3 cos x = cos x
dx 3
du 3π
which implies that although dx
does not exist for x = 2
+ 2kπ we still have

2 du(x) 3π
lim 3u (x) = lim cos x = cos (= 0).
x→ 3π
2
+2kπ dx x→ 3π
2
+2kπ 2
1
With this interpretation we can claim that u(x) = (1 + sin x) 3 satisfies
3u2 u = cos on the entire real axis even if it is not differentiable at certain
points.
Example 27.5. Our aim is to find a solution to the initial value problem
du(x)
+ 3u(x) = 8, u(0) = 2. (27.21)
dx
We may write this equation as
dy
1 + (3y − 8)1 = 0, y = u(x), y0 = u(0),
dx
and formally we find
dy
= dx,
8 − 3y
or
dy
= dx + C,
8 − 3y
i.e.
1
− ln(8 − 3y) = x + C.
3
388
With y0 = 2 we find − 13 ln(8 − 6) = − 13 ln 2 = C, and we arrive at

8 − 3y
ln(8 − 3y) − ln 2 = ln = −3x,
2
i.e.
8 − 3y
= e−3x ,
2
leading to
2
y = u(x) = (4 − e−3x ).
3
2
Thus we conjecture that u(x) = 3 (4 − e−3x ) solves (27.21). The verification
is straightforward:
2
u(0) = (4 − 1) = 2
3
and
du(x) d 2 −3x
2
= 4−e = − (−3)e−3x = 2e−3x ,
dx dx 3 3
which yields

du(x) −3x 2 −3x
+ 3u(x) = 2e +3 (4 − e )
dx 3
= 3e−3x + 8 − 2e−3x = 8.
Remark 27.6. A word of caution: in our course the exponential function
was introduced as the solution to the initial value problem u = u, u(0) = 1.
Thus we still owe a proof of its existence and consequently of the existence
of ln and the corresponding integrals. This will be done shortly in Chapter
29.
We close this chapter with a very useful formula. The calculation leading to
the justification of (27.10) needs an evaluation of
u(x)
d
g(z)dz,
dx a
which is of course done with the help of the fundamental theorem. We want
to consider a more general expression, namely
u(x)
x → g(z)dz
v(x)
389
with differentiable functions v, u such that v ≤ u. Let G be a primitive of

the continuous function g. It follows that
u(x)
g(z)dz = G(u(x)) − G(v(x)).
v(x)
Consequently we find
u(x)
d d
g(z)dz = (G(u(x)) − G(v(x)))
dx v(x) dx
= G (u(x))u (x) − G (v(x))v (x)
= g(u(x))u(x) − g(v(x))v (x).
Hence we have
Proposition 27.7. Let g : [a, b] → R, a < b, be a continuous function and let

u, v : [a, b] → R be two differentiable functions such that v(x) ≤ u(x) holds
for all x ∈ [a, b]. The function F : [a, b] → R given by
u(x)
F (x) := g(z)dz
v(x)
is differentiable in (a, b) and we have for a < x < b

u(x)
d d
F (x) = F (x) = g(z)dz = g(u(x))u(x) − g(v(x))v (x). (27.22)
dx dx v(x)
Even in the case of Theorem 27.1 we will in general have no explicit formula
for the solution of (27.11) and of course this applies to (27.17) too. Neither
should we expect to find G or H explicitly, nor will we have an explicit
formula for G−1 . Nonetheless, for a function satisfying, say (27.17), we can
derive some properties. Here are some first observations. Suppose u : [a, b] →
R is differentiable and solves
u (x) = h(x), x ∈ (a, b). (27.23)
If h ≥ 0(> 0, ≤ 0, < 0) then u must be monotone increasing (strictly increas-

ing, decreasing, strictly decreasing). This is trivial, for us the observation
390
that we can find properties of a solution to (27.23) without having an ex-

plicit formula is the important one. A similar, more surprising result is the
following: if u : [a, b] → R satisfies
u (x) + f (x)u(x) = 0, x ∈ (a, b) (27.24)
and if f : (a, b) → R is k-times continuously differentiable then u is on (a, b)

(k + 1)-times continuously differentiable. The proof is as follows: since u is
on (a, b) continuous and differentiable, from (27.24) we derive
u (x) = −f (x)u(x)
and the right hand side −f u is continuously differentiable. Hence u is con-

tinuously differentiable and we find
u (x) = −f (x)u(x) − f (x)u (x)

= −f (x)u(x) + f 2 (x)u(x).
Now we observe that −f u + f 2u is continuously differentiable implying that

u is continuously differentiable and the following holds

u (x) = −f (x) + 3f (x)f (x) − f 3 (x) u(x).
We can iterate this process until on the right hand side the k th derivative of
f appears, which is of course the case when forming u(k+1)
dk dk
u(k+1) (x) = u (x) = (−f (x)u(x))
dxk dxk
k
k (l)
=− f (x)u(k−l) (x),
l
l=0
and we see that we can replace u(k−l) by an expression involving derivatives of

f up to order k − l and u. In particular, if f is arbitrarily often differentiable
we find that u is too. Thus, even for satisfying equation (27.24) we only
need the first derivative of u, depending on the smoothness, i.e. the order of
differentiability of f , u must have higher order derivatives too. From now on
we will encounter a few problems involving ordinary differentiable equations,
and step by step we will establish a theory of ordinary differential equations.
391
Problems
1. For c < 0 consider the function uc : R → R defined by
⎧ 2
x
⎪
⎨4, x>0
uc (x) := 0, c≤x≤0
⎪
⎩ (x−c)2
− 4 , x < c.
Prove that uc is differentiable and satisfies

uc (x) = |uc (x)|.
Prove further that every uc satisfies uc (2) = 1. Now deduce that the
initial value problem

v (x) = |v(x)|, x ∈ (2, 3), v(2) = 1,
is solvable but the solution is not unique.
2. Let f, h : [a, b] → R be continuous functions and consider the differen-

tial equation
f (x)u (x) + h(x)u(x) = 0, x ∈ (a, b). (27.25)
Prove that if u1 , u2 : (a, b) → R are two solutions to (27.25) then for

every λ, μ ∈ R the function λu1 + μu2 is a further solution.
3. Let p0 , p1 : R → R be continuous functions and p0 (x) = 0 for all x ∈ R.

Prove that a solution to
p0 (x)u (x) + p1 (x)u(x) = 0, x ∈ R, u(a) = ua ∈ R,
is given by xp1 (t)

− dt
u(x) := ua e a p0 (t)
. (27.26)
4. By using the separation of variables method, if possible, find a solution

to the following initial value problems. In each case give a (reasonable)
domain for the solution.
a) xu (x) = 2u(x), u(1) = 3;
392
b) y (t) = 2y 2 (t), y(0) = −1;
ϕ(s)
c) ϕ (s) = tan s
, ϕ( π4 ) = π4 ;
d) 5x4 (r)x (r) = r cos r, x( π2 ) = 1.
5. a) If g is a continuous function defined on R find

√
x2 +1
d
g(z)dz.
dx cos x
b) For the differentiable functions u, v : R → R, v(x) ≤ u(x) for all

x ∈ R, find
u(x)
d 1
dt.
dx v(x) 1 + t2
6. Let h : R → R be an odd, continuous function and let u : R → R be

a non-negative, continuously differentiable function. Prove by a direct
calculation that u(x)
d
h(t)dt = 0,
dx −u(x)
u(x)
hence x → −u(x) h(t)dt is a constant function. Now give reasons with-
u(x)
out doing any calculation that we have in fact −u(x) h(t)dt = 0 for all
x ∈ R.
7. Suppose that u : [0, ∞) → R is a continuous function which is con-

tinuously differentiable in (0, ∞). If u is a solution of the initial value
problem
1
u = , u(0) = 1, k ∈ N,
1 + u2k
prove that then u is a strictly monotone increasing, arbitrarily often
differentiable function which is convex. (You are not expected to find
an explicit expression for u.)
393
28 Improper Integrals and the Γ-Function

So far we have only integrated certain classes of bounded functions which are
defined on a compact interval. We want to extend our notion of integrals to
unbounded functions as well as to non-compact intervals.
Definition 28.1. Let I be a bounded interval with end points a < b and
f : I → R be a fundtion. We assume:
a) for every c, d ∈ I, c < d, the function f |[c,d] is continuous;
b) for some α ∈ I the following two limits exist:
α d
lim f (t)dt and lim f (t)dt. (28.1)
c→a c d→b α
Then we define the integral of f over the interval I by
b α d
f (t)dt := f (t)dt := lim f (t)dt + lim f (t)dt, (28.2)
a I c→a c d→b α
where α ∈ I is any point.
Remark 28.2. A. Since for α, β ∈ I the following identity holds
d α d
f (t)dt = f (t)dt + f (t)dt (28.3)
β β α
we can replace (28.1) by the condition that

α d
lim f (t)dt and lim f (t)dt
c→a c d→b β
exist for α, β ∈ I. b
B. The definition of a f (t)dt is independent of the choice of α since for
α, γ ∈ I it follows that
d α d γ d
f (t)dt = f (t)dt + f (t)dt = f (t)dt + f (t)dt.
c c α c γ
α
C. In the case that f |[a,α] or f |[α,b] is already integrable, i.e. a f (t)dt or
d α
b
α
f (t)dt exist, then we need only require that lim f (t)dt or lim f (t)dt
d→b α c→a c
exist and we can define
b α d
f (t)dt := f (t)dt + lim f (t)dt (28.4)
a a d→b α
395
and
b α b
f (t)dt := lim f (t)dt + f (t)dt, (28.5)
a c→a c α
respectively.
D. Of course it is possible to reduce all limits under consideration to limits
of the type α b−
lim f (t)dt or lim f (t)dt, (28.6)
→0 a+ →0 α
with > 0 such that a + , b − ∈ I.
This definition allows us already to integrate certain continuous functions
defined on open or half-open intervals, and even some unbounded functions
are included.
Example 28.3. Let R > 0 and 0 < α < 1. Consider the unbounded,
continuous function fα : (α, R] → R, x → x1α . For c ∈ (0, R] it follows that
R R R
1 1 1 1 1−α
fα (x)dx = α
dx = · α−1 = R − c1−α .
c c x 1−α x c 1−α
Since 1 − α > 0 we find
R
1
lim fα (x)dx = R1−α ,
c→0 c 1−α
hence
R R
1 1
fα (x)dx = α
dx = R1−α . (28.7)
0 0 x 1−α
1 0 1−
dx dx dx
√ = lim √ + lim √
1−x 2 →0 −1+ 1−x 2 →0 0 1 − x2
−1
= − lim arcsin(−1 + ) + lim arcsin(1 − )
→0 →0
π π
=− − + = π,
2 2
i.e. 1
dx
√ = π. (28.8)
−1 1 − x2
396
28 IMPROPER INTEGRALS AND THE Γ-FUNCTION
Example 28.5. We want to investigate the integral of x → ln(sin x) over the

interval (0, π2 ). For x ∈ (0, π2 ) the range of sin x is (0, 1), but lim (ln y) = −∞.
y→0
y>0
π
We first note for > 0 that the substitution x = 2
− y yields
π
−
2 π
I := ln(sin x)dx = − ln(sin( − y))dy
π
2
− 2
π
− π
−
2 π 2
= ln(sin( − x))dx = ln(cos x)dx,
2
or
π
−
2
2I = (ln(sin x) + ln(cos x))dx

π − π −
2 2 sin 2x
= ln(sin x cos x)dx = ln dx
2
π − π −
2 2
= ln(sin 2x)dx − ln 2 dx

π −
2 π
= ln(sin 2x)dx − ln 2( − + )
2
π −
2 π
= ln(sin 2x)dx − ln 2.
2
We now study the remaining integral: the substitution 2x = t gives

π
− π−2
2 1
ln(sin 2x)dx = ln(sin t)dt
2 2
π
− π−
1 2
= ln(sin t)dt + ln(sin t)dt
2 π
2
+
2 π−
1
− ln(sin t)dt + ln(sin t)dt ,
2 π−2
and using the first part (or the substitution x = π − t) we find

π
− π
− 2 π−
2 2 1
ln(sin 2x)dx = ln(sin x)dx − ln(sin t)dt + ln(sin t)dt ,
2 π−2
397
implying
2 π−
π 1
2I = I − ln 2 − ln(sin t)dt + ln(sin t)dt ,
2 2 π−2
or
2 π−
π 1
I = − ln 2 − ln(sin t)dt + ln(sin t)dt .
2 2 π−2
We now claim that

2 π−
lim ln(sin t)dt = lim ln(sin t)dt = 0. (28.9)
→0 →0 π−2
Since lim (y α ln y) = 0 for any α > 0 we first note that lim ((sin 2) ln(sin 2)) =
y→0 →0
0, implying of course that lim ln(sin 2) = 0. Next we note that
→0
max | ln(sin t)| = − ln(sin 2),

t∈[,2]
and
max | ln(sin t)| = − ln(sin 2),
t∈[π−2,π−]
which implies (28.9) and consequently we find

π
2 π
ln(sin x)dx = − ln 2.
0 2
In a further step we want to extend the integral for certain functions defined
on unbounded intervals.
Definition 28.6. A. Let f : [a, ∞) → R (g : (−∞, b] → R) be a continuous

function. If the limit
R b
lim f (x)dx lim g(x)dx (28.10)
R→∞ a R→∞ −R
exists we denote it by
∞ R b b
f (x)dx := lim f (x)dx g(x)dx = lim g(x)dx .
a R→∞ a −∞ R→∞ −R
(28.11)
398
B. Let f : (a, b) → R be a function where a ∈ R ∪ {−∞} and b ∈ R ∪ {∞}.

Suppose that for every c, d ∈ R, c < d, such that [c, d] ⊂ (a, b), the function
f |[c,d] is continuous. If for some α ∈ (a, b) the limits
b α
f (x)dx := lim f (x)dx (28.12)
a c→a c
and
b d
f (x)dx := lim f (x)dx (28.13)
α d→b α
exist, then we define
b α b
f (x)dx := f (x)dx + f (x)dx (28.14)
a a α
α d
:= lim f (x)dx + lim f (x)dx.
c→a c d→b α
α b
Remark 28.7. As before, if one of the integrals a f (x)dx or α f (x)dx
exist as the Riemann integral of a continuous function defined on a compact
interval, then in (28.14) we need to consider only one limit.
Definition 28.8. Any of the integrals defined in Definition 28.1 or Definition

28.6 we call the improper (Riemann) integral of f .
Example 28.9. For α > 1 we have

∞
dx 1
α
= . (28.15)
1 x α−1
R
Indeed, for R > 0 the integral 1 xdxα exists and we find
R
R
dx 1 1 1 1
= · = 1 − α−1 .
1 xα 1 − α xα−1 1 α−1 R
1
Since lim = 0, note that α − 1 > 0, it follows that
R→∞ Rα−1
∞ R
dx dx 1 1 1
α
= lim α
= lim 1 − α−1 = .
1 x R→∞ 1 x R→∞ α − 1 R α−1
399

∞
dx
2
= π. (28.16)
−∞ 1 + x
We have for R > 0

∞ 0 R
dx dx dx
= lim + lim
−∞ 1 + x2 R→∞ −R 1 + x2 R→∞ 0 1 + x2
= − lim arctan(−R) + lim arctan R
R→∞ R→∞
π π
= −(− ) + = π.
2 2
Example 28.11. For α > 0 we have
∞
1
e−αt dt = . (28.17)
0 α
Indeed, for R > 0 it follows that
R R
1 −αt 1
e dt = − e = (1 − e−αR )
−αt
0 α 0 α
and passing to the limit R → ∞ we find (28.17).
In most cases we cannot do explicit calculations to check whether or not
an improper integral exists. Thus we need criteria for the convergence or
divergence of improper integrals.
In the following I is an interval with end points a ∈ R ∪ {−∞} and b ∈
R ∪ {∞}, a < b, and f : I → R is a function which is continuous on any
[c, d] ⊂ I. For the improper integral of f over I (if it exists)
compact interval
we will write I f (x)dx. Our first criterion is the Cauchy criterion for
improper integrals.

Theorem 28.12. The improper integral I f (x)dx exists (converges) if for
every α ∈ (a, b) we have:
For every > 0 there exists s0 , y0 ∈ (a, b) such that b > t > s > s0 > a and
a < z < y < y0 < b imply
t y

f (x)dx < and f (x)dx < .

s z
400
Proof. The first condition is equivalent to the existence of the limit

t0
lim f (x)dx,
t0 →b s0
while the second condition is equivalent to the existence of the limit

y0
lim f (x)dx.
z0 →a z0
Theorem 28.13. Suppose that f ≥ 0. Then for every α ∈ (a, b) the integral
b
α
f (x)dx converges if there exists a constant M > 0 such that for all β ∈
(α, b)
β
f (x)dx ≤ M (28.18)
α
holds.
β
Proof. Since f ≥ 0 the function β → α
f (x)dx is monotone increasing and
it is bounded, hence the limit
β
lim f (x)dx
β→b α
exists, see Problem 6 in Chapter 20.

Definition 28.14. We call I f (x)dx absolutely convergent if I |f (x)|dx
converges.

Lemma 28.15. If I f (x)dx converges absolutely, then it converges.
Proof. This follows
from the Cauchy criterion with the help of the triangle
inequality. Since I |f (x)|dx converges the Cauchy criterion holds for |f |, i.e.
for α ∈ (a, b) we have: for every > 0 there exists s0 , y0 ∈ (a, b) such that
b > t > s > s0 > a and a < z < y < y0 < b imply
t y
|f (x)|dx < and |f (x)|dx < , (28.19)
s z
and consequently
t
y
f (x)dx < and f (x)dx < . (28.20)

s z
401
Remark 28.16. In Theorem 25.19 we have proved that if f is Riemann

integrable on the compact interval [a, b], then |f | is Riemann integrable on
[a, b] too. We will see in Problem 9 that this does not hold for improper
integrals. Moreover, while the product of two Riemann integrable functions
on a compact interval is also Riemann integrable, the product of two improper
integrable functions need not be improper integrable. Indeed, by Example
28.3 the function f 1 := √1x is improper integrable on (0, 1], however (f 1 ·
2 2
f 1 )(x) = x1 is not improper integrable on (0, 1]:
2
1
1 1
dx = ln 1 − ln = ln
x
and the limit → 0 does not exist.
The following criterion is useful:

Theorem 28.17. Let g : I → R, g(x) ≥ 0, and suppose that I g(x)dx
converges. If |f (x)| ≤ g(x) for all x ∈ I then I f (x)dx converges absolutely.

Proof. We use once again the Cauchy criterion. Since the integral I g(x)dx
exists, the Cauchy criterion holds, so we can replace in (28.19) the function
|f | by g. Now we need to observe that
t t t

f (x)dx ≤ |f (x)|dx ≤ g(x)dx

s s s
as well as
y y y
f (x)dx ≤ |f (x)|dx ≤ g(x)dx.

z z z
Corollary 28.18. Suppose that h : I → R, h(x) ≥ 0, is continuous (inte-

grable wouldbe sufficient) and that I h(x)dx diverges. If h(x) ≤ f (x) for all
x ∈ I, then I f (x)dx diverges too.
Proof. If we assume the contrary, Theorem 28.17 would imply the conver-
gence of I h(x)dx.
Example 28.19. A. The integrals
∞ ∞
cos x sin x
2
dx and dx
1 x 1 x2
402

exist. Indeed, if P (sin x, cos x) = N k l
k,l=0 Ak,l sin (ak x) cos (bl x) and α > 1
then ∞
P (sin x, cos x)
dx, r > 0,
r xα
exists. We only need to observe that
N

|P (sin x, cos x)| ≤ |Ak,l |
k,l=0
∞
and that for α >1 the integral r x1α dx converges.
∞
B. The integral 0 sinx x dx converges. We use the Cauchy criterion to show
this. Let 0 < s < t, then integration by parts gives
t t t
sin x − cos x cos x
x
dx =
x − x2
dx
s s s
which implies
t
t
sin x 1 1 dx
dx ≤ + +
x s t x2
s
s t
1 1 1 2
= + + − = .
s t x s s
Thus given > 0, choose s0 ≥ 2 to find that for t > s > s0 it follows
t
sin x 2
dx ≤ < .
x s
s
Remark 28.20. Of course we can also use the integral test, Theorem 18.21,
to test improper integrals for convergence.
We now want to introduce one of the most important functions in mathemat-
ics and for this we need some preparation. For x ∈ R consider y → cos xy.
It follows that
2 2
1 sin 2x − sin x
cos(xy)dy = sin xy = .
1 x 1 x
Hence we have defined a new function (at least) on R \ {0} by
2
sin 2x − sin x
x → cos(xy)dy = .
1 x
403
More generally, for each x ∈ I, I ⊂ R an interval, let a continuous function

gx : (a, b) → R, y → gx (y), be given, −∞ ≤ a 0 the improper integral

∞
Γ(x) := tx−1 e−t dt (28.21)
0
exists.
Proof. For t > 0 we have the estimate
tx−1 e−t ≤ tx−1 , (28.22)
implying for > 0 that

1 1
x−1 −t 1 1
t e dt ≤ tx−1 dt = (1 − x ) ≤ .
x x
1
Since the function →
tx−1 e−t dt is monotone, bounded and continuous it
follows that 1
1
x−1 −t
t e dt = lim tx−1 e−t dt
0 →0
exists and is finite. Further, using that lim tx+1 e−t = 0 implies that for some
t→∞
N ∈ N the condition t ≥ N yields
1
tx−1 e−t ≤ , (28.23)
t2
we find for R > N that
R N R
tx−1 e−t dt = tx−1 e−t dt + tx−1 e−t dt
1 1 N
N R
1
≤ tx−1 e−t dt + 2
dt
1 N t
N
1 1
= tx−1 e−t dt + − ≤ C(N) < ∞.
1 N R
404
R
As before we observe that R → 1
tx−1 e−t dt is monotone, bounded and
continuous, hence
R ∞
x−1 −t
lim t e dt = tx−1 e−t dt
R→∞ 1 1
exists. Thus
1 R
x−1 −t
Γ(x) := lim t e dt + lim tx−1 e−t dt
→0 R→∞ 1
is well defined.
Definition 28.22. The function Γ : (0, ∞) → R defined by (28.21), i.e.

∞
Γ(x) := tx−1 e−t dt, (28.24)
0
is called the Γ-function.
Theorem 28.23. For x > 0 we have
Γ(x + 1) = xΓ(x). (28.25)
Proof. Integration by parts yields

R R
x −t
t e dt = −tx e−t |R
+x tx−1 e−t dt.

For → 0 and R → ∞ we find

R
Γ(x + 1) = lim lim tx e−t dt
→0 R→∞
R

= lim lim tx e−t |R
+ x lim lim tx−1 e−t dt
→0 R→∞ →0 R→∞
= lim lim (−Rx e−R + x e− ) + xΓ(x)
→0 R→∞
= xΓ(x),
which proves (28.25).
405
Since R
Γ(1) = lim e−t dt = lim (1 − e−R ) = 1, (28.26)
R→∞ 0 R→∞
we deduce from (28.25) for n ∈ N that
Γ(n + 1) = nΓ(n) = n(n − 1)Γ(n − 1) = · · · = n(n − 1)(n − 2) · · · 1 · Γ(1) = n!
Corollary 28.24. If n ∈ N then
Γ(n + 1) = n!. (28.27)
Lemma 28.25. We have the following

∞
−x2 1
e dx = Γ . (28.28)
−∞ 2
1
Proof. The substitution x = t 2 yields
R R2
2 1 1
e−x dx = t− 2 e−t dt,
2 2
or ∞ ∞
−x2 1 1 1 1
e dx = t− 2 e−t dt = Γ( ).
0 2 0 2 2
Since ∞ ∞
−x2 2
e dx = 2 e−x dx
−∞ 0
the result follows.

1 √
Remark 28.26. We will prove in Theorem 30.14 that Γ 2
= π, implying
∞
2 √
e−x dx = π. (28.29)
−∞
Definition 28.27. Let I ⊂ R be an interval and F : I → (0, ∞) be a

function. We call F logarithmic convex if ln F : I → R is convex.
Remark 28.28. If F : I → (0, ∞) is logarithmic convex we have for 0 <
λ < 1 and x, y ∈ I that
ln F (λx + (1 − λ)y) ≤ λ ln F (x) + (1 − λ) ln F (y)
406
or
ln F (λx + (1 − λ)y) ≤ ln(F (x)λ F (y)1−λ),
implying
F (λx + (1 − λ)y) ≤ F (x)λ f (y)1−λ. (28.30)
Theorem 28.29. The Γ-function is logarithmic convex.
∞
Proof. First we note that Γ(x) = 0 tx−1 e−t dt > 0 for x > 0. Next, for
x, y ∈ (0, ∞) and 0 < λ < 1 we set p := λ1 and q := 1−λ
1
, i.e. p1 + 1q = 1.
x−1 t y−1 t
Define f (t) = t p e− p and g(t) = t q e− q . For > 0 and R > Hölder’s
inequality yields
R R p1 R 1q
p q
f (t)g(t)dt ≤ f (t) dt g(t) dt ,

but
x y
f (t)g(t) = t p + q −1 e−t = tλx+(1−λ)y−1 e−t ,
f (t)p = tx−1 e−t ,
g(t)q = ty−1 e−t ,
i.e. we find
R R λ R 1−λ
λx+(1−λ)y−1 −t x−1 −t y−1 −t
t e dt ≤ t e dt t e dt .

For → 0 and R → ∞ we eventually arrive at
Γ(λx + (1 − λ)y) ≤ Γ(x)λ Γ(y)1−λ,
i.e. Γ is logarithmic convex.

In Chapters 30 and 31 we will return to the Γ-function.
Problems
1. a) Let a < b. Prove that the improper integral
b
dx
α
a (x − a)
407
converges for α < 1 and diverges for α ≥ 1.

b) Prove the existence of the improper integral
2
dx
.
0 x(2 − x)
c) Prove that for every α ∈ R the integral

∞
xα dx
0
diverges.
d) Show that
∞
a
e−ax cos(wx)dx = .
0 a2 + w 2
2. Let f : [0, ∞) → R be a continuous function satisfying with some β ∈ R

β
the estimate |f (r)| ≤ c0 (1 + r 2 ) 2 . Prove that for β + 1 < α the integral
∞
f (r)
(∗) α dr
0 (1 + r 2 ) 2
converges absolutely. Now suppose that f is a polynomial of degree

m ∈ N. For which α ∈ R does (∗) converge?
3. Use mathematical induction to prove for α > −1 and k ∈ N0

1
k!
xk (1 − x)α dx = .
0 (α + 1)(α + 2) · . . . · (α + k + 1)
4. Prove that for a > 0

∞ ∞
ln x sin2 t
dx and dt
0 x + a2
2
0 t2 + a2
converge.
408
5. Let g : [−1, 1] → R be an even, continuous function, g(0) = 0. Prove

that 0 1
g(x) g(x)
dx and dt
−1 x 0 x
diverge and consequently we cannot define
1
g(x)
dx.
−1 x
Find now
− 1
g(x) g(x)
lim dx + dx .
→0 −1 x x
6. Let f : [0, ∞) → R be a continuous function. Prove that if lim xα f (x) =

x→∞
c0 ∈ R and α > 1, then
∞
(∗∗) f (x)dx
0
converges. However, if lim xα f (x) = c0 = 0 and α ≤ 1, then (∗∗)

x→∞
diverges.
7. Use the result of Problem 6 to test for the convergence or divergence

of:
∞ ln x
a) 1 1+x dx;
∞ 1−cos y
b) 0 y2
dy;
−1 et
c) −∞ t dt.
8. Prove that the integral ∞

sin x
dx
0 x
does not converge absolutely.
Hint: note that
∞
∞ (n+1)π
sin x sin x

x dx = x dx.
0 n=0 nπ
409
Now prove that

π
(n+1)π sin x sin t
dx = dt
x
nπ 0 t + nπ
and test the resulting series for divergence.
9. Prove the following quotient test for improper integrals: Let f, g :

f (x)
(a, b] → R be two non-negative continuous functions. If lim =
x→a g(x)
b b
c0 > 0, then a f (x)dx exists if and only if a g(x)dx exists. If
b b
limx→a fg(x)
(x)
= 0 and a g(x)dx converges, then a f (x)dx converges. If
f (x) b b
lim = ∞ and a g(x)dx diverges then a f (x)dx diverges.
x→a g(x)
10. Show that

1
ds 1
√ =Γ .
0 − ln s 2
11. For x > 0 and y > 0 prove the existence of the improper integral
1
B(x, y) := tx−1 (1 − t)y−1 dt
0
and deduce B(x, y) = B(y, x). Further, by using the substitution t =

sin2 ϑ show that
π
2 1
sin2m−1 ϑ cos2n−1 νdϑ = B(m, n).
0 2
12. a) Prove that the product of two logarithmic convex functions is

logarithmic convex.
b) Show that a twice continuously differentiable function f : I →
R, I ⊂ R an interval, is logarithmic convex if f > 0 and f f −(f )2 ≥ 0.
c) Prove that the limit of a sequence of logarithmic convex functions
is logarithmic convex.
410
29 Power Series and Taylor Series

We have already discussed at several occasions sequences of functions and we
know that sequences are closely related to series. We now want to start to
discuss series of functions. Let (fn )n∈N0 , fn : K → R, K ⊂ R, be a sequence
of functions. We may study the partial sums
N

SfN (x) := fn (x). (29.1)
n=0
Theorem 29.1 (Weierstrass’ convergence criterion or Weierstrass’

M-test). Let (fn )n∈N0 , fn : K → R, be a sequence of functions and suppose
that
∞
fn ∞ < ∞. (29.2)
n=0
Then the series, i.e. the sequence (SfN )N ∈N0 of partial sums, converges abso-
lutely and uniformly on K to a function F : K → R.

∞
Proof. First we prove that fn (x) converges pointwise, i.e. for every x ∈ K,
n=0

∞
to some function F : K → R. Since |fn (x)| ≤ fn ∞ the series fn (x)
n=0
converges absolutely by the comparison test, see Theorem 18.13. Therefore
we can define for x ∈ K
∞

F (x) := fn (x),
n=0
which is a function F : K → R. Next we prove that the convergence is

uniform, i.e. the sequence of partial sums (SfN )N ∈N0 converges uniformly.

∞
Since fn ∞ < ∞ there exists Ñ(ε) such that
n=0
∞

fn ∞ < ε for N ≥ Ñ(ε).
n=N +1
411
Therefore, for N ≥ Ñ (ε) it follows that

SfN − F ∞ = sup |SfN (x) − F (x)|
x∈K

∞ ∞

= sup fn (x) ≤ sup |fn (x)|
x∈K x∈K
n=N +1 n=N +1
∞

= fn ∞ < ε.
n=N +1

∞
Example 29.2. The series cos nx
n2
converges uniformly on R since with
n=1
cos nx
fn (x) = n2
we have
cos nx 1

fn ∞ = sup 2 = 2
x∈R n n
and ∞
1
< ∞.
n=1
n2
Now we return to power and Taylor series.
Definition 29.3. Let (cn )n∈N be a sequence of real numbers and a ∈ R. We
call ∞

a
T(cn ) (x) := cn (x − a)n , x ∈ R, (29.3)
n=0
the (formal) power series associated with (cn )n∈N and centre a.
Most important of course is the question for which x = a the formal power
series T(ca n ) (x) converges.
Theorem 29.4. Let (cn )n∈N0 be a sequence of real numbers and a ∈ R. If
T(ca n ) (x) converges for some x1 = a, then it converges for all x ∈ R such that
|x−a| ≤ < |x1 −a|, i.e. it converges for all x ∈ [a−, a+], 0 < < |x1 −a|.
Moreover, the convergence is absolute and uniform on [a − , a + ] and the
same holds for the series
∞

a
T(nc n)
(x) = ncn (x − a)n−1 . (29.4)
n=1
412
29 POWER SERIES AND TAYLOR SERIES
In particular x → T(ca n ) (x) and x → T(nc

a
n)
(x) are for every 0 < ρ < |x1 − a|
on [a − ρ, a + ρ] continuous functions.
Proof. We set fn (x) := cn (x − a)n , hence formally we find with f (x) :=

∞
∞
T(ca n ) (x) that f = fn . Since fn (x1 ) converges by our assumption, there
n=0 n=0
exists M ≥ 0 such that |fn (x1 )| ≤ M for all n ∈ N0 .
For 0 < ρ < |x1 − a| and x ∈ [a − ρ, a + ρ] it follows that
n
n

n x−a

|fn (x)| = |cn (x − a) | = |cn (x1 − a) | ≤ Mϑn
x1 − a

where ϑ := |x1 −a|
< 1. Thus we have
fn [a−,a+],∞ = sup |fn (x)| ≤ Mϑn

x∈[a−ρ,a+ρ]

implying that ∞ 1
n=0 fn [a−,a+],∞ ≤ M 1−ϑ , and hence by Theorem 29.1 the

∞
series fn converges absolutely and uniformly on [a − , a + ], and since fn
n=0
is continuous on [a − ρ, a + ρ] the function f is continuous too, see Theorem
24.6.
∞
Now define gn (x) := ncn (x − a)n−1 , and g = gn . As before we may prove
n=0
that
gn [a−,a+],∞ ≤ nMϑn−1

∞
(n+1)M ϑn
and the ratio test implies the convergence of nMϑn−1 . Note that nM ϑn−1
=
n=0
n+1
n
ϑ< 1 for n large since ϑ < 1 and lim n+1 = 1. Now, Theorem 29.1 to-
n→∞ n
gether with Theorem 24.6 implies the result.
Definition 29.5. Let T(ca n ) be a formal power series. We call the set of all
x ∈ R for which T(ca n ) converges the domain of convergence of T(ca n ) .

∞
Corollary 29.6. Let f (x) = cn (x − a)n be a power series converging in
n=0
[a − , a + ], > 0, uniformly. Then we have for a − ≤ b < c ≤ a +
c ∞
c ∞
n cn
f (x)dx = cn (x − a) dx = (c − a)n+1 − (b − a)n+1 .
n=0 n=0
n+1
b b
(29.5)
413
This corollary follows from Theorem 29.4 and Theorem 25.27.

∞
Corollary 29.7. Let f (x) = cn (x − a)n be a power series converging
n=0
uniformly in [a − , a + ]. Then we find for x ∈ (a − , a + ) that
∞

f (x) = ncn (x − a)n−1 (29.6)
n=1

∞
and the series ncn (x − a)n−1 converges uniformly in [a − , a + ].
n=1
This corollary follows from Theorem 29.4 and Theorem 26.19

∞
Corollary 29.8. Let f (x) = cn (x − a)n be as in Corollary 29.7. Then
n=0
f : (a−, a+) → R is arbitrarily often differentiable and we have for k ∈ N0
1 (k)
ck = f (a). (29.7)
k!
Proof. A repeated application of Corollary 29.7 yields first the existence of
all derivatives and then
∞

f (k) (x) = n(n − 1) · . . . · (n − k + 1)cn (x − a)n−k
n=k
which gives for x = a

1 (k)
ck = f (a).
k!
Example 29.9. For |x| < 1 we find

∞
∞
∞
d n
nxn = x nxn−1 = x x
n=1 n=1
dx n=0

d 1 x
=x =
dx 1−x (1 − x)2
∞ n
which for example yields n=1 2n = 2.
414
Having Corollary 29.8 in mind, we note that the power series allow us to
define arbitrarily often differentiable functions. This opens the road to even-
tually prove the existence of the exponential function exp : R → R.
Theorem 29.10. There exists a unique function exp : R → R with exp =

exp and exp(0) = 1.
Proof. We set
∞
xk
exp(x) := . (29.8)
k=0
k!
First we claim that this power series converges for all x ∈ R. Indeed for
x ∈ R fixed we find xk+1
1
(k+1)!
xk = |x| ,
k+1
k!
|x| k
and therefore, if k ≥ 2|x| it follows that k+1 ≤ 2(k+1) < 12 , and the ratio
∞ xk
test implies the convergence of k=0 k! . Now we deduce from Theorem 29.4
that this convergence is uniform on every compact interval. Consequently,
by Corollary 29.7 we find
∞ ∞ ∞
xk−1 xk−1 xk
exp (x) = k = = = exp(x).
k=1
k! k=1
(k − 1)! k=0 k!
and exp(0) = 1.
We will see later how we can use power series to solve certain differential
equations (some examples are given in the Problems).
Our aim is to discuss Taylor’s formula and the Taylor series. The starting
point is the fundamental theorem of calculus.
Let I = [a, b] be an interval and f : (a, b) → R be of the class C 2 , i.e.
f ∈ C 2 ((a, b)), and suppose that f , f and f have continuous extensions on
[a, b]. Since f is a primitive of the continuous function f and since by the
fundamental theorem
x
f (x) = f (c) + f (t)dt, c, x ∈ (a, b)
c
415
a further application of the fundamental theorem yields

x t

f (x) = f (c) + f (c) + f (s)ds dt
c c
x t
= f (c) + f (c)(x − c) + f (s)ds dt.
c c
Let Mf = ||f ||[a,b],∞. For x, t ≥ c we get

x t x t
(2)

Rf,c (x) = f (s)ds dt ≤ Mf 1ds dt
c c c c
x
(x − c)2
= Mf (t − c)dt = Mf ,
c 2
which yields
(2)
f (x) = f (c) + f (c)(x − c) + Rf,c (29.9)
where
(2) (x − c)2 |x − c|2
|Rf,c (x)| ≤ Mf = Mf . (29.10)
2 2
Note that it is easy to see that (29.9) and (29.10) hold for all x ∈ [a, b].
Moreover with
x
(1)
Mf := ||f ||[a,b],∞ and Rf,c (x) = f (t)dt
c
we have
(1)
|Rf,c (x)| ≤ Mf |x − c|. (29.11)
Here is the interpretation of these results: If |x − c| is small we can ap-
proximate f (x) by f (c), and we might get a better approximation by f (c) +
f (c)(x − c) :
|f (x) − f (c)| ≤ Mf |x − c|,
and
(x − c)2
|f (x) − (f (c) + f (c)(x − c))| ≤ Mf .
2
Recall that for |x − c| < and < 1 it follows that |x − c|2 < |x − c|. The
main question is whether we can get an even better approximation when
increasing the order of derivatives and iterating the above process.
416
Theorem 29.11 (Taylor’s formula). Let f : [a, b] → R be a function such

that f |(a,b) ∈ C n+1 ((a, b)) and f, f , . . . , f (n+1) have continuous extensions to
[a, b]. Then for every c, x ∈ (a, b) the following holds
f (c) f (c) f (n) (n+1)
f (x) = f (c) + (x − c) + (x − c)2 + · · · + (x − c)n + Rf,c (x)
1! 2! n!
(29.12)
where the remainder term is given by

(n+1) 1 x
Rf,c (x) = (x − t)n f (n+1) (t)dt. (29.13)
n! c
Proof. We use mathematical induction. The fundamental theorem yields
x
f (x) = f (c) + f (t)dt
c
which is (29.12), (29.13) for n = 0. Now suppose that (29.12), (29.13) hold
for n − 1 ∈ N. Consider
x
(n) 1
Rf,c (x) = (x − t)n−1 f (n) (t)dt
(n − 1)! c
x
d (x − t)n
=− · f (n) (t)dt.
c dt n!
Integration by parts gives
x
(n) d (x − t)n
Rf,c (x) = − · f (n) (t)dt
c dt n!
t=x x
(n) (x − t)n (x − t)n (n+1)
= −f (t) · + ·f (t)dt
n! n!
t=c c
(n)
f (c) (n+1)
= (x − c)n + Rf,c (x).
n!
Thus
n−1 (j)(c)
f (n)
f (x) = (x − c)j + Rf,c (x)
j=0
j!
n−1
f (j) (c) (x − c)n (n+1)
= (x − c)j + f (n) (c) + Rf,c (x)
j=0
j! n!
n
f (j) (c) (n+1)
= (x − c)j + Rf,c (x),
j=0
j!
417
and the theorem is proven.
Definition 29.12. Let f : [a, b] → R be a (n+1)-times continuously differ-

entiable function on (a, b) and let f, f , . . . , f (n+1) have continuous extensions
to [a, b]. The first n Taylor polynomials of f around c ∈ [a, b] are given by
k

(k) f (j) (c)
Tf,c (x) := (x − c)j , k = 1, . . . , n. (29.14)
j=0
j
Thus we have
(n) (n+1)
f (x) = Tf,c (x) + Rf,c (x) (29.15)
with
(n+1) |x − c|n+1
|Rf,c (x)| ≤ Mf (n+1) (29.16)
(n + 1)!
where Mf (n+1) = ||f (n+1) ||[a,b],∞ .
Corollary 29.13. Let f be as in Definition 29.12. If f (n+1) (x) = 0 for all

x ∈ [a, b] then f is a polynomial of degree less than or equal to n.
(n+1)
Proof. In this case Rf,x0 (x) = 0 for all x ∈ [a, b].
Here are some examples of Taylor polynomials:
k
(k) 1 j
Texp,0 (x) = x; (29.17)
j=0
j!
k l−1 1
(k) j=0 ( 2 − j) l √
Tg,0 (x) = x , g(x) = 1 + x; (29.18)
l=0
l!
k

(k) (−1)j−1 xj
Th,0 (x) = , h(x) = ln(1 + x); (29.19)
j=1
j
k

(2k+1) x2j+1
Tsin,0 (x) = (−1)j ; (29.20)
j=0
(2j + 1)!
k

(2k) x2j
Tcos,0 (x) = (−1)j . (29.21)
j=0
(2j)!
418
We want to understand how good the Taylor polynomial approximates the

function and for this we need to estimate the remainder term. Sometimes,
(n+1)
instead of using Rf,c it is more helpful to use the Lagrange form of the
remainder term.
Theorem 29.14. Let f be as in Definition 29.12 and x, x0 ∈ [a, b]. Then
there is ξ ∈ [x, x0 ] or ξ ∈ [x0 , x] such that
n
f (k) (x0 ) f (n+1) (ξ)
f (x) = (x − x0 )k + (x − x0 )n+1 . (29.22)
k=1
k! (n + 1)!
Proof. Let us suppose that x0 < x, the other case goes analogously. By the
mean value theorem for integrals we find

(n+1) 1 x
Rf,x0 (x) = (x − t)n f (n+1) (t)dt
n! x0
x
(n+1) (x − t)n (x − x) )n+!
=f (ξ) dt = f (n+1) (ξ) ,
x0 n! (n + 1)!
proving the theorem.

1
Example 29.15. A. For x → ln(1 + x), 0 ≤ x ≤ 10 , we find using the
integral form of the remainder
x 2
d

(2)
Rln(1+·),0 (x) = (x − t) 2
ln(1 + t) dt
dt
0 x x
1

= − (x − t) dt≤ (x − t)dt
0 1 + t2 0
x2
= ,
2
1 x2 1
which implies for 0 ≤ x ≤ 10 that | ln(1 + x) − x| ≤ 2
≤ 100
.
B. For sin : [0, 2π] → R we find
(2n+3) cos ξ
Rsin,0 (x) = (−1)n+1 x2n+3 for ξ ∈ [0, 2π]
(2n + 3)!
and therefore
(2n+1) |x|2n+3
sin x − Tsin,0 (x) ≤
(2n + 3)!
419
1
Therefore for n = 1 and |x| ≤ we have already
10
1 5 1
(3)
sin x − Tsin,0 (x) ≤ · .
10 120
Finally we consider the Taylor formula as n goes to infinity.
Definition 29.16. Let f : (a, b) → R be an arbitrarily often differentiable
function and x0 ∈ (a, b). We call
∞
f (k) (x0 )
Tf,x0 (x) := (x − x0 )k (29.23)
k=0
k!
the Taylor series (or Taylor expansion) of f about x0 .

Remark 29.17. So far Tf,x0 (x) is a formal power series which does not
necessarily converge for all x ∈ (a, b). Moreover, if Tf,x0 (x) converges, the
limit does not have to be f (x). In fact the Talyor series Tf,x0 (x) converges
to f (x) if and only if
(n+1)
lim Rf,x0 (x) = 0.
n→∞
Example 29.18. Consider f : R → R defined by

- 1
e− x2 , x = 0
f (x) = .
0, x=0
We claim that f ∈ C ∞ (R) and f (n) (0) = 0 for all n ∈ N0 . This implies that
Tf (x) = 0 for all x ∈ R, in particular Tf (x) converges for all x ∈ R but for
x = 0 we have Tf (x) = f (x). To prove our claim we show the existence of
polynomials pn such that
- 1
(n) pn ( x1 )e− x2 , x = 0
f (x) = .
0, x=0
The case n = 0 is clear, just take p0 = 1.

Now, for x = 0 we have

(n+1) d (n) d 1 − 12
f (x) = f (x) = pn e x
dx dx x

1 1 1 1 1
= −pn 2
+ 2pn 3
e− x2 ,
x x x x
420
thus
pn+1 (t) := −pn (t)t2 + 2pn (t)t3 .
But for x = 0
f (n) (x) − f (n) (0)
f (n+1) (0) = lim
x→0 x
1 − x12
pn ( x )e
= lim
x→0 x
2
= lim rpn (r)e−r = 0.
r→∞
Note that if Tf,x0 (x) converges to f (x) for some x = x0 , then in the interval
[x0 − ρ, x0 + ρ], 0 < ρ < |x − x0 |, the convergence is uniform.
We will encounter Taylor series (and power series) later on when discussing
functions of several real variables and most of all when treating complex-
valued functions of a complex variable.
We have introduced the exponential function now as a convergent power
series and we may ask whether we can prove the functional equation for exp,
i.e.
exp(x + y) = exp(x) exp(y), (29.24)
without using the fact that exp satisfies the initial value problem u =
u, u(0) = 1, compare with Lemma 9.7. The right hand side of (29.24) is
the product of two power series and we first want to discuss products of
infinite series. ∞
Let (a
n )n∈N0 and (bn )n∈N0 be two sequences of real numbers and A := n=0 an ,
B := ∞ m=0 mb the corresponding series which we assume
to converge.
∞ The
aim is to find conditions under which we can represent ( ∞ a
n=0 n ) ( m=0 bm )
as a series converging to A · B.
For two partial sums we have
N M

an bm = an bm
n=0 m=0 n,m
where on the right hand side we form all products on an bm , 0 ≤ n ≤ N and

0 ≤ m ≤ M, and add them up. However we cannot proceed in the same way
with infinitely many terms. We set
n

cn := an b0 + an−1 b1 + · · · + a0 bn = an−k bk (29.25)
k=0
421
and give
∞
Definition 29.19. Let ∞ n=0 an and m=0 bm be two series of real numbers
and define cn by (29.25). The Cauchy product of these series is given by
∞ ∞
n

cn := an−k bk . (29.26)
n=0 n=0 k=0
Remark 29.20. Sofar the definition does not include a statement about
convergence. Thus ∞ n=0 cn stands for the sequence of partial sums
N n
n=0 ( k=0 an−k bk ).

Theorem 29.21. Let A := lim An , An := nk=0 ak , and B := lim Bm ,
∞ n→∞ ∞ m→∞
Bm := m l=0 bl . If k=0 ak converges absolutely and k=0 bk converges, then
their Cauchy product converges to A · B, i.e.
∞ N
n

A·B = cn = lim an−k bk . (29.27)
N →∞
n=0 n=0 k=0
∞ ∞
In the case where l=0 bl converges absolutely, then n=0 cn is also absolutely
convergent.
Proof. We may write

n

ck = a0 b0 + (a0 b1 + a1 b0 ) + · · · + (a0 bn + · · · + an b0 )
k=0
= a0 Bn + a1 Bn−1 + · · · + an B0
n n
= an−k (Bk − B) + B ak .
k=0 k=0
∞
By assumption k=0 ak = A, and hence we are done if we can prove that
n

lim an−k (Bk − B) = 0.
n→∞
k=0
422
Given > 0 we can find N() such that for k ≥ N() we have |Bk − B| < .
For n > N() it follows that
n
N n

an−k (Bk − B) ≤ |an−k ||Bk − B| + |an−k ||Bk − B|

k=0 k=0 k=N +1
N
n

≤ max |Bk − B| |an−k | + |an−k |
k≤N
k=0 k=N +1
N
∞

≤ max |Bk − B| |an−k | + |ak |.
k≤N
k=0 k=0
∞
For n → ∞ it follows that an → 0 since n=0 an converges. Therefore it
follows that for every fixed N
N

lim |an−k | = 0.
n→∞
k=0
Hence
n ∞

0 ≤ lim sup an−k (Bk − B) ≤ |ak |,
n→∞
k=0 k=0

implying the convergence of ∞ n=0 cn to A · B. Now suppose that both series
converge absolutely. Then we get

M M n M n

|cn | = an−k bk ≤ |an−k ||bk |

n=0 n=0 k=0 n=0 k=0
M M ∞ ∞

= |an | |bk | ≤ |an | |bk |
n=0 k=0 n=0 k=0
∞
implying the absolute convergence of n=0 cn .
We now apply this result to exp in order to prove its functional equation.
Proposition 29.22. For x, y ∈ R the relation (29.24) holds, i.e.
exp(x) exp(y) = exp(x + y).
423
xn
∞ yk
Proof. We know that exp(x) = ∞ n=0 n! and exp(y) = k=0 k! and both
series converge absolutely. Therefore, by Theorem 29.21 we find
∞
n
xn−k y k
exp(x) exp(y) = .
n=0
(n − k)! k!
k=0
Using the binomial theorem we get

∞
n ∞
n
xn−k y k 1 n n−k k
= x y
n=0
(n − k)! k! n=0
n! k
k=0 k=0
∞
1
= (x + y)n = exp(x + y).
n=0
n!
Problems
4
1. For n ∈ N0 consider the functions gn (x) = (1+x
x
4 )n defined on R. Prove
that -
∞
1 + x4 , x = 0
gn (x) =
n=0
0, x = 0.
Why does this series not converge uniformly?
2. Prove that the following series converge absolutely and uniformly in

the given domain.

a) ∞ k=1 k α , α > 1, x ∈ R;
sin kx

b) ∞ xn
n=1 32 , −1 ≤ x ≤ 1;
n

c) ∞ n=1 n2 +r 2 , r ∈ R.
1
3. For α ∈ R define n
α α−k+1
:= .
n k=1
k
Prove that for α ∈ N this is a binomial coefficient. Let gα : (−1, 1) → R,
(k)
gα (x) = (1 + x)α . Show that gα (0) = k! αk and find the Taylor
(n)
polynomial Tgα ,0 (x).
424
∞
4. Suppose that ∞ k
k=0 ak x and ∞
k=0 bk xk converge absolutely
∞ and uni-
formly on [−c, c]. Prove that then k=0 (ak +bk )x and k=0 (λak )xk , λ ∈
k
R, converge also in [−c, c] absolutely and uniformly and that we have

∞
∞
∞

(ak + bk )xk = ak xk + bk xk ,
k=0 k=0 k=0
∞
∞
(λak )xk = λ ak xk .
k=0 k=0
∞ xk
5. Given that ex = k=0 k! find the Taylor series of sinh and cosh.
6. Find the Taylor series about 0 of
1 1+x
f (x) = ln , |x| < 1.
2 1−x
7. For l ∈ N) we define the Bessel function of order l by
∞ 2n+l
(−1)n x2
Jl (x) := . (29.28)
n=0
n!(n + l)!
Prove that Jl converges uniformly and absolutely on every compact

interval in R. Now prove that Jl solves
x2 Jl (x) + xJl (x) + (x2 − l2 )Jl (x) = 0.
Note that we can write Jl (x) as
∞
∞
n 1 2n+l xl n x2n
Jl (x) = (−1) x = (−1) .
n=0
n!(n + l)!22n+l 2l n=0 n!(n + l)!22n
8. Justify
∞
1
= (−1)l t2l , |t| < 1,
1 + t2
l=0
and by using the identity
x
1
arctan x = dt
0 1 + t2
find Tarctan,0 (x).
425
9. Use the result of Problem 8 to show

∞ n
π 1 1 1
=√ − .
6 3 n=0 2n + 1 3
∞
10. Prove Abel’s convergence theorem: if the series k=0 ak converges
then ∞ ∞

ak = lim ak xk .
x→1
k=0 x<1 k=0
11. Use Abel’s convergence theorem to show

a) ln 2 = ∞ l=1 (−1)
l+1 1
l
= ∞ 1
l=1 (2l−1)2l .

b) π4 = ∞ k 1
k=0 (−1) 2k+1
12. Show the following inequalities for x > 0

x2 x3 4 2 x3
a) x − 2
+ 3
− x4 < ln(1 + x) < x − x2 + 3
;
x x2
√ 2 3
b) 1 + 2
− 8
< 1 + x < 1 + x2 − x8 + x16 .
13. By using the Cauchy product prove
1 2 ∞
a) 1−x = n=0 (n + 1)xn , |x| < 1;
b)

cos x 1 2 1 3 1 1
=1+x+ 1− x + 1− x + 1− + x4
1−x 2! 2! 2! 4!

1 1
+ 1− + x5 + · · ·
2! 4!
∞
n

k 1
= (−1) (x2n + x2n+1 ), |x| < 1.
n=0
(2k)!
k=0
14. Let f, g : (−1, 1) → R have convergent Taylor expansions f (x) =

∞ f (k) (0) k g (k) (0) k
k=0 k!
x and g(0) = ∞ k=0 k!
x . Assuming that f · g also has
a convergent Taylor expansion in (−1, 1) prove
∞
k
f (l) (0) g (k−l)(0)
(f · g)(x) = xk .
k=0 l=0
l! (k − l)!
426
30 Infinite Products and the Gauss Integral

Given a sequence (cn )n∈N of real numbers ck = 0. For N ∈ N we use the
notation
N
PN := ck = c1 · . . . · cN . (30.1)
k=1
We want to study the convergence of the sequence (PN )N ∈N .
Definition 30.1. Given a sequence (cn )n∈N of real numbers ck = 0. We call
the sequence (PN )N ∈N the infinite product of (cn )n∈N and denote it by
∞

ck . (30.2)
k=1

Note that as in the case of ∞ a series we may also consider ∞ k=m0 ck with
its
obvious definition. So c
k=1 k is just a further symbol for
(PN )N ∈N =
N N
k=1 ck and we are interested in conditions under which k=1 ck
N ∈N N ∈N
converges.

Definition 30.2. We say that the infinite product ∞ k=1 ck converges to
P = 0 if the sequence (PN )N ∈N converges to P . In this case we write for the
limit P ∞

P = ck . (30.3)
k=1
N
∞
If lim ck = 0 we say that k=1 ck divergent to 0.
N →∞
k=1
∞
Remark 30.3. If k=1 ck converges then it follows that
PN limN →∞ PN
lim cN = lim = = 1.
N →∞ N →∞ PN −1 limN →∞ PN −1
This condition is however not sufficient for the convergence of an infinite
product as is seen by
1 k+1
ck := 1 + = →1
k k
but
2 3 N +1
PN = · · ...· = N + 1 → ∞.
1 2 N
427

Example 30.4. The infinite product ∞ 1 1
k=2 1 − k 2 converges to 2 . Indeed
we have
N

1 1 1 1
PN = 1− 2 = 1− 2 1 − 2 · ...· 1 − 2
k 2 3 N
k=2
2 − 1 32 − 1
2
N2 − 1
= · . . . ·
22 32 N2
(2 − 1)(2 + 1) (3 − 1)(3 + 1) (N − 1)(N + 1)
= 2 2
· ...·
2 3 N2
1N +1
= .
2 N
The latter follows easily by induction: for N = 2 we have (2−1)(2+1)
22
= 34 .
Furthermore
(2 − 1)(2 + 1) (3 − 1)(3 + 1) (N − 1)(N + 1) N(N + 2)
2 2
·...·
2 3 N2 (N + 1)2
1 N + 1 N(N + 2) 1N +2
= 2
= .
2 N (N + 1) 2N +1
Thus for N → ∞ we find
∞
1 1 N +1 1
1 − 2 = lim PN = lim = .
k N →∞ N →∞ 2 N 2
k=2

Since for the convergence of ∞ k=1 ck it is necessary that lim ck = 1 we may
k→∞
introduce ak := ck − 1, i.e. ck = 1 + ak , and consider ∞ k=1 (1 + ak ). Clearly
we have now the necessary condition lim ak = 0 and ak = −1 is excluded.
k→∞
Suppose that ak > −1, i.e. ck > 0. Then we find
N
N

ln PN = ln (1 + ak ) = ln(1 + ak ),
k=1 k=1
or
N

PN = exp ln(1 + ak ) .
k=1
∞
Since exp is continuous
∞ the
∞ convergence of k=1 ln(1
+ ak ) will imply the
∞
convergence
of k=1 ck = k=1 (1+ak ). Conversely, if k=1(1+ak ) converges
then ∞ k=1 ln(1 + ak ) converges too.
Thus we have proved
428
30 INFINITE PRODUCTS AND THE GAUSS INTEGRAL
∞numbers where ak > −1

Lemma 30.5. Let (ak )k∈N be a sequence of real
and set ck = 1
+ ak . Then the convergence of k=1 ck is equivalent to the
convergence of ∞k=1 ln(1 + ak ).
Remark 30.6. As in the case of series we can sharpen Lemma 30.5 slightly
by assuming that ak > −1 for all k ≥ N0 . Note that if ak ≤ −1 for some
finite values of k, k ≤ N0 , then for these k the terms ln(1 + ak )are not
∞
defined.
∞ We find however the equivalence of the convergence of k>N0 ck
and k=N0 +1 ln(1
+ ak ) and the convergence of this series also implies the
convergence of ∞ k=1 ck .
The Cauchy criterion for infinite products is as follows:

Proposition 30.7. The infinite product ∞ k=1 ck converges if and only if for
every > 0 there exists N = N() ∈ N such that n > m > N () implies

n

ck − 1 < .

k=m+1
∞
Proof. We assume first that k=1 ck
converges
to 0. The Cauchy crite-
c=
N
rion applied to the convergent sequence k=1 ck states: for every > 0
N ∈N
and η > 0 there exists N(η, ) such that n > m > N(η, ) implies

n m

ck − ck < η

k=1 k=1
or
n η

c k − 1 < m .
| k=1 ck |
k=m+1
m
m
Since lim ck = c = 0 it follows that for m ≥ N0 we have | k=1 ck | ≥
m→∞
k=1
|c| 2
2
= 0. Hence for n > m > max(N0 , N()) we have with η = |c|

n η

ck − 1 < m ≤ .
| k=1 ck |
k=m+1
429
Now we prove the converse. First we note that for = 1

2
there exists N1 ∈ N
such that n > m > N1 implies
n
1

ck − 1 <
2
k=m
which yields
1 3
n
< ck < , (30.4)
2 k=m 2
and in particular cl = 0 for l > N1 . Now let N > N1 fixed. For every
0 < < 12 there exists by assumption N() > N such that n > m > N ()
implies
n n

k=N ck
m − 1 = c − 1

k

k=N ck k=m
2
= |cm+1 · cm+2 · . . . · cn − 1| < ,
3
or
n m m 2

ck − ck < ck · < ,
3
k=N k=m N =k

N
where we used (30.4). Thus k=1 kc is a Cauchy sequence in R and
N ∈N
therefore convergent.
∞ ∞
Definition 30.8.
∞ The product k=1 c k = k=1 (1 + ak ) is called absolutely
convergent if k=1 (1 + |ak |) converges.

Proposition 30.9. If ∞ k=1 (1 + ak ) converges absolutely then it converges.
Proof. We aim to apply the Cauchy criterion, and for this we note that for
a1 , . . . , an ∈ R the following holds:
|(1 + a1 )(1 + a2 ) · . . . · (1 + an ) − 1| ≤ (1 + |a1 |)(1 + |a2 |) · . . . · (1 + |an |) − 1.
Indeed, for n = 1 we have
|(1 + a1 ) − 1| = |a1 | = (1 + |a1 |) − 1.
430
Moreover,
|(1 + a1 )(1 + a2 ) · . . . · (1 + an )(1 + an+1 ) − 1|

= |(1 + a1 )(1 + a2 ) · . . . · (1 + an + an+1 + an an+1 ) − 1|
≤ (1 + |a1 |)(1 + |a2 |) · . . . · (1 + |an + an+1 + an an+1 |) − 1,
but (1 + |an + an+1 + an an+1 |) ≤ (1 + |an |)(1 + |an+1 |).

∞
Proposition 30.10.∞ The product k=1 (1 + ak ) converges absolutely if and
only if the series k=1 ak converges absolutely.
Proof. Since
|a1 | + · · · + |an | ≤ (1 + |a1 |)(1 + |a2 |) · . . . · (1 + |an |)

∞
∞absolute convergence of k=0(1 + ak ) implies the
it follows that the absolute
convergence of k=1 ak . On the other hand, for x ≥ 0 we have ex ≥ 1 + x,
which yields
(1 + |a1 |)(1 + |a2 |) · . . . · (1 + |an |) ≤ e|a1 |+···+|an | .

N
Now, if ∞ |a
k=1 k | converges, then k=1 (1 + |ak |) must converge as it
N ∈N
is an increasing sequence which is bounded.
∞ (−1)k

Example 30.11. The product k=2 1 + k converges since
2n
(−1)k 3 2 5 4 1 1
1+ = · · · · ...· 1+ =1+ →1
k=2
k 2 3 4 5 2n 2n
and

2n−1
(−1)k

3 2 5 4 2n − 1 2n − 2
1+ · · · · ...·
= = 1.
k=2
k 2 3 4 5 2n − 2 2n − 1
∞ 1

However we already know that 1 + does not converge, hence
∞ k=2 k
(−1)k
k=2 1 + k
does not converge absolutely.
We are interested in finding the value of Wallis’ product, i.e. we want to

prove
∞
4n2 π
2
= . (30.5)
n=1
4n − 1 2
431
We start by considering
π
2
Am := sinm xdx (30.6)
0
and claim
m−1
Am = Am−2 , for m ≥ 2. (30.7)
m
π
Clearly we have A0 = 2
and A1 = 1. In order to prove (30.7) note that
π π
2 2
sinm xdx = sinm−1 x sin xdx
0 0
π
π 2 d
= sinm−1 x(− cos x)|02 − (sinm−1 x)(− cos x)dx
0 dx
π
2
= (m − 1) sinm−2 x cos2 xdx
0
π
2
= (m − 1) sinm−2 x(1 − sin2 x)dx
0
π π
2 2
m−2
= (m − 1) sin xdx − (m − 1) sinm xdx,
0 0
or
Am = (m − 1)Am−2 − (m − 1)Am
which implies (30.7). Using (30.7) we find
(2n − 1) (2n − 3) 3 1 π
A2n = · ...· · · (30.8)
2n (2n − 2) 4 2 2
and
2n (2n − 2) 4 2
A2n+1 = ·...· · . (30.9)
(2n + 1) (2n − 1) 5 3
For x ∈ [0, π2 ], i.e. 0 ≤ sin x ≤ 1 we have
0 ≤ sin(2m+2) x ≤ sin(2m+1) x ≤ sin2m x ≤ 1
implying that
A2m+2 ≤ A2m+1 ≤ A2m . (30.10)
432
Since
A2m+2 2m + 1
lim = lim =1
m→∞ A2m m→∞ 2m + 2
we also get by (30.10)

A2m+1
lim = 1.
m→∞ A2m
Finally we find
A2m+1 2m · 2m · . . . · 4 · 2 · 2 2
= · ,
A2m (2m + 1)(2m − 1) · . . . · 3 · 3 · 1 π
i.e.
A2m+1 2 2m · 2m · . . . · 4 · 2 · 2
1 = lim = lim
m→∞ A2m π m→∞ (2m + 1)(2m − 1) · . . . · 3 · 3 · 1
or
m
∞

π 4n2 4n2
= lim = .
2 m→∞ n=1 4n2 − 1 n=1 4n2 − 1
Thus we have proved
Theorem 30.12 (Wallis’ Product). The following holds:

∞
4n2 π
2
= . (30.11)
n=1
4n − 1 2
We want to use (30.11) to prove that

∞
2 √
e−x dx = π. (30.12)
−∞
For this we will study the Γ-function a bit more closely. We know that
on (0, ∞) the Γ-function is logarithmic convex, i.e. for 0 < λ < 1 and
x, y ∈ (0, ∞) we have
Γ(λx + (1 − λ)y) ≤ Γ(x)λ Γ(y)1−λ. (30.13)
Further we know that Γ(x + 1) = xΓ(x) implying
Γ(x + n) = Γ(x)x(x + 1) · . . . · (x + n − 1), x > 0, n ∈ N. (30.14)
433
Since n + x = (1 − x)n + x(n + 1) the logarithmic convexity of Γ implies for

0<x<1
Γ(x + n) ≤ Γ(n)1−x Γ(n + 1)x = Γ(n)1−x Γ(n)x nx = (n − 1)!nx . (30.15)
Using n + 1 = x(n + x) + (1 − x)(n + 1 + x) we derive in a similar way for

0<x<1
n! = Γ(n + 1) ≤ Γ(n + x)x Γ(n + 1 + x)1−x = Γ(n + x)(n + x)1−x . (30.16)
Combining (30.15) and (30.16) gives
n!(x + n)x−1 ≤ Γ(n + x) ≤ (n − 1)!nx ,
or with (30.14)
n!(x + n)x−1
an (x) := ≤ Γ(x)
x(x + 1) · . . . · (x + n − 1)
(n − 1)!nx
≤ =: bn (x).
x(x + 1) · . . . · (x + n − 1)
Note that
bn (x) (n − 1)!nx (n − 1)!(n + x)nx (n + x)nx
= = =
an (x) n!(n + x)x−1 n!(n + x)x n(n + x)
which implies
bn (x) an (x)
lim = lim = 1.
n→∞ an (x) n→∞ bn (x)
Thus
an (x) Γ(x)
≤ ≤1
bn (x) bn (x)
Γ(x)
and for n → ∞ we find for 0 < x < 1 that ≤ 1 or
lim bn (x)
n→∞
(n − 1)!nx
Γ(x) = lim . (30.17)
n→∞ x(x + 1) · . . . · (x + n − 1)
Hence we have proved a product representation for Γ(x) provided that 0 <
x < 1. More generally we have
434
Theorem 30.13. For all x > 0

n!nx
Γ(x) = lim (30.18)
n→∞ x(x + 1) · . . . · (x + n)
holds.
n
Proof. Since lim = 1 we deduce (30.18) for 0 < x < 1 from (30.17),
n→∞ x + n
and for x = 1 we find
n!n n!n n
= =
1(1 + 1) · . . . · (1 + n) (n + 1)! n+1
which yields
n
Γ(1) = 1 = lim
n→∞ n + 1
i.e. (30.18) holds for 0 < x ≤ 1. Now, if (30.18) holds for x ∈ (0, ∞) then it
holds also for y := x + 1 since
n!nx
Γ(y) = Γ(x + 1) = xΓ(x) = lim
n→∞ (x + 1) · . . . · (x + n)
n!ny−1
= lim
n→∞ y(y + 1) · . . . · (y + n − 1)
n!ny
= lim ,
n→∞ y(y + 1) · . . . · (y + n − 1)(y + n)
n
where we used lim = 1, and the theorem is proved.
n→∞ y + n
In Lemma 28.25 we have already proved

∞
2 1
e−x dx = Γ
−∞ 2
or equivalently
∞
−x2 1 1
e dx = Γ .
0 2 2
We note further that by (30.18) we find
√
1 n! n
Γ = lim 1
2 n→∞ (1 + 1 )(2 + 1 ) · . . . · (n + 1 )
2 2 2 2
435
and √
1 n! n
Γ = lim
2 n→∞ (1 − 1 )(2 − 1 ) · . . . · (n − 1 )(n + 1 )
2 2 2 2
implying by Theorem 30.12 that
2
1 2n (n!)2
Γ = lim
2 n→∞ n + 1 (1 − 1 )(4 − 1 ) · . . . · (n2 − 1 )
2 4 4 4
n 2
n n
= 2 lim lim = π,
n→∞ n + 1 n→∞ k 2− 1
2 k=1 4
1 √
or Γ 2
= π which yields
Theorem 30.14. The following holds:

∞
2 √
e−x dx = π. (30.19)
−∞
Next we want to study infinite products of functions. For this let I ⊂ R

be an interval and for k ∈ N let uk : I → R be a function. In addition we
assume uk (x) = 0 for all x ∈ I and k ∈ N. With vk = 1 − uk we set for x ∈ I
∞
∞

uk (x) = (1 + vk (x)). (30.20)
k=1 k=1
When the product in (30.20) converges for every x ∈ I a new function u :

I → R is defined by
∞
N

u(x) = uk (x) = lim uk (x), (30.21)
N →∞
k=1 k=1
or
∞
N

u(x) = (1 + vk (x)) = lim (1 + vk (x)). (30.22)
N →∞
k=1 k=1

N N
Thus u is the pointwise limit of k=1 uk = k=1 (1 + vk ) . We
N ∈N N ∈N
say that u has the product representation or product expansion (30.21)
or (30.22). Clearly, in order to check pointwise convergence of ∞
k=1 (1+vk (x))
we can use our previous criteria.
436
∞ x2

Example 30.15. The product k=1 1 − k2
converges for every x ∈ R
2 2
such that x = m , m ∈ N. Indeed we have ∞ absolute convergence since with
2 ∞
ak = − xk2 we have that k=1 |ak | = x2 1
k=1 k 2 converges. But now we can
∞ x2
extend k=1 1 − k2 to all x ∈ R by defining it to be 0 for x2 = m2 , m ∈ N.
x2

Thus ∞ k=1 1 − k 2 defines a function on R. Now it is even easier to see
∞ 2

that k=1 1 + xk2 converges for all x ∈ R.

Definition 30.16.
We call ∞ k=1 (1 + vk ) uniformly convergent to u if
N
the sequence k=1 (1 + vk ) converges on I uniformly to u.
N ∈N
The Cauchy criterion extends to uniform convergence in the usual way, i.e.
we have

Theorem 30.17. The product ∞ k=1 (1 + vk ) converges on I uniformly if for
every > 0 there exists N() ∈ N such that n ≥ N() and m ∈ N implies
that for all x ∈ I
n+m

(1 + v k (x)) − 1 < . (30.23)

k=n+1

Theorem 30.18. Suppose that the series ∞ k=1 |vk |
converges uniformly on
I and that all functions vk are continuous. Then ∞ k=1 (1 + vk ) converges
uniformly to a continuous function.
∞
∞ We know that the convergence of k=1 |vk (x)| implies the convergence
Proof.
of k=1(1 + vk (x)), see Propositions 30.10 and 30.9. Thus we can define a
function on I by
∞
u(x) := (1 + vk (x)).
k=1
Furthermore, there exists M ∈ N such that for all x ∈ I and all l ∈ N the
following holds
|vm+1 (x)| + |vm+2 (x)| + · · · + |vm+k (x)| < 1, (30.24)
∞
which is again a direct consequence of the uniform convergence of k=1 |vk |.
Consider now the product
∞

wM (x) := (1 + vk (x)) (30.25)
k=M +1
437
n
with pn,M (x) := k=M +1 (1 + vk (x)), n ≥ M + 1, we find with pM,M (x) = 0
∞

wM (x) = (pk,M (x) − pk−1,M (x))
k=M +1
∞
= pk,M (x)vk+1 (x).
k=M +1

We claim that the series ∞k=M +1 pk,M (x)vk+1 (x) is uniformly convergent on
I. Indeed, for k ≥ M + 1 we have
|pk,M (x)| ≤ (1 + |vM +1 (x)|)(1 + |vM +2 (x)|) · . . . · (1 + |vM +k (x)|)

≤ e|vM +1 |+···+|vM +k (x)| < e,
where we used (30.24) and the fact that 1 + y ≤ ey for y ≥ 1. Thus we find
∞
∞

|pk,M (x)vk+1 (x)| ≤ |vk+1 (x)|
k=M +1 k=M +1
∞
and the uniform convergence of k=1 |vk (x)| togetherwith the Weierstrass
M-test yields the uniform and absolute convergence of ∞ k=M +1 pk,M (x)vk (x),
hence wM is a continuous function and the convergence in (30.25) is uniform.
Multiplying wM (x) by M k=1 (1 + vk (x)) does not change the continuity nor
the uniform convergence and the result follows.
Finally we want to establish a result for the derivative of an infinite product
of differentiable functions.
Theorem
∞ 30.19. Suppose∞ that vk : I → R, k ∈ N, is differentiable and that

∞ k=1 |vk | and k=1 |vk | converge uniformly. Then the product u(x) :=
both
k=1 (1 + vk (x)) is differentiable and we have
∞
u (x) ( ∞ (1 + vk (x))) vk (x)
= k=1
∞ = . (30.26)
u(x) k=1 (1 + vk (x)) k=1
1 + v k (x)
Proof. First note that for a finite product we have if g = g1 · . . . · gM that
d (g1 · . . . · gM )
ln(g1 · . . . · gM ) =
dx g1 · . . . · gM
438
and
M M
d d gj
ln(g1 · . . . · gM ) = ln gj = ,
dx dx j=1 g
j=1 j
i.e.
M
(g1 · . . . · gM ) gj
= (30.27)
g1 · . . . · gM g
j=1 j
provided gj = 0 and gj > 0. (The case g < 0 and some gj <0 can also
be treated by looking at |g|.) Now, the uniform convergence of ∞
k=1 |vk (x)|
implies the existence of M ∈ N such that for all x ∈ I we have
∞
1
|vk (x)| < ,
k=M +1
2
1
in particular we have |vk (x)|
∞< 2 for k ≥ M + 1. This implies by Proposition
30.10 the convergence
of k=M +1(1 + vk (x)) and hence by Lemma 30.5 the
convergence of ∞ k=M +1 ln(1 + vk (x)). Since
|vk (x)| < 12 for k ≥ M + 1 it

follows that for these k we have 1+v1k (x) < 2 and consequently, since
∞
∞

d vk (x)
(ln(1 + vk (x))) = ,
dx 1 + vk (x)
k=M +1 k=M +1

the uniform convergence of ∞
k=1 |vk | implies the absolute and uniform con-
∞ v
vergence of k=|M +1 1+vk k . By Theorem 26.19 we now have
∞ ∞
d d vk
ln(1 + vk ) = .
dx dx 1 + vk
k=M +1 k=M +1
From
N
N

(1 + vk ) = exp ln(1 + vk )
k=M +1 k=M +1
we derive N
d N

dx k=M +1 (1 + vk ) vk
N =
k=M +1 (1 + vk ) k=M +1
1 + vk
439
and we are now allowed to pass to the limit to obtain

∞ ∞
d
(1 + vk ) vk
dx
∞k=M +1 = .
k=M +1 (1 + vk ) k=M +1
1 + vk
Using (30.27) for (1 + v1 ) · . . . · (1 + vM ) we eventually get

∞ ∞
d
dx k=1
(1 + vk ) vk
∞ = .
k=1 (1 + vk ) k=1
1 + vk

Example 30.20. For x ∈ (−1, 1) we know that F (x) = x ∞ k=1 1 −
x2
k2

∞ x2
converges uniformly, since k=1 k2 ≤ |x|2 ∞ 1
k=1 k 2 . Further we have
∞ 2
∞

x 1
= 2|x|
k2 k2
k=1 k=1
and therefore we find for x = 0

∞
F (x) 1 2x
= + .
F (x) x k=1 x2 − k 2
Problems
1. Prove
∞ k 3 −1
a) k=2 k 3 +1 = 23 ;
∞ 1

b) l=1 1+ l(l+2)
= 2.

2. Let ak ≥ 0, ak = 1 for k ∈ N. Show: the convergence of ∞ k=1 (1 − ak )
is equivalent to the convergence of ∞ a
k=1 k .
∞
3. Suppose that k=1 ak converges.
∞ 2
a) Prove that ∞ k=1 (1 + ak ) converges if and only if k=1 ak con-
verges.
∞
b) Prove that if ∞ 2
k=1 ak diverges then k=1 (1 + ak ) diverges.
440

4. Prove that if the infinite product ∞ k=1 (1+ak ) converges absolutely then
we can rearrange its factors and the corresponding product converges
to the same limit.

5. a) For |x| < 1 prove that ∞ 2k 1
k=1 (1 + x ) = 1−x .
b) Show that for x = 2k ( π2 + lπ), k ∈ N, l ∈ Z, the following holds:

∞
x sin x
cos j
= ,
j=1
2 x
∞ π
and derive j=1 cos 2j+1 = π2 .
441
31 More on the Γ-Function

In this chapter we want to discuss more properties of the Γ-function. Al-
though the Γ-function is one of the most important functions in mathemat-
ics, typical undergraduate courses do not have enough time to treat the
Γ-function in detail. We therefore view this chapter as an addition to the
basics that are usually met.
In Theorem 28.29 we have seen that the Γ-function is logarithmic convex. A
classical result due to H. Bohr and J. Mollerup gives an interesting charac-
terisation of the Γ-function using logarithmic convexity.
Theorem 31.1. Let G : (0, ∞) → R be a positive, logarithmic convex func-

tion satisfying the functional equation of the Γ-function, i.e. G(x + 1) =
xG(x), and the normalisation G(1) = 1. Then G is the Γ-function.
Proof. From the functional equation and the normalisation we deduce for
n ∈ N the fact that G(n + 1) = n!. Now, for 0 < x ≤ 1 we find
n + x = (1 − x)n + x(n + 1),
i.e. n + x is a convex combination of n and n + 1. Since G is logarithmic

convex we find
ln G(n + x) ≤ (1 − x) ln G(n) + x ln G(n + 1),
or
G(n + x) ≤ G(n)(1−x) G(n + 1)x = ((n − 1)!)(1−x) (n!)x = n!n(x−1) ,
i.e.
G(n + x) ≤ n!n(x−1) . (31.1)
Moreover,
n + 1 = x(n + x) + (1 − x)(n + x + 1),
i.e. n + 1 is a convex combination of n + x and n + x + 1, and this gives
n! = G(n + 1) ≤ G(n + x)x G(n + x + 1)(1−x)

= G(n + x)x (G(n + x)(n + x))1−x
= G(n + x)(n + x)(1−x) ,
443
thus we have
n! = G(n + 1) ≤ G(n + x)(n + x)(1−x) . (31.2)
Combining (31.1) and (31.2) we find
n!(n + x)(x−1) ≤ G(n + x) ≤ n!n(x−1) . (31.3)
Observing that
G(n + x) = x(x + 1) · . . . · (x + n − 1)G(x),
it follows from (31.3) and the fact that 0 < x ≤ 1 that
n!nx n!(n + x)x

≤ ≤ G(x)
x(x + 1) · . . . · (x + n) x(x + 1) · . . . · (x + n)
n!nx n!nx
≤ ≤ ,
x(x + 1) · . . . · (x + n − 1)n x(x + 1) · . . . · (x + n)
and by Theorem 30.13 we get for n → ∞
Γ(x) ≤ G(x) ≤ Γ(x), (31.4)
i.e. G(x) = Γ(x) for 0 < x < 1. But now G(x + 1) = xG(x) and Γ(x + 1) =
xΓ(x) imply G(x) = Γ(x) for all x > 0.
In the proof of the above result, Theorem 30.13 was quite important. Using
this result we can also give a representation of the Γ-function as an infinite
product involving the exponential function. Recall that the existence of the
Euler constant N
1
γ := lim − ln N (31.5)
N →∞
k=1
k
was proved in Theorem 18.24. Denote by Γn the function
nx n!
Γn (x) := , (31.6)
x(x + 1) · . . . · (x + n)
and then Theorem 30.13 reads as
Γ(x) = lim Γn (x). (31.7)

n→∞
444
31 MORE ON THE Γ-FUNCTION
From (31.6) we deduce

n
Γn (x + 1) = xΓn (x) , (31.8)
x+n+1
or
1x+n+1
Γn (x) = Γn (x + 1). (31.9)
x n
Next we observe that
k 1
= x
x+k 1+ k
and
1 1 x x x
ex(ln n−1− 2 −...− n ) = nx e− 1 e− 2 · . . . · e− n ,
which implies
x x
1 1 e1 en
x(ln n−1− 12 −...− n )
Γn (x) = e · · . . . · . (31.10)
x 1 + x1 1 + nx
Now we pass to the limit n → ∞ and using (31.5) we arrive at
Theorem 31.2. For x > 0 the Γ-function has the Weierstrass product
representation
∞
e−γx e k
x
Γ(x) = . (31.11)
x k=1 1 + xk
From (31.10) we can immediately deduce

1 1 1
x −x x −x x −x
= xex(1+ 2 +···+ n −ln n) 1 + e 1 · 1+ e 2 · ...· 1 + e n
Γn (x) 1 2 n
and passing to the limit n → ∞ yields
Corollary 31.3. For x > 0 we have

∞
1 x −x
= xeγx 1+ e k. (31.12)
Γ(x) k=1
k
The Weierstrass product representation allows us to prove
Theorem 31.4. On (0, ∞) the Γ-function is arbitrarily often differentiable.
445
Proof. Since for x > 0 we have Γ(x) > 0 it follows that Γ is differentiable if
and only if ln Γ is differentiable. From (31.11) we derive
∞
x x
ln Γ(x) = −γx − ln x + − ln 1 + , (31.13)
k=1
k k
∞ x
and we know that the series k=1 k
− ln 1 + xk converges pointwise. Fur-
ther we note that
∞ ∞
x 1
∞
x 1 x
− ln 1 + = − = ,
k=1
k k k=1
k k+x k=1
k(k + x)
and this series converges uniformly on compact intervals in (0, ∞). Conse-
quently we have
∞ ∞
Γ (x) 1 x 1 1 !
(ln Γ(x)) = = −γ − + = −γ − + − ,
Γ(x) x k(k + x) x k k+x
k=1 k=1
(31.14)
which now yields also that Γ is arbitrarily often differentiable on (0, ∞).
Indeed we find for l ≥ 2
∞
dl−1 Γ (x) (−1)l (l − 1)!
= . (31.15)
dxl−1 Γ(x) k=0
(x + k)l
It is worth noting that

∞

1
(ln Γ) (x) = , (31.16)
k=0
(k + x)2
which in particular confirms that ln Γ is convex, which is of course already

known to us.
We want to study the asymptotic behaviour of the Γ-function. Since on
N it coincides with the factorials we expect rapid growth. More precisely,
recalling
k k+1
1 1
1+ <e< 1+ , (31.17)
k k
446
see Problem 10 in Chapter 17, we find when multiplying these inequalities

for k = 1, . . . , (n − 1)
nn−1 nn
< en−1 < , (31.18)
(n − 1)! (n − 1)!
or
enn e−n < n! < enn+1 e−n , (31.19)
which suggests
1
Γ(x) = xx− 2 e−x eϑ(x) . (31.20)
Using methods not yet at our disposal the general Stirling formula can
be proved.
√ 1
Γ(x) = 2πxx− 2 e−x eϑ(x) (31.21)
where ∞
1 1 1
ϑ(x) = t
− + e−xt dt. (31.22)
0 e −1 t 2
A proof of Theorem 31.5 is given in R. Beals and R. Wong [1]. Here we give
a proof of the Stirling formula for the factorial, or equivalently for Γ(n + 1).
As preparation we prove
Lemma 31.6. For k ∈ N we find ξk ∈ [k, k + 1] such that
k+1
1 1
ln xdx = (ln k + ln(k + 1)) + . (31.23)
k 2 12ξk2
Proof. Let g(x) = x(1−x) 2

≥ 0 on [0, 1] and set gk : [k, k + 1] → R, gk (x) =
(x−k)(1+k−x)
g(x − k) = 2
. We have gk (x) ≥ 0, gk (x) = −x + 2k+1
2
and gk (x) =
−1, and therefore
k+1 k+1
ln xdx = − gk (x) ln xdx
k k
k+1 k+1

= −gk (x) ln x

+ gk (x)(ln x) dx
k k
k+1 k+1
k+1

= −gk (x) ln x
+ gk (x)(ln x) − gk (x)(ln x) dx.
k k k
447
Now, since k+1

1
−gk (x) ln x = (ln(k + 1) + ln k)
k 2
and k+1

gk (x)(ln x) = 0,
k
we find
k+1 k+1
1 1
ln xdx = (ln(k + 1) + ln k) + g (x)dx
2 k
k 2 k x
k+1
1 1
= (ln(k + 1) + ln k) + 2 gk (x)dx
2 ξk k
1 1
= (ln(k + 1) + ln k) + , ξk ∈ [k, k + 1].
2 12ξk2
Now we sum (31.23) from k = 1 to n − 1 and we get

n
n−1
n
1 1 1
ln xdx = ln k − ln n + .
1 k=1
2 2 k=1 ξk2
n
Since 1
ln xdx = n ln n − n + 1 we find further
n
1
ln k = (n + ) ln n − n + ηn , (31.24)
k=1
2
where
n−1
1 1
ηn = 1 − . (31.25)
12 k=1 ξk2
n
ln k
But e k=1 = n!, and therefore
n! = n(n+ 2 ) e−n cn , cn = eηn .

1
(31.26)
1 1
For k ≤ ξk ≤ k + 1 we have ξk2
≤ k2
which yields
∞
1 1
η := lim ηn = 1 −
n→∞ 12 k=1 ξk2
448
exists, hence
c = lim cn = eη .
n→∞
We want to find c. We must have
c2n
c = lim ,
n→∞ c2n
and by (31.26) we have

√
c2n (n!) 2n(2n)2n √ 2n (n!)2
= = 2√ .
c2n n2n+1 (2n)! n(2n)!
Recall Wallis’ product, Theorem 30.12, i.e.
∞
4k 2
2 = π. (31.27)
k=1
4k 2 − 1
Note that 4k 2 − 1 = (2k − 1)(2k + 1) and hence

N
4k 2 2 · 2 · 4 · 4 · . . . · 2N · 2N
2
=
4k − 1 1 · 3 · 3 · 5 · . . . · (2N − 1) · (2N + 1)
k=1
which gives
N
12
4k 2 √ 2 · 4 · . . . · 2N
2 = 2 √
4k 2 − 1 3 · 5 · . . . · (2N − 1) · 2N + 1
k=1
1 22 · 42 · . . . · (2N)2
=
N + 12 2 · 3 · 4 · 5 · . . . · (2N − 1)(2N)
1 22N (N!)2
= ,
N + 12 (2N)!
which yields
22N (N!)2 √
c = lim √ = 2π, (31.28)
N →∞ N(2N)!
implying
n!
lim √ 1 = 1.
n→∞ 2πn(n+ 2 ) e−n
Thus we have proved
449
Theorem 31.7 (Stirling formula). The following holds
n!
lim √ = 1. (31.29)
2πn(n+ 2 ) e−n
1
n→∞
Further it follows that
Corollary 31.8. For n ≥ 2 we have

√ √ 1
2πn(n+ 2 ) e−n < n! < 2πn(n+ 2 ) e−n e 12(n−1) .
1 1
(31.30)
Proof. We only need to observe that

∞ ∞
1 1 1 1 1 ∞
dx 1
0 < ηn − η = 2
≤ 2
< 2
=
12 k=n ξk 12 k=n k 12 n−1 x 12(n − 1)
and now we have to use (31.26).
In Problem 11 in Chapter 28 we have seen that the improper integral

1
B(x, y) = tx−1 (1 − t)y−1 dt (31.31)
0
converges for x > 0 and y > 0.
Definition 31.9. The function B : (0, ∞) × (0, ∞) → R, (x, y) → B(x, y),

is called (Euler’s) beta-function.
Our aim is to relate the beta-function to the Γ-function.

First we note for x, y > 0
1 1 x
x y−1 x+y−1 t
B(x + 1, y) = t (1 − t) dt = (1 − t) dt. (31.32)
0 0 1−t
Lemma 31.10. For x, y > 0 we have

x
B(x + 1, y) = B(x, y). (31.33)
x+y
450
1
Proof. For 0 < , η < 2
we find using integration by parts
1−η x
x+y−1 t
(1 − t) dt
1−t
x+y x 1−η 1−η x t x−1 1
= − (1−t)
x+y
t
1−t
+ x+y
(1 − t)x+y 1−t (1−t)2
dt

y x −η y (1−η)x 1−η x−1
= (1−) x+y x
+ x+y
t (1 − t)y−1 dt.
For and η tending to 0 we get

x
B(x + 1, y) = B(x, y).
x+y
Now we can prove

Theorem 31.11. For x, y > 0 the following holds
Γ(x)Γ(y)
B(x, y) = . (31.34)
Γ(x + y)
Proof. For y > 0 fixed we consider the function
f (x) := B(x, y)Γ(x + y).
With (31.33) we find
f (x + 1) = B(x + 1, y)Γ(x + 1 + y)
x
= B(x, y)(x + y)Γ(x + y)
x+y
= xB(x, y)Γ(x + y) = xf (x),
thus f satisfies the functional equation of the Γ-function. Further, for y > 0
fixed the function x → Γ(x + y) and x → B(x, y) are logarithmic convex.
For the Γ-function this is trivial, in the case of the beta-function we only
need to note that x → tx−1 (1 − t)y−1 is logarithmic convex and hence the
integral defining B(x, y) is a pointwise limit of logarithmic convex functions.
By Problem 12 c) in Chapter 28 it follows that x → B(x, y) is logarithmic
convex. Finally, Problem 12 a) in Chapter 28 shows that x → B(x, y)Γ(x+y)
is logarithmic convex. Since both results hold also for g(x) := ff (x)(1)
and
451
f (x)
g(1) = 1 we deduce by Theorem 31.1 that the function f (1)
is the Γ-function,
i.e. we have
f (1)Γ(x) = B(x, y)Γ(x + y).
In order to find f (1) we note that
1
1
B(1, y) = (1 − t)y−1 dt =
0 y
and therefore
1
f (1) = B(1, y)Γ(1 + y) = yΓ(y) = Γ(y)
y
and it follows that
Γ(x)Γ(y)
B(x, y) = .
Γ(x + y)
Calculating B(x, x) we find

1
B(x, x) = tx−1 (1 − t)x−1 dt, x > 0
0
1
2
=2 (t(1 − t))x−1 dt,
0
where we used that t → t(1 − t)x−1 is symmetric with respect to the axis
t0 = 12 . Using the substitution s = 4t(1 − t) we obtain
1
1
B(x, x) = 2 sx−1 (1 − s)− 2 · 2−2x ds
0
1
1 1
=2 1−2x
sx−1 (1 − s)− 2 ds = 21−2x B(x, ).
0 2
Now we apply (31.34) to find
1
Γ(x)2 1−2x Γ(x)Γ 2
= B(x, x) = 2
Γ(2x) Γ x + 12
√
or using Γ 12 = π we arrive at the Legrendre duplication formula for
the Γ-function:
452

22x−1 1
Γ(2x) = √ Γ(x)Γ x + . (31.35)
π 2
We close our theoretical considerations by proving an interesting relation

between the Γ-function and the sine-function. For this we first extend Γ to
a larger domain. Let x > 0 and n ∈ N. Iterating the functional equation of
the Γ-function we get
Γ(x + n) = (x + n − 1)(x + n − 2) · . . . · (x + 1)xΓ(x), (31.36)
which allows us to define Γ for all x, x > −n, but x = 0, −1, . . . , −n by
Γ(x + n)
Γ(x) := .
(x + n − 1)(x + n − 2) · . . . · (x + 1)x
Thus we can extend Γ to R\{−N0 }, −N0 := {k|−k ∈ N0 } and the functional

equation of Γ also holds for this extension.
We now consider the function
ϕ(x) := Γ(x)Γ(1 − x) sin πx (31.37)
which is defined for all x ∈ R \ Z. For such a value of x we find
ϕ(x + 1) = Γ(x + 1)Γ(1 − x − 1) sin(π(x + 1))

Γ(1 − x)
= xΓ(x) · (− sin πx) = ϕ(x),
−x
where we used Γ(1 − x) = −xΓ(−x). Thus ϕ is a function with period 1.

Our next aim is to extend ϕ to R. Applying (31.35) for x = 12 we find
x x + 1
Γ Γ = c0 2−x Γ(x), (31.38)
2 2
√
π
where c0 = 2
. Replacing x by 1 − x in (31.38) we get

1−x x
Γ Γ 1− = c0 2x−1 Γ(1 − x), (31.39)
2 2
453
and we find
x x + 1 x x πx

x+1

1−x

πx
ϕ ϕ =Γ Γ 1− sin Γ Γ cos
2 2 2 2 2 2 2 2
c20 c20
= Γ(x)Γ(1 − x) sin πx = ϕ(x),
4 4
or x x + 1 c2 π
ϕ ϕ = 0 ϕ(x) = ϕ(x). (31.40)
2 2 4 16
For x ∈ R \ Z the function ϕ is arbitrarily often differentiable since the sine
and the Γ-functions are. The functional equation of the Γ-function yields
Γ(1 + x)
ϕ(x) = Γ(1 − x) sin πx
x
sin πx
= Γ(1 + x)Γ(1 − x)
x
∞
π 2k+1 x2k
= Γ(1 + x)Γ(1 − x) (−1)k ,
k=0
(2k + 1)!
and the series on the right hand side converges for all x ∈ R. Moreover,
as x → 0 the right hand side tends to π and is indeed an arbitrarily often
differentiable function. Thus the function
-
ϕ(x), x ∈ R \ Z
ϕ̃(x) :=
π, x∈Z
has period 1 and is on R arbitrarily often differentiable, and further (31.40)

holds for all x ∈ R.
Now we claim that ϕ̃ is constant. Denote by g the function
d2
g(x) := ln ϕ̃(x), 0 ≤ x ≤ 1.
dx2
Clearly g has period 1 and by (31.40) we find
π
x x+1
ln ϕ ϕ = ln ϕ(x)
2 2 16
or x
x+1 π
ln ϕ + ln ϕ = ln + ln ϕ(x),
2 2 16
454
which yields
1 x 1 x+1
g + g = g(x). (31.41)
4 2 4 2
On [0, 1] the function g is continuous, hence bounded, say |g(x)| ≤ M on
[0, 1], which implies by (31.41)

1 x 1 x + 1 M
|g(x)| ≤ g + g ≤ 2 (31.42)
4 2 4 2
and iterating (31.42) N-times we find
M
|g(x)| ≤ , (31.43)
2N
which due to the periodicity of g extends to all x ∈ R, thus we must have
d2
g(x) = 0 for all x ∈ R. Hence dx 2 ln ϕ̃(x) must be a linear function and
periodic, i.e. it must be a constant. Consequently ϕ̃ must be constant, but

ϕ̃(0) = π. Thus by (31.37) we have proved
Theorem 31.13. For x ∈ R \ Z

π
Γ(x)Γ(1 − x) = . (31.44)
sin πx
Writing (31.44) as
1 1
sin πx =
π Γ(x)Γ(1 − x)
and using again Γ(1 − x) = −xΓ(−x) we obtain
π
sin πx = .
−xΓ(x)Γ(−x)
If we note that the Weiersrtass product representation extends to x ∈ R \ Z

we find by Theorem 31.2 the following product representation of the
sine function:
Theorem 31.14. For x ∈ R the following holds

∞
x2
sin πx = πx 1− 2 (31.45)
k=1
k
455
Remark 31.15. From our derivation we can only conclude that (31.45) holds
∞ x2
for x ∈ R \ Z. But for x ∈ Z, one term in k=1 1 − k2 vanishes as does
sin πx, hence we can extend (31.45) to R.
When turning to complex-valued functions of a complex variable and intro-
ducing meromorphic functions we will return to the Γ-function and related
functions. In fact many of the formulae proved here will show their full power
in the complex setting.
Problems
1. Show that √
1 (2n)! π
Γ n+ = , n ∈ N.
2 4n n!
2. Let α > −1 and fα : (0, ∞) → R, fα (t) = tα . Prove that

∞
Γ(α + 1)
Fα (s) := tα e−st dt = .
0 sα+1
3. Prove that x−1

1
1
Γ(x) := ln dt
0 t
and derive
1 12 √
1 π
ln dt = ,
0 t 2
as well as
1 − 1
1 2 √
ln dt = π.
0 t
4. Prove that Γ (1) = −γ, where γ is the Euler constant.

d Γ (x)
5. The function ψ(x) := dx
ln Γ(x) = Γ(x)
is often called the digamma-
function. Prove:
∞ 1 1

a) ψ(x) − ψ(1) = − k=0 x+k
− k+1
;
1 1
b) ψ(x + n) = x
+···+ x+n−1
+ ψ(x).
456
6. For the Beta-function derive the representation

∞
sx−1
B(x, y) = ds.
0 (1 + s)x+y
s
Hint: use the substitution t = 1+s
in the definition of B(x, y).
7. Find ∞
x5
dx.
0 (1 + x)7
8. Prove the following product representation of the Beta-function:
∞
x+y 1 + x+y
n
B(x, y) = .
xy n=1 1 + nx 1 + ny
457
32 Selected Topics on Functions of a Real

Variable
We have discussed in much detail continuous functions, differentiable func-
tions (of a certain order including arbitrarily often differentiable functions),
integrable functions etc. In particular we could clarify some of their rela-
tions, for example that functions differentiable on an open set are continu-
ous, continuous functions on a compact interval are integrable, etc. Maybe
most striking was the fundamental theorem of calculus in the form that if
f : [a, b] → R is continuous then the function F : [a, b] → R defined by
x
F (x) := f (t)dt
a
is differentiable and F (x) = f (x).

However we also have important function classes for which these results do
not apply: a monotone function need not be continuous, but one-sided limits
exist, see Problem 6 in Chapter 20, or a bounded monotone function on [a, b]
is Riemann integrable but we should not expect that
x
G(x) := g(t)dt
a
is differentiable as the example g : [−1, 1] → R, g|[−1,0] = 0, g|(0,1] = 1 with

corresponding G given by G|[−1,0] = 0 and G(x) = x for x ∈ (0, 1] shows.
Thus for handling monotone functions we require an extension of our theory.
It turns out that a much better understanding of point sets in R is needed.
In this chapter we want to give some first ideas of the topic “Theory of Real
Variables”. Only after we have introduced the Lebesgue measure and the
Lebesgue integral can we deal with this topic in more detail.
Recall that a set A is called countable if it is the bijective image of N. If A is
finite or countable we call A denumerable. In R we have finite and countable
subsets, for example N, Z or Q, and non-countable subsets, for example R or
R \ Q. Moreover, in R we have some topological notions: we have open and
closed intervals, in fact open and closed sets, or compact sets. We now want
to add a further notion of “smallness”.
Definition 32.1. A set A ⊂ R is called a null set if for every > 0 there
exists a denumerable number of bounded intervals In with end points an < bn
459
such that ∞

A⊂ In and (bn − an ) ≤ . (32.1)
n∈N n=1
Remark 32.2. There is no need to be more restrictive in the choice of In ,

i.e. we may allow open, closed or half-open intervals.
Lemma 32.3. A. If A ⊂ A and A ⊂ R is a null set, then A is a null set
too.
B. Every denumerable set A ⊂ R is a null set.

Proof. A. This is trivial since A ⊂ n∈N In provided A ⊂ n∈N In .
B. Let A = {aν |ν ∈ N} be denumerable subset of R. (If A is finite
with m elements we set am+j = a1 for j ∈ N.) Given > 0, choose
Iν = (−2−ν−1 + aν , av + 2−ν−1 ) which yields bν − aν = 2−ν and conse-
quently A ⊂ ∪ν∈N Iν as well as
∞
∞
∞

(bν − aν ) = 2−ν = 2−ν = .
ν=1 ν=1 ν=1
Before proceeding further, we briefly consider the idea of how to measure

“length”. We have no problem in accepting that the length of the bounded
interval [a, b] ⊂ R is given by λ(1) ([a, b]) = b − a. We may next ask how to
determine the “length” or “size” of an arbitrary subset A ⊂ R. For simplicity
we assume that A is bounded. Reasonable properties for a function measuring
“length” would include for A, Aj ⊂ R the following:
λ(1) (∅) = 0, (32.2)
i.e. the empty set has no length;
λ(1) (A) ≥ 0, (32.3)
i.e. length is non-negative;
λ(1) (A1 ∪ A2 ) = λ(1) (A1 ) ∪ λ(1) (A2 ) for A1 ∩ A2 = ∅, (32.4)
or more naturally and more generally
∞ ∞

(1)
λ Aj = λ(1) (Aj ) for Aj ∩ Al = ∅ if j = l. (32.5)
j=1 j=1
460
32 SELECTED TOPICS ON FUNCTIONS OF A REAL VARIABLE
Moreover, with c + A = {c + x|x ∈ A}, c ∈ R,
λ(1) (c + A) = λ(1) (A), (32.6)
i.e. length is invariant under translations.

Suppose we can define a mapping λ(1) with these properties. We want to
calculate for A := [0, 1] ∩ Q and B := [0, 1] \ Q the length λ(1) (A) and
λ(1) (B). Since 1 = λ(1) (A ∪ B) = λ(1) (A) + λ(1) (B) we only need to find
λ(1) (A). Since Q is countable we know that A = [0, 1] ∩ Q is countable. Let
τ : N → A be a fixed bijective mapping (an enumeration of A) and put
Aj = τ ({j}). Clearly Aj ∩ Ak = ∅ for j = k and A = ∪j∈N Aj . Therefore we
have ∞

(1)
λ (A) = λ(1) (Aj ). (32.7)
j=1
Each set Aj consists of a single point and hence by translation invariance we

must have λ(1) (Aj ) = α for all j ∈ N. If α = 0 then λ(1) (A) = ∞ which is a
contradiction to λ(1) (A) ≤ 1. Thus α = 0 and therefore λ(1) (A) = 0 implying
that λ(1) (B) = 1. It follows that the infinite set A must have “length” zero
and if we take away this infinite set from [0, 1], the length remains unchanged.
So far the results might be surprising but they are consistent. However it
turns out that we cannot define on all bounded subsets of R a mapping λ(1)
with the properties listed above. We will see later that we can construct
λ(1) , the one-dimensional Lebesgue measure on a large family of sets,
the Borel sets B and with the normalisation λ(1) ([0, 1]) = 1, λ(1) is even
uniquely defined. All open and closed subsets of R belong to B as do all
countable sets. (Unfortunately not every subset of a null set will belong to
B which will cause a few problems later.) At the moment it is sufficient
to accept that for all countable, all closed and all open sets of R we can
define “length” which is finite for bounded sets (if defined) and zero for
countable sets. Moreover, if A ⊂ R is a Borel set then A , its complement,
is a Borel set too. If I is a bounded interval with end points a < b then
λ(1) (I) = b − a. If A ⊂ [a, b] is a Borel set then [a, b] \ A is a Borel set and
λ(1) ([a, b] \ A) = (b − a) − λ(1) (A).
Now we want to discuss a compact set which is not denumerable but nonethe-
less has “length” zero. This is one of the interesting properties of the famous
Cantor set. We start by setting
C0 := [0, 1]. (32.8)
461
1
From C0 we take away the open interval ,2
3 3
to obtain
$ % $ %
1 2 0 1 2 3
C1 := [0, 1] \ , = , ∪ , . (32.9)
3 3 3 3 3 3

In the next step we take away from 03 , 13 and 23 , 33 the open “middle inter-
val” of length 19 , i.e.
$ % $ %
0 1 1 2 2 3 7 8
C2 : = , \ , ∪ , \ , (32.10)
3 3 9 9 3 3 9 9
$ % $ % $ % $ %
0 1 2 3 6 7 8 9
= , ∪ , ∪ , ∪ , .
9 9 9 9 9 9 9 9
We continue this process. Clearly CN consists of 2N disjoint closed intervals

CN,j , j = 1, . . . , 2N , each of length 31N . From CN we move to CN +1 by taking
1
away from each interval CN,j the open “middle interval” of length 3N+1 We
define the Cantor set C by
∞

C := CN . (32.11)
N =0
So what can we say about C? First, since each set CN is closed by Lemma
19.7 it follows that C is closed too. Moreover, since C ⊂ [0, 1], the Cantor set
is bounded, hence by the Heine-Borel theorem, Theorem 20.26, it is compact.
Further, in the N th step we get from CN to CN +1 by removing 2N open
1
intervals of length 3N+1 . The total length of the removed intervals add up to
∞ ∞ N
2N 1 2 1 1
N +1
= = 2 = 1. (32.12)
N =0
3 3 N =0 3 31− 3
This implies however that

λ(1) (C) = 0. (32.13)
Finally we observe that C is not denumerable. For this we use first Theorem
18.33 which implies that every x ∈ [0, 1] has a ternary or 3-adic representation
∞

x= an 3−n , an ∈ {0, 1, 2}. (32.14)
n=1
462
A different way to write x in this representation is
x = 0.a1 a2 a3 · · · , an ∈ {0, 1, 2}, (32.15)
and of course we identify
x = 0.00 . . . 01000 (1 is in position k) (32.16)
and
y = 0.00 . . . 00222 . . . ( first 2 is in position k + 1). (32.17)
Using this identification, in C1 we only find elements with first digit in the
ternary representation being either 0 or 2. In C2 we only find elements
belonging to C1 and with the second digit being either 0 or 2, and in CN
we only have elements from CN −1 with N th digit either 0 or 2. Thus x ∈ C
implies
∞
x= an 3−n , an ∈ {0, 2}. (32.18)
n=1
Conversely, every x with a representation (32.18) must belong to C. Now we

can use the proof of Theorem 18.35 to show that C is not denumerable. We
only have to restrict Ak−l in that proof to 0 or 2. Eventually, we have now
proved
Theorem 32.4. The Cantor set is a compact, denumerable null set.
This result tells us that sets being large when judged by their cardinality
still can be small with respect to “length” or measure. Having these con-
siderations in mind we return to monotone functions. In the following we
consider monotone functions f defined on a compact interval [a, b] which are
bounded. If f is monotone decreasing then −f is monotone increasing and
hence when investigating the “smoothness” or “regularity” of a monotone
function we can confine ourselves to increasing functions. Let f : [a, b] → R
be a bounded increasing function. Since f is real-valued and increasing we
have of course f (a) ≤ f (x) ≤ f (b) < ∞, i.e. f is bounded, however some-
times we prefer to emphasise in this chapter the boundedness of f . From
Problem 6 in Chapter 20 we know that for x0 ∈ (a, b)
lim f (x) = inf{f (x)|x0 < x ≤ b}

f (x0 +) := x→x (32.19)
0
x>x0
463
and
f (x0 −) := x→x
lim f (x) = sup{f (x)|a ≤ x < x0 } (32.20)
0
x<x0
exist and the following must hold
f (x0 −) ≤ f (x0 ) ≤ f (x0 +). (32.21)
We call
[f ](x0 ) := f (x0 +) − f (x0 −) ≥ 0 (32.22)
the jump of f at x0 . In part b) of Problem 6 in Chapter 20 we have proved
that f can only have finitely many jumps larger than a given η > 0. Indeed,
since f is bounded, there exists n0 ∈ N such that
n0 η ≥ f (b) − f (a),
implying that an upper bound for the number of jumps of size larger than η
is the largest n ∈ N such that nη ≤ f (b) − f (a). This implies also, again see
Problem 6 in Chapter 20, that f can have at most countable many jumps,
i.e. outside a countable set f is continuous.
Lemma 32.5. Let f : [a, b] → R be a bounded increasing function. For

a = x0 < x1 < · · · < xn < xn+1 = b we have
n

(f (a+) − f (a)) + [f ](xk ) + (f (b) − f (b−)) ≤ f (b) − f (a). (32.23)
k=1
Proof. Let yk ∈ (xk , xk+1 ), k = 0, . . . , n. It follows that
f (xk +) − f (xk −) ≤ f (yk ) − f (yk−1),

f (a+) − f (a) ≤ f (y1 ) − f (a),
f (b) − f (b−) ≤ f (b) − f (yn ),
and adding these inequalities yields (32.23).
Suppose that f has countable jumps occurring at "xj , j ∈ N, a <#x1 < · · · <
xj−1 < xj , xj < b. For N ∈ N denote by SN := xN N
1 , . . . , xk(N ) the finite
subset of {xj |j ∈ N} corresponding to jumps of size larger than N1 . Clearly we
have SN ⊂ SN +1 and ∪N ∈N SN = {xj |j ∈ N}. For SN inequality (32.23) holds
464

k(N ) N
and the sequence j=1 [f ](xj ) is increasing. Since this sequence is
N ∈N
also bounded it converges and in the limit we obtain
∞

(f (a+) − f (a)) + [f ](xk ) + (f (b) − f (b−)) ≤ f (b) − f (a). (32.24)
k=1
Definition 32.6. Let f : [a, b] → R be a bounded increasing function. We

define the corresponding jump function sf : [a, b] → R by
-
0, x=a
sf (x) :=
(f (a+) − f (a)) + y<x [f ](y) + (f (x) − f (x−)), 0 < x ≤ b.
(32.25)
Note that since f has at most countable many jumps, say x1 < x2 < · · · <
xj < · · · the sum in (32.25) stands for

[f ](xj ).
xj <x
Of interest is now
Theorem 32.7. Let f : [a, b] → R be a bounded increasing function and sj
its jump function. The function ϕf : [a, b] → R defined by
ϕf := f (x) − sf (x) (32.26)
is increasing and continuous.
Proof. Let a ≤ x < y ≤ b. We apply (32.24) to the interval [x, y] and obtain
sf (y) − sf (x) = f (y) − f (x), (32.27)
which implies ϕf (x) − ϕf (y) ≥ 0, i.e. ϕf is increasing. Further, passing in
(32.27) to the limit y → x we find
sf (x+) − sf (x) ≤ f (x+) − f (x),
but the definition of ϕf implies
f (x+) − f (x) ≤ sf (y) − sf (x),
which gives for y → x
f (x+) − f (x) ≤ sf (x+) − sf (x),
or f (x+) − f (x) = sf (x+) − sf (x), i.e. ϕf (x+) = ϕf (x). Analogously we
may prove ϕf (x−) = ϕ(x).
465
The jump function s f is the pointwise and monotone limit of the sequence
(SN )N ∈N , SN (x) = xN [f ](xj ), which is an increasing step function on
j <x
[a, b]. Thus every monotone increasing function is the sum of continuous
increasing functions and a monotone limit of step functions.
Let f : [a, b] → R be a bounded function and a = x0 < x1 < · · · <
xn−1 < xn = b be a finite partition Z of [a, b] for which we write as be-
fore Z(x0 , . . . , xn ). We can now form
n−1

VZ (f ) := |f (xk+1 ) − f (xk )| . (32.28)
k=0
Definition 32.8. Let f :: [a, b] → R be a function.

A. By
V (f ) := sup VZ (f ) (32.29)
Z
we denote the total variation of f , where the supremum is taken over all
(finite) partitions of [a, b].
B. We call f a function of bounded variation if V (f ) < ∞. The set of
all functions of bounded variation on [a, b] is denoted by BV ([a, b]).
Remark 32.9. A. Sometimes it is helpful to emphasise the interval [a, b],

and then we write
Vab (f ) := V (f ), f : [a, b] → R.
B. If f ∈ BV ([a, b]) then f |[c,d] ∈ BV ([c, d]), a ≤ c < d ≤ b.

C. Some authors prefer to speak of functions of finite variation, but the
symbol BV is now widely used and therefore we prefer to call them functions
of bounded variation.
Lemma 32.10. A function of bounded variation is bounded.
Proof. Let x ∈ [a, b]. Then a ≤ x ≤ b is a partition of [a, b] and therefore
|f (x) − f (a)| + |f (b) − f (x)| ≤ V (f )
which implies
|f (x)| ≤ |f (a)| + V (f ).
466
Proposition 32.11. If f : [a, b] → R is monotone then f belongs to BV ([a, b]).

Proof. Since f ∈ BV ([a, b]) if and only if −f ∈ BV ([a, b]) we may assume
that f is increasing. For f increasing we have f (xk+1 ) − f (xk ) ≥ 0 for any
two points xk < xk+1 . Hence for every partition Z of [a, b] we find
n−1

0 ≤ VZ (f ) = (f (xk+1 ) − f (xk )) = f (b) − f (a),
k=0
implying that supZ VZ (f ) is finite.

Proposition 32.12. A Lipschitz continuous function f : [a, b] → R belongs
to BV ([a, b]).
Proof. For some κ ≥ 0 we have for all x, y ∈ [a, b]
|f (x) − f (y)| ≤ κ|x − y|, (32.30)
thus for a partition Z(x0 , . . . , xn ) of [a, b] we find
n−1
n−1

VZ (f ) = |f (xk+1 − f (xk )| ≤ κ (xk+1 − xk ) = κ(b − a)
k=0 k=0
which yields
V (f ) ≤ κ(b − a).
Example 32.13. The continuous function f : [0, 1] → R defined by

-
0, x=0
f (x) = 1
x sin x , x ∈ (0, 1]
is not of bounded variation. To see this, consider the partition x0 = 0, xj =
2
(2n−2j+1)π
, xn = 1, 1 ≤ j ≤ n − 1. With this partition we find for k =
1, . . . , n − 2 that
|f (xk+1 ) − f (xk )| ≥ 2xk
and further we note that
n−2 n−2
4 1 4 1
lim = lim = ∞,
n→∞ π
j=1
2n − 2j + 1 π n→∞
k=1
2k + 1
hence f is not of bounded variation.
467
Theorem 32.14. The set BV ([a, b]) with the natural pointwise operations
forms an algebra. In particular for f, g ∈ BV ([a, b]) and λ ∈ R we have
f + g, λf, f · g ∈ BV ([a, b]).
Proof. Clearly we need only to prove that f + g and f · g belong to BV ([a, b])
if f, g ∈ BV ([a, b]). For this let Z(x0 , . . . , xn ) be a partition of [a, b]. Since
n

|f (xk ) + g(xk ) − f (xk−1 ) − g(xk−1 )|
k=1
n
n

≤ |f (xk ) − f (xk−1)| + |g(xk ) − g(xk−1 )|
k=1 k=1
we conclude first that
VZ (f + g) ≤ VZ (f ) + Vz (g),
and then by taking the supremum over all partitions of [a, b] we get
V (f + g) ≤ V (f ) + V (g).
Furthermore we have
|f (xk )g(xk ) − f (xk−1 g(xk−1)|

≤ |f (xk )g(xk ) − f (xk−1 )g(xk )| + |f (xk−1 g(xk ) − f (xk−1 )g(xk−1)|
≤ ||g||∞ |f (xk ) − f (xk−1 )| + ||f ||∞ |g(xk ) − g(xk−1 )| ,
implying
V (f · g) ≤ ||g||∞V (f ) + ||f ||∞V (g).
The next result gives a surprising characterisation of functions of bounded

variation.
Theorem 32.15. For f ∈ BV ([a, b]) there exists two monotone increasing
functions g, h : [a, b] → R such that f = g − h.
Proof. Given f ∈ BV ([a, b]) we define vf : [a, b] → R by vf (x) = Vax (f ).

Clearly we have vf (a) = 0, vf (b) = V (f ) and vf (x) ≤ vf (y) for x < y, i.e.
vf is increasing. We define g := vf and h := vf − f and find immediately
468
that f = g − h. It remains to prove that h is increasing, i.e. that f − vf is

decreasing. For a ≤ x < y ≤ b we have
f (y) − f (x) ≤ Vxy (f ) = vf (y) − vf (x),
implying
−h(y) = f (y) − vf (y) ≤ f (x) − vf (x) = −h(x),
i.e. −h is decreasing as claimed.
Corollary 32.16. The following holds
BV ([a, b]) = {g − h|g, h : [a, b] → R are increasing }. (32.31)
An immediate consequence of Corollary 32.16 is that f ∈ BV ([a, b]) has at

most countably many jump discontinuities and further that the limits
lim f (x), f (x0 −) = x→x

f (x0 +) = x→x lim f (x) (32.32)
0 0
x>x0 x<x0
exist for every x0 ∈ (a, b), so do the limits f (a+) and f (b−).
We have now the following situation: BV ([a, b]) is a vector space, in fact
an algebra, and every element in BV ([a, b]) is Riemann integrable. However
certain results that we have considered for continuous, integrable functions do
not hold, for example the fundamental theorem, or rules such as integration
by parts, since for this result we need differentiability. A natural question is
to which extent can we “rescue” these results, i.e. can we find an extension
of our theory of integration which will allow us to prove these results perhaps
with some generalised interpretation? It turns out that we can achieve this
however we will need the Lebesgue measure and we will take up this problem
in Volume 3.
We know that BV ([a, b])∪C([a, b]) is a subset of all Riemann integrable func-
tions. The following result gives a characterisation of a Riemann integrable
function a proof of which we will give in Volume 3.
Theorem 32.17. A bounded function f : [a, b] → R is Riemann integrable

if and only if the set
Ds (f ) := {x ∈ [a, b]|f is not continuous at x}
is a null set.
469
Problems
1. Define f : [0, 1] → R by f (0) = 0 and f (x) = x cos πx for 0 < x ≤ 1
and prove that f is continuous but not of bounded variation. Hint:
1 1
consider the partition 0 < 2k < 2k−1 < · · · < 13 < 12 < 1.
2. Show that if f, g ∈ BV ([a, b]) then g + , g − , |g|, max(f, g) and min(f, g)

all belong to B([a, b]) too. Hint: prove first that |g| ∈ BV ([a, b]) and
use the fact that BV ([a, b]) is a vector space. Recall the representation
of max and min using | · |.
3. Suppose that g ∈ BV ([a, b]) and inf |g| > 0. Prove that 1g ∈ BV ([a, b]).
x
4. Let f ∈ C([a, b]) and F (x) := a f (t)dt, x ∈ [a, b]. Show that F ∈
b
BV ([a, b]) and V (F ) = a |f (t)|dt.
5. We call f : [a, b] → R, a < b, absolutely continuous if for every

> 0 there exists δ > 0 such that for all m ∈ N and any choice
of pairwise
disjoint open intervals (a , bj ) ⊂ [a, b], j = 1, . . . , m, the
j m
estimate m (b
j=1 j − aj ) < δ implies j=1 |f (bj ) − f (aj )| < .
a) Prove that every absolutely continuous function is continuous.

b) Prove that every Lipschitz continuous function is absolutely con-
tinuous.
c) Prove that an absolutely continuous function is of bounded vari-
ation.
6. Show that the absolutely continuous functions on [a, b] form an algebra.
x f ∈ C([a, b]) ∪ BV ([a, b]) and prove that F : [a, b] → R, F (x) :=

7. Let
a
f (t)dt, is absolutely continuous.
470
Appendices
The material collected in the following appendices is additional, it is either

material which students should have learnt by now and therefore a reminder
or it is material which will be taught in more detail elsewhere. In the latter
case the material introduced will be brief; omitting proofs and examples. In
some of the appendices however we handle in more detail additional aspects
of material treated within the main text, or we provide proofs of results that
are only cited in the main text.
471
Appendix I: Elementary Aspects of

Mathematical Logic
Before we consider elementary concepts of logic, let us make some general
remarks. A fair summary of our knowledge of the foundations of knowledge
could be: at the beginning there was no beginning. This perhaps bizarre
statement reflects the fact that there is no point zero to start with, maybe
the central insight of philosophy. This has of course impacted on how we
think about mathematics. However we must start somewhere. By experience
and taking into account the historical development of the subject the best
way to start is by taking certain facts for granted and then to investigate
the consequences and the nature or essence of these facts. However this may
result in severe changes of what we initially took for granted.
A major problem is that we have to use our everyday language to formulate
these facts and the objectives we are interested in, but our everyday language
however is not precise.
The beginner in mathematics normally encounters these problems when learn-
ing how to use mathematical logic and when learning naı̈ve set theory. A
suggestion to help in learning these topics is: make a start and then return
to these problems occasionally.
The first basic idea we need is that of a (mathematical) statement. We define

it as follows:
A statement is a sentence which is either true or false.
Mathematics is concerned with deriving new correct (mathematical) state-
ments from given ones. We usually denote a statement by p, q or r. Every
statement creates a new statement, its negation ¬p (read: not p):
¬p is true if p is false and it is false if p is true.
We may present this definition in an easy way by using a truth table
p ¬p
T F
F T
Table A.I.1
473
Given several statements we may try to combine them to get compound

statements. There are three basic ways to combine two statements p and q:
Conjunction: p ∧ q (read: p and q);
Disjunction: p ∨ q (read: p or q);
Implication: p =⇒ q (read: p implies q).
Our main task is to decide, i.e. to define, when these new statements are
true and when they are false. Here are the truth tables for conjunction,
disjunction and implication. Conjunction:
p q p∧q
T T T
T F F
F T F
F F F
Table A.I.2
Thus the conjunction p ∧ q is true if and only if both p and q are true.
Disjunction:
p q p∨q
T T T
T F T
F T T
F F F
Table A.I.3
Hence the disjunction p ∨ q is true when at least one of p and q is true.
474
APPENDIX I: ELEMENTARY ASPECTS OF MATHEMATICAL LOGIC
Implication:
p q p =⇒ q
T T T
T F F
F T T
F F T
Table A.I.4
In the case of implication we have a surprising result: we would expect an

implication to be true when both p and q are true, i.e. when the premise p
and the conclusion q are true. However, we can also define that whatever
the conclusion is, true or false, it may be derived from a false premise, i.e.
p being false also leads to a true statement p =⇒ q independent of q.
There are two further formulations related to the implication: we call p a
sufficient condition for q (read: p is sufficient for q), and we call q a
necessary condition for p (read: q is necessary for p). We next want to
introduce a further compound statement, but one might have different views
on its place in the system. We are speaking about the equivalence of two
statements p and q for which we write
p ⇐⇒ q (read: p is equivalent to q),
and which we define by
p q p ⇐⇒ q
T T T
T F F
F T F
F F T
Table A.I.5
Thus p is equivalent to q if both are true or both are false, but this is not
really what we mean when saying that p is equivalent to q. What we really
mean is the following
(p =⇒ q) ∧ (q =⇒ p), (A.I.1)
475
i.e. p implies q and q implies p. Taking (A.I.1) as the definition for equiva-
lence, then we may introduce a new notation, namely
p ⇐⇒ q if and only if (p =⇒ q) ∧ (q =⇒ p). (A.I.2)
Note that the truth tables Tables A.I.1 and Tables A.I.3 imply
(p ∧ q) ⇐⇒ (q ∧ p) (A.I.3)
and
(p ∨ q) ⇐⇒ (q ∨ p). (A.I.4)
We want to study some of these compound statements in more detail. We
start with the negation of negation, i.e. ¬(¬p) with truth table
p ¬p ¬(¬p)
T F T
F T F
Table A.I.6
Thus ¬(¬p) is equivalent to p, i.e. ¬(¬p) is true if p is true and false if p is

false. Next we consider p ∧ (¬p):
p ¬p p ∧ (¬p)
T F F
F T F
Table A.I.7
This statement is always false. Conversely, when looking at p ∨ (¬p) we find
p ¬p p ∨ (¬p)
T F T
F T T
Table A.I.8
476
Thus the statement p ∨ (¬p) is always true, i.e. given any statement, either
p or ¬p is true, there is no other possibility. This fact is called the law
of the excluded middle or tertium non datur . But note: the law of the
excluded middle depends on the fact that any statement is only allowed to
be true or false. As soon as we allow a third option we cannot prove the law
of the excluded middle. When we take the negation of p ∨ (¬p) we get the
statement ¬(p ∨ (¬p)) which is always false. Therefore having Table A.I.7 in
mind we find
(¬(p ∨ (¬p))) ⇐⇒ (p ∧ (¬p)). (A.I.5)
We clearly apply the fact that two compound statements are equivalent if
they have identical truth tables.
Of interest are the two laws dealing with negation of conjunction and dis-
junctions. They are called de Morgan’s laws and they state
(¬(p ∧ q)) ⇐⇒ ((¬p) ∨ (¬q)) (A.I.6)
and
(¬(p ∨ q)) ⇐⇒ ((¬p) ∧ (¬q)). (A.I.7)
Note that (A.I.5) follows from (A.I.7) with ¬p instead of q. A further im-
portant conclusion we can make from the negation of the implication is
(¬(p =⇒ q)) ⇐⇒ (p ∧ (¬q)). (A.I.8)
Thus instead of proving that p does not imply q, we may prove that p and
¬q are true. Since (A.I.8) implies
(p =⇒ q) ⇐⇒ ((¬p) ∨ q) (A.I.9)
instead of proving that p implies q we may prove that either ¬p or q is true.

In addition we have
(p =⇒ q) ⇐⇒ (¬q =⇒ (¬p)), (A.I.10)
i.e. instead of proving p implies q we may prove that ¬q implies ¬p. The
equivalence (A.I.10) is known as contra-position and a proof using ¬q =⇒
¬p instead of p =⇒ q is called a proof by contra-position. Combining (A.I.8)
with the law of the excluded middle we obtain a very powerful method for
proving statements: reductio ad absurdum or proof by contradiction. Here
is the method:
477
Suppose we want to prove that p =⇒ q. Instead we assume that ¬q is

true and p is true. Now we try to construct a contradiction to the statement
(¬q) ∧ p, i.e. we prove that (¬q) ∧ p is false. This implies by (A.I.8) that
¬(p =⇒ q) is false too. Hence by the law of the excluded middle p =⇒ q
is true.
The implication has a further very useful property
((p =⇒ q) ∧ (q =⇒ r)) =⇒ (p =⇒ r), (A.I.11)
i.e. if p implies q and q implies r, then p must imply r. In fact, most if not
all proofs rely on a finite number of applications of (A.I.11).
The following considerations are more involved and often cause some prob-
lems to begin with. We must learn to work with statements which include
quantifiers. To explain this in more detail we need to consider some set
theory.
Let X be a non-empty set. Often we need to consider for each x ∈ X a
statement p which depends on x. For this we write p(x), for example if
X = N the statement could be:
p(n) : n is a prime number.
(Note that we do not interpret p(x) as the value of a mapping at x ∈ X. The

co-domain of such a function must be (a subset of) the set of all statements
and this is a construction we wish to avoid.)
Now given a set X = ∅ and a family of statements p(x), x ∈ X. An all-
statement is a statement of the type
for all x ∈ X the statement p(x) is true. (A.I.12)
For example we may consider
for all n ∈ N it is true that n ≥ 0,
here X = N and p(n) is the statement that n ≥ 0.

An existence-statement is a statement of the type
there exists x ∈ X such that p(x) is true. (A.I.13)
For example we may consider
there exists z ∈ Z such that z ≤ 0,
478
where now X = Z and p(z) is the statement z ≤ 0. For all-statements and

existence statements a new notation is introduced. For (A.I.12) we write
∀x ∈ X : p(x), (A.I.14)
and for (A.I.13) we write

∃x ∈ X : p(x) (A.I.15)
The symbol “∀” is called the all-quantifier and the symbol “∃” is called the
existence-quantifier. Next we may form compound statements involving
quantifiers, for example
∀x ∈ R : (∃n ∈ N : n ≥ x), (A.I.16)
for which we may also write
∀x ∈ R ∃n ∈ N : n ≥ x.
Another example is
∃M ∈ R : (∀x ∈ R : | sin x| ≤ M), (A.I.17)
for which we often write
∃M ∈ R ∀x ∈ R : | sin x| ≤ M.
The rules for negation of statements involving quantifiers are
¬(∀x ∈ X : p(x)) ⇐⇒ (∃x ∈ X : ¬p(x)) (A.I.18)
and
¬(∃x ∈ X : p(x)) ⇐⇒ (∀x ∈ X : ¬p(x)). (A.I.19)
Thus the negation of (A.I.17) is
¬(∃M ∈ R (∀x ∈ R : | sin x| ≤ M))
⇐⇒ ∀M ∈ R : ¬(∀x ∈ R : | sin x| ≤ M)
⇐⇒ ∀M ∈ R (∃x ∈ R : | sin x| > M),
and since (A.I.17) is true, just take M = 1, the last statement is of course
false.
479
Note: symbols such as ¬, ∧, ∨, =⇒ , ⇐⇒, ∀, ∃ have their meaning in a

formal language or in a formal mathematical context. They are not abbrevi-
ations. In our course, wherever possible, we try to avoid using these symbols.
Clearly, we do not and cannot avoid the ideas of negations, conjunctions, dis-
junctions, implications, equivalences, all-statements or existence-statements.
We believe however that to begin with it is better to use the longhand ap-
proach, thus for (A.I.17) we write:
there exists m ∈ R such that for all x ∈ R it follows that | sin x| ≤ M,
whereas the negation of this statement reads as
for all M ∈ R there exists x ∈ R such that | sin x| > M.
480
Appendix II: Sets and Mappings. A Collection

of Formulae
In this appendix we give a collection of formulae on set operations and prop-
erties of mappings which every mathematics student should eventually know
and be able to work with. (In compiling this list we followed closely J.
Dieudonné [2].) Many of these formulae have already been used and some of
them have been proved in Part 1, partly in the solved exercises. At the end
of this appendix we will pick up some of the principal ideas of the proofs of
these statements.
Elementary Operations for Sets
X \ X = ∅ and X \ ∅ = X; (A.II.1)
X ∪ X = X and X ∩ X = X; (A.II.2)
X ∪ Y = Y ∪ X and X ∩ Y = Y ∩ X; (A.II.3)
The statements X ⊂ Y, X ∪ Y = Y, X ∩ Y = X are equivalent; (A.II.4)
The statements X ⊂ X ∪ Y and X ∩ Y ⊂ X are equivalent; (A.II.5)
X ⊂ Z and Y ⊂ Z if and only if X ∪ Y ⊂ Z; (A.II.6)
Z ⊂ X and Z ⊂ Y if and only if Z ⊂ X ∩ Y ; (A.II.7)
X ∪ (Y ∪ Z) = (X ∪ Y ) ∪ Z, i.e. X ∪ Y ∪ Z makes sense; (A.II.8)
X ∩ (Y ∩ Z) = (X ∩ Y ) ∩ Z, i.e. X ∩ Y ∩ Z makes sense; (A.II.9)
X ∪ (Y ∩ Z) = (X ∪ Y ) ∩ (X ∩ Z) and X ∩ (Y ∪ Z) = (X ∩ Y ) ∪ (X ∩ Z);
(A.II.10)
if X ⊂ E and Y ⊂ E, then
(X ) = X, (X ∪ Y ) = X ∩ Y , (X ∩ Y ) = X ∪ Y ; (A.II.11)

X ⊂ Y ⊂ E is equivalent to Y ⊂ X ; (A.II.12)
if X ⊂ E and Y ⊂ E then X ∩ Y = ∅ if and only if X ⊂ Y ; (A.II.13)
if X ⊂ E and Y ⊂ E then X ∪ Y = E if and only if X ⊂ Y, and Y ⊂ X;
(A.II.14)
X × Y = ∅ if and only if X = ∅ or Y = ∅; (A.II.15)
481
if X × Y = ∅ then X × Y ⊂ X × Y if and only if X ⊂ X and Y ⊂ Y ;

(A.II.16)
(X × Y ) ∪ (X × Y ) = (X ∪ X ) × Y ; (A.II.17)
(X × Y ) ∩ (X × Y ) = (X ∩ X ) × (Y ∩ Y ); (A.II.18)
(X × Y ) × Z := X × Y × Z. (A.II.19)
Mappings
For Z := X × Y we define
pr1 : Z→X pr2 : Z → Y
and
(x, y) → x (x, y) → y
For a mapping F : X → Y we denote by
F (A) = {y ∈ Y |y = F (x) and x ∈ A ⊂ X} ⊂ Y
the image of A ⊂ X, and by
F −1 (A ) = {x ∈ X|y = F (x) and y ∈ A ⊂ Y } ⊂ X
the pre-image of A ⊂ Y . Further we write
Γ(F ) = {(x, F (x))|x ∈ X}
for the graph of F . We will write F −1 (y) for F −1 ({y}).
F (A) = pr2 (Γ(F ) ∩ (A × Y )); (A.II.20)
A = ∅ if and only if F (A) = ∅; (A.II.21)

F ({x}) = {F (x)} for all x ∈ X; (A.II.22)
A ⊂ B implies F (A) ⊂ F (B); (A.II.23)
F (A ∩ B) ⊂ F (A) ∩ F (B); (A.II.24)
F (A ∪ B) = F (A) ∪ F (B); (A.II.25)
F −1 (A ) = pr1 (Γ(F ) ∩ (X × A )); (A.II.26)
−1 −1
F (A ) = F (A ∩ F (X)); (A.II.27)
−1
F (∅) = ∅ (A.II.28)
482
APPENDIX II: SETS AND MAPPINGS. A COLLECTION OF FORMULAE
(but note: F −1 (A ) = ∅ does not imply A = ∅);
A ⊂ B implies F −1 (A ) ⊂ F −1 (B ); (A.II.29)
F −1 (A ∩ B ) = F −1 (A ) ∩ F −1 (B ); (A.II.30)

F −1 (A ∪ B ) = F −1 (A ) ∪ F −1 (B ); (A.II.31)
F −1 (A \ B ) = F −1 (A ) \ F −1 (B ) if B ⊂ A ; (A.II.32)
F (F −1 (A )) = A ∩ F (X) for A ⊂ Y ; (A.II.33)
A ⊂ F −1 (F (A)) for A ⊂ X; (A.II.34)
pr1−1 (A) = A × Y for A ⊂ X; (A.II.35)
pr2−1 (A )
= X × A for A ⊂ Y ;
(A.II.36)
C ⊂ pr1 (C) × pr2 (C) for C ⊂ X × Y. (A.II.37)
If F : X → Y and G : Y → Z we define the composition H := G ◦ F by
H:X →Z
x → H(x) = G(F (x)).
H(A) = G(F (A)) for A ⊂ X; (A.II.38)

−1 −1 −1
H (A ) = F (G (A )) for A ⊂ Z; (A.II.39)
if F and G are injective (surjective, bijective) then
H = G ◦ F is injective (surjective, bijective); (A.II.40)
if F : X → Y is bijective we denote its inverse mapping by F −1 : Y → X
(A.II.41)
(this does not cause any trouble with the notation for the pre-image because
in this case the pre-image of one point is either a set containing exactly one
point or it is empty.) For a bijective mapping we have
F ◦ F −1 = idY
F −1 ◦ F = idX
where idY is the identity on Y and idX is the identity on X, respectively.
483
Families of Sets
In the following I and J are arbitrary index sets and (Ai )i∈I and (Bj )j∈J are
families of sets. We define the union and the intersection of such families by:

Ai := {x|x ∈ Ai for some i ∈ I};
i∈I
Ai := {x|x ∈ Ai for all i ∈ I}.

i∈I
Clearly if I = {1, 2} then

Ai = A1 ∪ A2 and Ai = A1 ∩ A2
i∈I i∈I
with the obvious generalisation to a finite index set I.

Ai = Ai ; Ai = Ai ; (A.II.42)
i∈I i∈I i∈I i∈I

Ai ∩ Bj = (Ai ∩ Bj ); (A.II.43)
i∈I j∈J (i,j)∈I×J

Ai ∪ Bj = (Aj ∪ Bj ); (A.II.44)
i∈I j∈J (i,j)∈I×J
Let F : X → Y be a mapping and (Ai )i∈I a collection of subsets of X and

(Aj )j∈J a collection of subsets of Y.

F Ai = F (Ai ); (A.II.45)
i∈I i∈I

−1
F Aj = F −1 (Aj ); (A.II.46)
j∈J j∈J

F −1 Aj = F −1 (Aj ). (A.II.47)

j∈J j∈J
484
If B ⊂ X is a subset and (Ai )i∈I is a collection

of subsets of X, i.e. Ai ⊂ X,
then we call (Ai )i∈I a covering of B if B ⊂ i∈I Ai .
Denumerable Sets
Let X be any set. We call X denumerable if it consists either of finitely

many elements or if there is a bijective mapping f : N → X. If we only have
the latter case then we call X countable
every subset of a denumerable set is denumerable; (A.II.48)
the sets N, Z and Q are countable; (A.II.49)

if X1 , . . . Xk , k ∈ N are countable, then
k

X1 × · · · × Xk = Xk is countable too; (A.II.50)
j=1
the union of denumerable many denumerable sets is denumerable

and
the union of countable many countable sets is countable (A.II.51)
i.e. if (Xj )j∈N is a family of countable sets, then

Xi
j∈N
is countable. (Note that instead of N we may take any countable index set).
Next we want to give some hints on how to prove (in principle) statements
about sets and mappings when starting with the basics. There is a nat-
ural correspondence between certain logical operations and set theoretical
operations. Let us introduce the following statements
p: x∈X
q: x∈Y
then
x∈X ∩Y ⇐⇒ p ∧ q
x∈X ∪Y ⇐⇒ p ∨ q
x∈
/X ⇐⇒ ¬p
485
and if X ⊂ Z, Z fixed, we have
x ∈ X ⇐⇒ ¬p.
Further, if for some index set J, sets Xj , j ∈ J, are given and if
pj : x ∈ X j
then
x∈ Xj =⇒ ∀j ∈ J : pj
j∈J
and
x∈ Xj ⇐⇒ ∃j ∈ J : pj .
j∈J
Now we may use truth tables to prove compound statements when finitely
many statements are involved. For example in order to prove the second
statement of (A.II.10), i.e.
X ∩ (Y ∪ Z) = (X ∩ Y ) ∪ (X ∩ Z)
we can look at
x ∈ X x ∈ Y x ∈ Z (x ∈ X) ∩ (x ∈ Y ∨ x ∈ Z) (x ∈ X ∧ x ∈ Y ) ∨ (x ∈ X ∧ x ∈ Z)
T T T T T
T T F T T
T F T T T
T F F F F
F T T F F
F T F F F
F F T F F
F F F F F
Table A.II.1
Since the last two columns coincide the two statements are equivalent, how-
ever
(x ∈ X) ∧ (x ∈ Y ∨ x ∈ Z) ⇐⇒ x ∈ X ∩ (Y ∪ Z)
and
(x ∈ X ∧ x ∈ Y ) ∨ (x ∈ X ∧ x ∈ Z) ⇐⇒ x ∈ (X ∩ Y ) ∪ (X ∪ Z).
486
Note: all statements about relations of sets given in our collection are state-
ments involving quantifiers, for example the above statement (A.II.10) is
equivalent to
∀x ∈ X ∪ Y ∪ Z : ((x ∈ X ∩ (Y ∪ Z)) ⇐⇒ (x ∈ (X ∩ Y ) ∪ (X ∩ Z))).
In our proof we only considered the equivalence for a single x, but since x
was arbitrary this means that we proved it for all x ∈ X ∪ Y ∪ Z.
Although the method of truth tables will always provide a proof as long as
only finitely many statements are involved, it could be quite a time consuming
process to check all cases. For example to prove
(X1 × Y1 ) ∩ (X2 × Y2 ) = (X1 ∩ X2 ) × (Y1 ∩ Y2 ) (A.II.52)
one would have to complete a truth table with 16 rows. However, a short
and transparent proof is obtained by using step by step basic definitions and
simple rules for handling logical statements:
(x, y) ∈ (X1 × Y1 ) ∩ (X2 × Y2 )
⇐⇒ (x, y) ∈ (X1 × Y1 ) ∧ (x, y) ∈ (X2 × Y2 )

⇐⇒ x ∈ X1 ∧ y ∈ Y 2 ∧ x ∈ X2 ∧ y ∈ Y 2
⇐⇒ x ∈ (X1 × X2 ) ∧ y ∈ (Y1 ∩ Y2 )
⇐⇒ (x, y) ∈ (X1 ∩ X2 × (Y1 ∩ Y2 ).
Since the pair (x, y) is arbitrary the statement (A.II.52) (which of course is
(A.II.18)) is proved. Similarly we can prove statements with quantifiers, for
example the first statement in (A.II.42):

Ai = Ai .
i∈I i∈I
We have

x∈ Ai ⇐⇒ x ∈
/ Ai
i∈I i∈I
487
⇐⇒ ¬(∃i ∈ I : x ∈ Ai )
⇐⇒ ∀i ∈ I : ¬(x ∈ Ai )
⇐⇒ ∀i ∈ I : x ∈ Ai

⇐⇒ x ∈ Ai .
i∈I
The proofs for the statements listed above involving mappings are reduced
to statements for sets. For example the meaning of (A.II.24) is
y ∈ F (A ∩ B) =⇒ y ∈ F (A) ∩ F (B)
and in more detail
y ∈ F (A ∩ B) means y ∈ {ỹ ∈ Y |∃x ∈ A ∩ B : F (x) = ỹ},
y ∈ F (A) means y ∈ {ỹ ∈ Y |∃x ∈ A : F (x ) = ỹ},

y ∈ F (B) means y ∈ {ỹ ∈ Y |∃x ∈ B : F (x ) = ỹ}.
Thus F (A ∩ B) ⊂ F (A) ∩ F (B) is the statement
{ỹ ∈ Y |∃x ∈ A ∩ B : F (x) = ỹ} ⊂
{y ∈ Y |∃x ∈ A : F (x ) = ỹ} ∩ {y ∈ Y |∃x ∈ A : F (x ) = ỹ}.

The proofs for statements involving unions or intersections of arbitrary fam-
ilies of sets are similar but they will need quantifiers. Let us prove (A.II.46)

−1
F Aj = F −1 (Aj ).
j∈J j∈J
First note that this statement says

−1
x∈F Aj ⇐⇒ x ∈ F −1 (Aj ).
j∈J j∈J

Now, x ∈ F −1 j∈J Aj means

x ∈ {x̃ ∈ X|F (x̃) ∈ Aj }
j∈J
488
which is equivalent to
x ∈ {x̃ ∈ X|∃j ∈ J : F (x̃) ∈ Aj },

but the meaning of x ∈ j∈J F −1 (Aj ) is nothing but
x ∈ {x̃ ∈ X|∃j ∈ J : F (x̃) ∈ Aj }
and the statement is proved.
As mentioned at the beginning of this appendix, we only want to indicate

the principle strategies on how to prove the statements listed. The reader is
encouraged to prove some of the other statements as an exercise.
489
Appendix III: The Peano Axioms

As we have stated previously, when starting to think about the foundations
of knowledge, in our case the foundations of mathematics, we must come to
the conclusion that “at the beginning there was no beginning”. To make a
start the axiomatic method in mathematics as is accepted nowadays by all
mathematicians suggests to use a system of axioms; statements we accept
as true without giving any justification or proof, as a starting point and draw
conclusions from these. Of course, a system of axioms should satisfy certain
conditions, for example it should not lead to (obvious) contradictions, axioms
must be “reasonable” statements etc. In Euclid’s geometry such an approach
had already been indicated, however he still partly tried to justify axioms or
relate the content of axioms to experience. Nowadays, systems of axioms are
seen to be completely independent of “exterior experiences”. The mystery
is that such a method is extremely successful to provide the most powerful
tools for science, engineering, economics etc, i.e. real world problems. As E.
Wigner put it, we have some “Unreasonable Effectiveness of Mathematics in
Natural Sciences”.
For beginners in mathematics this method might seem unusual and requires
some time to be understood and appreciated. Therefore looking back at Part
1 we can see that we have not used the axiomatic approach to its full extent.
It is possible to introduce the natural numbers by a system of axioms in
such a way that a beginner should follow. Historically, this approach to the
natural numbers was one of the first axiomatic theories. Thus we dedicate
this appendix to an axiomatic introduction of the natural numbers. The
system of axioms in question are the Peano Axioms.
P.A.1
1 is a natural number.
P.A.2
For every natural number n there exists a unique natural number

called the successor of n which is denoted by n .
P.A.3
n = 1 for all natural numbers n.
491
P.A.4
If n = m then n = m.
P.A.5 (Axiom of Induction)

Let M be a subset of the natural numbers such that:
• 1 ∈ M;
• if n ∈ M then n ∈ M.
Then M is the set of all natural numbers.
Of course, we denote as before the set of all natural numbers by N and further
2 := 1 , 3 := 2 , 4 := 3 etc.
Here are some consequences of the Peano axioms:
Proposition A.III.1. A. For n, m ∈ N it follows that n = m implies
n = m .
B. For n ∈ N we have n = n.
C. If n = 1, n ∈ N, then there exists a unique m ∈ N such that n = m .
Before we prove this proposition, let us consider some interpretations. P.A.2
states (by its uniqueness property) that if n = m then n = m . Now part A
of the proposition says that two distinct natural numbers have two distinct
successors. Part B tells us that n is never its own successor, and part C states
that every natural number n = 1 is indeed a successor of another natural
number.
Proof of Proposition A.III.1. A. Suppose that n = m . Then by P.A.4 it
follows that n = m, which is a contradiction, hence n = m .
B. Let M be the set of all n ∈ N with n = n , i.e. M = {n ∈ N|n = n }.
By P.A.1 and P.A.3 we have 1 = 1, implying 1 ∈ M. Further if n ∈ M, i.e.
n = n, then by part A it follows that (n ) = n , hence n ∈ M. Now P.A.5
implies M = N.
C. Let M be the set containing 1 and all n ∈ N such that there is m ∈ N
with n = m , i.e.
M = {1} ∪ {n ∈ N \ {1}|∃m ∈ N : n = m }.
Clearly 1 ∈ M. Furthermore, if n ∈ M then for m = n we find n = m , i.e.
n ∈ M. Now by P.A.5 we conclude that M = N.
492
APPENDIX III: THE PEANO AXIOMS
So far we have only defined a set N of natural numbers. Clearly we want

to add natural numbers together as we are used to. We achieve this by
introducing on N a binary operation which we call addition.
Theorem A.III.2. For every pair of natural numbers (n, m) there exists a
unique natural number denoted by add(n, m) such that
add(n, 1) = n for every n ∈ N; (A.III.1)
and
add(n, m ) = (add(n, m)) for all (n, m) ∈ N × N. (A.III.2)
Let us now try to understand how to proceed. First we introduce axiomati-

cally a set, called the natural numbers, denoted by N. We then introduce a
mapping from N × N to N
add : N × N → N (A.III.3)
by the two properties (A.III.1) and (A.III.2). Of course we have to prove

that such a mapping exists and is unique. This is what the above theorem
considers, however we do not give the proof here. Once the theorem is proved,
i.e we know there is such a binary operation add we can start to study its
properties. For simplicity we write from now on
n + m := add(n, m) (A.III.4)
and the task is to prove using P.A.1-P.A.5 and Theorem A.III.2 only prop-
erties such as
(k + m) + n = k + (m + n) associativity,
or
n + m = m + n commutativity.
E. Landau in [7] gives a very systematical way of introducing N, addition
and the extension from N to Z as well as from Z to Q.
Finally we want to discuss how mathematical induction relates to the
Peano axioms. Recall that mathematical induction works as follows: suppose
that for n ∈ N a statement A(n) is given. If A(1) is true and if A(n) always
implies A(n + 1) then A(n) is true for all n ∈ N.
493
Denote by M the set of all natural numbers such that A(n) is true, i.e.
M := {n ∈ N|A(n) is true}.
We have to prove
(A(1) ∧ (A(n) =⇒ A(n + 1))) =⇒ M = N.
Since 1 ∈ M by assumption and since n+1 = n we know that n ∈ M implies

n ∈ M. Hence by P.A.5 it follows that M = N.
Thus introducing N via the Peano axioms in an axiomatic way we can deduce
that mathematical induction is providing what we want.
A final remark: in Chapter 3 we have formulated the principle of mathemat-
ical induction for a more general starting point, say k ∈ Z. Of course we can
use the above argument to justify this formulation. We only need to make a
change of the enumeration index.
494
Appendix IV: Results from Elementary

Geometry
Here we recollect some basic results from elementary geometry for reference
purposes. Typically students will have already met these results.
We first consider straight lines. Let g1 and g2 be two parallel lines in the
plane and h a straight line transverse to both g1 and g2 , see Figure A.IV.1
below.
g2
h
β3 g1
β4 β2
β1
α3
α4 α2
α1
Figure A.IV.1
The following relations hold for the above angles:

α1 + α2 = π; (A.IV.1)
α1 = α3 and α2 = α4 ; (A.IV.2)
α1 = β1 , α2 = β2 , α3 = β3 , α4 = β4 ; (A.IV.3)
α1 = β3 , α2 = β4 , α3 = β1 , α4 = β2 . (A.IV.4)
Now, let ABC be a triangle in the plane, see Figure A.IV.2.
b γh a
c
α β
A c B
Figure A.IV.2
495
Note that
α + β + γ = π, (A.IV.5)
and for the area of ABC we have
1
area(ABC) = hc · c (A.IV.6)
2
where hc is the height from C to AB. Clearly we have
1 1 1
hc c = hb b = ha a = area(ABC),
2 2 2
where hb and ha denote the heights from B to the side AC and A to the
side BC respectively. In the case of a right angled triangle ABC, see Figure
A.IV.3 we have Pythagoras’ theorem
a2 + b2 = c2 . (A.IV.7)
(Note that there is a slight abuse of notation here: a, b, c denote the sides in
ABC, whereas in (A.IV.7) we use the same symbols to denote the length of
these sides.)
C
·
b γ
a γ= π
2
α β
A c B
Figure A.IV.3
π
Note that we use the “continental” way to indicate an angle of size 2
i.e:
instead of
·
Symbol for a right angle
Figure A.IV.4
496
APPENDIX IV: RESULTS FROM ELEMENTARY GEOMETRY
Now let Cr (O) be a circle of radius r and centre O, i.e:
Figure A.IV.5
Its area is given by

area(Cr (O)) = πr 2 (A.IV.8)
and its circumference ∂Cr (O) has length
length(∂Cr (O)) = 2πr. (A.IV.9)
There are two scales to measure the size of an angle in the unit circle, i.e. in
C1 (O), (these are degrees and radians). An angle is measured as a fraction
of 360◦ , i.e. by definition we say that the full circle forms an angle of 360◦
and the size of α is just a corresponding fraction, for example a right angle
has size 90◦ . Or, a better way to do this is to take the length of the segment

AB as a measure of α, see Figure A.IV.6.
r=1
Segment AB
O A
Figure A.IV.6
497

By the segment AB we mean the arc joining A and B, i.e. on ∂Cr (O). Often
we say that it is measured by the arc length. This definition of the size
of an angle implies the following correspondence:
π ∼ ◦ π ∼ ◦ π ∼ ◦ π ∼ ◦
= 30 , = 45 , = 60 , = 90 ,
6 4 3 2
3π ∼ 3π ∼
= 135◦ , π ∼
= 180◦, = 270◦, 2π ∼
= 360◦ .
4 2

For a circle Cr (O), see Figure A.IV.7, the length of the arc AB with angle
α is given by

length(AB) = rα (α measured by the arc length) (A.IV.10)

and the area of the sector OAB is given by
r2 α
area(OAB) = (α measured by the arc length). (A.IV.11)
2

B Segment AB

Sector OAB
A
α r
O
Figure A.IV.7
498
Appendix V: Trigonometric and Hyperbolic

Functions
Trigonometric and hyperbolic functions play an important role in many areas
of mathematics. Here we collect some of the most useful formulae for these
functions.
A. Trigonometric Functions
1. Symmetries
sin(−x) = − sin x, sin(x + 2π) = sin x (A.V.1)

cos(−x) = cos x, cos(x + 2π) = cos x (A.V.2)
tan(−x) = − tan x, tan(x + π) = tan x (A.V.3)
cot(−x) = − cot x, cot(x + π) = cot x (A.V.4)
2. Addition Theorems
sin(x ± y) = sin x cos y ± cos x sin y (A.V.5)
cos(x ± y) = cos x cos y ∓ sin x sin y (A.V.6)

tan x ± tan y
tan(x ± y) = (A.V.7)
1 ∓ tan x tan y
cot x cot y ∓ 1
cot(x ± y) = (A.V.8)
cot y ± cot x
3. Consequences of the Addition Theorems
π
sin( + x) = cos x, sin(π + x) = − sin x (A.V.9)
2
π
cos(+ x) = − sin x, cos(π + x) = − cos x (A.V.10)
2
π
tan( ± x) = ∓ cot x (A.V.11)
2
π
cot( ± x) = ∓ tan x (A.V.12)
2
4. Double Arguments (Double angle formulae)
sin 2x = 2 sin x cos x (A.V.13)
499
cos 2x = cos2 x − sin2 x (A.V.14)

2 tan x
tan 2x = (A.V.15)
1 − tan2 x
cot2 x − 1
cot 2x = (A.V.16)
2 cot x
5. Half Arguments (Half angle formulae)
⎧
x ⎨
1
2
(1 − cos x), 0≤x≤π
sin = (A.V.17)
2 ⎩ − 1 (1 − cos x), π ≤ x ≤ 2π
2
⎧
x ⎨
1
2
(1 + cos x), −π ≤ x ≤ π
cos = (A.V.18)
2 ⎩ − 1 (1 + cos x), π ≤ x ≤ 3π
2
x sin x 1 − cos x
tan = = (A.V.19)
2 1 + cos x sin x
x sin x 1 + cos x
cot = = (A.V.20)
2 1 − cos x sin x
6. Sums
x±y x∓y
sin x ± sin y = 2 sin cos (A.V.21)
2 2
x+y x−y
cos x + cos y = 2 cos cos (A.V.22)
2 2
x+y y−x
cos x − cos y = 2 sin sin (A.V.23)
2 2
√ π
cos x ± sin x = 2 sin( ± x) (A.V.24)
4
sin(x ± y)
tan x ± tan y = (A.V.25)
cos x cos y
sin(z ± y)
cot x ± cot y = ± (A.V.26)
sin x sin y
cos(x − y)
tan x + cot y = (A.V.27)
cos x sin y
cos(x + y)
cot x − tan y = (A.V.28)
sin x cos y
500
APPENDIX V: TRIGONOMETRIC AND HYPERBOLIC FUNCTIONS
7. Products
1
sin x sin y = (cos(x − y) − cos(x + y)) (A.V.29)
2
1
cos x cos y = (cos(x − y) + cos(x + y)) (A.V.30)
2
1
sin x cos y = (sin(x − y) + sin(x + y)) (A.V.31)
2
tan x + tan y
tan x tan y = (A.V.32)
cot x + cot y
cot x + cot y
cot x cot y = (A.V.33)
tan x + tan y
tan x + cot y
tan x cot y = (A.V.34)
cot x + tan y
8. Squares
sin2 x + cos2 x = 1 (A.V.35)
tan2 x 1
sin2 x = = (A.V.36)
1 + tan2 x 1 + cot2 x
1 cot2 x
cos2 x = = (A.V.37)
1 + tan2 x 1 + cot2 x
x 1
sin2 = (1 − cos x) (A.V.38)
2 2
x 1
cos2 = (1 + cos x) (A.V.39)
2 2
2
sin x 1 − cos2 x
tan2 x = = (A.V.40)
1 − sin2 x cos2 x
cos2 x 1 − sin2 x
cot2 x = = (A.V.41)
1 − cos2 x sin2 x
501
9. Useful Values
x 0 π6 π
4
π
3
π
2
2π
3
3π
4
5π
6
π
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
0 30 45 √ 60
√ 90 120
√ 135
√ 150 180
1 1
sin x 0 √ 2 2√ 2
2 1 3 1 12 3 12 √2 1
2√
0
1 1 1 1 1 1
cos x 1 2 √3 2 2 √2 0 −√2 − 2 2− 2 √3 -1
tan x 0 13√ 3 1 √3 - − √3 -1 − 13√ 3 0
cot x - 3 1 13 3 0 − 13 3 -1 − 3 -
B. Hyperbolic Functions
1. Symmetries
sinh(−x) = − sinh x (A.V.42)
cosh(−x) = cosh x (A.V.43)
tanh(−x) = − tanh x (A.V.44)
coth(−x) = − coth x (A.V.45)
2. Addition Theorems
sinh(x ± y) = sinh x cosh y ± cosh x sinh y (A.V.46)
cosh(x ± y) = cosh x cosh y ± sinh x sinh y (A.V.47)

tanh x ± tanh y
tanh(x ± y) = (A.V.48)
1 ± tanh x tanh y
1 ± coth x coth y
coth(x ± y) = (A.V.49)
coth x ± coth y
3. Double Arguments
sinh 2x = 2 sinh x cosh x (A.V.50)
cosh 2x = sinh2 x + cosh2 x (A.V.51)

2 tanh x
tanh 2x = (A.V.52)
1 + tanh2 x
1 + coth2 x
coth 2x = (A.V.53)
2 coth x
502
APPENDIX V: TRIGONOMETRIC AND HYPERBOLIC FUNCTIONS
4. Half Arguments
⎧
x ⎨
1
2
(cosh x − 1), x ≥ 0
sinh = (A.V.54)
2 ⎩ − 1 (cosh x − 1), x < 0
2
'
x 1
cosh = (cosh x + 1 (A.V.55)
2 2
x cosh x − 1 sinh x
tanh = = (A.V.56)
2 sinh x cosh x + 1
x sinh x cosh x + 1
coth = = (A.V.57)
2 cosh x − 1 sinh x
5. Sums
1 1
sinh x ± sinh y = 2 sinh (x ± y) cosh (x ∓ y) (A.V.58)
2 2
1 1
cosh x + cosh y = 2 cosh (x + y) cosh (x − y) (A.V.59)
2 2
1 1
cosh x − cosh y = 2 sinh (x + y) sinh (x − y) (A.V.60)
2 2
sinh(x ± y)
tanh x ± tanh y = (A.V.61)
cosh x cosh y
6. Squares
cosh2 x − sinh2 x = 1 (A.V.62)
tanh2 x 1
sinh2 x = cosh2 x − 1 = 2 = 2 (A.V.63)
1 − tanh x coth x − 1
1 coth2 x
cosh2 x = sinh2 x + 1 = = (A.V.64)
1 − tanh2 x coth2 x − 1
sinh2 x cosh2 x − 1 1
tanh2 x = 2 = 2 = (A.V.65)
sinh x + 1 cosh x coth2 x
sinh2 x + 1 cosh2 x 1
coth2 x = 2 = 2 = (A.V.66)
sinh x cosh x − 1 tanh2 x
Note that we will see the relationship between hyperbolic and trigonometric
functions when we consider complex arguments later in this course.
503
Appendix VI: More on the Completeness of R

In this appendix we want to discuss in more detail some aspects of the Axiom
of Completeness which as we recall (see Chapter 17) is: In R every Cauchy
sequence has a limit. This axiom was needed to prove many central results
including:
• the Bolzano-Weierstrass theorem (Theorem 17.6);
• every increasing (decreasing) sequence bounded from above (below)

converges (Theorem 17.14);
• the principle of nested intervals (Theorem 17.15);
• every set bounded from above (below) has a least (greatest) upper
(lower) bound (Theorem 19.14).
Without these results we cannot prove many others, hence the completeness
of R is key for our theory. Nonetheless there are at least two problems
with the axiom of completeness. Firstly, it looks quite artificial, an ad hoc
requirement which turns out to be useful. Secondly, while we may suppose
the axiom to hold, we have given no proof so far that an Archimedian ordered
field which is complete exists.
First we want to discuss an equivalent way of introducing the completeness
of R by choosing a different axiom as a starting point.
Axiom A
Every non-empty set of real numbers bounded from above has a

least upper bound.
Clearly this axiom is equivalent to

Axiom A
Every non-empty set of real numbers bounded from below has a

greatest lower bound.
The first consequence of Axiom A is
Theorem A.VI.1. An increasing sequence (xn )n∈N , xn ∈ R, which is bounded

from above converges to the least upper bound x of the set {xn |n ∈ N}.
505
Proof. Let x be the least upper bound of {xn |n ∈ N}. Given > 0 there
exists N ∈ N such that x − 2 < xN < x. Since (xn )n∈N is increasing it
follows for all n ≥ N that x − 2 ≤ xN ≤ xn ≤ x, or for all n ≥ N we have
0 ≤ x − xn ≤ 2 , i.e. |xn − x| < , implying the convergence of (xn )n∈N to
x.
Corollary A.VI.2. A decreasing sequence (xn )n∈N , xn ∈ R, which is bounded

from below converges to the greatest lower bound x of the set {xn |n ∈ N}.
Theorem A.VI.3. If Axiom A holds every Cauchy sequence in R converges.
Proof. Let (xn )n∈N be a Cauchy sequence. We know by Proposition 17.3.B

that (xn )n∈N is bounded. We consider the sets Ak := {xl |l ≥ k} which are
bounded and Ak+1 ⊂ Ak , A1 = {xn |n ∈ N}. Each of the sets Ak has a
greatest lower bound ck and the sequence (ck )k∈N is increasing, i.e. ck ≤ ck+1
for k ∈ N, and bounded from above. By Theorem A.VI.1 this sequence
has a limit c, c = limk→∞ ck . We claim now that a subsequence of (xn )n∈N
converges to c. Given > 0 there exists N ∈ N such that for m ≥ N it
follows that 0 < cm − c < 2 . Since cm is the greatest lower bound of Am
there exists km ≥ m such that 0 < xkm − cm < 2 . For the subsequence
(ckm )m∈N the following holds
|xkm − c| ≤ |xkm − cm | + |cm − c| < ,
i.e. (xkm )m∈N converges to c. Now Lemma 17.10.B implies the result.
Theorem A.VI.3 implies the equivalence of Axiom A (or Axiom A ) with the
Axiom of Completeness, and arguably Axiom A is more natural to accept.
It is possible to prove the equivalence of other statements to the Axiom of
Completeness, but we do not want to go into further detail.
The following material is very mathematically advanced and might be skipped

in a first reading.
Our goal is to sketch how to construct R. Let us start with the following
problem: given N as a set characterised by the Peano axioms, see Appendix
III, can we construct the ring Z? We have of course an idea of what Z shall
constitute of and this will give us hints for our formal construction. Note that
every integer z ∈ Z is the difference between two natural numbers m, n ∈ N,
i.e. z = n − m. The problem is that in N the operations “−” is not yet
defined. Moreover, the representation is not unique: 0 = n − n for all n ∈ N,
506
APPENDIX VI: MORE ON THE COMPLETENESS OF R
or n = n + m − m for all m ∈ N. A way forward is to use pairs of natural

numbers. On N × N we define the relation
(n, m) Z (n , m ) if and only if n + m = m + n . (A.VI.1)
This definition is of course inspired by the fact that n + m = m + n is

equivalent to n − m = n − m , if “−” is defined in the usual way. It is easy
to see that on N × N the relation “Z ” is an equivalence relation. Indeed,
(n, n) Z (n, n) is trivial, and since n + m = m + n if and only if n +
m = m + n, we also have the symmetry (n, m) Z (n , m ) if and only if
(n , m ) Z (n, m). Moreover, if (n, m) Z (n , m ) and (n , m ) Z (n , m )
it follows that n + m = m + n and n + m = n + m and therefore
n + m + n + m = m + n + n + m and the arithmetic rules in N yield
n + m = m + n or (n, m) Z (n , m ). We denote now by Z := N × N/Z
the family of all equivalence classes and introduce on Z the operations
[(n, m)] ⊕ [(n , m )] := [(n + n , m + m )]
and
[(n, m)] [(n , m )] := [(nn + mm , nm + mn )].
First we can prove that these definitions are independent of the represen-
tatives chosen. Moreover we can identify n ∈ N with [n + m, m], m ∈ N,
and we may define 0 := [n, n], as we may set −n for [m, n + m]. It takes
some work, but it is not difficult to see that N × N/Z with the operations
⊕ and forms a ring and we will use the standard notations from now on,
i.e. 0, 1, n, −n, n + m, n − m when working in Z. We do not want to go much
further into the details since we will do so when passing from Z to Q for
which we employ a similar construction.
On Z × N we define the relation
(k, m) Q (l, n) if and only if nk = ml. (A.VI.2)
Again it is easy to see that Q is an equivalence relation:
(k, m) Q (k, m) is trivial
and since kn = ml if and only if lm = nk the symmetry relation
(k, m) Q (l, n) if and only if (l, n) Q (k, m)
507
follows. Moreover, if (k, m) Q (l, n) and (l, n) Q (p, q) we have kn = lm

and lp = qn which yields knlp = qnlm or kp = qm, i.e. (k, m) (p, q).
We denote by Q the family of all equivalence classes, i.e.
Q := Z × N/Q ,
and for [(k, m)] ∈ Z × N/Q we will soon write again k

m
.
Next we want to define the “usual” algebraic operations on Q, and again we
take guidance from our previous knowledge about the rationals. The rules
we know are
k l nk + lm
+ =
m n nm
and
k l k·l
· = ,
m n m·n
therefore we define
[(k, m)] ⊕ [(l, m)] := [(nk + lm, nm)] (A.VI.3)
and
[(k, m)] [(l, n)] := [(kl, mn)]. (A.VI.4)
Note that mn = m + · · · + m (n summands), so we need only addition in N
(which we get from the Peano axioms) to define ⊕ and . First we need to
prove that our definitions are independent of the choice of representatives.
So let (k, m) Q (k , m ) and (l, n) Q (l , n ). We find
[(k, m)] ⊕ [(l, n)] = [(nk + lm, nm)]
and
[(k , m )] ⊕ [(l , n )] = [(n k + l m , n m )].
However we have km = k m and ln = l n and therefore
nkn m + lmn m = n k nm + l m nm
= n (k m)n + (l n)mm = n (km )n + (ln )mm ,
implying
(nk + lm)n m = (n k + l m )nm,
or
(nk + lm, nm) Q (n k + l m , n m ),
508
i.e.
[(nk + lm, nm)] = [(n k + l m , n m )].
Analogously we can prove that (A.VI.4) is independent of the representatives.
The next task is to verify the field axioms for (Q, ⊕, ). For example we find
[(k, m)] ⊕ [(l, n)] = [(l, n)] ⊕ [(k, m)]
since
[(k, m)] ⊕ [(l, n)] = [(kn + lm, nm)]
and
[(l, n) ⊕ (k, m)] = [(lm + kn, mn)].
For n ∈ Z we identify [(n, 1)] with n, and since (0, 1) Q (0, m) for all m ∈ N
we can represent 0 by any pair of the type (0, m). Further, for n ∈ N we can
identify [n, n] with 1, indeed we get
[(n, n) · (k, m)] = [nk, nm]
but (nk, nm) Q (k, m) as nkm = nmk, and further
[(0, 1)] [(l, n)] = [(0, n)] = [(0, 1)].
For [(m, n)] = [(0, 1)] we can form its inverse of multiplication by [(n, m)]:
[(n, m)] [(m, n)] = [(mn, mn)].
Thus, along these lines it is possible to prove that (Q, ⊕, ) is a field and
we can consider this field as a model of the rational numbers. We can also
introduce an order relation ≤ on (Q, ⊕, ) by
[(k, m)] ≤ [(l, n)]
if and only if kn ≤ lm. Again it is possible to prove that the definition is

independent from the choice of representatives and that typical properties of
an order relation hold. For example we know that kn ≤ lm and lq ≤ np, i.e.
knlq ≤ lmnp, implies kq ≤ mp and therefore
[(k, m)] ≤ [(l, n)] and [(l, n)] ≤ [(p, q)] implies [(k, m)] ≤ [(p, q)].
Moreover we find
[(0, 1)] ≤ [(k, m)] (A.VI.5)
509
if and only if k ∈ N0 since (A.VI.5) means 0 · m ≤ k. In particular we have

[(0, 1)] ≤ [(n, n)] for n ∈ N.
The principle should be now clear: the natural numbers N and addition in
N we introduce using the Peano axioms, and then we can construct the ring
Z and the ordered field Q using appropriate equivalence relations. This is
now our basic idea to pass from Q to R: on the set of all Cauchy sequences
of rational numbers we will introduce an equivalence relation “R ” and on
the corresponding equivalence classes we can implement the structure of a
complete ordered field in which Archimedes’ axiom holds. Of course this field
will become R.
We denote by C the set of all Cauchy sequences of rational numbers. Hence
(xn )n∈N ∈ C if xn ∈ Q and for every ∈ Q, > 0, there exists N = N() ∈ N
such that n, m ≥ N() implies |xn −xm | < . Two Cauchy sequences (xn )n∈N ,
(yn )n∈N ∈ C are said to be equivalent if their difference tends to 0 ∈ Q:
(xn )n∈N R (yn )n∈N if and only if lim (xn − yn ) = 0. (A.VI.6)

n→∞
First we claim that “R ” is an equivalence relation on C. Clearly, lim (xn −

n→∞
xn ) = 0 for every sequence (xn )n∈N and therefore (xn )n∈N (xn )n∈N . More-
over, since lim (xn − yn ) = 0 if and only if lim (yn − xn ) = 0 it follows
n→∞ n→∞
that (xn )n∈N (yn )n∈N if and only if (yn )n∈N (xn )n∈N , i.e. the rela-
tion R is symmetric. Finally we observe that lim (xn − yn ) = 0 and
n→∞
limn→∞ (yn − zn ) = 0 implies
lim (xn − zn ) = lim (xn − yn + yn − zn ) = lim (xn − yn ) + lim (yn − zn ) = 0,

n→∞ n→∞ n→∞ n→∞
i.e. (xn )n∈N R (yn )n∈N and (yn )n∈N R (zn )n∈N implies (xn )n∈N R (zn )n∈N .
Hence we have proved that “R ” is an equivalence relation on C. Now we
consider
R := C/R , (A.VI.7)
the set of all equivalence classes of Cauchy sequence of rational numbers. On
C/R we introduce the following two operators:
[(xn )n∈N ] ⊕ [(yn )n∈N ] := [(xn + yn )n∈N ] (A.VI.8)
and
[(xn )n∈N ] [(yn )n∈N ] := [(xn yn )n∈N ]. (A.VI.9)
510
First we need to show that these definitions are independent of the choice
of representatives. If (xn )n∈N R (xn )n∈N and (yn )n∈N R (yn )n∈N then it
follows immediately that (xn + yn )n∈N R (xn + yn )n∈N since
lim (xn + yn − (xn + yn )) = lim (xn − xn ) + lim (yn − yn ) = 0.
n→∞ n→∞ n→∞
Furthermore we know that (yn )n∈N and (xn )n∈N are bounded and
xn yn − xn yn = (xn − xn )yn + xn (yn − yn )
implies now that lim (xn yn − xn yn ) = 0, i.e. (xn yn )n∈N R (xn yn )n∈N .
n→∞
Next we claim that (C/R , ⊕, ) is a field. We will check only some of
the axioms and the reader is invited to check the remaining ones. For the
addition ⊕ we find for example
[(xn )n∈N ] ⊕ [(yn )n∈N ] = [(xn + yn )n∈N ]

= [(yn + xn )n∈N ] = [(yn )n∈N ] ⊕ [(xn )n∈N ].
Further, with [0] := [(cn )n∈N ], cn = 0 for all n ∈ N,
[(xn )n∈N ] ⊕ [0] = [(xn + cn )n∈N ] = [(xn )n∈N ]
or
[(xn )n∈N ] ⊕ [(−xn )n∈N ] = [(xn − xn )n∈N ] = [0].
For the multiplication we have for example with [e] = [(en )n∈N ], en = 1 for
n ∈ N, that
[(xn )n∈N ] [e] = [(xn en )n∈N ] = [(xn )n∈N ].
More delicate is to prove that if [(xn )n∈N ] = [0], then we can find an inverse
with respect to the multiplication. We observe that if [(xn )n∈N ] = [0] there
exists δ ∈ Q, δ > 0, and N(δ) ∈ N such that |xn | ≥ δ for all n ≥ N(δ). If this
is not the case then (xn )n∈N has a subsequence (xnk )k∈N converging to zero,
and by Lemma 17.10.B we conclude that (xn )n∈N must converge to zero, i.e.
[(xn )n∈N ] = [0], which is a contradiction. (Note that the proof of Lemma
17.10.B works for Cauchy sequences in Q.) For (xn )n∈N ∈ C not equivalent
to (cn )n∈N , cn = 0 for all n ∈ N, we define
-
x−1
n , n ≥ N(δ)
x̃n :=
0, n < N(δ)
511
where δ and N(δ) are as before. We find
[(xn )n∈N ] [(x̃n )n∈N ] = [(xn · x̃n )n∈N ]
where -
1, n ≥ N(δ)
xn x̃n =
0, n < N(δ),
which implies that (xn x̃n )n∈N (en )n∈N , en = 1 for n ∈ N. The remain-
ing axioms, in particular the associative laws and the distributivity law are
proved in a straightforward way along the lines as indicated above.
We want to define an order structure on C/R . We call (xn )n∈N ∈ C positive
if there exists δ ∈ Q, δ > 0, and N(δ) ∈ N such that xn ≥ δ for all n ≥ N(δ).
Again, our first task before looking at C/R is to prove that the definition is
independent of the representative. For this let (xn )n∈N ∈ C be positive and
(xn )n∈N ∈ C be equivalent to (xn )n∈N . Then there exists δ ∈ Q, δ > 0, and
N(δ) ∈ N such that xn ≥ δ for n ≥ Nδ . Further, since lim (xn − xn ) = 0 we
n→∞
can find Ñ (δ) ∈ N such that |xn − xn | < δ2 for all n ≥ Ñ(δ). This however
implies for n ≥ max(N(δ), Ñ(δ)) that xn > xn − δ2 ≥ δ2 and hence (xn )n∈N is
positive too. With [0] = [(cn )n∈N ], cn = 0 for all n ∈ N, we define
[(xn )n∈N ] ≥ [0] (A.VI.10)

if and only if (xn )n∈N is positive. The claim is that C/R , ⊕, , ≥ is an
ordered field. Again we will verify only some of the axioms and leave the rest
to the reader.
For example, if [(xn )n∈N ] ≥ [0] and [(yn )n∈N ] ≥ [0] then we can find δ ∈ Q
and N(δ) ∈ N such that xn ≥ δ and yn ≥ δ for n ≥ N(δ), implying that
xn + yn ≥ 2δ for n ≥ N(δ), hence [(xn )n∈N ] ⊕ [(yn )n∈N ] ≥ [0], and further we
find xn yn ≥ δ 2 , i.e. [(xn )n∈N ] [(yn )n∈N ] ≥ [0]. It is a bit more tricky to show
that one and only one of the statements
[(xn )n∈N ] = [0], [(xn )n∈N ] ≥ [0] and [(xn )n∈N = [0], [0] ≥ [(xn )n∈N ] and [(xn )n∈N = [0]
holds. Let [(xn )n∈N ] = [0]. We claim [(|xn |)n∈N ] ≥ [0]. If this is not the
case, then there exists a subsequence (xnk )k∈N of (xn )n∈N such that |xnk | < k1
implying by Lemma 17.10.B that (xn )n∈N R (cn )n∈N , cn = 0 for all n ∈ N.
Now, we know |xn | ≥ δ > 0 for n ≥ N(δ) and (xn )n∈N is a Cauchy sequence.
Thus there exists Ñ (δ) ∈ N such that n, m ≥ Ñ(δ) implies |xm − xn | < δ2 .
For m0 ≥ max(N(δ), Ñ(δ)) it follows from |xn | ≥ δ that either xm0 ≥ δ
512
or −xm0 ≥ δ. In the first case we get xm0 − xn ≤ |xm0 − xn | < δ2 or

xn > xm0 − 2δ ≥ 2δ , and in the second case we find −xn > −xm0 − 2δ ≥ δ2 ,
proving that [(|xn |)n∈N ] is indeed positive. Therefore either [(xn )n∈N ] ≥ [0] or
[0] ≥ [(xn )n∈N ].
Thus we have already constructed an ordered field C/R , ⊕, , ≥ . We
now want to embed Q into C/R while preserving all structures. For q ∈ Q
we can form the class [q] by defining [q] := [(xn )n∈N ], xn = q for all n ∈ N.
Consider
j : Q → C/R , j(q) := [q]. (A.VI.11)
Clearly, q = q implies j(q) = j(q ), i.e. j is an injective mapping. Moreover
the following hold (we leave the proofs for the reader):
j(q1 + q2 ) = [q1 ] ⊕ [q2 ];
j(q1 · q2 ) = [q1 ] [q2 ];
q1 ≥ q2 implies [q2 ] ≥ [q2 ];
j −1 ([q1 ] ⊕ [q2 ]) = j −1 ([q1 ]) + j −1 ([q2 ]);
j −1 ([q1 ] [q2 ]) = j −1 ([q1 ])j −1 ([q2 ]);
[q1 ] ≥ [q2 ] implies j −1 ([q1 ]) ≥ j −1 ([q2 ]).
These results show that j(Q) is a subset of C/R which is a subfield and
respects the order relation, i.e. j(Q) is in all structures isomorphic to Q.
With some further effort one can see that for [(xn )n∈N ] ∈ C/R , [(xn )n∈N ] ≥ [0],
there exists [q] ∈ j(Q) such that [(xn )n∈N ] ≥ [q] ≥ [0] for which we can of
course write [(xn )n∈N ] ≥ j(q) ≥ j(0). We can now introduce as usual the
notation > , < , and ≤ . Moreover we can define the absolute value on C/R
by -
[(xn )n∈N ], if [(xn )n∈R ] ≥ [0]
|[(xn )n∈N ]| := (A.VI.12)
−[(xn )n∈N ], if [(xn )n∈N ] < [0],
where −[(xn )n∈N ] denotes the inverse of [(xn )n∈N ] with respect to the addition
⊕. It is not difficult to see that
|[(xn )n∈N ]| = [(|xn |)n∈N ] (A.VI.13)
and in particular
|j(q)| = j(|q|) (A.VI.14)
as well as
|j −1 ([q])| = j −1 ([|q|]) (A.VI.15)
513
hold. It remains to prove that C/R is complete. In order to simplify notation

from now on we often write R for C/R and x ∈ R for elements in C/R .
But we still make a distinction between Q and j(Q) ⊂ R. We also will use
the easier notation + for ⊕, · for , ≥ for ≥ , etc.
Using the absolute value as defined by (A.VI.12) we can now define conver-
gence in R as we are used to: (xn )n∈N , xn ∈ R, converges to x ∈ R if for
every > 0, ∈ R, there exists N = N() ∈ N such that n ≥ N() implies
|xn − x| < . Further, (xn )n∈N , xn ∈ R, is a Cauchy sequence in R if for
every > 0, ∈ R, there exists N() ∈ N such that n, m ≥ N() yields
|xn − xm | < .
We prove the completeness of R in three steps. First we prove that (qn )n∈N ,
qn ∈ Q, is a Cauchy sequence in Q if and only if (j(qn ))n∈N is a Cauchy se-
quence in R = C/R . Then we show that every Cauchy sequence (j(qn ))n∈N ,
qn ∈ Q, has a limit in R. Eventually we will prove that every Cauchy se-
quence in R has a limit.
Theorem A.VI.4. The sequence (qn )n∈N , qn ∈ N, is a Cauchy sequence in

Q if and only if the sequence (j(qn ))n∈N is a Cauchy sequence in R.
Proof. Let (qn )n∈N , qn ∈ N, be a Cauchy sequence in Q and > 0, ∈ R.

Then there exists ∈ j(Q) such that 0 < < . Since (qn )n∈N is a Cauchy
sequence in Q, for j −1 ( ) > 0 there exists N() such that n, m ≥ N()
implies |qn − qm | < j −1 ( ) and we conclude
|[qn ] − [qm ]| = |j(qn ) − j(qm )|

= |j(qn − qm )| = j(|qn − qm |) < j(j −1 ( )) = < ,
i.e. (j(qn ))n∈N is a Cauchy sequence in R.

Now let (j(qn ))n∈N , qn ∈ Q, be a Cauchy sequence in R. Hence, given ∈
Q, > 0, we can find N() ∈ N such that n, m ≥ N() implies |j(qn ) −
j(qm )| < j(), which yields
|qn − qm | = |j −1 (qn ) − j −1 (qm )| = |j −1 (qn − qm )|

= j −1 (|qn − qm |) < j −1 (j()) = .
Theorem A.VI.5. Every Cauchy sequence (j(qn ))n∈N , qn ∈ Q, converges to

a limit x ∈ R.
514
Proof. We have to prove the existence of x ∈ R such that (as a limit in R)

lim j(qn ) = x. Since (j(qn ))n∈N is a Cauchy sequence in R it follows that
n→∞
(qn )n∈N is a Cauchy sequence in Q. Consequently (qn )n∈N defines an element
in R (= C/R ), and this element we denote by x and we claim lim j(qn ) = x.
n→∞
Given > 0, ∈ R, we can find as before > 0, ∈ j(Q), such that
0 < < . Since (qn )n∈N is a Cauchy sequence in Q there exists N ∈ N such
that for n, m ≥ N we have
j −1 ( )
|qn − qm | < .
2
(m)
For m ∈ N fixed consider the sequence (yn )n∈N where
yn(m) := j −1 ( ) − |qn − qm |.
(m)
With (qn )n∈N also (qn − qm )n∈N is a Cauchy sequence, hence (yn )n∈N is a
Cauchy sequence in Q. Moreover we have
j −1 ( ) j −1 ( )
yn(m) = j −1 ( ) − |qn − qm | > j −1 ( ) − = > 0,
2 2
(m)
i.e. (yn )n∈N is a sequence of positive numbers, implying that
[(yn(m) )n∈N ] = [(j −1 ( ) − |qn − qm |)n∈N ] > 0,
hence
[(qn − qm )n∈N ] < [j −1 ( )] = ,
or
|j(qm ) − x| = |[qm ] − x| = |[qm ] − [(qn )n∈N ]|
= |[(qm − qn )n∈N ]| = [|(qm − qn )n∈N ] < < .
Since m ≥ N was arbitrary it follows that lim j(qm ) = x.
m→∞
Corollary A.VI.6. For every x ∈ R and > 0 there exists x ∈ j(Q) such
that |x − x| < .
Proof. Let x ∈ R, i.e. x = [(qn )n∈N ] for a Cauchy sequence (qn )n∈N , qn ∈ Q.
By Theorem A.VI.5 we have lim j(qn ) = x, so given > 0 we choose N() ∈
n→∞
N such that |j(qn ) − x| < for n ≥ N() and it follows that |x − x| < for
x = j(qN ()+1 ).
515
Eventually we can prove
Theorem A.VI.7. In R every Cauchy sequence converges.
Proof. Let (xn )n∈R be a Cauchy sequence in R. We have to prove the exis-
tence of x ∈ R such that lim xn = x. Let (n )n∈N , n > 0, be a sequence in
n→∞
R such that lim n = 0. (Any sequence (ηn )n∈N , ηn ∈ Q, such that ηn > 0
n→∞
and lim ηn = 0 will induce on R such a sequence by n := j(ηn ).)
n→∞
For n ∈ N there exists qn ∈ Q such that
|j(qn ) − xn | < n .
We claim that (j(qn ))n∈N is a Cauchy sequence in R. For n, m ∈ N we find
|j(qn ) − j(qm )| ≤ |j(qn ) − xn | + |xn − xm | + |j(qm ) − xm |

≤ n + m + |xn − xm |.
Since lim n = 0 and (xn )n∈N is a Cauchy sequence, given > 0 we can find
n→∞
N() ∈ N such that n, m ≥ N() implies

n < , m < , |xn − xm | < ,
3 3 3
or |j(qn )−j(qm )| < , i.e. (j(qn ))n∈N is a Cauchy sequence in R. By Theorem

A.VI.4 we know that (qn )n∈N must be a Cauchy sequence in Q. We define
x := [(qn )n∈N ] and show that lim xn = x. From Theorem A.VI.5 we deduce
n→∞
that lim j(qn ) = x and therefore
n→∞
|xn − x| ≤ |xn − qn | + |qn − x| < n + |qn − x|.
Given > 0, since lim n = 0 and lim qn = x, we can find N ∈ N such that
n→∞ n→∞

n ≥ N yields n < 2
and |qn − x| < 2 , or for n ≥ N

|xn − x| < + = ,
2 2
which implies lim xn = x.
n→∞
516
Our presentation follows that in K. Endl and W. Luh [3]. However we have
left some of the details to the reader (including the fact that R as constructed
is an Archimedian field). The reason why the proof is so long is partly because
we have a lot of structure on Q which needs to be transferred to R: algebraic
structures, order structure and convergence (topological structure).
Finally we want to mention a further possibility of constructing R from Q.
Let A, B ⊂ R be two non-empty sets such that A ∪ B = R. In addition we
require for all a ∈ A and b ∈ B that a < b. We call such a pair of subsets
of R a Dedekind cut and denote it by (A|B). Further we call any s ∈ R a
separating number for (A|B) if for all a ∈ A and b ∈ B we have a ≤ s ≤ b.
Equivalent to our axiom of completeness is:
Axiom D
Every Dedekind cut has exactly one separating number.
As it stands Axiom D is “artificial” as our Axiom of Completeness. However

we may introduce Dedekind cuts first in Q and then prove that with the help
of these cuts we can construct R. In W. Rudin [11] this construction is given
in detail.
Axiom A and Axiom D use the order structure of R (or Q) and look more
natural than the Axiom of Completeness using Cauchy sequences. However
the construction using Cauchy sequences extends to many more situations
where no order structure is given but just a metric.
517
Appendix VII: Limes Superior and Limes

Inferior
The concepts of limes superior and limes inferior are difficult ones, and stu-
dents will have to spend time to understand these ideas and how to work with
them. While we have proved some basic properties of lim sup and lim inf in
the main text or in the solved problems, we believe that it is of some benefit
to students to have a more detailed list of properties of lim sup and lim inf.
We will not give proofs, however many detailed proofs can be found in R. L.
Schilling [12].
In the following (an )n∈N and (bn )n∈N are sequences of real numbers and λ > 0
is a fixed real number.
lim sup an = − lim inf (−an ), lim inf an = − lim sup(−an ); (A.VII.1)
n→∞ n→∞ n→∞ n→∞
lim inf an ≤ lim sup an ; (A.VII.2)

n→∞ n→∞
if a ∈ R is an accumulation point of (an )n∈N then (A.VII.3)

lim inf an ≤ a ≤ lim sup an ,
n→∞ n→∞
and lim inf n→∞ an as well as lim supn→∞ an are accumulation points of
(an )n∈N ;
if lim = a ∈ R exists then lim sup an = lim inf an = lim an ; (A.VII.4)

n→∞ n→∞ n→∞ n→∞
lim sup(an + bn ) ≤ lim sup an + lim sup bn ; (A.VII.5)

n→∞ n→∞ n→∞
lim inf (an + bn ) ≥ lim inf an + lim inf bn ; (A.VII.6)

n→∞ n→∞ n→∞
lim inf (an + bn ) ≤ lim inf an + lim sup bn ≤ lim sup(an + bn ); (A.VII.7)
n→∞ n→∞ n→∞ n→∞
if lim bn = b ∈ R exists then (A.VII.8)

n→∞
lim sup(an + bn ) = lim sup an + lim bn ,

n→∞ n→∞ n→∞
lim inf (an + bn ) = lim inf an + lim bn ;

n→∞ n→∞ n→∞
519
if an ≥ 0 and bn ≥ 0 for all n ∈ N then (A.VII.9)

lim sup(an bn ) ≤ lim sup an lim sup bn ,
n→∞ n→∞ n→∞

lim inf (an bn ) ≥ lim inf an lim inf bn ;
n→∞ n→∞ n→∞
if an ≥ 0 and bn ≥ 0 for all n ∈ N then (A.VII.10)

lim inf(an bn ) ≤ lim inf an lim sup bn ≤ lim sup(an bn );
n→∞ n→∞ n→∞
if an ≥ 0 and bn ≥ 0 for all n ∈ N and lim bn = b ∈ R exists, then we have

n→∞
(A.VII.11)

lim sup(an bn ) = lim sup an lim bn ,
n→∞ n→∞ n→∞

lim inf (an bn ) = lim inf an lim bn ,
n→∞ n→∞ n→∞
in particular for λ > 0 it follows
lim sup(λan ) = λ lim sup an , lim inf (λan ) = λ lim inf an ;

n→∞ n→∞ n→∞ n→∞
lim sup |an | = 0 implies lim an = 0; (A.VII.12)

n→∞ n→∞
1
lim sup an = ∞ if and only if lim inf = 0, (A.VII.13)
n→∞ an
1
lim inf an = ∞ if and only if lim sup = 0;
n→∞ n→∞ an
if 0 < lim inf an ≤ lim sup an < ∞ then (A.VII.14)
n→∞ n→∞
1 1
lim sup = ,
n→∞ an lim inf n→∞ an
1 1
lim inf = ;
n→∞ an lim supn→∞ an
if (an )n∈N is bounded then (A.VII.15)
520
APPENDIX VII: LIMES SUPERIOR AND LIMES INFERIOR
a1 + · · · + an a1 + · · · + an
lim inf an ≤ lim inf ≤ lim sup ≤ lim sup an ;
n→∞ n→∞ n n→∞ n n→∞
if (an )n∈N is bounded and an > 0 then (A.VII.16)

√ √
lim inf an ≤ lim inf n a1 · . . . · an ≤ lim sup n a1 · . . . · an ≤ lim sup an .
n→∞ n→∞ n→∞ n→∞
A proof of (A.VII.15) and (A.VII.16) is given for example in H. Heuser, [5,

Section 28].
521
Appendix VIII: Connected Sets in R

In this appendix we provide proofs for Theorem 19.25 and Theorem 19.27.
Recall that Theorem 19.25 states that a non-empty subset of R is connected
if and only if it is an interval.
Proof of Theorem 19.25. Suppose that A ⊂ R is not an interval. It follows

that there exist a < b < c such that a, c ∈ A and b ∈
/ A. Define O1 := (−∞, b)
and O2 = (b, ∞). Clearly O1 ∩ O2 = ∅ and both sets are open. Moreover
A∩O1 and A∩O2 are non-empty and A ⊂ O1 ∪O2 . Thus A has a non-trivial
splitting and is therefore not connected, and we have proved that for A to
be connected it is necessary that A is an interval.
Next we prove that [a, b] ⊂ R is connected. Suppose that [a, b] is not con-
nected and that {O1 , O2 } is a non-trivial splitting of [a, b] with a ∈ O1 .
Define
c := sup{x ∈ R|[a, x] ⊂ O1 ∩ [a, b]}.
If c 0 such that [c −η, c + η] ⊂ O1 ∩[a, b]
and [a, c+η] ⊂ O1 ∩[a, b] which is a contradiction. Consequently c ∈ O2 ∩[a, b]
and [c − δ, c + δ] ⊂ O2 ∩ [a, b] for some δ > 0. But now we find for c − δ ≤ x
that [a, x] is not a subset of O1 ∩ [a, b] which again contradicts the definition
of c. Hence c = b, [a, b) ⊂ O1 ∩ [a, b] and {b} = O2 ∩ [a, b]. But O2 is open
and therefore either O2 ∩ [a, b] is empty or contains more than one point.
Now any open interval (a, b) has the representation
$ 1 1
%
(a, b) = a+ ,b− ,R = [−m, m],
m≥m0
m m m∈N
$ 1
% $ 1
%
(−∞, b) = −m, b − , (a, ∞) = a − ,m ,
m∈N
m m∈N
m
and a half-open interval is of the type I ∪{c} where I is an open interval with
c being an end point. Thus if we can prove that the union of intersecting
connecting sets is connected and noting that {a}, a ∈ R, is trivially connected
we are done.
Our
claim is: let Aj ⊂ R, j ∈ J = ∅, be a family of connected sets such that
j∈J Aj =
∅. Then j∈J Aj is connected too.
Suppose
j∈J Aj is not connected
and let {O1 , O2 } be a non-trivial splitting
of j∈J Aj such that O1 ∩ j∈J Aj = ∅. Since Aj is connected it follows that
523

Aj ∩ O2 = ∅
for all j ∈ A, implying O2 ∩ j∈J Aj = ∅ which is a contradic-
tion. Hence j∈J Aj does not have a non-trivial splitting and therefore it is
connected.
Next we prove Theorem 19.27 which states that every open set in R is a
denumerable union of disjoint open intervals.
Proof of Theorem 19.27. Let A ⊂ R be open and x ∈ A. Then there exists
δ > 0 such that (x, x + δ) ⊂ A and (x − δ, x) ⊂ A. Let b := sup{y|(x, y) ⊂ A}
and a := inf{z|(z, x) ⊂ A}. Clearly a < x w such that (x, y) ⊂ A but w ∈ (x, y), so w ∈ A. Next we prove
that b ∈/ A (the fact that a ∈ / A goes analogously). Suppose b ∈ A. In
this case there would exist some > 0 such that (b − , b + ) ⊂ A, hence
(x, b + ) ⊂ A contradicting the definition of b. We consider now (Ix )x∈A .
Each x ∈ A belongs to some of these intervals, for example Ix , and each Ix is
contained in A, thus A = x∈A Ix . We want to prove that either Ix1 ∩Ix2 = ∅
or x1 = x2 . Let Ix1 and Ix2 , x1 , x2 ∈ A, say Ix1 = (α1 , β1 ) and Ix2 = (α2 , β2 ),
and suppose x ∈ (α1 , β1 ) ∩ (α2 , β2 ). In this case it follows that α2 < β1 and
α1 < β2 . But α2 ∈ / A hence α2 ∈ / (α1 , β1 ) and therefore α2 ≤ α1 . Since
α1 ∈/ A and hence α2 ∈ / (α2 , β2 ) we have α1 ≤ α2 , i.e. α1 = α2 . Similarly
we can prove β1 = β2 to get (α1 , β1 ) = (α2 , β2 ). Thus if Ix1 ∩ Ix2 = ∅ then
Ix1 = Ix2 . So we have already proved that A is the union of disjoint open
intervals. By Theorem 19.11 each of these intervals must contain a rational
number. But the rational numbers are countable and no rational number
can belong to two of these intervals. Hence we have at most countably many
open intervals, the union of which is A.
524
Solutions to Problems of Part 1

Chapter 1
1. The set {φ} is not empty. It contains one element, the empty set, i.e. φ ∈ {φ}.
2. a) For a real number x belonging to the set
{x ∈ R | x2 = 16 and 2x + 3 = 12} two conditions must be satisfied: x2 = 16 and
2x + 3 = 12. The first condition implies that x = 4 or x = −4, however the second
condition implies that x = 92 . Hence we cannot satisfy both conditions so the set is
empty.
b) For a rational number to belong to the set
{x ∈ Q | x2 = 9 and 3x − 6 = 3} the following two conditions must be satisfied:
x2 = 9 and 3x − 6 = 3. The first condition gives x = 3 or x = −3 while the second
one leads to x = 3. Hence {x ∈ Q | x2 = 9 a 3x − 6 = 3} = {3} = φ.
c) It is clear that the set {x ∈ R | x = x} is empty. There is no real number
not equal to itself.
d) The condition x2 = 14 implies that x = 1
2 or x = − 12 , both are not integers,
hence the set {x ∈ Z | x2 = 14 } is empty.
e) Since x2 = 14 implies that x = 12 or x = − 21 and they are both rational
numbers, it follows that {x ∈ Q | x2 = 14 } = { 12 , − 21 }, hence the set is a non-empty
set.
Note that this is different to problem d): In both problems we have to deal with the
same condition x2 = 14 . However, we seek integers in problem d) while in problem
e) we seek rational numbers.
3. a) Since every element in A is an odd integer we have A ⊂ B.
b) 9 is not a prime number, therefore A is not a subset of C: there is (at least)
one element in A which does not belong to C.
c) Every number belonging to C is an odd integer, hence belonging to B, then
C ⊂ B.
4. The set Z \ M consists of all integers x ∈ Z that do not belong to M , i.e. in order
to belong to Z \ M a number x must be an integer and x < 5. Therefore we have
Z \ M = {x ∈ Z | x < 5} = {x ∈ Z | x ≤ 4}.
5. The set R = {k ∈ N | k 2 ≤ 10} consists of all the numbers 1, 2 and 3, i.e. R =

{1, 2, 3}. Consequently we have
B \ R = {1, 2, 3, 4, 5, 6} \ {1, 2, 3} = {4, 5, 6}.
6. a) The condition 5x + 7 = 13 implies x = 6

5 / Z which gives
∈
{x ∈ Z | 5x + 7 = 13} = φ.
525
6
b) This is the same condition as in a), but now we seek rational solutions and 5 ∈ Q.
Therefore we have
6
{x ∈ Q | 5x + 7 = 13} = .
5
c) Now we have to handle the inequality 5x + 7 ≤ 13 which is equal to x ≤ 65 .

However, only integer solutions are allowed, which leads to
{x ∈ Z | 5x + 7 ≤ 13} = {x ∈ Z | x ≤ 1}.
7. a)

−7 27 18 −7 27 · 5 − 18 · 8
− =
3 8 5 3 8·5

−7 135 − 144
=
3 40

−7 9 7 9 7·3 21
= − = · = = .
3 40 3 40 40 40
b)
3 7 3·3 7 9+7
4 + 12 3·4 + 12 12
2 1 = 2·7 19 = 5
19 − 7 19·7 − 19·7 − 133
16 4
4 133 532
= − 12 3
5 = 5 = 3 · 5 = .
133 133
15
c)
42 − 33 16 − 27 11 1
= =− =− .
52 + 19 25 + 19 44 4
8. a)
3a + 4(a + b)2 − 6a( 21 + b) − 2b(a + 2b)

1
2 (a + b)
3a + 4(a2 + 2ab + b2 ) − 3a − 6ab − 2ab − 4b2
= 1
2 (a + b)
4a2 + 8ab + 4b2 − 8ab − 4b2
= 1
2 (a + b)
4a2 8a2
= 1 = .
2 (a + b)
a+b
b) We need to prove that

1 2 1
(a − 3b2 − c2 − 2ab + 4bc) = (a + b − c)(2a − 6b + 2c).
2 4
526
SOLUTIONS TO PROBLEMS OF PART 1
Now
1 1
(a + b − c)(2a − 6b + 2c) = (a + b − c)(a − 3b + c)
4 2
1 2
= (a + ab − ac − 3ab − 3b + 3bc + ac + bc − c2 )
2
2
1
= (a2 − 3b2 − c2 − 2ab + 4bc)
2
and therefore the result is proved.
c)
a−b 4ab a+b

+ −
a + b (a + b)2 a−b
(a − b)(a + b)(a − b) 4ab(a − b) (a + b)(a + b)2
= + −
(a + b)2 (a − b) (a + b)2 (a − b) (a + b)2 (a − b)
(a − b )(a − b) + 4a b − 4ab2 − (a + b)3
2 2 2
=
(a + b)2 (a − b)
a − ab − a b + b3 + 4a2 b − 4ab2 − a3 − 3a2 b − 3ab2 − b3
3 2 2
=
(a + b)2 (a − b)
2
8ab
=− .
(a + b)2 (a − b)
d)

x3 − y 3 1 x y
− y 4 x2 3
− +
y−x y x y x
= −(x2 + xy + y 2 ) − yx + y 3 x3 − y 5 x
= x3 y 3 − x2 − 2xy − xy 5 − y 2 .
9.
1
12 6
8
9 11 − 29
5 − 7
7

8 3
−
3 4 222 84
1 72 30

− −
= 9 99 8 99 1135 35
3 · − 4
1 50 54 1·50·54
·99 · 35
=−9 82 = − 9·99·35
22
12 3
10.6 20 20
9·11·7 3·11·7 11·7
=− 22 = − 22 = − 22
3 3 1
20 10 10
=− =− =− ·
77 · 22 77 · 11 847
527
10. a) b)
3 4
2 1 8 2 3 3 2 8 9
− +5 − −
3 2 9 5 8
= 125 64
19 19
8 1 40 40 40
= − + 512−1125 613
27 16 9 8000 200
8 · 16 − 27 + 40 · 3 · 16 = 19 =−
= 40
19
27 · 16 613
2021 =− .
= . 3800
432
11. a)
(a + b)3 − (b − a)2 (b + a)
4ab
a3 + 3a2 b + 3ab2 + b3 − (b2 − 2ab + a2 )(b + a)
=
4ab
a3 + 3a2 b + 3ab2 + b3 − b3 + a2 b + ab2 − a3
=
4ab
4a2 b + 4ab2
= = a + b;
4ab
b)
a 3 4
b − ab
a2 b 3
a3 b4
3 − 4 a7 − b 7
= b 2 3a = .
a b a6 b 6
√
225 a4 b6 a2 b3
12. a) 625 = 25; b) 49 = 15
7 ; c) (a+b)2 = a+b .
13. a) First we observe that

3x − 12 ≥ −7
is equivalent to
3x ≥ 5
or
5
x≥ .
3
5
Hence every x ∈ R with x ≥ 3 solves the above inequality.
5
3
528
b) Note that
7 2 3
+ x≤ x
4 5 8
is equivalent to
2 3 7
x− x≤− ,
5 8 4
i.e.
1 7
x≤−
40 4
or
x ≤ −70.
Thus every x ∈ R satisfying x ≤ −70 solves this inequality.
−70
c) In order to have (x − 3)(x + 4) ≥ 0 we must have that either
(x − 3) ≥ 0 and (x + 4) ≥ 0
or
(x − 3) ≤ 0 and (x + 4) ≤ 0
is true.
The first pair of inequalities imply
x≥3 and x ≥ −4
hence whenever x ≥ 3 then (x − 3)(x + 4) ≥ 0.

The second pair of inequalities give
x≤3 and x ≤ −4
which yields that for all x ≤ −4 we have (x − 3)(x + 4) ≥ 0.
0
−4 3
z
14. The term xy is not well defined. For x = 2, y = 3, z = 2 we have
xy = 23 = 8 and therefore (xy )z = 82 = 64
however, since y z = 32 = 9, it follows that

z
x(y )
= 29 = 512.
z
Thus (xy )z = x(y ) and therefore the brackets are needed.
529
15. a) Using that b−1 = 1b , d−1 = 1

d we have
1 1 b·d 1
+ = b−1 + d−1 = (b−1 + d−1 ) = (b−1 (b · d) + d−1 (b · d))
b d b·d b·d
1 1 d+b
= ((b−1 · b) · d + (d−1 · d)b) = (d + b) = .
b·d b·d b·d
b) We first show that (x−1 )−1 = x for x = 0. Since (x−1 )−1 x−1 = 1 and
−1
x · x = 1 it follows that
(x−1 )−1 · x−1 = x · x−1
or
(x−1 )−1 x−1 · x = x · x−1 · x,
i.e.
(x−1 )−1 = x
1
and using fractions we get 1 = x. Now we find
x
−1 −1
a
b a c −1 a 1 a −1 1 a 1 ad
c = · = c· = ·c = · ·d= .
d b d b d b d b c bc
16. a) A straightforward calculation gives

2
b b2
a x+ − +c
2a 4a

2 bx b2 b2
=a x +2 + 2 − +c
2a 4a 4a
b2 b2
= ax2 + bx + − +c
4a 4a
= ax2 + bx + c,
therefore the equivalence is established.

b) By part (a), we have for x ∈ R such that ax2 + bx + c = 0 that
2
b b2
a x+ = −c
2a 4a
or 2
b 1
x+ = 2 (b2 − 4ac).
2a 4a
By assumption b2 − 4ac ≥ 0, therefore we can take the square root on the right
hand side to get '
1 2 1 2
2
(b − 4ac) = b − 4ac.
4a 2a
530
b
Now we wish to take the square root on the left hand side. If x + 2a ≥ 0 we have
no problem to find
b 1 2
x+ = b − 4ac,
2a 2a
or
b 1 2
x=− + b − 4ac.
2a 2a
b
b
b

b 2
If x + 2a ≤ 0, we know that − x + 2a = −x − 2a ≥ 0, however x + 2a =

b 2
−x − 2a . Thus we have
2
b 1
−x − = 2 (b2 − 4ac)
2a 4a
or
b 1 2
−x − = b − 4ac,
2a 2a
implying
b 1 2
x=− − b − 4ac.
2a 2a
Thus so long as b2 −4ac ≥ 0 we find that the solutions of the quadratic equation
ax2 + bx + c = 0 are
b 1 2
x1 = − + b − 4ac
2a 2a
and
b 1 2
x2 = − − b − 4ac.
2a 2a
b
If b2 = 4ac we have only one solution x1 = x2 = − 2a .
Chapter 2
1. Since A = {x ∈ X | x ∈
/ A} we have A = {e, f, g, h, i}. The set A ∩ C is given by
those elements belonging to both the sets A and C, hence,
A ∩ C = {c, d}.
Now we find
(A ∩ C) = {a, b, e, f, g, h, i}.
The set B \ C consists of every x ∈ X which belongs to B but does not belong to
C, so we find
B \ C = {b, h}.
Finally, since A ∪ B = {a, b, c, d, f, h}, we have
(A ∪ B) = {e, g, i}.
531
2. a) From the definition we know

B4 (2) = {x ∈ R | |x − 2| < 4} = {x ∈ R | − 4 < x − 2 < 4} = {x ∈ R | − 2 < x < 6}
and analogously
B3 (8) = {x ∈ R | |x − 8| < 3} = {x ∈ R | 5 < x < 11}.
Thus for x ∈ B4 (2) ∩ B3 (8) the two sets of inequalities
−2 < x < 6 and 5 < x < 11
must be true, i.e. x must satisfy 5 < x < 6, so
B4 (2) ∩ B3 (8) = {x ∈ R | 5 ≤ x < 6}.
Here is the graphical solution to the problem:
B3 (8)
( | )
[ )
−3 −2 −1 0 1 2 3 4 5 6 7 8 9 10 11 12
( | )
B4 (2)
b) As in part a) we find
B2 (5) = {x ∈ R | 3 < x < 7}
and
B7 (−2) = {x ∈ R | − 9 < x < 5}
implying that
B2 (5) ∩ B7 (−2) = {x ∈ R | 3 ≤ x < 5}
and therefore we have
(B2 (5) ∩ B7 (−2)) = {x ∈ R | x < 3 or x ≥ 5}.
The graphical solution to the problem is the following:
B2 (5)
( | )
[ )
−10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7 8
( | )
B7 (−2)
c) We have
$
3 1 7 3 1 7
−3, ∪ − , = x ∈ R − 3 < x < or − < x <
2 4 3 2 4 3

7
= x ∈ R − 3 < x < ,
3
which yields
$
3 1 7 7
−3, ∪ − , = x ∈ R x ≤ −3 or x ≥ .
2 4 3 3
Graphically we have:
532
( )
]( )[
−6 −5 −4 −3 −2 −1 0 1 2 7 3 4 5 6
3
( )
d) Since $
7 7
−2, = x ∈ R − 2 ≤ x <
3 3
and further $ %
3 15 3 15
, = x ∈ R ≤ x ≤
5 4 5 4
we find $ $ % $
7 3 15 3 7
−2, ∩ , = ,
3 5 4 5 3
which we can also see from the following:
[ ]
[ ) |
−3 −2 −1 0 3 1 2 7 3 15 4
5 3 4
[ )
3. a) It is true that x ∈ A ∩ B implies x ∈ A and x ∈ B, hence
x ∈ A ∩ B =⇒ x ∈ A
or we can write A ∩ B ⊂ A.
On the other hand x ∈ A implies x ∈ A or x ∈ B, i.e.
x ∈ A =⇒ x ∈ A ∪ B,
or we can write A ⊂ A ∪ B.
b) Let x ∈ (A \ B) ∩ B. Then x ∈ A \ B and x ∈ B, or
(x ∈ A ∧ x ∈
/ B) ∧ x ∈ B
which implies that x ∈

/ B and x ∈ B which does not hold for any x, i.e. (A\B) ∩ B =
φ.
c) The statement x ∈ B \ A means that x ∈ B and x ∈ / A, which is equivalent
to x ∈ B and x ∈ A which is the statement that x ∈ B ∩ A .
4. a) The following holds:
x ∈ (A ∩ B) ∪ C
⇐⇒ (x ∈ A ∩ B) ∨ (x ∈ C)
⇐⇒ ((x ∈ A) ∧ (x ∈ B)) ∨ (x ∈ C)
⇐⇒ ((x ∈ A) ∨ (x ∈ C)) ∧ ((x ∈ B) ∨ (x ∈ C))
⇐⇒ x ∈ (A ∪ C) ∩ (B ∪ C) .
533
b) We use a truth table to prove this equality:

x ∈ A x ∈ A x∈B x ∈ B x∈ A∪B x ∈ (A ∪ B) x ∈ A ∩ B
T F T F T F F
T F F T T F F
F T T F T F F
F T F T F T T
Since the last two columns coincide it follows that

(A ∪ B) = A ∩ B .
5. We prove
a) =⇒ b) =⇒ c) =⇒ d) =⇒ e) =⇒ f) =⇒ a).
Now A ⊂ B means x ∈ A =⇒ x ∈ B. Therefore x ∈ A implies x ∈ A and x ∈ B,
hence x ∈ A ∩ B, i.e. A ⊂ A ∩ B. The inclusion A ∩ B ⊂ A has already been
proved (see Problem 3 a)). Thus A ⊂ B implies A ∩ B = A, i.e. a) implies b).
Next observe that A ∩ B = A is equivalent to (A ∩ B) = A , or
A ∪ B = A . Thus x ∈ B must imply x ∈ A , B ⊂ A , proving b) implies c).
Suppose now that B ⊂ A . We have already proved that A ⊂ B implies A ∩ B =
A. Thus B ⊂ A implies A ∩ B = B . Taking the complement on both sides
gives (A ∩ B ) = B, but (A ∩ B ) = A ∪ B, so we find A ∪ B = B and
therefore c) =⇒ d).
If A ∪ B = B then A ∪ B = A ∪ A ∪ B = X, since A ∪ A = X. Therefore we
have proved that d) implies e).
In order to prove that e) implies f) we only need to take complements: if B ∪ A = X
then (B ∪ A ) = X or B ∩ A = φ.
Finally, we show that A ∩ B = φ implies A ⊂ B. Note that A ∩ B = φ means
that x ∈ A and x ∈ B cannot hold, i.e. x ∈ A implies x ∈ B which is A ⊂ B of
course, i.e. we have proved that f) implies a).
6. We have:
a)
5 5
− = ;
8 8
b)
11
− 3 = 11 − 9 = 2 = 2 ;
3 3 3 3
c)
7 12 35 − 108 73 73
− = =− = ;
9 5 45 45 45
d)
| | − 3| − | − 5| | = |3 − 5| = | − 2| = 2;
√
e) a2 = |a|;
(note that a can be non-positive since a ∈ R).
534
7. For ε > 0 and a, b ∈ R we have by (2.9):

√ 1
|ab| = 2εa √ b
2ε
2
√ √1 b
( 2εa)2 2ε 1
≤ + = εa2 + b2 .
2 2 4ε
In order to prove
1
min{a, b} = (a + b − |a − b|) ,
2
note that a ≤ b implies a = min{a, b} as well as a − b ≤ 0 or |a − b| = b − a implying
1 1 1
(a + b − |a − b|) = (a + b − (b − a)) = 2a = a.
2 2 2
However, if b ≤ a we have b = min{a, b} and a − b ≥ 0 which gives |a − b| = a − b
and therefore
1 1 1
(a + b − |a − b|) = (a + b − (a − b)) = 2b = b.
2 2 2
Since a > 0, we deduce that
a2 + 1 ≥ 2a,
or
a2 − 2a + 1 = (a − 1)2 ≥ 0,
which is clearly correct. Every step is an equivalent formulation of the previous

one, hence we have the equivalence of
1
a+ ≥0 and (a − 1)2 ≥ 0.
a
8. We may use the triangle inequality (2.11). Therefore we can easily see that
|a − c| = |a − b + b − c| ≤ |a − b| + |b − c|.
For the second estimate we use the converse triangle inequality, i.e. (2.55), which
states for α, β ∈ R, that
||α| − |β|| ≤ |α − β|. Now with α = |a − b| and β = c we find
||a − b| − |c|| ≤ ||a − b| − c|
and the triangle inequality gives:
||a − b| − c| ≤ |a − b| + |c| ≤ |a| + |b| + |c|.
535
9. a) For x ∈ R
8x − 11 > −24x + 89
is equivalent to
32x > 100
or
25
.
x>
8
Thus 8x − 11 > −24x + 89 holds for all x > 25 8 .
b) We have to satisfy two inequalities:

−3 ≤ 7x − 2 and 7x − 2 < 6x + 5.
The first yields:
1
− ≤ x;
7
and the second:
x < 7.
We must now be careful, we only seek integer solutions of the system
1
− ≤ x < 7,
7
namely x1 = 0, x2 = 1, x3 = 2, x4 = 3, x5 = 4, x6 = 5, x7 = 6. For this we may
write: the solution set is given by {0, 1, 2, 3, 4, 5, 6}.
c) We discuss the following four cases:

(i) x − 3 ≥ 0 and x + 3 ≥ 0, i.e. x ≥ 3 and x ≥ −3, which implies x ≥ 3;
(ii) x − 3 ≥ 0 and x + 3 ≤ 0, i.e. x ≥ 3 and x ≤ −3, which cannot happen;
(iii) x − 3 ≤ 0 and x + 3 ≥ 0, i.e. x ≤ 3 and x ≥ −3 which means x ∈ [−3, 3];
(iv) x − 3 ≤ 0 and x + 3 ≤ 0, i.e. x ≤ 3 and x ≤ −3 which implies x ≤ −3.
We now consider each case:
In case (i) |x − 3| ≤ |x + 3| is equivalent to
x−3≤x+3
which holds for all x, hence for x ≥ 3 the inequality holds.
In case (ii) |x − 3| ≤ |x + 3| can never hold.
In case (iii) |x − 3| ≤ |x + 3| is equivalent to
−(x − 3) ≤ x + 3 or − x + 3 ≤ x + 3,
which can only hold for x ≥ 0, then for x ∈ [0, 3] the inequality has a solution.
In case (iv) |x − 3| ≤ |x + 3| is equivalent to
−(x − 3) ≤ −(x + 3) or − x + 3 ≤ −x − 3
which never holds.
Thus the inequality |x − 3| ≤ |x + 3| holds for all x ≥ 0.
536
10. a) Note that

2x + 6(2 − x) ≥ 8 − 2x
is equivalent to
2x + 12 − 6x ≥ 8 − 2x
or
−4x + 12 ≥ 8 − 2x,
i.e.
−2x ≥ −4
or x ≤ 2. Thus the inequality is solved by every x ∈ R, x ≤ 2.
b) First note that

x2 + 2x − 10 < 3x + 2
is equivalent to
x2 − x − 12 < 0.
We now factorise the left hand side noting that x2 −x−12 = (x−4)(x+3). (We find
this factorisation by determining the roots of the quadratic equation x2 −x−12 = 0.)
The condition (x − 4)(x + 3) < 0 is fulfilled either if x − 4 > 0 and x + 3 < 0 or if
x − 4 < 0 and x + 3 > 0.
In the first case we have:
x > 4 and x < −3,
and in this case there is no solution.
The second case holds if
x < 4 and x > −3,
implying that every x ∈ (−3, 4) solves this inequality.
Chapter 3
1. a) For k = 0 we have
03 + 13 + 23 = 1 + 8 = 9
which is divisible by 9. If the statement holds for k, then we find for k + 1 that
(k + 1)3 + (k + 2)3 + (k + 3)3

= (k + 1)3 + (k + 2)3 + k 3 + 3 · 3k 2 + 3 · 32 k + 33
= (k 3 + (k + 1)3 + (k + 2)3 ) + 9(k 2 + 3k + 3).
Now the first term k 3 + (k + 1)3 + (k + 2)3 as well as the second term is divisible
by 9 and the result follows by mathematical induction.
b) For n = 0 we find
05 04 03 0
+ + − = 0 ∈ Z.
5 2 3 30
537
Suppose now that

n5 n4 n3 n
+ + −
5 2 3 30
is an integer. We need to show that
(n + 1)5 (n + 1)4 (n + 1)3 n+1

+ + −
5 2 3 30
is an integer too. Expanding all terms we arrive at
n5 + 5n4 + 10n3 + 10n2 + 5n + 1 n4 + 4n3 + 6n2 + 4n + 1

+
5 2
n3 + 3n2 + 3n + 1 n + 1
+ −
3 30
n5 n4 n3 n
= + + −
5 2 3 30
+ (n4 + 2n3 + 2n2 + n) + (2n3 + 3n2 + 2n)
6 + 15 + 10 1
+ (n2 + n) + − .
30 30
Now by our assumption
n5 n4 n3 n
+ + −
5 2 3 30
31 1
is an integer. Moreover (n4 + 2n3 + 2n2 + n), (2n3 + 3n2 + 2n), (n2 + n) and 30 − 30
are integers. Therefore the result follows by mathematical induction.
2. a) For n = 1 we have x1 − y 1 = 1 · (x − y). For n ∈ N suppose that the following
hold
xn − y n = (x − y)Qn (x, y).
We need to show that for xn+1 − y n+1 we have a similar factorisation. Since
xn+1 − y n+1 = xxn − yy n = xxn − xy n + xy n − yy n

= x(xn − y n ) + (x − y)y n
= x(x − y)Qn (x, y) + (x − y)y n
= (x − y)(xQn (x, y) + y n )
we have a factorisation as required with Qn+1 (x, y) = xQn (x, y)+y n , and the result
follows.
b) For n = 1, the statement reduces to
(1 − 1)x + y 1 ≥ 1xn−1 y
or y = y which is of course correct. Now for n ∈ N fixed suppose that we have
(∗) (n − 1)xn + y n ≥ nxn−1 y.
538
We want to prove that
nxn+1 + y n+1 ≥ (n + 1)xn y.
Since x > 0 we may multiply (∗) by x to obtain
(n − 1)xn+1 + y n x > nxn y,
then adding xn+1 and subtracting y n x yields
nxn+1 ≥ nxn y + xn+1 − y n x,
and adding y n+1 leads to:
nxn+1 + y n+1 ≥ nxn y + xn+1 − y n x + y n+1

= (n + 1)xn y + xn+1 − y n x + y n+1 − xn y.
Thus we need to show that
xn+1 − y n x + y n+1 − xn y ≥ 0.
Note that
xn+1 − y n x + y n+1 − xn · y = xn (x − y) + y n (y − x)
= (xn − y n )(x − y).
Now if x > y then x − y > 0 as well as xn − y n > 0. However if x < y then

xn − y n < 0. In both cases we find that (xn − y n )(x − y) ≥ 0 and the inequality
follows from mathematical induction.
3. a)
2
1 1 1 1 1 1
j
= −2 + −1 + 0 + 1 + 2
j=−2
2 2 2 2 2 2
1 1
= 22 + 2 + 1 + + = 7 34 .
2 4
b)
5

(ak − ak−2 ) = a2 − a2−2 + a3 − a3−2 + a4 − a4−2 + a5 − a5−2
k=2
= a2 − 1 + a3 − a + a4 − a2 + a5 − a3
= a5 + a4 − a − 1.
539
c)
6
l+1 1+1 2+1 3+1
(−1)l = (−1)1 + (−1)2 + (−1)3
l 1 2 3
l=1
4+1 5+1 6+1
+ (−1)4 + (−1)5 + (−1)6
4 5 6
3 4 5 6 7
= −2 + − + − +
2 3 4 5 6
4 6 3 5 7
= −2− − + + +
3 5 2 4 6
68 47 −272 + 235 37
=− + = =− .
15 12 60 60
4. a) In both cases our formal proof will use induction. However before giving
the formal proofs let us rewrite the statement as
λ(a1 + . . . + aN ) = (λa1 + . . . + λaN )
and
(a1 + . . . + aN ) + (b1 + . . . + bN ) = (a1 + b1 ) + . . . + (aN + bN ),
thus we get a feeling for the content of these statements: the first is an extension of
the law of distributivity, the second follows as an extension of the commutativity
of addition. Here are the formal proofs:
for N = 1 we obviously have
1
1

λ aj = λa1 = (λaj ).
j=1 j=1
Now if
N
N

λ aj = (λaj )
j=1 j=1
then it follows that

N +1
N

λ aj = λ aj + aN +1
j=1 j=1
N

=λ aj + λaN +1
j=1
N
N
+1
= (λaj ) + λaN +1 = (λaj ).
j=1 j=1
Further for N = 1 we have:

1
1
1

aj + b j = a1 + b 1 = (aj + bj ).
j=1 j=1 j=1
540
If
N
N
N

aj + bj = (aj + bj )
j=1 j=1 j=1
holds then we find

N
+1 N
+1 N
N

aj + bj = aj + aN +1 + bj + bN +1
j=1 j=1 j=1 j=1
N

= (aj + bj ) + (aN +1 + bN +1 )
j=1
N
+1
= (aj + bj ).
j=1
Hence both statements follow by mathematical induction.

b) Applying the results of part a) we note that:
5
5
5

(x − y) xk y 5−k = xk+1 y 5−k − xk y 6−k
k=0 k=0 k=0
= xy 5 + x2 y 4 + x3 y 3 + x4 y 2 + x5 y + x6
− y 6 − xy 5 − x2 y 4 − x3 y 3 − x4 y 2 − x5 y
= x6 − y 6 .
5. We prove each of the following identities by mathematical induction:

a) For n = 1 we have
1
1 1 1 1
= = = .
(2k − 1)(2k + 1) (2 − 1)(2 + 1) 3 2·1+1
k=1
Now if
n
1 n
=
(2k − 1)(2k + 1) 2n + 1
k=1
then it follows that

n+1
n
1 1 1
= +
(2k − 1)(2k + 1) (2k − 1)(2k + 1) (2(n + 1) − 1)(2(n + 1) + 1)
k=1 k=1
n 1
= +
2n + 1 (2n + 1)(2n + 3)
n(2n + 3) + 1 (2n + 1)(n + 1) n+1
= = = ,
(2n + 1)(2n + 3) (2n + 1)(2n + 3) 2n + 3
which proves the statement.
541
b) For k = 1 we find
1

n · n! = 1 · 1! = 1 = (1 + 1)! − 1.
n=1

k+1
k

n · n! = nn! + (k + 1)(k + 1)!
n=1 n=1
= (k + 1)! − 1 + (k + 1)(k + 1)!
= (k + 2)(k + 1)! − 1 = (k + 2)! − 1
proving the assertion.

c) For m = 1 it follows that
1

(a + (j − 1)d) = a + (1 − 1)d = a
j=1
1
= 1(2a + (1 − 1)d).
2
If
m
1
(a + (j − 1)d) = m(2a + (m − 1)d),
j=1
2
holds then
m+1
m

(a + (j − 1)d) = (a + (j − 1)d) + a + ((m + 1) − 1)d
j=1 j=1
1
= m(2a + (m − 1)d) + a + md
2
1 1
= · 2am + a + m(m − 1)d + md
2 2
1 1
= 2a(m + 1) + (m + 1)md
2 2
1
= (m + 1)(2a + md).
2
6. a)
2

2−k = 2−(−2) · 2−(−1) · 2−(0) · 2−1 · 2−2
k=−2
= 22 · 2 · 1 · 2−1 · 2−2 = 1.
542
b)
6

(j − 4) = (3 − 4)(4 − 4)(5 − 4)(6 − 4) = 0.
j=3
c)
5
j+2 3 4 5 6 7 1
= · · · · = .
j=1
j + 4 5 6 7 8 9 6
7. Again, as in Problem 4 a), we first rewrite the statement to understand its content:
N
N

(μaj ) + (νaj )
j=1 j=1
= μa1 · μa2 · . . . · μaN + νa1 · νa2 · . . . · νaN

= μN a 1 · . . . · a N + ν N a 1 · . . . · a N
N

= (μN + ν N )a1 · . . . · aN = (μN + ν N ) aj .
j=1
Here is the formal proof by induction:

for N = 1 we have
1
1
1

μaj + νaj = μa1 + νa1 = (μ + ν)a1 = (μ + ν) aj .
j=1 j=1 j=1

N +1 N +1
N
N

μaj + νaj = μaj μaN +1 + νaj νaN +1
j=1 j=1 j=1 j=1
N
N

N N
=μ aj μaN +1 + ν aj νaN +1
j=1 j=1
N
+1 N
+1
= μN +1 aj + ν N +1 aj .
j=1 j=1
8. a)
7! = 1 · 2 · 3 · 4 · 5 · 6 · 7 = 5040
and
63! 60! 61 · 62 · 63
= = 61 · 62 · 63 = 238, 266.
60! 60!
b)
(n + 1)! − n! (n + 1)n! − n! ((n + 1) − 1)n!
= = = n!
n n n
543
c)
(n + 1)! (n − 1)! · n · (n + 1)
= = n(n + 1).
(n − 1)! (n − 1)!
9. a) For n = 2 we find
2

2k − 1 1 3 3 1 4 6
= · = = 4 = .
2k 2 4 8 2 2 16
k=1
Now we want to show that the statement for n implies that for n + 1:
n+1
n
2k − 1 2k − 1 2(n + 1) − 1
=
2k 2k 2(n + 1)
k=1 k=1

1 2n 2n + 1
= 2n
2 n 2n + 2
1 (2n)! 2(2n + 1)
= 2(n+1) .
2 n!n! n + 1
Thus it remains to prove that:

(2n)! 2(2n + 1) 2(n + 1) (2(n + 1))!
= = .
n!n! n + 1 n+1 (n + 1)!(n + 1)!
Note that
(2n)!2(2n + 1) 2(2n + 1)!
=
n!n!(n + 1) (n + 1)!n!
and
(2(n + 1))! (2n + 1)!(2n + 2) 2(2n + 1)!(n + 1) 2(2n + 1)!
= = =
(n + 1)!(n + 1)! (n + 1)!(n + 1)! (n + 1)!(n + 1)! (n + 1)!n!
and the identity is now proved.
b) Since

1−1
1
k 0
1
k
1+ = 1+ =1
k k
k=1 k=1
the statement is true for n = 1. Now under the assumption that the statement
holds for n we get for n + 1 that:
n k n−1 k n
1 1 1
1+ = 1+ 1+
k k n
k=1 k=1
n n
nn 1 nn n + 1 (n + 1)n
= 1+ = =
n! n n! n n!
(n + 1)n (n + 1) (n + 1)n+1
= = ,
n!(n + 1) (n + 1)!
and the assertion is proved.
544
10. a)
4
4
(5x2 + 3y)4 = (5x2 )4−k (3y)k
k
k=0

4 2 4 4 2 3 4 2 2 2 4 2 3 4
= (5x ) + (5x ) (3y) + (5x ) (3y) + (5x )(3y) + (3y)4
0 1 2 3 4
= 625x8 + 1500x6 y + 1350x4 y 2 + 540x2 y 3 + 81y 4 .
b)
n
n n−k
(x − y)n = x (−y)k
k
k=0
n
k n
= (−1) xn−k y k .
k
k=0
11. a) By definition we have

n n! n(n − 1) · . . . · (n − k + 1)(n − k) · . . . · 2 · 1
= =
k (n − k)!k! ((n − k)(n − k − 1) · . . . · 2 · 1)(1 · 2 · . . . · k)
n(n − 1) · . . . · (n − k + 1)
= .
1 · 2 ·...· k
1
b) Using the definition for 2
k
we find
1 1 1
2 2(2 − 1)( 12 − 2) · . . . · ( 12 − k + 1)
=
k 1 · 2· ...· k
1
2k
− 2)(1 − 4) · . . . · (1 − 2k + 2))
(1(1
=
1 ·2 · ...·k
1(2 − 1)(4 − 1) · . . . · (2k − 2 − 1)
= (−1)k−1
2k 1 · 2 · . . . · k
1 · 3 · . . . · (2k − 3)
= (−1)k−1 .
2 · 4 · . . . · (2k)
12. We use mathematical induction:

a) Since by assumption p ≥ 2, we have the correct statement p ≥ 1 for k = 1.
Now suppose that pk > k. We want to prove pk+1 > k + 1.
Since
ppk > pk
and
kp ≥ 2k = k + 1,
it follows that
pk+1 > k + 1.
545
b) For k = 1 it is true that p > 1 for p ≥ 3 and also for k = 2 we have p2 > k 2
since p ≥ 3. (Note that p ≥ 2 is not sufficient to get the strict inequality.) Assume
pk > k 2 and k ≥ 2. Multiplying by p yields:
pk+1 > pk 2 ≥ 3k 2
and it remains to prove that 3k 2 ≥ (k + 1)2 which is equivalent to
3k 2 ≥ k 2 + 2k + 1
or
2k 2 ≥ 2k + 1
k 2 + (k − 1)2 ≥ 2,
which holds since k ≥ 2. Thus by mathematical induction the statement holds for
all k ≥ 2. The case k = 1 has already been proved.
c) Note that for k = 2, 3 and 4 the statement is false. For k = 5 we have:
25 = 32 > 25 = 52 .
If we multiply 2k > k 2 by 2 we find
2k+1 > 2k 2
and the proof reduces to show that 2k 2 ≥ (k + 1)2 or k 2 ≥ 2k + 1 which follows

from (k − 1)2 ≥ 0.
13. a) For N = 1 we have:
1
1 √
√ = 1 ≤ 2 1 = 2.
j=1
j
Now under the assumption that the statement holds for N we find for N + 1 that:
N
+1 1 N √
1 1 1
√ = √ +√ ≤2 N+√
j=1
j j=1
j N +1 N +1
and it remains to show that

√ 1 √
2 N+√ ≤ 2 N + 1,
N +1
1 √ √
√ ≤2 N +1− N .
N +1
546
√ √
Now multiplying this inequality by N + 1+ N gives the equivalent statement:
√ √
N +1+ N √ √ √ √
√ ≤ 2( N + 1 − N )( N + 1 + N ) = 2
N +1
or '
N
1+ ≤ 2,
N +1
i.e. we need to justify the equivalent statement
'
N
≤1
N +1
N
which follows from N +1 < 1.
b) For k = 1 we find:
1

(2m)! = 2! = 2 ≥ ((1 + 1)!)1 = 2.
m=1
Suppose that
k

(2m)! ≥ ((k + 1)!)k .
m=1
For k + 1 it follows that

k+1
k

(2m)! = (2m)! (2(k + 1))!
m=1 m=1
≥ ((k + 1)!)k (2(k + 1))!
our problem is to prove
((k + 1)!)k (2(k + 1))! ≥ ((k + 2)!)k+1
((k + 2)!)k+1
(2(k + 1))! ≥
((k + 1)!)k
((k + 1)!(k + 2))k (k + 2)!
=
((k + 1)!)k
= (k + 2)k (k + 2)!
However note:
(2(k + 1))! = (k + 2)!(k + 2 + 1) · . . . · (k + 2 + k)
547
or more formally
k

(2(k + 1))! = (k + 2)! (k + 2 + j).
j=1
Since for j = 1, . . . , k we have k + 2 + j ≥ k + 2, it follows that

k

(k + 2 + j) ≥ (k + 2)k .
j=1
Hence we conclude
(2(k + 1))! ≥ (k + 2)k (k + 2)!
14. We first prove (∗ ∗) for n = 2k . For this we use mathematical induction. The case
k = 1, i.e. n = 2 follows from
(a1 + a2 )2 − 4a1 a2 = (a1 − a2 )2 ≥ 0
or
√ a1 + a2
a1 a2 ≤ .
2
Now suppose that (∗ ∗) holds for n = 2k−1 , i.e.
1 a1 + . . . + a2k−1
(a1 · . . . · a2k−1 ) 2k−1 ≤ .
2k−1
However we also have for the “next” 2k−1 terms the following estimate:
1 a2k−1 +1 + · . . . · +a2k
(a2k−1 +1 · . . . · a2k ) 2k−1 ≤ ,
2k−1
or equivalently
a + . . . + a k−1 2k−1
1 2
a1 · . . . · a2k−1 ≤
2k−1
and
a + . . . + a2k 2
k−1
2k−1 +1
a2k−1 +1 · . . . · a2k ≤ ,
2k−1
which gives
a + . . . + a k−1 a k−1 + . . . + a k 2k−1
1 2 2 +1 2
a1 · . . . · a2k ≤
2k−1 2k−1
or
1 (a1 + . . . + a2k−1 )(a2k−1 +1 + . . . + a2k )
(a1 · . . . · a2k ) 2k−1 ≤ .
2k−1 · 2k−1
The case k = 1 also gives:
1
(a1 + . . . + a2k−1 )(a2k−1 +1 + . . . + a2k ) ≤ (a1 + . . . + a2k )2 ,
4
548
implying
1 (a1 + . . . + a2k )2
(a1 · . . . · a2k ) 2k−1 ≤
2k · 2k
or
a1 + . . . + a2k
1
(a1 · . . . · a2k ) 2k ≤ .
2k
Now let n be any integer. Choose a k ∈ N such that 2k > n and introduce
n
1
aj := a := ak
n
k=1
for n < j < 2 . We may now apply the result for 2k when looking at
k
k
a + . . . + a + a + . . . + a 2k
1 n
a1 · . . . · an · a2 −n ≤ .
2k
aj
Now 2k ≤ a for every 1 ≤ j ≤ n and therefore we have
k k
−n
a1 · . . . · an · a2 ≤ a2
or
a1 · . . . · an ≤ an ,
15. First note that xn and an are defined by recursion.
Note that
1 c
xn := xn−1 + , n ∈ N, x0 = 1,
2 xn−1
and
c
an = , n ∈ N ∪ {0},
xn
implying that
1
(∗) xn = (xn−1 + an−1 ), n ∈ N.
2
All terms are non-negative and
x 2
n−1 + an−1
x2n = ≥ xn−1 an−1 .
2
hence
xn−1 an−1 xn−1 c c
xn ≥ = = = an .
xn xn xn−1 xn
Combining this with (∗) we find for n ∈ N
an ≤ xn+1 ≤ xn .
Therefore we deduce
xn
≥1
xn+1
and consequently
xn an
an+1 = ≥ an .
xn+1
Together we now have
an ≤ an+1 ≤ xn+1 ≤ xn .
549
Chapter 4
1. a) By definition we have
A × B = {(x, y) | x ∈ A and y ∈ B}
and
B × A = {(x, y) | x ∈ B and y ∈ A}.
Therefore it follows that
A×B = {(3, 1), (4, 1), (5, 1), (6, 1), (3, 2), (4, 2), (5, 2), (6, 2), (3, 3), (4, 3), (5, 3), (6, 3)}
and
B×A = {(1, 3), (2, 3), (3, 3), (1, 4), (2, 4), (3, 4), (1, 5), (2, 5), (3, 5), (1, 6), (2, 6), (3, 6)}.
6 6
5 5
B×A
4 A×B 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 0 1 2 3 4 5
b) We need to prove: if (k, m) ∈ N×Z then (k, m) ∈ R×Q. Since k ∈ N implies

k ∈ R, i.e. N ⊂ R, and since m ∈ Z implies m ∈ Q, i.e. Z ⊂ Q, (k, m) ∈ N × Z
yields (k, m) ∈ R × Q.
c) First note that X ∪ Y = {1, 2, 3, 4, 5} and Y ∪ Z = {3, 4, 5, 6, 7}. Now it
follows that
(X ∪ Y ) × Z ={(1, 6), (1, 7), (2, 6), (3, 6), (3, 7),
(4, 6), (4, 7), (5, 6), (5, 7)},
X × (Y ∪ Z) ={(1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7), (3, 3),
(3, 4), (3, 5), (3, 6), (3, 7)}.
Finally, from
X × Z = {(1, 6), (2, 6), (3, 6), (1, 7), (2, 7), (3, 7)}
and
Y × Z = {(3, 6), (4, 6), (5, 6), (3, 7), (4, 7), (5, 7)}
we deduce
(X × Z) ∩ (Y × Z) = {(3, 6), (3, 7)}.
550
2. a) Since (x, y) ∈ (A ∪ B) × C is equivalent to x ∈ A ∪ B and y ∈ C, i.e. x ∈ A

or x ∈ B and y ∈ C, we note that this is equivalent to
x ∈ A and y ∈ C or x ∈ B and y ∈ C,
i.e. (x, y) ∈ (A × C) ∪ (B × C).
b) Note (x, y) ∈ (A × B) ∩ (C × D) means
(x, y) ∈ (A × B) and (x, y) ∈ (C × D),
i.e.
x ∈ A and y ∈ B and x ∈ C and y ∈ D
or
x ∈ A and x ∈ C and y ∈ B and y ∈ D,
i.e. x ∈ A ∩ C and y ∈ B ∩ D implying that (x, y) ∈ (A ∩ C) × (B ∩ D). However
all arguments are reversible, hence we also deduce that (x, y) ∈ (A ∩ C) × (B ∩ D)
implies that (x, y) ∈ (A × B) ∩ (C × D).
3. Suppose that X × Y ⊂ X × Y , i.e. (x, y) ∈ X × Y implies that (x, y) ∈ X × Y .
This means that x ∈ X and y ∈ Y implies x ∈ X and y ∈ Y , hence X ⊂ X
and Y ⊂ Y . Next if X ⊂ X and Y ⊂ Y , then (x, y) ∈ X × Y which implies
(x, y) ∈ X × Y .
4. The following hold:
5

({j}×Ij ) = ({1}×[1, 2])∪({2}×[2, 3])∪({3}×[3, 4])∪({4}×[4, 5])∪({5}×[5, 6])
j=1
and
5

(Ij ×{j}) = ([1, 2]×{1})∪([2, 3]×{2})∪([3, 4]×{3})∪([4, 5]×{4})∪([5, 6]×{5}).
j=1
This gives:
6
5 6
5
({j} × Ij ) (Ij × {j})
j=1 j=1
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 0 1 2 3 4 5 6
551
5. We need to prove that m ≡ n mod(p) is a reflexive, symmetric and transitive relation

on Z. Clearly m ≡ m mod(p) since m − m is divisible by p. Further if m − n is
divisible by p then n − m is divisible by p also, since m − n = rp implies n − m =
(−r)p. Hence this relation is reflexive and symmetric. Now, suppose m ≡ n mod(p)
and n ≡ k mod(p). We want to prove that m ≡ k mod(p). The congruence m ≡
n mod(p) stands for m − n = r1 p and the congruence n ≡ k mod(p) stands for
n − k = r2 p with r1 , r2 ∈ Z. Now it follows that
m − k = (m − n) + (n − k) = r1 p + r2 p = (r1 + r2 )p,
i.e. m ≡ k mod(p) implying the transitivity, and therefore we have proved that
m ≡ n mod(p) is an equivalence relation.
6. Again we have to prove that “∼” on Z × N is a reflexive, symmetric and transitive
relation. Now for (k, m) ∈ Z×N we see that km = mk implying (k, m) ∼ (m, k), i.e.
“∼” is reflexive. Also kn = lm is equivalent to lm = kn, i.e. (k, m) ∼ (l, n) if and
only if (l, n) ∼ (k, m) i.e. symmetry is proved. Further, suppose that (k, m) ∼ (l, n)
and (l, n) ∼ (p, q). It follows that kn = lm and lq = pn, implying lqkn = lmpn.
Now n ∈ N, hence n = 0 and therefore we find lqk = lmp. If l = 0, then it
follows that qk = mp or (k, m) ∼ (p, q). However l = 0 implies p = 0 and k = 0,
and therefore qk = 0 = mp, which proves the transitivity of “∼”, i.e. this is an
equivalence relation on Z × N.
7. a) Since φ is the only subset of φ we find that P(φ) = {φ}. Note that P(φ) = φ,
the set {φ} contains one element, the set φ.
b) We have
P({1, 2, 3}) = {φ, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}.
8. We need to add up the number of subsets of X with

0, 1, 2, . . . , N elements. However
the number of subsets of X with k elements is Nk by the hint, so we need to find:
N
N

N N
= 1k 1N −k = (1 + 1)N = 2N .
k k
k=0 k=0
Let us give a proof of the following:

Proposition.
N The number of subsets with k elements of a set with N elements is
k .
Proof. Denote the number of subsets with k elements of the X = {x1 , . . . , xN }

set
with N elements by νN,k . The aim is to prove that νN,k = N k . We use mathemat-
every k ≤ N . For
ical induction, i.e. we assume that the statement holds for N and
N = 0 we only have one subset, namely φ, hence ν0,0 = 1 = 00 . For N = 1 we have
one subset with zero elements,
namely φ, and one
subset with one element, namely
{x1 }. Hence ν1,0 = 1 = 10 and ν1,1 = 1 = 11 . Now suppose that the number of
552

subsets with k elements of a set with N elements is νN,k = N
k . We want to find
νN +1,k . Two cases are trivial:

N +1 N +1
νN +1,0 = 1 = and νN +1,N +1 = 1 = .
0 N +1
Thus we may assume 1 ≤ k ≤ N . The subsets of X = {x1 , . . . , xN +1 } having k

elements form two disjoint sets K0 and K1 . In K0 we collect all subsets of X with
k elements which do not contain xN +1 , whereas K1 is the family of subsets of X
having k elements,
one of which is xN +1 . The number of elements of K0 is by our
assumption N k . We are looking for the number of subsets with k elements of a
set with N elements. Every set belonging to K1 contains xN +1 and Nk− 1 further
elements belonging to {x1 , . . . xn }. Thus K1 has by our assumption k−1 elements.
This implies
N N N +1
νN +1,k = + = ,
k k−1 k
where we used Lemma 3.8 in the last step.
9. The solutions of the quadratic equation y 2 − 2y + x = 0 are formally given by

√
y1,2 = 1 ± 1 − x,
but we are confined to real numbers, hence for 1 > x we have two solutions, for
x = 1 we have one solution and for x < 1 we have no solution. Therefore we cannot
define a mapping on R which maps x to the solution of y 2 − 2y + x = 0.
k
m
10. Let p(x) = aj xj and q(x) = l=0 bl x
l
and suppose that k ≤ m. Define for
j=0
m

j = k + 1, . . . , m the coefficients aj := 0 to get p(x) = aj xj . Now we define
j=0
m
m
m

p(x) + q(x) = aj xj + bj xj = (aj + bj )xj
j=0 j=0 j=1
proving that p + q is a polynomial.

Further we have
⎛ ⎞
k
m

p(x)q(x) = ⎝ aj x ⎠
j
bl xl
j=1 l=0
⎛ ⎞
k
m k+m

= aj bl xj+l = ⎝ aj bl ⎠ xn
j=0 l=0 n=1 j+l=n
and it follows that p · q is a polynomial.
553
11. a) We need to determine the coefficients bl , 0 ≤ l ≤ 2n given the coefficients

a2j , 0 ≤ j ≤ n. The only choice is

a2j , l = 2j, j = 0, . . . , n
bl =
0, l = 1, 3, . . . , 2n − 1.
With this choice we clearly have:
n
2n

p(x) = a2j x2j = bl xl .
j=0 l=0
b) Since for all j ∈ N we have |x|2j = (x2 )j = x2j , it follows that f and p have
the same domain, namely R, and on R they coincide:
n
n

p(x) = a2j x2j = a2j |x|2j .
j=0 j=0
c) For x ≥ 0 we have |x| = x and therefore |x|3 = x3 . However, for x < 0 we

have |x| = −x and therefore |x|3 = (−1)3 x3 = −x3 = x3 . Hence the largest domain
where h and g coincide is
R+ = {x ∈ R | x ≥ 0}.
12. a) For all x ∈ R we know that x2 + 7 = 0 and therefore
x3 − 5x2 − 17
x2 + 7
is defined for all x ∈ R. Hence we can define a rational function:
q1 : R −→ R
x3 − 5x2 − 17
x −→ q1 (x) = .
x2 + 7
b) The term (x − 3)(x + 4)(2x + 7)8 has zeroes for x = 3, x = −4 and x = − 72 .
Therefore we can define on R \ {3, −4, − 27 } the function q2 : R \ {3, −4, − 27 } −→ R,
x −→ q2 (x), where
(x − 3)2 (2x + 7)5
q2 (x) = .
(x − 3)(x + 4)(2x + 7)8
However, on R \ {3, −4, − 27 } we find
(x − 3)
q2 (x) =
(x + 4)(2x + 7)3
and this term is defined on R \ {−4, − 27 }. Therefore we may extend q2 : R \
{3, −4, − 27 } −→ R to a function q˜2 : R \ {−4, − 27 } −→ R by
(x − 3)
x −→ q˜2 (x) = .
(x + 4)(2x + 7)3
554
c) The term (x − 4)(x + 2) is zero for x = 4 and x = −2. It follows that on

R \ {4, −2} we can define the function:
q3 : R \ {4, −2} −→ R
x2 − x − 12
x −→ q3 (x) = .
(x − 4)(x + 2)
However, for x = 4 we have 42 − 4 − 12 = 0, or x2 − x − 12 = (x − 4)(x + 3). Thus
on R \ {4, −2} we find
x2 − x − 12 (x − 4)(x + 3) x+3
= = .
(x − 4)(x + 2) (x − 4)(x + 2) x+2
Therefore we can extend q3 to a function q˜3 : R \ {−2} −→ R by
x+3
x −→ q˜3 (x) = .
x+2
13. (i)
a) By definition we have x ∈ f −1 (A ∩ B) if there exists y ∈ A ∩ B such that
f (x) = y. Since y ∈ A it follows that x ∈ f −1 (A) and since y ∈ B it follows that
x ∈ f −1 (B), i.e. x ∈ f −1 (A)∩f −1 (B). We have proved that f −1 (A∩B) ⊂ f −1 (A)∩
f −1 (B). Now let x ∈ f −1 (A) ∩ f −1 (B), i.e. x ∈ f −1 (A) and x ∈ f −1 (B). Hence
there exists y1 ∈ A such that f (x) = y1 and y2 ∈ B such that f (x) = y2 . However
this implies y1 = y2 and y1 = y2 ∈ A ∩ B. Consequently x ∈ f −1 (A ∩ B) proving
f −1 (A) ∩ f −1 (B) ⊂ f −1 (A ∩ B) which now proves f −1 (A ∩ B) = f −1 (A) ∩ f −1 (B).
b) If x ∈ f −1 (A ∪ B) then there exists y ∈ A ∪ B such that f (x) = y. Conse-
quently x ∈ f −1 (A) or x ∈ f −1 (B) implying f −1 (A ∪ B) ⊂ f −1 (A) ∪ f −1 (B). Now,
let x ∈ f −1 (A) ∪ f −1 (B). Then there exists y ∈ A ∪ B such that f (x) = y implying
that x ∈ f −1 (A ∪ B) or f −1 (A) ∪ f −1 (B) ⊂ f −1 (A ∪ B) proving the assertion.
(ii)
a) For y ∈ f (A ∩ B) there exists x ∈ A ∩ B such that f (x) = y, hence y ∈ f (A)
and y ∈ f (B), i.e. y ∈ f (A)∩f (B) and we have proved that f (A∩B) ⊂ f (A)∩f (B).
Of course we do not expect equality to hold: take f : R −→ R, x → f (x) = x2 , and
choose A = {1} and B = {−1}. Then A ∩ B = φ and consequently f (φ) = φ while
f (A) = {1} and f (B) = 1, i.e. f (A) ∩ f (B) = {1}.
b) If y ∈ f (A ∪ B) then there exists x ∈ A ∪ B such that f (x) = y, thus
y ∈ f (A) or y ∈ f (B) implying f (A ∪ B) ⊂ f (A) ∪ f (B). Now let y ∈ f (A) ∪ f (B)
then there exists x ∈ A or x ∈ B such that f (x) = y, i.e. x ∈ A ∪ B and f (x) = y
which yields y ∈ f (A ∪ B). Thus we have proved that f (A) ∪ f (B) ⊂ f (A ∪ B).
c) By definition f ({x}) = {y ∈ Y | y = f (x)} = {f (x)}.
14. (i)
a) Since x2 + 1 ≥ 1 we first note that f −1 ({y}) = φ if y < 1. In the case
where y √= 1 we deduce f −1 ({1}) =√{0}, whereas
√ for y > 1, x2 + 1 = y implies
−1
x1,2 = ± y − 1, i.e. f ({y}) = {+ y − 1, − y − 1}. This is easier to see in the
following figure:
555
4 y = x2 + 1
3
y>1
1 y=1
y<1
• • •
−2 −1 0 1 2
√ √
− y−1 y−1
b) Since x1 = 0 for all x ∈ R \ {0}, we deduce that g −1 ({0}) = φ, whereas for

z = 0 it always follows from g(x) = z, i.e. z = x1 , that g −1 ({z}) = z1 for z = 0.
c) Consider the following figure:
5
3 •
−b
• 2
−a
1
h−1 ((a, b))
| |
−6 −5 −4 −3 −2 −1 1 2 3 4
A B −1
−2
It suggests that h−1 ((a, b)) is the interval (A, B) with A given by h(A) = a and B by
h(B) = b which implies A = 2a−6 and B = 2b−6. Thus h−1 ((a, b)) = (2a−6, 2b−6).
(ii)
a) Using previous knowledge we find
$ % $ % $ %
1 √ 1 1
f ,9 = y ∈ R | y = x, x ∈ ,9 = ,3 .
4 4 2
b) We have

1 8 5
g({1, 2, 3, 4}) = {g(1), g(2), g(3), g(4)} = 0, , , .
2 11 6
c) We have
h(N) = {y ∈ R | y = 2n , n ∈ N}.
556
Chapter 5
1. a) In order for f1 : R −→ R+ to be injective we must have that f1 (x) = f1 (y)

implies x = y, i.e. we need to consider the equation
|x − 3| + 2 = |y − 3| + 2,
or
|x − 3| = |y − 3|.
For every real number a ∈ R \ {0} where x = 3 + a and y = 3 − a it follows that
|x − 3| = |3 + a − 3| = |a| = |3 − a − 3| = |y − 3|,
but x = 3 + a = 3 − a = y for a = 0. Hence f1 is not injective and therefore it

cannot be bijective.
In order for f1 : R −→ R+ to be surjective, we need to find for every b ≥ 0 some
x ∈ R such that f1 (x) = b. Since f1 (x) = |x − 3| + 2 ≥ 2 for all x ∈ R, we cannot
find any x ∈ R such that |x − 3| + 2 = b for 0 ≤ b < 2. Hence f1 : R −→ R+ is also
not surjective.
The graph of f1 is as follows:
−1 0 1 2 3 4 5 6
b) We first test f2 : [1, ∞) −→ (0, 2] for injectivity. Given x, y ∈ [1, ∞) and

suppose that f2 (x) = f2 (y), i.e.
2 2
= or 2y = 2x.
x y
This implies that x = y and therefore f2 is injective. Now let b ∈ (0, 2] and consider
the equation
2
b = f2 (x) = .
x
This equation has the unique solution x = 2b and for 0 < b ≤ 2 it follows that
1 ≤ x < ∞. Thus f2 is surjective and with the previous result it follows that f2 is
bijective.
The graph of f2 is as follows:
557
0
0 1 2 3 4 5
c) Again we start by checking the injectivity of f3 : [−2, 7] −→ [0, 3]. For

x, y ∈ [−2, 7] we find the condition
√
x + 2 = y + 2 or x + 2 = y + 2
implying x = y, i.e. f3 is injective. Now let b ∈ [0, 3] and consider the equation
√
x + 2 = b or x + 2 = b2 .
This yields that x = b2 − 2 and for b ∈ [0, 3] we have −2 ≤ b2 − 2 ≤ 7. Thus f3 is

surjective, √
f3 (b2 − 2) = b2 − 2 + 2 = b2 = b.
Consequently, since f3 has already been proved to be injective it follows that f3 is
bijective. Here is the graph of f3 :
−2 −1 0 1 2 3 4 5 6 7

2. a) For p1 = 2 and q1 = 3 we have g pq11 = p1 + q1 = 5. Further for p2 = 4 and

q2 = 1 we have g pq22 = p2 + q2 = 5, and therefore g is not injective and therefore
it is not bijective either. Clearly,
g is surjective: given k ∈ Z then we take p = k − 1
p
and q = 1 to find g k−1 1 = g q = p + q = k − 1 + 1 = k.
b) Note that r maps pairs of real numbers into pairs of real numbers. Let
(x1 , y1 ) and (x2 , y2 ) be two pairs of real numbers and suppose that
r(x1 , y1 ) = r(x2 , y2 ), i.e. (y1 , x1 ) = (y2 , x2 ).
The equality (y1 , x1 ) = (y2 , x2 ) means that y1 = y2 and x1 = x2 , hence the pairs
(x1 , y1 ) and (x2 , y2 ) are equal implying the injectivity of r. The surjectivity of r is
straightforward: given the pair (a, b) ∈ R × R then for (x, y) := (b, a) it follows that
r(x, y) = r(b, a) = (a, b).
Thus r is injective and surjective, hence it is bijective.
558
3. a) Not taking into account domain and range (co-domain) problems, formally
(f ◦ g)(x) is given by
(f ◦ g)(x) = f (g(x))
= 5(g(x))2 − 2(g(x)) + 1
√ 2 √
=5 5+x −2 5+x+1
√
= 5x − 2 5 + x + 26.
Since g is defined for all x ≥ −5 and since f is defined for all real √
numbers it follows
that f ◦ g is defined on [5, ∞) and therefore (f ◦ g)(x) = 5x − 2 5 + x + 26 holds
for x ∈ [−5, ∞).
b) For D1 = R we can define f ◦ h by

(f ◦ h)(x) = f x4 + 2 = | x4 + 2 + 3| − 2.
The same holds for D2 = R and h ◦ f :

(h ◦ f )(x) = h(|x + 3| − 2) = (|x + 3| − 2)4 + 2.
Note that we can define f ◦ h : R −→ R as well as h ◦ f : R −→ R, but of course

f ◦ h = h ◦ f .
√ c) Here we have to be more careful since the range of h is R(h) = [−1, ∞) but
· is not defined for non-positive numbers. However for x ≥ −1 and x ≤ −3 it
follows that |x + 2| − 1 ≥ 0. Hence on
D = {x ∈ R | x ≤ −3 or x ≥ −1} = R \ (−3, −1) we can define

(f ◦ h)(x) = f (h(x)) = f (|x + 2| − 1) = |x + 2| − 1.
4. Let f1 : D1 −→ F1 and f2 : D2 −→ F2 be two injective mappings such that

f1 (D1 ) = D2 . Then f2 ◦ f1 : D1 −→ F2 is defined and for x, y ∈ D1 the equality
(f2 ◦ f1 )(x) = (f2 ◦ f1 )(y), i.e. f2 (f1 (x)) = f2 (f1 (y)),
implies f1 (x) = f1 (y) by the injectivity of f2 , and now the injectivity of f1 implies
x = y, i.e. f2 ◦ f1 is injective.
Now let g1 : D1 −→ F1 and g2 : D2 −→ F2 be two surjective mappings such that
g1 (D1 ) = D2 (= F1 ). Then g2 ◦ g1 : D1 −→ F2 is defined and for b ∈ F2 the
surjectivity of g2 implies the existence of a ∈ D2 such that g2 (a) = b. Since g1 is
surjective we know that g1 (D1 ) = F1 = D2 , thus given a ∈ D2 = F1 we find x ∈ D1
such that g1 (x) = a implying that
(g2 ◦ g1 )(x) = g2 (g1 (x)) = g2 (a) = b,
i.e. g2 ◦ g1 is surjective.
Now if f1 : D1 −→ F1 and f2 : D2 −→ F2 are injective and surjective, i.e. bijective,
and if f1 (D1 ) = D2 (= F1 ), then f2 ◦ f1 is also injective and surjective, i.e. it is
bijective.
559
5. Since all mappings belonging to Aut(X) are bijective their composition is always
defined.
(i) Since the composition of mappings is associative the statement (f ◦ g) ◦ h =
f ◦ (g ◦ h) follows immediately.
(ii) Clearly f ◦g maps X to X. We need to prove that it is bijective. By Problem
4 we know however that the composition of bijective mappings is bijective, hence
f ◦ g ∈ Aut(X).
(iii) The map idX : X −→ X, x → idX (x) = x, belongs to Aut(X) and the
following holds for f ∈ Aut(X):
(f ◦ idX )(x) = f (idX (x)) = f (x) = idX (f (x)) = (idX ◦ f )(x).
(iv) Since f ∈ Aut(X) it is bijective and with kf := f −1 we find

f ◦ f −1 = f −1 ◦ f = idX .
6. a) Let f : X −→ Y be injective. Then

f : X −→ R(f ) is bijective and f −1 : R(f ) −→ X exists. Define g : Y −→ X by
−1
f (y) for y ∈ R(f )
g(y) :=
x0 ∈ X for y ∈ Y \ R(f ).
Then we find for x ∈ X since f (x) = y ∈ R(f ) that
(g ◦ f )(x) = g(f (x)) = g(y) = f −1 (y) = (f −1 ◦ f )(x) = x,
i.e. g ◦ f = idX .
Conversely, suppose that there exists a mapping g : Y −→ X such that
g ◦ f = idX . For x, y ∈ X with f (x) = f (y) it follows that
x = g(f (x)) = g(f (y)) = y,
i.e. x = y and f must be injective.
b) Suppose now that f : X −→ Y is surjective. For y ∈ Y choose xy ∈ X such
that f (xy ) = y. This defines a mapping
h : Y −→ X
y → xy .
For this mapping we find f ◦ h : Y −→ Y and
(f ◦ h)(y) = f (h(y)) = f (xy ) = y,
i.e. f ◦ h = idY .
Conversely if there exists a mapping h : Y −→ X such that f ◦ h = idY it follows
for any b ∈ Y that
b = idY (b) = (f ◦ h)(b) = f (h(b)).
Thus given b ∈ Y there exists x := h(b) ∈ X such that
f (x) = f (h(b)) = b, i.e. f is surjective.
560
7. We already know (or can easily check) that

f : (0, ∞) −→ (0, ∞) is bijective, hence f −1 exists. The claim is that f ◦ f = id.
However for x ∈ (0, ∞) we find

1 1
f (f (x)) = f = 1 = x = id(0,∞) (x).
x x
8. First note that the range of h is a subset of R+ \ {0}, and therefore we can define
f ◦ h and g ◦ h on a suitable domain, namely the domain of h. By definition we
have
(f + g) ◦ h = f ◦ h + g ◦ h,
(f · g) ◦ h = (f ◦ h) · (g ◦ h),
1 1
◦h= ,
g g◦h
and therefore it follows that
((f + g) ◦ h)(x) = f (h(x)) + g(h(x))

1
= + h(x) + |h(x) − 2|
h(x)
1
= 2 + x2 + 2 + x2 ,
x +1
((f · g) ◦ h)(x) = f (h(x)) · g(h(x))

1
= h(x) + |h(x) − 2|
h(x)
1 x2
= √ + 2 ,
x +1 x +1
2

1 1 1
◦ h (x) = =
g g(h(x)) h(x) + |h(x) − 2|
1
=√ .
x2 + 2 + x2
9. For every real number a ∈ R we have
|a| + a |a| − a
≥ 0 and ≥ 0.
2 2
Indeed, if a ≥ 0 then |a| = a and
|a| + a 2a
= =a≥0
2 2
561
and
|a| − a a−a
= = 0.
2 2
But if a < 0 then |a| = −a and we find
|a| + a −a + a
= =0
2 2
as well as
|a| − a −2a
= = −a > 0 since a < 0.
2 2
Since for x ∈ X by definition we have f (x) ∈ R it follows that
|f (x)| + f (x) |f (x)| − f (x)
f + (x) = ≥0 and f − (x) = ≥ 0.
2 2
We call f + the positive part and f − the negative part of f . Note that the negative
part of f is a non-negative function.
Now it follows that
|f (x)| + f (x) |f (x)| − f (x)
f + (x) − f − (x) = − = f (x)
2 2
and
|f (x)| + f (x) |f (x)| − f (x)
f + (x) + f − (x) = + = |f (x)|.
2 2
10. a) We need to solve the equation
1
y= ;
1 + x2
or 1 + x2 = y1 , i.e. x2 = y1 − 1.
Since y ∈ (0, 1] it follows that y1 − 1 ≥ 0. Hence
' '
1 1−y
x= −1= .
y y
Thus we find the inverse function of f1 to be:
f1−1 : (0, 1] −→ [0, ∞)
'
1−y
y → .
y
b) We first sketch f2 :
3
0
0 1 2 3 4 5 6 7
562
In order to find f2−1 we need to solve the equation
f2 (x) = y.
For 0 ≤ x ≤ 1 we find
−x + 2 = y, i.e. x = 2 − y.
For 1 < x < ∞ we have

1 1
=y or x = ,
x y
and therefore we obtain
f2−1 : (0, 2] −→ [0, ∞)
1
y, y ∈ (0, 1)
y →
2 − y, y ∈ [1, 2].
c) Now the equation we have to solve is given by

1
f3 (n) = q or =q
n3
which yields
1 1
n= √
3 q
= q− 3 .
1
Thus f3−1 : q | q = 1
n3 and n ∈ N −→ N, q → q − 3 .
Note that − 13
1 1
= (n3 ) 3 = n,
n3
therefore f3−1 has the desired properties.
11. a) First consider the figure below of the unit disc B1 (0)
1
(x, y)
•
(0, 0)
−1 x = pr1 ((x, y)) 1
−1
563
" (x, y) ∈ B
For
1
(0) we find that pr1 ((x,
# y)) = x. Denote the set
2 2
(x0 , y) | − 1 − x0 ≤ y ≤ 1 − x0 by A(x0 ) for x0 ∈ [−1, 1]. Then we find
pr1 (A(x0 )) = x0 .
Now for the circle S 1 = {(x, y) ∈ R2 | x2 + y 2 = 1} we find again that pr1 ((x, y)) =
x, see the following figure:
• (x, y)
x = pr1 ((x, y))

For x0 ∈ [−1, 1] and(x0 , y) ∈ S 1 we find with y = ± 1 − x20 that only the

points x0 , ± 1 − x20 are mapped to x0 by pr1 . In both cases we have however
pr1 (B1 (0)) = pr1 (S 1 ) = [−1, 1].
b) We may rewrite R(g) as
R(g) = {(x, y) | x ∈ [0, 1] and g(x) = x2 + 1}
= {(x, x2 + 1) | x ∈ [0, 1]}.
This implies that
pr2 (R(g)) = {x2 + 1 | x ∈ [0, 1]}
= [1, 2],
i.e. we are dealing with the following situation:
2 pr2
pr2 (R(g))
R(g)
1
0
0 1 2
564
12. First we look at pr1 : X × Y −→ X, (x, y) → x. Now, by the very definition of the
pre-image we have for A ⊂ X
pr1−1 (A) = {(x, y) ∈ X × Y | x ∈ A}

= {(x, y) | x ∈ A, y ∈ Y }
= A × Y.
Analogously we find for pr2 : X ×Y −→ Y , (x, y) → y, that for B ⊂ Y the following

holds
pr2−1 (B) = {(x, y) ∈ X × Y | y ∈ B}

= {(x, y) | x ∈ X, y ∈ B}
= X × B.
13. Suppose that j : N −→ R is injective. Then j : N −→ j(N) is surjective and

injective, hence bijective, implying that j(N) is countable as it is a bijective image
of N. Now consider the mapping j : N −→ {1} ∪ {2k | k ∈ N} with

1, for n being odd
j(n) :=
2n for n being even.
Clearly j is not injective but j(N) is countable. Indeed we know that {2k | k ∈ N}
is countable and the union of a countable set with a finite set is again countable.
14. We have to prove that ‘∼’ is symmetric, reflexive and transitive.
If f, g ∈ M (D; R) and f ∼ g then there exists a finite set Af,g = {x1 , . . . , xm } ⊂ D
such that f (x) = g(x) for x ∈ D \ Af,g . But for x ∈ D \ Af,g we also have
g(x) = f (x), i.e. f ∼ g implies that g ∼ f and ‘∼’ is symmetric. Since f (x) = f (x)
for all x ∈ D and by definition the empty set is finite it follows with Af,f = φ that
f (x) = f (x) for all x ∈ D \ Af,f , i.e. ‘∼’ is reflexive.
Finally, if f, g, h ∈ M (D; R) and f ∼ g as well as g ∼ h, we find sets Af,g and Ag,h
such that
f (x) = g(x) for x ∈ D \ Af,g and g(x) = h(x) for x ∈ D \ Ag,h .
Now Af,h := Af,g ∪ Ag,h is a finite set and for

x ∈ D \ Af,h = D \ (Af,g ∪ Ag,h ) we have
f (x) = g(x) = h(x),
i.e. f (x) = h(x) for x ∈ D \ Af,h implying the transitivity of ‘∼’. Therefore it
follows that ‘∼’ is an equivalence relation.
15. The mapping J is injective: if ((x, y), z) = ((x , y ), z ) then either z = z or (x, y) =
(x , y ). Hence at least one of the statements z = z , x = x , y = y is true which
implies that (x, y, z) = (x , y , z ). The mapping J is surjective: given (x, y, z) ∈
X × Y × Z, then ((x, y), z) ∈ (X × Y ) × Z and J(((x, y), z)) = (x, y, z). Hence J is
bijective.
565
Chapter 6
1. Firstly, a general remark: in order to calculate limits using (6.18)−(6.20) we assume
that all the relevant assumptions hold. However, while doing these calculations it
is important that we can justify that all steps are correct.
a)

lim3 53 x2 − 12
7
x = lim3 53 x2 − lim3 127
x
x→ 4 (6.18) x→ 4 x→ 4

5 2 7
= lim 3 lim x − lim 12 lim x
(6.19) x→ 34 x→ 34 x→ 34 x→ 34
2
5 3 7 3 5 9 7 3 5·3 7 1
= · − · = · − · = − = .
3 4 12 4 3 16 12 4 16 16 2
b) First note that for x = 1
1 − x2 (1 − x)(1 + x)
= =1+x
1−x 1−x
and therefore
1 − x2
lim = lim (1 + x) = lim 1 + lim x = 2.
x→1 1−x x→1 (6.18) x→1 x→1
c)
x3 − 4x2 + 7x − 13
lim
x→3 − 75 x2 + 1+x
1
2
lim (x3 − 4x2 + 7x − 13)

x→3
=
(6.20) 7 1
lim − x2 +
x→3 5 1 + x2
lim x3 − lim 4x2 + lim 7x − lim 13
x→3
= x→3 x→3 x→3
(6.18),(6.20) 7 2 1
lim − x +
x→3 5 lim (1 + x2 )
x→3
33 − 4 · 32 + 7 · 3 − 13
=
− 75 · 32 + 1+3
1
2
27 − 36 + 21 − 13
=
− 635 + 10
1
10(48 − 49) −10 2

= = = .
−126 + 1 −125 25

7 2 1 −125
Note that since lim − x + = = 0 we may apply (6.20).
x→3 5 1 + x2 10
566
2. The remark made at the beginning of the solution of Problem 1 also applies here.
a)
x2 − 2x + 5 lim (x2 − 2x + 5)
x→4
lim =
x→4 x−2 lim (x − 2)
x→4
lim x2 − lim 2x + lim 5
x→4 x→4 x→4
=
lim x − lim 2
x→4 x→4
16 − 8 + 5 13
= = ,
4−2 2
and we need to note that lim (x − 2) = 2 = 0.
x→4
b)
x2 − 9 (x − 3)(x + 3)
lim = lim
x→−3 (x + 5)(x + 3) x→−3 (x + 5)(x + 3)
lim (x − 3)
x−3 x→−3
= lim =
x→−3 x + 5 lim (x + 5)
x→−3
−6
= = −3.
2
x2 −9 x−3
We need to note that for x = −3 we have (x+5)(x+3) = x+5 , and that lim (x+ 5) =
x→−3
2 = 0.
3. For x = 3 we have
lim f (x) = lim (x3 − 22) = 27 − 22 = 5
x→3 x→3
and since 5 = lim f (x) = f (3) = 17, it follows that f is not continuous at x = 3.
x→3
4. a) Since h is bounded we know that |h(x)| ≤ M for some M ≥ 0 therefore we

find that |xh(x)| ≤ M |x|. Therefore it remains to prove that lim (M |x|) = 0 (using
x→0
the assumption in the question) which is equivalent to lim |x| = 0.
x→0
We must satisfy the definition of the limit of a function: given > 0 we chose δ =
to find for |x| < δ that
||x| − 0| = |x| < δ = which implies lim |x| = 0.
x→0
Now we sketch the proof of the assumption:

|f (x)| ≤ g(x) for all x ∈ (a, b) and lim g(x) = 0, c ∈ (a, b), implies lim f (x) = 0.
x→c x→c
We know that for > 0 there exists δ > 0 such that 0 < |x − c| < δ implies
|g(x)| = g(x) < . Therefore for > 0 given we find with the same δ > 0 for
0 < |x − c| < δ that
|f (x) − 0| = |f (x)| ≤ g(x) < ,
567
i.e. lim f (x) = 0.

x→0
b) For the function f we find the estimate

1

|f (x)| ≤ |x| sin ≤ |x| for x = 0
x
|f (0)| = 0 = |0| for x = 0.
Therefore it follows that
|f (x)| ≤ |x| for all x ∈ R
and applying part a) in particular that lim |x| = 0 it follows that lim f (x) = 0.
x→0 x→0
5. Consider the following

3 2 3 1 2
f (x) − f (x0 ) 4x −2− 4 −2 −2
=
x − x0 x − − 21

3 x2 − 14 3 x − 12 x + 12
= =
4 x + 12 4 x + 12

3 1
= x− .
4 2
f (x) − f (x0 )
Recall f (x) = lim . Therefore for the limit we now find
x→x0 x − x0

f (x) − f − 21 f (x) − f − 12
lim 1 1 = lim
x→− 2 x − −2 x→− 12 x + 12

3 1 3 1 1 3 3
= lim 1 x− = − − = (−1) = −
x→− 2 4 2 4 2 2 4 4

thus f − 12 = − 43 .
6. First let us sketch the graph of χ[0,1] : R −→ R, where

1, x ∈ [0, 1]
χ[0,1] (x) =
0, x∈/ [0, 1].
568
•
1− •
) |(
1
Now, for x0 < 0 we find for x ∈ R such that |x − x0 | < δ and δ < |x0 |, in particular
x < x0 + δ < 0, that
χ[0,1] (x) − χ[0,1] (x0 ) 0−0

= = 0,
x − x0 x − x0
implying χ[0,1] (x0 ) = 0. In a similar way we find that χ[0,1] is differentiable for
0 < x0 < 1: for x close to x0 and x ∈ (0, 1) we find
χ[0,1] (x) − χ[0,1] (x0 )

=0
x − x0
which gives χ[0,1] (x0 ) = 0. Moreover, for x0 > 1 and 1 < x it follows once again
that
χ[0,1] (x) − χ[0,1] (x0 )
=0
x − x0
hence χ[0,1] (x0 ) = 0.
Before we investigate the case x0 = 0 or x0 = 1, we make the following observation:
in order for
χ[0,1] (x) − χ[0,1] (x0 )
lim
x→x0 x − x0
to exist it is necessary that for all 0 < δ ≤ δ0 the function
χ[0,1] (x) − χ[0,1] (x0 )

x →
x − x0
is bounded on 0 < |x − x0 | < δ.

Suppose that
χ[0,1] (x) − χ[0,1] (x0 )
lim =a
x→x0 x − x0
569
for some a ∈ R. Then for = 1 there exists δ̃ > 0 such that 0 < |x − x0 | < δ̃ implies

χ[0,1] (x) − χ[0,1] (x0 )
− a < 1.
x − x0
Thus for 0 < |x − x0 | < δ̃ it follows that

χ[0,1] (x) − χ[0,1] (x0 )
− |a| ≤ χ[0,1] (x) − χ[0,1] (x0 ) − a < 1,
x − x0 x − x0
or
χ[0,1] (x) − χ[0,1] (x0 )
< 1 + |a|.
x − x0
Now, for x0 = 0 we find with 0 < |x − x0 | = |x| < 1 that
χ[0,1] (x) − χ[0,1] (x0 ) χ[0,1] (x) − 1

=
x − x0 x

0, 0<x<1
=
− x1 , −1 < x < 0

which is for 0 < |x| < 1 unbounded. Suppose that − x1 = x1 ≤ c for some c and
1
every 0 < |x| < 1, then it would follow that 0 < c ≤ x for all x ∈ (0, 1) implying
1
c = 0 which is a contradiction.
For x0 = 1 we have to consider
χ[0,1] (x) − χ[0,1] (1)
x−1

0, 0<x<1
= 1
− x−1 , 1<x
and this is again an unbounded function for 0 < |x − 1| < 1. Thus neither χ[0,1] (0)
nor χ[0,1] (1) exist.
570
7. Again it is helpful to sketch the graph of g, say on [0, 3]

6
0
0 1 2 3
In order to have differentiability at x0 = 2 we need to investigate the limit as x → 2

for
g(x) − g(2) 0, for x ≤ 2
= x2 −4
x−2 x−2 = x + 2, for x > 2.
Suppose that
g(x) − g(2)
lim =a
x→2 x−2
for some a ∈ R.

Then for all > 0 there exists δ > 0 such that 0 < |x−2| < δ implies g(x)−g(2)
x−2 −a <
1
. In particular for = 2 there exists some δ > 0 such that −δ + 2 < x < 2 + δ
implies
g(x) − g(2) 1
− a < .
x−2 2
For −δ + 2 < x < 2 we have |a| < 12 , where for 2 < x < 2 + δ we find that
1
|x + 2 − a| <
2
or
1
|x + 2| −
< |a|
2
where we used the converse triangle inequality
|x + 2 − a| ≥ |x + 2| − |a|.
We may assume δ < 1 to see that

1
x+2− < |a|
2
571
or
3
< |a|
x+
2
but x > 1, implies |a| > 32 which is a contradiction. Thus g is not differentiable at
x0 = 2.
However g is continuous at x0 = 2. For this we need to prove that for every > 0
there exists δ > 0 such that 0 < |x − 2| < δ implies |g(x) − 1| < . Now for
−δ + 2 < x < 2 we have |g(x) − 1| = 0, hence every δ > 0 will work. Whereas for
2 < x < δ + 2 we find
|g(x) − 1| = |x2 − 3 − 1| = |x2 − 4| = |x + 2||x − 2|
and since we may assume without loss of generality that δ < 1 we find |x−2| < δ < 1
implies |x| ≤ 3 and therefore
|g(x) − 1| = |x + 2||x − 2| ≤ 4|x − 2|.

Thus for δ = 4 we find 0 < |x − 2| < δ implies

|g(x) − 1| ≤ 4|x − 2| < 4 · δ = 4 · =
4
proving the continuity of g at x0 = 2.
8. In the following we make use of (6.36), (6.37), (6.38), (6.40) and (6.42).
a)

d d 7 2 2
f (x) = x − 3
dx dx 5 x
7 1 14 6
= 2 · x − 2(−3) 4 = x+ 4.
5 x 5 x
b)

d t7 + 12t3 − 2 d 7 3 1
= (t + 12t − 2) ·
dt t5 dt t5

(6.38) d 1 d 1
= (t7 + 12t3 − 2) 5 + (t7 + 12t3 − 2)
dt t dt t5

1 1
= (7t6 + 12 · 3t2 ) 5 + (t7 + 12t3 − 2) −5 · 6
t t
24 10
= 2t − 3 + 6 .
t t
c)
⎛ ⎞
M
d d ⎝ −j ⎠
h(s) = js
ds ds j=1
M
M
d −j
= j (s ) = (−j 2 )s−j−1 .
j=1
ds j=1
572
9. The proof that χR+ is not differentiable at x0 = 0 follows in the same way as the
proof that χ[0,1] is not differentiable at x0 = 0, see Problem 6.
In order to investigate the differentiability of h : R −→ R, x → x2 f (x) = x2 χR+ (x)
we must consider the limit
h(x) − h(0)
lim .
x→0 x−0
Note that

h(x) − h(0) x2 χR+ (x) − 0
=
x−0 x−0
x2 χR+ (x)
= ≤ |x|.
|x|
Therefore, given > 0 we find for δ = that
0 < |x − 0| = |x| < δ implies

h(x) − h(0) h(x) − h(0)
− =
x−0 0 x − 0 ≤ |x| < δ = ,
i.e. h is differentiable at x0 = 0 and h (0) = 0.

Chapter 7
√ d
√
1. The case k = 1 is known: h1 (x) = x and = √1 = 12 x−1 . Now for
dx h1 (x)
√ 2 x
k = 2n, n ∈ N, being even we have hk (x) = h2n (x) = x2n = xn and therefore
d k k k √ k−2
hk (x) = nxn−1 = x 2 −1 = x .
dx 2 2
Whereas for k = 2n + 1, n ∈ N, being odd we find
√ √
hk (x) = h2n+1 (x) = x2n+1 = xn x
which gives
d d n√ √ 1
hk (x) = (x x) = nxn−1 x + xn √
dx dx 2 x

1 n− 12 k k −1 k √ k−2
= n+ x = x2 = x .
2 2 2
Thus we have for all k ∈ N
d√ k k √ k−2
x = x .
dx 2
2. i)
d k k
f (x) = − (1 + x2 )− 2 −1 (2x)
dx 2
−k−2 −kx
= −kx(1 + x2 ) 2 = k+2 ;
(1 + x2 ) 2
573
ii)

d 1 d 1
g(y) = ·
dy 2 1+ 1 dy y 4
y4

1 1
= −4 5
2 1+ 1 y
y4
2 −2
=− = ;
y5 1 + 1 y3 y4 + 1
y4
iii)
'
d z4 d z2
2
= √
dz 1+z dz 1 + z 2
2z d 1
= √ + z 2 (1 + z 2 )− 2
1 + z2 dz

2z 2 1 2 − 32
= √ + z − 2z(1 + z )
1 + z2 2
2 3
2z(1 + z ) − z z 3 + 2z
= 3 = 3 .
(1 + z 2 ) 2 (1 + z 2 ) 2
3. i) Using the quotient rule we find

d 3u5 − 7u9
du 1 + u6 + u8
d d
(3u5 − 7u9 ) (1 + u6 + u8 ) − (3u5 − 7u9 ) du (1 + u6 + u8 )
= du 6 8 2
(1 + u + u )
(15u2 − 63u8 )(1 + u6 + u8 ) − (3u5 − 7u9 )(6u5 + 8u7 )
=
(1 + u6 + u8 )2
15u − 48u − 3u − 24u12 − 21u14 − 7u16
2 8 10
= ;
(1 + u6 + u8 )2
ii) By the quotient rule it follows that

d 2 12
7 1 d 7
1
d (1 + v 2 ) 2 dv (1 + v ) (5 + v 2 ) 2 − (1 + v 2 ) 2 dv (5 + v 2 ) 2
=
dv (5 + v 2 ) 72 (5 + v 2 )7
1 7 1 5
v(1 + v 2 )− 2 (5 + v 2 ) 2 − (1 + v 2 ) 2 (7 · v(5 + v 2 ) 2 )
=
(5 + v 2 )7
v(5 + v 2 ) − 7v(1 + v 2 ) −2v(1 + 3v 2 )
= 1 9 = 1 9 ;
2 2
(1 + v ) 2 (5 + v ) 2 (1 + v 2 ) 2 (5 + v 2 ) 2
574
iii) Again, our main tool is the quotient rule:

√
d d z 5 − 2z 4
h(z) =
dz dz 12 + z 2 (1 + z 3 )
√ √
d 5 − 2z 4 ) (12 + z 2 (1 + z 3 )) − ( z 5 − 2z 4 ) d (12 + z 2 (1 + z 3 ))
dz ( z dz
=
(12 + z 2 (1 + z 3 ))2
√ √
5 3 − 8z 3 (12 + z 2 + z 5 ) − ( z 5 − 2z 4 )(2z + 5z 4 )
2 z
=
(12 + z 2 + z 5 )2
√ √
(5z 5 + 5z 2 + 60) z 3 − 16z 8 − 16z 5 − 192z 3 + 2(10z 8 + 4z 5 + z 3 (−2z 2 − 5z 5 ))
=
2(12 + z 2 + z 5 )2
√
z 3 (5z 5 + 5z 2 + 60 − 4z 2 − 10z 5 ) − 16z 8 − 16z 5 − 192z 3 + 20z 8 + 8z 5
=
2(12 + z 2 + z 5 )2
√
(−5z 5 + z 2 + 60) z 3 + 4z 8 − 8z 5 − 192z 3
= .
2(12 + z 2 + z 5 )2
4. For f −1 we have
d −1 1
(f )(y) = −1
dy f (f (y)).
1
Since f (x) = kxk−1 and f −1 (y) = y k , we find
d −1 1 1
(f )(y) = −1 k−1
= k−1
dy k(f (y)) ky k
1 1
= y k −1 .
k
2 √ 1
Note that k = 21 we know that the inverse of x → x , x > 0 is y →
1for y = y2
d
and dy y 2 = 12 y − 2 = 2√
1
y , as we already know.
5. In all three sub-problems we use the result of Problem 4, namely that
d 1 d √ 1 1 1 1−k 1 1
xk = k
x = x k −1 = x k = √ .
dx dx k k k k xk−1
i)
d d 1
f (s) = (1 + s2 ) k
ds ds
1 1 2s 1
= 2s (1 + s2 ) k −1 = (1 + s2 ) k −1
k k
2s
= 1 .
k(1 + s2 )1− k
575
ii)
√
d d 1 + t4
g(t) = √
dt dt 5 1 + t6 + t8
d √ √ √ d 5
√
4 ( 5 1 + t6 + t 8 ) − 1 + t4 dt ( 1 + t6 + t 8 )
dt 1 + t
= √
( 5 1 + t6 + t8 )2
1 1
2t3 (1+t6 +t8 ) 5 (1+t4 ) 2 (6t5 +8t7 )
1 − 4
(1+t4 ) 2 5(1+t6 +t8 ) 5
= 2
(1 + t6 + t8 ) 5
10t (1 + t6 + t8 ) − (1 + t4 )(6t5 + 8t7 )

3
= 1 6
5(1 + t4 ) 2 (1 + t6 + t8 ) 5
2t + 4t9 − 8t7 − 6t5 + 10t3
11
= 1 6 .
5(1 + t4 ) 2 (1 + t6 + t8 ) 5
iii)
⎛ ⎞ 1
1 1
d ⎝ u7 ⎠ d u 7 (1 + u4 ) 2
= 1
du 1+u2 du (1 + u2 ) 2
1+u4
d 1 1 1 1
d 1 1
du (u (1
7 + u4 ) 2 )(1 + u2 ) 2 − u 7 (1 + u4 ) 2 du (1 + u2 ) 2
=
1 + u2

1 − 67 4 12 1 1 1 1 1 1
7u (1 + u ) + u 7 2u3 (1 + u4 )− 2 (1 + u2 ) 2 − u 7 (1 + u4 ) 2 u(1 + u2 )− 2
=
1 + u2
(1 + u4 + 14u4 )(1 + u2 ) − 7u2 (1 + u4 )
= 6 1 3
7u 7 (1 + u4 ) 2 (1 + u2 ) 2
8u + 15u4 − 6u2 + 1
6
= 6 1 3 .
7u 7 (1 + u4 ) 2 (1 + u2 ) 2
6. i) By the chain rule we find

d l d 1 l
xk = xk
dx dx
1 l−1 1 1
= l xk · x k −1
k
l l−1 + 1 −1
= x k k
k
l l
= x k −1 .
k
576
ii)
3

d d (1 + s2 )− 2
g(s) =
ds ds (1 + s4 )5
d 3
= ((1 + s2 )− 2 (1 + s4 )−5 )
ds
3 5 3
= − · 2s(1 + s2 )− 2 (1 + s4 )−5 + (1 + s2 )− 2 (−5 · 4s3 (1 + s4 )−6 )
2
5
= (1 + s2 )− 2 (1 + s4 )−6 (−3s(1 + s4 ) + (1 + s2 )(−20s3 ))
−(20s5 + 3s4 + 20s3 + 3s)
= 5 .
(1 + s2 ) 2 (1 + s4 )6
7. A straightforward calculation using the chain rule and then the quotient rule gives
&
d p(x) 1 d p(x)
−2= −2
dx q(x) 2 p(x) − 2 dx q(x) q(x)
p (x)q(x) − p(x)q (x)

=
2q 2 (x) p(x)
q(x) − 2
p (x)q(x) − p(x)q (x)

= 3 .
2q(x) 2 p(x) − 2q(x)
8. Again, we just apply the chain rule to find

dg(t) d 1
= (t2 − 1)(2t + 3) 2
dt dt
1 d 2 1

= (t − 1)(2t + 3) 2
1 dt
2 (t2 − 1)(2t + 3) 2

1 1 1 1
= 2t(2t + 3) 2 + (t2 − 1)2 · (2t + 3)− 2
1 2
2 (t2 − 1)(2t + 3) 2
1 2t(2t + 3) + t2 − 1
= · 1
1
2 (t2 − 1)(2t + 3) 2 (2t + 3) 2
2t(2t + 3) + t2 − 1 5t2 + 6t − 1
= = .
3 3
2 (t2 − 1)(2t + 3) 2 2 (t2 − 1)(2t + 3) 2
9. Since (h ◦ f )−1 = f −1 ◦ h−1 we have to apply the chain rule to f −1 ◦ h−1 , thus
−1
d −1 d −1 dh
(f ◦ h−1 )(z) = f −1
(h (z)) (z)
dz dy dz
577
with f (x) = y, h(y) = z, i.e. z = h(f (x)). Now using (7.7) we find
−1
dh 1 1
(z) = = −1
dz h (y) h (h (z))
and further

d −1 1 1
f (h−1 (z)) = −1 −1 = ,
dy f (f (h (z)) f ((h ◦ f )−1 (z))
which gives
d 1 1
((h ◦ f )−1 )(z) = .
dz f ((h ◦ f )−1 (z)) h (h−1 (z))
m

10. First note that d
dx p(x) = kak xk−1 . Now using the chain rule we find
k=1
i)
d
p(u(x)) = p (u(x))u (x)
dx
m

= u (x) kak (u(x))k−1 ;
k=1
ii)
d
u(p(x)) = u (p(x))p (x)
dx m m

k
=u ak x kak xk−1 ;
k=0 k=1
iii)
d 1 −1 d
= u(p(x))
dx u(p(x)) u(p(x))2 dx
m

k
u ak x m
−u (p(x)) k=0

= p (x) = − 2 · kak xk−1 .
u(p(x))2 m
k=1
u ak xk
k=0
Chapter 8
1. a) This follows straightforward from the definition of the composition of map-
pings and the boundedness of g. For x ∈ D1 set y := f (x) and observe
|(g ◦ f )(x)| = |g(f (x))| = |g(y)| ≤ M.
578
Thus g ◦ f is bounded with bound M .

b) Since |f (x)| = |(x − 1)2 | = (x − 1)2 ≤ x2 + 2|x| + 1 we have |f (x)| ≤ 9
for x ∈ (1, 2). We can in fact improve the bound: for x ∈ (1, 2) it follows that
x − 1 ∈ (0, 1) and therefore (x − 1)2 ≤ 1, i.e. a sharper bound for f on (1, 2) is 1,
i.e. |f (x)| ≤ 1 for x ∈ (1, 2).
The function g ◦ f is given by
1
(g ◦ f )(x) = , x ∈ (1, 2).
(x − 1)2
We claim that this function is unbounded. For this suppose that there exists M ≥ 0
such that
1 1
(∗) = ≤ M for all x ∈ (1, 2).
(x − 1)2 (x − 1)2
Now take xn = 1 + n1 , n ∈ N \ {1}. It follows that xn ∈ (1, 2) and
1 1 1 2
= 2 = 1 = n
(xn − 1)2 1
1+ n −1 n2
and (∗) implies that

n2 ≤ M for all n ≥ 2
which of course is a contradiction. Thus g ◦ f is unbounded on (1, 2).
c) We may choose a = 0 and b = 1 and consider the function f : (0, 1) −→ R,
x → f (x) = x1 . This function is unbounded
on (0, 1). Indeed as in part b) suppose
that for some M ≥ 0 we have x1 = x1 ≤ M for all x ∈ (0, 1). Then for xn = n1 ,
n ∈ N \ {1}, we would deduce n ≤ M for n ≥ 2 which is a contradiction. However
for x ∈ [a1 , b1 ] ⊂ (0, 1) we find
1 1
≤ ,
x a1
1
thus f |[a1 ,b1 ] is bounded by a1 .
2. Note that we need to find a bound, not necessarily the best bound, i.e. the smallest
bound for f . Thus we may use rather crude estimates as long as we achieve our goal.
k
Therefore let p be a polynomial of degree k ∈ N0 . It follows with p(x) = aj xj
j=0
and
c0 := max{|aj | | j ∈ {0, 1, . . . , k}}
that

k k
|p(x)| = aj x ≤
j
|aj ||x|j
j=0 j=0
k

(∗) ≤ c0 |x|j .
j=0
579
Now we claim that for all x ∈ R

1
|x| ≤ (1 + x2 ) 2
which follows immediately from x2 ≤ 1 + x2 . Thus by (∗) we get
k
k
j
|p(x)| ≤ c0 |x|j ≤ c0 (1 + x2 ) 2
j=0 j=0
k
2
≤ (k + 1)c0 (1 + x ) 2 .
Consequently we find
k
|p(x)| 1 + |x|2 2 k −n
2 n
≤ (k + 1)c0 2 n
= (k + 1)c0 1 + |x|2 2 .
(1 + x ) (1 + |x| )
For k2 − n ≤ 0, i.e. n ≥ k2 , the right hand side is bounded since for any l ≥ 0 the
function x → (1 + x2 )−l is bounded. The latter statement follows from (1 + x2 )−1 ≤
1, which is equivalent to 1 ≤ 1 + x2 .
3. a) First note that

d x3 + 2x − 5 (3x2 + 2)(x − 1) − (x3 + 2x − 5) · 1
=
dx x−1 (x − 1)2
3 2
2x − 3x + 3
=
(x − 1)2
and now it follows that

d2 x3 + 2x − 5 d 2x3 − 3x2 + 3
=
dx2 x−1 dx (x − 1)2
(6x − 6x)(x − 1) − (2x3 − 2x2 + 3)2(x − 1)
2 2
=
(x − 1)4
4 3 2
2x − 10x + 14x − 12x + 6
= .
(x − 1)4
1 √
b) It might be easier to write (t4 + 1) 2 instead of t4 + 1. Now we find
d4 d 1 1 1
t + 1 = (t4 + 1) 2 = 4t3 · (t4 + 1)− 2
dt dt 2
1
= 2t3 (t4 + 1)− 2 ,
d2 4 d 1
t + 1 = (2t3 (t4 + 1)− 2 )
dt2 dt
1 1 3
= 6t2 (t4 + 1)− 2 + 2t3 − · 4t3 (t4 + 1)− 2
2
6t2 (t4 + 1) − 4t6 3
= 3 = (2t6 + 6t2 )(t4 + 1)− 2
4
(t + 1) 2
580
and therefore we get

d3 4 d 6 2 4 − 32

t + 1 = (2t + 6t )(t + 1)
dt3 dt
3 3 5
= (12t + 12t)(t4 + 1)− 2 + (2t6 + 6t2 ) − · 4t3 · (t4 + 1)− 2
5
2
(12t5 + 12t)(t4 + 1) −12t9 − 36t5
= 5 + 5
(t4 + 1) 2 (t4 + 1)− 2
−12t5 + 12t
= 5 .
(t4 + 1) 2
c) We first want to investigate the differentiability of s → |s|5 . For s < 0 this

is just the function s → −s5 with derivative −5s4 , where for s > 0 it is the function
s → s5 with derivative 5s4 . Now for s = 0 we find
⎧ 4
|s|5 − 0 |s|5 ⎨ s , s>0
= = 0, s=0
s−0 s ⎩
−|s|4 , s<0
implying that s → |s|5 is differentiable and

⎧ 4
5s , s>0
d 5 ⎨
|s| = 0, s=0
ds ⎩
−5s4 , s < 0.
Now it follows with g(s) = |s|5 that

d |s|5 g (s)(s2 + 4) − g(s)2s
2
=
ds s + 4 (s2 + 4)2
g (s)s + 4g (s) − 2s|s|5
2
=
(s2 + 4)2
⎧ 4 2
⎪ 5s ·s +4·5s4 −2ss5
⎨ (s2 +4)2 , s>0
= 0, s=0
⎪
⎩ −5s4 s2 −4·5s4 +2ss5 ,
(s2 +4)2 s<0
⎧ 6 4
⎪ 3s +20s
⎨ (s2 +4)2 , s>0
= 0, s=0
⎪
⎩ −3s6 −20s4 ,
(s2 +4)2 s < 0.
|s|5
In order to find the second derivative of h(s) := s2 +4 we need to find the limit
h (s) − h (0)
lim .
s→0 s−0
Now for s ≥ 0 we have
h (s) − h (0) 3s5 + 20s3
=
s−0 (s2 + 4)2
581
and for s ≤ 0 we have

h (s) − h (0) −3s5 − 20s3
= .
s−0 (s2 + 4)2
and so for s ∈ R we have

h (s) − h (0)
≤ 3s5 + 20s3
s−0
implying that
h (s) − h (0)
lim = 0,
s→0 s−0
|s|5
thus s → s2 +4 has a second derivative at s = 0 and this second derivative at 0 is 0.
4. With f (x) = u2 (x) + 1 and g(x) = (v 2 (x) + 1)−1 we find
d2 2 2 −1
d2
(u (x) + 1)(v (x) + 1) = (f (x)g(x))
dx2 dx2
= f (x)g(x) + 2f (x)g(x) + f (x)g (x).

Next we note
f (x) = 2u (x)u(x) a f (x) = 2u (x)u(x) + 2(u (x))2
as well as
2v (x)v(x)
g (x) = − = −2v (x)v(x)(v 2 (x) + 1)−2
(v 2 (x) + 1)2
and
g (x) = (−2v (x)v(x)) (v 2 (x) + 1)−2 − 2v (x)v(x)((v 2 (x) + 1)−2 )

= (−2v (x)v(x) − 2v (x)2 )(v 2 (x) + 1)−2
− 2v (x)v(x)(−2v (x)v(x))(−2(v 2 (x) + 1)−3 )
(−2v (x)v(x) − 2v (x)2 )(v 2 (x) + 1) − 8v (x)2 v 2 (x)
=
(v 2 (x) + 1)3
−2v (x)v (x) − 2v (x)v(x) − 10v (x)2 v(x)2 − 2v (x)2
3
= .
(v 2 (x) + 1)3
Therefore we find
d2
((u2 (x) + 1)(v 2 (x) + 1)−1 )
dx2
2u (x)u(x) + 2u (x)2 2u (x)v(x)(−2v (x)v(x))
= 2
+2
(v (x) + 1) (v 2 (x) + 1)2
(u (x) + 1)(−2v (x)v (x) − 2v (x)v(x) − 10v (x)2 v(x)2 − 2v (x)2 )
2 3
+
(v 2 (x) + 1)3
Q(u, v)(x)
= 2 ,
(v (x) + 1)3
582
where
Q(u, v)(x) =(2u (x)u(x) + 2(u (x))2 )(v 2 (x) + 1)2
− 8u (x)v (x)u(x)v(x)(v 2 (x) + 1)
+ (u2 (x) + 1)(−2v (x)v 3 (x) − 2v (x)v(x) − 10v (x)2 v(x)2 − 2v (x)2 .
5. By the chain rule we find

d
(g ◦ f )(x) = g (f (x))f (x)
dx
and therefore
d2 d
(g ◦ f )(x) = (g (f (x))f (x))
dx2 dx
d
= (g (f (x))) f (x) + g (f (x))f (x)
dx
= g (f (x))f (x)2 + g (f (x))f (x).
1 1
For h(t) = (1 + f 2 (t))− 2 we find with g(s) = (1 + s2 )− 2 that h(t) = (g ◦ f )(t) and
therefore we may apply the above formula. For this note that
s 2 − 32
g (s) = − 3 = −s(1 + s )
(1 + s2 ) 2
and
2s2 − 1 5
g (s) = 5 = (2s2 − 1)(1 + s2 )− 2 .
(1 + s2 ) 2
Now we set
d2 2 − 12

(1 + f (t))
dt2
5 3
= (2f 2 (t) − 1)(1 + f 2 (t))− 2 · f (t)2 + f (t)(−5(1 + f (t))2 )− 2

(2f 2 (t) − 1)f (t)2 − 5f (t) 1 + f (t)2
= 5 .
(1 + f (t)2 ) 2
6. First we observe
1 1
= 2 2
(u2 (x) + 2)2 2
√x +2
1+x2
1
= 2
x4 2(1+x2 )
1+x2 + 1+x2
(1 + x2 )2 x4 + 2x2 + 1
= =
x4 + 2x2 + 2 x4 + 2x2 + 2
1
=1− 4 .
x + 2x2 + 2
583
This implies immediately that

d2 1 d2 4
=− (x + 2x2 + 2)−1
dx2 (u (x) + 2)2
2 dx2
with g(y) = y1 and f (x) = x4 + 2x2 + 2 we find g (y) = − y12 , g (y) = 2

y3 , f (x) =
4x3 + 4x, and f (x) = 12x2 + 4. Thus it follows

d2 1
= g (f (x))f (x)2 + g (f (x))f (x)
dx2 (u2 (x) + 2)2
1 1
=− 4 · (4x3 + 4x)2 − 4 (12x2 + 4)
(x + 2x2 + 2)3 (x + 2x2 + 2)2
−(4x3 + 4x)2 − (12x2 + 4)(x4 + 2x2 + 2)
=
(x4 + 2x2 + 2)3
−28x − 60x4 − 48x2 − 8
6
= .
(x4 + 2x2 + 2)3
7. We prove

dn 1 pn (x)
=
dxn 1 + x2 (1 + x2 )n+1
by induction. For n = 0 we have p0 (x) = 1. Now we calculate

dn+1 1 d pn (x)
=
dxn+1 1 + x2 dx (1 + x2 )n+1
where we used the induction hypothesis. It follows that

d pn (x) p (x)(1 + x2 )n+1 − pn (x)(2(n + 1)x(1 + x2 )n )
2 n+1
= n
dx (1 + x ) (1 + x2 )2n+2
p (x)(1 + x2 ) − pn (x)(2(n + 1)x) pn+1 (x)
= n 2 n+2
=
(1 + x ) (1 + x2 )n+2
with
pn+1 (x) = pn (x)(1 + x2 ) − 2(n + 1)xpn (x).
The degree of pn (x) is at most n and that of pn (x) is at most n − 1, therefore the
degree of pn+1 (x) is at most n + 1.
Now the estimate follows using Problem 2
n
d 1 pn (x) cn (1 + |x2 |) n2
=
dxn 1 + x2 (1 + x2 )n+1 ≤ (1 + x2 )n+1
cn
= n+2 .
(1 + x2 ) 2
584
8. a) By the definition of the absolute value, we know that |x3 | ≥ 0 for all x ∈ R
and |x|3 = 0 if and only if x = 0 implying that f (x) = |x|3 has a local minimum at
x0 = 0.
(Note that we did not use differential calculus as it is not necessary or helpful here.)
b) We first find g (s):
g(s) = (s2 − 2s)(2 + 3s2 )−1 ,
therefore
g (s) = (2s − 2)(2 + 3s2 )−1 + (s2 − 2s)(−1(2 + 3s2 )−2 6s)
(2s − 2)(2 + 3s2 ) − 6s(s2 − 2s)
=
(2 + 3s2 )2
2
6s + 4s − 4
= .
(2 + 3s2 )2
Therefore the condition g (s) = 0 is equivalent to
2(3s2 + 2s − 2) = 0,
i.e. we have to solve the quadratic equation
3s2 + 2s − 2 = 0
which gives
1 1√
s1,2 = − ± 7.
3 3
In order to decide whether we have a local extreme value at s1 or s2 , and when we
do in order to find what type it is we make use of g (s).
d
g (s) = ((6s2 + 4s − 4)(2 + 3s2 )−2 )
ds
= (12s + 4)(2 + 3s2 )−2 + (6s2 + 4s − 4)(−2(2 + 3s2 )−3 (6s))
(12x + 4)(2 + 3x2 ) − 12x(6x2 + 4x − 4)
=
(2 + 3x2 )3
−36s3 − 36s2 − 24s + 8
= .
(2 + 3s2 )3
Now we need to determine whether g (s1 ) (g (s2 )) is strictly positive or strictly

negative.
But we do not need to calculate the exact value of g (s1 ) (g (s2 )). Therefore we
only need to look at the sign of the polynomial - 36s3 − 36s2 − 24s + 8 at s1 and
s2 . Note that
1 1√ 8 2√ 22 10 √
s1 = − + 7, s21 = − 7, x31 = − + 7
3 3 9 9 27 27
585
while
1 1√ 8 2 22 10 √
s2 = − − 7, x22 = + , s32 = − − 7.
3 3 9 9 27 27
Therefore we find
g (s1 ) = −36(s31 + s21 ) − 24s1 + 8

22 10 √ 8 2√ √
= −36 − + 7+ − 7 +8−8 7+8
27 27 9 9

2 4√ √
= −36 + 7 − 8 7 + 16
27 27
8 16 √ √
=− − 7 − 8 7 + 16
3 3
40 40 √ 40 √
= − 7= (1 − 7) < 0,
3 3 3
1 1√
implying that g has a local maximum at s1 = − + 7. For g (s2 ) we find
3 3
g (s2 ) = −36(s32 + s22 ) − 24s2 + 8

22 10 √ 8 2√ 1 1√
= −36 − − 7+ + 7 − 24 − − 7 +8
27 27 9 9 3 3

2 4√ √
= −36 − 7 +8+8 7+8
27 27

4√ 2 √
= 36 7− + 8 7 + 16 > 0,
27 27
1 1√
implying that g has a local minimum at s2 = − − 7.
3 3
c) For the first derivative of h we find
−2u2 − u + 1
h (u) = √
1 − u2
which has zeroes for u1 = 12 and u2 = −1, but −1 ∈ / (−1, 1). Thus h may only have
a local extreme value at u1 = 12 .
Now
2u3 − 3u − 1
h (u) = 3
(1 − u2 ) 2
1
and for u0 = 2 we find

1 2 · 13 − 3 · 12 − 1 −9
h = 2 = 3 < 0.
2 1 2 32
1− 2 4 34 2
Therefore h has a local maximum at u0 = 12 .
586
1
9. a) We may take, for example f : (−1, 1) −→ R, f (t) = 1+t . It follows that
1
(f ◦ g)(x) = 1+x 2 and we find immediately that

1 1

1 + x2 = 1 + x2 ≤ 1 for all x ∈ (−1, 1).
Since (f ◦ g)(0) = 1 it follows that f ◦ g has a maximum at x = 0.

b) The function f has a local maximum at x0 if for some > 0 it follows for
x ∈ (− + x0 , x0 + ) that f (x) ≤ f (x0 ). This implies for all x ∈ (− + x0 , x0 + )
that
h(x0 + c) = f (x0 + c − c) = f (x0 ) ≥ f (x) = f (x + c − c) = h(x + c),
i.e.
h(x0 + c) ≥ h(y) for all y ∈ (− + x0 + c, x0 + c + ),
and with y0 := x0 + c ∈ (− + x0 + c, x0 + c + ) we have
h(y) ≤ h(y0 ) for all y ∈ (− + x0 + c, x0 + c + ),
implying that h has a maximum at y0 = x0 + c.

Note that in the case where f is twice differentiable we may use calculus. First
note that we know f (x0 ) = 0 and f (x0 ) < 0. However h (x) = f (x − c). Thus
h (x0 + c) = 0 and since h (x) = f (x − c) we also know that h (x0 + c) < 0
implying that h has a local maximum at x0 + c.
10. a) By the mean value theorem we have
| sin x − sin y| = | sin ξ||x − y| = | cos ξ||x − y| ≤ |x − y|
and for y = 0 we find

| sin x| ≤ |x|.
b) We apply the mean value theorem in the form
|f (x) − f (y)| ≤ M |x − y|
where |f (z)| ≤ M for all z, f : [x, y] → R.

Thus in this case, we have

1 1
|g (z)| = √ ≤ for z ∈ [1, 2],
2 z 2
11 10
therefore with x = and y = =1:
10 10
'
11 1 1 1

− 1 ≤ =
10 2 10 20
587
or
'
1 11 1
− +1≤ ≤1+ ,
20 10 20
i.e.
'
19 11 21
≤ ≤ .
20 10 20
11. a) We first note that each of the functions is increasing.
χn :R → R
x → χn (x) = χ[n,∞) (x)
Indeed,
if x < y < n then χn (x) = χ(y),
if x < n ≤ y then χn (x) = 0 < 1 = χn (y),
if n ≤ x < y then χn (x) = χn (y).
Since the sum of increasing functions is increasing (g(x) ≤ g(y) and f (x) ≤ f (y)
implies g(x) + f (x) ≤ g(y) + f (y)), it follows that XN is increasing.
Here is the graph of X5

[
6−
5− [ )
4− [ )
3− [ )
2− [ )
1 −[ )
| | | | | | |
0 1 2 3 4 5 6 7
which is justified by:
588
χ0 (x) = 1 for all x ≥ 0

0 for all x<1
χ1 (x) =
1 for all x≥1

0 for all x<2
χ2 (x) =
1 for all x≥2

0 for all x<3
χ3 (x) =
1 for all x≥3

0 for all x<4
χ4 (x) =
1 for all x≥4

0 for all x<5
χ5 (x) =
1 for all x≥5
therefore for all

5

0<x<1: χn (x) = 1
n=0
5

1≤x<2: χn (x) = 2
n=0
5

2≤x<3: χn (x) = 3
n=0
5

3≤x<4: χn (x) = 4
n=0
5

4≤x<5: χn (x) = 5
n=0
5

5≤x : χn (x) = 6.
n=0
b) We need to consider the sign of fa . Note that
d
f (x) = (x(1 + ax2 )−1 )
dx
= (1 + ax2 )−1 + x(−1(1 + ax2 ))−2 (2ax))
1 + ax2 − 2ax2 1 − ax2
= 2 2
= .
(1 + ax ) (1 + ax2 )2
Since (1 + ax2 )2 > 0 for all x ∈ R we only need to look at 1 − ax2 . For x < − √1a
or x > √1a , it follows that 1 − ax2 < 0, hence in ( √1a , ∞) the function is strictly
589

decreasing. For x ∈ 0, √1a we have 1 − ax2 > 0 and therefore in 0, √1a the
function is strictly increasing.
12. a) Let t1 , t2 ∈ (c, d) such that t1 < t2 . Since g is increasing it follows that
g(t1 ) ≤ g(t2 ). Now the fact that f is also increasing gives f (g(t1 )) ≤ f (g(t2 )), i.e.
f ◦ g is increasing.
b) Using the chain rule we find
(f ◦ g) (x) = f (g(x))g (x)
and
(g ◦ f ) (x) = g (f (x))f (x).
In both cases, if f and g are positive functions, or if f and g are negative functions
we have that (f ◦ g) and (g ◦ f ) are non-negative functions, hence increasing.
13. By the mean value theorem applied to h = g − f there exists ξ ∈ (a, b) such that
h(x) − h(a) = h (ξ)(x − a) = (g (ξ) − f (ξ))(x − a) > 0.
However h(a) = g(a) − f (a) = 0 and therefore g(x) − f (x) = h(x) > 0 for all
x ∈ (a, b), i.e. f (x) < g(x) for all x ∈ (a, b).
Chapter 9
1. a) Recall that lim f (x) = ∞ means that for all M > 0 there exists N ∈ N
x→∞
such that x > N implies f (x) > M . Given M > 0 we have to√find N ∈ N such that
x > N implies√x5 − 5 > M , or x5 > M + 5. Hence for N := [ M + 5] + 1 it follows
for x > N = [ M + 5] + 1 that
√ 2
x2 − 5 > [ M + 5] + 1 − 5
√
= [M + 5] + 2 M + 5 − 4
√
= M + 4 + 2 M + 5 − 4 > M.
b) Let us rewrite p(x) as

k−1

al l−k
k
p(x) = ak x 1+ x
ak
l=0
which is correct for say x > 1. Now for 0 ≤ l < k there exists Nl such that for
x > Nl
al l−k 1
x < .
ak 2k
Indeed, this is equivalent to
al
x k−l
> 2k
ak
590
for x > Nl , and this follows from Example 9.10. Therefore we see for Nk :=
max{N0 , . . . , Nk−1 } that x > Nk implies
k−1 k−1
al l−k al
1+ x ≥1− xl−k
ak ak
l=0 l=0
1 1
≥1−k· = .
2k 2
This now implies for x > Nk
ak k
p(x) ≥ x .
2
Again using Example 9.10 we deduce that given M > 0 there exists Ñ ∈ N such
that x > Ñ it follows that a2k xk > M . Hence for N = max{Ñ , Nk } it follows that
x > N implies p(x) > M or
lim p(x) = ∞.
x→∞
c) Note that
1 + a + ax2 1 a(1 + x2 ) 1
2
= 2
+ = + a,
1+x 1+x 1 + x2 1 + x2
therefore for > 0 we have to find N ∈ N such that x > N implies that

1 + a + ax2 1 1

− a = + a − a = < .
1 + x2 1 + x2 1 + x2

We can now continue as in Example 9.9 and take N = N () = 1 + 1, and see for
x > N () that

1 + a + ax2 1 1 1
=
1 + x2 − a 1 + x2 < x < 1 + 1 < .

2. a) Lemma 9.11.B says that for a > 0 and n ∈ N0

n(n − 1) 2
(1 + a)n ≥ 1 + na + a .
2
n(n−1) 2
For n ≥ 2 it follows that 2 a > 0 implies
(∗) (1 + a)n > 1 + na
for a > 0 and n ≥ 2.
b) We apply (∗) to see for n ≥ 2
n
1 1 n
1+ 2 >1+n 2 =1+ 2
n −1 n −1 n −1
n 1
and it remains to prove that for n ≥ 2 it follows that n2 −1 ≥ n which is equivalent
2
n
to n2 −1 ≥ 1 and this of course is correct.
591
3. From the definition we find
ax+y = exp((x + y)lna) = exp(xlna + ylna)

= exp(xlna)exp(ylna) = ax ay ,
as well as
a0 = exp(0lna) = exp(0) = 1.
4. a) By the chain rule we find

d d
exp(− x2 + 1) = (− x2 + 1) (exp ) (− x2 + 1)
dx dx
−x
= √ exp(− x2 + 1).
2
x +1
b) Again we use the chain rule to get

d d
exp(−loga (1 + u2 )) = (−loga (1 + u2 )) (exp ) (−loga (1 + u2 ))
du du
−2u
= exp(−loga (1 + u2 )).
(lna)(1 + u2 )
c) First we find

d 1 d 1 1
exp − = − (exp ) −
dt 1 + t2 dt 1 + t2 1 + t2

2t 1
= exp −
(1 + t2 )2 1 + t2
and now it follows that

d2 1 d 2t 1
exp − = exp −
dt2 1 + t2 dt (1 + t2 )2 1 + t2

d 2t 1 2t d 1
= exp − + exp −
dt (1 + t2 )2 1 + t2 (1 + t2 )2 dt 1 + t2
2

2 − 6t 1 2t 2t 1
= exp − + · exp −
(1 + t2 )3 1 + t2 (1 + t2 )2 (1 + t2 )2 1 + t2

2 − 6t4 1
= 2 4
exp − .
(1 + t ) 1 + t2
5. The case n = 0 is straightforward, just take p0 (x) = 1. Now suppose that
dn −x2 2
e = pn (x)e−x
dxn
592
with pn (x) of degree n. It follows that
dn+1 −x2 d 2

n+1
e = pn (x)e−x
dx dx
2 2
= pn (x)e−x − 2xpn (x)e−x
2 2
= (pn (x) − 2xpn (x))e−x = pn+1 (x)e−x .
The polynomial pn+1 (x) := pn (x) − 2xpn (x) has degree at most n + 1 since the
degree of pn (x) is at most n − 1 and that of −2xp(x) is at most n + 1.
6. a) By the chain rule we find

d d 4
4 2
ln( s + 1 − s ) = ( s + 1 − s ) (ln )( s4 + 1 − s2 )
2
ds ds
√
2s3 − 2s s4 + 1 1
= √ √
s4 + 1 s4 + 1 − s2
√
2s3 − 2s s4 + 1
= √ .
s4 − s2 s4 + 1 + 1
b) Once again by the chain rule we find

d d x
(ln(ax )) = a (ln )(ax )
dx dx
1
= (lna)ax x = lna.
a
Note that the derivative is constant.
c) First note that

d d 2
ln((y 2 + 1)−k ) = (y + 1)−k (ln ) (y 2 + 1)−k
dy dy
−2yk 1 −2yk
= 2 · 1 = 2
(y + 1)k+1 (y2 +1) k (y + 1)
and it follows now that

d2 2 −k
d −2ky
ln (y + 1) =
dy 2 dy y 2 + 1
2
−2k(y + 1) − (−2ky)(2y)
=
(y 2 + 1)2
2
2ky − 2k
= 2 .
(y + 1)2
593
7. a) We can use Lemma 9.14 in the following way:
x 1 ax
lim = lim
x→∞ exp(ax) a x→∞ exp(ax)
1 y
= lim = ∞.
a y→∞ exp(y)
x
Here we used the fact that x → ∞ if and only if y = a → ∞.
b) For n ∈ N we have
ax ax
exp(ax) = exp + ···+
n
ax n
ax
= exp · . . . · exp (n-terms)
n n
and therefore it follows that

xn x x
lim = lim ·... ·
x→∞ exp(ax) x→∞ exp ax
n exp axn

x x
= lim · . . . · lim = ∞.
x→∞ exp ax
n
x→∞ exp ax
n
The following is important to note: we have not yet proved that if lim f (x) = ∞

x→∞
and lim g(x) = ∞ then it follows that lim (f (x)g(x)) = lim f (x) lim g(x) =
x→∞ x→∞ x→∞ x→∞
∞. Suppose that lim f (x) = ∞ and lim g(x) = ∞. Given M > 0 there exists
x→∞ x→∞√ √
N such that for x > N we have f (x)√> √ M and g(x) > M . Therefore for
x > N it follows that f (x) · g(x) > M M = M , i.e. we have proved that
lim f (x)g(x) = ∞. Finally we use the convention that (+∞) · (+∞) = +∞.
x→∞
8. Firstly we can use the considerations of Problem 1 b). Thus we first write for x = 0
m−1

bk
m k−m
p(x) = bm x 1+ x .
bm
k=0
If m is even we find further for K = lnM , K > 0, i.e. M > 1 given there exists
N ∈ N such that x > N implies
m−1

bk
m k−m k−m
p(−x) = bm x 1+ (−1) x
bm
k=0
≥ lnM.
Now it follows for x > N that
exp(p(−x)) ≥ M
594
implying for m even it follows that lim exp(p(x)) = ∞.

x→−∞
Now let m be odd. First we prove that
lim exp(xm ) = 0
x→−∞
which follows from

1
lim exp(xm ) = lim = 0.
x→−∞ y→∞ exp(y m )
Now suppose that we can prove that there exists some N ∈ N such that x < −N
implies
(∗) p(x) ≤ cxm
with c > 0 independent of N . In this case we would have
0 ≤ exp(p(x)) ≤ expcxm
and therefore
0 ≤ lim exp(p(x)) ≤ lim exp(cxm ) = 0,
x→−∞ x→−∞
i.e.
lim exp p(x) = 0.
x→−∞
In order to prove (∗), note that for x < 0

m−1
bk
1 p(x)
= 1 + xk−m
bm xm bm
k=0
and we are done if we can show that

m−1
bk k−m
1+ x ≤ c̃, c̃ > 0.
bm
k=0
Now note that for x ≤ −1

m−1
m−1
|bk |
bk k−m
1+ x ≤1+ |x|k−m
bm |bm |
k=0 k=0
m−1
|bk |
=1+ ,
|bm |
k=0
and then the result follows.

9. We again use the fact that for x = 0
n−1

ak k−n
n
(∗) p(x) = an x 1+ x
an
k=0
595
and the result shown in Problem 1 b) that for large x

n−1
ak k−n 1
1+ x ≥ .
an 2
k=0
Hence for x ≥ R we have that ln p(x) is defined.

Now we can investigate
lnp(x)
lim .
x→∞ x
With (∗) it follows for x ≥ R that
n−1

ak k−n
n
lnan x 1+ x
lnp(x) an
k=0
0≤ =
x x

n−1
ak k−n
ln 1 + x
lnan xn an
k=0
= +
x x
n−1
|ak | k−n
ln 1 + R
lnan xn |an |
k=0
≤ + .
x x
n−1

|ak |
ln 1 + Rk−n
|an |
k=0
Clearly lim = 0.
x→∞ x
Thus we want to prove
lnan xn
lim = 0,
x→∞ x
but
ln(an xn ) lnan nlnx
= +
x x x
and Theorem 9.16 gives the result.
10. a) First note that for x, y > 0
1 x+y
(xy) 2 ≤ .
2
This estimate is equivalent to
4xy ≤ (x + y)2 = x2 + 2xy + y 2
or
2xy ≤ x2 + y 2
596
which is correct since 0 ≤ (x − y)2 = x2 − 2xy + y 2 .

Now the monotonicity of ln gives

1 x+y
ln(xy) 2 ≤ ln
2
but
1 1 lnx + lny
ln(xy) 2 = ln(xy) = .
2 2

A function satisfying g x+y
2 ≤ g(x)+g(y)
2 is called convex in the sense of J.
Jensen or mid-point convex.
b) The mean value theorem gives
|lnx − lny| = |ln ξ||x − y|
for some y ≤ ξ ≤ x. Now ln ξ = 1

ξ and by assumption |x − y| = 1. Therefore we
have
1
|lnx − lny| = .
ξ
Since ln is monotone increasing we have lnx − lny > 0, i.e. lnx − lny = |lnx − lny|
and further x1 < 1ξ < y1 implying
1 1
≤ lnx − lny ≤ .
x y
v
11. The logarithmic derivative of v is given by v. Thus we have
v (x)
= 1, v(0) = 1,
v(x)
or v (x) = v(x) and v(0) = 1. Thus it follows that v(x) = exp x.

Chapter 10
1. a) For x ∈ R we have
(f ◦ g)(−x) = f (g(−x)) = f (g(x)) = (f ◦ g)(x),
therefore f ◦ g is an even function.

b) For x ∈ R we find
(f ◦ g)(−x) = f (g(−x)) = f (−g(x))

= −f (g(x)) = −(f ◦ g)(x),
hence f ◦ g is an odd function.

c) Let c = min{|a|, b}. Then − 2c , 2c ∈ (a, b) and f − 2c = f 2c . Therefore
f |(a,b) is not injective and therefore it does not have an inverse function.
597
2. a) Let f be an even function and note that

f (−y + h) − f (−y) f (y − h) − f (y) f (y − h) − f (y)
= =−
h h −h
and for h → 0 we find
f (−y + h) − f (−y) f (y − h) − f (y)
f (−y) = lim = − lim = −f (y)
h→0 h h→0 −h
implying that f is an odd function.
Now if f is an odd function we have
f (−y + h) − f (−y) −f (y − h) + f (y) f (y − h) − f (y)
= =
h h −h
and in the limit we have
f (−y + h) − f (−y) f (y − h) − f (y)
f (−y) = lim = lim = f (y),
h→0 h h→0 −h
i.e. f is even.
Thus by iteration if f is an even C k function then all derivatives f (l) with l ≤ k
and l even are even functions and all derivatives f (l) with l ≤ k and l odd are odd
functions.
b) We define
f (x), x≥0
g(x) =
f (−x), x≤0
and ⎧
⎨ f (x), x>0
h(x) = 0, x=0
⎩
−f (−x), x < 0.
Clearly g is even and h is odd. Note that we obtain g by reflecting f in the y-axis,
where h is obtained by a point reflection of f[0,∞) at x0 = 0.
even extension of f odd extension of f
− +
g(−x) + + f (x) = g(x) f (x) = h(x)

−
− +
−
−x
| |
x
| |
−x x +
h(−x)
3. a) This limit does not exist. Suppose that it does and that it is equal to a, i.e.
for all > 0 there exists N () ∈ N such that x > N () implies that | sin x − a| < .
598
For = 12 take k > N () implying xk := 2πk > N () and yk := π

2 + 2πk > N ()
implying
1 = | sinxk − sin yk | = | sin xk − a + a − sin yk |

1 1
≤ | sin xk − a| + | sin yk − a| < + = + = 1,
2 2 2 2
b) Since |(sin x)k | ≤ 1 for all x ∈ R and k ∈ N it follows that

(sin x)k 1
− 0 ≤ , x > 0.
x x

Now, given > 0 choose N () = 1 + 1 to find for x > N (), i.e. 1
< 1
< ,
x [ 1 ]+1
that
(sin x)k (sin x)k 1

− 0 = < < .
x x x
4. We use Figure 10.2 in the following.

a) First note that sin π4 = cos π4 and since 1 = sin2 π
4 + cos2 π
4 = 2 sin2 π
4 it
√
follows that sin π4 = cos π4 = 22 .
By (10.10) we now find
π 1√ π 2
cos − cos 0 = 2 − 1 = −2 sin
4 2 8
or
π 2 1 1√
sin = 1− 2 ,
8 2 2
√
i.e. sin π8 = 1
2 − 1
4 2.
b) By looking at the figure below and using elementary geometry we deduce
that sin π6 = 12
π
6
sin π6 OBA is an equilateral triangle,
| hence 2 sin π6 = 1.
0 1
599
Since cos2 π
6 = 1 − sin2 π
6 =1− 1
4 = 3
4 we find that
π 1√
cos = 3.
6 2
c) First note that
π π π π π π π
sin = sin + = sin cos + cos sin
3 6 6 6 6 6 6
π π 1√
= 2 sin cos = 3
6 6 2
and
π π π π π π π
cos = cos + = cos cos − sin sin
3 6 6 6 6 6 6
1√ 1√ 1 1 1
= 3 3 − · = .
2 2 2 2 2
Therefore we find √
π sin π3 1
2 3 √
tan = = = 3.
3 cos π3 1
2
d) Since sin π6 = 1
we find
2
1 π π π π π π π
= sin = sin + = sin cos + cos sin
2 6 12 12 12 12 12 12
π π
= 2 sin cos
12 12
or
π π 1
sin cos = ,
12 12 4
which yields
1 π
(∗) π = 4 cos .
sin 12 12
Further we find that
1√ π π π π 2 π 2
3 = cos = cos + = cos − sin .
2 6 12 12 12 12
π 2
π 2

Since cos 12 + sin 12 = 1 it follows that
1√ π 2 π 2
3 = cos + cos −1
2 12 12
or '
π 1 1√
cos = + 3.
12 2 4
π
π cos π
Finally, since cot 12 = sin = 4 cos2 12
12
π , where we used (∗), we find
12

π 1 1√ √
cot =4 + 3 = 2 + 3.
12 2 4
600
5. The problem is equivalent to finding the value x such that

√
3
a) sin x = 2 ,
√
b) cos x = − 12 2,
c) tan x = √1 ,
3
√
d) cot x = − 3.
√
a) 23 = cos π6 = sin π6 + π2 = sin 2π 2π
3 , therefore x = 3 , note that we have
used part b) of Problem 1.
√
b) Since cos(π + x) = − cos x and cos π4 = 12 2 we find
√
cos 5π 1
4 = − 2 2, i.e. x = 4 .
5π
√
c) Since tan x = cos x and cos π6 = 12 3, sin π6 = 12 we find tan π6 = √13 .
sin x

d) Now cot x = tan1 x and tan x = − tan(−x), so we know that tan − π6 = − √13
π √
or cot − 6 = − 3. Since in addition cot(x + π) = cot x, and it is preferred to
√
consider cot on (0, π) we take x = 5π 5π
6 as a solution i.e. cot 6 = − 3.
6. a) By the mean value theorem for some ξ between x and y we get
| sin x − sin y| = | sin ξ||x − y| = | cos ξ||x − y| ≤ |x − y|.
b) Again the mean value theorem yields for some ξ between x and y
1
| tan x − tan y| = | tan ξ||x − y| = |x − y|
| cos2 ξ|
and since that in [−a, a] the function cos has its minimum at a we find
1
| tan x − tan y| ≤ |x − y|.
cos2 a
c) For the first statement we use a short induction argument. For n = 1 we
have | sin x| ≤ | sin x|. Now suppose that | sin nx| ≤ n| sin x|. Using the addition
theorems we find
sin((n + 1)x) = (sin nx) cos x + sin x cos nx
or
| sin(n + 1)x| ≤ | sin nx|| cos x| + | sin x|| cos nx|

≤ n| sin nx| + | sin x| = (n + 1)| sin x|.
The statement | sin ax| ≤ a| sin x| for all a > 0 and all x ∈ R is not correct. Take
a = 12 and x = π, then the claim is
π 1
1 = sin ≤ | sin π| = 0,
2 2
which of course is not correct.
601
7. Let x ∈ R. It follows that

(f ◦ g)(x + a) = f (g(x + a)) = f (g(x)) = (f ◦ g)(x),
thus f ◦ g has period a.
The function g ◦ f : R −→ R does not have to be periodic. Consider the periodic
1
function g = sin and the function f : R −→ R, x → 1+x 2 . Note that the range
of f is (0, 1]. On [0, 1] the function sin is strictly increasing. Now, let a > 0. It
1 1
follows for x ∈ R that x + a = x and therefore 1+(x+a) 2 = 1+x2 . Thus either

1 1 1 1
sin 1+(x+a)2 < sin 1+x2 or sin 1+x2 < sin 1+(x+a)2 , but we never have

1
equality, hence x → sin 1+x 2 is not periodic.
8. a)
d
cos(ln(1 + x2 ))
dx
d
= ln(1 + x2 ) (cos )(ln(1 + x2 ))
dx
2x
=− sin(ln(1 + x2 )).
1 + x2
b)

d sin(tan t)
√
dt 1 − cos4 t
d √ d
√
dt sin(tan t) 1 − cos4 t − sin(tan t) dt 1 − cos4 t
=
1 − cos4 t
1
√ 3 1 4 − 12
4
cos2 t cos(tan t) 1 − cos t − sin(tan t) · (4 cos t)(sin t) 2 (1 − cos t)
=
1 − cos4 t
(cos(tan t))(cos t)(1 − cos t) − 2 sin(tan t) cos3 t sin t
4
= 3
(1 − cos4 t) 2
−(cos(tan t))(cos5 t) − 2(sin(tan t))(sin t)(cos3 t) + (cos(tan t))(cos t)
= 3 .
(1 − cos4 t) 2
c)
d √
arcsin 1 + cos s
ds
d√ √
= 1 + cos s (arcsin ) 1 + cos s
ds
− sin s 1
= √
2 1 + cos s 1 − √1 + cos s2
− sin s
= √ √ .
2 1 + cos s cos s
602
d)
d 2

arctan e−u cot u
du
d −u2 2

= e cot u (arctan ) e−u cot u
du

−u2 −u2 1 1
= −2ue cot u + e
sin2 u 1 + e−u2 cot u 2
2
(−2u(cot u) sin2 u + 1)e−u
= .
1 + e−2u2 cot2 u
n

9. We first find cos jt. For j ≥ 1 it follows that
j=1

t 1 1 1
cos jt · sin = sin j + t − sin j − t
2 2 2 2
and therefore
⎛ ⎞
n

⎝ t
cos jt⎠ sin
j=1
2
n
1 1 1
= sin j + t − sin j − t
2 j=1 2 2

1 1 t
= sin n + t − sin
2 2 2
or for t = 0 2n+1
n
1 sin 2 t
cos jt = −1 ,
j=1
2 sin 2t
i.e. for t = 0 we have
n
1 1 sin 2n+1
2 t
Cn (t) = + cos jt = .
2 j=1 2 sin 2t
Thus we find for t = 0 that

1 sin(2n + 1)t
Cn (2t) = ,
2 sin t
1 1
which yields for t = 0 that Cn (2t) = 2 Dn (t). For t = 0 we find Cn (0) = 2 +
n
2n + 1
(cos j0) = , hence
j=1
2
1
Cn (2t) = Dn (t)
2
603

for all t ∈ − π2 , π2 . Since t → Cn (2t) is arbitrarily often differentiable
on − π2 , π2
it follows that Dn (.) is arbitrarily often differentiable on − π2 , π2 .
Chapter 11
1. The interior points are (−1, 2) ∪ (5, 6) and the boundary is

{−1} ∪ {2} ∪ {3} ∪ {4} ∪ {5} ∪ {6}, i.e.
∂D = {−1, 2, 3, 4, 5, 6}.
To prove this we first look at the following sketch:
−1 0 1 2 3 4 5 6
[ | | ) • • [ ]
For −1 < x < 2, i.e. x ∈ (−1, 2) we take ε1 = 12 min{|x + 1|, |x − 2|} to find
that (−ε1 + x, x + ε1 ) ⊂ (−1, 2) ⊂ D, and for x ∈ (5, 6) we find with ε2 :=
1
2 min{|x − 5|, |x − 6|} that (−ε2 + x, x + ε2 ) ⊂ (5, 6) ⊂ D. The points −1, 2, 3, 4, 5, 6
are not internal points: every interval with centre x ∈ {−1, 2, 3, 4, 5, 6} will contain
points not belonging to D. This is almost the proof that ∂D = {−1, 2, 3, 4, 5, 6}.
For −1, 3, 4, 5, 6 ∈ D it follows immediately that these points belong to ∂D. For 2
we need to argue more carefully: if (−ε + 2, 2 + ε), 0 < ε < 1, is an open inteval with
centre 2, then (−ε + 2, 2) ⊂ D. Hence (−ε + 2, 2 + ε), 0 < ε < 1, always contains
points in D and D . However this also holds when we do not use the restriction
ε < 1.

2. a) In order for
(x2 − 1) (x2 + 4x) to be defined we need to have that
x2 − 1 x2 + 4x ≥ 0, i.e. (x2 − 1) ≤ 0 and (x2 + 4x) ≤ 0 or (x2 − 1) ≥ 0 and
(x2 + 4x) ≥ 0. Now x2 − 1 ≤ 0 if x ∈ [−1, 1] and x2 + 4x = x(x + 4) ≤ 0 if either
x ≤ 0 and x ≥ −4 or x ≥ 0 and x ≤ −4. Moreover x2 − 1 ≥ 0 if x ∈ R \ (−1, 1)
and x(x + 4) ≥ 0 if either x ≥ 0 and x ≥ −4 or x ≤ 0 and x ≤ −4. Together we
find that (x2 − 1)(x2 + 4x) ≥ 0 if
x ∈ R \ (−4, 1) ∪ [−1, 0].

b) First note that we need to have x3 + 4x2 − 5x = 0, or x x2 + 4x − 5 = 0,
i.e. x ∈ R \ {−5, 0, 1}. Next note that cos is defined on all of R but ln is only
defined on (0, ∞). Hence we need arctan x > 0 implying x > 0. Therefore we have
D = R+ \ {0, 1}.

c) As in part a) we note that the condition is that (sinh x) 1 − x4 is non-
negative. Now for x ≥ 0 we have sinh x ≥ 0 and for x < 0 we have
604

sinh x < 0. Further
for x ∈ [−1, 1] we have 1 − x4 ≥ 0 and in R \ [−1, 1] we find
that 1 − x4 < 0. Therefore it follows that
D = (−∞, −1] ∪ [0, 1].
d) The definition of cot gives
cos(arcsin x) cos(arcsin x)
cot(arcsin x) = =
sin(arcsin x) x
which implies
D = [−1, 1] \ {0}.
3. a) Since cos π = −1, hence 1 + cos π = 0, and x2 − 2x + 1 has a zero when

x = 1, we may therefore use the rules of l’Hospital:
1 + cos πx −π sin πx
lim = lim
x→1 x2 − 2x + 1 x→1 2x − 2
and since sin π = 0 as is 2x − 2 for x = 1 we use the rules again to find
1 + cos πx −π sin πx −π 2 cos πx π2

lim 2
= lim = lim = .
x→1 x − 2x + 1 x→1 2x − 2 x→1 2 2
b) Note that for t = 0 we have cos 3t = cos 2t = 1, however ln1 = 0. Applying

the rules of l’Hospital gives
−3 sin 3t
ln(cos 3t)
lim = lim −2cossin3t2t
t→0 ln(cos 2t) t→0
cos 2t
3 sin 3t cos 2t
= lim
t→0 2 sin 2t cos 3t
3 cos 2t sin 3t
= lim lim
2 t→0 cos 3t t→0 sin 2t
3 sin 3t 3 3 cos 3t 3 3 9
= lim = lim = · = ,
2 t→0 sin 2t 2 t→0 2 cos 2t 2 2 4
where we used the rules once again in the final step.
c) It follows that
3y 2 − y + 5 6y − 1 6 3
lim = lim = lim = .
y→∞ 5y 2 − 6y − 3 y→∞ 10y + 6 y→∞ 10 5
1 1
d) We first rewrite the term sin2 u
− u2 as:
1 1 u2 − sin2 u u2 − sin2 u u2
2 − 2 = 2 = · .
sin u u 2
u sin u u 4
sin2 u
605
Next we note that ⎛ ⎞2

2
u ⎜ 1 ⎟
lim =⎝ =1
u→0 sin2 u sin u ⎠
lim
u→0 u
u2 − sin2 u
and therefore it remains to find lim . We have
u→0 u4
u2 − sin2 u 2u − 2 sin u cos u
lim = lim
u→0 u4 u→0 4u3
2u − sin 2u
= lim
u→0 4u3
2 − 2 cos 2u
= lim
u→0 12u2
4 sin 2u 1 sin 2u 1
= lim = lim = .
u→0 24u 3 u→0 2u 3
4. a) We claim that x → f (x) = x2 is the asymptote for g as x → ∞. To show

this we must prove that
2

ln 1 + x2 + ex
lim = 1.
x→∞ x2
Using the de l’Hospital rules we find
2

ln 1 + x2 + ex 2x + 2xex
2
lim = lim
x→∞ x2 x→∞ 2x 1 + x2 + ex2

x2 −x2
1+e x2 e 2xe + 1
= lim 2 = lim

x→∞ 1 + x2 + ex x→∞ ex2 e−x2 + x2 e−x2 + 1
2
2xe−x + 1
= lim 2 = 1.
x→∞ e −x + x2 e−x2 + 1
b) Since h is even we only need to consider the case t → ∞. We guess, since

1
lim = 0, that the asymptote is the function t → f (t) = 1.
t→∞ 1 + t2
Now we have
1
− 1+t
e 2 1
− 1+t
lim = lim e 2
t→∞ 1 t→∞
1
− lim
=e t→∞ 1 + t2 = e0 = 1.
2x2 +12x−2
5. a) In order for f1 (x) = 1 to be defined we have to have x2 − 1 > 0.
15(x2 −1) 2
If x2 − 1 = 0 then we would be dividing by 0, note that the numerator is not 0 for
606
x = ±1. If x2 − 1 < 0 the square root is not defined. Hence the maximal domain
D1 of f1 is D1 = R \ [−1, 1].
On this domain f1 is neither even nor odd and f1 (x) = 0 if and only if x2 +6x−1 = 0,
i.e. for √ √
x1,2 = −3 ± 9 + 1 = −3 ± 10.
√ √ √
Since 3 < 10 < 4 the only root is x0 := −3 − 10, the number −3 + 10 does
not belong to D1 .
For x > 1 we have 2x2 + 12x − 2 > 0 and therefore
2x2 + 12x − 2
lim 1 = +∞.
x→1
x>0 15 (x2 − 1) 2
√ √
For x < −3 − 10 we know that 2x2 + 12x − 2 > 0 but for −3 − 10 < x < −1 we
have 2x2 + 12x − 2 < 0 which implies that
2x2 + 12x − 2
lim 1 = −∞.
x→−1
x<−1 15 (x2 − 1) 2
2
We claim that for x → ∞ the asymptote is x → 15 x. Indeed we find
2x2 +12x−2
1
15(x2 −1) 2 x2 + 6x − 1
2 = 1
15 x x · x 1 − x12 2
and for x → ∞ we have
x2 + 6x − 1 1 + x6 − x1
lim
1 = lim 1 = 1.
x→∞ 2 x→∞
x 1 − x12 2 1 − x12 2
2
The asymptote for x → −∞ is the function x → − 15 x, since for x < −1
2x2 +12x−2
15(x2 −1) 2
1
x2 + 6x − 1 x2 + 6x − 1
2 = 1 = 1
− 15 x −x(−x) 1 − x12 2 x2 1 − x12 2
and the result follows as before.
To find local extreme values we consider

d 2x2 + 12x − 2 2x3 − 2x − 12
1 = 3 .
dx 15 (x2 − 1) 2 15 (x2 − 1) 2
df1
The condition dx (x) = 0 is equivalent to
x3 − x − 6 = 0.
One zero is easy to find: x = 2, indeed we have
23 − 2 − 6 = 0.
607
Now we look at
3
x − x − 6 : (x − 2) = x2 + 2x + 3.
x3 − 2x2
2x2 − x
2x2 − 4x
3x − 6.
Thus we have 3
x − x − 6 = (x − 2) x2 + 2x + 3 .
√
The zeroes of x2 + 2x + 3 are x1,2 = −1 ± 1 − 3, hence they are not real and
therefore only at x0 = 2 may we have local extreme values.
We know that lim f1 (x) = +∞ and lim f1 (x) = +∞, therefore we have a local
x→1 x→∞
x>1
minimum at x0 = 2.
This can be checked by looking at f1 (2):

d 2x3 − 2x − 12 −2x2 + 12x + 2
f1 (x) = 1 = 5
dx 15 (x2 − 1) 2 15 (x2 − 1) 2
and
−2 · 4 + 12 · 2 + 2 18 6
f1 (2) = √ = √ = √ > 0.
15 3 15 3 5 3
Note that f (2) = √2 . Finally we sketch the graph of f1
3
−9 −8 −7 −6 −5 −4 −3 −2 −1 1 2 3 4 5 6 7 8 9
−1
−2
−3
−4
−5
−6
608
s2 s 2
b) First we note that 1+s4 ∈ [0, 1) for all s ∈ R and therefore s → tan 1+s4
2 2
is defined on R. Secondly s → 1+s s s
4 is even implying that s → tan 1+s4 is even.
Consequently we can restrict our considerations to s ∈ R+ . For s0 = 0 it follows

s20 s20
that 1+s 4 = 0, i.e. tan 1+s 4 = 0 and s0 is the only zero in R+ . Also, since
0 0
s2 s2
lim = 0 it follows that lim tan = 0. Next we search for local
s→∞ 1 + s4 s→∞ 1 + s4
extreme values and therefore we consider

d s2 d s2 1
tan = ·
ds 1 + s4 ds 1 + s4 cos2 s2
1+s4
2s − 2s5 1
= .
(1 + s4 ) cos2 s2
1+s4
Thus f2 (s0 ) = 0 implies that 2s0 − 2s50 = 0, or since

2s0 − 2s50 = 2s0 1 − s40 = 2s0 1 − s20 1 + s20
the real zeroes are s0 = 0, s1 = 1, s2 = −1, and due to symmetry we only need to
investigate the function at s0 = 0 and s1 = 1.
s2
We know that tan t ≥ 0 for t ∈ 0, π2 therefore the function s → tan 1+s 4 must
s2
have a minimum at s0 = 0. Further, for s → ∞ we know that lim tan = 0,
s→∞ 1 + s4
hence we have a maximum at x1 = 1, and by symmetry also at s2 = −1. The value
at x1 = 1 is tan 12 .
| • |
2
c) Since arsinh is defined on R and so is t → 1 − e−t , the maximal domain of
2
f3 is D3 = R. Further t → 1 − e−t is even, therefore f3 is even too, hence we need
to consider f3 only on [0, ∞).
2
The only zero of arsinh is 0, and 1 − e−t = 0 only for t = 0, therefore f3 has a zero
2
at 0. For t → ∞ it follows that −e−t tends to 0, hence
2
√
lim arsinh 1 − e−t = arsinh(1) = ln 1 + 2 .
t→∞
609
We now look at the first derivative of f3 to investigate local extreme values:

d 2
d 2
2

arsinh 1 − e−t = 1 − e−t (arsinh ) 1 − e−t
dt dt
−t2 1
= 2te 2 > 0 for t > 0,
1 + 1 − e−t2
hence on [0, ∞) arsinh is strictly monotone increasing and the only local extreme
value is at 0, which is equal to
2

arsinh 1 − e−0 = arsinh 0 = 0.
√
ln 1 + 2
−
6. a) Using the definitions of cosh and sinh we find

x 2 x 2
e + e−x e − e−x
cosh2 x − sinh2 x = −
2 2
1 2x
= e + 2 + e−2x − e2x + 2 − e−2x = 1.
4
cosh x
b) Since coth x = sinh x we have
1 1 1
= =
coth2 x − 1 cosh2 x
sinh2 x
−1 cosh2 x
sinh2 x
− sinh2 x
sinh2 x
sinh2 x
= = sinh2 x,
cosh x − sinh2 x
2
where we used part a).

c) We start with
ex − e−x ey + e−y ex + e−x ey − e−y
sinh x cosh y + cosh x sinh y = +
2 2 2 2
1 x+y −x+y −(x+y) −x+y

= e −e +e x−y
−e +e x+y
+e −e x−y
− e−(x+y)
4
1 x+y ex+y − e−(x+y)
= 2e − 2e−(x+y) = = sinh(x + y).
4 2
Now for −y instead of y we find
sinh(x − y) = sinh x cosh(−y) + cosh x sinh(−y)
= sinh x cosh y − cosh x sinh y
610
where we used the fact that cosh is an even function and sinh is an odd function.
sinh x
d) Recall that tanh x = cosh x and therefore
sinh x sinh y
tanh x − tanh y cosh x − cosh y
= sinh x sinh y
1 − tanh x tanh y 1 − cosh x cosh y
sinh x cosh y−sinh y cosh x
cosh x cosh y
= cosh x cosh y−sinh x sinh y
cosh x cosh y
sinh x cosh x − sinh y cosh x sinh(x − y)
= = = tanh(x − y).
cosh x cosh y − sinh x sinh y cosh(x − y)
Chapter 12
1. a) We use the formula
n

(∗) S(f, Zn , ξ) = f (ξk )(tk − tk−1 )
k=1
k−1
and therefore with tk = 1 + n , k = 1, . . . , n + 1, we have
k−1 k−2 1
tk − tk−1 = 1 + −1− =
n n n
and
k−1 1 2k − 1
ξk = 1 + + =1+
n 2n 2n
which gives
n

S(f, Zn , ξ) = f (ξk )(tk − tk−1 )
k=1
n
1 2k − 1
= f 1+
n 2n
k=1
n 2 n
2 2k − 1 1 2k − 1
1+ − 1+ .
n 2n n 2n
k=1 k=1
Now we note that

n 2 n
2 2k − 1 1
1+ = (2n + 2k − 1)2
n 2 2n3
k=1 k=1
n n n

1
2 2
= (4n − 4n + 1) + (8n − 4)k + 4k
2n3
k=1 k=1 k=1
4n2 − 4n + 1 (8n − 4)n(n + 1) 2n(n + 1)(2n + 1)
= + +
2n2 4n3 6n3
14 2
= − 2
3 3n
611
and
n n n n
1 2k − 1 1 1k 1 1
1+ = 1+ + −
n 2n n n n n n
k=1 k=1 k=1 k=1

1 1 n(n + 1) 1 1
= ·n+ + − n
n n 2n n n
3 1
= −
2 2n
therefore it follows that
14 2 3 1
S(f, Zn , ξ) = − 2− +
3 3n 2 2n
19 2 1
= − 2+ .
6 3n 2n
Note that
2 2
2 3 t2 2 4 2 1
(2t − t)dt = t − = · 8 − − +
1 3 2 1 3 2 3 2
32 − 12 − 4 + 3 19
= = ,
6 6
1 2
and the larger the n the closer 2n − 3n2 is to 0.
b) Again we wish to use ∗, therefore we note that
a(m2 − (k − 1)2 ) + (k − 1)2 b

tk = , k = 1, . . . , m + 1
m2
2k + 1
tk+1 − tk = (b − a), k = 1, . . . , m
m2
2 1
ξk = tk + tk+1
3 3
1 2
= 2
3m − 3k 2 + 4k − 2 a + 3k 2 − 4k + 2 b , k = 1, . . . , m
3m
and we find
m

S(h, Zm , ξ) = h(ξk )(tk+1 − tk )
k=1
m

1 2k + 1
= (b − a)
1 + ξk2 m2
k=1
m
2k + 1
= 9m2 (b − a)
k=1
9m4 + ((3m2 − 3k 2 + 4k − 2) a + (3k 2 − 4k + 2) b)2
an expression we do not wish to simplify further.
612
2. From the definition we find
S(g|[a,tk ] , Zn |[a,tk ] , ξ|[a,tk ] )

k

= g(ξj )(tj − tj−1 )
j=1
and
S(g|[tk ,b] , Zn |[tk ,b] , ξ|[tk ,b] )

n
= g(ξj )(tj − tj−1 )
j=k+1
which implies
n

S(g, Zn , ξ) = g(ξj )(tj − tj−1 )
j=1
k
n

= g(ξj )(tj − tj−1 ) + g(ξj )(tj − tj−1 )
j=1 j=k+1
= S(g|[a,tk ] , Zn |[a,tk ] , ξ|[a,tk ] ) + S(g|[tk ,b] , Zn |[tk ,b] , ξ|[tk ,b] ).
It is clear from the following figure that such a result is true.
f (x)
|
|
|
| |
| |
|
······ ······
ξ1 ξ2 ξ3 ······ ξk ξk+1 ξk+2 ξk+3 ······ ξn

| | | | | | | || | | | | | | | | | | x
t1 t2 · · · · · · tk−1 tk+1 tk+2 · · · · · · tn−1
a = t0 tk b = tn
3 45 6 3 45 6
[a,tk ] [tk ,b]
3 45 6
[a,b]=[a,tk ]∪[tk ,b]
3. a) The figure in the problem shows the graph of x → |x| on [−2, 1] and indicates
the two triangles. They are right angled triangles and by denoting the distance
between two points A1 , A2 in the plane by l(A1 , A2 ) we find
1 1
area(ABC) = l(A, B)l(A, C) = · 2 · 2 = 2
2 2
613
and
1 1 1
area(BDE) = l(B, D)l(D, E) = · 1 · 1 =
2 2 2
therefore we find 1
5
|x|dx = .
−2 2
√
b) The figure in the problem shows the graph of r → g(r) = R2 − r2 , r ∈
[−R, R], and the area of the upper disc with radius R is 12 πR2 , hence we have
R
1
R2 − r2 dr = πR2 .
−R 2
4. a)
d
F (x) = (ln(cosh x))
dx

d
= cosh x (ln )(cosh x)
dx
1
= sinh x = tanh x,
cosh x
therefore
tanh xdx = ln cosh x.
b)
s
d a d eslna
F (s) = =
ds lna ds lna
lna slna
= e = as ,
lna
i.e.
as
as ds = .
lna
c)

d eu
F (u) = (sin 5u − 5 cos 5u)
du 26
eu eu
= (sin 5u − 5 cos 5u) + (5 cos 5u + 25 sin 5u)
26 26
eu sin 5u
= ,
26
and we find
eu sin 5u eu
du = (sin 5u − 5 cos 5u).
26 26
614
d)

d 1
F (r) = − cos r2 + 4r − 6
dr 2
1
= − (2r + 4) − sin r2 + 4r − 6
2
= (r + 2) sin r2 + 4r − 6
which gives

1
(r + 2) sin r2 + 4r − 6 dr = − cos r2 + 4r − 6 .
2
Chapter 13
1. We have
n
1 n
1
1 1
(1 + k 2 )x k2 dx = (1 + k 2 )x k2 dx
0 k=1 k=1 0
n 1
1
= (1 + k 2 ) x k2 dx
k=1 0
1
n
1
2 1+ k12
= (1 + k )
1 x
1 + k2
k=1 0
1
n
1
1
= (1 + k 2 ) k2 +1 x1+ k2
2

k=1 k 0
n
n(n + 1)(2n + 1)
= k2 = .
6
k=1
2. a) We know that |f | = f + + f − and f + , f − are integrable. Hence

b b b b
+
|f (x)|dx = f (x) + f − (x) dx = f + (x)dx + f − (x)dx.
a a a a
b) The function t → M is integrable |f (t)| ≤ M , therefore by monotonicity we

have
b b
|f (t)|dt ≤ M dt = M (b − a).
a a
Note that the monotonicity of the integral, i.e. the fact that f ≤ g implies
b b
f (t)dt ≤ g(t)dt is a simple consequence of the fact that the integral of a
a a
non-negative function is non-negative: f ≤ g if and only if g − f ≥ 0, hence
b b b
(g − f )(t)dt ≥ 0 implying f (t)dt ≤ f (t)dt.
a a a
615
b
c) If h ≥ 0 is integrable on [a, b] then h(t)dt ≥ 0. Now f (x) ≥ 0 on [−1, 0]
a
implies that f is monotone increasing on [−1, 0] and since f (−1) = 0 it follows that
0
f (x) ≥ 0 for all x ∈ [−1, 0] implying f (x)dx ≥ 0.
−1

1
3. The function x → 1 + 1+x2 sin x3 is odd:

1 3
1
1+ sin (−x) = 1 + sin − x3
1 + (−x)2 1 + x2

1
=− 1+ sin x3 .
1 + x2
a
Now note that for every odd function f : [−a, a] −→ R we have f (t)dt = 0.
−a
Indeed
a a 0
f (t)dt = f (t)dt + f (t)dt
−a 0 −a
a 0
= f (t)dt − f (−t)dt
0 −a
a 0
= f (t)dt − f (s) · (−1)ds
0 a
a 0 a a
= f (t)dt + f (s)ds = f (t)dt − f (s)ds = 0.
0 a 0 0
4. Denote the Dirichlet kernel by Dn :
sin(2n+1)t
, t = 0, t ∈ − π2 , π2
Dn (t) := sin t
2n + 1, t = 0.
By Problem 3 in Chapter 9 we have
n
1 1
Dn (t) = + cos(2kt)
2 2
k=1
616
implying
π π n

2 2 2 2
Dn (t)dt = 1+2 cos 2kt dt
π0 π 0
k=1
n π
2
=1+2 cos 2ktdt
k=1 0
n
1 kπ
=1+ cos sds
k 0
k=1
kπ
n
1

= 1+ sin s = 1.
k
k=1 0
5. For α = 0 and αt = s we have

b αb
1
f (αt)dt = f (s)ds
a α αa
and further with r = αt + β we have

b αb+β
1
f (αt + β)dt = f (s)ds.
a α αa+β
6. a) We have
π
4
π
4
π
ϑ cos ϑdϑ = ϑ sin ϑ|04 − sin ϑdϑ
0 0
π π
= ϑ sin ϑ|0 + cos ϑ|0 4 4
π 1√ 1√ 1√ π
= 2+ 2−1= 2 1+ − 1.
42 2 2 4
b) We have
2 2 2
x2 x 2
xln(2x + 1)dx = ln(2x + 1) − dx
1
2
2 1 2 2x +1
2
2
1 x2
= 2ln5 − ln2 − dx,
8 1
2
2x + 1
and we know
x2 1 1 1
(∗) = x− + ,
2x + 1 2 4 8x + 4
617
this is obtained by division:

1 1
x2 : 2x + 1 = x−
2 4
1
x2 + x
2
1
− x
2
1 1
− x−
2 4
1
+
4
giving (∗). Thus we find
2 2
x2 1 1 1
dx = x− + dx
1 2x + 1 1 2 4 8x + 4
2 2
2 2 2
1 1 1
= x2 − x + ln(8x + 4)
4 1 4 1 8 1
2 2 2
9 1 1
= + ln20 − ln8,
16 8 8
which implies
2
1 9 1 1
xln(2x + 1)dx = 2ln5 − ln2 − − ln20 + ln8.
1
2
8 16 8 8
c) We have
1 1 m1
m cosh(ms) m 1
s sinh msds = s − cosh msds
0 m 0 m 0
m1
cosh 1 1
= − 2 sinh ms
m2 m 0

cosh 1 − sinh 1 e + e−1 − e − e−1
= =
m2 2m2
1
= 2 .
m e
d) We have
3 3 3 3
lnt 1 1 1 1
√ dt = t− 2 lntdt = 2t 2 lnt − 2t 2 dt
1 t 1 1 1 t
√ 3
− 12
= 2 3ln3 − 2 t dt
1
√
1 3 √
= 2 3ln3 − 4t 2 = 4 + 2 3(ln3 − 2).
1
618
e) We have
π π
1 3 π 2r
e2r sin 3rdr = e2r sin 3r − e cos 3rdr
0 2 0 2 0
π
3 1 2r 3 π 2r
=− e cos 3r + e sin 3rdr
2 2 0 2 0
or π
9 3
1+ e2r sin 3rdr = − −e2π − 1
4 0 4
which gives
π

2r 4 3 2π 3 e2π + 1
e sin 3rdr = · e +1 = .
0 13 4 13
7. Performing integration by parts twice gives

π π
1 1 1 π n
(cos nx)(cos mx)dx = (cos nx)(sin mx) + (sin nx)(sin mx)dx
π −π πm −π π −π m
π
1 n
= (sin nx)(sin mx)dx
πm −π
π 2 π
1 n
= − (sin nx)(cos mx) +1 n (cos nx)(cos mx)dx,
π m2 π m2 −π
−π
or π
1 n2
(∗) 1− 2 cos nx cos mxdx = 0.
π m −π
For n = m it follows from (∗) that

π
1
(cos nx)(cos mx)dx = 0.
π −π
In the case where n = m we find by noting

π π
1 1
cos2 (nx)dx = sin2 (nx)dx
π −π π −π
that
π
2 π
1 2
2= cos nx + sin2 nx dx = cos2 nxdx,
π −π π −π
which gives
π
1
cos2 nxdx = 1.
π −π
619
Furthermore, for n, m ∈ N we have

π
1 1 1 n π
(sin nx)(cos mx)dx = sin nx sin mx − (cos nx)(sin mx)dx
π −π πm π m −π
π
1 n
=− (cos nx)(sin mx)dx
π m −π
π
n 1 n2 π
= − (cos nx)(cos mx) − (sin nx)(cos mx)dx
πm2 −π π m2 −π

1 n2 π
=− (sin nx)(cos mx)dx.
π m2 −π
n2
Since m2 > 0 it follows that
π
1
(sin nx)(cos mx)dx = 0.
π −π
8. a) Integration by parts gives

x2 eλx 2
x2 eλx dx = − xeλx dx
λ λ

x2 eλx 2xeλx 2
= − + 2 eλx dx
λ λ2 λ
2
x 2x 2
= − 2 + 3 eλx .
λ λ λ
b) First, it is clear that the case a = 0 gives

dt dt 1
= = ln|bt + c|.
at2 + bt + c bt + c b
Therefore let a = 0 and set D = b2 − 4ac. We consider the three cases D > 0,
D = 0, and D < 0.
D>0. It follows that
⎛ ⎞
2 √ 2
b D
at + bt + c = a ⎝ t +
2
− ⎠
2a 2a
√ √
b+ D b− D
=a t+ t+ .
2a 2a
√
−b± D
Over any interval which does not contain t1 , t2 , t1,2 = 2a the function t →
620
1
at2 +bt+c is integrable and we have
1 1
= √ √
at2 + bt + c b+ D
a t + 2a t + b−2a D

√a √a
1 D D
= √ − √
a t + b− D t + b+2a D
2a

1 1 1
=√ √ − √ ,
D t + b−2a D t + b+2a D
which implies

1 1 1√ 1 1√
dt = √ dt − √ dt
at2 + bt + c D t+ b−2a D D t+ b+2a D
√

1 2at + b − D
= √ ln √ .
D 2at − b + D
D=0. In this case we have

2
b
at2 + bt + c = a t +
2a
and consequently

dt 1 dt 1 −2
= 2 =− b
= .
at2 + bt + c a (x+ 2a
b
) a x+ 2a
2ax + b
D<0. Now we find

⎛ 2 ⎞
2
b |D|
at2 + bt + c = a ⎝ t + + ⎠
2a 2a
which has no real zero and therefore

dt 1 dt
= √ 2 .
at2 + bt + c a b 2
(t+ 2a ) + 2a |D|

dx 1 x b
Since = arctan it follows by the substitution t + 2a = y that
x2 + α2 α α
⎛ ⎞
b
dt 1 2a t + 2a
= arctan ⎝ √ ⎠
at2 + bt + c a |D| |D|
2a

2 2at + b
= arctan .
|D| |D|
621
9. We have
a a+c a+c
g(t)dt = g(s − c)ds = g(s)ds.
0 c c
10. In general, we will first determine a primitive and then evaluate it at the boundary
points of the integral
dy 1
a) With the change of variable y = lnx we find that dx = x, x = ey , and
consequently
dx dy −1 −1
= = 2 =
x(lnx)3 y3 2y 2(lnx)2
which gives
e2 e2
dx −1 −1 1
3
= 2 = 2 2
+
e x(lnx) 2(lnx) e 2(lne ) 2(lne)2
1 1 3
=− + = .
8 2 8
b) With the change of variable tan 2t = s we find that cos t = cos2 t

2 − sin2 t
2 =
2
1−s dt 2
1+s2 and ds = 1+s2 , thus we find

dt 2 ds
= 1−s2 1 + s2
5 + 3 cos t 5 + 3 1+s2

2 1
= ds = ds
5 + 5s2 + 3 − 3s2 s2 + 4

1 s 1 1 t
= arctan = arctan tan .
2 2 2 2 2
Thus we find
π
π
2 dt 1 1 t 2
= arctan tan
π
3
5 + cos t 2 2 2 π
3
1 1 π 1 1 π
= arctan tan − arctan tan
2 2 4 2 2 6
1 1 1 1
= arctan − arctan √ .
2 2 2 2 3
dν 2y
c) With arcsin y 2 = ν we find that dy = 1 giving
(1−y 4 ) 2

y arcsin y 2 1 ν2
dy = νdν =
1 − y4 2 4
1 2
= arcsin y 2
4
622
and therefore
1 1
√
y arcsin y 2
2 1 2 √2
dy = arcsin y 2
0 1 − y4 4 0
2
1 1 1 π 2 π2
= arcsin = = .
4 2 4 6 144
d) We have

ds ds 1 ds
√ = =
2
5 − 4s − s2 9 − (s + 2)2 3 1−( s+2
3 )
and the change of variable s+2

3 = y gives

ds 1 s+2
√ = dy = arcsin y = arcsin
5 − 4s − s2 1 − y2 3
implying
1 1
ds s + 2 5
√ = arcsin 1 = arcsin 1 − arcsin 3
1
2
5 − 4s − s2 3
2
π 5
= − arcsin .
2 3
e) With x = sinh t we find dx
dt = cosh t and further

dx cosh tdt cosh t
3 = 3 = dt
2
(1 + x ) 2 (1+sinh 2 t) 2
(cosh t)3
sinh t x
= tanh t = = √ .
1 + sinh t2 1 + x2
Therefore we get
4 4
dx x
= √ = √4 − √1 .
1 + x 1
3
2 17 2
1 (1 + x2 ) 2
√ √
11. a) To find a primitive of t → 3 2t+1 we use the change of variable y = 2t + 1,
i.e. 2x + 1 = y 2 and dx
dy = y to get
√
3 2t+1 dt = 3y ydy,
now we use integration by parts:

y
y · 3y 3
3y ydy = − dy
ln3 ln3
y · 3y ey
= − .
ln3 (ln3)2
623
Thus we arrive at
√ √ √ 4
4 √ 2t + 1 3 2t+1
e 2t+1
2t+1
3 dt = −
0 ln3 (ln3)2
0
3 · 33 1 · 31 e3 e
= − − +
ln3 ln3 (ln3)2 (ln3)2
3
240 e +e
= − .
ln3 (ln3)2
b) The substitution x = π − y gives
π π
x sin x (π − y) sin y
I= 2
dx = dy
0 1 + cos x 0 1 + cos2 y
π π
sin y y sin y
=π 2y
dy − dy
0 1 + cos 0 1 + cos2 y
π
(cos )(y)
= −π 2
dy − I
0 1 + cos y
π
= −π arctan(cos y)|0 − I
π2
= − I,
2
or
π2
I= .
4
12. First we note
x4 − x = x x3 − 1 = x(x − 1) x2 + x + 1 .
We wish to find a, b, c, d ∈ R such that
x+1 x+1 a b cx + d
= = + +
x4 − x x(x − 1) (x2 + x + 1) x x − 1 x2 + x + 1
x3 (a + b + c) + x2 (b + d − c) + x(b − d) − a
= ,
x(x − 1) (x2 + x + 1)
i.e.
a+b+c=0
a+d−c=0
b−d=1
−a=1
which gives a = −1, b = 23 , c = 13 , d = − 31 . Hence we have

x+1 1 2 1 1 x−1
dx = − dx + dx + dx
x4 − x x 3 x−1 3 x2 + x + 1

2 1 x−1
= −ln|x| + ln|x − 1| + dx,
3 3 x2 + x + 1
624
and further

x−1 1 2x − 2
dx = dx
x2 + x + 1 2 x2 + x + 1

1 2x + 1 1 3
= dx − dx
2 x2 + x + 1 2 x2 + x + 1

1 3 2 2x + 1
= ln x2 + x + 1 − √ arctan √
2 2 3 3

1 2 √ 2x + 1
= ln x + x + 1 − 3 arctan √ ,
2 3
where we use the solution to Problem 8 b) for D < 0.
13. This formula follows by iterating integration by parts:
b b
b
f (t)g (3) (t)dt = f g |a − f g (2) dt
a a

b
b b
= f g |a − f g |a −
f (t)g (t)dt
a
b
b b
= f g |a − f g |a + f (t)g (t)dt
a
b
b b b
= f g |a − f g |a + f g|a − f (3) (t)g(t)dt.
a
d
g (s)
14. Note that ds g(s) = √ and therefore we find
2 g(s)

g (s)
ds = 2 g(s).
g(s)
π
2 cos r
In the case of √ dr we now find
π
6
sin r
π2 √ π2
cos r
√ dr = 2 sin r π
π
6
sin r 6
√
= 2 − 2.
15. Integration by parts gives
π π
1 1 π
f (t) cos ntdt = (sin nt)f (t) − f (t) sin ntdt
−π n −π n −π
which implies
π
1 π

f (t) cos ntdt = f (t) sin ntdt
n
−π −π

1 π 2πM
≤ M dt = ,
n −π n
625
where we used the fact that f (t) sin nt ≤ |f (t) sin nt| ≤ M.
16. We have
x x
1
t−n dt = t−n+1
1 −n + 1 1
1 1 1
= −
n − 1 n − 1 xn−1
which implies
x
1 1 1 1
lim t−n dt = − lim = .
x→∞ 1 n − 1 x→∞ n − 1 xn−1 n−1
626
Solutions to Problems of Part 2

Chapter 14
1. For x, y ∈ R it follows that x2 , y2 and x+y
2 belong to R too and x < y implies
x
2 < y2 .
Therefore we find x = x2 + x2 < x2 + y2 = x+y x y y y
2 = 2 + 2 < 2 + 2 = y.
2. a) By definition x < 0 if and only if 0 > x and this is equivalent to 0 − x > 0

or −x > 0.
b) For x > 0 it follows that x2 > 0 from (14.12). If x < 0 then −x > 0, hence
(−x)2 > 0 which gives x2 > 0.
c) Since y − x > 0 and −a > 0 it follows that −a(y − x) > 0 or ax − ay > 0
which implies ax > ay.
3. Given x, y ∈ Q, x > 0, y > 0 we need to find an n ∈ N such that nx > y. Let
x = pq11 and y = pq22 with p1 , p2 , q1 , q2 ∈ N. With r = q1 q2 , p = p1 q2 , q = q1 p2 we find
x = pr and y = qr . Now p ≥ 1 and therefore p · q ≥ q and r ≥ 1 implies q · r ≥ q or
q ≥ qr . Then with ñ = r · q we obtain
p q
ñx = rq = pq ≥ q ≥ = y
r r
and therefore with n = ñ + 1 we have nx > y.
p
4. Suppose that for a = q ∈ Q, p ∈ Z and q ∈ N having no common divisor, we have
2
2 p
a = 3, i.e. = 3 or p2 = 3q 2 . Then 3 must divide p, say p = 3r. Therefore
q2
we have 9r2 = 3q 2 or 3r2 = q 2 and it follows that 3 also divides q which is a
contradiction.
1
5. Taking x = n in Bernoulli’s inequality we find
1 n 1
(1 + ) ≥1+n =2
n n
or n
n 1 (n + 1)n
+ = ≥2
n n nn
implying
(∗) 2nn ≤ (n + 1)n .
We now use induction to prove
n n
n! ≤ 2 .
2
For n = 1 we have 1
1
1! ≤ 2 =1
2
627
which is correct. Next we find using (∗) that

n n
(n + 1)! = n!(n + 1) ≤ 2 (n + 1)
2
1
= · 2nn (n + 1)
2n
1 2
≤ n
(n + 1)n (n + 1) = n+1 (n + 1)n+1
2 2
n+1
n+1
= 2
2
proving the assertion.
6. First note that for xk = x for all k = 1, . . . , n we find Bernoulli’s inequality. For
n = 1 the statement is trivial. Now if it holds for n then we consider the case n + 1
n+1
n

(1 + xk ) = (1 + xk )(1 + xn+1 )
k=1 k=1
n n

= (1 + xk ) + xn+1 (1 + xk )
k=1 n=1
n
n+1

≥ 1+ xk + xn+1 = 1 + xk .
k=1 k=1
7. Suppose that
1 a1 + · · · + an
(∗) (a1 · . . . · an ) n ≤
n
holds and that n ≥ 2. If 0 < x ≤ 1 − n1 n
then x > 0 ≥ 1 + n(x − 1) or with x = 1 + y
we have (1 + y)n ≥ 1 + ny.
Thus we need to prove the case n ≥ 2 and x > 1 − n1 . In this case 1 + n(x − 1) > 0
and we may apply (∗) to a1 = 1 + n(x − 1) and a2 = · · · = an = 1 to find
n
1 + n(x − 1) + 1 + · · · + 1
xn =
n
≥ (1 + n(x − 1)) · 1 · . . . · 1 = 1 + n(x − 1)
or again with x = 1 + y : (1 + y)n ≥ 1 + ny.

8. We take a1 = · · · = an = 1 + nx and an+1 = · · · = am = 1. Note that an > 0 is
equivalent to −x < n. Now we find
1 x n
Gm = (a1 · . . . · am ) m = (1 + )m
n
m
1 1 x
≤ aj = (n(1 + ) + (m − n)1)
m j=1 m n
x
= 1+ ,
m
628
which implies
x n x
(1 + ) ≤ (1 + )m for − x < n < m.
n m
9. The Cauchy-Schwarz inequality yields

n n n

ak ≤ |ak | = 1 · |ak |

k=1 k=1 k=1
n 12 n 12 n 12
√
2 2 2
≤ 1 ak = n ak .
k=1 k=1 k=1
Now, from the above calculations we derive
n
n
12
1
√ |ak | ≤ a2k ,
n
k=1 k=1
and since
n

a2k ≤ n · max {a21 , . . . , a2n }
k=1
2
= n · max {|a1 |2 , . . . , |an |2 } = n · max {|a1 |, . . . , |an |}
we find
n
12
√
a2k ≤ n max {|a1 |, . . . , |an |}.
k=1
Chapter 15
1. Let g : N → M be a bijective mapping which must exist since M is countable.

Define ak := f (g(k)), k ∈ N then {ak |ak = f (g(k))} = {f (m)|m ∈ M }.
2. a) We need to prove that for every > 0 there exists N () ∈ N such that
n > N () implies |an − a| < . For n ≥ M it follows however that
|an − a| = |a − a| = 0 <
implying lim an = a.
n→∞
b) Since lim an = a, for > 0 there exists N1 () ∈ N such that n ≥ N1 ()
n→∞
implies |an − a| < . Therefore, for n ≥ N () = max{N1 (), M } we have |bn − a| =
|an − a| < .
Both problems show that the first M (M could be very large) elements of a sequence
do not have any affect on the limit.
629
3. Let |bn | ≤ M for all n ≥ k and given > 0 choose N () ∈ N such that n ≥ N ()

implies |an | < M . It follows for n ≥ N () that

|bn an | = |bn ||an | ≤ M |an | < M = ,
M
i.e.
lim |bn an | = 0.
n→∞
4. a) Since by assumption an − a ≤ cn − a ≤ bn − a we find for all n ≥ k that
|cn − a| ≤ max{|an − a|, |bn − a|}.
Now, since lim an = lim bn = a, given > 0 there exists N1 (), N2 () ∈ N such
n→∞ n→∞
that n ≥ N1 () implies |an − a| < and n ≥ N2 () implies |bn − a| < . Hence for
n ≥ max{N1 (), N2 ()}, we deduce that max{|an − a|, |bn − a|} < which implies
for n ≥ max{N1 (), N2 ()} that |cn − a| < .
b) We may look at (cn )n∈N , cn = (−1)n , which is a divergent sequence as we
know by Example 15.5.C. However −1 ≤ cn ≤ 1 for all n ∈ N and the sequence
(an )n∈N , an = −1 for all n ∈ N has limit a = −1. Also the sequence (bn )n∈N , bn = 1
for all n ∈ N has limit b = 1. Thus a < b, an ≤ cn ≤ bn but (cn )n∈N does not
converge.
5. a) Here we will use the converse triangle inequality
||x| − |y|| ≤ |x − y| for x, y ∈ R.
Since lim an = a yields for > 0 the existence of N () ∈ N such that n ≥ N ()
n→∞
implies |an − a| < , for these n, i.e. n ≥ N (), it follows that
||an | − |a|| ≤ |an − a| < ,
i.e.
lim |an | = |a|.
n→∞
Since lim an = a is equivalent to lim (an − a) = 0, we deduce that lim an = a

n→∞ n→∞ n→∞
implies lim |an − a| = 0. That lim |an − a| = 0 implies lim an = a follows from
n→∞ n→∞ n→∞
the definition.
b) Given > 0 there exists N () ∈ N such that n ≥ N () implies μn < or
|an − a| ≤ μn < , implying lim an = a.
n→∞
6. We know (compare with Lemma 2.7) that

1
max{a, b} = (a + b + |a − b|)
2
and
1
min{a, b} = (a + b − |a − b|).
2
630
Therefore we find
1
max{an , bn } = (an + bn + |an − bn |)
2
and
1
min{an , bn } = (an + bn − |an − bn |),
2
or with cn := an + bn , dn := an − bn
1
max{an , bn } = (cn + |dn |)
2
and
1
min{an , bn } = (cn − |dn |).
2
Since lim cn = a + b and lim |dn | = |a − b| where for the last limit we have used
n→∞ n→∞
Problem 5 a), it follows that
1
lim max{an , bn } = (a + b + |a − b|) = max{a, b}
n→∞ 2
and
1
lim min{an , bn } = (a + b − |a − b|) = min{a, b}.
n→∞ 2

5 5
7. a) First we observe that n+6 − 0 = n+6 < n5 . Now, given > 0 we choose
N () ∈ N, N () > 5 , to find for n ≥ N () that

5
− 0 = 5 < 5 < ,
n + 6 n+6 n
5
which implies lim = 0.
n→∞ n+6
b) We have
4n 4 3 · 4n − 4(3n + 2)
− =
3n + 2 3 3(3n + 2)

−8 8 8
= = < .
9n + 6 9n + 6 9n
8 1 8000
Therefore, if 9n < 1000 , i.e. n > 9 , it follows that

4n 4 1
−
3n + 2 3 < 1000 .
1 1
8. a) Given > 0 for n > + 1 it follows that n < . For these n we find
1 1
nk ≤ n1 < , i.e. lim k = 0.
n→∞ n
631
1
b) Since lim = 0, given > 0 we find N () ∈ N such that n ≥ N () implies
n→∞ n
1 k
n < , implying for these n that

1
1 − 0 = 11 < ,
nk nk
1
which proves lim 1 = 0.
n→∞ nk
9. a)
(n + 1)2 − n2 n2 + 2n + 1 − n2
lim = lim
n→∞ n n→∞ n

2n + 1 1
= lim = lim 2 + = 2.
n→∞ n n→∞ n
b) √ √ √ √
√ √ ( n + 1 − n)( n + 1 + n)
lim ( n + 1 − n) = lim √ √
n→∞ n→∞ n+1+ n
n+1−n 1
= lim √ √ = lim √ √ = 0,
n→∞ n+1+ n n→∞ n+1+ n
1 √ 1
where the latter follows from √n+1+ n
≤ √1n and lim √ = 0.
n→∞ n
c)
n n(n+1)
j=1 j 2
lim = lim
n→∞ n2 n→∞ n2

n2 + n 1 1 1
= lim = lim + = .
n→∞ 2n2 n→∞ 2 2n 2
d)
n n(n+1)(2n+1)
j=1 j2 6
lim = lim
n→∞ n3 n→∞ n3

2n3 + 3n2 + n 1 1 1 1
= lim = lim + + = .
n→∞ 6n3 n→∞ 3 2n 6n2 3
e)
1
1 + 2 · 3n 3n + 2
lim = lim 5
n→∞ 5 + 4 · 3n
3n + 4
n→∞
n
1
3 +2
= lim n
n→∞ 5 13 + 4
n
1
lim +2
n→∞ 3 2 1
= n = = .
1 4 2
lim 5 +4
n→∞ 3
632
f) n
n + 4n n 4
lim = lim n + lim =0
n→∞ 5n n→∞ 5 n→∞ 5
n
since lim q n = 0 for |q| < 1 and n
5n ≤ n
2n and lim n = 0 by Example 15.5.E.
n→∞ n→∞ 2
1
10. The case a = 1 is trivial. Let a > 1 then a n = 1 + bn for some bn > 0 and further
n
n n
a = (1 + bn )n = bkn > 1 + bn
k 1
k=0
= 1 + nbn
or
a−1
0 < bn = ,
n
implying lim bn = 0. Therefore
n→∞
√
n
1
lim a = lim a n = lim (1 + bn ) = 1.
n→∞ n→∞ n→∞
11. We first note that

n
n
n
1 j+1−1 j
1− = =
j=1
j+1 j=1
j+1 j=1
j+1
n
j=1 j 1 ·2 · ...· n 1
= n = = ,
j=1 (j + 1) 2 · 3 · . . . · n · (n + 1) n+1
and now we conclude that
n

1 1
lim 1− = lim = 0.
n→∞
j=1
j+1 n→∞ n+1
12. Since every polynomial has at most a finite number of real zeroes and since we are
only interested in limits we may assume in the following that for ν ≥ K none of
the polynomials under consideration has a root. Next we consider
n
n−1

ak ν k = an ν n + ν n ak ν k−n
k=0 k=0
and
m
m−1

bl ν l = bm ν l + ν m bl ν l−m
l=0 k=0
which gives n n−1

ak ν k an ν n + ν n k=0 ak ν k−n
k=0
m l
= m−1 .
l=0 bl ν bm ν m + ν m k=0 bl ν l−m
633
If n = m then it follows that

n
ak ν k an + n−1
k=0 ak ν
k−n
k=0
n l
= n−1
l=0 bl ν bn + k=0 bl−n
l
and since lim ν k−n = 0 for k < n, we deduce for an = 0, bm = 0 that

ν→∞
n
aν ν k an
lim k=0
n l
= .
ν→∞ b
l=0 l ν bn
Now if m > n we find

n n−1
ak ν k an + k=0 ak ν k−n
k=0
m l
= m−1
l=0 bl ν ν m−n bm + k=0 bl ν l−m
n−1
1 an + ak ν k−n
= · k=0
m−1 ,
ν m−n bm + k=0 bl ν l−m
and we note n−1
an +ak ν k−n an
lim k=0
m−1 =
ν→∞ b +
m k=0 lb ν l−m b m
1
as well as lim = 0 for m > n, hence
ν→∞ ν m−n
n
ak ν k
lim k=0
m l
= 0 for m > n.
l=0 bl ν
ν→∞
In the case where n > m we find

n n−1
k=0 ak ν k n−m an + k=0 ak ν k−n
m l
=ν m−1 .
l=0 bl ν bm + l=0 bl ν l−m
We know that n−1

an + k=0 ak ν k−n an
lim m−1 =
ν→∞ b+ l=0 bl ν l−m b m
and an = 0 by assumption. Thus there exists N0 ∈ N such that ν ≥ N0 implies

an an + n−1
k=0 ak ν
k−n
an an

− ≤ − ≤
2bm bm + m−1
l=0 bl ν
l−m bm 2bm
an
If bm ≥ 0 then we find
n−1
an an + k=0 ak ν k−n 3an
≤ m−1 ≤
2bm bm + l=0 bl ν l−m 2bm
634
implying n
an n−m ak ν k 3an n−m
ν ≤ k=0
m l
≤ ν
2bm l=0 b l ν 2b m
n
k a νk
which yields that k=0
m l is unbounded and therefore must diverge to +∞.
l=0 bl ν
an
If bm < 0 then we consider

− nk=0 ak ν k
m l
l=0 bl ν
which must tend to +∞ as ν → +∞, and hence
n
ak ν k
k=0
m l
l=0 bl ν
tends to −∞ as k → ∞.
n
j=1 bj
13. With bn := an − a we have to prove that if lim bj = 0 then lim = 0. For
j→∞ n→∞ n
m < n we have
b1 + · · · + bn b1 + · · · + bm bm+1 + bm+2 + · · · + bn
= +
n n n
and therefore

b1 + · · · + bn |b1 + · · · + bm | |bm+1 | + · · · + |bn |
≤ + .
n n n

Since lim bj = 0, given > 0 we can find m ∈ N such that |bj | < 2 for j > m,
j→∞
which implies
|bm+1 | + · · · + |bn | n−m
≤ < .
n n 2 2
Now we can choose N such that for n > N > m it follows that
|b1 + · · · + bm |
<
n 2
which eventually yields for n > N > m that
|b1 + · · · + bn |
< ,
n
i.e.
|b1 + · · · + bn |
lim = 0.
n→∞ n
14. Given x0 we take xn = x0 + n1 and we deduce that for > 0 there exists δ > 0 such
that |xn − x0 | = n1 < δ then

f (xn ) − f (x0 )
− A < .
xn − x0
635
However
f (xn ) − f (x0 ) f (x0 + n1 ) − f (x0 )
−A= 1 −A
xn − x0 n
1
) − f (x0 )) − A.
= n(f (x0 +
n
1
Thus given > 0 we can find N () ∈ N such that n > N () ≥ δ implies
1
|n(f (x0 + ) − f (x0 )) − A| < ,
n
i.e.
1
lim (n(f (x0 + ) − f (x0 )) − A) = 0.
n→∞ n
Chapter 16
1. Given a sequence of partial sums (sn )n∈N we find the corresponding sequence
n
(an )n∈N with sn = k=1 ak by an = sn − sn−1 , hence
n(n + 1)(2n + 1) (n − 1)n(2(n − 1) + 1)
an = −
6 6
n
= (2n2 + 3n + 1 − (n − 1)(2n − 1))
6
n
= (2n2 + 3n + 1 − 2n2 + 2n + n − 1)
6
n
= · 6n = n2 .
6
n
Indeed we already know that k=1 k 2 = n(n+1)(2n+1)
6 .
n n
2. From an ≤ bn we deduce k=1 an ≤ k=1 bk which implies
∞
n
n
∞

ak = lim an ≤ lim bk = bk .
n→∞ n→∞
k=1 k=1 k=1 k=1
3.
∞
n

1 1
2
= lim
4k − 1 n→∞ 4k 2
−1
k=1 k=1
n
1 1 1
= lim −
n→∞ 2 2k − 1 2k + 1
k=1
n n−1

1 1 1 1
= lim 1+ − −
n→∞ 2 2k − 1 2k + 1 2n + 1
k=2 k=1
n−1 n−1

1 1 1 1
= lim 1 + − −
2 n→∞ 2k + 1 2k + 1 2n + 1
k=1 k=1

1 1 1
= lim 1 − = .
2 n→∞ 2n + 1 2
636
4. a) This is the geometric series with q = − 15 . Since |q| < 1 we have

∞
(−1)k 1 1 5
= = = .
k=0
5k 1 − − 15 1+ 1
5
6
b) Note that e−nx = (e−x )n and for x < 0 we know that 0 < e−x < 1, hence
∞
1
e−nx = .
n=0
1 − e−x
c) We have
∞ k
∞ k

4 4 4
= −1−
7 7 7
k=2 k=0
1 11 7 11 16
= 4 − = − = .
1− 7
7 3 7 21

1
5. The condition for the convergence of the geometric series is y−2 < 1 or 1 < |y − 2|.
∞ 1
Thus if y > 3 or y < 1 the series k=0 (y−2)k converges with limit
∞
1 1 1
= = .
(y − 2)k 1 − (y − 2) 3−y
k=0
6. We only need to note that

n

sn := (ak − ak−1 ) = (a1 − a0 ) + (a2 − a1 ) + · · · + (an − an−1 ) = an − a0
k=1
and
n

s̃n := (ak − ak+1 ) = (a1 − a2 ) + (a2 − a3 ) + · · · + (an − an+1 ) = a1 − an+1
k=1
Now we find
∞

(ak − ak−1 ) = lim sn = lim (an − a0 ) = lim an − a0
n→∞ n→∞ n→∞
k=1
and
∞

(ak+1 − ak ) = lim s̃n = lim (a1 − an+1 ) = a1 − lim an .
n→∞ n→∞ n→∞
k=1
637
7. a) We know that
∞
1 1
= 1 =2
2k 1 − 2
k=0
and
∞
∞ k
(−1)k 1 1 3
= − = 1 =
3k 3 1− 3
4
k=0 k=0
which implies
∞

1 (−1)k 11
+ = .
2k 3k 4
k=0
b) First we note that

1 k2 − 1
ln 1 − 2 = ln
k k2
= ln(k + 1) + ln(k − 1) − 2 ln k
= (ln(k + 1) − ln k) − (ln k − ln(k − 1)).
Thus
∞
∞
1
ln 1 − 2 = ((ln(k + 1) − ln k) − (ln k − ln(k − 1)))
k
k=2 k=2
is a telescopic series with respect to ak = ln k − ln(k − 1). Therefore, by using a

straightforward modification of Problem 6 we get
∞
1
ln 1 − 2 = lim (ln(k + 1) − ln k) − (ln 2 − ln 1)
k k→∞
k=2

1 1 1
= lim ln 1 + + ln = ln ,
k→∞ k 2 2

1
where we used the assumption that lim ln 1 + = 0.
k→∞ k
∞ 1
c) The series k=1 (2k−1) 2 sums up all reciprocals of squares of odd natural
∞ 1
numbers, k=1 (2k) 2 sums up all reciprocals of squares of even natural numbers,
thus
∞
∞
∞
1 1 1
= +
k2 (2k)2 (2k − 1)2
k=1 k=1 k=1
∞
∞

1 1 1
= +
4 k2 (2k − 1)2
k=1 k=1
or
∞ ∞
3 1 1
= ,
4 k2 (2k − 1)2
k=1 k=1
638
i.e.
∞
1 3A
= .
(2k − 1)2 4
k=1
n3 +2n2 −2
8. a) For an = 15n2 +n it follows that

n3 1 + n2 − n23
an =
n2 (15 + n1 )

1 + n2 − n23
= n
15 + n1
n
≥ ,
16
where we used that 1 + n2 − n23 ≥ 1 and 15 + n1 ≤ 16. Now, given K ∈ R we only
have to choose N > 16K to have that n ≥ N implies
n N
an ≥ ≥ > K.
16 16
b) Using the hint we find
1 1 1
0 < sin ≤ or n ≤ .
n n sin n1
Now, given K > 0 choose N ∈ N such that N > K. Then it follows for n ≥ N that
1
≥ n ≥ N ≥ K,
sin n1

1
hence lim = +∞.
n→∞ sin n1
9. a) Take an = n2 and bn = n1 , then lim an = lim n2 = ∞ and lim bn =
n→∞ n→∞ n→∞
1 2 1
lim = 0. Moreover an · bn = n · n = n and lim (an bn ) = lim n = +∞.
n→∞ n n→∞ n→∞
b) Take an = −n2 and bn = n1 and we find lim an = −∞, lim bn = 0 and
n→∞ n→∞
21
lim (an bn ) = lim (−n ) = lim (−n) = −∞.
n→∞ n→∞ n n→∞
c) Take an = n and bn = nc , then lim an = ∞, lim bn = 0 and lim (an bn ) =
n→∞ n→∞ n→∞
c
lim (n · ) = c.
n→∞ n
Chapter 17
1. a) Since
2n
1
s2n − sn =
j=n+1
j
1 1 n 1
= + ···+ > =
n+1 2n 2n 2
639
the first part follows in a straightforward way. However now the second part is
trivial: (sn )n∈N is not a Cauchy sequence, hence has no limit.
b) We will prove that
1
(∗) 0 < sn+m − sn <
n+1
which implies that (sn )n∈N is a Cauchy sequence: given > 0 take N > 1 to find
for n ≥ N that |sn+m − sn | < compare with Remark 17.2.
Next we prove (∗). If m is even we find

1 1 1 1 1 1
sn+m −sn = − + − +· · ·+ − >0
n+1 n+2 n+3 n+4 n+m−1 n+m
and if m is odd we have

1 1 1 1
sn+m − sn = − + − − ···
n+1 n+2 n+3 n+4

1 1 1
− − + >0
n+m−1 n+m n+m
which proves the lower bound in (∗).
To show the upper bound we note that for m even that

1 1 1 1 1 1 1
sn+m − sn = − − − − −· · ·− >
n+1 n+2 n+3 n+4 n+5 n+m n+1
and for m odd that

1 1 1 1 1 1
sn+m − sn = − − − ···− − >
n+1 n+2 n+3 n+m−1 n+m n+1
and therefore (∗) is proved.

Note that in 1b) a lot of cancellation happens when calculating sn+m − sn , whereas
while calculating 1a) there is no cancellation; we just add up the strictly positive
terms.
2. We know that for > 0 there exists N ∈ N such that n ≥ N implies 2−N +1 < .
For n ≥ N and m ∈ N it follows that

m−1

|an − an+m | = (an+j+1 − an+j )
j=0
m−1

≤ |an+j+1 − an+j |
j=0
m−1
∞

≤ 2−n−j ≤ 2−n 2−j = 2−n+1 < ,
j=1 j=0
640
proving that (an )n∈N is a Cauchy sequence.

3. Since (an )n∈N and (bn )n∈N converge they are bounded, i.e. −A ≤ an ≤ A and
−B ≤ bn ≤ B for some A > 0 and B > 0. It then follows that
−A ≤ an ≤ cn ≤ bn ≤ B
implying that the sequence (cn )n∈N is bounded. By the Bolzano-Weierstrass theo-
rem it must contain a convergent subsequence.
√ √
n n
4. Since 0 < n+1 for all n ∈ N, once we know that n+1 is decreasing we can
√ n∈N
n
deduce that it converges. In order to prove that n+1 is decreasing we need
n∈N
to prove √ √
n n+1
≥ for all n ∈ N.
n+1 n+2
√
n+1 √ √
This is equivalent to n+2
n+1 ≥
√
n
which is equivalent to n(n + 2) ≥ n + 1(n + 1)
or
n(n + 2)2 ≥ (n + 1)3 ,
i.e.
n3 + 4n + 4n ≥ n3 + 3n2 + 3n + 1
which is a correct statement.
5. First we prove by induction for k ∈ N that k! ≥ 2k−1 . For k = 1 we have 1 ≥ 1
which is clearly true. Now we observe that
(k + 1)! = k!(k + 1) ≥ 2k−1 (k + 1) ≥ 2 · 2k−1 = 2k = 2(k+1)−1 ,
and the result follows. Next we find

n
n n
1 1
= 1+ ≤1+ 2−k+1
k! k!
k=0 k=1 k=1
n
n

−k −k
= 1+2 2 =1+2 2 −1
k=1 k=0
n
1
= 1+2 1−2 < 3,
2
n 1
where we used Theorem 16.4. Thus k=0 k! n∈N is a monotone increasing se-
quence which is bounded from above.
6. We have to prove that for every K > 0 there exists N ∈ N such that n ≥ N
implies an ≥ K. Suppose that there exists K0 > 0 such that for infinitely many
nl , l ∈ N, anl < K0 . The subsequence (anl )l∈N satisfies 0 ≤ anl ≤ K0 , i.e. it
is bounded. Consequently it must have a convergent subsequence, implying that
(an )n∈N must have an accumulation point which is a contradiction.
641
7. We define ⎧
⎨ −2, n = 3k + 1, k ∈ N0
bn := − 1 , n = 3k + 2, k ∈ N0
⎩ 3
17, n = 3k, k ∈ N0 .
Since −3 < −2 < − 13 < 17 < 19 we have −3 ≤ an ≤ 19 for all n ∈ N. Further the
subsequence (a3k+1 )k∈N0 converges to −2, the subsequence (a3k+2 )k∈N0 converges
to − 13 and the subsequence (a3k )k∈N0 converges to 17.
8. Once we know that (xn )n≥0 converges to a limit lim xn = x > 0 we find
n→∞
k
k−1x +a
x= or kxk = (k − 1)xk + a
k xk−1
1 √
i.e. xk = a or x = a k = k a.
Now we prove that (xn )n≥0 is bounded from below and decreasing, hence it must
converge. We need the following steps:
i) We prove xn > 0. We know that x0 > 0 and a > 0. Since
(k − 1)xkn + a
xn+1 = ,
kxk−1
n
we find xn+1 > 0 if xn > 0, thus by induction xn > 0 for all n.

xk
n −a
ii) Since xn > 0 it follows that 1 − kxk
≥ 0, note that this is equivalent to
n
xk
n −a
kxkn − xkn + a ≥ 0. Thus − kx k ≥ −1.
n
iii) Using Bernoulli’s inequality we get

k k
xk − a xk − a
xn − n k−1 = xn 1 − n k
kxn kxn
k
k
x −a
= xkn 1 − n k
kxn
k
x −a
≥ xkn 1 + k − n k
kxn

k a
= xn 1 − 1 + k = a.
xn
xk
n −a
iv) Since xn+1 = xn − kxk−1
we deduce that xkn ≥ a for all n ∈ N.
n
xk
n −a
v) Since xn ≥ 0 and kxk−1
≥ 0 it follows that
n
xkn − a
xn+1 = xn − ≤ xn .
kxk−1
n
1
Thus (xn )n∈N is decreasing and bounded below by a k . Therefore lim xn exists
n→∞
1
and it must be a k .
642
9. By the binomial theorem we find for n > 1 that

n n
1 n 1
an = 1+ =
n k nk
k=0
1 n(n − 1) 1 n(n − 1)(n − 2) · · · 1 1
= 1+n + 2
+ ··· +
n 2! n n! nn
1 1 1 1 n−1
(∗) = 1+1+ 1− + ··· + 1− ··· 1 −
2! n n! n n
n ∞
1 1
≤ ≤ ,
j=0
j! j=0 j!
∞ 1
and by Problem 5 the series j=0 j! converges. Moreover, for k > n we have
k
1 1 1 1 1 2 n
1+ >1+1+ 1− + ··· + 1− 1− ··· 1 − ,
k 2! k n! k k k
where we used the calculations made above leading to (∗).

For n fixed and k → ∞ we obtain
k n
1 1
lim 1 + ≥
k→∞ k j=0
j!
and therefore k n ∞
1 1 1
lim 1 + ≥ lim = .
k→∞ k n→∞
j=0
j! j=0
j!
Thus we have proved that

∞ n ∞
1 1 1
≤ lim 1 + ≤ ,
j=0
j! n→∞ n j=0
j!
i.e. n ∞
1 1
e = lim 1+ =
n→∞ n j=0
j!
n n+1
10. Since an = 1 + n1 < 1 + n1 = bn the intervals [an , bn ] are bounded, closed
and non-empty. We are done if we can prove that (an )n∈N is monotone increasing
and that (bn )n∈N is monotone decreasing. We then have a1 ≤ an ≤ bn ≤ b1 and in
addition
n+1 n
1 1
b n − an = 1+ − 1+
n n
n
1 1 1 1 4
= 1+ = an ≤ b 1 = ,
n n n n n
643
hence bn − an → 0 as n → ∞.
For n ≥ 2 Bernoulli’s inequality yields
n
1 1 1
1− 2 > (1 − n) 2 = 1 − ,
n n n
and therefore n
n−1 n2 − 1 (n + 1)n (n − 1)n
< =
n n2 nn nn
or n−1 n
n n+1
< ,
n−1 n
i.e. for n ≥ 2
n−1 n−1 n n
1 n n+1 1
an−1 = 1 + = < = 1+ = an .
n−1 n−1 n n
Therefore (an )n∈N is strictly monotone increasing. Once again, using Bernoulli’s
inequality we find for n ≥ 2
n
1 n n 1
1+ 2 >1+ 2 >1+ 2 =1+ ,
n −1 n −1 n n
and further
n n −n
1 n2 n n+1
1+ < 2
=
n n −1 n−1 n
n −n
1 1
= 1+ 1+ ,
n−1 n
which yields
n+1 n
1 1
bn = 1+ < 1+ = bn−1 .
n n−1
Now we see that(bn )n∈N
nis strictly monotone decreasing and the result follows ,
1
recall that lim 1 + = e.
n→∞ n
Chapter 18
1. This problem helps to understand how to handle the Cauchy criterion better. With
l := n + m and j = n + 1 it follows that
m n+k
l i

1 1
= .
2 i=j
2
k=1
Thus we know that for > 0 there exists N ∈ N such that l ≥ j ≥ N implies
l 1 i l
i=j 2 = i=j 2−i < . Therefore the Cauchy criterion is satisfied.
644
2. Given > 0 there exists m ∈ N such that k > m and l ≥ 1 implies

|sk+l − sk | = ak+1 + · · · + ak+l < .
2
Now take n ≥ 2m then for k = [ 12 n] it follows that k ≥ m and therefore

ak+1 + · · · + an < .
2
Since the sequence is decreasing and aj ≥ 0, we find
1
(n − k)an = (n − [ n])an < ,
2 2
but n − [ 12 n] ≥ n
2 implying
n
an < or nan < .
2 2
Thus we have proved: given > 0 there exists M = 2m ∈ N such that n > M = 2m
yields nan = |nan | < , i.e. lim nan = 0.
n→∞
∞
3. Since an ≥ 0 and aj+1 ≤ aj it follows for s = n=1 an < ∞ that
s ≥ a1 + a2 + (a3 + a4 ) + (a5 + a6 + a7 + a8 ) + · · · + (a2n−1 +1 + · · · + a2n )
1
≥ a1 + a2 + 2a4 + 4a8 + · · · + 2n−1 a2n ,
2
or
a1 + 2a2 + · · · + 2n a2n ≤ 2s.
∞ n
This implies the convergence of ∞ n
n=1 2 a2n by Theorem 18.4. Now suppose n=1 2 a2n
converges, then we find for 2k ≥ n
a1 + a2 + · · · + an ≤ a1 + (a2 + a3 ) + (a4 + · · · a7 ) + · · · + (a2k + · · · + a2k+1 −1 )
≤ a1 + 2a2 + 4a4 + · · · + 2k a2k ,
and again Theorem 18.4 gives the result.
Note that the statement of this problem is sometimes called Cauchy’s condensa-
tion theorem.
4. a) For the “condensed” series we find
∞
∞ n
1 1
2n · =
n=1
(2n )α n=1
2α−1
∞ 1
and
∞ therefore for α > 1 the series n=1 nα converges and for α ≤ 1 the series
1
n=1 nα diverges.
b) We note that
1 1
2n =
2n (ln 2n )α nα (ln 2)α
∞ 1
and consequently n=2 n(ln n)α converges for α > 1 and diverges for α ≤ 1.
645

5. a) We need to consider the sequence n1α n∈N . Clearly for α ∈ R we have
1
nα ≥ 0. Moreover, for α ≥ 0 the sequence is decreasing, while for α < 0 it
1
is not decreasing, but only for α > 0 we have lim α = 0. Hence for α > 0
n→∞ n
∞ n+1
the series n=1 (−1) nα converges by Leibniz’s test, for α = 0 the series reduces
∞ k+1
to n=1 (−1) which we know to be divergent and for α < 0 it follows that
1 ∞ k+1
lim = +∞ therefore n=1 (−1) nα diverges in this case.
n→∞ nα
1 1 1 1
b) Clearly 2n−1 ≥ 0 for n ≥ 1, limn→∞ 2n−1 = 0 and 2(n+1)−1 = 2n+1 <

1 1
2n−1 , i.e. the sequence 2n−1 is decreasing and again Leibniz’s test gives the
n∈N
∞ (−1)n+1
convergence of n=1 2n−1 .
1 1 1
c) For n ≥ 2 the term n ln n is defined and positive. Since 0 < n ln n < n we
1
deduce that lim = 0. Moreover we claim that
n→∞ n ln n
1 1
<
(n + 1) ln(n + 1) n ln n
1
which follows from < n1 and ln(n+1)
n+1
1
< 1
ln n . Hence by Leibniz’s criterion we
∞ (−1)n
know the convergence of n=2 n ln n .
6. From the assumption it follows that
n
n

ak ≤ bk
k=1 k=1

and that for K ≥ 0 there exists N ∈ Nsuch that n ≥ Nimplies nk=1 ak ≥ K.
n ∞
This however gives that n ≥ N implies k=1 bk ≥ K, i.e. k=1 bk diverges.
7. a)
(−1)k k 2 k2 1

k 4 + 2k ≤ k 4 + 2k ≤ k 2
∞ ∞ k 2
and 1
k=1 k2 converges, hence k=1 (−1) k
k4 +2k converges.
b)
k! 1 · ...· k 1 2 3 k 2
= = · · ·... · ≤
kk k · ...· k k k k k k2
∞ ∞ k!
and since k=1 k22 converges we deduce that k=1 kk converges.
c) First we claim that ln(n + 1) ≤ n for all n ∈ N. Indeed for n = 1 we
have ln 2 ≤ 2 since ln e = 1 and ln is a monotone function. Further n + 1 ≤ en is
equivalent to ln(n + 1) ≤ n and if n + 1 ≤ en then we get n + 2 ≤ en + 1 ≤ en + en =
2en ≤ een = en+1 , thus we know that ln(n + 1) ≤ n for all n ∈ N and it follows that
ln(n + 1) ln(n + 1) 1
3
≤ 3
≤
3n + 7 3n 3n2
646
∞
and the convergence of n=1 3n1 2 implies the result.

d) Since | sin x| ≤ |x| we deduce sin 13 ≤ 13 implying the convergence of
∞ 1
∞ 1 n n
n=1 sin n3 since n=1 n3
∞
kx
e) For all x ∈ R we have cos 1
1+k2 ≤ 1+k2 and since
1
k=1 1+k2 converges it
∞ cos kx
follows that k=1 1+k2 converges for all x ∈ R
mx
f) For x ≤ 0 we know that emx ≤ 1 and therefore em4 ≤ m14 and the conver-
∞ ∞ mx
gence of m=1 m14 implies the convergence of m=1 em4 for x ≤ 0. However for
∞ mx
mx e
x > 0 the series m=1 em4 diverges since lim = 0. Indeed, applying the de
m→∞ m4
tx
l’Hospital rules four times to the function t → et4 we find
etx x4 etx x4 tx
lim 4
= lim = lim e = ∞.
t→∞ t t→∞ 4! t→∞ 4!
tx
Therefore there exists N ∈ N such that t ≥ N implies et4 ≥ 1, hence for m ∈ N it
mx
follows that m ≥ N implies em4 ≥ 1.
g)
∞ ∞
∞

x2 2 1 2 1
= x ≤ x < ∞,
l2 + x2 l2 + x2 l2
l=1 l=1 l=1
i.e. the series converges for all x ∈ R.
h) Since
n+5 n+5 1
√ ≥ √ ≥ √
(2n + 1) n + 3 2n n 2 n
and ∞
1
√
n=1
2 n
∞ n+5
diverges, the series n=1 (2n+1)√n+3 diverges too.
8. For n ∈ N we know both the Cauchy-Schwarz inequality (compare with Corollary
14.3)
n n 12 n 12
n
2 2
(∗) ak b k ≤ |ak bk | ≤ ak bk

k=1 k=1 k=1 k=1
and the Minkowski inequality (compare with Lemma 14.5)
n 12 n 12 n 12

2 2 2
(∗∗) |ak + bk | ≤ ak + bk .
k=1 k=1 k=1
∞ 2
∞
Now, since by the assumptions that < ∞ and k=1 b2k < ∞ we may take
k=1 ak
the limit as n → ∞ of (∗) and the right hand side of (∗∗) to obtain
∞ 12 ∞ 12
n n
2
ak b k ≤ |ak bk | ≤ ak b2k

k=1 k=1 k=1 k=1
647
and 12 12 12
n
∞
∞

2
|ak + bk | ≤ a2k + b2k .
k=1 k=1 k=1
n n 12
Now we see that the partial sums k=1 |ak bk | and k=1 |ak + bk |2 are monotone
and bounded, hence
∞
∞
12 ∞
12

|ak bk | ≤ a2k b2k
k=1 k=1 k=1
and 12 12 12
∞
∞
∞

2
|ak + bk | ≤ a2k + b2k .
k=1 k=1 k=1
∞
Finally we
note that the (absolute) convergence of k=1 |ak bk | implies the conver-
gence of ∞ k=1 ak bk .
|ak | ∞
9. Since for every sequence (ak )k∈N the term 1+|a k|
is bounded by 1 and k=1 21k
∞ |ak |
converges, it follows immediately that k=1 21k 1+|a k|
converges. The inequality
∞
∞
∞

1 |ak + bk | 1 |ak | 1 |bk |
≤ +
2k 1 + |ak + bk | 2k 1 + |ak | 2k 1 + |bk |
k=1 k=1 k=1
follows once we have proved for a, b ∈ R

|a + b| |a| |b|
≤ + .
1 + |a + b| 1 + |a| 1 + |b|
However a straightforward calculation first shows
|a + b| |a| + |b|
≤
1 + |a + b| 1 + |a| + |b|
and a further calculation shows
|a| + |b| |a| |b|
≤ + .
1 + |a| + |b| 1 + |a| 1 + |b|
(The reader is encouraged to do these simple calculations.)
10. a) We use the ratio test here:
2 6 6
(n + 1)6 e−(n+1) n+1 −2n−1 n+1
= e ≤ e−2n .
n6 e−n2 n n
6
n+1
Since lim = 1 and lim e−2n = 0, it follows that
n→∞ n n→∞
2
(n + 1)6 e−(n+1)
lim =0
n→∞ n6 e−n2
648
∞ 2
implying the convergence of n=1 n6 e−n .
b) Note that
4n2 + 15n − 3 4n2 + 15n2 19
3 ≤ 3 = 3
2
n (n + 1) 2 2
n n2 n2
∞ 1 ∞ 4n2 +15n−3
and the series n=1 3 converges implying the convergence of n=1 3 .
n2 n2 (n+1) 2
c) For x ∈ R fixed we find
xk+1
(k+1)! k! x
xk
=x = .
k!
(k + 1)! k+1
k+1
x
(k+1)! xk
Thus lim xk = 0 and therefore ∞
k=0 k! converges for every x ∈ R.
k→∞
k!
d) For every x ∈ R it follows that

(−1)k+1 x2k+2
(2(k+1))! (2k)!
2
(−1)k xk = x (2(k + 1))!
(2k)!
(2k)! x2
= x2 = .
(2k)!(2k + 1)(2k + 2) (2k + 1)(2k + 2)

x2 ∞ x2 k
Since lim = 0 we deduce that the series k=0 (−1)k (2k)! con-
k→∞ (2k + 1)(2k + 2)
verges for every x ∈ R.

11. The condition an+1
an ≥ λ > 1 implies |an+1 | > |an |, i.e. (|an |)n∈N is a strictly
increasing sequence of non-negative numbers, hence it cannot converge to 0, but
this implies that (an )n∈N itself cannot converge to 0.
We now study the two examples
a)
(−1)n+1 3n+1 4
(n+1)4 n+1
≥ 3,
(−1)n 3n = 3 n
n4
∞ (−1)n 3n
implying the divergence of n=1 n4 .
b) We find
√ √
an+1 (n + 1) n + 1 (n + 3) 4n + 15
= √ √
an n n (n + 4) 4n + 19
√ √
(n + 1)(n + 3) n + 1 4n + 15
= √ √
(n + 4)n n 4n + 19
'
n2 + 4n + 4 4n2 + 19n + 15
= > 1,
n2 + 4n 4n2 + 19n
649
∞ n√2
3
implying the divergence of n=1 (n+3) 4n+15 .
1
12. ∞|an | ≥ 1, i.e. |an | ≥ 1, we deduce that (an )n∈N cannot have the limit
a) From n
0, hence n=1 an must diverge.

b) First we note that for := 1−a 2 > 0 there exists N ∈ N such that n ≥ N
implies
1 1−a
||an | n − a| < =
2
or
1 1−a a+1
|an | n < a + = < 1.
2 2
∞ ∞
This implies the convergence of n=N |an | by Theorem 18.18 and therefore n=1 |an |
converges too.

13. From aan+1
n
≤ 1 − na for n ≥ N we deduce that
n|an+1 | ≤ n|an | − a|an |
or
(a − 1)|an | ≤ (n − 1)|an | − n|an+1 |.
Since a > 1 we find
0 < (n − 1)|an | − a|an+1 |,
or
(n − 1)|an | > n|an+1 |.
Hence the sequence (n|an+1 |)n∈N is strictly monotone and decreasing and bounded
from
∞ below by 0, implying its convergence. Therefore we deduce that the series
n=1 ((n − 1)|an | − n|an+1 |) which is a telescopic series converges, compare with
Chapter
∞ 16, Problem 6, and this implies, see Chapter 16, Problem 6 again, that
n=1 n | converges.
|a ∞Note that we can also prove: if for all n ≥ N we have
an+1 1
an ≥ 1 − n then n=1 an diverges.
14. Note that if

an+1

lim n 1 − >1
n→∞ an
then there exists N ∈ N such that n ≥ N implies for some a

an+1
n 1 − ≥ a > 1.
an
2
Now for an = 1·4·...·(3n−2)
3·9·...·3n we find
2
an+1 3n + 1
n 1 − = n 1−
an 3(n + 1)

12n + 8 12n2 + 8n
= n =
9n2 + 18n + 9 9n2 + 18n + 9
650

an+1
implying lim n 1 − = 4 > 1 and therefore by Raabe’s criterion the
n→∞ an 3
series converges. Note that

an+1 2
lim = lim 9n + 6n + 1 = 1,
n→∞ an n→∞ 2
9n + 18n + 9
therefore the ratio test cannot give the result.
15. a) We note that
N ln N
1 1 y
dx = e dy
2 x(ln x)α ln 2 ey y α
ln N ln N
1 1
1−α
= α
dy = y ,
ln 2 y 1−α ln 2
therefore, for α > 1 it follows that

N
1 (ln 2)1−α
lim α
dx =
N →∞ 2 x(ln x) α−1
∞ 1
and the series k=2 k(ln k)α converges. On the other hand, for α < 1, it follows
that N
1
lim dx = +∞
N →∞ 2 x(ln x)α
implying the divergence of the corresponding series.
For α = 1 we have to note in the above calculation that
N ln N
1 1
dx = dy = ln N − ln 2
2 x ln x ln 2 y
N 1
which yields limN →∞ 0 x ln x dx = ∞ and again we get the divergence of the
corresponding series. Also compare with Problem 4 b).
b) Since
N N N
−x2 1 d −x2 1 −x2
xe dx = − e dx = − e
1 1 2 dx 2 1
N
2 1 ∞ 2
we find lim xe−x dx = and hence the series l=1 le−l converges.
N →∞ 1 2e
c) We have
N
ln x
dx = ln(ln x)|N
2 = ln(ln N ) − ln(ln 2)
2 x
N
ln x
implying lim dx = ∞ and therefore the divergence of ∞
k=2
ln k
k .
N →∞ 2 x
651
d) Integration by parts yields

N N N
ln x 1 1 1 + ln x
dx = ln x − = −
2 x2 x x 2 x 2
and we conclude that

N
ln x 1 + ln 2
lim dx =
n→∞ 2 x2 2
∞ ln k
and k=2 k2 converges.
-
an , an > 0
16. For our purpose we may assume that an = 0 for all n ∈ N. Let = a+
n
0, an < 0
-
−a n , a n < 0 ∞
and a−n = then an = a+ −
n − an . Since n=1 an converges the con-
0, an > 0
∞ + ∞ − ∞ ∞
vergence of n=1 an or n=1 an implies the convergence of n=1 |an | = n=1 (a+ n+
∞ ∞
a−
n ), hence both series n=1 a +
n and n=1 a −
n diverge, i.e.
N
∞

lim a+
n = lim a−
n = ∞.
n→∞ n→∞
n=1 n=1
∞
However, since n=1 an converges it follows that lim an = 0 implying that lim a+
n =
n→∞ n→∞
0 and lim a−
n = 0.
n→∞
Given A ∈ R and denote by (bn )n∈N the subsequence of all positive elements of
(an )n∈N and by (cn )n∈N the subsequence of all negative elements of (an )n∈N . Choose
n0 to be the smallest index such that
n0

bk > A,
k=1
next choose n1 to be the smallest index such that

n0
n1

bk − |ck | < A
k=1 k=1
and continue to choose n2 such that

n0
n1
n2

bk − |ck | + bk > A,
k=1 k=1 k=n0 +1
and now continue with this process. We eventually obtain a series
(∗) b1 + · · · + bn0 − |c1 | − · · · − |cn1 | + bn0 +1 + · · · + bn2 − |cn1 +1 | · · ·
652
∞
which is a rearrangement of k=1 ak . Moreover
⎛ ⎞
n0 n1 n2l+1 n2l+2

0≤S−⎝ bk − |ck | + · · · − |ck |⎠ < bk
k=1 k=1 k=n2l−1 +1 k=n2l−1 +1
and
n0 n1 n2l n2l+1

0≤ bk − |ck | + · · · + bk − S < |ck |.
k=1 k=1 k=n2l−2 +1 k=n2l−1 +1
Since lim bk = lim ck = 0, it follows that the rearranged series converges to S.

k→∞ k→∞
∞
17. We could start by looking at 17 = n=−k an b−n and to try to find the numbers k
and n for a given b. However there is a more systematic suggestion for any b and
0 < x < 1:
Clearly
∞
x= an b−n , 0 ≤ an < b, b ∈ N, b ≥ 2.
n=1
Then a1 is the largest integer such that a1 b−1 ≤ x. If a1 , . . . , an−1 are already
known, then an is the largest integer such that
an b−n ≤ x − (a1 b−1 + · · · + an−1 b−n+1 )
or
an ≤ xbn − a1 bn−1 − · · · − an−1 b.
We set
yn := xbn − a1 bn−1 − · · · − an−1 b
and find the following algorithm:
y1 := xb
a1 := [y1 ]
yn+1 := (yn − an )b
an+1 := [yn+1 ].
We now solve (i)-(iii).

(i) b = 2
n 1 2 3 4
yn 27 4
7
8
7
2
7
an 0 0 1 0
653
We may stop now as the results start to repeat, therefore we have the following
periodic expansion
1
= 0.001001 . . . (b = 2)
7
(ii) The case b = 7 is trivial
1
= 0.1 (b = 7)
7
(iii) b = 10
n 1 2 3 4 5 6 7
yn 10
7
30
7
20
7
60
7
40
7
50
7
10
7
an 1 4 2 8 5 7 1
and again the result is the periodic expansion

1
= 0.142857142857 . . .
7
It is straightforward to see that for every rational number kl , 0 < l < k, and for
every b ≥ 2 the b-adic fraction representation of kl is periodic.
Using induction we can prove
pn
yn = , pn ∈ N0 , 0 ≤ pn < nb.
n
However there are only finitely many possibilities for yn , thus there exists two
numbers r, s ∈ N such that
yr+s = yr
implying an+s = an for all n ≥ r.
18. In the case where we have a bijective mapping f : (a, b) → (0, 1), then (a, b) is
not countable, hence D, a set containing (a, b) is not countable. Thus we need to
construct f . This is easily done, for example f : (a, b) → (0, 1), t → f (t), f (t) =
t a 1
b−a − b−a is strictly monotone since f (t) = b−a > 0, f (a) = 0, f (b) = 1, hence f
maps (a, b) bijectively to (0, 1).
Chapter 19
1. By definition a set is closed if and only if it contains all of its accumulation points.
We claim that b is an accumulation point of [a, b). Since b ∈ / [a, b) this implies
that [a, b) is not closed. The sequence (bn )n∈N , bn := b − b−a
2n consists of elements
,
belonging to [a, b) since a < b − b−a2n < b, and lim n→∞ bn = b. Hence b is an
accumulation point of [a, b) and therefore [a, b) is not closed. In order for [a, b)
to be open the set [a, b) = (−∞, a) ∪ [b, ∞) must be closed. However a is an
accumulation point of [a, b) not belonging to the set, so [a, b) is not closed and
therefore [a, b) is not open.
654
√
2. The set Q ⊂ R cannot be closed since we know that for example 2 ∈ R is
an accumulation point of Q not belonging to Q. However Q R cannot be open
√⊂
either. If it was open then Q must be closed. The sequence n2 is a sequence
n∈N
in Q with accumulation point 0 ∈ Q, so Q is not closed and consequently Q is
not open in R.
3. It follows that {aν |ν ∈ R} is the complement of the set (−∞, a1 )∪(a1 , a2 )∪(a2 , a3 )∪
· · · , i.e. we have
∞

{aν |ν ∈ R} = (aν−1 , aν )
ν=1
∞
with a0 := −∞. Since ν=1 (aν−1 , aν ) is open as it is a union of open sets it follows
that {aν |ν ∈ R} is closed.
!
1 1
4. We claim that with Bν = ν+2 , 1 − ν+2 we have

Bν = (0, 1).
ν∈N

Clearly we have Bν ⊂ (0, 1)for all ν ∈ N, which implies ν∈N Bν ⊂ (0, 1).
Next we prove that (0, 1) ⊂ ν∈N Bν . For x ∈! (0, 1) we need to show the exis-
tence of ν0 ∈ N such that x ∈ ν01+2 , 1 − ν01+2 . However limν→∞ ν+2
1
= 0 and

1 1 1
limν→∞ 1 − ν+2 = 1, implying that for ν large enough ν+2 < x < 1 − ν+2 .
5. In general {aν |ν ∈ N} is not closed. A closed set must contain all of its accumulation
points and if a is not an element of {aν |ν ∈ N} then this set is not closed. However,
for a converging sequence (aν )ν∈N with limit a the set {aν |ν ∈ N} ∪ {a} is closed
since it contains all of its accumulation points.
6. Since A and B are bounded with some K1 , K2 > 0 we have −K1 ≤ a ≤ K1 for all
a ∈ A and −K2 ≤ b ≤ K2 for all b ∈ B. Consequently, for all a ∈ A and b ∈ B it
follows that
−(K1 + K2 ) ≤ a + b ≤ (K1 + K2 ),
i.e. A + B is bounded.
7. a) First we prove that (−3, 2) ∪ (4, 6) is open. Since the open interval (−3, 2)
and (4, 6) are open it follows that their union is open. Clearly (−3, 2) ∪ (4, 6) ⊂ M .
There are only three points, {4}, {6} and {10}, not belonging to (−3, 2) ∪ (4, 6) but
to M . For none of these points exists an open interval containing the point and
belonging entirely to M . Hence (−3, 2) ∪ (4, 6) is the largest open set contained in
M.
The set [−3, 2] ∪ [4, 6] ∪ {10} is closed since it is a finite union of the closed sets
[−3, 2], [4, 6] and {10}, note that {10} = ((−∞, 10)∪(10, ∞))c and (−∞, 10) as well
as (10, ∞) are open. Clearly M ⊂ [−3, 2] ∪ [4, 6] ∪ {10}. There are only two points,
{−3} and {2}, belonging to [−3, 2] ∪ [4, 6] ∪ {10} not belonging to M . However both
655
are accumulation points of M : −3 = lim (−3 + n1 ) and −3 + n1 ∈ M for all n ∈ N,

n→∞
and 2 = lim (2 − n1 ) and 2 − n1 ∈ M for all n ∈ N. Hence [−3, 2] ∪ [4, 6] ∪ {10} is
n→∞
the largest closed set containing M .

b) Since 0 ∈ (− n1 , n1 ) for every n ∈ N it follows that 0 ∈ (− n1 , n1 ), i.e.
n∈N
{0} ⊂ (− n1 , n1 ). Suppose a ∈ (− n1 , n1 ) then a ∈ (− n1 , n1 ) for all n ∈ N. If
n∈N n∈N
a = 0 then
there exists n ∈ N such that a < − n1 or a > n1 , implying
a1 ∈ / (− n1 , n1 ).
1 1
Thus (− n , n ) ⊂ {0} and together with the first part we have (− n , n1 ) = {0}.
n∈N n∈N
Each of the sets (− n1 , n1 ) is open as it is an open interval. However {0} is not open
since it does not contain an entire open interval (−ε, ε), ε > 0.

8. a) We claim G = {y ∈ Ry = x1 , x ≥ 12 } = (0, 2]. Indeed, z ∈ G implies
z = x1 for some x ≥ 12 . On [ 12 , ∞) the function x → x1 is strictly decreasing,
strictly positive and tends to 0 for x tending to ∞, hence G = (0, 2], inf G = 0 and
sup G = 2. Since 2 ∈ G we have 2 = max G(= sup G), but 0 ∈ / G and therefore G
has no minimum.
b) Consider the sequence
⎧
⎪
⎪ 3, n=1
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ 0, n=2
⎪
⎪
⎨
5
an := 2 − n1 , n = 3k
⎪
⎪
⎪
⎪
⎪
⎪ n
⎪
⎪
⎪ n+1 , n = 3k + 1
⎪
⎪
⎪
⎩ 1
2 + n1 , n = 3k + 2
It holds

5 1 5
lim a3k = lim − =
k→∞ k→∞ 2 3k 2

3k + 1
lim a3k+1 = lim =1
k→∞ k→∞ 3k + 2

1 1 1
lim a3k+2 = lim + =
k→∞ k→∞ 2 3k + 2 2
and these are obviously all convergent subsequence of (an )n∈N . Hence (an )n∈N has
3 accumulation points. Further
656
3 5 1 5
0≤ ≤ − ≤ ≤ 3,
2 2 n 2
n
0≤ ≤1≤3
n+1
and
1 1 1 3
0≤ ≤ + ≤ ≤3
2 2 n 2
implying that sup{an |n ∈ N} = a1 = 3 and inf{an |n ∈ N} = a2 = 0. Note that in
our case the supremum is a maximum and the infimum is a minimum.
Of course we expect
5
lim sup an = lim a3k =
n→∞ k→∞ 2
and
1
lim inf an = lim a3k+2 = .
n→∞ k→∞ 2
Here comes the proof:
For n ≥ 3:
⎧ 5
⎪
⎪ 2 − n1 , n = 3k
⎪
⎪
⎨
5 1
sup{a | ≥ h} = 2 − n+1 , n = 3k + 1
⎪
⎪
⎪
⎪
⎩ 5 1
2 − n+2 , n = 3k + 1
which implies
5
lim (sup{a | ≥ n}) = ,
n→∞ 2
i.e.
5
lim sup an = .
n→∞ 2
Moreover we have for n ≥ 3

⎧ 1 1
⎪
⎪ 2 + n+2 , n = 3k
⎪
⎪
⎨
1 1
inf{a | ≥ n} = 2 + n+1 , n = 3k + 1
⎪
⎪
⎪
⎪
⎩ 1
2 + n1 , n = 3k + 2
657
implying
1
lim (inf{a | ≥ n}) =
n→∞ 2
or
1
lim inf an = .
n→∞ 2
9. Consider the following table:

sup{an |n ∈ N} inf{an |n ∈ N} lim supn→∞ an lim inf n→∞ |an
n−1
an = 2 − 10 2 −∞ −∞ −∞
n−1
an = (−1)n+1
1
2 − 31 0 0
an = 23 (1 − 101n ) 2
3
3
5
2
3
2
3
Since each sequence converges or diverges to −∞, the lim inf and lim sup is in each
case the limit.
10. Once Corollary 19.23 is at our disposal, the problem is trivial. Here we provide a
solution using the same idea as in the first part of Example 19.17. Clearly a is an
upper bound of {an |n ∈ N} and therefore sup{an |n ∈ N} ≤ a. Given > 0 we
can find N () such that a − an < for all n ≥ N (), note that since an ≤ a we do
not need to use the absolute value in this estimate. Thus for n ≥ N () we have
a − < an , implying that a − cannot be an upper bound.
11. We may assume that a is finite, for a = +∞ the statement is trivial. Suppose that
for some > 0 there exists infinitely many anl , l ∈ N, such that anl ≥ a + . Then
all accumulation points of the sequence (anl )l∈N are greater or equal to a + > a.
Hence (an )n∈N has a subsequence converging to a point larger than its limit superior
which is of course a contradiction.
12. Let a = lim supn→∞ an and b = lim supn→∞ bn . For > 0 we have an < a + and
bn < b + for all but finitely many n ∈ N. This implies λan < λa + λ for all but
finitely many n ∈ N and an + bn < a + b + 2 for all but finitely many n ∈ N, which
gives a) and b) respectively.
Now we apply b) to the sequences (an + bn )n∈N and (bn )n∈N to find with (19.20)
lim sup an = lim sup(an + bn − bn )

n→∞ n→∞
≤ lim sup(an + bn ) + lim sup(−bn )
n→∞ n→∞
= lim sup(an + bn ) − lim inf bn ,
n→∞ n→∞
which yields
lim sup(an + bn ) ≥ lim sup an + lim inf bn ,
n→∞ n→∞ n→∞
658
proving c). Now d) follows since
lim sup bn = lim inf bn = lim bn

n→∞ n→∞ n→∞
and combining b) and c) we find
lim sup an + lim bn ≤ lim sup(an + bn ) ≤ lim sup an + lim bn .

n→∞ n→∞ n→∞ n→∞ n→∞
13. We need to find two non-empty open sets

1 3
O
1 and 7O2 such that O1 ∩ O2 = ∅ and
A ⊂ O 1 ∪O
1 3 7 2 . The two open intervals − ,
2 2 and 4 , 15 3will
suffice. Clearly we have
7
− 2 , 2 ∩ 4 , 5 = ∅ and further we find
1 3 7 [0, 1] ⊂ − 2 2 and {2} ∪ (3, 4) ⊂ 4 , 5
,
implying [0, 1] ∪ {2} ∪ (3, 4) ⊂ − 2 , 2 ∪ 4 , 5 .
Chapter 20
1. This is merely a reformulation, but a helpful one, of Theorem 20.2.(ii) by replacing
f (x) by f ( lim xn ). Let us add a remark. It is important that
n→∞
lim f (xn ) = f ( lim xn )

n→∞ n→∞
holds for all sequences converging to x ∈ [a, b], xn ∈ [a, b]. Consider the function
-
1, x≥0
g(x) =
−1, x < 0.
The graph of g is given in the figure below.
g(x)
0 x
−1

1
1 1
For the sequence it holds that lim
2n n∈N = 0 and lim g = 1 =
n→∞ 2n n→∞ 2n

−1 −1
g(0). For the sequence 2n+1 it holds that lim g = −1 = g(0).
n∈(N ) n→∞ 2n + 1
Obviously, g is discontinuous.
659
2. Suppose that f is continuous and V ⊂ R is open. We have to prove that f −1 (V )

is open. Take y0 = f (x0 ) ∈ V . Since V is open there exists > 0 such that
(y0 −, y0 +) ⊂ V . By continuity of f , we can find δ > 0 such that (x0 −δ, x0 +δ) ⊂
D and |f (x) − f (x0 )| < , thus (x0 − δ, x0 + δ) ⊂ f −1 (V ) proving that f −1 (V ) is
open.
Conversely, suppose the pre-image f −1 (V ) of every open set V ⊂ R is open. Take
x0 ∈ D and set y0 = f (x0 ). The interval (y0 −, y0 +δ) ⊂ R is open and consequently
f −1 ((y0 −, y0 +)) is open too and x0 ∈ f −1 ((y0 −, y0 +)). Thus there exists δ > 0
such that (x0 −δ, x0 +δ) ⊂ f −1 ((y0 −, y0 +)) and f ((x0 −δ, x0 +δ)) ⊂ (y0 −, y0 +).
In other words: For x0 ∈ D, given > 0 exists δ > 0 such that |x − x0 | < δ, x ∈ D,
implies |f (x) − f (x0 )| < , i.e. f is continuous at all x0 ∈ D.
3. The function f : D → R has at x0 the limit a from the right if x ∈ D, 0 < |x−x0 | < δ
and x > x0 implies |f (x) − a| < , i.e. 0 < x − x0 < δ implies |f (x) − a| < .
Analogously we find that f : D → R has at x0 the limit b from the left if for every
> 0 exists δ > 0 such that 0 < x0 − x < δ implies |f (x) − b| < .
(Note that we have taken for granted the assumption that there exists a sequence
(xk )k∈N , xk ∈ D, xk = x0 , converging to x0 .)
4. a) By Theorem 18.30 we know that every real number can be approximated by
rational numbers and by Theorem 18.35 the real numbers are not countable. Thus,
given x ∈ [0, 1] we can find a sequence of rational numbers (qn )n∈N , qn ∈ [0, 1],
converging to x, and we can find a sequence of irrational numbers (rn )n∈N , rn ∈
[0, 1], converging to x too. However
lim χ[0,1]∩Q (qn ) = 1 = 0 = lim χ[0,1]∩Q (rn ),

n→∞ n→∞
implying that X[0,1]∩Q is at all x ∈ [0, 1] discontinuous.

b) For x = 0 we can argue as in part a). However for x = 0 we find that every
sequence, whether consisting only of rational points, only of irrational points, or
both rational and irrational points converging to 0, will be mapped by f onto a
sequence converging to 0. Hence f is continuous at 0.

5. Let M be a bound for g, i.e. |g(x)| ≤ M for all x ∈ [0, 1]. Given > 0, take δ = M

to find for x ∈ [0, 1] such that |x| = |x − 0| < δ = M that
|f (x) − f (0)| = |xg(x)| ≤ M |x| < δM = ,
i.e. f is continuous at 0.
6. a) Suppose that f is increasing, the decreasing case goes analogously. Since
xν > x0 and lim xν = x0 it follows that x0 < xν ≤ xN , xN := max{xν |ν ∈
ν→∞
N}. Consequently (f (xν ))ν∈N is bounded from below by f (x0 ) and from above by
f (xN ). By the Bolzano-Weierstrass theorem (f (xν ))ν∈N has at least one converging
subsequence.

Let f (xνl1 ) and f (xνk2 ) be two converging subsequences. We want to
l∈N k∈N
660
show that they have the same limit. For νl1 there exists νk(l)
2
such that xνl1 ≥ xνk(l)
2
2 1
and for νk(l) exists νm(k) such that xνk(l)
2 ≥ xνm(k)
1 implying that
f (xνl1 ) ≥ f (xνk(l)
2 ) ≥ f (xνm(k)
1 )
and therefore lim f (xνl1 ) = lim f (xνk2 ). Thus all subsequences of (f (xν ))ν∈N con-
l→∞ k→∞
verge to the same limit.
b) We suppose again that f is monotone increasing. Let x0 ∈ I and (xν )ν∈N ,
xν ∈ I, xν > x0 , be a sequence converging from the right to x0 and (yν )ν∈N , yν ∈
I, yν < x0 , be a sequence converging from the left to x0 . Part a) implies that
(f (xν ))ν∈N has a limit f (x0 +) and (f (yν ))ν∈N has a limit f (x0 −). By monotonicity,
if f (x0 +) = f (x0 −) then both must coincide with f (x0 ). Denote by D(f, I) the set
of all points of discontinuity which f has in I. It follows that
D(f, I) = {x ∈ I|f (x−) < f (x+)}.
For every x ∈ D(f, I) exists a rational number r(x) such that f (x−) < r(x) <
f (x+). The mapping x → r(x) maps D(f, I) injectively to Q, hence D(f, I) must
be denumerable.
7. Given f : I → R a monotone function and denote by D(f, I) the denumerable set
of its discontinuities, which we write as a monotone sequence x1 < x2 < x3 · · · Now
we define h : I → R as follows
-
f (x), x ∈ I\D(f, I)
h(x) =
f (xk +), xk ∈ D(f, I)
⏐ ⏐
Clearly f ⏐I\D(f,I) = h ⏐I\D(f,I) , so f and h coincide outside a countable set and
further lim h(x) = f (xk −) exists. Finally we have
x→xk ,x<xk
lim h(x) = lim f (x) = f (xk +) = h(x),

x→xk ,x>xk x→xk ,x>xk
hence f is continuous from the right.

Note that càdlàg function are most important when investigating certain stochastic
procedures, e.g. Lévy processes or more generally Feller processes.
8. a) Since
1
ϕ(x) = (f (x) + g(x) + |f (x) − g(x)|)
2
and
1
ψ(x) = (f (x) + g(x) − |f (x) − g(x)|)
2
the result follows immediately since in both cases on the right hand side we have
continuous functions. (Recall f ± g are continuous as is |h| for h continuous).
b) Clearly
f+ (x) = max(f (x), 0)
661
and
f− (x) = − min(f (x), 0)
This implies immediately with part a) that for a continuous function f the function
f+ and f− are continuous. Now
f+ (x) − f− (x) = max(f (x), 0) + min(f (x, 0)
if f (x) ≥ 0 then f+ (x) = f (x) and f− (x) = 0, and if f (x) ≤ 0 then −f− (x) =
f (x) = min(f (x), 0), and f+ (x) = 0. Thus in each case we get
f+ (x) − f− (x) = f (x).
This decomposition also implies that if f+ and f− are continuous then f is contin-
uous.
Finally we note
- -
f (x), f (x) ≥ 0 f+ (x), f (x) ≥ 0
|f (x)| = =
−f (x), f (x) ≤ 0 f− (x), f (x) ≤ 0
but f+ (x) = 0 if f− (x) = 0 and f− (x) = 0 if f+ (x) = 0 and therefore
|f (x)| = f+ (x) + f− (x).

Note that both the positive and the negative part of a function are non-negative
functions.
9. Let x ∈ [a, b]. There exists a sequence of rational numbers (qν )ν∈N , qν ∈ (a, b),
converging to x, i.e. lim qν = x. Consequently, we have f (qν ) = g(qν ) and hence,
ν→∞
by continuity of f and g
f (x) = lim f (qν ) = lim g(qν ) = g(x).

ν→∞ ν→∞
10. The even extension of f is given by fe : [−a, a] → R, where

-
f (x), x ∈ [0, a]
fe (x) =
f (−x), x ∈ [−a, 0].
If x > 0 then fe (x) = f (x) and fe is continuous at x. If x < 0 then fe (x) =

(f0 (−id))(x) and the continuity follows. Since lim fe (x) = f (0) = lim fe (x),
x→0,x>0 x→0,x<0
the continuity is also proven for x = 0.
Now the odd extension of f − f (0) is given by g : [−a, a] → R
-
f (x) − f (0), x ∈ [0, a]
g(x) :=
−f (−x) + f (0), x ∈ [−a, 0]
662
and the continuity of g follows with arguments similar to those given above. For
x ∈ (0, a] the function x → g(x) = f (x) − f (0) is continuous since f is. For
x ∈ [−a, 0) the function x → g(x) = −f (−x) + f (0) is continuous as composition
and sum of continuous functions.
For x = 0 we find
lim g(x) = lim (f (x)− f (0)) = 0 = lim (−f (−x)+ f (0)) = lim g(x)
x→0,x>0 x→0,x>0 x→0,x<0 x→0,x<0
hence g is also continuous at x = 0.

11. a) For f, g : D → R continuous and λ ∈ R we know that f + g and λf as well
as f · g are continuous, so C(D) is an R-algebra.
b) for f, g ∈ C(D) and λ, μ ∈ R we find
Aop (λf + μg)(x) = a(x)(λf + μg)(x)
= λa(x)f (x) + μa(x)g(x)
= (λAop f )(x) + (μAop g)(x)
= (λAop f + μAop g)(x)

so Aop (λf + μg) = λAop f + μAop g proving the linearity of Aop : C(D) → C(D).
12. a) Consider h : D → R, h(D) ⊂ D, and let x0 ∈ D be a fixed point of h, i.e.
h(x0 ) = x0 . In this case, the graph of h must intersect the line g(x) = x at x0 , see
g(x) = x
x0 x0 = h(x0 )
x0 x
663
b) This follows from a simple algebraic consideration:
g(x0 ) = h(x0 ) + x0 − a
implies h(x0 ) − a = 0. Conversely h(x0 ) = a implies
g(x0 ) = h(x0 ) + x0 − a = x0 .
13. a) We apply Theorem 20.14 to the function g − f . For h(x) := g(x) − f (x)
defined on [a, b] we have h(a) = g(a) − f (a) > 0 and h(b) = g(b) − f (b) < 0 implying
that for some x0 ∈ (a, b) it holds that h(x0 ) = 0 or g(x0 ) = f (x0 ).
1 π 3π
b) We apply part a) to g(x) = sin x and f (x) = 2+cos 4 x defined on [ 2 , 2 ].
Since sin π2 = 1 and cos π2 = 0 we find g( π2 ) = 1 > 12 = f ( π2 ). But for x = 3π 2 we

have sin 3π = −1 and f ( 3π
2 2 ) > 0, so g( 3π
2 ) < f ( 3π
2 ) Hence there exists at least one
ξ ∈ ( π2 , 3π
2 )
1
with sin ξ = 2+cos4ξ

14. a) Below is a picture of An = 12 n1 + n+1 1
, 12 n1 − n−1
1
1 1 1 1 1 1
2(n + n+1 ) 2(n + n−1 )
| ( | ) |
1 1 ↑ 1
n+1 n An n−1
This gives
An+2 An+1 An An−1 An−2
( | )( | )( | )( | )( | )
1 1 1 1 1
n+2 n+1 n n−1 n−2
Hence (An )n∈N is an open covering of { n1 | n ∈ N} and An ∩ Am = ∅ for n = m.

Therefore a finite number of the sets, An , n ∈ N, can never cover { n1 | n ∈ N}
implying that this set is not compact.
Now we consider the set { n1 | n ∈ N} ∪ {0}. We claim that this set is compact. Let
(Aν )ν∈I be an open covering of { n1 | n ∈ N} ∪ {0}. Since lim n1 = 0 there exists
n→∞
Aν0 and N ∈ N such that 0 ∈ Aν0 and for k > N it follows that xk ∈ Aν0 . For
k ≤ N − 1 exists Aνk such that k1 ∈ Aνk and therefore
1
Aν0 ∪ Aν1 ∪ · · · ∪ AνN −1 ⊃ { | n ∈ N} ∪ {0},
n
664
i.e. we have a finite subcovering.

b) We can use the idea developed when proving that B in part a) is compact.
Let (Uj )j∈I be an open covering of C := {ak |k ∈ N0 }. Since a0 ∈ C there exists
Uj0 such that a0 ∈ Uj0 . Further, since Uj0 is open there exists an > 0 such that
(a0 − , a0 + ) ⊂ Uj0 . Now, lim ak = a0 implies the existence of N = N () such
k→∞
that k ≥ N implies ak ∈ (a0 − , a0 + ) ⊂ Uj0 , note that ak ∈ (a0 − , a0 + )
is equivalent to |ak − a0 | < . For a1 , · · · , aN we can find Uj1 , · · · , UjN such that
al ∈ Ujl and consequently C ⊂ Uj0 ∪ Ujn ∪ · · · ∪ UjN , i.e. we have constructed a
finite subcovering of C.
15. First we sketch the situation. The set Ux , x ∈ [0, 1], is an open interval with mid
point x and of length 3N
2 . See:
3
x− 3 x x+ 4N 1
4N
The points 0, N1 , · · · , NN−1 , 1 give a partition of (0,1) and the distance of two neigh-
bouring points is N1 . Therefore, for N k
and k+1
N we find

4k + 1 4k + 3
U k ∩ U k+1 = , = ∅
N N 4N 4N
and
N

−3 4N + 3
(0, 1) ⊂ Uk = , .
N 4N 4N
k=0
Thus, (U k )k=0,··· ,N is indeed a finite subcovering of (0, 1). However, since (0, 1) is
N
open it cannot be compact. Finding a finite subcovering for a special open covering
is of course not sufficient for compactness.

16. First we note that Kν ⊂ Kj0 for every j0 ∈ I. Now, Kν ⊂ R is compact,

ν∈I
hence closed and bounded implying immediately that Kν is bounded. Further

ν∈I
we

know that the intersection of an arbitrary family of closed sets is closed, hence
Kν is closed and bounded and therefore compact.
ν∈I

The sets [−ν, ν], ν ∈ N, are compact but the set [−ν, ν] = R is not.
n∈N
17. a) For x ∈ K exists δx > 0 such that for y ∈ K and |x − y| < δx it follows that
|f (y) − f (x)| < f (x)
4 . The family of intervals (x − δx , x + δx ), x ∈ K, forms an open
665
covering of K and therefore, by compactness, we can find points x1 , · · · , xN ∈ K

such that (xj − δxj , xj + δxj )j=1,··· ,N forms a finite subcovering of K. On ((xj −
f (x ) f (x ) f (x )
δxj , xj + δxj ) it holds that |f (y) − f (xj )| < 4 j or − 4 j < f (y) − f (xj ) < 4 j ,
3f (x )
implying 0 < 4 j < f (y) which yields
3f (xj )
0 < min < f (y).
1≤j≤N 4
b) Since f is uniformly continuous on D, given = 1 there exists δ > 0 such

that |x − y| < δ implies |f (x) − f (y)| < 1. Since D is bounded we can cover D with
a finite number N of intervals of length 2δ, δ > 0, with midpoints xj , j = 1, · · · , N ,
belonging to D. On (xj −δ, xj +δ) we have |f (x)−f (xj )| < 1, or |f (x)| < 1+|f (xj )|
implying |f (x)| < 1 + max |f (xj )| for all x ∈ D, i.e. f is bounded.
1≤j≤N
⏐
18. First we note that f ⏐[−a,−a+1] is uniformly continuous as a continuous function on
a compact set. Thus, given > √ √ δ1 > 0 such that x, y ∈ [−a, −a + 1]
0 there exists
and |x − y| < δ1 it follows that | x + a − y + a| < .
Next we observe that if either x ≥ −a + 1 or y ≥ −a + 1 then
√ √ √ √ √ √
| x + a − y + a| ≤ | x + a + y + a|| x + a − y + a| = |x − y|.
Thus, given
√ > 0 choose δ = min(δ1 , ) to find for all x, y ∈ [−a, ∞) that |x− y| < δ
√
implies | x + a − y + a| < , i.e. f is uniformly continuous on [−a, ∞).
19. By uniform continuity of g, given > 0 there exists δ > 0 such that |f (x)−f (y)| ≤
for all x, y ∈ [a, b] such that |x − y| < δ. Now let a = x0 < x1 < · · · < xn = b be a
partition of [a, b] such that |xk −xk−1 | ≤ δ for k = 1, · · · , n. We define ϕ : [a, b] → R
as follows: on [xk , xk−1 ] we set
⏐ f (xk ) − f (xk−1 ) f (xk−1 )xk − f (xk )xk−1

ϕ ⏐[xk−1 ,xk ] (x) = x+ ,
xk − xk−1 xk − xk−1
⏐
i.e. the graph of ϕ ⏐[xk−1 ,xk ] is the line segment connecting (xk−1 , f (xk−1 )) with
(xk , f (xk )). Clearly, ϕ is piecewise linear. By assumption, we have for x, y ∈
[xk−1 , xk ] that |f (x) − f (y)| ≤ or f ([xk−1 , xk ]) ⊂ [γk − , γk ] where γk :=
sup{f (x)|x ∈ [xk−1 , xk ]}, but we have also, by construction, that |f (x) − ϕ(x)| ≤
for all x ∈ [a, b].
ε
20. Given ε > 0 we take δ = κ to find for all x, y ∈ D with |x − y| < δ that
ε
|f (x) − f (y)| ≤ κ|x − y| ≤ κδ = κ =ε
κ
which implies the uniform continuity of f .
21. a) By definition f : D → R is uniformly continuous if for every > 0 and all
x, y ∈ D there exists δ > 0 such that |x − y| < δ implies |f (x) − f (y)| < . Thus if
666
we restrict x, y to D ⊂ D, given > 0 we may still work with the same δ > 0 to
get the uniform continuity of f |D .
b) Since lim g(x) = A exists, we can define g̃ : [a, b] → R by
x→a,x>a
-
g(x), x ∈ (a, b]
g̃ :=
A, x = a.
By construction, g̃ is continuous on the compact interval [a, b], hence g̃ is uniformly

continuous. Now, the result follows from part a).
Chapter 21
1. If Sx0 ,x (t) = at + b then we must have Sx0 ,x (x0 ) = f (x0 ) and Sx0 ,x (x) = f (x) which
yields f (x0 ) = ax0 + b and f (x) = ax + b, or
f (x) − f (x0 )
a=
x − x0
and
f (x0 )x − f (x)x0
b= ,
x − x0
i.e.
f (x) − f (x0 ) f (x0 )x − f (x)x0
Sx0 ,x (t) = t+ .
x − x0 x − x0
The tangent line through (x0 , f (x0 )) is given by gx0 (t) = αt+β with gx0 (x0 ) = f (x0 )
and gx 0 (x0 ) = f (x0 ). This implies gx0 (t) = f (x0 )t + f (x0 ) − f (x0 )x0 = f (x0 )(t −
x0 ) + f (x0 ). Now we find
f (x) − f (x0 ) f (x0 )x − f (x)x0

Sx0 ,x (t) − gx0 (t) = t+ − f (x0 )t − f (x) + f (x0 )x0
x − x0 x − x0

f (x) − f (x0 ) f (x) − f (x0 )
= − f (x0 ) t + x0 f (x0 ) − ,
x − x0 x − x0
which implies for t ∈ R fixed that
lim (Sx0 ,x (t) − gx0 (t)) = 0

x→x0
or lim Sx0 ,x (t) = gx0 (t).

x→x0
2. We use mathematical induction. For k = 1 the statement is just the well known
Leibniz’s rule
(f · g) (x) = f (x)g(x) + f (x)g (x)
Now suppose that
k

dk k (k−l)
(f · g)(x) = f (x)g (l) (x)
dxk l
l=0
667
and consider
k
dk+1 d k (k−l)
(∗) (f · g)(x) = f (x)g (l) (x)
dxk+1 dx l
l=0
k
k d (k−l)
= (f (x) · g (l) (x))
l dx
l=0
k " #
k
= f (k+1−l) (x)g (l) (x) + f (k−l) (x)g (l+1) (x) .
l
l=0
The last term we now handle is the analogous term in the proof of the binomial
theorem, Theorem 3.9.
k k
k (k+1−l) k (k−l)
f (x)g (l) + f (x)g (l+1) (x)
l l
l=0 l=0
k
k (k+1−l)
= f (k+1) (x)g(x) + f (x)g (l) (x)
l
l=1
k
k−1
+ f (k−l) (x)g (l+1) (x) + f (x)g (k+1) (x)
l
l=0
k
k (k+1−l)
= f (k+1) (x)g(x) + f (x)g (l) (x)
l
l=1
k
k
+ f (k−(l−1)) (x)g (l) (x) + f (x)g (k+1) (x)
l−1
l=1
k
k + 1 (k+1) k k
= f (x)g(x) + + f (k+1−l) (x)g (l) (x)
0 l l+1
l=1

k+1 (l)
+ f (x)g (x)
k+1
k + 1
k+1
(∗∗) = f (k+1−l) (x)g (l) (x)
l
l=0
k k
where we used Lemma 3.8, i.e. k+1l = l−1 + l . Thus the general Leibniz’s rule
is proved by combining (∗) and (∗∗).
3. We need to prove for f, g ∈ C k (I) and λ, μ ∈ R that λf + μg ∈ C k (I) and f · g ∈
d df dg
C k (I). Now the linearity of the derivative, i.e. dx (λf + μg) = λ dx + μ dx implies
immediately the linearity of higher derivative, l ≤ k
l−1
dl d d
(λf + μg) = (λf + μg)
dxl dx dxl−1
668

d dl−1 dl−1
= λ l−1 f + μ l−1 g
dx dx dx
dl f dl g
l
=λ
+μ l,
dx dx
where as the general Leibniz’s rule, see Problem 2, yields for l ≤ j
l
dl
(f · g) = (lj )f (l−j) g (j)
dxl j=0
dl
and it follows dxl
(f g) ∈ C k−l (I).
4. For x = 0 and x = 1 the function is obviously differentiable. For being also
differentiable at x0 = 0 and x1 = 1 the function must be at these points continuous
and the right and the left derivative, i.e.
f (x) − f (x̃) f (x) − f (x̃)
lim and lim
x→x̃,x>x̃ x − x̃ x→x̃,x<x̃ x − x̃
must exist and coincide, x̃ ∈ {x0 , x1 }.
This yields
a · 0 + b = c · 02 + d · 0
a = 2c · 0 + d
1
c+d =1−
1
1
2c · 1 + d = 2
1
or b = 0, a = d, c + d = 0, 2c + d = 1 i.e. a = −1, b = 0, c = 1, d = −1.
5. Using the law of the logarithms
ln(a · b) = ln a + ln b
we find first
n
n

ln fk (x) = ln fk (x),
k=1 k=1
and consequently
n
n

ln fk (x) = ln fk (x)
k=1 k=1
which yields
n n

( k=1 fk (x)) fk (x)
n = ,
k=1 fk (x) fk (x)
k=1
g
where we used (ln g) = g .
669
6. a) In the case that f is a differentiable function at x0 , we find
f (x0 + h) − f (x0 − h) f (x0 + h) − f (x0 ) + f (x0 ) − f (x0 − h)

=
2h 2h
1 f (x0 + h) − f (x0 ) 1 f (x0 − h) − f (x0 )
= +
2 h 2 −h
and passing to the limit h → 0 yields
f (x0 + h) − f (x0 − h) 1 f (x0 + h) − f (x0 ) 1 f (x0 − h) − f (x0 )

lim = lim + lim
h→0 2h 2 h→0 h 2 h→0 −h
1 1
= f (x0 ) + f (x0 ) = f (x0 ).
2 2
b) The function g : (−a.a) → R, g(x) = |x|, is not differentiable at x0 = 0,
compare Example 7.7. However we have for x0 = 0 that g(0 + h) = |h| = | − h| =
g(0 − h),
and therefore
g(0 + h) − g(0 − h)
=0
2h
implying that
g(0 + h) − g(0 − h)
lim =0
2h
but g (0) does not exist.
7. Consider the quotient
f (x) − f (0) x2 h(x)

= = |x|h(x).
x−0 x
Now, since h is bounded, i.e. |h(x)| ≤ M for all x, we deduce that
f (x) − f (0)
f (0) = lim = lim |x|h(x) = 0,
x→0 x−0 x→0
since ||x|h(x)| ≤ |x|M and lim |x|M = 0.

x→0
8. a) For an even function f : R → R we find for all x0 ∈ R
f (x) − f (x0 ) f (−x) − f (−x0 )

f (x0 ) = lim = lim
x→x0 x − x0 x→x0 x − x0

f (y) − f (y0 ) f (y) − f (y0 )
= lim = lim −
y−y0 −y − (−y0 ) y→y0 y − y0
f (y) − f (y0 )
= − lim = −f (y0 ) = −f (−x0 ),

y→y0 y → y0
670
i.e. for all x0 ∈ R we have f (x0 ) = −f (−x0 ) which means that f is odd. Now, if
g : R → R is an odd function it follows for all x0 ∈ R that
g(x) − g(x0 )
g (x0 ) = lim
x→x0 x − x0
−g(−x) − (−g(−x0 ))
= lim
x→x0 x − x0
−g(y) + g(y0 )
= lim
y→y0 −y − (−y0 )
−(g(y) − g(y0 ))
= lim
y→y0 −(y − y0 )
g(y) − g(y0 )
= lim = g (y0 ) = g (−x0 ),
y→y0 y − y0
i.e. g is even.
b) Since f is a-periodic, we have
d d 2
(f (x) · f (x + a)) = f (x) = 2f (x)f (x)
dx dx
and
d
f (x)f (x + a) = f (x)f (x + a) + f (x)f (x + a),
dx
= f (x)f (x) + f (x)f (x + a)
or
f (x)f (x) = f (x)f (x + a).
By assumption, we have f (x) = 0, so we deduce for all x ∈ R, f (x) = f (x + a), i.e.
f is a-periodic too. Note that the assumption f (x) = 0 for x can be reduced when
assuming that f is continuous. In this case it would be sufficient for f (x) = 0 for
all x ∈ R\Q, or more generally for a dense set in R, a notion we will discuss later
on.
9. For |x| < 1 we find
N
1 − xN +1
xk =
1−x
k=0
which implies by differentiation
N
1 N xN +1 − (N + 1)xN
kxk−1 = 2
+ ,
(1 − x) (1 − x)2
k=1
and further for x = 0

N
1 k x N xN +2 − (N + 1)xN +1
kx = + ,
x (1 − x)2 (1 − x)2
k=1
671
or
N
x N xN +2 − (N + 1)xN +1
kxk = + ,
(1 − x)2 (1 − x)2
k=1
and this identity holds also for x = 0. Hence for all |x| < 1 we find, since in this
case lim N xN +l = 0, l = 1, 2, · · · , that
N →∞
∞
N
x
kxk = lim kxk = .
N →∞ (1 − x)2
k=1 k=1
10. a) For k = 1 we have
d 1 1 3
(1 + x2 )− 2 = − (2x)(1 + x2 )− 2
dx 2
−x
= 3
(1 + x2 ) 2
dk 1 Pk (x)
and P1 (x) = −x. Now suppose that dxk (1 + x2 )− 2 = 2k+1 and that the
(1+x2 ) 2
degree of Pk is less or equal to k. We want to prove
dk+1 1 Pk+1 (x) Pk+1 (x)

(1 + x2 )− 2 = = 2k+3
dxk+1 2(k+1)+1
(1 + x2 ) 2 (1 + x2 ) 2
with a polynomial of degree at most k + 1. Note that

dk+1 2 − 12 d Pk (x)
(1 + x ) =
dxk+1 dx (1 + x2 ) 2k+1
2

d 1 d 1
= Pk (x) 2k+1 + Pk (x) 2k+1
dx 2
(1 + x ) 2 dx (1 + x2 ) 2
Pk (x)(1 + x2 ) + Pk (x)(− 2k+1
2 )(2x)
= 2(k+1)+1
(1 + x2 ) 2
Pk (x)(1 + x2 ) − (2k + 1)xPk (x)

= 2k+3 .
(1 + x2 ) 2
Since Pk (x) has degree less or equal to k, Pk (x) has degree less or equal to k − 1
and this implies that (1 + x2 )Pk (x) as well as xPk (x) has degree less or equal to
k + 1.
k
Now, if Pk (x) has degree less or equal to k, then we know that |Pk (x)| ≤ Ck (1+x2 ) 2 ,
which implies
k
dk 1 |Pk (x)| (1 + x2 ) 2
| k
(1 + x2 )− 2 | ≤ 2k+1 ≤ Ck 2k+1
dx (1 + x2 ) 2 (1 + x2 ) 2
672
1
= Ck k+1 .
(1 + x2 ) 2
b) Let f ∈ Cbm (R). First we note that for l ≤ m, since |f (l) (x)| ≤ Ml for some
Ml , it follows that
1
|f (l) (1 + x2 )− 2 | ≤ Ml .
1
Next, with g(x) = (1 + x2 )− 2 we find with part a),
kj
(j) kj 1 1
|g (x)| ≤ γj,kj 1 j
(1 + x2 ) 2 (1 + x2 ) 2
1 1
≤ γj,kj 1 jkj .
(1 + x2 ) 2 (1 + x2 ) 2
Now the Faà di Bruno formula yields

1
|f (m) ((1 + x2 )− 2 )|
k1 km
1 g (1) (x) g (m) (x)
≤| cm,k1 ,··· ,kn f (k) ((1 + x2 )− 2 ) ···
1! m!

≤ ˜m,k1 ,··· ,km |g (1) (x)k1 | · · · |g (m) (x)|km
1 1
≤C 1

(1 + x2 ) 2 (1 + x2 ) jkj
1
=C m+1 .
(1 + x2 ) 2
Chapter 22
1. We may first apply Rolle’s theorem to ⏐ f which yields⏐the existence of x0 ∈ (a, b)
⏐ ⏐
such that f (x0 ) = 0. Now consider
⏐ f
⏐ 0 ] and f [x0 ,b] . Since f (a) = 0 and
[a,x
⏐ ⏐
f (b) = 0, both functions f [a,x0 ] and f [x0 ,b] satisfy the assumption of Rolle’s
theorem. Thus there exists x1 ∈ (a, x0 ) and x2 ∈ (x0 , b), hence x1 = x2 , such that
f (x1 ) = f (x2 ) = 0
2. By our assumption, we find for x, y ∈ (a, b), x = y, that
f (x) − f (y)
| | ≤ c|x − y|α ,
x−y
which implies that f is differentiable at x and
f (x) − f (y)
f (x) = lim =0
y→x x−y
for all x ∈ (a, b) which yields that f must be constant on (a.b). Clearly f is
continuous on [a, b], hence f is constant on [a,b].
673
Remark: The condition |f (x) − f (y)| ≤ C|x − y|β , β > 0, is called the Hölder
condition. For β = 1 we recover the Lipschitz condition. The result proved
above says that if f is Hölder continuous, i.e. satisfies the Hölder condition with
exponent (Hölder exponent) β > 1, then f is constant. For 0 < β < 1 there are
non-trivial functions satisfying the Hölder condition.
3. Let x, y ∈ (a, b), x < y. We can find ξ ∈ (x, y) such that f (x) − f (y) = f (ξ)(x − y)
which gives
|f (x) − f (y)| = |f (ξ)||x − y| ≤ M |x − y|,
i.e. f is Lipschitz continuous and by Problem 19 in Chapter 20, f is uniformly
continuous.
4. We consider the function y → ln(1 + y) as the interval [0, xq ] and [ xq , xp ] for x > 0
and apply in both intervals the mean value theorem. Thus using for 0 < y1 < y2
the formula
1
ln(1 + y2 ) − ln(1 + y1 ) = (y2 − y1 ), ξ ∈ (y1 , y2 )
1+ξ
we find
x x 1 x x
ln 1 + = ln 1 + − ln 1 = , 0 < ξ0 <
q q 1 + ξ0 q q
and
x x 1 x x x x
ln 1 + − ln 1 + = − , < ξ1 < .
p q 1 + ξ1 p q q p
1 1
Thus, since 1+ξ0 > 1+ξ1we obtain

ln 1 + xq ln 1 + xp − ln 1 + xq
x > x x ,
q p − q
implying

x x x x x x
− ln 1 + > ln 1 + − ln 1 + ,
p q q q p q
which gives
x x x x
ln 1 + > ln 1 +
p q q p
or
x x
q ln 1 + > p ln 1 + ,
q p
i.e. p q
x x
ln 1 + < ln 1 +
p q
and this implies of course
p q
x x
1+ < 1+ .
p q
674
5. a) By l’Hospital’s rule we find first

eαx αeαx
lim = lim = ∞,
x→∞ x x→∞ 1
αx β
eαx eβ
and since xβ = x , we deduce
eαx
lim = ∞.
x→∞ xβ
b) First we note that

1
ln x x 1
lim = lim = lim =0
x→∞ xα x→∞ αxα−1 x→∞ αxα
and now we note β

(ln x)β ln x
= α
xα xβ
and the result follows.
c) Since xx = ex ln x we find by the continuity of exp
lim xx = lim ex ln x = exp( lim x ln x)

x→0 x→0 x→0
= exp(0) = 1,
where we used
1
ln x x
lim x ln x = lim 1 = lim = lim (−x) = 0.
x→0 x→0 x→0 − 12 x→0
x x
6. First note that x−7

1
1 1
(8 − x) x−7 = eln(8−x) = e x−7 ln(8−x) ,
and therefore, since

1
ln(8 − x) ln(1 − y) − 1−y
lim = lim = lim = −1,
x→7 x−7 y→0 y y→0 1
we find
1
lim (8 − x) x−7 = e−1 .
x→7
7. We write x
a x ln f √ax
f √ =e
x
and note that x
a x ln f √ax
lim f √ = lim e
x→∞,x>0 x x→∞,x>0
675

a
= exp lim x ln f √
x→∞,x>0 x
and further, by applying l’Hospital’s rule twice
√
a ln f (a y)
lim x ln f √ = lim
x→∞,x>0 x y→0 y
√
af (a y)
= lim √ √
y→0,y>0 2 yf (a y)
√
a2 f (a y) a2
= lim √ √ √ =− ,
y→0,y>0 2f (a y) + 2a yf (a y) 2
where we used f (0) = 1, f (0) = 0 and f (0) = −1. Hence we arrive at
x
a a2
lim f √ = e− 2 .
x→∞,x>0 x
8. Let x ∈ (a, b) and h > 0 such that x + h ∈ (a, b). It follows that f (x + h) − f (x) ≤ 0
and therefore
f (x + h) − f (x)
f (x) = lim ≤ 0.
h→0 h
9. Clearly for k = 0 we have e−at ≥ 0. Moreover for k ∈ N we find
dk −at
(e ) = (−a)k e−at = (−1)k ak e−at
dtk

implying that (−1)k (−1)k ak e−at = ak e−at ≥ 0, i.e. t → e−at , a > 0, is com-
pletely monotone.
Now, 1 − e−at , a > 0, t ≥ 0, is always non-negative and for k ∈ N we find
dk dk
k
1 − e−at = − k e−at
dx dx
k
d −at
and by the previous result follows that (−1)k dx k (1 − e ) ≤ 0.
k
d d
Finally, for α = 1 we have t ≥ 0 for t > 0 and dt (t) = 1 ≥ 0,, as well as dtk (t) = 0
for k ≥ 2. Thus t → t is a Bernstein function. For 0 < α < 1 we find first tα > 0
for α > 0 and t > 0 and further
dk α
(t ) = α(α − 1)(α − 2) · · · (α − k)tα−k
dtk
= α|α − 1||α − 2| · · · |α − k|(−1)k−1 tα−k ,
and we arrive at
dk α
(−1)k (t ) = (−1)α|α − 1||α − 2| · · · |α − k|tα−k ≤ 0.
dtk
676
10. a) We have
1
1 3 −x
f (x) =
9 (x2 (1 − x)) 13
provided x = 0 and x = 1. Thus f vanishes at x0 = 13 and for 0 < x < 13 we have
f (x) > 0 and for 13 < x < 1 we have f (x) < 0, i.e. approaching from the left of
the point x0 = 13 the function f is strictly increasing, and for x > 13 the function is
strictly decreasing, hence we must have a local maximum at x0 = 13 and the value
1
is f 13 = 433 .
For x = 0 and x = 1 the function f is not differentiable. For x ∈ (0, 1) we have
f (x) > 0 and further f (x) < 0 for x < 0, hence there is no local extreme value at
x = 0. However for x = 1 we find that f (1) = 0 and f (x) > 0 for x > 1 as well
as for x ∈ (0, 1), hence there is a local minimum at x = 1, see J. Kaczar and M.T.
Nowak [6, p. 298].
b) The function f has only strictly positive values and lim f (x) = lim f (x) =
x→∞ x→−∞
0. For x = 0 and x = 1 the function is not differentiable. Thus, to find the maxi-
mum we have to look at (−∞, 0), (0, 1) and (1, ∞) for a local maximum and compare
with f (0) = 32 and f (1) = 32 .
Now for x < 0 we find
1 1 1 1
f (x) = + = +
1−x 1−x+1 1−x 2−x
1 1
and f (x) = (1−x) 2 + (2−x)2 > 0, implying that f is on (−∞, 0) strictly increasing,
3
hence f (x) < 2 for x ∈ (−∞, 0).
For 0 < x < 1 we find
1 1
f (x) = +
1+x 2−x
and
1 1
f (x) = − 2
+
(1 + x) (2 − x)2
1

implying f 2 = 0. Since
2 2
f (x) = +
(1 + x)3 (2 − x)3

and therefore f 12 > 0, we find that f has a local minimum at 12 .
Finally, for x > 1 we have
1 1 1 1
f (x) = + = +
1+x 1+x−1 1+x x
and
1 1
f (x) = − − 2
(1 − x)2 x
Thus on (1, ∞) the function f is strictly decreasing. It follows that the global
maximum of f is 32 and it is attained at two points x0 = 0 and x1 = 1.
677
11. The tangent line at x0 is the graph of
gx0 (t) = g (x0 )t + g(x0 ) − x0 g (x0 )
and the normal line at x0 is the graph of

1 x0
nx0 (t) = − t + g(x0 ) +
g (x0 ) g (x0 )
provided g (x0 ) = 0.
√
With g(x) = 1 − x2 , x ∈ (−1, 1), we find
x
g (x) = − √ = 0
1 − x2
for x = 0. Thus

−x0 t x2
gx0 (t) = + 1 − x20 + 0 2
1 − x20 1 − x0
−x0 t + 1 − x20 + x20 −x0 t + 1
= =
1 − x20 1 − x20
and for x0 = 0

1 − x20 x0 1 − x20
nx0 (t) = t + 1 − x20 −
x0 x0

1 − x20
= t.
x0
For x0 = 0 the normal line is of course the abscissa. Since g is the upper half circle,
we expect the centre of curvature to be for all x0 ∈ (−1, 1) the origin. In general
we have for c = (c1 , c2 )
1 + g 2 (x0 )
c1 = x0 − g (x0 )
g (x0 )
and
1 + g 2 (x0 )
c2 = g(x0 ) + .
g (x0 )
−1
Since g (x0 ) = 3 we find
(1−x20 ) 2
x20 1
1 + g 2 (x0 ) = 1 + = ,
1 − x20 1 − x20
3
1 + g 2 (x0 ) (1 − x20 ) 2
= − =− 1 − x20 ,
g (x0 ) 1 − x20
678

1 + g 2 (x0 ) x0
g (x0 ) = − . − 1 − x2
0 = x0 ,
g (x0 ) 1 − x20
and consequently
c1 = x0 − x0 = 0
as well as
c2 = 1 − x20 − 1 − x20 = 0.
Finally, as radius of curvature we find
3 3
1 + g 2 (x0 ) 2 1 (1 − x20 ) 2
r= = 3 . = 1,
|g (x0 )| (1 − x20 ) 2 1
as we shall expect: the circle of curvature of a circle is the circle itself.
g̃x0
ñx0
−1 0 x0 1 x
12. For x ∈ (0, ∞) we have f (x) = − x12 and f (x) = 2

x3 which yields for the normal
line that it is the graph of the function
1 x0
nx0 (t) = − t + f (x0 ) +
f (x0 ) f (x0 )
1
= x20 t +
− x30 .
x0
The centre of curvature c = (c1 , c2 ) is given by
1 + f 2 (x0 )
c1 = x0 − f (x0 )
f (x0 )
679
and
1 + f 2 (x0 )
c2 = f (x0 ) + ,
f (x0 )
and since
1 + f 2 (x0 ) 1 + x40
=
f (x0 ) 2x0
we get
1 x40 + 1 2x4 + x40 + 1 3x4 + 1
c1 = x0 + 2 . = 0 3 = 0 4 ,
x0 2x0 2x0 2x0
1 1 + x40 3 + x40
c2 = + = .
x0 2x0 2x0
Chapter 23
1. For m = 2 the statement is f (λ1 x1 + λ2 x2 ) ≤ λ1 f (x1 )+ λ2 f (x2 ), λ1 , λ2 ∈ [0, 1], λ1 +
λ2 = 1. Thus with λ := λ1 and λ2 := 1 − λ we recover the definition of convexity.
Now suppose (23.12) holds for some m ≥ 2. We want to prove that it also holds
for m + 1. For this, take points x1 , · · · , xm+1 ∈ I and λ1 , · · · , λm+1 ∈ [0, 1] with
m+1
j=1 λj = 1. Since for λm + λm+1 > 0

λm λm+1
λm xm + λm+1 xm+1 = (λm + λm+1 ) xm + xm+1 = λ̃m x̃n ,
λm + λm+1 λm + λm+1
by our induction hypothesis we find
f (λ1 x1 + · · · + λm+1 xm+1 ) = f (λ1 x1 + · · · + λm−1 xm−1 + λ̃m x̃m )
≤ λ1 f (x1 ) + · · · + λ̃m f (x̃m )

λm λm+1
= λ1 f (x1 )+· · ·+λm−1 f (xm−1 )+(λm +λm+1 )f xm + xm+1
λm + λm+1 λm + λm+1
≤ λ1 f (x1 ) + · · · + λm−1 f (xm−1 ) + λm f (xm ) + λm+1 f (xm+1 ),
λm
where we used in the last step the convexity of f and the fact that λm +λm+1 +
λm+1
λm +λm+1 = 1.
2. We⏐ will prove more, namely that if I has end points a 0 such that a < a1 − η and b1 + η < b. Now choose x1 , y1 , x2 , y2 ∈ I

such that x1 < y1 < a1 − η and b1 + η < x2 < y2 . Take x, y ∈ [a1 , b1 ] and suppose
x < y (otherwise change the role of x and y in the following argument). We apply
Lemma 23.4 to x1 , y1 , x and then to y1 , x1 , y to find
f (y1 ) − f (x1 ) f (x) − f (y1 )

≤
y1 − x1 x − y1
680
and
f (x) − f (y1 ) f (y) − f (x)
≤ ,
x − y1 y−x
hence
f (y1 ) − f (x1 ) f (y) − f (x)
≤ .
y1 − x1 y−x
Applying Lemma 23.4 once more first to x, y, x2 and then to y, x2 , y2 we arrive at
f (y) − f (x) f (y2 ) − f (x2 )

≤ .
y−x y2 − x2
Thus we get the estimate

f (y) − f (x)
≤ max f (y1 ) − f (x1 ) , f (y2 ) − f (x2 )
y−x y1 − x1 y2 − x2
which implies with

f (y1 ) − f (x1 ) f (y2 ) − f (x2 )

L := max
y1 − x1 , y2 − x2
the Lipschitz estimate

|f (y) − f (x)| ≤ L|y − x|
for x, y ∈ [a1 , b1 ].
3. Suppose that f has at x0 ∈ R a local minimum, i.e. for some δ > 0 it follows that
|x − x0 | ≤ δ implies f (x0 ) ≤ f (x). For x ∈ R such that |x − x0 | > δ we note that
δ
|x−x0 | ∈ (0, 1) and further with

δ δ
y := x+ 1− x0
|x − x0 | |x − x0 |
we first find |y − x0 | = δ, hence f (x0 ) ≤ f (y), and using the convexity of f we find

δ δ
f (x0 ) ≤ f (y) ≤ f (x) + 1 − f (x0 ),
|x − x0 | |x − x0 |
implying f (x0 ) ≤ f (x) i.e. f (x0 ) is a global minimum of f .

2
4. a) On (0, ∞) the function ln is twice continuously differentiable with d dxln2 x =
− x12 .
Hence the function x → − ln x is convex, i.e. ln is concave. Using Jensen’s
inequality we obtain
x xn 1 1 1
1
− ln + ··· + ≤ − ln x1 − ln x2 − · · · − ln xn ,
n n n n n
or x
1 xn 1
ln + ··· + ≥ (ln x1 + · · · + ln xn ),
n n n
681
i.e. x
1 xn 1
ln + ···+ ≥ ln(x1 . · · · .xn )
n n n
which yields
n
n
n1
1
xk ≥ xk .
n
k=1 k=1
b) Since
d2 1
(x ln x) = > 0
dx2 x
for x ∈ (0, ∞), we note that f is convex, and consequentially by convexity

x+y x+y x y
ln ≤ ln x + ln y,
2 2 2 2
or
x+y
(x + y) ln ≤ x ln x + y ln y.
2
5. First we sketch the situation.
3
e2
−1 0 +1 x
Now, for x ∈ [−1, 0] and 1 ≤ a ≤ 32 we have eax ≤ ex and for x ∈ [0, 1] and
3
1 ≤ a ≤ 32 we find eax ≤ e 2 x implying
- 3
e 2 x , x ∈ [0, 1]
sup fa (x) =
a∈[1, 32 ] ex , x ∈ [−1, 0].
682
6. For 0 ≤ λ ≤ 1 and x, y ∈ R we find using first the monotonicity of h and the

convexity of f , and then the convexity of h.
(h ◦ f )(λx + (1 − λ)y) = h(f (λx + (1 − λ)y))
≤ h(λf (x) + (1 − λ)f (y))

≤ λh (f (x)) + (1 − λ)h (f (y))
= λ(h ◦ f )(x) + (1 − λ)(h ◦ f )(y).
||x−y||k
7. First we note that 1=||x−y|| k
≤ 1 implying that the series converges for all x, y ∈
R . Moreover, from the definition follows that d(x, y) ≥ 0 for all x, y ∈ R and if
n
d(x, y) = 0 then ||x − y||k = 0 for all k ∈ N, hence, since ||.||k is a norm, x = y.
Since for every norm ||x − y|| = ||y − x|| holds we also find that d is symmetric i.e.
d(x, y) = d(y, x).
1
Moreover, the monotonicity of f (t) = 1+t , t ≥ 0, implies
f (||x − y||k ) ≤ f (||x − z||k + ||z − y||k ) ,
and it follows
∞
1
d(x, y) = f (||x − y||k )
2k
k=1
∞
1
≤ f (||x − z||k + ||z − y||k )
2k
k=1
∞
1 ||x − z||k + ||z − y||k
=
2k 1 + ||x − z||k + ||z − y||k
k=1
∞
∞

1 ||x − z||k 1 ||z − y||k
= +
2k 1 + ||x − z||k + ||z − y||k 2k 1 + ||x − z||k + ||z − y||k
k=1 k=1
∞
∞
1 ||z − y||k
1 ||x − z||k
≤ k
+
2 1 + ||x − z||k 2k 1 + ||z − y||k
k=1 k=1
= d(x, z) + d(z, y).

√
8. For x, y, z ∈ Rn we find with ||x|| := ||x||2 using 2 ab ≤ a + b which holds for
a, b ≥ 0, that
2(1 + ||y||)(1 + ||z||) = 2 + 2||y|| + 2||z|| + 2||y|| · ||z||
= (1 + ||y|| + ||z|| + (||y|| + ||z||)) + (1 + 2||y|| · ||z||)

≥ 1 + ||y|| + ||z|| + 2 ||y|| · ||z||
2
=1+ ||y|| + ||z||
683
2

≥ 1 + ||y + z|| = 1 + ||y + z||,
√ √ √
where for the last estimate we need a+b ≤ a + b for a, b ≥ 0. Thus with
z = x − y we get
2(1 + ||y||)(1 + ||x − y||) ≥ 1 + ||y + x − y|| = 1 + ||x||,
or
1 + ||x||
≤ 2(1 + ||x − y||).
1 + ||y||
9. a) We first prove that . is a norm on Rn . Note that
x = x(1) + x(2) ≥ 0
and x = 0 implies x(1) + x(2) = 0, i.e. x(1) = 0 and x(2) = 0 implying
x = 0.
Moreover, for λ ∈ R we find with x ∈ Rn
λx = λx(1) + λx(2) = |λ|||x(1) + |λ|||x(2) = |λ|(x(1) + x(2) ) = |λ x.
Finally for x, y ∈ Rn we get
x + y = x + y(1) + x + y(2)
≤ x(1) + y(1) + x(2) + y(2)
= x(1) + x(2) + y(1) + y(2)
= x + y.
Now we turn to |||.|||. Clearly
|||x||| = max(x(1) , x(2) ) ≥ 0
and if |||x||| = 0 then x(1) = 0 and x(2) = 0 implying x = 0. For λ ∈ R and

x ∈ Rn we have
|||λ||| = max(λx(1) , λx(2) )

= max(|λ| x(1) , |λx(2) )
= |λx| max(x(1) , x(2) ) = |λ| |||x|||.
684
However, for x, y ∈ Rn we have

|||x + y||| = max(x + y(1) , x + y(2) )
≤ max(x(1) + y(1) , x(2) + y(2) )
1
= (x(1) + y(1) + x(2) + y(2) + |x(1) + y(1) − x(2) − y(2)|)
2
1
= (x(1) + x(2) ) + y(1) + y(2) + |(x(1) − x(2) ) + (y(1) − y(2) )|
2
1
≤ (x(1) + x(2) + (x(1) − x(2) ))
2
1
+ (y(1) + y(2) + (y(1) − y(2) ))
2
= max(x(1) , x(2) ) + max(y(1) , y(2) )
= |||x||| + ||y|||,
i.e. |||.||| is a norm on Rn .
b) The triangle inequality yields for y, z ∈ Rn that
||z + y|| ≤ ||z|| + ||y|| or ||z + y|| − ||y|| ≤ ||z||,
which gives with x = z + y, i.e. z = x − y, that
||x|| − ||y|| ≤ ||x − y||.
Analogously we obtain
−(||x|| − ||y||) = ||y|| − ||x|| ≤ ||x − y||
implying

||x|| − ||y|| ≤ ||x − y||,

where the inequality

||x|| − ||y|| ≤ ||x|| − ||y||
is obvious.
10. First we recall (xk )k∈N converges to x in .p if for every ε > 0 there exists N (ε)
1/p
n (j) (j) p
such that k ≥ N implies xk − xp = j=1 |xk − x | < ε which implies

(j) (j) (j)
immediately that k ≥ N yields xk − x < for j = 1, . . . , n, i.e. xk
k∈N
converges
to x(j) . Conversely, suppose that for every j = 1, . . . , n the sequence
(j)
xk converges to x(j) . Given > 0 we can find N () such that for j = 1, . . . , n
k∈N
(j)
we have that k ≥ N () implies for all j = 1, . . . , n that xk − x(j) < 1 which
np
gives
⎛ ⎞1
n p p
(j)
||xk − x||p = ⎝ xk − x(j) ⎠ < .
j=1
685
11. Since lim ||xk − x||p = 0 we have that for every > 0 there exists N () ∈ N such
k→∞
that k ≥ N () implies ||xk − x||p < c . Consequentially given > 0 and N () chosen
as above we find

||xk − x|| ≤ c||xk − x||p < c · = ,
c
i.e. lim ||xk − x|| = 0.
k→∞
Chapter 24
1. For x ∈ R fixed the sequence (gn (x))n∈N converges clearly to 0. However sup |g2n (x)| =
x∈R
∞, thus we cannot expect ||gn − 0||∞ = ||gn ||∞ converging to 0, and therefore the
convergence is not uniform.
2. Once we have proved that f is the pointwise limit of (fn )n∈N we have also shown
that the convergence cannot be uniform. Each fn is continuous, f is not. But the
uniform limit of continuous function must be continuous.
1
Now, for x = 0 we have fn (0) = 12 , hence lim fn (0) = . If x = 0 then (nx − 1)2
n→∞ 2
1
diverges to +∞ and hence lim fn (x) − lim = 0.
n→∞ n→∞ 1 + (nx − 1)2
3. a) We know for x = 1 that xn → 0, and fn (1) = 0 for all n. Thus we

conclude that (fn )n∈N converges pointwise to 0. The function fn attains the max-
n
imum on [0,1] and it is attained at xn = n+1 , since fn (x) = nxn−1 − (n + 1)xn ,
n−1 n n
and nx −(n+1)x = 0 implies either x = 0 or x = n+1 . For xn we find fn (xn ) =
n n
n
n n nn n n 1
n+1 1 − n+1 = (n+1) n+1 , and lim = lim =
n→∞ (n + 1)n+1 n→∞ n+1 n+1
n
n 1
0. Thus we have lim ||fn − 0||∞ = lim = 0 implying the uni-
n→∞ n→∞ n+1 n+1
form convergence of (fn )n→∞ to 0.
b) We notice that
x2 x2
gn (x) = 1 = 1
n (1 + nx) n +x
and hence the pointwise limit is g(x) = x. Moreover we find for x ∈ [0, 1]
2
nx2 nx − (1 + nx)x
|gn (x) − x| = − x =

1 + nx 1 + nx
x
=
1 + nx
and
x 1
sup |gn (x) − x| = sup =
x∈[0,1] x∈[0,1] 1 + nx 1+n

d x 1 x
note dx 1+nx = (1+nx) 2 , i.e. x → 1+nx is monotone increasing, hence on [0, 1]
x 1
we have 1+nx ≤ 1+n . Now we conclude that (gn )n∈N converges uniformly on [0, 1]
to g(x) = x.
686
c)
Since continuous and arctan 0 = 0 we find for each x ∈ R that
arctan is
4x
lim arctan = 0. The mean value theorem implies
n→∞ x2 + n4
| arctan z − arctan y| ≤ |z − y|
d
since dx (arctan x) = 1+x
1
2 ≤ 1. Consequently we have

4x
sup arctan 2 − 0 ≤ sup 4x − 0 ≤ sup 4|x| = sup 4x .
x∈R x +n 4 2
x∈R x + n
4 x∈R x2 + n4 2
x>0 x + n
4
4x 4n2
But sup = which tends to 0 as n → ∞, thus (hn )n∈N , hn (x) =
x>0 x2+n 4 16n4 + n4
arctan x24x
+n4 , converges on R uniformly to 0.
1
d) Since | cos an x| ≤ 1 and lim α = 0 we find
n→∞ n
1
sup | cos(an x) − 0| = sup | cos(an x)| ≤ ,
x∈R x∈R nα
and once again we have uniform convergence.
4. Let x, y ∈ I, x < y. It follows that
f (x) = lim fn (x) ≤ lim fn (y) = f (y).

n→∞
5. Since for ck there exists a sequence (ck,n )n∈N of rational numbers ck,n converging to
ck , given > 0 we can find N () ∈ N such that n ≥ N () implies |ck − cn,k | < N +1 .
This implies since 0 ≤ x ≤ 1
N

|p(x) − pn (x)| = | (ck − cn,k )xk |
k=0
N

≤ |ck − cn,k |xk
k=0
N

< = ,
N +1
k=0
and consequently for n ≥ N ()

||p(.) − pn (x)||∞ < ,
implying the uniform convergence of (pn )n∈N to p.

6. Let > 0 be given. There exists N () ∈ N such that n ≥ N () implies |fn (y) −
f (y)| ≤ ||fn −f ||∞ < 2 as well as |f (xn )−f (x)| < 2 the latter due to the continuity
of f at x and the convergence of (xn )n∈N to x. Hence for n ≥ N () it follows that
|fn (xn )−f (x)| ≤ |fn (xn )−f (xn )|+|f (xn )−f (x)| < 2 + 2 , i.e. lim fn (xn ) = f (x).
n→∞
687
7. Let x ∈ I, there exists

⏐ α < β such that x ∈ [α, β] ⊂ [a, b] and consequently we have
gα,β (x) = lim fn ⏐[α,β] (x)
n→∞
For any interval [α , β ] ⊂ I such that x ∈ [α , β ] it follows gα,β (x) = gα ,β (x). So
we may define f : I → R, f (x) = gα,β (x) for some [α, β] ⊂ I, x ∈ [α, β]. Moreover
for every x ∈ I we have f (x) = lim fn (x). The continuity of f at x follows from
⏐ n→∞
the uniform convergence of fn ⏐[α,β] to gα,β . Thus gα,β is continuous on [α, β] and
consequently f is continuous for every x ∈ I.
Note that in general we cannot prove the uniform convergence of fn to f on (a, b).
f (x+ n
1
)−f (x)
8. We note first that gn (x) = 1 and now we use the mean value theorem
n
to find
f x + n1 − f (x)
gn (x) − f (x) = 1 − f (x)
n
= f (ξn ) − f (x)

for some ξn ∈ x1 , x + n1 , or
|gn (x) − f (x)| = |f (ξn ) − f (x)|.
Now we use the uniform continuity of f : For > 0 we can find δ > 0 such that
|y − z| < δ implies |f (y) − f (z)| < . For δ we may find N ∈ N, N = N (), such
that n ≥ N () implies n1 < δ, and consequently for n ≥ N ()
|gn (x) − f (x)| < ,
proving the uniform convergence of (gn )n∈N to f .

1−n2 x2
9. Since fn (x) = (1+n2 x2 )2 we find
1
sup |fn (x) − 0| = sup |fn (x)| = ,
x∈[−1,1] x∈[−1,1] 2n
and we obtain uniform convergence. Now for x = 0 we have fn (0) = 1 for all n,
whereas for x ∈ [−1, 1]\{0} we have
1 − n2 x2
lim = 0.
n→∞ (1 − n2 x2 )2
Since the pointwise limit is not continuous, the convergence of the derivative cannot
be uniform.
Chapter 25
1. Let ϕ ∈ T [a, b] be given with respect to the partition Zϕ (x0 , . . . , xn ) and ψ ∈
T [a, b] with respect to the partition Zψ (t0 , . . . , tm ). Denote the joint partition by
Z = Zϕ ∪ Zψ , Z = Z(y0 , . . . , yk ). For 1 ≤ l ≤ k it follows that ϕ|(yl−1 ,yl ) = cl and
ψ|(yl−1 ,yl ) = dl for some cl , dl ∈ R. Consequently (ϕ · ψ)|(yl−1 ,yl ) = cl · dl and hence
ϕ · ψ ∈ T [a, b].
688
2. Since f is Riemann integrable, given > 0 there exists step functions ϕ, ψ ∈ T [a, b]
b
such that ϕ ≤ f ≤ ψ and a (ψ − ϕ)dx < . Let ϕ be given with respect to Zϕ
and ψ with respect to Zψ . Let Z = Zϕ ∪ Zψ be the joint partition and Z̃ :=
Z ∪ {y1 , . . . , yN }. With respect to Z̃ we define two step functions
-
cj , x ∈ {y1 , . . . , yN }
ϕ̃(x) =
ϕ(x), x ∈ [a, b] \ {y1 , . . . , yN }
and -
cj , x ∈ {y1 , . . . , yN }
ψ̃(x) =
ψ(x), x ∈ [a, b] \ {y1 , . . . , yN }.
It follows that ϕ̃ ≤ f˜ ≤ ψ̃ and further

b b
(ψ̃ − ϕ̃)(x)dx = (ψ − ϕ)(x)dx < .
a a
The latter equality follows when using Z̃ to calculate both integrals. Note that
for a step function represented with respect to Z = Z(t0 , . . . , tm ) the values at tj ,
1 ≤ j ≤ m, do not contribute to the integral.
3. a) Consider the function g : [a, b] → R defined by

-
1, a ≤ x < c c x<c
and therefore g is not continuous at c.

b) First we note that if h : (λ, μ) → R is a bounded continuous function such
that limx→λ h(x) =: hλ and limx→μ h(x) =: hμ exist, then we can extend h to a
x>λ x<μ
continuous function h̄ : [λ, μ] → R by defining
⎧
⎪
⎨ hλ , x=λ
h̄(x) := h(x), x ∈ (λ, μ)
⎪
⎩
hμ , x = μ.
Therefore for the piecewise continuous function f : [a, b] → R there exists a partition
Z(x0 , . . . , xn ) of [a, b] such that f |(xk−1 ,xk ) is continuous with continuous extension
f¯k : [xk−1 , xk ] → R.
689
¯ ¯
xk fk there exists step functions ϕk , ψk ∈ T [a, b] such that ϕk ≤ fk ≤ ψk and
For
xk−1 (ψk − ϕk )(x)dx < n . We define the step functions
-
f (xj ), x = xj , j = 0, . . . , n
ϕ(x) :=
ϕk (x), x ∈ (xk−1 , xk ), k = 1, . . . , n
and -
f (xj ), x = xj , j = 0, . . . , n
ψ(x) :=
ψk (x), x ∈ (xk−1 , xk ), k = 1, . . . , n.
It follows that ϕ ≤ f ≤ ψ and
b n
xk
(ψ − ϕ)(x)dx = (ψ − ϕ)(x)dx
a k=1 xk−1
n xk n

= (ψk − ϕk )(x)dx < = ,
xk−1 n
k=1 k=1
where we used the result of Problem 2.

4. Since f is Riemann integrable, given > 0 there exists step functions ϕ, ψ ∈ T [a, b]
such that ϕ ≤ f ≤ ψ and
b
(ψ − ϕ)(x)dx ≤ γ 2 .
a
Since f ≥ γ we may assume that ϕ ≥ γ. It follows that ψ1 , ϕ1 ∈ T [a, b] and

1 1 1 1 1
ψ ≤ f ≤ ϕ ≤ γ . Thus f is bounded and further
b b
1 1 1
− (x)dx = (ψ(x) − ϕ(x))dx
a ϕ ψ a ϕ(x)ψ(x)
b
1 1
≤ 2 (ψ − ϕ)(x)dx ≤ 2 γ 2 = ,
γ a γ
1
proving the Riemann integrability of f.
5. By Problem 2 we know that changing a Riemann integrable function at finitely
many points will not affect the value of its integral. Thus in general
b
|f (x)|dx = 0 does not imply f (x) = 0 for all x ∈ [a, b].
a
b
6. Suppose f ∈ C([a, b]) and a |f (x)|dx = 0. Suppose further that for some x0 ∈ (a, b)
we have f (x0 ) = 0, say f (x0 ) > 0, the case f (x0 ) < 0 is analogous. Since f is
continuous there exists δ > 0 such that (−δ + x0 , x0 + δ) ⊂ (a, b) and f (x) > f (x2 0 )

for x ∈ − 2δ + x0 , x0 + δ2 . Consequently
b δ
2 +x0
δ
2 +x0 f (x0 ) f (x0 )δ
0= |f (x)|dx ≥ f (x)dx ≥ dx = > 0,
a − δ2 +x0 − δ2 +x0 2 2
690
b
which is a contradiction. It is now easy to show that ||f ||L1 = a |f (x)|dx is a norm
on C([a, b]). We need to prove:
i) ||f ||L1 ≥ 0 and ||f ||L1 = 0 if and only if f = 0, i.e. f is constant and has the
value 0;
ii) ||λf ||L1 = |λ|||f ||L1 ;
iii) ||f + g||L1 ≤ ||f ||L1 + ||g||L1 .
b b
Clearly a |f (x)|dx ≥ 0 and a |f (x)|dx = 0 if and only if f = 0 has just been
proved above. For λ ∈ R we have
b b
||λf ||L1 = |λf (x)|dx = |λ| |f (x)|dx = |λ|||f ||L1 ,
a a
and iii) is Minkowski’s inequality.

7. a) This problem is more of an interpretation of the result given in Theorem
25.24. By definition
b
lim Sn (f ) = f (x)dx
n→∞ a
if for > 0 there exists N ∈ N such that n ≥ N implies
b

(∗) Sn (f ) − f (x)dx < .
a
By Theorem 25.24 for > 0 and any partition Z(x0 , . . . , xn ) with mesh size less
than δ = δ() and points ξj ∈ [xj−1 , xj ] we have
b

(∗∗) S(f ) − f (x)dx <
a
where S(f ) denotes the Riemann sum for f with respect to Z and ξ1 , . . . , ξn . The
mesh size of Zn is
(n) (n) j j−1 b−a
xj − xj−1 = a + (b − a) − a − (b − a) = .
n n n
Hence, given > 0 we determine N ∈ N such that for n ≥ N it follows that b−a
n <δ
and now (∗∗) implies (∗).
b) Since
n
(n) b − a
Sn (f ) = f (xj )
j=1
n
and since b
lim Sn (f ) = f (x)dx
n→∞ a
it follows that
⎛ ⎞
n b
1 (n) 1 1 b
lim ⎝ f (xj )⎠ = lim Sn (f ) = f (x)dx = − f (x)dx.
n→∞ n j=1 n→∞ b − a b−a a a
691
8. The following is Hölder’s inequality for finite sums
n
n
p1 n
q1

p q
|αk βk | ≤ |αk | |βk | .
k=1 k=1 k=1
(n) (n)
Now let Zn = (x0 , . . . , xn ) be a sequence of partitions of [a, b] whose mesh
size converges to 0. For the corresponding Riemann sums of f · g we find with
(n) (n)
ξj ∈ [xj−1 , xj ]
n
(n) (n) (n) (n)
|f (ξj )g(ξj )(xj − xj−1 )|
j=1
n
(n) (n) (n) 1 (n) (n) (n) 1
= |f (ξj )(xj − xj−1 )| p |g(xj )|q (xj − xj−1 )| q
j=1
⎛ ⎞ p1 ⎛ ⎞ 1q
n
n

(n) (n) (n) (n) (n) (n)
≤⎝ |f (ξj )|p (xj − xj−1 )⎠ ⎝ |g(ξj )|q (xj − xj−1 )⎠ .
j=1 j=1
Passing to the limit n → ∞ we obtain
p1 q1
b b b
p q
|f (x)g(x)|dx ≤ |f (x)| dx |g(x)| dx .
a a a
1 1
Note that if we agree that q = ∞ is the conjugate of p = 1, i.e. p + q = 1, then we
find b b
|f (x)g(x)|dx ≤ sup |g(x)| |f (x)|dx,
a x∈[a,b] a
or b
|f (x)g(x)|dx ≤ ||f ||L1 ||g||∞ .
a
9. a) We follow the hint and apply Hölder’s inequality to |f |p and 1: so given p

and q we take r = qp , and r such that 1r + r1 = 1, i.e. r = r−1
r q
= q−p , to find
b b
|f (x)|p dx = |f (x)|p · 1dx
a a
r1 1
b b r
p·r r
≤ |f (x)| dx 1 dx
a a
pq
b q−p
= |f (x)|q dx (b − a) q .
a
692
b) By the Cauchy-Schwarz inequality we have
12 12
b b b
(∗) |f (x)g(x)|dx ≤ |f (x)|2 dx |g(x)|2 dx .
a a a
√
Further for A, B > 0, noting that A · B = 2A · √1 B, we have
2
2
1 √ 1 1 1 2
(∗∗) AB ≤ ( 2A)2 + B = A2 + B .
2 2 2 4
12
b
The result follows by applying (∗∗) to (∗) with A = a
|f (x)|2 dx and B =
12
b 2
a |g(x)| dx .
10. By the Cauchy-Schwarz inequality we find

2
b b b
2 2
f (x) sin kx dx ≤ f (x) dx sin kx dx
a a a
and
2
b b b
2 2
f (x) cos kx dx ≤ f (x) dx cos kx dx .
a a a
Adding these inequalities gives

2 2
b b
f (x) sin kx dx + f (x) cos kx dx
a a

b b b
2 2 2
≤ f (x) dx sin kx dx + cos kx dx
a a a

b b
2 2 2
= f (x) dx (sin kx + cos kx) dx
a a
b
= (b − a) f 2 (x)dx,
a
since sin2 kx + cos2 kx = 1.
11. Let Z = Z(x0 , . . . , xn ) be a partition of [a, b] and consider the two Riemannian
sums
n n
h(xj )(xj − xj−1 ) and f (h(xj−1 ))(xj − xj−1 ).
j=1 j=1
693
x −x
We set λj := j b−aj−1 and therefore 0 ≤ λj ≤ 1 and λ1 + · · · + λn = 1. By Problem
1 in Chapter 23 we find
⎛ ⎞ ⎛ ⎞
n n
1
f⎝ h(xj )(xj − xj−1 )⎠ = f ⎝ λj h(xj )⎠
b − a j=1 j=1
n

≤ λj f (h(xj ))
j=1
n
1
f (h(xj ))(xj − xj−1 ).
b − a j=1
If we replace now Z by a sequence (Zk )k∈N of partitions such that Zk+1 is a refine-
ment of Zk and for the mesh sizes we have η(Zk ) → 0 as k → ∞, then it follows
from
⎛ ⎞
nk nk
1 1
f⎝ h(xj )(xj − xj−1 )⎠ ≤ f (h(xj ))(xj − xj−1 )
b − a j=1 b − a j=1
and the continuity of f , recall that convex functions on an interval are continuous
in the interior, see Corollary 23.6, that
b b
1 1
f h(t)dt ≤ f (h(t))dt.
b−a a b−a a
12. The following is the graph of gn
2n
1 1
2n n

First let us show that gn is continuous. Clearly gn [0, 1
( 1 1
)∪( n1 ,0] is continuous.
2n )∪ 2n , n
1
If x0 = 2n we have
lim gn (x) = lim1 4n2 x = 2n
1
x→ 2n x→ 2n
1 1
x< 2n x< 2n
and
lim gn (x) = lim1 (−4n2 x + 4n) = 2n,
1
x→ 2n x→ 2n
1 1
x> 2n x> 2n
694
1 1
i.e. gn is continuous at x0 = . At x0 = we find
2n n
lim gn (x) = lim1 (−4n2 x + 4n) = 0

1
x→ n x→ n
1 1
x< n x< n
and
lim gn (x) = lim1 0 = 0,
1
x→ n x→ n
1 1
x> n x> n
implying gn is continuous on [0, 1]. As indicated in the hint the integral is the area
of the triangle (0, 0), ( n1 , 0), ( 2n
1
, 2n), hence
1
1 1
gn (x)dx = · · 2n = 1.
0 2 n
thus 1
lim gn (x)dx = 1.
n→∞ 0
Finally we claim: For every x ∈ [0, 1] it holds
lim gn (x) = 0.
n→∞
For x = 0 or x = 1 this follows from the definition. Now let x ∈ (0, 1). Since x > 0
it follows that for some N it holds x < N1 and now, for n ≥ N it follows gn (x) = 0
1
implying that limn→∞ gn (x) = 0. Since 0 0dx = 0 we find in this case
1 1
1 = lim gn (x)dx = lim gn (x)dx = 0.
n→∞ 0 0
13. Since (fn )n∈N converges uniformly to f , given > 0 there exists N = N () ∈ N

such that x ∈ [a, b] and n ≥ N () implies |fn (x) − f (x)| < 2(b−a) . For any n we
have
fn − |fn − f | ≤ f ≤ fn + |fn − f |.
Now, since fn is Riemann integrable there exists step functions ϕn , ψn ∈ T [a, b]
b
such that ϕn ≤ fn ≤ ψn and a (ψn − ϕn )dx < 2 . Hence for n ≥ N () we find with

the step functions ϕn − 2(b−a) and ψn + 2(b−a) that

ϕn − ≤ f ≤ ψn +
2(b − a) 2(b − a)
and
b b

ψn + − ϕn + dx = (ψn − ϕn )dx + < .
a 2(b − a) 2(b − a) a 2
695
Hence we have proved that the uniform limit of a sequence of Riemann integrable
functions is Riemann integrable. Now it follows that
b
b b

f (x)dx − fn (x)dx = (f (x) − fn (x))dx
a a a
b
≤ |f (x) − fn (x)|dx ≤ (b − a)||fn − f ||∞
a
implying
b b
lim fn (x)dx = f (x)dx.
n→∞ a a
Chapter 26
1. a) Let F be a primitive of f , i.e.
x
F (x) = f (t)dt + c.
a
Since F (x) = f (x) by Theorem 26.1, if f ∈ C k ([a, b]) then F ∈ C k ([a, b]) and F
is (k + 1)-times continuously differentiable.
b) Recall that if V is a R vector space a set Wa = a + W , a ∈ V and W ⊂ V
a subspace, is called an affine subspace of V . The dimension of Wa is that of W .
Clearly the constant function fc : [a, b] → R, fc (x) = c, forms a one-dimensional
subspace of C([a, b]), a basis for example is given by f1 , f1 (x) = 1. If f ∈ C([a, b])
then the set of all its primitives is given by
x
g : [a, b] → R| g(x) = f (t)dt + fc , c ∈ R
a
x
or with W := {fc |c ∈ R} ⊂ C 1 ([a, b]) and F ∈ C 1 ([a, b]), F (x) := a
f (t)dt, the set
of all primitives of f is the affine subspace F + W ⊂ C 1 ([a, b]).
2. Note that nothing is claimed about the existence of a fixed point. The statement
is that if T has a fixed point then the fixed point must belong to C ∞ ([a, b]).
Now, by Theorem 26.1 we have that T f is differentiable and (T f ) (x) = e−x f (x).
This implies that for a fixed point T g(x) = g(x) that g is in C 1 , i.e. a continuously
differentiable function. Therefore t → e−t g(t) is a C 1 function implying that T g
is a C 2 function. By induction it follows that if g = T g and g ∈ C k ([a, b]) then
g ∈ C k+1 ([a, b]) and therefore a fixed point belongs to C ∞ ([a, b]).
3. a) Let I1 = [a1 , b1 ) and I2 = [a2 , b2 ) and assume that a1 ≤ a2 . If b1 < a2
then I1 ∪ I2 is the union of two disjoint intervals. In the case that a2 ≤ b1 then
we either have I2 ⊂ I1 , namely if b2 ≤ b1 , hence I1 ∪ I2 = I1 , or, if b1 ≤ b2
then I1 ∪ I2 = [a1 , b2 ). Now, for finitely many right half-open intervals I1 , . . . , IN ,
Ij = [aj , bj ) we proceed by induction. The case N = 2 has just been proved.
We assume that I1 ∪ · · · ∪ IN −1 is the union of mutually disjoint right half-open
696
intervals with some bj0 , j0 ≤ N − 1, being the supremum of I1 ∪ · · · ∪ IN −1 . Now,

if bj0 < aN we are done. If not, for some j1 ≤ N − 1 we have [aj1 , bj1 ) ∩ IN = ∅. If
now bN < bj2 for some j2 ≥ j1 , then
I1 ∪ · · · ∪ IN = I1 ∪ · · · ∪ Ij1 −1 ∪ [aj1 , bj2 ) ∪ Ij2 +1 ∪ · · · ∪ IN −1 .
If however bj < bN for j ≤ N − 1 we have I1 ∪ · · · ∪ IN = I1 ∪ · · · ∪ Ij1 −1 ∪ [aj1 , bN ).
For the intersection of [a1 , b1 ) ∩ [a2 , b2 ), a1 ≤ a2 we find for b1 < a2 that [a1 , b1 ) ∩
[a2 , b2 ) = ∅, otherwise we find [a1 , b1 ) ∩ [a2 , b2 ) = [a2 , min(b1 , b2 )).
b) Let I1 = [a1 , b1 ) and I2 = [a2 , b2 ), a1 ≤ a2 . If I1 ∩ I2 = ∅ there is nothing
to prove. If I1 ∩ I2 = ∅ then I1 ∪ I2 = [a, b) and I1 ∩ I2 = [c, d) with the following
possibilities [a, b) = [a1 , b1 ) implying [c, d) = [a2 , b2 ), [a, b) = [a1 , b2 ) implying that
[c, d) = [a2 , b1 ) with the convention that [a2 , b1 ) = ∅ if a2 = b1 . In the first case we
have
b1 b2
μ(I1 ∪ I2 ) + μ(I1 ∩ I2 ) = f (t)dt + f (t)dt = μ(I1 ) + μ(I2 )
a1 a2
and in the second case we find

b2 b1
μ(I1 ∪ I2 ) + μ(I1 ∩ I2 ) = f (t)dt + f (t)dt
a1 a2
b1 b2 b1
= f (t)dt + f (t)dt + f (t)dt
a1 b1 a2
b1 b2
= f (t)dt + f (t)dt = μ(I1 ) + μ(I2 ).
a1 a1
x
c) Since μa0 (x) = μ([a0 , x)) = a0 f (t)dt the result follows from Theorem 26.1.
4. First we note that for every x we get
x x
0= f (t)dt = (f (−t) + f (t))dt = 0
−x 0
which yields for all x, y that

y
(f (−t) + f (t))dt = 0.
x
Now we claim that if for a continuous function g : [a, b] → R we have for all
β
α, β ∈ [a, b], α < β that α g(t)dt = 0 then g(t) = 0 for all t. Indeed, take t0 ∈ [a, b]
and h > 0 such that a ≤ t0 < t < t0 + h ≤ b, to find by our assumptions and by
the mean value theorem
t0 +h
0= g(t)dt = g(ξh )h, ξh ∈ [t0 , t0 + h].
t0
This implies that g(ξh ) = 0 and since limh→0 ξh = t0 the continuity of g implies that
g(t0 ) = 0. Therefore we deduce that f (−x)+ f (x) = 0 for all x, i.e. f (−x) = −f (x)
which implies that f is odd.
697
5. a) We note that
y y
y ρ − xρ = ρ tρ−1 dt ≤ ρ 1 dt = ρ(y − x).
x x
b) Since
y
cos t dt = sin y − sin x
x
and for − π4 ≤ x < y ≤ π

4 we have
y
1√
cos t dt ≥ 2(y − x)
x 2
the estimate (y − x) ≤ √2 (sin y − sin x) follows.

2
6. For F we have
x y
F (x) − F (y) = f (t)dt − f (t)dt
ax a
= f (t)dt
y
which implies
x x

|F (x) − F (y)| = f (t)dt ≤ ||f ||∞
1dt = ||f ||∞ |x − y|.
y y
7. a) Since (f · g)(x) = 0 for all x ∈ [a, b], but f and g are not both zero, it follows
that f ⊥ g.
b) The product of an odd function and an even function
a is odd. For any odd
function h : [−a, a] → R we have, see Proposition 26.7.B, −a h(t)dt = 0.
c) Let g, h ∈ C([a, b]) such that f ⊥ g and f ⊥ h. For λ, μ ∈ R we find
b b b
f (x)(λg(x) + μh(x))dx = λ f (x)g(x)dx + μ f (x)h(x)dx = 0,
a a a
hence f ⊥ (λg + μh), which implies the required result.
8. We start with
b b
1 d 2 1 1
0= f (x)f (x)dx = (f (x))dx = f 2 (b) − f 2 (a),
a a 2 dx 2 2
or f 2 (b) = f 2 (a) implying |f (b)| = |f (a)|.
698
9. Obviously we have for f, g ∈ C01 ([a, b]) and λ ∈ R that
||f ||L2 ≥ 0,
||λf ||L2 = |λ|||f ||L2 ,

and
||f + g ||L2 ≤ ||f ||L2 + ||g ||L2 ,
since these results hold for all f, g ∈ C 1 ([a, b]) and λ ∈ R. In order to prove that
||f ||L2 is a norm we need to show in addition that ||f ||L2 = 0 implies f (x) = 0 for
all x ∈ [a, b], i.e. f is the zero element in C01 ([a, b]). By Proposition 26.16 we know
Poincaré’s inequality:
||f ||L2 ≤ γ0 ||f ||L2 .
b
Thus ||f ||L2 = 0 implies ||f ||L2 = 0 or a (f (x))2 dx = 0. But by Problem 6 in
Chapter 25 we now find that f (x) = 0 for all x ∈ [a, b].
10. Since G is differentiable and
G (x) = β (x)f (β(x)) − α (x)f (α(x))
the fact that f (y) ≥ 0 for all y and β (x) ≥ 0 whereas α (x) ≤ 0, which follows
from the fact that β is increasing and α is decreasing, we find G (x) ≥ 0, hence G
is increasing.
11. For > 0 we can find N0 ∈ N such that for n, m ≥ N0 it follows that

|fn (x0 ) − fm (x0 )| <
2
and

|fn (t) − fm

(t)| < for all t ∈ [a, b].
2(b − a)
We now apply the mean value theorem to fn − fm to find
|x − t|
|(fn − fm )(x) − (fn − fm )(t)| < ≤
2(b − a) 2
for all x, t ∈ [a, b] and n, m ≥ N0 . Hence it follows with t = x0 for n, m ≥ N0 that
|fn (x) − fm (x)| ≤ |(fn − fm )(x) − (fn − fm )(x0 )| + |fn (x0 ) − fm (x0 )| < ,
or n, m ≥ N0 implies
||fn − fm ||∞ < ,
i.e. (fn )n∈N is a Cauchy sequence with respect to || · ||∞ . Therefore it has a limit
f which is a continuous function f . Moreover fn converges pointwise to f . Denote
by f ∗ the uniform limit of (fn )n∈N . It follows that
x
fn (x) = fn (x0 ) + fn (t)dt
x0
699
and for n → ∞ we get x

f (x) = f (x0 ) + f ∗ (t)dt,
x0
implying that f (x) = f ∗ (x).

12. We know by Theorem 16.4 that
1 − xN +1
SN (x) =
1−x
implying that
1 − (N + 1)xN + N xN +1
SN (x) = .
(1 − x)2
∞ k 1
Moreover, for |x| < 1 we have k=0 x = 1−x , so we need to prove that for
[a, b] ⊂ (−1, 1)

1 1

sup SN (x) −
1 − x and sup SN (x) − (1 − x)2
x∈[a,b] x∈[a,b]
both tend to 0.
1 1
Denote by κ1 := supx∈[a,b] 1−x < ∞ and κ2 := supx∈[a,b] (1−x) 2 < ∞. It follows
that
N +1
SN (x) − 1 = |x| ≤ κ1 max(|a|N +1 , |b|N +1 )
1−x 1−x
hence
1

sup SN (x) − ≤ κ1 max(|a|N +1 , |b|N +1 )
x∈[a,b] 1 − x
1
and since |a| < 1 and |b| < 1 the uniform convergence of SN (x) to 1−x is proved.
Further

1 (N + 1)|x|N + N |x|N +1
SN (x) − =
(1 − x)
2 (1 − x)2
≤ κ2 (2N + 1) max(|a|N , |b|N ),
where we used that for |y| < 1 we have |y|N +1 < |y|N . Since limN →∞ (2N +1)|y|N =
1
0 for |y| < 1 it also follows that SN (x) converges uniformly to (1−x) 2 . Therefore we
have
∞
1 d k d
2
= x = lim SN (x)
(1 − x) dx dx N →∞
k=1
N ∞

d
= lim SN (x) = lim kxk−1 = kxk−1 .
N →∞ dx N →∞
k=1 k=1
700
1
For x = m, m ≥ 2, we find
∞ k−1 ∞
1 k 1 m2
k =m k
= 2 = 2 ,
m m 1− m1 m −1
k=1 k=1
∞
k m
i.e. k
= 2 .
m m −1
k=1
Chapter 27
1. For x ∈ (−∞, 0) ∪ (0, c) ∪ (c, ∞) the function uc is clearly differentiable. At x = 0
we find -
x
uc (x) − uc (0) , x>0
= 4
x−0 0, c ≤ x ≤ 0
implying
uc (x) − uc (0)
lim = 0,
x→0 x−0
and at x = c we have
-
uc (x) − uc (c) 0, c≥x≤0
= (x−c)
x−c − 4 , x ≤ c,
which yields
uc (x) − uc (c)
lim = 0,
x−c
x→0
hence uc is on R differentiable. Moreover

⎧
⎪ x
⎨2, x>0

uc (c) = 0, c≤x≤0
⎪
⎩ (x−c)
− 2 , x < c,
i.e.
uc (x) = |uc (x)|.
Next we observe that uc (2) = 1 for all c < 0, implying that for all c < 0 a solution
to
v (x) = |v(x)|, v(2) = 1
is given by uc . Hence we have existence but not uniqueness.
2. The calculation is simple and goes as follows:
f (x)(λu1 + μu2 ) (x) + h(x)(λu1 + μu2 )(x)

= f (x)(λu1 (x) + μu2 (x)) + h(x)(λu1 (x) + μu2 (x))
= λf (x)u1 (x) + μf (x)u2 (x) + λh(x)u1 (x) + μh(x)u2 (x)
= λ(f (x)u1 (x) + h(x)u1 (x)) + μ(f (x)u2 (x) + h(x)u2 (x))
= 0.
701
It is important to note that if u and u only appear linearly in a differential equation

then linear combinations of solutions are solutions.
3. First we note that ap1 (t)
− dt
u(a) = ua e a p0 (t)
= ua e0 = ua ,
i.e. the initial condition is fulfilled. Differentiating u we find
p (t)
d − x 1 dt
u (x) = ua e a p0 (t)
dx
x
d p1 (t) p (t)
− x 1 dt
= ua − dt e a p0 (t)
dx a p 0 (t)
p1 (x) − ax pp1 (t) dt
= −ua e 0 (t) ,
p0 (x)
and it follows that
p0 (x)u (x) + p1 (x)u(x)

p1 (x) − ax p1 (t)
dt −
xp1 (t)
dt
= −p0 (x)ua e p0 (t)
+ p1 (x)ua e a p0 (t)
p0 (x)
xp1 (t)
− dt
= u0 e a p0 (t) (−p1 (x) + p1 (x)) = 0.
du 2dx
4. a) Using the method of separation of variables we find xu = 2u or u = x
which yields
ln |u(x)| = 2 ln |x| + c
with some constant c. From here we derive
u(x) = kx2
where k is any real number. The initial condition demands u(1) = k = 3, so we

expect u(x) = 3x2 to be a solution to this initial value problem. Indeed we have
u(1) = 3 · 12 = 3 and u (x) = 6x, hence xu (x) = 6x2 = 2 · u(x). Obviously u is
defined on the whole real line.
dy
b) From y (t) = 2y 2 (t) we derive y2 = 2dt, or
1
− = 2t + c
y
1
which gives y(t) = − 2t+c . Adjusting the initial value requires
1
y(0) = − = −1,
c
1
implying that y(t) = − 2t+1 is a candidate for a solution. We find y(0) = −1 and
further
2
y (t) = = 2y 2 (t).
(2t + 1)2
702
ϕ(s)
c) The differential equation ϕ (s) = tan s leads to
dϕ ds cos s (sin s)

= = ds = ds
ϕ tan s sin s sin s
or
ln |ϕ| = ln | sin s| + c,
which yields
ϕ(s) = γ sin s
for some γ ∈ R. The condition ϕ( π4 ) = π
implies
4
√
π π 2
= γ sin = γ
4 4 2
or √
π π 2
γ= √ = .
2 2 4
√ √
π 2
For ϕ(s) = 4 sin s we find ϕ (s) = π 4 2 cos s which gives
√ √ cos s
π 2 π 2 ϕ(s)
ϕ (s) = cos s = sin s = ,
4 4 sin s tan s
√ √ √
as well as ϕ( π4 ) = π 4 2 sin π4 = π 4 2 · 22 = π4 . Note that ϕ is defined on R, but the
coefficient in the differential equation is not defined for s = kπ, where tan s = 0
and for s = π2 + k ∈ Z where tan s is not defined.
d) From 5x4 (r)x (r) = r cos r we deduce
5x4 dx = r cos rdr
or
x5 = cos r + r sin r + c,
which yields
1
x(r) = (cos r + r sin r + c) 5 ,
and x( π2 ) = 1 implies
π π π 1
1 = (cos + sin + c) 5
2 2 2
which is solved by c = 1 − π2 . An easy calculation now shows that x(r) = (cos r +
1
r sin r + 1 − π2 ) 5 indeed solves the initial value problem: x( π2 ) = (cos π2 + π2 sin π2 +
1
1 − π2 ) 2 = 1, and
d π 1
(cos r + r sin r + 1 − ) 5
dr 2
1 π −4
= (cos r + r sin r + 1 − ) 5 (− sin r + sin r + r cos r)
5 2
1 π −4
= (cos r + r sin r + 1 − ) 5 (r cos r),
5 2
703
or
5x4 (r)x (r)

π 41 π 4
= 5(cos r + r sin r + 1 − ) 5 (cos r + r sin r + 1 − )− 5 (r cos r)
2 5 2
= r cos r.
Again the solution is defined for all r ∈ R.

5. a)
√
x2 +1
d
g(z)dz
dx cos x
d 2 d
= g( x2 + 1) x + 1 − g(cos x) cos x
dx dx
x
= g( x2 + 1) √ + g(cos x) sin x.
x2 + 1
b)
u(x)
d 1 1 1
dt = u (x) − v (x).
dx v(x) 1 + t2 1 + u2 (x) 1 + v 2 (x)
6. We have
u(x)
d
h(t)dt = h(u(x))u (x) − h(−u(x))(−u(x))
dx −u(x)
= h(u(x))u (x) + h(u(x))(−u(x))

= h(u(x))u (x) − h(u(x))u(x) = 0,
u(x)
thus x → −u(x) h(t)dt has derivative zero, and therefore it must be constant. We
know that for every odd function h we have
a
h(t)dt = 0,
−a
compare with Proposition 26.7.B, and therefore we must have for all x ∈ R that
u(x)
−u(x) h(t)dt = 0.
1
7. Since u2k ≥ 0 it follows from u = 1+u 2k that u is strictly monotone increasing, and
since u(0) = 1 we deduce that on [0, ∞) the function u is positive. Further we have
d 1 2ku2k−1 (x)u (x)

u (x) = =
dx 1 + u2k (x) (1 + u2k (x))2
2k−1
2ku (x)
= >0
(1 + u2k (x))3
704
since u2k−1 (x) > 0 (which follows from u being strictly positive). Hence u is convex.
The fact that u is an arbitrarily often differentiable function follows as discussed at
the end of Chapter 27: we know that
1
u = g1 (u), g1 (t) =
1 + t2k
and
2kt2k−1
u = g2 (u), g2 (t) = .
(1 + t2k )3
Now we claim that u(n) = gn (u) with an arbitrarily often differentiable function gn .
For n = 1 (and n = 2) we know the result. Now if u(n) = gn (u) then
d
u(n+1) = gn (u) = gn (u) · u
dx
= gn (u)g1 (u)
implying the result.

Chapter 28
1. a) Denote by gα the function gα : (a, b] → R, gα (x) = (x − a)−α . A primitive
of gα is given by -
1
(x − a)1−α , α = 1
Gα (x) = 1−α
ln(x − a), α = 1.
Consequently we have
-
b 1
dx 1−α (b − a)1−α − 1−α
=
a− (x − a)α ln(b − a) − ln ,
and for → 0 we find if α < 1, then

b
dx 1
lim = (b − a)1−α ,
→0 a− (x − a)α 1−α
however for α ≥ 1 the limit

b
dx
lim
→0 a− (x − a)α
does not exist (as a finite limit).
b) We have an unbounded integrand at x = 0 and at x = 2. Therefore we split
the integral accordingly: for 0 < < 1
2− 1 2−
dx dx dx
= + .
x(2 − x) x(2 − x) 1 x(2 − x)
705
1 √1
For 0 < ≤ x ≤ 1 we have √ ≤ x
and therefore
x(2−x)
1 1
dx dx √
0≤ ≤ √ = 2 − 2 ,
x(2 − x) x
1
implying the convergence of the first integral. For 1 ≤ x ≤ 2 − we find √ ≤
x(2−x)
√1 which yields
2−x
2− 2−
dx dx √
0≤ ≤ √ = 2 − 2 .
1 x(2 − x) 1 2−x
2 dx
and hence the second integral converges too, i.e. 0
√ converges.
x(2−x)
c) If the integral converges we can split the integral as follows:

∞ 1 ∞
xα dx = xα dx + xα dx.
0 0 1
The first integralconverges if and only if α > −1 but in this case the second integral
∞
diverges. Hence 0 xα dx will never converge.
d) A primitive of g(x) = e−ax cos(wx) is the function
e−ax
G(x) = − (a cos(wx) − w sin(wx))
a2 + w 2
and therefore R
e−ax cos(wx)dx = G(R) − G(0).
0
Since lim G(R) = 0 we find

R→∞
∞
a
e−ax cos(wx)dx = .
0 a2 + w 2
2. First we note
β
|f (r)| (1 + r2 ) 2 2 β−α
α ≤ c0 α = c0 (1 + r )
2 .
2
(1 + r ) 2 (1 + r2 ) 2
It follows that
R R
|f (r)| 1
α dr ≤ c0 α−β dr
0 (1 + r2 ) 2 0 (1 + r2 ) 2
1 R
1 1
= c0 α−β dr + c0 α−β dr,
0 (1 + r2 ) 2 1 (1 + r2 ) 2
706
and clearly the first integral on the right hand side exists for all α and β. If r ≥ 1
1 1
and β < α then α−β ≤ α−β α−β and since
(1+r 2 ) 2 2 2 r 2
R
R
−α+β 1 1−α+β
lim r dr = lim r
R→∞ 1 R→∞ 1−α+β 1
1 R1−α+β
= + lim
α − 1 − β R→∞ 1 − α + β
exists only for 1 − α + β < 0, i.e. β + 1 < α, it follows that for β + 1 < α the integral
∞ f (r)
0 (1+r 2 ) α
2
dr converges absolutely. Now if f is a polynomial of degree m we know
m ∞ f (r)
that |f (r)| ≤ c0 (1 + r2 ) 2 and therefore for m + 1 < α the integral 0 α dr
2 2 (1+r )
converges absolutely in this case. In the case where m + 1 ≥ α the integral must
diverge. We may assume that f (r) ≥ 0 for r ≥ R0 , otherwise we switch to −f .
From Example 11.4 we know that
f (r)
lim =1
r→∞ am r m
when am > 0 is the leading coefficient of f (r). Thus we can find R1 ≥ R0 such that
f (r)
r ≥ R1 implies am rm − 1 < 12 , or a2m rm ≤ f (r). Since for m + 1 ≥ α the integral
∞ am r m ∞ f (r)
α dr diverges it follows that α dr diverges.
R1 2 2
2(1+r ) 0 2 2 (1+r )
3. For k = 0 we find 1
1 0!
(1 − x)α dx = = .
0 α+1 α+1
Assuming that
1
k!
xk (1 − x)α dx =
0 (α + 1)(α + 2) · . . . · (α + k + 1)
we find when integrating by parts
1 1
d (1 − x)α+1
xk+1 (1 − x)α dx = xk+1 − dx
0 0 dx α+1
1
k+1
= xk (1 − x)α+1 dx
α+1 0
(k + 1) k!
= · ,
α + 1 (α + 2)(α + 3) · . . . · (α + k + 2)
where we have used that the boundary terms
1
(1 − x)α+1
xk+1 −
α+1 0
vanish.
707
4. The second integral is straightforward since

sin2 t 1

t2 + a 2 ≤ t2 + a 2 ,
∞ 2t
implying the absolute convergence of 0 tsin 2 +a2 dt. The first integral we split into
two integrals and we consider

1 R
ln x ln x
2 + a2
dx and 2 + a2
dx.
x 1 x
We note that
1 1
− ln x
0≤ dx ≤ − ln xdx
x2 + a2
1
= − (x ln x − x)| = 1 + − ln ,
and since lim→0 ( − ln ) = 0, compare with the calculation in Example 11.6.C,
it follows that the integral converges.
√ The second integral converges since we know
that for x ≥ 1 we have ln x ≤ c0 x and therefore
R R √ R
ln x c0 x 3 1
2 + a2
dx ≤ 2 + a2
dx ≤ c 0 x− 2 dx = 2c0 (1 − R− 2 )
1 x 1 x 1
∞ ln x
implying the convergence of 0 x2 +a2 dx.
5. For the first part we observe that since g is continuous and g(0) = 0 for some
η > 0 we have g(x) = 0 for x ∈ (−η, η). We may assume that g > 0 in (−η, η)
and consequently
there exists 0 < m ≤ M such that 0 < m ≤ g(x) ≤ M for
x ∈ − η2 , η2 . This implies for 0 < < η2 that
η η
2 g(x) 2 1 η
dx ≥ m dx = m(ln − ln )
x x 2
η
g(x)
2 1
and therefore lim dx, and hence 0 g(x) x dx does not exist. The second
→0 x
integral goes analogously. Note that now we have x < 0, and since
− −
g(x) (−g(x))
dx = − dx
− η2 x − η2 x
the estimate
− −
(−g(x)) 1
dx ≤ −M dx
− η2 x − η2 x
yields for → 0 the divergence of this integral, and hence the divergence of
0 g(x) g(x)
−1 x dx. Since g is even the function x → x , x = 0, is odd and therefore
− 1
g(x) g(x)
dx = − dx,
−1 x x
708
implying that
− 1
g(x) g(x)
(∗) lim dx + dx = 0.
→0 −1 x x
1 g(x)
Clearly, (∗) does not imply −1 x dx = 0 since we know that the latter integral
does not exist.
6. Suppose that α > 1 and limx→∞ xα f (x) = c0 . It follows that there exists R > 0
such that x ≥ R implies
|xα f (x)| − |c0 | ≤ |xα f (x) − c0 | < 1,
or
1 + |c0 |
|f (x)| ≤ ,
xα
R
implying for α > 1 the convergence of 1 |f (x)|dx and hence the convergence of
∞
0 |f (x)|dx.
Now suppose for c0 = 0 and α ≤ 1 that lim xα f (x) = c0 . We consider the
x→∞
case c0 > 0, the case c0 < 0 goes analogously. The existence of the limit implies
xα f (x) ≥ 0 for x ≥ R0 , i.e. f (x) ≥ 0 for x ≥ R0 , and consequently we can find
R1 ≥ R0 such that x ≥ R1 implies
c0
c0 − xα f (x) ≤ |c0 − xα f (x)| < ,
2
or for x ≥ R1
c0
≤ f (x)
2xα
implying ∞ ∞
c0
α
dx ≤ f (x)dx,
R1 2x R1
but for α ≤ 1 the integral on the left hand side diverges. Note that in the second
case c0 = ∞ is allowed. Clearly, we can also apply these criteria to continuous
functions f : [a, ∞) → R.

ln x
7. a) Since lim x = ∞, by the second case in Problem 6 the integral
x→∞ 1+x
diverges.
b) Here we have two boundary points which can cause potential problems and
therefore we split the integral as follows:
∞ π ∞
1 − cos y 1 − cos y 1 − cos y
2
dy = 2
dy + dy.
0 y 0 y π y2
1 − cos y 1
Since lim = (use the rules of l’Hospital), it turns out that the first
y→0 y2 2
integral is a Riemann integral and not an improper integral. For the second integral
we observe that
3 1 − cos y
lim y 2 = 0,
y→∞ y2
709
and the first part of Problem 6 gives the convergence of the integral.
c) The substitution t → −s gives
−1 ∞
et e−s
dt = − ds
−∞ t 1 s

2 e−s −1et
and we need only to note that lim s = 0 to deduce that −∞ t dt con-
s→∞ s
verges.
8. Following the hint we write

∞ (n+1)π
∞ sin x sin x
dx =
x x dx.
0 n=0 nπ
Taking into account that sin kπ = 0 as well as | sin x| = | sin(x+π)|, the substitution
x = t + nπ yields
(n+1)π π
sin x sin t
dx = dt.
x
nπ 0 t + nπ
1 1
Since for 0 ≤ t ≤ π it follows that t+nπ ≥ (n+1)π we find
π π
sin t 1 2
dt ≥ sin tdt = ,
0 t + nπ (n + 1)π 0 π(n + 1)
which implies
∞ ∞
∞ sin x 2 21

x dx ≥ π(n + 1)
=
π n=1 n
,
0 n=0
∞ ∞ sin x
and since the series 1
diverges we have proved that the integral dx
n=1 n 0 x
diverges.
f (x) c0
9. Since lim = c0 > 0, for 2 > 0 there exists δ > 0 such that 0 < x − a < δ
x→a g(x)
implies

f (x) c0

g(x) − c0 < 2 ,
c0 f (x) 3c0
or 2 < g(x) < 2 , i.e.
c0 3c0
g(x) ≤ f (x) ≤ g(x).
2 2
This yields
a+δ a+δ a+δ
c0 3c0
g(x)dx ≤ f (x)dx ≤ g(x)dx,
2 a a 2 a
710
b b a+δ b
implying that f (x)dx exists if and only if a g(x)dx = a g(x)dx+ a+δ f (x)dx
a
f (x)
exists. In the case where lim = 0 we can still find for > 0 some δ > 0 such
x→a g(x)
that for 0 < x − a < δ it follows that
f (x) ≤ g(x),
b b a+δ b
implying that a f (x)dx converges if a g(x)dx = a g(x)dx + a+δ g(x)dx con-
f (x)
verges. Now, if lim = ∞ then for R > 0 there exists δ > 0 such that
x→a g(x)
0 < x − a < δ implies fg(x)

(x)
≥ R, or f (x) ≥ Rg(x). Therefore the divergence of
b b
a
g(x)dx implies the divergence of a f (x)dx.
10. For 0 < < r < 1 we find with − ln s = u that
r − ln r − ln
dr − 12 −u 1
√ = u (−e )du = u− 2 e−u du.
− ln s − ln − ln r
Now, as → 0 it follows that − ln → ∞ and as r → 1 it follows that − ln r → 0.

Hence for 0 < < α < r we find
r α
ds ds
lim √ + lim √
r→1 α − ln s →0 − ln s
− ln α − ln
1 1
= lim u− 2 e−u du + lim u− 2 e−u du
r→1 − ln r →0 − ln α
∞
1 1
= u− 2 e−u du = Γ .
0 2
1
11. For 0 < < 2 we find
1− 1
2
1−
tx−1 (1 − t)y−1 dt = tx−1 (1 − t)y−1 dt + tx−1 (1 − t)y−1 dt.
1
2
Since x > 0 it follows that x − 1 > −1 and consequently, see Example 28.3,
12
lim tx−1 (1 − t)y−1 dt
→0
exists. Analogously we deduce, also compare with Problem 1 a), that

1−
lim tx−1 (1 − t)y−1 dt
→0 1
2
1
exists, implying the convergence of B(x, y) = 0 tx−1 (1 − t)y−1 dt. Substituting t
by 1 − s we find
0 1
B(x, y) = − (1 − s)x−1 sy−1 ds = sy−1 (1 − s)x−1 ds = B(y, x).
1 0
711
This calculation has however a problem: we have not proved the substitution rule
for improper integrals. Thus we should start with
1− 1−
x−1 y−1 x−1 y−1
t (1 − t) dt = − (1 − s) s ds = sy−1 (1 − s)x−1 ds
1−
and pass to the limit.

Finally substituting t = sin2 ϑ (and allowing ourselves to use a substitution rule
for this particular improper integral) we find for√x = m and y = n, while noting
that for t = 0 we have ϑ = 0 (t = , ϑ = arcsin ) and for t = 1 we have ϑ = π2
√
(t = 1 − , ϑ = arcsin 1 − ), that
1
B(m, n) = tm−1 (1 − t)n−1 dt
0
π
2
=2 (sin2 ϑ)m−1 (cos2 ϑ)n−1 cos ϑ sin ϑdϑ
0
π
2
=2 (sin ϑ)2m−1 (cos ϑ)2n−1 dϑ,
0
where we used 1 − sin2 ϑ = cos2 ϑ and dt

dϑ = 2 cos ϑ sin ϑ.
(The more correct calculation would be to first derive
√
1− arcsin 1−
m−1 n−1
t (1 − t) dt = 2 √
(sin ϑ)2m−1 (cos ϑ)2n−1 dϑ
arcsin
and pass to the limit → 0.)

The mapping (x, y) → B(x, y) is the (Euler) beta-function and we will study it,
in particular its relation to the Γ-function, in Chapter 31.
12. a) Since the sum of two convex functions is convex we have for h and g being
logarithmic convex that
log h + log g = log(hg)
is convex, i.e. h · g is logarithmic convex.
b) We just need to note that the convexity of log f implies

f f f − (f )2
0 ≤ (log f ) = = .
g f2
c) Since the limit of a sequence of convex functions is convex the continuity of

the logarithmic function implies the result.
712
Chapter 29
1
1. For x = 0 we have 1+x4 < 1 and consequently
∞
∞
1 1
gn (x) = x4 4 )n
= x4 1
n=0 n=0
(1 + x 1 − 1+x 4
= 1 + x4 , x = 0.

However, for x = 0 we have gn (x) = 0 for all n, thus ∞ n=0 gn (0) = 0. If follows
that
∞ ∞

lim gn (x) = 1 = gn (0),
x→0
n=0 n=0
∞
i.e. ∞continuous for x = 0. Since all functions gn are continuous
n=0 gn (x) is not
the convergence of n=0 gn (x) cannot be uniform on any interval containing 0.

2. a) Since | sin kx| ≤ 1 for all x ∈ R and k ∈ N0 it follows that sinkαkx ≤ k1α and
∞ 1 ∞ sin kx
for α > 1 the series α=0 kα converges, hence α=0 kα converges absolutely
and uniformly.
n

b) We observe that for |x| ≤ 1 we have x 3 ≤ 13 and the convergence of
∞ 1 n2 n2
∞ xn
n=1 32 implies the absolute and uniform convergence of n=1 32 for |x| ≤ 1.
n n
1 1
∞ 1
c) Note that n2 +r 2 ≤ n2 for any r ∈ R and since n=1 n2 < ∞ it follows that
∞ 1
n=1 n2 +r 2 converges for all r ∈ R absolutely and uniformly.
3. For α = m ∈ N, m ≥ n, we find
n

m−k+1 m! m
= = .
k n!(m − n)! n
k=1
Now, for k ∈ N0 we have
gα(k) (x) = α(α − 1) · . . . · (α − k + 1)(1 + x)α−k

α
= k! (1 + x)α−k ,
k
(k) α th
i.e. gα (0) = k! k . Consequently the n Taylor polynomial of gα about 0 is given
by
n
α k
Tg(n) (0) = x .
α
k
k=0

Note: with ck = αk xk we find
α k+1
ak+1 k+1 x α − k
= = |x|
ak α k
k +1.
k x
713

α − k
Since lim = 1, we find for η such that |x| < η < 1 some N = N (η) with
k→∞ k + 1

the property that n ≥ N (η) implies ak+1
ak ≤ η < 1. Consequently, the series
∞
α k
x
k
k=0
∞ α k
converges for |x| < 1. It takes further effort to prove that k=0 k x is indeed the
Taylor series of gα , |x| < 1.
4. For N ∈ N we have
N
N
N

|(ak + bk )xk | ≤ |ak ||x|k + |bk ||x|k
k=0 k=0 k=0
∞ ∞
≤ |ak ||x|k + |bk ||x|k
k=0 k=0
as well as
N
N
∞

|(λak )xk | ≤ |λ| |ak ||x|k ≤ |λ| |ak ||x|k ,
k=0 k=0 k=0
which allows us in each case to pass to the limit as N → ∞. Once we have secured
absolute and uniform convergence, we may pass in the equalities
N
N
N

(ak + bk )xk = ak xk + bk xk
k=0 k=0 k=0
and
N
N

(λak )xk = λ ak xk
k=0 k=0
to the limit as N → ∞.
ex −e−x ex +e−x
5. We note that sinh x = 2 and cosh x = 2 , and therefore we find
∞ k

1 xk k
kx
sinh x = − (−1)
2 k! k!
k=0 k=0
∞ ∞
1 x2l
x2m−1
= +
2 (2l)! m=1 (2m − 1)!
l=0
∞ ∞

x2l x2m−1
2l
− (−1) − (−1)2m−1
(2l)! m=1 (2m − 1)!
l=0
∞
x2m−1
= ,
m=1
(2m − 1)!
714
and further
∞ ∞

1 xk k
kx
cosh x = + (−1)
2 k! k!
k=0 k=0
∞ ∞
1 x2l
x2m−1
= +
2 (2l)! m=1 (2m − 1)!
l=0
∞ ∞

x2l x2m−1
2l
+ (−1) + (−1)2m−1
(2l)! m=1 (2m − 1)!
l=0
∞
x2l
= .
(2l)!
l=0
6. We know that for |x| < 1 the following holds:

∞
xn
ln(1 + x) = (−1)n+1 ,
n=1
n
which implies for |x| < 1 that

1 1+x 1
ln = (ln(1 + x) − ln(1 − x))
2 1−x 2
∞ ∞

1 n+1 x
n
n+1 (−x)
n
= (−1) − (−1)
2 n=1 n n=1
n
∞ ∞

1 n+1 x
n
2n+1 x
n
= (−1) − (−1)
2 n=1 n n=1
n
∞
1 xn
= (−1)n+1 + 1
2 n=1 n
∞
x2n+1
= .
n=0
2n + 1
7. For x ∈ R fixed we apply the ratio test to the series representing Jl (x) :

(−1)n+1 ( x2 )l+2(n+1) xl x2 x2n
(n+1)!(n+1+l)! 22 2l 22n (n+1)n!(n+1+l)(n+l)! x2 .

(−1)n ( x2 )l+2n = xl x2n
=
22 (n + 1)(n + 1 + l)
n!(n+l)!
2l 22n n!(n+l)!
Thus, in order to obtain the convergence of Jl (x) we need to assume that there
exists N ∈ N such that n ≥ N implies
x2
≤ τ < 1,
22 (n + 1)(n + 1 + l)
715
of course N may depend on x. Now
x2 x2
≤ ,
22 (n + 1)(n + 1 + l) 2 2 n2
x2 1
thus if 22 n2 ≤ 4 (but any 0 < τ < 1 will do instead of 14 ) then we are done. Now
x2 1
≤ implies |x| ≤ n.
2 2 n2 4
Hence, for N := [x] + 1 it follows for n ≥ N that
x2 1
≤
22 (n + 1)(n + 1 + l) 4
implying the convergence of Jl (x). (Note: there is no need to assume l ∈ N0 ).
We can now differentiate Jl (x) term by term to find
∞
(−1)n xl+2n
Jl (x) =
n=0
2l+2n n!(n + l)!
∞
(−1)n (l + 2n)xl+2n−1
Jl (x) =
n=0
2l+2n n!(n + l)!
∞
(−1)n (l + 2n)(l + 2n − 1)xl+2n−2
Jl (x) =
n=0
2l+2n n!(n + l)!
and therefore we have
∞ ∞

(−1)n xl+2n+2 (−1)n l2 xl+2n
(x2 − l2 )Jl (x) = l+2n
−
n=0
2 n!(n + l)! n=0 2l+2n n!(n + l)!
∞
(−1)n (l + 2n)xl+2n
xJl (x) =
n=0
2l+2n n!(n + l)!
∞
(−1)n (l + 2n)(l + 2n − 1)xl+2n
x2 Jl (x) =
n=0
2l+2n n!(n + l)!
716
Now we have to add up these three terms to find
x2 Jl (x) +xJl (x) + (x2 − l2 )Jl (x)
∞
(−1)n xl+2n+2
=
n=0
2l+2n n!(n + l)!
∞ (−1)n {−l2 +(l+2n)+(l+2n)(l+2n−1)}xl+2n

+ n=0 2l+2n n!(n+l)!
∞ ∞

(−1)n xl+2n+2 (−1)n (4n(n + l))xl+2n
= +
n=0
2l+2n n!(n + l)! n=0 2l+2n n!(n + l)!
∞ ∞

(−1)n xl+2n+2 (−1)n 4xl+2n
= l+2n
+ l+2n
n=0
2 n!(n + l)! n=1 2 (n − 1)!(n + l − 1)!
∞
∞
(−1)n−1 xl+2n (−1)n 4xl+2n
= +
n=1
2l+2n−2 (n − 1)!(n − 1 + l)! n=1 2l+2n (n − 1)!(n + l − 1)!
∞
∞
(−1)n 4xl+2n (−1)n 4xl+2n
=− l+2n
+ l+2n
n=1
2 (n − 1)!(n − 1 + l)! n=1 2 (n − 1)!(n + l − 1)!
= 0.
∞
1
8. Since for |r| < 1 we have rn = we find with r = −t2
n=0
1−r
∞
∞
1 2 n
= (−t ) = (−1)n t2n .
1 + t2 n=0 n=0
For |x| < 1 it holds
x ∞
x
1
arctan x = dt = (−1)n t2n dt.
0 1 + t2 0 n=0
Since |x| < 1 implies |t| < 1 the series under the integral sign converges uniformly
and therefore we find by changing the order of summation and integration that
∞
x ∞
x2n+1
arctan x = (−1)n t2n dt = (−1)n .
n=0 0 n=0
2n + 1
717
√
9. Since tan π6 = 3
3
we have by Problem 8
√ ∞
√ 2n+1
π 3 n 1 3
= arctan = (−1)
6 3 n=0
2n + 1 3
√ ∞ ∞
3 1 3n 1 (−1)n
= (−1)n = √ .
3 n=0 2n + 1 32n 3 n=0 (2n + 1)3n
∞
n
10. For n ∈ N0 we denote the nth partial sum of an by Sn := k=0 ak and further
m=0
∞
we set S−1 := 0, implying that an = Sn − Sn−1 and S := lim Sn = an . It
n→∞
∞ n=0
follows for |x| < 1 that by g : (−1, 1) → R, g(x) = n=0 an xn , a function is defined
which satisfies ∞

g(x) = (1 − x) Sn xn .
n=0
Now let > 0. Then there exists N = N () ∈ N such that n > N implies
∞
|S − Sn | < 2 . Further, since for |x| < 1 we have (1 − x) n=0 xn = 1, it follows for
0 < x < 1 that

∞
n
|g(x) − S| = (1 − x) (Sn − S)x

n=0
N

≤ (1 − x) |Sn − S| + .
n=0
2
Now, for this > 0 we can also find δ > 0 such that 1 − δ < x < 1 yields
N
N

(1 − x) |Sn − S| < δ |Sn − S| < ,
n=0 n=0
2
implying that |g(x) − S| < , or

∞
∞

lim an xn = an .
x→1
x<1 n=0 n=0
11. a) Since for |x| < 1 we have the Taylor expansion

∞
xl
ln(1 + x) = (−1)l+1
l
l=1
Abel’s convergence theorem gives

∞
(−1)l+1
lim ln(1 + x) = ln 2 = .
x→1 l
l=1
718
For the second equality we just have to note
(−1)2l (−1)2l−1 2l + 1 − 2l
+ = .
2l 2l − 1 (2l)(2l − 1)
b) We can use the Taylor series for arctan:

∞
xk
arctan x = (−1)k
2k + 1
k=0
and Abel’s theorem gives
(−1)k∞
π
= arctan 1 = .
4 2k + 1
k=0
12. In both cases we use the Taylor formula with the Lagrange remainder term.
a) With some 0 < ϑ1 < 1 we have
x2 x3 x4 x5 1
ln(1 + x) = x − + − + ·
2 3 4 5 (1 + ϑ1 )5
x2 x3 x4
>x− + − ,
2 3 4
and for some 0 < ϑ2 < 1 we find
x2 x3 x4 1
ln(1 + x) = x − + − ·
2 3 4 (1 + ϑ2 x)4
x2 x3
<x− + .
2 3
b) For some 0 < ϑ1 < 1 we find
√ x x2 x3 x4 1
1+x=1+ − + − ·
2 8 16 128 (1 + ϑ1 x) 72
x x2 x3
<1+ − + ,
2 8 16
and with some 0 < ϑ2 < 1 we get
√ x x2 x3 1
1+x=1+ − + ·
2 8 16 (1 + ϑ2 x) 52
x x2
>1+ − .
2 8
719
13. a) We know that for |x| < 1 we have

∞
1
xn = ,
n=0
1−x
therefore
2 ∞
∞

1
n m
= x x
1−x n=0 m=0
∞

= ck xk
k=0
where
k

ck = al bk−l
l=0
with al = 1, bl = 1 for all l, hence
k

ck = 1 = k + 1,
l=0
which implies
2 ∞

1
= (k + 1)xk .
1−x
k=0
b) We note that
∞ 2k
∞
∞
cos x 1 k x
= (cos x) = (−1) xl = ck xk
1−x 1−x (2k)! m=0
k=0 l=0
-
(−1)k
, n = 2k
where with an = (2k)! and bm = 1 it follows that
0, n = 2k − 1
k
k

ck = aj bk−j = aj .
j=0 j=0
For k = 2l we find
l
l
1
c2l = a2j = (−1)j
j=0 j=1
(2j)!
and for k = 2l + 1 we have
l
l
1
c2l+1 = a2j = (−1)j ,
j=0 j=1
(2j)!
implying the result.
720
14. From our assumptions we deduce first

∞
(f · g)(k) (0) k
(f · g)(x) = x ,
k!
k=0
and therefore it remains to prove that

k
(f · g)(k) (0) f (l) (0) g (k−l) (0)
= .
k! l! (k − l)!
l=0
However, Leibniz’s rule for higher order derivatives, see Corollary 21.12, gives
k
k (l)
(f · g)(k) (0) = f (0)g (k−l) (0)
l
l=0
and since
1 k 1 1
=
k! l l! (k − l)!
the result follows.
Chapter 30
1. Since
k3 − 1 (k − 1)(k 2 + k + 1)
3
=
k +1 (k + 1)(k 2 − k + 1)
(k − 1)((k + 1)2 − (k + 1) + 1)
=
(k + 1)(k 2 − k + 1)
we find for N ∈ N
N N
k3 − 1 (k − 1)((k + 1)2 − (k + 1) + 1)
3
=
k +1 (k + 1)(k 2 − k + 1)
k=2 k=2
N N
k − 1 (k + 1)2 − (k + 1) + 1
=
k+1 k2 − k + 1
k=2 k=2
2 (N + 1)2 − (N + 1) + 1
= ·
((N − 1) + 1)(N + 1) 4−2+1
2(N 2 + N + 1)
= .
3N (N + 1)
Thus we have
∞
N

k3 − 1 k3 − 1 2(N 2 + N + 1) 2
= lim = lim = .
k 3 + 1 N →∞ k 3 + 1 N →∞ 3N (N + 1) 3
k=2 k=2
721
b) We note that
1 (l + 1)2
1+ =
l(l + 2) l(l + 2)
and therefore
N
N

1 (l + 1)2
1+ =
l(l + 2) l(l + 2)
l=1 l=1
N
N
l+1l+1
=
l l+2
l=1 l=1
N +1 2 2(N + 1)
= · = ,
1 N +2 N +2
which yields
∞
N

1 1
1+ = lim 1+
l(l + 2) N →∞ l(l + 2)
l=1 l=1
2(N + 1)
= lim = 2.
N →∞ N +2
2. We first observe that for bk ≥ 0 we have
(∗) (1 − b1 ) · . . . · (1 − bn ) ≥ 1 − (b1 + · · · + bn ).
Indeed, for n = 1 we have equality and if (∗) holds for n, then
(1 − b1 ) · . . . · (1 − bn )(1 − bn+1 ) ≥ (1 − (b1 + · · · + bn ))(1 − bn+1 )

= 1 − (b1 + · · · + bn ) − bn+1 + (b1 + · · · + bn )bn+1
≥ 1 − (b1 + · · · + bn+1 ).
∞
Now assume that ∞ k=1 ak converges. Then there exists N ∈ N such that k=N ak <
1
2 . For n > N we find
n
n

Pn = (1 − ak ) = PN −1 (1 − ak ),
k=1 k=N
or
n
n

Pn 1
= (1 − ak ) ≥ 1 − ak > ,
PN −1 2
k=N k=N
Pn
implying that is bounded from below and since for n > N we have 0 <
PN −1
Pn
1 − an < 1 it follows that PN −1
is also monotone decreasing, hence it has a
n∈N
722
limit p ∈ [ 12 , 1]. Therefore we find

∞
n

(1 − ak ) = lim (1 − ak )
n→∞
k=1 k=1
n

= PN −1 lim (1 − an )
n→∞
k=N
= PN −1 P = 0.
∞ ∞
Conversely, suppose that k=1 ak diverges. In order to have convergence of k=1 (1−
ak ) it is necessary that lim (1 − ak ) = 1, i.e. lim ak = 0. We assume now that
k→∞ ∞ k→∞
lim ak = 0, otherwise the divergence of k=1 (1 − ak ) would follow immediately.
k→∞
Since ak ≥ 0 we deduce 0 ≤ ak ≤ 1 for all k ≥ N with some N ∈ N. For 0 ≤ x ≤ 1
we have 1 − x ≤ e−x and therefore with n ≥ N
n
n
0≤ (1 − ak ) ≤ e− k=N ak
k=N
n

and the divergence of ∞ a
k=1 k implies now that lim (1 − ak ) = 0, which yields
n→∞
∞ k=N
that k=1 (1 − ak ) diverges to 0.
∞want to use Lemma 30.5 and hence we need a control on ln(1 + ak ). Since
3. We
k=1 ak converges, hence lim ak = 0, there exists N ∈ N such that for k ≥ N
k→∞
we have |ak | < 12 . Now we apply the Taylor formula with Lagrange remainder, see
Theorem 29.14, to ln(1 + x), |x| < 12 , to find
x2 1
ln(1 + x) = x − , 0 < |ξ| < ,
2(1 + ξ)2 2
or
2 1
< < 2.
9 2(1 + ξ)2
a) From the considerations made above it follows that for k ≥ N
2
ln(1 + ak ) = ak − ϑk a2k , < ϑk < 2.
9
∞ ∞ 2
If ∞ 2 2
k=1 ak converges, then k=1 ϑk ak ≤ 2 it follows that ∞
k=1 ak and k=1 ln(1+
∞ ∞
ak ) converges. If however k=1 (1 + ak ) converges then k=1 ln(1 + ak ) converges,
∞
implying first the convergence of k=1 ϑk a2k and since 29 < ϑk the convergence of
∞ 2
k=1 ak follows.
∞
b) Now suppose that k=1 a2k diverges. From our previous considerations we
deduce for k ≥ N
2
ak − ln(1 + ak ) > a2k ,
9
723
∞
and since lim |ak | = 0 it follows that k=1 ln(1 + ak ) must diverge to −∞. Con-
k→∞
∞ ∞
sequently k=1 (1 + a k ) diverges and conversely, the divergence of k=1 (1 + ak ),
∞ ∞
i.e. the divergence of k=1 (1 + ak ), implies the divergence of k=1 a2k .
∞ ∞
4. If k=1 (1+ak ) converges absolutely, then it converges and consequently k=1 ln(1+
ak ) converges. Moreover, we must have lim ak = 0 thus for some N ∈ N it follows
k→∞
N −1 ∞
that ak > −1 if k ≥ N . Since ∞ k=1 (1 + ak ) = k=1 (1 + ak ) k=N (1 + ak ),
and a finite rearrangement cannot change the value of the infinite ∞product, we may
assume∞that ak > −1 for all k ∈ N. In this case, with P = k=1 ∞(1 + ak ) and
S = k=1 ln(1 + ak ) we have P = exp(S). If we can show that k=1 ln(1 + ak )
converges absolutely, then we can rearrange the series without changing its value,
see Theorem 18.27. But ∞the equality P = exp(S) then implies that we can also
rearrange the product k=1 (1 + ak ) without changing its ∞ value. Thus it remains
to prove that the absolute convergence∞ of the product k=1 (1 + ak ) implies the
absolute convergence
∞ of the series k=1 ln(1 + a k ). From Proposition 30.10 we
deduce that k=1 ak converges absolutely. Moreover, since lim ak = 0 we find
k→∞
| ln(1 + ak )|
lim = 1,
k→∞ |ak |
or 12 ≤ | ln(1+a
|ak |
k )|
≤ 2 for k sufficiently large implying the absolute convergence of
∞
k=1 ln(1 + ak ).
5. a) For |x| < 1 we find

k k
(1 + x2 )(1 − x2 ) = 1 − x2k+1
which implies
N
N k+1 N +1
k 1 − x2 1 − x2
(1 + x2 ) = =
k=0 k=0
1 − x2k 1−x
and therefore
∞
N
N +1
2k 2k 1 − x2 1
(1 + x ) = lim (1 + x ) = lim = .
N →∞ N →∞ 1−x 1−x
k=0 k=0
b) First we observe that for x = 0 the product has the value 1 and the right
hand side converges for x → 0 to 1. Now, for x = 2k ( π2 + lπ) we have cos 2xk = 0 as
well as sin 2xk = 0. Using sin(2ϕ) = 2 sin ϕ cos ϕ we find
x
x 1 sin 2j−1
cos = ,
2j 2 sin 2xj
and consequently
N
N x
x 1 sin 2j−1 sin x
cos = x = N .
j=1
2 j
j=1
2 sin 2 j 2 sin 2xN
724
sin x
Since 2N sin 2xN = x x
2N
we eventually get
2N
∞
N x
x 1 sin 2j−1 sin x
cos = lim x = lim N
2j N →∞ 2 sin j N →∞ 2 sin xN
j=1 j=1 2 2

sin x sin 2xN sin x
= lim x = .
x N →∞ 2N x
π
Finally, for x = 2 we derive
∞
π sin π2 2
cos = π = .
j=1
2j+1 2 π
Chapter 31
1. From Theorem 31.12, the Legrendre duplication formula, we find for n ∈ N
√
1 πΓ(2n)
Γ n+ = 2n−1
2 2 Γ(n)
√
π(2n − 1)!
= n 1
4 · 2 (n − 1)!
√
π(2n)! n
= · 1
4n n! ·
2 2n
√
π(2n)!
= .
4n n!
2. Using the substitution r = st we find
∞ ∞ α
r 1
tα e−st dt = e−r dr
0 0 s s
∞
1 Γ(α + 1)
= α+1 rα e−r dr = .
s 0 sα+1
Note that we applied the change of variable formula to an improper integral. Mean-
while we have seen several times, in particular in the context of the Γ-function, how
R
to derive a result as the above one by looking first at tα e−st dt and then passing
∞
to the limit. For a function f : (0, ∞) → R such that F (s) := 0 f (t)e−st dt exists
we call F the Laplace transform of f .
3. The substitution s = − ln t, i.e. t = e−s yields
1 x−1 1
1
ln dt = (− ln t)x−1 dt
0 t 0
0 ∞
−s
= s x−1
e ds = sx−1 e−s ds
∞ 0
= Γ(x).
725
3
1 √
For x = 2 we find using Γ 2 = π
1 1 √
1 2 3 1 1 π
ln dt = Γ = Γ = ,
0 t 2 2 2 2
1
and for x = 2 we find
1 − 12
1 1 √
ln dt = Γ = π.
0 t 2
4. We use formula (31.14) to find

∞
Γ (1) 1 1 1
= −γ − + − .
Γ(1) 1 k k+1
k=1
Since Γ(1) = 1 and since

N

1 1 1 1 1 1 1 1
− =1− + − + ··· − + − ,
k k+1 2 2 3 N N N +1
k=1
we have
∞

1 1
− = 1,
k k+1
k=1
and we find
Γ (1) = −γ.
Note, that if we can justify
∞
d d x−1
Γ(x) = t x=1
e−t dt
dx x=1 0 dx
we would obtain ∞
(ln t)e−t dt = −γ.
0
Γ (1)
5. a) We again use formula (31.14) to get with ψ(1) = Γ(1) = −γ that
∞
1 1 1
ψ(x) − ψ(1) = − + −
x k k+x
k=1
∞
1 1
=− − .
k+x k+1
k=0
b) Since
Γ(x + n) = (x + n − 1)(x + n − 2) · . . . · xΓ(x),
726
we have
ln Γ(x + n) = ln(x + n − 1) + ln(x + n − 2) + · · · + ln x + ln Γ(x),
and therefore
d
ψ(n + x) = ln Γ(x + n)
dx
1 1 1 d
= + + ··· + + ln Γ(x)
x+n−1 x+n−2 x dx
1 1
= + ···+ + ψ(x).
x x+n−1
6. Starting with
1
B(x, y) = tx−1 (1 − t)y−1 dt,
0
s
the substitution t = 1+s yields
1
tx−1 (1 − t)y−1 dt
0
∞ y−1
sx−1 s 1
= 1− ds
0 (s + 1)x−1 1+s (1 + s)2
∞
sx−1 1 1
= ds
0 (s + 1)x−1 (s + 1)y−1 (1 + s)2
∞
sx−1
= ds.
0 (s + 1)x+y
7. We apply the result of Problem 6:

∞
x5
dx = B(6, 1),
0 (1 + x)7
and now we use Theorem 31.11 which states

Γ(x)Γ(y)
B(x, y) = .
Γ(x + y)
Thus
Γ(6)Γ(1) 5!0! 1
B(6, 1) = = = .
Γ(7) 6! 6
Therefore we have proved that
∞
x5 1
dx = .
0 (1 + x)7 6
727
8. We apply Theorem 31.2 and Corollary 31.3 in combination with the formula
Γ(x)Γ(y)
B(x, y) =
Γ(x + y)
to find
∞ ∞ y
e−γx e k e−γy e k
x
B(x, y) = ·
x 1 + xk y 1 + yk
k=1 k=1
∞
x+y (x+y)
· (x + y)eγ(x+y) 1+ e− k
k
k=1
∞
x+y 1 + x+y
= x
k y .
xy 1 + k 1+ k
k=1
Chapter 32
1. For the partition Z, where x0 = 0, xj = 1j , j ∈ N2k and x2k+1 = 1 we find that
when j = 2l is even a typical term in the variation sum is

1 1 1 1 1

|f (xj ) − f (xj−1 )| = cos lπ − cos l − π = =
j j−1 2 j 2l
and if j = 2l + 1 is odd

1 1 1 1 1

|f (xj ) − f (xj−1 )| = cos l + π− cos lπ = = .
j 2 j−1 j−1 2l
This yields
k
11
VZ (f ) =
2 l
l=1
which diverges for k → ∞.

2. First we note that ||g(xj )| − |g(xj−1 )|| ≤ |g(xj ) − g(xj−1 )| which implies for every
partition Z of [a, b] that
VZ (|g|) ≤ VZ (g),
hence if g ∈ BV ([a, b]) then |g| ∈ BV ([a, b]). Next we note that 0 ∈ BV ([a, b]) and
since
1
max(f, g) = (f + g + |f − g|)
2
and
1
min(f, g) = (f + g − |f − g|)
2
we deduce from the fact that BV ([a, b]) is a vector space and the first part of our
solution that g + , g − as well as max(f, g) and min(f, g) belong to BV ([a, b]).
728
3. Let A = inf |g| and Z(x0 , . . . , xn ) a partition of [a, b]. It follows that

1 1 |g(xk−1 ) − g(xk )|
−
g(xk ) g(xk−1 ) = |g(xk )||g(xk−1 )|
which implies
n
1 1 1
VZ = −
g g(xk ) g(xk−1 )
k=1
n
|g(xk−1 ) − g(xk )|
=
|g(xk )||g(xk−1 )|
k=1
n
1 1
≤ 2
|g(xk ) − g(xk−1 )| = 2 VZ (g)
A A
k=1

1 1
and taking the supremum over all partitions Z we arrive at V g ≤ A2 V (g).
4. For a partition Z(x0 , . . . , xn ) of [a, b] we find

xk xk

|F (xk ) − F (xk−1 )| = f (t)dt ≤ |f (t)|dt
xk−1 xk−1
and therefore
n
n
xk b
VZ (F ) = |F (xk ) − F (xk−1 )| ≤ |f (t)|dt = |f (t)|dt,
k=1 k=1 xk−1 a
b b
i.e. VZ (F ) ≤ a |f (t)|dt for all partitions Z implying that V (F ) ≤ a |f (t)|dt. Now
we prove the converse inequality. Let mk := min{|f (t)||t ∈ [xk−1 , xk ]}. By the
mean value theorem for the Riemann integral there exists ξk ∈ [xk−1 , xk ] such that
F (xk ) − F (xk−1 ) = f (ξk )(xk − xk−1 )
implying
|F (xk ) − F (xk−1 )| = |f (ξk )|(xk − xk−1 ) ≥ mk (xk − xk−1 )
and consequently
n
n

VZ (F ) = |F (xk ) − F (xk−1 )| ≥ mk (xk − xk−1 ).
k=1 k=1
Taking the supremum over all partitions Z of [a, b] we find

n
b
V (F ) ≥ sup mk (xk − xk−1 ) = |f (t)|dt,
Z a
k=1
where the last equality follows from Theorem 25.24 when observing that mk = f (ηk )
for some ηk ∈ [xk−1 , xk ].
729
5. a) This is trivial: we only need to take m = 1.

b) If f : [a, b] → R is Lipschitz continuous, i.e. |f (x) − f (y)| ≤ κ|x − y| for all
x, y ∈ [a, b] with some κ > 0, then we find for > 0 with δ = κ that with (aj , bj )
as in the definition
m
m

(bj − aj ) < δ = implies κ(bj − aj ) <
j=1
κ j=1
and therefore
m
m

|f (bj ) − f (aj )| ≤ κ(bj − aj ) < .
j=1 j=1
c) Let f: [a, b] → R be absolutely continuous. For = 1 there exists δ >

m
0 such that
m j=1 (bj − aj ) < δ (where (aj , bj ) is as in the definition) implying
j=1 |f (b j ) − f (aj )| < 1. In particular we have Vαβ (f ) ≤ 1 for every interval
[α, β] ⊂ [a, b] with β − α < δ. Given δ > 0 sufficiently small there exists k ∈ N
such that kδ < b − a, and intervals Ij ⊂ [a, b] such that λ(1) (Ij ) < δ, j = 1, . . . , k,
k
and [a, b] ⊂ ∪kj=1 Ij . It follows that Vab (f ) ≤ j=1 VIj (f ) ≤ k = b−aδ , hence f has
bounded variation.
6. Since the constant functions are obviously absolutely continuous we need to prove
that with f, g : [a, b] → R absolutely continuous the functions f + g and f · g
are absolutely continuous too. The absolute continuity of f + g follows from the
triangle inequality: if we know that for every > 0 there exists δ > 0
m such that for
m
(aj , bj ) ⊂ [a, b], j = 1, . . . , m, it follows that j=1 (bj −aj ) < δ implies j=1 |f (bj )−
m
f (aj )| < 2 and j=1 |g(bj ) − g(aj )| < 2 then we have of course that
m
m
m

|(f + g)(bj ) − (f + g)(aj )| ≤ |f (bj ) − f (aj )| + |g(bj ) − g(aj )|
j=1 j=1 j=1

< + = .
2 2
In order to prove that f · g is absolutely continuous we first note that f and g must
be bounded, i.e. ||f ||∞ < ∞ and ||g||∞ < ∞. For an interval (aj , bj ) ⊂ [a, b] we
find
|(f · g)(bj ) − (f · g)(aj )| = |f (bj )g(bj ) − f (aj )g(aj )|
≤ |f (bj )g(bj ) − f (aj )g(bj )| + |f (aj )g(bj ) − f (aj )g(aj )|
≤ ||g||∞ |f (bj ) − f (aj )| + ||f ||∞ |g(bj ) − g(aj )|
Thus with ||f ||∞ ≤ M, ||g||∞ ≤ M , given >0, choose δ > 0 such that for
(aj , bj ), j = 1, . . . , m and (aj , bj ) ⊂ [a, b], from m j=1 bj − aj < δ it follows that
m
m
j=1 |f (b j ) − f (a j )| < M and j=1 |g(b j ) − g(a j < M . This implies
)|
m

|(f · g)(bj ) − (f · g)(aj )| < + = .
j=1
M M
730
x
7. For f ∈ C([a, b]) ∪ BV ([a, b]) we define F by F (x) := a f (t)dt. Let x, y ∈ [a, b], it
follows that
y y

|F (y) − F (x)| = f (t)dt ≤ |f (t)|dt ≤ M |x − y|
x x
for M = ||f ||∞ < ∞, note that continuous functions on a compact interval are
bounded as are functions of bounded variation. Thus F : [a, b] → R is Lipschitz
continuous and therefore absolutely continuous.
731
References
[1] Beals, R., and Wong, R., Special Functions. A Graduate Text. Cambridge Studies in
Advanced Mathematics, Vol. 126. Cambridge University Press, 2010.
[2] Dieudonné, J., Grundzüge der modernen Analysis, 2. Aufl. Logik und Grundlagen
der Mathematik Bd. 8. Friedrich Vieweg & Sohn, Braunschweig 1972.
[3] Endl. K., und Luh, W., Analysis I, 3. Aufl. Akademische Verlagsgesellschaft, Wies-
baden 1975.
[4] Garling, D.J.H., A Course in Mathematical Analysis, Vol. I. Foundations and Ele-
mentary Real Analysis. Cambridge University Press, Cambridge 2013.
[5] Heuser, H., Lehrbuch der Analysis. Teil 1. B.G. Teubner Verlag, Stuttgart 1980.
[6] Kaczor, W.J., and Nowak, M.T., Problems in Mathematical Analysis II. Students
Mathematical Library, Vol. 12. American Mathematical Society, Providence R.I.,
2001.
[7] Landau, E., Grundlagen der Analysis. Akademische Verlagsgesellschaft, Leipzig 1930.
[8] Lin, M., The AM-GM inequality and the CBS inequality are equivalent.The Mathe-
matical Intelligencer 34. 2 (2012), 6.
[9] Maligranda, L., The AM-GM inequality is equivalent to the Bernoulli inequality. The
Mathematical Intelligencer 34. 1 (2012), 1-2.
[10] Markuschewitsch, A.I., Rekursive Folgen. Kleine Ergänzungsreihe zu den Hochschul-

buecher für Mathematik Bd. 11. VEB Deutscher Verlag der Wissenschaften, Berlin
1955.
[11] Rudin, W., Principles of Mathematical Analysis, 3rd ed. McGraw-Hill International
Editions, Mathematical Series. McGraw-Hill Book Company, Singapore 1976.
[12] Schilling, R.L., Measures, Integration and Martingales. Cambridge University Press,
Cambridge 2005.
733
Mathematicians Contributing to Analysis

Abel, Niels Henrik (1802-1829).
Archimedes, (ca. 287B.C.-212B.C.).
Banach, Stefan (1892-1945).

Bernoulli, Jakob I (1654-1705).
Bernstein, Sergej Natanowitsch (1880-1968).
Bessel, Friedrich Wilhelm (1784-1846).
Bohr, Harald (1887-1951).
Bolzano, Bernard (1781-1848).
Borel, Emile (1871-1956).
Bunyakovsky, Viktor Jakovlevitsh (1805-1859).
Cantor, Georg (1845-1918).

Cauchy, Augustin-Louis (1789-1857).
Cohen, Paul J. (1934-2007).
de Morgan, Auguste (1806-1871).

Dedekind, Richard (1831-1916).
Descartes, René (1596-1650).
Dirichlet, Joham Peter Gustav, Lejeume- (1805-1859).
du Bois-Reymond, Paul (1831-1889).
Euclid, (ca. 325B.C.-265B.C.).

Euler, Leonard (1707-1783).
Faà di Bruno, Francesco (1825-1888).

Fermat, Pierre (1601-1655).
Fibonacci, Leonardo of Pisa, called- (ca. 1170-ca.1250).
Fourier, Jean-Baptiste Joseph (1768-1830).
Fraenkel, Abraham (1891-1965).
735
Gödel, Kurt (1906-1978).
Hölder, Otto (1859-1937).

Heine, Heinrich Edward (1821-1881).
Hospital, Guillaume Francois Antoine de l’ (1661-1704).
Jensen, Johan (1859-1925).
Lagrange, Joseph-Louis (1736-1813).

Laplace, Pierre Simon (1749-1827).
Lebesgue, Henri (1875-1941).
Legrendre, Adrian-Marie (1752-1833).
Leibniz, Gottfried Wilhelm (1646-1716).
Lindelöf, Ernst (1870-1946).
Lipschitz, Rudolf (1832-1903).
Minkowski, Hermann (1864-1909).

Mollerup, Peter Johannes (1872-1937).
Newton, Isaac (1613-1727).
Pascal, Blaise (1623-1662).

Peano, Guisepp (1858-1939).
Picard, Emile (1856-1941).
Poincaré, Henri (1854-1912).
Pythagoras, (ca.580B.C.-ca.500B.C.).
Raabe, Josef Ludwig (1801-1859).

Riemann, Bernhard (1826-1866).
Rolle, Michel (1652-1719).
Schwarz, Hermann Amandus (1843-1921).

Stirling, James (1692-1770).
Taylor, Brook (1685-1731).
Wallis, John(1616-1703).
Weierstrass, Karl Theodor Wilhelm (1815-1897).
Zermelo, Ernst Friedrich Ferdinand (1871-1953).
736
Subject Index
Γ-function, 405 Bernstein function, 315

ε-neighbourhood, 213 Bessel function, 425
k-times differentiable, 299 beta-function, 450, 712
k th derivative, 118 bijective, 71
nth partial sum, 225 binary operation, 9
binomial coefficient, 46
Abel’s convergence theorem, 426 binomial theorem, 47
absolute value, 20, 59 Bolzano-Weierstrass theorem, 235
absolutely continuous, 470 Borel set, 461
absolutely convergent, 246, 430 boundary, 157
absolutely convergent integral, 401 boundary point, 156
accumulation point, 236 boundary point at infinity, 157
addition, 201, 493 bounded, 115, 215
inverse element, 7, 201 bounded above, 267
neutral element, 7, 201 bounded below, 267
affine mapping, 293 bounded from above, 215
all-quantifier, 479 bounded from below, 215
all-statement, 478 bounded function, 283
alternating harmonic series, 246 bounded interval, 263
alternating series, 245 bounded variation, 466
Archimedes’ axiom, 205
arcus-cosine function, 147 càdlàg function, 289
arcus-cotangent function, 150 Cantor set, 462
arcus-sine function, 147 cardinality, 84
arcus-tangent function, 149 Cartesian product, 56, 64
area, 175 Cauchy criterion for improper integrals,
area sinus hyperbolicus, 167 400
arithmetic mean, 205 Cauchy criterion for infinite products, 429
arithmetic-geometric mean inequality, 53, Cauchy criterion for series, 243
206, 328 Cauchy product of series, 422
associative, 7 Cauchy sequence, 234, 337
asymptote, 159 Cauchy’s condensation theorem, 645
automorphism, 87 Cauchy’s functional equation, 127
axiom of completeness, 234, 505 Cauchy-Schwarz inequality, 206, 260, 325
axiom of induction, 492 centre of curvature, 313
axiom of mathematical induction, 39 chain rule, 108, 299
axiomatic method, 491 change of variables, 186, 188
axioms of order, 202 changing the running index in a sum, 49
characteristic function, 60
b-adic fraction, 254 circle of curvature, 313
Banach space, 337 closed interval, 25, 263
Bernoulli’s inequality, 131, 136, 205 closed set, 266
737
cluster point, 236 derivative, 98, 295

co-domain, 55, 66 k-times differentiable, 299
co-secant function, 150 k th derivative, 118
commutative, 7 chain rule, 108, 299
commutative field, 202 higher order derivatives, 117, 299
compact set, 284 Leibniz’s rule, 101
comparison test for series, 247 logarithmic, 130
complement, 28 product rule, 101
complete normed space, 337 quotient rule, 111
complete set of representatives, 83 rules, 100
completely monotone, 315 derivative of a function, 98, 295
composition, 73 difference, 62
composition of mappings, 483 difference quotient, 296
concave, 317 differentiable, 295
conclusion, 475 differentiable from the left, 306
conjunction, 474 differentiable from the right, 306
connected set, 273 digamma-function, 456
continuous, 102 Dirichlet function, 60
absolutely continuous, 470 Dirichlet kernel, 153, 616
Hölder continuous, 674 disjoint, 28
Lipschitz continuous, 292 disjunction, 474
continuous function, 278, 326 distributivity, 8
continuum hyothesis, 86 divergent, 213
contra-position, 477 divergent to ∞, 229
convergence, 213, 225 divergent to 0, 427
infinite product, 427 divisible, 40
pointwise, 332 domain, 55, 66
sequence, 326 interior of, 161
series, 225 domain of convergence, 413
uniform, 332 dyadic numbers, 257
converse triangle inequality, 35, 334
convex, 317 elementary transcendental functions, 125
convex in the sense of J. Jensen, 597 elements, 4
coordinate projections, 73 empty set, 9
cotangent function, 149 entier-function, 60
countable, 485 equation
countable sets, 84 ordinary differential, 383
covering, 485 equidistant partition, 345
curvature, 313 equivalence class, 82
equivalence of statements, 475
de Morgan’s laws, 31, 477 equivalence relation, 66
decimal representation, 257 estimate, 19
decreasing function, 121 upper, 19
Dedekind cut, 517 Euler constant, 252, 444
definite integral, 179 Euler number, 127, 242
denumerable, 84, 485 Euler’s beta function, 450, 712
738
SUBJECT INDEX
even function, 142 digamma, 456

even number, 15 domain, 55
excluded middle, 477 elementary transcendental functions,
existence-quantifier, 479 125
existence-statement, 478 equal, 55
exponential function, 125 even, 142
extreme value, 116, 305 exponential function, 125
fixed point, 290
Faà di Bruno formula, 301 graph, 57
factorial, 46
hyperbolic, 162
Fibonacci numbers, 211
identity, 76
finite intersection, 31
image, 57
finite subcovering, 284
increasing, 121
finite union, 31
injective, 71
fixed point of a function, 290
inverse, 75
fraction
isolated maximum, 305
b-adic, 254
jump, 464
fractional powers, 12
jump function, 465
function, 55
left continuous, 279
Γ-function, 405
absolutely continuous, 470 limit at ∞, 280
arcus-cosine, 147 limit from the left, 279
arcus-cotangent, 150 limit from the right, 279
arcus-sine, 147 Lipschitz continuous, 292
arcus-tangent, 149 local extreme value, 116, 305
asymptote, 159 local maximum, 116, 305
Bernstein, 315 local minimum, 116, 305
Bessel, 425 locally bounded, 157
beta, 450, 712 logarithm, 128
bounded, 115, 283 logarithmic convex, 406
bounded variation, 466 monotone, 121
càdlàg, 289 negative part, 88, 356
co-secant, 150 odd, 142
completely monotone, 315 one-to-one, 71
composition, 73 orthogonal, 380
concave, 317 piecewise continuous, 365
continuous, 102, 278, 326 piecewise linear, 291
convex, 317 point of discontinuity, 289
cotangent, 149 pointwise limit, 436
decreasing, 121 polynomials, 56
derivative, 98 positive part, 88, 356
derivative of, 295 power, 178
difference, 62 primitive, 176, 369
differentiable, 295 product, 62
differentiable from the left, 306 range, 57
differentiable from the right, 306 restriction, 62
739
right continuous, 279 Peetre’s, 329

secant, 150 Poincaré, 377
step function, 345 triangle, 323
strictly decreasing, 121 triangle inequality for integrals, 357
strictly increasing, 121 infimum, 268
sum, 61 infinite interval, 263
surjective, 71 infinite product, 427
tangent, 148 Cauchy criterion for, 429
total variation, 466 convergence, 427
uniformly continuous, 286 divergent to 0, 427
functional equation for exp, 126 infinite series, 225
fundamental theorem of calculus, 176, 370 initial value problem, 383
injective, 71
geometric mean, 205 inner point, 155
geometric sequence, 211 integers, 3
geometric series, 227 integral
graph, 57, 67 absolutely convergent, 401
area, 175
Hölder condition, 674
Cauchy criterion for improper inte-
Hölder continuous, 674
grals, 400
Hölder’s inequality, 324, 362
change of variables, 186, 188
half-open interval, 25, 263
definite, 179
harmonic series, 244
improper Riemann, 399
Heine-Borel Theorem, 286
indefinite, 179
higher order derivatives, 117, 299
Hospital (rules of), 160, 307 integration by parts, 184, 371
hyperbola, 59 integration by substitution, 371
hyperbolic functions, 162 linear, 183
lower integral, 350
identity, 76 of a step function, 348
image, 57, 66, 482 ratio test for improper integrals, 410
implication, 474 Riemann, 175
improper Riemann integral, 399 Riemann integral, 350
increasing function, 121 upper integral, 350
indefinite integral, 179 integral test, 250
induction, 493 integration by parts, 184, 371
inequalities, 19 integration by substitution, 371
inequality interior of a domain, 161
arithmetic-geometric mean, 206, 328 interior point, 155
Bernoulli’s, 131, 136 intermediate value theorem, 126, 283
Cauchy-Schwarz, 206, 260, 325 intersection, 28
converse triangle, 334 interval
Hölder’s, 324, 362 bounded, 263
Jensen’s, 327 closed, 25, 263
Jensen’s inequality for integrals, 366 half-open, 25, 263
Minkowski’s, 207, 260, 362 infinite, 263
740
SUBJECT INDEX
nested, 238 lower integral, 350

open, 25, 213, 263
principle of nested intervals, 238 majorant, 248
inverse element, 201, 202 mapping, 64, 66
inverse function, 75 affine, 293
irrational numbers, 5 bijective, 71
isolated maximum, 305 co-domain, 66
isolated minimum, 305 composition, 483
domain, 66
Jensen’s inequality, 327 graph, 67
Jensen’s inequality for integrals, 366 identity, 76
joint partition, 345 image, 66, 482
jump function, 465 injective, 71
jump of a function, 464 pre-image, 67, 482
range, 66
L’Hospital’s rule, 160, 307 surjective, 71
Lagrange form of the remainder term, 419 target set, 66
Laplace transform, 725 mathematical induction, 39, 103, 493
law of distribution, 202 max, 34
Lebesgue measure, 461 maximum, 270
left continuous, 279 mean value theorem, 119, 307
Legrendre duplication formula, 452 for integrals, 358
Leibniz’s criterion for alternating series, second or generalised, 306
245 measured by the arc length, 498
Leibniz’s rule, 101, 298 mesh size, 173, 345
limit, 95, 213 metric, 323
pointwise, 436 mid-point convex, 597
rules, 96 min, 34
limit at ∞, 280 minimum, 270
limit from the left, 279 Minkowski’s inequality, 207, 260, 325, 362
limit from the right, 279 monotone, 237
limit inferior, 270 monotone decreasing sequence, 237
limit of a sequence, 213 monotone function, 121
limit point, 236 monotone increasing sequence, 237
limit superior, 270 monotonicity, 155
linear, 183 multiplication, 8, 201
linear approximation, 296 inverse element, 8, 202
Lipschitz continuous, 292 neutral element, 8, 202
local extreme value, 116, 305
local maximum, 116, 305 natural numbers, 3
local minimum, 116, 305 necessary condition, 475
locally bounded, 157 negation, 473
logarithm, 128 negative, 13
logarithmic convex function, 406 negative part, 88, 356
logarithmic derivative, 130 neighbourhood, 267
lower bound, 268 nested intervals, 238
741
neutral element, 201, 202 power series, 412

non-Abelian group, 87 domain of convergence, 413
non-negative, 13 power set, 67
non-positive, 13 pre-image, 67, 482
norm, 323 premise, 475
supremum norm, 333 primitive, 176, 369
normal line, 312 principle of nested intervals, 238
null set, 459 product, 62
divergent to 0, 427
odd function, 142 Wallis’ product, 433
odd number, 15 product expansion, 436
one-to-one, 71 product of convergent sequences, 218
open covering, 284 product representation, 436
open interval, 25, 213, 263 product representation of the beta-function,
open set, 264 457
order structure, 12 product representation of the sine func-
ordered field, 202 tion, 455
ordered pair, 56 product rule, 101
ordered triple, 89 proof by contradiction, 477
ordinary differential equation, 383 Pythagoras’ theorem, 496
separation of variables, 386
orthogonal, 380 quadratic equation
osculating circle, 313 solutions of, 531
quantifier, 478
parabola, 58 all, 479
partial fractions, 193 existence, 479
partition, 82, 173, 345 quotient rule, 111
equidistant partition, 345
joint partition, 345 Raabe’s test, 261
mesh size, 173 radius of curvature, 313
width, 173 range, 57, 66
Peano’s axioms, 39, 491 ratio test, 248
Peetre’s inequality, 329 for improper integrals, 410
periodic, 61 rational numbers, 4
piecewise continuous function, 365 real numbers, 5
piecewise linear, 291 rearrangement of a series, 252
Poincaré inequality, 377 recursive definition, 49
point of discontinuity, 289 recursively defined sequence, 211
point of inflexion, 162 reductio ad absurdum, 477
pointwise limit, 436 reflexive, 66
polynomial, 56 relation, 64
Taylor polynomials, 418 equivalence, 66
positive, 13 reflexive, 66
positive integers, 3 symmetric, 66
positive part, 88, 356 transitive, 66
power function, 178 remainder term, 417
742
SUBJECT INDEX
Lagrange form of, 419 Cauchy product, 422

restriction, 62 comparison test, 247
Riemann integral, 175, 350 convergent, 225
Riemann sum, 174, 359 geometric, 227
right continuous, 279 harmonic, 244
Rolle’s theorem, 306 infinite, 225
root test, 249 infinite product, 427
rules for limits, 96 integral test, 250
Leibniz’s criterion for alternating se-
secant function, 150
ries, 245
second or generalised mean value theo-
majorant, 248
rem, 306
power series, 412
separation of variables, 386
ratio test, 248
sequence
rearrangement, 252
accumulation point, 236
root test, 249
bounded, 215
Taylor series, 420
bounded from above, 215
telescopic, 231
bounded from below, 215
set, 4
Cauchy, 234, 337
Borel, 461
cluster point, 236
convergent, 213, 326 boundary, 157
divergent, 213 boundary point, 156
divergent to ∞, 229 boundary point at infinity, 157
geometric, 211 bounded from above, 267
limit, 213 bounded from below, 267
limit inferior, 270 Cantor, 462
limit point, 236 closed, 266
limit superior, 270 compact, 284
monotone, 237 complement, 28
monotone decreasing, 237 connected, 273
monotone increasing, 237 countable, 84, 485
pointwise convergence, 332 denumerable, 485
product of convergent sequences, 218 disjoint, 28
recursively defined, 211 elementary operations, 481
strictly monotone decreasing, 237 empty set, 9
strictly monotone increasing, 237 equality, 5
subsequence, 235 finite intersection, 31
sum of convergent sequences, 217 finite subcovering, 284
uniform convergence, 332 finite union, 31
uniformly convergent, 437 infimum, 268
sequence of real numbers, 72, 211 inner point, 155
series interior point, 155
absolutely convergent, 246 intersection, 28
alternating, 245 lower bound, 268
alternating harmonic, 246 maximum, 270
Cauchy criterion for, 243 minimum, 270
743
neighbourhood, 267 transitive, 66

null set, 459 triangle inequality, 21, 323
open, 264 triangle inequality for integrals, 357
open covering, 284 truth table, 473
power set, 67
splitting, 273 uniform convergence, 437
subset, 5 uniformly continuous function, 286
supremum, 268 union, 28
target set, 55 unit sphere, 323
union, 28 upper bound, 268
upper bound, 268 upper estimate, 19
splitting, 273 upper integral, 350
square root, 58
statement, 473 Wallis’ product, 433
all, 478 Weierstrass product representation, 445
equivalence, 475 Weierstrass’ convergence criterion, 411
existence, 478
step function, 345
Stirling formula, 447, 450
strictly decreasing function, 121
strictly increasing function, 121
strictly monotone decreasing sequence, 237
strictly monotone increasing sequence, 237
subsequence, 235
subset, 5
denumerable, 84
successor, 491
sufficient condition, 475
sum, 61
sum of convergent sequences, 217
supremum, 268
supremum norm, 333
surjective, 71
symmetric, 66
symmetry, 155
tangent, 296
tangent function, 148
target set, 55, 66
Taylor expansion, 420
Taylor polynomials, 418
Taylor series, 420
Taylor’s formula, 417
telescopic series, 231
tertium non datur, 477
total variation, 466
744

A Course in Analysis - Volume I - Introductory Calculus, Analysis of Functions of One Real Variable

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

A Course in Analysis - Volume I - Introductory Calculus, Analysis of Functions of One Real Variable

Încărcat de

Drepturi de autor:

Formate disponibile

A Cour e 1n

9625_9789814689083_tp.indd 1 29/7/15 5:23 pm

RokTing - A Course in Analysis.indd 1 29/7/2015 11:56:51 AM

lit World Scientific

9625_9789814689083_tp.indd 2 29/7/15 5:23 pm

British Library Cataloguing-in-Publication Data

RokTing - A Course in Analysis.indd 2 29/7/2015 11:56:51 AM

The modular approach to teaching combined with examination pressure has

Another problem that needed to be addressed was providing students with

Acknowledgements and Apologies xiii

List of Symbols xvii

The Greek Alphabet xxiii

Part 1: Introductory Calculus 1

2 The Absolute Value, Inequalities and Intervals 19

4 Functions and Mappings 55

5 Functions and Mappings Continued 71

7 Derivatives Continued 107

8 The Derivative as a Tool to Investigate

9 The Exponential and Logarithmic Functions 125

10 Trigonometric Functions and Their

11 Investigating Functions 155

12 Integrating Functions 171

13 Rules for Integration 183

Part 2: Analysis in One Dimension 199

15 Sequences and their Limits 211

16 A First Encounter with Series 225

17 The Completeness of the Real Numbers 233

18 Convergence Criteria for Series, b-adic Fractions 243

19 Point Sets in R 263

20 Continuous Functions 277

22 Applications of the Derivative 305

23 Convex Functions and some Norms on Rn 317

24 Uniform Convergence and Interchanging Limits 331

25 The Riemann Integral 343

26 The Fundamental Theorem of Calculus 369

27 A First Encounter with Diﬀerential Equations 383

28 Improper Integrals and the Γ-Function 395

29 Power Series and Taylor Series 411

30 Inﬁnite Products and the Gauss Integral 427

31 More on the Γ-Function 443

32 Selected Topics on Functions of a Real Variable 459

Appendix I: Elementary Aspects of Mathematical Logic 473

Appendix II: Sets and Mappings. A Collection of Formulae 481

Appendix III: The Peano Axioms 491

Appendix IV: Results from Elementary Geometry 495

Appendix V: Trigonometric and Hyperbolic Functions 499

Appendix VI: More on the Completeness of R 505

Appendix VII: Limes Superior and Limes Inferior 519

Appendix VIII: Connected Sets in R 523

Solutions to Problems of Part 1 525

Solutions to Problems of Part 2 627

Mathematicians Contributing to Analysis 735

Subject Index 737

Acknowledgements and Apologies

Dieudonné, J., Grundzüge der modernen Analysis, 2. Auﬂ. Logik und

Fichtenholz, G.M., Differential- und Integralrechnung I, 8. Aufl. Differential-

Forster, O., Analysis 1, 2. Nachdruck. Analysis 2, 2. Nachdruck. Analysis 3.

have stronger impact on some passages, in particular in parts dealing with

Rudin, W., Principles of Mathematical Analysis, 3rd ed. McGraw-Hill In-

Walter, W., Gewöhnliche Diﬀerentialgleichungen. Heidelberger Taschenbücher

Walter, W., Analysis 1, 3. Auﬂ. Analysis 2, 4. Auﬂ. Springer Verlag, Berlin,

Zeidler, E., (ed.), Oxford Users Guide to Mathematics. Oxford University

Kaczor, W.J., and Nowak, M.T., Problems in Mathematical Analysis I, II

Since 0 = 1 it follows that there is a real number x such that 0 · x = 1, hence

b) Show that for a + b = c

Hint: ﬁrst prove that for x = 0, (x−1 )−1 = x.