Documente Academic
Documente Profesional
Documente Cultură
c W W L Chen, 1981.
This work is available free, in the hope that it will be useful.
Any part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including
photocopying, recording, or any information storage and retrieval system, with or without permission from the author.
Chapter 1
ARITHMETIC FUNCTIONS
1.1. Introduction
THEOREM 1.1. Suppose that the function f : N C is multiplicative. Then the function g : N C,
dened by
g(n) = f (m)
m|n
Proof of Theorem 1.1. Suppose that a, b N and (a, b) = 1. If u is a positive divisor of a and v
is a positive divisor of b, then clearly uv is a positive divisor of ab. On the other hand, it is well-known
that every positive divisor m of ab can be expressed uniquely in the form m = uv, where u is a positive
divisor of a and v is a positive divisor of b. It follows that
g(ab) = f (m) = f (uv) = f (u)f (v)
m|ab u|a v|b u|a v|b
= f (u) f (v) = g(a)g(b).
u|a v|b
This chapter was rst used in lectures given by the author at Imperial College, University of London, in 1981.
12 W W L Chen : Elementary and Analytic Number Theory
for every n N. Here the sum is taken over all positive divisors m of n. In other words, the value
d(n) denotes the number of positive divisors of the natural number n. On the other hand, we dene the
function : N C by writing
(n) = m (2)
m|n
for every n N. Clearly, the value (n) denotes the sum of all the positive divisors of the natural
number n.
THEOREM 1.2. Suppose that n N and that n = pu1 1 . . . pur r is the canonical decomposition of n.
Then
pu1 +1 1 pur +1 1
d(n) = (1 + u1 ) . . . (1 + ur ) and (n) = 1 ... r .
p1 1 pr 1
Proof. Every positive divisor m of n is of the form m = pv11 . . . pvrr , where for every j = 1, . . . , r, the
integer vj satises 0 vj uj . It follows from (1) that d(n) is the number of choices for the r-tuple
(v1 , . . . , vr ). Hence
u1
ur
d(n) = ... 1 = (1 + u1 ) . . . (1 + ur ).
v1 =0 vr =0
uj
v u
u +1
pj j 1
pj j = 1 + pj + p2j + . . . + pj j = .
vj =0
pj 1
Natural numbers n N where (n) = 2n are of particular interest, and are known as perfect
numbers. A perfect number is therefore a natural number which is equal to the sum of its own proper
divisors; in other words, the sum of all its positive divisors other than itself.
It is not known whether any odd perfect number exists. However, we can classify the even perfect
numbers.
(2m1 , 2m 1) = 1.
so that
2m u u
(u) = =u+ m . (3)
2m 1 2 1
Note that (u) and u are integers and (u) > u. Hence u/(2m 1) N and is a divisor of u. Since
m > 1, we have 2m 1 > 1, and so u/(2m 1) = u. It now follows from (3) that (u) is equal to the
sum of two of its positive divisors. But (u) is equal to the sum of all its positive divisors. Hence u
must have exactly two positive divisors, so that u is prime. Furthermore, we must have u/(2m 1) = 1,
so that u = 2m 1.
We are interested in the behaviour of d(n) and (n) as n . If n N is a prime, then clearly
d(n) = 2. Also, the magnitude of d(n) is sometimes greater than that of any power of log n. More
precisely, we have the following result.
c
THEOREM 1.5. For any xed real number c > 0, the inequality d(n) (log n) as n does not
hold.
Proof. The idea of the proof is to consider integers which are divisible by many dierent primes.
Suppose that c > 0 is given and xed. Let N {0} satisfy c < + 1. For every j = 1, 2, 3, . . . ,
let pj denote the j-th positive prime in increasing order of magnitude, and consider the integer
m
n = (p1 . . . p
+1 ) .
On the other hand, the order of magnitude of d(n) cannot be too large either.
THEOREM 1.6. For any xed real number > 0, we have d(n) n as n .
Proof. For every natural number n > 1, let n = pu1 1 . . . pur r be its canonical decomposition. It follows
from Theorem 1.2 that
d(n) (1 + u1 ) (1 + ur )
= u1 ... .
n p1 pu
r
r
14 W W L Chen : Elementary and Analytic Number Theory
We may assume without loss of generality that < 1. If 2 pj < 21/ , then
uj
pj 2uj = euj log 2 > 1 + uj log 2 > (1 + uj ) log 2,
so that
(1 + uj ) 1
u < .
pj j log 2
On the other hand, if pj 21/ , then pj 2, and so
(1 + uj ) 1 + uj
uj 1.
pj 2u j
It follows that
d(n) 1
< ,
n 1/
log 2
p<2
We see from Theorems 1.5 and 1.6 and the fact that d(n) = 2 innitely often that the magnitude of
d(n) uctuates a great deal as n . It may then be more fruitful to average the function d(n) over a
range of values n, and consider, for positive real numbers X R, the value of the average
1
d(n).
X
nX
Proof. As Y , we have
1 1 1 Y
[Y ]
Y 1
= + 2
du = + du
n Yn u Y u2
nY nY nY n
Y
Y
[Y ] 1 du = [Y ] + [u]
= + 2
1 du
Y 1 u Y 1 u2
nu
Y
Y
[Y ] 1 u [u]
= + du du
Y u u2
1
1
1 u [u] u [u]
= log Y + 1 + O 2
du + du
Y 1 u Y u2
u [u] 1
= log Y + 1 2
du + O .
1 u Y
Chapter 1 : Arithmetic Functions 15
X 2 X 2
=2 [X 1/2 ] = 2 + O(X 1/2 ) (X 1/2 + O(1))
x x
xX 1/2 xX 1/2
1
= 2X log X 1/2 + + O + O(X 1/2 ) X
X 1/2
= X log X + (2 1)X + O(X 1/2 ).
We next turn our attention to the study of the behaviour of (n) as n . Every number n N
has divisors 1 and n, so we must have (1) = 1 and (n) > n if n > 1. On the other hand, it follows
from Theorem 1.6 that for any xed real number > 0, we have
(n) nd(n) n1+
as n .
In fact, it is rather easy to prove a slightly stronger result.
Proof. As n , we have
n 1
(n) = n n log n.
m m
m|n mn
As in the case of d(n), the magnitude of (n) uctuates a great deal as n . As before, we shall
average the function (n) over a range of values n, and consider some average version of the function.
Corresponding to Theorem 1.7, we have the following result.
Proof. As X , we have
n n
(n) = = = r
m m
nX nX m|n mX nX mX rX/m
m|n
1 X
X 1 X
2
= 1+ = + O(1)
2 m m 2 m
mX mX
2
X 1 1
= + O X +O 1
2 m2 m
mX mX mX
X2 1 1
= + O X2 + O(X log X)
2 m=1 m2 m2
m>X
2 2
= X + O(X log X).
12
16 W W L Chen : Elementary and Analytic Number Theory
1.3. The M
obius Function
Remarks. (i) A natural number which is not divisible by the square of any prime is called a squarefree
number. Note that 1 is both a square and a squarefree number. Furthermore, a number n N is
squarefree if and only if (n) = 1.
(ii) The motivation for the denition of the M obius function lies rather deep. To understand the
denition, one needs to study the Riemann zeta function, an important function in the study of the
distribution of primes. For a more detailed discussion, see Chapters 4, 5 and 6. At this point, it suces
to remark that the M obius function is dened so that if we formally multiply the two series
1 (n)
and ,
n=1
ns n=1
ns
We shall establish this last fact and study some of its consequences over the next four theorems.
Proof. Suppose that a, b N and (a, b) = 1. If a or b is not squarefree, then neither is ab, and so
(ab) = 0 = (a)(b). On the other hand, if both a and b are squarefree, then since (a, b) = 1, ab must
also be squarefree. Furthermore, the number of prime factors of ab must be the sum of the numbers of
prime factors of a and of b.
for every n N. It follows from Theorems 1.1 and 1.11 that f is multiplicative. For n = 1, the result is
trivial. To complete the proof, it therefore suces to show that f (pk ) = 0 for every prime p and every
k N. Indeed,
f (pk ) = (m) = (1) + (p) + (p2 ) + . . . + (pk ) = 1 1 + 0 + . . . + 0 = 0.
m|pk
Chapter 1 : Arithmetic Functions 17
Theorem 1.12 plays the central role in the proof of the following two results which are similar in
nature.
THEOREM 1.13. (M obius Inversion Formula) Given any function f : N C, suppose that the
function g : N C is dened by writing
g(n) = f (m)
m|n
Remark. In number theory, it occurs quite often that in the proof of a theorem, a change of order
of summation of the variables is required, as illustrated in the proofs of Theorems 1.13 and 1.14. This
process of changing the order of summation does not depend on the summand in question. In both
instances, we are concerned with a sum of the form
A(k, m).
n
m|n k| m
18 W W L Chen : Elementary and Analytic Number Theory
This means that for every positive divisor m of n, we rst sum the function A over all positive divisors
k of n/m to obtain the sum
A(k, m),
n
k| m
which is a function of m. We then sum this sum over all divisors m of n. Now observe that for every
natural number k satisfying k | n/m for some positive divisor m of n, we must have k | n. Consider
therefore a particular natural number k satisfying k | n. We must nd all natural numbers m satisfying
the original summation conditions, namely m | n and k | n/m. These are precisely those natural numbers
m satisfying m | n/k. We therefore obtain, for every positive divisor k of n, the sum
A(k, m).
m| n
k
Since we are summing the function A over the same collection of pairs (k, m), and have merely changed
the order of summation, we must have
A(k, m) = A(k, m).
n
m|n k| m k|n m| n
k
We dene the Euler function : N C as follows. For every n N, we let (n) denote the number of
elements in the set {1, 2, . . . , n} which are coprime to n.
Proof. We shall partition the set {1, 2, . . . , n} into d(n) disjoint subsets Bm , where for every positive
divisor m of n,
Bm = {x : 1 x n and (x, n) = m}.
If x Bm , let x = mx . Then (mx , n) = m if and only if (x , n/m) = 1. Also 1 x n if and only if
1 x n/m. Hence
Bm = {x : 1 x n/m and (x , n/m) = 1}
has the same number of elements as Bm . Note now that the number of elements of Bm is exactly (n/m).
Since every element of the set {1, 2, . . . , n} falls into exactly one of the subsets Bm , we must have
n
n= = (m).
m
m|n m|n
Apply the M obius inversion formula to the conclusion of Theorem 1.15, we obtain immediately the
following result.
Chapter 1 : Arithmetic Functions 19
Proof. Since the M obius function is multiplicative, it follows that the function f : N C, dened
by f (n) = (n)/n for every n N, is multiplicative. The result now follows from Theorem 1.1.
THEOREM 1.18. Suppose that n N and n > 1, and that n = pu1 1 . . . pur r is the canonical decom-
position of n. Then
r
r
1 u 1
(n) = n 1 = pj j (pj 1).
j=1
p j j=1
Proof. The second equality is trivial. On the other hand, for every prime p and every u N, we have
by Theorem 1.16 that
(pu ) (m) (p) 1
u
= =1+ =1 .
p u
m p p
m|p
We noe study the magnitude of (n) as n . Clearly (1) = 1 and (n) < n if n > 1.
Suppose rst of all that n has many dierent prime factors. Then n must have many dierent
divisors, and so (n) must be large relative to n. But then many of the numbers 1, . . . , n cannot be
coprime to n, and so (n) must be small relative to n. On the other hand, suppose that n has very few
prime factors. Then n must have very few divisors, and so (n) must be small relative to n. But then
many of the numbers 1, . . . , n are coprime to n, and so (n) must be large relative to n. It therefore
appears that if one of the two values (n) and (n) is large relative to n, then the other must be small
relative to n. Indeed, our heuristics are upheld by the following result.
Proof. The result is obvious if n = 1, so suppose that n > 1. Let n = pu1 1 . . . pur r be the canonical
decomposition of n. Recall Theorems 1.2 and 1.18. We have
u 1
r u +1
pj j 1 r
1 pj j
(n) = =n
j=1
pj 1 j=1
1 p1
j
and
r
(n) = n (1 p1
j ).
j=1
Hence
(n)(n) r
u 1
= (1 pj j ).
n2 j=1
The upper bound follows at once. On the other hand, the lower bound follows on observing that
r n
uj 1 2 1 n+1 1
(1 pj ) (1 p ) 1 2 = > .
j=1 m=2
m 2n 2
p|n
110 W W L Chen : Elementary and Analytic Number Theory
We shall denote the class of all arithmetic functions by A, and the class of all multiplicative functions
by M.
Given arithmetic functions f, g A, we dene the function f g : N C by writing
n
(f g)(n) = f (m)g
m
m|n
Furthermore, the arithmetic function I : N C, dened by I(1) = 1 and I(n) = 0 for every n N
satisfying n > 1, is an identity element for Dirichlet convolution. It is easy to check that I f = f I = f
for every f A.
On the other hand, an inverse may not exist under Dirichlet convolution. Consider, for example,
the function f A satisfying f (n) = 0 for every n N.
THEOREM 1.22. For any f A, the following two statements are equivalent:
(i) We have f (1) = 0.
(ii) There exists a unique g A such that f g = g f = I.
Proof. Suppose that (ii) holds. Then f (1)g(1) = 1, so that f (1) = 0. Conversely, suppose that
f (1) = 0. We shall dene g A iteratively by writing
1
g(1) = (5)
f (1)
and n
1
g(n) = f (d)g (6)
f (1) d
d|n
d>1
for every n N satisfying n > 1. It is easy to check that this gives an inverse. Moreover, every inverse
must satisfy (5) and (6), and so must be unique.
We now describe Theorem 1.12 and M obius inversion in terms of Dirichlet convolution. Recall that
the function U A is dened by U (n) = 1 for all n N.
THEOREM 1.23.
(i) We have U = I.
(ii) If f A and g = f U , then f = g .
(iii) If g A and f = g , then g = f U .
Proof. (i) follows from Theorem 1.12. To prove (ii), note that
g = (f U ) = f (U ) = f I = f.
Remark. Note that if f M is not identically zero, then f (n) = 0 for some n N. Since f (n) =
f (1)f (n), we must have f (1) = 1.
Proof of Theorem 1.24. For A , this is now trivial. We now consider M . Clearly I M . If
f, g M and (m, n) = 1, then
mn
mn
(f g)(mn) = f (d)g = f (d1 d2 )g
d d1 d2
d|mn d1 |m d2 |n
m n
= f (d1 )g f (d2 )g = (f g)(m)(f g)(n),
d1 d2
d1 |m d2 |n
112 W W L Chen : Elementary and Analytic Number Theory
g(pk ) = h(pk )
for every n > 1. Then g M . Furthermore, for every integer n > 1, we have
(f g)(n) = (f g)(pk ) = (f h)(pk ) = I(pk ) = I(n),
pk n pk n pk n
so that g is an inverse of f .
ELEMENTARY AND
ANALYTIC NUMBER THEORY
W W L CHEN
c W W L Chen, 1981.
This work is available free, in the hope that it will be useful.
Any part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including
photocopying, recording, or any information storage and retrieval system, with or without permission from the author.
Chapter 2
DISTRIBUTION OF PRIMES
I : INTRODUCTION
We have already seen the elegant and simple proof of Euclids theorem, that there are innitely many
primes. Here we shall begin by proving a slightly stronger result.
p
p
is divergent.
1
1
PX = 1 .
p
pX
Then
1
log PX = log 1 = S1 + S2 ,
p
pX
where
1
1
S1 = and S2 = .
p hph
pX pX h=2
This chapter was rst used in lectures given by the author at Imperial College, University of London, in 1981.
22 W W L Chen : Elementary and Analytic Number Theory
Since
1 1 1
0 = ,
hp h p h p(p 1)
h=2 h=2
we have
1 1
0 S2 = 1,
p
p(p 1) n=2 n(n 1)
so that (X) denotes the number of primes in the interval [2, X]. This function has been studied
extensively by number theorists, and attempts to study it in depth have led to major developments in
other important branches of mathematics.
As can be expected, many conjectures concerning the distribution of primes were made based purely
on numerical evidence, including the celebrated Prime number theorem, proved in 1896 by Hadamard
and de la Vallee Poussin, that
(X) log X
lim = 1.
X X
We shall give an analytic proof of this in Chapter 5. Here we shall be concerned with the weaker result
of Tchebyche, that there exist positive absolute constants c1 and c2 such that for every real number
X 2, we have
X X
c1 < (X) < c2 .
log X log X
The study of the function (X) usually involves, instead of the characteristic function of the primes, a
function which counts not only primes, but prime powers as well, and with weights. Accordingly, we
introduce the von Mangoldt function : N C, dened for every n N by writing
log p if n = pr , with p prime and r N,
(n) =
0 otherwise.
Proof. The result is clearly true for n = 1, so it remains to consider the case n 2. Suppose that
n = pu1 1 . . . pur r is the canonical decomposition of n. Then the only non-zero contribution to the sum
Chapter 2 : Distribution of Primes I : Introduction 23
v
on the left hand side comes from those natural numbers m of the form m = pj j with j = 1, . . . , r and
1 vj uj . It follows that
r
uj
r
u
(m) = log pj = log pj j = log n.
m|n j=1 vj =1 j=1
X
(m) = X log X X + O(log X).
m
mX
X
log n = (m) = (m) 1= (m) .
m
nX nX m|n mX nX mX
m|n
To prove (1), note that log X is an increasing function of X. In particular, for every n N, we have
n+1
log n log u du,
n
so that
X
log n log(X + 1) log u du.
nX 1
so that
[X]
log n = log n log u du
nX 2nX 1
X X X
= log u du log u du log u du log X.
1 [X] 1
The crucial step in the proof of Tchebyches theorem concerns obtaining bounds on sums involving the
von Mangoldt function. More precisely, we prove the following result.
24 W W L Chen : Elementary and Analytic Number Theory
THEOREM 2.4. There exist positive absolute constants c3 and c4 such that
1
(m) X log 2 (X c3 ) (2)
2
mX
and
(m) c4 X (X 0). (3)
X
2 <mX
Proof. If m N satises X/2 < m X, then clearly [X/2m] = 0. It follows from this and Theorem
2.3 that as X , we have
X X X X
(m) 2 = (m) 2 (m)
m 2m m 2m
mX mX m X
2
X X X
= (X log X X + O(log X)) 2 log + O(log X)
2 2 2
= X log 2 + O(log X).
Hence there exists a positive absolute constant c5 such that for all suciently large X, we have
1 X X
X log 2 < (m) 2 < c5 X.
2 m 2m
mX
We now consider the function [] 2[/2]. Clearly [] 2[/2] < 2(/2 1) = 2. Note that the
left hand side is an integer, so we must have [] 2[/2] 1. It follows that for all suciently large X,
we have
1
X log 2 < (m).
2
mX
The inequality (2) follows. On the other hand, if X/2 < m X, then [X/m] = 1 and [X/2m] = 0, so
that for all suciently large X, we have
(m) c5 X.
X
2 <mX
THEOREM 2.5. (Tchebyche) There exist positive absolute constants c1 and c2 such that for every
real number X 2, we have
X X
c1 < (X) < c2 .
log X log X
log X
= (log p) (X) log X.
log p
pX
holds for every integer j 0 and every real number X 0. Suppose that X 2. Let the integer k 0
be dened such that 2k < X 1/2 2k+1 . Then
k
k
log p log p c4 X 2j < 2c4 X,
X 1/2 <pX j=0 X
<p Xj j=0
2j+1 2
so that
log p 4c4 X
1 < ,
log X 1/2 log X
X 1/2 <pX X 1/2 <pX
whence
4c4 X c2 X
(X) X 1/2 + <
log X log X
for a suitable c2 .
X
(m) = X log X X + O(log X).
m
mX
so that as X , we have
(m)
X = X log X + O(X).
m
mX
As X , we have
log p
1 1 log n
(log p) (log p) = = O(1).
pk pk p(p 1) n=2 n(n 1)
pX 2k log X pX k=2 pX
log p
The inequality (5) follows. Finally, for every real number X 2, let
log p
T (X) = .
p
pX
Then it follows from (5) that there exists a positive absolute constant c6 such that |T (X) log X| < c6
whenever X 2. On the other hand,
X X
1 log p 1 dy T (X) T (y) dy
= + 2 = +
p
pX
p
pX
log X p y log y log X 2 y log2 y
X X
T (X) log X (T (y) log y) dy dy
= + 2 + 1 + .
log X 2 y log y 2 y log y
c W W L Chen, 1990.
This work is available free, in the hope that it will be useful.
Any part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including
photocopying, recording, or any information storage and retrieval system, with or without permission from the author.
Chapter 3
DIRICHLET SERIES
THEOREM 3.1. Suppose that the series (1) converges for some s C. Then there exist unique real
numbers 0 , 1 , 2 satisfying 0 1 2 < and such that the following statements hold:
(i) The series (1) converges for every s C with > 0 . Furthermore, for every
> 0, the series (1)
diverges for some s C with 0
< 0 .
(ii) For every > 0, the series (1) converges uniformly on the set {s C : > 1 + } and does not
converge uniformly on the set {s C : > 1 }.
(iii) The series (1) converges absolutely for every s C with > 2 . Furthermore, for every
> 0, the
series (1) does not converge absolutely for some s C with 2
< 2 .
converges absolutely for every s C with > 1 and diverges for every real s < 1. It follows that
0 = 1 = 2 = 1 in this case.
This chapter was rst used in lectures given by the author at Imperial College, University of London, in 1990.
32 W W L Chen : Elementary and Analytic Number Theory
Proof of Theorem 3.1. Suppose that the series (1) converges for s = s = + it . Then
f (n)ns 0 as n , so that |f (n)ns | = O(1), and so |f (n)| = O(n ). It follows that for
every s C with > + 1, we have
|f (n)ns | = |f (n)n | = O(n
),
so that the series (1) converges by the Comparison test. Now let
and
2 = inf{u R : the series (1) converges absolutely for all s C with > u}.
Clearly (i) and (iii) follow, and 0 2 . To prove (ii), let > 0 and
> 0 be chosen. Then there exists
N N such that
|f (n)|n2 <
.
n=N +1
Hence N
s s
sup f (n)n f (n)n : 2 + |f (n)|n2 <
.
n=1 n=1 n=N +1
It follows that the series (1) converges uniformly on the set {s C : 2 + }. Now let
A simple consequence of uniform convergence is the following result concerning dierentiation term
by term.
THEOREM 3.2. For every s C with > 1 , the series (1) may be dierentiated term by term. In
particular, F (s) exists and
F (s) = f (n)(log n)ns .
n=1
Our next task is to prove the uniqueness theorem of Dirichlet series, a result of great importance in view
of the applications we have in mind.
where f : N C and g : N C are arithmetic functions and s C. Suppose further that there exists
3 R such that for every s C satisfying 3 , we have F (s) = G(s). Then f (n) = g(n) for every
n N.
where f : N C is an arithmetic function and s C. Suppose further that there exists 3 R such
that for every s C satisfying 3 , we have F (s) = 0. Then f (n) = 0 for every n N.
Proof. Since the series converges for s = 3 , we must have |f (n)| = O(n3 ) for all n N. Now let
3 + 2. Then
f (n)n = O n3 . (2)
n=N n=N
as +. Hence f (1) = 0. Suppose now that f (1) = f (2) = . . . = f (M 1) = 0. Using (4) with
N = M + 1, we obtain, for 3 + 2,
0 = F () = f (M )M + f (n)n = f (M )M + O((M + 1)3 +1 ),
n=M +1
so that
M
0 = f (M ) + O (M + 1)3 +1 f (M )
M +1
as +. Hence f (M ) = 0. The result now follows from induction.
Dirichlet series are extremely useful in tackling problems in number theory as well as in other branches
of mathematics. The main properties that underpin most of these applications are the multiplicative
aspects of these series.
where fj : N C is an arithmetic function and s C. Suppose further that for every n N, we have
n n
Then
F1 (s)F2 (s) = F3 (s),
(1) (2)
provided that > max{2 , 2 }, where, for every j = 1, 2, the series Fj (s) converges absolutely for
(j)
every s C with > 2 .
Proof. We have
N
f3 (n)ns = f1 (x)xs f2 (y)y s ,
n=1 1xN
1yN
xyN
so that
N
f3 (n)ns f1 (x)xs f2 (y)y s
n=1 x N y N
= f1 (x)xs f2 (y)y s + f1 (x)xs f2 (y)y s .
N <xN yN/x x N N <yN/x
It follows that
N
s
f3 (n)n f1 (x)xs s
f2 (y)y
n=1
x N
y N
< |f1 (x)|x
|f2 (y)|y
x> N y=1
+ |f1 (x)|x |f2 (y)|y . (5)
x=1 y> N
(1) (2)
Suppose now that > max{2 , 2 }. Clearly
|f1 (x)|x and |f2 (y)|y
x> N y> N
are convergent. It follows that the right hand side of (5) converges to 0 as N . On the other hand,
f1 (x)xs and f2 (y)y s
x N y N
Remark. Theorem 3.5 generalizes to a product of k Dirichlet series F1 (s), . . . , Fk (s), where the general
coecient is
f1 (x1 ) . . . fk (xk ).
x1 ,...,xk
x1 ...xk =n
Chapter 3 : Dirichlet Series 35
In many applications, the coecients f (n) of the Dirichlet series will be given by various important
arithmetic functions in number theory. We therefore study next some consequences when the function
f : N C is multiplicative.
THEOREM 3.6. Suppose that the function f : N C is multiplicative. Then for every s C
satisfying > 2 , the series (1) satises
h hs
F (s) = f (p )p .
p h=0
By the uniqueness of factorization, the inner sum on the right hand side contains at most one term.
Hence
k
h hs
f (pj )pj = k (n)f (n)ns ,
j=1 h=0 n=1
where
1 if all the prime factors of n are among p1 , . . . , pk ,
k (n) =
0 otherwise.
It follows that as k , we have
k
h hs
f (pj )pj f (n)ns = (k (n) 1)f (n)ns
j=1 h=0 n=1 n=1
=O |f (n)|n 0.
n=k+1
THEOREM 3.7. Suppose that the function f : N C is totally multiplicative. Then for every s C
satisfying > 2 , the series (1) satises
F (s) = (1 f (p)ps )1 .
p
Furthermore, if f is not identically zero, then it is easy to see that f (1) = 1, so that the series (6) is now
a convergent geometric series with sum (1 f (p)ps )1 .
36 W W L Chen : Elementary and Analytic Number Theory
This is called the Euler product of the Riemann zeta function (s).
ELEMENTARY AND
ANALYTIC NUMBER THEORY
W W L CHEN
c W W L Chen, 1990.
This work is available free, in the hope that it will be useful.
Any part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including
photocopying, recording, or any information storage and retrieval system, with or without permission from the author.
Chapter 4
DISTRIBUTION OF PRIMES
II : ARITHMETIC PROGRESSIONS
The purpose of this chapter is to prove the following remarkable result of Dirichlet, widely regarded as
one of the greatest achievements in mathematics.
THEOREM 4.1. Suppose that q N and a Z satisfy (a, q) = 1. Then there are innitely many
primes p a (mod q).
Note that the requirement (a, q) = 1 is crucial. If n a (mod q), then clearly (a, q) | n. It follows
that if (a, q) > 1, then the residue class n a (mod q) of natural numbers contains at most one prime.
In other words, Dirichlets theorem asserts that any residue class n a (mod q) of natural numbers
must contain innitely many primes if there is no simple reason to support the contrary.
It is easy to prove Theorem 4.1 by elementary methods for some special values of a and q.
Example. There are innitely many primes p 1 (mod 4). Suppose on the contrary that p1 , . . . , pr
represent all such primes. Then 4p1 . . . pr 1 must have a prime factor p 1 (mod 4). But p cannot
be any of p1 , . . . , pr .
Example. There are innitely many primes p 1 (mod 4). Suppose on the contrary that p1 , . . . , pr
represent all such primes. Consider the number 4(p1 . . . pr )2 + 1. Suppose that a prime p divides
4(p1 . . . pr )2 + 1. Then 4(p1 . . . pr )2 + 1 0 (mod p). It follows that 1 is a quadratic residue modulo
p, so that we must have p 1 (mod 4). Clearly p cannot be any of p1 , . . . , pr .
This chapter was rst used in lectures given by the author at Imperial College, University of London, in 1990.
42 W W L Chen : Elementary and Analytic Number Theory
log p
+ as 1 + .
p
pa (mod q)
Let us illustrate the idea of Dirichlet by studying the case n 1 (mod 4).
First of all, we need a function that distinguishes between integers n 1 (mod 4) and the others.
Suppose that n is odd. Then it is easy to check that
n1
1 + (1) 2 1 if n 1 (mod 4),
=
2 0 if n 1 (mod 4);
so that
log p 1 log p p1
= 1 + (1) 2 .
p 2 p
p1 (mod 4) p odd
converges as 1+.
The next idea is to show that if we consider the series
n1
(1) 2 (n)
(1)
n=1
n
n odd
instead, then the contribution from the terms corresponding to non-prime odd natural numbers n is
convergent. It therefore suces to show that the series (1) converges as 1+.
Note now that the function
n1
0 if n is even,
(n)
L() = , (2)
n=1
n
It follows from Theorems 3.2 and 3.5 that for > 1, we have
(n) log n (n)(n) (n)
L () = = .
n=1
n n=1
n n=1
n
Hence
(1) 2 (n) (n)(n)
n1
L ()
= = .
n=1
n n=1
n L()
n odd
1 1 1
L() L(1) = 1 + + ... > 0
3 5 7
and
log 3 log 5 log 7
L () + ...
3 5 7
which converges by the Alternating series test. We therefore expect the series (1) to converge to a nite
limit.
Dirichlets most crucial discovery is that for every q N, there is a family of (q) functions : N C,
known nowadays as the Dirichlet characters modulo q, which generalize the function in the special
case and which satisfy
1 (n) 1 if n a (mod q),
=
(q) (a) 0 if n a (mod q).
mod q
To understand Dirichlets ideas, we shall rst of all study group characters. This is slightly more
general than is necessary, but easier to understand.
Let G be a nite abelian group of order h and with identity element e. A character on G is a
non-zero complex-valued function on G for which (uv) = (u)(v) for every u, v G. It is easy to
check the following simple results.
Remark. We have
(i) (e) = 1;
(ii) for every u G, (u) is an h-th root of unity;
(iii) the number c of characters is nite; and
(iv) the characters form an abelian group.
Remark. If u G and u = e, then there exists a character on G such that (u) = 1. To see
this, note that G can be expressed as a direct product of cyclic groups G1 , . . . , Gs of orders h1 , . . . , hs
respectively, where h = h1 . . . hs . Suppose that for each j = 1, . . . , s, the cyclic group Gj is generated by
vj . Then we can write u = v1y1 . . . vsys , where yj (mod hj ) is uniquely determined for every j = 1, . . . , s.
Since u = e, there exists k = 1, . . . , s such that yk 0 (mod hk ). Let (vk ) = e(1/hk ), and let (vj ) = 1
for every j = 1, . . . , s such that j = k. Clearly (u) = e(yk /hk ) = 1.
shall denote by 0 the principal character on G. In other words, 0 (u) = 1 for every u G.
We
Also, denotes a summation over all the distinct characters on G.
44 W W L Chen : Elementary and Analytic Number Theory
THEOREM 4.2. Suppose that G is a nite abelian group of order h and with identity element e.
Suppose further that 0 is the principal character on G.
(i) For every character on G, we have
h if = 0 ,
(u) =
0 0 .
if =
uG
Proof. (i) If = 0 , then the result is obvious. If = 0 , then there exists v G such that (v) = 1,
and so
(v) (u) = (u)(v) = (uv) = (u),
uG uG uG uG
the last equality following from the fact that uv runs over all the elements of G as u runs over all the
elements of G. Hence
(1 (v)) (u) = 0.
uG
the last equality following from noting that the characters on G form an abelian group so that 1 runs
through all the characters on G as runs through all the characters on G. Hence
(1 1 (u)) (u) = 0.
We are now in a position to introduce Dirichlet characters. Let q N be given. Then there
are exactly (q) residue classes n a (mod q) satisfying (a, q) = 1. Under multiplication of residue
classes, they form an abelian group of order (q). Suppose that these residue classes are represented
by a1 , . . . , a(q) modulo q. Let G = {a1 , . . . , a(q) }. We can now dene a character on the group G
Chapter 4 : Distribution of Primes II : Arithmetic Progressions 45
as described earlier, interpreting the group elements as residue classes. Furthermore, we can extend the
denition to cover the remaining residue classes. Precisely, for every n N, let
(aj ) if n aj (mod q) for some j = 1, . . . , (q),
(n) = (3)
0 if (n, q) > 1.
A function : N C of the form (3) is called a Dirichlet character modulo q. Note that is totally
multiplicative. Also, clearly there are exactly (q) Dirichlet characters modulo q. Furthermore, the
principal Dirichlet character 0 modulo q is dened by
1 if (n, q) = 1,
0 (n) =
0 if (n, q) > 1.
The following theorem follows immediately from these observations and Theorem 4.2.
THEOREM 4.3. Suppose that q N. Suppose further that 0 is the principal Dirichlet character
modulo q.
(i) For every Dirichlet character modulo q, we have
(q) if = 0 ,
(n) =
0 if = 0 .
n (mod q)
Our next task is to introduce the functions analogous to the function (2) earlier. Let s = + it C,
where , t R. For > 1, let
(s) = ns ; (4)
n=1
The functions (4) and (5) are called the Riemann zeta function and Dirichlet L-functions respectively.
Note that the series are Dirichlet series and converge absolutely for > 1 and uniformly for > 1 +
for any > 0. Furthermore, the coecients are totally multiplicative. It follows from Theorem 3.7 that
for > 1, the series (4) and (5) have the Euler product representations
(s) = (1 ps )1 and L(s, ) = (1 (p)ps )1
p p
THEOREM 4.4. Suppose that > 1. Then (s) = 0. Furthermore, L(s, ) = 0 for every Dirichlet
character modulo q.
1 p
s 1
(2)
|(s)| =
(1 p )
(1 + p )1 = = >0
1p 2 ()
p p p
and
s 1
|L(s, )| =
(1 (p)p )
(1 + p )1 (1 + p )1 > 0.
p pq p
THEOREM 4.5. Suppose that 0 is the principal Dirichlet character modulo q. Then for > 1, we
have
L(s, 0 ) = (s) (1 ps ).
p|q
that
s s
(s) = (n)n n .
n=1 n=1
Chapter 4 : Distribution of Primes II : Arithmetic Progressions 47
The rst assertion follows. On the other hand, it also follows from Theorem 3.2 that
L (s, ) = (n)(log n)ns .
n=1
that
L (s, ) = (n)(n)ns (n)ns .
n=1 n=1
THEOREM 4.7. If > 1, then for every Dirichlet character modulo q, we have
log L(s, ) = m1 (pm )pms .
p m=1
so that
log L(s, ) = log(1 (p)ps ).
p
The justication for (6) is that the series on the right hand side converges uniformly for > 1 + , as
can be deduced from the Weierstrass M -test on noting that
Our next task is to extend the denition of (s) and L(s, ) to the half plane > 0. This is achieved
by analytic continuation.
An example of analytic continuation is the following: Consider the geometric series
f (s) = sn .
n=0
This series converges absolutely in the set {s C : |s| < 1} and uniformly in the set {s C : |s| < 1 }
for any > 0 to the sum 1/(1 s). Now let
1
g(s) =
1s
in C. Then g is analytic in the set C \ {1}, g(s) = f (s) in the set {s C : |s| < 1}, and g has a pole at
s = 1. So g can be viewed as an analytic continuation of f to C with a pole at s = 1.
We shall prove the following results on analytic continuation of (s) and L(s, ).
48 W W L Chen : Elementary and Analytic Number Theory
THEOREM 4.8. The function (s) admits an analytic continuation to the half plane > 0. Fur-
thermore, (s) is analytic for > 0 except for a simple pole at s = 1 with residue 1.
THEOREM 4.9. Suppose that q N and 0 is the principal Dirichlet character modulo q. Then
the function L(s, 0 ) admits an analytic continuation to the half plane > 0. Furthermore, L(s, 0 ) is
analytic for > 0 except for a simple pole at s = 1 with residue (q)/q.
THEOREM 4.10. Suppose that q N and is a non-principal Dirichlet character modulo q. Then
the function L(s, ) admits an analytic continuation to the half plane > 0. Furthermore, L(s, ) is
analytic for > 0.
The proofs of these three theorems depend on the following two simple technical results. The rst
of these is basically a result on partial summation.
THEOREM 4.11. Suppose that a(n) = O(1) for every n N. For every x > 0, write
S(x) = a(n).
nx
X
s s
a(n)n = S(X)X +s S(x)xs1 dx. (7)
nX 1
X
s s s s
a(n)n S(X)X = a(n)(n X )= a(n) sxs1 dx
nX nX nX n
X X
=s a(n) xs1 dx = s S(x)xs1 dx.
1 nx 1
The second technical result, standard in complex function theory, will be stated without proof.
THEOREM 4.12. Suppose that the path is dened by w(t) = u(t) + iv(t), where u(t), v(t) R for
every t [0, 1]. Suppose further that u (t) and v (t) are continuous on [0, 1]. Let D be a domain in C.
For every s D, let
F (s) = f (s, w) dw,
where
(i) f (s, w) is continuous for every s D and every w ; and
(ii) for every w , the function f (s, w) is analytic in D.
Then F (s) is analytic in D.
Chapter 4 : Distribution of Primes II : Arithmetic Progressions 49
Proof of Theorem 4.8. Let F (s) = (s). In the notation of Theorem 4.11, we have a(n) = 1 for
every n N, so that S(x) = [x] for every x > 0. It follows from (8) that
(s) = s [x]xs1 dx = s xs dx s {x}xs1 dx
1 1 1
1 s1
=1+ s {x}x dx.
s1 1
We shall show that the last term on the right hand side represents an analytic function for > 0. We
can write
{x}xs1 dx = Fn (s),
1 n=1
It remainsto show that (i) for every n N, the function Fn (s) is analytic in C; and (ii) for every > 0,
the series n=1 Fn (s) converges uniformly for > . To show (i), note that by a change of variable,
1 1
Fn (s) = t(n + t)s1 dt = te(s+1) log(n+t) dt,
0 0
and (i) follows from Theorem 4.12. To show (ii), note that for > , we have
n+1
|Fn (s)| =
{x}x s1
dx
n1 < n1 ,
n
Proof of Theorem 4.9. Suppose that > 1. Recall Theorem 4.5, that
1
L(s, 0 ) = (s) 1 s .
p
p|q
Clearly the right hand side is analytic for > 0 except for a simple pole at s = 1. Furthermore, at
s = 1, the function (s) has a simple pole with residue 1, while
1
(q)
1 = .
p q
p|q
We now attempt to prove Theorem 4.1. The following theorem will enable us to consider the analogue
of (1).
Proof. Note rst of all that the sum on the left hand side does not exceed the rst term on the right
hand side. On the other hand, we have
(n) log p log p
n p p m=2
pm
na (mod q) pa (mod q)
log p log p log n
= = O(1).
p m=2
p m
p
p(p 1) n=2 n(n 1)
1 1
= (n)(n)n + O(1)
(q) (a) n=1
(mod q)
1 1 L (, )
= + O(1). (9)
(q) (a) L(, )
(mod q)
log p 1 L (, 0 ) 1 1
= + O(1) = + O(1) ,
p (q) L(, 0 ) (q) 1
pa (mod q)
since the function L (s, 0 )/L(s, 0 ) has a simple pole at s = 1 with residue 1 by Theorem 4.9. To
complete the proof of Dirichlets theorem, it remains to prove (10). Clearly (10) will follow if we can
show that for every non-principal Dirichlet character (mod q), we have L(1, ) = 0. Here we need to
distinguish two cases, represented by the following two theorems.
THEOREM 4.14. Suppose that q N and is a non-real Dirichlet character modulo q. Then
L(1, ) = 0.
log L(, ) = (pm )m1 pm
(mod q) (mod q) p m=1
= (pm ) m1 pm
p m=1 (mod q)
= (q) m1 pm > 0,
p m=1
pm 1 (mod q)
Chapter 4 : Distribution of Primes II : Arithmetic Progressions 411
|(pm )m1 pm |
(mod q) p m=1
L(, )
> 1. (11)
(mod q)
Suppose that 1 is a non-real Dirichlet character modulo q, and L(1, 1 ) = 0. Then 1 = 1 , and
L(1, 1 ) = L(1, 1 ) = 0 also. It follows that these two zeros more than cancel the simple pole of L(, 0 )
at = 1, so that the product on the left hand side of (11) has a zero at = 1. This gives a contradiction.
THEOREM 4.15. Suppose that q N and is a real, non-principal Dirichlet character modulo q.
Then L(1, ) = 0.
Proof. Suppose that the result is false, so that there exists a real Dirichlet character modulo q such
that L(1, ) = 0. Then the function
F (s) = (s)L(s, )
is analytic for > 0. Note that for > 1, we have
F (s) = f (n)ns ,
n=1
Since is totally multiplicative, it suces to prove (12) when n = pk , where p is a prime and k N.
Indeed, since assumes only the values 1 and 0, we have
1 if (p) = 0,
k 2 k k + 1 if (p) = 1,
f (p ) = 1 + (p) + ((p)) + . . . + ((p)) =
1
if (p) = 1 and k is even,
0 if (p) = 1 and k is odd,
so that
1 if k is even,
f (pk ) g(pk ) =
0 if k is odd.
Suppose now that 0 < r < 3/2. Since F (s) is analytic for > 0, we must have the Taylor expansion
F () (2)
F (2 r) = (r) .
=0
!
412 W W L Chen : Elementary and Analytic Number Theory
Now as r 3/2, we must therefore have F (2 r) +. This contradicts our assertion that F (s) is
analytic for > 0 and hence continuous at s = 1/2.
ELEMENTARY AND
ANALYTIC NUMBER THEORY
W W L CHEN
c W W L Chen, 1990.
This work is available free, in the hope that it will be useful.
Any part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including
photocopying, recording, or any information storage and retrieval system, with or without permission from the author.
Chapter 5
DISTRIBUTION OF PRIMES
III : THE PRIME NUMBER THEOREM
In this chapter, we give an analytic proof of the famous Prime number theorem, a result rst obtained
in 1896 independently by Hadamard and de la Vallee Poussin.
X
(X) .
log X
As in our earlier study of the distribution of primes, we use the von Mangoldt function . For every
X > 0, let
(X) = (n).
nX
X
(X) X if and only if (X) .
log X
This chapter was rst used in lectures given by the author at Imperial College, University of London, in 1990.
52 W W L Chen : Elementary and Analytic Number Theory
To prove the Prime number theorem, it suces to show that as X , we have (X) X. However,
a direct discussion of (X) introduces various tricky convergence problems. We therefore consider a
smooth average of the function . For X > 0, let
X
1 (X) = (x) dx. (4)
0
Chapter 5 : Distribution of Primes III : The Prime Number Theorem 53
Proof. Suppose that 0 < < 1 < . Since (n) 0 for every n N, the function is an increasing
function. Hence for every X > 0, we have
1 X
1 (X) 1 (X)
(X) (x) dx = ,
X X X ( 1)X
so that
(X) 1 (X) 1 (X)
. (5)
X ( 1)X 2
On the other hand, for every X > 0, we have
1 X
1 (X) 1 (X)
(X) (x) dx = ,
X X X (1 )X
so that
(X) 1 (X) 1 (X)
. (6)
X (1 )X 2
As X , we have
1 (X) 1 (X) 1 1 2 1 1
= ( + 1) (7)
( 1)X 2 1 2 2 2
and
1 (X) 1 (X) 1 1 1 2 1
= ( + 1). (8)
(1 )X 2 1 2 2 2
Since and are arbitrary, we conclude, on combining (5)(8), that as X , we have (X)/X 1.
The rest of this chapter is concerned with establishing the following crucial result.
1 2
1 (X) X .
2
The following theorem provides a link between 1 (X) and the Riemann zeta function (s).
A crucial step in the proof of Theorem 5.5 is provided by the following result concerning a particular
contour integral.
54 W W L Chen : Elementary and Analytic Number Theory
Suppose rst of all that X 1. Let A (c, T ) denote the circular arc centred at s = 0 and passing
from c iT to c + iT on the left of the line = c. Furthermore, let
1 Xs
JT = ds.
2i A (c,T ) s(s + 1)
1 Xc Xc
|JT | 2R 0 as T .
2 R(R 1) T 1
1 Xc Xc
|JT+ | 2R 0 as T .
2 R2 T
By Cauchys residue theorem, we have
IT = JT+ .
The result for X 1 follows on letting T .
the last equality following from interchanging the order of integration and summation. Note also that
the above conclusion holds trivially if 0 < X < 1. It therefore follows from Theorem 5.6 that for every
X > 0, we have
1 (X) n 1
= 1 (n) = 1 (n)
X X X/n
nX nX
(n) c+i (X/n)s
= ds,
n=1
2i ci s(s + 1)
where c > 1. Since c > 1, the order of summation and integration can be interchanged, as
c+i
(n)(X/n)s (n) dt
|ds| X c
s(s + 1) n c c2 + t2
n=1 ci n=1
as required.
Recall rst of all Theorem 4.11. In the case of the Riemann zeta function, equation (7) of Chapter 4
becomes
X
ns = s [x]xs1 dx + [X]X s
nX 1
X X
s
=s x dx s {x}xs1 dx + X s+1 {X}X s
1 1
s s X
{x} 1 {X}
= s dx + s1 . (9)
s 1 (s 1)X s1 1 xs+1 X Xs
Letting X , we deduce that
s {x}
(s) = s dx (10)
s1 1 xs+1
if > 1. Recall also that (10) gives an analytic continuation of (s) to > 0, with s simple pole at
s = 1. We shall use these formulae to deduce important information about the order of magnitude of
|(s)| in the neighbourhood of the line = 1 and to the left of it. Note that ( + it) and ( it) are
complex conjugates, so it suces to study (s) on the upper half plane.
It follows that
1
1 1 dx
|(s)|
+ 1
+
+ |s| +1
n tX X X x
nX
1
1 1 t 1
+ + + 1 + . (12)
n tX 1 X X
nX
If 1, t 1 and X 1, then
1 1 1 1+t t
|(s)| + + + (log X + 1) + 3 + .
n t X X X
nX
Choosing X = t, we obtain
|(s)| (log t + 1) + 4 = O(log t),
proving (i). On the other hand, if , t 1 and X 1, then it follows from (12) that
1
1 t 1
|(s)|
+ 1
+ 2 +
n tX X
nX
[X]
dx X 1 3t X 1 3t
+ + + X 1 + .
0 x t X 1 X
proving (iii). To deduce (ii), we may dierentiate (11) with respect to s and proceed in a similar way.
Alternatively, suppose that s0 = 0 + it0 satises 0 1 and t0 2. Let C be the circle with centre s0
and radius < 1/2. Then
1 (s) M
| (s0 )| = ds ,
2i C (s s0 )2
where M = supsC |(s)|. Now for every s C, we clearly have 0 1 and 2t0 > t
t0 > 1. It follows from (13) that for every s C, we must have ( = 1 )
1 3 10t0
|(s)| (2t0 ) +1+ ,
1
10t0
| (s0 )| .
2
THEOREM 5.8. The function (s) has no zeros on the line = 1. Furthermore, there is a positive
constant A such that as t , we have, for 1, that
1
= O (log t)A .
(s)
Chapter 5 : Distribution of Primes III : The Prime Number Theorem 57
so that
it
log |( + it)| = R cn n = cn n cos(t log n), (15)
n=2 n=2
where
1/m (n = pm , where p is prime and m N),
cn = (16)
0 (otherwise).
Combining (14)(16), we have
log | () ( + it)( + 2it)| =
3 4
cn n (3 + 4 cos(t log n) + cos(2t log n)) 0.
n=2
Suppose that the point s = 1 + it is a zero of (s). Then since (s) is analytic at the points s = 1 + it
and s = 1 + 2it and has a simple pole with residue 1 at s = 1, the left hand side of (17) must converge to
a nite limit as 1+, contradicting the fact that the right hand side diverges to innity as 1+.
Hence s = 1 + it cannot be a zero of (s). To prove the second assertion, we may assume without loss
of generality that 1 2, since for 2, we have
1
s
(1 + p ) < () (2).
(s) = (1 p )
p p
by Theorem 5.7(i), where A1 is a positive absolute constant. Since log(2t) 2 log t, it follows that
( 1)3/4
|( + it)| , (18)
A2 (log t)1/4
where A2 is a positive absolute constant. Note that (18) holds also when = 1. Suppose now that
1 < < 2. If 1 and t 2, then it follows from Theorem 5.7(ii) that
|( + it) ( + it)| = (x + it) dx A3 ( 1) log2 t,
( 1)3/4
|( + it)| |( + it)| A3 ( 1) log2 t A3 ( 1) log2 t. (19)
A2 (log t)1/4
58 W W L Chen : Elementary and Analytic Number Theory
On the other hand, if 2 and t 2, then in view of (18), the inequality (19) must also hold. It
follows that inequality (19) holds if 1 2, t 2 and 1 < < 2. We now choose so that
( 1)3/4
= 2A3 ( 1) log2 t;
A2 (log t)1/4
in other words,
= 1 + (2A2 A3 )4 (log t)9 ,
where t > t0 so that < 2. Then
We are now ready to complete the proof of Theorem 5.4. By Theorem 5.5, we have
c+i
1 (X) 1
= G(s)X s1 ds, (20)
X2 2i ci
1 (s) 1 1
G(s) = = (s) .
s(s + 1) (s) s(s + 1) (s)
By Theorems 4.8, 5.7 and 5.8, we know that G(s) is analytic for 1, except at s = 1, and that for
some positive absolute constant A, we have
G(s) = O |t|2 (log |t|)2 (log |t|)A < |t|3/2 (21)
for all |t| > t0 . Let > 0 be given. We now consider a contour made up of the straight line segments
L1 = [1 iU, 1 iT ],
L2 = [1 iT, iT ],
L3 = [ iT, + iT ],
L4 = [ + iT, 1 + iT ],
L5 = [1 + iT, 1 + iU ],
where T = T () > max{t0 , 2}, = (T ) = () (0, 1) and U are chosen to satisfy the following
conditions:
(i) We have
|G(1 + it)| dt < .
T
(ii) The rectangle [, 1] [T, T ] contains no zeros of (s). Note that this is possible since (s)
has no zeros on the line = 1 and, as an analytic function, has at most a nite number of zeros in the
region [1/2, 1) [T, T ].
(iii) We have U > T .
Furthermore, dene the straight line segments
M1 = [c iU, 1 iU ],
M2 = [1 + iU, c + iU ].
Chapter 5 : Distribution of Primes III : The Prime Number Theorem 59
1 + iU / 2
M
c + iU
O L5
+ iT / 1 + iT
L4
O L3 O
L2o
iT 1 iT
O L1
1 iU o c iU
M1
and
G(s)X s1
ds 2T M X 1 , (26)
L3
where
M = M (, T ) = M () = sup |G(s)|. (27)
L2 L3 L4
On letting U , we have
1 (X) 1 M TM
X 2 2 + log X + X 1 .
1 (X) 1
.
X2 2
ELEMENTARY AND
ANALYTIC NUMBER THEORY
W W L CHEN
c W W L Chen, 1990.
This work is available free, in the hope that it will be useful.
Any part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including
photocopying, recording, or any information storage and retrieval system, with or without permission from the author.
Chapter 6
THE RIEMANN ZETA FUNCTION
In Riemanns only paper on number theory, published in 1860, he proved the following result.
THEOREM 6.1. The function (s) can be continued analytically over the whole plane, and satises
the functional equation
s
s/2 (1s)/2 1s
(s) = (1 s), (1)
2 2
where denotes the gamma function. In particular, (s) is analytic everywhere, except for a simple pole
at s = 1 with residue 1.
Note that the functional equation (1) enables properties of (s) for < 0 to be inferred from
properties of (s) for > 1. In particular, the only zeros of (s) for < 0 are at the poles of (s/2); in
other words, at the points s = 2, 4, 6, . . . . These are called the trivial zeros of (s).
The part of the plane with 0 1 is called the critical strip.
Riemanns paper is particularly remarkable in the conjectures it contains. While most of the con-
jectures have been proved, the famous Riemann hypothesis has so far resisted all attempts to prove or
disprove it.
THEOREM 6.2. (Hadamard 1893) The function (s) has innitely many zeros in the critical strip.
It is easy to see that the zeros of (s) in the critical strip are placed symmetrically with respect to
the line t = 0 as well as with respect to the line = 1/2, the latter observation being a consequence of
the functional equation (1).
This chapter was rst used in lectures given by the author at Imperial College, University of London, in 1990.
62 W W L Chen : Elementary and Analytic Number Theory
THEOREM 6.3. (von Mangoldt 1905) Let N (T ) denote the number of zeros = + i of the
function (s) in the critical strip with 0 < T . Then
T T T
N (T ) = log + O(log T ). (2)
2 2 2
where A and B are constants and where runs over all the zeros of (s) in the critical strip.
We comment here that the product representation (4) plays an important role in the rst proof of
the Prime number theorem.
The most remarkable of Riemanns conjectures is an explicit formula for the dierence (X)li(X),
containing a term which is a sum over the zeros of (s) in the critical strip. This shows that the zeros of
(s) plays a crucial role in the study of the distribution of primes. Here we state a result closely related
to this formula.
Then
X
(0) 1 1
0 (X) X = log 1 2 ,
(0) 2 X
where the terms in the sum arising from complex conjugates are taken together.
However, there remains one of Riemanns conjectures which is still unsolved today. The open
question below is arguably the most famous unsolved problem in the whole of mathematics.
CONJECTURE. (Riemann Hypothesis) The zeros of the function (s) in the critical strip all lie
on the line = 1/2.
so that s
s/2 ns = xs/21 en
2
x
dx.
2 0
It follows that for > 1, we have
s
s/2 s/21 n2 x s/21 n2 x
(s) = x e dx = x e dx,
2 n=1 0 0 n=1
Chapter 6 : The Riemann Zeta Function 63
where the change of order of summation and integration is justied by the convergence of
x/21 en
2
x
dx.
n=1 0
Now write
en
2
x
(x) = .
n=1
Note now that the integral on the right hand side of (8) converges absolutely for any s, and uniformly
in any bounded part of the plane, since (x) = O(ex ) as x +. Hence the integral represents an
entire function of s, and the formula gives the analytic continuation of (s) over the whole plane. Note
also that the right hand side of (8) remains unchanged when s is replaced by 1 s, so that the functional
equation (1) follows immediately. Finally, note that the function
1 s
(s) = s(s 1) s/2 (s)
2 2
is analytic everywhere. Since s(s/2) has no zeros, the only possible pole of (s) is at s = 1, and we
have already shown earlier that (s) has a simple pole at s = 1 with residue 1.
It remains to establish (6) for every x > 0. In other words, we need to prove that for every x > 0,
we have
en /x = x1/2 en x .
2 2
n= n=
64 W W L Chen : Elementary and Analytic Number Theory
The starting point is the Poisson summation formula, that under certain conditions on a function f (t),
we have
B
f (n) = f (t)e2it dt, (9)
AnB = A
where denotes that the terms in the sum corresponding to n = A and n = B are 12 f (A) and 12 f (B)
respectively. Using (9) with A = N , B = N and f (t) = et /x , we have
2
N
N
en et
2 2
/x /x 2it
= e dt.
n=N = N
Letting N , we obtain
en et
2 2
/x /x 2it
= e dt. (10)
n= =
and that
e t2 /x
cos(2t) dt 0
=0 N
en eu
2 2
/x x 2ixu
=x e du
n= =
e(ui)
2
x 2 x
=x du
=
e e(ui)
2 2
x x
=x du. (11)
=
where
ey
2
A dy = 1. (13)
en e
2 2
/x
= x1/2 x
.
n= =
This gives (6), and the proof of Theorem 6.1 is now complete.
Chapter 6 : The Riemann Zeta Function 65
In this section, we shall prove some technical results on entire functions for use later in the proof of
Theorems 6.2 and 6.4.
An entire function f (s) is said to be of order 1 if
f (s) = O e|s| as |s| (14)
holds for every > 1. Without loss of generality, we may suppose that g(0) = 0. Then we can write
g(Rei ) = (ak + ibk )Rk eik (ak , bk R),
k=1
so that
Rg(Rei ) = ak Rk cos k bk Rk sin k.
k=1 k=1
and 2
sin k cos n d = 0.
0
It follows that 2
Rg(Rei ) cos n d = an Rn ,
0
so that 2
|an |R
n Rg(Rei ) d = O (R ) as R
0
holds for every > 1. On letting R , we see that an = 0 for every n > 1. A similar argument
using the function sin n instead of the function cos n gives bn = 0 for every n > 1. We have therefore
proved the following result.
THEOREM 6.6. Suppose that the entire function h(s) has no zeros on the plane, and that (15) holds
for every > 1. Then h(s) = eA+Bs , where A and B are constants.
Remark. In the preceding argument, note that it is enough to assume that the estimates for h(s) hold
for a sequence of values R with limit innity.
Our next task is to study the distribution of the zeros of an entire function. The rst step in this
direction is summarized by the result below.
66 W W L Chen : Elementary and Analytic Number Theory
THEOREM 6.7. (Jensens Formula) Suppose that f (s) is an entire function satisfying f (0) = 0.
Suppose further that s1 , . . . , sn are the zeros of f (s) in |s| < R, counted with multiplicities, and that
there are no zeros of f (s) on |s| = R. Then
2
1 Rn
log |f (Rei )| d log |f (0)| = log . (16)
2 0 |s1 . . . sn |
R2 sj s
R
has no zeros in |s| R and satises 2
R sj s
= |s sj |
R
so that 2
1
log |f (Rei )| d = n log R + log |f (0)| log |s1 . . . sn |,
2 0
and the result follows.
Remarks. (i) It is important to point out that Jensens formula was in fact only discovered after
Hadamards work in connection with Theorems 6.2 and 6.4.
(ii) Gausss mean value theorem states that the value of an analytic function at the centre of a circle
is equal to the arithmetic mean of its values on the circle. In particular, if the function F (s) is analytic
for |s| < R0 , then for every R < R0 , we have
2
1
F (0) = F (Rei ) d.
2 0
Chapter 6 : The Riemann Zeta Function 67
A simple consequence of Jensens formula is the following result on the zeros of entire functions.
THEOREM 6.8. Suppose that f (s) is an entire function satisfying f (0) = 0, and that (14) holds for
every > 1. Suppose further that s1 , s2 , s3 , . . . are the zeros of f (s), counted with multiplicities and
where |s1 | |s2 | |s3 | . . . . Then for every > 1, the series
|sn |
n=1
is convergent.
where, for every non-negative r R, n(r) denotes the number of zeros of f (s) in |s| r. To see this,
note that
R |sj+1 |
n1 R
1 1
r n(r) dr = r j dr + r1 n dr
0 j=1 |sj | |sn |
n1
= j(log |sj+1 | log |sj |) + n(log R log |sn |)
j=1
= n log R log |s1 | . . . log |sn |.
It follows that
n(R) = O (R ) as R .
Now
|sn | = r dn(r) = r1 n(r) dr <
n=1 0 0
as required.
Suppose now that f (s) is an entire function satisfying f (0) = 0, and that (14) holds for every
> 1. Suppose further that s1 , s2 , s3 , . . . are the zeros of f (s), counted with multiplicities and where
|s1 | |s2 | |s3 | . . . . Then for every ) > 0, the series
|sn |1
n=1
68 W W L Chen : Elementary and Analytic Number Theory
converges absolutely for every s C, and uniformly in any bounded domain not containing any zeros of
f (s). It follows that P (s) is an entire function, with zeros at s1 , s2 , s3 , . . . . Now write
where h(s) is an entire function without zeros. If (15) holds for every > 1, then h(s) = eA+Bs , where
A and B are constants, and so
s
f (s) = e A+Bs
1 es/sn . (21)
n=1
sn
THEOREM 6.9. Under the hypotheses of Theorem 6.8, the inequality (15) holds for every > 1,
where the function h(s) is dened by (19) and (20). In particular, the function f (s) can be expressed in
the form (21), where A and B are constants.
Proof. To show that the inequality (15) holds for every > 1, it clearly suces, in view of (14) and
(20), to establish a suitable lower bound for |P (s)|. Since the series
|sn |2
n=1
has nite total length. It follows that there exist arbitrarily large positive values R such that R S, so
that
|R |sn || |sn |2 for every n N. (22)
For any such R, write, for j = 1, 2, 3,
s
Pj (s) = 1 es/sn ,
sn
(23.j)
R
|sn | < , (23.1)
2
R
|sn | < 2R, (23.2)
2
|sn | 2R, (23.3)
respectively. Clearly
P (s) = P1 (s)P2 (s)P3 (s). (24)
Chapter 6 : The Riemann Zeta Function 69
Let ) > 0 be chosen and xed. Suppose rst of all that (23.1) holds. Then on |s| = R, we have
1 s es/sn s 1 e|s|/|sn | > eR/|sn | ,
sn sn
that
|P1 (s)|
eR
1+2
as R . (25)
in view of (22). Note that there are at most O (R1+ ) values of n for which (23.2) holds. Hence on
|s| = R, we have
|P2 (s)|
(R3 )R
eR
1+ 1+2
as R . (26)
for some positive constant c (see the Remark below), and so it follows from
2 1+
|sn | (2R) |sn |1
(23.3) n=1
that
|P3 (s)|
eR
1+2
as R . (28)
|P (s)|
eR
1+3
as R . (29)
The result now follows on combining (20) and (29), and noting that the inequality (14) holds for = 1+).
where |z| 1/2. Write z = x + iy, where x, y R. Then (30) will follow if we show that
whenever |x| 1/2. This last inequality can easily be established by using the theory of real valued
functions of a real variable.
THEOREM 6.10. Under the hypotheses of Theorem 6.8, suppose further that the series
|sn |1
n=1
holds.
Proof. This follows from (21) and the inequality |(1 z)ez | e2|z| which holds for every z C.
Recall that (s), dened by (3), is an entire function, and that (0) = 0. Note also that the zeros of
(s) are precisely the zeros of (s) in the critical strip. In order to establish Theorem 6.4, we shall use
Theorem 6.9. We therefore rst need to show that
(s) = O e|s| as |s|
Proof. Since (s) = (1 s) for every s C, it suces to prove the inequality (31) for 1/2. First
of all, there exists a positive constant c1 such that
1
s(s 1) s/2 < ec1 |s| .
2
as |s| is valid in the angle /2 < arg s < /2, and so there exists a positive constant c2 such that
s
< ec2 |s| log |s| .
2
Finally, note that the formula
s
(s) = s {x}xs1 dx
s1 1
is valid for > 0, and the integral is bounded for 1/2, so that there exists a positive constant c3
such that
|(s)| < c3 |s|.
Chapter 6 : The Riemann Zeta Function 611
This proves (31). On the other hand, note that as s + through real values, we have
s s s
log log and (s) 1,
2 2 2
so that (32) does not hold.
To complete the proof of Theorems 6.2 and 6.4, note that by Theorem 6.10, the series
||1
is divergent, where denotes the zeros of (s) and so the zeros of (s) in the critical strip. Theorem 6.2
follows immediately. Theorem 6.4 now follows from Theorems 6.9 and 6.11.
(s) 1 1 1 ( 2s + 1) (s)
= log + + . (34)
(s) s1 2 2 ( 2s + 1) (s)
where B is a constant and where denotes the zeros of (s) in the critical strip.
The formula (35) exhibits the pole of (s) at s = 1 and the zeros in the critical strip. The trivial
zeros are exhibited by the term
1 ( 2s + 1)
.
2 ( 2s + 1)
To see this last point, we start from the Weierstrass formula
1 s s/n
= es 1+ e ,
s(s) n=1
n
612 W W L Chen : Elementary and Analytic Number Theory
1 ( 2s + 1) 1 1 1
= + .
2 ( 2s + 1) 2 n=1
s + 2n 2n
The starting point of our discussion is based on the Argument principle. Suppose that the function F (s)
is analytic, apart from a nite number of poles, in the closure of a domain D bounded by a simple closed
positively oriented Jordan curve C. Suppose further that F (s) has no zeros or poles on C. Then
1 F (s) 1
ds = C arg F (s)
2i C F (s) 2
represents the total number of zeros of F (s) in D minus the total number of poles of F (s) in D, counted
with multiplicities. Here C arg F (s) denotes the change of argument of the function F (s) along C.
It is convenient to use the function (s), since it is entire and its zeros are precisely the zeros of (s)
in the critical strip. To calculate N (T ), it is convenient to take the domain (1, 2) (0, T ), so that C
is the rectangular path passing through the vertices
2, 2 + iT, 1 + iT, 1
Let us now divide C into the following parts. First, let L1 denote the line segment from 1 to 2.
Next, let L2 denote the line segment from 2 to 2 + iT , followed by the line segment from 2 + iT to 12 + iT .
Finally, let L3 denote the line segment from 12 + iT to 1 + iT , followed by the line segment from 1 + iT
to 1.
Since (s) is real on L1 , clearly L1 arg (s) = 0. On the other hand,
so that L2 arg (s) = L3 arg (s). If we write L = L2 , so that L denotes the line segment from 2 to
2 + iT , followed by the line segment from 2 + iT to 12 + iT , then
Recall that s
(s) = (s 1) s/2 + 1 (s).
2
Chapter 6 : The Riemann Zeta Function 613
It follows that
s
L arg (s) = L arg(s 1) + L arg s/2 + L arg + 1 + L arg (s). (37)
2
Clearly
1 1
L arg(s 1) = arg + iT = + O(T 1 ) (38)
2 2
and
s/2 1 1
L arg = L t log = T log . (39)
2 2
On the other hand,
s 5 1
L arg + 1 = I log + iT .
2 4 2
By Stirlings formula,
5 1 3 1 5 1 5 1 1
log + iT = + iT log + iT iT + log + O(T 1 ),
4 2 4 2 4 2 4 2 2
so that s T T 3 T
L arg + 1 = log + + O(T 1 ). (40)
2 2 2 8 2
Combining (36)(40), we have
1 T T T 3 T
N (T ) = log + log + + S(T ) + O(T 1 )
2 2 2 2 8 2
T T T 7
= log + + S(T ) + O(T 1 ),
2 2 2 8
where
S(T ) = L arg (s).
To prove Theorem 6.3, it suces to prove the following result.
Proof. Note rst of all that arg (2) = 0. On the other hand,
1 I(s)
arg (s) = tan
R(s)
Suppose now that R(s) vanishes q times on the line segment from 2 + iT to 12 + iT . Then this line
segment can be divided into q + 1 parts, where in each subinterval, R(s) may vanish only at one or
both of the endpoints and has constant sign strictly in between, so that the variation of arg (s) in each
such subinterval does not exceed . It follows that for s = + iT , we have
1 1
| arg (s)| (q + 1) + 2 . (41)
2 2
1
R(s) = (( + iT ) + ( iT )).
2
614 W W L Chen : Elementary and Analytic Number Theory
1
fT (s) = ((s + iT ) + (s iT ))
2
(note that we no longer insist that s = + iT ). Then q is the number of zeros of fT (s) on the line
segment from 1/2 to 2, and so is bounded above by the number of zeros of fT (s) in the disc |s 2| 3/2.
In other words,
3
qn , (42)
2
where, for every r 0, n(r) denotes the number of zeros of fT (s) in the disc |s 2| r. By Jensens
formula and noting that we may assume that ( 12 + iT ) = 0, we have
7/4
n(r) 1 2 7 i
dr =
log fT 2 + e d log |fT (2)|. (43)
0 r 2 0 4
Observe that
1
|fT (2)| = ((2 + iT ) + (2 iT )) = |R(2 + iT )|
2
2
1 1
= R 1 = 2 > 0,
n2+iT n2 6
n=1 n=2
so that
log |fT (2)| = O(1). (45)
Finally, recall that |(s)| T 3/4
for every 1/4. It follows that for every [0, 2], we have
fT 2 + 7 ei 1 2 + 7 ei + iT + 2 + 7 ei iT T 3/4 ,
4 2 4 4
so that
7 i
log fT 2 + e log T. (46)
4
Combining (43)(46), we conclude that
3
n log T. (47)
2
It follows that
c+i
1 (X/n)s
0 (X) = (n) = (n) ds
n=1
2i ci s
nX
c+i
c+i
1 (n) X s 1 (s) X s
= ds = ds,
2i ci n=1 ns s 2i ci (s) s
where denotes that the term in the sum corresponding to n = X is 12 (X).
The idea is now to move the line of integration away to innity on the left, so that we can express
0 (X) as a sum of the residues at the poles of the function
(s) X s
.
(s) s
Here, the pole at s = 1 gives rise to a residue X, while the pole at s = 0 gives rise to a residue
(0)/(0). On the other hand, each zero of (s) in the critical strip gives rise to a residue X /,
while each trivial zero 2n of (s) gives rise to a residue X 2n /2n. Finally, note that
1 1 X 2n
log 1 2 = .
2 X n=1
2n
and regard the path of integration as one side of a rectangle with vertices at c iT and U iT , where
U > 0 is large. Here T has to be chosen carefully so that the horizontal sides of the rectangle should
avoid the zeros of (s) in the critical strip. Also U has to be chosen carefully so that the left vertical
side of the rectangle should avoid the trivial zeros of (s). On taking U , this will result in a nite
version of Theorem 6.5, of the form
X
(0) 1 1
0 (X) = X log 1 2 + R(X, T ),
(0) 2 X
||<T