Sunteți pe pagina 1din 793

3.

9JT
An Introduction to LINEAR ANALYSIS
DONALD L, KREIDER, Dartmouth College
ROBERT G. KULLER, Wayne State University

ADDISON-WESLEY (CANADA) IIMITED, DON MILLS, ONTARIO


DONALD R. OSTBERG, Indiano University

FRED W. PERKINS, Dartmouth College

Introduction to
LINEAR
ANALYSIS

ADDISON-WESLEY PUBLISHING COMPANY, INC., READING, MASSACHUSETTS, U. S, A.


This book is in the

ADDISON-WESLEY SERIES IN MATHEMATICS

LYNN H. LOOMIS, Consulting Editor

Copyright © 1966
Philippines Copyright 1966

ADDISON-WESLEY PUBLISHING COMPANY, INC.

Printed in the United States of America

All rights reserved. This book, or parts thereof,


may riot be reproduced in any form without
written permission of the publisher.

Library of Congress Catalog Card No. 65-23656

ADDISON-WESLEY PUBLISHING COMPANY, INC.


READING, MASSACHUSETTS Palo Alto London
NEW YORK DALLAS ATLANTA ' BARRINGTON, ILLINOIS

ADDISON-WESLEY (CANADA) LIMITED


DON MILLS, ONTARIO
pr eface

For the student. Tradition dictates that textbooks should open with a few remarks
by the author in which he explains what his particular book is all about. This obli-
gation confronts the technical writer with something of a dilemma, since it is safe
to assume that the student is unfamiliar with the subject at hand; otherwise he
would hardly be reading it in the first place. Thus any serious attempt to describe
the content of a mathematics text is sure to be lost on the beginner until after he
has read the book, by which time, hopefully, he has discovered it for himself.
Still, there are a few remarks which can be addressed to the student before he

begins the task of learning a mathematical discipline. Above all, he should be told
what is expected of him in the way of prior knowledge, and just what special de-
mands, if any, are going to be made of him as he proceeds. In the present case the
first of these points is easily settled: We assume only a knowledge of elementary
calculus and analytic geometry such as is usually gained in a standard three-se-
mester course on these subjects. In particular, the reader should have encountered
the notion of an infinite series, and know how to take a partial derivative and eval-
uate a double integral. Actually almost two-thirds of this book can be read without
a knowledge of these last two items, while the first is covered quickly, but (for our
purposes) adequately, in Appendix I. we have
In short, kept formal prerequisites
to a minimum. At the same time, however, we demand
that the reader possess a
certain amount of that elusive quality called mathematical maturity, by which we
mean the patience to follow mathematical thought whither it may lead, and a will-
ingness to postpone concrete applications until enough mathematics has been done
to treat them properly.
This demand is a reflection of the fact that initially much of our work may
last

seem rather abstract, especially to the student coming directly from calculus. Thus,
even though we have made every effort to motivate our arguments by referring to
familiar situations and have illustrated them with numerous examples, it may not
be out of place to reassure those students not interested in mathematics per se that
every one of the topics we discuss is of fundamental importance in applied mathe-
matics, physics, and engineering. Indeed, without falsifying fact, this book could
have been entitled '*An introduction to Applied Mathematics" or "Advanced
Engineering Mathematics," and might have been save that the materia! covered is
of real value for the student of "pure" mathematics too. Nevertheless, most of
the ideas which we have treated grew out of problems encountered in classical
vi PREFACE

physics and ihe mathematical analysis of physical systems. As such these ideas lie
at the foundations of modern physics and, to a lesser extent, modern mathe-
matics as well.
But even more important, the subject which we have chosen to call "Linear
Analysis" is, when viewed as an entity, one of the most profound creations of the
human mind, and numbers among its contributors a clear majority of the great
mathematicians and physicists of the past three centuries. For this reason alone
it isworthy of study, and as our discussion proceeds we can only hope that the stu-
dent will come to appreciate the beauty and power of the mathematical ideas it
exploits and the remarkable creativity of those who invented them. If he does, his
efforts and ours will have been well rewarded.
For the instructor. As its title suggests, this book is an introduction to those
branches of mathematics based on the notion of linearity. The subject matter
in these fields is vast, ranging all the way from differential and integral equa-
tions and the theory of Hilbert spaces to the mathematics encountered in con-
structing Green's functions and solving boundary-value problems in physics. Need-
less to say, no single book can do justice to such a variety of topics, particularly
when, as in the present case, it attempts to start at the beginning of things. Never-
theless, it is the firm conviction of the authors that the notion of linearity which
underlies these topics and ultimately enables them to be classified as branches of a
single mathematical discipline can be developed in such a way that the student will

gain a real understanding of the issues at stake.


Since we have assumed nothing more than knowledge of elementary calculus
a
and analytic geometry, the first two chapters of the book are devoted to an exposi-
tion of the rudiments of linear algebra, which we develop to the point where dif-
ferential equations can be studied systematically. Anyone with a background in
linear algebra should be able to begin at once with Chapter 3, using the first two
chapters for reference.
Chapters 3 through 6 constitute an introduction to the theory of ordinary linear
equations comparable to that taught in most first courses on the sub-
differential
ject.Following the usual preliminaries, we introduce the notion of an initial-value
problem and state the fundamental existence and uniqueness theorems for such
problems. With these theorems as our real starting point, we proceed to show that
the solution space of a normal /ith-order homogeneous linear differential equation
must be w-dimensional, and use this fact to obtain a complete treatment of the
Wronskian. Then come equations with constant coefficients, solved by factoring
the operators involved, after which we turn our attention to the method of variation
of parameters. Here the algebraic point of view begins to pay real dividends, since

we are in a position to see this method as a technique for inverting linear differ-
ential operators. This leads naturally to the notion of Green's functions and their
associated integral operators, which we then treat in detail for initial-value prob-
lems. These ideas arise again in Chapter5 when we study the Laplace transform,

and our approach is such that we are able to give an integrated treatment of what
all too often strike the student as unrelated techniques for solving differential equa-
tions. The sixth, and last, in this sequence of chapters extends the survey of linear
PREFACE VH

differential equations beyond the customary beginning course by proving the Sturm
separation and comparison theorems and the Sonin-Polya theorem. Al this point

we anticipate our later needs by using these results to study the behavior of solu-
tions of Bessel's equation long before these solutions have been exhibited in series

form. Finally, continuing in the same spirit, we introduce the method of un-
determined coefficients for generating power series expansions of functions defined
by equations with analytic coefficients,
In Chapters 7 and 8 the setting changes to Euclidean spaces, and metric concepts
are introduced for the first time. We begin by proving the standard results for
finite dimensional spaces, and then proceed to discuss convergence in finite and

infinite dimensional spaces. Here we introduce the notion of an orthogonal basis


in an infinite dimensional Euclidean space together with the concept of orthogonal
series expansions in function spaces. Our point of view is that these ideas are
straightforward generalizations of concepts familiar from Euclidean H-space, and
we have made every effort to present them as such. In Chapters 9 and 1 1, we illus-

trate this theory by introducing several of the classical (Fourier) series of analysis,
first relative to the trigonometric functions, and then, in succession, relative to the

Legendre, Hermite, and Laguerre polynomials. (Chapter 10 is in the nature of a


digression and is devoted to the study of convergence of Fourier series.)

In Chapter 12, we pull the various threads of our story together by introducing
two- point boundary-value problems. We define eigenvalues and eigenvectors and
discuss the eigenvalue method for solving operator equations, As usual we begin
with the finite dimensional case, which is reduced to a problem in elementary alge-
bra via the characteristic equation, and then generalize to symmetric operators on
function spaces. The question of the existence of eigenfunction bases is treated in
a theorem unproved) of sufficient generality to cover the boundary-value prob-
(left

lems considered in the chapters which follow. We conclude this discussion by re-
turning to the subject of Green's functions to establish their existence and unique-
ness for problems with unmixed boundary conditions.
The last three chapters of the book use these results to solve boundary-value
problems involving the wave, heat, and Laplace equations. The physical signifi-
cance of these equations is discussed and the method of separation of variables
is applied to reduce the problems considered to Sturm-Liouville systems which

fall under our earlier analysis. The question of the validity of the solutions obtained
is settled for thewave equation by appeal to earlier results on the convergence of
Fourier series. Various forms of Laplace "s equation are then considered, and the
elegant theory of harmonic polynomials is introduced. Finally, cylindrical regions
make their appearance, and the Frobentus method is developed to the point where
Bessel's equation can be solved and orthogonal series involving Bessel functions
constructed.
The book ends with four appendices containing material which would have been
unduly disruptive in the body of the text. There we provide a discussion of point-

wise and uniform convergence which is sufficient for our needs, a brief treatment of
determinants, and a development of vector field theory to the point where unique-
ness theorems for boundary-value problems can be proved.
viii PREFACE

Having outlined what is in the book, a few words may be in order concerning
what is mt First, this is not a text in linear algebra. Thus, even though we do
present much of the material usually taught in a first course in linear algebra, a few
familiar topics have been omitted as unnecessary Tor the analysis we had in view.

Second, we have said nothing whatever about numerical approximations, finite


difference equations, and the like. Here our decision was guided by the belief that
this material properly belongs in a course on numerical analysis, and any attempt
to introduce it here would have resulted in an unwieldy book far too large to
appear decently in public. Finally, for similar reasons we have avoided all topics
which require a genuinely sophisticated use of operator theory, such as integral
equations and the Fourier transform and integral. Logically such material ought
to appear in a course following one based on a text such as this.
Given the modest level of preparation which we have assumed, we have made
every effort, particularly in the earlier chapters, to motivate what we do by slow
and careful explanations. Indeed, throughout the book we have been guided by
the feeling that it is belter to err on the side of too much explanation than too little.

We have also tried to keep the discussion sharply in focus at all times by giving
formal definitions of new terminology and precise statements of results being
proved or used. For the most part, theorems stated in the text are proved on the
spot. Those which are not comprise results whose proofs were felt to be either too
difficult for a book at this level or unenlightening in view of our objectives. Such
statements are usually accompanied by a reference to a proof in the literature.
In its present form this book is sufficiently flexible to be used in one of several
courses. For instance, Chapters through 6 plus parts of 7 and 15 provide mate-
1

rial for a combined course on (ordinary) differential equations and linear algebra

at the introductory level. On the other hand, Chapters 7 through 1 1 are logically
independent of everything which precedes them, save Chapter 1, and can be used
to give a course on series expansions and convergence in Euclidean spaces. By
omitting Chapter 10 and adding Chapter 12, the first few sections of Chapter 2,

and portions of Chapters 13 through 15, one obtains ample material for a one-se-
mester course in boundary-value problems suitable for students who are able to
solve elementary differential equations, Further there is more than enough mate-
rial (though not exactly of the traditional sort) for several of those courses which

go under the name of "engineering mathematics."' In fact, this book was written
primarily for such courses, and was motivated by the belief (or hope) that engineers
ultimately profit from mathematics courses only to the extent that these courses
present an honest treatment of the ideas involved.
For everyone. The internal reference system used in the text works as follows;
Items in a particular chapter are numbered consecutively as t for example, (3-1) to
(3-100). The first numeral refers to the chapter in question, the second to the num-
bered item within that chapter.
Throughout the book we have followed the popular device of indicating the end
of a formal proof by the mark | in the belief that students derive a certain comfort
from a clearly visible sign telling them how far they must go before they can relax.
PREFACE lx

As usual, sections marked with an asterisk may be omitted without courting dis-
aster. Everything so marked is cither a digression which the authors had not the
strength to resist, or material of greater difficulty than thai in the immediate vicinity.
As a gesture toward scholarly respectability, we have included a short bibliog-
raphy comprising those books which the authors personally found especially useful,
and for the convenience of those inveterate browsers of books we have prepared
an index of special symbols used in the text (see pp. xvi-xvii). Finally, a dia-
gram showing the logical interdependence of the various chapters appears after
the table of contents.
Debts and acknowledgements. Collectively and individually the authors are in-
debted to a large number of people who at long last can be publicly thanked:
First, the numerous students who have used portions of this material more or
less willingly at Dartmouth College and Indiana University over the past several

years, and whose comments have been far more valuable than they ever imagined.
Second, the surprisingly large number of professional colleagues whose advice
has been sought, sometimes unknowingly, and who have been more than gen-
erous in answering questions and furnishing criticism. In particular, special
thanks are due Professors H. Mirkil of Dartmouth College, G, Rota of Mas- C
sachusetts Institute of Technology, and M. Thompson of Indiana University, and
also Mr. L. Zalcman, presently at M. I. T.
And third, Mrs. Helen Hanchett of Hanover, New Hampshire, and Mrs.
Darlene Martin of Bloomington, Indiana, for their patience, good nature, and
unfailing accuracy in preparing typewritten versions of the manuscript too nu-
merous to count.
Thanks are also due. and hereby given, Dartmouth College for assistance ren-
dered in preparing a preliminary version of the manuscript and the Addison-
Wesley staff for seeing the book through press.
Lastly, thanks of a very special sort to our several wives for their constant sup-
port and encouragement as well as their equally constant insistence that we get on
with things and finish the job.
Conclusion. It seems to be one of the unfortunate facts of life that no mathe-
matics book can be published free of errors. Since the present book is undoubtedly
no exception, each of the authors would like to apologize in advance for any that
still remain and take this opportunity to state publicly that they are the fault of the
other three.

January, 1966 D. L. K.
R.G. K.
D. R. O.
F. W. P.
logical interdependence of chapters

Vector Spaces

2 7
Linear Euclidean
Tra nsfa rma t ions Spaces

3 S
Linear Differential Convergence in
Equations Euclidean Spaces

4 5 10 9

Equations with Laplace Convergence of Fourier Series


Constant Coefficients Transform Fourier Series

12
Boundary-Value
Problems
6 11

Linear Differential Series of


Equations Polynomials
13

Wave and Heat


Equations

15 14

Bessei Functions Laplace Equation


contents

1 REAL VECTOR SPACES

1-1 Introduction . 1

1-2 Real vector spaces 6


1-3 Elementary observations , 10
1-4 Subspaces 12
1-5 Linear dependence and independence ; bases 18
1-6 Coordinate systems ............... 26
1-7 Dimension 28
1-8 Geometric vectors 32
1-9 Equivalence relations ,,,,,...,..,,.. 37

LINEAR TRANSFORMATIONS AND MATRICES

2-1 Linear transformations 41


2-2 Addition and scalar multiplication of transformations 45
2-3 Products of linear transformations 48
2-4 The and image; inverses
null space 55
2-5 Linear transformations and bases 61
2-6 Matrices 64
2-7 Addition and scalar multiplication of matrices 69
2-8 Matrix multiplication 74
2-9 Operator equations 80

THE GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS

3-1 Linear differential operators 86


3-2 Linear differential equations 91
3-3
3-4
First-order equations ............... 95
Existence and uniqueness of solutions; initial-value problems. . . . 102
3-5 Dimension of the solution space 106
3-6 The Wronskian Ill
3-7 Abel's formula 117
3-8 The equation f* + y m 121

xl
xii CONTENTS

4 EQUATIONS WITH CONSTANT COEFFICIENTS


4-1 Introduction 126
4-2 Homogeneous eq uat ions of order two .......... 127
4-3 Homogeneous equations of arbitrary order ......... 132
4-4 Nonhomogeneous equations: variation of parameters and Green's
functions 138
4-5 Variation of parameters; Green's functions (continued) ..... 145
4-6 Reduction of order 154
4-7 The method of undetermined coefficients . 157
4-8 The Euier equation 161
4-9 Elementary applications 166
4-10 Simple electrical circuits 170

5 THE LAPLACE TRANSFORM

5-1 Introduction 177


5-2 Definition of the Laplace transform 179
5-3 The Laplace transform as a linear transformation ....... 183
5-4 Elementary formulas 186
5-5 Further properties of the Laplace transform 193
5-6 The Laplace transform and differentia! equations 202
5-7 The convolution theorem 206
5-8 Green's functions for constant coefficient linear differential operators , 212
5-9 The vibrating spring; impulse functions 218

6 FURTHER TOPICS IN THE THEORY OF LINEAR DIFFERENTIAL EQUATIONS


6-1 The separation and comparison theorems . 231
6-2 The zeros of solutions of Bessel's equation . 234
6-3 Self-adjoint form; the Sonin-Polya theorem 237
6-4 Power series and analytic functions 241
6-5 Analytic solutions of linear differential equations 243
6-6 Further examples 250

7 EUCLIDEAN SPACES

7-1 Inner products 256


7-2 Length, angular measure, distance 261
7-3 Orthogonality 268
7-4 Orthogonalization 273
7-5 Perpendicular projections; distance to a subspace 281
7-6 The method of least squares 290
7-7 An application to linear differential equations 298

$ CONVERGENCE IN EUCLIDEAN SPACES

8-1 Sequential convergence 304


8-2 Sequences and series 310
3

CONTENTS xiri

8-3 Bases in infinite dimensional Euclidean spaces 31


8—4 Bessei's inequality; Parsevat's equality 319
8-5 Closed subspaces 322

FOURIER SERIES

9-1 Introduction 329


9-2
9-3
The space of piecewise continuous functions ........ 329
Even and odd functions 334
9-4 Fourier series 336
9-5 Sine and cosine series 349
9-6 Change of interval ....,...,....., 355
9-7 The hasis theorem 360
9-8 Orthogonal series in ..........
two variables 364

TO CONVERGENCE OF FOURIER SERIES

10-1 Introduction 371


10-2 The Riemann-Lebesgue lemma . . 371
10-3 Fointwtse convergence of Fourier series 373
10-4 Uniform convergence of Fourier series 380
10-5 The Gibbs phenomenon .... 387
10-6 DifTerentiation and integration of Fourier series 391
10-7 Summability of Fourier series; Fejer's theorem 396
10-8 The Weierstrass approximation theorem . 405

11 ORTHOGONAL SERIES OF POLYNOMIALS

11-1 Introduction 409


11-2 Legendre polynomials . 409
11-3 Orthogonality: the recurrence relation 412
11-4 Legendre series 421
11-5 Convergence of Legendre series 425
11-6 Hermite polynomials 434
11-7 Laguerre polynomials 443
11-8 Generating functions 447

12 BOUNDARY-VALUE PROBLEMS FOR ORDINARY DIFFERENTIAL EQUATIONS

12-1 Definitions and examples . 457


12-2 Eigenvalues and eigenvectors ,,.,...,. 461
12-3 Eigenvectors in finite dimensional spaces . ..... 465
12-4 Symmetric linear transformations 471
12-5 Self-adjoint differential operators; Sturm-Liouville problems 476
12 6 Further examples 480
CONTENTS

12-7 Boundary-value problems and series expansions 484


12-8 Orthogonality and weight functions 488
12-9 Green's functions for boundary-value problems: an example ... 491
12—10 Green's functions for boundary-value problems: unmixed boundary
conditions 496
12-11 Green's functions: a proof of the main theorem 500

13 BOUNDARY-VALUE PROBLEMS FOR PARTIAL DIFFERENTIAL EQUATIONS:


THE WAVE AND HEAT EQUATIONS

13-1 Introduction 505


............
, , .

13-2 Partial differential equations 505


13-3 The classical partial differential equations 508
13-4 Separation of variables: the one-dimensional wave equation . . . 516
13-5 The wave equation; validity of the solution 522
13-6 The one-dimensional heal equation 528
13-7 The two-dimensional heal equation; biorthogonal series .... 532
13-8 The Schrodinger wave equation 536
13-9 The heat equation; validity of the solution 538

14 BOUNDARY-VALUE PROBLEMS FOR LAPLACE'S EQUATION

14-1 Introduction 546


14-2 Laplace's equation in rectangular regions 548
14-3 Laplace's equation in a circular region; the Poisson integral 553
14-4 Laplace's equation in a sphere; solutions independent of 6 558
14-5 Laplace's equation; spherical harmonics 564
14-6 Orthogonality of the spherical harmonics; Laplace series . 568
14-7 Harmonic polynomials and the basis theorem .... 573

15 BOUNDARY-VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS

15-1 Introduction 582


15-2 Regular singular points . 583
15-3 Examples of solutions about a regular singular point 586
15-4 Solutions about a regular singular point; the general case . . 590
15-5 Solutions about a regular singular point: the exceptional cases 594
15-6 Bessel's equation 597
15-7 Properties of Bessel functions ..... 606
15-8 The generating function 612
15-9 Sturm-Liouvillc problems for Bessel's equation 617
15—10 Bessel series of the first and second kinds 621
15-11 La p lace's eq ua t on n cy li nd rica c oord n a t es
i i I i 626
15-12 The vibrating circular membrane .... 631
CONTENTS XV

APPENDIX I INFINITE SERIES

1-1 Introduction 637


1-2 Sequential convergence „.,..,. 638
1-3 Infinite series 642
1-4 Absolute convergence 648
1-5 Basic notions from elementary calculus 652
1-6 Sequences and series of functions 656
1-7 Power series 661
1-8 Taylor series 666
1-9 Functions defined by integrals 669

APPENDIX II LERCH'S THEOREM 678

APPENDIX III DETERMINANTS

III— 1 Introduction . 680


IIT-2 Basic properties of determinants 682
III-3 Minors and cofactors 687
IN-4 Summary and examples 690
III — 5 Multiplication of determinants 696

APPENDIX IV UNIQUENESS THEOREMS

3 700
IV-l Surfaces in £R , surface area
IV-2 Surface integrals of vector fields 707
IV-3 The divergence theorem ............. 713
IV-4 Boundary-value problems revisited: uniqueness theorems .... 718

RECOMMENDATIONS FOR FURTHER READING 723

ANSWERS TO ODD-NUMBERED EXERCISES 725

INDEX 767
.2 5 C i
JS
|l 6 e
a o
u t "3 '5

£ s.
CO •a *>
"
aw *
2
-
— a O
3J

&2 CJ
a 2 -£
1 E u
5P
o
I 1
5 I 2| Cl
E
o _ 1 ^3
e c
g G
«
B 1 1 s
OB c c
g
i o o l| >> 1 2
b* a T3 c
1 a, sr S
o vi b ™ 8 ° 8
o g
1 3 <u E
* £ S- 1 c a S J3
1 ! 3
5. J :-, Uj UJ U y: v <sj X w ?

JO

4 a B k
«
H Q
< 8
IS
i
<

2 §
MS i! .
M —
5 = =

V-j
™ <3 B M
H 2 8 2 fl
-2 O
3 i 1
•j Ia * 1 3 £ c u
£ .£

i
i a I 3 5 M I - I s
"5 § I o o ° =
a
£ <s
4) * 41 cj el
J £
a i 8 g i s*
t3 3i I J £ Q crt^.

- n id «n 00 o\ t»>
5 5!

S & fe a « & M &


=
J 9 g
&
1 i I j

el a
O fi

i
I

a 1 5,

s
§ _ |
1
II IS I1 M 'c
c '0
o =
e 53g 60

90S
1
m fit fi —
O
E

i_
s
c
a
Cl
^5
n a
2 I -
% 5
a
H * -3 A Q
tj 3
^= Hi

cj
u 1
U
."e
]u
1 ~r
1
rl s
SO CO
a £ I< E
T3 ^c 5 i it,

5 3 B •J-; i/i s: i j: >i a 5

1 o ™ S
f** f»I** r-
a

f si S
J ti 2 5 X 13

c
-C 5
,2 c
% O
E "B
e 8
^
1 -0
!* a J2,
^ *s C fa
3 3 O
O fa
.5
3 ~-
3 * c B
y
3 1u l ^ a
M 5e a
| O u "8
x
1
f
,5 i
3 B c -
a
.9 M ^o
E £
c <•-
1c 1 -2
CJ
c Q
e O
ffl

s s
'J

6 a .2c B 1 I
2 'E. a
h. n
B
is
•5
_rt
'
<* a 3
D
IS
C
I 8
B c B
'j S O U
B a S3 I 3 H
s
c I 9
B B 'c a
E 5 a 3 S3 90 *-< P a £ z 5 s2

3 3 -h rv ^

© /Si *53 •'59


C bo c- "•

3 ^
real vector spaces

1-1 INTRODUCTION
The Cartesian plane of analytic geometry, denoted by (R 2 is one of the most ,

familiar examples of what is known in mathematics as a real vector space. Each


of its points, or vectors, is an ordered pair (x 1} x 2 ) of real numbers whose individual
entries, x x and x 2 are called the components of that vector. Geometrically, the
,

vector x = (x u x 2 ) may be represented by means of an arrow drawn from the


origin of coordinates to the point (x u x2) as shown in Fig. 1-1.*

x = (x u x2 )

FIGURE 1-1 FIGURE 1-2

Ifx = Ox, * 2 )and y = (y u y 2 ) are any two vectors in (R


2
, then, by definition,
their sum is the vector
x + y = Oi
:
+ yu x 2 + y 2) (1-1)

obtained by adding the corresponding components of x and y. The graphical


interpretation of this addition is the familiar "parallelogram law," which states
that the vector x + y is the diagonal of the parallelogram formed from x and y
(see Fig. 1-2). It follows at once from (1-1) that vector addition is both associative

Throughout this book we shall use boldface type (i.e., x, y, . . .) to denote vectors.
l
REAL VECTOR SPACES CHAP. 1

and commutative, namely,

x + (y + z) = (x + y) + z, (1-2)

x + y = y + x. (1-3)

Moreover, if we let denote the vector (0,0), and —x the vector (— x u -x 2 )


obtained by reflecting x = (x u x 2 ) across the origin, then

x + - x, (1-4)
and
x + (-x) = (1-5)

for every x. Taken together, Eqs. (1-2) through (1-5) imply that vector addition
behaves very much like the ordinary addition of arithmetic.
As well as being able to add vectors in (R 2 we can , also form the product of a
real number a and a vector x. The result, denoted ax, is the vector each ofwhose
components is a times the corresponding component of x. Thus, if x = (x 1} x 2 ),
then
ax = (ax i, ax 2 ). (1-6)

Geometrically, this vector can be viewed as a "magnification" of x by the factor a,


as illustrated in Fig. 1-3.
The principal algebraic properties of
2x = (2x 1 ,2x2 )
this multiplication are the following:

a(x + y) = ax + ay, (1-7)

(a + /3)x = ax + |8x, (1-8)

(a/3)x = a(/3x), (1-9)

lx = x. (1-10)
lx=(-xi,-x2 )
The validity of each of these equations FIGURE 1-3

can be deduced easily from the definition


of the operations involved, and save for (1-9), which we prove by way of illustra-
tion, the equations are left for the student to verify. To establish (1-9), let
2
x = (jci, x 2 ) be an arbitrary vector in (R , and let a and be real numbers. Then
by repeated use of (1-6), we have

(a/3)x = ((a/3)*i, (afi)x 2 )


= (a(/3xi),aC3* 2 ))
= a(|8x 1,0*2)

= a(p(x u x 2 j)
- a(/3x),

which is what we wished to show.


1-1 I
INTRODUCTION 3

The reason for calling attention to Eqs. (1-7) through (1-10) is that they,
together with properties (1-2) through (1-5) for vector addition, are precisely
what make (R 2 a real vector space. Indeed, these equations are none other than the
axioms in the general definition of such a space, and once this definition has been
2
given, the above discussion constitutes a verification of the fact that (R is a real
vector space. But before giving this definition, we pause to look at another
example.
This time we consider the set Q[a, b] consisting of all real valued, continuous
functions defined on a closed interval [a, b] of the real line.* For reasons which
will shortly become clear we shall call any such function a vector, and, following

our general convention, write it in boldface type. Thus f is a vector in Q[a, b] if and
only if f is a real valued, continuous function on the interval [a, b].
At first sight it may seem that Q[a, b] and (R 2 have nothing in common but
the name "real vector space." However,
one of those instances in which
this is
first impressions are misleading, for as we
remarkably
shall see, these spaces are
similar. This similarity arises from the fact that an addition and multiplication
by real numbers can also be defined in e[a, b] and that these operations enjoy
2
the same properties as the corresponding operations in (R .

Turning first to addition, let f and g be any two vectors in Q[a, b]. Then their
sum, f +
g, is defined to be the function
(i.e., vector) whose value at any point x in

[a, b] is the sum of the values of f and g at x

(see Fig. 1-4). In other words, '

f n.^^/^.1
(f + g)(x) = f(x) + g(x). (1-11)

At this point it is important to observe that


since the sum of two continuous functions
is continuous, this definition is meaningful
in the sense that f -f- g is again a vector in figure i-4

It is now easy to verify that apart from notation and interpretation Eqs. (1-2)
through (1-5) remain valid in Q[a, b]. In fact,

f + (g + h) = (f + g) + h (1-12)
and
f + g = g + f (1-13)

follow immediately from (1-11), while if denotes the function whose value is

* The
closed interval [a, b] is the set of all real numbers x such that a x < b; i.e., <
the interval from a to b, end points included. By contrast, if the end points are
[a, b] is
not included in the interval, we speak of the open interval from a to b, and write (a, b).
t For a proof of this fact see, for example, C. B. Morrey, Jr., University Calculus with
Analytic Geometry, Addison- Wesley, 1962, p. 89.
REAL VECTOR SPACES CHAP. 1

FIGURE 1-5 FIGURE 1-6

zero at each point of [a, b], then


f + (1-14)

for every f in e[a, b] (Fig. 1-5). Finally, if — f is the function whose value at x is

— f(x) (i.e., — f is the reflection of f across 0, as shown in Fig. 1-6), then f + (— f)


has the value zero at each point of [a, b], and we have

f + (-f) = 0. (1-15)

2
We have seen that the sum of two vectors in (H is found by adding their cor-

responding components (Eq. 1-1). A similar interpretation of vector addition is

possible in the present example and may be achieved as follows. If f is any vector
in Q[a, b], we agree to say that the "component" of f at the point x is its functional
value at x. Of course, every vector in Q[a, b] then has infinitely many components,

one for each x in the interval [a, b], but once this fact has been accepted it becomes
clear that Eq. (1-1 1) simply states that the sum of two vectors in e[a, b] is obtained
2
by adding corresponding "components," just as in (R .

Next, if f is any vector in <5[a, b] and a


is an arbitrary real number, we define the

product af to be the vector in Q[a, b]


whose defining equation is

(af)(x) = af(x). (1-16)

In other words, af is the function whose


value at jc is the product of the real num-
bers a and f(x). The similarity between
this multiplication and the corresponding
2
operation in (R is clear, since af is also
formed by multiplying each "component"
of f by a. (Figure 1-7 illustrates this mul-
tiplication when a = 2.) FIGURE 1-7
1-1 I
INTRODUCTION 5

2
The analogy with (R is now complete, for Eqs. (1-7) through (1-10) are also
valid in <3[a, b]. We restate them here as

a(f + g) = af + ag, (1-17)

(a + /3)f = af + j8f, (1-18)

(a/3)f = «G8f), (1-19)

If = f, (1-20)

and leave their proofs as an exercise.


This perhaps an appropriate point at which to warn the unsuspecting reader
is

that the space Q[a, b] is much more than an idle example. Indeed, a great deal of
our later work will be devoted to a study of this and similar vector spaces, and the
student will do well to understand it before reading further.

EXERCISES

In Exercises 1 through 5 compute the value of x y and a(x y) for the given vectors + +
x and y in (R 2 and real number a. Illustrate each of your computations with an appro-
priate diagram.

1. x = (0,2), y = (-1,1), o = 3
2. x = (i,l), y = (1,-2), a = -2
3. x = (-ii), y = (-2,-1), a = -1
4. x = (5,-2), y = (-3,2), a = \
5. x = (-5, -2), y = (-1, -1), a = -3

In Exercises 6 through 10 compute the value of f + g and a(f + g) for the given vectors
f and g in e[— 1, 1] and real number a. Illustrate each of your computations with an
appropriate diagram.
6. f(x) = 2x, g(x) = x 2 — x -\- \, a = 2

7. f(x) = tan 2 *, g(x) = 1, a = -1

8. f(x) = e
x
g(x) = e~ x a = %
, , ,

x+3 x — 2 1
ft
9.
,,
«*) -
v
^33. g(*) = -^^'" =
5

10. f(x) = cos 2


x, g(x) = sin
2
*, a = —3
In Exercises 11 through 20 determine whether or not the indicated function belongs to
C[— 1, 1]; if it does not, state why not.

11 •
11. sin
*-
+ 1
n
12. = I
x if x ^ °
x<0
i i
\x\
7 T [_ x if

13. tan x
14. tan (2* +D
15- In \x\ 16.
x2 3x + 2
REAL VECTOR SPACES | CHAP. 1

1 if X > I * - J if X >
,,
"• fW -
,, ,
i
1 if X <
18 - «*> - {-, - if » <
J

[x + 1 if x <
Solve each of the following vector equations for x.
21. 2x + 3(2, 1) = (0, 0) 22. 4x + 3x = 2(-l, 3)

23. (i + f)x + 5(i(3, 4)) = i(2, -1) 24. x - 3(2, 1) = 4x

Solve each of the following vector equations in C[— 1, 1] for f.

25. 2f - 1 = sec 2
x 26. In f = 1

27. e
f
= x + 2 28. 2f + e~ x = e^

29. Prove Eqs. (1-2) through (1-5) and (1-7) through (1-10) of the text, and illustrate

each with a diagram.


30. Prove Eqs. (1-12) through (1-15) and (1-17) through (1-20) of the text.

31. Let x be a vector in (R


2
, and let a be a real number. Prove that ax = if and only
ifa = or x = 0.
32. Let x ^ be a vector in (R 2 , and suppose ax = x. Prove that a = 1.

1-2 REAL VECTOR SPACES

With the examples of the preceding section in mind we now give the definition
of a real vector space.

Definition 1-1. A real vector space V is a collection of objects called


vectors, together with operations of addition and multiplication by real

numbers which satisfy the following axioms.

Axioms for addition. Given any pair of vectors x and y in V there exists a
(unique) vector x + y in V called the sum of x and y. It is required that

(i) addition be associative,

x + (y + z) = (x + y) + z,

(ii) addition be commutative,


x + y = y + x,

(iii) there exist a vector in 13 (called the zero vector) such that

x + = x
for all x in V, and

(iv) for each x in V there exist a vector —x in V such that

x + (-x) = 0.
1-2 | REAL VECTOR SPACES 7

Axioms for scalar multiplication. Given any vector x in V and any real num-
ber a there exists a (unique) vector ax in V called the product, or
scalar product, of a and x. It is required that

(v) a(x + y) = ax + ay,

(vi) (a + /3)x = ax + 0x,

(vii) (a$)x = a(,3x),

(viii) lx = x.

The student should not be discouraged by the formality of this definition; it


looks much worse than it really is. The issue here is simply that in order to deserve
the name, a real vector space must have a number of elementary and eminently
2
reasonable properties in common with (R We have already seen this happen in
.

the case of the function space Q[a, b], and before embarking on the general study
of this subject we give several additional examples of such variety as to convince
the reader that real vector spaces are very common objects indeed in mathematics.
Still others will be found in the exercises at the end of this section. In each case
we leave the verification of Axioms (i) through (viii) as an exercise to aid the
beginner in assimilating the various requirements of the definition.

Example The real numbers, with the ordinary definitions of addition and
1.

multiplication, form a real vector space. In this case Axioms (i) through (viii)
are all familiar statements from arithmetic.

Example 2. Let n be a fixed positive integer, and let (R


re
denote the totality of
ordered ra-tuples (xi, . . . , xn ) of real numbers. If x = (jc 15 . . . , xn ) and y =
(ji, • • •
» yn) are two such n-tuples, and a is a real number, set

x + y = (*i + yi, . . , xn + yn ) (1-21)

and
ax = (ax i, . . . , axn ). (1-22)

Then (R n becomes a real vector space. It is clear that (R


1
is the vector space of
Example 1 above, and that (R 2 and (R 3 are the vector spaces studied in analytic
geometry.
Note that there is nothing in the least mysterious in the fact that n is allowed to
assume values greater than three in this example. It is true, of course, that pictures
of the usual sort cannot then be drawn, but this is no serious shortcoming. In
fact, geometric intuition from 3-space, if used circumspectly, is still reasonably
n
accurate in (R when n > 3, despite the lack of a visual representation for these
spaces.
8 REAL VECTOR SPACES |
CHAP. 1

Example 3. Let (P denote the set of all polynomials in x with real coefficients,
and let polynomial addition and multiplication by real numbers be denned as in
high school algebra. Then (P is a real vector space.

This completes our list of basic examples. As we shall see, each of the real
n
vector spaces Q[a, has distinctive features not shared by either
b], (R , and (P

of the others, and for this reason these three spaces will occupy a special place in
our later work. And since we have already mentioned the future importance
n
of Q[a, b], it is only fair to add that (R and <P will in turn receive a great deal
of attention.
Before we continue, a few general remarks concerning the definition of a real
vector space are in order. First, since vector addition is associative, any finite

sum of the form


Xl + x2 + • •
+ xn

is stands and need not be festooned with parentheses to avoid


well defined as it

ambiguity. Moreover, the commutativity of vector addition implies that the value
of such a sum does not depend upon the order of the summands.
Secondly, in the interest of simplicity we shall write x — y from now on in

place of the unwieldy expression x + (— y). This, of course, involves the usual
change in terminology, for we then say that x — y is obtained by subtracting
y from x.
And finally, we call attention to the reasonably obvious fact that our insistence
upon using real numbers in the definition of a vector space is quite unnecessary.
Complex numbers, for instance, would do just as well, in which case we would
have what is known as a complex vector space. Even further
unsurprisingly
generalizations are possible.However, these more general vector spaces will not
appear in this book, and so we agree that the term "vector space" will always
mean real vector space, as defined above. Furthermore, the term scalar will be
used now and again as a synonym for the words "real number" in consonance
with the standard terminology of the subject.

EXERCISES
n
1. With addition and scalar multiplication as defined in the text, prove that (R is a real
vector space.
2. Prove that <P is a real vector space. (Recall that the vectors in (P may be written in
the form
flo + a\x + • • •
+ a nxn ,

where ao, . • • ,a n are real numbers.)


3. Find the value of aixi + «2X2 when + 0:3X3 in(R
2

(a) X! = (2,1), x 2 = (-1,2), x 3 = (1,0), ai = 1, a 2 = 2, a 3 = -1;


(b) xi = (J, 3), x 2 = (~h i), x 3 = (-l,2),ai = 2, a 2 = 9, a 3 = -2;
(c) xi = (-2,2), x 2 = (-• J, -3), x 3 = (f, -£),ai = h «2 = f a3
, = 2.
; :

1-2 | REAL VECTOR SPACES 9

4. Find the value of aixi + 0:2X2 + 03X3 in (R 4 when


(a) xi = (2,0, -1,3), x 2 = (1,-2,2,0), x 3 = (2, -1, 3, 1), «i = -l,a 2 = 0,

«3 = l;

(b) xi = (1, 2, 3, 4), x 2 = (0, 2, 1, -2), x 3 = (2, 6, 7, 6), «i = 2, « 2 = 1,


0:3 = —l;
(c) xi = (-i,2, 1, -3),x 2 = (1, -2,i,5),x 3 = (§,0,0,i),ai = -2,a 2 = h

as = — 1.
5. Find the value of aipi + «2P2 + «3P3 in (P when
(a) pi(x) = x 2 — x + 1, cxi = 2,
P2(*) = 3X 3 + 2X — 1, OL2 = — 1,
P3(*) = — * 3 + 2x, a 3 = —2;
(b) P i(jc) = 2x* - Ax 2 + 1, 01 = |,
P2W = -x 4 + 2x 3 + x 2 - x + 2, a 2 = 2,

p 3 (x) = -x 4 + 4x 3 - 2x - f, a 3 = -1;
(c) pi(x) = ^x 3 — 2x 2 + ^, ai = 3,
p 2 (x) = 2x — 1, a 2 = -|,
p 3 (x) = x 2 a 3 = 2.
,

6. Does the set of all polynomials in x with integral coefficients form a real vector space
with the usual definitions of addition and multiplication by real numbers? Why?
7. Let C denote the set of all complex numbers, i.e., all numbers of the form a + bi,

a and b real, and / = y/ — \. With addition of complex numbers defined by

(a + bi) + (c + di) = (a + c) + ib + <0*\

and scalar multiplication defined by

a(a + £1) = aa + a6/,

prove that 6 is a real vector space. Compare this space with (R 2 .

8. Let (R°° be the set consisting of all infinite sequences

x = (xi, x2 , . .
.)

of real numbers. If y = (yi, y2 , . . .) is another such sequence, define x + y by

x + y = (xi + yi, x2 + y2 , . .
.);

while if a is a real number, define ax by

ax = (ax\, ax 2 , . . .).

Prove that (R* is a real vector space.


9. With addition and scalar multiplication defined as in Q[a, b], determine which of the
following sets of functions is a real vector space
(a) all functions which are continuous everywhere on [a, b] except at a finite number
of points
(b) all functions which are zero everywhere on [a, b] except at a finite number of
points;
.

10 REAL VECTOR SPACES | CHAP. 1

(c) all functions which are different from zero at all but a finite number of points
of [a, b];

(d) all continuous functions such that f(a) = f(b);

(e) all continuous functions such that f(a + x) = f(b — x);

(f) all continuous functions which are zero on some closed subinterval of [a, b];

(g) all continuous functions which are zero on a fixed subinterval of [a, b].

10. Let (R+ be the set consisting of all positive real numbers, and define "addition" and
"scalar multiplication" in (R + as follows: If x and y belong to (R + let ,

x + y = xy,

where the product appearing on the right-hand side of this equality is the ordinary
product of the real numbers x and y; if a is an arbitrary real number, let

Prove that (R + is then a real vector space.


11 Let V be the set of all ordered pairs of real numbers. If x = (xi, X2) and y = (y i, .y 2)

are any two elements in V, define

x + y = (xi + yi, 0),

and let scalar multiplication be defined as in (R 2 . Is V a real vector space? Why?


12. Repeat Exercise 11, this time with addition defined as in (R 2 and scalar multiplication
changed to
ax = (xi, 0).

13. Use the associativity of vector addition to prove that

(w + x + y) + z = w + (x + y + z).

14. Is the operation of subtraction in a vector space associative or commutative?

1-3 ELEMENTARY OBSERVATIONS


In this sectionwe note a number of immediate consequences of Definition 1-1,
allof which are so elementary that they will be used without explicit mention in
the future. The first of these concerns the zero vector and asserts that this vector
behaves very much as one might expect. Specifically,

Ox = for every x, (1-23)


and
<x0 = for every a. (1-24)

To prove the first of these assertions set a = /3 = in ax + /3x = (a + /3)x.

This gives
Ox + Ox - (0 + 0)x = Ox.

Now subtract Ox from both sides of this equation, and then use the fact that
1-3 I
ELEMENTARY OBSERVATIONS 11

Ox — Ox = to obtain

Ox + (Ox - Ox) = Ox - Ox
and
Ox + = 0.

Hence Ox = 0.
The proof of (1-24) is similar; this time set x = y = in a(x y) = ax ay. + +
Next, an equally elementary proof establishes the fact that the vector — x and
(— l)x are one and the same. Indeed, since lx = x and Ox = 0, we have

x + (-l)x = lx + (-l)x = (1 - l)x = Ox = 0.

Now subtract x from both sides of this equation to obtain (— l)x = — x, as


asserted.
Finally, it may be of interest to prove that in any vector space V the vector is

unique and —x is uniquely determined by x, in the sense that they are the only
vectors in V possessing their particular properties. This is the burden of

Lemma 1-1. IfO' is a vector in V such that x + 0' = xfor every x in V,


then 0' = 0. Similarly, if x' is any vector in V such that x + x' = 0,
then x' = —x.

Proof. If x + 0' = x for every x in V, we have, in particular,

+ 0' = 0.

On the other hand, the zero vector has the property that + x = x for every x.
Hence
+ 0' = 0',

and it follows that


= 0'.

The second statement of the lemma follows from the sequence of equalities

x' = + x' = (-x + x) + x' = -x + (x + x') = -x + = -x. |

EXERCISES*

1. Prove that ax = implies that a = or x = 0. [This is the converse of (1-23)


and (1-24).]

2. Prove that ax = j8x if and only if either x = or a = /3.

3. Establish the equality (— a)x = —(ax) for any a and any x.

* Exercises marked with an asterisk are intended to be somewhat more challenging


than the rest.
: :

12 REAL VECTOR SPACES | CHAP. 1

*4. Prove that the commutativity of addition in a vector space is a consequence of the
remaining axioms, as follows
(a) Use Axioms (i), (iii), and (iv) to prove that x z = y z implies x+ + = y.
(This result is sometimes known as the Cancellation law of vector addition.)
(b) Justify each step in the following sequence of equalities:

(0 + x) + (-x) = + [x + (-x)] = + - x + (-x).

Now apply (a) to show that + x = x for all x in V.


(c) With the aid of the result just proved, justify each step in the following sequence
of equalities

[(-x) + x] + + [x + (-x)] =- x + 0=-x = 0+ (-x).


(-x) = (-x)
Now apply (a) to show that — x + x = for all x in V.
(d) Use (b) and (c) to prove that z + x = z + y implies x = y.

(e) Use Axioms (v) and (vi) of Definition 1-1 to expand (1 + l)(x + y) in two

ways, and then apply (a) and (d) to deduce that x + y = y + x for all x and y.

1-4 SUBSPACES
Now that we are well armed with examples, we begin the systematic study of
real vector spaces. To do so, we introduce the important notion of a subspace of
a vector space.

Definition 1-2. A subset W of a vector space V is said to be a subspace


of V if W itself is a vector space under the operations of addition and scalar
multiplication defined in 'U.f

Before giving examples, we consider the problem of determining when a given


collection of vectors actually is a subspace. One way of doing this, of course, is

to verify that the set of vectors in question satisfies all of the requirements of
Definition 1-1. However, the step by step verification of the axioms in this defi-
nition is both time consuming and tedious, and we now show how this procedure
can be substantially shortened. Specifically, we establish the following:

Subspace Criterion. If every vector of the form

axX! + a 2 X2 (1-25)

belongs to W whenever x x and x 2 belong to W, and a and a 2


t
are arbitrary scalars,
then "W is a subspace of V.

t The term "subset" as used here and in similar contexts in the future means that every
vector in W also belongs to T). Moreover, we always assume that there is at least one
vector in W —a which fact is sometimes expressed by saying that is nonempty. W
1-4 I
SUBSPACES 13

To prove we must show that


this assertion, satisfies Definition 1-1. But by W
first setting a 2 = 1 in (1-25), and then setting « 2 = 0, we deduce in turn
ax =
that (a) the sum of any two vectors in v? again belongs to W, and (b) ax belongs
to W for every real number a and every x in W.*
From (b) it follows in particular that — x belongs to whenever x does, and that W
W also contains the zero vector. Thus satisfies Axioms (iii) and (iv) of Defini- W
tion 1-1. Finally, we observe that the remaining axioms certainly hold in V?,
since they are valid everywhere in V. Hence is a subspace of V. | W
Example Every vector space has two subspaces: (a) the whole space, and
1.

(b) the subspace consisting of the zero vector by itself, called the trivial subspace.
A subspace of V which is distinct from 13 is called a proper subspace.

Example 2. If W is the subset of (R


3
consisting of all those vectors whose third
component is zero, then the above criterion implies at once that is a subspace W
3 3
of (R . When the components of each vector in (R are viewed as its ordinary
x, y, z-components, then W is just the (x, >>)-plane in 3-space.
1
Example 3 . Let e [a, b] denote the set of all functions which possess a continuous
derivative at every point of the interval [a,b]; i.e., the so-called continuously
dijferentiable functions on [a, b]. Since a differentiable function is continuous,
l
each function in Q [a, b] also belongs to e[a, But both the scalar multiple of b].

a continuously differentiable function and the sum of two such functions are
l
continuously differentiable. Hence Q [a, b] is closed under addition and scalar
multiplication and thus is a subspace of e[a, b]. More generally, if e n [a, b] denotes
the set of n times continuously differentiable functions on
all [a, b], then Q m [a, b]
n
is a subspace of Q [a, b] whenever m > n.

Example 4. Let <P W be the set consisting of all polynomials of the form

n—
aQ + a xx + • • '
+ a«-ix

where a , . . . , an —\ are arbitrary real numbers, and n is a. fixed positive integer;


i.e., (Pn consists of all polynomials with real coefficients of degree <«, together
with the zero polynomial. Then (Pw is a subspace of (P and also of m whenever
<P

m > n.

Now that we know what a subspace is, it is natural to ask how one might go
about finding all is a hard
subspaces of a given vector space. In general, this
problem, but for certain spaces the answer can readily be given. For instance,
3
it is not difficult to show that the only nontrivial, proper subspaces of (R are lines
and planes through the origin (see Exercise 4, below). And once this observation

* Mathematicians summarize these two facts by saying that is closed under vector W
addition and scalar multiplication. In this language the above criterion becomes the
statement that a nonempty subset of a vector space is a subspace if and only if it is closed
under vector addition and scalar multiplication.
14 REAL VECTOR SPACES | CHAP. 1

has been made one is struck by the fact that the intersection of any two subspaces
3 3
of (R is again a subspace of (R . Actually, this is true in general, as is shown in

Lemma 1-2. IfWi andV? 2 cire subspaces of V, then the set consisting
of all vectors belonging to both W i andV? 2 is a subspace ofV.

Proof Let W be the and note that set in question,


contains the zero vector W
since this vector belongs to both and 2 Now let x x and x 2 be any two vectors W x W .

in W. Then Xi and x 2 belong to Wi and to 2 and hence so does aix t + a 2 x 2 W ,

for any pair of real numbers a x and a 2 This implies that aix x + a 2 x 2 belongs .

to W, and the assertion that W


is a subspace of V now follows from the subspace

criterion. |

The subspace W of this lemma is known as the intersection of W i and W 2, and


is denoted "Wj n W (read "W 2 x intersect W 2 ").
We now return to the problem of finding all subspaces of an arbitrary vector
space V. Rather than attempt a frontal assault on this problem, it turns out to
be much more profitable to proceed as follows : Let 9C be any (nonempty) subset

of V. Then, as was noted above, there is at least one subspace of V containing 9C,
namely 13 itself. This being so, we attempt to find the "smallest" subspace of V
containing 9C, where by we mean that subspace of V which contains 9C, and
this
which in turn is contained in every subspace of V containing 9C. To show that
such a subspace actually exists, consider the totality of all subspaces of V which
contain 9C, and let S(9C) denote the set of vectors belonging to every one of these
subspaces ; i.e., S(9C) is the intersection of these subspaces. Reasoning as in the proof
of Lemma 1-2, we see that S(9C) is a subspace of V, and from its very definition it is

clear that there is no subspace of V which contains 9C and is properly contained in


S(9C). Thus S(9C) is the desired subspace. It is called the subspace of V spanned
by 9C and, as we shall see, is uniquely determined by the set 9C.
All this is well and good, but unless we can discover an easy method for finding
S(9C) in terms of the vectors belonging to 9C, we will have made little progress on

the problem of surveying the subspaces of V. Fortunately (and this is the reason
for introducing S(9C) in the first place) such a method is easy to derive. To do so,
we introduce the following definition.

Definition 1-3. An expression of the form

«iXi + • • •
+ anxn ,
(1-26)

where « 1} . . . , a n are real numbers, is called a linear combination of the


vectors x h . . . ,x B .

And now we can describe S(9C): it is the set of all linear combinations of the vectors
in 9C. Thus once 9C is known, so is S(9C), and (1-26) gives the form of each of its

vectors.
1-4 SUBSPACES 15

FIGURE 1-8

3
Before proving this assertion, let us look at some examples in (R . First let 9C
consist of a single nonzero vector x. Then S(9C) is the line through the origin
determined by x. But the points on this line are simply all scalar multiples ax
of x, and our assertion holds in this case. Next, let 9C consist of two nonzero,
noncollinear vectors, x x and x 2 In this case, S(9C)
is the plane through the origin
.

determined by x 1 and x 2 (Why?)


But every linear combination of the form
.

axXi + a2*2 certainly lies in this plane, and, conversely, we can "reach" any
vector y in this plane by a linear combination of x 1 and x 2 as indicated in Fig. 1-8. ,

Thus S(9C) is again the set of all linear combinations of the vectors in 9C.
We now prove the general result.
Theorem 1-1. Let X be a (nonempty) subset of a vector space 13. Then
the subspace of 13 spanned by 9C consists of all linear combinations of the
vectors in 9C.

Proof In the first place, the set of all linear combinations of vectors in 9C is closed
under addition and scalar multiplication, and hence is a subspace More- WofU
over, the equation x = lx shows that each x in 9C is a linear combination of
vectors in 9C, thus proving that 9C is contained in W. Finally, every subspace of 13
which contains 9C must contain all vectors of the form (1-26) by virtue of the fact
that a subspace is closed under addition and scalar multiplication. In other
words, "W is contained in every subspace of 13 containing 9C, and it follows that
W = S(9C). I

EXERCISES

1 . Determine which of the following subsets are subspaces of the indicated vector space.
Give reasons for your answers.
(a) The set of all vectors in (R 2 of the form x = (1, x 2 ).
(b) The zero vector together with all vectors x = (jci, x 2) in (R 2 for which x 2 /xi
has a constant value.
16 REAL VECTOR SPACES | CHAP. 1

(c) The set of all vectors x = 3


(xi, X2, X3) in (R for which xi + X2 + X3 = 0.

(d) The set of all vectors in (R


3
of the form (xi, X2, xi + xi).

(e) The set of all vectors x = 3


(xi, X2, X3) in (R for which x\ + x\ + x\ = 1.

2. Repeat Exercise 1 for the following subsets.

(a) The subset of (P n consisting of the zero polynomial and all polynomials of
degree n — 1.

(b) The subset of (P consisting of the zero polynomial and all polynomials of
even degree.
(c) The subset of (P consisting of the zero polynomial and all polynomials of
degree 0.

(d) The subset of (P„, n > 1, consisting of all polynomials which have x as a
factor.

(e) The subset of (P consisting of all polynomials which have x — 1 as a factor.

3. Repeat Exercise 1 for the following subsets of Q[a, b].

(a) The set of all functions in e[a, b] which vanish at the point xo in [a, b].

(b) The set of all nondecreasing functions in Q[a, b].

(c) The set of all constant functions in Q[a, b].

(d) The set of all functions f in Q[a, b] such that f(a) = 1.

(e) The set of all functions f in Q[a, b] such that Ja f(x)


6
dx = 0.
3 from the subspace are
4. Prove that the only proper subspaces of (R different trivial

lines and planes through the origin.

5. Prove that (P w is a subspace of (P.

6. Consider the set of all infinite sequences {x u x„, . . . , . . .} of real numbers which
have only a finite number of nonzero entries. Prove that this set is a subspace of the
vector space defined in Exercise 8 of Section 1-2.
3
7. Determine which of the following vectors belong to the subspace of (R spanned
by (1, 2, 1) and (2, 3, 4).
(a) (4,7,6) <c) (4,1,1) (e) (2,9,4)
(b) (-± -i -f) (d) (2, 9, 5) (f) (0,
i,
-f
3
8. Determine which of the following vectors belong to the subspace of (R spanned
by (1, -3, 2) and (0,4,1).
(a) (3, -1, (c) (i 1, (e) (£, -1, f)
8) f)
(b) (2, -2, 1) (d) (2, -|, 0) (0 (|, 3, -|)
9. Determine which of the following polynomials belongs to the subspace of <P spanned
by x 3 2x 2 + 1, x 2 - 2, x 3
+ x. +
(a) x2 - x +3 (c) 4x 3 - 3x + 5 (e) ~4* 3 + fx 2 - x - 1

(b) x2 - 2x + 1 (d) x4 + 1 (0 x - 5

10. Let f and g be the functions in e[0, 1] defined by

< x < f-x +| if < x <


fW- |0
j x _
if
1 ifi <
4,
x < 1,
8W
(
.

|0 if \ < x < 1.
4,

Find S(f), S(g), and S(f, g).


1-4 I
SUBSPACES 17

11. Let f be the function whose value is 1 at every point of the interval [a, b]. Find the
subspace of Q [a, b] spanned by f.
12. Find the subspace of (ft 3 spanned by each of the following sets of vectors.
(a) (2,1,3), (-1,2,1)

(b) (1,0,2), (2,1,-2)


(c) (-1,1,2), (0,1,0), (2,4,1)
13. Find the subspace of (P spanned by each of the following sets of vectors.
(a) x 2 x(x
, + 1)

(b) x + 1, x2 - 1

(c) 1, x - 2, (x - 2)
2

14. Prove that the intersection of any collection of subspaces of a vector space V is a
subspace of 13. (By the intersection of a collection of subspaces of V we mean the

totality of vectors common to all of the subspaces in question.)

15. (a) Prove that S(S(9C)) = S(9C) for any 9C.

(b) Prove that S(9C) = 9C if and only if 9C is a subspace of V.


16. Let "Wi, W2, and W3 be subspaces of a vector space, and suppose that °Wi is 'a
subspace of W2, and that W
2 is a subspace of W3. Prove thatWi is a subspace of
W 3.

17. Let 9Ci and 9C2 be two sets of vectors in V, and let 9Ci n 9C2 be the set of vectors
belonging to 9C 1 and'X.2. Furthermore, let us agree that if 9Ci and 9C2 have no vectors
in common, S(9Ci D 9C2) is the trivial subspace of V. Show that S(9Ci f~l 9C2) is a
subspace of S(9Ci) n S(9C2). Give an example in (ft 3 where these two subspaces are
distinct, and one where they are identical.

18. Prove that the following two subsets span the same subspace of (ft 3 .

(a) (1,-1,2), (3,0,1)


(b) (-1,-2,3), (3,3,-4)
19. Prove that the functions sin 2 2
*, cos x, sin x cos x span the same subspace of
Q[a, b] as 1 sin 2x and cos 2x.
20. Let Wi and V?2 be subspaces of V, and let *W denote the set of all vectors which
belong to W
1 or "W 2, or both. Prove that is a subspace of V if and only if one W
of the W* = 1, 2) is contained in the other.
21. (a) Let W 1 be the subset of Q[—a, a] consisting of all functions f such that f(x) =
f(—x), and \efW2 be the subset consisting of all f such that f(x) = — f(— x). Prove
W
thatWi and 2 are subspaces of 6[— a, a].
(b) Give a graphical description of the functions belonging to each of these sub-
spaces.
(c) What is "Wi flW
(d) Prove that every vector f in Q[— a, a] can be written in one and only one way as
a sum of a vector in Wi and a vector in W2. [Hint: Consider the functions
(f(x) + f(-x))/2 and (f(x) - f(-x))/2.]
22. Let W and 1 W
be subspaces of V, and let
2 1 W
V? 2 be the set of all vectors in + V
of the form xi +
x 2 where xi belongs to "W 1 andx 2 toW 2 Show that Wi
, . +W2
is a subspace of V.
18 REAL VECTOR SPACES | CHAP. 1

23. Let W
be the subset of (P„ consisting of all polynomials which have zero as a root,
i

and let W2
be the subset of (P„ consisting of the zero polynomial and all poly-
nomials of degree zero (i.e., constant polynomials). Prove that (P„ = Wi W2 +
(see Exercise 22), and show that each vector in (P n can be written in one and only
one way as the sum of a vector in W 1 and a vector in W2.
24. If 9Ci and 9C 2 are two sets of vectors in V, then 9Ci U 9C2 (read "9Ci union 9C2")
is the set of vectors belonging to 9Ci or to 9C2 (or both). Prove that

S(9Ci U 9C 2 ) = S(9Ci) + S(9C 2 ),

3 (see
and illustrate this result by examples chosen from (R Exercise 22).

25. Let a 1 and a 2 be real numbers and consider the linear equation

aixi + 0:2*2 =
in the unknowns xi and X2. A vector (ci, c£) of (R 2 is said to be a solution of this
equation if the substitution of c\ and a for x\ and X2 respectively reduces this
equation to an identity. Show that the set of solutions of the given equation is a
subspace of (R 2 . Describe this subspace graphically.
26. Show that the set of all simultaneous solutions of the pair of linear equations

aixi + a2X2 = and jSixi + 02*2 =


is a subspace of (R 2 and give a geometric description of this subspace.
,

27. (a) Suppose that the vector x = (ci, C2) is a solution of the pair of linear equations

ai*i + «2*2 = Ti,

01*1 + 02*2 = T2 .

Prove that every solution of this pair of equations is of the form x + y, where y is

a solution of
a 1*1 + a 2* 2 = °»

0i*i + 182*2 = 0.

(b) Conversely, with x and y as in (a), prove that every vector of the form x + y
is a solution of (I).

(c) Give a geometric description of the solutions of (II), and then use it and the
above results to obtain a description of the solutions of (I).
*28. Let 9C be an arbitrary subset of a vector space V, and let x and y be vectors in V.
Suppose that x belongs to the subspace S(9C, y) but not to S(9C). Prove that y then
belongs to S(9C, x). (This result is sometimes known as the exchange principle.)

1-5 LINEAR DEPENDENCE AND INDEPENDENCE; BASES

Consider the subspace S(x t x 2 x 3 ) spanned by three nonzero, coplanar vectors


, ,

3 perfectly clear
in (R , no two of which are collinear (Fig. 1-9). In this case it is

that the given set contains more vectors than are needed to span the plane
S(xi, x 2 x 3 ), since
, any two of them suffice in this respect. But at least two vectors
1-5 | LINEAR DEPENDENCE AND INDEPENDENCE; BASES 19

3
are always necessary to span a plane in (R and hence we obtain a "minimal" ,

subset of xi, x 2 x 3 ) by discarding any one of the given


x 2 x 3 which spans
, S(xi, ,

vectors. This example suggests that it may be possible to reduce any finite set
of vectors Xi, x n to a minimal subset which continues to span S(xx,
. . . , xn ). . . . ,

This is in fact the case, as we show in Theorem 1-2, but before doing so we intro-
duce some useful terminology.

Definition 1-4. A vector x is said to be linearly dependent on Xi, . . . , xn


if x can be written in the form

x = aiXi + • • •
+ a„x n ,

where the a; are scalars. If, on the other hand, no such relation exists,
x is said to be linearly independent of Xi, xn . . . , .

Thus x is linearly dependent on x l5 . . . , xn if and only if x is a linear combination


of Xi, x n (Definition 1-3), and hence if and only if x belongs to the subspace
. . . ,

spanned by them (Theorem 1-1). In particular, each of the vectors X;, 1 < i < n,
is linearly dependent on x 1} x n since it belongs to the subspace spanned by
. . . , ,

these vectors. We also note that the equation = Ox implies that the zero vector
is linearly dependent on every vector.

FIGURE 1-9

It is convenient to extend the terminology of Definition 1-4 to include finite


sets of vectors by saying that such a set is linearly independent if no one of its
vectors is linearly dependent on the remaining ones. If this is not the case, we say
that the set is linearly dependent. Finally, when referring to a linearly independent
(or dependent) set x l5 x n we shall often relax our terminology and say that
. . . , ,

the vectors themselves are linearly independent (or dependent).


How does one determine in practice whether a set of vectors is linearly dependent
or independent? The easiest method is given by the following test for linear in-
dependence.
20 REAL VECTOR SPACES | CHAP. 1

Test for linear independence. The vectors x l5 . . . , x n are linearly independent


if and only if the equation
«iXi + • • '
+ anxn =
implies that ai = • • • = an = 0.

For instance, x x = (1,3,-1,2), x 2 = (2,0,1,3), x3 = (—1,1,0,0) are


4
linearly independent in (R since the equation

<*iXi + a 2x 2 + 0:3X3 =
implies that
«i + 2a 2 — «3 = 0,

3«i + «3 = 0,

— «! + a2 = 0,

2«i + 3a 2 = 0,

from which it easily follows that a1 = a2 = a3 = 0.

We leave the proof of the above test as an exercise (see Exercise 9 below),
with the strong recommendation that it be done.
And now we are ready to show how one can weed the extraneous vectors from
any finite set x ls . . . , xn without disturbing S(x l5 x n ). The basic idea is . . . ,

obvious; just get rid of as many linearly dependent vectors as possible.


To accomplish this we begin with the vector x w . If xn is linearly dependent on
xi, . .
.
, x n _i, then
xn = «iXi + • • •
+ a n _ix n _i,

and we can rewrite the expression

x = jSiXi + • • •
+ /8 nxn

for an arbitrary vector in S(xi, . . . , xn) in the form

X = (jSl + «liSn)Xi + * * •
+ (|8 n _i + a _ii3n)Xn-l.
ra

This proves that x is already a linear combination of Xi, . . . , x n _i, and hence
that S(xi, . . . xn _i)
,
= S(x x , . . . , x„). In this case we drop the vector x n from
the set x l9 If, on the other hand, x n
...,x n . is not linearly dependent on
Xi, . .
.
, xn _i, we keep it.
If we repeat this procedure with each of the x* in turn, dropping x^ if it is linearly
dependent on the remaining vectors in the (possibly modified) set, keeping it
otherwise, it is clear that we obtain a linearly independent subset of x b xn . . . ,

which spans the subspace S(xi, x n ). This, of course, is what we started out . . . ,

to show, and we have proved

Theorem 1-2. Every finite set of vectors 9C contains a linearly independent


subset which spans the subspace S(9C).
1-5 | LINEAR DEPENDENCE AND INDEPENDENCE; BASES 21

(Note, however, that in general 9C contains many such subsets. This was the case
for instance in the example given at the beginning of this section. Other examples
will be found in the exercises below.)
Linearly inflependent sets enjoy a special status in the study of vector spaces,
and among such sets those which span the entire space are particularly important.
Such sets are named in

Definition 1-5. A finite linearly independent subset (B of a vector space V


is said to be a basis for V if S((B) = V.

As an example, we cite the vectors i = (1,0,0), j = k = (0,0, 1),


(0, 1,0),
which form a basis for (R We shall prove this assertion in3
. Example 1 below, and
now merely wish to observe that every vector x = (*i, x 2 x 3 ) in (R 3 can be written ,

in one and only one way as a linear combination of these basis vectors, namely,
x = xj + x 2 j + * 3 k. This last property actually serves to characterize a basis
in a vector space, as we now show.

Theorem 1-3. A set of vectors ei, en is a basis for a vector space V . . . ,

if and only if every vector in V can be written uniquely as a linear combination


ofe u . ,e.
n . .

Proof First suppose that ei, e n is a basis for V. Then the . . . , e; span V, and
hence every vector in V can be written in at least one way as

x = «ie! + • • •
+ a n en . (1-27)

To show that this is the only such expression possible, let

x = iSid + • • •
+ /3 nen (1-28)

be another. Then, subtracting (1-28) from (1-27), we obtain

= (oi - 0i)ei + •••


+ («„- n)en . (1-29)

But since ei, .e n is a basis for V, these vectors are linearly independent. Hence,
. . ,

by our test for linear independence, each of the coefficients in (1-29) is zero,
and it follows that an = jS 1} an = £„, as desired. . . . ,

Conversely, suppose every vector in 13 can be written uniquely as a linear com-


bination of ei, en . Then these vectors certainly span V, and we need only
. . , .

prove their linear independence in order to show that they are a basis for V. To
accomplish this, we observe that = Oe x 0e w and that our assumption + • • •
+
concerning the uniqueness of such expressions implies that this is the only represen-
tation of as a linear combination of e ls . .
.
, en . Thus if o^ex + • * *
+ «wen = 0,
we must have ai = • • • = an = 0, and the test for linear independence now
applies. |
i

22 REAL VECTOR SPACES | CHAP. 1

Example 1. The vectors


d = (1,0,..., 0),

e2 = (0,1,..., 0),

en = (0, 0, . .
.
, 1)

n
are a basis for (R , since x = *iei + + xn en is the only way of expressing • • •

the vector x = (xi, . . . , xn ) as a linear combination of ei, e n This particular . . . , .

w
basis is called the standard basis for (R .

n
Example 2. Again in (R , let

ei = (1,0,... ,0),

e'2 = (1,1,..., 0),

e; = (1, 1, . .
.
, 1),

where, in general, e^ is the «-tuple having and 0's there- l's in the first i places
n
after.Then ej, . . . , e« is a basis forx = (x 1} xn ) (R . To prove this let . . . ,

be given, and let us attempt to find real numbers «i, ...,a n such that
x = aie[ + • • •
+ a n en. In order that such an equality hold we must have

(*i, . . . , xn ) = ai (l, 0, . . . , 0) + a 2 (l, 1, . . . , 0) + • • •


+ an (l, 1, . . . , 1)

= (ai, 0, . . . , 0) + (a 2 , a 2 , • . • , 0) + ' ' *


+ («n, an , . . . , an )

= (a\ + a2 + • • •
+ oi n , «2 + ' * *
+ ot n , ... , a n ),

which leads to the system of equations

«1 + «2 + * * '
+ «n = *1,

«2 + " " '


+ «n = X2 ,

Hence
«i = Xi — x2 ,

a2 = x2 — X3,

Qt w — = Xn — i Xn ,

&n = Xn ,
1-5 | LINEAR DEPENDENCE AND INDEPENDENCE; BASES 23

which simultaneously shows that x can be written as a linear combination of


e(, e£, and that the coefficients of this relation are uniquely determined.
. . . ,

n
Thus the e t are a basis for (R
-
, as asserted.
~
Example 3. The polynomials 1, x, x2 ,
. .
.
, xn l
form a basis for the vector
space since each polynomial in this space can be written in
(Pn one and only one
~
way in the form a a^x an _\X n x
+ + • • •
+ .

Example 4. Let pi(x), p n (*) be any finite set of polynomials in (P, and let d
. .
. ,

be the maximum of the degrees of the Pi(x). Then no linear combination of these
polynomials is of degree greater than d, from which it follows that p! (x), pn (x) . .
. ,

is not a basis for (P, since (P contains polynomials of arbitrarily high degree. Thus
(P does not possess a basis in the sense of Definition 1-5.

n
In Examples 1 and 2 we exhibited two distinct bases for (R , and in each case
found that the number of vectors involved was the same. This was no
total
coincidence, for it can be shown that any two bases in a vector space 13 always
have the same number of elements. In other words, the number of vectors in a
basis for 13 (provided 13 has a basis) is an intrinsic property of 13 itself. We shall
prove this important result in Section 7, along with certain other facts about bases
in vector spaces, and mention it here only in order to justify the following
definition.

Definition 1-6. A vector space is said to be of dimension n if it has a basis


consisting of n vectors, and is said to be infinite dimensional otherwise.* We
denote the fact that 13 is n-dimensional by writing dim 13 = n.

n
On the strength of the above examples we can assert that both (R and <?„ are n-
dimensional, and that (P is infinite dimensional.

EXERCISES

Find all linearly independent subsets of the following sets of Vectors in 3


1. (R .

(a) (1,0,0), (0,1,0), (0,0,1), (2,3,5)


(b) (1,1,1), (0,1,1), (0,0,1), (6,4,7)

(c) (1,1,1), (2,2,2), (1,0,0), (0,0,1)


(d) (0,0,0), (1,2,1), (1,3,3), (1,4,6)

2. Find all linearly independent subsets of the following sets of vectors in (P4.

(a) 1, x - 1, x2 + 2x + 1, x2 (b) x(x - 1), jc


3
, 2x 3 - x 2
, x
(c) 2x, x2 + 1, x + 1, x2 - 1 (d) 1, x, x2 x3 x2
, , + x + 1

* By convention, the vector space consisting of just the zero vector is assigned di-
mension 0.
24 REAL VECTOR SPACES | CHAP. 1

3
3. Are the vectors (0, 2, -1), (0, \, -£), (0, §, -£) linearly independent in (R ? If
not, find a linearly independent subset which has a maximum number of elements.
2
4. Prove that each of the following sets of vectors is a basis for (Si .

(a) (1,0), (0,-1)


(b) (cos 0, sin 0), (-sin 0, cos 0), < < 2x
(c) (a, 0), (0, (3), a, jS nonzero real numbers
(d) (1, 1), (0, 1)
2
5. Express each of the following vectors in (51 as a linear combination of the vectors
in the various bases of Exercise 4:

i, J, i + J, «'i + /3'j.

[Recall that i = (1, 0) and j = (0, 1).]


4
6. Prove that each of the following sets of vectors is a basis for (R .

(a) (1,0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (1, 1,1,1)


(b) (1,1,0,0), (0,0,1,1), (-1,0,1,1), (0,-1,0,1)
(c) (2,-1,0,1), (1,3,2,0), (0,-1,-1,0), (-2,1,2,1)
(d) (1, -1, 2, 0), (1, 1, 2, 0), (3, 0, 0, 1), (2, 1, -1, 0)

7. Express (2, —2, 1, 3) as a linear combination of the vectors in the various bases of
Exercise 6.
2
8. (a) Show that the functions 1, sin 2 x, cos x are linearly dependent in C[— x, x].
(b) Show that the functions 1, cos x, cos 2x are linearly independent in C[— x, x].
9. Prove that the vectors xi, . . . , x„ are linearly independent if and only if the equation

aixi + • • •
+ a«x„ =
implies that ai = • • • = an = 0.
2
10. Prove that the vectors (a, b) and (c, d) are linearly independent in (R if and only if

ad - be 9^ 0.

*11. Prove that the vectors (jci, jc 2 , x 3 ), (yi, y2, V3), (zi, z 2; z 3 ) are linearly independent
3
in (Si if and only if

xi yi z\

X2 72 Z2 j£ 0.

*3 V3 Z3

* 12. Show that the functions sin


x, sin 2x, sin nx are linearly independent in <S[—t, it] . . .
,

any positive integer n. [Hint: Let X^ =1 a k sin kx = 0; multiply by siny'x, where


for
j = 1, n, and integrate from -ir to ir.]
. .
.
,

*13. Show that the functions 1, sin x, cos x, sin 2x, cos 2x, sin nx, cos nx are linearly .
. ,

independent in Q[—ir, x]. [Hint: See Exercise 12.]


14. Prove that the polynomials

,
A, 2^ 2' 2 2

form a basis for (P4.

15. Express x 2 and x 3 as linear combinations of the basis vectors for (P4 given in
Exercise 14.
1-5 | LINEAR DEPENDENCE AND INDEPENDENCE; BASES 25

16. Assume that xi, X2, and X3 are linearly independent vectors in V. Prove that
xi + X2, xi + X3, and X2 + X3 are linearly independent.
17. Prove that the functions 1, ex , e 2x are linearly independent in C[0, 1]. [Hint: Differ-
entiate the expression a + fie
x
+ le 2x = 0.]

18. Prove that the functions 1, e x , xe x are linearly independent in 6[0, 1].

19. Show that the polynomials

— — 2 — -1
1, x a, (x a) , . .
. , (x a)" ,

where a is an arbitrary real number, form a basis for (Pn [Hint: Consider the .

Taylor series expansion of a polynomial about the point x = a.]

*20. Show that the polynomials

1, x, x(x - 1), x(x - l)(x -2), ..., x(x - l)(x - 2) • • •


(x - h + 1)

form a basis for (Pn+ i. [Hint: Use mathematical induction.]

21. Express the polynomials 3


x and x 3
+ 3x — 1 as linear combinations of the basis
vectors in (P4 described in Exercise 20.

*22. Let 9C„ be any set of n polynomials in (Pn , one of each degree 0, 1, . .
.
, n — 1.

Prove that 9C n is a basis for (P„. [Hint: Use mathematical induction.]

23. Let 9C4 be the basis for (P4 consisting of the cubic polynomial x 3 + 2x + 5 and
its first three derivatives (see Exercise 22). Write each of the following polynomials
9C is a as a linear combination of polynomials in 9C4.

(a) jc
3
+ 2x + 5
(b) x2 + 1

(c) 2x 3 - x2 + 10* + 2

24. Let 9C be a finite linearly independent subset of a vector space V, and suppose that
every finite subset of V which properly contains 9C is linearly dependent. Prove that
9C is a basis for 1).

25. Let 9C be a finite subset of a vector space T) which spans V, and suppose that no
proper subset of 9C spans V. Prove that 9C is a basis for V.

26. Let ei, . .


. , e n be a basis for V. Prove that ei, aei + e2, . .
.
, aei + e„ is also a
basis for V for every real number a.
*27. Prove that every basis for (R 3 contains exactly three vectors. [Hint: Let ei, . .
.
, e„
3
be any basis in (R , and express each e; as a linear combination of the standard
basis vectors i, j, k as

ei = ani + CX21J + 0:31k,

e2 = ai2i + ot22J + 0:32k,

e„ = «lni + Q!2nj + a3nk.

Use the fact that none of the et are zero to successively eliminate i, j, and k from

these equations, and conclude that n < 3. Now reverse the argument to show
that 3 < «.]
26 REAL VECTOR SPACES CHAP. 1

1-6 COORDINATE SYSTEMS


Definition 1-7. Let e l9 , e n be a basis for V and let

x = a^i + i
a n£n

be the unique expression for x in terms of this basis (Theorem 1-3). Then
the scalars <x\, . . . ,a n are called the coordinates or components of x with
respect to ei, . . . , e w and the basis vectors themselves are said to form
,

a coordinate system for V. Finally, the subspaces of V spanned by each


of the e t are called the coordinate axes of the given coordinate system.

Thus a basis is a coordinate system, and the unique expression for a vector as a
linear combination of basis vectors is nothing other than the "decomposition"
of the vector into its components along the various coordinate axes. In these
terms the direct statement in Theorem 1-3 assumes the following eminently
reasonable form: The coordinates of a vector are uniquely determined by the coordi-
nate system.
At the same time, we caution the student not to expect too much from a co-
ordinate system, and especially not to fall into the error of expecting coordinate
axes to be mutually perpendicular. Strictly speaking, of course, the concept of
perpendicularity in a vector space has no meaning yet, but it will be defined in
Chapter 7. Nevertheless, it is common knowledge that certain coordinate axes,
n
such as the standard ones in (R , are mutually perpendicular. We merely wish
to emphasize the sometime nature of this phenomenon, and call attention to the

existence of "oblique" coordinate systems. One such is the coordinate system


n
e (,..., e'n for (R introduced in Example 2 of the preceding section.
In this connection, it is also worth mentioning explicitly that the coordinates
of a vector change with a change of coordinate system. Failure to appreciate the
implications of this innocent and obvious statement often causes confusion, or
worse, for the unwary. For example, the vector x = (4, 2) has coordinates 4, 2
with respect to the standard coordinate system e i = (1,0), e 2 = (0, 1) in (R
2
, since

(4,2) = 4ei + 2e 2 .

However, if we use the coordinate system e( = (1,0), e'2 = (1,1), then the

x = 4e!+2e2 -jrx = 2e{+2e'2

«£ = (U)
e2 = (0,l)

ei = (l,0)

FIGURE 1-10
. :

1-6 COORDINATE SYSTEMS 17

coordinates of x become 2, 2, since


x= 4ej+2e2
(4, 2) = 2ei + 2e'2 .

2
Indeed, the vector in (R having coordi-
nates 4, 2 with respect to e[, e'2 is the or-
dered pair (6, 2). (See Figs. 1-10 and
1-11.)
Finally, we call attention to the fact FIGURE 1-11
that the operations of vector addition and
scalar multiplication are converted into ordinary addition and multiplication when
they are carried out with respect to a basis. For then these operations are per-
formed componentwise, irrespective of the nature of the vectors involved (n-tuples
of real numbers, polynomials, etc.). We prove this assertion as follows

Theorem 1-4. Let ei, e n be any basis for a vector space V. Then
. . . ,

the sum of two vectors in V is found by adding their corresponding components,


and the product of a vector and a scalar a is found by multiplying each com-
ponent of the vector by a.

Proof Ifx = ai^i + + a n en and y = jSiei +


• • •
+ /3 n e n then
, it follows
directly from the axioms defining a vector space that

x + y = («! + /3i)ei + • • •
+ (a„ + /3 n )e n ,

and that
ax = (aai)ei + • • •
+ (aa n )e n ,

as asserted.

Among its various implications, this theorem foreshadows the use of bases in
finite dimensional vector spaces whenever extensive numerical calculations are
in the offing. On the other hand, as long as we are concerned with the general
theory of vector spaces, bases are a distinct hindrance. This stems from the fact
that whenever a basis is used in the proof of a theorem which purports to be a
general statement about finite dimensional vector spaces, one must then prove
that the result in question is independent of the particular basis chosen to prove it.
And this is usually as difficult as it is to construct a coordinate free proof of the

original statement.

EXERCISES

Find the coordinates of each of the following vectors in 3 with respect to the basis
1 I

(1,0,0), (1,1,0), (1,1,1).


(a) (0,1,0) (c) (0,0,1) (e) (4,-2,2)
(b) (-2,1,1) (d) (-1,4,1) (0 (1,3,2)
2. Repeat Exercise 1 using the basis (1,1, 0), (1, 0, 1), (0, 1, 1).
28 REAL VECTOR SPACES | CHAP. 1

3. Prove that the vectors (2, 1, 0), (2, 1, 1), (2, 2, 1) form a basis for (R 3 Find the .

vectors in (R 3 which have the following coordinates with respect to this basis.

(a) 1,0,0 (c) 4, -5, (e) 3, 1,-1


(b) -1,2,1 (d)i,2, 1 (f)2,2, 1

Find the coordinates of the standard basis vectors in 3


4. (R with respect to the basis
given in Exercise 3.

Find a basis in 4 with respect to which the vector (—3, 1,2, —1) has coordinates
5. (R

1,1,1,1.
6. Let ei, e2, e3 and ei, e2, e% be bases for a vector space V, and suppose that

d = + «2e2 + «3e3,
aie'i

e2 = + 2e'2 + foe^,
jSie'i /3

e3 = Tiei + Y 2 e 2 + 7 3 e 3 .

Find the coordinates of the vector x = x\ei + *2e2 + *3e3 with respect to the
basis ei, e2 , e3 .

7. Does there exist a basis for (R 2 with respect to which an arbitrary vector (xi, X2) has
coordinates 2*1 and 3^2?
8. Find a basis for (R 2 with respect to which an arbitrary vector (xi, X2) has coordinates
xi and x\ 2x2. +
9. Let ei, . . .
, e„ and ei, . .
. , e„ be bases for a finite dimensional vector space V,
and let

e'i = cmei + «2ie2 + • • •


+ «nie n ,

e2 = «i2ei + a22e2 + • • •
+ a n 2e n ,

e« = ai nei + a2 n e2 + "
"

+ «nrae ra .

Find the coordinates of x with respect to ei, . . . , e„ given that

x = xie'i + *2e2 + "


'

+ x'n e'n.

1-7 DIMENSION

Theorem 1-5. If V has a basis containing n vectors, then any n + 1 or


more vectors in V are linearly dependent.*

The technique used to prove this theorem has already been introduced in Exer-
cise 27 of Section 1-5 to treat a particular case. The reader may find it helpful to
keep that exercise in mind while reading the following proof.

* The statement of Theorem 1-5 also applies to the trivial space consisting of just the
zero vector, provided we agree that the empty set of vectors is a basis for this space. Such
an agreement is consistent with our definition of a basis for a vector space, and with the
convention that the dimension of the trivial space is zero.
1-7 I
DIMENSION 29

Proof. Let ei, . . . , e n be a basis for V, and suppose, contrary to the assertion of
the theorem, that V contains a linearly independent set e[, e'm in which . . . ,

m > n. Express each of the e^as a linear combination of the e z thereby obtaining ,

the system of equations

e'i = auei + a 2 ie 2 + • •
+ « nl e n ,

©2 = «12©1 + Oi22^2 + * • •
+ oi n2 en ,

:
(1-30)

Cm = «i wei + a 2m e 2 + • • •
+ « nm en ,

in which the an are Since none of the ey is the zero vector, at least one
scalars.
of the an from zero
is in each of these equations. (Recall that the zero
different
vector is linearly dependent on every vector in V.) Thus, by relabeling the e t if -

necessary, we may assume that an 5^ 0. This done, solve the first equation for
ei, and substitute the value obtained in the remaining m — 1 equations. This
eliminates ei from (1-30), and yields a system of equations of the form

e'2 = &22e 2 + 032e3 + • • •


+ 0n2fin + 0i 2 ei,

e'3 = |823e 2 + i8 33 e3 + • • •
+ Pn&n + j8i 3 ei,

:
(1-31)

em = |82me 2 + |S 3 me 3 + • • •
+ Pnmtn + fi\m£\-

Focusing our attention on the first of these equations we note that the linear
independence of e[ and e 2 implies that at least one of the coefficients /3 22 ,

/3 3 2> • • • from zero. Assume that the e r are labeled so that /3 22 ^ 0.


, jSn2 is different
Then a repetition of the above argument, now applied to e 2 reduces (1-31) to ,

the system
e'3 = T 33 e 3 + • • •
+ Tn3 e n + 7i 3 e'i + T 23 e'2 ,

Let us now speculate on the effect of our assumption that m is greater than n.
A moment's thought will reveal that by continuing the above process of elimina-
tion we will eventually find ourselves confronted with a system of m — n equations
expressing each of the vectors en+i, . . . , e« as a linear combination of ej, , e'
n . . . .

But this cannot be. Hence m < n after all. |

Corollary 1. IfV has a basis containing n vectors, then every basis for V
contains n vectors.

Proof. If e x , . . . , en and e[, . . . , e m are bases for V, then the above theorem
implies that m < n, and n < m. Hence m = n. |
30 REAL VECTOR SPACES | CHAP. 1

This result furnishes the necessary justification for the definition of the dimension
of a vector space given in Section 1-5.
Theorem 1-5 also allows us to prove the reassuring fact that every subspace of a
finite dimensional vector space is finite dimensional, and that its dimension does

not exceed the dimension of the whole space. This is the content of

Theorem 1-6. //"W is a subspace of an n-dimensional vector space V,


then dim < W n.

Proof. The theorem is obviously true if n = 0, or if W is the trivial subspace of


V* Thus we can assume n > 0, and W nontrivial.
By virtue of this last assumption, W contains linearly independent sets of vectors,
since any nonzero vector in W is, by itself, such a set. Moreover, every linearly
independent set in W is also linearly independent as a set in V. Thus, by Theorem
1-5, the number of vectors in such a set cannot exceed n. Finally, if ei, . . . , em
is a linearly independent set in W containing a maximum number of vectors, then
S(ei, . . . , em) = W. Hence dim W = m < as advertised. n, |

This theorem may be read as asserting that every nontrivial subspace W of an


n-dimensional space V has a basis e l5 . . . , e m with m < n. If m = «, then
ei, . . . , ew is also a basis for V, and W = V. On the other hand, if m < n,
then W is a proper subspace of V
and there exist vectors in
(i.e., W ^ V), V
which do not belong to W. Choose any such vector, and label it e m+1 Then . it

is all but obvious that ei, e m+ i are linearly independent in V.


. . . ,

To prove the truth of this observation, we apply the test for linear independence
(p. 20) as follows. Suppose that

«iCi + • • •
+ amem + <x m+1 em+1 = 0. (1-32)

Then a m + 1 = 0, for otherwise

ai 0Cm
em+l — «
el ' '
' *
vmj
«m+l «m+l
and e m+ i is in W. Thus
aiei + • • •
+ amem = 0,

and it follows from the linear independence of ei, e w that a i = = a m = 0. . .


.
,
• • •

Hence all of the coefficients in (1-32) are zero, and ei, e m+ i are linearly . . . ,

independent.
We now repeat the above argument, this time starting with the subspace
S(ei, . . . , e w+ i). If S(ei, e OT+ i) is a proper subspace of
. . . , V we can enlarge
ei, . . . , e TO+ i to a linearly independent set in V containing m+ 2 vectors. But

* Recall that the dimension of the trivial space is zero.


:

1-7 | DIMENSION 31

Theorem 1-5 implies that this process must come to a halt after n — m steps,
atwhich point we will have a basis for V. With this we have proved the following
important and useful result.

Theorem 1-7. Let V be an n-dimensional vector space, and let ei, . . . , e OT


be a basis for an m-dimensional subspace of V. Then there exist n — m
vectors e w+ i, , e„ in V such that ei,
. . . e m e m+1 ew is a basis for . .
. , , ,
. . . ,

V.

EXERCISES

1. What is the dimension of the subspace of (R 3 spanned by


(a) the vectors (2, 1,-1), (3, 2, 1), (1, 0, -3)?
(b) the vectors (1, -1, 2), (0, 2, 1), (-1, 0, 1)?

2. What is the dimension of the subspace of (R 4 spanned by


(a) the vectors (1, 0, 2, -1), (3, -1, -2, 0), (1, -1, -6, 2), (0, 1, 8, -3)?
(b) the vectors (-£, £, 3, -1), (i 0, 1, -£), (1, 1, 10, -4)?
3. Let W be the set of all polynomials in <P n whose second derivative is zero; i.e.,

p(x) belongs to V? if and only if (d2 /dx 2)p(x) = 0.

(a) Prove that W is a subspace of (P n , and find a basis for W.


(b) Extend the basis forW found in (a) to a basis for (Pn .

4. Let*W be the set of all polynomials p(x) in (P n such that p(l) = p'(l) = 0.

(a) Prove that V? is a subspace of (P„, and find a basis for "W.
(b) Extend the basis for "W found in (a) to a basis for (Pn .

5. (a) Find the dimension of the subspace of Q[— w, x] spanned by the vectors 1,
sin x, sin 2 x, cos 2 x.

(b) Repeat part (a) for the vectors sin x cos x, sin 2x, cos 2x, sin 2 x, cos 2 x.

6. Prove that the vector space Q[a, b] is infinite dimensional.


7. What is the dimension of the subspace of all solutions (in (R 3 ) of the single linear
equation
a\xi + a2X2 + 03X3 = 0?
8. What is the dimension of the subspace of all solutions (in (R n) of the single linear
equation
a\x\ + a2*2 + • • •
+ anxn = 0?
9. Given the vectors xi = (2, 0, 1, 1) and x 2 = (1, 1, 0,
4
3) in (R , find vectors x 3 and
X4 such that xi, X2, X3, X4 form a basis for (R 4 .

10. Let V be a vector space of dimension n. Prove that V contains a sequence of sub-
spaces
V ,V!,...,V n
having the following two properties
(a) dimt),- = /;

(b) Vi is a subspace of Vj whenever i < j.


32 REAL VECTOR SPACES |
CHAP. 1

11. Let 1)i and 132 be finite dimensional subspaces of a vector space 13, and suppose
that 13 i and 132 have only the zero vector in common. Let ei, . .
. , e m be a basis
for 13 i, and ei, e» a basis for D2. Prove that ei,
. .
.
, em , ei, . . . , . .
. , e'n is a basis
for the subspace W = 13 1 + 13 2 of 13 (cf. Exercise 22, Section 1-4). Deduce that
dimW = dim 13 1 + dim13 2 .

*12. Let 13 1 and 13 2 be finite dimensional subspaces of a vector space 13. Prove that
13 1 + D2 is finite dimensional, and that

dim (13i + 13 2 ) = dim13i + dim13 2 - dim (Ui n D2).

13. Let 13 1 be an w-dimensional subspace of an ^-dimensional vector space 13. Prove


that there exists an (n — m)-dimensional subspace 132 of 13 such that
(a) 13i + 13 2 = 13, and
(b) 13 1 n D2 contains only the zero vector. [Hint: Choose a basis for 13 1 and
extend this to a basis for 13.]

1-8 GEOMETRIC VECTORS

Informally, geometric vectors are arrows in the plane or 3 -space. As such, they
are familiar to anyone who has studied elementary physics, where they appear
as forces, velocities, accelerations, etc., i.e., quantities having a magnitude and
direction. In this section we propose to examine some of the vague ideas associated
with the use of such arrows, and make these ideas precise by constructing the
space of two-dimensional geometric vectors. Besides furnishing us with still an-
other example of a real vector space, this discussion will provide the link between
our definition of the term vector and the vectors introduced in elementary calculus
and physics.*
The geometric notion of an arrow in the plane finds its mathematical analogue
in the concept of a directed line segment. Specifically, the line segment between
2
two distinct points A and B in (R is said to be directed if the points are given a
definite order, say A, B. In this case we speak of the directed line segment from
A to B, which we denote by AB. A is then called the initial point of the segment,
and B the terminal point. We also agree to regard a single point as a directed line
segment, in which case the relevant symbol is AA.
Intuitively the directed line has a magnitude and direction. At
segment AB
first sight one might be tempted to define these concepts as length and angular

measure with respect to a coordinate system in the plane. However, any such
definition would have the grave defect of making the magnitude and direction of
^AB dependent upon the coordinate system used, in conflict with the intuitive de-
mand that they be intrinsically associated with AB itself. Unfortunately there is

* With obvious minor changes the following discussion can be adapted to 3-space or,

for that matter, to w-space.


1-8 I
GEOMETRIC VECTORS 33

no way out of this dilemma so long as we continue to focus our attention upon a
2
single segment. But when we turn to the set of all directed line segments in (R ,

we observe that it is as easy to determine when two segments have the same mag-
nitude and direction as it is difficult to say what these terms mean. Indeed, using
the notion of parallel translation, we can say that AB and CD have the same mag-
nitude and direction if and only if they can be brought into coincidence by such a
translation. Furthermore, the entire theory of geometric vectors can be based
upon this simple observation.
We begin by giving the above discussion formal status in

Definition 1-8. Let AB and CD be directed line segments in the plane,


and suppose CD is translated parallel to itself until its initial point coincides
with A. Then, if the terminal points of the two segments also coincide,
we say that AB and CD have the same magnitude and direction, and we
write AB ~C~D (read "AB is equivalent to CD").

For future use we note the following simple consequences of this definition:

Ai ~AB, (1-33)

AB ~CD implies CD ~ AB, (1-34)


and
AB ~ CD and ~CD ~ EF imply AB ~ EF. (1-35)

And now we give the basic definition.

Definition 1-9. The collection of all directed line segments in the plane
which have the same magnitude and direction as a given segment AB is,

by definition, the two-dimensional geometric vector y(AB) determined by


AB. Any directed line segment in this collection will be called a representa-
2
tive of \(AB), and the set g consisting of all such vectors is called the
space of two-dimensional geometric vectors.

At first sight this definition may seem somewhat bizarre, but it does in fact yield
precisely the sort of quantity we want in a geometric vector. For, whatever other
ideas one may have concerning geometric vectors, it is clear that every such vector
must be completely determined by its magnitude and direction. In other words,
it consists of nothing but a magnitude and direction. And when we consider equiva-

lent directed line segments AB and CD


as entities having only a magnitude and
and terminal points, it is also clear that
direction, forgetting about their initial
AB and CD are then effectively identical. Thus equivalent directed line segments
are not distinct geometric vectors; they are distinct representatives of the same
geometric vector. This, of course, is the content of Definition 1-9. Figuratively
2
speaking, 9 is the totality of directed fine segments in the plane viewed so my-
34 REAL VECTOR SPACES |
CHAP. 1

opically that segments having the same magnitude and direction are indis-
tinguishable.
But can we accept Definition 1-9 as it stands? Hardly; for we still lack the
necessary assurance that a geometric vector is unambiguously determined by any
one of its representatives. Phrased somewhat differently, the validity of Defi-
nition 1-9 depends upon the fact that no directed line segment can be a repre-
sentative of more than one geometric vector. Intuitively this is clear, but the student
should nevertheless appreciate the need for a proof based upon the definitions.
This is accomplished by the following theorem, which actually establishes a
somewhat stronger result.

Theorem 1-8. Every directed line segment in the plane is a representative


of one and only one two-dimensional geometric vector.

2
Proof. If AB is a directed line segment in (R , then by Definition 1-8, \(AB) is

a geometric vector having AB as a representative. Hence every directed line seg-


ment is a representative of at least one geometric vector, and it remains to prove
that AB cannot represent more than one.
Thus suppose AB also represents the vector \(CD). Then, if EF is any repre-
sentative of y(CD), EF ~ ~CD. But AB ~ CD, and hence by (1-34) we have
CD ~ AB, and by (1-35) EF ~ AB. This shows that every representative of
v(CD) is also a representative of \(AB). Now reverse the argument to conclude
that every representative of \(AB) is also a representative of \(CD). Hence
\(AB) = v(CZ>), as required. |
c

FIGURE 1-12

2
To define vector addition in we make use of the simple and geometrically
9
obvious fact that an arbitrary geometric vector, and P is any point in the
if v is

plane, then there exists precisely one representative of v with initial point P.
(To produce this representative, select any segment belonging to v and translate
2
its initial point to P.) This having been said, let v and w be vectors in Q and ,

choose a representative A~B of v. Then, if BC is that representative of w with


initial point B, we define v + w to be the vector v(AC). Geometrically, AC is

the third side of the triangle formed from AB and liC, as shown in Fig. 1-12.
Here again we have a which requires justification, for it is predicated
definition
upon the fact that v + w remains the same regardless of which representatives of v
1-8 GEOMETRIC VECTORS 35

FIGURE 1-13

(v+w)+x = v+(w+x)

FIGURE 1-14 FIGURE 1-15

and ware used to compute it. But this, we assert, is obvious. For if A'B' ~ AB,
and B'C ~ BC, then triangles A'B'C and ABC will be congruent, and a parallel
translation will bring A'C into coincidence with AC (see Fig. 1-13).
With this difficulty out of the way, we invoke elementary geometry to prove
that v + (w + x) = (v + w) + and that v + w =
x, w + v (appropriate
diagrams appear above in Figs. 1-14 and 1-15). Moreover, if we set = y(AA),
then v + = v for every v in g
2
; while if v = v(AB), it is clear that the vector
—v = v(BA) has the property that v + (— v) = 0. Thus the additive axioms
2
for a vector space are satisfied in g .

2
Next let y(AB) be an arbitrary nonzero vector
and let a be a real number. in g ,

Choose a on the ray from A through B, and let \AB\ denote the
unit of distance
length of the segment AB. Then there exists a unique point C on this ray such
that |^C|/|y4fi| = \a\, where |a| denotes the absolute value of a. Ifa > 0, let

av = y(AC); while if a < 0, let a\ = -\(AC) = v(CA). Finally, when v = 0,


set av = for all a. (We can describe av as the collection of line segments in the
plane whose magnitude is |a|-times the magnitude of AB, and whose direction is

thesame as or opposite to that of AB according as a is positive or negative.)


2
With this we have defined a and geometric arguments
scalar multiplication in g ,

can again be used to prove that the required axioms are satisfied. Thus g 2 is a
real vector space.
36 REAL VECTOR SPACES | CHAP. 1

It is both interesting and instructive to compare this space with the vector space
2
51 defined in Section 1-1. To effect this comparison consider the totality of
directed line segments in the plane which have their initial point at the coordinate
2
origin in (31 . On the one hand, this collection of vectors is none other than the
2
set of vectors which comprise (ft However (and this is the crux of our argument),
.

it may also be viewed as a complete set of representatives of the vectors in g 2 .

2
(Recall that every vector in g has precisely one such representative, and con-
versely, every such directed line segment represents a unique vector in g 2 .) In
2
other words, the vectors in (ft are simply a particular set of representatives of the
2 2
vectors in g . From agree to replace each vector in g
this it follows that if we
2
by its unique representative emanating from the coordinate origin in (R , we find
that (R
2
and g 2 thenof precisely the same vectors. Moreover, addition and
consist
2
scalar multiplication in then become identical with the corresponding opera-
g
tions in (ft
2
Thus we conclude that (ft 2 and the space of two-dimensional geometric
.

2 2
vectors are essentially identical, g is simply (ft stripped of its coordinate system,
2 2
and conversely, (ft is g seen by means of a coordinate system.

EXERCISES

In Exercises 1 through 5 find two other directed line segments which have the same
magnitude and direction as AB for the given values of A and B.

1. A = (1, 1), B = (3, 1) 2. A = (-2, 1), B = (0, 0)


3. A = (1, -1), B = (3, 1) 4. A = (-4, -£), B = (-^f)
5. A = (0,2), B = (-3,-1)
In Exercises 6 through 10 find y(AB) + v(CD) for the given values of A, B, C, D.

6. A = (0,0), B = (1,2), C = (-1,2), D = (3,0)


7. A = (0, -1), B = (-2, 1), C = (3,0), D = (5, -2)
8. A = (-1, 1), B = (2, 1), C = (1, 3), D = (-1, 2)
9. A = (4,i), B = (f, -f), C - (-2,i), D = (0, -f)
10. A = (-2, -5), B = (-1, -2), C = (1,4), D = (6,7)
2
In Exercises 11 through 15 find the vector in (ft which represents the geometric vector
y(AB) for the given values of A and B.

11. A = (-1, 1), B = (2, 1) 12. A = (-2, 1),B = (0, 0)


13. A = (0,2), B = (-3, -2) 14. A = (f,i), B = (-£,2)
15. ^ = (6, -3), fi = (-7, -9)

In Exercises 16 through 20 find the value of a\(AB) for the given values of A, B, and a.

16. A = (-5, 7), B = (3, 10), a = 4 17. ^ = (9, 2), 5 = (5, -2), « = -J
18. A = (f, 6), B
= (4, |), a = 3 19. A = (0, 5), fi = (11,0), a = -1
20. ^ = (-3, -1), B = (-4, -5), a = £
1-9 I
EQUIVALENCE RELATIONS 37

21. Let A = (xi,>>i)andi? = 2


y 2 ) be any two points in (ft and suppose
(x2, , AB ~ CD,.
where C = (*3, V3). Find the coordinates of the point D.
2
22. Let A = (x\,y\) and B = (x 2 , 2) be any two points in (ft , and let a be a real

number. Find a representative of the geometric vector ax(AB).


23. GivenA = (x u yi),B = (x 2 ,y 2 ),C = (x 3 ,y3),D = (*4, JV4), four points in (ft
2
.

Compute \(AB) + v(CD).


24. Prove that \(AB) + v(CD) = \(CD) + v(AB) for any pair of geometric vectors.
[Hint: Use the result of Exercise 23.]

25. Prove that the addition of geometric vectors is associative. [Hint: Use the result

of Exercise 23.]
26. Using the results of Exercises 22 and 23 prove that a(y w) = <xv aw and + +
(a + /3)v = a\ + j8v for any pair of geometric vectors v, w, and any pair of real
numbers a, /3.

27. Use the result of Exercise 22 to prove that (a/3)v = a(jSv) for any geometric vector v,

and any pair of real numbers a, /3.

*l-9 EQUIVALENCE RELATIONS


We have seen that a geometric vector is a collection of directed line segments
mutually related by magnitude and direction. This is but one example of the
method whereby a new mathematical entity is defined as a collection of related
objects of some familiar type. We shall have occasion to use this technique again,
and it may therefore be of some interest to present it in its general setting. As
usual we begin with a definition.

Definition 1-10. An equivalence relation (ft on a set S is a set of ordered


pairs (jc, y) of elements of S, subject to the following conditions:

(i) The pair (x, x) belongs to (ft for every x in S;

(ii) If (x, y) belongs to (ft, then so does (y, x);

(iii) If (x, y) and (y, z) belong to (ft, then (x, z) belongs to (ft.

Whenever an ordered pair (x, y) belongs to an equivalence relation on S, one says


that x is equivalent to y, and writes jc ~ y. Custom then dictates that the'symbol ~
(usually called "tilda" or simply "wiggle") rather than be referred to as the (ft itself

equivalence relation on S. In these terms the defining conditions of an equivalence


relation on S become

(i) x ~ x for all jc in S,

(ii) jc ~ y implies y ~ x,

(iii) jc ~ y and y ~ z imply jc ~ z.

One also says that an equivalence relation is reflexive, symmetric, and transitive;
these names being given respectively to properties (i), (ii), and (iii) above.
38 REAL VECTOR SPACES | CHAP. 1

Equivalence relations crop up in every branch of mathematics, and usually in a


very fundamental way, as the following examples illustrate.

Example 1. The relation of equality applied to the elements of any set S is


obviously an equivalence relation on S. In fact, the notion of an equivalence
relation can be viewed as a generalized form of equality.

Example 2. Let S be the set of all triangles in the plane, and letA x ~ A 2 mean
that A ! and A 2 are congruent. Then ~ is an equivalence relation on S.

Example 3. Let S be the set of directed line segments in the plane, and let

AB ~ CD have the meaning assigned in Definition 1-8. Then (1-33) through


(1-35) imply that ~ is an equivalence relation on S.

Example 4. Let S be the set of real valued functions which are continuous at
all but a number of points in an interval [a, b]. If /and g are two such func-
finite

tions, let / ~ g mean that f(x) = g(x) for all but a finite number of values of x
in [a, b]. Then ~ is an equivalence relation on S (see Exercise 1).

Example 5. "symbols" of the form a/b, where a and b


Let S be the set of all

are integers, and b (c/d) if and only if ad = be. Then ~ is an


9^ 0. Set (a/b) ~
equivalence relation on S (see Exercise 2). (Just as arrows in the plane represent
geometric vectors, the symbols a/b in S represent rational numbers, and the
equivalence relation introduced here allows us to define a rational number as a
collection of mutually equivalent symbols of the form a/b. This example is
discussed in greater detail in Exercise 6.)

If ~ is an equivalence relation on a set S, and x is any element in S, then the set


of all elements y in S such that y ~ x is called the equivalence class determined by x.
This equivalence class is denoted by [x], and any element belonging to it is said
to be a representative of [x]. In particular, x itself is a representative of the equiva-
lence class [x], since x ~ x.
We now come to the fundamental theorem concerning equivalence relations,

a special case of which was proved in the last section.

Theorem If ~ is an equivalence relation on a set S, then every element


1-9.
of§> belongs to one and only one equivalence class of elements of§>.

For a proof, see Theorem 1-8.


It is this result which justifies using equivalence relations to define new mathe-
matical objects. The objects in question are the equivalence classes, and the above
theorem asserts that every such object is uniquely determined by any one of its
representatives. is sometimes called "definition by abstrac-
This type of definition
tion," since it passes from the particular to the general by regarding mutually
equivalent individuals as identical.
1-9 |
EQUIVALENCE RELATIONS 39

EXERCISES

1. Prove that the relation defined in Example 4 above is an equivalence relation.

2. Prove that the relation defined in Example 5 above is an equivalence relation. What
is the equivalence class determined by ^? By 5? Byf?

3. Let S be the of all integers, and let m be a fixed positive integer. If x and v belong
set
to S, set x ~y if and only if x — y is divisible by m. Prove that ~ is an equivalence
relation on S, and find all the equivalence classes for this relation when m = 2, 3,
and 4.

4. A partition of a set S is defined to be a collection (P of subsets of S, each of which


contains at least one element of S, and such that every element of S belongs to one
and only one of the subsets making up (P. (Informally, a partition chops S into pieces,
and these pieces are the subsets belonging to (P.) In these terms, the fundamental
theorem on equivalence relations can be restated as follows: //~ is an equivalence
relation on a set S, then the equivalence classes determined by ~ yield a partition of§>.
Prove the following converse of this theorem: If (P is a partition of a set S, then
there exists an equivalence relation on S whose equivalence classes coincide with the
subsets of the given partition (P. (Note that these two results imply that the concepts
of an equivalence relation on a set and a partition of a set are identical.)
*5. Let /be a function with domain 2D and range (R (i.e., /associates with each x in 2D
a unique element y in (R denoted f(x) and called the image of x under / and each y
in (R is the image of at least one x in 2D).

(a) If xi and X2 belong to 2D, set x\ ~ X2 if and only if f(x{) = f(x2). Prove
that ~ is an equivalence relation on 2D.

(b) Let (P be the partition of 2D defined by this equivalence relation (see Exercise 4).
Prove that there exists a one-to-one function g with domain <P and range (R such
that g([x]) = f(x) for each x in 2D. (Recall that a function / is one-to-one if
f(xi) = f(x 2 ) implies xi = x 2 .)
*6. The rational numbers. Let S be the set of all symbols a/b, b ^ 0, introduced in
Example 5 above, and let ~ be the equivalence relation that was defined on S; i.e.,
a/b ~ c/d if and only if ad — be. Let Q be the set of all equivalence classes of the
elements of S under this equivalence relation. Then the equivalence classes belonging
to Q are, by definition, rational numbers, and the set Q itself is called the set of all
rational numbers. Thus the rational number [a/b] is the equivalence class containing
the symbol a/b.
(a) To define addition of rational numbers set

ad + be
+ bd J

Prove that if a! lb' ~ a/b and c'/d' ~ c/d, then

a'd' + b'c' ad+ be


b'd'
~ bd
and thus conclude that the above equation actually defines an addition of equivalence
classes.
40 REAL VECTOR SPACES CHAP. 1

(b) To define multiplication of rational numbers set

ac
bd

Prove that if a'/b' ~ a/b and c'/d' ~ c/d, then

a'c' ac
Vd'~ ~b~d

and thus conclude that the above equation actually defines a multiplication of equiva-
lence classes.
2

linear transformations

and matrices

2-1 LINEAR TRANSFORMATIONS

Up to this point our study of real vector spaces can best be described as a modest
generalization of some of the ideas implicit in analytic geometry. Although such
terms as linear dependence and independence, subspaces, bases, and the like, may
have been unfamiliar they actually add little to the knowledge of vector spaces
taught in elementary geometry. All this changes, however, as soon as these ideas
are,used to study functions defined on vector spaces. Here new and important
things do happen, and as the following discussion unfolds we shall find the con-
cepts introduced earlier taking on added meaning and significance.

The simplest yet most important functions which arise in the study of vector
spaces are known as linear transformations, and are defined as follows:

Definition 2-1. A linear transformation, or linear operator, from a vector


space V i to a vector space V2 is a function A which associates with each
vector xin Vi 2l unique vector A(x) in T> 2 in such a way that

A(x x + x 2) = A{x x) + A{x 2 ) (2-1)


and
A(ax) = aA(x) (2-2)

for all vectors x 1 ,x 2 ,xinV 1 and ,


all scalars a.

In other words, a linear transformation is a function, or mapping, from one vector


space to another which sends sums into sums and scalar products into scalar
products (see Fig. 2-1). These requirements are sometimes referred to by saying
that a linear transformation is "compatible" with the algebraic operations of addi-
tion and scalar multiplication defined on vector spaces, and it is just this com-
patibility which accounts for the importance of such functions in linear algebra.
41
42 LINEAR TRANSFORMATIONS AND MATRICES | CHAP. 2

FIGURE 2-1

One consequence of Definition 2-1 is that a linear transformation always maps


the zero vector of V i onto the zero vector of V2 ; that is,

A(0) = 0. (2-3)
Another is that

A(a i x 1 + • • •
+ a n xn ) = a 1 y4(x 1 ) + • • •
+ «n^(x n ) (2-4)

for any finite collection of vectors x i5 . . . , xn in V x and scalars on, . . . ,a n .

The first of these assertions can be established by setting a = in (2-2), the second
by repeated use of (2-1) and (2-2) in the obvious fashion. In particular, when
n = 2, (2-4) becomes

A(aiXx + a 2x 2) = a x A{xi) + a 2 A(x 2 ). (2-5)

We call remark that, by itself, it can be (and


attention to this equation in order to
often taken as the definition of a linear transformation, since (2-1) and (2-2)
is)

are satisfied if and only if (2-5) is. (See Exercise 17.) From time to time we shall
use this fact when proving that a function is a linear transformation.
If A is a linear transformation from Vi to V 2 we write A:V\ — V2
> (read
"A maps Vi into V 2 "), and refer to Vi as
the domain of A. In this case the set of all

vectors y in V 2 such that y =


A(x) for some
x in Vi is called the image or range of A,
and is denoted by d (A). Lest it be over-
looked, we point out that the image of A
need not be all of V 2 a possibility which is
,

made explicit by saying that A maps V i


into V 2 (Fig. 2-2). Of course, it may hap-
pen that d(A) = ^2, in which case the
term onto is used. And finally, it should be
observed that there is nothing in the above
definition to prevent V i and V 2 from being figure 2-2
2-1 LINEAR TRANSFORMATIONS 43

x=(x l
,x 2 )

= 0,1)

A(-e 2 ) =
(-1,-1)

A(x)=(x 1 ,-x 2 )

FIGURE 2-3 FIGURE 2-4

one and the same vector space. Indeed, this is one of the most fruitful settings in
which to pursue the study of linear transformations.
We conclude this section by giving a number of examples, several of which will
figure prominently in our later work. For the most part we simply state the
definition of the function in question, and omit the routine verification of linearity
in the expectation that the reader will supply the missing argument for himself.

2
Example 1. Let x = (xi, x 2 ) be an arbitrary vector in (R , and set

A(x) = (*i, -x 2 ).

2
Geometrically A can be described as the linear transformation mapping (ft onto
itself by reflection across the x r axis. (See Fig. 2-3 where, for generality, the
effect of A has been depicted relative to an oblique coordinate system.)
2
Example 2. Let A be the mapping of (R onto itself obtained by shearing the
plane horizontally so that the jc 2 -axis is shifted through a 45° angle as shown in
Fig. 2-4. Analytically, A is defined by the equation

A(x u X 2 ) = (*i + x 2 ,x 2 ),
and is clearly linear.
3
Example 3. Let <£ be any line through the origin in (R , and let A be a fixed
rotation about £. Then, arguing geometrically, it is easy to show that A is a linear
3
transformation mapping (R onto itself.

Example 4. The mapping which sends each vector in T>i onto the zero vector
in V2 is clearly a linear transformation from V 1 to V 2 for all Vi and V 2 It is .

called the zero transformation, and is denoted by the symbol O, irrespective of the
vector spaces involved.

Example 5. A second linear transformation for which we reserve a special


symbol is the identity transformation I mapping a vector space V onto itself.
44 LINEAR TRANSFORMATIONS AND MATRICES |
CHAP. 2

The defining equation for / is

/(x) = x

for all x in V ; its linearity is obvious.

Example 6. Consider the space Q[a, b] of all real valued continuous functions
on the interval [a, b], and for each f in Q[a, b] set

A(f) = fi(t)dt, a < x < b.


Ja

Then since A(f) is continuous on [a, b], A can be viewed as a mapping of Q[a, b]
into itself. As such it is linear since

^(aifi + a 2 f2 ) = f [aifi(0 +
Ja
a 2 f2 (0] dt

X X
= f aifi(0 dt + f a 2 f2 (i) dt
Ja Ja
X X
= at [ fi(0 dt + a2
f
f2 (0 ^
Jo Ja

= ai A(ti) + a 2 ^(f2 ).

Example 7. For the same reasons as those just given, the mapping A : Q[a, b] — -> (R
x

defined by
6
^(f) = [ f(^) dx
Ja

is also linear.

Example 8. Let Q l [a, b] denote the space of all continuously differentiable


functions on [a, b] (see Example 3 in Section 1-4), and let D denote the operation
of differentiation on this space; that is, D(f) = f '. Then the familiar identities

£(fi + U) = 0(fi) + D(f 2 \ D(af) = aD(f)

imply that D is a linear transformation from 6 1 [a, 6] to C[a, 6]. More generally,
the operation of taking nth derivatives is a linear transformation mapping the
space of «-times continuously differentiable functions on an interval [a, b] into
the space e[a, b].

EXERCISES

2
Prove that each of the following equations defines a linear transformation from (R into
(or onto) itself, and describe the effect of the transformation in geometric terms.
1. A(xi, x 2 ) = — (*i, x 2 ) 2. A(xi,x 2 ) = (2*i, x 2)
3. A(xi,x 2 ) = 2(*i, x 2 /3) 4. A(xi, x 2 ) = 3(xi, x 2)
5. A(xi, x 2 ) = y/2(x\ — x 2 ,x\ + x 2) 6. A(x\, x 2 ) = (x 2 xi) ,
2-2 I
ADDITION AND SCALAR MULTIPLICATION 45

7. A(x\, X 2 ) = — (*2, *l) 8. A(X\, X 2 ) = (Xl + X 2 ,X\ + X 2)


9. A{xi, x 2 ) = (0, 0) 10. A(xi, x 2) = (a:i + x2 , 0)

Determine which of the following equations defines a linear transformation on the space
of polynomials (P.

11. A(p) = p(x) 2 12. A(p) = x (p(x))


13. Aft) = p(x + 1) - p(x) 14. A(p) = p(x + 1) - p(0)

15. ^(p) = p"(x) - 2p'(x) 16. ^(p) = p(x 2 )


17. Prove that ^4 : V i — 1)2 is» linear if and only if

A(a ixi + a2X2> = ai/i(xi) + a2-4(x2)

for all xi, X2 in 13 1, and all scalars ai, a2 .

18. Prove Eq. (2-4). [Hint: Use mathematical induction.]


19. Let ei, . .
. , e„ be a basis for a finite dimensional vector space V, and for each

index /, 1 < / < n, let rn be an arbitrary real number. Prove that the function
A:V ^(R 1
defined by

A(aid + • • •
+ «„e„) = «i?7i + + an Vn • • •

for each vector a iei + • • •


+ «ne„ in T) is a linear transformation.
20. Let V i be a finite dimensional vector space with basis ei, . .
. , e„, let V 2 be an arbi-
trary vector space, let y i, . .
. , y» be vectors in V2 , and for each x = aid + • • •
+
a„e n in V i set
^4(x) = aiyi + • • •
+ <x n yn -

Prove that A is a linear transformation from Vi to 1)2.


1 a basis
*21. Find all linear transformations mapping (R into (or onto) itself. [Hint: 1 is

forCR 1 .]

2-2 ADDITION AND SCALAR MULTIPLICATION


OF TRANSFORMATIONS
We begin the systematic study of linear transformations by describing several
ways in which new transformations can be formed from old ones. Of these the
simplest the addition of two transformations, each of which maps a given
is

vector space V i into the same space D 2 The definition reads as follows: .

Definition 2-2. Let A and B be linear transformations from V x to V2 .

Then their sum, A + B, is the transformation from Di to V2 defined by

(A + B)(x) = A(x) + B(x) (2-6)


for all x in Di.

This, of course, is just the familiar addition of functions here applied to linear
transformations, and it is an easy matter to show that A + B is again a linear
transformation from V x to V 2 Indeed, if x x and x 2 belong to Vi, and a 1 and a 2
.
46 LINEAR TRANSFORMATIONS AND MATRICES | CHAP. 2

are scalars, then

(A + £)(aiXi + a 2x 2) = ^(«iXi + a 2 x 2) + BfaiXx + a 2 x 2 )


= aiA(xi) + a 2 A(x 2 ) + ai5(xi) + ct 2 B(x 2 )
= ai[i4(xi) + 5( Xl )] + a 2 M(x 2 ) + £(x 2 )]
= ai (A + 5)( Xl ) + a 2 (A + *)(x 2 ).

Thus yi + B satisfies Eq. (2-5), and is therefore linear, as asserted.

Example 1. Let D and D2 denote, respectively, the operations of taking first


and second derivatives in Q 2 [a, b]. Then the sum D 2 + D is the linear trans-
formation from Q 2 [a, b] to <5[a, b] which sends each function y in Q 2 [a, b] onto the
continuous function y" + y' that is, ;

(D 2 + D)(y) = D 2y + Dy.

Example 2. Let K(7) be a fixed function in e[a, b], and let A be the linear trans-
formation mapping e[a, b] into itself given by

^(f) = T K(/)f(0 dt, a < x < b.


Ja

Then the sum A -\- I, I the identity transformation on Q[a, b] (see Example 5,
Section 2-1), is the linear transformation mapping e[a, b] into itself whose defining
equation is
X
(A + I)(f) =
Ja
[ K(t)f(t)dt + f.

The addition of linear transformations defined above has a number of familiar


and suggestive properties. In the first place, it is clear that

A + (B + C) = (A + B) + C (2-7)
and
A +B = B + A (2-8)

whenever A, B, and C are linear transformations from Vi to V 2 (see Exercise 2).


Secondly, the zero mapping from 13 to V 2 defined in Example 4
1 of the preceding
section acts as a "zero" for this addition since

A + O = O+ A = A (2-9)

for all A: Vi — > t) 2 . And finally, if ^ is any linear transformation from V\ to


°U 2 , and if we define —A by the equation

(-^)(x) = -^(x) (2-10)

for all x in V i, we obtain a linear transformation from V l to V2 with the property


that
A + (-A) = -A + A = O. (2-11)
2-2 | ADDITION AND SCALAR MULTIPLICATION 47

In short, the addition of linear transformations from V i to V 2 satisfies all of the


axioms postulated for addition in a vector space.
To complete what should by now be an obvious sequence of ideas we introduce
a scalar multiplication on the set of linear transformations from V i to V 2 The .

relevant definition is as follows:

Definition 2-3. The product of a real number a and a linear transformation


A: V^ — V2
> is themapping aA from Vi to V 2 given by

(aA)(x) = aA(x) (2-12)

for all x in T)i. In other words, aA is the function whose value at x is

computed by forming the scalar product of a and the vector A(x).

We omit the proof that aA is linear, as well as the easy sequence of arguments
required to show that the remaining axioms in the definition of a real vector space
are now satisfied. Granting the truth of these facts, we have

Theorem 2-1. The set of linear transformations from Vi to V2 is itself a


real vector space under the definitions of addition and scalar multiplication
given above.

EXERCISES

1. Cite the relevant axiom or definition needed to justify each step in the proof of the
linearity of A + B.
2. Prove that addition of linear transformations is associative and commutative.
3. Prove that the mapping —A defined in (2-10) is linear, and that

A + (-A) = -A + A = O.

4. (a) Prove that aA as defined in the text is linear,

(b) Prove Theorem 2-1.

5. (a) be a nonempty subset of a vector space V i, and let ($(9C) denote the set
Let 9C
of transformations A from Di to V 2 with the property that A(x) =
all linear for
all x in 9C. Prove that 0t(9C) is a subspace of the space of all linear transformations

from "0110 1)2.


(b) What is Ct(9C) when 9C consists of just the zero vector? When 9C = °Ui?
(c) Prove that Ct(9C) = Ct(S(9C)).

*6. Let Vi and V 2 be finite dimensional vector spaces with bases ei, ,e„ and . . .

ei> • • • > ©to, respectively. pair of integers /, j, 1 < / < n, 1 < j < m,
For each
define Aij\ V i — V2 > by first defining A^ on the basis vectors of V i according to the
formula
if k = i,
Aij(e k ) =
\0 if k ^ i,
48 LINEAR TRANSFORMATIONS AND MATRICES CHAP. 2

and then use (2-4) to obtain the value of An for each x in 13 1. (See Exercise 20,
Section 2-1.)
(a) Prove that the Aij are linear transformations from "Ui to V2, and that they are
linearly independent in the vector space of all such transformations.

(b) Prove that the An span the space of linear transformations from V\ to V 2 and ,

hence deduce that this space is finite dimensional with dimension mn. [Hint: Two
linear transformations from Di tol>2 are identical if and only if they coincide on
a basis for Vi.]

2-3 PRODUCTS OF LINEAR TRANSFORMATIONS


The theorem would seem to imply that
established at the end of the last section
the study of linear transformations can be subsumed within the general theory
of vector spaces. Such would indeed be the case were addition and scalar multi-
plication the only algebraic operations that could be performed on linear trans-
formations. However, under suitable hypotheses, it is also possible to define a
multiplication of transformations. And, as we shall see, this single fact makes their
study much richer in content and quite different in spirit from that of vector spaces
alone.

FIGURE 2-5

To introduce this multiplication, let V\, V 2 D3


, be vector spaces, and consider
a pair of linear transformations

A: v- V 2> B: V< V-

Then, for each x in V u A(x) is a vector in V 2 and it therefore makes sense to


,

speak of applying B to A(x) to obtain the vector B(A(x)) in V 3 (see Fig. 2-5).
Thus A and B can be combined, or multiplied, to produce a function from V 1
to 1) 3 which will be denoted by BA, and called the product of A and B in that
order, \iz., first A, then B. This is the content of

Definition 2-4. If
C
A: U 1 — V2
> and B: V 2 —> V 3 are linear transforma-
tions, then their product, BA, is the mapping from Vi to T) 3 defined by
the equation
BA(x) = B(A(x)) (2-13)
for all x in V\.
2-3 I
PRODUCTS OF LINEAR TRANSFORMATIONS 49

The essential fact about such products is that they are always linear. Indeed,
ifx x and x 2 belong to "Ui, and a x and a 2 are arbitrary real numbers, then

BA(a 1 x 1 + a 2x 2 ) = 5[^(axXi + a 2 x 2 )]
= B[ ai A{TL{) + a 2 A(x 2 )]
= a 1 B(A(x 1 )) + a 2 B(A(x 2 ))
= axBAixJ + a 2 BA(x 2 ).

Hence BA satisfies (2-5), and its linearity has been established.


Before going any further, a comment on notation is in order. At first sight it

might seem more reasonable to denote the product of A and B by AB rather than
BA as above. The explanation for not adopting this notation is quite simple.
Were it used, (2-13) would have to be changed to read AB(x) - B(A(x)) and
the writing of equations would then be an open invitation to error.
Having established the convention that the symbol BA stands for the product
of and B, in that order, we observe that this product is defined only when the
A
image of A is contained in the domain of B. Thus one of the products AB or BA
may exist and the other not, a phenomenon which will reappear later when we
introduce the subject of matrices. But even when both A and B map a given vector
space into itself, in which case AB and BA are linear transformations on the same
space, it is by no means true that they must be equal. A simple example of this dis-
2
turbing fact can be given in by letting A be a counterclockwise rotation of 90°
(R

about the origin, and B a reflection across any line through the origin, say the
x-axis. Then, with e x and e 2 the standard basis vectors, AB(ei) = e 2 while
BA(e{) = — 2 and AB 9^ BA. (Picture?) In short, the multiplication of linear
,

transformations is noncommutative.
The foregoing exampleillustrates one of the ways in which this multiplication
differsfrom "ordinary" multiplication. Why then call it multiplication at all?
The answer is provided by the following identities which show that most of the
properties usually associated with the term multiplication are still valid when
phrased in terms of linear transformations. Specifically, assuming that all of the
indicated products are defined, we have

A(BC) = (AB)C, (2-14)

04 ! + A 2 )B = A B X + A 2 B, A{B 1 + B 2 = AB + AB 2
) X ,
(2-15)

(aA)B = A(aB) = a(AB), aa scalar, (2-16)

Al = A '
/ the identity map. (2-17)
IA — A,

The of these identities asserts that the multiplication of linear transformations


first

is associative, the next two that it is distributive over addition, and the fourth that
it commutes with the operation of scalar multiplication. Finally, (2-17) implies
A

50 LINEAR TRANSFORMATIONS AND MATRICES |


CHAP. 2

that the identity transformation plays the same role in operator multiplication
that the number one plays in arithmetic. The reader should note, however, that
two different identity maps are usually involved here, and, strictly speaking, if
A: Vi — V 2 then (2-17) ought to be written
,

AI\) 1 = A,

/•o 2 v4 = A,

where lVl denotes the identity map on V x IV2 the identity map on V 2 But this
, .

notation is rarely used since the meaning of the unidentified symbol I is always
clear from the context.
The proof of each of the above identities is an easy exercise in the definitions
of the operations involved. Thus to establish (2-14) suppose that C: Vx —» V 2
,

B: V 2 -> V 3 and A: V 3 ,
— V4
> . Then each of the products A(BC) and (AB)C
isa linear transformation from V i to V 4 and to prove their , equality we simply
apply Definition 2-4, twice for each product. This gives

[A(BC)](x) = A(BC(xj) = A(B(C(x))),


and
[(AB)C](x) = AB(C(x)) = A(B(C(x))),

and (2-14) now follows from the equality of the right-hand sides of these expres-
sions.The remaining proofs have been left to the exercises.

Example 1 Powers of a linear transformation. If A is a linear transformation


.

on a fixed vector space V (i.e., A: V — V) we can form the product of A with >

itself any finite number of times, thereby obtaining a sequence of linear trans-

formations on V known as the powers of A. The associativity of operator multi-


plication implies that each of these powers is independent of the grouping of
its factors and hence can be denoted without ambiguity by A n n a positive integer. ,

Thus
A 1
= A, A 2 = AA, A 3 = AA 2 .... ,

In addition, it is customary to let A denote the identity transformation on V,


i.e., A = I, so that all of the familiar rules for manipulating (nonnegative)
exponents become valid. In particular, we have

A mA n = A nA m = A m+n
( m \n jrnn

for all nonnegative integers m and n.


Example 2. Let D be the differentiation operator on the space of polynomials
(Pn Then D is a linear transformation mapping (Pn into itself, and its powers are
.

simply the derivatives of orders two, three,


etc. Since differentiation lowers the
degree of every nonzero polynomial by one, the «th power of maps every poly- D
2-3 I
PRODUCTS OF LINEAR TRANSFORMATIONS 51

nomial in onto zero, and D n is the zero transformation on (?n However, if


<Pn .

~1
n > 1, then D n and hence D itself, is certainly different from zero, and we
,

have therefore shown that a power of a nonzero linear transformation may be zero.

In general, a nonzero linear transformation A: V —> V with the property that


An = for some n > 1 is said to be nilpotent on V, and the smallest integer n
such that An = is called the degree of nilpotence of A. We call attention to

the fact that the property of being nilpotent actually depends upon the vector
space under consideration as well as the linear transformation involved. For
instance, D is nilpotent of degree n on (Pn , but is not nilpotent on (P. (Why?)
Example 3. Polynomials in A. If A is a linear transformation on a vector space
V, we can use the powers of A together with the operations of addition and scalar
multiplication to form polynomials in A. Thus if

n
p(x) = a + a xx + • • •
+ an x

is a polynomial in x with real coefficients, we define p(A) to be the linear trans-


formation on V obtained by substituting A for x in p(x). In other words,

p(A) = a I + a xA + • • •
+ an A n ,

or
n
p(A) = a + a xA + • • •
+ an A ,

the factor / being understood in the first term of this expression just as x° = 1 is

understood in p{x). Hence if x is any vector in 13,

p(A)(x) = a x + ai^(x) + • • •
+ an A n (x).

Multiplicatively, these polynomials obey all of the familiar rules of polynomial


algebra with the single exception that products can sometimes vanish without
any of their factors vanishing, as was shown in the example above. In particular,
the multiplication a linear transformation is commutative since
of polynomials in

the identity p(x)q(x) = q(x)p(x) for "ordinary" polynomials p and q implies that
p(A)q(A) = q(A)p(A). This, in turn, implies that such polynomials can be factored,
for, as the reader will remember, factorization of polynomials depends only on
the commutativity of multiplication and its distributivity over addition.

Example 4. Let e°°[a, b] denote the space of all infinitely differentiable functions
defined on the interval and again let D be differentiation. Then D maps
[a, b],

e°°[a, b] into itself, and we can therefore form polynomials in Z>, which in this
setting are expressions of the type

n- x
an D n + an _ x D + • • •
+ axD + a ,

a , . . . , an real numbers. (Such expressions are known as constant coefficient


52 LINEAR TRANSFORMATIONS AND MATRICES | CHAP. 2

linear differential operators, and it should be observed that they can also be
interpreted as linear transformations from Q n [a, b] to Q[a, b].) The polynomial
D2 + D — 2 is a typical example, and if y is any function in e x [a, b] (or Q 2 [a, b]),

then

By virtue of the remarks made in the preceding example, we know that


D 2 + D — 2 may be rewritten in either of the equivalent forms (D + 2)(Z> — 1)
or (D — \){D + 2), and in this case is easy to verify directly that these factor-
it

izations are correct. Indeed,

(D + 2)(Z> - l)y = (D + 2)[(D - l)y]

= Tx(al- y) + 2 y
(fx- )

^ dx Zy
dx*
= (D + D -
2
2)y,

while a similar calculation yields the equality (D — 1)(Z> + 2) = D2 + D— 2.

Example 5. Let A and B denote the linear transformations mapping e°°[a, b]


into itself defined by

A = xD + 1, B = D - x;
that is,

A(y) = X %+y , B(y) = %- Xy


x
for each y in Q [a, b] (see Exercise 4). Then

AB(y) = (xD + 1)[(£ - x)y]

= (** +d(!-*)
— A. (dy. — \ -l.
dy.

dx \dx / dx

= [xD 2 + (1 - x 2 )D - 2x]y,

and hence
(xD + 1)(D - x) = xD 2 + (1 - x 2 )D - 2x. (2-18)
2-3 PRODUCTS OF LINEAR TRANSFORMATIONS 53

On the other hand,

BA(y) = (D - x)(xQ + y)

TX \ x % + n- x
\
x
tx + y l

d2 v , /-, 2^dy

= [xD 2 + (2 - 2
x )D - x\y,
and hence
(D - x)(xD + 1) = xD 2 + (2 - x 2 )D - x. (2-19)

Comparing these results, we see that

(xD + 1)(Z> - x) t* (D - x)(xD + 1),

and we have another illustration of the noncommutativity of operator multiplica-


tion. The reader should note that in this case neither the product AB nor BA can
be evaluated by using the rules of elementary algebra. This is another of the
unpleasant consequences of a noncommutative multiplication, and, as we shall
see, has a decisive effect upon the study of linear differential equations.

In view of the examples just given it is clear that the time has come for us to
discuss the general problem of functional notation. All of the functions which
we shall encounter in this text will be elements in one of a number of vector
spaces, and hence should be denoted by such symbols as f or f(x) were we to be
inflexible in our use of bold face type. This, however, would ultimately involve
us in such unsightly (and confusing) expressions as x
11
, sin x, d 2 y/dx 2 fa ,
f(x)g(x) dx.
Such pedantry is pointless, and so we symbols / and f(x) when in
shall use the
our opinion the printed page or its reader would suffer from the use of bold face
type. As a general rule, when we wish to call attention to the fact that a function
is a vector, we shall emphasize it; otherwise, not.

Finally, to settle notational matters once and for all, we comment on our in-
tended use (and mild misuse) of the symbols / and f(x). Strictly speaking, /
should be used to denote a function, and f(x) its value at the point x. But here
again strict adherence to the letter of the law violates the spirit of clarity of expo-
n
sition, for then we would be forced, for example, to write sin and e where , ,

everyone expects xn sin x, and e x


, . We shall not disappoint the reader's expecta-
tions on this score either.

EXERCISES

1. Cite the relevant axiom or definition needed to justify each step in the proof of the
linearity of BA as given in the text.
2. Prove the distributivity formulas (2-15).
54 LINEAR TRANSFORMATIONS AND MATRICES | CHAP. 2

3. Let D and L denote, respectively, the operations of differentiation and integration


onC 00
[a, b]; that is,

D{y) = £> L(y) = j%it)dt


for all y in e°°[a, b].

(a) Compute the value of LD and DL.


(b) Compute the value of L n D n and D nL n , n a non-negative integer.
4. (a) Prove that the mappings A and B defined in Example 5 above are linear trans-
formations on e°°[a, b\.

(b) Prove that every expression of the form

a n (x)D n + a n -i(x)D"-i + • • •
+ ai{x)D + a (x)

defines a linear transformation from Q n [a, b] to Q[a, b] whenever ao(x), . .


.
, an (x)
are continuous on [a, b].

5. Write each of the following products in the form a2(x)D 2 + a\(x)D + ao(x).
(a) (xD + 1)2
(b) (2xD + 1)(Z> - 1)

(c) (D - l)(2xD + 1)

(d) (x 2 D+ 2x)(D - 2x)


(e) (£> - 2x)(x 2 D+ 2x)
6. Let A, B: V V be linear, and suppose that
-+ AB = 5/4. Find a formula for
04 + B) n n a non-negative integer.
,

7. In each of the following find the result obtained by applying the given polynomial
in D to the indicated functions.
(a) D
2 -
2e x e~ x e x + e~ x
1 ; , ,

(b) D + 1 sin x + cos x, 2 sin 2x,


2
; e*

(c) (Z) + 1)(Z) - 2); sin x + <?-* + e 2x , e x , x2


(d) (D + 2)
2
; e- 2 -, xe- 2a: x 2 e~ 2x
,

(e) (A:i) - 1)(a:Z) + 2); x2 ,


(x 3 + l)/x, e\
8. (a) Prove that AB = BA for any pair of linear transformations A, B mapping a
one-dimensional vector space 13 into itself. {Hint: Choose a basis for 13.]
(b) Let 13 be finite dimensional, with dim 13 > 1. Prove that there always exist
linear transformations A and B mapping 13 into itself such that AB 9* BA.
9. Let A, B, C be linear transformations each mapping a given (unspecified) vector
space into another vector space, and suppose that each of the products AB and BC
is defined. Prove that (AB)C and A(BC) are also defined.
*10. (a) Let A and B be nilpotent linear transformations on a vector space 13, and sup-
pose that AB = BA. Prove that AB is nilpotent on 13.
(b) Give an example to show that the conclusion in (a) may be false if AB 9* BA.
11. Let B
be a fixed linear transformation on a vector space 13. Show that the set of
all linear transformations A on 13 such that AB = O is a subspace of the vector

space of all linear transformations on 13. What is the subspace if B = Ol If B = /?


. —

2-4 I
THE NULL SPACE AND IMAGE; INVERSES 55

2-4 THE NULL SPACE AND IMAGE; INVERSES


Let A be a linear transformation from Di to U 2 and let ?fl(A) denote the set of
,

all x in Di such that A(x) = 0. Then, as we have already observed, 31(A) always
contains the zero vector of D x . Actually we can say much more than this, for if
^(xO = A(x 2 ) = 0, then

A(aiZi + a 2x 2) = axAixx) + a 2 A(x 2 ) =

for all scalars a 1} a 2 and , it follows that 31(A) is a subspace o/Vi. This subspace
is called the null space or kernel of A, and is of fundamental importance in studying
the behavior of A on 1) i.

Of equal importance with the null space of A is its image, 3(A), which, we recall,
is the set of all y in 1) 2 A(x) for some x in 1) x
such that y = . It too is a subspace
this time in V 2— since if yi and y 2 belong to d(A) with y x = A(x 1 ), y2 = A(x 2 ),
then
A(aix x + a 2x 2 ) = ai^(xi) + a 2 A(x 2 ) = atf! + a 2y 2 ,

and aiy i + a 2y 2 is also in the image of A, as required.

Example 1. Let /: V —» V be the identity transformation. Then 31(7) = ©,


the trivial subspace of V, while 0(1) = V.
Example 2. If O: T>i — V2 > is the zero transformation, then, by its very defi-
nition, Dl(O) = Vi, $(0) = ©.

Example 3. Let D be the differentiation operator on the space of polynomials


(P Then the null space of D consists of all polynomials of degree zero together
ra .

with the zero polynomial, while its image consists of the zero polynomial and all
polynomials of degree < n — 1

Example 4. Let C 2 (— oo, oo) denote the space of all twice continuously differen-
tiable functions on (— oo, oo), and let A: e 2 (— oo, oo) —» e(— oo, oo) be the linear
transformation D —
2
I. Then

and the null space of A is the set of all functions y in e 2 (— oo, oo) for which

2? - y = a
Thus, 31(A) is the set of solutions of a certain differential equation, and the prob-
lem of finding all solutions of this equation is identical with that of finding the

null space of D2 — I.

Example 5.
00
Let (R be the space of all infinite sequences {x u x 2 x 3 .} of , ,
. .

real numbers, with addition and scalar multiplication defined termwise (see
56 LINEAR TRANSFORMATIONS AND MATRICES CHAP. 2

Exercise 8, Section 1-2), and let A and B be the linear transformations on (R°°

defined by
A{X U X 2 X 3 , , • • •} = {x 2 , X S , X 4 , . . .},

B{xx, x 2 , x s , . . .} = {0, xi, x2 , . . .}.

Then 31(A) is the subspace of (R°° consisting of all sequences of the forjn

{x lt 0,0, ...}, with xi arbitrary, while 31(B) = 0. On the other hand, 0(A) = (R°°,
while, by definition, 3(B) consists of all sequences whose first entry is zero.

Now that we have introduced the null space and image of a linear transforma-
tion we propose to take a closer look at those transformations A: °Q\ — V2
> for
which either
(i) 9100 = 0> or (») *(A) = v 2,
or both. The second of these equations asserts that A maps "0 1 onto V2 , and implies
that for each y in V2 there exists at least one xin1)i such that y = A(x). The
first, which says that the null space of A contains only the zero vector, turns out
to be equivalent to the assertion that A is one-to-one in the sense of the following
definition.

Definition 2-5. A linear transformation A: Vi — V2


> is said to be one-
to-one if and only if >4(xi) = A(x 2 ) implies that Xi = x2 .

In other words, A is one-to-one if and only if A maps distinct vectors in 13 1 onto


distinct vectors in T^; whence the name. (See Fig. 2-6.) This said, we now prove

Theorem 2-2. A linear transformation A: Vi V2 is one-to-one if and


only if 31(A) = 0.

Proof Let A = 0. Then ^(x)


be one-to-one, and suppose that A(x) = ^4(0), and
Definition 2-5 implies thatx = 0. Thus 9X00 = ©• Conversely, if 9100 =
and A(x0 = A(x 2 ), then A(x x ) - A(x 2 ) = 0, or A(x t - x 2) = 0. Thus
Xi — x 2 = 0, and x x = x 2 as asserted. | ,

One-to-one Not one-to-one

FIGURE 2-6
2-4 I THE NULL SPACE AND IMAGE; INVERSES 57

Among the various transformations appearing in the examples above, only


B: (R°° —
and / were one-to-one, since only they had trivial null spaces.
> (R°°

Additional examples of such transformations are provided by rotations of (R 2


about the origin, reflections across any line through the origin, etc. The reader
should have no difficulty in augmenting this list indefinitely.
Linear transformations which are both one-to-one and onto are called iso-
morphisms, and are said to be iwertible. They are of particular importance since,
just as with ordinary one-to-one onto functions, they have inverses, and all of
the standard facts concerning inverse functions can then be established. Indeed,
if A Vi :
— >• "U 2 is one-to-one and onto, each vector y in ^2 is paired with a unique
vector xinDi, and A can therefore be used to define a function from l) 2 to 13i.
This function is called the inverse of A, and is denoted by A~ l
(read "A inverse").
It can be described explicitly as the function from V 2 to V x such that

A (y) = x, where A(x) = y (2-20)

for each y in V2 is obtained from A by reading the


. Loosely speaking, A~ l

A from right to left, as suggested in Fig. 2-7, and is clearly a one-to-


definition of
one map of V 2 onto Vi. Moreover, it is linear, since if y and y 2 belong to V 2 t

with yi = A(Xi), y 2 = A(x 2 ), then

A(axK x + a 2 x 2) = «iyi + « 2 y2,

and (2-20) implies that

A~ l
(a x y l + a 2y 2) = a x x x + a 2x 2 = «i^ _1 (yi) + a 2 A~ 1 (y 2 ).

Having observed that A~ 1 is one-to-one,


onto, and linear, it follows that it too is in-
vertible, and if we simply parrot the construc-
tion given above, this time starting with A~ 1
,
-1 -1
we find that 04 ) = A. Finally, if we
form the products X X
A~ A and AA~ each of ,

which is certainly defined, an easy argument


reveals that they both reduce to the identity;
that is,

A-^ix) = x and AA~\y) = y


for all x in V 1 and all y in V 2 * And with this
we have proved the following theorem. FIGURE 2-7

* These equations are the vector space analogs of such pairs of statements as
-1 -1 x)
sin (sin x) = x, sin (sin = x, — ir/2 < x < tt/2,
or
x, In (e*) = x,

which are familiar from calculus.


~

58 LINEAR TRANSFORMATIONS AND MATRICES | CHAP. 2

Theorem 2-3. Every one-to-one linear transformation A mapping V i onto


V2 has a unique inverse from 13 2 to 13 1 defined by

A~\y) = x,

where A(x) = y for all y in 13 2 . A -1 is also one-to-one, onto, and linear,


-1 -1
with 0* ) = A, and

A~ A =l
IVl , AA- = 1
IV2 ,
(2-21)

where IVl and IV2 denote, respectively, the identity maps on 13 1 and 13 2 .

These last equations actually serve to characterize invertible linear transforma-


tions —a fact which when stated precisely reads as follows:

Theorem 2-4. Let A: 13 1 — > 13 2 and B: V2 — » 13 1 be linear, and suppose


that BA and AB are, respectively, the identity maps on 13 i and V2 Then A
.

is one-to-one and onto, and B = A -1 .

Proof Let x in 13 i be such that A(x) = 0. Then on the one hand,

B(A(x)) = 5(0) = 0,
and on the other,
B(A(x)) = BA(x) = /(x) = x.

Thus x = 0,and VL(A) = 0.


Now let y be an arbitrary vector in V 2 Then .

y = /(y) = AB(y) = A(B(yj),

and it follows that y is the image under A of the vector B(y) in V i. Thus 3(A) = V2 ,

and we are done. |

2
Example 6. If A is any rotation of (R about the origin through an angle 0,

then yl is invertible with A~ x


the rotation through —6, since A l
A = AA~ = X
I.

Example 7. Let A : (R
3 — > (R
3
be defined by

A(x u x 2 x z ) ,
= (*i + x 2 ,x 2 ,x 3 ).

Then A is invertible, with A -1 given by

A (xi, x 2 X3) =, (xi — x2 x2 , , X3),


since
A~ A(x 1 x 2 x 3 ) =
1
, , (*i, x 2 x 3) =
, AA~ 1
(x 1 , x 2 x 3 ).
,

Theorem 2-4 suggests a natural and valuable generalization of the notion of


the inverse of a linear transformation A : 13 1 —» V 2 ; to wit, a linear transformation
.

2-4 I
THE NULL SPACE AND IMAGE; INVERSES 59

B: V 2 — Vi> such that


AB = I, BA ?* I.

The fact that such transformations do exist can be seen by looking at Example 5
above where
AB{xi, x 2 , x& . . .} = {xi, X2, X3, . .},
and
BA{x x x 2 x 3
, , , . ..} = {0, x2 x3
, , ...}.

Transformations of this sort are encountered fairly often in certain types of prob-
lems and are therefore distinguished by name according to the following definition.

Definition 2-6. A linear transformation B: V 2 — > Vi is said to be a right


inverse for A: Vi — ^2 > if the product AB
map on V 2 is the identity .

Similarly, B is said to be a left inverse for A if BA is the identity map on V 1.

Remark. If B is a right (left) inverse for A, then A is a left (right) inverse for B.

The example given a moment ago shows that a linear transformation may have
a right or left inverse without having an inverse. It is easy to show, however,
A has both a right inverse B
that if and a left inverse C, then A is invertible, and
B = C = A~K For then
AB = I and CA = I,

and it follows that

C(AB) = CI = C, (CA)B = IB = B,

and hence that B = C. Thus AB = BA = I, and the assertion that A is invertible


with B = C = A~ now follows from Theorem 2-4.
x

Example 8. Let e°°[a, b] be the space of infinitely differentiable functions on


[a, b], and let D and L be differentiation and integration, respectively; that is,

dy

Then
LD(y) = FyXQdt = y(x) - y(a),
Ja
while

DUy) = ±f*y(t)dt = y{x),

and it follows that DL = I, LD 5* I. In other words, the operation of integration


on function spaces is only a right inverse, and not an inverse, for differentiation.
It is this fact more than any other which motivated us to introduce the notions

of right and left inverses in the first place.


60 LINEAR TRANSFORMATIONS AND MATRICES |
CHAP. 2

EXERCISES

1. Find the null space and, where applicable, the inverse of each of the following linear
transformations on (R 2 .

(a) A(xi, x 2 ) = 2(xi, — x 2)


(b) A(x u x 2 ) = (x 2 , 0)
(c) A(xi, x 2 ) = (*i + x 2 xi
, + x 2)
(d) A(xi, x 2 ) = (xi + x 2 xi
,
— x 2)
3
2. Repeat Exercise 1 for the following transformations on (R .

(a) A(xi, x 2 xs) ,


= (xi + x2 x2
, + *3, xi)

(b) A(xi, x 2 X3) ,


= (2xi, — *3, xi + X3)
(c) A(xi, x 2 X3) ,
= (x 2 + 2x3, *i — x 2 0) ,

(d) A(xi, x 2 X3) ,


= (xi — x 2 xi + x 2 + X3, x 2 +
,
X3)

3. Repeat Exercise 1 for the following transformations on (P.

(a) >4(p) = ^ - 2-
(b) A(p) = xp(x)
(c) A(p) = />(*) - />(0)

(d) A(p) = p(x)g(x), qix) a fixed polynomial in (P

4. In giving the definition of A" we 1


insisted that A be one-to-one. Why?
2 -> 2
5. Let A: (R (R be defined by

^(*i, X2) = (aixi + a2^2, jSi^i + 182x2),

where a 1, a 2 /3i, 182 are real numbers. Prove that A is linear, and find a necessary
,

and sufficient condition in terms of ai, a 2 #1, p 2 for ,4 to be invertible. ,

6. Let .4: 13 1 -^ 13 2 be linear, let W


be a subspace of 13 1, and let A(W) denote the
image of W in 13 2 ;i.e., A(V?) is the set of all vectors y in V2 such that y = A(vf)

for some w in *W. Prove that A(y?) is a subspace of 13 2 .

7. Let .4 : 13 1 -> 13 2 be linear, and let W be a subspace of 13 2 . Prove that the set of all x
in 13 1 such that A(x) belongs to W a subspace of
is 13 1.

8. Let A: 13 1 —
be a one-to-one linear transformation, and let ei,
> 13 2 ,e„ be . . .

linearly independent vectors in 13 1. Prove that A(ei), A(e n) are linearly inde- . .
. ,

pendent in D2.
9. Let A : 13 1 — 13 2 be linear, and suppose that

dim 13 1 = dim 13 2 = n < <x>.

Prove thatA is one-to-one if and only if it is onto. [Hint: See Exercise 8.]

10. Let A and B be invertible linear transformations mapping 13 onto itself. Prove that
AB and BA are also invertible, and that
(AB)- 1 = B-U-\ (BA)- 1 = A-iB-K

11. Let ,4:13 —» 13 be linear, and suppose that A2 + I = A. Prove that A is

invertible.
2-5 |
LINEAR TRANSFORMATIONS AND BASES 61

*12. (a) Let V2 be finite dimensional, and let A: Vi —» 132 be one-to-one. Prove that
A has a left inverse but not a right inverse whenever $(A) 9^ X>2- [Hint: Choose
an appropriate basis in *U 2 .]

(b) Now let *U 1 be finite dimensional, and suppose that A is onto. Prove that A has
a right inverse but no left inverse whenever 31 {A) 9^ 0.
13. A linear transformation P: V — V * is said to be idempotent if and only if P2 = P.
(a) Prove that P is idempotent if and only if 7 — P is.
(b) Prove that 9l(P) = 4(1 - P), and that $(P) = 91(7 - P) whenever P is idem-
potent. [Hint: The image of P consists of all x inV such that P(x) = x.]
(c) Use the results of (b) to show that every x in V can be written uniquely in
the form
X = Xl + X2

with xi in 9l(P), X2 in #(P) whenever P is idempotent.


14. Let V be finite dimensional, and letW be a subspace of V.
(a) Prove that there exists a linear transformation P: V —> such that ^(P) = W, W
and P(x) = x for all x in W. A linear transformation of this type is said to be a
projection of V onto W. [Hint: Choose a basis for W, and extend to a basis for V.]

(b) Prove that a linear transformation P on V is a projection onto a subspace *W


if and only if P is idempotent and W
= #(P). (See Exercise 13.)
15. Two vector spaces Vi and V2 are said to be isomorphic if and only if there exists
an isomorphism A V 1 —»
: D2.
(a) Prove that two finite dimensional spaces V 1 and V2 are isomorphic if and only
if dimDi = dim V2.
Let (R+ denote the space of Exercise 1
(b) 10, Section 1-2. Prove that (R+ and (R
are isomorphic by exhibiting an isomorphism A: (R + — » (R
1
.

2-5 LINEAR TRANSFORMATIONS AND BASES


Until now we have carefully refrained from using bases and coordinates in our
study of linear transformations in order to avoid subverting our results by tying
them to a particular choice of coordinate system or suggesting that they are
knows full well, coordinate
valid only for finite dimensional spaces. But as everyone
systems are invaluable in computations, and we therefore propose to devote the
remainder of this chapter to exploring some of the connections between linear
transformations and bases. In so doing we will be led to the notion of a matrix
and to the subject of matrix algebra, which can best be described as the arithmetic
of coordinatized linear transformations. Thus we now impose the restriction
that every one of the vector spaces encountered in the several sections which
follow is finite dimensional unless explicitly stated otherwise.
With this restriction in force, let A be a linear transformation from V t to V2 ,

and let e x,..., e„ be a basis for V v Then if

x = xiei + • • •
+ xn e n
62 LINEAR TRANSFORMATIONS AND MATRICES |
CHAP. 2

is any vector in V i,
A(x) = XiAieJ + • • •
+ xn A(en ), (2-22)

and it follows that the value of A(x) is completely determined by the vectors
A(e x ), A(en ) in V 2 i.e., A is uniquely determined by its values on a basis
. . . , ;

for V v Moreover, if y lf y„ are arbitrary vectors in V 2 the mapping. . . , ,

A V —> V 2
:
x defined by setting

A(e{) = yi, ••• ,A(en ) = yn ,

and then using (2-22) to compute the value of A(x) for every x in V i is clearly
linear.Thus Eq. (2-22) also tells us how to construct all linear transformations
from Vi to V 2 and we have proved ,

Theorem 2-5. Every linear transformation from Vi to V2 is uniquely


determined by its values on a basis for Vi. These values can be chosen
arbitrarily in V2 , different choices yielding different transformations, and
every linear transformation from Vi toV 2 can be obtained in this way.

Example Let ei and e 2 be the standard basis vectors in (R , and let


1.
2
ibea
2
linear transformation from (R to (R
1
Then A is completely determined by the .

pair of real numbers ^(ex), A(e 2 ), and can therefore be represented by the ordered
pair (A(e{), A(e 2 j). Since distinct ordered pairs define distinct linear trans-
formations, it follows that there are exactly as many linear transformations from
2 1 2
(R to (R as there are vectors in (R .

Example 2. Let A: (R
2 — <R
2
be the linear transformation defined by

^(ei) = (oci,a 2 ),

A(e 2 ) = 03 1,182),

where ei and e 2 are the standard basis vectors, and let

x = xiei + x 2e 2
2
be any vector in (R . Then

^(x) = x^(ei) + x 2 ^(e 2 )


= Xi(ai, a 2) + x 2 C8i > j8 2)

= (a 1*1 + 01*2, «2*l + 182*2).

In this case A can be represented by the array of scalars

«i 0i 1
«2 fcj

and every such array can be viewed as the definition of a linear transformation
2-5 |
LINEAR TRANSFORMATIONS AND BASES 63

2
A of (R into itself where

A(x 1 ,x 2 ) = (aiXi + j3i*2, a 2 Xi + j8 2 x 2 ).

Example 3. Let /4: (R


2
—» (R
1
be the linear transformation described by the
ordered pair of real numbers (2, —1) with respect to the standard basis in (R 2 ;

that is,

A(e{) = 2, A(e 2 ) = -1,


and let e{ = ex + e 2 e2
, = — ei. Then e[, e2 is also a basis for (R
2
, and we have

A(e\) = A(e,) + A(e 2 ) = 1,

A(tk) = -A(ei) = -2.

Thus the ordered pair which describes A with respect to the basis e[, e 2 is (1, —2),
and we see that the description of a linear transformation by means of its values
on a basis changes with a change of basis.

EXERCISES

1. Let A:(H 2 — 1
be represented by the ordered pair of real numbers (ai,a 2 ) with
> (R
respect to the standard basis in (R 2 and let B: (R 2 —> (R 1 be represented by (/3i, /J 2 ).
,

Show that a^4 is represented by (aai,aa2), and that A B is represented by +


(ai +j8i,a 2 +j8 2 ).
2. Let ^: (R
2 —» 1
be represented by the ordered pair of real numbers (ai, a 2 ) with
(R
respect to the standard basis in (R 2 Find a necessary and sufficient condition that
.

there exist a basis ej, e2 in (R 2 with respect to which A is represented by (1, 1). As-
sume that this condition is satisfied, and find ei, e2 .

3. Let A: (R
2 —> (R
2
be represented by the array

T«i j8i'

and 2 -^ 2
let 5: (R (R be represented by

ax pi
> o'
JX2 P 2.
Find the representation of aA, A + B, and AB.
4. Let £((R 2 , (R )
1
denote the vector space of all linear transformations from (R 2 to (R 1 ,

and let

r^csi 2 ,^ 1 )-^^ 2
be defined by
T(A) = (ai,a 2 ),

where («i, a 2) is the ordered pair of real numbers which describes A with respect to
the standard basis in (R 2 .
64 LINEAR TRANSFORMATIONS AND MATRICES [
CHAP. 2

2 1 2
(a) Prove that T is a one-to-one linear transformation mapping £(<R (R ) onto (R , ,

and hence deduce that dimiJCGl 2 ,^ 1 ) = 2. [Hint: See Exercise 1.]


2
(b) Use the result in (a) to find a basis A u A 2 in ^((R ,® ) which corresponds to
1

the standard basis in (R 2 What is the effect of A i on the vector (1, 1) in (R ? Of A 2 ?


2
.

*5. Generalize the technique used in the preceding exercise and show that there exists a
2 2 4 What is the dimension
one-to-one linear transformation mapping £((R (R ) onto (R , .

of£((R 2 ,(R 2 )?

2-6 MATRICES
We have seen that every linear transformation A:Vi-*V 2 can be obtained from
the formula
A(x) = ;M(ei) + • • •
+ xn A(e n ) (2-23)

by suitably choosing the v4(e y ) in V 2 and that (2-23) defines a linear transforma-
,

tion from V i to V 2 for euery choice of these vectors. (Recall that e 1? e n is a . . . ,

basis for V u and that x l5 xn are the coordinates of x with respect to this
. . . ,

basis.) We now use this observation to define the notion of a matrix for a linear
transformation, as follows.
Let f i, . . . , fm be a basis for V 2 and let
, A: V i
-+ V2 be given. Then, for each
integer j, 1 < j < n, there exist scalars an such that

^(ey) = 2 «<yf» (2
~24 )

i.e., such that


/4(ei) = anfi + a 2 \f2 + * •
-
+ Ci m lfm,
^4( e 2) = «12fl + "22^2 + * ' "
+ Ot m2 fm ,
f2-25)

A(e n ) = ai n fi + a 2n f2 + • • •
+ a mn fm .

For computational purposes it turns out to be convenient to display these scalars


in the rectangular array
«11 «12 *
" ' a lr
«21 «22 '
' ' a 2r (2-26)

«ml am2 '


' ' a mi

whose columns are the coefficients of the various equations in (2-25). The reader
should note that the first subscript on an entry in (2-26) indicates the row in which
that entry appears, and the second indicates the column. With this convention in
force the entire array can be abbreviated (a^), it being understood that / and j
range independently over the integers 1, m and 1, n, respectively. When . . . , . . . ,

displayed as above, this set of scalars is called the matrix of A with respect to the
bases (S> 1 = {e 1? e n} and (S> 2 =
. {fi,
. .
f
,m} and is denoted by [A: (R lt (B 2 ], . . . ,
2-6 MATRICES 65

or simply by [A]. (In the special case where A maps V into itself and (Bi = = ®,
<B 2

the notation [A: (B] is also used.) When m = n, we say that (2-26) is a square
matrix; otherwise, rectangular. In general, a matrix consisting of m rows and n
columns will be referred to as an m X «-matrix (read "m by «").
The argument just given shows that every linear transformation from V x to V 2
determines a unique m X rc-matrix with respect to <Bi and (S> 2 But since the an .

in (2-25) uniquely determine the A(ej) and hence, by (2-23), A(x) for all x in V u
it follows that euery m X «-matrix determines a unique linear transformation from

13 1 to V 2 in terms of (B x and (B 2 and we therefore have ,

Theorem 2-6. Let V i and V 2 be finite dimensional vector spaces with


dim Vi = n, dim °U 2 = m, and let (Bi ant/ (B 2 be bases for V x and V 2 ,

respectively. Then every linear transformation from Vi to V 2 determines


a unique m X n-matrix with respect to ($>i and (S> 2 and, conversely, every ,

such matrix determines a unique linear transformation from Vi to V 2


defined by (2-23) and (2-24).

It is important to realize that this theorem does not assert that every linear
transformation has a unique matrix. Indeed, any such assertion would be patently
false, for, as we have already seen, the matrix of a linear transformation can

change with a change of basis (Example 3, Section 2-5). Thus the several references
to bases which appear in Theorem 2-6 cannot under any circumstances be deleted.

2
Example 1. Let e x and e 2 be the standard basis vectors in (R , and let
A: (R
2
—> (R
2
denote the reflection across the ei-axis. Then

A(e{) = 1 •
ex + • e 2, A(e 2 ) = •
ei - 1 e2 ,

and the matrix of A with respect to the basis ei, e 2 is

Lo -i

Example 2. Let A: (R
2 —> (R
2
be the (counterclockwise) rotation about the
origin through an angle 6, and again let

ei and e 2 be the standard basis vectors.


Then

^(eO = (cos 0)d + (sin 0)e 2 ,

A(e 2 ) = — (sin0)ei + (cos 0)e 2

(see Fig. 2-8), and the matrix of A with


respect to this basis is

cos — sin
in0"|
sin OS0J
cos
66 LINEAR TRANSFORMATIONS AND MATRICES CHAP. 2

Example 3. If / is the identity map on V, then, regardless of the basis used,


the matrix of / is the n X n-matrix

with ones along its principal diagonal and zeros elsewhere. For obvious reasons
this matrix is called the n X n-identity matrix.
Similarly the matrix of the zero transformation from 13 1 to V2 is always the
the m X n-zero matrix

0-- •

•• •

•• •

Example 4. Let
D:(Pn --><Pn
be differentiation, and let

(B = {1, x, . ..,X,n— It

be the "standard" basis. Then


~z
= 0- + 0-x + + xn + x,n—
% 1
D(l) 1 • •

~
D(x) = 1-1 + 0-X+ h • x nn—22 + i r\ .
• x„n—
n~J
D(x 2 ) = 0-l + 2-x+ h • jc + • x,71-1

~l = n~z
D(x n ) 0-l+0-x+ V in - \)x + • x,n— 1
and
1
•••

2 •••
[D: ffi] =
••• n - 1

•••

Example 5. Let

A: (P 3 ^ (?b

be the linear transformation defined by

A(p( x)) = (lx


2
- 3)p(x)
2-6 MATRICES 67

for a\lp(x) in (P 3 , and let (B x and (B 2 be the standard bases in (P 3 and (P 5 , respec-
tively. Then

,4(1) = 2x
2
- 3 = -3 • 1 + • x + 2 + •
*2 •
+
x3 • x
4
,

A(x) = 2x
3
- 3x = • 1 - 3 •
x + 2
x + 2• •
x +
3
• x
4
,

2 4
A(x ) = 2x - 3x
2
= • 1 + • x - 3 x2 + • •x3 + 2 •
x
4
,

and
3

-3
[^: (Bi, (B 2 ] = 2 -3
2

2_

EXERCISES

In Exercises 1 through 6 find the matrix of the given linear transformation with respect
to each of the given pairs of bases.

1. A:(R 3 ^(R 3 ;A(xi,x 2 ,x 3 ) = (*i + x 2 ,xi + x3 , 0)


(a) (Bi and (B 2 the standard basis

(0,1,0)
(b) ($>i the standard basis, (B 2 (1,0,1)
(0,0,1)

(0,1,0) 0,0,0)
f
(c) (Bi = {(1,0, 1), (B 2 (1,1,0)
(0,1,1) (1,1,1)

2. A: (R
3 -> (R
2
; 4(jci, x 2 x 3)
,
= (*i - jc 2 , 2x 2 - 3x 3 )
(a) (Bi and <B 2 the standard bases

(0,1,0)
(b) (Bi = j(l,0,l) (B 2 = the standard basis
1(0, 1, 1)

(0,1,1)
(3,1)
(c) (Bi ={(2,-1,-1), (B 2 =
d,f)
((3, 2, J)

3. A: (P 3 -^(R 1 ; ^(/>(*)) = f£p(x)dx


(a) (Bi = {l,*,* 2 }, (B 2 = {1}

(b) (Bi = {1,jc — l,x(x - 1)}, (B 2 = {1}


(c) (Bi = (l,x - l,x(x - 1)}, (B 2 = {2}
68 LINEAR TRANSFORMATIONS AND MATRICES |
CHAP. 2

^(/?W) = /Ii /?(x)rfx


1
4. A: <P 3 -» (R ;

(a) (B = {l,x,x 2 }, (B 2 = {1}


(b) (B = {l,x,x 2 - £}, (B 2 = 0}
(c) (B = {l,x+ l,x 2 - 1}, (B 2 = {-2}
^(/»W) = (x - l)/>'(*)
2
^:(P 3 -> (P4 ;

(a) (B = {l,x,x 2 }, (B 2 = {l,x,x 2 ,x 3 }


(b) (B = {l,x,x 2 }, (B 2 = {1,* - M* " I) 2 (* - l) 3 } ,

(c) (B = {1, x - 1, (jc - l) 2 }, (B 2 = 0, * - 2, (x - 2) 2 ,


(x - l)(x - 2) }
2

A: (P 3 -> (P 4 ; ^I(^W) = So P(t)dt


(a) (B = = {l,x,x 2 ,x 3 }
{l,x, x 2 }, (B 2
(b) (B = {l,x,x }, (B 2 = {l,x,x 2 /2,x 3 /3}
2

(c) (B = {2x - 1,2a: + l,x(x + 1)}, (B 2 = {x - 1, x + 1, x 2 x 3 /3}


,

Find the value of A(x) for any x in *Ui, given that A: 13 1 -» 13 2 is linear, and
[,4:(B *2] = (&/).

8. A subspace W of a vector space 13 is said to be invariant under a linear transforma-


tionA V -* 1) if and only if ^(w) belongs to *W for every winW.
:

(a) Every linear transformation A 13 -> 13 has at least two invariant subspaces.
:

What are they?


(b) Give an example of a linear transformation on a finite dimensional vector space 13
which has exactly two invariant subspaces. (Do no? assume dim "0 = 1.)
(c) Let dim 13 = n, dimW = m, and suppose that "W is invariant under .4 : 13 -> 1).

Prove that there exists a basis (B for 13 such that

an "
• '
«lm ai,m+i • "
ai„

(Xrnl ' OLmm a»n,m4-l OLmn

U: =
(B]

• a m +i, m +i ' a m-\-\,n


• •
a n TO +i
,
OCnn

[flwif : See Theorem 1-7.]

(d) Let Wiand *W 2 be invariant subspaces of 13 under A, and suppose that


(i) dimWi = m, dimW 2 = n - m, where n = dim 13,
(ii) "W i and W
2 have only the zero vector in common.
Prove that there exists a basis (B for 13 such that

an • •
ai m

am i • Oimm
[A:(S>] =

•• am + 1.TO+1 '
• a m +i, n


•• a„ ,m+l CX-nn
2-7 | ADDITION AND SCALAR MULTIPLICATION OF MATRICES 69

2-7 ADDITION AND SCALAR MULTIPLICATION OF MATRICES


Let £(13 1, 13 2 ) denote the set of all linear transformations from 13 1 to 13 2 and ,

let3H wn denote the set of all m X ^-matrices. Then if dim 13 x = n and dim 13 2 = m,
Theorem 2-6 asserts that the function which associates each A in £(13 1, 13 2 )
with its matrix [A] with respect to a fixed pair of bases ($>i and (B 2 is a one-to-one
mapping of £(13 1} 13 2 ) onto 9TC mn . This simple fact allows us to translate algebraic
statements concerning linear transformations into statements concerning matrices,
and leads to the subject of matrix algebra. In particular, it allows us to convert
2fE wn into a real vector space by using the matrix analogs of the addition and

scalar multiplication in £(13 1, 13 2 ) to define an addition and scalar multiplication


for matrices. The argument goes as follows.
Let (<xij) and (&•_,•) be arbitrary m X n-matrices, and let cr be a real number.
Then by choosing bases in 13 1 and 13 2 we find unique linear transformations A and
Bin£(V lt 13 2 ) such that

[A] = (an), [B] = (fry).

Thus, using the notation of the preceding section, we find


m m

i=l i=\
and it follows that
(A + B)(ej) = Ate) + B(ej)

i=i t=i
m

and

((rA)(ej) = a

= ^ (<Tan)U-

Hence
M+ £] = (a,-y + /3,7 ),

[<rA] = (o-a.-y),

and if we now require, as reason dictates we must, that

[A] + [B] = [A + B], and <r[A] = [<rA],

we are forced to give the following definition.


70 LINEAR TRANSFORMATIONS AND MATRICES I CHAP. 2

Definition 2-7. The sum + (&;) of two m X ^-matrices is by defi-


(a,-,-)

nition the m X n-matrix (a^ + ft,-); the product (T(a ) of a real number a i}

and an m X «-matrix is by definition the m X n-matrix (pan). In other


words,

{an) + (&>) = (an + 0,y) (2-27)


and

0"(«;y) = (fr<*ij) (2-28)

for all m X n-matrices (a,-,-), 0,,), and all real numbers (T.

When expressed as rectangular arrays these equations become

«11 «12 a\ n Pll P12 Pin

«21 «22 «2n 021 022 02n


+
.ami «m2 am.n >m\ Pml '
' '
Pm

an + #11 «12 + 012 a\ n + 01n

«21 + 021 «22 + 022 «2n + 02w

am \ + 0ml a m2 + 0m2 amn i Pmn


and
«11 «12 -
*
«ln (Tan crai2 •

Gai n

«21 «22 «2n <T«2i 0"O!22 ' <ra 2n


*
*
=
am \ «m2 " amn . _0"«ml <7a m2 (fa mn

and assert that matrix addition and scalar multiplication are performed entry by
entry, or termwise. Moreover, we now have

Theorem 2-7. The set 2fTl wn of all m X n-matrices is a real vector space
under the above definition of addition and scalar multiplication.

The student should appreciate that there is no need for a formal proof at this
point since the asserted result follows automatically from Theorem 2-6, the fact
that £(V U V 2 ) is a real vector space, and the way in which addition and scalar
multiplication were defined in 3Tl mn . Indeed, we can now assert that £(V U V 2)
and 3Tl wn are algebraically identical (or isomorphic), and that the function which
sends each linear transformation A: Vi — V2 > onto its matrix [A: <R lt (B 2 ] with
respect to a fixed pair of bases (Bi and (B 2 is an isomorphism of £(13 1, V 2) onto
Jl'-mn-
2-7 ADDITION AND SCALAR MULTIPLICATION OF MATRICES 71

As an illustration of the way in which this fact can be used to establish results
which are not otherwise obvious, we now propose to show that <£(T>i, V 2 ) is finite
dimensional and to compute its dimension. For this purpose we introduce the
special matrices (e^), 1 < i < m, 1 < j < n, each of which has the entry 1 at
the intersection of the rth row andyth column and zeros elsewhere:

= (en)- (2-29)
J

Then, for each (a i-7 ) in 9TC mn we have

(ctij) - au(eu) + + «ln(^ln)

+ "21(^21) + + «2n(e2n)

+ «m i(ew i) + • • •
+ amn (emn ),
or
m,n
(2-30)
i,J=l

Thus the (e^) span M, mn , and since it is clear that (2-30) is the only possible way
of writing (o#) as a linear combination of the (e,-,-), it follows that these matrices
are a basis for 9!l wn . (This particular basis is called the standard basis for 9fTl mn .)

Hence £ (V 1 ,V 2 )
I is also finite dimensional with dimension mn, and we have
proved

Theorem 2-8. IfVi and V 2 are finite dimensional vector spaces, then so
is £(Vi, V 2 ), and
dim £(V 1, V 2) = (dim V i)(dim V 2 ). (2-3 1)

Example 1. The set of all 1 X ^-matrices 9TCi n is an n-dimensional vector


space with
( Cll )= (1,0,..., 0)
(e 12 ) = (0,1,..., 0)

(e ln ) = (0,0,..., 1)

as a basis. In this case the (e -y) can be identified with the standard basis vectors
t

in (R
w
and when this identification is made 3Tli n becomes identical with (R n
, .

Thus £(V, (R 1 ) is isomorphic with (R whenever dim V = n, and we conclude re


,

that there are exactly as many linear transformations from V to (ft as there are
1

w
vectors in <ft (cf. Example 1, Section 2-5).
71 LINEAR TRANSFORMATIONS AND MATRICES CHAP. 2

Example 2. Let

V* = Vo = (R'

2
Then dim £((R.
2
, (R ) = 4, and the (et-y) are four in number:

ri o] ro ii
(en) = («i2) =
0. LP oj

[0 o] ol
(«2l) = («22> = r°
_1 0. [p ij

Moreover, if E^ denotes the linear transformation corresponding to {e i}) with


2 2
respect to a fixed basis in (R , then every linear transformation mapping (R into
itself can be written uniquely in the form

"ll-Ell + «12^12 + <*2 1-^21 + «22£'22

for suitable scalars a 11} ai 2 a2i, «22-


,

Example 3. If A is a nonzero linear transformation mapping a finite dimensional


vector space V into itself, then

/, A, A2 , A.
also map V into and thus belong to £(V, V). But by Theorem 2-8 this set
itself,

is linearly dependent in£(V, V). Hence there exists a smallest positive integer k
k~ x
such that A k is linearly dependent on I, A, ... A and it is now easy to show , ,

that these transformations are a basis for the subspace of £(V, V) spanned by
k
the powers of A (see Exercise 14). In particular, we can write A in the form

A = a k _iA + + axA + a I,

or
A — ak-iA axA — a = O, (2-32)

where O is the zero transformation on V, and it follows that A is a root of the


polynomial
k
— k ~1 — — —
m A (x) = x a k -ix ... a xx a . (2-33)

Since k was chosen as small as possible in this argument there no polynomial


is

of lower degree having A as a root. For this reason m^(*) is called the minimum
polynomial of A. can be characterized as the polynomial of least degree with
It
2
leading coefficient 1 which has A as a root, and is clearly of degree <n when
dim V = n. Actually, it can be shown that the degree of m A (x) does not exceed
the dimension of V for any nonzero transformation A: V —» V. The proof,
however, is not easy.
2-7 ADDITION AND SCALAR MULTIPLICATION OF MATRICES 73

EXERCISES

In Exercises 1 through 5 compute the value of a[A] + 0[B] for the given scalars a, /3

and matrices [A], [B].

1. a = 2, j8 = - 1,
1 -2 4 -1 2

[A] = 3 1 -5 ' [5] = 3 -5 1

2 -2 3 4 -2.
2.a = -*. = 4,
"2 -3~ -1 2"

[41
= 1 • [B] = 3 1

.4 -2. 4 -5_
3. a = §, = -2,
6 2 -3 -8 3
2
l"
[4] = ' [5] =
_1 4.
_A3 3J

4. a = hP = l
6 2
-3" -1
2 2

[A] = 4 1 » [5] = —2 1 —2 i

--1 1
3 3. -f 2

5. a = hP- ~h
5"
* -2 3 3 -4 1

[A] = § -1 -i <
3 . [B] = 6 — z2 A
2

.-3 1
L 4 i
2
2 3J L.3 3
2
6. Let /I denote the counterclockwise rotation of (R about the origin through the
angle w/4, and let B denote the reflection across the origin. Find the matrix of
A + B with respect to the standard basis ei = (1, 0), e2 = (0, 1), and with respect
to the basis ei = (1, 1), e'2 = (-1, 1).

7. Let A, B: (P 3 -* (P4 be defined by

A(p(x)) = xp(x) - pi\), B(p(x)) = (x - l)p(x).

Find the matrix of 2A — B with respect to the standard bases in (P3 and (P4, and
with respect to the bases

(Bi = {1, x - 1, (x - l) },
2
(B 2 = {l,x- 1, (x - l)
2
,
(x - 3
l) }.

8. Compute the dimension of the subspace of 3H22 spanned by the matrices

"2 1" "1 9" -6"


2 3
» >
-1 ,0
3
4. 2J
74 LINEAR TRANSFORMATIONS AND MATRICES | CHAP. 2

9. Let ei, . .
.
, e„ and f i, . .
.
, fm be bases for 13 i and 13 2, respectively, and for each pair
of integers i,j with 1 < / < m, 1 < j < n, let E i} :Vi — > 13 2 be defined by

= if j ^ k,
Eij(e k )
[fi if j = k.

Prove that the Ei}


-
are a basis for £(13 1, 13 2).

10. Let A: 9TC22 —» 9^22 be the mapping defined by

an «12 an ai2

_Q!21 0!22. . a22.

Prove that A is linear, and find the matrix of A with respect to the standard basis
in 91X22.

11. Repeat Exercise 10 for the mapping ,4:91122 — * 9E23 defined by

an ai2 an ai2

a2i a22 a2i a22

12. What is the dimension of the vector space £(9TCm i,9Ei„)? Of £(9TCOT „,9re pg)?
2 2
13. (a) Prove that the functions sin x, cos x, sin x cos x, sin x, cos x are linearly
independent in C(— 00 , 00 ).
(b) Let V denote the subspace of C(— <» , 00 ) spanned by the functions in (a). Prove
that Dn , the «th power of the differentiation operator, maps V into itself for all n,
and find the matrix of D 2 — 2D + 1 with respect to the given basis for V.
14. Let A: V — V > be linear, with A
O, and let k be the smallest positive integer such
?*

that A k is linearly dependent on /, A, A h ~ x Prove that /, A, A k ~ 1 is a . . .


, . .
.
,

basis for the subspace of £(V, V) spanned by the powers of A.

15. Find the minimum polynomial of the linear transformation D: (P3 —> (P3.

2-8 MATRIX MULTIPLICATION


Continuing in the spirit of the last section we now define a multiplication for
matrices by rewriting the definition of the product of two linear transformations
in matrix form. To this end let

B:Vi^>V 2 , A:V 2 - 13:

be given, let

(Bi = {ei, ...,e r}, <B 2 = {fi, ...,f»}, = {gl, . . . ,gm}

be bases for 13 1, V2 V3 , , respectively, and let

fc=i

t=l
2-8 MATRIX MULTIPLICATION 75

Then we have

ABiej) = A
(
\k=i
2 Mk) = 2 ^jA(h) / *;=i

fc=l v=l /

=
2 (2 a *'*^*y) 8»»

and the matrix of yi5 with respect to (Bi, (B 3 is the mX /"-matrix whose ijth entry
is£jk=i OtijSiky. But since

[A: « 2 «3 =
, ] (o«), 1 < i < m, 1 < k < n,
and
[5: (Bi, CB 2 ] = (fly), 1 < k < n, 1 < j < r,

the requirement that

[AB: (Bi, (B 3 ] = [A: (B 2 , <B 3 ][£: ®i, «2]

leads to the following definition of matrix multiplication.

Definition 2-8. The product (aftX/Siy) of an mX n-matrix (a ik ) and an


n X /--matrix (0 fc y) is by definition the m X /"-matrix

(ttifcXfoi) = ( /! «ifc^fcA (2-34)

It isimportant to notice that the product of two matrices is defined only when
the number of columns in the first matrix is equal to the number of rows in the second;
a restriction which is the matrix analog of the fact that the product of two linear
transformations is defined only when the image of the first is contained in the
domain of the second. When written in greater detail, Eq. (2-34) becomes

«11 «12 ' •


«ln 011 012 * •
01r

«21 «22 '


'
<*2n |821 022 *

02r

«TOl «m2 ' &mn_ _0nl 0»2 ' Pnr

0=11011 + • • *
+ ttln0nl ail01r + ' • '
+ «ln0nr

OC21011 + * * *
+ «2n0nl «2101r + * * '
+ «2n0nr

«ml011 + + Otr, O^mlPlr ~T " * '


"T OCnmPnr

and is easily remembered in terms of the kinesthetic relationship between the


rows of the first matrix and the columns of the second.
76 LINEAR TRANSFORMATIONS AND MATRICES | CHAP. 2

Example 1. If
3"
1
'2-1
(<*ik)
= (&;) = -2 1

1 2 -3
4,
then (ancXPhj) is defined, and we have

=
"2-1 + (-l)(-2) + 0-0 2-3 + (-1)- 1 + 0-4
(aikXPkj)
.1-1 + 2 -(-2)+ (-3)-0 1 -3 + 2- 1 + (—3) •
4J
4 5

-3 -7
On the other hand, (0kj)(aik) is not defined since the number of columns in (j8 fc y)

is not equal to the number of rows in («;&).

Example 2. Let
3 1 ~-l 2
(«ii)
= (fty) =
-1 2 1 0_
Then
3 1 -1 2 -2 6
((*ij)(Pij)
=
-1 2. 1 0. 3 -2
-1 2"
3 1 -5 3
(Pij)(<Xij) =
1 -1 2 3 1

and we see that the multiplication of square matrices is noncommutative.

Example 3. Let [A] be the matrix of a linear transformation A: V— V> with


respect to a basis and set [^4]° = [/] where [/] is the appropriate identity
(B in t>,
k
matrix. Then the various powers [A] k a nonnegative integer, are all defined,
,

and are simply the powers of A with respect to (B. (Why?) This in turn implies
that the matrix of a polynomial

p{A) = ak A
k
+ ak-iA*-
1
+ h a xA + a I
in A is

p([A]) = a k [Af + a k _ x [Af-


1
+ + a x \A\ + floM-

Such expressions are called polynomials in [A], and are defined whenever [A]
is a square matrix. In particular, if

rriA(x) = xk — a k -ix
k 1
a\x a

is the minimum polynomial of A defined in Example 3 of the preceding section,


then
mA ([A]) = [Af - au-ilA]
fc-i
ai [A] - a [I] = [O],
2-8 MATRIX MULTIPLICATION 77

and it follows that [A] is a root of m A (x). For obvious reasons this polynomial
is also called the minimum polynomial of the matrix [A].

Since Definition 2-8 is simply a coordinatized version of the multiplication of


linear transformations, all of the results proved in Section 2-3 remain true when
phrased in terms of matrices. This allows us to assert (without proof) that matrix
multiplication is associative, and is distributive over addition; that is,

[(a*)(|8jy)](7y0 = (a;*X(/3fci)0^)] (2-35)


and
(ctikWkj) += (T fc y)] («.*)(&,) + (oc ik )(y kj ),
(2-36)
[(«<*) + (PikWkj) = (a ik yy kj ) + (fiikYykj)

whenever the sums and products appearing in these equations are defined.
Similarly, the identity matrices introduced in Example 3 of Section 2-6 play the
same role in matrix multiplication that the identity transformations play in operator
multiplication. And finally, we can use these identity matrices to define right,
left, and two-sided inverses for matrices by rewriting Eq. (2-21) and Definition 2-6
in matrix terms. The details have been left to the reader.

EXERCISES

1. Evaluate each of the following products.

(a) 1 -3 2 6 l"

4 1 -1 -2 3

2 -5 3. 3 -4.

(b) 2 0-1 1 1 3 -2~

4 5 3-2 1

.-10 2 4 L5 -1 3_

(c) a
3
i n
u £2 _1* _l"
2
6

J, 1 _J, -1 2 1

nil
6

L"
2

6
6

3 2 l 2-i

4 4

(d) [2 i -1 3] (e) [2 -1 3]
2 2

-1 -1
78 LINEAR TRANSFORMATIONS AND MATRICES CHAP. 2

2. Find all 2 X 2-matrices which commute with


1 -f
.0 2,

3. Find necessary and sufficient conditions for the matrices

an ai2 an a2i
and
«21 «22j a 12 a22.

to commute.
-1
4. An n X «-matrix [A] is said to be invertible with inverse [/4] if and only if

[AM]- 1 = [A]-HA] = [/],

where [/] is the n X ^-identity matrix.

(a) Show that if [A] has a right inverse [B] and a left inverse [C] then

[*] = [C] = [A]~K


(b) Show that the matrix

1 1

1 1.

has no left inverse in 9TC32, but has infinitely many right inverses in 3H32.

5. Find the inverse of each of the following matrices.

(a) (b) -1 (c) 1 -3


1 1

1 1 -2 1

6. Show that the matrix

1 -f
1

a
2 and when this condition is
is invertible if and only if jS ?* a , find its inverse satisfied.

7. Find all values of a and /3 for which each of the following matrices is invertible, and
compute their inverses. ,

"

(a) a (b) "a j8 (c) 1

2 2
fi 2_ a fl
p _
-1

8. Compute [A] n when


"l f 1 1

(a) [A] - (M
_1 1.
2-8 I MATRIX MULTIPLICATION 79

1 -1 1 a 1

(C) [A] = (d) [A] = (e) [A] =


1 1. 1 a

9. Find all 2 X 2-matrices

7 5
such that
2
a (3 1 0~

y *_ .0 1-

10. An n X n-matrix (an) is said to be nilpotent if and only if there exists an integer
k > such that (a,,)* is the n X w-zero matrix. The smallest positive integer for
which this is true is called the degree ofnilpotence of (a,,). Show that each of the
following matrices is nilpotent, and find their degree of nilpotence.

(a) 1 -1 2 (b) 1 1 3 (c) 1 -3 -4


3 -2 5 2 6 -1 3 4

4 .-2 -1 -3. 1 -3 -4

11. Find all 2 X 2 nilpotent matrices with degree of nilpotence two. (See Exercise 10.)

12. Determine which of the following matrices are roots of the polynomial p(x) =
x3 — x2 + x — 1.

(a) TO 0] (b) 1

1 1

.1 0.

(c) |1 (d) 1 o"

1 1

.1 -1 1 -1
13. Let A: 9E 2 2 -> 9H22 be defined by
"2
an a\2 l" an ai2
A .

JX21 «22. .0 -1_ a2i «22.

Prove that A is linear, and find its matrix with respect to the standard basis in 9TC22.

14. Repeat Exercise 13 for the mapping ^4:20^22 —* 9TC23 defined by

an ai2 an ai2 1 -1

a2i a22. _a2i a22. 2 1 1


80 LINEAR TRANSFORMATIONS AND MATRICES CHAP. 2

15. Solve each of the following matrix equations for [X], given that

2 1 -1 1 1
[A] = [B] = [C] = [O]
3 2 1 .-1 1.

(a) 2[X] - [A][C] = [B]. (b) [C][X] - [A][B] = [O].

(c) [X] 2 = [A]([B] - [C]).

16. Prove the associativity of matrix multiplication directly from Definition 2-8.
17. Find 2 X 2-matrices [A], [B] such that

[A][B] = [O], [B][A] * [O],

where [O] is the 2 X 2-zero matrix.


18. Does [A][B] - [A][C] for n X w-matrices [^], [fi], [C] necessarily imply that
[#] = [C]? Why?
19. An n X n-matrix (a*/) is said to be a diagonal matrix if and only if a^ = when-
ever i 7^ y.

(a) Show that the product of diagonal matrices is diagonal.


(b) Under what conditions is an n X /i-diagonal matrix invertible? What is its

inverse?
(c) Let T: (R 1 -» 3Tl nn be defined by

a •• •

n«) = a

for all a in (R 1 . Prove that T is one-to-one, linear, and that T(ci)T(0) = r(a/8) =
T(0)T(a) for all a, 0.

(d) Find the matrix of T with respect to the standard bases in (R 1 and 2fTC n „.

20. Prove that the only n X ^-matrices which commute with every matrix in 9TC nn are
the scalar matrices
" •••
a

2-9 OPERATOR EQUATIONS


Much of the study of linear transformations is given over to devising methods
for solving equations of the form
Ax = y, (2-37)

where y is known, x unknown, and A a linear transformation from Vi to 1) 2 -

Such equations are known under the generic name of operator equations, and will
appear throughout this book in a variety of forms. In general, of course, the
technique for solving a particular operator equation depends upon the operator
2-9 OPERATOR EQUATIONS 81

involved, and also upon the underlying vector spaces. Nevertheless there are a
number of such equations which can be proved without using
facts concerning
anything other than the linearity of A, and we propose to get them on record here
before going on to more specialized topics.
A vector x in Vi is said to be a solution of (2-37) if A(x ) = y. The totality

of such vectors is called the solution set of the equation. In the special case of a
homogeneous equation
Ax = (2-38)

whose right-hand side is zero, we know that this set is a subspace of 13 1. It is called
the solution space of the equation. One of the most important properties of
operator equations is that the problem of solving a nonhomogeneous equation
Ax = y,y 5* 0, can all but be reduced to that of solving its associated homogeneous
equation Ax = 0. In fact, if x p is a fixed solution of Ax = y, and if x^ is any
solution whatever of Ax = 0, then x p + Xh is also a solution of Ax = y, since

A(x p + x h) = A(xp ) + A(x h)


= y +
= y-

Moreover, every solution x of Ax = y can be written in this form for a suitable

Xh, since from

A(x — xp ) = A(x ) - A(x p )


= y - y
= o

it follows that x — xp = Xh is a solution of Ax = 0, and hence that x = xp + x^,


as asserted.
The solution x p appearing in this argument is frequently called a particular
solution of Ax = y, and in these terms we can state the above result as follows:

Theorem 2-9. If x p is a particular solution of Ax = y, then the solution


set of this equation consists of all vectors of the form x p Xh, where Xh is +
an arbitrary solution of the associated homogeneous equation Ax = 0.

Geometrically this theorem asserts that


the solution set of a nonhomogeneous
operator equation can be obtained from
the solution space of its associated homo-
geneous equation by translating that sub-
space by a particular solution x p as shown
in Fig. 2-9. Algebraically it gives us a
prescription for solving Ax = y; viz.,

find all solutions of Ax = 0, one solution figure 2-9


82 LINEAR TRANSFORMATIONS AND MATRICES CHAP. 2

of Ax = y, and add. The reader will do well to keep both of these points of view
in mind as we continue.

Example 1. Let A:Q 2 (— oo,oo)—»e(— oo, oo) be the linear transformation


D — 2
I introduced in Example 4, Section 2-4. Then the operator equation

Ay= 1,

v in e 2 (— oo, oo), assumes the form

a*y
dx*
-y=h (2-39)

and its solution set consists of all functions in C 2 (— oo, oo) which satisfy this
equation on the entire real line. In this case it is obvious that y = —1 is one such
function. Thus, to complete the solution of (2-39) it suffices to find all solutions
of the homogeneous equation

dx*
y

In Chapter 3 we will prove that the solution space of this equation has the functions
x
e and e~x as a basis, and hence as a corollary, that the solution set of (2-39) is
2
the totality of functions in C (— oo, oo) of the form

y = -1 + c xe
x
+ c 2 e~
x
,

where c x and c 2 are arbitrary constants.

&[x 1 +a2 X2 = P

<*\X\-\- 012X2 =Q

FIGURE 2-10

2 1
Example 2. Let A be a linear transformation from (R to (R , let e x and e 2 be
the standard basis vectors in (R
2
, and let A(ei) = ct\, A(e 2 ) = a2 , ai and a 2
real numbers. Then if x = x^! + x 2e 2 is any vector in (R
2
,

A(x) = xi^ei) + x 2 A(e 2)


= «iXi + a 2 X 2 ,
2-9 | OPERATOR EQUATIONS 83

and the operator equation Ax = is an abbreviated version of

a i*i + a 2x 2 = 0. (2-40)

2
Since (2-40) is the equation of the line through the origin in (R with slope —ai/a 2 ,

2
the solution space of the equation Ax = is just the set of points in (R which
comprise that line. In this case the solution set of the nonhomogeneous equation
Ax = j8, j8 a real number, can be interpreted as a translation of the line described
by (2-40), as shown in Fig. 2-10.

It goes without saying that in particular instances the solution set of Ax = y


may be empty, in which case the equation has no solutions. In fact, one of the
major problems in the study of operator equations (or arbitrary equations for
that matter) is to determine conditions under which the equation will have solu-

tions. This is the so-called existence problem for operator equations, and theorems
which establish such conditions are called existence theorems.
Of equal, or even greater importance is the problem of ascertaining when Ax = y
admits at most one solution for any given y in V 2 This problem is known as the .

uniqueness problem for operator equations, and can always be answered by examin-
ing the homogeneous equation Ax = and using the following theorem.

Theorem 2-10. An operator equation Ax = y will have a unique solution


(provided it has any solutions at all) if and only if its associated homogeneous
equation Ax = has no nonzero solutions, i.e., if and only if'Sl(A) — 0.

The student should have no difficulty in convincing himself that this result is an
immediate consequence of Theorem 2-9 and the description of the solution set
of Ax = y given there.
In the case where A admits an inverse of one of the various types discussed in
Section 2-4, the equation Ax = y can be immediately solved. If, for instance,
A is invertible, then from Ax = y we deduce that
A~\Ax) = A~ l
y,
or
x = A~ x y,
and the solution (which in this case must be unique) has been described in terms
of A~ l
. Similarly, if B is either a right or left inverse for A we find that the solu-
tion set of Ax = y is the set of all x in Vi such that x = By. This technique for
solving an operator equation is known as inverting the operator, and is used
whenever an explicit formula for an inverse can be deduced from the definition
of A.

Example 3. Systems of Linear Equations. As our final example we apply the above
ideas to the study of systems of linear equations. Our motive for presenting this
somewhat extended example is twofold. First, the theory of linear equations is
;

84 LINEAR TRANSFORMATIONS AND MATRICES | CHAP. 2

important in its own right, and the results we are about to obtain will be needed
from time to time in our later work. Second, this material provides perhaps the
easiest application of linear transformations to the solution of a nontrivial prob-
lem, and should therefore help the student become familiar with such trans-
formations.
We begin by introducing some standard terminology. A system of m linear
equations in the n unknowns x x x 2 , , . . . , xn is a set of equations of the form

0=11*1 + «12*2 + * * *
+ OCi n Xn = 0i,

«21*1 + «22*2 + ' * *


+ Ot 2n Xn = £2 ,
(2 4H
aml*l + Ol m2 X 2 + •
-
'
+ <X mn Xn = m fi ,

in which the an and & are real numbers. The an are called the coefficients of the
system, and have been so indexed that the first subscript on any coefficient indi-
cates the equation in which it appears, and the second the unknown with which
it is associated. Such a system is said to be homogeneous if all of the &• are zero
nonhomogeneous otherwise. Finally, a solution of (2-41) is an »-tuple of real
numbers (c x c 2 c„) with the property that when Ci is substituted for x lt
, ,
. .
.
,

c 2 for x 2 etc., each of the equations in the system becomes an identity. A system
,

without solutions is said to be incompatible.


Let
Q!ll «12 *
' *
«ln

M= «21 «22 *
* * 0C 2n

«mi ot m2 • • - amn _

n
be the m X H-matrix formed from the coefficients of (2-41), and let A: (R —* (R
m
be the linear transformation defined by this matrix relative to the standard bases.
n
Then if x denotes the vector (x lt . . . , xn ) in (R , and y the vector (0i, ... ,
|S W)
in (R
w (2-41) can be rewritten in operator form as
,

Ax = y. (2-42)

Conversely, if A is an arbitrary linear transformation from (R


n
to (R
m and
, if [A]
is the matrix of A
with respect to the standard bases in these spaces, Eq. (2-42)
is equivalent to a system of linear equations in n unknowns for each fixed y m
in (R
m We now propose to use these observations to establish several important
.

facts concerning solutions of systems of linear equations.


In the first place, we know that (2-42) will have solutions if and only if the
vector y = (/3 ls . . . ,/? m ) belongs to the image of the linear transformation A.

Furthermore, if

yi = («11) «21? • • • » «ml)> • • • > Yn = («ln> «2n» • • • , Oimn)


2-9 | OPERATOR EQUATIONS 85

denote the vectors in (R


m formed from the columns of (2-41) (or, equivalently,
from the columns of the matrix [A]), then

Ax = xtfi + x 2y 2 + •
*
' + xn yn

= n
for each x (x lt . . . , xn ) in (R . Thus Ax is a linear combination of y i, . . . , y„,
and it follows that these vectors span the image of A. This, combined with the
observation made a moment ago, yields our first existence theorem.

Theorem 2-11. The system of linear equations (2-41) has a solution if and
only if the vector y = (p u
m is linearly dependent on the
j3 OT ) in (R . . .
,

vectors yi, yn formed from the columns of the system.


. . . ,

To answer the uniqueness problem for (2-41) we pass to the associated homoge-
neous system Ax = and apply Theorem 2-10.
0, We again use the fact that if

c = (c i, . . . , cn ) is any vector in (R n then ,

Ac = ctfi + • • •
+ cn yn .

Hence c will be a solution of the homogeneous equation Ax = if and only if

ciyi + • • •
+ cn y„ = 0.

Thus Ax = has a nontrivial solution (i.e., a solution in which at least one of the
^ and only
m and we
C{ 0) if if the vectors y i, . . . , yn are linearly dependent in (R ,

have

Theorem 2-12. The system of linear equations (2-41) has a unique solution
{provided it has any solutions at all) if and only if the vectors formed from
m
the columns of the system are linearly independent in (R .

Finally, since y ls . . . , yn are always linearly dependent if n > m, this result


also yields

Corollary 2-1. A
homogeneous system of linear equations has nontrivial
solutions whenever the number of unknowns exceeds the number of equations.
]

the general theory of


linear differential equations

3-1 LINEAR DIFFERENTIAL OPERATORS

We have already had occasion to remark that the operator D which maps a
differentiable functiononto its derivative is a linear transformation (see Sec-
tion 2-1). The same is true of polynomials in D, and of even more complicated
expressions such as xD
2
+
D + x. Linear transformations of this sort which
involve D and its powers are called linear differential operators. The study of such
operators leads naturally to the theory of linear differential equations, the subject
matter of this chapter and the chapters which follow.
To meaning to the term "linear differential operator," let I be
give a precise
an arbitrary and for each non-negative integer n, let e n (I)
interval of the real line,
denote the vector space of all real-valued functions which have a continuous «th
derivative everywhere in I. [Recall that the vectors in Q n (I) are real-valued func-
tions whose first n derivatives exist and are continuous throughout I, and that
vector addition and scalar multiplication in this space are defined by the equations

(/ + gKx) = /(X) + g(x), («/)(*) = af(x),

for all x in 7. By agreement, e°(7) = 6(7).


In these terms we now state

Definition 3-1. A linear transformation L: e w (7) — > 6(7) is said to be a

linear differential operator of order n on the interval I if it can be expressed


in the form

L = an (x)D n + an.xOc)/^-
1
+ • •
+ a x (x)D + a (x), (3-1)

where the coefficients a (x), an (x) are continuous everywhere in I,


. . . ,

and an (x) is not identically zero on I. In addition, the transformation


which maps every function in e n (7) onto the zero function is also con-
sidered to be a linear differential operator. It, however, is not assigned an
order.
86
3-1 I
LINEAR DIFFERENTIAL OPERATORS 87

Thus the image of a function / in e n (/) under the linear differential operator

described above is the function in e(Y) defined by the identity

Lfix) = an(x) £^ f(x) + • • •


+ a x (x)j- f(x)
x
+ a (x)f(x), (3-2)

or, more simply, by

Ly = an (x)y (n) + • •
+ mix)/ + a (x)y, (3-3)

where y', . .
. , yn derivatives of the function y = fix). Strictly
(n)
are the first

speaking, the left-hand side of (3-2) is the value of Lfat the point x, and a scrupu-
lous regard for accuracy would require that it be written (Lif))ix). For obvious
reasons the extra parentheses are almost always omitted. Moreover, we shall

occasionally refer to Lfix) as "the linear transformation L applied to the function

fix)," thereby following the familiar custom of confusing a function with its
value at a point. This, of course, is just a linguistic convenience, and once under-
stood as such causes no difficulty.

Example 1. The «th derivative operator, Dn , is the simplest example of a linear


differential operator of order n on an arbitrary interval /. When n = 0, D° is

just the identity transformation, and, in general, Dn can be viewed as the nth
power of the linear transformation D (see Section 2-3).
Example 2. Any polynomial in D of degree n, with real coefficients, is a linear
differential operator of order n on every interval of the real line.

Example 3. A linear differential operator of order zero on / has the form

L = a ix), (3-4)

where a ix) is continuous and not identically zero on /. Thus if/ is any function
in e(7),

Lfix) = a ix)fix),

which, of course, is just and/. Occasionally one


the product of the functions a
form a ix)D° to emphasize the fact that a ix) is being
finds (3-4) written in the
viewed as an operator and not as a function in e(/).*

Example 4. The linear differential operator

xD 2 + 3Vx D - 1

*
Note that the expression aoix)fix) actually admits three different interpretations.
It can be viewed as the product of the functions aoix) and fix), or as the value of the
operator aoix) applied to the function fix), or as the product of the operators aoix) and
fix). The particular interpretation chosen, however, is usually a matter of indifference.
88 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS J
CHAP. 3

is of order 2 in [0, oo) or any of its subintervals.* By way of contrast

(x + |jcj)Z>
2
- VxTT D + In (x + 1)

is of order 2 on (—1, 1), but of order 1 on the subinterval (—1,0] since x + |jc|

vanishes identically there. Thus the order of a linear differential operator may
depend upon the interval in which it is being considered, as well as on the algebraic
form of the operator itself.

A linear differential operator is, by definition, a linear transformation, and


hence, under suitable hypotheses, it makes sense to talk about the product of two
such operators. Such products are again linear differential operators although it
is much about their order or domain of definition (see
impossible to say very
Exercise 7). We remind the reader that the usual precautions arising from the
noncommutativity of operator multiplication must also be observed in this setting.
For instance, a product such as (xD 2)(2xD 1) cannot be computed by + +
multiplying the expressions xD + 2 and 2xD + 1 according to the usual rules
of algebra. Indeed, if it could, we would have (xD + 2)(2xD + 1) = 2x 2 D 2 +
5xD + 2, when, in fact, the correct answer is 2x 2 D 2 + IxD + 2, as can be
seen from the following computation:

(xD + 2)(2xD + l)y = (xD + 2)(2xy + y) f

= xD(2xy' + y) + 2(2*/ + y)
= x(2xy" + 3/) + 4xy' + 2y
= 2x 2y" + 7jc/ + 2j.

However, in the special case of operators with constant coefficients products


can be computed as though the operators were ordinary polynomials in D (see
Section 2-3, and Exercise 12 below). As we shall see, this fact will ultimately

enable us to solve all linear differential equations with constant coefficients.

EXERCISES

1. Evaluate each of the following expressions.


(a) (D 2 + D)e 2x (b) (3D 2 + 2D + 2) sin x
(c) (xD - x)(2 In x) (d) (D + \)(D - x)(2e* + cos x)

2. Repeat Exercise 1 for each of the following expressions.


2 kx
(a) (aD + bD + c)e , a, b, c, k constants
2 2
(b) (x D - 2xD + 4)x\ k a constant

The intervals [a, b) and (a, b] are defined, respectively, by the inequalities a < x < b,
*
a x < b. The first is said to be open on the right and closed on the left, the second
<
closed on the right and open on the left.
3-1 | LINEAR DIFFERENTIAL OPERATORS 89

2 2
(c) (4x D 2 + 4xD + 4x + ^^sinx
3. Find constants a, b, c such that a + b -f c = 1, and

[(1 - x 2 )/) 2 - 2xZ) + 6](ax 2


+ Z>x + c) = 0.

4. Write each of the following linear differential operators in the standard form

a n (x)D n -\ + ai(x)D + a (x).

(a) (D 2+ 1)(Z> - 1) (b) xD(D - x)

(c) (xD + D) 22 2
(d) £> (xZ) - \)D
(e) D(Z)e* + 1) + e x

5. Show that D(xD) j± (xD)D.


6. (a) Prove that a linear differential operator of order n is a linear transformation
from C n (/) to e(7).
(b) Is this linear transformation one-to-one when n > 0? Why?
7. (a)Compute the product of the linear differential operators a\{x)D + 1 and
bi(x)D + 1 when
2
(0, x < 0, [x /2, x < 0,
fliOt) = Ai(*) =
{ 2
\x /2, x > 0, JO, x > 0,

and thus deduce that the order of the product of two such operators need not be
the sum of the orders of the factors.
(b) Give an example to show that the product of two linear differential operators
on an interval I need not be defined on the same interval.
8. Prove that Dm (a(x)D n) is a linear differential operator of order m + n by expressing
thisproduct in standard form as a "polynomial" in D. [Assume the existence and
continuity of all the necessary derivatives of a(x).]

9. Find the sum Li + L2 of each of the following pairs of linear differential operators.
(a) Li = 2xD + 3, L 2 = xD - 1
(b) Lx = ex D 2 + A L 2 = e~ x D 2 - D
(c) L\ = xD + 1, L 2 = Dx
10. Prove that the sum of two linear differential operators defined on an interval / is

the linear differential operator on / obtained by adding the corresponding coeffi-


cients in the standard "polynomial" representation (3-1) of the given operators.
11. Let m n
k
Li = J2 a k (x)D and L 2 = XI **(*)*>*
k=0 k=0

be linear differential operators on an interval /. Prove that L\ = L2 if and only if

m = n, and ak (x) = bk(x) for all k.

12. (a) Prove that


m n n m m+n
(aD )(bD ) = (bD )(aD ) = abD

whenever a and b are constants.


90 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS | CHAP. 3

(b) Use (a) and the general distributivity formula for linear transformations that
was established in Section 2-3 to prove that the multiplication of constant coeffi-
cient linear differential operators is commutative. Deduce from this that the product
of two such operators can be obtained by treating them as ordinary polynomials
in D and using the usual rules of elementary algebra.

13. Factor each of the following linear differential operators into a product of irreducible
factors of lower order.

(a) D 2 - 3D + 2 (e) 4Z) 4 + 4D 3 - ID 2 + D- 2


(b) 2D 2
+ 5D + 2 (0 D* - 1

(c) AD 2 + AD + 1 (g) D* + 1

(d) Z) 3 - 3D 2 + 4 (h) D5 - 1

14. Prove that


(a) D 2 [f(x)g(x)] = f"(x)g(x) + 2f'(x)g'(x) + f(x)g"(x),
(b) D*[f(x)g(x)] = f'"(x)g(x) + 3f"(x)g'(x) + 3f'{x)g"{x) + /(*)*"'(*).

Can you make a conjecture as to the form of D n


[f(x)g(x)]l ("No" is not an ac-
ceptable answer.)
*15. Use mathematical induction to prove Leibnitz's rule:

n n -k
Dn [f(x)g(x)] = £
fc=0
(k
)(D f(x))(D g(x)),
k

where
n(« - !)••• (n - k + 1)

o- k\(n - k)l k\

16. Use the result of the preceding exercise to express each of the following linear
differential operators in the form an (x)D n • • a\(x)D ao(x). + •
+ +
(a) D 3 (xD)

(b) Dm (xD)
(c) D 5 (xD 2 + ex)
17. Prove that for any pair of non-negative integers k and m,

k —m
! X m < k,
^**-r («)
lo, m > k.
"18. (a) Prove that
m m k
- — m+
(x D )x = k(k 1) •(k l)x

for any real number k.

(b) Prove that

(a2X
2
D2 + a\xD + ao)x = [a2k(k — 1) + a\k + ad\x

for any real number k (ao, a\, «2 constants).

(c) Prove that (xD)(x 3 D 3 ) = (x 3 D 3 )(xD).


'19. A linear differential operator is sometimes said to be equidimensional or an Euler

operator if it can be written in the form a nx n D n • a\xD ao, where + • •


+ +
ao- • • • , a n are constants.
3-2 | LINEAR DIFFERENTIAL EQUATIONS 91

(a) Compute the value of Lx k k an arbitrary real number, when Lis equidimensional.
,

(b) Prove that (xm Dm)(x n D n) = (xn D n)(xm Dm) for any pair of non-negative inte-
gers m, n, and hence deduce that the multiplication of equidimensional operators is

commutative. [Hint: Use the results of Exercises 15 and 17.]

3-2 LINEAR DIFFERENTIAL EQUATIONS

An nth-order linear differential equation on an interval / is, by definition, an


operator equation of the form
Ly = h(x), (3-5)

in which h is /, and L is an wth-order linear differential operator


continuous on
defined on Such an equation is said to be homogeneous if h is identically zero
/.

on /, nonhomogeneous otherwise, and normal whenever the leading coefficient


an (x) of the operator L does not vanish anywhere on /. Finally, a function y(x)
n
is said to be a solution of (3-5) if and only if y(x) belongs to e (I) and satisfies the

equation identically on /.
Thus an nth order linear differential equation is simply an equation of the
form

an (x) ^+--- + W-^+ fl 1 a (x)y = h(x), (3-6)

whose coefficients a (x), . .


.
, an (x) and right-hand side h(x) are continuous on an
interval / in which an (x) is not identically zero. Typical examples are provided
by the equations
d2 v

which is homogeneous, normal, and of order 2 on (— oo, oo) or any of its sub-
intervals, and

X ^ '
dX 3 dx

which is nonhomogeneous, normal, and of order 3 on (0, oo) and (— oo, 0), but
is non-normal on any interval containing the origin.
The primary objective in the study of linear differential equations is to find all
solutions of any given equation on an interval /. As might be expected, this is a
difficult problem, and a complete answer is known only for certain special types

of equations. However, there exists a considerable body of knowledge concerning


the general behavior of solutions of linear differential equations, and in this
respect the theory of such equations stands in refreshing contrast to that of non-
linear equations. This, of course, is due to the fact that the techniques of linear
algebra can be used in this context, and the present chapter constitutes our first
substantial application of these ideas to the study of a problem in analysis.
92 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS | CHAP. 3

As an illustration of the way in which linear algebra intervenes in the study of


differential equations, let
Ly = (3-7)

be a normal, homogeneous linear differential equation of order n on an interval /


of the ;c-axis. In this case the solution set of the equation is none other than the
n
null space of the linear transformation L, and hence is a subspace of <5 (I). Out
of deference to the problem at hand this subspace is called the solution space of
the equation, and the task of solving (3-7) has been reduced to that of finding
a basis for its solution space, provided, of course, the solution space is finite

dimensional. It is; and later in this chapter we


prove that the solution shall in fact
space of any normal nth. order homogeneous linear differential equation is an n-dimen-
n
sional subspace ofQ (I). Thus if L is normal, and if y\(x), yn (x) are n linearly . . .
,

independent solutions of (3-7), then every solution of that equation must be of the
form
y(x) = Cij>i(x) + • ' •
+ cnyn {x) (3-8)

for suitable real numbers a. Conversely, every function of this type is certainly
a solution of (3-7) whenever y x {x), yn (x) are, and for this reason (3-8),
. . . ,

with the Ci arbitrary, is called the general solution of (3-7). Finally, any function
obtained from the general solution by assigning definite values to the c» is called
a particular solution. We leave the reader to reflect upon the merits and short-
comings of this somewhat unfortunate choice of terminology.
By a familiar line of reasoning, these results are also pertinent to the study of
nonhomogeneous equations. Indeed, in Section 2-9 we saw that if yp (x) is any
solution of the nonhomogeneous equation

Ly = h(x) (3-9)

and iiyhix) is the general solution of the associated homogeneous equation Ly = 0,


then the expression y (x)
p +
y h (x) is the general solution of (3-9). In other words,
the solution set of a nonhomogeneous linear differential equation can be found by
adding all solutions of the associated homogeneous equation to any particular solu-
tion of the given equation. Needless to say, this argument effectively reduces the
problem of solving a nonhomogeneous equation to that of finding the general
solution of its associated homogeneous equation. And in the next chapter we
shall complete this reduction by giving a method whereby a particular solution
yp (x) of (3-9) can always be found once the general solution of the associated
homogeneous equation is known.

Example 1. The functions sin x and cos x are easily seen to be solutions of the

second-order equation
y" + y = (3-10)

on the interval (— oo, oo). Moreover, these functions are linearly independent
3-2 | LINEAR DIFFERENTIAL EQUATIONS 93

in 6 2 (— oo, oo ), since

Ci sin x + c 2 cos x =

for all x implies, by setting x = w/2 and x = 0, that c x = and c2 = 0. Thus


the general solution of (3-10) is

y = Cisinjc + c 2 cos;t, (3-11)

where c x and c 2 are arbitrary constants. The reader should note that without a
theorem such as the one cited above there would be no guarantee whatever that
(3-1 1) includes every solution of the given equation.

Example 2. The function yp (x) = x is obviously a solution of the nonhomoge-


neous equation
/' + y = x (3-12)

on (—oo, oo ). Hence, since c x sin a: + c 2 cos x is the general solution of the


associated homogeneous equation y" + y = 0, the general solution of (3-12) is

y = x + Ci sin x + c 2 cos x.

Before leaving this section it may be instructive to compare the solution set
of a nonlinear differential equation with that of a linear equation. To this end we
consider
/ _ 3^2/3 = 0j (3
_ 13)

which, as is easily seen, has the family of cubic curves

y = (x + cf (3-14)

as its "general" solution on the interval (— oo, oo). (See Fig. 3-1.) In particular,
the functions x
3
and (x + l)
3
But their sum is not, and
are solutions of (3-13).
hence the solution set of this equation is not a subspace of e x (— oo, oo), even
though the equation appears to be homogeneous. Moreover, all of the various
solutions obtained from (3-14) by assigning different values to c are linearly
independent in e 2 (— oo oo), and we conclude that a. first-order nonlinear differential
,

equation can actually have infinitely many linearly independent solutions. Finally,
(3-13) also admits an infinite number of from solutions which cannot be obtained
(x + c)
3
by specializing the constant somewhat peculiar solutions c. All of these
have the property that they are zero along an interval of the x-axis and are of the
following three forms

*
- fl)3 ' x < a, o, x < b,
yv = !
(
y ~ kx - bf, x >
0, x > a, b,

\x - af, x < a,
o, a < x < b,
(x - bf, x > b.
94 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 3

y y

FIGURE 3-1

Thus, to use the term "general solution" in reference to (3-14) is in this case a
genuine misnomer. In short, every single one of the properties enjoyed by the

solution set of a linear differential equation fails to hold here a fact which, if it
does nothing should convince the student that linear differential equations
else,

are rather more pleasant to encounter than nonlinear equations.

EXERCISES

1. Determine the order of each of the following linear differential equations on the
indicated intervals,
(a) xy" - + l)y = 3, op (-oo, ») (b) (D + l) 3 = 0, on (0, 1)
(2x .y

(x + \x\)y'" + (sinjc)/ = 2e
x
(c) , on (-1, 1); on (0, °o)
(d) y/xy" — 2/ + (sinx)^ = lnx, on (1, »)
(e) (x + 1 + \x + 1|)/" + (x + \x\)y' + 2y = 0, on (-», oo); on (0, oo);
on (-1,0)
3-3 I
FIRST-ORDER EQUATIONS 95

2. In each of the following show that the given function is a solution of the associated
linear differential equation, and find the interval (or intervals) in which this is the

case.

(a) xy" + / = 0; In (1/x)


(b) Ax + 4jc/ + (4x 2 -1)^ = 0; V2/(irx) sin x
2y"

(c) (1 - x 2 )y" - 2xy' + 6y = 0; 3x 2 - 1

2y" - xy'
(d) x + y = 1; 1 + 2*lnx
(1 - xV ~ 2xy' + 2y = 2; xtanh" x
1
(e)

3. (a) Show that e01 cos fo: and e? x sin Ax are linearly independent solutions of the
equation
(Z)
2 - 2aD + a2 + £ 2)y - 0, 6^0,
on (—oo, oo ).

(b) What is the general solution of this equation?


(c) Find the particular solution of the equation in (a) which satisfies the "initial"
conditions y(0) = b, ^'(0) = — a.
4. (a) Show that e°* and xe? x are linearly independent solutions of the equation

(D - a) 2y = 0.

(b) Find the particular solution of this equation which satisfies the "initial" condi-
tions ^(0) = l,/(0) = 2.
5. (a) Verify that sin
3
x and sin — ^ sin 3x are solutions
x of

y" + (tan x - 2 cot x)y' =


on any interval where tan x and cot x are both defined. Are these solutions linearly
independent?
(b) Find the general solution of this equation.
6. Show that %x 3 and %(x 3/2 + l)
2
are solutions of the nonlinear differential equation
idy/dx) 2 — xy = on (0, °o). Is the sum of these functions a solution?
7. In each of the following show that the given functions span the solution space of
the associated differential equation. Find a basis for the solution space in each case,
and use it to obtain the general solution of the equation in question.
x
(a) y" — y = 0; sinh x, 2e~ , —cosh x, on (— oo , oo )

2 3 3
(b) x y" - 5xy' + 9y = 0; 2x In x, x
3
, x (2 In x - 1), on (0, °o

(c) y" + Ay = 0; sin 2x, —2 cos 2x, —cos (2x — 3), on (— oo, oo)

(d) (1 - jcV " 2xy' + 2y = 0; 3x, \ In j-±-| -1, | . on (-1, 1)

3-3 FIRST-ORDER EQUATIONS

Let
fliW ^ + a (x)y = h(x) (3-15)

be a normal first-order linear differential equation defined on an interval / of the


X-axis. Then, as we know, the general solution of this equation can be expressed
96 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS [ CHAP. 3

in the form
y = yP (x ) + yh(x), (3-16)

where yp (x) is any "particular" solution, and yh(x) is the general solution of the
homogeneous equation

fli(*) % + a (x)y = 0. (3-17)

Since ai(x) =^ everywhere in 7, (3-17) may be rewritten

1 dy a (x)
y * 0,
y dx a\(x)

and integrated to yield

or
11 _ —)\atf.x)\a x {x)\dx

Hence, by the theorem cited in the preceding section, the general solution of (3-17)
is

,, _ ro —I[aoW/ai(,x)]dx

where c is an arbitrary constant.


To obtain a particular solution of (3-15), we rewrite the equation as

dy a (x) h(x)
n 1R
.
(3_18)
Tx+ ^(x) y ^x)'

and multiply by g Jl8o(I)/fll(,)1<fa to obtain

( dy a (x) \ J[a (x)/ ai (x)]dx _ hjx) Jlo <z)/o x (g)]dx


,

y g (3-19)
\dx fli(x) ) fli(x)

But


dx
°* ft, J"k»o(*)/«i(*>]*«^
J
_
~
(dy.
\<£t
+
_|_
a °(X ^
ai (x)
yi,l
,Jl<*o<*>/«i(*>]<**
rfx )

and so (3-19) may be replaced by the equivalent equation

d_ / f[a (x)lai(x)]dx^ _ Kx )
e
Jla (x)/ ai (x)]dx
dx
Thus
_ —J[o (a;)/oi(a;)]da; h(-X) J\a {x)la l (x)\dx ,

/
3-3 | FIRST-ORDER EQUATIONS 97

and it follows from (3-16) that the general solution of (3-15) is

y = J
h\X) f[a (x)/o 1 (a;)]da;
^ Q—
f[a (x)/ai(x)]dx
(3~20)

c an arbitrary constant.
Considering the simplicity of the technique underlying this result, it is not
recommended that the student memory. Instead he should re-
commit (3-20) to
member the general method, which can be described as follows: To find the general
solution of a normal first-order linear differential equation, rewrite the equation in
the form

dy ao(x) h(x)
,
= 3

dx a x (x) a\(x)

multiply by e
ha ° (x)la ^ x)]dx and integrate.
,

Example 1 . Find the general solution of

% + **-'
2xdx = x
In this case we multiply the equation by e$ e to obtain

X2
= xe .

Thus
X2 X2
ye = I

J
xe dx
I
+ c,

and it follows that


x* ^
y -(£•)
=IV + c)e = 2 + ce

Example 2. Solve the equation

xfx + y = x. (3-21)

Since the leading coefficient of this equation vanishes when x = 0, the above
method applies only on the intervals (0, oo) and (— oo,0). There, however,
(3-21) may be rewritten

| + ^-l. 0-22)

and solved by introducing the "integrating factor"

e
fdxlx = e
\n\x\ = 1^
98 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS | CHAP. 3

Multiplying (3-22) by \x\ and integrating, we obtain

j>|jc| = / \x\dx + c

x2
y+ C, X > 0,

x2
- y+ C, X < 0.

Thus

xr + * >0 -

y = s

x <°>
i + r
and since c is arbitrary we have

y = x
ex
+ 2

We call the reader's attention to the fact that x/2 is the only solution of the given
equation defined on the entire real line. Nevertheless it is common practice to
call c/x +
x/2 the "general solution" of (3-21) without specifying the interval
in question—a practice which is admittedly convenient, but potentially misleading.
Example 3. We now use the above technique to solve the nonlinear equation

n
«iW| + a (x)y = h(x)y ,
(3-23)

where n an arbitrary real number. This equation is known as Bernoulli's


is

equation,and here, as always, we assume that a (x), ai(x), and h(x) are con-
tinuous on an interval /, and that a\{x) 9^ on /.
Dismissing the cases n = and n = 1 which have been treated above, we
rewrite (3-23) as

a 1 {x)y~
n
& + a (x)y - 1 n
= h(x), (3-24)

and make the change of variable u = y ~ n Then


l
.

and (3-24) becomes


ai(x) du
1 — n dx
+ a (x)u = h(x),

which is a normal first-order equation on the interval /. We now solve this equa-
3-3 | FIRST-ORDER EQUATIONS 99

and then express the general solution of (3-23) as y = w 1/( 1_w) Finally,
tion for u, .

if n > 0, we add the solution y = "suppressed" in passing from (3-23) to


(3-24), and we are done.
Thus, to solve

%+y= (xyf, (3-25)

we first rewrite the equation as

_2 dy .
_i 2

and then make the change of variable u = y~ x


. This gives

du 2

from which we obtain

ue~
x
= - x 2 e~x dx + c.
J

Hence
u = 2 + 2x + x2 + ce
x
,

and the solutions of (3-25) are


-1
y = (2 + 2x + x2 + ce*) , c arbitrary,

and y = 0.

EXERCISES

Find the general solution of each of the following equations.


2
1. */ + 2y = 2. (1 - x )y' - y =
3. (sin jc)/ + (cos x)y = 4. 3/ + ky = 0, k a constant
5. 2/ + 3>> = e~x 6. 3x/ - >> = In x + 1

7. L—
at
+ Ri = E, L, R, E constants, L, R ^
2
8. (3x + 1)/ - 2*3> = 6x
2
9. (x + 1)/ - (1 - xfy = xe~x
2
10. (jc + 1)/ + xy = (1 - 2x)\/*2 + 1

11. x sin x —
ax
+ (sin x + x cos x)^ = xe
x

y
12. x ^
dx
+
\flx-\-
7
1
= 1 + V2x+ 1
100 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 3

13. x (f
dx
+
y/i
,
y
_ X2
= (1 +VT^)^
2
14. sin x cos x -~
fi?X
+ y = tan x 15. (1 + sin x)-j-
ax
+ (2 cos x)y = tanx

,^ 0/1
16. 2(1
v
— 2
x N^
)/
//,2V
— — x (1 )y = xy 3 e -x ,-
17. / =
, / sin x :
- y cos x
2

sin x cos x
+ xy 2 - x =
18. yy'
2
19. (x + \Wy~y' = xe
2
3x/2
+ (1 - x)
20. (x + x + l)y/ + (2* + 1)/ = 2x -
2
^ 1

x(x + In x)
22. —— - y' + y = (1 + cos x).k
y __ sin 2x 2/3
21. x/ + ^— = -^-5-:
, .
/ , ,
2
lnx ^ lnx 6
+ l)lnx - x(3x + 4)/ (a:
23. (x - 1)/ - 2y = V(x2 - l)y 24. / =
(x3 + 2x2 - 1)>>2
2 3 2
25. (xy = (xy) (x + 1)
)'

26. Find the particular solution of the equation xy' — (sinx)^ = on the interval
(0, 00 ) which passes through the point (1, — 1). [Hint: Show that the general solution

of this equation on (0, 00) may be written in the form y = ce$*


(8in l)/t dt
, x > 0.]

27. (a) Find the solution curve of the equation

dy -x 2 /2
x—
dx
+y = e

which passes through the point (2, -3). [Hint: Find the general solution and show
that it can be written in the form

C I -x*/2 .
y = - + -1 I e dx.\
-,

x x J2

(b) What is the ordinate of the point on the solution curvefound in (a) corresponding
to the point x = 1 ? (Consult a table of values for (l/V2ir)/f „ e~
t2/2
dt.) Find the
slope of the solution curve at this point.

Riccati'' s equation. Any first-order differential equation of the form

^
dx
+ « (x)/ + ai{x)y + ao(x)
2 = 0, (3-26)

in which ao(x), ai(x), a2(x) are continuous on an interval / and a 2 (x) ^ on is called /,

a Riccati equation. A number of elementary facts concerning the solutions of such equa-
tions are given in the exercises which follow.
28. Let y\{x) be a particular solution of (3-26). Make the change of variable y =
yx +1/z to reduce (3-26) to a first-order linear equation in z, and hence deduce
that the general solution of a Riccati equation can be found as soon as a particular
solution is known.
Use the technique suggested in the preceding exercise to find the general solution of each
of the following Riccati equations.
29. y' — xy 2 + (2x — l)y = x — 1 ;
particular solution y = 1
3-3 I
FIRST-ORDER EQUATIONS 101

30. y' -\- xy 2 — 2x 2y + x3 = x + 1 ;


particular solution y = x — 1

31. 2y' — (y/x) 2 — 1=0; particular solution y = x


32. y' + y
2 — (1 + 2e x)y + e 2x - 0; particular solution .y = ex

1
= cos X
33. y' — (sin
2
x)y 2 -\ y + cos 2 x = 0; particular solution .y
sin x cos x sin x
34. (a) Let yi(x) and >>2(*) be two particular solutions of Eq. (3-26). Show that the
general solution of the equation is

y ~ yi _ /«2(a )(3'2— v\* dx


;

y - yi

c an arbitrary constant. [Hint: Consider the expression


yf _ y\ y' _ y^
y - yi y - y2
]

(b) Let yi(x), y2(x), and j3(x) be distinct particular solutions of Eq. (3-26). Use the
result established in (a) to prove that the general solution of the equation is

( y - yiX-Vs - y2)
(y - y2)(y<i — yi)

c an arbitrary constant.

35. (a) Show that a constant coefficient Riccati equation

2
^
ax
+ ay + by + c =

has a solution of the form y = m, m a constant, if and only if m is a root of the


quadratic equation
am 2 + bm -f- c = 0.

(b) Use this result, together with Exercise 28 or Exercise 34(a), as appropriate, to
find the general solution of each of the following Riccati equations,

(i) y+ y
2
+ 3j + 2 = (ii) y + 4>> 2 - 9 =
(iii) /+ y
2 - 2y + 1 = (iv) 6/ + 6>> 2 + - >> 1 =0
36. (a) Prove that the change of variable v = y'/y reduces the second-order homoge-
neous linear differential equation

/' + ai(x)y' + a (x)y = (3-27)

to the Riccati equation

v' + v2 + ai(x)v + a (x) = 0, (3-28)

and hence deduce that the problem of solving (3-27) is equivalent to that of solving
the simultaneous pair of first-order equations

—- =
ax
vy, —
ax
= 2
-v - ai(x)v - ao(x). (3-29)

[Equation (3-28) is called the Riccati equation associated with (3-27).]


.

102 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS | CHAP. 3

(b) What conditions ought one impose on (3-29) to correspond to the conditions
*(0) = >>o,/(0) = vi on (3-27)?
(c) Prove that every Riccati equation (3-26) in which a2(x) 9^ can be converted
to a second-order homogeneous linear differential equation by making the change of
variable y = v'/fazo).
37. Find the Riccati equation associated with y" — y = 0. Solve this equation, and
hence find the general solution of y" — y = 0.
38. Prove that whenever m\ and /«2 are distinct real roots of the quadratic equation

am 2 + bm + c = 0, a, b, c constants,

then emix and em2X are linearly independent solutions in ©(—«>, ») of the second-
order homogeneous linear differential equation

ay" + by' + c = 0.

(See Exercises 35(a) and 36(a).)


39. Use the result of the preceding exercise to find the general solution of each of the
following second-order linear differential equations.
(a) y" - 5/ + 6y = (b) 2y" + y> - 3 =

(c) (D + l)(D - 2)y = (d) (12D 2 - D - 20)y =


(e) (2D 2 - 3)y =
40. Prove that emx and xemx are linearly independent solutions in C(— 00, oo) of the
second-order homogeneous linear differential equation

/' - 2my' + m 2y = 0,

m a constant.
41 Use the result of the preceding exercise to find the general solution of each of the
following second-order linear differential equations.
(a) /' + 2/ + 1 = (b) Ay" - 12/ + 9y =
(c) (D - %) y
2 = (d) (36D 2 - 12D + l)y =
(e) (2D 2 - 2V2 D + l)y =

3-4 EXISTENCE AND UNIQUENESS OF SOLUTIONS;


INITIAL-VALUE PROBLEMS
On the strength of the results of the preceding section we can assert that every
first-order linear differential equation which is normal on an interval / has solu-
tions. In fact, it has infinitely many, one for each value of c in the expression

__ L, /
h{X) f[a X )/ ai (x)lrf3; —^[a Q (x)l ai (x)]dx
1

g
(
jx c (3-30)

and the general solution of such an equation therefore is a one-parameter family


of plane curves which traverse the strip of the x^-plane determined by /, as shown
3-4 EXISTENCE AND UNIQUENESS OF SOLUTIONS 103

in Fig. 3-2. Even more important, it is easy to see that there is a solution curve
passing through any preassigned point (x , y ) in this strip, since (3-30) can be
solved for c when x = x , y = y .

The problem of finding a function y = y(x) which is a solution of a normal


first-order linear differential equation and which also satisfies the condition
y(x ) = y is called an initial-value problem for the given equation. This termi-
nology is designed to serve as a reminder of the physical interpretation which
views such a solution as the path, or trajectory, of a moving particle which started
at the point (x y ), and whose subsequent motion was governed by the equation
,

in question. In these terms our earlier results can be summarized by saying that
every initial-value problem involving a normal first-order linear differential
equation has at least one solution.

FIGURE 3-2

At this point it is only natural to ask whether or not such a problem can admit
more than one solution. This is the so-called uniqueness problem for first-order
linear differential equations, and is anything but an idle question. Indeed, in appli-
cations of differential equations to the natural sciences it is often essential to be
able to guarantee that the problem being investigated has a unique solution,
since any attempt to predict the future behavior of a physical system governed
by an initial value problem relies upon this knowledge. In the case at hand, it is

not difficult to show that the desired uniqueness obtains (see Exercise 14 below),
and hence the above assertion can be amended to read as follows:

Theorem 3-1. Every initial-value problem involving a normal first-order


linear differential equation has precisely one solution.

The general theory of linear differential equations can properly be said to begin
with the theorem which generalizes this result to nth-order equations. In the
special case treated above, the theorem was proved by the simple expedient of
exhibitingall of the solutions at issue. Unfortunately, it is impossible to give an

argument of this type for equations of higher order, and though the asserted
theorem is true, its proof is not conspicuously easy. Thus, rather than become
104 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 3

involved in a long and somewhat arid discussion at this time, we content ourselves
with a formal statement of the result.*

Theorem 3-2. (The existence and uniqueness theorem for linear differen-
tial equations.) Let

an (x)^n + •
+ a (x)y = h(x) (3-31)

be a normal nth-order linear differential equation defined on an interval I,


and let x be any point in I. Then ify , y n -i
are arbitrary real numbers ,
. . .

there exists one and only one solution y(x) of (3-31) with the property that

y(*o) = yo, /(*o) = yi, . • , y w_1) (*o) = yn -u (3-32)

As in the case of first-order equations, the problem of finding a solution of


(3-31) which satisfies the n additional conditions given in (3-32) is called an

initial-value problem with initial conditions

~l
y(x ) = y ,
y'(x ) = yu . . .
,y
{n
\x ) = yn _ x .

It is also worth noting that Theorem 3-2 can be phrased in the language of linear
operators, in which case it assumes the following suggestive form:

w
IfL: e (7) — > G(I) is a normal nth-order linear differential operator, there exists

a unique inverse operator G: 6(7) — > e n (7) such that

(i) L[G(h)] = h, for all h in e(7),

and
(ii) G(h)(x ) = yo, G(h)'(xo) = yu . .
. , ^^"-"(^o) = ^»-i-

When stated in these terms it is clear that the task of solving an initial-value
problem for a normal linear differential equation comes down to finding an explicit
form for the inverse operator G, since once G is known the problem

Ly = h;

n-1) = yn -u
y(x ) = y , . . • ,y (*o)

can be solved by computing the value of G(h). This point of view will be exploited
in later chapters where much of our work will be directed toward finding G for
specific classes of linear differential operators. As we shall see, G will turn out to
be an integral operator of the type considered in Example 2 of Section 2-2.

* For a proof the reader should consult E. Coddington, An Introduction to Ordinary


Englewood Cliffs, N.J., 1961.
Differential Equations, Prentice-Hall,
3-4 |
EXISTENCE AND UNIQUENESS OF SOLUTIONS 105

EXERCISES
Find the solution of each of the following initial-value problems and specify the domain
of the solution.

1. */ + 2y = 0, y(l) = -1
(sin*)/ + (cosx)y = y( =
2. 0,
~J 2

3. 2/ + 3y = e~x y(-3) = -3 ,

2 2
4. (x + 1)/ - (1 - x )y = e~\ y(-2) =

5. x sin x -- + (sin x + cos x)y = — » y(— £) =


WAT x
6. x
2
/+ ^ = 1 + Vl - *2 , y(i) =
Vl - *2

7. (1 4- sin x) —+ (cot x)y = cos x, y (


^J
= 1

Use the given general solution to solve each of the following initial-value problems.
8. y" - k y = 0, = = a sinh kx +
2
y(0) y'(0) 1 ; y = c 2 cosh kx, k ^
* -
9. (1 — x 2 )y" - 2xy' = 0, y(-2) = 0, y'(-2) = 1; y = ci + c 2 In
1

x +
,
,
1

10. xy" + y' + xy = 0, y(l) = y'(l) = 1; y = ci/o(x) + c2 F W, where / and


Fo are linearly independent solutions of the equation on (0, oo ).

2
11. 4*y + 4xy' + {Ax - \)y = 0,
y(^j
- -l./(^) = 0;


irx
(ci sin x + c2 cos x)

12. y" + (tan* - 2 cot*)/ = 0, J- M = y' (- M = 1;

y = c\ sin x + C2

13. xy" + y' = 0, y(-2) = y'(-2) =1; y = ci + c 2 In |x|

14. (a) Let y i and y 2 be solutions of a normal first-order linear differential equation on
an interval /. Prove that y\ — y 2 is either identically zero or is different from zero
everywhere on /.
(b) Use the result in (a) to deduce that every initial- value problem involving a normal
first-order linear differential equation has at most one solution.
15. Give an example to show that the conclusion of Theorem 3-2 fails when the hypothe-
sis of normality is not satisfied.

16. Let y\ and y 2 be distinct solutions of a normal first-order linear differential equation
on an interval /. Prove that the general solution of the equation on / is

y - yi
= c,
yi - 72
where c is an arbitrary constant. [Hint: See Exercise 14(a).]
106 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 3

17. Prove that a nontrivial solution of a homogeneous first-order linear differential

equation cannot intersect the x-axis. [Hint: Use Theorem 3-1.]


18. Use the prove that y = c\ sin x + c 2 cos x is the general
results of this section to
solution of y =
/' + on (- » oo). [Hint: If u(x) is any solution on (- oo oo),
, ,

show that c\ and c 2 can be chosen so that y(0) = w(0), /(0) = u'(Q), and then
apply Theorem 3-2.]
19. Show that two distinct solutions of a normal first-order linear differential equation
cannot have a point of intersection.
20. Prove that every nontrivial solution u(x) of a normal second-order linear differential

equation
a 2 (x)y" + a 1 (x)y' + a (x)y =
has only simple zeros. [A point xo is said to be a zero of a function u(x) if and only if

u(xo) = 0. A zero of u(x) is simple if and only if u'(xo) ^ 0.]


21. Prove that distinct solutions of a normal second-order linear differential equation
have no point of mutual tangency.
22. Given the linear differential operator D, find the inverse operator G which satisfies

G(h)(x ) = y .

23. Given the linear differential operator D - k(k& constant), find the inverse operator
G which satisfies G(h)(xo) = yo.

3-5 DIMENSION OF THE SOLUTION SPACE


In this section we shall use the existence and uniqueness theorem stated above to
give a simple yet elegant proof of the fact that the dimension of the solution space
of every normal homogeneous linear differential equation is equal to the order of
the equation. The reader should note, however, that this result fails in the case of
an equation whose leading coefficient vanishes somewhere in the interval under
consideration (witness the equation xy' + y = on an interval containing the
origin).
This said, we now prove
Theorem 3-3. The solution space of any normal nth-order homogeneous
linear differential equation

"n(x)^ + -
+ a (x)y = (3-33)

defined on an interval I is an n-dimensional subspace of 6(7).

Proof Let jc be a fixed point in I. Then by Theorem 3-2 we know that this equa-

tion admits solutions y i(x), . . .


, yn (x) which, respectively, satisfy the initial

conditions
yi(x ) = l,/i(*o) = 0, . .
.
.^"-"(xo) = °>

w_1)
y 2 (x ) = 0,y'2 (x ) = 1, . . . ,y2 (xo) = 0,
(3-34)
(n-l)/
,^"~ 1,
yn (.x ) = 0, j*(*o) = o, . .
. (*o) = 1
3-5 | DIMENSION OF THE SOLUTION SPACE 107

In other words, yi(x), . .


. , yn (x) have the property that the vectors

0>;(*o), J>'Oo), • • • , yf~


X
\xo)), i = \,...,n,

n .*
are the standard basis vectors in (R (See Fig. 3-3.) We assert that these solu-
tions are a basis for the solution space of (3-33).

FIGURE 3-3

Indeed, suppose that Ci, . .


.
, cn are real numbers such that

ciyi(x) + • • •
+ cnyn (x) =
on I. Then this identity, together with its first n — 1 derivatives, yields the system

ciydx) + c 2y 2 (x) + • • •
+ cnyn (x) = 0,

ci/i(x) + c 2y 2 (x) + • • •
+ cny'n (x) = 0,
(3-35)
n~l
^-"W + c 2y 2 \x) + • • •
+ Cnjfr-VOc) = 0.

Setting x = x , we obtain

ciyi(x ) + c 2y 2 (x ) + • • •
+ cnyn (x ) = 0,

Ciy'i(xo) + c 2y 2 (x ) + • • •
+ cny^(x ) = 0,
(3-36)

Ciyf-'Kxo) + c 2 yf-
l
\x ) + • • •
+ cny^-
l
\x Q ) = 0,

and (3-34) now implies that cx = c2 = = cn = 0. Thus yi(x), , yn (x)


are linearly independent in 6(7).
It remains to prove that every solution of (3-33) can be written as a linear combi-
nation of yi(x), yn (x).
. .
.
To this end let y(x) be an arbitrary solution of the
,

equation and suppose that

y(x = a y'(x = ax /* u (x
(n-l)/ = an _i. (3-37)
) , ) , . . .
, )

* This choice of solutions has been illustrated in Fig. 3-3 for a second-order equation,
in which case
0>i(*o),/i(*o)) = (1,0),
(y2(xo),y 2 (x )) = (0,1).
108 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 3

Then by the uniqueness statement in Theorem 3-2 we know that y(x) is the

solution of (3-33) which satisfies these particular initial conditions. But, using
(3-34) again, we see that the function

floJiW + aiJ>2(*) + * • •
+ an -iyn (x)

also satisfies this initial-value problem. Hence

y(x) = a yi(x) + a 1 y 2 (x) + • • •


+ On-i^nC*),

and it follows that yi(x), . . .


, yn (x) span the solution space of (3-33). With this,

the proof is complete. |

We call the reader's attention to the fact that the particular numerical values
used to fix the solutions yi(x), yn (x) did not really play an essential role in
. . .
,

the argument given above. Indeed, the success of the proof depended only on the
linear independence of the vectors

0;(*o), y'i(xo), ..., yT'^ixo)), i = 1, . .


. , «,

n
in (R and the choice made in (3-34) merely served to simplify our computations.
,

For as long as these vectors are linearly independent, the system of homogeneous
linear equations (3-36) will have a unique solution, and the a will be zero, as
required.

Example 1. The second-order equation

d 2y - =
y (3-38)
dx 2

is normal on the entire x-axis, and thus its solution space is a 2-dimensional
subspace of e(— go, oo). Moreover, it is easy to show that the functions

x x
yi(x) = \{e + e~ ) = cosh x,

= i( ex - ~x =
y 2 (x ) e ) sinnx

are solutions of (3-38) on (— go, go), and since

J>i(0) = 1, /i(0) = 0,

j> 2 (0) = o, ya (0) = l,

the argument given above implies that cosh x and sinh x are a basis for the solu-
tion space of this equation. Thus the general solution of (3-38) is

y = ci cosh x + c 2 sinh x,

where c x and c 2 are arbitrary constants.


3-5 |
DIMENSION OF THE SOLUTION SPACE 109

Example 2. The functions

x
yi(x) = e , y 2 (x) = e~
x

provide a second pair of solutions of Eq. (3-38). In this case

Ji(0) = 1, /i(0) = 1,
y 2 (0) = 1, / (0)
a = -1,
2 x
and since the vectors (1, 1) and (1, —1) are linearly independent in (R , e and
e~x also form a basis for the solution space of the equation. It follows that the
general solution of (3-38) may also be written

y = c xe
x
+ c 2 e~
x
,

which, of course, is a variant of the solution obtained above.

Example 3. The functions

yi(x) = sin 2a:, ^2 W= cos 2.x

are solutions of the normal second-order equation

g+ *y - o (3-39)

on (-oo, oo ). Furthermore, j>i(0) = 0, y[(0) = 2, while j> 2 (0) = 1, y'2 (0) = 0,


and since (0, 2) and (1,0) are linearly independent vectors in (R 2 we conclude ,

that sin 2x and cos 2x are linearly independent in e(— oo, oo). Hence they are a
basis for the solution space of (3-39), and the general solution of that equation is

y = Ci sin 2x + c 2 cos 2x.

At this point it is impossible to escape the conclusion that in proving Theorem


3-3 we also established a method for testing functions for linear independence.
This fact is well worth bringing out into the open, since it will be used in the follow-
ing sections to obtain a number of important results concerning (normal) linear
differential equations. Specifically, we have

Corollary 3-1. Let yi(x), . . .


, yn (x) be functions in Q(I), each of which
possesses derivatives up to and including those of order n — 1 everywhere

in I, and suppose that at some point x in I the vectors

0.<*o), y'iM, • • • , /"'"Ml i = 1, . . . , «, (3-40)

n
are linearly independent in (R . Then yi(x), . . .
, yn (x) are linearly inde-
pendent in Q(I).
)

110 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS |


CHAP. 3

Example 4. The functions


x x 2 x
e , xe , x e

are linearly independent in e(-oo, oo) since the above test applied at x =
3
yields the linearly independent vectors (1, 1, 1), (0, 1, 2), (0, 0, 2) in (R .

EXERCISES

For each of the following homogeneous linear differential equations, (a) show that the

given functions span the solution space of the equation on an appropriate interval,
(b) choose a basis for the solution space from among these functions and use it to express
the general solution of the equation, (c) find a basis for the solution space of the
and
equation which satisfies initial conditions of the form (3-34) at the given point x .

2x
1. /" - /' - 2/ = 0; e~\ sinh x - \e\ 2e 1; x = ,

2. /" — / = 0; cosh x, e~ x + sinh x, e + sinh x, cosh x — 1; xo =


4 3 2
3. (D - 4D + ID - 67) + 2)y = 0;
xe
x
+ x)ex ex i\ - sin*), e\x + sinx), e^cosx; x
, (1 ,
=
2
4. jcV" + 2x y" - xy' + y = 0;

x + - x + x In x, - + x In - x(l - In x); xo = e
x
>

xx ^-^>
»

5. (x
2
D2 - 2)y = 0; 2x
2
-
xx
-» 3x
2
; x = 1

2* -1
6.
*v"
- ,
1 - x2
/ = 0; 1 - tanh x, 1 + In ^-^
1+*
'2; x =

7. Use the equation xy' + >> = to show that Theorem 3-3 fails if the hypothesis of
normality is not satisfied.

8. Show that the following functions are linearly independent in Q(J) for the given
interval /.

ax hx
(a)e , e {a*b)\ (-00,00)
2
(b) l,x,x ;
(-co,oo)

(c) 1, ln|^|; (l,oo)

(d) x, x In x; (0, 00 )
ax ax
(e) e sin bx, e cos bx (b ^ 0); (- °o , 00

9. Prove that the functions 1, x, x2 , . . .


, x n are linearly independent in 6(7) for any
interval 7.

*10. Suppose that j>i(x), ,y n (x) are solutions of a normal homogeneous linear dif-
. . .

ferential equation of order m > n on an interval /. Prove that if yi(x), yn (x) . .


. ,

are linearly independent in 6(7), they are linearly independent in 6(7) for any sub-
interval 7 of 7. Give an example to show that this result fails if yi(x), . . . , y n (x)
do not satisfy such an equation.
3-6 THE WRONSKIAN 1 1

3-6 THE WRONSKIAN

In the preceding section we proved that y lt yn are . .


. , linearly independent
functions in e n-1 (7) whenever there exists a point x in I such that the vectors

(yi(xo), yKx ), ..., yf ^(xq)), i = !,...,«, (3-41)

n
are linearly independent in (R . For our present purposes this result can be stated
more conveniently in terms of the determinant of a certain matrix, as follows:
Let yi, . .
, yn be arbitrary functions in C n_1 (7), and for each x in I consider
the matrix

y x (x) y 2 (x) • • •
yn (x)
y'i(x) y'2 (x) y'n {x) (3-42)

yr ^) ^- 1 i}
(jc) y'n~ \x\
l

Then on the interval I whose value at x is the indicated


(3-42) defines a function
matrix, and by forming the determinant of this matrix we obtain a real valued
function on I known as the Wronskian of y lf yn This function will be . . .
,
.

denoted by W\y ly ,yn ] to indicate its dependence on y 1}


. . .
yn and its . .
, ,

value at x by Wly^x), yn (x)]. In short, the Wronskian of y u


. . . , yn is . . . ,

the (real valued) function whose defining equation is

yi(x) yzix) yn (x)


W\y {x\... yn {x)] =
1 t y'lix) y'(x) y'nix) (3-43)

,(w-l)/ v \ „(n-l)
y?- l
\x) yV-»{x) •••
y\r
l
\x)

For example,
x sin x
W[x, sin x] = = x cos x — sin x,
1 cos a:

and
x 2x
W[x, 2x] = = 0.
1 2

We now recall that the determinant of an n X n matrix is nonzero if and only


n
if (see Theorem
the columns of the matrix are linearly independent vectors in (R
III— 8). Thus the Wronskian of y u yn is different from zero at x if and . . . ,

only if the columns of (3-42) are linearly independent when x = x But for -

each x in / the columns of (3-42) are none other than the vectors in (3-41), and
we therefore have the following theorem.

n_1
Theorem 3-4. The functions y i, yn are linearly independent in 6 (7),
. . .
,

and hence also in 6(7), whenever their Wronskian is not identically zero on I.
112 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS CHAP. 3

Example 1. Since
x
e e
x x
W[e e~ , ] x
= -2,
e -e
x x
the functions e and e are linearly independent in e(7) for any interval /.

112 312
Example 2. The functions x, x , x are linearly independent in e(Y) for
any subinterval / of the positive x-axis since

.1/2 ^3/2

AI 1/2
L, ^3/2-1 -1/2 iv l/2
W[x, x \ x <] A 2

_l Y -3/2 Hy" 1/ 2

More generally, xa , x&, xy are linearly independent in ©(/), / as above, if and only
if a, |8, 7 are distinct real numbers (see Exercises 13 and 14 below).
3 3
Example 3. The functions jc and |x| are linearly independent in e(— oo, oo),

for if Cix
3
+ c 2 |*|
3
— 0, then

Cl (l)
3
+ c 2 |l|
3
= 0,

ci(-l)
3
+ c 2 |-l|
3
= 0,

andci = c 2 = 0. On the other hand, the Wronskian of x and


3 3
\x\ is identically
zero on (— oo, oo) since

„3

W[x 3
,
\x\
3
]
= = 0,
2 2
3x 3x

if x > 0, and

—x
W[x 3 ,
\x\
3
]
= = 0,
2 2
3x •3jc

if x < 0. Thus the converse of Theorem


3-4 is false, and one cannot deduce the FIGURE 3-4
linear dependence of a set of functions in
C(/) from the fact that their Wronskian vanishes identically on /. (See Fig. 3-4).

This example notwithstanding, it is true that the Wronskian of a linearly


dependent set of functions in Q(I) vanishes identically on /, provided, of course,
that the Wronskian exists in the first place. Hence, rather than abandon the
search for a converse to Theorem 3-4, we weaken our requirements, and ask
whether it is possible to impose additional conditions on a set of functions which,
together with the vanishing of their Wronskian, will imply linear dependence.
3

3-6 | THE WRONSKIAN 1 1

This can in fact be done, simply by requiring that the functions be solutions of a
homogeneous linear differential equation. We prove this assertion as

Theorem 3-5. Let y\, . . .


, yn be solutions of a normal nth-order homoge-
neous linear differential equation

an (x) g+ • • •
+ a (x)y = (3-44)

on an interval I, and suppose that W[y u yn ] . . .


,
is identically zero on I.

Then y l9 yn are linearly


. . dependent
.
,
in Q(I).

Proof. Let x be any point in /, and consider the system of equations

ciJ>iOo) + ' • '


+ cnyn (x ) = 0,

Ciy'iixo) + • • •
+ cny'n (x ) = 0,
(3-45)
n_1)
ctfi—'W + •
+ <W* (*o) = 0,

in the unknowns c x cn Since the Wronskian of y lt


, . .
.yn vanishes identi-
, . . . . ,

cally on / the determinant of (3-45) is zero, and the system has a nontrivial solution
(<?!, . .cn ). (See Theorem IH-9.) Thus the function
. ,

y(x) = J2 Ciyi{x)

is a solution of the initial-value problem consisting of (3-44) and initial conditions

n - 1}
X*o) = 0, y'(x ) = 0, . . . ,y (^ ) = 0.

But the zero function is also a solution of this problem, and hence Theorem 3-2
implies that
ciydx) + ••• + c nyn {x) =

for all x in /. The linear dependence of y x , . . .


, yn now follows from the fact
that the c* are not all zero. |

Once again we have established a result which is stronger than the one adver-
tised. For the above proof only made use of the fact that the Wronskian of
y\, yn vanished at a single point in /, and hence the conclusion remains true
• • •
,

under this more restrictive hypothesis. Combined with Theorem 3-4 this obser-
vation yields

Theorem 3-6. A
of solutions of a normal nth-order homogeneous linear
set
differential equation is linearly independent in Q(I), and hence is a basis for
the solution space of the equation, if and only if its Wronskian never vanishes
on I.
114 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS CHAP. 3

Example 4. By direct substitution the student can verify that sin 3 jc and 1/sin 2 x
are solutions of

^+
ax 2
tan x Q-
ax
- 6 (cot
2
x)y = (3-46)

on any interval I in which tan x and cot x are both defined. Moreover,

3 1
sin x
'
sin 2 x
W Sin
3 1

*' sin 2 x\
=
2 2 cosx
= 5 cos x.
[
3 sin x cos x
sin 3 jc

3 2
Since cos x is never zero on I, the above theorem implies that sin x and 1/sin x
are linearly independent in Q(I), and the general solution of (3-46) therefore is

3 c2
y — C\ sin x + sin 2 x

(Note that this result can also be obtained directly from Theorem 3-4.)
Example 5. The functions
3
ydx) sin x, y2(x) sin x j$ sin 3x

are solutions of
r2 j
^+(tan;c-2cotJc)^ (3-47)

on any interval I in which tan x and cot x are defined. But

sin
3
jc sin x — ^ sin 3a:
W\yi(x\ y 2 (x)] =
3 sin
2
jc cos jc cos jc — cos 3jc

= sin
3
jc(cos jc — cos 3jc) — 3 sin
2
jc cos jc(sin jc sin 3jc)

= sin
2
jc(sin 3jc cos jc — sinjccos3jc) — sin
2
jc(2 sin jccos jc)

= sin
2
jc sin 2jc — sin
2
jc sin 2jc = 0.

Hence y 1 and y 2 are dependent in 6(7), and do not form a basis for the
linearly
solution space of (3-47).* it is clear that any constant c
In this case, however,
3
is a solution of (3-47), and since c and sin jc are obviously linearly independent

in 6(7), the general solution of the equation is

y = C\ -f- c 2 sin
3
x.

Of course, this expression may also be written

y = Ci + c 2 (sin jc — ^ sin 3jc).

* In fact, sin 3 x = f sin x — \ sin 3x.


) 5

3-6 | THE WRONSKIAN 1 1

EXERCISES

By computing Wronskians, show that each of the following sets of functions is linearly
independent in 6(7) for the indicated interval /.

1. 1, e~ x 2e 2x
, on any interval /
x
2. e , sin 2x on any interval /

3. 1, X, X , . . . , X on any interval I

4. In x, x In x on (0, oo )

5. X
\I2 ^1/3 on (0, oo
ax sin
6. e bx, e? x cos Z>x (6 * 0) on any interval /

7. e x , e x sin x on any interval /

8. e~ x xe~ x x 2 e~ x
, , on any interval /

9. 1, sin 2 x, 1 — cos* on any interval /

x - 1
10. on (—oo, —1)

11. Vl — X2 , JC on (-1,1)
X
12. sin

- , cos z2 x on any interval /

13. Show that x", x&, x y are linearly independent in 6(0, oo) if and only if a, /3, 7 are
distinct real numbers. [Hint: If a, /3, 7 are distinct and cix" + C2X& + C3X y =
on (0, oo ), show that c\ = C2 = c% = by considering what happens as x tends
to oo.]
14. Show that x", x?, xy are
independent in 6(0, oo) if and only if they are
linearly
linearly independent in 6(/) for every subinterval / of (0, oo). [Hint: First establish
both of the following assertions, and then use Theorem 3-6:
(a) x", x&, xy satisfy the linear differential equation

x sy"' + a2X 2y" + a\xy' + aoy = 0,

wherea 2 = 3 -a- - 7,ai = 1 - a - -7+ a$ + oCI + /3T,a = -apY.

1 1 1

a a+l}+y - 3
(b) W(x ,x ,x Y ) = x l3
a @ 7

a(fx - 1) 003 - 1) 7(7 - 1)

and hence Wix", xP, xy ) either vanishes nowhere in (0, oo) or vanishes identically.]

15. Generalize the results of Exercises 13 and 14(b) to show that x" 1 , . . . ,xcln are linear-
ly independent in 6(7) for any subinterval I of (0, oo) if and only if a\, . .
.
, an are
distinct real numbers.

16. Let /belong to 6 J [a,


b], and suppose that /is not the zero function. By computing
their Wronskian show that f(x) and xf(x) are linearly independent in Q l [a, b].
(Also see Exercise 17.)
17. Show that fix) and xf(x) are linearly independent in Q[a, b] if /is continuous and
not identically zero on [a, b].
1

116 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS CHAP. 3

18. Suppose that /is an odd function in eH-a, a] (that is, f(—x) = —/(*)) and that
/(0) = /'(0) = 0. Show that

W[f(x), \f(x)\\ =

for allx in [—a, a], but that /and |/| are linearly independent in C 1 [— a, a] unless
/is identically zero. Compare this result with Example 3 in the text.
19. Let / g be any two functions in C 1 (7), and suppose that g never vanishes in /.

Prove that W[f(x), g(x)]


if On = /, then /and g are linearly dependent in Q 1 ^.
[Hint: Calculate d/dx(f(x)/g(x)).]
"20. Let / g be any two functions in CK/) which have only finitely many zeros in / and
have no common zeros. Prove that if W[f(x), g(x)] = on /, then / g are linearly
dependent in Q 1 ^. [Hint: Apply the result of Exercise 19 to the finite number of
subintervals of / on which forg never vanishes.]
"21. (a) Show that

1 1
••• 1

a\ 02

W[e
aiX
,
a x
„ n l _ _(°H \- a n> x 2
02
2

n-l n—
«2

(b) The determinant appearing in (a) is known as a Vandermonde determinant.


Show that it is zero if and only if a* = ay for some pair of indices /, j with / ^ j.
[Hint: Expand the determinant by cofactors of the 1st column to obtain a poly-
nomial in a \. Is a 2 a root of this polynomial?]
*22. Prove that if u\, . .
.
, un is a basis for the solution space of the linear differential
equation
™+ - 1)
y a n _i(*y n + • • •
+ a (x)y = 0,

then the ai(x), < < n — 1, are uniquely determined by u\,


i u n [Hint: For . . . , .

each index /, let u\ j} be the solution of the given equation which sal^sfies the initial
conditions u[%
xo
(xo) = 0, < j < n — 1, j /, and «{,*„(* o) = — 1. Then ^
Oi(Xo) = u i% (
(xo).]

23. Let u\ and U2 be linearly independent solutions of the normal second-order linear
differential equation
/' + ai(pc)y' + a (x)y = 0.

Express the coefficients ao(x), a\(x) in terms of u\ and U2. [Hint: Let y be an arbi-
trary solution of the equation, and consider the Wronskian of y, u\, U2-]
24. Generalize the result of the preceding exercise to an nth-order equation

>><"> + an-i(x)y< n -» + • • •
+ a (x)y = 0.

25. Use the results of Exercise 23 to find a homogeneous second-order linear differential
equation whose solution space has the following functions as a basis.
(a) x, xe x (b) x, x2
(c) sin x, cos x (d) x, sin x
(e) x, lnx
3-7 ABEL'S FORMULA 117

3-7 ABEL'S FORMULA


According to Theorem 3-6 the Wronskian of a set of solutions of a normal
homogeneous linear differential equation either vanishes identically or not at all.

This fact can also be deduced from the following theorem, which gives an explicit
formula for the Wronskian in this case.

Theorem 3-7. Let y x ,


. . .
, yn be solutions on an interval I of the nth-order
equation

«nW^ + fl»-iW ^Ji + • • •


+ a (x)y = 0, (3-48)

and suppose that an (x) ^ everywhere in I. Then

W\y x (x), . . . , yn (x)] = ce -/[«»-i(*>/«»<*)]<** (3


_49)

for an appropriate constant c. (This result is known as AbeVs formula for


the Wronskian.)

Proof In order to avoid using general properties of determinants, we shall prove


(3-49) only in the case n = 2. The general proof is identical, except that it uses
the formula for the derivative of an nth-order determinant (see Exercise 9).
Thus suppose that y\ and y 2 are solutions of

°2
^ dtf
+ fll
^ ~dx
+ a °(*)y = °

on an interval / in which a 2 {x) does not vanish. Then

yi(x) J>2(*)
i<ny>oo,y>w = i y'i(x) y'2 (x)

= ^ lyi(x)y 2 (x) - y 2 (x)y[(x)]

= yi(x)y 2'(x) - y"(x)y 2 (x)

[yi(x)y'2 (x) - y 2 (x)y'1 (x)]


a 2 (x)

ai(x)
myi(x),y 2 (x)].
02(*)

Thus W\yi(x), y 2 (x)] is differentiable on /, and satisfies the first-order linear


118 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 3

differential equation

4? ,
?iW y = *
dx a 2 (x)

But the general solution of this equation is

y **" »

and hence, for an appropriate value of c, we have

W\yi(x), y 2 (x)] = cg-/[«iw/«»w]*;

as asserted. |

If x is any fixed point in /, and W(x ) denotes the value of the Wronskian
if

of yi, • • •
, yn at x , then Abel's formula may be written in the form

W[ydx), . . . , yn (x)] = W(xo)eS*o [«-i<*>/-.(*>i*


(3 -50)

This formula shows that the Wronskian of any basis for the solution space of a
normal homogeneous linear differential equation is determined up to a multi-
plicative constant by the equation itself, and does not depend upon the particular
basis used to compute it. This simple observation will be important later on.

Example 1. Since
2
dy ,
dy A

isnormal on (0, oo), the Wronskian of any two solutions y lt y 2 of this equation
must be of the form
C
W\y {x),y {x)\ = ce~hdxlx) = -- l 2

If, in addition, y x and y 2 satisfy the initial conditions

^i(^o) = «o, /i(*o) = au

y2(*o) = &o, ^2(^0) = bx ,

at some point x > 0, then

a b
c = x x (a bi — aib ),

and
x (a bi — aib Q )
W
* This result actually holds for all n > 2, provided that ai(x) and a 2 (x) are replaced
by a„_i(x) and a n (x), respectively.
3-7 | ABEL'S FORMULA 119

As our first Theorem 3-7 we shall use Abel's formula


substantial application of
to find the general solution of a second-order homogeneous linear differential
equation given one nontrivial solution of the equation. Thus let jx ^ be a
solution of

a 2 (x)yf ' + ai(x)/ + a (x)y = (3-51)

on an interval I in which a 2 (x) 5* 0. Then every solution y 2 of (3-51) must satisfy


the equation

W[y 1 (x),y 2 (x)] = ce-haiix)la2(x)]dx , (3-52)

and y 2 can thus be found by solving the nonhomogeneous first-order equation

dy 2 —}l n
f[ai(x)lao(x)]dx
/
yi(x
\
~ /1WJ2
// \
= „
ce
2(

)~fa

By (3-20) the general solution of this equation is

-f[a l (.x)/a 2 {x)]dx

y2 = cy^x)
J :r7
yi(x)
^ dx + kyi(x),

where k is an arbitrary constant, and since this formula is valid on any sub-
interval of / in which y x 9^ 0, it can be used to determine a second solution of
(3-51) on such a subinterval. In particular, the function

—Mx?
/-—f[a (x)la (x)]dx
1 2

dx

will be such a solution, and is clearly linearly independent of y lt as desired. Thus


we have proved the following theorem.

Theorem 3-8. Ify\ is a nontrivial solution of Eq. (3-51) on an interval in

which a 2 {x) does not vanish, then

,—J[«l(*)/<*2(*)]<k ;

y 2 (x) = y x (x) ^-7^^ dx (3-53)


J ~Mxy
isa solution of the equation on any subinterval of I in which y\ 9^ 0. More-
over, y 2 is linearly independent of y\, and the general solution o/(3-51) is

y = co>i + c 2y 2 ,

where C\ and c 2 are arbitrary constants.

Example 2. By direct substitution we find that x2 is a solution on (0, 00) of


the second-order equation

x 2y" + x 3/ - 2(1 + x 2 )y = 0. (3-54)


120 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS | CHAP. 3

Hence a second linearly independent solution in (2(0, oo) can be found by solving
the first-order equation
jcV ~ 2xy = e~ x2 ' 2
.

Using Formula (3-53), we obtain


, )x dx
e
y2 = x2 —^- dx
J ~X*

4 x 2
-/ x- e~ <
dx,

and the general solution of (3-54) on (0, oo) is

Here the solution must be


4
left
+
in integral
«,/rV^*)
form since fx 4
e
x l2
dx cannot be
expressed in terms of elementary functions.

Example 3. The function y x = 1 is a solution of

y" + (tan x - 2 cot x)y' = (3-55)

on any interval in which tan jc and cot x are both defined. Applying the above
result we obtain a second solution

= ( e -h^nx-2 cot z)dx dx


y 2 (x)
ln(cos x sin 2 x)dx »
f

2
= J
sin x cos x dx
3
= % sin jc.

Thus the general solution of (3-55) is

y = C\ + c 2 sin
3
x,

which agrees with the result obtained at the end of the last section.

EXERCISES

1 . For each of the following differential equations find the Wronskian of the solutions
y\, yiwhich satisfy the given initial conditions.
2 2
(a) x y" + xy' + (x + \)y = 0; j>i(l) = 0, ^(1) = 1, y 2 {\) = /2 (D = 1

(b) (1 — x )y" — 2xy' + n(n + 1)^ = (n a positive integer);

M0) = yi(0) = 2, y 2 (0) = 1, y2 (0) = -1


3-8 | THE EQUATION y" +y= 121

2
(c) x y" - 3xy' + y = 0; yi (-l) = y\(-l) = 2, y 2 (-l) = 0,

/2 (_1) = -1
(d) /' + 2xy = 0; yi (0) = y\(0) = 1, y 2 (0) = 1, /2 (0) =
(e) y" - (sin x)y' + 3 (tan jc)y = 0;
>i(0) = 1, yi(0) = 0, > 2 (0) = 0, y2 (0) = 1
2
(f) Vl + Jtsy - x /+ y = 0; ji(l) = 1, /i(l) = 0, y 2 (l) = -1,
•ad) = 1

In Exercises 2 through 8, one solution of the differential equation is given. Find a second
linearly independent solution using the method of this section.

2. y" - 4/ + 4y = 0; e 2x
3. y" - lay' + a 2y = (a constant); e"*

4. 3x/' - / = 0; 1
5. y" + (tan*)/ — 6(cot 2 jc)^ = 0; sin 3 x
(1 - x )y" - 2x/ = 0; 1
2
6.

7. (1 - *V - 2xy' + 2y = 0; x
8. 2xy" - (e*)/ = 0; 1
9. (a) Prove that if fa is in e^/), 1 < h J < 2, then so is the function F defined by

fu(x) fi2(x)
F(x) =
f2\(,x) f22(x)
and that

fll(x) fl2(x) fUx) f12 (x) /nW ffox)


dx f2l(x)
+
f2 2(x) fkix) f22 (x) f2l(x) fh(x)

(b) Generalize the result in (a) to the case of nth-order determinants; and show, in
particular, that

fu(x) fi 2 (x) • • •
j\n(,x)

d_ f2i(x) f22 {x) f2n(x)


dx
fnl(x) fn2(x) Jnn\X)

can be expressed as the sum of n determinants, the fth of which is obtained from
|/i/(*)l by differentiating the functions in the /th column.

*3-8 THE EQUATION y" + y =


By now it should be obvious that the theorems we have in hand are powerful
tools for studying linear differential equations. What is not so obvious, however,
isthat they can also be used to obtain detailed information about the solutions
of an individual equation, and before going any further we propose to illustrate
this aspect of our results. Since this method is applied almost exclusively in
122 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS | CHAP. 3

rather complicated situations we have chosen to introduce it by means of an


example which, though somewhat artificial, has the merit of absolute clarity.
In the process we will undoubtedly give the impression of someone who is reso-
lutely shutting his eyes to the obvious (which, in fact, is what we shall be doing),
but the reader should nevertheless appreciate that the technique in question and
the spirit underlying its application are of considerable importance.
The problem we set for ourselves is to study the solutions of the second-order
equation
/' + y = (3-56)

without using any information other than that provided by the equation itself

and the general theorems proved above.


In the first place, this equation is normal on (— and we can therefore oo, oo),

apply Theorem 3-3 to assert that spanned by two linearly


its solution space is

independent functions, C(x) and S(x), which are defined for all x and satisfy
the initial conditions
C(0) = 1, C'(0) = 0,
(3-57)
K J
S(0) = 0, S'(0) = 1.

Moreover, from the identities

C"(x) + C(x) = 0, S"(x) + S(x) = 0, (3-58)

valid for all x, we conclude that both C(x) and S(x) are infinitely differentiable

on (— oo , oo), and that all of their derivatives are solutions of (3-56). For example,
the identity C"(x) + C(x) = implies that C"(x) is diflferentiable, since C(x) is,

and that C""(x) + C'(x) = 0. But this is just (3-56) again with C'(x) in place
of y; S'(x) is treated similarly, and the argument can be repeated to establish
the assertion for still higher derivatives. In particular,

C"'(x) = -C'(x), S"'(x) = -S'(x),


= av =
C (iv)
(x) C(x), S \x) S(x),

and the derivatives of C(x) and S(x) repeat in cycles of four, as expected. Finally,
(3-57) and (3-58) imply that

C"(0) = 0, C"(0) = -1,


S'tO) = 1, S"(0) = 0,

and it follows that C'(*) is the solution of (3-56) which satisfies the initial con-
ditions C"(0) = 0, C"(0) = — 1, while S'(x) is the solution which satisfies

S'(0) = 1, S"(0) = 0. Thus

C'(jc) = -S(jt), S'(x) = C(x),

and we have evaluated all of the derivatives of C(x) and S(x).


3-8 I
THE EQUATION y" +y= 123

Using these results we can now prove the familiar identity

S(x)
2
+ CO) 2 = 1. (3-59)
Indeed, since

^ '~ n2 A 21
/o/,

fx [S(xY + Cixf] = 2S(x)S'(x) + =


.

2C(x)C'(x) 0,

it follows that
S(x)
2
+ C(jc)
2
= k,

k a constant. Setting x = gives /c = 1, as asserted. Among other things,


(3-59) implies that |C(jc)| < 1, \S{x)\ < 1 for all jc.

Next, we establish the important addition formulas

C(a + x) = C(a)C(x) - S(a)S(x),

S(a + jc) = S(a)C(x) + C(a)S(x),

a an arbitrary real number. To do so, we note that

d2 2

—o
dx 2
C(a + x) = -C{a
—C(a + x) and —d 2 S(a +
-— x) = -S(a + x).
dx

Thus C(a + x) and .S(a + x) are solutions of (3-56), and as such must be linear
combinations of C(x) and S(x) that ; is,

C(a + x) = ai C(x) + a 2 S(x),


S(a + jc) = |8iC(*) + ^ 2 5(jc)

for suitable constants « 1} a 2 , 01, j8 2. To obtain the values of these constants we


set jc = in (3-61) and the identities obtained from itby differentiation. This
gives ai = C(a), 0i = S(a), a2 = -S(a), p 2 = C{a), and (3-60) has been
proved.
In much the same fashion it can be shown that

C(-jc) = C(x), S(-x) = -S(x), (3-62)

and we conclude that the graph of C(jc) is symmetric about the y-axis, while that
of S(x) is symmetric about the origin (see Exercise 1).
At this point we could derive those long and all too familiar lists of trigonometric
identities involving C(jc) and S(x). However, it is much more instructive to prove
that these functions are periodic with period 2x. Here we begin by defining t/2
to be the smallest positive real number such that C(x) = 0. (The proof that such
a number exists has been left to the student in Exercise 2.) Then C(jc) is positive
on the interval (0, tt/2), and since S f
(x) = C(x), we conclude that S(x) is in-
creasing on that interval. But 5(0) = 0, and hence S(x) is also positive on (0, x/2).
,

124 GENERAL THEORY OF LINEAR DIFFERENTIAL EQUATIONS |


CHAP. 3

Thus by (3-59), S(t/2) = 1, and the addition formulas now give

C(t) = -1, C(3tt/2) = 0, C(2x) = 1,

$(*) = 0, 5(3tt/2) = -1, S(2ir) = 0.

Hence
C(x + 2w) = C(jc)C(2t) - <S(x)5(2tt) = C(x),

S(x + 2tt) = 5(jc)C(2t) + C(x)S(2t) = S(x),

and the periodicity of C(x) and S(x) has been established.


To complete the discussion it remains to show that 2x is the smallest period
for each of these functions. For C(x) the argument goes as follows: From (3-61)
we obtain
C(x + tt/2) = -S(x),
C(x + x) = -S(x),
C(x + 3x/2) = S(x),

and it is negative on the interval (x/2, 3x/2), positive on


follows that C(jc)
(3x/2, 2x). For similar reasons S(x) is positive on (0, x), negative on (x, 2x).
But since C'(x) = —S(x), C(x) is decreasing on (0, x) and increasing on (x, 2x).
This, together with C(3tt/2) = and C(2x) = 1, implies that 2tt is the smallest
positive real number such that C(x) = 1, and we are done.

EXERCISES

1. Establish the identities (3-62) by showing that C(— x), S(—x) are solutions of (3-56)
and expressing them in terms of the basis C(x), S(x) for the solution space.
*2. Show that there is a least positive real number a such that C(a) = 0. [Hint: Assume
the contrary; then argue that
(a) C(x) > for < x < <»,
(b) S(x) > for < x < oo
(c) C(x) decreasing and concave downwards on (0, oo ),
is strictly

and derive a contradiction from (a) and (c).* Having established that C(x) has
positive zeros, consider the greatest lower bound a of the set of positive zeros of C(x).]

3. Let E(x) be the unique solution on (— oo , oo ) of the initial value problem y' — y = 0,

j>(0) = 1. Establish the following properties of E(x):


(a) E(x) has derivations of all orders, and E M (x) = E(x);
(b) E(x) > on (0, oo ); number
[Hint: Otherwise let xo be the smallest positive real
for which E(xo) =
and derive a contradiction by applying the mean value theorem.
0,
(Why would such a smallest number exist?)]

* To establish these facts rigorously, the intermediate value theorem and mean value
theorem from calculus must be applied. The student, however, may give an intuitive
argument based on a consideration of the graphs of C(x) and S(x).
3-8 | THE EQUATION y" +Y= 125

(c) E(x) is strictly increasing and concave upwards on (0, <»);

(d) E(a + x) — E(a)E(x) for all real numbers a, x;

(e) E(— x) = -—r^ for every real number x; [Hint: Apply (d).]
E{x)
(0 < E(x) < 1 on (-oo,u);

(g) km E(x) = oo ; [Hint: Use (c).]


x —>">

(h) lim E(x) = 0;


X—> "
(i) Set E(V) = e, and show that E(n) = en .

4. (a) Prove that every solution of a homogeneous linear differential equation with
constant coefficients has derivatives of all orders at every point on the x-axis.
(b) Generalize this result to homogeneous equations with variable coefficients and to
nonhomogeneous equations.
4
equations with
constant coefficients

4-1 INTRODUCTION
Linear differential equations with constant coefficients, that is, equations of the
form
any
in)
+ fln-iy*-
1'
+ • • •
+ a y = h(x) (4-1)

in which a , . . . ,an 7^ are (real) constants, are in many respects the simplest
of all differential For one thing, they can be discussed entirely within
equations.
the context of linear algebra, and form the only substantial class of equations of
order greater than one which can be explicitly solved. This, plus the fact that
such equations arise in a surprisingly wide variety of physical problems, accounts
for the special place they occupy in the theory of linear differential equations.
We shall begin the discussion of this chapter by considering the homogeneous
version of Eq. (4-1), which can be written in normal form as

n~ x
(D n + an _ x D + • • •
+ a )y = 0, (4-2)
or as
Ly = 0, (4-3)

~1
where L is the constant coefficient linear differential operator Dn + an _iD n +
• • •
+ a . Algebraically such operators behave exactly as if they were ordinary
polynomials in D, and can therefore be factored according to the rules of ele-

mentary algebra. In particular, it follows that every linear differential operator


with constant coefficients can be expressed as a product of constant coefficient
operators of degrees one and two (see Exercise 2 below). As we shall see, this
essentially reduces the task of solving (4-2) to the second-order case where com-
plete results can be obtained with relative ease.
This done, we up the problem of finding a particular solution of
will take
Ly = h given the general solution of the associated homogeneous equation
126
4-2 I
HOMOGENEOUS EQUATIONS OF ORDER TWO 127

Ly = (see Section 3-2). Here the restriction on the coefficients of L will be


dropped and much more far reaching results obtained. The language of operator
theory and the ideas of linear algebra will dominate this portion of our discussion
and furnish just that measure of insight needed to make it intelligible. Finally,
we shall conclude the chapter with some special results involving constant coef-
ficient equations and a number of applications to problems in elementary physics.

EXERCISES

1. (a) Prove that the product of two complex numbers a + bi and c + di is real if
and only if either

(i) b = d = 0, or
(ii) a = c and b = —d.
[Hint: Recall that a complex number a + and only if b = 0, and that
bi is real if
a product of the form (a bi)(c di) + + is computed by using the distributive law
and the rule i
2 = — 1.]
(b) Let P(x) be a polynomial with real coefficients, and suppose that P(x) has
a + bi, b > 0, as a root; that is, P(a + bi) = 0. Prove that a — bi is also a root
ofP(x).
2. Let P(x) be a polynomial of degree n, n > 0, with real coefficients. Use the fact
complex number system to prove that P(x) can
that P(x) has exactly n roots in the
be factored into a product of linear and quadratic factors with real coefficients.
[Hint: See Exercise 1(b) above.]

3. Find the second-degree polynomial which has a + bi and a — bi, b > 0, as roots.

4. Prove that every polynomial of odd degree with real coefficients has at least one real
root. [Hint: See Exercises 1(b) and 2 above.]
5. Write each of the following linear differential operators as a product of operators of
degrees one and two.
(a) D 3 + AD 2 + 5D + 2 (b) D3 - D2 + D - 1

(c) D 4 + 2D 3 - 10D - 25 (d) D 4 - 5D 2 + 4


(e) Z) 4 + 2D 2 + 10
6. Factor the operator D4 + 1 into a product of operators with real coefficients.

4-2 HOMOGENEOUS EQUATIONS OF ORDER TWO


We have already emphasized that the technique for solving constant coefficient
linear differential equations depends upon the commutativity of the operator
multiplication involved. To make this dependence explicit, and at the same time
phrase it in the form best suited to our immediate needs, we begin by establishing

Lemma 4-1. If Lu . . . ,Ln are constant coefficient linear differential


operators, then the null space of each of them is contained in the null space
of their product.
128 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

Proof. To prove this assertion we must show that (L 1 . . . L n )y = whenever


Ly = { 0. But this is a triviality since

(Li . . . L n )y = (L x . . . Li_iLi + i . . • L n Li)y


= (L\ . . . L{_iLi + i . . . L n )(Liy)
= (Z, i . . . Li_ \Li + 1 . . .£»*>
= 0.|

Example. The second-order equation

(D 2 - 4)y = (4-4)
may be rewritten
(D + 2)(Z> - 2)j> = 0.

Hence and e~ 2x are solutions of (4-4) since they


e
2a;
are, respectively, solutions of
the first-order equations (D — 2)y = and (D + T)y = 0. Furthermore,
these functions are linearly independent in e(— oo, oo) (compute their Wronskian!),
and it therefore follows from Theorem 3-3 that the general solution of (4-4) is

2x
y = Cl e + c 2 e~
2x
,

Ci and c 2 arbitrary constants.


This simple example suggests that we attempt to solve the general second-order
equation
(D 2 + axD + a )y = (4-5)

by decomposing the operator D2 + axD + a into linear factors. To this end


we first find the roots a 1} a 2 of the quadratic equation

m2 + aim + a = (4-6)

known as the auxiliary or characteristic equation of (4-5), and then rewrite (4-5) as

(D - ai)(Z) - a 2 )y = 0. (4-7)

This done, the argument falls into cases depending on the nature of a x and a 2 ,

as follows:

Case 1. «! awfif a 2 real and unequal. Here the reasoning used in the above ex-
ample carries over without change; the functions e a x and e a * x are linearly in- i

dependent solutions of (4-7), and

aix
y = Cie + c 2e
a 2x

is the general solution.

Case 2. ax = a2 = a. In this case (4-7) becomes

(D - a)
2
y = 0, (4-8)
4-2 | HOMOGENEOUS EQUATIONS OF ORDER TWO 129

ax
and our earlier argument yields but one solution of the equation, namely e .

Using it, however, we can apply the method introduced in Section 3-7 to find a
second linearly independent solution by solving the first-order equation

2ax
W[e ax ,y(x)] = e .

An easy computation reveals that up to multiplicative constants, y(x) = xe ax and


,

hence that the general solution of (4-8) is

ax
y = (ci + c 2 x)e .

Case «i and a 2 complex. Here ai = a + bi, a 2 = a — bi, a and b real,


3.

b > and the above method apparently breaks down. Nevertheless, if we pre-
0,
a
tend that e" * and e * x continue to make sense when a.\ and a 2 are complex,*
1

the discussion under Case 1 would imply that the general solution of (4-7) is

aiX a *x
y = Cl e + c2e
{a+hi)x ~ hi)x
= d xe + c2e
(a

ax ibx ibx
= e Cl e( + c 2 e~ ).

At this point we invoke Euler's famous formula

e
lx
= cos x + / sin x

(see Exercise 34) to rewrite this expression as

y = e
ax
[ci(cos bx + isinbx) + c 2 (cos bx — isinbx)]
= e
ax
[(ci + c 2) cosbx + i(ci — c 2 ) sin bx]

= ax ax
c 3e cos bx + c^e sin bx.

Thus, on purely formal grounds we are led to e ax cos bx and e ax sin bx as a basis
for the solution space of (4-7) when a x = a + bi and a 2 = a — bi. Of course,
we must now verify that these functions actually are solutions of the given equa-
tion,and that they are linearly independent in C(— oo, oo). But this is routine and
has been left as an exercise for the reader.
Since these three cases include all possible combinations of ai and a 2 we have ,

completed the task of solving the general second-order homogeneous linear dif-
ferential equation with constant coefficients. For convenience of reference we
conclude by summarizing our results.
To solve a second-order homogeneous linear differential equation of the form

(D 2 + axD + a )y = 0,

* Properly interpreted they do, as the reader may verify by consulting any text on the
theory of functions of a complex variable.
130 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

first find the roots ai and a 2 of the auxiliary equation

m2 + aim + a = 0.

Then the general solution of the given equation can be expressed in terms of a i and
a2 as follows:

«1, «2 General solution

Real, ai ?* ct2 c\e a i x + C2e a 2 x

Real, a\ — ct2 — oc (ci + C2x)eax

Complex, ai = a + bi
x
e" (ci cos bx + C2 sin bx)
<X2 = a — bi

EXERCISES

Find the general solution of each of the following differential equations.

1. /' + / - 2y = 2. 3/' - 5/ + 2y =
3. 8/' + 14/ - 15v = 4. /' - 2/ =
5. /' + Ay = 6. 3/' + 2j =
7. /' + 4/ + 87 = 8. Ay" - Ay' + 3y =
9. /' - 2/ + 2y = 10. 9/' - 12/ + Ay =
11. /' + 2/ + 4y = 12. 2/' - 2V2/ + 7 =
13. 2/' - 5V3/ + 67 = 14. 9/' + 6/ + y =
15. 64/' - 48/ + 17y =

In Exercises 16 through 25 find the solutions of the given initial- value problems.

16. 2/' - / - 3y = 0; 7(0) = 2, /(0) = -|


17. /' - 8/ + 16>> = 0; y(0) = \, /(0) = -\
18. Ay" - 12/ + 9y = 0; 7(0) = 1, /(0) = f
19. 7" + 2>> = 0; y(0) = 2, /(0) = 2\/2
20. 4/' - 4/ + 5>; = = 1
0; y(0) = \, y'(0)
21. /' + Ay' + 13y = = -2
0; 7(0) = 0, /(0)
22. 9/' - 3^' - 27 0; 7(0) =
3, /(0) = 1 =
23. 7" - 2\/5/ + 57 = 0; 7(0) = 0, /(0) = 3
24. 16/' + 8/ + 57 = 0; 7(0) = 4, /(0) = -1
25. 7" ~ V2/ + 7 = 0; 7(0) = V% /(0) =
26. Prove that e a i x and e ^* are linearly independent in C(— 00 , 00) whenever a 1 and 0:2

are distinct real numbers.


27. Verify that xe ax is a solution of the second-order equation (Z) — a) 2y = 0. Prove
that this solution and eax are linearly independent in <B(— «> , oo).
4-2 | HOMOGENEOUS EQUATIONS OF ORDER TWO 131

x
28. Verify that e° cos bx and eax sin bx are linearly independent solutions of the
equation
(£> - ai)(D - a 2 )y =

when a\ = a + bi and <x 2 = a — bi, b ?± 0.

29. Find a constant coefficient linear differential equation whose general solution is

(a) (c\ + c 2 x)e~ 3x (b) c\e x sm2x + C2e*cos2;t


-*
(c) (ex + c 2 x)e~ 2x + 1 (d) cie + c 2 e~ 3x + x + 4
(e) ci sin 3x + C2 cos 3x + j:/3

30. For each of the following functions find a second-order linear differential equation
with constant coefficients which has the given function as a particular solution.
(a) x(l -\- e x) (b) 4 sin x cos x
(c) (1 + 2e )e 2x + 6x +
x
5 (d) cos x(l - 4 sin 2 x)

(e) e 3x
+ e 2x + xe 3 *
31. (a) Show that the general solution of the second-order equation

[D 2 - 2aD + (a
2
+ b 2 )]y =

can be written in the form

y = cie^cos (bx + c 2 ),

c\ and C2 arbitrary constants. This form is frequently called the phase-amplitude


form of the solution. Why?
(b) Write the general solution of (D 2 + 4)y = in phase-amplitude form.

If L = (D -
2
32. a) , a real, show that

Le = (k — a) e .

Differentiate both sides of this identity with respect to k to prove that

x x
Lxe = (k — a)[2e + k(k — a)e ],

and then show that xe ax


is a solution of Ly = 0.

33. (a) Find the solution of

(D 2 - 2D + 26)y =

whose graph passes through the point (0, 1) with slope 2.

(b) Solve the problem given in (a) again, this time writing the general solution in
the form
(a+bi)x (a — bi)x
y = cie +,

c2e

Evaluate c\ and c 2 formally, and then use Euler's formula to show that the resulting
solution can be transformed into the solution found in (a).
*34. The function ez z a
, complex number, is defined by the infinite series

*--l+z + £+-+£+-.
132 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

and it can be shown that this series converges absolutely for all values of z* Set
z = ix in this series, and use the fact that i
2 = —1 to prove Euler's formula.
[Hint: Since the series is absolutely convergent for all z, its terms may be rearranged
at will.]

4-3 HOMOGENEOUS EQUATIONS OF ARBITRARY ORDER


The technique for solving homogeneous constant coefficient linear differential
equations is now all but complete. For instance, to solve

(Z)
4
- 2D 3 + 2D 2 - 2D + \)y = (4-9)

we first decompose the operator into linear and quadratic factors, as suggested
in Section 4-1, to obtain the equivalent equation

(D - \)\D 2 + \)y = 0,

and then invoke Lemma 4-1 to assert that the solution space of each of the second-
order equations (D — \)
2
y = and (D 2 + l)y = is contained in the solu-
x x
tion space of (4-9). Thus e , xe , sin x, and cos x are solutions of (4-9), and since
these functions are linearly independent in <B(— oo, oo), the general solution of
the given equation is

y = (^1 + c 2 x)e
x
+ c3 sinx + c 4 cosa:.

This, in brief, is how all homogeneous constant coefficient equations are solved,
and save for the difficulty occasioned by equations such as

(D - \fy =
and
(D 2 + \)
2
y = 0,

where the above argument fails to yield the required number of linearly inde-
pendent solutions, we are done. But, recalling our experience with the equation
{D — a)
2
y = 0, it is not difficult to guess that the missing solutions for the
above equations are, respectively, x 2 e x x 3 e x and x
, , sin x, x cos x. Both of these
conjectures are correct, and we shall now prove the relevant generalization of
this fact for arbitrary equations of the form

(D n + an ^D n - + 1
• • •
+ a )y = 0, (4-10)

where a , . . . , a n _ x are constants.

* By definition the absolute value, or modulus, of a complex number z = a bi is +


the real number \Ja 2 &*. + A
series X)"=i z„ of complex numbers is said to converge
absolutely if the series £^n=i \z n of real numbers converges in the usual sense.
\
4-3 | HOMOGENEOUS EQUATIONS OF ARBITRARY ORDER 133

We begin by decomposing the operator into linear and quadratic factors, the
linear factors being determined by the real roots of the auxiliary or characteristic
equation
n~ x
mn + an _ x m + • • •
+ a = 0, (4-11)

the quadratic factors by itsThen, by Lemma 4-1, we can find


complex roots.
m
solutions of (4-10) by finding the null space of each factor of the form (D — a)
corresponding to the real root a, and of each factor of the form [D — 2aD +
2

2 m
{a
2
+
b )] corresponding to the pair of complex roots a bi, b > 0. This ±
we accomplish in the following theorem, which, it should be noted, also includes
the case (D —
2
a) discussed earlier.

Theorem 4-1. If y(x) belongs to the null space of a constant coefficient


~1
linear differential operator L, then xm y(x) belongs to the null space ofLm .

Proof. To establish this result we must compute the value of the linear differential
operator Lm applied to the product x
m~ 1 and we therefore begin by giving a
y,
formula for evaluating all such expressions. Specifically, if L is any linear dif-

ferential operator whatever, and if V, L", . . . denote the formal derivatives of L


with respect to D* then

L(uv) = (Lu)u + (L'u)Dv +^ (L"u)D v


2
+^ (L'"u)D v
3
+ • • •
(4-12)

whenever u and v are functions for which L(uv) is defined.!


Accepting the validity of (4-12), let L and y(x) be as in the statement of the
theorem; i.e., Ly = 0. We must prove that

~
Lm (yxm 1
) = 0.

To this end we set Lm = M, and apply the above formula to obtain

-1
M(yxm ) = (My)xm - 1
+ (M^Dx™- + ^ Wy^x™' + 1 1
• • •
.

To show that this expression is zero, we


whenever first note that D xm ~ =
r 1

r > m. Moreover, when r < m, the rth formal derivative of with respect to
m ~ r (see
M
D, Mir
\ consists of a sum of terms each of which contains the factor L
Exercise 20 below). Hence, since Ly = and since L is a constant coefficient
operator, Lemma 4-1 applies, and we find that
My = 0. Thus all of the terms M
in the above expression are zero, and the theorem is proved. |

* Thus, if L = 3D 2 + 4D - l,thenZ/ = 6£> + 4, L" = 6,L"' = 0, etc.

t This formula is really no more than a


generalized version of Leibnitz's formula
D n
(uv) = £fc=o Q(Z>*M)(Z> n_ *u), introduced in Exercise 15, Section 3-1. Its proof has
been left as an exercise (see Exercise 18).
134 EQUATIONS WITH CONSTANT COEFFICIENTS |
CHAP. 4

As a consequence of this theorem we can now assert that the null space of the
operator {D — a)
m contains the functions

and that the null space of [D 2 - 2aD + (a


2
+ b 2 )] m contains

e
ax
sin bx, xe ax sin bx, . .

e
ax
cos bx, xeax cos bx, . .

And it is out of just such functions that the general solution of every homogeneous
constant coefficient linear differential equation is constructed. The construction
depends, of course, upon the fact that the various functions obtained in this way
are linearly independent in e(— oo, oo). They are, but unfortunately there is no
really brief proof of this assertion. One particularly elegant proof will be given
in Section 7-7 as an illustration of the ideas introduced there, and in the mean-
time we will content ourselves with indicating an alternate approach in Example
5 below.

Example 1. Find the general solution of

(D 3 + \)y = 0. (4-13)

Here the factorization of the operator is (D + 1)(Z>


2
- D+ 1), and it follows
that the roots of the auxiliary equation

m3 + 1 =

are -1, \ + (V3/2)/, and \ - (\/3/2)/. Thus the general solution of (4-13) is

y = c x e~
x
+ e
xl2
lc 2 sin -^ x + c 3 cos -^- x) •

Example 2. Solve

y
(7) _ 2y5) +y = 3)
Q

In operator notation this equation becomes

(D 7 - 2D 5 + D 3 )y = 0,
and since
D 7 - 2D 5 + Z>
3
= D\D - \){D + l)]
2
,

the roots of the auxiliary equation are (a root of multiplicity 3), 1 (a root of

multiplicity 2), - 1 (a root of multiplicity 2),* y = cx + c2x + c3 x


2
+ (c 4 +
x
c 5 x)e
x
+ (c 6 + c 7 x)e~ .

* The a root a of the equation m


multiplicity of
n a n -\mn 1 + -J-
• • •
+ ao = is

the largest integer k such that (m — a) k


is a factor of the operator.
4-3 | HOMOGENEOUS EQUATIONS OF ARBITRARY ORDER 135

Example 3. Find a constant coefficient linear differential equation which has


2x
e and xe~ 3x among its solutions.
In this case we must find a constant coefficient linear differential operator L
with the property that Le 2x = and Lxe~ 3x = 0. (In more picturesque ter-
minology it is sometimes said that L annihilates these functions.) Clearly any
operator which contains the factors D— 2 and (D + 3)
2
will answer the prob-
lem, and hence
(D - 2)(D + 3)
2
y =

will be an equation of the required type.

Example 4. Find a constant coefficient linear differential operator L which, when


applied to the equation

(D 2 + 1)(Z> - l)y = e
x
+ 2 - 7*sinx,

produces the homogeneous equation

L(D 2 + l)(D - \)y = 0.

Since L must annihilate the functions e x 2, and x sin x, it must contain


, the
factors D— 1, D, and (D 2 + l)
2
. By Lemma 4-1 we can therefore set

L = D(D - l)(D 2 + l)
2
.

Example 5. As our final example we shall prove that the various solutions of
the equation
(D - 2)(£> + 5) (Z>
3 2
- AD + 13)j; =
obtained using Theorem 4-1 are linearly independent in e(— oo, oo). In this case
the solutions are

e
2x
corresponding to the factor D — 2,
5x 5x 2 5x
e~ , xe~ , x e~ corresponding to the factor (D + 5) 3 ,

2x 2x
e cos 3x, e sin 3jv corresponding to the factor D 2 — AD + 13,

and obvious that they are somewhat too numerous to permit their Wronskian
it is

to be computed easily. Instead we reason as follows:


Let c i, c 6 be constants such that
. . . ,

2x 2 5x 2x
c xe + (c 2 + c3 x + c 4 x )e~ + e (c 5 cos 3x + c 6 sin 3x) - (4-14)

for all x. Apply the operator (D + 5) 3 (Z> 2 - AD + 13) to this expression to


annihilate every term but the first, thereby obtaining

Ci(D + 3
5) (Z>
2
- AD + 13)e
2x = 0.

But since (Z> + 5) (Z>


3 2
- AD + 2x
13) does not annihilate e , it follows that
136 EQUATIONS WITH CONSTANT COEFFICIENTS |
CHAP. 4

c1 = and that (4-14) reduces to


5x 2x
(c 2 + c 3x + c 4x
2
)e~ + e (c 5 cos 3x + c 6 sin 3x) = 0. (4-15)

Next apply the operator (D + 5)


2
(Z>
2
- AD + 13) to annihilate every term in
(4-15) except the one arising from c 4t x
2
e~ 5x . This gives

c 4 (D + 5)\D 2 - AD + 13> 2 e- 5x = 0,

and since (D + 5)
2
(D 2 — AD + 13)x e
2 _5:i;
is not identically zero, we conclude
that c 4 = 0. Hence (4-15) becomes
2x
(c 2 + c 3 x)e~
5x
+ e (c 5 cos 3x + c 6 sin 3x) = 0.

Continue the argument, using the operators (Z> 5)(Z> + 2


— AD + 13) and
D 2 - 4Z) 13 in turn to deduce that c 3 =
+ and c 2 = 0. Finally, from the
identity
2x
e (c 5 cos 3x + c 6 sin 3x) =
we conclude directly that c 5 = c6 = 0, and the linear independence of this

particular set of solutions has been proved.

In point of fact, this argument can be refined to give a general proof of the
linear independence of the functions appearing in the solutions of constant coef-
ficient homogeneous linear differential equations. We refrainfrom doing so,
however, since the problem will be considered in a later chapter where an entirely
different proof will be given.

EXERCISES

Find the general solution of each of the following differential equations.

1. y">
_|_ 3/'
y _
3y = o ' - 2. y'" + 5/' - 8/ - My =
3. Ay'" + 12/' + 9/ = 4. /" + 6y" + 13/ =
5. 2/" + /' - 8/ - Ay = 6. /" + 3/' + / + 3y =
/iv) _ y = o
7. 8. / - 8/' + 16y =
iv)

9. /
iv)
+ 18/' + 81^ = 10. 4/ - 8/" - /' + 2/ =
iv >

H. yiv)
+ y »r _|_ y " = o 12. / = iv)

13. / iv) - 4/" + 6/' - 4y' +


y = 14. / v + 2/" + / =
>

15. / +
v)
6/ iv)
+ + 36;;' + 24y
15/" + 26/' =
16. Find the general solution of (D 4
+ \)y = 0.
17. Find linear differential operators which annihilate each of the following functions,
2
(a) xV* +1) (b) 3e * cos 2x
(c) x(2x + 1) sin x (d) 3 + Ax - 2e~ 2x
2
(e) x 2 sin* cos* (f) xVsin x
4-3 HOMOGENEOUS EQUATIONS OF ARBITRARY ORDER 137

(g) x sin (x + 1) (h) (x 2 - l)(cos x + l)e 3x


3 '
(i) (xe x + l) (J) (* + e xY
18. Use Leibnitz's formula and mathematical induction to establish (4-12) for any linear
differential operator L.

19. Verify Formula (4-12) when


(a) L = D
(b) L = D 2 + 2D + 1
(c) L = Dn
20. (a) Let P = Q nR, where Q and R are polynomials, and n is a positive integer.
~
form Q n l S, where 5 is again a polynomial.
Show that P' can be written in the
~
More generally, prove that if Qk divides a polynomial P, then Q k r divides D {r) P.

(b) Use the result in (a) to prove the assertion made during the proof of Theorem
4-1 to the effect that (r)
v = M
whenever r < m.
*21. (a) Show that the Wronskian of the functions ek i x e k ^ ,
e*n x is

k2
(*>! + +kn)x
••.
k
2
kl

, n— , n— • n—
K2

(b) The determinant appearing in (a) is known as a Vandermonde determinant.


Prove that every such determinant is nonzero whenever k\, , k n are distinct. . .

[Hint: Let V denote the determinant in question, and consider k\ as a variable.


Use an inductive argument to show that for each n, V can be viewed as a polynomial
of degree n — 1 in k\ which has k 2 , , k n as roots.] . .
.

(c) By directcomputation prove that every 3X3 Vandermonde determinant is

different from zero if k\, k 2 k% are distinct.


,

(d) What conclusion can be drawn about solutions of homogeneous constant co-
efficient linear differential equations from the results of (a) and (b)?
22. Prove that the solutions yi(x), y n (x) of a constant coefficient linear differential
. . . ,

equation are linearly independent in C(— w, oo) if and only if they are linearly
independent in 6(7) for every interval /. [Hint: Assume linear dependence on / and
use the uniqueness theorem.]
23. Prove that the functions e 2x xe 2x e 2x sin x, e 2x cos x are linearly independent in
, ,

6(7) for any /. [Hint: Show that this assertion is equivalent to asserting the linear
independence of 1, x, sin x, cos x in C(— oo, oo) (see Exercise 22), and then study
the behavior of c\ c 2x +
C3 sin x C4 cos x as x — oo .]
+ + »•

24. Establish the linear independence of the functions given in Exercise 23 by applying
the annihilator of sin x and cos x to the identity c\ + c 2x + C3 sin x + c± cos x = 0.
~
25. Prove that the functions e ax xe ax x m l e ax a real, are linearly independent in
, ,
. .
.
, ,

e(7) for any /. [Hint: A polynomial of positive degree has only a finite number of
zeros.]
138 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

26. (a) Let P(x) be a polynomial with real coefficients, and let L = P(D) be the asso-
ciated constant coefficient linear differential operator. Prove that

Le ax = p(a)e «*.

(b) Use the result in (a) to show that for any constant coefficient linear differential
operator L, Le ax = if and only if L has D — a as a factor.
27. Prove that

,n — .k m ax (0 if k > m,
(D a) x e =| m _k ax
\m(m — 1) • • •
(m — k + \)x e if k < m.

In particular, deduce that

(Z) - a) mxme ax = w!e ai .

28. (a) Let L = (D - a) m (D - /8)


n
, a and /S real, a ^ j8. Prove that the functions

in the null space of L are linearly independent in 6(7) for any /. [Hint: Apply the
operators (£> n (D
a)" -
1-1
j3)
— ,
(D — j8)
n
(D — a) m_2 , . . . , in succession to the
identity

d ^x
~X
Cie ax _|_ . . .
+ CmXm-l e ax _|_ _|_ . . . _}_ dn X n ^x = 0,

and use the results of Exercises 26 and 27.]

(b) Generalize the result in (a) to operators of the form

L = (D - aifi •••(£- a*)"1 *

where ai, . .
.
, a* are distinct real numbers.
Remark. The results of the last three exercises, when generalized to include complex
functions, furnish a proof of the linear independence of the solutions of the equation
Ly = obtained in this section.

4-4 NONHOMOGENEOUS EQUATIONS:


VARIATION OF PARAMETERS AND GREEN'S FUNCTIONS
In Section 3-2 we observed that the general solution of a nonhomogeneous linear

differential equation

an (x) ^+ • • •
+ a (x)y = h(x) (4-16)

may be written in the form


y = yP + yh ,
(4-17)

where yp isany particular solution of (4-16), and yn is the general solution of the
associated homogeneous equation

«»W %+ • • •
+ a (x)y = 0. (4-18)
4-4 | NONHOMOGENEOUS EQUATIONS: VARIATION OF PARAMETERS 139

Using the language of linear operators, the problem of finding a particular solution
of (4-16) —
which we assume defined and normal on an interval 7 consists of —
n
finding exactly one function in e (7) which satisfies the equation

Ly = h, (4-19)

where L is the linear differential operator an (x)D n a (x). And this, as + • • •


+
we know, is equivalent to the problem of constructing a right inverse for L; mean-
ing, of course, a linear transformation G: 6(7) — > Q (I) such that L(G(h)) = h
n

for all h in 6(7) (cf., Section 2-5 and Section 3-4).


The existence of such inverses is guaranteed by the fact that Eq. (4-19) has
solutions for every h in 6(7), and the only open question is how to go about
selecting a particular inverse for L from the infinitely many that exist. In other
words, how do we impose conditions on Eq. (4-19) to ensure that it has a unique
solution for each h in 6(7) ? When asked in these terms an answer is obvious:
We simply require that the solution satisfy a "complete" set of initial conditions
at some point x in the interval I* Since the particular solution obtained is quite
immaterial we choose the simplest of all possible initial conditions, namely

y(x ) = 0, y'(x ) = 0, . . . , /"""(Jo) = 0. (4-20)

And with this we have in fact defined a right inverse G for the operator L. Speci-
fically, be described as the (linear) mapping from 6(7) to 6 n (7) which sends
G can
each function h in 6(7) onto the solution of (4-19) which satisfies the initial con-
ditions given above. In this section we shall obtain an explicit formula for G in
terms of a basis for the solution space of the homogeneous equation Ly =
when L is an operator of order two. In the next section these results will be gen-
eralized to operators of arbitrary order, and once this has been done the study of
linear differential equations will have been reduced to the homogeneous case.
(The reader should note that this portion of our discussion is not restricted to
constant coefficient operators.)
Thus we begin by considering a normal second-order linear differential equation

= Kx)
d£ + + (4^2I)
° l(x) ao(x)y >
rfx

defined on an interval 7 of the x-axis, and the general solution

yh = c^x(x) + c 2 y 2 (x) (4-22)

of its associated homogeneous equation. We seek a particular solution yp of (4-21)


such that
y p (x ) = 0, yp (x ) = 0, (4-23)

where x is a fixed but otherwise arbitrary point in 7.

* This requirement can be viewed as restricting the domain of L in such a way that
L becomes one-to-one and has an inverse.
140 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

The construction of yp is begun by making the unjustified but not unreasonable


assumption that any particular solution of (4-21) ought to be related to the ex-
pression for yh, and we therefore attempt to alter the latter in such a way that it
becomes a solution of the given equation. One way of doing this is to allow the
parameters c x and c 2 in (4-22) to vary with x in the hope of finding a solution of
(4-21) of the form

yp = c l (x)y 1 (x) + c 2 (x)y 2 (x).* (4-24)

If (4-24) is substituted in (4-21), and the notation simplified by suppressing


mention of the variable x, we obtain

c\(y" + ai/i + a yx) + c 2 (y'i + a xy2 + a y2) + ,


(c 1 y 1 + c'2 y 2 )'

+ fli(ctvi + c'2 y 2 ) + (cM + c'2 y'2 ) = h. (4-25)

Moreover, since y x and y 2 are solutions of the homogeneous equation y" +


+ a y = 0, the first two terms in (4-25) vanish, and we have
a x y'

(c'tFi + c'2 y 2 y + a x (c[y x + ciy 2 ) + (c{y[ + c&£) = h.

This identity, which must hold if (4-24) is to be a solution of (4-21), will obviously
be satisfied if Ci and c 2 can be chosen so that

c'1 (x)y 1 (x) + c'2 (x)y 2 (x) = 0,


(4--2o)
c'i(x)yi (x) + <4(M*) = Kx),

for all x in /. Thus it remains to show that these equations serve to determine
c x (x) and c 2 (x), and that this can be done in such a way that the function

yp = c 1 (x)y l (x) + c 2 (x)y 2 (x)

satisfies the initial conditions given in (4-23).


But, for each x in /, (4-26) may be viewed as a pair of linear equations in the
unknowns c[(x) and c 2 (x). As such, the determinant of its coefficients is

yi(x) J2OO

which we recognize as the Wronskian of the linearly independent solutions y x {x)


and y 2 (x) of the homogeneous equation associated with (4-21). We now recall
(Theorem 3-6) that this determinant is a continuous function of x which never
vanishes on /. Hence (4-26) has a unique solution for c[(x) and c 2 (x), and, once
this solution is known, c x (x) and c 2 (x) can be found by integration. Moreover, by

* The term "parameter" is frequently used as a synonym for "constant," particularly


when, as in this case, the constant is allowed to assume arbitrary values.
4-4 |
NONHOMOGENEOUS EQUATIONS: VARIATION OF PARAMETERS 141

suitably choosing the limits of integration, the required initial conditions can also
be satisfied, and the argument is complete. For obvious reasons, this method of
constructing a particular solution for a nonhomogeneous linear differential equa-
tion out of the general solution of its associated homogeneous equation is known
as the method of variation of parameters.
Starting with (4-26), an easy calculation gives

.,, _ Kx)y 2 (x) ,


(
. _ h(x)yi (x)

Thus

and if these values are substituted in (4-24), and terms combined, we find that

yp can be written in integral form as

v (x) - f ^(^i(0 - yi(x)y*(t)


h(t) dt ( 4_ 28 )

One reason for calling attention to this expression is that it can be read as the
definition of a right inverse for the linear differential operator L = D2 +
a 1 (x)D + a (x), and, in fact, is the particular right inverse discussed earlier in
this section. For if h is any function in e(7), and if we set

GQi) = I K(x, t)h(t) dt, (4-29)


Jx
where

K(x t) = ^(^i(0 - yifrWO ( ^ 30)


then G maps e(7) to C 2 (7), and has the further prop-
acts as a right inverse for L,
erty that GQi) satisfies ) = G(hy(x ) =
the initial conditions (see G(h)(x
Exercise 29). It should be pointed out that the function K(x, t) defined by (4-30)
is independent of the particular choice of x in the interval /, and is completely
determined by the operator L* As such it is referred to as the Green's function
for L for initial value problems on the internal I, or, more simply, as the Green's
function for L.

Example 1. Find the general solution of the second-order equation

/' + y = tan*. (4-31)

* In the next section it will be shown that K(x, i) is also independent of the particular
basis chosen for the solution space of Ly = 0.
142 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

In this case the associated homogeneous equation has

yh = Cx sin x + c 2 cos x

as its general solution. Thus we seek a particular solution of (4-31) in the form

yp = ci(x)sinx -f- c 2 (x)cosx,

where c^x) and c 2 (x) are determined from the pair of equations

c\(x) sin x + c 2 (x) cos x = 0,

c\(x) cos * — C2(x) sin x = tan x

(see Eq. 4-26). This gives


c'i(x) = sin x,
c'2 (x) = — sin x tan x,
and it follows that

Ci(x) = I sinxdx = — cosx,

C2CX) = — / dx = — I (sec a: — cos x) cfcc

= —In Isec x -\- tan jc| + sin x.


Thus
yp = —cos a: sin a: + [sin x — In |sec x + tan x\] cos x
= —cos a: In Isec x + tan xl,

and the general solution of (4-31) is

y = —cos x In |sec x + tan x\ + Ci sin x + c 2 cos x.

An alternate method of solving (4-31) relies upon determining the Green's


function K(x, t) for the operator L = D2 + 1. According to (4-30)

sin x cos t — cos x sin f .


, .

*<*• '> - tncos U sin <]


" sm (jt - '>•

and if we set a = in (4-29), the expression

G(/i) = / sin (x — 0^(0 *


./o

defines a right inverse for L. (The reader should note that at this point we are in
a position to solve all linear differential equations involving D2 + 1, a fact which

* It is standard practice to omit the constants of integration at this point since without
them we still obtain a particular solution of the given equation.
4-4 | NONHOMOGENEOUS EQUATIONS: VARIATION OF PARAMETERS 143

vividly illustrates the economy of using Green's functions.) It now follows that a
particular solution of (4-31) can be obtained by applying G to the function tan x;
i.e., by computing

(*)=
yP (x) = f/ sin (x — t) tan * dt. (4-32)
Jo
This gives

yP (x) = f (sin x cos


j t — cos x sin i) tan t dt
Jo
2
[ — / sin 1
= sin x I
.

sin t dt
,
cos x / dt
,

Jo Jo cost
= sin x — cos x In |sec x + tan x|.

Example 2. In Section 4-8 we will show that the general solution of

xy" +y =
on (0, oo ) and (— oo, 0) is c x + c 2 In \x\. Hence the nonhomogeneous equation

xy + y = x + 1 (4-33)

has a particular solution of the form yp = d(x) + c 2 (x) In |x|, where c x (jc) and
c 2 (x) are determined from the equations

ciW+ c'2 (x)ln|;c| = 0,

, .
c'2 (x)
.
• - =
X
1
—+
*
X
1

[Here h(x) = (x + 1)A since we must divide by x to put (4-33) in normal form.]
Thus
c\(x) = -(*+ 1) In 1*|,

c 2 (x) = x + 1,
and

ci(x) = ^+*- y+ ( Jf jln


2
x
C2(X) = y + X.

This gives

=
^ = T+*~ (t + x
)
ln W + (t + x
)
ln W T+^
and the general solution of (4-33) is

x2
y = -r +x + C! + c 2 In Ixl
144 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

EXERCISES

Find the general solution of each of the following differential equations.

1. (D
2
+ l)y = —!— 2. (Z)
2
- D - 2)y = e~ x sir
COS X
2 2x 2
3. (D + 4D + 4)y = xe 4. (Z) + 3D - A)y = x 2ex
2x
2 2 e
5. (AD + 4Z> + \)y = xe~ xl2 smx 6. (Z) + 4)y =
2
2x <N 2
7. (Z)
2
+ 10D - 12)3^ = (g + ,

l)
8. (Z> + 3)
2
>> = (jc + X)e
x
2 2x 2 x 2
9. (Z) - 2Z) + 2)>> = e sin x 10. (4Z> - SD + 5)y = <? tan |
For each of the following differential equations verify that the given expression is the
general solution of the associated homogeneous equation and then find a particular
solution of the equation.

11. x 2y" — 2xy' + 2y = x 3 lnx, x > 0; yh = c\x + C2* 2


12. 2
jc /' — xy' + y = x(x + 1); y* = (ci + C2 In \x\)x
13. (sin4x)y — 4(cos 2 2x)/ = tanx; yh = c\ + C2COs2x
14. */' - (1 + 2* 2 )/ = jcV yh = a +
2
;
c 2 e*
2

15. (1 - x 2)y" - 2xy' = 2x, -1 < x < 1; j>* = ci + c 2 ln


1 — x
In each of the following exercises find a Green's function for the given linear differential
operator.

16. D2 + 3 17. D2 - D - 2

18. Z)
2
+ 4Z> + 4 19. 4D - 8D + 5
2

20. D 2 + 3D - 4 21. x 2 D 2 - 2xZ> + 2


xD 2 - (1 - x )D
2 - 2xD
22. (1 + 2x)Z> 23.
2

24. Solve the initial value problem

(Z) 2 + 2aD + 6 2 )y = sin w/, y(0) = /(0) = 0,

where a, b, co are (real) constants, a < b. Consider separately the cases


co ?± Vb — 2 a 2 and co = y/b 2 — a 2 and sketch the solution curve in each case.
,

25. Find a Green's function for the operator in Exercise 24, and use this function to

obtain the desired solution as G(sin uit).

26. The second-order equation


(Z) - 1)(jcZ> + 3)y = e*

may be solved by setting (xD + 3)y = u, and then successively solving the first-order
equations
(Z) - \)u = ex and (xD + 3)y = u.

Use this technique to show that the general solution of this equation is

c\ ,
C2e .2 n . ~n , x
4-5 | VARIATION OF PARAMETERS; GREEN'S FUNCTIONS 145

27. Use the technique introduced in the preceding exercise to show once more that xe ax
is a solution of (D — a) 2y = 0.

28. Let K(x, t) denote the Green's function (4-30) for initial value problems involving the
operator L = D2 + a\(x)D + ao(x), and assume that L is defined on an interval /
of the x-axis.
(a) What is the domain of K(x, i) in the x/-plane?

(b) Prove that K(x, x) = and Kx (x, x) = 1 for all x in /. [Note: Kx denotes the
partial derivative of K(x, i) with respect to x.]
(c) Show that for each fixed t in I the function <p(x) = K(x, t) is a solution on / of
the initial value problem Ly = 0; <p(t) = 0, <p'(j) = 1.

(d) Use the results of (b) and (c) to deduce that K(x, i) is independent of the particular
basis yiix) and y2(x) chosen for the solution space of the homogeneous equation
Ly = 0.

29. With K(x, t) as in the preceding exercise, show that the function yp defined by

y p (x) = / K(x, t)h(t) dt


J XQ

satisfiesthe initial conditions yp (xo) = yp (xo) = for all h in C(/). [Hint: Use
Leibnitz's formula for differentiating integrals,t namely,

r Hx) /• b(x)

-£ / f(x, t)dt = / Mx, t) dt + f(x, b{x))b'{x) - f(x, a(x))a' (*).]


dX J a (x) J a(x)

*30. Find the Green's function for initial value problems on (0, <x>) for the operator

*-** + = ' + ('-£)


p a non-negative real number.

*4-5 VARIATION OF PARAMETERS; GREEN'S FUNCTIONS (continued)

The method of variation of parameters can easily be extended to equations of


arbitrary order. In this case we begin with a normal equation

(n)
y + an _ !(*)/"- " + • • •
+ a (x)y = h(x) (4-34)

defined on an interval /, and again assume that the general solution

yn = c iyi (x) + • • •
+ cnyn (x) (4-35)

of the associated homogeneous equation is known. Then, following the argument


given in the second-order case, we seek a particular solution of the form

yP = ci(*)yi(*) + • • •
+ cn (x)yn (x), (4-36)

t See Theorem 1-36.


.

146 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

where, in addition to the requirement that yp satisfy (4-34), we impose the follow-
ing n — 1 conditions on the unknown functions Ci(x), , c n (x): . . .

c'iJ'i + • • •
+ nyn = c' 0,

c'xy'x + ••• + <« = 0, (4-37)

^(n-2) + + =
c . . .
ctff-V

for all x in /. If the expression for yp is now substituted in (4-34) and the above
conditions are used, we obtain the additional equation

+ y^- =
l)
c'xyf-
1
'
+ • • • c'n
{
h(x), (4-38)

and, for each x in /, (4-37) and (4-38) may be viewed as a system of n linear
equations in the unknowns whose determinant is W[yi(x),
c[, . . . , yn (x)].
c'n . . .
,

Our earlier reasoning still applies, and we can obtain a particular solution for
(4-34) by solving this system for c[, c'n integrating, and then substituting . . . , ,

the resulting functions in (4-36).


In fact, if Vk(x) denotes the determinant obtained from W[yi(x), . .
, yn (x)]
by replacing its Ath column with

6
1

then a straightforward computation gives

W\y x (x), ..., yn (x)]

(See Exercise 17.) Hence, just as in the second-order case, the particular solution
may be written in integral form as

where x is any point in /, or as

yp (x) = / K(x,t)h(t)dt, (4-41)


J Xn
where

_ yi (x)VM + + y
;
MVM ,
( ^42)
4-5 | VARIATION OF PARAMETERS; GREEN'S FUNCTIONS 147

or, for the reader who prefers determinant notation,

yi(t) •
yn (0
y'i(t) • y'nit)

yf~
2
\t) - yn'
( 2)
(t)

= yi(x) •
yn (x) (4-43)
K(x, t)
yi(t) • yn (t)
y'i(t) •• y'nit)

yf-
2)
(t) •
- y?- 2 \t)
y:~ \t)
1} { l
yr- (o •
••

The function K(x, t) defined here is called the Green's function for the operator
L = Dn + an ^i(x)Dn ~~ 1 + +a • • •
(x) (for initial value problems in the
interval I), and the expression

=
G(h)
r
/
Jx
K(x, t)h(t) dt (4-44)

defines a right inverse G: Q(I) — Q n (I) for the operator L.


> In fact, G is the inverse
for L such that G{h) satisfies the initial conditions

n U
v(n-l) -
G(h){x ) = GW(*o) G(hy (x ) = (4-45)

for all h in e(I).

Example. Find a particular solution yp for the equation

3/" + 5/' - 2/ = /•(*), (4-46)

r(x) continuous on (— oo, oo).


Here the general solution of the associated homogeneous equation is Ci +
2x
c 2 e~ + c 3 exls Hence
.

yP = Ci(x) + c 2 (x)e~
2x
+ c 3 (x)e* 13 ,

where Ci(x), c 2 {x), c 3 (x) satisfy the identities

2x
Ci(jc) + c'2 (x)e- + c 3 (x)ex 3 = '
0,
2x
-2c'2 {x)e- + ^c 3 (x)ex 3 = '
0,

Ac'2 {x)e-
2x
+ ic'3 (*K
/3
= ^ •
x

148 EQUATIONS WITH CONSTANT COEFFICIENTS CHAP. 4

Thus

c'i(x) = - '

~Y
= i*e 2xr(x),
c'2 (x)

4(x) = $e- xl3 r(x),


and it follows that

f -2x f *xl3 f
- x,3
yv », r(x)
J
dx
'v-.— +
.

^jf-
14 J
e
2x
r(x) dx + ^- J e r(x) dx

-
= / [~i + ^e~ 2{x + l)
^- x t)l3
]r(t)dt.

Alternately, we could have computed the Green's function K(x, i) for the (nor-
mal) operator D 3 + §D 2 — § D, and used (4-41) to express yv as an integral in-
volving K(x, i). Starting with the basis 1, e~ 2x , e
xl3
for the solution space of the
associated homogeneous equation, we then obtain

1 e~ 2t e
m
-2e~ 2t 3e
2x xl3
= 1 e~ e
K(x, t)
2t tl3
1 e~ e

-2e~ 2t 3e
2t
4e~ ±e t,a

T_
e
-5t/3 _ i
e
-2x
e
t/3 _ 2e
xl3
e~
2t

— 2-p-
9e
-5t/3 4 -5(/3

_ — a.
— i

-T 14^
_a_ -2(x-t)
T^
, 9
7e
(.x—t)/3
2
Thus

+ ^- 2(x -° + &- x t)l3

f
J
[-i ]r«)dt,

which agrees with the result obtained above.

The remainder of this section will be devoted to taking a closer look at the Green's
functions for initial-value problems, and to establishing some of their more im-
portant properties. Throughout this discussion we shall assume that

L = Dn + a^xOc)!)"-
1
+ • • •
+ «o(*)

is a fixed linear differential operator on e n (I), and that K(x, i) is the function
obtained above from the general solution of the equation Ly = 0. It then follows
4-5 | VARIATION OF PARAMETERS; GREEN'S FUNCTIONS 149

from the very way in which K{x, i) was constructed (see Exercises 18-21 below)

that
(1) K(x, t) is defined throughout the region R of the xf-plane consisting of all

points (x, i) with x and t in I (see Fig. 4-1);


2
(2) K{x, dK/dx, d 2 K/dx ...,
t) and ,

n n
d K/dx are continuous everywhere in R;
(3) For every x in I, and every h in 6(7),
the function

X
y(x) = ( K(x, i)h(i) dt
Jx

is a solution of the initial-value problem

Ly = h;

y(x ) = /M = • • • = ^—"(jco) =
FIGURE 4-1

on I.

These properties are actually sufficient to characterize the Green's function for
initial- value problems involving L in the sense that K(x, i) is the unique function

defined in R which satisfies (1), (2), and (3). This assertion will be proved below
as Theorem 4-3, and is mentioned here only to motivate the following definition.

Definition 4-1. A function H(x, t) is said to be a Green 's function for initial
value problems involving the linear differential operator L if and only if
H{x, t) enjoys the three properties listed above for the function K(x, t).

This said, we proceed at once to give an alternate, and for our purposes much
more useful description of a Green's function for L. For convenience, we shall
denote the various derivatives dH/dx, d 2 H/dx 2 appearing in the following , . . .

argument by x H H
2 .... And with this notation in effect, we have
, ,

Theorem 4-2. Let H(x, t) be defined throughout the region R described


above, and suppose that and its partial derivativesHu 2 H H , . . . , Hn are
continuous everywhere in R. Then H(x, i) is a Green* s function for the linear
n~ l
differential operator L = Dn + an -i(x)D + • • •
+ a (x) if and only
if the following identities are satisfied throughout R:

H(x, x) = 0,

H\(x, x) = 0,
(4-47)
Hn _ 2 (x, x) = 0,

Hn _ x (x, x) = 1,

and
Hn {x, + t) an _ !(*)#„_!(*, t) + • • •
+ a (x)H(x, t) = 0. (4-48)
150 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

Proof. First assume that H(x, i) is a Green's function for L. Then, by definition,
the function

y(x) = f"
H(x, t)h{i)dt (4-49)

is a solution of the initial value problem

Ly = h; y(x Q ) = y'(x ) = • • • = /"""(xo) =

for every x in I, and every h in 6(7); see (3) above. We now differentiate (4-49)
using Leibnitz's formula* to obtain

y'(x) =
Jx
(' Hiix, i)h(t) dt + H(x, x)h(x), (4-50)

which reduces to
H(x , x )h(x ) =
when x = x (recall that y'(xo) = 0). But, by assumption, this expression is

valid for all h in 6(7), and hence, in particular, for h = 1. Thus H(x , x ) = 0,
and since x can be chosen arbitrarily in I, we have

H(x, x) = 0,
and

y'(x) = f*
JXq
H (x, t)h(i)
x dt. (4-51)

We now repeat the argument, starting with (4-51), to get, first

x
y »(x) = f
JXn
IXq
H (x, f)h{i) dt +
2 Htix, x)h(x),

then
H\(x, x) = 0,
and finally
X
y "(x) =
Jx
f H (x, t)h(t)
2 dt.

Continuing in this fashion we eventually arrive at the situation

Hn—zix, x) = 0,
and
- 1}
yn (x) = f*
Jx
Hn _ x {x,t)h{t)dt.

* Leibnitz's formula is

rHx)
D(X) r 0(,X)
r b(,x)

d_

dx J/a(x)
/(*, t)dt =
J a(x)
¥
OX
(x, dx + f(x, b(x))b'(x) - f(x, a(.x))a'(x).
4-5 | VARIATION OF PARAMETERS; GREEN'S FUNCTIONS 151

Differentiating once more, we obtain

X
in
y \x) = [
Jx
Hn (x, t)h(t) dt + Hn-xix, x)h(x),

whence
/ n \x ) = H(x Q xoWxo). , (4-52)

But since y(x) is a solution of Ly = h,

(n
y \x) + tfn.xOc)/"-
1
^) + • • •
+ a (x)y(x) = h(x),

and the initial conditions in effect imply that

/ n \x ) = Kx ).

This, together with (4-52) and the fact that h and x are still arbitrary, implies
that
Hn —\{x, x) = 1,

and
X
y»>(x) = f Hn (x, t)h(t)dt + h(x).
Jxn

Thus, in particular, we have established the several identities listed in (4-47).


To prove (4-48) we substitute the formulas obtained above for y,y',..., j> (n-1)
in Ly = h. After the various terms are collected we have
X
f [Hn (x, t) + a«_i(x)#n _i(x, + ' • •
+ a (x)H(x, t)]h(t) dt = 0, (4-53)
Jx

and the fact that this expression holds for all x in / and all h in 6(7) allows us
to conclude that the bracketed portion of the integrand is identically zero (see
Exercise 23). And first part of the proof is complete.
with this, the
As argument needed to show that (4-47) and (4-48)
for the remainder, the
imply that H(x, i) is a Green's function for L is an even more elementary com-
putation than the one just given, and has therefore been relegated to the exercises
(see Exercise 24). |

Among other things, this theorem asserts that for any fixed t in the interval I,
the function
k(x) = H(x, t )

is a solution of the initial-value problem

Ly = 0; X'o) = /(to) = • • • = y in - 2) (t ) = 0, y"- 1} (/o) = I-

But, as we know, the solution of this problem is unique. Thus the values of
H{x, t) are uniquely determined by the operator L on the line segment consisting
152 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

of those points (x, i) in R with t = t (see Fig. 4-2). However, / can be chosen
arbitrarily in /. and we therefore have our main result.

Theorem 4-3. The Green's function for initial value problems on I involving a
linear differential operator L is uniquely determined by L, and hence must
coincide with the function K{x, t) defined by (4-42) or (4-43). In particular,
K(x, i) is independent of the basis for the solution space of Ly = used in

computing it.

to

FIGURE 4-2

Everything that has been said up to this point in our discussion applies to
arbitrary linear differential operators. As might be expected, much more precise
information can be given in the case of operators with constant coefficients, and
we conclude this section with a theorem which describes the Green's functions
obtained in this special case.

Theorem 4-4. The Green's function for a constant coefficient linear differen-
tial operator L can be written in the form k(x — t), where k(x) is the solution

on (— oo , co) of the initial value problem

Ly = 0; XO) = /(0) = • = ^— al
,(n-2)/ryk
(0)
_
= 0,
,(n— 1)/A\ _
y»-"(0) = 1.

Proof The function H(x, t) = k(x - t) clearly satisfies the identities (4-47) and
(4-48) of Theorem 4-2, and hence, by Theorem 4-3, is the Green's function for L* |

EXERCISES

Use the method of variation of parameters (without employing Formulas (4-41) through
(4-43)) to find the general solution of each of the following differential equations.

J, y">_ y» — y> _|_ y = 4Xe * 2. /" — / = SU1 X


3. /" - 2y" = 4(x + 1) 4. /" - 3/' - / + 3y = 1 + e*

* Note that the assumption of constant coefficients is needed to verify (4-48); see
Exercise 25.
4-5 | VARIATION OF PARAMETERS; GREEN'S FUNCTIONS 153

5. /" - 7/ + 6v = 2 sin x 6. /" - 3/ - 2v = 9e~x


7. y
"> - y = sin x 8. /" + /' + / + y = 2(sin x + cos x)
yiv) _ y/ = 2xex (iv) - = x2 + 1
9. 10. y >>

In Exercises 11 through 16 compute the Green's function K(x, i) for the given operator
(a) by using Formula (4-42) or (4-43) and (b) by applying Theorem 4-4.

11. D\D - 1) 12. D(D 2 - 4)

13. D - 3 6£> 2 + UD - 6 14. D3 + |£>


2 - f
15. D\D 2 - 1) 16. D* - 1

17. Verify Formula (4-39). [Hint: Use Cramer's rule.]

Exercises 18 through 21 concern properties of the function K(x, i) defined by Formula


(4-42) or (4-43). Establish each of them.

18. When n = 2, Formula (4-43) reduces to (4-30) of the preceding section.

19. K(x, i) is defined and has continuous derivatives through order n in the region R
described in the text.
20. The partial derivatives with respect to x of K(x, t) satisfy the identities

K(x, x) = Ki(x, x) = • • • = Kn —2(x, x) = 0, Kn —i(x, x) = 1,


in the interval /. Here

Ki(x, t) = —. K(x, t).

21. For each xo in / and each h in Q(I) the function

X
y(x) = [ K(x,t)h(t)dt

satisfies the initial conditions y(xo) = • • • = .)>


(n-1)
(.xo) = and the equation
Ly = h(x). [Hint: Use Leibnitz's formula and the result of Exercise 20.]

22. If f{x) is continuous on the interval [a, b] and if

f(x)g(x)dx =
/
Ja

for every g in Q[a, b], then /= on [a, b]. [Hint: Assume /(*o) ^ and use the
continuity of /to obtain an interval (x — 8, x + 5) in which |/(x)| >|/(x )|/2.
Then find a function g in 6(7) for which the above integral is different from zero.]
23. Use the result of Exercise 22 to prove the assertion made in the text concerning the
bracketed term in (4-53).
24. Let H(x, t) satisfy the hypotheses of Theorem 4-2, and the identities given in (4-47)
and (4-48). Prove that for every xq in / and every h in C(/) the function

y(x) = f* H(x, t)hit) dt


Jx
rx n

satisfies the initial value problem

Ly = 0; y(xo) = y'(x ) = • • • = 7 (n_1) (^o) = 0.

[Hint: Use Leibnitz's formula.]


1 54 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

25. Show that the function H(x, t) = k(x — t) appearing in the proof of Theorem 4-4
satisfies (4-47) and (4-48).

The only values of K(x, which enter into the integration in (4-44) are those for which
t)

the point (x, t) lies R Xo of R shaded in Fig. 4-3. This suggests the possi-
in the subregion
bility of generalizing the notion of a Green's function as follows:

Definition 4-2. A function R(x, i) is

called a Green's function for the operator


L= D* a n -i(x)D n -i
+ floW + • • •
+
for initial value problems at the point xq
#<„,
if it is defined and continuous in R Xo ,

and if for every h in 6(7) the function


X
-R-tol
y(x) = f K(x,tMt) dt
Jx I I

I I

is the solution of the initial value problem I I

XQ
Ly = h;

y(x )
=*•••= yi»-»(xo) = 0. FIGURE 4-3

In the exercises which follow we explore some of the properties of these functions, and,
in particular, show that under certain additional assumptions they coincide with K(x, t).

26. Let K(x, t) be as described above, and assume that L = D2 + a\(x)D + ao(x).
Suppose, in addition, that
(i) K(x, t), K\(x, t), K^ix, t) are continuous in R XQ ,

(ii) K(x, x) = on /,

(iii) K\(x, x) = 1 on /.

Prove that
K 2 (x, i) + a^xWiix, t) + a (*, t) = 0.

[Hint: Follow the proof of Theorem 4-2.]


27. Let K(x, t) be as described in Exercise 26. Prove that K(x, i) = K(x, t) in the

region R XQ .

28. Generalize the results of Exercises 26 and 27 to the «th-order case.

4-6 REDUCTION OF ORDER


One of is that we can
the remarkable properties of linear differential equations
simplify (and sometimes solve) the equation Ly = h even when we do not have a
complete basis for the solution space of Ly = 0. Again the technique is variation
of parameters, but this time it leads to a reduction in the order of the equation.
The following example will serve to introduce the method.

Example 1. Consider the second-order equation

2 O+
dx 2
X >%-2( + 1 X *)y (4-54)
4-6 I
REDUCTION OF ORDER 155

on the interval (0, oo). Here none of our earlier techniques is sufficient to obtain

the general solution of the associated homogeneous equation

**§ + *£- W + ^-O- (4-55)

However, the solution y = x2 of (4-55) is easily discovered by inspection,* and


we now proceed to seek solutions of (4-54) in the form

y = x 2 c(x).
Then
/ = x 2 c'(x) + 2xc(x),
= x c"(x) + 4xc'(x) +
2
y» 2c(x),
and (4-54) yields

x\x 2 c" + Axe' + 2c) + x\x 2 c' + 2xc) - 2(1 + x 2)x 2 c = x,

which, upon simplification, becomes

x*c" + (4*
3
+ x b )c' = x,
or

c" + XX
4 4- x
^
2
c'

But (4-56) may be viewed as a first-order equation in c', and as such can be solved
= 41

3
'
(4-56)
v '

by the technique introduced in Section 3-3. Indeed, using the integrating factor

e
f(4+x 2 )/xdx _ ^4^x2/2

we obtain
x2/2
e = [fc! + fxe dx]x- 4 e- x2 < 2

-
= [*i + ^ 2/2]^- 4e x2/2

where A: x is an arbitrary constant. Hence

c(x) = - ^+ k x j x-"e-x212 dx + *a ,

where Ar 2 is also arbitrary, and it follows that

y = x 2 c(x ) = _L + k lX 2 x- 4 e- x2 ' 2
dx + fc 2 x2
J

* The phrase "discovered by inspection" is just a dodge to hide the fact that the process
was one of trial and error.
156 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

is a solution of (4-54). In fact, since x 2 and x 2 fx~ e~x2l2 dx 4:


are linearly in-
dependent in 6(0, oo ), this expression is actually the general solution of (4-54)
on the interval (0, oo).

The preceding example is representative of the technique whereby the order of


a linear differential equation can be reduced by one as soon as a single (nontrivial)
solution of its associated homogeneous equation is known. To establish this
assertion in general, let Ly = h be an equation of order n, and suppose that
u(x) is a nontrivial solution of Ly = 0. Then, evaluating the left-hand side of
the expression L[u(x)c(x)] = h(x) by means of Formula (4-12), we obtain

(Lu)c + (L'u)c' +^ (L"u)c" + \-


^ (L
(n)
«)c
(n)
= h. (4-57)

But since Lu = 0, the first term in this equation vanishes, and we may therefore
view (4-57) as a linear equation of order n — 1 in c'. This is the asserted reduc-
tion in order. In particular, this technique can always be used as it was above
to find the general solution of a second-order equation whenever one nontrivial
solution of its associated homogeneous equation is known.
Example 2. The second-order equation

(D - a)
2
y = (4-58)

has e ax as a solution. Using the above technique to find the general solution we
set

= ax
y c(x)e ,

substitute in the equation, and obtain

c"e ax + 2ac'e
ax
+ a 2 ce ax - 2a(c'e
ax
+ ace ax ) + a 2 ce ax = 0.

This simplifies to c" = 0, which, by two integrations gives

c = k\X + k2 .

Thus
y= (k lX + k 2 )e ax ,

as expected.

EXERCISES

In Exercises 1 through 6, find the general solution of each differential equation using the
given solution of the associated homogeneous equation.

1. /' + */ = 3x, 1 2. xy" - {x + 2)/ + ly = x3 + x, ex

3. xy" - / = 0, 1 4. xy" + (2 + *)/ + y = e~ x , 1/x


4-7 I
THE METHOD OF UNDETERMINED COEFFICIENTS 157

5. 4x 2y" - Sxy' + 9y = 0, x 3/2 6. xy" + (x - l)y' - y = 0, e~ x


7. Starting with the solution y = e -J [«<>(*> /«i (*>]<** of the normal first-order equation

ai(x) —
dy
ax
+ floW)' = 0,

use the method of reduction of order to find the general solution of

ai(x) —
dy
dx
+ ao(x)y = h(x).

8. The second-order equation

(1 - x 2 )y" - 2xy' + 2y =

has ^ = x as a particular solution. Use the method of variation of parameters to


reduce the order of this equation, and then find a second linearly independent solution
in each of the intervals (— °° —1), (—1,
, 1), (1, °o) in which the equation is normal.
*9. It can be shown that the equation*

xy" + y' + xy =
has a solution, Jo(x), on the interval (0, °o ) which can be expanded in a power series as

2 4 6
Jokx) = 1 ~ + _
22 24 2 26 2 •
(2!) (3!)

Find a second solution, linearly independent of Jo(x) in 6(0, «> ), of the form

J x[Jo(x)] 2

and then prove that this solution can be written in the form

Jo(x) In x -f- (power series in x).

What is the behavior of this solution as x — > 0?


10. Let y i and j>2 be linearly independent solutions of a normal third-order homogeneous
linear differential equation Ly = on an interval 7, and let My = be the second-
order equation obtained by using the solution y\ to reduce the order of Ly = by
one. Prove that (y2/yi)' is a solution of My = on any subinterval of I in which
yi has no zeros.

4-7 THE METHOD OF UNDETERMINED COEFFICIENTS


The method of variation of parameters enables us to find a particular solution of
a nonhomogeneous linear differential equation whenever the general solution of
its associated homogeneous equation is known. However, it is not always the

* This equation is known as Bessefs equation of order zero and will be studied in detail
in Chapter 15.
158 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

most efficient way of producing such a solution, and for certain equations the
work involved can be considerably lightened. For instance, it would be pointless
to use variation of parameters to find a particular solution of (D
2 — D + S)y = 3
since the solution yv = § is immediately evident. And even for an equation
such as
(D 2 + 3)y = e
x

it is obvious that a solution of the form

yv = Ae x

must exist for a suitable value of A. Moreover, by substituting Ae x in the equation


we obtain
x
Aex + 3Ae x = e ,

from which it follows that A = \, and yp = e x /4.


This method, wherein a particular solution is known up to certain undetermined
constants, and the values of these constants are found by using the differential
equation, is known as the method of undetermined coefficients. It is clear that this
method depends for its success upon the ability to recognize the form of a par-
ticular solution, and for this reason lacks the generality possessed by the method
of variation of parameters. Nevertheless, it can be used often enough to merit
some attention.
One type of equation for which this method always works is a constant coef-
ficient equation
Ly = h (4-59)

in which h itself is a solution of a linear differential equation with constant co-


efficients. For then we can find a linear differential operator L 1 which annihilates
h (i.e., such that L x h = 0), and it follows that every solution of (4-59) is also a
solution of the homogeneous equation

L x Ly = 0. (4-60)

Thus we can obtain a particular solution for (4-59) by appropriately determining


the constants in the general solution of (4-60). A few examples will suffice to
illustrate this technique.

Example 1. Since Dz annihilates the right-hand side of the equation

(D 2 + \)y = 3x 2 + 4, (4-61)

a particular solution of (4-61) can be found among the solutions of the homo-
geneous equation
D\D 2 + \)y = 0. (4-62)
4-7 |
THE METHOD OF UNDETERMINED COEFFICIENTS 159

In other words, (4-61) has a particular solution of the form

yP = Ci + c 2x + c3x
2
+ c 4 sin x + c 5 cos x (4-63)

for suitable values of c lt . . . , more than this if we


c5 . In fact, we can say even
observe that c 4 sin x + c 5 cos x is homogeneous the general solution of the
equation (D + l)y =
2
0. For then it is clear that the last two terms in (4-63)
will be annihilated when substituted in (4-61), and so, rather than mindlessly
dragging them through our computations only to see them disappear in the
process, we can begin by setting

2
yP = Ci + c 2x + c 3x .

Substituting this expression in (4-61) we obtain

2c 3 + ci + c 2x + c 3x
2
= 3x 2 + 4,

from which it follows that

ci + 2c 3 = 4, c2 = 0, c3 = 3.

Thus C\ = —2, c2 = 0, c 3 = 3, and

yv = 3jc
2 - 2.

Example 2. To find a particular solution of

(D 2 - AD + 4)y = 2e
2x
+ cos x, (4-64)

we apply the operator (D — 2)(D 2 + 1) to the equation and obtain

(D - 2)
3
(D 2 + 1);; = 0.

The general solution of this last equation is

2x 2x 2 2x
y = C\e + c 2 xe + c^x e + c 4 sin x + c 5 cos x,

and since the two terms are annihilated by the operator


first D 2 — AD -j- 4, we
look for a particular solution of (4-64) of the form

2 2x
yp = c3x e + c 4 sin x + c 5 cos x.

In this case

y'
p = 2c s xe
2x
+ 2c 3 x 2e 2x + c 4 cos x — c 5 sin x,
j>p = 2c 3 e
2x
+ Sc s xe 2x + 4c 3 x 2e — c 4 sin x — 2a:
c 5 cos x,

and substitution in (4-64) yields

2x 2x
2c 2 e + (3c 4 + 4c 5 ) sin x + (3c
5
— 4c 4 ) cos x = 2e + cos x.
160 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

Hence
2c 3 = 2, 3c 4 + 4c 5 = 0, 3c 5 — 4c 4 = 1,

and
c3 =1, c4 = -^, c5 = ^V-

Thus the desired particular solution is

2 2x
yp = x e — £$ sin x + ^ cos x.

EXERCISES

Use the method of undetermined coefficients to find a particular solution for each of the
following differential equations.

1. D(D + \)y = 2x + 3ex 2. D(D + l)y = 2 + e~x


3. D(D - l)y = sin x 4. (Z> 2 + \)y = 3 cos x
5. (Z) 2 + 4Z> + 2)>> = xe~ 2x 6. (D
2
+ £) - = 6(x + 1)
6)>>

(6Z) + 2D - = 7x(x + l)e* - 5D + = -2 + 36x 2 + e*


2 2
7. 1)>> 8. (Z> 6)>>

(Z) 2 - 4Z> + % = + l) 3 - 4D + = e 2 *(l + sin 2x)


2
9. 10. (Z) (jc 8)j>

(Z> 2 + 6Z> + 10)^ = x + 2x + 2


4 2
11.
2 -
Z> + J)j; =
/2
12. (Z> jc<?*

13. D(D 2 - 2D + 10)>> = 3**?*


14. (Z> 3 + 3D
2
+ 3Z> + 1)^ = x4 + 4* 3 + 10x 2 + 20* + 1

15. (D 3 - Z) 2 - Z) + l)y = 2(jc + 2e~ x)


(Z) 3 - 3Z) - 2)7 = <?*(1 + xe )
x
16.

17. (Z) 3
+ Z) — \)y = sin x + cos x
18. Z> 2 (Z>
2
+ = 1 + 2xe*
1)>>

19. D(D 2 -
\){D - 2)y = x 2 + 2x + 3 - 2c*
(Z) + 5Z) + A)y = 2 cos
4 2
20. a:

Give the form of a particular solution for each of the following equations. The coeffi-

cients need not be evaluated.

21. (Z> 2 - 4Z> += x(2e 2x + xsinx)


A)y
2 -
22. (Z> 2 + 2D +
2)j> = x 3xe~ 2x cos 5x
23. (Z> + 1) (Z)
2 3 - \)y = 3e~ x + 5x 2 cos x
4 -
4Z> + 6Z) - 4Z) + l)j = (x + 1)(1 - e )
2 3 2 2 x
24. Z) (Z>

25. (Z) 8 - 2Z) + \)y = (2x - 1) cosh x + x sin x


4 3

26. (Z) 3 - 1)(Z)


2
+ Z> - 2)j> = e*/2 sin V3 x - x cos V3 x
27. (Z> 3 - 1)
3
>> = (2x + 1)V + ^P
28. (Z) 2 - 2Z> + 1)(Z) 2 - 4)
2
y = x sinh x + cosh 2x
29. [(4Z> 2 - 4Z> + 5)(Z> 2 + 2Z> + 1)] 2>> = x(l + ex 2 sin x - xcosx)
'

30. £>(Z)
2 - 4)
5>> = (x + 2
l) [(x + 1) + sinh 2x]
>

4-8 | THE EULER EQUATION 161

4-8 THE EULER EQUATION


A linear differential equation of the form

x" 2 + a-!-- s3 + 1 • '


' + «.* % + <W - 0, («5)
with a ,
. . . , an _i constants, is called a (homogeneous) Euler equation of order n.

The reader should note that this equation on the entire x-axis, but is
is defined
normal only on intervals which do not contain the point x = 0. It is one of the
relativelyfew equations with variable coefficients that can be solved in closed
form in terms of elementary functions, and is important because its solutions are,
to some extent, typical of those of a large class of linear differential equations
whose leading coefficient vanishes at the origin.
As we can be converted into a (linear) equation with con-
shall see, Eq. (4-65)
stant coefficients by making the change of variable u = In x, and hence can be
solved by the methods introduced in this chapter.* Although this reduction can
be effected in a routine fashion, it is illuminating to consider it from the point of
view of linear operators by introducing the transformation T: e*(— oo, oo) —
e w (0, oo ) defined by
(Tg)(x) = g(lnx) (4-66)
n
for all g in e (— oo , oo). Thus Tmaps the function x onto In x, sin x onto sin (In x),
with (X~ f%x) =
l
etc., and is obviously linear. More important, it is invertible,

f(e
x
) for all fin e n (0, oo), and hence is a one-to-one linear transformation mapping
W
6 (— oo, oo ) onto e n (0, oo ). From this it follows that the problem of solving the
equation Ly = on the interval (0, oo), with L = xn D n + an _ 1 xn ~ 1 Z)n_1 +
• • •
+ a , is equivalent to the problem of finding all functions g in e M (— oo, oo)
such that LTg = 0. In other words, we must find the null space of the transforma-
tion LT.
To this end we begin by computing the various products DT, D 2 T, . .
.
, as
follows:

DTg = Dg(ln x) = i g'(ln x) = i TDg,

D 2Tg = D(DTg) = D^TDg)


= - ^ TDg + I D(TDg)
\2.

TD(D - l)g,

* In the following discussion we shall restrict our attention to the interval (0, oo ). On
(— oo , 0) the change of variable u = In (—x) must be used.
162 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

and, in general,

DkTg = 1 TD(D - 1) • • • {D - k + \)g, k= 1,2,.... (4-67)

Thus
x
k
D kT = TD(D - 1) • • •
(D - k + 1),

and hence when

L = xn Dn + a^!* 71
-1 ^- + 1
• • •
+ a x xD + a ,
(4-68)
we have
IT = TL, (4-69)

where L is the constant coefficient linear differential operator

Z = Z)(D - 1) • • •
(D - n + 1) + an - X D(D - 1) • • •
(D - n + 2)

+ •
+ at/) + fl - (4-70)

Equation (4-69), together with the fact that T is one-to-one, implies that the
null space of the transformation LT coincides with the null space of L. This
establishes our contention that (4-65) can be reduced to an equation with constant
coefficients and also allows us to describe the solution of any (homogeneous)
Euler equation on the interval (0, oo) as follows:

The general solution on (0, cc) of the equation Ly = 0, L as in (4-68), is

y = <?i>>i(lnx) + • • •
+ cnyn (lnx), (4-71)

where y\(u), . . .
, y n (u) are a basis for the solution space of the constant
coefficient equation Ly = 0, L deduced from L by (4-70), and c x , . . . , cn
are arbitrary constants.

Finally, to remove the restriction on the interval, we note that for each non-
negative integer k,

x
k
D kg(-x) = A-DW*) = k
(-x) k D g(-x).

Hence if y(x) is a solution of the Euler equation Ly = on (0, co), y(—x) is a


solution on (— oo, 0), and y(\x\) on (0, <x) and (— oo, 0) (see Exercise 12).

Example 1 . The Euler equation of order two. Let

be a (homogeneous) Euler equation of order two. Then the constant coefficient


L appearing above is
linear differential operator D(D — 1) + axD + a , and
c

4-8 I
THE EULER EQUATION 163

the general solution y(x) of (4-72) is determined by the roots a\, a 2 of the quad-
ratic equation
m(m — 1) + ax m+ a = 0, (4-73)

known as the indicia! equation associated with (4-72). Thus


(i) if a and a 2 are
i real, ai 5* a 2 then
,

a
y(x) = Cl |x|
ai
+ c 2 \x\ *;

(ii) if «x = a2 = a, then

y(x) = |*|«(c 1 + 2 ln |*|);

(iii) if ax = a + bi, a2 — a — bi, b > 0, then

y(x) = |*|
a
[ci sin (b In |*|) + c 2 cos (b In |*[)].

We shall have occasion to refer to these results in a later chapter.

Example 2. Find the solution of

x 2y" + 2xy' - 6y = (4-74)

which passes through the point (1,1) with slope zero.


The indicial equation associated with (4-74) is

m(m — 1) + 2m — 6 = 0,

and has 2 and —3 as roots. Thus the general solution of (4-74) on (0, 00) is

-3
y = ci*
2
+ c 2* ,

and since the given initial conditions y(l) = 1, /(l) = imply that c x = f,
c2 = f, the required solution is

2 -3
y = f-* + I* -

Example 3. In Example 2 of Section 4-4 we said that the general solution of the
equation
xy" +/ = (4-75)
is

y = ci + c 2 ln |*|. (4-76)

To prove this assertion we multiply (4-75) by *, and obtain the second order
Euler equation x 2y" + xy' = 0. Here the indicial equation is m2 = 0, and
(4-74) now follows from the results of Example 1.
The general solution of a nonhomogeneous Euler equation Ly = h, defined on
an interval I not including the origin, can be obtained by using variation of param-
164 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

eters or the Green's function for L (see Example 2, Section 4-4). But here too the
problem can be reduced to one involving a constant coefficient equation, and is

usually easier to solve in this form. To do so we again use the linear transformation
T, this time to rewrite the equation

Ly = h (4-77)
as
Lr(r~V) = T(T~ l
h).

But, by (4-69), LT = TL. Thus

TL(T- l y) = T(T- l h),

and since T is one-to-one, this last equation may be rewritten

jC(r-V) = T~ l
h, (4-78)

in which form it is a constant coefficient equation. Moreover, we can now assert


that ifyiu) is the general solution o/(4-78) on I, then j>(ln \x\) is the general solution

of (4-77) on I.

Example 4. Find the general solution of

x 2y" + xy* = x + x2 (4-79)


on (0, oo ).
= u 2u
In this case L = D 2 T~ ,
l
(x + x 2) e -f e , and the transformed version
of (4-79) is

u 2u
D 2y = e + e . (4-80)

The general solution of D 2y = is cx + c 2 u, and a particular solution of (4-80)


may be found by using the method of undetermined coefficients on the expression
Ae u Be 2u
+ simple computation gives A = 1, B = J. Thus the general
. A
solution of (4-80) is
u 2u
y = e + \e + cx + c 2 u,

and it follows that the general solution of (4-79) on (0, oo) is

2
x
y = x + -r
4
+ Ci + c 2 In x.

EXERCISES

Find the general solution of each of the following Euler equations.


1. x 2y" 4- 2x/ - = 2y 2. 4x 2y" - Sxy' + 9y =
3. x 2y" + xy' + 9y = 4. a:
2/' - 3x/ + ly =
5. jc
2/'
+ */ - z? 2^ = 0, p a constant 6. 2x V + */ - y =
4-8 I
THE EULER EQUATION 165

7. x 3y'" - \lxy' - ly =
- 2x 2y" 8. x 3y"' - *x 2y" + 6xy' - 6y =
9. /" + 4x
jc
3 - 2y = 2/'
10. x /"
3
+ 4x 2/' - 8*/ + 8y =
11. *V + 6*V" + 7x 2/' + */ -
iy)
j;=
12. Let S be the linear transformation from © n (0, °o) to C n (— °o , 0) defined by

(Sf)(x) - /(-*).

(a) Show that S is one-to-one, onto, and that SL = LS whenever


L = x D n n
+ fln.ix"- 1 /)"- 1
+ • • •
+ aixD + a .

(b) Use the results of (a) to prove that L(Sy) = if and only if Ly = 0, L as above,
and hence deduce that >>(— x) is a solution of Ly = on (— », 0) if and only if

y(x) is a solution on (0, oo ).


13. Compute the Green's function for the linear differential operator x 2 D 2 + a \xD + ao
on 6(0, oo ).

14. Prove that (4-67) is valid for all positive integers k. [Hint: Use mathematical
induction.]

15. (a) Let T be the linear transformation defined in (4-66), and let L = D -j- ao.
Prove that
TLT' 1 = xD + ao.

(b) Let L = D2 + a\D + a . Prove that

rZr- 1 = x 2 D 2 + (ai + 1)*Z) + ao.

16. Prove that every linear differential operator of the form

L = xn D +
n
a,,-!*"- /)"- 1
1
+ • • •
+ a\xD + a

can be written as a product of operators of this form of orders one and two. [Hint:
Let L be the constant coefficient operator associated with L, and write

L = L1L2 '
' 'Lk,

where Li, i = 1, . .
.
, k, is a constant coefficient operator of order one or two.
Show that
L = (TLiT-^iTUT- 1 ) • •
(TUT- 1 ),
and use the results of Exercise 15.]

17. Use the results of Exercises 15 and 16 to factor each of the following operators,
(a) x2D2 + 2xD - 2 (b) x2D2 + D- 9
(c) x3 D3 - 2x 2 D 2 - llxD - 1 (d) x3 D3 - 3x 2 D 2 + 6xD - 6
(e) x4 D* + 6x 3 D 3 + lx 2 D 2 + xD - 1

Find a particular solution of each of the following Euler equations on (— °o , °°).

18. x 2y" + xy' - 9 = x 3 + 1

19. x V+ */ + 9 = sin (In x 3)


20. x 2y" + 4*/ + 2y = 21nx
21. jt
8/"
+ 4x 2y" xy' + y = x -\-

22. x V" + xy" + 4/ = 1 + cos (2 In x)


166 EQUATIONS WITH CONSTANT COEFFICIENTS |
CHAP. 4

4-9 ELEMENTARY APPLICATIONS


We conclude this chapter with several examples illustrating the way in which
linear differential equations arise in the study of natural phenomena. Although
the problems we discuss are rather simple — at least from a physical point of
view —they are not entirely without interest, and should be construed as reason-
able, albeit elementary, applications of differential equations to biology and physics.
Some of a more substantial nature will be considered in later chapters.
The growth of populations. Problems of this type consist of determining the
I.

future size of a population under the assumption that its rate of growth is known,
and arise in such diverse situations as the radioactive decay of matter and the
increase of bacteria in a culture. In such problems it is frequently assumed that
the rate of increase (or decrease) in population at time / is proportional to the
number of individuals present at that time.* Then if y(t) denotes the number of
individuals present at time t,

%-ky (4-80

for an appropriate constant k, and it follows that y obeys the well known law of
exponential growth
kt
y = ce ,
(4-82)
c a positive constant.
Hypotheses which result in something more realistic than prolonged exponential
growth are, of course, also used. One of the simpliest of these is the assumption
that the supply of necessities for life is constant and sufficient to support a total
population P. This implies that the factor of proportionality in (4-81) depends
upon y and P, and approaches zero as y approaches P. Thus y must now satisfy
a nonlinear equation of the form

= (4-83)
f t
f(y,P)y,

where /has to be determined experimentally. If, for instance, /(y, P) = P — y,


(4-83) becomes

dy d 2

and can be solved as a Bernoulli equation. Indeed, making the substitution


u = y~ we l
, find that

pt
^
at
+ Pu = 1, and u = ~ (1+
f
ce~ ).

organisms it has been found that this assumption is fairly accurate


* In the case of living
when the population small in comparison with the availability of such necessities as
is

food, living space, etc. For radioactive decay, short of an atomic explosion, the assump-
tion is in complete accord with experimental fact.
4-9 ELEMENTARY APPLICATIONS 167

Thus

y =
1 + ce-rt

and if we assume that y(0) = P we , have

= PPo
y P + (P - P )e~ Pt

In Fig. 4-4 we have sketched a number of these curves for various values of P .

FIGURE

II. The simple pendulum. An overwhelming majority of the problems en-


countered in elementary physics are solved by invoking Newton's second law of
motion which, in crude form, asserts that the (vector) sum of the forces acting on
a moving object is proportional to the product of the mass m of the object and
its acceleration a; that is,

F = fema. (4-84)

The reader should note that in 3-space this equation can be rewritten as a system
of three scalar equations
Fx = kmax ,

Fv = kmciy,
Fz = kmaz ,

where Fx ax etc., denote, respectively, the x-, y-, z-components of F and a.


, ,

For convenience the physical units are usually chosen so that k = 1 in these equa-
tions, and in the future we shall always assume this choice has been made.
As it stands, Eq. (4-84) assumes that m is constant. A slightly more sophisticated
formula which avoids this assumption is

F - (mT) -
(4-85)
St
168 EQUATIONS WITH CONSTANT COEFFICIENTS |
CHAP. 4

where v denotes velocity. The quantity mv appearing in (4-85) is known as the


momentum of the object, and in this form Newton's second law says that the
(vector) sum of the forces acting on a moving particle is equal to the time deriva-
tive of its momentum. This, for example, would be the equation used to study
the flight of a rocket whose mass decreases as its fuel is consumed.
As an how Newton's second law is used to solve an elementary
illustration of
problem we consider a pendulum bob of mass m supported at the end of a string
as shown in Fig. 4-5. The forces acting on the bob are the tension T in the string
and the vertical force mg due to gravity. Let <p denote the angular displacement of
the string from the vertical. Then the component of the gravitational force parallel
to the string balances the tension, while the component perpendicular to the
string provides the tangential restoring force which causes the pendulum to oscillate.
Since the magnitude of this tangential force is mg sin <p, and since the momentum
of the bob is mv = mL(d(p/dt), Newton's second law gives

mL = ~ mgsi
sin < <
Jt( f) <p, <P

(4-86)

Ifwe now assume that <p is small so that sin <p « <p, the
above equation may be replaced by the linear equation

(4-87)

whose general solution is

= FIGURE 4-5
<p(t) c x sin \/g/L t -+ c 2 cos

Finally, if we assume that the pendulum was initially released from rest at an
angle from the vertical, then <p(0) = <p <p'(0) = 0, and the corresponding
<p ,

particular solution of (4-87) is <p(f) = <p cos y/g/Lt. The student should recog-
nize this as the equation for simple harmonic motion whose period of oscillation
is 2-Ky/L/g.

EXERCISES

1. The half-life of a radioactive substance is defined as the length of time required for
half of the atoms in any sample of the substance to decay.
(a) The half-life of radioactive carbon 14 is 5600 years. Find the amount of carbon 14
remaining in a sample of amount xq at the end of t years. (Assume that the rate of
decay is proportional to the amount present.)
(b) If 90% of the carbon 14 in a given sample of carbon has decayed, find the age
of the sample.
4-9 ELEMENTARY APPLICATIONS 169

2. If 20% of a certain radioactive substance disintegrates in 50 years, find the half-life


of the substance.
3. A radioactive substance with a half-life of 50 years lies exposed to the weather, and
erodes at a constant rate of k pounds per year.
(a) Find the formula for the amount of material remaining after t years in a sample
which originally contained xq pounds.
(b) How long will it take for the substance to disappear entirely?
4. Bacteria in a certain colony are born and die at rates proportional to the number
present, so that the equation governing the growth of the colony is

dy
= (*] k 2 )y.
dt

Determine k\ and & 2 if it is known that the colony doubles in size every 24 hours,
and would have its size halved in 8 hours were there no births.
5. If the population of a certain colony of bacteria doubles in 40 hours, how long will
it take for the population to increase 10-fold?
6. The population of a country increases 3% per year; its present population is 190
million.

(a) How many years will elapse before the population reaches 250 million?
(b) What will the total population be in 5 years? In 50 years?
7. Solve Exercise 6 under the additional assumption that the country admits 200,000
immigrants each year.
8. What is the annual rate of interest being paid on an account where interest is con-
tinuously credited at the rate of 5%?
9. Assume that evaporation causes a spherical raindrop to decrease in volume at a
rate proportional to its surface area, and find the length of time it will take a drop
of radius ro to evaporate entirely.
10 It can be shown that a body inside the earth is attracted
toward the center by a force which is directly proportional
to the distance from the center. Find the equation of mo-
tion of a ball dropped into a hole bored through the center 1
h
of the earth. When will the ball reach the opposite end of
the hole?
1
11 Liquid is U-tube as shown
oscillating without friction in a
in Figure 4-6. If the liquid was initially at rest with one
side ho inches higher than the other, determine the subse-
quent motion of the liquid by finding h as a function of
time. Show that the period of oscillation is rs/lL/g,
where L is the total length of liquid in the tube, and g the FIGURE 4-6
acceleration of gravity.
12. The angular momentum of a rotating body is given by the formula T = la, where T
is the torque applied, / the moment of intertia of the body about its axis of rota-
tion, and a = d2 <p/dt 2
is the angular acceleration. Suppose that a bob on a twisted

wire resists the twisting force with a torque k<p. Find / as a function of time if k = 1
and the period of rotation is half a second.
170 EQUATIONS WITH CONSTANT COEFFICIENTS |
CHAP. 4

4-10 SIMPLE ELECTRICAL CIRCUITS

The flow of current in an electrical network consisting of a finite number of


closed loops, or circuits, is governed by the following rules known as Kirchhoff's
laws:
(a) The algebraic sum of the currents flowing into any point in the network
is zero;
(b) The algebraic sum of the voltage drops across the various electrical com-
ponents in any oriented closed loop in the network is zero.*
In this discussion we shall restrict our attention to networks consisting of a
made up of a voltage source E, a resistance R, a capacitance C, and
single circuit
an inductance L, the last in the form, say, of a coil of copper wire. The formulas
relating the flow of current / to the voltage drop across each of these components
are
Er = iR for a resistance,

EL — L -j for an inductance, (4-88)


at

i = C -j- for a capacitance.


at

Webegin by considering the R-L circuit shown in Fig. 4-7, where the symbol
— |

denotes a constant source of voltage E such as might be supplied by
|

a battery, and the arrows indicate the direction of the flow of current. By Kirchhoff's
second law we have
EL + ER - E = 0,

the negative sign being due to the fact that the voltage rises across the battery.
Hence, by (4-88),

and if we now assume that the circuit was energized at time t = 0, the flow of
current is obtained as the solution of the initial value problem

L^
at
+ Ri = E, i(0) = 0. (4-89)

An easy computation reveals that

E „ - _(JJ/L)t>
= Z
,•
(1 e-^ ^), 1
(4-90)

and we see that the current flow in this circuit is a sum of two terms, a time

* A closed curve is said to be oriented if a positive direction has been assigned for
traversing it.
4-10 SIMPLE ELECTRICAL CIRCUITS 171

FIGURE 4-7 FIGURE 4-8

independent steady-state term E/R, and a transient term — (E/R)e~ (RILH ,

whose effect diminishes with time. (See Fig. 4-8.) Since the inductance L appears
only in the latter term it follows that a simple R-L circuit operating under a con-
stant impressed voltage will eventually behave very much as if the circuit were
noninductive. The length of time required for the transient term to become
negligible sometimes called the delay time of the circuit, and furnishes a measure
is

of its sensitivity in responding to the voltage source E.


If we replace the battery in the above circuit by an alternating current source
E sin (at, (4-89) becomes

Lj + t
Ri = E sin cot, i(0) = 0. (4-91)

This time the solution assumes the more complicated form

= (aEL —fRiTAt
-(RIL)t . E —
i e ,
(R sin cat (aL cos (at) (4-92)
R 2 + w 2L 2 VR 2 + C0
2 L2
(see Exercise 2), which may be rewritten

= °> EL p -(RIL)t
i + = sin {(at — a) (4-93)
Z2
by setting R = Zcosa and (aL = Zsina, where Z= y/R 2 + « 2L 2 . Again
the flow is the sum of two terms, it and /„ the first of which is transient and dies
out as t increases. The second,

i« = ^ sin ((at — a),

is the steady-state current, and is sinusoidal in nature as one might expect. It


differs from the impressed voltage E sin (at by the phase angle a and the multi-
plicative factor 1/Z. (The quantity Z = \AR 2 + « 2L 2 is called the steady-state
impedance of the circuit.) Thus the graph of the steady-state current can be
172 EQUATIONS WITH CONSTANT COEFFICIENTS CHAP. 4

FIGURE 4-9

obtained by multiplying the amplitude of the graph of the impressed voltage by


1/Z, and translating the result a units to the right (see Fig. 4-9). Since <
« < t/2, the current in such a circuit is said to lag the voltage by the phase angle a.

Finally, since

sin a = coL
and
,
cos a = R

we see that a = in a purely resistive circuit (L = 0), and that a = w/2 in a


purely inductive circuit (R = 0). Hence when L = the current and voltage are
in phase, while when R = they are 90° out of phase. In either case the reader
will note that the steady-state impedance plays exactly the same role that the
resistance plays in an R-L circuit under a
constant voltage, a fact which explains the
use of the term "impedance" here.
We now consider a simple R-L-C cir-
cuit under a sinusoidal impressed voltage
E sin cct (see Fig. 4-10). In this case it is E sin wt
somewhat simpler to describe the state
of the system in terms of the charge q on
the capacitor as a function of time, rather
than in terms of the current /. Since

.
_ dq
(4-94)
1
~ It
FIGURE 4-10

the voltage drop across a capacitor is (l/Qq, and Kirchhoff's second law leads to
the equation
d2q dq 1
L
^+R&+ c q E sin oit (4-95)

governing the accumulation of charge q(i) on the capacitor in the circuit. Thus
the behavior of this circuit can be determined as soon as the initial values of q
and i = dq/dt are known.
4-10 SIMPLE ELECTRICAL CIRCUITS 173

The solution qh of the homogeneous equation associated with (4-95) can be


expressed in terms of the roots

2L
± U
LC
\\2V
of the auxiliary equation

- .1
.2
Lm' + Rm
.

+ ^= 0.

Thus #a assumes different forms depending on whether

\2L/ LC
L(

is positive, negative, or zero. However, it is easily seen that no matter what initial

conditions are imposed on the circuit, qh —* as t > oo. In other words, qh —


represents a transient charge on the capacitor, and as t increases, the time variation
q(t) of the charge on the capacitor approaches a steady-state value 8 {i), which can
q
be determined by finding a particular solution of (4-95). Furthermore, it is
interesting to note that this steady-state charge independent of the initial con-
is

ditions #(0) and i(0) imposed on the circuit, and depends only on R, L, C, and
E sin (at. [Why?] This observation
is in agreement with our intuition which

suggests that the long-range behavior of such a circuit ought to depend only upon
itscomponents, and not upon their state at time = 0. t

We now find the steady-state behavior of this system under the assumption
that R 5^ 0. In this case E sin cot cannot be a solution of (4-95) for any value of
E, and hence the equation has a particular solution of the form

qs = C\ sin cot + c 2 cos cot.

When this expression is substituted in (4-95) and the resulting identity solved for
Ci and c 2, we obtain

Ci = E<*L- (1/coC)]
co{R 2 + [coL - (l/coC)]2}

c2 =
ER
co{R2 + [coL - (l/coC)]2}

If we introduce the abbreviations

7 = coL - l
z2 2
= * + (<^-sb)
CO C
the particular solution q s may be written

qs= -^
El .

Sinco
ER
'-^ COSC0 ''
174 EQUATIONS WITH CONSTANT COEFFICIENTS CHAP. 4

or even more simply

qs = ^ cos ^~ **)> (4-96)

where sin a = l/Z and cos a = R/Z. by differentiating


Finally, this expression

we find that the steady-state current for an R-L-C circuit is

i, = ^ sin {cat 0. (4-97)

As in the case of with impressed electromotive force E sin at, the


an R-L circuit
constants a and Z are called the phase angle and steady-state impedance of the
circuit. In fact, it is customary to view an R-L circuit as the limiting case of an
R-L-C by setting C
circuit obtained = oo.

Since the steady-state impedance

Z = VR 2 + [wL - (1/coC)] 2

in an R-L-C circuit depends upon o>, the maximum amplitude E/Z of the steady-
state current /, on co. It is clear that for fixed values of L and Cthe
also depends
quantity E/Z is a maximum when coL — 1/coC vanishes, i.e., when co = l/y/LC,
and for this value of «, E/Z = E/R. Thus, if we plot E/Z, the amplitude of is ,

as a function of co, the graph will attain its maximum value when co = l/y/LC
Furthermore, this maximum increases with decreasing R, as shown in Fig. 4-11.
Physically, these observations tell us that the circuit may react quite differently to
input voltages with different frequencies differs from
co. The more the frequency
l/y/LC, the smaller the amplitude of the steady-state current becomes, and hence
the voltage drop across the various components of the circuit is small. By minimiz-
ing its resistance, such a circuit can be made highly selective, in the sense that it
discriminates very sharply against inputs whose frequency differs from the circuit's
natural or resonating frequency l/\/LC.
Thus if a mixture of sinusoidal input
voltages of the same magnitude is applied
to an R-L-C circuit of high selectivity, so
that the impressed electromotive force is

a sum of the form

y~l E sin co 4,

the steady-state current will depend almost


exclusively on those terms whose frequen-
cies co t are very close to the natural frequency
l/y/LC. The use of such circuits as tuning
circuits or filters in electronic equipment is

obvious. FIGURE 4-11


4-10 SIMPLE ELECTRICAL CIRCUITS 175

EXERCISES*

1. Find the current flow in a simple R-L circuit under a damped sinusoidal voltage
Eoe"01 sin bt, with Eo, a, b constants, a > 0. [Assume that at time t = 0, /(0) = z'o.]
2. Verify Eq. (4-92) of the text.

3. A simple circuit consisting of a condenser


and a resistor is connected as shown in Fig.
4-12. Suppose that C initially carries a charge
of 0.03 coulombs and that the switch is closed
at time t = 0.
c
(a) Find q(t) as a function of time.
(b) Sketch the graph of q(t).
(c) Find the voltage drop across the resistor
after 10 seconds have elapsed if C = 300
microfarads (300 X 10~ 6 farad) and R = FIGURE 4-12
10,000 ohms.
4. Replace the resistor in the circuit of Exercise 3 by an inductance L, and show that
the charge qon the condenser satisfies the equation

d*q 1

Solve this equation under the initial conditions i(0) = 0, #(0) = 0.03 coulombs.
Sketch the graph of the solution.
5. Add an inductance to the circuit in Exercise 3 and solve the resulting differential
equation, considering separately the three cases

R > R = 2* R < 2\/L/C.

Sketch the graphs of the three types of solutions obtained. (In this situation the
resistance R corresponds to a mechanical damping force, and these three cases yield,
respectively, overdamping, critical damping, and underdamping.)

6. Find the current flow in a simple R-L circuit under a constant voltage given that
R = 40 ohms, L = 8 henries, and that E = volts and i = 10 amperes when
t = 0. At what time will / = 5 amperes?

7. If the resistance is removed from the R-L-C circuit discussed in the text, q satisfies
the differential equation
v2
dq i
l
r _i_ n E sin cot.
Solve this equation in the two cases

(i) co * (ii) co =

* The units of measurement used in the following exercises are resistance in ohms,
inductance in henries, capacitance in farads, current in amperes, charge in coulombs.
176 EQUATIONS WITH CONSTANT COEFFICIENTS | CHAP. 4

and discuss the behavior of the solution in each case. Which of these cases exhibits
the phenomenon of resonance? [Note that in the resonant case the voltage drop
across the capacitor oscillates with the same frequency as the input voltage
(is sin t)/VLC. Its amplitude, however, is unbounded. This is the limiting case of

an R-L-C circuit when R = 0, but, of course, can only be approximated in practice.]


8. Suppose that #(0) = q'(Q) = 0, and that R = 5 ohms, C = 300 microfarads
-6 farad),
(300 X 10 and that L = 0.1 henries in the R-L-C circuit discussed in the
text. If the impressed electromotive force is the standard 110-volt, 60-cycle alter-
nating current in common use (i.e., Esinwt = 110\/2 sin (120x0, since it is the
effective and not the peak voltage which is 110 volts), find q as a function of time.
What is the voltage drop across R after 10 seconds? Across L?
the laplace transform

5-1 INTRODUCTION
In this chapter we shall for the first time make full use of the idea of a linear
operator and its inverse in solving initial-value problems involving linear differen-
tial equations. By contrast to the rather pedestrian methods developed in the last
chapter, our present investigations will yield an extremely efficient technique for
handling such problems. In addition, they will give a much deeper insight into the
role which operator theory plays in applied mathematics, and will serve as an
excellent introduction to a general method which will be used later to analyze
more difficult problems.
The particular linear transformation which we now intend to study is an integral
operator <£ known as the Laplace transform. Before giving the definition, however,
we introduce the notion of a piecewise continuous function, which will be needed
when we describe the domain of £.
Informally, a real valued function is

said to be piecewise continuous on a closed


interval if its graph consists of a finite
number of continuous pieces. More pre-
cisely, / is piecewise continuous on [a, b]
if it is continuous at all but a finite num-
-2a 2a

ber of points of this interval and if at each


point x of discontinuity both the right
and left limits off exist; that is,f(x + h) FIGURE 5-1
and/(x — h) both tend to a finite limit
as h tends to zero through positive values.* Thus such functions as t 2 sin t, ,

and the "square wave" function shown in Fig. 5-1 are piecewise continuous on
any finite interval of the f-axis. On the other hand, neither tan t nor sin \/t is
piecewise continuous on [0, x/2]; the first because of its behavior near w/2, the
second because it oscillates in such a manner that it does not approach a limit
as t — > (see Exercise 10).

* Note that only one of these limits is relevant when xo is an endpoint of the interval.
177
178 THE LAPLACE TRANSFORM CHAP. 5

For our purposes the essential fact concerning piecewise continuous functions
on a finite interval is that they are integrable. Indeed, if/ is piecewise continuous
on [a, b], with discontinuities at x X\, xn and possibly at a and b as well, , . . . , ,

then fa fit) dt is defined and evaluated as


o- i_ b ~h
lim r
A-»0+lLJa+h
r V(o dt + Jxr +hV(o * + a
•••
+ [ +h
Jx n
/a) dt\
J
>

where the notation h — »


+ means that h approaches zero through positive values
only. (See Fig. 5-2.) It is known that this limit always exists, and the student

presumably has had practice heretofore in evaluating such integrals.


At this point a number of trivial re-
marks are in order. The first concerns
our use of the letter t in place of x for the
independent variable. This is nothing
more than a convention which is all but
universal when discussing the Laplace
transform, and stems from the fact that in
most problems the independ-
initial-value
ent variable Moreover, since
is time.
negative values of time are usually ex- FIGURE 5-2
cluded, it is also standard practice to re-
strict attention to the non-negative r-axis [0, oo). In this context we shall be
interested in functions which are piecewise continuous on every finite interval

[0, t ], t > 0, and for the sake of brevity shall say that such functions are piece-
wise continuous on [0, oo). Finally, the set of all such functions is obviously a
real vector spaceunder the definitions of vector addition and scalar multiplication
given in Chapter 1 (see Exercise 5).* We shall have more to say about this space
in the sections which follow.

EXERCISES

1. Which of the following functions are piecewise continuous on [0, oo)? Give reasons
for your answers.

t + 1 ... t - 2 i/t
(a) e (b) ln(r + 1) (c)
t - 1
(d)
fi - t - 2
(e) e

2. Repeat Exercise 1 for the following functions.

,
(a)
.
——
sin t
' na
...
positive integer

(b) t* (c) fif) = [t], the greatest integer less than t

1
iff n = 1,2, an inte 8er
(d) fit) = n (e) f(i) = |
'
f

-J 1 otherwise
11 otherwise

* That is, (/+ g){f) = /(/) + g{t) and (a/)(0 = aftf).


5-2 I
DEFINITION OF THE LAPLACE TRANSFORM 179

3. Prove that for any real number a, e atf{t) is piecewise continuous on [0, oo) whenever
/is.

4. Prove that the product of two piecewise continuous functions on [0, oo ) is piecewise
continuous.
5. Prove that the set of all piecewise continuous functions on an interval / is a real
vector space under the "usual" definitions of addition and scalar multiplication.

Evaluate Jo /(f) dt for each of the following functions.

6. /(,) = ('. < / < 1

\t - 1, 1 < t < 2
7. f(t) = cos \vt\

- i t, < t < %
8. /(,) =
(
Jl
lr
-
-
',

1, 1
<
<
/

/
<
<
1

2
9 m =
[
J _
(a_f
4 ,2
?
g/ + 3
i
<
f<,<-2
f <
o

10. Show that lim^ + (sin 1//) does not exist. [Hint: Sketch the graph of sin \/t in
the interval (0, oo ).]

11. Does lim _ o+ < > it sin 1//) exist? Is t sin 1/7 piecewise continuous on [0, oo)?

5-2 DEFINITION OF THE LAPLACE TRANSFORM


Let f{t) be a real valued function on the interval (0, oo) and consider

r e-
./o
et
fit)du (5-1)

where s is a real variable.* Whenever / is sufficiently well behaved this integral


willconverge for certain values of s, in which case it defines a function of s called
the Laplace transform off, and denoted <£[/], or £[f]is). Thus if fit) = cos at,
where a is a constant, then
/•OO

8t
£[cos at\is) = / e~ cos at dt
Jo
rh
= lim / e~ cos
8t
at dt
< ->» J o
-st
= lim | 2
e
— g (a sin at — s cos af)

= »» [i/+^ (« *> "'o - * cos «o) + ^^i]

* We recall that an integral of this sort is evaluated according to the rule

- st t0
re
Jo
fit)dt= lim f e-
•'°
st
fit)dt,
<o-»°°

and is said to converge for a particular value of s if and only if this limit exists.
180 THE LAPLACE TRANSFORM | CHAP. 5

SincethislimitexistsifandonIyif.s > 0, in which case it has the values/fa 2 + 2


« X
2 2
it follows that the Laplace transform of cos at is the function s/(s -\- a ) re-
stricted to the interval (0, oo). In other words,

£[cos at] = > s > 0. (5-2)


s2 ^_ q2

In most of our applications it will be possible to compute £[f] by direct evalua-


tion of (5-1) as was done above. This, however, does not obviate the need for
determining a "reasonable" set of conditions which will insure the existence of
the Laplace transform of a given function /, particularly since we wish to view <£
as a linear transformation defined on a suitable vector space. If we examine
(5-1) from this point of view it is clear that /must be so chosen that

.— sl
st
e- fii)dt (5-3)
Jo

exists for all t > 0. This can be accomplished by demanding that / be piecewise
continuous on every form [0, 1 ], t > 0, for then the integrand in
interval of the
(5-3) will be piecewise continuous, and the integral will exist. (See Exercise 4,
Section 5-1.) Piecewise continuity by itself, however, is not enough to guarantee
the existence of £[f] since (5-3) must also converge as t —> oo for at least one
value of s. One way of assuring this convergence is to require that f(t) be "domi-
st
nated" by some exponential function, thus in effect demanding that e~ f(t) ap-
proaches zero rapidly as t increases. To make this notion precise we lay down
the following definition.

Definition 5-1. A function /is said to be of exponential order on [0, oo)


if there exist constants C and a, C > 0, such that

at
|/(0| < Ce (5-4)
for all t > 0.*

In a moment we shall prove that the Laplace transform of any piecewise continuous
function of exponential order does in fact exist, but first some examples.
The constant function f(t) = 1 is of exponential order, as can be seen by
setting a = 0, C = 1 in (5-4). So too —and this is important —are the functions
n at n at n at
t , e , sin bt, cos bt, t e sin bt, t e cos bt,

familiar from the study of constant coefficient linear differential equations. For
n ai
instance, the proof that t e cos bt is of exponential order goes as follows: If
a > 0,
n at
t e cos bt
?2at g2at gat

* This inequality need only be satisfied, of course, at those points of the non-negative
/-axis where /is defined.
5-2 I
DEFINITION OF THE LAPLACE TRANSFORM 181

and L'Hopital's rule shows that this expression tends to zero as t — > oo. In
n at
particular, it is eventually less than 1, and hence \t e cos bt\ < e
2at
for
sufficiently large values of /. Thus there exists a constant C > such that
n at
\t e cos bt\ < Ce 2at for all t > (see Exercise 14 below). If a < the proof
is even easier, for then
n at n
\t e cosbt\ < t ,

n
and^he inequality t < l
e for large values of / implies the existence of a constant
n
C> such that \t e
at
cos bt\ < Ce* for all'/ > 0.

On the other hand, the function e l is «o/ of exponential order, since

t
2

lim ^= lim e
<( '" a)
= <x>

for all a.
This having been said, we now prove the theorem which justifies introducing
functions of exponential order in the first place.

Theorem 5-1 . Iff is a piecewise continuous function of exponential order,


there exists a real number a such that

- 8t
r
Jo
e f{t)dt

converges for all values of s > a.

Proof This assertion is an immediate consequence of a well-known comparison


theorem from analysis; viz., iff and g are integrable on every interval of the form
[a, b], where a is fixed and b > a is arbitrary, and if |/(/)| < g(t) for all / > a,
00
then X, /(/) dt exists whenever Ja" g(i) dt exists.*
Granting the truth of this result, choose C and a so that |/(/)| < Ceat for all
/ > (recall that /is of exponential order). Then

—(8—a)t
e- s \Ce at )dt = C e dt
o Jo

-a)t
= lim ^—[1 - e-
{s
°]

= c .
f
if s > a,

st
and the comparison theorem implies that Jq e fif) dt exists for all s > a. |

On the strength of this result we can assert that the domain of definition of the
Laplace transform of a piecewise continuous function of exponential order always

The student who has not already met this theorem may be willing to accept it when
*
we point out that it is the analog for integrable functions of the comparison test for the
convergence of infinite series. For a proof see Appendix I.
182 THE LAPLACE TRANSFORM | CHAP. 5

form (a, oo). In point of fact, if s Q denotes


includes a semi-infinite interval of the
the greatest lower bound of the set of real numbers a such that £[f](s) exists for
all s > a, it Can be shown that £[f](s) does not converge for any s < s .* Thus,

with the possible exception of the point s itself, the domain of definition of <£[/]
is the open interval (s , oo), and for this reason s is known as the abscissa of
l
convergence of the function/. (Note that for certain functions such as e~ or 0,
s may be — oo .)

The Laplace transform of any function of exponential order exists this much ;

we know. But what about the converse? Is it true that a function whose Laplace
transform exists is necessarily of exponential order? The answer is no, and, as a
matter of fact, the function l/y/1 has a Laplace transform even though it is not
of exponential order. All of this is by way of saying that the set of functions
possessing Laplace transforms is larger than the set S of functions of exponential
order. How much larger we shall not say, for it is no easy matter to specify the
domain of £ precisely. Fortunately the set of functions of exponential order
contains all of the functions which arise in applications, and as such is large enough
for most purposes.

EXERCISES

Compute the Laplace transform and abscissa of convergence of each of the following
functions.

4. 1 5. sin at 6. fit) = JO, < t < 1

If, t > 1

Prove that each of the following functions is of exponential order.


n
7. t , n a positive integer 8. e°* 9. sin bt

10. cos fo 11. ln(l + i) 12. Vt


*13. Show that the Laplace transform of a function /may exist even though /"grows
too fast" to be of exponential order.

14. Let /be piecewise continuous on [0, oo), and suppose that there exist constants C
and a such that |/(0 1 < Ce at whenever t > to > 0. Prove that /is of exponential
order.

15. Prove that the product of two functions of exponential order is of exponential order.

* Recall that b a lower bound of a nonempty set S of real numbers if and only if
is

b < and that B is a greatest lower bound of S if and only if B is a lower


s for all s in S,
bound of S and B > b for every lower bound b of S. One of the most important proper-
ties of the real number system is that every nonempty set S of real numbers has a unique
greatest lower bound B (provided B is allowed to assume the value — °° wherever S has
no finite lower bound).
5-3 | THE LAPLACE TRANSFORM AS A LINEAR TRANSFORMATION 183

16. Let /be piecewise continuous on [0, »).


(a) Prove that /is of exponential order whenever there exists a constant a such that

.in, m = 0.

(b) Prove that /is not of exponential order if

for a// real numbers a.


17. Use the results of the preceding exercise to prove that e at " is of exponential order
if a < 1, and not if a > 1.

18. Is the function t* of exponential order on [0, <»)? [Hint: Use Exercise 16 and the
identity t
l
= e<
ln
'.]

19. Another version of the integral comparison theorem stated in the proof of Theorem
5-1 is the following. If /and g are integrable on [a, 1], < a < 1, and if \f(t)\ <
g(t) whenever < t < 1, then fl f(t) dt exists whenever fl g(t) dt exists. Use this
result to prove that 1/V7 has a Laplace transform.

— »t
[Hint: /
e
—-dt=\ ^—dt + ^—
e
dt.]
JO x/t J0 -x/t Ji

20. Let / be a function of exponential order, and let ao be the least real number such
that for some constant C,
1/(01 < Ce«<
for all a > ao-
(a) Show that ao > so, the abscissa of convergence of /.
(b) Show that there exist functions for which ao > .so. [Hint: Consider the function

= an mteger,i
f(t) r *f * *s

|o otherwise. J

21. Let / be piecewise continuous and bounded on [0, «);


i.e., there exists a constant M
such that /(f) |
| < M for all t > 0. Prove that /is a function of exponential order
with abscissa of convergence < 0.

5-3 THE LAPLACE TRANSFORM AS A LINEAR TRANSFORMATION

Let S denote the set of all piecewise continuous functions of exponential order,
viewed as a real vector space under the usual definitions of addition and scalar
multiplication, and let J denote the set of all real valued functions defined on
intervals of theform (s oo) or oo), -so >: — <»• Then JF too can be made into
, ,

a real vector space provided we modify the addition used heretofore in function
spaces to accommodate the fact that the members of JF are not all defined on the
same interval. Specifically, if/ and g are any two functions in SF,/ + g is defined
to be the function whose domain is the intersection of the domains of /and g,
and whose value at any point s in that intersection is f(s) + g(s). Then, with
scalar multiplication as usual, 3" is a real vector space (see Exercise 1).
1 84 THE LAPLACE TRANSFORM | CHAP. 5

By virtue of the observations made following the proof of Theorem 5-1, we


can assert that the Laplace transform £ maps the vector space 8 into the vector
space tf, and it is only natural to ask if this mapping is linear. It might seem
that the answer to this question is obvious, but, unfortunately, the obvious
answer is wrong. The difficulty arises from the fact that £[/ g] need not be +
the same function as £[/] +
£[g], as can be seen by considering the case where
fit) = cos at and g(t) = —cos at. For then £[/] £[g] is the function which +
is zero on the interval (0, oo), but is not defined for s 0, while £[f g] = £[0] < +
is the zero function on the entire s-axis (— oo , oo). From this it is clear that we are
only entitled to say that £[f + and £[/] +
£[g] are identical for those values
g]
of s where both of these functions are defined, a statement which is not at all the
same as asserting their equality.*
But once this difficulty has been recognized, it is clear that it can be circum-
vented simply by agreeing to regard two functions in ^ as identical whenever they
coincide on an interval of the form (o, oo). (Thus, for example, the two functions
encountered above are "identified," meaning that they are considered as one
and the same.) Enforcing this identification, it is now an easy matter to prove that
<£[/ + g] + £[g] f° r any two functions / and g in 8, and since, in any
= <£[/]
event, £[«/] =
a£[f] whenever a is a real number, we have succeeded in inter-
preting £ as a linear transformation from 8 to JF.f
This done, we now ask if £ is one-to-one; i.e., does £[/] = £[g] imply that
/ = g? The reader should recognize that this is just another way of asking if an
operator equation of the form

£[y] = <p(s)

can be solved uniquely for y when <p is given, and by now he should realize that
this is not an idle question. As in the discussion of the linearity of £, there is a
trivial difficulty which prevents us from giving an affirmative answer. For if/
and g are functions in 8 which differ only at their points of discontinuity, then
£[/] = £[#] even though/ ^ g. But two such functions are "very nearly" identi-
cal, and should this be the worst that can happen we would certainly be justified
in asserting that for all practical purposes £ is one-to-one. The following theorem
guarantees that it is, and for this reason is one of the most important results in the
theory of the Laplace transform.

* Recall that two functions are equal if and only if they have the same domain and
take the same value at each point in this domain.
t Strictly speaking, we should at this point replace $7 by the vector space 5* whose
dements are equivalence classes of functions in $ as determined by the above process of
identification (see Exercise 3). Then the mapping £* 8 —> JF* defined by £*[/] = £[/]*,
:

where £[/]* denotes the equivalence class in JF* containing the function £[/], is the
/linear transformation in question. It is common practice however to ignore this distinc-
tion and simply speak of £ itself as being linear.
5-3 I
THE LAPLACE TRANSFORM AS A LINEAR TRANSFORMATION 185

Theorem 5-2. (Lerch's theorem.)* Let f and g be piecewise continuous


functions of exponential order, and suppose there exists a real number s
such that £[f](s) = £[g](s)for all s > s . Then, with the possible exception
of points of discontinuity, f(t) = g(t) for all t > 0.

Thus whenever an equation


£L>] = v(s) (5-5)

can be solved for y, the solution is "essentially" unique. In fact, if we agree to


identify any two functions in 8 which coincide except at their points of discon-
tinuity, we can then speak of the solution of such an equation.f This solution is
called the inverse Laplace transform of the function <p, and is denoted by £~ 1 [<p].
It is characterized by the property

-1
<£ LV] = y if and only if £[y] = <p. (5-6)

At our discussion only one general question remains unanswered


this point in
viz., does £ map 8 onto $?
In terms of operator equations this is equivalent to
asking if (5-5) has a solution for every function <p in 3\ And this time the answer
is an honest no, since we have

Theorem 5-3. If'/'is a function of exponential order, then lim^*, <£[/] = 0.

Proof Indeed, in proving Theorem 5-1 we saw that there exist constants C and
a such that

i
£ wi < j&-a
for all s > a, and the desired result follows by taking the limit as s — > oo. |

On the strength of this theorem we can assert that such functions as 1, s, sin s,

and s/(s + 1) do not have inverse transforms in 8, since none of them approaches
zero as s — > oo

EXERCISES

1. (a) Prove that the set 8 of piecewise continuous functions of exponential order is a
real vector space under the usual definitions of addition and scalar multiplication,
(b) Using the definition of addition given in the text, prove that the set 5 is a real
vector space.

* See Appendix II for a proof.


t The knowledgeable reader will recognize that we are again defining an equivalence
relation.
186 THE LAPLACE TRANSFORM | CHAP. 5

*2. Let /and g belong to $F, and define /~ g if and only if f(s) = g(s) on some inter-
val of the form so < s.

(a) Prove that ~ is an equivalence relation on JF.

(b) Exhibitan equivalence class of functions in JF which does not contain the Laplace
transform of any function in 8.
(c) Can an equivalence class in If contain two different functions which are both
Laplace transforms of functions in 8? Why?
*3. Let SF* denote the set of all equivalence classes of functions in JF under the equivalence
relation defined in Exercise 2.

(a) Give an appropriate definition of addition and scalar multiplication for elements
of 5* so that JF* becomes a vector space.
(b) Define <£*: 8 —> JF* by <£*[/] = <£[/]*, where <£[/]* is the equivalence class in
JF* containing the function <£[/]. Prove that <£* is a linear transformation.
4. Prove that lims ^ w 5£[/] is bounded for any function /of exponential order, and

then use this result together with Theorem 5-3 to deduce that <£ -1 [$a ] does not exist
for any a > — 1. [Hint: See the proof of Theorem 5-3.]

5-4 ELEMENTARY FORMULAS


In Section 5-2 we used the definition of the Laplace transform to prove that

£ tcos = ~7
"'3
52 + g2
' s > °- (5 )

In like fashion one can produce an almost endless list of elementary formulas,
among which are
£[1] = i . s > 0, (5-8)

^ = Y=-a > s>a > (5


-9
>

£[sin at] = > s > 0, (5-10)


s2 ^ a2

n = ~^n
£[* ] ' s > 0, na non-negative integer. (5-11)

For example,
- 8t at
£ [eat ] = f" e e dt
Jo
to -a)t
= lim f e-
(s
dt
t —»oo JO

= lim [—?— (1 - «- < - a)


'»)l

— s > a.
s a

* Recall that 0! = 1.
5-4 I
ELEMENTARY FORMULAS 187

This proves (5-9), and (5-8) can be obtained from it by setting a = 0. Formulas
(5-10) and (5-11) will be established presently, and a more comprehensive list is

found in the table of transforms on p. 228.


Although these simple formulas are not without significance, it is clear that any
applications of the Laplace transform must rest upon more substantial results.
One of the most important of these is a formula which expresses the transform of
off in terms of £[f] and the behavior off at 0. This result in turn
the derivative
depends upon the elementary fact that any function which is continuous on (0, oo),
and has a piecewise continuous derivative which is of exponential order, is itself of
exponential order (see Exercise 15). In particular, this allows us to deduce the
existence of £[/] from the continuity off and the existence of £[/'], and this is
precisely what we need to prove

Theorem 5-4. Let f be continuous on (0, oo), and suppose thatf is piecewise
continuous and of exponential order on [0, oo). Then

£[/'] = s£[f] - /(0+), (5-12)

where f(0 +) = lim t ^ + f(t). More generally, ifff, ,/


(w_1)
are con- . . .

tinuous for all t > 0, andiff {n)


is piecewise continuous and of exponential

order on [0, oo), then

£[/"] = s
2
£[f] - sf(0+) - /'(0+), (5-13)

(n)
= n - n~ 1 +) - n~ 2 +) - ... - / (n - 1)
£[/ ] s £[f] 5 /(0 s f'(0 (0+).

(5-14)

Proof To establish (5- 1 2) we use integration by parts to evaluate £[f J as follows :

£[f] = re- 8tf(t)dt


Jo
= e- 8tf(t)\Z+ S re- 8t
f(t)dt
Jo

= s£[f] + e- 8tf(t)\; ,

and the proof will be complete if we can show that e~ stf(t)\o = —/(0 +).
To this end we note that since/is of exponential order, e~stf(t) — as t —> oo
whenever s is sufficiently large. Thus e~ stf(t)\o vanishes at its upper limit, and,
taking account of the fact that / may have a jump discontinuity at the origin,
we have
e-8tf(t)\o = lim e~ 8tf(t) - lim e~ 8tf{t)
<->» t->o+
—St.
= -lim e- 8t
f(t)
<-*o+

= ~/(0 + ),
as required.
188 THE LAPLACE TRANSFORM | CHAP. 5

Formulas (5-13) and (5-14) can now be established by repeated use of (5-12),
and we are done (see Exercise 28).* |

Example 1. Let f{t) = — (l/a) cos at. Then f'(t) = sin at, and so, using
(5-12) and (5-7),
s 1
£[sin at] = £[cos at] -1
a a

£(*)
+
+
a \s 2 a 2/
!
a

s > 0.
s2 + a2

This proves (5-10).

Example 2. Since Dn t
n = n\, (5-8) implies that

£[D n f] = £[n\] = n!£[l] = — > s > 0.

On the other hand, (5-14) yields

- -1 - -
£[D n t n ] = n n
s £[t ] s
71
• • • •

= n n
s £[t ].

n n =
Hence s £\t ] nl/s for every non-negative integer n, and

This proves (5-11).

With the mechanics of using the differentiation formulas now out of the way,
we are in a position to illustrate the use of Laplace transforms in the solution of
an initial-value problem. Our example must perforce be a simple one since our
list of transforms is still rather limited. Nevertheless it will illustrate all of the
essential steps of the technique.

Example 3. Use Laplace transforms to solve the initial -value problem

^- y = 1; y(0) = 0, /(0) = 1.

We begin by applying the operator £ to both sides of the given equation to obtain

* A generalization of (5-12) to the case where /has jump discontinuities is given in


the exercises below.
5-4 ELEMENTARY FORMULAS 189

(Note that this step depends upon the linearity of <£.) Using (5-13) and (5-8), this
equation may be rewritten

2
s*£[y]
r - 1 - £{y] = i

Thus

- 15)
^-icrh)' (5

and to complete the solution we must find a function whose Laplace transform is
given by this equation.To do so we use the method of partial fractions to rewrite
(5-15) as

m =
5—1
1
—1
S
j

from which it follows at once that

y = e
l
- 1.

The above example illustrates the way Laplace transforms are used to solve
initial-value problems. In general, if we are given an wth-order linear differential
equation Ly = h(t) with constant coefficients and initial conditions y(Q) = y
= in ~ l) =
,

y'(0) yi, . .
. , y (0) yn _ 1} then (5-14) can be used to convert this initial-
value problem into an operator equation of the form

£[y] = <p(s),

whenever h(t) is of exponential order. It then follows that

y = £-%],

and the problem has been solved, provided £~ [<p] can be explicitly found. The 1

reader should appreciate that this argument depends upon two facts: first, that
the unique solution of every such problem is of exponential order, so that the
existence of £[y] is assured from the outset, and second, that the equation
-1
y = £ 0>] has a unique continuous solution. But both of these facts have already
been established; the first in the paragraph following Definition 5-1, the second
by Lerch's theorem.
We conclude this section by establishing the integral analogs of Formulas (5-12)
and (5-14) which are frequently useful in computing transforms and their inverses.

Theorem 5-5. Iffis a function of exponential order on [0, oo), and a is a


non-negative real number, then

\ f(x)dx] = -£[fi-~f f(x)dx. (5-16)


Ja J o S J o
190 THE LAPLACE TRANSFORM I CHAP. 5

More generally

r rt

£ f(X)dX
fa"' ia
/<*><**••<**] = h £[f] ~hf
n times
ra rt ra rt rt

-M
Sn L
Jo J a
mdxdx s J
/
Ja
•..//(*)<&•«&.
Ja

(n — 1) times
(5-17)

Proof. The proof is based upon the observation that if /is of exponential order
then so is & f(x) dx (see Exercise 19). For using integration by parts with
rt
u (t) = /
f(x) dx and dv = e~ st dt,
we have

£ \f f(x)dx\ = I e-
8t
f f{x)dxdt
rt 00 /•»

= -\e- 8t
S Ja
f(x)dx +\S J o
e-
8t
f(t)dt.

But since J* f(x) dx is of exponential order, the first term in this expression tends
to zero as t —> oo provided s is sufficiently large, and hence

rt -, r0
r
f(x)dx\ = ijE|/l + f(x)dx.
£[ja I Ja

Except for obvious notational changes this is (5-16). Equation (5-17) is estab-
lished by iterating this result. |

In practice the integration formulas usually arise with a = 0, in which case


they assume the much simpler forms

f(x)dx^ = - (5-18)
£[f],
£[jo s

'" dX =
[f'"f ^^^
£ 7n
£W- (5-19)
\

Example 4. Since J*
'
cos ax dx = (I /a) sin at, (5-18) gives

-£[sina*l = -£[cosaf].
a s
5-4 I
ELEMENTARY FORMULAS 191

Thus
£[sin at] = - £[cos at]

s
= *( \
s \s 2 + a2 /
a ^
' s > n0.
s2 + a

1
Example 5. Use the integration formulas to compute £[te ].

Since
x
f xe dx = te
l
- e
l
+ \,
Jo
(5-18) gives

£[te
l
- e
l
+ 1] = -£[te'].

Using the linearity of £ we then have

Site
1
] - £[e
l

] + £[1] = - £[te'].
Hence

**1 - 731 + i
= ^*1
and it follows that

£ lV] =
(5
J 1)2
' *> L

(The reader should note that this result can also be obtained by using the differen-
tiation formula.)

EXERCISES

Find the Laplace transform of each of the following functions. (In those problems
where it appears, a denotes a constant.)
1. sin {t + a) 2. (t + a) 2 3. (/ + a) n , n a positive integer 4. sinh at

5. cosh at 6. fV 7. sin
2
at 8. t
2
cost
9. t sin It 10. (2/ - 3)e« 2)/3

11. Show that


2
__
£[cos
3 ,,
t] = *(* + 7)

(5
2
+ 9 )(5 2 + 1)

12. Compute £[e' sin t]. [Hint: Use the differentiation formula.]

13. Compute the Laplace transform of the function u a (i) defined by the formula

Ua(f) = ! ' '/^ a '

1, / > a.
192 THE LAPLACE TRANSFORM I CHAP. 5

14. Compute the Laplace transform of the function whose graph is given in Fig. 5-3.
[Hint: Recall the formula for the sum of a geometric series.]

FIGURE 5-3

15. Suppose that the function /is continuous on (0, oo), and that its derivative /' is
of exponential order. Integrate the inequality

-Ce ax < f'(x) < Ce ax


to deduce that /is of exponential order.
16. In the proof of Theorem 5-4 we used integration by parts to evaluate the integral

\~ stf\t)dt,
f
Jo

where /' was of exponential order. Prove that this was legitimate. (The problem
consists of investigating the behavior of this integral at zero and at the points of
discontinuity of /'.)

17. During the proof of Theorem 5-4 it was asserted that e~ stf(t)\o vanishes at its

upper limit provided s is sufficiently large. How large is "sufficiently large" ?


18. Suppose that the function /in Theorem 5-4 has a jump discontinuity at to > 0.
Prove that
+ -
£[f] = *£[/] - /(0 ) e-*\Ktt) - f(to)l
where
f(t$) = lim f(t + h), f(to) = lim f(t - h), h> 0.

19. Prove that J*j f(x) dx, a > 0, is of exponential order whenever /is of exponential
order.

20. Use (5-16) to prove that

4ff Ja Ja
f(x) dx dx £[f}-\
S* JO
f(x)dx--
S Jo
/
Ja
f(x)dxdx.

Solve each of the following initial- value problems using Laplace transforms.
21. y" - 3/ + 2y = = 4
0; >-(0) = 3, /(0)
22. ?' + y =y(0) =
t; -1, /(0) = 3

23. /" + /' + 4/ + 4y = -2; y(Q) = 0, /(0) = 1, /'(0) = -1


24. /" - /' + 9/ - 9y = 0; y(0) = 0, /(0) = 3, y"(0) =
25. /' - y' - 6y = 3t 2 + / - 1; y(0) = -1, /(0) = 6
26. /' + 6y' + Sy = 4(4t + 3); ^(0) = 2, /(0) = -6
27. Ay" + y = -2; y(0) = 0, /(0) = \
28. Derive Formula (5-14).
5-5 I
FURTHER PROPERTIES OF THE LAPLACE TRANSFORM 193

"29. If Ly = h(t) is a linear differential equation with constant coefficients and if hit)
is of exponential order, show that every solution of this equation is of exponential
order. [Hint: Each solution is of the form yh (t) + yv (f), where yh (t) satisfies the
homogeneous equation Ly = and yp (t) = /q G(t, l-)h(l-) d%. Recall how the
Green's function G(t, £) is constructed from solutions of Ly = (see Section 4-5).]

5-5 FURTHER PROPERTIES OF THE LAPLACE TRANSFORM


As we have seen, the solution of an initial-value problem by Laplace transform
methods comes down to finding the inverse transform of a function <p(s). In prac-
tice such inverses are obtained by using the method of partial fractions to convert
<p(s) to a form in which its inverse- can be recognized usually with the aid of —
certain specialformulas and a table such as the one given at the end of this chapter.
In this section we shall derive a number of the above mentioned formulas, and
illustrate their use in computations.
We begin with a very simple result which permits us to compute the Laplace
at
transform of e f(i) whenever the transform of/ is already known.

Theorem 5-6. If£[f] = <p(s), then

at
£[e f(t)] = <p(s - a). (5-20)
Proof.
at
£[e f(t)} = f e-
Jo
st at
e f(t)dt

{s -a)t
e- f(t)dt
-JJo
= £[/(')](* - a)

= <p(s - a). |

This result is sometimes known as the first shifting theorem (the second will be
given below), and may be written in terms of inverse transforms as
_1 - a -1
£ [^(5 a)] = c '£ [^(5)], (5-21)
or as
-1 _1
£ lVCs)] = e°'£ k(j + «)]• (5-22)

Example 1. Since £ [cos 3/] = s/(s


2
+ 9), (5-20) yields

~2
£'e '
c ° s3 "= (,+y+9 -

Example 2. Compute
£~' 25+3
s2 - 4s + 20,
194 THE LAPLACE TRANSFORM | CHAP. 5

Keeping (5-20) or (5-21) in mind, we write

2s + 3
= 25 + 3
s2 - 4s + 20 - 2)2 + 16
(s

= 2(5
- 2) + 7
(s - 2)2 + 16

5-2
2
2)2 +
[(s- 2)2+ 16j
+ 4 - 2)2 +
16J '

4 l(s
L( 16J
But

£' i f 5-2 1
= e
2t
2t
cos 4t,
l(s - 2)2 + 16J
and

e
1
= e^sin4f.
[(s - 2)2 + 16j
Hence
>-i
(s -2)* + 16]
= 2e2t C ° S 4t
+ ^ Sin 4r -

In order to state our next result we must introduce the so called unit step function
ua (t), which is defined by the formula

KD •23)
(1, t > a.

(See Fig. 5-4.) (For our purposes we shall


always assume a > 0.) This function en- Ua {t)
ables us to write the formula for a function
such as the sinusoidal curve

FIGURE 5-4
/(0 = |°' ' ~ °>
(5-24)
(sin (t — a),
,
t > a,
v

shown in Fig. 5-5, in very simple form. Indeed, since ujj) is zero for t < a,
we have
fit) = ua (t) sin (t - a), (5-25)

an expression which is much better adapted to computation than (5-24).* More


generally, the expression

fit) = ua (t)g(t - a)

0, t < a,
g(t -a), t > a,

* Some authors restrict the term "unit step function" to mean what we have called
«o(0- I n that case (5-25) becomes /(/) = uo(t — a) sin (t — a).
5-5 FURTHER PROPERTIES OF THE LAPLACE TRANSFORM 195

FIGURE 5-5

describes the function obtained by translating or "shifting" g(t) a units to the right
and then annihilating that portion to the left of a (Fig. 5-6). Such functions arise
in practice as time delayed inputs to physical systems (i.e., inputs occurring at
time t = a > 0) and are of considerable practical importance. The next theorem,
known as the second shifting theorem, gives a formula for the Laplace transform
of such a function.

FIGURE 5-6

Theorem 5-7. Let

fit) = ua (t)g(t -a), a >


be a piecewise continuous function of exponential order. Then

£[/] = e—£[g]. (5-26)

(For physical reasons the factor e~as in this formula is known as a delaying
factor.)

8t
Proof. £[/]= f e~
Jo
f(t)dt

= re- 8tg(t- a)dt.


Ja

Thus if we make the substitution x = f


— a, we obtain

£[/] = C e-
Jo
8{x+a)
g(x)dx

e~ r e~
as 8X
= g(x) dx
Jo
= e~a8£{gl
and the theorem is proved.
1

196 THE LAPLACE TRANSFORM | CHAP. 5

To apply (5-26) in the computation of inverse transforms we rewrite it as

je-^-^JBfeCO]] = ua (t)g(t - a),

or as
£ -1 [e- a V(*)] = u a (t)g(t - a), (5-27)

where <p(s) = £[g(t)].

Example 3. If /(/) = u a (t) sin t, then

£[/] = £["a(0 sin + a — a)] (t

_as
= e £[sin (t + a)]
-as
= e £[sin cos a + cos sin a]
t t

as
= e~ {cos a£[sin + sin a£[cos t]} *]

as
_
~ e~ (cos a + sin a) -y

s + 2 1

Example 4. Let/ be the function whose graph is shown in Fig. 5-7. To compute
<£[/] it is convenient to think off as the sum of the functions

/,(*) =i, t > o,

Mt) = ' * '•


f°>
[I - t, t > 1,

/_(/) =
/8W (0, /< 2,
l-(2 - 0, t>2.
FIGURE 5-7
Then
£[/] = £[/i] + £[/2 ] + £[/3 ]
2s
= - + e~ £[-t]
s
+ e~ £[/]

5 + e~
2s
- e~
s

Example 5. Find £T l
[e-*'/{s
2
+ 6s + 10)].
Using (5-27) we have
-3s
— = -
£ u 3 (t)g(t 3),
s2 + 6s + 10_

_1
whereof) - £ [l/(s 2 + 65 + 10)]. But, by (5-21),

£" 1
= £" 1
= e- 3 ^- 1
= e~ 3t sin *.
s2 + 6s + 10. L(s + 3)2 + 1 s2 + 1.
I

5-5 FURTHER PROPERTIES OF THE LAPLACE TRANSFORM 197

Hence
,—3*
-3(«— 3)
= u 3 (t)e~ sin (t — 3).
s2 + 65 + 10

Our next result is somewhat akin to Theorem 5-6 in that it permits us to compute
n
the Laplace transform of t f(J) when £[/] is known. Specifically, we have

Theorem 5-8. //"£[/] = <p(s), then

n w
£[t f(t)l= (-l) J^(s). (5-28)

Proof. This result is established by differentiating both sides of the equation

/»O0

at
<p(s) = / e- f(i)dt
Jo

n times with respect to s. Thus


/•OO

A = st
e- f(t)dt
4-Ms)
ds iJ
ds o

^[e- stf(t)]dt
I.o ds

= - st

f e-
Jo
Jo
tf(t)dt

= -£[tf(t)],

and so forth. (For a justification of this differentiation under the integral sign see
Appendix I.) |

This time the companion formula phrased in terms of £ _1 is

dn
<p(s) = (-lyrjB- 1
!^)]. (5-29)
dsn

Example 6.

£[t sin /] = — — £[sin t]

A
ds \s 2 + l)

2s
(s 2 + l) 2
198 the laplace transform | chap. 5

Example 7.

JOT= (-Dn ^£[l]

(-D ds n \sj

n!

,s
n+1

which again proves (5-11).


Example 8. Suppose we wish to compute £ _1 [l/(s 2 + l)
2
]. By comparing
l/(s
2
+ l)
2
with (d/ds)(l/(s
2
+ 1)) we see that

I
= _ 1 d_ ( 1 \
(s 2 +f l) 2 2s ds\s 2 + 1/

and applying Formula (5-18) we have

But (5-29), with n = 1, yields

£' f sin f.
[ds V + 2
1/J U + U
2

Hence

, dt

= \ I
t sin t dt
o

= — ht cos * + i sin t.

Our final formula is designed to reduce the computation of the Laplace trans-
form of a periodic function* to the evaluation of an integral over a finite interval,
and reads as follows:

Theorem 5-9. Iffis of exponential order and is periodic with period p, then

<*fl - ^fl'flf (5-30)

* A piecewise continuous function / is said to be periodic with period p > if

f(j + p) = f(t) for all values of t.


5-5 |
FURTHER PROPERTIES OF THE LAPLACE TRANSFORM 199

Proof. By definition,

st
£[/]= re~ f{t)dt
Jo

St 2P st
= f e~ f<J) dt+ f e- f(t) dt+ • •

Jo Jp

+ P
•(n+l)p „,
St
+ /' e- f(t)dt +
We now set x + np = t in the (n -f l)st integral of the above series to obtain

{n+1)P P
[
Jnp
e-*<f(t)dt =
Jo
[ e-«™f x + ( np)dx
P
= e-nps f e- sxf(x)dx,
Jo

the last step following from the periodicity off. Hence


P sx P
e~ ps [ e~ f(x) dx
sx
£[/] = f
Jo
e~ f(x) dx + + • •

Jo
nps P sx
+ e-
Jo
[ e- f(x)dx+ •••

ps 2ps P sx
= [1 + e~ + e~ + • • -]( e- f(x)dx.
Jo

But the sum of the geometric series

1 + e~ ps + + e- +
is 1/(1 - e~ ps
), and it follows that

8tP

£[f] = S e- f(t)dt
_
1 — e—p*
as asserted.

Example 9. Find the Laplace transform of the function whose graph is shown
in Fig. 5-8. In this case /is periodic with period 2, whence

lie-'f{t) dt
£[/]
1
- Q—2 S
f 1 „-* l
Joe dt
1 - e -2s

1 - e~ s
S {\ - e -2 s )

s(\ + 6~ s ) FIGURE 5-8


200 THE LAPLACE TRANSFORM |
CHAP. 5

EXERCISES

Find the Laplace transform of each of the following functions

1. e '
sin 3t 2. 3e cos2f 3. t sin 3t
3t
4. t e cos t 5. <T cos(2f + 4) 6. te\d/dt)(sm2t)
2t
7. te f\t) 8. (Jf + l)/(0, where Z) = d/dt

9. fit) = 0, t < i 10. /(f) = ('> ' < 2


1 + t, t>\ [2, t > 2
f T
0, ><2
jsinf, < 2x t 3t
= f
=
11. /( , )
f > 2x
12. /CO < cos t,
2<'<T
3t
0,

f
t, f < 2

13. fit) = <


8 - 3f, 2 < f < 3 14. sin t cos t

f
— 4, 3 < t < 4
.0, f > 4

15. 2e2t sin f cos f 16. sin


2
t [Hint: What is the period of sin 2 f ?]

17. [sin t\ 18. /(0, as shown in Fig. 5-9

l -3 2
19. 1 te sin t di 20. e / f cos 4f dt 21. f / f sin tdt
Jo Jo

2
d —
22. f6 'cos 3f dt 23.
«'J a
t
dt
(e sin j )rff
df* Jo
24. Find the Laplace transform of the stair-

case function

fit) = n + 1, 4 -
1

n < t < n + 1, h = 0, 1, 2, . . .
, 1

3
1

shown in Fig. 5-10. 1

2
1

1 1
1

4a '2 4
a 2a 3a 1 3

FIGURE 5-9 FIGURE 5-10


l

5-5 FURTHER PROPERTIES OF THE LAPLACE TRANSFORM 201

Find the inverse Laplace transform of each of the following functions.

1 1
25. 26. 27.
sis + 1) (s - 1)2 s(s + 2)2

5 1
28.
*2(s - 5)2
29.
(5 - a) n
, n>\
1 1
30. a, b constants 31.
(5 - a)(s - b) 52 + 45 + 29
2s
32. 33. [#m/: See Example 8.]
2s2 + 1 (*2 + 1)2
i 2
1 35
*34. 35. [Hint: First expand in partial fractions.]
0*2 + 4)3 (j2 + 1)2
3 <, —2a
2j 3e
"36. 37. 38.
(52 + 1)
3 352 + 1 5* + 1

1 4 4 2 2
39.
54
.
',

+ .
1
[#//tf: 5 + 1= (5 + 25 + 1) - 25 .]

35
40. [Hint: See Exercise. 35.]
(.S + I) 4

41. 42
s is - 1)(5 - 2)
2
1
43. 44.
53 + a3 53 + fl3

... 5+3
In —^r
rrr
[#m/:
.
— In 5+3
rf , 1
45. Apply Formula (5-29).]
s + 2 ^5 5 + 2 (5 + 2)(5 + 3)
2
46. In
5 + i 1
1

sis + 3)
*47. Describe a procedure for finding the inverse Laplace transform of any rational
function P(j)/Q(j), P, Q polynomials with degree P < degree Q. In particular,
show that it is sufficient to consider the special cases

1 1

(5
2 + a 2)n (5
2 _|_ a 2)n (s
_|_ fl
)n

where a is a constant and n a positive integer.


*48. Show that for any integer n > 1, a ^ 0,

s- 1
r
L(52 +
i

a2)»+lj
i -if
Jo
*-*
In
r
L(52 +
*

fl2)n
<//.

"49. (a) Use the result of Exercise 48 to show that

[* /' /"'
£—
1 1

_(52 + fl2)n+l
r:
l*an\
— Jo j / / /
Jo
t • • •
/
Jo
t
.

sm at dtdt- •
dt.

(b) Derive a similar formula for £ _1 [5/(5 2 + a 2 ) n+1 ].


202 THE LAPLACE TRANSFORM | CHAP. 5

5-6 THE LAPLACE TRANSFORM AND DIFFERENTIAL EQUATIONS

The use of the Laplace transform in solving initial -value problems was introduced
and justified in Section 5-4. In this section we shall illustrate how the formulas
just derived allow us to solve more elaborate problems.

Example 1. Find the solution of

y" + Ay' + \ly = It + 3e- 2 'cos3/,


*0) = 0, /(0)= -1.

Taking the Laplace transform of both sides of this equation, and applying the
given initial conditions, we obtain

.2„r„,
z n„r,, 2 3(5+2)
s £[y] + i
1 +, ,-r..-,
As£[y] +
,

\3£[y] = -=2
s
+ ,

'
(5 + 2) 2 + 9
Hence

m s2 + 4s + 13 ^ s 2 (s 2 + As + 13) ^ (s 2 + As + 13) 2

and we must now find the inverse transform of the various terms on the right-
hand side of this equation. The first can be disposed of without difficulty since

1 1

s2 + As + 13 (^ + 2)2 + 9 3 \{s\0 + 2)2 + 9,


9
Hence

^'[- j.-nl+J --*-"*' 3 '-

To handle the second, we use the method of partial fractions, as follows:

2 A
— —
B Cs + D
+
.
,

-r -52
s 2 (s 2 + 45+13) S '
5 '
52 + 45 + 13

whence

^5(5
2
+ 45 + 13) + B(s 2 + 45 + 13) + (Cs + Z>)5
2
= 2.

In order that this equation hold identically in s, we must have

A + C = 0,

4^4 +5+ D= 0,

13,4 + AB = 0,

135 = 2,
5-6 LAPLACE TRANSFORM AND DIFFERENTIAL EQUATIONS 203

and it follows that A = —jw§, B = ^, C = jfg, D = yfg. Thus

s 2 (s 2 + 4s + 13) 169 \sj ^ 13 V/ 2 169 V + 4s +2


13/
8 8 / s + 2
169 GH&)
^ ^ + 169 V(5
10 /
+ 2)2

3
+ 9

\(s
3(169) V(5 + 2)2 + 9
9/
and

2<
= ~ T69 + 13 t + rfg e cos 3/ - ^V e 2<
sin 3*.
_s 2 (s 2 + 4s + 13)_

Finally, since

3(s + 2) 3 rf

(s 2 + As + 13)2 2 <fc V +%+2


13/

_ I A/ 3 \
2 dy \(* + 2) 2 + 9,/

we can apply Formulas 5-21 and 5-29 to obtain

3(5 + 2) -2t
£" = Me sin 3/.
is 2 + % + 13) 2
Combining these results we see that the solution of (5-31) is

y = -Wre~ 2t *™lt + jh>e-


2t
cos3t + \te~
2t
sin 3/ + ^- rfe-

(The student who feels overwhelmed by the details of these computations is


urged to compare them with those involved in the method of variation of param-
eters or undetermined coefficients. He will find that the argument given above is
simpler and much more straightforward than either of the other two.)

Example 2. In this example we show how the method of Laplace transforms


can be used to find the general solution of a differential equation. As an illustration
we again solve the equation

(D - a)
2
y = 0, (5-32)

where a is an arbitrary real number.


In the absence of specific initial conditions we choose them arbitrarily as

y(0) = C\, /(0) = c2 Taking Laplace transforms of both sides of (5-32) and
.

applying our initial conditions, we have

s
2
£[y] - ds ~ c2 - 2a(s£[y] - c{) + oc
2
£[y] = 0.
204 THE LAPLACE TRANSFORM | CHAP. 5

Hence

£{y] = Cis + c2 + 2aCi


s2 — 2as + a2
CiS — C\a c2 + 3ci<x
(s - a) 2
+ (s - a) 2

Cl l

s — a
+ (c 2 + 3c ia )
(s - a) 2
But
— 1
= ai
£ — and <£" 1 te
s lis «) 2 J
Hence
a' a*
j = Cl e + (c 2 + 3cia)te

= c3 e
Of 2
+ 1

c<ite
j <xt
,

where c 3 c 4 are, still arbitrary, is the general solution of (5-32)

EXERCISES

Solve each of the following initial-value problems using Laplace transforms.


2
1. (D + 2D + l)y = e\ y(0) = /(0) = 0.

2. -f + 3j> = fsinaf; y(0) = -1.

3. ^+ 2
^+ 3y = 3t; y(0) = 0, /(0) = 1.

2 2(
4. (Z) - 4Z) + 4)y = 2e + cos t; y(0) = ft, /(0)
= <L.
25-

FIGURE 5-11

dy
5. ^
at
+ ky = h(j); y(0) = 0, with k a constant and the graph of h(i) given by

Fig. 5-11.

6. ^-2^ + >> = te'sint; y(O) = /(0) = 0.


5-6 I
LAPLACE TRANSFORM AND DIFFERENTIAL EQUATIONS 205

1
i3

'^~^ + %~ Ay=
2

A
^ + 4g2<;
m =
'
/(0) = 5
'
^'^ = 3'

8 '
^+ ^+S" 3 3
^"^ =
^ A0)
= y(0) = y (0) = ^'^^
'
= °-


k_ _»
dt (l
r>2 .

10. -j-r + y — t +1; y(7r) = w , /(ir) = 2r. [Hint: First make the substitution
at 2,
X = t — IT.]

,2

11. -r^ - y = -10 sin 2t\ y(ir) = -1, /Or) = 0. [flwif: See Exercise 10.]
at z

12.
^+ y = I
'

U - 1,
'

f
"
>
1;

1;
y(l) = y'iX) = 1, /'(l) = /"(l) = 0.

[Hint: See Exercise 10.]

13. Use Laplace transforms to solve the equation

dy I'. < < 1>

-£ + 2y+ I

/ ><t)dt
y(t)dt = \2 - t, 1 < t < 2,
dt J
Jo°
0, / > 2,

subject to the initial condition y(Q) = 1.

14. (a) Suppose that y{t) is of exponential order and is a solution of the Euler equation

a, b constants. Show that £[y(t)] also satisfies an equation of the Euler type,

(b) Prove that the result in (a) is also valid for any solution (of exponential order)
of an Euler equation of order n.

15. Show that if f(t) and /'(/) are of exponential order, and if /(/) is continuous for all

/ > 0, then
Urn s£[f] = + ).
/(0
8 —>oo

*16. Bessel's function of the first kind of order zero, denoted Jo, is by definition that
solution of the differential equation

which is defined for t = and satisfies 7o(0) = 1. Prove that

1
£[/o(0] = ,

V1+J2
206 THE LAPLACE TRANSFORM | CHAP. 5

[Hint: Show that £[/o] is a solution of the differential equation

(1 + 2
s )<p'(s) + s v (s) = 0,

and use Exercise 15. Assume Jo is of exponential order.]


_1/2
*17. By expanding s£[Jo] = s(s 2
+ 1) (see Exercise 16) in a binomial series, express
/o(0 as a power series.

5-7 THE CONVOLUTION THEOREM


In this section we shall establish the most important single property of the Laplace
transform, the so-called convolution formula. Far from being just a computational
device, as were the results of Section 5-5, the convolution formula plays an im-
portant role in certain theoretical investigations in advanced analysis. And in
the following pages we shall find that it is also ideally suited to the task of con-
structing inverses for linear differential operators with constant coefficients.
We prove the formula in question as
Theorem 5- 1 0. Let f and g be piecewise continuous functions of exponential
order, and suppose that

£[/] = *>(*), £[g] = Ms)-


Then

&[f M ~ £)&)#] = <p(3>Ks). (5-33)

When written in terms of inverse transforms, (5-33) becomes

£-V(*>Ks)] = f fit ~
./o
S)g(t)di, (5-34)

and in this form asserts that if we know the inverse transforms, / and g, of the
functions <p and \f/,
we can express the inverse transform of the product <p(s)^(,s) as
an integral involving /and g. The integral in question is called the convolution of
/and g and is denoted/ * g; that is,

(/* g)0) = f fit - Og(*)di. (5-35)


Jo

Using this notation, (5-34) can be written

f*g = JB-^WC')],

or, even more suggestively,

_1 _1
JB-^frM*)] = £ [^)] * JB [^)]. ( 5 -36)
5-7 | THE CONVOLUTION THEOREM 207

Proof of the theorem. Using the definition of £ we have

= I™ [' e- st
f(t- OgiOdtdt,
Jo Jo

where the integration is being performed over the region of the *£-plane described
by the inequalities

< £ < t, < t < oo

(see Fig. 5-12). But this region is also described


by

£ < t < ao, < £ < oo,

and hence the above iterated integral may be


written

f
Jo
re-
Jk
8t
f(t- OgiOdtdt,

or

" 8t
-
/o &) '(/" e~ f(t £)di} rftf "G «* E 5-12

We now make the change of variable u = t — £ in


st
f[ e~ f(t — g)dt and obtain

/""e-'/d
J%
- *)* =
JO
r e- s{u+i)
f(u)du.
Hence
"S(M+?)
£
t/o'
/( ' ~ 0g(0 *] = /o"
g(l)
(/o°°
e ^M >
jM
)
*
°° °°
= «""*«(«) «"~/(«) du) dZ
/o (/o
= re-8Uf(u)dur e - sig(0d^
Jo Jo
= £[f]£[gl
and the proof is complete. |

The operation of convolution can be viewed as a multiplication on the vector


space 8, with/ * g as the product of/ and g, and it is of some interest to determine
its properties. For instance, Formula (5-34) and the equality
£[f]£[g] = £[g]£[f]

t It can be shown that an interchange of the order of integration is permissible at


this point. See Appendix I.
208 THE LAPLACE TRANSFORM | CHAP. 5

imply at once that

f fit - ZMOdZ =
Jo
['git
Jo
- £)/(£) </£.

Hence f*g = g * f, and convolution is a commutative operation. It is also


associative and distributive; that is

/* (g* h) = (f*g) * h and f*(g+ h) =f*g+f*h


(see Exercises 14 and 15 below), and thus defines a very reasonable multiplication
on S. Much of the advanced work connected with the convolution integral is

devoted to studying the behavior of 8 under this multiplication.

Example 1. Find
£' 1

u* 2 + i)J

It is problem can be solved by separating l/(s(s 2 + 1)) into


clear that this
partial fractions as (l/s) — (s/(s + 1)) and applying our earlier formulas. But
2

it can also be handled just as well by using (5-35) and (5-36) as follows:

£' = £' £—
[s(s 2 +
-
1)J
*
h]
= 1 * sin t

•t
= / sin £ d%
Jo
= 1 — cos t.

Example 2. Find the solution of the initial-value problem

(D 2 + D- 6)y = h{t),
(5-37)
XO) = /(O) = 0,

given that h(f) is a function of exponential order.


We apply the operator £ to the given equation and obtain
(s
2
+ s - 6)£[y] = £[h],
or
£[h]
£{y] =
s2 + s — 6
from which it follows that

y ' & 1
Kt).
U-2)(5+3)J
5-7 I THE CONVOLUTION THEOREM 209

But since

£"
[:(s
1
= £' if_i- L_l
- 2)(s + 3)J |_5(s 2) 5(5 + 3)J
= ie 2 < _ ie-3 '

we have
(5-38)
Jo

and the problem has been solved pending explicit knowledge of h.

If, for example, h(t) = 1, then

Jo

Jo Jo

= &e 2t ,-3« _ i6-


+ fse-

The reader will recall from Section 4-5 .that the function K(t, £)
_
= 1*2 U-$)
\e _
±e~ 3(t ~® appearing in (5-38) is called the Green's function for the linear differ-
ential operator L = D2 + D— 6 (for initial-value problems at t = 0), and as
such completely determines an inverse for L. In the next section we shall have
more to say about the simple formula

= £—
1
G{tt
i
- 0*
s2 +s— 6
('

which allows us to compute this Green's function by Laplace transform methods.

EXERCISES

Use the convolution formula to find the inverse Laplace transform of each of the following
functions.
3
£[f] e~ '£[f] 1
1. 3.
s2 + 1 S 2 (S + 1)

s 3/ 1

- - a ^ b
'
(s 2 + 1)2
*
(5 2 4- 1)2 (s a)(s b)

Evaluate each of the following.


7. e°* * e6t 8. t * cos at
9. sin at * cos bt 10. t *eat
11. f(t - 1) * e-*g(t + 1)
2
12. /•(-/) * (sin t)g(t )
210 THE LAPLACE TRANSFORM | CHAP. 5

13. Prove directly that / * g = g * f, that is,

f At -
Jo
ZMOdi- = f*
Jo
git - *)/#)««.

[Hint: Make the substitution « = / — £.]

14. Prove that f*(g-\- h) =f*g+f*h.


15. Prove that f*(g*h) = (f * g)* h.

16. Find 1*1 and 1*1*1.


17. Derive a formula for 1 * 1 * 1 * •• * 1 (n factors).

18. Prove that

I sin a(t — wi) / sin a(u\ — 1/2) / sin a(u2 — 1/3)


*
'
*

./o Jo Jo


f
Jo
sin a(un —i un ) sin aun du n • • • du\

hf>f>f<
2 n n\
Jo Jo Jo
t sin at dt • • •
dt.

[Hint: Compare .C^tl/Cs 2 + a 2) n+1 ] as computed by the formula of Exercise 49,


Section 5-5, and as computed by repeated use of the convolution integral.]
19. Suppose that f(t) is of exponential order and that lim t _> o+ f(t)/t exists. Assuming
that the order of integration may be reversed in the computation, prove that
/oo
fit)
£ £[/] ds.

Use the result of Exercise 19 to compute the Laplace transform of each of the following
functions.

20. ,(.L with £1^ = I) a.^£


22. '1^1
t
23.
^H^
&
°°
24. (a) It can be proved that whenever J [f(i)/t] dt exists, the formula in Exercise 19
remains valid with s set equal to zero. Show that we then obtain

[
Jo
m *-[
t Jo
JO
£[f\ ds.

(b) Use this formula to prove that

sin/ t
—-dt = ,

^
/. t 2.

The Gamma function T(jc) and Beta function B(x, y) are introduced in Exercises 25-33.
5-7 |
THE CONVOLUTION THEOREM 211

25. The gamma function is defined by the equation

T(x) =
Jo
/ «~ V -1 dt. (5-39)

(It can be proved that this integral converges for all x > 0.) Use integration by
parts to show that
T(x + 1) = xT(x) for all x > 0.

26. Use the result of Exercise 25 together with the value of T(l) to prove that

T(n + 1) = n\, n = 0, 1, 2,

(Because of this property the gamma function is also called the generalized factorial
function.)

27. By differentiation of (5-39), show that T"(x) is non-negative on (0, »). Where,
approximately, does the minimum of T(x) lie? Draw an accurate graph of F(x).

28. Let n be an arbitrary real number greater than —1. Prove that

[Hint: Set t = u/s in the integral defining <£[**].]

29. Prove that T(i) = \pk. [Hint: Show that

u
r(i) = 2 e~ du,
Jo
and hence that

K0MYo V< 2+ "


^-
Now evaluate this integral by changing to polar coordinates.]
30. Find the value of T(f). (See Exercise 29.)

31. Evaluate each of the following integrals.

r
00
-x r —Vxx
(a) /
—— : ax
dx mi
(b) i
/ e~
i dx
Jo y/x
a/y ^0

32. Prove that

rcoroo x
'\\ - uf- 1 du
u
T(x + y)
I
Jo

whenever x, y > 0. This integral is known as the Beta function of x and y and is

denoted B(x, y); that is,

B(x, y) = f «*-!(! - uy- 1


du.
Jo

[Hint: Use the result of Exercise 28 and the convolution formula to evaluate
£- 1 [l/s*+ 2
'] in two ways.]
212 THE LAPLACE TRANSFORM | CHAP. 5

33. Use Exercise 32 to prove that

dx Vx r(l//i)
/.o Vl - x* n r K» + 2)/(2/i)]

*5-8 GREEN'S FUNCTIONS FOR CONSTANT COEFFICIENT LINEAR


DIFFERENTIAL OPERATORS

We begin this section by recalling a number of facts concerning linear differential


operators which were established in Chapters 3 and 4.
n~ 1
Let L = Dn + a n ^ 1 (t)D + • • •
+ a (t) be a (normal) linear differential
operator whose coefficients are continuous on an interval I, let h be continuous on

/, and consider the equation


Ly = h{t). (5-40)

Then L can be viewed as a linear transformation from e n (7) to <B(I), and a solution
of (5-40) can be described as a linear transformation G from (3(7) to e n (7) which
acts as a right inverse for L, i.e., is such that L(G(h)) = /i for all h in e(7). In the
absence of any further conditions on the unknown y, G is not uniquely determined
by L since L is not one-to-one. Thus to construct G we must impose a sufficient
number of additional restrictions on y so as to make the solution of (5-40) unique.
For example, we might require, as in Section 4-4, that y be the solution of (5-40)
which satisfies the complete set of initial conditions

X'o) = /Co) = • • • = y {n - l
\t ) = (5-41)

at some point t in the interval I. Then the corresponding inverse G can be ex-
pressed as an integral operator

G(h)= ?K{t
Jt
t m&dl t
(5-42)

where the function K(t, |) may be constructed by the method of variation of


parameters from a basis for the solution space of Ly = [see Formulas (4-30)
and (4-42)]. This function is called the Green's function for L for initial value
problems on I, and is completely determined by L since we saw that it is independent
of the point 1 at which the initial conditions were imposed and of the particular
basis chosen for the solution space of Ly = 0.
In this section we shall rederive several of the above results in the case of a
constant coefficient operator L by means of the convolution formula. One of the
principal advantages of this method is that it allows us to bypass the explicit use
of a basis for the solution space of Ly =
and eliminates the tedious computa- 0,
tions involved in the method of variation of parameters. As such it furnishes an
5-8 | GREEN'S FUNCTIONS FOR CONSTANT COEFFICIENT OPERATORS 213

excellent example of the efficiency that accrues from using the "coordinate free"
ideas of linear algebra.
Thus, for the remainder of this discussion we shall consider the constant coeffi-

cient linear differential operator

L = Dn + fln-i^""
1
+ • • •
+ a , (5-43)

and we begin by solving the initial-value problem

Ly = h(t),
(5-44^1
l }
;K0) = /(0) = • • • = y»-»(0) = o,

where h(i) is a function of exponential order. Applying the Laplace transform £


to (5-44), we obtain the equation

p(s)£[y] = h(s),

where p(s) = s
n
+ an -i^ n_1 + * • •
+ a , and h(s) = £[h]. Thus

or n Ks)
p(s)
and hence if

the convolution formula yields

y(t)= fgit- OKOdit (5-45)


Jo

as the desired solution of (5-44).


We can now view (5-45) as determining, in the usual way, a mapping from
w
e[0, oo ) to e [0, oo ), and we have therefore obtained a right inverse G for the
operator (5-43) on this interval.* The defining equation for G is

G(h) = ['git - OKOdt, (5-46)

and the function g(t — appearing in this integral is then, by definition, a Green's
function for L for problems at t =
initial-value (see Definition 4-2).
Fortunately we can say much more about g(t — £). For the function

g(t)
--a
=

* In Exercise 14 it is shown that (5-46) can be derived, using Laplace transforms,


without the assumption that h(f) is of exponential order.
214 THE LAPLACE TRANSFORM | CHAP. 5

is the unique solution on [0, co) of the initial-value problem

Ly = 0,

X0) = /(0) = = /»- 2) (0) = • • •


0, (5-47)
y—»(0) = 1,
as is by using Laplace transform methods to solve (5-47). (See
easily verified
Exercise 12.) Moreover, the usual techniques of Sections 5-4 and 5-5 for com-
puting £~ [l/p(s)] lead to a function K(t), which extends g(t) to the unique solu-
1

tion of (5-47) on all of (— oo, oo). Thus by Theorem 4-3, K(t — £) is the Green's
function for L (for initial-value problems) on the entire interval (— oo, oo). When
Laplace transforms are used in a purely computational fashion it is common prac-
tice to ignore the distinction between K(t) and g(t), as we do, for example, in
stating the next theorem which summarizes the above results.

Theorem 5-11. Let L be the constant coefficient operator (5-43), and let
n -1
p(s) = s + fln-i^ + • ' •
+ flo

be the auxiliary polynomial ofL. Then if

the function g(t — £) is the Green's function for L on the interval (— oo, oo).

Example 1 . To find a Green's function for the operator

L = D2 - 2aD + a
2
+ b
2
, b 9* 0,

we first set

8 ^ = £
[s 2 - las + a2 + b 2\

L(* - a) 2 + b 2}

at
= 1 e sin bt.
b

Then, by the above theorem, the desired Green's function is

g(t- = e
ait ~^ sin b(t- a
I

Example 2. Find the particular solution of the differential equation

(4£
4 - 4Z>
3
+ 5D 2 - AD + 1)^ = In t

which satisfies the initial conditions y(Y) = y'{\) = y"(l) = y"'i\) — 0.


5-8 |
GREEN'S FUNCTIONS FOR CONSTANT COEFFICIENT OPERATORS 215

We first use Theorem 5-1 1 to compute the Green's function for

L = D* - D 3 + fD 2 - D + a
Since

_s* - s* + fs 2 - s + J_
4
= £'
(2s - 1)
2 2
-f 1)
13. 22 16„A 12.
= £' 5 25 25 25
_(2.s - l) 2 2s-l T
+ s2 + 1
~ s2 + 1

= W ~W 12 12
+ if cos t - Hsin t,

the desired Green's function is

~ = Ht - 0e (t -* )l2 - -° 12 -
g(t ¥te
{t
+ M cos (t - if sin (t - £).

The solution of the given initial-value problem can now be written in the form

m J>-ȴ dk,

since h(i) = (In t)/A is the right-hand side of the normalized differential equation
(Z>4 _ /)3 + ^2 _ ^ + i
)j;
= i
ln ,

As a final example let us use Green's functions and Laplace transform methods
to solve an initial-value problem with nonhomogeneous initial conditions, say,

Ly = h(t):
n - 1} (5-48)
y(to) = c , ... ,y (/ ) = cn _i.

Following the methods of Chapter 4 we can write the solution of (5-48) in the
form
y = yh + G(h),

with G(Ji) as above, and yh the solution of the homogeneous equation Ly =


which satisfiesthe initial conditions of (5-48).
If, for example, (5-48) is

(D 2 - 2aD + a2 + b )y
2
= h(t),
(5-49)
yW) = 2, /(,) = -3,

then, using the Green's function for D2 - 2a D + a2 + b


2
obtained in Example 1,

we have

G(h) = e
a «-* sin bit - mt)di.
lj^
216 THE LAPLACE TRANSFORM | CHAP. 5

Thus it remains to find the solution yh (t) of the initial-value problem

(D 2 - 2aD + a
2
+ b )y
2
= 0,

Xx) = 2, /Or) = -3.

This, of course, may be done with the methods of Chapter 4. But we can also

use Laplace transforms if we note that y h (j) = y(t - tt), where y(t) satisfies

(Z)
2 - 2aD + a
2
+ b )y
2
= 0,
(5-50)
X0) = 2, /(0) = -3.

Applying £ to (5-50) we obtain

(s
2 - las + a
2
+ 2
& )£[y] = 25 - 3 - 4a,

or
_ 2s - 3 - 4a
£ LJ>J - _ +
S2 2fl5 + a2 fe
2

2(5 - a) 3 + 2a
~ (s - a) 2 + b2 (s - a) 2 + b2
Thus

y(t) = 2e
at
cos bt - ^^e at
sin bt,

and

yhQ) = y(t - x) = e
a(t jt)
2 cos b(t — tt) r—- sin b(t — x)

The desired solution of the original problem is now completely determined as


a(t »•)
2 cos b{t - tt) r—- sin b(t - x)

+ sin b{t - OKO d£.

EXERCISES

Determine Green's functions for each of the following linear differential operators in

three ways: (a) by applying Formula (4-42) or (4-43); (b) by applying Theorem 4-4;
and (c) by applying Theorem 5-11.
1. D 2 - 4D + 4 2. D2 + D 3. D 2 + 6D + 13

4. D 2 + i# - 4 5. Z>
2
- £/> + ^ 6 -
4£)3 - ^
4
7. D3 + 1 8. (Z>
2
+ l)
2
9. Z) + 1

10. (Z>
2 - 4Z> + 20)
2
5-8 |
GREEN'S FUNCTIONS FOR CONSTANT COEFFICIENT OPERATORS 217

11. Solve the initial- value problem

(D
2
- l)y = f°>
° < ' < h
\t - 1, t > 1,

y(0) = /(u) = o,

in three different ways: (a) by using Laplace transforms directly; (b) by first deter-
mining a Green's function for D 2 — 1 and (c) by solving the initial-value problem
;

(D 2 - l)y = t - 1; y(l) = a, /(l) = b,

with an appropriate choice of constants a, b.


12. Prove that the function = -1 g(t) £ [l/p(*)] defined in the text satisfies the initial-
value problem
Ly = 0;

^(0) = • • • = y»-2)(0) = 0, /*- !>(()) = 1,

on the interval [0, °o ).

13. Show that the methods of Sections 5-4 and 5-5, when used to compute £- l [l/p(s)],
lead to the unique solution on the entire interval (— oo oo ) of the initial- value problem ,

of Exercise 12. (This justifies the statement that g(t — £) is the Green's function for
L on (-oo, oo ), where g(j) = £~ 1
[l/p(s)].)

14. Use Laplace transforms to derive (5-45) for any function hit) which is piecewise
continuous on [0, <»). [Hint: First consider the solution of (5^4) with hit) replaced
by

= h V>> ° ^ * ^ a
H{t) [
>

|u, t > a,
where a is a constant > 0.]

15. If L is a constant coefficient operator, show that yp (f) is a solution of the initial-
value problem
Ly = hit);

y(to) = co, y'ito) = ci, . .


. , >>
(n-1)
(fo) = cn _i,

if and only if yp (i) = Y(t - / ), where Y(t) satisfies

Ly = h{t + to);

y(0) = co, y'iO) = a, . .


. ,
y*-»(0) = c_i.

Solve each of the following initial-value problems using the method given at the end of
this section.

16. (Z> 2 + \)y = e «-l; yU) = /(l) =


17. i2D 2 + D - \)y = sin y(ir) = /(*) = t;

18. (4D 2 + 16D + 17)y = 2 - 1; y(p) = y'ia) = t

19. (Z) 2 - 3D - A)y = «-'; y(2) = 3, /(2) =


20. (2D 2 - 3D + 1)>> = r; j(l) = 0, y'il) = -1
218 THE LAPLACE TRANSFORM |
CHAP. 5

21. (4Z)
2 - AD + 37)>> = e'
/2
cos3/; y(a) = /(a) = -2, a >
7,

22. (D 3 + l)y = te'; yQ.) = /(l) = 0, /'(l) = 1


23. D 2 (D 2 + 1)
2
}> = A(0; y(.a) = y'(a) = /'(a) = /"(a) = >> (iv) (a) = 0, /*>(«) = 1

5-9 THE VIBRATING SPRING; IMPULSE FUNCTIONS

Consider an elastic spring which is fixed at one end


and is free to vibrate in a vertical direction as
shown in Fig. 5-13. Suppose a weight of mass m
isattached to the spring and the entire system
comes into equilibrium with the weight at the
point y = units below the natural
located y
length of the spring. Then, by Hooke's law, the
weight experiences an upward force of magnitude
ky * Since the system is in equilibrium this force
is exactly counteracted by the force of gravity
acting on the weight, and so we have y = 0-

ky = mg. (5-51) FIGURE 5-13

Suppose that the spring-weight system is in equilibrium, and that the weight is
now subjected to an additional vertical force h(t) which may vary with time. Then
at time t, with the weight a distance X0 fr° m the equilibrium position, and the
positive ^-direction measured downward, the forces on the weight are mg due to
gravity, —k(y — ) due to the restoring force in the
y spring, and h(t). Hence,

by Newton's second law, we have

m^= mg- k(y- y ) + Kt),

or, using (5-51),

m M+ ky== m (5-52)

Furthermore, since the system was initially at rest in equilibrium, y(t) must satisfy

y (0) = 0, y'(0) = 0, (5-53)

and the motion of the weight is thus obtained as the solution of an initial-value
problem.
To solve this problem we use Laplace transforms and obtain

ms 2 £[y] + k£[y] = £[h(t)].

* The positive constant k is known as the spring constant. It depends upon the ma-
terial from which the spring is made, the dimensions of the spring, etc.
5-9 I
THE VIBRATING SPRING; IMPULSE FUNCTIONS 219

Hence
£L>]
= (5-54)
ms 2 + k
and

= £-'
1
y(t) * h(i)
ms 2 + k_

— v^7m h(t)
r 2
mk (s + (A:/m))_


"mk
sin J— t)
V™ )
* h(t)4

Thus the equation of motion for this system under the action of an arbitrary ex-
ternal force h can be expressed in integral form as

y(t) = 7=
1
f-sin [Vk/m (t -
/
S1 £)M£) d%. (5-55)
mk J o
In applications the impressed force h(t) is often of the form

h(t) = A sin cat, (5-56)

where A and co are positive constants, in which case the equation of motion
becomes
A
= — == C sin [\/k/m (t — ,
yif) / £)] sin co£ rff
Vmk J o
Although this integral can be evaluated by elementary techniques it is instructive
to begin again with (5-54) and solve the problem directly. Thus

1
£\y] = £[A sin cot]
ms 2 + k
1 Act
ms 2 + k s2 + co 2

We now consider two cases according as co is or is not equal to y/k/m.


Case 1. co ^ y/k/m. Then Aoo/[(ms 2 + k)(s
2
+ co
2
)] can be rewritten by the
method of partial fractions as


Aco
(_i s—\
k moo 2 \s 2 + co
2 ms 2 + k)

t Those of our readers who are familiar with the material of the preceding section will
recognize that we have now computed the Green's function g(t — £) for initial-value
problems involving the normalized operator L = 2
(k/m), and that D +
git — £) = y/m/k sin y/k/m (t — £).
220 THE LAPLACE TRANSFORM CHAP. 5

y—2 sin t — sin 2t

FIGURE 5-14

and, taking inverse transforms, we obtain

= Aw — sin (at — y/m/k sin y/k/m t (5-57)


y{t)
k — moo 2

This function may be interpreted as the superposition of oscillations of two


different frequencies, y/k/m 2tt arid co 2ir. The first of these is the so-called natural
frequency of the system, while the second is the frequency of the impressed force
(5-56). In Fig. 5-14 we have sketched the graph of ^(0 when = k = 1, A = 3, m
and co = 2.

Case 2. co = y/k/m. Then

m = Ay/km
(jns 2 + k) 2
Ay/km s
s (ms 2 + k) 2
But since
£" = £
(ms 2 + k) 2 \ m
2m ds \ms 2 *-f k) \
1

2m ms 2 + k

sin y/k/m t,

2my/km
5-9 THE VIBRATING SPRING; IMPULSE FUNCTIONS 221

y = \ sin / — \ t cos t

FIGURE 5-15

the convolution formula can be used to obtain

y = I £ sin y/k~J m %d%


2m Jo

_A_
-r sin y/k/m t — y/m/k t cos y/k/m t
2m
and we have

y(i) = ttt sin y/k/m t


—= / cos y/k/m t. (5-58)
2k ly/klk

Therefore when the impressed frequency is equal to the natural frequency, the
amplitude of the oscillations increases with time and the spring is eventually
stretched beyond its elastic limit (see Fig. 5-15, sketched for A = k = m = 1).
This phenomenon is known as resonance, and is important in various physical
problems, f
A rather different situation arises if we attempt to find the response of this sys-
tem when the weight is struck a sharp blow in the vertical direction at time t = a,

t Also see the discussion of resonance in connection with electrical networks in


Section 4-10.
222 THE LAPLACE TRANSFORM CHAP. 5

a > 0. To obtain the equation of motion in


this case we introduce the function h defined
by

0, < t < a,

Kt) -
T
' a < t < a + t, (5-59)

,0, a + r < t,

where r is an arbitrary positive constant (see FIGURE 5-16


Fig. 5-16). Physically, h represents a force of
magnitude 1/r acting on the system for a time r, and hence h imparts a total im-

pulse of 1 to the system.* We now agree that the mathematical description of the
physical situation described by the words "sharp blow" is obtained by using a
force which acts throughout an arbitrarily short interval of time but imparts a
predetermined impulse, or change of momentum, to the system. Our problem
then becomes that of determining the behavior of the solution y(t) in (5-55)
when h is as above, and r — > 0.

Substitution of (5-59) in (5-55) gives

0, < t < a,

=
1
sin \/k/m {t — £) d£, a < t < a + t, (5-60)
xo mk
a-\-r
1
in
sin y/k/m (t — dH t
a + t < t.

Hence, passing to the limit as r — > 0, we obtain the solution

0, t < a,

yo(t) (5-61)
1
sin \/k/m (t — a), t > a,

(see Exercise 2). But y (t) is also the solution of the initial-value problem
2
dy
m-^ + ky 0;

y(a) = 0, y'(a) = - »

* constant force of magnitude F acting on an object of mass


A for t seconds is said m
to impart an impulse I = F t to the object. Since F = (d/dt)(mv), where v is the magni-
tude of the velocity of the object (Newton's second law), it follows that when F is constant
the total change in the momentum mv of the object is equal to the impulse. Throughout
this discussion we shall assume for the sake of convenience that 7=1.
5-9 THE VIBRATING SPRING; IMPULSE FUNCTIONS 223

and as such can be interpreted as the response of a weighted spring which is given
unit momentum at t = a [i.e., my' {a) = 1], and left undisturbed thereafter. It
y it) may be interpreted
follows (see the preceding footnote) that in this situation
as arisingfrom an instantaneous unit impulse imparted to the system at t = a,
and the problem has been solved.
In certain circumstances it is convenient to think of such a unit impulse at f =
as arising from a fictitious function 8(t), called the Dirac Delta Function, whose
defining properties are

/""
5(0 = for all t 9* 0, S(t)dt = 1. (5-62)
J — 00

In these terms a unit impulse at / = a would be provided by the "function"

8(t — a). The initial-value problem leading to (5-61) then becomes


2
mdy + , ,
= „•^ ~
y
.
a ^'
~di*

y(0) = /(0) = 0,
we have seen, has y (t) as its solution.
and, as
The foregoing discussion can easily be generalized to initial-value problems
of the form
Ly = 8(t - a);
( }
y(0) = /(0) = • • • = /"""(0) = 0,

~
where L = Dn + an _ 1 D n 1
+ • • •
+ a . To solve this problem we again use
Laplace transforms, first replacing 8(t — a) by the function h defined in (5-59),
and then passing to the limit as r — > 0. Thus

n ~1
with/?(s) = s + an _ 1 s n + • • •
+ a , and hence

= f'gO- Mi)#,
Jo

where g(t) = £ _1 [l/^(-s)]- Using the given value of hit) we thus obtain

0, t < a,

y (t) = <
- g(t - OdZ, a < t < a + t, (5-64)
J
+T
1
- f git- Od%, a + r < t,
\T J/ a
224 THE LAPLACE TRANSFORM | CHAP. 5

and it follows that the solution of (5-63) is

= !°'
'-"' (5-65)
*>(/)
W -a), t > a.

This result is valuable because it can be used to define the Laplace transform of
8{t — a). Indeed, by Formula (5-26) we have

£L>o] = e~ as £[g]
—as *

Pi*)'

On the other hand, if we apply £ to (5-63), under the unjustified assumption


that £[8(t — a)] exists, we find that

£L>o] = ^£[8(t- a)].

Thus if £ [8(t — a)] exists at all it must have the value e~ as and hence we are ,

forced to the definition


£[8(t - a)] = e~ as . (5-66)

(For a different approach to this formula see Exercise 1 below.) In particular,


for a = we have the curious formula

£[5(0] = 1. (5-67)

This apparent contradiction of Theorem 5-3, which asserts that £[/] ->0as
s — oo, is merely a reflection of the fact that 8(t) does not belong to the space of
piecewise continuous functions of exponential order. But this is hardly surprising
since, as was pointed out above, 8(t) is not even a function in the usual sense of
the term.

Example. Consider a spring in equilibrium with a body of mass m attached.


Suppose that at time / = the system is subjected to a sinusoidal force h(t) =
A sin cat, and that at time t = 10 it is struck a sharp blow from below which in-
stantaneously imparts 2 units of momentum to the mass. Problem: Find the
motion of the system from t = onward.
In this case we have to solve the initial-value problem

m ^A + ky = A sin cat - 2 8(t - 10),


at z

y(0) = /(0) = o.

Then

(ms
2
+ k)£\y] = ^^ - 2e-
5-9 THE VIBRATING SPRING; IMPULSE FUNCTIONS 225

and

m
JBM =
(ms 2 +
^
k)(s 2 + w2) ms 2
?
+ A:
e-
108

Referring to our earlier calculations, the inverse transform d(t) of


Aw/[(ms 2 + fcXs
2
+ w 2 )] is given by either (5-57) or (5-58) depending on
whether co ?± y/k/m or not. As for the second term, its inverse transform is

0, t < 10,
e(t) = 2 ,
{
sin Vk/m (t - 10), t > 10,

Mi (0 sin y/k/m (t — 10).


km
Hence y(t) = d(t) + e(t), and the description of the motion is complete.

EXERCISES

1. Show that the definition

£[8(t - a)] = lim £[A(f)],

where h(i) is defined by (5-59), leads to the same result as Formula (5-66).
2. Let y(t) be defined as in (5-64). Prove that

lim;y(/) = |
' '- fl '

~o [g(t -a), t > a.

Use this result to verify that (5-61) is correct.

3. Solve the initial value problem

/" + y = g* + 8(t - 1);

y(0) = 1, y'(0) = 1, y"(0) = 2.

4. Find the equation of motion of a weight having mass m which is initially in equilib-
rium at the end of a spring, and which, at time t = 0, is struck a sharp blow from
above which instantaneously imparts 1 unit of momentum to the system.
5. A weight of mass 1 is attached to a spring whose spring constant is 4, and at time
t = the weight is struck a blow from above which instantaneously imparts 1 unit
of momentum to the system. At time t = ir/2 a sinusoidal force of magnitude
—sin (t — (71-/2)) begins to act vertically on the system. Find the equation of motion
of the mass.
6. A mass m, hanging in equilibrium at the end of a spring, is struck from below and
instantaneously given two units of momentum. At time
a the mass t = is subjected
to the external force sin (t —
Assuming that the spring constant
a). is different
from m, find the equation of motion of the mass m.
226 THE LAPLACE TRANSFORM CHAP. 5

A unit mass is attached to a rigid spring whose spring constant is 3, and is then
mounted in an elevator as shown in Fig. 5-17. At time t = the elevator begins
to descend with a constant velocity of 2 ft/sec, and at that moment the mass is struck
a blow from above which instantaneously gives it one unit of momentum. Find the
equation of motion of the mass as a function of time.

v = 2ft/sec
FIGURE 5-18

FIGURE 5-17

8. A mass m is suspended on a spring beneath a car which moves with constant velocity v
along the track shown in Fig. 5-18. Suppose that at time t = the mass is struck
from below and instantaneously given one additional unit of momentum. Find the
equation for the motion of m in the vertical direction as a function of time.
9. Find the equation of motion of a mass m bobbing at the end of a spring with spring
constant k if the mass was originally displaced a units from equilibrium and given

initial velocity b. (This is an example of what is known as simple harmonic motion.)

10. It is observed that the spring of Exercise 9 oscillates with a period of 2 seconds when
m is 1 gram.

(a) Determine the spring constant k.

(b) Now suppose the mass on this spring is 9 grams and that the oscillations start
at a = 4 with initial velocity b = ir. Find the period and amplitude of the
oscillations.

11. Suppose that a spring is suspended in a resisting medium which opposes any motion

through it with a force proportional to the velocity of the moving object.


(a) Show that the equation governing the motion of a mass m bobbing at the end
of such a spring is

my" + X/ + ky = 0,

where X is a positive constant and k is the spring constant,


(b) Let y(0) = a, y'(0) = b for the mass in (a). Show that

£ mas + mb + Xa
y =
1 ms 2 + Xs + k

12. Assume that the resisting force in Exercise 11 is sufficiently large to ensure that
X2 > 4km.
5-9 I
THE VIBRATING SPRING; IMPULSE FUNCTIONS 227

(a) Show that the quadratic equation ms 2 + As + k = then has distinct negative
roots, —a i, —a2 with <r\ < <T2, and that

y = e~" l \a + fie
~^~^\
where
a , 2bm + aX a 2bm + aX
2 - 4km 2
2>/X 2 2VX 2 — 4km
X - VX 2 - 4km x + Vx- 2 -- 4km
0"1 = ~ ' 0"2 =
2m 2m
(b) Show that except in the trivial case where y = 0, the mass in this problem is in
itsequilibrium position y = for at most one value of t. (The effect of the resistance
in this problem is often described by the word "damping," and the resulting motion
is known as damped harmonic motion.)
13. Suppose that the spring in Exercises 11 and 12 has spring constant 9, and that a unit
mass is attached to the spring, displaced one unit in the positive y-direction, and given
an initial velocity b = — 10. Suppose in addition that the medium offers a resistance
corresponding to X = 10.
(a) Find the equation of motion of the mass.
(b) Show that the mass is in its equilibrium position for exactly one instant. When
is this?

(c) What is the maximum height which the mass attains? (Recall that the positive
^-direction is downward.)
(d) Sketch the graph of the solution curve for this problem.
14. Show that a necessary and sufficient condition that the mass of Exercises 11 and 12
be in its equilibrium position at some one instant after the motion begins is that
< — a/j8 < 1, and hence deduce that if a j± and b = the mass approaches
its equilibrium position without ever passing through it.

15. Show that the quadratic equation ms 2 + Xs + k = for the vibrating spring prob-
lem described in Exercise 11 has a double root s = —a = — X/(2m) < when
X 2 = 4km. Find the equation of motion of the mass in this case. (This is an ex-
ample of what is known as critical damping.)
16. Let y = y(t) be the solution obtained in Exercise 15.

(a) Show thaty —» as t —* qo , and that the mass passes through its equilibrium
position at most once during the motion.
(b) Show that if b/a < — a, then y = when t = — a/(aa b) > 0, but that +
otherwise the mass approaches its equilibrium position without ever passing through

it.(Assume y ^ 0.)
17. Find the equation of motion for the system in Exercise 11 in the case where X 2 < 4km.
What is the behavior of y as t
—> oo ?
228 THE LAPLACE TRANSFORM | CHAP. 5

A Short Table of Laplace Transforms!

Function Transform
/•OO

8t
fit) £[/! = / e~ fit)dt
Jo

a f(t) + Pgit) a£[f] + p£[f]


- +
fit) s£[fl fi0 )

2
- + - +
fit) s £[f] sfi0 ) f'(0 )

n
- n + - j"- 2 +
f
W it) s £[f] 5 "V(0 ) /'(0 )
/-i>(0+)

*
/
fit)dt s
JO

/ fit)dt -£[/]--/ fit)dt


Ja s s Jo

* / •••/ fit)dt...dt n
Jo Jo s

n times
pQ,

/
Ja
•••/

n times
Ja
fit)dt...dt
-^LL md"" /*Qi pt pt

/ /
•••/ fii)dt...dt
SJo Ja Ja >
<

n — 1 times

* at
- =
e fit) f(s a), where f(s) £[/]

*
t
n
fit) ^ i«w n

0, t < a as
Uait)git) = e~ £[git + a)]
git), t > a

t The table consists of three parts. The first contains formulas of a general nature,
and the second the transforms of a small number of selected functions. When used in
conjunction, these two parts of the table will yield the Laplace transforms of most func-
tions which occur in practice. The third part of the table is primarily intended for com-
puting inverse transforms, and hence is designed to be read from right to left. The methods
of partial fractions, completing the square, etc., are, of course, indispensible in such
computations. Finally, those formulas in the first part of the table which are particularly
well suited to evaluating inverse transforms have been marked with an asterisk.
TABLE OF LAPLACE TRANSFORMS 229

A Short Table of Laplace Transforms (Continued)

Function Transform

* fa. t <
-
a as
, s ,
* u (t)g(t
a -«) = »
{' e- £[g]
[git -a), t > a

* £[/]£[«•]
fit- OgiOdZ
Jo

st
fit) periodic with period p (p > 0) Joe- fit)dt
1 - «-*•

M t
if ito
t^0+
^exists
f
/00

1
1
5

a< 1
e
j — a

n
t

a
sin at
52 + tf2

5
cos at
52 + fl2

a
sinh at
52 - a2
5
cosh a/
52 — a2

5(0 1

- —as
5(f a) e

n-l at

in
t

-
e
1)!
7
is
—-^r a)n
(« >
- 1)

1
^-r (sin a/ — at cos a?)
2a3 (j2 + a 2)2

—/

2a
.

sin at
(S
2 +
S

a 2)2

1
£ + a 2)»+l
Jo 2n [(S2 + fl2)nj^ (5
2

1 5

Tn^ (S
2 + a 2)» (5
2 + a 2)»+l
230 THE LAPLACE TRANSFORM | CHAP. 5

A Short Table of Laplace Transforms (Continued)

Function Transform

/» t pt pt
1
t t- - -
1 t sin atdt- • •
dt
2 n an\ Jo
J
Jo Jo 02 + fl
2)»+l

n times

S
1 1 t- • t sin atdt- • dt
2n
:
an\
/
Jo Jo
1
Jo (5
2 + fl2)»+l

n — 1 times
6
further topics in the theory of

linear differential equations*

6-1 THE SEPARATION AND COMPARISON THEOREMS


In this chapter we resume the general discussion of linear differential equations
begun in Chapter 3, and number of extremely valuable results concern-
establish a
The first two are almost immediate con-
ing equations with variable coefficients.
sequences of Abel's formula (Theorem 3-7), and describe the behavior of the
zeros of solutions of second-order homogeneous linear differential equations.

Theorem 6-1. (Sturm separation theorem.) If y\ and y 2 are linearly in-

dependent solutions of the second-order homogeneous equation

a 2 {x)y" + fli(x)/ + a (x)y = (6-1)

on an interval I in which a 2 does not vanish, then the zeros of yi and y 2


alternate on /.f

Proof Let a and b (a < b) be two points of / such that y 2 (a) = y 2 (b) = 0, and
suppose that y 2 is nonzero everywhere between a and b. We must show that there
exists exactly one point c between a and b such that y\(c) = 0. (See Fig. 6-1.)

FIGURE fr-1

* None of the material in this chapter will be used in any essential way until Chapter 15
where equations with regular singular points are studied.
f A zero of a function y is a point xo at which y(xo) = 0. It may happen, of course,
that >>i and j 2 have no zeros on /, in which case the conclusion of the theorem is vacuously
satisfied.

231
232 THE THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 6

Since y x and y 2 are linearly independent solutions of (6-1), their Wronskian


never vanishes on / (Theorem 3-6). Hence the constant in Abel's formula is

nonzero, and

W[yi(x), y 2 (x)] = yi(x)y'2 (x) - y 2 (x)y'1 (x)

has the same algebraic sign everywhere in /. Moreover, the values of the
Wronskian at x = a and x = b are, respectively,

yi(a)y'2 (a) and yi(b)y 2 (b),

and hence y\(a), y\(b), y 2 (a), and y 2 (b) are all different from zero. But since a
and b are successive zeros of y 2 the derivative of y 2 must have opposite signs at a
,

and b (i.e., the graph of y 2 must be rising at a and falling at b, or vice versa).
Hence yi(a) and yi(b) have opposite signs, and it follows that yi(c) = for at
least one point c between a and b*
Finally, by reversing the roles of y x and y 2 in the above argument, we conclude
that y 2 has at least one zero between every pair of successive zeros of y x on /,
and we are done. |

Example 1. On the strength of this theorem we can deduce the well-known


fact that the zeros of sin x and cos x alternate on (— oo, oo), since these functions
are linearly independent solutions of the equation

y" + y = 0. (6-2)

A somewhat less obvious consequence is that any two functions of the form

a x sin jc + a 2 cos x, b\ sin x + b 2 cos x

have alternating zeros whenever a x b 2 ^ a 2 b u for all such pairs of functions are
also linearly independent solutions of (6-2). (See Exercise 1.)

Example The functions sinh x and cosh x are linearly independent solutions
2.

of y — y =
o, and hence their zeros alternate on (— oo, go). This, of course, is
obvious since sinh x has but a single zero, namely, x = 0, while cosh x has no
zeros at all, and is cited merely to emphasize that the separation theorem says
nothing at all about the number of zeros of a solution of (6-1).

Questions of the latter sort can sometimes be answered by using the following
theorem, also due to Sturm.

Theorem 6-2. (Sturm comparison theorem.) Let y x andy 2 be, respectively,

nontrivial solutions of the differential equations

y" + Pl (x)y = and /' + p 2 (x)y =

* The validity of this assertion depends upon the fact that a continuous function
assumes all values between its maximum and minimum on a closed interval [a, b].
6-1 I
THE SEPARATION AND COMPARISON THEOREMS 233

on an interval I, and suppose that pi(x) > p 2 {x) everywhere in I. Then


between any two zeros of y 2 there is at least one zero ofyi, unless p\(x) =
p 2 {x) and yi and y 2 are linearly dependent in 6(7).

Proof. Let a and b be adjacent zeros of y 2 with a < b, and suppose that yi does
,

not vanish in the interval (a, b)* Since the zeros of a function y are the same
as those of — y, we may assume that yi and y 2 are both positive throughout
(a, b). Then, arguing as in the preceding proof, we have

W(a) = yi {a)y 2 (a) > 0,


(6 3)
W(b) = yi (b)y 2 (b) < 0,

where W(a) and W(b) denote the values of the Wronskian of yi and y 2 at a and b,

respectively. But

= - j 2 (xM(x)]
fx myi(x) yf 2 (x)] -^[y 1 (x)y 2 (x)

= yi(x)y 2'(x) - y 2 (x)y'{{x)


= -yi(x)p 2 (x)y 2 (x) + y2(x)p 1 (x)y 1 (x)

= yi(x)y 2 (x)[p 1 (x) - p 2 (x)] > 0,

and it follows that the Wronskian of yx and y 2 is a nondecreasing function on /.

This, however, contradicts (6-3) unless pi(x) = p 2 (x),


which case W[yi(x), in

y2 (x)] = for all x in I. Thus yx must have a zero between a and b whenever
Pi(x) ^ p 2 (x). Moreover, the separation theorem implies that this result con-
tinues to hold when pi(x) = p 2 (x) unless y x and y 2 are linearly dependent in
e(7), and the proof is complete. |

Example 3. Every nontrivial solution of the equation

y" + P(x)y = (6-4)

has at most one zero on any interval in which p(x) < 0. For if we apply the
comparison theorem to this equation and

/' = 0, (6-5)

we conclude that on such an interval every solution of (6-5) must vanish at least
once between successive zeros of a solution of (6-4). The assertion now follows
from the fact that y" = has solutions (namely y = c) which do not vanish on
any interval.

* At this point we are making use of the fact that a nontrivial solution of a second-
order linear differential equation cannot have infinitely many zeros in any finite closed
interval of the *-axis. (See Exercise 6.)
234 THE THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 6

If we agree to say that a nontrivial solution of a linear differential equation


oscillates on an interval / if and only if it has at least two zeros in /, we can restate
the conclusion of this example by saying that no solution of (6-4) oscillates on
any interval in which p(x) < 0. In particular, this result holds for the nontrivial
solutions of
/' _ ky =

on (—oo, oo ) whenever k is a positive constant (cf. Example 2).

EXERCISES

1. Show that the functions a\ sin x +


a<i cos x and b\ sin x £2 cos x are linearly +
independent solutions of y" y — +
whenever a 1&2 9^ «2&i, and thus prove that
the zeros of these functions alternate on (— 00 <x>). ,

2. Prove that sin k\x has at least one zero between any two zeros of sin k^x whenever
k\ > k2 > 0.

3. (a) Show that every nontrivial solution of the equation y" + (sinh x)y = has at
most one zero in (— 00 , 0), and infinitely many zeros in (0, 00 ).

(b) Prove that the distance between successive zeros of any nontrivial solution of this
equation tends to zero asx-> 00

4. Every solution of the equation y" + xy = has infinitely many zeros in (0, 00 ).

True or false? Why?


5. (a) Let /be continuous on (0, 00), and suppose that f(x) > e > for all x > 0.
Prove that every solution of y" f(x)y = +
has infinitely many zeros in (0, » ).
(b) Does the conclusion in (a) remain true if f(x) > in (0, 00)? Why?
*6. Prove that no nontrivial solution y(x) of a normal second-order linear differential

equation can have infinitely many zeros in a closed interval / = [a, b]. [Hint: Assume
the contrary, and then show by partitioning / into arbitrarily small subintervals that
there would exist a point xo in / with the property that every open interval centered at
/contains infinitely many zeros of y(x). Use this fact to prove that

y(xo) = y'(x ) = 0,

and then invoke the uniqueness theorem.]

6-2 THE ZEROS OF SOLUTIONS OF BESSEL'S EQUATION


The second-order linear differential equation

x 2y" + x/ + (x
2
- p 2 )y = 0, (6-6)

p a non-negative real number, is known as BesseVs equation of order p. It is easily


one of the most important differential equations in mathematical physics, and will
be studied in some detail in Chapter 15. For the present we restrict ourselves to
investigating the behavior of the zeros of its solutions on the interval (0, go).
6-2 I
THE ZEROS OF SOLUTIONS OF BESSEL'S EQUATION 235

Before the comparison theorem can be applied to (6-6) the term involving /
must be eliminated. This can be done by making the change of variable y = u/y/x
which transforms (6-6) into

",, + (l + L^)" = (6-7)

without disturbing the zeros of its solutions. Thus it suffices to study the solutions
of (6-7). There are three cases to be considered.

Case 1. < p < ^. Then 1 + (1 — 4p )/4x > 1, and the comparison 2 2

theorem implies that every solution of Bessel's equation vanishes at least once
between any two zeros of a nontrivial solution of u" + u = 0. But, as we know,
the general solution of this equation is c x sin x c 2 cos x. Moreover, if a is an +
arbitrary real number, c x and c 2 can be chosen in such a way that the zeros of
ci sin x + c 2 cos x are a, a ± ir, a ± 2x, (See Exercise 2.) Thus we conclude
. . .

that every solution ofBesseVs equation of order p, p < \, has at least one zero in every
subinterval of the positive x-axis of length x; i.e., the distance between successive
zeros of these solutions does not exceed x. Finally, it is not difficult to show that
this distance is always less than x, and approaches x as x —» oo (Exercise 3).

Case 2. p = \. Here Eq. (6-7) reduces to u" + u = 0, and the general solution
of (6-6) can be written explicitly as

y = —— (c\ sin x + c 2 cos x).

The remark made a moment ago concerning the choice of Ci and c 2 again applies,
and we can assert that the zeros of every nontrivial solution of BesseVs equation of
order % are equally spaced along the positive x-axis, successive zeros separated by
an interval of length ir.

Case 3. p > %. In this case 1 + (1 — 2


4p )/4x
2
< 1, and comparison with
u" + u = implies that every nontrivial solution of (6-7) has at most one zero
in any subinterval of the positive x-axis of length x. To prove the existence of
zeros here we reason as follows.
For any fixed value of p, (1 — 4p 2 )/4x 2 —> 0, as x —> oo. Hence there exists
such that 1 + (1 — 4p )/4x > \ whenever x > x
2 2
anjc > and we now ,

apply the comparison theorem on the interval (x oo) to (6-7) and the equation ,

u" + \u = with general solution c x sin (y/2/2)x + c 2 cos (\/2/2)x. Since this
'

last function has infinitely many zeros in (x , oo) with successive zeros separated
by an interval of length a/2 x, it follows that every solution of BesseVs equation of
order p > % has infinitely many zeros, and that the distance between successive
zeros eventually becomes less than s/lir. By modifying the argument in the
obvious way, it can be shown that here too the distance between successive zeros
approaches x as jc —> oo
236 THE THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 6

EXERCISES

1. Verify that the substitution y = u/Vx reduces Bessel's equation of order p to

" + (• + 4^)-*
2. Prove that for any real number a there exist constants c\ and C2 such that the zeros of
c\ sin x + C2 cos x are a, a ± ir, a ± 2ir, ....

3. (a) Prove that the distance between successive zeros of any nontrivial solution of
Bessel's equation of order p < % is always less than t. [Hint: For any fixed p < \,
and any jco > 0, 1 < 1 +«< 1 + (1 - 4p 2 )/(4x 2 ) on (0, x ).]

(b) Prove that the distance in (a) approaches ir as x — » « . [Hint: (1 — 4p 2 )/(4;c 2 )


— > as x — > oo .]

4. Prove that the distance between successive zeros of any nontrivial solution of Bessel's
equation of order p > § approaches ir as x —> <»
"5. (a) Let ao(x) be continuous on (0, a>), and suppose that there exist positive numbers
b, B such that b 2 < ao(x) < B 2 for all x > 0. Prove that every nontrivial solution of

/' + a (x)y =
has infinitely many zeros on (0, oo ), and that the distance d between successive zeros
can be estimated as

B ~ - b

[Hint:Use the comparison theorem on the given equation and each of y" + b 2y =
and y" + B 2y = 0.]
(b) Use the results in (a) to deduce the facts proved in this section concerning the
zeros of the solutions of Bessel's equation.
6. Let

a2(x) —+ ai(x) — + a (x)y =

be normal on an interval /. Show that there exists a function v(x) defined on / with
the property that the substitution y = uv reduces this equation to

u" + i(x)u = 0.

(The function i(x) is called the invariant of the equation.)


7. Use the method suggested in Exercise 6 to reduce the following equations to a form in
which the first derivative does not appear. (In each of these equations p is a non-
negative constant.)
(a) x 2y" + xy' + (jc 2 - p 2)y =
(b) (1

(c) (1
-
-
x
x
V
V
- 2xy' +p(j>+\)y =
- */ + p 2y = (d) y" - 2xy' + 2py =
8. Prove that every nontrivial solution of the Hermite equation

y" - 2xy' + 2py = 0,

pa non-negative constant, has at most finitely many zeros on (— «> <*> ). [Hint: See
,

Example 3 and Exercise 6 of the preceding section, and Exercise 7 (d) above.]
*

6-3 | SELF-ADJOINT FORM; THE SONIN-POLYA THEOREM 237

6-3 SELF-ADJOINT FORM; THE SONIN-POLYA THEOREM

Let

fl2(x)
S+ aiix)
tc
+ ao(x)y = ° (6_8)

be normal on an interval /, and let

p (x
\ _ gi [a 1 (x)/o 2 (x)] dx^ (6-9)
Then since
d ( , n dy\ , .d 2y . a\(x) , .dy

Eq. (6-8) may be rewritten

or, more simply,

£(pm £) + «'»- (6-'°>

where q(x) = [a (x)/a 2 (x)]p(x).


Equation (6-10) is known as the self-adjoint form of (6-8), and enjoys certain
advantages over the original version of the equation. Not the least of these is
that it provides a "standard form" for all normal second-order linear differential
equations which is easy to derive, and, as we shall see, well suited for computa-
tions. For future reference we note that the function p(x) appearing in (6-10) is

always positive throughout the interval /.

In addition to the self-adjoint form, there are a number of other special forms
for normal second-order equations which are sometimes useful. One of them
involving the invariant of the equation was given in the exercises at the end of
the last section. Another is a normalized version of the self-adjoint form in which
p(x) = 1. It can be deduced from (6-10) by making the change of variable

dx
= .

t
/ p(x)
For then
dy _ dy dt_ _ 1 dy
dx dt dx p(x) dt

A(P{X)\dy\
dx \
r
dx)
= A (dy\
dx \dt)
= ^L
dt
(ti\ dt_
\dt) dx
= J_ fy
p(x) dt 2

and (6-10) becomes


2
dy
+ Qd)y = o. (6-lD
dt2

where Q(t) = p(x)q(x).


238 THE THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 6

This argument proves that every normal homogeneous second-order linear


differential equation can be written in the form required by the comparison
theorem, and thus considerably extends the usefulness of that result. Actually, it

proves even more. For if the points x and t correspond under the change of
variable
dx
/
introduced in passing from (6-10) to (6-11), and if

(MO = pixtaiix),

(MO = P(x)q 2 (x),

then the fact that p(x) is positive throughout / implies that Qi(t ) > (M*o) if

and only if qi(x ) > #2(*o)- Thus if

§+ Qx(t)y = and ^+ Q 2 (t)y =

are deduced, respectively, from the self-adjoint equations

- 12
s(**>&) + *<*»- and £(*'->$ + <*'»- °. (6 >

then Qi(t) > Q 2 (0 if and only if ^ x (a:) > q 2 (x). Hence the comparison theorem
stated in Section 6-1 is also valid for a pair of self-adjoint equations of the form
(6-12) whenever q\(x) >
q 2 (x) on /.*
As a final application of the ideas we have been exploring here we prove a rather
surprising result concerning the zeros of the derivative of any solution of a certain
class of self-adjoint equations.

Theorem 6-3. (Sonin-Polya theorem.) Let p(x) > and q(x) 5* be


continuously differentiable on an interval I, and suppose that p(x)q(x) is
nonincreasing (nondecreasing) on I. Then the absolute values of the relative

* The reader should note that we have proved this assertion under the assumption
that the function p(x) is the same in both equations in (6-12). It can be shown, however,
that the conclusion of the comparison theorem continues to hold exactly as stated earlier
for the solutions of a pair of self-adjoint equations

d_
(p.M I) + ,.(»» = -
dx
0,
I (p 2 (x)
I) + q2 (X )y 0,

in which q\(x) > #2 00 and < pi(x) < P2(x). For a proof of this more general
theorem see G. Birkhoff and G. C. Rota, Ordinary Differential Equations, Ginn, Boston,
1962.
6-3 | SELF-ADJOINT FORM; THE SONIN-POLYA THEOREM 239

maxima and minima of every nontrivial solution of the equation

&{*<>£) + «*»- ^ 13 )

are nondecreasing (nonincreasing) as x increases*

Proof Let y be a nontrivial solution of (6-13), and consider the function

Then
FM = [yWf +
^ r)
[/'W/W]
!

f-^ + ^W-^to')*.
and since, by assumption, (py')' = —qy,

**<*> --©'!<«>•
Now suppose that pq is nonincreasing on /. Then d/dx(pq) < and F is non-
decreasing on / (i.e., F' > on I). Hence the same is true of any sequence of
values of F computed at points x1 < x2 < • • •
. If, in particular, xu x2 , . . are
the points at which y has a relative maximum or minimum, then /(*,-) = 0,
2
F(x{) = y(Xi) , and we have

2 2
X*i) < y(x 2 ) < • • •

Thus
b(^0l < b(* 2 )l < •

as asserted. A similar argument applies when pq is nondecreasing, and the proof


is complete. |

Example. In the preceding section we proved that every nontrivial solution of


Bessel's equation of order p has infinitely many relative maxima and minima on
(0, oo ). Since the self-adjoint form of Bessel's equation is

A
dx ('2) + (^'-* "
(6 14)

and since p(x)q(x) = x2 — p2 is increasing and positive on the interval (p, oo),
the Sonin-Polya theorem implies that the magnitude of the oscillations of such a

Note that the relative maxima and minima of a solution y of (6-13) are the zeros of /.
1 —

240 THE THEORY OF LINEAR DIFFERENTIAL EQUATIONS I CHAP. 6

1

\

\ / —
' 1 1 1 1 1 1

1 1 1
xNv^
i 1 t^- 1

\ 5 / 10
_^/ 15

2
FIGURE 6-2

solution is nonincreasing on this interval. In fact, in this case it can be proved


(see Exercise 6) that the oscillations actually decrease as shown in Fig. 6-2 where
the graph of a solution of Bessel's equation of order 1 has been sketched.

EXERCISES

1. Write each of the following equations in self-adjoint form.


(a) (1 - x 2 )y" - 2xy' + 6y = (b) x 2y" - 2x 3y' - (4 - x 2 )y =
(c) (x 3 - 2)y" - x 2y' - 3y = (d) 2x(ln x)y" + 3/ - (sin x)y =
(e) (x + \)y" - y' + 2xy =
2. Prove that if y i and y2 are solutions of the self-adjoint equation

d_
=
dx (**>!) + q(x)y 0,

then p[yiy2 — y'xyz] is a constant. (This result is known as Abel's identity.)


3. Write Bessel's equation in the form

dy
dP
+ Q(t)y =

by using the substitution suggested in the text.

4. (a) Prove that no solution of a self-adjoint equation

d_

dx -T +
(*>£) 9(x)y =

can oscillate in any interval where q(x) < 0.


(b)For each of the equations in Exercise 1 determine the intervals of the x-axis in
which a nontrivial solution can have at most one zero.
6-4 | POWER SERIES AND ANALYTIC FUNCTIONS 241

5. Discuss the oscillatory behavior of the nontrivial solutions of each of the following
equations.

2*V + + (1 - =
(a) ' 6xy'
^j y 2 3
(b) y" -y + e\\ - x)y = (c) x y" + xy' + (x - 1)>> =
6. Prove that the sequence of absolute values obtained in the Sonin-Polya theorem is

strictly increasing (decreasing) when pq is strictly increasing (decreasing).

7. (a) Discuss the oscillatory behavior of the nontrivial solutions of Airy's differential
equation
y" + xy =
on (—oo, oo ).

(b) Prove that if y(x) a solution of BesseFs equation of order £ on


is (0, oo), then
1/2 3/2 a solution of Airy's equation.
x v(§ x ) is

8. Let p(x) and q(x) > satisfy the hypotheses of the Sonin-Polya theorem, and let y(x)
be a nontrivial solution of the equation

=
l(' (x) l) +
q{x)y 0.

Prove that the values of \p{x)q(x)\ 1/2 \y(x)\ at the points where / = form an increas-
ing (decreasing) sequence if p(x)q(x) is an increasing (decreasing) function. [Hint:

Argue as in the proof of Theorem 6-3, starting with the function


2
F(x) = p(x)q(x)y(x) 2 + [p(x)y'(x)] .]

6-4 POWER SERIES AND ANALYTIC FUNCTIONS


In an earlier section we introduced the equation x 2y" + */ + (x
2
— p
2
)y =
to illustrate how the comparison theorem used to obtain information about
is

the solutions of a differential equation. This particular equation is but one of a


number of linear differential equations with variable coefficients which arise re-
peatedly in mathematics and mathematical physics, and whose study has had a
decisive influence on the development of the theory of differential equations.
One of the distinctive features of these equations is that their solutions cannot,
form in terms of elementary functions. Thus it is
in general, be expressed in closed
quite impossible to "solve" them within the context of the naive interpretation
which considers a solution of a differential equation as a neat little formula in-
volving familiar functions. Nevertheless, it is possible to gain information about
the solutions of such equations (witness our discussion of the oscillatory behavior
of the solutions of Bessel's equation) and, in fact, enough information to be able
to use these solutions effectively in the analysis of other problems. This suggests
that we ought to drop the artificial restriction of seeking solutions within some
preassigned collection of "known" functions, and adopt instead the larger point

of view which sees the solutions of a differential equation as functions defined by


the equation itself.
242 THE THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 6

One example of the way in which this point of view can be exploited was given
earlierwhen we derived all of the familiar properties of the functions sin x and
cos x from the fact that they satisfy the equation y" + y = (see Section 3-8).
Another is furnished by that treatment of the natural logarithm which sees this
= 1 subject to the initial
function as the solution of the first-order equation xy'
condition y{\) = 0. Together these examples pointedly illustrate the fact that
one of the most satisfactory ways of defining new functions in mathematics is as
solutions of differential equations.
In the remaining sections of this chapter we shall use the method of power
series expansions to study the solutions of a certain large class of linear differential
equations, or, rather, to study the functions defined by this class of equations.
Since this discussion depends upon certain results in the theory of power series
expansions of analytic functions we begin by reviewing some of the basic facts
concerning such series. assumed that the reader is already familiar with these
It is

results, and the real purpose of this resume is to fix terminology and provide
formulas for convenient reference.
In the first place, we recall that an expression of the form

00

22 a k (x - x )\ (6-15)
A;=0

x and a k constants, is called a power series in x — x , or, alternately, a power


seriesabout the point x and is said to converge at Xi if and only if the series
,

obtained from it by setting x = a;x is a convergent series of real numbers. It is


well known that every power series about x either converges at the single point
x = x or , else throughout an interval centered at x of the form \x — x < r \

with < r < oo . If, in the latter case, r is chosen as large as possible, the series
diverges when \x — x |
> r, and for this reason r is called the radius of con-
vergence of the series.*
Every power series with radius of convergence r defines a function / in the
interval \x — x \
< r, and we write

fix) = £
k=0
**(* ~ x o)
k
- (6-16)

It can be shown that such functions are infinitely differentiable throughout


|* — x < r, and that
\

a*) = 22 ka ^x - x of~\
k=\

/"(*) = 22 Kk- l)a k (x - x f~


2
, (6-17)
fc=2

* The behavior of a power series at the endpoints of its interval of convergence cannot
be predicted in advance.
6-5 | ANALYTIC SOLUTIONS 243

where the radius of convergence of each of these derived series is identical with
the radius of convergence of (6-16). In short, a power series may be differentiated
term-by-term without changing its radius of convergence.
Any function which can be represented by a convergent power series of the
form (6-16) in an open interval / about the point x is said to be analytic at x .

We have just seen that such a function must have derivatives of all orders every-
where in /, and, as might be expected, is actually analytic at each point of /. Thus
it is customary to speak of functions as being analytic on
an interval, and the
phrase "analytic at x " is used primarily to direct attention to the point about

which the series is expanded. Finally, the reader should be aware of the fact that
all of the elementary functions in mathematics—polynomials, exponentials,
trigonometric functions, etc. are analytic.— Indeed, this is one of the major
results of that chapter of calculus which deals with Taylor series expansions of
(analytic) functions, and will be used throughout the following discussion.

EXERCISES

1. Let a(l) denote the set of all functions analytic on an open interval / of the x-axis.
Prove that GL(I) is a real vector space under the "usual" definitions of addition and
scalar multiplication.

2. Let y(x) be a solution of

^+ «,-!(*) ££ + •" + ""My = />«


on an interval /, and suppose that a (x), . .
.
, a„_i(x), and h(x) are analytic at a point
xo in /. Prove that y(x) is infinitely differentiable at xq.

3. Prove that the power series expansion of an analytic function about the point xo is

unique, i.e., that such a function has precisely one such expansion.

6-5 ANALYTIC SOLUTIONS OF LINEAR DIFFERENTIAL EQUATIONS


Analytic functions arise in the study of differential equations for the very simple
reason that the solutions of any normal linear differential equation whose coef-
ficients and right-hand side are analytic on an interval / are themselves analytic
on /. This result is known as the existence theorem for equations with analytic
coefficients, and in contrast to most existence theorems in the theory of dif-
ferential equations immediately leads to an explicit technique for computing
solutions. Before introducing this technique, however, we give a formal statement
of the theorem in question.

Theorem 6-4. (The existence theorem for equations with analytic coef-
ficients.) Let

a* + *-iM sS + •
+ «<&» = *« (
6- 18 >
244 THE THEORY OF LINEAR DIFFERENTIAL EQUATIONS | CHAP. 6

be a normal nth-order linear differential equation whose coefficients a (x), . . .


,

an _ x (A:) and right-hand side h(x) are analytic in an open interval I. Let x
be an arbitrary point in I, and suppose that the power series expansions of
a (x), . h(x) all converge in the interval \x — x
. . , < r, r > 0. Then \

every solution of (6-18) which is defined at the point x is analytic at


that point, and its power series expansion also converges in the interval
\x — x \
< r.

It is important to note that this theorem not only asserts the analyticity of

solutions of an equation of the type described, but also specifies an interval in


which the power series expansions of these solutions converge. At the same time,
it should be observed that we make no statement concerning the behavior of such
series outside this interval. As we shall see, this is in the nature of things, since
it is possible to give examples of series solutions which converge in a larger in-
terval than the one described above.*

Example 1. Find the general solution of

y" + xy' + y = (6-19)


on (—00, oo).
Since this equation satisfies the hypotheses of Theorem 6-4 on the entire jc-axis,
each of its solutions has a power series expansion about the point x = which
converges for all values of x.f To compute these solutions we use the so-called
method of undetermined coefficients, as follows.
Let y(x) be an arbitrary solution of (6-19). Then for suitable constants a k we
have
oo

y(x) = J2 a kXk , (6-20)


fc=0
and
-2
y'(x) = £
fc=l
k<*kX
k
-\ y"(x) =Y,k(k-
fc=2
l)ak x
k
.

Substituting these series in (6-19), we obtain

-2
£
fc=2
k(k - l)a k x
k
+ J] ka kx
fc=l
k
+ £
k=0
akx
k
= 0, (6-21)

and it follows that (6-20) will be a solution of the given equation if and only if
the a k are chosen so that the sum of the coefficients of like powers of x in this
expression are all zero (see Exercise 3 of the preceding section).

* For a proof of Theorem 6-4 the reader should consult E. A. Coddington, An Intro-
duction to Ordinary Differential Equations, Prentice-Hall, Englewood Cliffs, N. J., 1961.
f The student should appreciate that the choice xo = is one of convenience, not
necessity.
6-5 | ANALYTIC SOLUTIONS 245

To facilitate collecting terms in (6-21) we now replace the index of summation


in the first series by k + 2, which can certainly be done without prejudice to the
sum of the series. This gives

-2 k
2 Kk -
fc=2
l)a kx
k
= X)
fc+2=2
(fe + 2)(k + l)a k+2 x

= ]T (k + 2)(fc + l)a fc+2 *


fc

fc=0

and (6-21) becomes


00 <»
oo

£; (A: + 2)(fc + l>n + 2X* + 2 *°* x * + S fl *** = °-

But since the order of summation in a power series is a matter of indifference, we


can rewrite this expression as

a + 2a 2 + 2 (* + !X(fc + 2 )«*+2 + "*]** = °;

whence
A + 2a 2 = 0,

(A; + 2)a k+2 + = fl* 0, k > 1.

The second of these equations is known as a recurrence relation (or /wi/te rfi/-

ference equation), and can be used to express the a k from A: = 3 onward in terms

of the preceding ones. Moreover, since a 2 = —(a Q /2), it follows that all of the
a k for k > 2 are uniquely determined by the values ofa and a x They fall into two .

distinct sets depending on the parity of k, as follows:

k even k odd
ao
a2 = — a
2
fl i

as — — «i
3"

a* = 4^2 °5 " 5 3 •

°6 = ~ O
"7 _ _ q i

6-4-2 ~ 7-5-3
'

, ixfc #0 _
~ f_1\
k 9l
a 2k - (-1)
(2fcX2Jk - 2) • • 4 •
2
2fc+1 l
' (2* + 1X2* - 1) • • •
5 • 3

Substituting these values in (6-20), we obtain

y(x) = fl0 [l - \+ 4^2 ~ 6^2 + '


"

L*~T + 5^3~7.5-3"
(6-22)
+ fll h

where a and <* i are arbitrary constants.


246 THE THEORY OF LINEAR DIFFERENTIAL EQUATIONS | CHAP. 6

To complete the problem it remains to show that (6-22) is the general solution
of the given equation on (—00, 00). To this end we introduce the series

^)=l + g(-D fc

(2 , )(2 ,
_*
2) ... 4-2

(6-23)
2k+1
x
k=1
(2k + \)(2k- l)-.-5-3

and rewrite (6-22) as

y(x) = a y (x) + «iji(x). (6-24)

An easy computation using the ratio test now shows that y and yx both converge
for all Hence so does (6-24), and we conclude that this expression is a
values of x.
solution of (6-19) on the entire *-axis, just as predicted by Theorem 6-4. Finally,
we note that y and y 1 are themselves solutions of (6-19), and that

J>o(0) = 1, / (0) = 0,

J>i(0) = 0, y\(0) = 1.

Hence y and y x are linearly independent in e(— 00, 00) and therefore span the
solution space of (6-19) (cf. Theorem 3-3). This implies that (6-24) is the general
solution of the given equation, and we are done.
Example 2. The differential equation

(1 - x V~ 2*/ + X(X + l)y = 0, (6-25)

X a non-negative constant, is known as Legendre's equation of order X, and will


play an important role in much of our later work. In this section we seek the
power series expansion

y(x) = £
fc=0
akx
k
(6-26)

of its general solution about the point x = 0. Before we begin, however, we note
that since the leading coefficient of (6-25) vanishes when x = ±1, Theorem 6-4
only guarantees that the series in question will converge in the interval (—1, 1).
This said, we substitute (6-26) and its first two derivatives in (6-25) to obtain 1

-2 ~l
(1 - x 2 ) J^ k(k
k=2
- \)a k x
k
- 2x jj ka k x
k=l
k
+ X(X + 1) ^
k=0
"*** = °>
6-5 ANALYTIC SOLUTIONS 247

or

~2
2 Kk -
k=2
l)a kx
k
_]£*(*_
k=2
\)a k x
k
- £ 2ka x
fc=l
k
k

00
k
+ 2] MX + l)a kx = 0.
k =o

By shifting the index of summation on the first series and consolidating the last
three, this expression may be rewritten

-2a lX + X(X + l)[a + a lX] + J2 (k + 2 Xk + l>fc+2**


fc=0
00

+ Z) [~*(* - 1) - 2* + X(X + l)]^ = 0. (6-27)


k=2

Finally, we use the identity

-k(k - 1) - 2k + X(X + 1) = (X + k + 1)(X - k)

to put (6-27) in the simpler form

2a 2 + X(X + IK + [(X + 2)(X - l) fll + (3 •


2)a 3 ]x
00

+ J] [(* + 2)(fc + l)flt+2 + (X + * + 1)(X - k)ak ]x


k
= 0.
k=2

To complete the solution we set the various coefficients in this last series equal
to zero and solve the resulting equations. This gives

2fl 2 + X(X + l)a = 0, (X + 2)(X - \)a x + (3 •


2)a 3 = 0,

(k + 2)(k + l)ak+2 + (X + k + 1)(X - k)a k = 0, k > 2,

and it is now an easy matter to express all of the ak in terms of a and a x Indeed,.

a2 = - —+j-^-
-(X 1)X
a ,

*4 _ (X+ 3)(X- 2) = (X + 3)(X + 1)X(X - 2)


°2 a °>
4^3 4!

etc., while

a _ (X + 2)(X- 1)

«s - _ (X + 4)(X- 3)
03
_ (X + 4)(X + 2)(X - 1)(X - 3)
« l5
5 .
4 5|
248 THE THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 6

etc. In general,

a2k = (-I)*

v (X +2k - 1)(X + 2k - 3) • • • (X + 1)X(X - 2) • • • (X - 2/c + 2)

lVfc (X + 2*)(X + 2*-2)---(X + 2XA- l)---(X-2fc+ 1)


« 2 fc+i = ,C-l) (2/c + 1)!
for all k > 0, and it follows that (6-26) may be written

y(x) = a y (x) + axyxix), (6-28)

where
.. t
(X+1)X 2 ,
(X + 3)(X + 1)X(X - 2) 4

(6-29)

(X + 2)(X - 1) 3 (X + 4)(X + 2)(X - 1)(X - 3) 5


yi(x) = x x H
,
^
2| 5|

and a and fli are arbitrary constants. Moreover, since

yo(0) = 1, /o(0) = 0,

yi(0) = 0, /i(0) = 1,

we see that (6-28) is the general solution of Legendre's equation of order X on


(—1, 1), as required.
When a (non-negative) integer the solutions of Legendre's equation are of
X is

particular interest. For instance, when X = 2n, n = 0, 1, 2, the series for . . . ,

given above has only a finite number of nonzero terms, and hence is a poly-
y
nomial. In fact,a polynomial of degree In involving only even powers of x.
it is

Similarly, when X = In 1, n = 0, 1, 2, + ,yi is a polynomial of degree . . .

In +
1 involving only odd powers of x. But since j> and y x are themselves solu-
tions of (6-25) we conclude that Legendre's equation has polynomial solutions for
each non-negative integral value of the parameter X. These polynomials will be
studied in considerable detail in Chapter 11, and we therefore say no more about

them here other than to remark that they provide examples of power series solutions

of a linear differential equation whose radius of convergence exceeds that predicted

by Theorem 6-4.*

* In fact, Legendre's equation has polynomial solutions only when X is a non-negative


integer,and these polynomial solutions are the only ones which are continuous on the
closed interval [—1,1]. For a proof see F. Tricomi, Differential Equations, Hafner,
New York, 1961, p. 192.
6-5 | ANALYTIC SOLUTIONS 249

EXERCISES

Express the general solution of each of the following equations as a power series about the
point x = 0.
1. y" + y = 2. y" - y =
3. y" - 3xy = 4. y" --y = 3xy'
5. (x + 1)/' - 6y =
2
6. (x + 1)/' - 8xy' +
2
15y =
7. (2x 2+ 1)/' + 2xy' - lSy = 8. 2y" + 2x 2/ - ay =
9. y" + *V + 2*y = 10. /' - x 2y = 6x
11. 3y" - xy' + y = x 2 + 2x + 1 12. /" + 3a 2/' - 2y =
13. /" - 3xy' - y = 14. /" + x 2y' - xy =
Use the method of undetermined coefficients to express the general solution of each of the
following equations as a power series about the point x = 0, and specify an interval in
which the solution is valid.
2
15. (x - \)y" + xy' - 4y =
2 2
16. (x - 2)y" + xy' - y = x

v
17. y
// ,
x
y
25y 1 + 2x
x2 - 4 X2 - 4 x^ - 4
3 2
18. (x - S)y" + jc /+ xy = 16
3 2
19. (x + 2)^" + 6x y' - 6xy =
4 2
20. (a - 4)/" - 36a- / - 48xy =
21. Use the method of undetermined coefficients to solve the initial- value problem

y" + xy' - 2y = e x
;

y(0) = /(0) = 0.
[Hint: Expand e x
as a power series about x = 0.]
22. Find the general solution of Airy's equation/' + xy = 0.

23. Find a necessary and sufficient condition that a differential equation of the form

(x 2 + a)y" + fixy' + ly =
has a polynomial solution of degree n.

24. Let yo and yi be the series solutions of Legendre's equation given in (6-29).
(a) Find the radius of convergence of these series. [Hint: Use the ratio test.]
(b) Prove that up to constant multiples Legendre's equation of order n (n a non-
negative integer) has only one polynomial solution.
(c) Prove that the function

HNn)-
is a solution of Legendre's equation of order one on the interval (—1, 1), and use
this fact to write the general solution of the equation in closed form.
250 THE THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 6

25. The second-order linear differential equation

y" - 2xy' + Iky = 0,

X a non-negative constant, is known as Hermite's equation of order X.

(a) Prove that y is a solution of this equation if and only if u = e~ x /2


y is a solution
of
u" + (2X + 1 - x 2 )u = 0.

(b) Use the method of undetermined coefficients to find a basis for the solution space
of Hermite's equation.
(c) Show that Hermite's equation of order n has polynomial solutions of degree n
for each integer n > 0, and that up to constant multiples there is precisely one such
solution for each n.
*26. Use the method of undetermined coefficients to show that Bessel's equation of order
zero has a solution Jo which is analytic on the entire x-axis and satisfies the
condition /o(0) = 1.

6-6 FURTHER EXAMPLES


In certain respects the exercises and examples in the preceding section were some-
what too special to be completely representative of the power series method for
For one thing, all of the coefficients in each
solving linear differential equations.

of these equations were polynomials which hardly qualify as typical analytic
functions. And for another, each of the recurrence relations we obtained led

immediately to a general formula for the coefficients of the series being sought.
In many cases neither of these simplifications occur, and in order to put this
discussion in a more reasonable perspective we now present a number of less
elementary examples.

Example 1. Solve the initial-value problem

3/' -/ + (*+ Dj>=1; _


(6 30)
X0) = /(0) = o.

By Theorem 6-4 we know that the desired solution can be expressed in the form

k
y(x) = J2 akx (6-31)
fc=0

for suitable constants a k , and that the resulting series will converge on (- oo, co).
Thus we substitute (6-31) and its first two derivatives in the given equation to
obtain

~1
k=2
£ 3*(* - I****"
2
~ E
fc=l
kakxk + £=°
fc
akxk+1 + E
k=0
«*** = l >
,

6-6 |
FURTHER EXAMPLES 251

or, after shifting indices of summation,

2 3(* + 2){k + \)a k+2 x


k
- £ (k + lH+i**

fc=l fc=0
Collecting terms we have

6a 2 — «i + a
00

+ E
fc=l
t3 ^+ 2 X* .+ l)**+2 ~ (* + 1H+1 + «t + fl*_!]X* = 1,

and it now follows that

6a 2 — a1 + a = 1,

3(* + 2)(k + l)afc+2 - (k + l)a fc+1 + a fc + a _!fc


= 0, k > 1.

In addition, by setting a: = in (6-31) and its first derivative, and using the
given initial conditions, we find that a = ax = 0. Hence

a = 0, ax = 0, a2 = £,
and, in general,

k+2
a fc+i afc + Qfc-i ,
K
. .

3(* + 2) 3(* + 1)(* + 2)


'
~ l '

Here, for the first time, we are confronted with a recurrence relation which
cannot be solved for a k as a function of k alone. As suggested above, this is not
at all uncommon, and when it occurs we have no choice but to compute a few
terms of the series involved and then use Theorem 6-4 to determine a (minimal)
interval of convergence. In point of fact, this is usually sufficient for most pur-
poses, since it is always possible to develop the series to the point where it can be
used in numerical work.
In the present case we have

fl
3 — 54> #4 = — 324» a5 = ~ i&S >

and
y(x) = ±x 2 + &X - 3
^x 4
- T2 zx \ 5
+ • •

an expression which already furnishes an excellent approximation to the required


solution in the interval (—1, 1).

Example 2. Solve the initial-value problem

XI) =
xy» +?+
0, /(l)
xy =
=
0;
-1. ^ Z)
, , .

252 THE THEORY OF LINEAR DIFFERENTIAL EQUATIONS |


CHAP. 6

Here we begin by making the change of variable u = x — 1 to shift the com-


putations to the origin. Then

dy dy, d*y = fy2


= (6-33)
du dx '
du 2 dx
and (6-32) becomes

XO) = o, /(0) = -l.


Substituting
00

y(u ) = 2 akU

in this equation we obtain

"2 '1
(« + i) £
fc=2
*(* - iK« fc

+ S
fc=l
kakuh + (M + !)
fc=0
S flfc "
fc
= °'

or, after the usual simplifications,

«o + fli + 2a 2
fc

+ £[(* + 2)(A: + 1)afc +2 + ^ + ')Vi + fl


* + «fc-il« = o.

it=i

Thus
flO+a + = l 2fl 2 0,
2
(k + 2)(A: + l)fl*+2 + (* + l) «fc+i + «fc + «*-i = °> ^ > 1.

and since the initial conditions imply that a = 0, ax = — 1, we have

02 = 2> fl 3 = — £> fl 4 = h
Hence
Xw) = -u + it/
2
- i«
3
+ i«
4
+ •
'

and, setting u = x — 1,

X*) = i - x + K* - i)
2
- K* - i)
3
+ K* - D 4 + • •

From Theorem 6-4 we conclude that this series converges at least in the interval
< x < 2, since this is the largest interval centered at x = 1 in which the
equation is normal.

Example 3. For our final example we solve the initial value problem

y - e*y - °> (6-34)


X0) = /(o) = i,

involving an equation with nonpolynomial coefficients.


6-6 | FURTHER EXAMPLES 253

As usual we start by substituting

y(*) = 2
k=0
a **
k

in the given equation. This yields

-2
£
fc=2
k(k - l)a k x
k
- e
x
J2
&=0
ak x
k
= 0,

which we rewrite as

£
k=0
(k + 2)(/c + l)a k+2 x
k
- f E IrVZ
\fc=0 **/V=o
«**^ = 0.* (6-35)
/
In order to put this expression in the form required by the method of undetermined
coefficients (i.e., a power series in x), we now use the theorem which asserts

that power series may be multiplied according to the usual rules of algebra within
their common intervals of convergence. Thus

(flo + «i* + a 2x
2
+ • •
•) (l + x + ~+ • •
•)

= a + (a + a x )x + (f[
+ ai + a z) x2

Substituting this expression in (6-35)


SS^ we obtain

and it follows that


fc
1 a
'+ 2 - (* + i)£ortr- *^°-
fl
+ 2x*
In particular,
«2 -y
°3 _ qp +6~"
ai

1 / fl o , . \ a + «i

x
Recall that e = 1 + jc + x 2 /2! + *3/3! +
254 THE THEORY OF LINEAR DIFFERENTIAL EQUATIONS |
CHAP. 6

etc., and in principle all of the a k can now be computed in terms of a and a x .

Finally, since the initial conditions in effect imply that a = a x = 1, we have

#2 = 2">

n — i
«4 = 6"'

and
4
x2 x2 x

The validity of the above computations obviously depends upon the fact that
convergent power series can be multiplied as though they were polynomials. Since
we shall have occasion to refer to this result again we now state it precisely and
formally as

Theorem 6-5. Let

J2a k x
fc=0
k
and £
fc=o
b kx
k
(6-36)

be convergent in the interval \x\ < r, r > 0. Then the series

£
fc=0
c k x\ (6-37)

with
k

ck = 2
3=0
ai b *-i> (6-38)

known as the Cauchy product of the series in (6-36), also converges for
\x\ < r, and


\fc=0
**A ( S b ^)
\fc=o
/ /
= E
fc=o
c **
fc

<6
-39
>

for all x in this interval.

When phrased in terms of analytic functions this theorem asserts that the
product of two functions/and g which are analytic on an interval lis itself analytic
on /, and that its power series expansion about any point x in / is the Cauchy
*
product of the power series expansions of/ and g about x

* For a proof the reader should consult Buck, Advanced Calculus, 2nd Ed., McGraw-
Hill, New York, 1965.
6-6 | FURTHER EXAMPLES 255

EXERCISES

Find the first four nonzero terms in the series expansion of the solution of each of the
following initial value problems, and determine a (minimal) interval of convergence for
the series.
1. /' + (sinx)y = 0; 2. 2y" -/+(* + l)y = 0;
y(0) = l,/(0) = 0. v(0) = 0, /(0) = 1.

3. (x + 1)/' +/+ xy = 0; 4. 2y" - xy = cosx;


y(Q) = y'(0) = -1. y(0) = y'(0) = 0.

5. (1 -4x 2 )y" + jcV - 2>> = 0; 6. (cos*)/' + 2*^ = 0;


J(0) = /(0) = 1. v(0) = 0,/(0) = 1.

7. (* - 3)y" + jc /+
2
y = 0; 8. [1 + ln(l + x)]y" - xy' +y = sin x;
j/(0) = 0, /(0) = 6. y(0) = = 1.
0, v'(0)

9. xy'" - j = 0; 10. xy" + ? + xy = 0;


y$) = /O) = 0,/'(3) = 9. >>(1) = 0,/(l) = -1.

11. 3/" - xy' + x 2>> = e x ; 12. 3x/' - / = 0;


y(0) = /(0) = 0,/'(0) = i. y(-2) = 1, /(-2) = -1.
13. Find the first four nonzero terms in the power series expansion of
x
r —t 2
e *
dt.
Jo

What is the interval of convergence of the power series expansion of this integral?
14. Let
00

k=0
be a solution of the equation

/' + P(x)y' + q(x)y =


in the interval |jc| < r, r > 0, and suppose that

/>(*) = 2
fc=0
****** ^*) = S
k=0
0***

in this interval. Prove that


*
1
flfc+2 = - „ ^„ „ 22 [C/ + i)Pk-ja j+ i + qk-jdj],

fork = 0, 1, 2,
.

euclidean spaces

7-1 INNER PRODUCTS

Much of the content of elementary geometry depends upon the ability to measure
distance between points. In this chapter we shall show how distance, together
with such related concepts as length and angular measure, can be generalized to
arbitrary real vector spaces. These so-called "metric" concepts are the foundation
of Euclidean geometry, and from them flow a wealth of results in both geometry
and analysis.
In order to introduce a metric into a real vector space we must first choose a
unit of distance for our measurements. This can be done most easily by defining
what is known as an inner product on a vector space. The logic behind taking the

notion of inner product as primitive will become compelling once we have used it
to define length, angular measure, and distance. For each of these concepts then
appears as a natural consequence of the notion of inner product, and the student
is led to appreciate them as an elaboration of a single idea.

Definition 7-1. An inner product is said to be defined on a real vector


space V if with each pair of vectors x, y in V there is associated a real
number x •
y in such a way that

x y • = y • x, (7-1)

(ax) •
y = a(x •
y) for every real number a, (7-2)

(xi + x 2) y• = X! •
y + x2 •
y, (7-3)

x •
x > 0, and x • x = if and only if x = 0. (7-4)

A vector space with an inner product is known as a Euclidean or inner


product space, and the real number x y • (read "x dot y") is called the inner
product of x and y.*

* Some authors call x • y the "scalar product" of x and y. We shall avoid this termi-
nology, however, because of the possibility of confusing it with scalar multiplication as
introduced in Chapter 1
256
7-1 | INNER PRODUCTS 257

If we apply Eq. (7-1) to (7-2) and (7-3), we see that an inner product also
satisfies

x •
(yi + y 2) = x •
y2 + x y2
• and x (ay)
• = a(x •
y). (7-5)

Even more generally, we have

(aiXi + a 2 x 2) •
CSiyi + 2 y 2) = «i/Si(xi •
yi) + aij8 2 (xx •
y 2)
(/—o)
+ a 2jSi(x 2 •
y x) + a 2 p 2 (x2 '
y 2 ),

where a\, a 2 , j8i, /8 2 are arbitrary scalar s. For future reference we also note that

x y • = whenever x = or y = 0. (7-7)

[The proofs of (7-6) and (7-7) have been left to the reader in Exercises 6 and 7
below.]
Equation (7-1) in the above definition asserts that an inner product is a com-
mutative or symmetric operation on pairs of vectors. Equation (7-2) may be
interpreted as an associativity requirement, this time with respect to scalars, while
(7-3) requires that the operation be distributive. These two conditions, together
with their analogs given in (7-5), are said to make the inner product bilinear.
Finally, (7-4) is referred to by saying that an inner product is positive definite,
the allusion here being to the fact that the inner product of a vector with itself is

always greater than zero unless the vector involved is the zero vector. Thus one
frequently hears an inner product called a real-valued, symmetric, bilinear, positive
definite operation on pairs of vectors.

Example 1. Let x = = n
(jc 1s . .
.
, xn ) and y (y 1} . .
. , yn ) be vectors in <5t , and
define x y by •

x y • = *iJ>i + • • •
+ xnyn . (7-8)

n
Then becomes a Euclidean space, and as such is called Euclidean n-space.
(R
2
In and (R 3 this inner product is none other than the familiar "dot product"
(R
of physics, where the definition is usually phrased geometrically as the product
of the length of x, the length of y, and the cosine of the angle between them. The
equivalence of these definitions will become apparent in the next section (see
Eq. 7-21).

Example 2. This example furnishes the first intimation of things to come in


analysis. The vector space in question is Q[a, b], the space of all continuous
functions on the interval [a, b], with f g defined by •

b
fg = [ f(x)g(x)dx. (7-9)
Ja

It is not difficult to show that (7-9) satisfies all of the requirements for an inner
product. Indeed, it is perfectly obvious that f •
g is both real-valued and sym-
258 EUCLIDEAN SPACES |
CHAP. 7

metric, while bilinearity follows from the equations

/ af(x)g(x) dx = af f(x)g(x) dx, a a real number,


Ja Ja

and
b b

[ [fl(x) + f2(x)]g(X )dx = I" fi(x)g(x)


Ja
dx + [
Ja
f2(x)g(x)dx.
Ja

Finally, if we recall that the integral of a non-negative function is non-negative,


and that the integral of a continuous non-negative function is zero if and only if

the function is identically zero (see Exercise 12), we see that

b
2
f.f = f f(x) dx > 0,
Ja

and
f • f = if and only if f = 0.

n we
Hereafter whenever we refer to (R or e[a, b] as Euclidean spaces, shall

assume that we are using the inner products defined in the above examples unless
express mention is made to the contrary.

Example 3. Let r be a non-negative function in Q[a, b] which vanishes at most


at finitely many points in the interval [a, b]. Then

b
f-g= [
f(x)g(x)r(x)dx (7-10)
Ja

also defines The function r is called the weight


an inner product on e[a, b].

function for this inner product, and when r is identically equal to 1,


we note that
(7-10) reduces to the inner product defined in Example 2. We shall meet this
inner product again when we study boundary value problems for differential
equations.

Example 4. be the vector space consisting of all polynomials in x with


Let (P

real coefficients. In most of our prior discussions of (P we have considered its


vectors as objects in and of themselves, and have ignored their interpretation as
real-valued continuous functions defined on the entire real line or any of its sub-
intervals. From now we shall consider the members of (P as poly-
on, however,
nomial functions in order that (P may be viewed as a subspace of e[a, b] for any
pair of real numbers a < b. In this case we must define the inner product on (P
by (7-9) or (7-10) according as one or the other of these products is being used
in e[a, b]. (Similar remarks apply to each of the vector spaces (P n .)

In the preceding example we mentioned the notion of a subspace of a Euclidean


space. The relevant definition is obvious: A Euclidean space W is a subspace of a
Euclidean space V if °W is a subspace of V as defined in Chapter 1, and if the inner
7-1 | INNER PRODUCTS 259

product defined on W coincides with the inner product defined onl). clear It is

that an arbitrary subspace W of a Euclidean space V always a subspace of V is

in this sense provided we use as the inner product on W the one that defined is

on V. However, equally clear that


it is possible to furnish W with an inner
it is

product which differs from that defined on V, in which case W, as a Euclidean


space,is not a subspace of V. The reader should be able to construct examples
involving (P and Q[a, b]. We furnish one in Exercise 10 below.

EXERCISES

1. Find x y for each of the following



pairs of vectors in (R 3 .

(a)x = (4,2,-1) (b)x = (f,i 1)


y = (4,-2,3) y = (-±4,2)
(c)x = (l,f, -3) (d)x = (-5,0,1)
y = (I,*, |) y = (1,0,-5)
(e) x = (2, -2, 1)

y = (i,i,3)
2. Find f • g for each of the following pairs of vectors in C[0, 1]. (Recall that the inner
product is defined by (7-9).)
(a) f(x) = x (b) f(x) = x
gix) = 1 - x2 g(x) = 1 - x
(c) f(x) = sinTTAr/2 (d) f(x) = ex
gix) = costtx/2 g(x) = sin*
(e) f(x) = |jc- i|
six) = 4 - I* - 4|

3. Find f • g for each of the following pairs of vectors in 6[0, 1] when the inner product
is defined with respect to the weight function r(x) = e x (see (7-10)).
(a) fix) = 1 - 2x (b) fix) = x 2
gix) = e~ x gix) = e
x

(c) fix) = x (d) fix) = e- x/2 smirx/2


gix) = 1 - x gix) = e-
x/2
sm3irx/2
(e) f(jc) = cos7rx/2
g<*) = 1
4. Prove that (7-8) defines an inner product on n
(R .

5. Let x = (xi, X2), y =


be arbitrary vectors in
iy\, >>2) (R
2
. Determine which of the
following define an inner product on (R 2 .

(a) x y = xiyi

(b)x y = 2ix x yi + x 2 yz)

(c) x-y = -2(xui + x 2 y 2) (d)x-y = ixiyi) 2 + ix 2 y 2 ) 2


(e) x-y = a: i^i + xiy 2 + ^2^1 + 2x2^2 (0 x-y = xiy 2 + x 2 yi
6. Prove that Eq. (7-6) holds in any Euclidean space.
7. Prove (7-7). [Hint: Consider (0 + 0) •
y.]
260 EUCLIDEAN SPACES | CHAP. 7

8. Let 13 be a real vector space, and set x y =


• for every pair of vectors x, y in 13.
Is 13 a Euclidean space? Why?
9. (a) Let a < a\ < bi < bbe real numbers, and define f • g in G[a, b] by

l
f-g = C
Ja
Rx)g{x)dx.
i

Is Q[a, b] then a Euclidean space? Why?


(b) Answer the same question for (P. Explain fully.

10. In the space of polynomials (P let

p •
q = aobo + aibi + • • •
+ a n bn ,

where p(x) = a a\x H + an x n and q(x) = bo +b\x -\ bnxn, + + .

(Note that by adding terms with zero coefficients we can make any two polynomials
in (P have the same apparent degree, as above.)

(a) Prove that this definition yields an inner product on (P.

(b) Is (P with this inner product a subspace of the Euclidean space Q[a, b]l Why?
11. Let 13 be a Euclidean space with inner product x •
y.

(a) For each pair of vectors x, y in 13 let x o y be defined by

xoy = 2(x-y).

Prove that this definition yields an inner product on 13.

(b) Let a be an arbitrary real number, and define x o y by

xoy = a(x •
y).

Determine those values of a for which this definition yields an inner product on 13.

*12. (a) Let /be a continuous function on the interval [a, b], and suppose f(x ) > for

some xo in this interval. Use the definition of continuity to prove that fix) >
for all values of x in some subinterval of [a, b] containing the point *o-

(b) Use the result in (a) to prove that in C[a, b] f f > 0, and that f f = if and • •

only if f = 0.

13. Prove that (7-10) defines an inner product on Q[a, b].

14. Will (7-10) define an inner product on Q[a, b] if we merely require r to be non-
negative on [a, b\l Why?
15. Let r be any function in Q[a, b] which vanishes for at most finitely many values in the
interval [a, b]. Prove that

f-g = I" f(x)g(x)\r(x)\ dx


la

an inner product on Q[a, b].


defines
2
*16. Letx = (xi,x 2 )andy = (yi,y2) be vectors in (R , and let

/an «i2\
=
(fl ^ L
\a21 a )
#22/
7-2 I LENGTH, ANGULAR MEASURE, DISTANCE 261

be a 2 X 2 matrix whose entries are real numbers. Set

x-y = anxiyi + ai2*iy2 + a2i*2Vi + a22X2y2- (7-11)

(a) Show that this definition satisfies Eqs. (7-2) and (7-3) of Definition 7-1 for
every 2x2 matrix (a„).
(b) Show that Eq. (7-1) is satisfied if and only if a 12 = 021 [i.e., if and only if (a,-,-)
2
is a symmetric matrix], and hence deduce that (7-11) defines an inner product on (R
if and only if (an) is a 2 X 2 symmetric matrix such that

2 2
aii*i + (ai2 + a2i)xiX2 + a22*2

is non-negative for every choice of x 1 and X2, and is zero if and only if x 1 = X2 = 0.
2
(c) Find a matrix (an) which reduces (7-11) to the ordinary inner product on (R .

(d) Determine which of the following matrices can be used to define an inner product
on(R 2 :

(::)•(-::>(::)
*17. Generalize the preceding exercise to (R n .

7-2 LENGTH, ANGULAR MEASURE, DISTANCE


In this section we fulfill our promise to define length, angular measure, and dis-
tance in terms of the inner product on a Euclidean space. Each of these concepts
has a well-defined meaning in Euclidean 2-space, and it is reasonable to demand
2
that any definition we adopt reduce to Thus we can obtain
the familiar one in (R .

acceptable definitions by rewriting the relevant formulas from analytic geometry


2
in terms of the inner product on (R and then adopting the results as definitions
,

for arbitrary Euclidean spaces.


Turning first to the notion of length, let x = x 2 ) be any vector
(x lt in (R
2
.

Then the length of x, denoted by ||x||, is the non-negative real number

X=(*!,X2 )
(see Fig. 7-1). But this expression may be
rewritten in terms of the inner product
2
on (R (Formula 7-8) as

l|x|| = Vx^x,

and we have our first definition. FIGURE 7-1

Definition 7-2. The length (or norm) of a vector x in a Euclidean space is

defined to be the non-negative real number

Ixll = Vx • x. (7-12)
262 EUCLIDEAN SPACES |
CHAP. 7

= w
Thus, in particular, the length of a vector x (xi, . . . , xn ) in (R is

||x|| = Vxi + • • •
+ £, (7-13)

while the length of a vector f in e[a, b] is

llfll
= (jjixfdx) 112 . (7-14)

2
Next, we observe that if x and y are any two nonzero vectors in (R , the formula

cos v = *,7 I,
' < e < x, (7-15)

is an immediate consequence of the law of cosines (see Exercise 6). But the ex-
pression
xy
|x|| ||y||

is also meaningful in an arbitrary Euclidean space, a fact which suggests con-


sidering (7-15) as a reasonable candidate for the general definition of cos 0.

Before acting on this suggestion, however, we must establish the inequality

IMI llyll

for every pair of nonzero vectors in a Euclidean space, since, of course, any
definition of cos must satisfy the inequality -1 cos 1. This fact will < <
emerge as a consequence of the following important result, known as the Schwarz
or Cauchy-Schwarz inequality.

Theorem 7-1. (Schwarz inequality.) Ifx and y are any two vectors in a
Euclidean space, then
(x •
y)
2
< (x • x)(y •
y). (7-17)

Proof. We first observe that this inequality is immediate if either x or y is the


zero vector, since then both sides of (7-17) are zero [see (7-7)]. Thus it suffices
to consider the case in which x and y are nonzero. Here we use (7-6) to expand
(ax - /3y) •
(ax - j8y) where a and are arbitrary real numbers. By (7-4) we
have
< (ax - j8y) • (ax - j8y)

= 2
a (x • x) - 2a/3(x •
y) + /3
2
(y •
y),

whence
2a/3(x •
y) < a 2 (x • x) + /r(y •
y)

We now set
= Vy •
y an d = Vx • x.
7-2 I LENGTH, ANGULAR MEASURE, DISTANCE 263

This gives
2Vx-x Vyy (x •
y) < 2(x • x)(y •
y),

or
x y
• < Vx» x Vy -y.
Squaring, we obtain (7-17). |

n
In (R the Schwarz inequality assumes the form

,2 / n \ / n
-
* E*? E-rf < 7 18 >
(IH' L i=i / \t=i

while in C[a, 6] it becomes

2 2
< (£ /W 2 dx)
(/aV(*M*) ^) (£ g(x) </*)

(7-19)

The first of these inequalities is valid for any collection x u . . .


, xn y lt
,
. .
. , yn
of real numbers, and is usually called Cauchy's inequality. It is worth remembering
since it is often useful in deducing other arithmetic inequalities (see Exercises 9
through 11 below).
In the notation of Definition 7-2 the Schwarz inequality becomes

|x-y| < ||x||||y||, (7-20)

and asserts that the absolute value of the inner product of two vectors does not
exceed the product of the lengths of the vectors. Thus

whenever x and y are nonzero. But this is just another way of writing (7-16),
the inequality needed to justify using (7-15) as a definition of cos 6, and we can
now state

Definition 7-3. If x and y are nonzero vectors in a Euclidean space we


define the cosine of the angle between them to be

cose = *'* •
(7-21)
I,

Ml llyll

If, on the other hand, one of the vectors is zero, we set cos 6 = 0.

It goes without saying that in defining the cosine of the angle between x and y
we have, by implication, also defined the angle in question; just take the principal
value of the inverse cosine.
264 EUCLIDEAN SPACES [ CHAP. 7

At this point all that remains of our original program is to define the distance
between any two points (i.e., vectors) in a Euclidean space. Again this is done
2
simply by copying the definition from (R where the distance between x and y
is the length of the vector x — y (Fig. 7-2). Thus

Definition 7-4. The distance between two vectors x and y in a Euclidean


space is, by definition,

d(x,y)= ||x-y||. (7-22)

But is this a reasonable definition of the term "distance"? In order to answer this
question we must first decide what properties we require of distance in general.

On mathematicians are in agreement, having decided as follows. The


this score
distance between two points must be a non-negative real number which is zero if
and only if the points coincide. It must be independent of the order in which the
points are considered, and finally the triangle inequality, famous from plane
geometry, must be satisfied. Thus in order to justify using the term "distance" in
Definition 7-4 we must show that d(x, y) is a real number satisfying

d(x, y) > 0, (7-23)

</(x, y) = if and only if x = y, (7-24)

d(x, y) = d(y, x), (7-25)

d(x, y) + d(y, z) > d(x, z) for any three vectors x, y, z. (7-26)

The first three of these properties follow immediately from the definition of
length and the axioms governing an inner product. The last, however, is not
quite so obvious. To prove it we first establish an inequality which is of some

importance in its own right.

Lemma 7-1. Ifx and y are arbitrary vectors in a Euclidean space, then

||x + y|l
< [|x|| + ||y||. (7-27)
7-2 | LENGTH, ANGULAR MEASURE, DISTANCE 265

Proof.
1/2
||x + y||
= [(x + y).(x + y)]

= + 2(x.y) + (y.y)] 1/2


[(x.x)

< [(x-x) + 2v^x\/y 7 y+ (yy)] 1/2 (by the


Schwarz inequality)
2 1'2
= [(Vx^ + Vy^y") ]

= Vx-x + Vy «y
= NI + ||y||.l
The triangle inequality follows at once from this result. Indeed,

||x - z|| = ||(x - y) + (y - z)|| < ||x - y|| + ||y


- z||, (7-28)

which is precisely what we had to show.


Finally, we note that the distance function defined above also enjoys the follow-
ing agreeable properties:

d(ax, ay) = \a\ d(x, y) for any real number a, (7-29)

and
d(x + z, y + z) = d(x, y). (7-30)

The proofs and geometric interpretations are left to the reader.

EXERCISES

4
1. Find the length of each of the following vectors in (ft .

(a) (1, -2, 2, 0) (b) (i - V3, VX 1)


(c) (3,4,-3,1) (d)(f,
i,-|,-i)
(e) (2,-1,5,3)
4
2. Find the distance between each of the following pairs of points in (ft .

(a)x = (3,0, -1,5), y = (2,2,-1,3)


(b) x = (7, -4, 1, 3), y - (2, 1, -4, 8)
(c) x = (§,1,0,2), y = (if, -if)
(d)x = (1, -1,0,2), y = (2,1,1,0)
v.ejx =
(tj, 2> A l)j y = ( q> 4» -U 2">

3. Compute ||f|| for each of the following vectors in e[0, 1].

(a) f(x) = x (b) f(x) = e x/2


(c) f(x) = 1 - x2 (d) f(x) = sinirx/2
(e) f(x) = In (x + 1)
266 EUCLIDEAN SPACES [ CHAP. 7

4. Find the angle between each of the following pairs of vectors in 3


(R .

(a)x - (1,1,1), y = (i-1,1)


(b)x = (i,l, -2), y = (2,4,-8)
(c)x = (1, -1,0), y_= (2,-1,2)
(d)x = (0,V39,Vll), y = (1,5,0)
(e) x = (-3, -1, 0), y = (1, 2, - V5)
5. (a) Find the cosine of the angle between each of the following pairs of vectors in
(P3 if the inner product is p q = f-i p(x)q(x) dx. •

(i) 1, x (ii) x, x2 (iii) x, 1 — x

(b) Repeat (a), this time using the inner product p •


q = Jo p(x)q(x) dx.
6. Use the law of cosines to establish that

x y •

cos 6 =
Ml ||y||

2
for any pair of nonzero vectors x, y in (R (see Fig. 7-3).

7. (a) Compute f •
g for each of the following pairs of vectors in C[— x, w]:
(i) f(x) = sinmx, g(x) = sinnx,
(ii) f(x) = sin/nx, g(x) = cosnx,
(iii) f(x) — cos mx, g(x) — cos nx,
where m and n are arbitrary non-negative integers,
(b) What can you say about the functions

1, sin x, cos x, sin 2x, cos 2x, . .

in C[— x, 7r] on the basis of the results in (a)?


8. Prove that the Schwarz inequality becomes an equality if and only if x and y are
linearly dependent.

9. Let a i, .
.
, a n be positive real numbers. Prove that

(ax + • • •
+ an ) (!+... + 1) > n
2

[Af/'/tf: Use Cauchy's inequality.]


.

7-2 | LENGTH, ANGULAR MEASURE, DISTANCE 267

*10. Let a,b,c be positive real numbers such that a + b + c = 1. Use Cauchy's
inequality to prove that

e-)<s-)c-)»
11. Prove that the following inequality holds for any collection of real numbers
a\, .an . , \

(a+^y < 2
fll +,

' ' '


+ i
2
On

12. Let / be an arbitrary real number, and consider the inequality

< (tx - y) • (/x - y),

valid forany pair of vectors x and y in a Euclidean space. Expand this inner product,
and derive the Schwarz inequality by examining the discriminant of the resulting
inequality in t (This is another very popular way of deriving the Schwarz inequality.)
.

13. Prove that distance as defined in (7-22) satisfies (7-23) to (7-25).

*14. Set x =
z in (7-26) and use (7-24) and (7-25) to deduce (7-23), thus showing that
this relation is actually implied by the other three.

15. Show that ||ax|| = \a\ ||x|| for all real numbers a.

16. Prove that


d(ax, ay) = \a\ d(x, y)

for all real numbers a, and that

d(x + z, y + z) = d(x, y).

17. Use (7-27) to deduce that

llx ± y|| > | N- ||y|| |

for any pair of vectors x, y in a Euclidean space.


18. Prove that ||x + y|| = ||x|| + ||y|| if and only if y = ax or x = ay for some
real number a > 0.

*19. Let
(an ai2\

£21 A22/)

be a 2 X 2 matrix whose entries are real numbers, and suppose that (a,,) is so chosen
that (7-11) an inner product on (R 2 (see Exercise 16 in the preceding
is section).
Use the Schwarz inequality to deduce that

«12
2 ^
< #11^22.

Conversely, show that if (a,,) satisfies this inequality, and ai2 = «2i, then (7-11)
defines an inner product on (R 2 .
268 EUCLIDEAN SPACES CHAP. 7

7-3 ORTHOGONALITY
Two vectors in a Euclidean space are said to be orthogonal or perpendicular if the
cosine of the angle between them is zero. Referring to Definition 7-3 we see that
the zero vector othogonal to everything, and, in general, that x and y are orthog-
is

onal if and only if x y = 0. In a moment we shall generalize the notion of


orthogonality somewhat, but first we prove a particularly celebrated theorem.

Theorem 7-2. (Pythagoras.) Two vectors x and y in a Euclidean space


are orthogonal if and only if

l|x + y||
2 = ||x|j
2
+||y||
2
.

Proof.
I|x + y||
2
= (x + y)-(x + y)
= x x + 2(x y) +
• •
y •
y
= 2
+ 2(x.y)+
||x|| ||y||
2
.

Thus ||x + y||


2
= ||x||
2
+ ||y||
2
if and only if x y • = 0, as asserted. |

We have appended two figures in illus- x+y j

tration of this result (Figs. 7-4and 7-5).


The is certainly one of the most
first

familiar and well chosen diagrams in math-


ematics, and needs no comment. The sec-
ond, on the other hand, has the negative
virtue of conveying almost no information
at all, and should stand as a warning to
the student to exercise geometric restraint
in interpreting statements concerning or- figure 7-4
thogonality.
This said, we continue the discussion by giving the following definition.

Definition 7-5. A set of vectors x x x 2 , , . • • , x*, ... in a Euclidean space


is said to be an orthogonal set if x* 5^ for all /', and

Xi-xj = (7-31)

whenever / ^ j. If, in addition,

xt - •
Xi = 1 (7-32)

for each /, the set is said to be orthonormal.

Thus an orthogonal set is a set of mutually perpendicular nonzero vectors, while


an orthonormal set is an orthogonal set in which each of the vectors is of unit
length. For economy of notation when discussing orthonormal sets, Eqs. (7-31)
7-3 I ORTHOGONALITY 269

and (7-32) are frequently combined by writing

= (0
= L
if /'
^ j, (7-33)
Xi-xj 8 i3; where 8 {j .
= .

if

The symbol 8^ introduced here is called the Kronecker delta.


The distinction between an orthogonal and an orthonormal set is really trifling,
for if we replace each vector x in an orthogonal set by the "normalized" vector
x/||x|| of unit length, the resulting set is obviously orthonormal. The only reason

for introducing orthonormal sets at all is for the convenience which sometimes
results from working with unit vectors.
Before giving examples, two points in the above definition.
we call attention to

The first is an orthogonal


that every vector in (or orthonormal) set is nonzero;
the second is that we have placed no restriction on the number of vectors in such
sets. In particular, an orthogonal set may contain an infinite number of vectors

(see Example 3 below). Such sets will occur repeatedly in Chapters 9 and 11.

f+g=l+ sinx/^ n,

\. / /% = sin x \ ,

— 7T 7P

FIGURE 7-5

Example 1. In (R
3
the vectors (1, 0, 0), (0, 2, 0), (0, 0, — $) form an orthogonal
set, while the standard basis vectors (1,0, 0), (0, 1, 0), (0, 0, 1) form an ortho-
normal set. More generally, the set consisting of the standard basis vectors in
n
(R is orthonormal.

Example 2. We define a trigonometric polynomial of degree In + 1 to be an


expression of the form

f(x) = "y + fli cos x + a 2 cos 2x + • • •


+ an cos nx

+ bi sin x + b 2 sin 2x + • • •
+ bn sin nx, (7-34)

where a b n are real numbers, and an ?± 0, or b n 9^ 0, or both. Let 3 n


,
. . . ,

denote the set of all trigonometric polynomials of degree < 2/i 1, together +
with the zero polynomial. We make 3 n a Euclidean space by defining addition
and scalar multiplication of trigonometric polynomials termwise, as with ordinary
.

270 EUCLIDEAN SPACES | CHAP. 7

polynomials, and an inner product by

T
fg =
f_J(x)g{x)dx.
(7-35)

The set of functions

1 , cos x, sin x, ... , cos nx, sin nx (7-36)

is an orthogonal set in 3 n since, for non-negative integers m and n,

J
sin mx sin nx dx = 0, if m j± n,

I sin mx cos nx Jx = 0, (7-37)

J
cos mx cos nx dx = 0, if m 5^ n.

To normalize this set we observe that

J
1 dx = 2ir,
*
and (7-38)
2
f
sin
2
mx dx = f cos mx dx — v, if m > 0.

Hence the functions

1 cos x cos nx sin x sin nx


(7-39)
\/27r Vt Vx Vx Vtt

form an orthonormal set in 3 n .

Example 3. It follows from the preceding example that the (infinite) set

1, cos x, sin x, ... , cos nx, sin nx, . .

is orthogonal in G[—ir, x].

The most important single property of orthogonal (and hence orthonormal)


sets is that each such set in a vector space V is linearly independent. To prove this
result in all generality we extend the definition of linear independence to include
infinite sets by agreeing that any such set is linearly independent if and only if

every one of its finite subsets is linearly independent in the earlier sense of the
term. Thus, for example, the vectors 1, x, x2 , ... are linearly independent in the
space of polynomials (P. (Proof?)

Theorem 7-3. Every orthogonal set of vectors in a Euclidean space V is

linearly independent.
.

7-3 | ORTHOGONALITY 271

Proof. Let S be an orthogonal set in V, and suppose

a lXl + • •
+ a nx n = 0, (7-40)

with xi, . . . , xn in S. Then, for each index i, 1 < i < n,

(aiXi + • • •
+ a„X n ) • x, = • Xt - = 0.

But since x t • xy = whenever i ^ j, the above equation reduces to

a t (x t x t )
• = 0,

and since x t x t ^ 0, it follows that a t = 0. Thus each of the coefficients in


• -

(7-40) is zero, and the assertion follows from the test for linear independence. |

In particular, we can now state the following useful result.

Corollary 7-1 . An orthogonal set is a basis for an n-dimensional Euclidean


space if and only if it contains n vectors.

Example 4. In (P 3 , with inner product

P •
q = j P(x)q(x) dx,

the polynomials 1, x, x2 — % are mutually orthogonal, and hence are a basis for
this space.

Example 5. We saw above that the functions 1, cos x, sin x, ... , cos nx, sin nx
are mutually orthogonal in 3 n , the space of all trigonometric polynomials of
degree < 2n 1. +
Hence these functions are linearly independent in 3 n More- .

over since every vector in this space is a linear combination of these functions it

follows that they form a basis for 3 n , and hence that dim 3n = 2n + 1.

Example 6. The orthogonality of the set of functions

1, cos x, sin x, ... , cos nx, sin nx, . .

in e[— 7r, ir] implies that this set is also linearly independent. Combined with the
fact that any n + 1 vectors in an n-dimensional space are linearly dependent, we
conclude that Q[—ir, ir] is infinite dimensional.

EXERCISES
1. Verify that the Pythagorean theorem holds for the orthogonal functions sin x, cos x
in C[— 7r, 7t].
*2. Let xi, . .
.
, x„ be mutually perpendicular vectors in a Euclidean space. Prove that

||Xl+---+X„|| 2 = ||
Xl ||2 + ...
+ || Xn p.

(This result is a generalized version of the Pythagorean theorem.)


272 EUCLIDEAN SPACES | CHAP. 7

3. Prove that the polynomials 1, x, x 2 — ^ are mutually orthogonal in (P3 with the
inner product defined as in Example 4 of the text.
4. Let x be a nonzero vector in a Euclidean space. Show that the vector x/||x|| is of
length 1.

5. Convert the orthogonal set of Exercise 3 above into an orthonormal set.

6. Find a polynomial of unit length in (P3 which is orthogonal to 1 and x 2 . (Use the
inner product of Example 4 of the text.)
7. Find a polynomial of degree 3 which is orthogonal to l,x, x 2 in (P4. (Use the
inner product of Example 4 of the text.)
8. Find a vector of unit length in (R 3 which is orthogonal to the vectors x = (1,-1,0),
y = (2,1,-1).
9. Find a vector of length 2 in (R 4 which is orthogonal to the vectors x = (1, 0, 3, 1),

y = (—1, 2, 1, 1), z = (2, —3, 0, —1), and has as its second component.
10. Let x and y be linearly independent vectors in (R 3 Prove that there exist precisely .

two vectors of unit length which are orthogonal to x and y.


11. Find a linear combination of the functions ex and e~ x which is orthogonal to e x in
C[0, 1].

12. Let x and y be arbitrary vectors in a Euclidean space, and suppose that ||x|| = ||y||.
Prove that x y and x — y are orthogonal. Interpret this result geometrically.
+
*13. Suppose xi, . .
.
, xn is a finite orthonormal set in a Euclidean space V. Prove that
for any vector x in V
n
X~^ t \2 ^ II II
2

1=1

[Hint: Set y = x - £?=1 (x • Xi)x l5 and compute ||y|| .]


2

This inequality is a special case of BesseVs inequality which will be proved in general
in Section 8-4.
*14. Let xi, . .
.
, x„ be an orthonormal basis for a Euclidean space V.
(a) If x is any vector in 13, prove that

n
2 2
||x|| - X>'X<) .

This result is known as ParsevaVs equality, and will be proved in more general terms
in Section 8-4.
(b) If x and y are any two vectors in V, prove that

x y =•
2 (x • x»)(y • x t ).

(c) Prove, conversely, that if xi, x„ is an orthonormal set in a finite dimensional


. .
.
,

Euclidean spaced, and if the equality in (b) is valid for every pair of vectors x, y inT),
then x 1, . .
.
, x„ is a basis for 1).
7-4 |
ORTHOGONAUZATION 273

7-4 ORTHOGONAUZATION
We now know that every orthogonal set of vectors in a Euclidean space is linearly
independent (Theorem 7-3). However, would be of only passing
this result

interest were it not for the fact that a Euclidean space also contains "enough"
orthogonal vectors to enable us to replace a given linearly independent set by an
equivalent orthogonal one. More precisely, in this section we shall prove that any
(finite or infinite) linearly independent set X in a Euclidean space can be con-
verted into an orthogonal set which spans the subspace S(9C). This process of
orthogonalizing a linearly independent set, as it is called, has a number of im-
portant and useful consequences, not the least of which are the computational
simplifications which result from working with orthogonal vectors.

e2=x2 — 0*1

FIGURE 7-6

Rather than begin with the most general situation, we shall introduce the
orthogonalization process by two examples. The drawn from (R 2 where
first is

we consider a pair of linearly independent vectors Xi, x 2 Then Xi and x 2 form


.

2
a basis for (R ,and in this case our problem becomes that of replacing x x and x 2
with an orthogonal basis e
ei, 2 constructed out of x x and x 2 in some reasonable

way. Figure 7-6 suggests the most natural solution of our problem; simply take
ei = xi, and then let e 2 be the "component" of x 2 perpendicular to Xi. Thus
we write e 2 in the form
e 2 = x 2 — 0*1,

and then determine a so that the orthogonality condition e 2 • ex = is satisfied.

This yields the equation

x 2 -e! — a(ei -ei) = 0,

and hence the value of a is


x 2 -e t
ei-ei

With this, e 2 has been determined in terms of Xi (= ei) and x 2 and the basis
,

x l5 x 2 in (R 2 has been orthogonalized.


274 EUCLIDEAN SPACES | CHAP. 7

Example 1. If xx = (1, 1) and x 2 = (0, 1), then

(0,1) .(1,1) 1

(1,1) -0,1) 2
and so
ei = (1,1), e2 = (-4,4).

Figure 7-7 shows that this is exactly the result one would expect on the basis of
our earlier remarks.
I /
i /
Xi = e! = (U),
x2 = (0,l)

e
2=(-!J)^

I
FIGURE 7-7

As our second example we orthogonalize an arbitrary basis x 1} x 2 x 3 in (R 3 , .

The procedure is essentially the same as that used above in (R 2 and is started by ,

choosing e! = Xi. The second step consists of determining e 2 according to the


pair of equations

e 2 'ei = 0, e2 = x2 — ae u
which gives again
a = x 2 -ei •

ei-d
It is clear that e 2 is not the zero vector (why?), and also that ei and e 2 both
3
belong to the subspace of (R spanned by x x and x 2 Hence S(ei, e 2 ) is a subspace .

of S(x l5 x 2 ). Moreover, since the orthogonal vectors ei, e 2 are linearly inde-
pendent, S(e i, e 2 ) has the same dimension as S(x i, x 2 ). Thus

S(ei, e 2 ) = S(xi,x 2 ).

3
Combined with the x 2 x 3 form a basis for (R this equality implies
fact that xi, , ,

3
that x 3 does not belong to the subspace of (R spanned by ei and e 2 Referring to .

Fig. 7-8, again geometrically clear that the orthogonalization process


it is

ought to be completed by letting e 3 be the component of x 3 perpendicular to the


subspace S(ei, e 2 ). Thus we set

e3 = x3 — «ie! — a 2e 2 ,
e

7-4 I ORTHOGONAUZATION 275

e3 = x 3 — («l e l + «2 e 2>

FIGURE 7-8

and find a x and a 2 by means of the orthogonality conditions ei • e2 = ei • e3 =


e2 . e3 = 0. They yield the pair of equations
= x3 -e! — a-i(ei-ei)

and
= x3 •
2 — a 2 (e 2 • e 2 ),

whence
x 3 -e 2
<*i
= x3 -ei
«2"
ei •
ei e2 • e2

This completes the orthogonalization of the basis x ls x 2 x 3 , .

Example 2. Let x x = x 2 = (0, 1,0), x 3 = (1, 1, 1). Then ei = xi,


(1, 1,0),
e2 = x2 —
aei, and e 3 = x 3 — a^! — a 2 e 2 where a, a u and a 2 are found
,

from the equations


x 2 -ei x 3 -ei x 3 -e 2
«i a2 =
ei-ei ei-d e2 • e2

e3 = (0,0,1) /
/
/x3 = (i,i,D
i

/ J^^-H* >

y\ = (0,1,0)

Xl = ei = (1,1,0)
FIGURE 7-9
276 EUCLIDEAN SPACES | CHAP. 7

It follows that a = |, «i = 1, a2 = 0, and

ei = (1, 1, 0), e2 = (-*, h 0), e3 = (0, 0, 1)

(see Fig. 7-9).

We can now quickly dispose of the general situation in which we are required
to orthogonalize an arbitrary set of linearly independent vectors x l5 x 2 , . . .

in a Euclidean space. and then let e 2 = x 2 — ae l5 where a is


First set e x = x ls
so chosen that ei«e 2 = 0. This determines a as (x 2 ei)/(ei ei), and the • •

linear independence of Xi and x 2 implies that e 2 9* 0. Furthermore, arguing as


above, we see that S(e ls e 2 ) = S(x l9 x 2 ).
It remains to show that this process can be continued indefinitely step by step.*

To do so, suppose that we have already constructed an orthogonal set ei, en . .


.
,

out of x x , x n so that S(ei,


. .
. , ew ) = S(x l5 x n ). Then, to continue one
. .
.
, . .
.
,

step further, set

eM+ i = x n+ i — a^ei — a 2e 2 — • • • — an e n ,

and determine a 1( ...,a n so that en+i is orthogonal to each of ei, . .


.
, en .

This leads to the equations

x n +i -ei - «i(ei -ej) = 0,

x n +i • e2 — a 2 (e 2 • e 2) = 0,

and so to

— Xn+i
X •
ei
= x„+i
t:
• e2
= x„+i
— • en
ai ,
a2 , . . . ,
an '

ei •
ei e2 • e2 en • en

which determines en+ i. Again the linear independence of xi, xn+ i implies . .
.
,

we can show that S(e l5


that e n+ 1^0, and, as before, e n+ 1) = S(x 1} xn+ 1). . .
.
, . . . ,

(See Exercise 9 below.) Thus the orthogonalization process has been continued,
as required, and we can now state the following important result.

Theorem 7-4. Let x l5 x 2 be a {finite or infinite) set of linearly inde-


, . . .

pendent vectors in a Euclidean space V. Then there exists an orthogonal


setei, e 2 , . . . in V such thatfor each integer n, S(e x ew ) = S(x l5 x n ). , . .
.
, . . . ,

Moreover, the e„ can be chosen according to the rule

ei = xx, (7-41)

and,
en+ i = x w+ — i ouei — • • • — anen ,
(7-42)

* The knowledgeable reader will recognize that we are giving a proof by mathematical
induction at this point.
7-4 I ORTHOGONALIZATION 277

where

x fi x n+ i e2
= —n: +i — • •

a\ ;
i
' a2 = (7-43)
d-ei e2 • e2

The method of orthogonalization described in this theorem is known as the

Gram-Schmidt orthogonalization process.

Example 3. In this example we apply our orthogonalization process to the


infinite linearly independent set of vectors

X j Jv 9 A 5 • • •

in the space of polynomials (P with inner product defined by

P •
q = P(x)q(x) dx. (7-44)
f
The orthogonalization goes as follows:

(0 ei = 1

ei • ei = /
dx = 2.

(ii) e2 = x — a, where a = ^ }_ 1 x dx — 0.

Thus
e<> = x

=
e2 • e2 = £/ dx §.

(iii) e3 = x2 — ai — a 2 x, where

ai = i /*
jc
2
dx = £, a2 = f| x 3 dx = 0.

Thus
e3 — * 3

f\x -&
2 2
e 3 -e 3 = dx = &.

Continuing in this fashion we obtain the orthogonal sequence

1
3, Av 3 Hy v-*
5A, A 3.Y
7 A 1

~T 35
3
(7-45)

in (P.
278 EUCLIDEAN SPACES I CHAP. 7

When supplied with appropriate multiplicative constants these polynomials


become famous Legendre polynomials of analysis. (Also see Section 6-5.) We
the
shall meet them again in Chapter 11, where they will be discussed in greater detail.
At this point we merely ask the student to make a mental note of the fact that the
Legendre polynomials form an orthogonal set in (P when the inner product is
defined by (7-44).
Theorem 7-4 has a number of useful consequences, the first of which though
obvious is nonetheless useful.

Corollary 7-2. Every finite dimensional Euclidean space has an orthonormal


basis.

This result will enable us to simplify many of our future computations. In


particular, if e u . .
. , en is an orthonormal basis in an n-dimensional Euclidean
space V, and if
i

X = aiei + • • •
+ a n en I


is any vector in V, then since e* • e,- = 5 t y, /
s *3 s \

/ X 1
s •
/
x • e,- = on

for each integer /, I < i < n. Thus every


vector x in V can be written uniquely in the
/ ! *2 !

form 1 •
s
1
s
x = (x •
eOex + • • •
+ (x • e„)e* /*\ 1 x

(7-46) FIGURE 7-10

and it x with respect to an orthonormal basis in V


follows that the coordinates of
are simply the various inner products of x with the basis vectors. When inter-
preted geometrically, these coordinates are just the lengths of the projections of x
onto the coordinate axes, as shown in Fig. 7-10.
Finally, if
X = a^i + • • •
+ a n en
and
y = 0iei + • • •
+ /3 n en

are any two vectors in V, their inner product is

xy = aijSi + • • •
+ a„/3n . (7-47)

Verbally, this result may be expressed by saying that the inner product of two
vectors in a finite dimensional Euclidean space is the sum of the products of their

corresponding components when computed with respect to an orthonormal basis for


the space. The reader ought to compare this result with Example 1 and Exercise
10 of Section 7-1.
.

7-4 | ORTHOGONALIZATION 279

EXERCISES

1. Orthogonalize each of the following bases in (R 3 .

(a) (1,1,0), (-1,1,0), (-1,1,1). (b) (£,0,2), (-1,3,1), (2,1,4).


(c) (2,-1, 1), (3, -2, £), (1, 0, 1). (d) (1,2,- 1), (2, 0, 3), (0, 4, 1).

(e) (1,0,0), (0,1,0), (-1,0,1).


2. Continue the orthogonalization process in Example 3 above and show that e± =
*3 — fx > an<i es = x4 — fx 2 + ^.
3. Orthogonalize each of the following bases in (R 4 .

(a) (1,0,0,1), (-1,0,2,1), (0,1,2,0), (0,0,-1,1).


(b) (2, 0, i -1), (0, 0, 3, 1), (1, 0, -1, 1), (2, 1, 0, -3).
(c) (1,0,0,0), (1,1,0,0), (1,1,1,0), (1,1,1,1).
4. (a) Orthogonalize the basis l,x — 1, (x — l)
2
in (P 3 with inner product
jii p(x)q(x) dx.
(b) Repeat time using the inner product /J p(x)q(x) dx.
(a), this

5. Orthogonalize each of the following sets of vectors in 6[0, 1].


x
(a) e ,e~ x (b) e
x
, e 2x (c)l,2x,e*
6. At one point in this section the following argument was used: If W is a subspace of
an ^-dimensional vector space V, and dim W = n, thenW = V. Prove this state-
ment.
7. In (9 define p •
q by

p •
q = aob + a\b\ -\ + a„6„,
where
p(x) = a + a\x -\ + anx n ,
q(*) = b + bix H \- bnxn .

(See Exercise 10, Section 7-1.) If a is an arbitrary constant, show that the polyno-
mials
1, x - a, (x - a) 2 , . .
. , (x - a) n , . .

are linearly independent in (P, and then orthogonalize this set.

8. Orthogonalize the basis


e{ = (1,0,..., 0),

e£ = (1,1,..., 0),

e'n = (1, 1, . .
.
, 1),

in(R\
9. In passing from step n to n + 1 in the orthogonalization process (Theorem 7-4) we
had the following situation:
(i) S(ei, . . e„) = S(xi,
. , . . . ,x„), with ei, . . . , e„ mutually orthogonal, and
(ii) e„ + i = xn+ i — aid — • • — a»e„, where ai,...,a„ are computed
according to (7-43).
.

280 EUCLIDEAN SPACES |


CHAP. 7

It was then asserted that

(a) e n+ i ?* 0, and
(b) S(ei, . .
.
, e n +i) = S(xi, . . . ,x n+ i). Prove these statements.
10. Write out a proof of Eq. (7-47) of the text.

11. Let ei, . .


.
, e n be an orthogonal basis in a Euclidean space V.
(a) If x = aid + " • •
+ <x n e n is an arbitrary vector in V, find the value of a* in
terms of x and e».

(b) Compute x y • for any pair of vectors x and y in V.


12. A vector x in a Euclidean space V is said to be perpendicular or orthogonal to a sub-
space V? of V if x is orthogonal to every vector in V.
(a) Show that at each step in the Gram-Schmidt orthogonalization process e„ + i is

orthogonal to the subspace S(xi, . . . ,x„).

(b) Suppose ei, . .


.
, e„ is a basis forW. Prove that x is orthogonal to*W if and only
if x is orthogonal to each of the e t .

Find a vector of unit length in (ft which is orthogonal to the subspace spanned by
3
13.
the vectors (1, 2, -1) and (-1,0, 2). (See Exercise 12.)
2
14. (a) Orthogonalize the set of vectors 1, sin x, sin x in C[-t, tt].
(b) Use the result in (a) to find a unit vector which is orthogonal to the subspace of
e[—n-, it] spanned by 1, sin x. (See Exercise 12.)

15. (a) Let ei, e2 be an orthonormal set in a Euclidean space V, and let be the sub- W
space spanned by ei and e2. If x is an arbitrary vector in V, prove that there exists
precisely one vector y in such that x - y is orthogonal to (see Exercise 12). W W
(b) Find y if V = (ft
3
, ei = (1, 0, 0), e 2 = (0, V2/2, V2/2), and x = (1, 1, 1).

16. Let po(x) pi(x) ..., Pn(x), ... be the sequence of polynomials obtained by ortho-
t t

2 in C[— 1, 1].
gonalizing the sequence 1, x, x , . . .

(a) Prove that p n (x) is a polynomial of degree n, for each n.

(b) Prove that the leading coefficient of />„(*) is 1.


* (c) Prove that when n is pn (x) contains only terms of even degree, and when n
even,
is odd, p n (x) contains only terms of odd degree. [Hint: Use mathematical induction.]

17. Legendre polynomials. The Legendre polynomial Pn (x) may be computed by the
following general formula where n successively assumes the values 0, 1, 2, .
.
:

n(n — 1) n-2
^nW 2(2« - 1) X
2«(/z!)2

njn - !)(« - 2)(m - 3) „_ 4


(7-48)
+ 2 •
4(2n - l)(2n - 3)
*

(a) Show that the first five Legendre polynomials, P (x), . .


. , Ptix), are constant
multiples of the polynomials listed in (7-45). (Recall that 0! = 1.)

(b) Write each of the following polynomials as a linear combination


of Legendre

polynomials.
(i) 3x 2 - 2x + 1

(ii) -5x 3 + 9x 2 - 3x - 2 (iii) ^x 3 + |x 2 - |x + 2


7-5 I PERPENDICULAR PROJECTIONS; DISTANCE TO A SUBSPACE 281

7-5 PERPENDICULAR PROJECTIONS; DISTANCE TO A SUBSPACE

Let W be a plane through the origin in


3
(R and let x be an arbitrary point not
,

on W. Then, if y denotes the perpendic-


ular projection of x onto W, the dis-
tance from x to °W is denned to be the
length of the vector x — y, as shown in
Fig. 7-11. Moreover, the vector y is

characterized by the property that it is

the unique vector in *W such that x — y


FIGURE 7-11
is perpendicular to W.*
In this section we generalize these familiar concepts to arbitrary Euclidean
spaces. To do so, however, it turns out that the subspace must be restricted W
in some way in order to make things manageable. The most obvious and ele-
mentary restriction is the requirement that W be finite dimensional, for then a
vector d will be orthogonal to V? if and only if d is orthogonal to each of the
vectors in a basis for W (Exercise 12(b), Section 7-4). Thus, throughout the
following discussion we shallassume that *W is a finite dimensional subspace of V.
Note, however, that we place no restriction on the dimension of V itself.
Our first objective is to establish the existence of perpendicular projections,
which we do by proving the following theorem.

Theorem 7-5. Let W


be a finite dimensional subspace of a Euclidean space
V, and let x be an arbitrary vector in V. Then x can be decomposed in pre-
cisely one way as
x = y + d, (7-49)

where y is a vector in W, and d is perpendicular to W.

Proof. Since V? is finite dimensional, we can apply Corollary 7-2 and find an
orthonormal basis e ls , en for W. Then if it exists at all, the vector y of
. .
.

Eq. (7-49) must be of the form

y = «iei + + «„e„ (7-50)

and it remains to show that the a* can be so determined that the vector d = x — y
is orthogonal to each of the basis vectors e z-.t But if we substitute (7-50) into

* A vector x is said to be perpendicular or orthogonal to a subspace W of a Euclidean


space if x is orthogonal to every vector in *W.
t Actually, we know
that this can be done, and that the answer is furnished
already
by the in +the Gram-Schmidt orthogonalization process applied to the
l)st step in
linearly independent vectors ei, . , e„, x. The argument which follows is a repetition
.
.

of that step.
282 EUCLIDEAN SPACES CHAP. 7

(7-49) to obtain
x = axd + • • •
+ a„e n + d,

and then apply the orthogonality condition

d • et = 0, ez - • ey = 5,-y,

we find that

«i = x-ei, <Xn — X * C« (7-51)

This set of equations determines y uniquely in terms of x and the basis vectors
ei, . .
.
, en as
y = (x •
ei)ei + • • •
+ (x • e n)e n . (7-52)

Finally, it is obvious that the vector d = x — y is perpendicular to W, and the


proof is complete. |

We now show that the vector d, which is called the component of x perpendicular
to W, canbe used to measure the distance from x to W. To do so, let z 9^ d be
any vector in V such that x — z belongs
to W. (One can view z as a vector from
W to the point x; see Fig. 7-12.) Then
since z — d can be written as the differ-
ence of two vectors in *W, it too belongs
to "W, and so is orthogonal to d. It now
follows from the Pythagorean theorem
that

|2 _
= ||d||
2
+ Z-d| FIGURE 7-12

and hence, since ||z - d|| > 0, that ||z|| > ||d||. In other words, of all vectors z

in V such that x — z belongs to W, d is the one whose length is smallest. This


serves to justify

Definition 7-6. If "W is a finite dimensional subspace of a Euclidean


space V, and x is any vector in V, then the distance from x to W is the length

of the component of x perpendicular to *W.

In terms of this definition we can describe the perpendicular projection x — d


of x onto °W as that point in W
which is "closest to" x, in the sense that if w is

any other vector in «W, then ||x - w|| is greater than ||d||.

Before we apply these results to a number of interesting special cases, we remark


that the use of an orthogonal basis for W
in the above computations was merely
a matter of convenience, not necessity. In general, if e x e n is an arbitrary , . . . ,

basis for W, and d is the component of x perpendicular to W, then, because


7-5 I PERPENDICULAR PROJECTIONS; DISTANCE TO A SUBSPACE 283

x — d belongs to W, we can write

x — d = a^i -f + a«e n .

This time the requirement that d be orthogonal to each of the e t leads to the -

system of linear equations

(ei -ei)«i + (ei • e 2 )a 2 + + (ei 'en )a n = ei • x,

(e 2 • ei)ai + (e 2 • e 2 )a 2 + + (e 2 • e n )an = e2 • x, (7-53)

(en •
ei)«i + (e„ • e 2 )a 2 + • • •
+ (e n • e„)an = en • x,

in the unknowns a ls . .
. , a n which must be solved
, in order to find d. Our earlier
results guarantee that this system has a unique solution since we know that d is
uniquely determined by x and W.
Digressing for a moment, we recall that a system of n linear equations in n
unknowns has a unique solution // and only if the determinant of its coefficients
is different from zero.* Hence

ei-ei ei • e2 ei • e„

e 2 «ei e2 • e2 e2 • e„
^ 0. (7-54)

en -ei e2

This determinant is known as the Gram determinant of the vectors ei, . . . , en ,


and the above argument shows that the Gram determinant of n linearly inde-
pendent vectors in a Euclidean space is always different from zero. The converse
of this statement is also true, and furnishes a method for testing a (finite) set of
vectors in a Euclidean space for linear dependence. We shall leave the proof of
this fact to the reader, and let matters rest with a formal statement.

Theorem 7-6. The vectors ei, . . . , e n in a Euclidean space are linearly


independent if and only if their Gram determinant is different from zero.

One final comment is in order before we consider specific examples. In practice


it is usually easier to find d by solving the system of equations (7-53) instead of
trying to use Formula (7-52). The reason for this is to be found in the fact that
(7-52) applies only when d, . . ,e n is an orthonormal basis for W, and the
.

construction of such a basis a tedious job at best. In essence, the issue here is
is

that one ought to attack a simple problem directly, rather than try to adapt it to
fit some general formula. Each of the following examples illustrates this point.

* The student who is unfamiliar with determinants should omit this paragraph, and
resume reading after the statement of Theorem 7-6.
284 EUCLIDEAN SPACES CHAP. 7

Example The distance from a point


1.

to a line. Let x be any vector in a


Euclidean space, and let <£ be the line
determined by the nonzero vector y;
i.e., £ is the one-dimensional subspace
spanned by y (Fig. 7-13). We propose
to find the distance from x to
||d|| <£.

Since y spans £, we can take y itself o- FIGURE 7-13


as a basis for <£. This done, we must
determine a so that the vector d = x — ay is orthogonal to y. Thus

= d •
y = x •
y - a(y •
y),

and
a = xy •

yy
It follows that
1/2
|d|| = (d-d)
= [(x - ay) • (x - ay)]
1/2

= [x • x - 2a(x y) • + a
2
(y •
y)]
1/2

2 2 1/2
(x-y) (x-y)
x x • 2 j

yy yy J

and we have the formula

(x • x)(y -
y) - (x •
y)
1/2
= (7-55)
lldll
yy
for the distance from x to the line determined by the vector y ^ in any Euclidean
space.
2
This formula assumes a particularly simple form in (R . For if x = (x lt x 2 )
and y = (yi,y 2 ), then
1/2
\x\ + xl)(yl + yl) - (xij>i + x 2y 2 f
Idll =
+ y%z yi
2 i

1/2
\x\y\ - 2x y 2 x 2yx + x%y\
x
2, .2

2 2
yi + y2
i

1/2
(x x y 2 — x 2 yxf
2 2
yi + i

y2

which may be written


= 1*1^2 - x 2y x \
(7-56)
Idll
2
Vjf + y2
7-5 PERPENDICULAR PROJECTIONS; DISTANCE TO A SUBSPACE 285

For example, the distance from the


point (—1, 1) to the line (through the
origin) determined by the point (1, 1) is

Idll = 1-1-11 = V2.


V2
(See Fig. 7-14.)
3
In (ft the formula for the distance
from a point x = (xi,x 2 ,x 3 ) to the FIGURE 7-14
line determined by the nonzero vector
y = iyu )>2, J 3) is not nearly so simple as it is in (R
2
. A straightforward calcula-
tion starting from (7-55) yields

(x x y 2 - x 2yif + (xija ~ x 3 yj) 2 + (x 2y 3 - x 3y 2 )


2 112
d = 2 2 2
]
(7-57)
y\ +
1

j>2 + 1

j>3 "J

A similar formula can be established for (ft", but it is obviously easier to use
(7-55) directly.

Example Find the distance from the point x =


2. (1, 3, 2) in (ft
3
to the plane
(through the origin) determined by the vectors yi = (1, 0, 0) and y 2 = (1, 1, 1).
We must first determine a and /3 so that the vector
d = x - ayi - j8y 2

is orthogonal to y 1 and y 2 . This leads to the pair of equations

x yi • - «(yi •
y0 - 0(y 2 •
yi) = 0,

x y2 • - «(yi •
y2) - /3(y 2 • y2) = 0.

Computing the values of the various inner products involved, we find that these
equations become
1 - a - /3 =
6 - a - 3/3 = 0,

and hence a = — f, j8 = f. It follows that d = (0, ^, —J), and

d " V4 + "
4 2

Example 3. The Fourier coefficients offix). Let e[— x, t] be the space of con-
tinuous functions on the interval [— t, tt] with the usual inner product

f-g = f(x)g(x) dx,


f*
and let 3 n be the (2« +
l)-dimensional subspace of Q[—t, w] consisting of all
trigonometric polynomials of degree < 2n 1. (See Example 2, Section 7-3.) +
286 EUCLIDEAN SPACES | CHAP. 7

We propose to compute the perpendicular projection onto 3 n of any function


/ine[-7r, Ti-].

By (7-34) this projection is of the form

T(x) = -y + ai cos x + • • •
+ an cos nx

+ bi sin x + " • •
+ bn sin nx,

and we must determine values for the a* and bi so that

d(x) = /(*) - T(x)

is orthogonal to each of the functions 1, cos x, sin x, ... , cos nx, sin nx. But, by
(7-37), these functions are mutually orthogonal in e[— w, t], and so, by taking
inner products, we have

=
J_ T f(x)dx-^J_Jx
0,

I f(x) cos x dx — fli / cos


2
x dx = 0,

J
fix) sin xdx — bn I sin
2
x dx = 0,

/ f(x) cos nx dx — an I
2
cos nx dx = 0,

/ /(x) sin nx dx — bi sin


2
«x dx = 0.

Since
f
/ dx = 2ir,

and
2
/ sin
2
mxdx = J
cos mxdx = x

if m > 0, it follows that

a = - f(x)dx,
7T J _,

ax = - / /(x)cosxdx, bi = - /(x) sin xdx,


7T ./ _„ 7T ^ _T (7-58)

an = - I
f(x ) cos nxdx, bn = - /
/(x)sinwxdx.
7T 7 -t * J —r
7-5 I
PERPENDICULAR PROJECTIONS; DISTANCE TO A SUBSPACE 287

These coefficients are known as the Fourier coefficients of the function / (The
student should now appreciate the reason for associating the factor \ with the
constant term of the trigonometric polynomials in 3 n It was done simply to .

assure that each of the Fourier coefficients has the same constant before the
integral, for without the \ the formula for a would have been (l/2ir)f_ r f(x) dx.)
We have now shown that the trigonometric polynomial T whose coefficients are
the Fourier coefficients of/ is the best approximation in 3 n to the function/, in
the sense that of all the functions P belonging to 3„, T is the one which minimizes
the integral

T
lV(*) - P(x)] dx.
2
(7-59)
/_ t

The value of this integral is often called the mean deviation (or mean square de-
viation) of P from / In these terms, T is that trigonometric polynomial in 3 n
with minimum mean deviation from /
These considerations lead one naturally to the problem of determining whether
the mean deviation of T from / tends to zero as n —> oo In other words, can f .

be approximated arbitrarily closely by trigonometric polynomials'! This and related


topics will be investigated in the chapters which follow.

EXERCISES

1. Find the distance from each of the following points x in (R 2 to the line (through the
origin) determined by the point y.

(a) x = (1, 0); y = (1, 1) (b) x = (-2, 1); y = (0, 1)


(c) x = (f,£); y = (-3,4) (d) x = (-2, 10); y = (1,|)
2. Find the distance from each of the following points x in 3
(R to the line (through the
origin) determined by the point y.

(a) x = (1, 1, 1); y = (1, 1,0) (b) x = (0, 1, 1); y = (1, 1,0)
(c) x = (2,1,-3); y = (-1,2,2)
(d) x = (-£, 1, 3/V3); y = (4, 3, 5V3)
3. Find the distance from each of the following points x in 4
(R to the line (through the
origin) determined by the point y.

(a) x = (0, 2, 1, 1); y = (1, 3, 1, 3) (b) x = (-1, 3, 2, 3); y = (1, 1, 1, 1)


(c) x = (16, |, 3, f); y = (4, ± 2, 1) (d) x = (0, -1, 2, 2); y = (-1, 0, -2, 2)
4. Let x and x' be linearly independent vectors in (R 2 such that ||x|| = ||x'||. Show that
there are two lines through (0, 0) for which the distances from x and x' are equal,
and that these lines are orthogonal.

There are two through the origin in 2


5. lines (R such that the distance from (1, 7) is 5.
Find them.
6. Find the locus of a point x in (R 2 which is equidistant from each of two given distinct
linesthrough the origin.
288 EUCLIDEAN SPACES | CHAP. 7

7. Find the distance from each of the following points x in (R 3 to the plane (through the
origin) determined by the given points y and z.

(a) x = (1,2,0); y = (-2,0,-2); z = (1,1,6)


x = -1, 1); y = (4, -3, -l);j = (-1, *, -2)
(b) (2,
_
(c) x = (1, -2, -1); y = (0,0, -\/21); z = (0, 2\/21, -5V21)
8. Let £ be the line through (0, 0, 0) and 3
(10, 2, 11) in (R . Find the points on £ which
are at a distance 3 from (2, —2, 1).

9. Find the coordinates of the point in (R 3 which lies on the plane determined by the
origin, (1, 2, 2), and (2, —2, 1) and is at a minimum distance from the point (3, 1, 1).
2 x 2 ), and
10. (a) Show that the area of the triangle in (R with vertices at (0, 0), (*i,
(yi,y2)is %\xiy2 x 2 yi\. -
(b)More generally, show that the area of the triangle in (R n with vertices at 0, x, and
yisH(x-x)(y.y) - (x-y) 2 ] 1 ' 2 .

11. Find the perpendicular projection of each of the following vectors in &[—ir, ir] onto
the indicated subspace W, and compute the distance from the vector to the subspace.
(a) f(x) = x, W = S(l,cosx,sinx)
(b) f(x) = cos
2
*, W = S(l,cos2x)
(c ) f( x)
= x2 , w = S(l,cosx,cos2x)
12. Repeat Exercise 11 for the following vectors and subspaces of C[— ir, ir].

(a) f(x) = x 2
, W = S(sin x, sin 2x)

(b) f(x) = jc
3
, W = S(l, cos x, cos 2x)

(c) f(x) = x
3
, W = S(sin x, sin 2x)

In Exercises 13-19 find the Fourier coefficients (for all values of n) for the given function
in C[— 7r, ir].
13. fix) = x 14. f{x) = x
2 15. f(x) = sin 2 x
m ma, positive integer =
16. fix) = x ,
17. fix) \x\

18. fix) = cos *


2 19. fix) = e*
3
*20. Use the Gram determinant to test the following sets of vectors in (R for linear
independence.
(a) ei = (2,-1,1), e2 = (1,2,1), e3 = (-1,2,0)
(b) ei - ih 2, -1), e2 = (1, -1, 2), e3 = (-2, 7, 4)
(c) ei = (if,0), e2 = i^,h -1), e3 = (3, -£,£)
21. Find the perpendicular projection of each of the following vectors in C[- 1, 1] onto
the subspace spanned by the polynomials 1, x, x - £, and compute the distance
2

from these vectors to the subspace in question.

(a) f(x) = x
n
, n an integer (b) fix) = sin x
(c) fix) = |xl (d) fix) = 3x 2
22. Let ei, . .
.
, e„ be an orthonormal basis in a Euclidean space V, and let W be the
m-dimensional subspace of V spanned by ei, em m < n. . .
. , ,

(a) Find the perpendicular projection of an arbitrary vector x in V onto W.


(b) Find the component of x perpendicular to W.
(c) Find the distance from xtoW.
5

7-5 I PERPENDICULAR PROJECTIONS; DISTANCE TO A SUBSPACE 289

23. Let 9C be an arbitrary nonempty subset of a Euclidean space V, and let 9C- - (read 1

"9C perp") denote the set of all vectors in 13 which are orthogonal to every vector in 9C.
Show that 9C-1 is a subspace of U What is 9C
X if 9C = V? If 9C contains only the
zero vector?
24. With 9C and 9C X as in Exercise 23, let (9C- -)- - denote the set of all vectors in
1 1
V which
are orthogonal to every vector in 9C
X .

(a) Show that 9C is a subset of (SC-1)- 1


-, i.e., that every vector in 9C belongs to 1 1
(SC- )- -.

(b) Show that (X 1) 1 = S(9C) if V is


- finite dimensional. [Hint: Choose an ortho-
normal basis for S(9C), and extend it to a basis for V.]
*25. (a) Let V consist of all infinite sequences

s = {ao,ai, . . . ,a„, . .
.}

of real numbers having only a finite number of nonzero entries, with addition and
scalar multiplication defined termwise. If

s = {ao, fli, . . . ,a„, . .


.}

and
t = {bo, b\, ...,£»,...},
define
00

S • t = 2^i a nbn-
n=0

Show that with this definition V becomes an infinite dimensional inner product space,
(b) Let W be the subspace of V consisting of all sequences of the form

s = {a, a, 0,0, ...},

where a is an arbitrary real number. Show that *W = (V? 1-)-1 .

26. Let V be a finite dimensional Euclidean space, and let W be a subspace of V. Show
that
v = *w + w- 1
,

and that WnV 1 contains only the zero vector (see Exercise 22, Section 1-4).
27. Find W- where 1
, W
is the subspace of

(R 3 spanned by the following vector or


vectors.

(a) (1,1,0)

(b) (1,-1,0), (0,0,1)


(c) (1,2,-1)
(d) (1,1,1), (-1,-1, J).

28. Let II be an arbitrary plane and x an


arbitrary vector. Show that the dis-
tance from x to II is the perpendicu-
lar projection of x onto H± . (See
Fig. 7-15.)
* 29. Prove Theorem 7-6. U^ figure 7-1
290 EUCLIDEAN SPACES | CHAP. 7

*7-6 THE METHOD OF LEAST SQUARES


One of the most important applications of the preceding material occurs in the
theory of approximations. The problem here can be described most succinctly
as the proper interpretation of experimental data, and a simple illustration is
perhaps the best way of introducing the subject.
Suppose that we wish to determine the value of a certain physical constant c,
such as thespecific gravity of a given substance, and that an experimental method
for measuring c is available. We then perform the experiment n times and obtain
estimates jci, . .
.
, xn of c. In the absence of experimental errors each of the 't-

would equal c, but in practice, of course, none of them will, and we are thus faced
with the problem of finding the "best approximation" to c available from our
experimental data.
To do so, we view the n experimental measurements as a vector x = (xi, . . . ,xn )
w
in (R , and let y be the n-tuple (1, . .
.
, 1), so that

cy = (c, . .
. , c).

Then, if we term "best approximation" as distance in 0T (which,


interpret the
most reasonable interpretation conceivable), and if we let c' denote
after all, is the
this approximation, we must choose c' so that the vector c'y is as close as possible
to x. In other words, c' is determined by the requirement that c'y be the per-
pendicular projection of x onto the one-
n
dimensional subspace of spanned by (R

y (see Fig. 7-16). But, as we saw in the


preceding section, this projection is

xry
yy y,
so that

Xi + + Xn
yy FIGURE 7-16

This, of course, is none other than the arithmetic average of the xt and we now ,

have a theoretical interpretation of the popular practice of averaging separate


(independent) measurements of the same quantity.
Appropriately generalized, the foregoing method will yield approximations to
vectors as well as scalars. For simplicity we consider the case of a vector c =
2
(ci, c 2 ) in (R , and a set of measurements

Xi = (a; ii, X12),

X2 ~ (-X21j *22)j

Xn — (-Knlj Xn 2)
7-6 I
THE METHOD OF LEAST SQUARES 291

of c. Again we wish to use the x t to obtain the best possible approximation c' =
(ci, c'2 ) to c.
This time we view the experimental results as a vector

X = (*H, *21j • • • 5 Xn \, X12, • • • , Xn 2)

2n
in (R , and let yi and y 2 be, respectively, the orthogonal vectors

(1,...,1,0,...,0) and (0,...,0,1,...,1)

2n
in (R . Proceeding as before, we take as our approximation the scalars c\, c 2

such that c\yi + c 2y 2 is the perpendicular projection of x onto the subspace


S(yi,y 2 ). Thus the vector x — (ciyi + c'2 y 2 ) must be perpendicular to y x
and y 2 whence ,

^=^y± and C2 = ^-.


yi •
yi y2 •
y2

This gives

C\

= -^11 + *21 + * ' '
+ Xn \
' C2
,
=
X\2 + ^22 + * ' '
+ *n2 '

and

c' = i Xl + • • •
+ x n ). (7-60)
(

This vector may be familiar to some of our readers as the centroid of the vectors
Xi, . .
.
, xn . We note that it may be characterized as the vector in (R
2
which
minimizes the quantity

E llx,-
- c'||
2
(7-61)

(see Exercise 4 below). ^

In general, n experimental determinations xi, . . , x n of a vector c in (R


m can
.

be handled in exactly the same way, and it is not difficult to show that Formulas
(7-60) and (7-61) continue to describe the best approximation.
A related problem of this type occurs when one is given a scalar y which is

known to depend linearly upon a scalar x, i.e.,

y = ex,

and one attempts to determine the value of c experimentally. (For instance, y


might be the displacement of a spring under a weight x, in which case c would be
the spring constant.) In this case our experiments yield a set of measured values
xi, . xn of a: and corresponding values y u
. . ,
yn for y, which can be dis- . . .
,
292 EUCLIDEAN SPACES | CHAP. 7

played as a system of n linear equations

yi = cxi,

y2 = cx 2 ,
(7-62)

yn = C-*n>

in the single unknown c. As a result of experimental errors, this system of equa-

tions will not be compatible (i.e., will not admit a unique solution), and our prob-

lem is to find the "best approximation" to c afforded by this data.


n
Again we pass to Euclidean n-space, (R , and consider the vectors

x = (*i, . .
.
, xn ) and y = (y u . . . , yn).

In this context our problem can be rephrased as follows: Find a scalar d such that
the vector c'x in the subspace S(x) is as close as possible to y. This, of course,
requires that d be so chosen that the vector c'x — y has the smallest possible
must minimize the quantity \\dx — y||
2
length. Thus d But .

\\dx - yf = (c'x - y) • (c'x - y)

= J2 (dxi - yi)
2
,

and we see that the best approximation to c is the scalar which minimizes the sum

For rather obvious reasons this method of approximation (and its generalization
below) is called the method of least squares.
In the present example it is easy to compute the value of d explicitly. Indeed,
since c'x — y must be perpendicular to x we have (c'x — y) x = 0, and hence •

X
d
X X
or

c =
"-"-
xiyi ^
+ •
• •

• ^ Xny-n
+ 7' . (7-64)
v 2
+ •

+ x2
Example 1. Use the method of least squares to find the best approximation to c

available from the equations


2 = c,

2 = 3c,

5 = Ac,

6 = 6c.
7-6 I THE METHOD OF LEAST SQUARES 293

Here x = (1, 3, 4, 6) and y = (2, 2, 5, 6), so that

C =
2 + 6 + 20 + 36
= 32
1 + 9 + 16 + 36 31

The type of approximation which has just been considered often arises as a
problem in curve fitting. In this context one is asked to find a straight line y = c'x
2
through the origin (in (R ) which best fits the points (x lf y{), (xn yn ). The . . . , ,

accepted method of solution is to let A,- denote the difference c'Xi — yit as shown
in Fig. 7-17, and then choose c' so that the sum of the squares of the A,- is a mini-
mum.* But this sum is just the quantity given in (7-63), and the present problem
is identical with the one solved above.
Finally, we consider the general situation in which a scalar y is an unknown
linear combination of the scalars x x ,
. . . , xm ; i.e.,

y = ci*i + i Cm.Xn

Again we imagine that n experiments have been performed (where n > m), and
that the rth experiment has yielded the values
xn, Xi m and yi, respectively, . .
.
, ,

for jci, xm and y. Thus we have a system of linear equations


. . . ,

yi = ci*n + c 2 x 12 + •
~r" Cm,X Inn

y 2 = C\X 2 \ + ^2*22 ~^~ '


1 %^2mj

yn = C\Xn i + C 2 Xn 2 + '
"T" Vnmi
(7-65)

in which the C{ are unknowns. In general, of


course, (7-65) cannot be solved, and our
problem becomes that of finding values
c[, . . . , Cm for the a which make the ex-
pressions standing on the right of these equa-
tions approximate yu . . .
, yn as closely as FIGURE 7-17
possible.
By now the method should be obvious. We consider the vectors
Xl = (*li, x 2 \, . . . , X n i),
x2 = C*12> -*22> • • • > ^2)1

X-m (^1?^) X 2m , ... , Xnm ),


and
y = (yi,...,yn )

* It makes excellent sense to use the quantities Af for this purpose rather than the A»
themselves, since in the latter case a large positive difference would obliterate the effect
of several small negative differences, contrary to the requirement that each point be
given equal weight in the fitting process.
294 EUCLIDEAN SPACES | CHAP. 7

formed from the columns of (7-65), and let be the subspace of (R w spanned by W
Xi, x.
m (recall
.
.
that
, m < n). We now make the assumption that our ex-
periments were so designed that Xi, x m are linearly independent, so that they . . . ,

form a basis for V?. The c[, c'm are then chosen so that the vector . . . ,

C\X\ + • • •
+ cm x m

is the perpendicular projection of y onto W. Thus the c t' must minimize the
length of the vector

(cix x + • • •
+ m xm )
c' - y,

or, equivalently, they must minimize the quantity

2K
1=1
c i**i + • • •
+ c'mXim) - ytf, (7-66)

which is the square of the length of this vector.


In practice the C; are usually determined from the orthogonality relations

[(c'iXi + • • •
+ c'mXm) ~ y] • X< = 0,

i = 1, . . . , m. They yield the system of m linear equations


(Xi •
Xi)c'i + (Xi • X 2 )C2 + • ' •
+ (Xi • Xm )Cm = *1 ' V,

(x 2 •
xi)ci + (x 2 • x 2 )c 2 + • • •
+ (x 2 • xm )c'm = x 2 •
y,

(X m •
Xi)ci + (X m • X 2 )c 2
/
+ * • •
+ (X» • Xm m y = Xm • V,

in the m which, in slightly different notation, has already


unknowns c[, . . , c'm ,

been discussed in the preceding section [see (7-53)]. These equations are called
the normal equations for the approximation in question.

Example Let y = c x x x 2. + c 2x 2 , and suppose that as a result of four separate

experimental determinations we have found the set of equations

15 = ci + 2c 2,

12 = 2c x + c2 ,

10 = ci + c2 ,

= Cy — C2 .

Use the method of least squares to find the best approximation to c x and c2 .

In this case, m = 2, n = 4, and

Xl = (1, 2, 1, 1), x2 = (2, 1, 1, - 1), y = (15, 12, 10, 0).


7-6 THE METHOD OF LEAST SQUARES 295

4
(Note that x x and x 2 are linearly independent in (R , as required.) Thus the normal
equations for this approximation are

7ci + 4c'2 = 49,

4ci + 7c'2 = 52,

and c\ = yy c 2 = jf These then are the experimentally determined values of


, .

Ci and c 2 and we have


,

llj> = 45.x x 56x 2 + .

Example 3. Find the equation of the parabola in the ;cy-plane which passes
through the origin of coordinates and has vertical axis, and which best fits the
points (—1, 3), (1, 1), (2, 5) in the sense of the method of least squares.
The general equation of such a parabola is

y = cix
2
+ c 2 x,

and the above data give the set of equations

3 = ex — c2 ,

1 = Ci + c2 ,

5 = 4c i + 2c 2 .

Thus the least square approximation to cx and c 2 is obtained by solving the


normal equations for the vectors

xi = (1,1,4), x2 = (-1,1,2), y = (3,1,5).

These equations are


18c7! + 8c 2 = 24,

8c'i + 6c 2 = 8,

and their solution is c[ = fr» c 2 = — if- Hence the desired parabola is lly =
20a:
2
— 12x, and as can be seen in Fig. 7-18, this curve really does fit the given
points extremely well.

11^ = 20a;2 -12jc

FIGURE 7-18
: : :

296 EUCLIDEAN SPACES |


CHAP. 7

EXERCISES

1. When an experimental approximation is made by replacing a vector x in an n-


dimensional Euclidean space V by its perpendicular projection x' on a subspace of
V the quantity )]x — x'\\ 2 /n is called the variance of the approximation. The mag-
nitude of the variance of the approximation is taken as a measure of the consistency
it, and in judging between two experimental approxi-
of the data used in obtaining
mations the one which has the smaller variance is taken to be the more consistent.
Suppose two students perform a series of five experiments to determine the value
of a physical constant with the following results:

Student A: 1.402 1.420 1.395 1.418 1.406

Student B: 1.416 1.405 1.419 1.422 1.406

(a) Find the best approximation to the constant which can be made by each student.
(b) Which set of values is the more consistent?

2. Let (xi, .x m) be the vector whose components represent the results of a series of
. .
,

m experimental determinations of a physical constant, and let (x m+ i, x m+n ) . . . ,

represent the results of n experimental determinations of the same constant. Show


that the variances for the two separate series of experiments do not each exceed
the variance for the single series of experiments (xi,..., x m+n ). Under what
conditions will equality hold?

3. (a) Repeat Exercise 1 for the following two series of measurements

Student A: 16.02 15.99 16.12 16.07 16.10

Student B: 16.11 16.16 16.07 16.12 16.14

(b) Consider the sequence obtained by combining the two sets of data in (a). Find
the best approximation for this sequence, and compare its variance with the vari-
ances for the two approximations in (a).

4. Prove that the assertion made in the text concerning expression (7-61) is true.

5. The center ofpopulation of a geographical region is by definition the centroid of the


residences of its inhabitants. Suppose that the population of a certain region is
concentrated in four cities, with a negligible number of residents elsewhere, and
suppose that the population and position (with respect to a Cartesian coordinate sys-
tem in the plane) of each is as follows

City A : 1 ,000,000 at (0, 0) City B : 600,000 at (10, 5)

CityC: 300,000 at (-5, 10) CityD: 100,000 at (0, -20)

Find the center of population.


6. Repeat Exercise 5 for the following data

City A: 2,000,000 at (0, 0) CityB: 500,000 at (6, 18)

City C: 300,000 at (-3, 15) City D: 200,000 at (0, - 12)

7. The center of mass, or center of gravity, of a system of k particles, each of mass rm


n is defined to be the centroid of the k
located at the point x», / = 1, 2, ... k, in (R

vectors m x,. Find the center of mass of each of the following systems in (R 3 .
t
7-6 | THE METHOD OF LEAST SQUARES 297

(a) mi = 2, xi = (1, 3, 0) (b) mi = 1, xi = (1, 5, 0)


m2 = 3, x 2 = (-2, 4, 1) m2 = 1, x 2 = (2, -3, 3)
m3 = 1, X3 = (0,6,2) m3 = 3, x 3 = (7,0,1)
/W4 = 4, X4 = (—1, —5, 3) m4 = 5, X4 = (2, —3, 6)

(c) mi = 5, xi = (3, 1, -2)


m2 = 1, x2 = (4, 0, 1)
m3 = 4, x3 = (0, 0, 0)
m4 = 3, x4 = (2, 2, 1)
m5 = 5, x5 = (1, -1,2)
m6 = 2, x6 = (4, 1, 6)

8. Repeat Exercise 7 for the following data:


(a) mi = 1, xi = (1,2,-3)
m2 = 2, x2 = (3, -4, 5)
m3 = 3, x3 = (-1,0,0)

(b) mi = 1, xi = (7, -5, 2) (c) mi = 2, x x = (2, 3, -1)


m2 = 2, x2 = (6, 3, 1) m2 = 5, x 2 = (4, -2, 1)
m3 = 5, x3 = (-1, -1,2) m3 = 4, x 3 = (6, -3,2)
m4 = 8, x4 = (2,2,0) m4 = 3, x 4 = (-1, -1,0)
m5 = 4, x5 = (3,5, -1) m5 = 6, x 5 = (0,0,0)
m6 = 5, x6 = (7, 1,0)

9. (a) Show that the center of mass x (Exercise 7) of k particles, each of mass m; located
at the point xi} = 1, 2, ik, minimizes the quantity \\Mx - ]Ci=i m Xi|| 2
. . . , t ,

where M= 2Zi=i m i-

(b) Letxi = (0,0,0),x 2 = (1, -1, 3), x 3 = (1, l,5),x 4 = (2, -8, -10). Find
a set of masses mi, m 2 m 3 m^ such that ||10x , ,
— 52f=i miXi\\
2
is minimized by the
vector x = (1,-1,2).
10. Let mi = 4, xi = (0,0,0); m2 = 3, x2 = (1,1,1); m3 = 2, x3 = (1,1,0);
«i4 = 1. Determine X4 so that ||10x — £* = i miX,||
2
is minimized by x = (1,2,1).
(See Exercise 9(a).)
11 Let T be a linear transformation mapping CR n into itself, and let x be the centroid of k
vectors xi, . . . ,x k in (R n . Show that T(x) is the centroid of the vectors T(xi),
• .
. , T(x k ).
12. Let T be the transformation which maps each vector in a Euclidean space V onto its
best approximation (in the sense of least squares) in a given finite dimensional sub-
space of V. Prove that T is a linear transformation. What are the null space and
image of T?
13. Find the y = ex in (R 2 which best fits each of the following
line sets of points.
Sketch the graph of each of these lines, and plot the given points relative to the
standard coordinate system in (R 2 .

(a) (5, 9), (10, 21), (15, 19) (b) (-4, -11), (1, 3), (5, 16), (10, 29)
(c) (-6, 10), (2, -2), (8, -11), (10, -16)
14. Repeat Exercise 13 using the following data.
(a) (-4, 9), (4, -10), (12, -25) (b) (-5, -16), (10, 31), (15, 31), (20, 40)
(c) (-4,-5), (8,9), (12,16), (20,24)
298 EUCLIDEAN SPACES | CHAP. 7

In Exercises 15-18 suppose that y = cxxx + C2X2, and use the method of least squares to
obtain an approximate formula for y from the given data.
15. xii = 1, X12 = 0, y\ = 2 16. *11 = 1, X12 = 0, yx = l

*21 = 0> *22 =1, ^2 = 3 *21 = 0, *22 = 1, yi = 1

*31 = 1, *32 = 1, ^3 = 2 *31 = — 1, *32 = Q, 73 = 3


X41 = 1, *42 = —1, 74 =
17. *u = 10, X12 = 10, 71 =
X21 = 10, x 22 = -10, y 2 = 19
*31 = -10, X32 = 10, 73 = -21
X41 = —10, X42 = —10, 74 =
18. xxx = 3, X12 = 4, 71 =
X2X = 7, X22 =1, 72 = 10
X3X = 1, ^32 =1, 73 = —5
X4X = 4, X42 = 3, 74 =
19. Find the best approximation to the formula 7 = cxxx + C2X2 + C3X3 from the
following data:
XXX = 0, XX2 = 0, XX3 = 0, 71 = 1
*21 = 1, *22 = 1, *23 = —1, 72 = 2
X31 = 1, JC32 = —1, *33 = 1, 73 = _1
X41 = 0, X42 = 1, *43 = 1, 74 = —2
2
with vertical axis which, in the sense of
20. Find the parabola through the origin in (R
each of the following sets of points. Sketch the graph of each
least squares, best fits
parabola and plot the points it approximates.
(a) (1, 2), (2, 5), (3, 9) (b) (-1, -1), (1, 0), (3, -5)
(c) (-1,4), (1,2), (2,10)
21. Repeat Exercise 20 for the following sets of points.

(a) (1, 4), (2, 5), (-1, -3) (b) (1, 3), (2, 7), (-1, 2), (-2, 8)

(c) (1,-1), (2,-3), (-2,1)


2
22. Find a cubic curve through the origin in (R. which, in the sense of least squares, best
fits each of the following sets of points. Sketch the graph of each curve obtained
and
plot the approximating points.

(a) (1, -2), (-1, 1), (2, -5), (3, -20) (b) (1, 2), (-1, 1), (2, 7), (-2, -4)

23. Find the fourth-degree equation in the form

y = ax 4 + bx 2 + ex

2
which, in the sense of least squares, best fits the following points in (R :

(-2,2), (-1,1), (1,2), (2,1).

*7-7 AN APPLICATION TO LINEAR DIFFERENTIAL EQUATIONS

In Chapter 4 we learned how to obtain n distinct solutions of any nth-order


homogeneous equation with constant coefficients from the roots
linear differential
of the auxiliary polynomial. At that time we also indicated how operator tech-
niques could be used to prove the linear independence of these solutions in the
7-7 I
AN APPLICATION TO LINEAR DIFFERENTIAL EQUATIONS 299

real vector space 6(7), where 7 is an arbitrary finite or infinite interval. We re-
frained from doing so, however, because such a proof is both tedious and unin-
teresting. But now that we have the notion of an inner product available, we
shall give a particularly elegant proof of this result which has the added virtue of
serving as an excellent example of the interplay between analysis and the theory
of Euclidean spaces.
Specifically, we must prove that every set of functions of the form

xm e ax i\nbx and xm e ax zo%bx, (7-67)

where a and b are real numbers (b > 0) and m is a non-negative integer, is linearly
independent in the real vector space To this end suppose that F
C(— oo, oo).*

is a (finite) linear combination of such functions, and that F(x) = 0. We must


show that all of the coefficients in F are zero.
Our first step consists of rewriting F by grouping together those terms which
contain the same exponential factor so that

a'x a2X a*x


F(x) = e P x {x) + e P 2 (x) + • • •
+ e Pr (x), (7-68)

where a x > > 0, and each Pi(x) is a linear combination of


a2 > • • •
> ar
functions of the form xm
bx and x m cos bx. Next we rewrite each Pi(x) by
sin
grouping together those terms which contain the same power of x. This gives

Pi(x) = Ti0 (x) + 7a(x)x + • • •


+T iSi (x)x
Si
, (7-69)

in which the coefficients 7^ are expressions of the form


n
Tij(x) = ]T (a k cos akx + 0* sin b k x), (7-70)
jt=i

where the a k 0k are real numbers and the a k b k are non-negative real numbers.
, ,

In what follows we shall refer to expressions of this form as trigonometric sums,


and we note that every such sum may be written as
n

^+ 2 ( ak COS QkX + $ k Sin bkX ^

with a k b k positive.
,

The essential step in proving that all of the coefficients of F are zero is fur-
nished by the following lemma.

Lemma 7-2. IfTis a trigonometric sum with the property that lim^*, T(x) =
0, then all of the coefficients ofT are zero.

* Note that it is sufficient to consider linear independence in 6(— qo, oo) since by
Theorem 3-2 (the uniqueness theorem) such a set of solutions is linearly independent
in C(— oo oo ) if and only if it is linearly independent in 6(7) for every subinterval 7 of
,

(- oo , oo ). (See Exercise 22, Section 4-3).


300 EUCLIDEAN SPACES CHAP. 7

Proof. Let 3 be the real vector space consisting of all trigonometric sums, with
the usual definitions of addition and scalar multiplication, and define an inner
product on 3 by

/.g= lim±/ f(x)g(x)dx (7-71)


t->oo I J

for any pair of trigonometric sums /and g. It is easy to see that (7-71) does in
fact definean inner product on 3 provided that the limit in question always exists
(see Exercise 1). To prove the existence of this limit we first observe that since
integration is a linear operation (i.e., is performed term by term) it suffices to
consider the case in which / and g are "monomials." In other words, we need
only establish the existence of (7-71) when
Case 1. f(x) = cosaxx, g(x) = cos a 2 x, a x ,a 2 > 0;

Case 2. f(x) = cosaxx, g(x) = sin^x, ax > 0, bt > 0;

Case 3. f(x) = sinbxx, g(x) = sinb 2 x, bi,b 2 > 0.

Suppressing most of the details, we obtain the following results.

Case 1.

sin (a i a^)t sin (ai + a2 )t


=
lim
2l a\ — a2
+ a\ + a2 .
o, if fli 5^ a2 ,

sin 2a\t 1
f'g= {
lim t + ^'
.

if
f
ai = a. i* o,
2t 2«i

lim - dt = 1, if ax = a2 = 0.
<->00 *

Case 2.

cos — a )t cos (6i + QiV 1


lim
2t
(2>i

bx — ai
t

bi + fli
+ &i - «i

f'g= { if fli 9± bu
= 0, if 0i = b\
*, ,_>«, 2t\_ ax _

Case 3.

^^2/L
1 r sin(^
&1
- *,)* _
- 62
»h(»i
*>1
+
+ &2
M J
- o, if ^^
f'g =
lim^f I * - xf-sin26i* if b\ = fc 2 -

Hence the necessary limits exist, and (7-71) defines an inner product on 3. More-
over, from the above computations we see that whenever a and b are positive real
numbers the functions
1, cos ax, sin bx
7-7 |
AN APPLICATION TO LINEAR DIFFERENTIAL EQUATIONS 301

are orthogonal in 3, and

|1||
= 1, ||cosajc|| = —— » ||sinfejc|| = — (7-72)
y2 v2
Now suppose that

T(x) = ^+ J] (a fc cos a fc x + jSfc sin b kx)


fc=i

is a trigonometric sum such that

lim T(x) = 0.

Making use of the orthogonality just established, together with the relations listed
we have
in (7-72),

rt
T(x) -cosakx . .. 1
«fc
= — n-^
rnr~
2
= 2 lim -
/
/
, .
T(x) cos a k x ax,
, .
k = 0,
.

1, .... n,
\\cosa k x t ^M t Jo

8* = •
z.
—\^~ = 2
2 lim - / T(x) sin b kx dx, k = 1, 2, . . . , n,
"—.Ml
i ii
sin I-*- * ./o

and to prove the lemma we must show that all of these coefficients are zero.
Let us examine the a k By assumption, T(x) —» as jc — oo. Hence, given any
.

real number e > 0, there exists a real number r > such that

|r(A:)| < e whenever x > r.

Then

«* = lim - / T(x) cos offcA; Jx


«—>oo * JO

lim -
t
/
J
T(x) cos a kx dx + -
t J
/ T(x) cos a fc .x fifo
r

1
< lim-
Jo
) cos a k x dx + lim,
/ T(x) cos afcA ah;

But

lim - /' T(x) cos a&x ^ 0,

while

lim -
t—»QO ^ J
/
r
r(x) cos a k x dx < lim -
t—*ao t
/
Jr
\T{x) cos a^| ^
< lim - / e dx
t->x 7* Jr

_ im l
*- r)
_
302 EUCLIDEAN SPACES | CHAP. 7

Hence |a&| < e for any positive number e, and it follows that ak = 0. A similar
argument shows that 0* = for all k, and the proposition is proved. |

It is now relatively easy to establish our main result, which we state formally as

Theorem 7-7. The set of all functions of the form (7-67) is linearly inde-
pendent in C(— oo, oo ).

Proof Let F be a finite linear combination of these functions such that F(x) = 0.

We rewrite F in the manner described above as


a a'x
F(x) = e **P x {x) + • • •
+ e Pr (x), (7-73)

where a± > a2 > • • -


> a r and
,

Pi(x) = Ti0 (x) + • • •


+T iSi (x)x
Si
. (7-74)

We shall assume that F has at least one nonzero coefficient, and deduce a
contradiction.
Indeed, if such is the case there exists at least one integer / such that Pi has

a nonzero coefficient. With / chosen as small as possible, we have

F(x) = e
aiX
Pi(x) + e
a ^ P i+1 (x) +
x
• • •
+ ea^P (x), r

and hence
- ai)x
p.(x) = e- aiX F(x) - e
{ai + l
P i+1 (x) - e
{a '- ai)x
Pr (x).
Since F(x) = 0,
aiX
lim e' F{x) = 0,

and since
(o '- Oi)
lim e */>y(j0 =

whenever,/ > /', it follows that

lim Pi(x) = 0.
X—»00

But Pi has at least one nonzero coefficient. Hence there exists at least one
index j in (7-74) such that the trigonometric sum Ttj has a nonzero coefficient. If
we choose j as large as possible, then

Pi(x) = Ti0 (x) + • •


+ Tij(x)x\
whence
Pi(x) Ti0 (x) Tn(x) r,-,j-i(-y)
r -i(jc)
t
*i x* JC'
_1 *
,
7-7 |
AN APPLICATION TO LINEAR DIFFERENTIAL EQUATIONS 303

Since it is clear that each of the terms on the right-hand side of this equation tends
to zero as x — oo, we conclude that

lim T{j(x) = 0.
£—00

This, however, contradicts Lemma 7-2 since Ti} -


has nonzero coefficients. It

follows that all of the coefficients of F must be zero, and the theorem is proved. |

EXERCISES

1. Verify that (7-71) defines an inner product on the vector space 3 of all trigonometric
sums. In particular, if
n

fix) = y+ J3 («fc cos a k x + 0* sin b kx),

show that
2
a
f-f-f
4
+ 5i;<«*
2
+ /ft
*=i

2. Suppose that the roots of the auxiliary polynomial for a constant coefficient linear
differential equation are 2, 2, 2 db 2/, 2 ± 2i, 2 ± 3i, 2 ± 3/, 2 ± 3/, 1 ± /, 1 ± /.

Write out F as in the text, and group the terms in the way described.

3.. (a) In the proof of Theorem 7-7 we asserted that

Urn e^-^Pjix) =
x—»oo

whenever j > i. Prove the assertion.


(b) Prove that

hm —
,•

X—»x
T(x)
^=
•*
n

whenever T is a trigonometric sum and n is a positive integer.


8
convergence in euclidean spaces *

8-1 SEQUENTIAL CONVERGENCE


In this chapter we prepare the way for the study of such topics as Fourier series
and boundary value problems by introducing the notion of convergence in Eu-
clidean spaces. As we shall see, the really interesting applications of this concept
occur in infinite dimensional spaces, but for the sake of completeness the basic
definition will be given without reference to the dimension of the underlying space.

Definition 8-1. A sequence {x k } = {xi, x2 , . . .} of vectors in a Euclidean


space V is said to converge to the vector x in V if and only if

lim ||x fc - x|| = 0. (8-1)


k—»°o

In this case x is said to be the limit of {x k } , and we denote the fact that the

sequence {x k } converges to x by writing

lim {Xfc} = x.
k— ><x>

We recall that the above definition is an abbreviation for the statement: {x k}


converges to x if and only iffor each real number e > 0, an integer K can be found
such that
||x* - x|| < e (8-2)

for all k >


K. In general, of course, the integer K
depends upon e, and increases
as e approaches 0. Moreover, since the quantity ||x* — x|| is just the distance
from x k to x, (8-2) asserts that {x k} converges to x if and only if the distance from
x k to x approaches as fc becomes large. And this is precisely what intuition
demands of sequential convergence.

* Although this chapter is logically self-contained, we assume that the reader is familiar
with the notions of sequential convergence and infinite series as studied in elementary
calculus. A review of this material can be found in Appendix I.
304
8-1 I
SEQUENTIAL CONVERGENCE 305

Before Definition 8-1 can be accepted as a reasonable description of convergence


we must prove that the limit of a convergent sequence of vectors is unique, in the
sense that a given sequence can never converge to more than one vector. This
we do by establishing

Lemma 8-1. If

lim {x } fc
= x and lim {x }
fc
= y,
fc-»oo k—»

then x = y.

Proof. Let e > be given. Then, by definition, there exists an integer K such
that

||x* - x|| < e/2 and ||x* - y|| < e/2

whenever k > K. Thus, by the triangle inequality,

||x - y|| < ||x - x fc || + ||x* - y||

= e.

Since e > was arbitrary, this inequality implies that ||x — y|| = 0. Hence
x = y, as asserted. |

1
Example 1. Let {ak} be a sequence of real numbers, viewed as vectors in (R .

Then {ak } converges to the real number (i.e., vector) a if and only if

lim \\a k — fl|| = 0.


k—>°o

1
Recalling that the inner product in (R is just ordinary multiplication of real
numbers, we find that

\Wk - a\\ = [{a k — a) (ak - • a)]


1' 2

[fob - «)
2 1/2
= ]

= \a k — a\.

Hence {a k } converges to a if and only if

lim \a k — a\ = 0.
k—><x

But this equation is none other than the definition of sequential convergence given
1
in elementary calculus, and it therefore follows that sequential convergence in (R
is identical with the usual convergence of sequences of real numbers.
306 CONVERGENCE IN EUCLIDEAN SPACES I CHAP. 8

Example e^ an orthonormal basis in n


2. Let . .
.
, e n be (R , and let {x k } be a
n
sequence of vectors in (R . Then if

x = aiei + "T &n£n

an arbitrary vector in n
is (R , and if, for each
integer k,

Xfc = aifcei + + (Xnk^r

we have

ix* — x||
2
= (a i k
- ai y

+ • ' •
+ (<Xnk ~ OL n y
Hence
lim ||x fc — x|| =
k—>00

if and only if the sequences

{«ifc} = {«n, «i2, • • .},

{&2k} = {«21, «22, • • •},

FIGURE 8-1
{<Xnk} = {«nlj «n2, • • •}»

converge (as ordinary sequences of real numbers) to a\, . . . , an , respectively.


2
For instance, if ei and e 2 are the standard basis vectors in (R , then

= +i
Jfc+1,
{x*}
y=i* + c-ir e2

= {ei + e 2 |ei
,
- e2, &i + e2, . .
.}

does not converge because the sequence {1, — 1, 1, .} formed from the com- . .

ponents of e 2 is not a convergent sequence of real numbers (see Fig. 8-1). On


the other hand, the sequence

{(1/2"- 1 *! + e 2}

converges to the vector e 2 since , {1, |, J, . . .} converges to 0, and {1, 1, 1, . .


.}

converges to 1 (Fig. 8-2).

The preceding examples show that the study of convergence infinite dimensional
Euclidean spaces is essentially the same as the study of convergence of sequences

of real numbers. However, in infinite dimensional spaces such as Q[a, b] the situa-
tion becomes much more complex, and correspondingly more interesting. For
then the type of convergence defined above is radically different from that studied
8-1 SEQUENTIAL CONVERGENCE 307

in calculus under the name of pointwise convergence.* Indeed, we shall see

momentarily that in a function space with an integral inner product the assertion
that
b
lim |f* - f|| = lim ( f [fk (x) - AxtfdxY 12 =
k—>oo k—>oo \Ja /

is not at all the same as saying that the sequence {ffc} converges to the function f
at every point of [a, b]. In analysis such convergence is known as mean convergence
to emphasize that it is computed by integration, which, in a sense, is a generalized
averaging process.
x4 x3 x2

FIGURE 8-2

Example 3. The sequence of functions {*, x2 *3


, , .} converges in the mean
in C[— 1, 1] to the zero function since

lim = lim
k—>oo
||** 0||
k—>Q0 (/><*r
= • ( 2 V' 2
lim
k

= 0.

This not withstanding, {*, * 2 , * 3 , . . .} does not converge to zero at each point in
the interval [—1,1]. In fact, at * = 1 the sequence converges to 1, while at

* = — 1 it does not converge at all. (See Fig. 8-3.)

The example just given shows that mean convergence is different from pointwise
convergence. The one which follows shows how different it really is.

Example 4. Let (?G[a, b] denote the set of all piecewise continuous functions on
[a, b] ; that is, the set of all functions which are continuous everywhere on [a, b]

except (possibly) at a finite number of points where they have jump discontinuities
(see Section 5-1). In Section 9-2 we will prove that (PQ[a, b] can be regarded as
a Euclidean space under the usual definitions of addition, scalar multiplication,

* Recall that a sequence of functions {/*} defined on [a, b] is said to converge pointwise
to the function/ if lim*-** fk (x) = f(x) for each point x in [a, b]. (See Appendix I.)
308 CONVERGENCE IN EUCLIDEAN SPACES CHAP. 8

FIGURE 8-3

and inner product, the last being defined by the formula

b
f-g [ Kx)gOO dx.
Ja

Accepting the truth of this assertion, let [a, b] be the unit interval [0, 1], and let

Ii,I2 ,... be the sequence of subintervals


h
A
h = [0, 1],
.

h = [0, 41 h = ih 11
h = [0, il
-*6 = L2j ?1 h = [|, 1],

FIGURE 8-4

(See Fig. 8-4.) For each integer k > 0, let /*(*) be the function in (PC[0, 1]
which has the value 1 when x is in Ik and elsewhere on [0, 1] (Fig. 8-5), and
consider the sequence {fk } (The function/* is known as the characteristic function
.

of the interval h.) We assert that {fk} converges in the mean in (Pe[0, 1] to the zero

function, but does not converge pointwise anywhere in [0, 1].


To prove mean convergence we must show that

lim ||/fc
- 0|| = 0.
k—»oo
But

11/* -on = (fiifkWfdxy 12 ,


/

8-1 I SEQUENTIAL CONVERGENCE 309

1-

FIGURE 8-5

and since fc is identically 1 on Ik and is zero elsewhere, the value of this integral is

just the square root of the length of Ik . However, the length of Ik tends to zero
with increasing k, and it follows that

2
lim (f*[fk(x)] dxy* = o,
k—»oo

as asserted.
Now let x be a fixed point in [0, 1], and consider the sequence of real numbers
ifk(xo)}. We contend that this sequence does not converge. For, by definition,
fk(x ) is either 1 or depending on whether x is or is not in Ik But the Ik were .

constructed in such a way that there exist arbitrarily large values of k for which
fk(x ) = and arbitrarily large values of k for which fk (x ) = 1. Thus {fk (x )}
contains both zeros and ones no matter how far out in the sequence we go, and
hence does not converge.

EXERCISES

1
Determine which of the following sequences {a k} converge in (R , and find the limit of
each convergent sequence.

(-1)"
1. ak = 2. ak =
2k + 1

3. ak = 1 + (-1)* 4. ak =
k + 1

A: + 1
5. ak = 1 + e 6. ak
(k + 1)(* - 1)

2
fc
+1 + (_i)
fe
In A;
7. ak = i. ak =
2 fc
310 CONVERGENCE IN EUCLIDEAN SPACES | CHAP. 8

Determine which of the following sequences {x*} converge in (R 3 , and find the limit of
each convergent sequence. (In each case the vectors ei, e2, e3 are an orthonormal basis
for (R 3 .)

9. x fc =
(*>
+
[ 1 t_ J
ei + e2 + , ,

2
e3 10. x* = — ei + k
2
e2 + (-l)*e 3

11. x fc
= (1 - 2~ fc

)ei + ^ e2 + (In *)e 3

k In A: , 1

j^TT ei + T- +
,

12 - x* = e2 e3
*TnT

13. Let {xfc} be a convergent sequence of vectors in a Euclidean space. Prove that for
every real number e > there exists an integer such that K
||x m - x„|| < e

whenever m,n > K. [Hint: Use the triangle inequality.]

14. In Chapter 10 it will be shown that

lim / f(x) sin kx dx =


k—»oo J —r

forany function / in 6[— x, x]. Use this fact to prove that the sequence {sin kx},
k = 1,2,..., does not converge in the mean in C[— x, x].

15. Prove that the sequence {(sin kx)/k} converges in the mean in 6[— x, x] to the zero
function. Prove that this sequence also converges to zero for each x in the interval
[-x,x].
*16. (For students familiar with uniform convergence.) Let {fk } be a sequence of func-
tions in C[a, b], and suppose that {fk} converges uniformly on [a, b] to a function fin
Q[a, b]; i.e., given any e > there exists an integer k such that \fk(x) — f(x)\ < e
for all k > K and all x in [a, b]. Prove that {fk} also converges in the mean to /.

8-2 SEQUENCES AND SERIES


In this section we establish a number of elementary facts concerning convergence
in Euclidean spaces. The first asserts that limits of sums and scalar products be-
have entirely as expected, in the sense of

Lemma 8-2. Let {x*} and {y&} be convergent sequences in a Euclidean


space with limits x and y respectively. Then the sequence {ax jSy*} is fc +
convergent for every pair of real numbers a, /3, and

lim {ax k + jSy*} = ax + /3y.


k—>oo

The proof an immediate consequence of Definition 8-1, and has been


is left to
the student as an exercise (see Exercise 1 below).
8-2 I
SEQUENCES AND SERIES 311

For our purposes, a much more important property of convergence is that

given by

Theorem 8-1. Let {x*} and {y^} be convergent sequences in a Euclidean


space with limits x and y, respectively. Then {x y*} is a convergent se- fc

quence of real numbers, and

lim {x fc

yjfc}
= x • y.
k—yoo

In short, the inner product is a continuous function on a Euclidean space.

Proof. We must show that

lim |(x fc

y fc ) - (x •
y)| = 0.
k—>oo

To this end we set U& = xk — x, v* = y* — y, and write

!(x fc -y*) - (x-y)| = |(Ufc + x).(v fc + y) - (x •


y)|

= |(u* •
Vfc) + (x • vjt) + (y • ujb)|

< \n k - Vfc| -F |x • v fc | + |y • u h \,

the last step following from Eq. (7-27), applied to (R.


1
. We now use the Schwarz
inequality to deduce that

|Hb-n| < I WW,


|x-v*| < llxIMM,
|yo»l < llyll Nil-
Hence
|(Ub-yjb) - (x-y)| < ||u fc || \\y k \\ + ||x|| \\y k \\ + ||y|| ||u fc ||,

and since ||u*|| — > and ||vjfc||


— > as k — > oo, we have

lim|(x .y fc fc
)- (x.y)| = 0.1
A;—+oo

Now that we have introduced sequential convergence in Euclidean spaces, we


can also discuss convergence of infinite series in the same context. Thus, with each
series ££=1 x* in a Euclidean space we associate its sequence of partial sums

{Xi, Xi + X2 , Xi + x2 + x3 , ...},

and then take the convergence of this sequence as the criterion for the convergence
of the series in question. This is the content of

Definition 8-2. An infinite series 2jfc=i x k of vectors in a Euclidean space


V is said to converge to the vector x in V if and only if the associated se-
312 CONVERGENCE IN EUCLIDEAN SPACES | CHAP. 8

quence of partial sums converges to x in the sense of Definition 8-1. If this


is the case we write

x = 2x
k=l
*> (8
~3
)

and say that x has been expanded as an infinite series. In greater detail,
XX=i Xfc converges to x if and only iffor each real number e > 0, there exists
an integer K such that
£x fc < e (8-4)
&=i
whenever n > K.

n
Example. Let e l5 e n be an orthonormal basis for
. .
.
, (R , and suppose Y,t=i *fc
71
is an infinite series of vectors in (R Then .

00 00 00

^2 x k = X/
/b=l fc=l
aifcei + ' ' *
+ Sa
A;=l
nk^k,

where the o^ are the coordinates of X& with respect to the given basis, and it
follows from Example 2 of the preceding section that £fc=i x*; converges to the
vector x = <*iei + + anen if and only if
• • •

00 CO

Oil = 22 a lfc> ' ' ' >


an = 2-J ank
'

fc=l fc=l

In particular, this result implies that the usual tests for the convergence of a series
of real numbers such as the comparison test, the ratio test, etc., can be applied
n
to determine whether or not an infinite series in (R converges. The only difference
is that the tests must be applied n times, component by component.

EXERCISES

1. (a) Prove Lemma 8-2.

(b) Let {a k } and {/3 fc } be convergent sequences of real numbers with limits a and /?,
respectively, and let {x } and {y k } be as in Lemma 8-2. Prove that {a kx k
fc fikYk} +
converges to ax + /3y.

2. (a) Let {x k } be a convergent sequence of vectors in a Euclidean space V, and suppose


that limbec {x k} = x. Prove that

lim [(x k - x) •
y] =
k—>oo

for every vector y in V.


8-3 | BASES IN INFINITE DIMENSIONAL EUCLIDEAN SPACES 313

(b) Let V
be a finite dimensional Euclidean space, and let {x k} be a sequence of
vectors in V. Suppose that there exists a vector x in V such that

lim [(x* - x) •
y] =
k—->oo

V. Prove that {x k} converges to x


for all y in in the sense of Definition 8-1. [Hint:

Choose an orthonormal basis in V.]


(c) Give an example to show that the result in (b) fails when V is infinite dimensional.
[Hint: LetV be the space of all continuously differentiable functions on [— ir, ir] with
the standard inner product, let x k = cos kx, k = 1,2,..., and let x = 0.]

8-3 BASES IN INFINITE DIMENSIONAL EUCLIDEAN SPACES

In Section 7-4 we used the Gram-Schmidt orthogonalization process to prove


that every finite dimensional Euclidean space has an orthonormal basis e 1? . .
. , en ,

and that every vector in such a space can be written uniquely in the form

x = (x •
eOei + • • •
+ (x • e n)e n . (8-5)

But the Gram-Schmidt process can be applied equally well in an infinite dimen-
sional Euclidean space where it can be used to produce an infinite orthonormal set
ei, e 2 , .This fact suggests that an extended version of our earlier results may
. . •

be within our grasp, and the remaining sections of the present chapter are devoted
to realizing this suggestion.
Before we begin, we emphasize that the ideas we are about to develop are
natural and inevitable consequences of the study of finite dimensional Euclidean
spaces, and should be understood as such. This, however, is not to deny their
importance or to suggest that they are being presented as an academic exercise
in generalization. Quite the contrary; for, as we shall see, these ideas lie at the
heart of much of and its applications to mathematical physics.
analysis
The most obvious way in which to attempt to generalize (8-5) in the presence
of an orthonormal set ei, e 2, ... in an infinite dimensional space V is to replace
its right-hand side by the infinite series

^2 (x • e fc >fc. (8-6)
fc=i

However, in the absence of any further information there is clearly no a priori


reason for supposing that this series converges, much less that it converges to x.*
Nevertheless, it is convenient to have a notation which expresses the fact that

* Such a series is sometimes called a formal series, to emphasize the fact that on the
face of things it is nothing more than an expression which actually may be devoid of
meaning.
314 CONVERGENCE IN EUCLIDEAN SPACES I CHAP. 8

(8-6) is deduced from x, albeit in a purely formal way. The one most; commonly
used is

x ~ ]T (x • e k )e k , (8-7)
fc=i

the symbol ~ (which, by the way, is no relative of the one used earlier for equiva-
lence relations) being used to emphasize that the series in question may not con-
verge to x. Of course, if it does, we write

x = ^ (x • e A )e fc , (8-8)

and say that the series converges in the mean to x. In either case, the inner products
x e*; are called the coordinates or (generalized) Fourier coefficients of x with

respect to the orthonormal set ei, e 2 , . . . .

It is clear that the Fourier coefficients of x depend upon the orthonormal set
with respect to which they are computed. Not quite so clear, but equally true, is

that (8-6) may converge in the mean to x for one orthonormal set but not for
another. The following examples illustrate both of these points.

Example 1 . Compute the generalized Fourier coefficients of the function

f(x) = x, —it<x<t,
in C[— t, 7r] with respect to the orthonormal set

sin x sin 2x sin 3x


(8-9)
\/t s/t V7T

(see Example 2, Section 7-3). In this case the coefficients are

x-e =fc
sin kx dx, k = 1,2,
Vf J —*
and, using integration by parts, we obtain

—= / x sin kx dx = — z
x cos kx
cos kx dx

2v5 cos kir


k
2\fw
k
2\Ajt
k = 2, 4, 6,
.

8-3 BASES IN INFINITE DIMENSIONAL EUCLIDEAN SPACES 315

Hence the Fourier coefficients of x with respect to (8-9) are (— \) k 1


(2y/ir/k),
and (8-7) becomes
„ / sin 2x sin 3.x \
x ~ 2
.

sin
I sii
( x ^
,

1 = .
. . l .

In the next chapter we shall see that this series actually converges in the mean to
x, so that we are justified in writing

* = 2f;(-i)*-'^
fc=l

Example 2. If we replace (8-9) with the orthonormal set

1 cos jc cos 2x
> 5 J • • • 5
(8-10)
V2x Vx s/r

the generalized Fourier coefficients of x become

/ x dx, and —— / x cos kxdx, k — 1,2, ...


V27T J —* V7T J -*

But all of these integrals are zero (see Exercise 1), and hence (8-7) becomes

x ~ 0.

This time it is clear that we do not have an equality since

We now return to the general situation in which e ls e 2 , . . . is an (infinite)


orthonormal set in V, and examine (8-6) a little more closely. The sum of the
first n terms of this series is

(x-ei)e! + •• •
+ (x-e n )e n ,

which we recognize from Section 7-5 as the perpendicular projection of x onto


the subspace V n of V spanned by the orthonormal vectors d, ,e n Hence, . . . .

if w is any vector in V n ,

x - 2^ (x • e k )ek < llx — wl (8-11)


k=i

This simple observation furnishes the key to the proof of the following theorem.
316 CONVERGENCE IN EUCLIDEAN SPACES CHAP. 8

Theorem 8-2. Let J2k=i QkPk be any infinite series which converges in the
mean to x; i.e.,
y^ y
ak e k . (8-12)
fc=i

Then a k = x • ek for each integer k.

This theorem asserts that whenever x can be expressed as an infinite series the
coefficients of the series are uniquely determined and must be the Fourier coeffi-
cients of x with respect to the orthonormal set e l5 e 2 , . . . . This, of course, is

the infinite dimensional analog of the uniqueness of (8-5) in the finite dimensional
case.

Since
Proof.
x = ^
k=l
ak e k ,

we have

lim
n—>oo
x — ^
fc=i
ake k = 0.

But Lfc=i ak^k is a vector in Vn , and hence, by (8-11),

x — ^
fc=i
(x • e k )e k < X — ^T, Qk * k
fc=l

for all n. Thus

lim
n—»oo
x - 2
fc=l
( x * efc ) e *

from which it follows that

x = 2 ( x * e *)efc -

Subtracting this series from (8-12) gives

00

Y^ [ak — (x • e fc )]e fc = 0, (8-13)

and we must now show that this equation implies ak — (x • ek ) = for all k.

But since the series in (8-13) converges in the mean to 0, we have


1/2

^ =
lim
n—>oo
2
fc=i
fa* - ( x * e k)]e k - = lim X)
fe=i
[
fl
* - <x *
e
2
°-

The desired conclusion now follows from the fact that each of the terms
[a k — (x • e k )]
2
in this sum is non-negative. |
8-3 | BASES IN INFINITE DIMENSIONAL EUCLIDEAN SPACES 317

In view of this result seems reasonable to extend the meaning of the term
it

"basis" to infinite dimensional Euclidean spaces as follows.

Definition 8-3. An orthogonal set of vectors ei, e 2 in a Euclidean


, . . .

space 13 is said to be a basis for 13 if and only if each vector x in 13 can be


written uniquely in the form

x = 2 «**• (8
~ 14
)

(Remember that such an equality must be interpreted as asserting that


the series in question converges in the mean to x.) The coefficients a^ in
this series are called the generalized Fourier coefficients of x with respect to
the given basis, and the series itself is called the generalized Fourier series
expansion ofx.

The reader should note that the vectors in a basis for 13 are only required to be
orthogonal, not orthonormal. Of course, any basis for 13 can be normalized in
the standard fashion, but would be a hinderance to restrict the definition in this
it

way. It is also possible to define a basis for an infinite dimensional Euclidean


space by using a linearly independent set of vectors rather than an orthogonal one.

This would be somewhat closer to the spirit of the definition in Chapter 1, but

since any such set can be orthogonalized by the Gram-Schmidt process, it would
not result in any significant gain in generality.*
For all we know at the moment, we may never encounter an infinite dimensional
Euclidean space with a basis. This notwithstanding, we shall assemble a number
of results about such spaces pending the time when we finally meet one. However,
in the absence of an immediate example it is only fair to state that these results
are more than speculations in a vacuum since all of the Euclidean spaces which
appear in analysis do, in fact, have bases.
We begin by making a number of simple observations, the first of which we
state formally as follows:

Lemma 8-3. The zero vector is the only vector which is orthogonal to
every vector in a basis for a Euclidean space 13.

This an immediate consequence of the postulated uniqueness of series expan-


is

sions relative to a basis,and the fact that 2jk = i ai&k converges in the mean to
zero whenever ak = for all k. Despite its simplicity, Lemma 8-3 is often useful,
particularly in showing that an orthogonal set is not a basis for a given Euclidean

*A word of warning. Definition 8-3 is not the only meaning assigned to the term
"basis" for infinite dimensional spaces. There is another in common use which is not
restricted to Euclidean spaces, but applies to arbitrary vector spaces as well. We mention
this point only to alert the student to the fact that when he encounters this term he must
check the definition to avoid serious errors.
318 CONVERGENCE IN EUCLIDEAN SPACES | CHAP. 8

space. For instance, when this result is combined with Example 2 above, it allows
us to assert that the set of functions in (8-10) is not a basis for C[— t, t].

An equally simple observation is that an orthonormal set ei,e 2 , . • . is a basis for


a Euclidean spaceV if and only if every vector x in V can be written in the form

oo

x = ^
fc=i
(x • e k )e k .

For then, by Theorem 8-2, this expression must be unique, and Definition 8-3 is

therefore satisfied.
And while on the subject of orthonormal bases, there is another point it would
do well to settle. Recall that if ei, . .
.
, en is such a basis in a finite dimensional
space, and if
n n
x = J]
fc=l
(x • e k )e k and y = ^
k=l
(y • e k )e k

are any two vectors in this space, then


n
x y • = 2] (x • e fc )(y • e k ).

That is, the inner product of x and y is the sum of the products of their cor-

responding components when computed with respect to an orthonormal basis.


We assert that this result is also valid for an orthonormal basis in an infinite dimen-
sional space V.
To prove this assertion, let

oe oo

x = J] (x • e k )e k and y = ]T (y • e k )e k
k=l k=l

be any pair of vectors in V, and for each positive integer n, set

n n
xn = S
fc=i
(x • e fc )e fc and yn = ^
fc=i
(y •
ejfc> fc .

Then {x n } -> x and {y n } — y, and hence by Theorem 8-1,

{x n .y n }^x.y. (8-15)

But the fact that e u e 2 , . . . is an orthonormal set implies that

n
x n yn • = 2Z ( x * e *Xy '
e *)'

and it follows at once from (8-15) that

x y • = X) (x • e*)(y • e fc ). (8-16)
Jk=l
— »

8-4 | BESSEL'S INEQUALITY: PARSEVAL'S EQUALITY 319

EXERCISES

1. Verify that

—— / x dx = and —— x cos kx dx = / 0, k = 1, 2,
%/2ir J
-x Vx J -*
2. Compute the Fourier coefficients of the function

fix) = x2 , —t < x < x


in C[— 7r, x] with respect to the orthonormal set

— 1 cos x cos 2x
• • •
> > j

\/27r Vir \Ar


and also with respect to the orthonormal set

sin x sin 2x

3. Repeat Exercise 2 for the function e x , —t<x<tt.


4. Let ei, e2, . . . be an infinite orthonormal set in a Euclidean space V, and suppose that
00

^
fc=i
aifih = 0.

Prove that a* = for all k.

8-4 BESSEL'S INEQUALITY: PARSEVAL'S EQUALITY


This section bears the title of two of the most important results in the theory of
infinite dimensional Euclidean spaces. As we shall see, both of them can be
interpreted as natural extensions of familiar statements about finite dimensional
spaces. This fact, however, should not be allowed to obscure the importance of
the generalizations themselves, and the reader is advised that they will ultimately

furnish the justification for most of our work with orthogonal sequences in func-
tion spaces.

Theorem 8-3. Let ei, e 2 , . . . be an orthonormal set of vectors in an infinite


dimensional Euclidean space V, and let x be an arbitrary vector in V. Then

oo

2
A;=l
(x-e*)
2
< ||x||
2
(BesseTs inequality). (8-17)

Moreover, ei,e 2 ,...wfl basis for V if and only if

00
2 2
^2 (x-e*) = ||x|| (ParsevaVs equality). (8-18)
k=l
320 CONVERGENCE IN EUCLIDEAN SPACES | CHAP. 8

Proof. The proof rests upon computing the value of

x - J] (x • e k )e k
k=i

for any x in V and any integer n. Now

x - ^2
k=l
(x • e k )e h = (
x - 2
k=l
( x '
e *)e * I' (
x — 2
fc=l
( x ' ek ^k

= x x • - 2 ^j (x • e A )(x • e fc ) + ( ^ (x • e y )ey ) •
( X) ( x ' e *) e * )

(We have changed the index of summation on the first factor of the last term for
convenience in computation.) But since the e^ are orthonormal, ey • e& = 8jk,

and it follows that

n \ / n \ n n

(
J] (x • e^e, )

( X) (x * ek ^k ) = 22 ( x ' e >')( x * e *Xe ' * efc )

= Z(x
k=l
-
e *)
2

Thus

x - ^2 (x • e^ = iixf-x;(x -^ (8-19)
fc=i fc=i

To prove Bessel's inequality, it suffices to note that

< x - J]
fc=i
(x • e fc )e fc = N 2
- Z>- e ^
fc=i

for all x in 13 and all n. Hence

2
X)(x.e fc ) < ||xf

2
for all h, and the partial sums of the series £*=i (x • e fc ) form a bounded non-
decreasing sequence of non-negative real numbers. By a well-known theorem
from calculus (see Appendix I) we conclude that this series converges and that

X>-e*) 2 < Hxf.

Thus (8-17) holds.


8-4 i BESSEL'S INEQUALITY: PARSEVAL'S EQUALITY 321

To finish the proof we now suppose that ei, e 2 , ... is an orthonormal basis
for V. Then by Theorem 8-2 we know that

x = ^ (x • e >, fc

Hence
n
lim
n—»oo
x — ^
A;=l
(x • e fc )efc = o,

and it follows from (8-19) that

-X;(*-<rf =
2
lim |x|| 0.
n—yao
fc=l

Thus

k=i

which is Parseval's equality. Finally, since the steps in this chain of reasoning are
reversible, we conclude that e ls e 2 , . . . is a basis whenever Parseval's equality is

satisfied, and the proof is complete. |

In a dimensional Euclidean space the square of the length of any vector


finite

is equal to the sum of the squares of the lengths of its components relative to an

orthonormal basis. Conversely, if this is true the set in question is an orthonormal


basis for the space. Parseval's equality asserts that this result also holds for infinite
dimensional spaces, and thus can be viewed as an infinite dimensional version of the
Pythagorean theorem. Its value lies in the fact that it provides an analytic tool
for determining whether an orthonormal set is a basis for a Euclidean space, and
as such is much easier to apply than the definition itself.
Bessel's inequality —whose geometric interpretation is now obvious —can be
used to obtain estimates of the magnitude of the Fourier coefficients of x with
respect to e i, e 2 One of the most important of the many results of this sort
, . . . .

is given in the following corollary.

Corollary 8-1. Ife 1} e 2 , . . . is an orthonormal set in an infinite dimensional


Euclidean space V, then

lim (x • ek) = (8-20)

for every xinV. That is, the Fourier coefficients ofx tend to zero as k—» oo
for any orthonormal set in V.

2
Indeed, Bessel's inequality implies that £t-i (x • e*) is a convergent series of
real numbers. Hence the individual terms in this series approach zero with in-

creasing k, which, in turn, implies (8-20).


.

322 CONVERGENCE IN EUCLIDEAN SPACES | CHAP. 8

In applications one frequently wants to work with an orthogonal set in V rather


than an orthonormal one. To do so we must know which of the results proved
for orthonormal sets remain true for orthogonal sets. The fact of the matter is
that, save for minor notational changes, they all do. This statement finds its
justification in the following assertion, whose proof we leave as an exercise. An
orthogonal set e
ei, 2 is a
, basis
. in
. a. Euclidean space V if and only if its asso-
ciated orthonormal set ex/He i|l, e 2 /||e 2 ||, . . . is a basis.
From this it follows that we can rephrase all of our earlier results in terms of
orthogonal sets. In particular, if ei, e 2, . . . is an arbitrary basis for V, Parseval's
equality becomes
2 2
M2 _ y» (x • e fc ) _ y^ (x • e fc )
11x11
I,

~h**'*> ~h w 2
(8_21)

and, conversely, if (8-21) is satisfied for all x in V, ei, e 2 , . . . is a basis for V.


Finally, the series expansion of a vector in terms of an arbitrary basis is

where the
x-e fc

2
IMI

are the generalized Fourier coefficients of x with respect to ei, e 2 , . . . . This


formula will be used repeatedly throughout the following chapters.

EXERCISES

1. Let ei, e 2 , . . .be an orthogonal set in a Euclidean space V. Prove that this set is a
basis for V if and only if its associated orthonormal set is' a basis.
2. Prove Formula (8-21).
3. Assume that the functions 1, sin x, cos x, . . , sin kx, cos kx, . . . are a basis for the
Euclidean space Q[— x, x]. (They are.) Find the Fourier coefficients of fin Q[— x, x]
in terms of this basis. What is the series expansion of/?

*8-5 CLOSED SUBSPACES


Roughly speaking, a Euclidean space is an object in which the notions of linearity
and convergence are studied simultaneously. A particularly simple yet important
example of the way in which these ideas enrich and support one another is fur-
nished by the study of subspaces of Euclidean spaces. In this instance the notion
of convergence intervenes to produce what is known as a "closed" subspace, de-
fined as follows.
.

8-5 | CLOSED SUBSPACES 323

Definition 8-4. A subspace W of a Euclidean space V is said to be closed


in V if the limit of every convergent sequence of vectors in *W belongs to W.

(In those cases where the context is clear we shall omit the reference to V, and
simply say that "W is a closed subspace.)
As with every new definition, one's first thought is to produce examples. In
this case we can furnish a plentiful supply by examining finite dimensional spaces,
for there we can prove

Lemma 8-4. Every subspace V? of a finite dimensional Euclidean space V


is closed.

Proof. If W is the trivial subspace of V it is obviously closed, and we are done.


Otherwise we can find an orthonormal basis e u . . , V whose first m vectors
e n for

d, . .
.
, e m are a basis for W.* It follows that an arbitrary vector x in V belongs
to *W if and only if its last n — m components with respect to this basis are zero.
Now suppose that {x k } isa sequence of vectors in V?, and that {xk } — x. »

Then, by Example 2 of Section 8-1, we know that the sequences of real numbers
formed from the components of the x k converge to the components of x. It
follows that the last n — m components of x are zero, and hence x belongs to W,
as asserted. |

More can be shown that every finite dimensional subspace of an


generally, it

arbitrary Euclidean space is closed. The proof is almost identical with the one
just given, and has been left as an exercise.
This lemma tells us that the only interesting illustrations of Definition 8-4 are
to be found in infinite dimensional spaces, if any exist at all. They do, as the
following example shows.

Example. Let l 2 (read "little I two") denote the set of all infinite sequences
of real numbers
x = C*l> *2j *3> • • •)

with the property that


00

^2 Xk < 00.

Then t2 becomes a Euclidean space if, with x as above, a a real number, and
y = Oi, J>2, ys, • • •), we define

ax = (ax i, aX 2 ax 3 , , . . .)

x + y = (*i + yu x 2 + y2 x 3 + y3
, , • •)

x y • = xiyi + x 2y 2 + x 3j> 3 + • •

* This assertion can be proved at once by applying the Gram-Schmidt process to the
basis described in Theorem 1-7.
324 CONVERGENCE IN EUCLIDEAN SPACES | CHAP. 8

(see Exercise 2). Moreover, since l 2 contains the infinite orthornormal sequence

et = (1,0,0,...),
e2 = (0,1,0,...),
e3 = (0,0,1,...),

it is infinite dimensional.
Now W
be the set of all vectors in l 2 which have only a finite number of
let

nonzero components. Then W


is a subspace of l 2 (Proof?) However, is not . W
closed in t,2 since the sequence

xi= (£.0,0,...),
x2 = \2i ?>">•• •)>

x3 = (25 4) 8> • -h

fc
each of whose vectors belongs to VP, converges to the vector (J, 5, i, . . . , l/2 , . .
.)

which is not in W.
(The Euclidean space l 2 is an example of what mathematicians call a Hilbert

space.)

Having shown that there exist subspaces which are not closed, and hence that
there is good reason for studying closed subspaces as entities in themselves, we

begin with the usual elementary observations. First, every Euclidean space V,
viewed as a subspace of itself, is closed, and hence every subspace of V (closed
or not) is contained in at least one closed subspace of V. Next, the intersection
of any collection of closed subspaces of V is itself a closed subspace. (Recall that
this intersection consists of those vectors which belong to every one of the sub-
spaces in question. Again the proof is deferred to the exercises.) Conclusion: If
W is an arbitrary subspace of a Euclidean space V, there exists a smallest closed
subspace of V containing V? namely the intersection of all the closed subspaces
;

of V which contain VP. This subspace is denoted VP, and, for obvious reasons,
is called the closure ofVP in V. Finally, VP can be described as follows.

Lemma 8-5. IfVP is a subspace of a Euclidean space V, its closure (in V)


is the set of all vectors inV which are limits of sequences of vectors in W.

In other words, to get W we simply adjoin to V? all limits of sequences of vectors


in W. The proof of this statement is an obvious consequence of our definitions.
In specific cases this lemma is far too general to be of much use in finding "W.
In fact, many of the deeper investigations of the theory of Euclidean spaces center
around the problem of determining W
for a given W. Unfortunately, most of
this work is highly technical, and beyond the scope of this book.
8-5 | CLOSED SUBSPACES 325

The sequence of observations leading to Lemma 8-5 admits a useful and natural
generalization. Suppose that instead of starting with a subspace of V we begin
with an arbitrary nonempty subset 9C in V. The n we can form the subspace S(9C)
spanned by 9C, and follow this with the closure S(9C) of S(9C) in V, in which case
we say that I(9C) is the closed subspace of V generated by 9C. Note that the sub-

space spanned by 9C coincides with the subspace generated by 9C if and only if

S(9C) = This will always happen, of course, if V is finite


S(9C). dimensional (Lemma
8-4), or if 9C a finite subset of V (Exercise 10). However, the above example
is

shows that when 9C is an infinite subset of an infinite dimensional space, S(9C) will
in general be different from S(9C). In Chapter 1 we gave a description of the vectors
belonging to S(9C) in terms of the vectors belonging to 9C (see Theorem 1-1) When .

combined with Lemma 8-5, that description allows us to assert that S(9C) consists
of all vectors in V which are limits of linear combinations of vectors in 9C. Stated
more formally, we have

Lemma 8-6. Let X be a set of generators of a (necessarily closed) subspace

V?ofa Euclidean space V. Then, ife is any positive real number, and x any
vector in W, there exists a finite linear combination Ya=\ «fc x of vectors x
fc fc

/«S such that


N
x - 2
fc=i
akXk < €.

Finally, suppose that 9C is a set of generators for V itself, and that the vectors in 9C
are also linearly independent.* On the strength of our experience with finite dimen-
sional spaces the reader could hardly be blamed for assuming that every vector
x in V could then be written in the form
00

X = ^ a kX k>

where the a k are scalars, and the x k belong to 9C. Surprisingly, this is not the case,
and an example of this apparently paradoxical situation can be found in Exercise
13 below. However, things improve remarkably when 9C is also an orthonormal (or
orthogonal) set, for then we can prove

Lemma 8-7. Ife x e 2


, , . . . is an orthonormal set of generators of a Euclidean
space V, then e l5 e 2 , . . . is a basis for V.

Proof. We must show that every vector x in V can be written in the form
00

x = J] ( x • e *)e fc-

* It can be shown that such a set always exists in any Euclidean space V. The proof,
however, is not easy.
326 CONVERGENCE IN EUCLIDEAN SPACES | CHAP. 8

To do so, let € > be given. Then, since the e^ generate V we can find a linear
combination X)*Li ak ek such that

x — ^2 ak e k < €.

If n > N, it follows by two applications of (8-1 1) that

N
x - J]
k=i
(x * ^k < x - ^ (x • e k )ek

N
< < e.

*=1

Hence, by the definition of convergence,

x = ^
fc=i
(x • e k )e k ,

and the lemma is proved. |

This result explains why orthonormal sets enjoy a special status in the theory of
infinite dimensional Euclidean spaces beyond what they have in finite dimensional
spaces. In the latter their use is simply a matter of convenience, in the former it

is often a matter of necessity. Since we shall be dealing almost exclusively with


orthonormal (and orthogonal) sets in the following chapters, it may be well to
summarize what we now know about them. Our results may be stated in the
following form.

Theorem 8-4. Ife lt e 2 , . . • is an orthonormal set in an infinite dimensional


Euclidean space V, then the following three statements are equivalent {i.e.,

any one implies the other two):


(i) ei, e 2 , . . a basis for V;
. is

(ii) e l5 e 2 , . . a linearly independent set of generators for V;


. is
2 2
(iii) Ifx is any vector in V, then ||x|| = ££=1 (x e^) • .

There is, of course, a corresponding theorem for orthogonal sets.

EXERCISES

1. Prove that every finite dimensional subspace "Wofa Euclidean space V is closed.
[Hint: If x is the limit of a sequence of vectors in W, apply Lemma 8-4 to the sub-
space of V spanned by W and x.]
8-5 | CLOSED SUBSPACES 327

2. Let I2 be as defined in the example in the text, and let

x = (*i, X2, . . , Xk, . . •), y = (y 1, y2, . .


. , yu, . •)

be arbitrary vectors in (2.


(a) Prove that x + y belongs to 1 2- [Hint: Use Eq. (7-27) to deduce that
I n \l/2
1/2 /In
_n_ A
\l/22 1/ / n
/JL. X 1/2
2
2>* + »)
fc=i
) <(E*<)
\k=i
+(S
\k=i
yl) '

v / I

and then use the fact that x and y belong to 1 2 to obtain an upper bound, independent
of n, for the right-hand side of this inequality.]
(b) Prove that x •
y is defined. [Hint: For any two real numbers x, y, we have
xy = %[(x + y)
2 — x2 — y
2
]. Hence

n / n n n \

^2 wi*
fc=i
= i ( 2
\a;=i
( Xfc + y^ ~
2
2
«:=i
x* ~ ]C ^*
fc=i
/

(c) Now complete the verification that ^2 is a Euclidean space.

3. (a) Prove that the subset W defined in the text is a subspace of 1 2.

(b) Find the closure of *W in l 2 .

4. In the example above it was asserted that a certain sequence in 1 2 converged to the
vector (^, 5, i, . . .). Prove this assertion.

5. Prove that the intersection of any (nonempty) collection of closed subspaces of a


Euclidean space V is closed.

6. Give an example of a Euclidean space which is a subspace of the Euclidean spaces W


V 1 and V 2, and which has the property that it is closed in 13 1 but not in 13 2. (Thus

in the definition of a closed subspace reference must be made to the space in which
the closure is taking place.)

7. Prove that = Wfor any subspace W Euclidean space 13. (This result Wofa is

sometimes expressed by saying that "the closure of the closure is the closure.")
8. Prove Lemma 8-5.

9. Prove thatW is a closed subspace of 13 if and only if W = W. [Hint: See Exercise 7.]

10. Prove that S(9C) is closed in 13 whenever 9C is a finite subset of 13.

11. Suppose x is orthogonal to a subspace W of a Euclidean space x y =13 (i.e., • for


every y in W). Prove that x is orthogonal to W. [Hint: Use Theorem 8-1.]
*12. Let W be an arbitrary subspace of a Euclidean space and 13, let W-1 denote the set
of vectors in
all which are orthogonal to W
13 Exercise (see 23, Section 7-5). Prove
thatW-1 is a closed subspace of 13. [Hint: Let {xk } be a sequence in W
1 which con- -

verges to x, and let y be any vector in W. To prove that x • y = 0, use the identity

x y = y • • (x — x k)

and the Schwarz inequality.]


328 CONVERGENCE IN EUCLIDEAN SPACES CHAP. 8

13. Let 1 2 be the Euclidean space defined above, and for each integer k > 2 let x*,

be the vector in 1 2 defined by

Xjfc = (1,0,..., 0,1,0,...),

where the ones occur in the first and A:th positions. Let 9C denote the set consisting
of the vectors X2, X3, . .

(a) Show that 9C is a linearly independent set in 1 2.


(b) Describe the subspace S(9C). That is, give a rule which can be applied to test

whether or not a given vector in £2 belongs to S(9C).

(c) Show that the vector ei = (1, 0, 0, . . .) belongs to S(9C). [Hint: Compute the
value of

ei J^ x,
*=2
and show that
°°
1
lim ei Y) xjt = 0.
n—»oo Jfc=2

(d) Prove that S(9C) = I2. [Hint: Begin by showing that each of the vectors ei,
e2, . . • as defined in the text belong to S(9C), and then show that these vectors are
a basis for ^2-]
(e) Show that ei cannot be written in the form £ST=2 a kx k thereby furnishing the ,

example mentioned just before Lemma 8-7.


14. Let {an}, k = 1, 2, , be a sequence of real numbers, and suppose that £"=1 at
. . .

converges. Prove that X)"=i a *>/k also converges. Give an example to show that
the converse of this statement is false.
9
fourier series

9-1 INTRODUCTION
Although we now know a great deal about the general theory of orthogonal
series in infinite dimensional Euclidean spaces we have yet to produce a single
concrete example of such a series. To fill this gap in our knowledge we propose
to devote the next three chapters to a detailed study of several types of series
expansions in infinite dimensional function spaces. In each case we shall begin,
as of course we must, by exhibiting a basis for the particular space in which the
series is tobe constructed. Once this has been done the series in question will
appear as a special version of our earlier results, and for this reason the following
discussion can be viewed as a collection of elaborate examples in illustration of
material that is already known.

At the same time, however, it is only fair to warn the reader that each of the
series which we shall construct is of significant importance in physics and applied
mathematics. Indeed, since 1822 when Jean Baptiste Fourier first solved the
problem of heat flow in solid bodies by means of those series which now bear his
name, this subject has grown until, at present, it is an entire branch of mathe-
matics and mathematical physics. Later in this book we shall examine some of
the applications of this theory, but first we discuss the series themselves.

9-2 THE SPACE OF PIECEWISE CONTINUOUS FUNCTIONS

Given the importance of the various series which we are about to consider, it is

clearly of some interest to phrase the following discussion so as to encompass as


wide a class of functions as possible. Unfortunately, most of the generalizations
which go beyond the space of continuous functions are highly technical and in-
accessible in a course at this level. Still, it is possible to extend our results to in-
clude piecewise continuous functions, and since this modest generalization is well
worth making, we shall devote the present section to showing how the set of all
such functions on a fixed interval can be made into a Euclidean space. For con-
venience we begin by recalling the definition of piecewise continuity.
329
330 FOURIER SERIES |
CHAP. 9

Definition 9-1. A real valued function/is said to be piecewise continuous


on an interval [a, b] if
(i) /is defined and continuous at all but a finite number of points of [a, b]

and
(ii) the limits
fix$) = lim f(x + h),

f(xo) = lim f(x - h)

exist at each point x in [a, b]. (Note that only one of these limits is relevant

if x is an end point of [a, b].)

We remind the reader that the nota-


tion h —» +
means h approaches zero
through positive values only, and that
./(4K-f-
the two limits appearing in (9-1) are +
f(x )-f(xo)
called, respectively, the right- and left-

hand limits of /at x Q . When x is a


point of continuity of / each of these
limits is equal to the value of /at x ,
Xq
and we then have
FIGURE 9-1
Kxt) = Rxo) = /(x ).

More generally, the requirement that both of these limits be finite everywhere in
[a, b] implies that the only discontinuities of/ are "jump discontinuities" of the

type shown in Fig. 9-1. Moreover, the difference

f(xt) ~ flxo)

measures the magnitude of the jump of the function /at x .

FIGURE 9-2 FIGURE 9-3


9-2 I THE SPACE OF PIECEWISE CONTINUOUS FUNCTIONS 331

Thus, for example, the function

x, < x < 1,

/(*) =
1 - x, 1 < x < 2,

(see Fig. 9-2) is piecewise continuous on the interval [0, 2] and has a jump dis-

continuity of magnitude — 1 at x = 1, since /(1 + ) = while /(1~) = 1. On


the other hand, the functions l/x and sin l/x fail to be piecewise continuous on
any interval containing the origin because of their behavior as x —» 0. (See
Figs. 9-3 and 9-4.)

FIGURE 9-4

The following facts concerning piecewise continuous functions are of particular


importance, and will be used repeatedly hereafter.

1. If/ is piecewise continuous on [a, b], then

rb
/ f{x)dx
Ja

exists, and is independent of whatever values (if any) / assumes at its points of
discontinuity. In particular, if /and g are identical everywhere in [a, b] save at
their points of discontinuity, then

rb rb
/ f(x)dx = / g(x)dx.
Ja Ja

2. If / and g are piecewise continuous on [a, b], then so is their product fg.
(See Exercise 11.) This, together with (1) implies that the integral of the product
of two piecewise continuous functions always exists.

3. Every continuous function on [a, b] is piecewise continuous.

This said, we now turn our attention to the problem of converting the set of
piecewise continuous functions on [a, b] into a Euclidean space. In view of the
fact that this set includes the continuous functions on [a, b] it is only reasonable
332 FOURIER SERIES | CHAP. 9

to require our construction to be so conceived that the resulting Euclidean space


has Q[a, b] as a subspace. This, in turn, suggests that we define/* g by the formula

f'g= [ f(xMx)dx.
Ja
(9-2)

But does (9-2) actually yield an inner product on the set of piecewise continuous
functions on [a, b]l The unpleasant answer is, No. To see what goes wrong, let
n(x) be a function which is zero everywhere in [a, b] except at a finite number of
points (see Fig. 9-5). Such a function
is said to be a null function, and has the .
" *
annoying property that

n(x) dX — FIGURE 9-5


L
in spite of the fact that n is not the zero function. This, of course, vitiates using
(9-2) as an inner product, since, by definition, the inner product of a nonzero
vector with itself cannot be zero.
It is perfectly clear, however, that the above difficulty will disappear if we over-

look the fact that a null function is not identically zero, and treat it as if it were.
But then, to be consistent, we must also regard any two piecewise continuous func-
tions as identical whenever they a finite number ofpoints. And this is
differ at only
just what needs to be done to make (9-2) yield an inner product.
As usual, there are a number of facts which have to be verified before such an
assertion can be accepted. Among them we call attention to the need to ascertain
that (9-2) respects our identification of functions. In other words, we must show
that whenever /i is identified with/, and g x with g, then

[ f\(x)gi(x)dx
= f
f(x)g(x)dx.
Ja Ja

For only then will (9-2) unambiguously define an inner product on the set of
piecewise continuous functions identified as above. This, however, is easy to
prove, and has been left to the reader as an exercise (see Exercise 5).

Finally, to complete the argument, we observe that whenever f x is identified

with/ and g x with g, then/x + gi with/ + g, and afx with of for


is identified

all real numbers a. Thus functional addition and multiplication by real numbers
also respect our identification of functions, and it follows that the set of piecewise
continuous functions on a fixed interval [a, b], identified as above, is a Euclidean
space. We shall denote this space by (PQ[a, b], and assert without proof that it

contains Q[a, b] as a subspace (Exercise 2).

Now
that we have rigorously constructed the space (?e[a, b] and insisted that
itsvectors are collections of functions identified with one another, it is perfectly
clear that we can ignore this fact and treat these vectors as though they were
9-2 I
THE SPACE OF PIECEWISE CONTINUOUS FUNCTIONS 333

ordinary functions. This is precisely what we shall do henceforth, but always with
the tacit understanding that the facts of the matter are as outlined above, and that
all of our arguments can be rigorously restated, if necessary. As with all such
abuses of terminology, this involves no real danger of error, and has the positive
result of simplifying language and notation.

EXERCISES

1. For each of the following functions evaluate the right- and left-hand limits at all
points of discontinuity, and state whether or not the function is piecewise continuous
on [0, 2].

^ (o, x =
(a) /(*)=*>
i
0<x<l
n *

ib) fix) ={ l
\x - 2, l < x < 2 I- > < x < 2

_ 2 i i

{2 < < ^ 1
= 1 " *' ^ x < J
x -l '
(d) fix) f
°

1, 1 < x < 2
U - 1, 1 < x < 2

< x < i i x
(1, < x < 1
0, } < x < f (f) fix) = x - 1

h 2 < x < 2 l 2 *> 1 < x < 2

2. (a) Suppose a piecewise continuous function /is identified with a continuous func-
tion g in (PQ[a, b]. What can be said about the magnitudes of the jump discon-
tinuities of/?

(b) Use the result in (a) to prove that Q[a, b] is a subspace of (PQ[a, b].

3. Determine whether or not the following functions are piecewise continuous on [0,1].

x sin - » < x < 1


x
0, x =
1, if x is irrational

0, if x is rational

(c) fix) = —j— -> ^-r < x < -!-> n = 0,1,2,...

4. Prove that/i and/2 are identified in (?Q[a, b] if and only if/i — /2 is a null function.
5. Let /and g be piecewise continuous on [a, b], and suppose that/i is identified with
/,and g\ with g. Prove that
,b ,b

/ fi(x)gi(x) dx = / fix)gix)dx.
Ja Ja

[Hint: Note that/ig! - fg = figi - fig + fig ~ fg.]


6. Verify the assertion that 6>Q[a, b] is a Euclidean space.
7. Are the functions 1, x, x2 , . . . linearly independent in (PG[a, b]l Why?
334 FOURIER SERIES CHAP. 9

8. Prove that the functions

1, cos —a
TTX
'
.

sin —a
TX
' cos
nirx
a
»
.

sin
tvkx
a
j

are linearly independent in (PC [—a, a].


9. Prove that \\f
— g\\ = in (PQ[a, b] if and only iff — g is a null function.
*10. Let f\ and gi be piecewise continuous on [a, b], and write f\ ~ /b if and only if
f — f2
x i s a nuu function. Prove that this defines an equivalence relation on the set
of all piecewise continuous functions on [a, b], and that the equivalence classes it
defines are the elements of (PC[a, b]. (See Section 1-9.)

11. Let /and g be piecewise continuous on the interval [a, b], and let fg denote their
product; that is, fg is the function defined by

fg(.x) = f(x)g(x)

for all x in [a, b\. Prove that fg is piecewise continuous on {a, b], and thus deduce
that the integral in (9-2) exists. [Hint: Use the fact that the product of continuous
functions is continuous.]

9-3 EVEN AND ODD FUNCTIONS


The task of evaluating the integrals which arise in the study of orthogonal series
can often be simplified by exploiting the symmetry of the functions involved. This
technique is usually formalized by introducing the notions of even and odd
functions, as follows.

Definition 9-2. A function/, defined on an interval centered at the origin,


is said to be even if
/(-*) = fix) (9-3)

for all x in the domain of/, and odd if

/(-*) = -Ax). (9-4)

is just another way of saying that a function is even


This, of course, if its graph is
symmetric about the vertical axis (Fig. 9-6), and odd if its graph is symmetric

FIGURE 9-6 FIGURE 9-7


.

9-3 | EVEN AND ODD FUNCTIONS 335

about the origin (Fig. 9-7). Thus, for integral values of is even if n is even, n, xn
odd if n is odd, a fact which helps explain the particular terminology used in
this context.

The importance of even and odd functions for our work stems from the equalities

f f(x)dx = if f(x)dx (9-5)


J-a JO

whenever /is even and integrable, and


/a
/(*) dx = (9-6)
-a

whenever /is odd and integrable. Both of these assertions are easy consequences
of the above definition (see Exercise 2), and are also evident from the geometric
interpretation of the definite integral as area.
An equally elementary observation is that the product of two functions is even
whenever both of the functions are even or both are odd, and is odd whenever
one of the functions is even and one odd. In short, the multiplication of even
and odd functions obeys the rules

(Even)(Even) = (Odd)(Odd) = Even,

(Even)(Odd) = (Odd)(Even) = Odd.

From this, and (9-6), we deduce that

f—a f(x)g(x)dx =
J

whenever / and g have the opposite parity. Hence even and odd functions in

(PC[— a, a] are mutually orthogonal.

Example 1. The functions


1, cos x, cos 2x, ...

are even in (PC[— a, a], and


sin x, sin 2x, . .

are odd. Thus


/a
f(x) cos kx dx =
—a
if/ is odd, and
/a
f(x) sin kx dx =
—a

if/ is even. The value of these results for computing Fourier coefficients is too
obvious to need comment.
336 FOURIER SERIES | CHAP. 9

A somewhat less obvious property of even and odd functions is established in


the following lemma.

Lemma 9-1 .
Every function on the interval [—a, a] can be written in exactly
one way as the sum of an even function and an odd function.

Proof Let /be an arbitrary function on [—a, a], and set

f / . _ f(x) + f(-x) _
— f(x) - /(-*)
jekx) 2 jo{x) 2 yy~i)

It is trivial fs is even, fo odd, and that / = fE


to verify that fo- Thus / has +
at least one decomposition of the desired form, and it remains to show that this
is the only such. To this end, suppose that we also had / = gE go, with gE +
even, and go odd. Then/? +f = gE + go, and

fE — gE = go — fo-
But the difference of two even functions is even, and the difference of two odd
functions is odd. Thus the function defined by the above identity is simultaneously
even and odd, and so must be the zero function. In other words, fs — gE =
go — fo = 0, and it follows that gE = fE, go = fo, as desired. |

The functions fs and fo defined in (9-7) are known respectively as the even
and odd parts off
Example 2. If f(x) = e
x
, then

x ,
-x x _ -x
e e e e
fs(x) = = and f (x) = =

Thus the even and odd parts of the exponential function are the hyperbolic cosine
and hyperbolic sine, respectively.

EXERCISES

1. Classify each of the following functions as even, odd, or neither even nor odd.

(a) tan x (b) / (c) ^-^-| (d) In |*|

.2
-1 -1
(e) 7 r^r? (0 sin x ve/ cos
(g) x (h)
(x + l)(;c - 7^
1) (x -f- l)(x - 1)

(i)
X(
^^ (J) /(W). /defined in [0, a]

2. Prove Eqs. (9-5) and (9-6) of the text.

3. Establish the multiplicative rules for even and odd functions given in the text.
9-4 | FOURIER SERIES 337

4. Show that

rr fit) cos uit - x)dtdu = \ r r /co cos U (t - x> * rfw

for any piecewise continuous function /on the interval [— it, t].

5. Let /be differentiable in the interval [-a, a]. Prove that/' is odd whenever /is even,
and even whenever /is odd.
6. Let /be an integrable function defined in the interval [—a, a], and let

Fix) = f f{i)dt, -a< x <a.


Jo

Prove that F is even if /is odd, and odd if /is even.

7. Decompose each of the following functions into its even and odd parts.
2

(a)
X
x — ,
1
(b) ——
x + 1
(c) x cos x - cos 2x (d)
x
_i_

— r
i
1

(e) > a,b,c constants, not all zero


ax 2 + bx + c

8. Find the even and odd parts of the function


00

f(x ) = ^2 a * x *V a * constants.

9. Find the even and odd parts of the function


QO

fix) = y+ 2 ia k cos kx + b k sin £x), a fc , 6 constants.


fc

10. Let 8[— a, a] denote the set of all even piecewise continuous functions on the interval
[—a, a], and let ©[— a, a] denote the set of all odd piecewise continuous functions on
[—a, a]. Prove that 8[— a, a] and G[— a, a] are subspaces of (?Q[— a, a].
11. Prove that the zero function is the only function which is simultaneously even and
odd on [—a, a].
12. Let poix), Piix), ... be the sequence of polynomials obtained by applying the Gram-
Schmidt orthogonalization process to l,x, x 2 ...in [—1,1] (see Example 3, ,

Section 7-4). Show that/?2fc(*) is an even function, and thatp2k+iix) is an odd func-
tion for all values of k. [Hint: Use mathematical induction.]

13. Let /be piecewise continuous on [—a, a], and suppose that /is orthogonal to every
even function in <P6[— a, a]. Prove that /is odd. [Hint: Use Lemma 9-1.]

9-4 FOURIER SERIES


In this section we begin our study of orthogonal series by considering series
expansions relative to the functions

1, cos x, sinx, cos 2x, sin 2x, .... (9-8)


338 FOURIER SERIES | CHAP. 9

We have already seen that these functions are mutually orthogonal in (P(B[— r, t],
and we shall prove shortly that they are a basis as well. Granting the truth of this
fact, we can then use Formula (8-22) to write any piecewise continuous function

/on the interval [ — ir, tt] in the form

m = 4il2 +
Hi
±
»_,
/•cos kx
||cosA:a:|| 2
cos kx
,
+
'

f'sinkx
-Tp-: ——=

||sin/:jc||
n-^
:
sin
. ,
kx (mean) (9-9)

where the notation "(mean)" indicates that the series in question converges in
the mean to /. But since

ii if = r dx = iw,
J TT

2
||cosfcx||
2
= I cos kx dx = %,
J X

2 2
||sin &a:|| = f sin kx dx = tt,
J — jr

(9-9) may be rewritten


00

f(x) = —+ 2 (a k cos kx + b k sin kx) (mean), (9-10)


fc=i

where

flfc = - / f(x) cos kx dx,


T J —T
(9-11)

bk = - j f(x) sin kx dx,


ir J -v

for all k. This particular representation of/is known as its Fourier series expansion
on the interval {
— *, tt], and the a k and b k are called the Fourier or Euler-Fourier
coefficients off.
Once again we emphasize that (9-10) must be read as asserting that the series

in question converges in the mean to/ not that it converges pointwise in the sense
that

a
f(x ) = ?? +
T J] (flfc cos fcxo + 6fc sin kx )
fc=l

for all jc in [— ir, tt]. Indeed, since the value of/ at x can be changed arbitrarily

without changing the value of Fourier coefficients, this would be entirely too
its

much to expect. Moreover, in view of Example 4 in Section 8-1, there is no


a priori reason to expect that the Fourier series for / will converge to the value
of/ at so much as a single point in [— tt, *]. But, surprisingly, whenever /is reason-
ably well-behaved, it converges to f(x) for all x. We shall have more to say on
this point as soon as we have considered an example.
.

9-4 I FOURIER SERIES 339

Example 1. Find the Fourier series expansion of the function

{ 1, < X < 7T,

(see Fig. 9-8).


In this case / is an odd function on [-w, t]. Hence so is f{x) cos kx, and
(9-6) implies that a k = for all k. On the other hand, fix) sin kx is even, and
we therefore have

bk = - / sin kx dx
x Jo
2
= 7— (1 — COS kir)
kx

\±> k = 1,3,5,...,

lO, k = 2, 4, 6, ... FIGURE 9-8

Hence the Fourier series expansion of/ is

sin 3x sin 5x
/(*) = sin a;
,

^ ^
,

1 ?
,

4 ^ sin (2k - l)x , .


(9-12)

fc=i

In Fig. 9-9 we have sketched the graph of /and the


two terms sum of the first

of its Fourier series, which, as can be seen, already furnishes a fairly good ap-
proximation to /throughout the interval [— r, *]. This approximation improves
considerably if we use the first four terms in the series (Fig. 9-10), and it is not

continues to improve as additional terms are considered. In


difficult to see that it

so doing one quickly becomes convinced that the series actually converges point-
wise to / every where in [— *-, t] where /is defined. Moreover, when x = and

4 r . , sin 3x ,
sin 5x . sin 7x1

FIGURE 9-9 FIGURE 9-10


340 FOURIER SERIES | CHAP. 9

±ir, the series obviously converges to zero even though /is not defined at those
points. Here then, in our very first example, we have come upon a Fourier series
which converges pointwise in the interval [— *-, ir], and which represents the func-
tion from which it was derived at each point in the domain of that function.
It would be easy to multiply the number of such examples indefinitely until the

reader became convinced (erroneously, as it turns out) that all Fourier series con-
[— ir, *]. Instead, however, we prefer to cite a
verge at each point in the interval
theorem which will account for this phenomenon within the framework of the
general theory of Fourier series. For the present we content ourselves with a
statement of the result in question, leaving any attempts at a proof to the next
chapter.

Theorem 9-1. Let f be a piecewise smooth function in (PC[— by ir, ir],

which we mean thatf has a piecewise continuous first derivative on [—ir, ir].

Then the Fourier series expansion for f converges pointwise everywhere in


[— r, ir], and has the value

f(xt) + f(xp)

at each point x in the interior of the interval, and

/(-*+) + /(O _
(9 14)

at ±ir.

This theorem is one of the most important in the entire theory of Fourier series,

and the student should make certain that he understands it thoroughly before
going on. In particular, he should note that the expression

Axf) + f(xo)

is none other than the average of the right- and left-hand limits of /at x , and is

equal to f(x ) whenever x is a point of continuity of/ Hence the Fourier series
expansion of a piecewise smooth function f in (PQ[— ir, ir] converges to f(x ) when-
ever x is a point of continuity off On the other hand, if/has a jump discontinuity
at x , (9-13) implies that the Fourier series for / converges at x to the value
located at the midpoint of the jump, as shown in Fig. 9-11.

FIGURE 9-11
9-4 FOURIER SERIES 341

When these results are applied to Example 1 above they allow us to assert that

the series
4T . . sin 3x sin 5x ] / Q i c\
sin x H ^ 1
^ h • •

(9-15)

converges pointwise in the interval [— x, x] to

-1 if -7T < x < 0,

if X = — 7T, 0, 7T,

1 if < x < x.

Thus, for example, when x = ir/2, the value of this series is 1, and we conclude
that

= ^
1
-a 3^5 7
or

= ^
1
3^5 7

Similarly, when x = x/4, (9-15) yields the numerical series

V2 + V2 V2 V2
x 2 2-3 2-5 2-7 +
and we have a second representation of t/4 as

V? 1+1-1-1+
2 ( ^3 5 7 ^ )
Before continuing, it may be appropriate to remark that there exist continuous
functions whose Fourier series diverge at finitely many points in [— t, it]. Thus
the requirement that / have a piecewise continuous first derivative is imposed in
order to guarantee the pointwise convergence of its Fourier series. Incidentally,
the problem of determining whether there exists a continuous function whose
Fourier series diverges everywhere in [— ir, x] is still unsolved. Needless to say,
such a function, if it exists, will be rather bizarre, and is not likely to arise in
applications.
By now the reader has undoubtedly observed that our description of the point-
wise behavior of a Fourier series is incomplete. For if a trigonometric series

00

-y + ^2 (ak cos kx + bk sin kx) (9-16)


k=i

converges to the value K


when x = x it will also converge to , at all points K
of the form x +
2wn, n an arbitrary integer. This, of course, is an immediate
342 FOURIER SERIES I CHAP. 9

FIGURE 9-12

consequence of the fact that the functions sin kx and cos kx are periodic, with 2w
as a period, and allows us to make the following important observation: If (9-16)
converges pointwise to a function f everywhere in [— ?r, x), then it actually converges
on the entire x-axis to the function F obtained by repeating f successively along the
x-axis in intervals of length 2x (see Fig. 9-12). It is obvious that the function F
obtained in this way is periodic with 2r as a period, in the sense that

F(x + 2tt) = F(x)

for all x. It is known as the periodic extension of/.


When these remarks are combined with Theorem 9-1, they yield

Theorem 9-2. The Fourier series expansion of a piecewise smooth function


f in (PC[ — 7r, it] converges pointwise on the entire real line. Moreover, if F
denotes the periodic extension off, then the value of the series is F(x ) when
x is a point of continuity of F, and

F(xt) + F(xo)

when Xo is a jump discontinuity ofF.

In particular, we note that the Fourier series for / will converge pointwise to a
continuous function on the entire x-axis if and only if /is continuous on [— tt, it]
and /(— tt) = /(tt). For only then will F be free of jump discontinuities. We
shall have occasion to use this fact later in the chapter.
At the risk of belaboring the obvious, we point out that Theorem 9-2 can be
used to sketch the graph of the Fourier series of any piecewise smooth function/
in (?e[— t, x]. The procedure is as follows: First sketch the graph of the periodic
extension F off; then plot the midpoint of each jump discontinuity of F. The
resulting picture, with these isolated points included, will be the graph of the series
9-4 FOURIER SERIES 343

FIGURE 9-13

in question. Thus, for instance, the graph of the series

4T sin 3x
+

+
.

sin x

found in Example 1 appears as shown in Fig. 9-13.

Example 2. Find the Fourier series expansion of the function

f(x) = I*!,
— < 7T x < IT,

and sketch the graph of the series.


In this case /is an even function on [— ir, ir]. Hence bk = for all k, while,
for k 9^ 0,

sin kx dx

Finally, when k = 0, we have

a = - / x dx = 7r,
* Jo
344 FOURIER SERIES CHAP. 9

FIGURE 9-14

and it follows that

cos 3x cos 5x
x = (cos a; H To I « T (mean)
32 52

in (?e[—T, *]. The graph of this series is shown in Fig. 9-14, and, for comparison,
we have sketched the graph of the sum of the first three terms of the series in
Fig. 9-15.

FIGURE 9-15

Example 3. Let g be the function in (PC[— tt, t] defined by

g(x) = (0, -r < x < 0,

Then, with /as in Example 1,

g = Mi + /] = \ + */,

and we conclude that the Fourier series expansion of g is

sin 3x , sin 5x ,

)
,

* + §(
9-4 FOURIER SERIES 345

• • • • • •

h 1
|
1

-3tt -2x — ir K 2tt 3tt

FIGURE 9

The moral of this example is that a Fourier series can sometimes be found without
recourse to integration. We refer the reader to Exercise 21 for a discussion of
this technique, and to Fig. 9-16 for the graph of the series.

EXERCISES

In Exercises 1-10 find the Fourier series expansion of the given function. Sketch the
graph of the series obtained, paying particular attention to its values at any points of
discontinuity.

1. fix) = x, —x < x < x 2./M- !•


1<*<°
l^, < X < X
x
3. fix) = e —x < x < x
, 4. fix) = \smx\

{-\X - *|, -T < X < -\ (i, -x < x < -\


5- /(*) = 0, -i < x < i
\ X — 2 — X ^ IT
j2, v i> 5 < -f < x
7. f(x) = (x - x)(x + x), 8. fix) = e
M , -x < x < x
— IT < X < IT
- x + x, —x < x < = x
9. fix)
— <
10. fix) , -it < x <
(x x, x < x
11. (a) Sketch the graph of the Fourier series expansion of the function

-1, -x < X <


IT X
fix) = < 1, - - < X < -

-1, < X < X,

over the intervals [2x, 3x] and [— 2x, 0].

(b) What is the value of the Fourier series for /when x = kir, k an integer? When

x = ilk + 1)
^,
k an integer ?
• :

346 FOURIER SERIES CHAP. 9

12. (a) Sketch the graph of the Fourier series expansion of the function

-1, ~T < X < -1,

fix) = <
|> -1 < X < 1,

1, 1 < X < T,

over the intervals [27r, 5t] and [— 37r, — t].


(b) Find the value of the Fourier series for this function at the following points

x = 1, x = x, x = —6, x = 3, x = 7, x = — 57r.
13. (a) Sketch the graph of the Fourier series expansion of the function

2t
X + 7T, — T<X< — »

2T
- -j < x < 0,
fix)
0<x<3>
IT
'•
f < * <
(b) What is the value of the Fourier series for this function when x = kw, k an
integer? When x = (4k +
1)(tt/3), k an integer?

14. (a) Find the Fourier series expansion of the function

fo, — ir < x < 0,


/(*) = i 2
\X , < X < IT,

and sketch the graph of the series obtained,

(b) Use this series to show that

£!_ 1+
x
^
± +±+±+
22 ^ 32 ^ 42
^ ....
6

15. (a) Find the Fourier series expansion of the function

l*, < X < T,

and sketch the graph of the series obtained,

(b) Use this series to show that

!L _
^,±4.1 + 1 +r " ..
52 ^ 72
1
"*
8 32

16. Find the Fourier series expansion of the function

fix)
0, —W < X < 0,

cos X, < X < IT,

and sketch the graph of the series obtained


.

9-4 FOURIER SERIES 347

17. (a) Find the Fourier series expansion of the function

fix) = cos ax, —t < x < 7T,

for any real number a.

(b) Use this series to prove that

1 1
_ y^ 2a
cot air = -
k *=.
t2

whenever a is not an integer. Justify the validity of your computations.


18. A function/, defined and continuous at all but a finite number of points in any closed
interval of the x-axis, is said to be periodic if there exists a real number p > such
that
/(* + p) = fix)

for all x in the domain of/. The smallest positive real number p with this property

(if such exists) is called the fundamental period of/; otherwise p is simply called a
period.

(a) Prove that if /is periodic with period p, then

fix + kp) = fix)

for all integral values of k.

(b) Give an example of a periodic function which does not have a fundamental
period.

(c) Let/i and/2 be periodic functions with fundamental periods p\ and p%, respec-
tively. Prove that ai/i +
0:2/2 is periodic for every pair of real numbers a 1 and a 2
if and only if pi/p2 is a rational number.
(d) Generalize the result in (c) to linear combinations of n periodic functions with
fundamental periods p\, . .
, pn .

19. Determine which of the following functions are periodic, and find the fundamental
period of each if it has one. (See Exercise 18.)

(a) sin —
1TX
(b) 2 sin 3x — cos 2x (c) tan x
2

(d) (sin 2x)(cos x) (e) sin - (f) tan 2x + 3 sin irx

, s r, x 10 if x is rational
(g) fix) = \
. .

[I if x is irrational
*20. Let

/(x) =
ew'
where Pix) and Qix) are polynomials and (?(0) 5^ 0. Prove that if /is periodic it is

constant. [Hint: Let /(0) = a, and consider the equation Pix) — aQix) = 0.]
21. Let
00 . 00

—+ 2_j (ak cos k* + °k s ^n kx) and — + ^J i^u cos kx +B k sin kx)


k=l k=l
348 FOURIER SERIES CHAP. 9

be, respectively, the Fourier series expansions of the functions /and g in (PC[— w, ir].

Prove that the series

+
aao f3Ao
+ S^ aak + ^^ cos kx "*" ^°** "*" P Bk ) sin ^
is the Fourier series expansion of the function af + fig for any pair of real numbers

22. Use Exercise 21 and the series found in Examples 1 and 2 and Exercise 1 above to
obtain the Fourier series expansions of the functions shown in Fig. 9-17.

FIGURE 9-17

23. It can be shown that the series

1 2 y> sin (2k - l)x


2 '
t f~> 2k - 1
and
fc+i

E (-ir sin kx

are, respectively, the Fourier series expansions of the functions

t ( v\ = 0, — 7T < X < 0, - ,
=
X
—t<x<tt.
liW {/ n .
,
and h(x)
N -•>
1, < X < T, X
Use these series, and Exercise 21, to find the Fourier series expansion of each of the
following functions.

(a) f(x) =
—w < x <
\ f
< x < ir

(b) fix) = \X + IT, -IT < X <


\x, < X < T
(c) /(*) - j~ x '
—K < X <
-x
(-x + 2tt, < X < IT

24. What is the Fourier series expansion of 2 + 7 cos 3x — 4 sin 2x, considered as a
function in (PC[-7r, x] ?

9-5 I
SINE AND COSINE SERIES 349

25. Without resorting to integration, find the Fourier series expansion of each of the
following functions in (PC[— ir, ir].
2
(a) sin * (b) sin x cos x
3 2
(c) sin x (d) sin x ( cos -
J

(e) cos 3x cos 2.x (f) cos x


26. Show that Parseval's equality assumes the form

2
I/' f(x) dx
2
= ^+J^(al + bt),
T J-

with respect to the basis 1, cos x, sin x, ... in (PQ[— ir, ir], where the ak and bk are
the Fourier coefficients of/.

27. (a) Find the Fourier series expansion of the function /( x) in terms of the expansion
of/(;c).

(b) Use the series found above, together with the appropriate result from Section 9-3
to find the Fourier series expansions of fs and/o, the even and odd parts of/.
(c)Under the assumption that the functions 1 cos x, sin x, form a basis for
, . . .

(PC[— ir, ir], use the results in (a) and (b) to find bases for the subspaces £[— ir, ir]
and 0[-ir, ir] of (PC[—w, ir].
28. Use the results in Exercise 27, together with Exercise 21 and the Fourier series
expansion of e x ,
—t < x < ir, to find the Fourier series expansions of sinh x and
cosh x.
*29. (a) Prove that the functions 1, cos x, sin x, ... , cos kx, sin kx, . . . are orthogonal
in (PQ[a, a + 2ir] for any real number a.

(b) Under the assumption that the functions in (a) are a basis for (PC[— ir, ir],
prove that they also form a basis for (PQ[a, a +
2ir] for any real number a. [Hint:

Use Parseval's equality.]

9-5 SINE AND COSINE SERIES

In the examples of the last section we took advantage of the fact that the functions
considered were even or odd to simplify the task of finding their Fourier series ex-
pansions. This technique can be exploited more often than one might expect,
and is of sufficient importance to be brought out into the open.
Specifically, if /is an even function in (?Q[— ir, ir], then, for all values of k,
f{x) cos kx is even, and f(x) sinkx is odd. Thus, by (9-5) and (9-6),

/ f(x) cos kx dx = 2
J
f(x) cos kx dx,

l f{x) sin kx dx = 0,

and it follows that the Fourier series expansion of an even function in (PC[— ir, ir]
.

350 FOURIER SERIES |


CHAP. 9

involves only cosine terms and may be computed according to the formula

00

f(x) = ^+ 2^ «/b cos kx (mean), (9-17)

where

ak = -2 /
f /(*) cos foe dx. (9-18)
x Jo

A similar argument shows that the Fourier series expansion of an odd function in
(PC[— x, x] involves only sine terms, and may be computed according to the formula
00

/(*) = 2 bk sin kx ( mean )' (9-19)

where

bk = - / /(*) sin A:* cfrc. (9-20)


X Jo

Actually these results are more than mere formulas. For if we combine them
with the fact that the functions

1, cos x, sin x, cos 2x, sin 2x, ...

are a basis for (Pe[-x, x], they imply, in turn, that

l,cos x, coslx, . .

is a basis for the space of piecewise continuous even functions on [— t, t], and that

sin x, sin 2x, ...

isa basis for the space of piecewise continuous odd functions on [— t, it]. Indeed,
this is the import of the assertion that the series in (9-17) and (9-19) converge
in the mean.
In applications of the theory of Fourier series one frequently needs to obtain
a series expansion for a piecewise continuous function / which is defined only on
the interval [0, *]. One way of doing this is to extend f to the entire interval
[-7T, tt] (where by this we mean that a function F is defined on [— *-, tt] in such

a way that F coincides with /on [0, *-]), and then expand F as a Fourier series. In
those cases where /is reasonably well-behaved, Theorem 9-1 guarantees that the
Fourier series expansion of F will be a good approximation to /on [0, x].
The crux of this method concerns the manner in which /is extended to [— x, *].
This, of course, can be done in any way whatsoever (so long as the resulting func-
tion belongs to <P6[— x, x]), but the following two extensions are the most con-
venient and important. The first is the so-called even extension of/, denoted f E ,

and defined by

yv
\f(-x), -ir < X < 0,
9-5 I
SINE AND COSINE SERIES 351

while the second is the odd extension off, denoted Of, and defined by

(9-22)

Figure 9-18 illustrates the even and odd extensions of a particular/, and furnishes
visual evidence of the easily proved assertion that Ef is even and Of is odd for
any/. This being the case, we can use Formulas (9-17) through (9-20) to obtain
the Fourier series expansions of Ef and Of. They are
OD

Ef {x) = y + ^
fc=i
a k cos kx (mean),

(9-23)
Ofc
2
= - f f(x) cos kxdx,
/
IT Jo
and
00

<?/(*) = 2 ftfc sin ^* (mean),

(9-24)
bk = -2 f f(x)
I sin /ex dx.
IT Jo

These series are called, respectively, the Fourier cosine and Fourier sine series
expansions off, a mild misuse of terminology which rarely causes any misunder-
standing. Unfortunately, the somewhat benighted term "half-range expansion"
is also used in this context.

Ef : The even extension Of : The odd extension


of/ FIGURE 9-18 of/
352 FOURIER SERIES | CHAP. 9

As an example, we compute the Fourier cosine and sine series for the function

f(x) = x2 , < x < t.


Here
EA X) = X2 > —* < X < T,

and integration by parts yields

Ok = - I
2
x cos kx dx
IT JO

= r / x sin kx dx
irk Jo
T nit

x cos kx
irk o
+ kJo
cos kx dx

= j-£ cos kir

Finally,
2
^T
a = 2- /
/
j
x 2 dx = —^
Li

tt Jo j
and we have

cos 2x cos 3x cos 4*


Efi.x) = y-4^.cos X 22 32 42 + ) (mean).

To compute the series expansion of Of we use (9-24) with f(x) = x2 . This gives

h = -2 rx
/
2 .

sin
si foe afrc
IT Jo
2 r x 2 cos
co: kx

o
+ v
k Jo
/ x cos kx dx

2ir cos kir _4_ a: sin foe


sin Ax dx
/c /ex k o k Jo

2t cos for . 4
- cos kx ,

k
-,
h tt
k3 T
2ir COS kir , 4 , , , N

2tt 8_
k k*ir
= <
2x
fc = 2, 4, 6,

9-5 I SINE AND COSINE SERIES 353

FIGURE 9-20

Thus
^ , x
= ~ / .

x
sin 2x , sin 3*
....)
f (x) 2tt I sin 2 1
3

sin
.

x H
sin 3a:
r^— ^
, sin
—5x ,

(mean).
-K- 33 53 f-
)
The graphs of these two series are shown in Figs. 9-19 and 9-20, respectively.

EXERCISES
1. (a) Find the even and odd extensions of each of the following functions in (PC[0, ir],

and sketch their graphs on the interval [— ir, ir].


(i) /(*) = 1 (ii) f{x) = x2 (iii) f(x) = ir - x (iv) f(x) = e*

(b) Sketch the graphs of the Fourier sine and cosine series for each of the functions
in (a).
354 FOURIER SERIES | CHAP. 9

2. Let /be an arbitrary function in (PC[0, 7f]. Prove that Ef is an even function in
(?e[-7r, x], and that Of is odd.
3. Find the Fourier sine series expansion of the function

fix) = cos x, < x < IT,

and use this result to deduce that

\Z2tt 7__
16
=
~ 22
1
- 1 62
3__
- 1
+ __5
102 - 1 142 _ i + " '

4. Find the Fourier cosine series expansion of the function

fix) = sinx, < a: < x,

and sketch the graph of this series.

5. Find the Fourier sine series expansion of e x , < x < x.


6. Find the Fourier cosine series expansion of e x , < x < x.
7. Find the Fourier sine series expansion of the function

fix) = x — x, < x < r.

8. Let /be a function in (PC[0, x] which is symmetric about the line x = ir/2. Show
that the only nonzero terms in the Fourier sine series expansion of /are of the form

Bk sin kx, k odd,

and find a formula for Bk.


2
9. Let / be the function in (Pe[0, x] which is obtained by reflecting the function x ,
< x < 7r/2, across the line x = it/2. Use the results of Exercise 8 to find the
Fourier sine series expansion of/.
10. Find the Fourier sine series expansion of the function

fix) = x2 — ttx, < x < T.

11. Let /be a function in (PC[0, x] with the property that

f +X) = ~ f Q>~ X 0<x<r/2 '

(^ )'

Show that the only nonzero terms in the Fourier cosine series expansion of /are of
the form
Ak cos kx, k odd,
and find a formula for Ak.

12. Use the results of Exercise 11 to find the Fourier cosine series expansion of the
function
1, < x < ir/2,
/W (-1, ir/2 < x < x.

13. Use the results of Exercise 11 to find the Fourier cosine series expansion of the
function

fix) = sin ( x - » < x < x.


^J
9-6 |
CHANGE OF INTERVAL 355

*14. (a) Prove that the functions

sin x, sin 2x, . . . , sin kx, . .

form an orthogonal set in (PC[0, *-].

(b) Use the fact that the functions 1, cos x, sin x, . . . form a basis for <PC[— ?r, it]

to prove that the set of functions in (a) is a basis for <PQ[0, ir].

(c) Prove that the functions

1 , cos x, cos 2x, . .

also form a basis for (?Q[0, t].

9-6 CHANGE OF INTERVAL


Up we have dealt exclusively with functions on the intervals [— %, t]
to this point
and [0, x]. For many purposes, however, this setting is too restrictive, and we
now propose to generalize our results to an arbitrary interval [a, b]. But rather
than begin at once with the most general case, it will prove simpler if we first
consider intervals of the form [—p,p] and their associated Euclidean spaces
<P6[— p, p]. For here the situation can be handled with dispatch.
Indeed, it is all but obvious that the functions

,
1, cos —
ttx
' sin
.


tx
' cos
2irx
'
.

sin
Ittx
» • • •
, n -_ N
(9-25)
P P P P
are mutually orthogonal in (P6[ —p,p] (Exercise 1 below).* Moreover, just as in
the case where p — can be shown that these functions are a basis for this space,
-k, it

and hence that their associated orthogonal series (which, by the way, are still
called Fourier series) converge in the mean. And finally, with due allowance being
made for the length of the interval, all of our earlier remarks concerning pointwise
convergence are valid in this setting.
To obtain formulas for the Fourier coefficients of a function in (pe[ ~p,p]
we note that
/p ax = 2p,
-v
and

cos
>2 kirx

P
dx
,
=
J-p
/
f sin
• 2 —P
kirx
- dx =

Thus, by Formula (8-22),

fix) = ?r
2
+ S
feV(
Uk cos ~7T
P
+ b k sin ~r
P
)
(mean), (9-26)

* Actually, the entire issue comes down to making a change in scale on the x-axis by
substituting irx/p for x in the functions used earlier.
356 FOURIER SERIES | CHAP. 9

where
ak = — f f(x)cos^—dx,
(9-27)
/p f(x) sin -^— dx
-p P
for all k. And with this we are done.
The above discussion can easily be adapted to handle the Euclidean space
(?Q[a, b]. Indeed, if we set 2p = b — a, so that [a, b] = [a, a + 2p], the func-
tions in (9-25) are also a basis for (?e[a, a + 2p\ This leads at once to the fol-
lowing formulas for computing the Fourier series expansion of a function / in
(?G[a, b\.

x
fix) =
f+ ak cos u
b — ^
a
+ b k sin ,
^_ J
(mean), (9-28)
&=i N
where b
v 2knx ,
ak ) cos t dx,
b — a
(9-29)
rb
Ikirx
b, = x
sm
.

r
,
dx,
)
b — a
for all k.

Example 1. Find the Fourier series expansion in (Pe[0, 1] of the function


fix) = x.
Here b — a — 1, and (9-29) becomes

/•i

ak = 2 / x cos (2kirx) dx,


Jo

/•i

bk = 21 x sin (2&7ta:) dx.


./o

Integration by parts now yields

0o = 1, ak = 0, k ^ 0, bk = —
kr
Hence
_ sin 4xjc sin 6irx
= I-i(si
. , . .

/(*) sin 2irx -\ ^ 1 ^ H (mean).


2 ' 3

The graph of this series is given in Fig. 9-21.

FIGURE 9-21
9-6 CHANGE OF INTERVAL 357

Example 2. Find the Fourier series expansion of the function / shown in


Fig. 9-22.
In this case,
— 2, 2 < x < 3,
Ax) Ix -
4 x, 3 < x < 4,

and Formulas (9-29) yield

/4
flfc = / /(x) cos kirx dx

r3 z-4
= / (x — 2) cos /:7rx dx + I (4 — x) cos &7rx dx,

bk = / /(*) sin kwx dx


J2

— I (x — 2) sin &7rx dx + / (4 — x) sin fcirx dx.


J2 J3

Although these integrals can be evaluated directly, the computations can be


considerably simplified by invoking the following argument.
Let F denote the periodic extension of /to the entire x-axis (Fig. 9-23). Then
the functions F(x) cos kirx and F(x) sin kirx are periodic with period 2, and we
have
/•a+2 r4
/ F(x) cos kirx dx = / fix) cos kirx dx,
Ja 12 _ 30
^9 ^
ra+2 f4
/ F(x) sin A:7rx dx = / /(x) sin /c7rx dx,
Ja J2

for any real number a. [At this point we are using the obvious fact that if g is

piecewise continuous on (— oo, oo) with period 2p, then

'b+2p
f g{x)dx = f g(x)dx
Ja Jb

for any pair of real numbers a, b.] We now set a = — 1 in (9-30) to obtain

Ok = I F(x) cos kirx dx,

bk = / F(x) sin kwx dx.


358 FOURIER SERIES CHAP. 9

But on the interval [—1, 1], F coincides with the even function |jc|. Hence bt =
for all k,and a^ = 2 f x cos kirx dx. Thus

k odd,

\ 0, A: even, k ^ 0,

and the Fourier series expansion of/ is

„ 1 4 / cos 3irx cos 5ttx \ , .

fix)
x
= ^ - ^2 I
cos x*
,

"•
32
,

'

52
,

r- • • • J (mean).

The student will find this technique useful in some of the following exercises.

EXERCISES

1. Prove that the functions

1 , cos —
kx
>
.

sin —
tcx
» • •
' cos
kirx

P
'
.

sin
kirx

P
> • •

P P
are mutually orthogonal in (PC[— p, p\.
2. Let /be piecewise continuous and periodic on the entire x-axis, and suppose that/
has period 2p. Prove that

ra+2p rb+2p
f(x)dx = f(x) dx
J jb

for any pair of real numbers a, b.

3. Find the Fourier series expansion of the function in (PC[— 2, 2] defined by

-2 < x < -1,


{0, -1 < x <
\x\, 1,

0, 1 < x < 2.

Sketch the graph of the series.

4. Find the Fourier series expansion of the function in (PC[1, 3] defined by

f(x \ = [2 - x, 1 < x < 2,

\x - 2, 2 < x < 3.

5. Find the Fourier series expansion of sin x as a function in (PC[0, t/2], and sketch
the graph of the series.
6. Find the Fourier series expansion of cos x as a function in (PQ[tt/2, 2>tt/2], and sketch
the graph of the series.
7. Find the Fourier series expansion of the function

f(x) = fl, 8 < x < 9,

ho - x, 9 < x < 10.


9-6 | CHANGE OF INTERVAL 359

8. Find the Fourier series expansion of the function

1, 2 < x < 3,
- x 4 3 < x < 4>
fM = l
nx) >

}x-4, 4 < x < 5,

X 5 < x < 6.

9. Find a Fourier series which contains only sine terms and which converges pointwise
to the function x — 1 for 1 < x < 2.

10. Find a Fourier series which contains only cosine terms and which converges point-
wise to the function x — lforl<Jc<2.
11. Find a Fourier series which contains only sine terms of odd "degree" and which
converges pointwise to the function x 2 — 4 when 2 < x < 3. [Hint: See Exercise
8, Section 9-5.]
12. Let /be a piecewise continuous function of period 2x defined on the entire jc-axis,
and suppose that the Fourierseries expansion of /is

-y + 22
-r- (a fc cos k x + ^* sm kx).

Let the Fourier series expansion of /(x + t) be

— + 22 (die cos kx + &k sin kx).


fc=i

Show that
A k = {-\fa k
and
B k = (-l) fc
6 fc .

13. Given that the Fourier series expansion of x, —t < x < ir, is

sin 2x sin 3x
sin x •

2 3

use Exercise 12 to find the Fourier series expansion of the function

)X + IT, — 7T < x < 0,

[X — IT, < X < IT.

14. Let /be piecewise continuous on (— °o, <*>), and assume that /(x + tt) = —fix)
for all x where /is defined.

(a) Prove that /is periodic with period 27r.

(b) Show that the Fourier series for /has only odd terms.
(c) Prove, conversely, that fix) = —fix + ir) whenever the Fourier series for /
has only terms of odd degree.
(d) What can be said about a function whose Fourier series contains only even
terms ? Why is this situation not particularly interesting ?

Remark: Functions satisfying fix) — —fix + w) on (— oo, oo) are said to possess
half-wave symmetry and are of interest in electrical engineering.
360 FOURIER SERIES CHAP. 9

*9-7 THE BASIS THEOREMf

We have already had occasion to remark that the entire theory of Fourier series

ultimately rests upon the fact that the set of functions

1 , cos x, sin x, cos kx, sin kx,

is x]. As we shall see, this result can be


a basis for the Euclidean space (Pe[— x,
made depend upon one of the really fundamental theorems in analysis, the
to
famous Weierstrass Approximation Theorem. This theorem, whose proof will
be given in Section 10-8, is usually stated in one of two equivalent forms, the
of which involves trigonometric polynomials, the second ordinary polynomials.
first

For our present purposes the statement involving trigonometric polynomials is


the more appropriate.

Theorem 9-3. (Weierstrass Approximation Theorem.) Let f be a con-


tinuous function on the interval [— x, x], and suppose that /(— x) = /(x).
Then, given any real number e > 0, there exists a trigonometric polynomial

T(x) = A + 22 (A k cos kx + Bk sin kx)


fc=i

such that
|r(x) - f(x)\ < e

for every x in [
— x, x]
Descriptively, this theorem asserts that the graph of T(x) lies between the graphs
of/+ e and/ — e throughout the entire interval [— x, x], as illustrated in Fig.
9-24. The notation 7V(e) in the formula for T(x) is a reflection of the fact that in
general the number of terms in this trigo-
nometric polynomial will depend upon e,
increasing as e becomes small.
Theorem 9-3 is often stated in the fol-
lowing terms: "Any continuous periodic
function can be uniformly approximated by
trigonometric polynomials," phraseology
which requires some explanation. In the
first place, we have already observed that

any function /in (P6[— x, x] can be ex-


tended to the entire real line in such a
way that the extended function is peri-
odic. Now if/ happens to be continuous
and if /(— x) = /(x), it is clear that the figure 9-24

t The discussion in this section uses the notion of a closed subspace defined in Sec-
tion 8-5.
9-7 | THE BASIS THEOREM 361

periodic extension F of / will be continuous on the whole real line (Fig. 9-25). It

follows that a trigonometric polynomial which approximates such a function


everywhere in [— t, it] will actually approximate the periodic extension of/. This
accounts for the term "periodic" in the above statement.

FIGURE 9-25

As far as the word "uniform" is concerned, it will suffice for now to remark
that it is used to call attention to the fact that the approximation of / by T holds
for all x in the interval [— x, t].
We are now ready to prove that the set 1, cos x, sin x, ... , which we shall
denote by <B,a basis for (PC[— x, 71-]. To do so we introduce the subspace
is

(P[— 7r, x] consisting of all continuous functions in (PC[— x, x] which satisfy the
equation /(— x) = /(x) 5
and establish two preliminary results concerning this
subspace. The first is

Theorem 9-4. Every function in (P[ — x, it] belongs to I((B), the closure
of($> in (Pe[-x, x].

Proof. Recall that S((B) is, by definition, the subspace of (P6[— x, x] consisting
of those functions which belong to S((B) or are limits in the mean of sequences of
functions in S((B), or both.* Thus to prove the theorem we must show that every
fin <P[— x, x] can be approximated arbitrarily closely in the mean by a trigono-
metric polynomial.
To accomplish this we apply the approximation theorem and, for each integer
k > 0, find a trigonometric polynomial T& such that

1
\Tk(x) - f(x)\ <
kV2i
for all x in [— t, x]. Then
h
m-n= [Tk (x) - f(x))
2
dxj

<
\J_r2Trky

S((B) is, of course, the set of all trigonometric polynomials.


362 FOURIER SERIES |
CHAP. 9

It follows at once that the sequence {Tk }, k = 1, 2, . .


.
, so obtained, converges
in the mean to/, and the theorem is proved. |

The second result is as follows:

Theorem 9-5. Given any real number e > 0, and any f in (P6[ — ir, tt],

there exists a function g in (?[—t, t] such that

\f~ < €.

In other words, any function in (PC[ — ir, t] can be approximated arbitrarily


closely in the mean by a function in (?[— t, it].

TT+8 Xq— S Xq Xq+S IT— 5 TV FIGURE 9-26

Proof The function g is constructed from /by "mending the discontinuities" of


/as suggested in Fig. 9-26. Specifically, suppose that/is a function in (PC[— w, w]
which has a single jump discontinuity at x In this case, g is the continuous func- .

tion obtained by redefining / near — tt, x and t as shown by the broken lines ,

in Fig. 9-26. (Note that this has been done in such a way that g(— tt) = g(ir), so
that g does indeed belong to <?[— it, x].)
Calculating \\f
- g\\
2
we find that

2
11/ g\\
= f [/(*)- gix)fdx
J T

T+d 2
x +S 2
= f~ if(x)- g(x)] dx+ r [f{x)- g(x)] dx
J —TT Jx
X —5

+ f
Jt-8
[fix)- g{x)fdx,

since f(x) = g(x) whenever x belongs to the intervals [— t + d,x — 5] and


Oo + 5, tt - 5]. Now let M be
an upper bound for |/(x)|.on [-ir, tt]; i.e.,
|/(x)| < M for all x in [-tt, tt]. Then \f(x) - g(x)\
2
< (2M) 2 everywhere on
[
— 7r, x], and it follows that

||/_ g ||2 < 4M 2 (5 + 25 + 6) - 16M 2 5.

To make this quantity less than any preassigned e > 0, it obviously suffices to
2
choose 8 < e/l6M Thus the theorem holds for functions with a
.
single dis-

continuity. To give the proof for any fin (P6[— t] just repeat this jt, process at
each point of discontinuity. |
9-7 I
THE BASIS THEOREM 363

Before stating the basis theorem, let us pause a moment to take stock of the
situation. The preceding any function in (Pe[— t, t] can be
result tells us that
approximated arbitrarily closely in the mean by a function in (P[— ir, it]. But
Theorem 9-4 asserts that every function in (?[— t, t] can in turn be approximated
arbitrarily closely in the mean by a trigonometric polynomial. Taken together
these two statements ought to yield a proof of the fact that (B is a basis for
(PQ[ — t, it]. They do, in the following manner.

Theorem 9-6. (B is a basis for (PC[ — w, ir].

Proof. We must show that every /in <PC[— t, it] is the limit of a sequence of
trigonometric polynomials. But, by Theorem 9-5 we can find functions fk in
(P[ — 7r, 7r] such that

\\f-fk\\ <^' k= 1,2,...,

while by Theorem 9-4 we can find trigonometric polynomials Tk such that

1
— Tk <
\fk \\

^
Hence, by the triangle inequality,

ii/- 7*ii = \\(f - fd + (ik - m\


< 11/ - M
+ Wfk - rt ||

^ 2k ^ 2k k

and it follows that the sequence {Tk } converges in the mean to /. |

For later purposes we also record the following easy consequence of the basis
theorem:

Corollary 9-1. (Parseval's Equality.) If f '


is any function in <P(3[ — it, ir],

then
1 C 2
2 00

- \ f(x) dx = ^+J2 («* + bh (9-31)

where the a k and b k are the Fourier coefficients off

Proof Since the functions 1, cos x, sin x, . . . are a basis for (PC[ — ir, tt], Parseval's
equality is satisfied, and we have

2 2

ii/ii
2
= H# 1 +
fc==l
(/• cos kx)
|cosA:^|| 2
(/• sin kx)
||sinA;x|| :
(9-32)
364 FOURIER SERIES |
CHAP. 9

(Formula 8-21). But


2 2
(/•D = (f-*f(x)dx) = raj
2 2tt 2
||1||

2 2
(/• cos kx) (J-r f(x) cos foe Jx) 2

||COS/cx|| 2 7T

2 2
(/•sin /ex) _
= (f _ r f(x) sin kx dx) _
" , 2

TiinMI 2 *

and (9-31) now follows at once from (9-32). |

9-8 ORTHOGONAL SERIES IN TWO VARIABLES

With an eye to later applications we now sketch the theory of series expansions
for functions of two variables. A corresponding theory also exists for functions
of any finite number of variables, but since we shall have no occasion to use these
results their formulation is left to the reader.
Our first task, of course, is to define a function space in which to conduct the
discussion. This accomplished by generalizing the notion of piecewise continuity
is

to accommodate functions of two variables, and then using these functions to


construct the two-dimensional analog of the space (PQ[a, b], as follows.
A function / is said to be piecewise continuous on a rectangle R in the plane if
(i) / is continuous everywhere in and on the boundary of R, with the possible
exception of a finite number of points, or along a finite number of simple differ-

entiable arcs, or both ;* and


(ii) lim (x> j/) ^ (X() ,j
/o )
f(x, y) exists whenever (x y ) is a point of discontinuity
,

of/ and (x, y) approaches (x , yo) from the interior of any one of the regions into
which R is separated by the arcs of discontinuity.
Any function which is continuous in and on the boundary of R is piecewise con-
tinuous, so that, in particular, such functions as sin {mx) sin (ny), sin (mx) cos (ny),
etc., m
and n integers, are piecewise continuous on any rectangle in the plane.
However, the set of piecewise continuous functions is clearly larger than the set
of continuous functions for any rectangle R. In Fig. 9-27 we have illustrated a
rather general piecewise continuous function, in order to dispel any doubts about
the nature of such functions. We also remark that it is quite legitimate to consider
piecewise continuous functions in regions other than rectangles. The basic defini-
tion remains unchanged, except that we replace R by a region whose boundary

A plane curve
x = x(t)
y = y(0\
is said to be a differentiable arc if the functions x(j), y(f) have continuous derivatives with
respect to t. A differentiable arc which does not intersect itself is said to be simple.
9-8 ORTHOGONAL SERIES IN TWO VARIABLES 365

consists of a finite number of simple differentiable arcs. This not withstanding,


we shall restrict our attention to rectangular regions throughout this section. The
reasons behind this prejudice in favor of rectangles will become apparent as
soon as we state Theorem 9-7.

FIGURE 9-27

As with functions of a single variable, we agree to identify two piecewise con-


tinuous functions whenever they agree everywhere in R except at their points of
discontinuity. This done, the totality of piecewise continuous functions on R
becomes a Euclidean space under the usual operations of addition and scalar
multiplication, and the integral inner product

f'g= fjf(x,y)g(x,y)dR. (9-33)

(The student will recall that when R is the rectangle a < x < b, c < y < d,
this double integral may be evaluated by computing either one of the iterated
integrals

[ fix, y)g(x, y) dy dx
[ Jc
Ja
or f [ fix, y)g(x, y) dx dy)
Jc Ja

We denote the resulting Euclidean space by (PQ(R).


shall
Our is to find a basis for (?e(R). And here the fact that R is a rectangle
next task
becomes important, for we can then reduce this problem to that of finding a basis
for a Euclidean space of the type (PQ[a, b]. This is the content of

Theorem 9-7. Let {f(x)} and {gj(y)} be orthogonal bases for the Euclid-
ean spaces (PQ[a, b] and (PQ[c, d], respectively. Then the set of all products

{Mx)gj (y)}, i = 1,2,..., j = 1,2,..., (9-34)

is a basis for (Pe(/J), where R is the rectangle a<x<b,c<y<d.


366 FOURIER SERIES CHAP. 9

The proof of this theorem is given below, following the examples. First, however,
we observe that the generalized Fourier coefficients of any function F in (9Q(R),
computed with respect to the functions in (9-34), are

ai3
(figj) •
(figj)
or, in greater detail,

jJF(x,y)fi {x)g j {y)dR


_ _B
(9-35)
2 2
fffi(x) gj(y) dR
R

Thus the series expansion of F can be written as a double series

00

£ ^fMsjiy), (9-36)

and Theorem 9-7 allows us to assert that this series converges in the mean to F.
(The order of summation in such a series is a matter of indifference since the
assertion that the functions figj are a basis for (PC(/?) is not affected by the order
in which they are displayed.)

Example 1. Let R be the rectangle — ir < x < ir, — ir<y<ir. Then the
set of functions
sin mx sin ny, sin mx cos qy,
cos px sin ny, cos px cos qy,

where m and n range independently over the integers 1,2,..., and p and q over
the integers 0, 1, 2, . . . , is a basis for (PQ(R). More generally, the set of functions

m-irx niry mirx qiry


sin
.

sin
.

—r- > sin


.

a
cos -^
b
>

a b
(9-38)
p-irx
cos -
a
sin —r~
b

niry .
>
pwx
cos -
a

qiry
cos J-rL-
b

is a basis for the Euclidean space of piecewise continuous functions on the


rectangle —a < x < a, —b < y < b.

Example 2. Find the series expansion of the function

F(x,y) = xy

in the rectangle — ir < x < ir, — ir < y < ir relative to the basis given in the
preceding example.
Here we must evaluate the coefficients

a mn> a mq> a pn> a pq


x

9-8 ORTHOGONAL SERIES IN TWO VARIABLES 367

of Formula (9-35) for the various functions in (9-37) and the given function F.
But since x cos px and y cos qy are odd functions of x and y,

/T rir

/ (x cos px)(y sin ny) dx dy = 0,


—X J —

(x cos px)(y cos <7j>) <Zx <a[y = 0,


JJ X J
J
X
/X rie

/ (x sin «x)(y cos qy) dx dy = 0.


—X •/ X

Thus all of the Fourier coefficients of F except the a mn are zero. To evaluate them
we note that
2
I sin tnx dx = j
2
cos ny dy = w
J X J X

for all positive values of m and n. Hence


-X
JX_„. _ T xy sin mx sin ny dx dy
J

J_ x J_ x sin
2
mx cos 2 ny dx dy

x sin mx dx y sin ny dy
TV*

=
~
A_
2
fx
j

sin mx dx f
I
J y sin ny dy.
IT Jo Jo
But

t sin kt dt = (— 1)
and it follows that

to+1 JT
(-D m (-1)
n
m-\-n
= (-D
mn
Thus
sin x sin 2y sin 2x sin 3;
xy = 4 sin x sin jy
1 -2 2-1
sin x sin 3y sin 2x sin 2^ sin 3x sin ^
H
To ' 2^2 ' yvi
= y* ^ OT+W sin mx sin «j
4

Series of this sort are called double Fourier series, and, as we shall see, arise in the
study of boundary value problems involving partial differential equations.
We now return to the proof of Theorem 9-7. For the sake of simplicity we shall
assume that {fi(x)} and {gj(y)} are orthonormal bases for (S>Q[a, b] and (?Q[c, d\
As we know, this involves no loss of generality since an orthogonal set can always
368 FOURIER SERIES | CHAP. 9

be normalized in the usual fashion. In this case the set of products {fi(x)gj(y)}
isalso orthonormal, and hence, of course, orthogonal, as asserted in the theorem.
We leave the task of establishing this fact as an easy exercise, and turn to the
problem of proving that these functions are a basis for (PQ(R).
By Theorem 8-3 we know that the set {fi(x)gj(y)} will be a basis if and only
if it satisfies Parseval's equality. Thus it suffices to prove that

r
F{x,yfdR = £ «Jy (9-39)

for any function F in (?Q(R).


We first consider the case in which F is continuous everywhere in R. Then,
for each value of y in the interval [c, d], F(x, y) is a continuous function of x,
and as such can be viewed as a member of (PQ[a, b]. But then we can apply Parse-
val's equality to it and the basis {f(x)} to obtain

b
2
F(x,y) dx = JliF.fi?
/. i=l
r- rb

-E F(x, y)f(x) dx

Moreover, each of the integrals appearing in this equality is a continuous function


ofy. For convenience of notation we now set

hi(y) = / F(x,y)Mx)dx,
Ja

and rewrite the above equality as

rb *
/ F{x,yfdx= f^hiiyf. (9-40)
Ja i=i

We now call result from the theory of infinite series which says that a
upon the
series of positivecontinuous functions which converges pointwise on a closed
interval to a continuous function may be integrated term-by-term.* Thus, inte-
grating (9-40), we obtain

j^ fa F{x,yfdxdy = E / h t<yf <fy]



(9-41)

* This result is a consequence of Dini's Theorem (see R. Courant, Calculus II, Inter-
science, New
York) which guarantees that such a series is uniformly convergent and
can be integrated term-by-term. (See Appendix I.)
9-8 |
ORTHOGONAL SERIES IN TWO VARIABLES 369

But since {gj(y)} is a basis for (PC[c, d], Parseval's equality implies that

2
hi(y) dy= YjQii.gif
3=1

h %{y)g,{y)dj[
3 =1 Je
for each integer /. If we substitute these values in (9-41), and recall the definition
of hiiy), we find that

F(x,yfdR= X)Z hi(y)gj(y) dy


=
i'=l 3 1

*-
j- pa
rd pb
=
.
2
i,j=l, L*' c
/
J/a
F x 3y)fi(x)gj(y)dxdy
(.

00

i,j=l

Thus (9-39) holds, and the proof is complete in the case where F is continuous.
The proof when F is piecewise continuous, but not continuous, requires a more
sophisticated version of the "mending of discontinuities" theorem of Section 9-7
to show that Fcan be approximated arbitrarily closely in the mean by a continuous
function. Although the procedure necessary to prove this result is conceptually
clear, its details are both complicated and unenlightening, and we therefore omit
the argument. But once this fact has been accepted, the general conclusion follows
from the continuous case proved above. |

As might be expected, questions concerning the pointwise convergence of


double series are rather difficult, and particular orthogonal systems must be con-
sidered individually. However, we can state one theorem which, though somewhat
restricted in scope, is sufficient to answer all such questions that arise in this book.

Theorem 9-8. Let R be the rectangle —Tr<x<ir,—ir<y<T, and


suppose that F is continuous on R, and that

2
dF dF d F
and
dx dy dx dy

exist and are bounded everywhere in R. Then the double Fourier series for F
converges pointwise to F everywhere in R*
With obvious modifications this theorem remains true for any rectangle.

* See E. W. Hobson, The Theory of Functions of a Real Variable, Second Ed., Cam-
bridge University Press, 1921, 1926.
370 FOURIER SERIES |
CHAP. 9

EXERCISES

1. (a) Verify that the set of functions {fi(x)gj(y)} of Theorem 9-7 is orthogonal in
(PC(/?), and that this set is orthonormal whenever {fi(x)} and {gj(y)} are ortho-
normal.
(b) Find the norms of these functions in terms of the norms of the functions ft(x) and
gj(y); i.e., find a formula for \\fi(x)gj(y)\\ in terms of ||/»(*)|| and \\gj(y)\\.

2. What are the norms of the functions in (9-37)? In (9-38)?


3. What is the form of the double Fourier series expansion of a function F if F(— x, v) =
F(x, y) and F(x, -y) = F(x,y)l If F(-x,y) = -F(x,y) and F(x, -y) = -F{x,y)l
4. (a) Repeat Exercise 3 for a function F such that
F(-x,y) = F(x,y) and F(x, -y) = -F(x,y).
(b) Repeat Exercise 3 for a function F such that
F(-x,y) = -F(x,y) and F(x, -y) = F(x,y).

In each of the following exercises find the double Fourier series expansion of the given
function in (?Q(R); R the rectangle —k < x < tt, —t < y < t.

5. F(x, y) = x
6. F(x, y) the function which is 1 when x and y are both positive or both negative, and
— 1 otherwise.

7. F(x, y) = sin 2 (x + y) 8. F(x, y) = ex«

9. F(x, y) = xy 2
10. F(x, v) = \xy\
10

convergence of fourier series *

10-1 INTRODUCTION
In this chapter we shall investigate some of the convergence problems which
arise in the study of Fourier series. Our first efforts will be devoted to proving
the theorem cited in Section 9-4 describing the pointwise behavior of the Fourier
series for a piecewise smooth periodic this has been done, we
function. Once
shall consider the more delicate (and problem of uniform conver-
interesting)
gence, and the related questions of term-by -term differentiability and integra-
bility of Fourier series. Finally, we shall introduce the important notion of "sum-

mability" for infinite series, and use it to extend these results to arbitrary functions
in (PC[— 7r, t]. Throughout this discussion we shall assume that the reader is
familiar with the notion of uniform convergence, and the results contained in
Sections 1-3 and 1-4 of Appendix I. They will be essential in all that follows.

10-2 THE RIEMANN-LEBESGUE LEMMA


We begin the formal work of this chapter by establishing a result which, in addi-
tion to being essential for the proof of our first convergence theorem, is also of
considerable interest in itself.

Lemma 10-1. (The Riemann-Lebesgue lemma.) If g is piecewise continuous


on [a, b], then

lim j
g(x) sin (Kx) dx = lim f g(x) cos (\x) dx = 0. (10-1)
X—>oo Ja X—»oo Ja

The reader should note that (10-1) has already been established when X —» oo
through the values 2wk/(b — a), k = 1,2,... (see Corollary 8-1). The burden
of the present assertion is that this result is still valid as X tends continuously to
infinity. Intuitively, of course, this is reasonable, since the positive and negative
portions of the area under each of the curves g(x) sin \x and g(x) cos Xjc tend to
cancel one another as X —> oo.

* This chapter may be omitted in its entirety without loss of continuity.


371

372 CONVERGENCE OF FOURIER SERIES CHAP. 10

Proof. Since the argument is similar for both functions we shall give the proof
only for g(x) sin Ax. Here if

rb
IQ$ = /
8(x) sm 0^x ) dx, (10-2)
Ja

and if e > is given, we must show that there exists a constant X such that
|/(X)| < e for all X > X To this end we assume for the moment that g is con-
.

tinuous, and make the substitution x = t + (tt/\) in (10-2) to obtain

rb— t/X i v

/(X) = -/
Ja—Kl\
g[t + \)
A/
sin (\t)dt,
\

or, reverting to the variable x,

•6— jr/A
rO— ir/X r v

/(X) = — / g(x + £ ) sin (Xx) dx (10-3)


Ja-i/X \ A/

Adding (10-2) and (10-3), we find that

2/(X) = - i
g ( jc + y)
sin (\x) dx + / g(x) sin (Xx) dx
./ a tt/X 6-t/X
•b-WX
sin (Xx) dx.

Hence if M denotes the maximum value of the function \g\ on [a, b], and if t/\ <
b — a (which, of course, we may assume), then
• b

2|/(X)| < Ml
'a— *7X
|sin Xx| dx + M I

J b—ir/X
[sin Xjc| dx

6-x/X

+ g(*) - g (" + i)l I


s
sin Xx| dx

b-x/X
2A^r
< g(*) - £ dx
X

(recall that |sin Xx| < 1 for all x). Thus


6-WX
|/(X)I < ^+ 2./«
g(x) - g (-*i)\ dx.

To complete the proof in the case under consideration we now use the fact that
g is uniformly continuous on [a, b] to find a constant X such that

!««-*(* + SI <E-r7i
10-3 I
POINTWISE CONVERGENCE OF FOURIER SERIES 373

for all X > X ,and all x in [a, b].* In addition, we suppose that X is chosen
so that, at the same time, Mir/X < e/2 whenever X > X . Then

i/(x)i <! + f
= €

for all X > X , as required. Finally, to establish the assertion for an arbitrary
function in (PQ[a, b], we merely apply the above argument to each of its con-
tinuous pieces. |

10-3 POINTWISE CONVERGENCE OF FOURIER SERIES

In this section we use the Riemann-Lebesgue lemma to establish our first con-
vergence theorem. The main step in the proof consists of deriving a formula for
the partial sums of an arbitrary Fourier series

00

y+ ^ («* cos kx + b k sin kx),

where

dk = - I f(x) cos kx dx, bk — - I f(x) sin kx dx,


IT J _T IT J _,j.

and /is any function in (PC[— w, *]. The derivation goes as follows.
Let

5„(x) = -~ + ^ (ak cos kx + b k sin kx).

Then

+ - ^ cos (kx) / /(/) cos kt dt -f- sin (kx) / f(t) sin kt dt

= ~
J
f(f) \j + ^ (cos kx cos kt + sin kx sin kt) dt

=
\ j_J(t) [j + £ cosA: (' - *) *•

But by summing the trigonometric identity

sin (k + %)s — sin (& — %)s = 2 sin ^ cos ks

* Recall that a continuous function on a closed interval is uniformly continuous.


(Theorem 1-13, Appendix I.)
374 CONVERGENCE OF FOURIER SERIES | CHAP. 10

as k runs from 1 to n, we find that

sin (« + I) 5 — sm 9 =
2
2 sin ^
2
^
k=\
cos ks,

or

/c=l

Hence

w tJ_/ 2 sin — x)/2

We now regard x as fixed, and make the change of variable s = / — x, to obtain

nv ' tt7_x-x 2sin(s/2)

Finally, if we now assume that /is periodic on (—00, 00) with period 2t (i.e., if

we replace / by its periodic extension to the entire real line), then Sn (x) is also

periodic with period 2x (Exercise 9), and we can write

v
ds. (10-5)
' tt J- T 2 sin (5/2)

This is the desired result, which is known as Dirichlef s formula for Sn Moreover, .

for future reference, we also note that when (10-4) is integrated from — ?r to t,
we have
''rin£+^ &=ir . (10-6)
f

_ T 2sin(V2)

Now that these facts have been established we can easily prove

Theorem 1 0-1 Let f be piecewise continuous on {—go, go), with period 2tt,
.

and suppose that f(x) = M/(x + ) + /CO] for al1 x Then the Fourier -

series expansion for f converges to f(x ) at each point x where f has a


right- and left-hand derivative. In particular, if f is piecewise smooth its
Fourier series converges to f(x)for all x*

* To accommodate points jc where / has a jump discontinuity, we define the right-


and left-hand derivatives of /to be, respectively,

/(*» + h) - /(xi) /(*o - *) - /(^L)


lim and Um
a_>o+ h h->o+ n

provided these limits exist. The reader should note that these definitions reduce to the
usual ones at those points where /is continuous. Moreover, if both of these limits exist
and are equal and if xo is a point of continuity of/, then /is differentiable at xq, and the
value of its derivative is the common value of the above limits. Finally, we recall that a
function is said to be piecewise smooth if it has a piecewise continuous first derivative.
10-3 | POINTWISE CONVERGENCE OF FOURIER SERIES 375

Proof. We begin by considering the case where x is a point of continuity of/.


Then f(x£) =
/(xq) = f(x and the assumption that
), / has a right- and left-

hand derivative at x requires that both of the limits

lim
fl*> + *) - JM and lim
JXx^-J,)-/^)

exist. We must show that the difference Sn (x )


— f(*o) tends to zero with
increasing n. (Here, as always, Sn denotes the wth partial sum of the Fourier
series for/).
Now by (10-5) and (10-6),
/K sin (n + h)s
_J(x
r,
+
, x
s)
2sin(j/2)
* ,

jy y
k J- T 2 sin (s/2)

1 f(x + s) - f(x ) is
7T 7— s sin ^5
sin (« + ^)s g?5.

Moreover, since /has both a right- and left-hand derivative at x , the function

g (s) = /(*o + *)-/(*o) _i^_


( io-7)

is piecewise continuous on [— x, x] (see Exercise 1). Hence by the Riemann-


Lebesgue lemma, lim^^fiS^jto) — f(x )] = 0, as required.
To complete the proof it remains to treat the case where x is a point of dis-
continuity of/ Here we must show that the series

00

yz 4- 2^ ( flfc C0S k*0 + ** Sm ^o)»


fc=l

where au and && are the Fourier coefficients of/ converges to/(jc ) = M/C*^) +
/(XcT)]- But this is equivalent to showing that the Fourier series expansion of
the function
G(x) = f{x + *„) - /(*o)
converges to zero at x = (see Exercise 4). To do so we decompose G into its

even and odd parts as


G = Ge + Go>

and observe that the Fourier series for Go converges to zero at x = 0. Hence it
is prove that the same is true of the Fourier series for Ge- However,
sufficient to
it follows at once from the definition of G that Ge is continuous at x = 0, and
376 CONVERGENCE OF FOURIER SERIES CHAP. 10

has a right- and left-hand derivative there (see Exercise 4). Thus, by the continuous
case treated above, the Fourier series for Ge converges to %[Ge(0 +) Ge(0~)] = +
G^(0) = 0. From this we conclude that the Fourier series for G, which is just
the sum of the corresponding series for Ge and Go, converges to zero at x = 0,
and the proof is complete. |

In Section 10-7 we
prove a far reaching generalization of this result. In
will
the meantime, however, we conclude our
discussion of pointwise convergence by
evaluating a certain improper integral which will be encountered shortly.

Proof.
Lemma 10-2
f Jo
sin
x
jc

Let /be the function in (PC[— n, w] defined by


.
dx = ir
-x
2

= {x Sm 2
—T < X < X, X J* 0,
f(x)
i
2> x = 0,

and let Sn denote the nth partial sum of the Fourier series for /. Then since / is
piecewise smooth on [— w, t], Theorem 10-1 implies that Sn (x ) —>/(xo) as
n — > go for all jc in [— t, t]. Setting x = 0, and using Formula (10-5), we
therefore have

sin (s/2) sin (n + %)s


if s 2 sin (s/2)
ds

as n — oo. We now use the fact that the integrand appearing in this expression
is even to deduce that
sin (n + %)s
ds —* ^

as n — > oo . Hence
(n+l/2)x
sin x
dx
/.

as n —* go , and we conclude that

sin x ,
= wir
f
Jo x
dx
2

provided the integral exists.


To settle this last point, we set

I sin*
Ak = dx
J (k— l)x X
.

10-3 POINTWISE CONVERGENCE OF FOURIER SERIES 377

FIGURE 10-1

k = 1,2,.... Then, referring to Fig. 10-1, we see that the A k measure the area
between the x-axis and successive oscillations of the curve sin x/x. Thus A i >
A2 > and Ak —> as k — oo (see Exercise 2). From this it follows that the
' • , *•

alternating series
A\ — A2 + A3 — - •

converges, and that

< \A N+ i\.
k=N+l

Hence, if Tis chosen so that (k — l)x < T < kr, then

/•KT
sin* / sin x
dx < dx +
T X JT X j=k+l

< \A k \ + \A k+1 \,

and since A k —> 0, the integral in question must converge.

EXERCISES

1. Prove that the function g(s) defined by Formula (10-7) is piecewise continuous on
[-7T, 71"].

2. Let
•Jfcir

sinx
Ak = dx k = 1,2,...
I.(k—l)r X

Prove that A > A% >


\ • •
, and that Ak — > as k — > oo

3. Show that

sint sin
<^
I , I t ,

dt
Jo
/
t -Jot / <fr

for all x.
378 CONVERGENCE OF FOURIER SERIES CHAP. 10

4. (a) Let au, bk denote the Fourier coefficients of/. Prove that the series

ao
-r-
2
+ ^2 iflk cos kxo + b k sin kxo)

converges to f(xo) if and only if

h 2_J [(«fc cos kxo + bk sin kxo) cos fcx

+ (bk cos &xo — dk sin A:xo) sin &x]

converges to zero at x = 0.
(b) Show that the second of the above series is the Fourier series expansion of the
function G(x) = fix + xo) — f(xo). [Hint: Make the change of variable t =
x + xq in the formulas for the Fourier coefficients of G on [— x, ir] and use the
periodicity of/.]
(c) Show that

lim GB (x) = lim GE (-x) = G£ (0) - 0,


x->0+ x->0+

where G E is as above, thereby establishing the continuity of GE atx = 0.

(d) Show that the right- and left-hand derivatives of GE exist at x = 0. [Hint: Note
that

GbQi) - GE (0) = 1 G(fl) - G(0) G(-/?) - G(0)


h 2

_ 1 /(xo + A) - /(xo) /(xo - A) - /(xo)


~ 2 /i
+ ,

5. A continuous function /is said to satisfy a Lipschitz condition of order a if there exist
positive constants M and a such that
|/(*i) - f(x 2 )\ < M|xi - x 2 |«

for all xi and X2 in the domain of/.


(a) Show that each of the following functions satisfies a Lipschitz condition on the
entire x-axis.

(i) fix) = c, c a constant

(ii) f(x) = sinx


(iii) fix) = sin 2 x
(iv) /the periodic extension to (— <x> ,
oo) of the function x2 in (?Q[—ir, ir]

(v) /the periodic extension to (— °° , °o) of the function |x 3 |


in (PC[— x, t]
(b) Let /be a piecewise continuous function of period 2t, and suppose that /satisfies
a Lipschitz condition of order a on the entire x-axis. Prove that the Fourier coeffi-
cients of /then satisfy the inequalities

Mir" Mxa
la* < \bk\ <

10-3 | POINTWISE CONVERGENCE OF FOURIER SERIES 379

for all k > 0. [Hint: Make the change of variable x = t + (r/k) in the formulas for
a and
fc bk, and deduce that

= 2~ cosA: *^x '


Uk
/ \^ ~ f( X + ^)]

(c) Why is it uninteresting to permit a > 1 ?

6. (a) Prove that every continuously differentiable function /on a closed interval [a, b]
a Lipschitz condition of order one on that interval. (See Exercise 5 above.)
satisfies

[Hint: Use the mean value theorem and set = max\f'(x)\ on [a, b].] M
(b) Prove the following generalization of Theorem 10-1. Let f be piecewise continuous
on (— co , oo ) with period 2-k, and suppose that

(i) fis continuous at the point xo, and


(ii) / satisfies a Lipschitz condition of order a for all x in an interval about xq.
Then the Fourier series for f converges to f(xo) when x = xq. [Hint: Follow the proof
of Theorem 10-1.]
7. Let /be piecewise smooth on (— «>, °o)with period 2-k, and suppose that f(x) =
+) + f(x~)] for all x. Prove that
h \f(x

/(*) = -lim
7T \__xx,
/
J -„
/(x + s) —S
rfs.

[Hint: Show that

sin Xs
f(x + s) ds

OS f(x

0
+ s) — f(x
+
) .....
sin (Xs) ds x+
+.
+ /OO
,

)f
-,

Jo
/
/
sin Xs

s
</s
,

—_).,..,
/.0

m
/(* + s) — f(x
jk -. sin Xs
+, i
.

J —
/

S
sm ^+
/
(X5)
, .,
/(X )
/
/

S
,
ds,
IT J T

and then apply the Riemann-Lebesgue lemma and Lemma 10-2.]

8. Prove that
1
-
v^ COS KX =
+ fs
. , sin (n
-—— + i)x
/ ,
:
, ...
2 2sin(x/2)

as follows.
Use Euler's formula

cos x + / sin x
to deduce that
n

Us
1

k=x
cos kx
380 CONVERGENCE OF FOURIER SERIES | CHAP. 10

is the real part of n


1 . V^ ikx

fc=l

Evaluate this expression by using the formula for the sum of the first n terms of a
geometric series, and then find its real part.

9. Let /be piecewise continuous and periodic on (— °°, °°), with period 2x. Prove
that the same is true of the function

J (x + S)
SH
L (AL +^)£
2sin(*/2)

10-4 UNIFORM CONVERGENCE OF FOURIER SERIES

Now that we have settled the question of pointwise convergence for the Fourier
series expansions of piecewise smooth functions, we propose to determine con-
ditions under which this convergence is uniform on a closed interval [a, b]. Here,
of course, we must impose additional hypotheses on the functions considered,
and at first sight one might expect them to be rather stringent. Surprisingly, how-
ever, we need only demand that the functions be continuous in order to guarantee
both uniform and absolute convergence; an assertion which we prove as

Theorem 10-2. Let f be a continuous function on (— oo, oo) with period 2x,
and suppose that f has a piecewise continuous first derivative. Then the
Fourier series for f converges uniformly and absolutely to f on every closed
interval of the x-axis.

Proof Let
00 / <x>

~1 + ^2 {a k cos kx + b k sin kx) and y+ S '


(ak cos kx + '
b k sin kx )
k=i *-i

be the Fourier series for/and/', respectively. Then, since/is periodic with period
27T,

«'o
= - / f\x)dx = -[/(tt) - /(-*)] = 0,
TT J -re TT

while, for k > 0,

a'k = - I f'{x) cos kx dx


TT J -,

= - f(x) cos kx + k I
J —V
f(x) sin kx dx
IT |_ —TT

= -[ T fix) sin kx dx c)

= kb k ,
10-4 UNIFORM CONVERGENCE OF FOURIER SERIES 381

and
bi -if
=
K
fix)
J —,
sin kx dx

[f(x) sin kx — k f f(x) cos kx dx


7T |
—r J —t
= —ka k .

Moreover, by Bessel's inequality (Theorem 8-3), we have

Em 2 2
+^)^ - / a*) <&<«>
2

Thus

E t^«f + «]
Jfc=l
2
< «>

and we conclude that the sequence \k\^al + bis, k = 1,2,..., belongs to the
Euclidean space / 2 of all "square summable" sequences of real numbers intro-
duced in Section 8-5. But the sequence {l/k}, k = 1, 2, ... also belongs to t 2 , .

Hence the inner product of these two sequences exists, and it follows that the
series

E U-kVJ+~bl) = ± Val +
k=l \* * fc=l
bl

must converge.
Now, given any pair of real numbers a and b, the Cauchy-Schwarz inequality
2
in Gt applied to the vectors ae x + be 2 and (cos kx)ei + (sin kx)e 2 e 1} e 2 the
, ,

standard basis, implies that

\a cos kx + b sin kx\ < y/a 2 -+- b2 Vsin 2 kx + cos 2 kx

= Va? + b2

for all x. This allows us to compare the series


00

-+-
E
fc=i
\
a k cos kx + bk sin kx\

with the convergent series of positive constants

|¥|
1
z '
+ E ^i +
fc=i
*.

and the Weierstrass Af-test (Appendix I) implies that

-y
2
+ E
fc=i
( flfc cos kx + 6*; sin fac)
382 CONVERGENCE OF FOURIER SERIES CHAP. 10

is uniformly and absolutely convergent on any closed interval of the x-axis.


Finally, by Theorem 10-1, we know that this series converges pointwise to/, and
the proof is complete. |

The reader should note that the hypotheses of Theorem 10-2 merely require/'
to be piecewise continuous, and hence allow/' to be undefined at isolated points
of the x-axis. Thus the Fourier series expansion of a function such as the one
shown in Fig. 10-2 converges uniformly and absolutely in every closed interval
of the x-axis in spite of the fact that /' does not exist at the points lirn, n an in-
teger. On the other hand, the theorem is certain to fail in any interval where /
itself is discontinuous, since it is well known that the limit of a uniformly con-
vergent sequence of continuous functions (in this case the Sk(xj) is continuous.
Thus the above result is we demand uniform con-
the best that can be expected if

vergence in every closed interval. By relaxing this requirement, however, Theorem


10-2 can be generalized to include functions with jump discontinuities. In this
case the result in question reads as follows.

Theorem 10-3. Let smooth and periodic on (—-oo, oo) with


f be piecewise
period 2w. Then the Fourier f converges uniformly to f in any
series for
closed interval of the x-axis which does not contain a point of discontinuity
off

The proof is an easy consequence of the following lemma, which itself is a


special case of the theorem.

Lemma 10-3. Let <p be the piecewise smooth, periodic function on(— oo, oo)
whose definition in the interval [
— t,t] is

0,

<p(x) = 0, x = 0, (10-8)

< <
K-0- x IT,

(see Fig. 10-3). Then the Fourier series for <p converges pointwise to <p for

all x, and the convergence is uniform on any closed interval which does not
contain a point of the form 2irn, n an integer.

-4tt -2-j 2ir 4i\

FIGURE 10-2 FIGURE 10-3


|

10-4 |
UNIFORM CONVERGENCE OF FOURIER SERIES 383

Granting the truth of this assertion, Theorem 10-3 is proved in the following
manner.
Let xi, x 2 , . . . , x m be the points in (— t, t) where /is discontinuous, and for
each i, i = 1, . . . , m, let co t-
denote the magnitude of the jump discontinuity at
Xi [that fix*) — f(x~)]. Then the function u>np(x
is, coi = — X;) also has jump
discontinuities of magnitude co; at the points Xi + 2-wn, but is continuous for all

other values of x. Hence

f(x) — U{<p(x — X{)

is continuous both at the points where/is continuous, and at the points Xi + 2im.
In short, by subtracting coi<p(x —
X{) from / we have removed the discontinuities
at the points x{ + 2irn without introducing any new points of discontinuity. We
now repeat this process for each index i, to obtain the function

which is piecewise smooth on (— oo, oo), periodic with period 2t, and continuous
everywhere except possibly at the points ±7r, ±3t, .... [Such discontinuities
will occur whenever /(—7r) ^ /(x).] To remove these last discontinuities we
set co x = /(t + ) — f(r~), and construct the function
m
F(x) = /(*) - 2 w ^(x — **0 ~ u v <p{x - tt).

Then F satisfies the hypotheses of Theorem 10-2, and its Fourier series therefore
converges uniformly to F in every closed interval of the x-axis. Moreover, Lemma
10-3 allows us to assert that the Fourier series for the function

m
$(*) = 2 awC* - **•) + "> T ip{x — tt)

converges uniformly to $ in any closed interval not containing a point of dis-


continuity of/. Hence the Fourier series for/ being the sum of the series for F
and <i>, must also converge uniformly in any such interval, and the theorem is

proved.

To complete the argument, we now establish Lemma 10-3. Here we reason


as follows.
A routine calculation reveals that the Fourier series for <p is

it^' ("H»

and hence to prove the lemma it suffices to show that this series converges uniformly
on every closed subinterval of [ — tt, x] not containing the origin.
384 CONVERGENCE OF FOURIER SERIES | CHAP. 10

To this end we set

sin 2* sinnx
Son (x)
/ \
= +
• . .

sin x H = h
n
and let

Tn (x) = sin x + sin 2x + + sin nx.

(Note that Sn is the nth partial sum of the Fourier series for the function ir<p.)

Then, since sin kx = Tjdx) — T]c _ 1 {x) for all k > 1, we have

Sn (x) = Tl (x) +
T* (X) ~ ™ + +
Tn (x) — Tn-xix)

or

sb(j0 = 1-2 TM + ^)
2-3
+ '
'
... + r-iW
(n - l)n
'
+ W
Moreover, by summing the trigonometric identity

2 sin ^ sin kx = cos (fc — %)x — cos (fc + %)x

as A: runs from 1 to n, we find that

cos (x/2) — cos (n + ^)x


Tn (x) = 2 sin (x/2)
x 5* 0, (10-10)

from which it easily follows that

1 v y|
I
sin ^jc]

for a// « and all x ?* in [— x, *]. Thus, for n > m, and x 5^ in [— x, x]

- Sm (x)\ = Tm (x) 7V_i(x) ^(x) t;„(x)


m(m + 1) t
.

\Sn (x) -r 7Z rr: ~r


" •

(ax — 1)« m
+!
1 1 1
<
|sin(x/2)| m(m + 1)
+ •••
+ (n — \)n
+ *-
n m
Now let x be restricted by the inequalities < 8 < \x\ < ir. Then

. x
sin > sm .

^ 5
and we have
1
\Sn (x) - SUx)\ <
sin (g/2) {[—(m + 1}
+ •••
+ (n — l)nj
+ U-J-
« m
But since the series

^fc(fc+ 1)
10-4 UNIFORM CONVERGENCE OF FOURIER SERIES 385

is convergent, the quantity

1 1

m(m + 1)
+ •••
+ (n — l)n_ n m
can be made arbitrarily small by taking m and n sufficiently large. Hence, given
any e > 0, there exists an integer such that N
\Sn {x) - Sm {x)\ < €

for all m, n > N, and all x with 8 < \x\ < t. This, of course, implies that

Y>v sin kx
fc=i
K

is uniformly convergent whenever < 5 < \x\ < ir. The same is therefore true
of the series for <p, and since the choice of 5 in (0, ir) was arbitrary, we are done. |

We
have already observed that the Fourier series for a piecewise smooth func-
tion cannot converge uniformly on any closed interval [a, b] containing a point
of discontinuity of the function in its interior. To complete our discussion
of uniform convergence, it remains to consider the behavior of Fourier series

(a) (b)

(c) FIGURE 10-4 (d)


386 CONVERGENCE OF FOURIER SERIES | CHAP. 10

in the vicinity of such discontinuities. The diagrams in Fig. 10-4, depicting the
function
-tt < x < 0,
{-1,
0, x = TT, 0, TT,

1, < x < T,

and several of the partial sums of its Fourier series, are typical of the situation
which obtains. From these diagrams it is apparent that the oscillations in the
partial sums, Sn of the Fourier series for / do not decrease at a uniform rate on
,

the interval (0, -n) as n —» oo On the contrary, the oscillations toward either end
.

of this interval remain rather large for all values of n, and though these exceptional
oscillations move toward the ends of the interval as n increases, they do not die
out in the process. This peculiar behavior of the partial sums of a Fourier series,
wherein these sums seem to gather momentum before plunging across a jump
discontinuity, is known as the Gibbs phenomenon, after the American mathema-
tician and physicist, J. W. Gibbs, who first discovered it. In the next section we
shall analyze this phenomenon in some detail, and obtain a limiting value for the
amplitude of the oscillations involved.

EXERCISES

Each of the following exercises refers to an arbitrary trigonometric series of the form
00

Y + k=l
Yj ( ak cos kx + bk sin kx) (10-11)

which, initially, is not assumed to be the Fourier series expansion of a function in


(Pe[-7T, TT].

1. Prove that whenever (10-11) converges in the mean in (PQ[— t, t] to a function/, it

is the Fourier series for /.


2. Suppose that (10-11) converges uniformly on every closed interval of the x-axis.
Prove that (10-11) then converges in the mean in (PC[— ir, ir], and is the Fourier series
expansion of its limit. (See Exercise 1.)

3. Suppose that

E M + N>
^=1

converges. Prove that (10-11) converges uniformly and absolutely on every closed
interval of the x-axis, and is the Fourier series expansion of its limit.
4. Prove that the conclusion of Exercise 3 holds whenever

N<p- \h\<-&
for all k > 0.
10-5 THE GIBBS PHENOMENON 387

*10-5 THE GIBBS PHENOMENON


We begin our analysis of the Gibbs phenomenon by examining the behavior of
the partial sums of the Fourier series for the periodic function <p defined in the
interval [— t, it] by
( ir — x
-7T < X < 0,

<p(x) = 0, x = 0, (10-12)
7T
< X < 7T.

(See Fig. 10-5.) The series in question is

Esin kx (10-13)
FIGURE 10-5

and by our earlier results we know that this series converges to <p(x) for all x,

the convergence being uniform on any closed interval not containing a point of
the form 27ra, n an integer. Figure 10-6 shows the graph of <p on the interval
[— 7r, ir], together with the graph of the sum of the first six terms of its Fourier
series, and again furnishes visual evidence of the fact that the partial sums of the
series tend to "overshoot" the values of the function near a point of discontinuity.

FIGURE 10-6

Thus, the values of «p between any two jump discontinuities range through the
interval (— ir/2, Sn the nth partial sum of (10-13), range
w/2), while those of ,

through a somewhat larger interval [— a n a n ]. The limiting value of an as n —> oo


,

determines what is known as the Gibbs interval for <p, and our first objective is to
obtain a precise description of this interval.
To this end we observe that a real number y will belong to the Gibbs interval
for <p if and only if there exists an increasing sequence of positive integers {rik}
388 CONVERGENCE OF FOURIER SERIES | CHAP. 10

k = 1, 2, ... , and a sequence of real numbers {xk} converging to zero, such that

lim Snk (xk ) = y* (10-14)


k—>oo

Informally, (10-14) asserts that we can approach y as closely as we wish by points


which lie on the graphs of a certain subsequence {Sn } of the sequence of partial
k
sums {Sn } . (See Fig. 10-7.) To find all points y which satisfy this condition we
consider the expression
"
sin kx
f + Sn (x) = f + 2
k=l f
Jo
1
^ + 2^ cos kt dt.

Then, by Formula (10-4), we have

n
1
^ ^ = sin (« + h)t
~ + ^2 cos
o 2 sin (t/2)
dt

1
/**
sin (« + j)t
dt
t/2

+ \}X^Tm~A iain + V) ' dt -

Thus

«*>--£+u*^*
+ ^/ lsirrk)-^] sin(rt + ^^ (1(M5)

We now make the substitution u = (n + %)t in the first

of these integrals to obtain

.(n+l/2)x
_1 / sin (n + 2)* j
"~_ / sinM
fifo.
2J0 V2 «

Next, an easy application of l'Hopital's rule shows that

1 1
lim = 0,
sin (t/2) t/2_

and hence that the function

1 1

sin (t/2) t/2 FIGURE 10-7

* The reader who is so inclined may take this statement as a formal definition of the
Gibbs interval for <p.
,

10-5 I
THE GIBBS PHENOMENON 389

is continuous on the closed interval [0, x], provided it is assigned the value
when t = 0. Moreover, if
x
C
1 1
in;x)=X
2Jo sin (//2) t/2_
sin (n + %)t dt,

it is not difficult to show that

lim F(n; x) =
x->0

uniformly in n; i.e., given any e > 0, a 5 > can be found such that for all n,

\F(n; x)\ < e whenever |x| < 5. (See Exercise 1 below.) Thus (10-15) may be
rewritten
Mn + l/2)x
Sn (x) = - \ + Jo
^dt + F(n;x),

and follows that if {jt&} is any sequence of real numbers which converges to
it

and if {« } is an increasing sequence of positive integers, both chosen so that


zero, fc

{x^k} has a limit, h, as k —* oo then ,

•h
/•h

lim
k—yao
Snk (x k ) =
Jo
J
—^
~T
dt,

(Note that every real number h can be expressed as a limit of this form, and that in
certain cases hmay assume the values oo .) Hence every point of the form ±
h
sin t ,
at, — oo < n < oo
/.o t

belongs to the Gibbs interval for <p.

Conversely, if j; belongs to the Gibbs interval for <p, there exist sequences
{xk} and {rik} with the property that Sn (xu) —* yo as k —> oo. But then there
exists a subsequence {«&'*&'} of {rikX k } which either converges or else diverges
to ±oo.* If h denotes the limit of this subsequence, then (10-15) (in its amended
form) implies that
/•h

y = lim Snk (x k ) = lim Snk >(x k ') = / ^7- dt,


fc—>oo k—nx JO I

and we have shown that y can be expressed in the form

•h
sin t
——dt ,

Jo

* This is a consequence of the famous Bolzano-Weierstrass theorem which asserts


that every bounded infinite set of real numbers has a limit point. For a proof see Buck,
Advanced Calculus, 2nd Ed., McGraw-Hill, New York, 1965.
390 CONVERGENCE OF FOURIER SERIES |
CHAP. 10

for a suitable value of h. Hence we have proved the following theorem.

Theorem 10-4. The Gibbs interval for the function <p defined above is the
set of all real numbers y of the form

I sin
~7~J dt (10-16)
Jo
with < h < oo.

Actually, we can say even more than this. For, referring to Fig. 10--8, and using
the interpretation of the integral as area under the curve, we see that (10-16)
assumes its maximum value when h = t. Thus the Gibbs interval for <p is

/•7T rtc

/ sin t , / sin t ,

sin t

Moreover, since the value of

sin t
dt
~~t~

is known to be (*-/2) 1.089490 . .


.
, it follows that the length of the Gibbs interval
for appreciably greater than the magnitude of the jump discontinuity in <p at
<p is

x = 0. Once again this implies that the Fourier series for <p is not uniformly
convergent in any open interval having a point of discontinuity of <p as one of
its end points.
It is now a simple matter to pass to the case of an arbitrary piecewise smooth
function. is piecewise smooth on [— ir, x] with a jump discontinuity
Indeed, if/
at x , then, arguing as in the proof of Theorem 10-3, the function

g(x) = /(*) "


fy- x ~ x o)

iscontinuous in some closed interval of the form [x — 8, x + 5] about x ,

and has a uniformly convergent Fourier series on that interval. But since

,+ ~
fix) = g(x) + /W) f(xo)
<p(x — x ),

we conclude that the Gibbs interval for /at x must be the same as the Gibbs
interval for the function

f(xt) - /(so) — Xq).


<p(X
10-6 I
DIFFERENTIATION AND INTEGRATION OF FOURIER SERIES 391

Thus, on the vertical line x = x the Gibbs interval for /consists of all points y
such that

f(*t) + f(xp) \Kxf) - f(Xo)\


yv i <

= \Axt) - f(xp)\ 2 Psinj


dt
2 T Jo t

= \f(xt) ~ f(xo)\
j 08 949

In short, the length of the Gibbs interval for /at jc exceeds the magnitude of the
jump discontinuity in /at that point by the factor 1.089490 ....

EXERCISES

1. Let
i i
foi x)
-'

=iL sin {t/2) t/2_


sin (n + i)/ dt.

Prove that lim^o F(n; x) = uniformly in n.

10-6 DIFFERENTIATION AND INTEGRATION OF


FOURIER SERIES
With the results we now have available it is an easy matter to settle the basic
questions concerning the termwise differentiability and integrability of Fourier
series. The relevant theorems in this connection are as follows.

Theorem 10-5. (The differentiation theorem.) Let f be a continuous


function on (— oo, go), with period 2t, and suppose that f has a piecewise
continuous first derivative, f. Then the Fourier series for can be obtained f
by differentiating the series for f term-by-term, and the differentiated series
converges pointwise tof'{x) wherever f" exists.

Proof. Let

y+ X)
k=i
( ak cos kx + bk sin kx) (10-17)

and

~2 + 2
k=i
(°* cos kx + b sin
'k kx ) (10-18)

be, respectively, the Fourier series for /and/'. Then, as was shown in the proof
of Theorem 10-2, a = 0, and, for k > 0,

a'k = kb k , b'k = —ka k .


392 CONVERGENCE OF FOURIER SERIES |
CHAP. 10

Thus, (10-18) is
00

y] k[bk cos kx — ah sin kx],


k=i

which is precisely the result obtained by differentiating (10-17) term-by-term.


This proves the first statement in the theorem, while the second is a consequence
of Theorem 10—1. |

From this, and Theorem 10-2, we immediately deduce

Corollary 10-1. Let f be continuous and periodic on (— oo, go), with period
2ir, and suppose that f has and piecewise con-
a continuous first derivative

tinuous second derivative. Then the Fourier series for is uniformly and f
absolutely convergent on every closed interval of the x-axis, and can be
obtained by differentiating the Fourier series for f term-by-term. More
(n_1) {n)
generally, iff, /', . . .
,/ are continuous, while f is piecewise con-
tinuous, then the Fourier series for f \j =
(i
1, . . . ,n — 1 converges uni-
formly and absolutely to f on every closed (j)
interval of the x-axis, and can
be obtained by differentiating the Fourier series for f term-by-term j times.

Turning to the integration of Fourier series we now prove

Theorem 10-6. Let f be a piecewise continuous function on (— oo, oo) with

period 2x, and let

-~ + ( Qk cos kx + bk sin kx ^
2 X)
*=i

be the Fourier series for f Then

fix) dx = -£(b- a)

^ a (sin kb
fc
— sin ka) — b k (cos kb — cos ka)

k ~1
(10-19)

In other words, the definite integral of from a to b can be evaluated by f


integrating the Fourier series for f term-by-term.

Proof Set

F(x)= [f(t)--f\dt. (10-20)


jQ
Then F is continuous, has a piecewise continuous first and is periodic
derivative,

with period 2x (Exercise 1). Thus Fcan be expanded in an everywhere convergent


10-6 |
DIFFERENTIATION AND INTEGRATION OF FOURIER SERIES 393

Fourier series as
°°
A
F(x) = -y + ^ (A
fc=i
k cos kx + Bk sin kx),

and an easy computation reveals that

Ak = -
r (10-21)
Bk = %> k>\.
Hence

and it follows that

**-** cos **
f'mdt = ^x + ^+E fl * sin
.
(10-22)

We now use the fact that

•6
/•o pb ra

/ fit)dt= / f«)dt- / f(t)dt


Ja JO JO
to deduce that

r
/(*) </* = -± ib - a) + 2_j £
Ja
00
sin ka — b k cos
-E flfc fca

Finally, since both of these series are absolutely convergent, the necessary re-
arrangement of terms leading to (10-19) can be effected, and we are done. |

In much the same way we can also prove the following theorem.

Theorem 10-7. (The integration theorem.) Let f be an arbitrary function


in (PQ[ — t, 7r] with Fourier series

00

-y + E (° k cos kx + bk sin **)• (10-23)

Then the function


rX

f(t) dt, —ir<x<ir,


394 CONVERGENCE OF FOURIER SERIES | CHAP. 10

has a Fourier series which converges pointwise for all x in the interval (— t, t),

and
x

[ at) dt = y^+ y~ 1
bk cos kx + [ak + ( ~ 1)fc+lflo] sin kx
(10-24)

Remark. The reader should note that since /is arbitrary in (PC[— t, t], the series
in (10-23) need not converge tof(x) everywhere in [— x, t]. This not withstanding,
the series for \lf{i) dt does converge pointwise as asserted in the theorem.

Proof. Reasoning as in the proof of Theorem 10-6, we have

I „ a A x^ a k sin kx — b k cos kx
=
T+g
N , ,
.

f(t)dt
T* +
'

J *

—t < x < it. (See Formula 10-22.) We now set x = and find that

4?=t,T' d°- 25 )
*-!*
and hence that

/"
a ^ ^ bk a k sinkx — b k cos fcx
A mdl 2r+£
.

" t* + &=1 fc=l


s

But when —x < x < t,

* _ V" r n fc+i sinfc.x


Thus

f/(o* = aoS(-i)t+i ^+i: bk


k

+ ^ <2& sin kx — b k cos fcx

h k

an expression which is clearly equivalent to (10-24). |

EXERCISES

1. Verify that the function defined by Eq. (10-20) is periodic with period 2x, and com-
pute the coefficients Ak and Bk (k > 0) of its Fourier series.

2. Let /be a function in (?e[-T, r] with Fourier series


00

^+ ^^ (a k cos kx + b k sin kx).


fc=i
10-6 |
DIFFERENTIATION AND INTEGRATION OF FOURIER SERIES 395

Prove that

ak kx — bkicos kx — cos kir)


+ v^
„. ao(x -\- it) sin
.

fit) dt
,
= ,

/
J — T
2_j J.

for — 7T < X < IT.

3. Suppose that the function /considered in Theorem 10-7 is piecewise continuous on


(— oo oo ) with period 2r. Is Formula (10-24) then true for all x? Why?
,

4. Let /be a piecewise continuous odd function on (— oo , oo ) with period lie. Prove that

for all x.

5. Starting with the series

~-
*+i sin kx
= 2-f( -1 > '
_7r<x<7r '
2 jfe
k=i

use Theorem 10-7 to prove that


2 2 oo ,
X -K ,
V^ /- ,\* COS kx
k =\
and that
kx
f_
12
_ i* =
12
y
tH
(-D
.it sin

*3

for —t < x < 7r. See Exercises 14(b) and 15(b) of Section 9-4.
6. Show that the trigonometric series 2~lk=2 sm kx/\n k is «o/ the Fourier series of any
function (PC[-x, it]. [Hint: See (10-25).]

7. (a) Let {/*}, k = 1, 2, . be a sequence of functions in (P6[— t, t] which con-


.
.
,

verges in the mean to / Prove that

lim /
fk (x)g(x)dx = / f(x)g(x)dx
k—»O0 J X J —
for all g in (PC[— x, 7r]. [Hint: Apply the Cauchy-Schwarz inequality to the functions
f-fk and g in (PC[-t, x].]

(b) Use the result in (a) to deduce that if £*=i /t converges in the mean to /in
(PC[-7r, *], then

/
f(x)g(x)dx = J2 / /*(*)*(*)<**

for all g in (Pe[-7r, it].

(c) Let /and g be piecewise continuous on (— °°, °°) with period 27r, and let au, bk
and a*. /8a= be, respectively, the Fourier coefficients for /and g. Prove that
/x
f(x)g(x) dx = ——^ + y~] (a k a k + 6*/3/fc).
-r 2 f—
o

396 CONVERGENCE OF FOURIER SERIES | CHAP. 10

10-7 SUMMABILITY OF FOURIER SERIES;


FEJER'S THEOREM
We have already had occasion to remark that the Fourier series for an arbitrary
function /in (P<5[— t, it] may be divergent for certain values of x. In such a case
one would naturally be inclined to feel that the series in question was a rather
poor approximation to /, and, in particular, that it would be impossible to deter-
mine / from its series in the absence of any additional information. One of the
really remarkable properties of Fourier series is that this impression is entirely
false, and that the value of f(x) can be found at all x where /is continuous even

though the Fourier series for / is divergent in the usual sense. Needless to say,
the method of "summation" which is used to accomplish this feat must be quite
different from the standard one involving the partial sums of the series, and yet,
at the same time, must yield the value which would be obtained by the method
of partial sums whenever the latter do converge. In this section we shall introduce
this new technique of summation, and then use it to prove Fejer's theorem —one
of the most important results in the entire theory of Fourier series.

Let
a + fli + a2 + • •

(10-26)

be an infinite series of constants with partial sums

Sk = «0 + «1 + -
' '
+ ak,

k = 0, 1, 2, ... . Then the sequence of arithmetic means associated with (10-26)


is, by definition, the sequence {a k }, k = 1, 2, ... , with

ak =
s
° + Sl + '
' + Sk -'
(10-27)
k

for all k > 0. If this sequence converges to a limit cr as k —» oo, that is, if

lim o-fc = <r,


k—»oo

we then say that (10-26) is summable by the method of arithmetic means, or Cesar
summable, and <r is called the Cesaro sum of the series.

Example 1 . Consider the series

1 - 1 + 1 - 1 + • •

(10-28)
with partial sums
S = 1, Si = 0, S2 = 1, ^3 = 0,

In this case the sequence of arithmetic means is

fl 1 2 2. 3 3. \
10-7 |
SUMMABILITY OF FOURIER SERIES; FEJER'S THEOREM 397

with general terms


1 k + 1
<T 2 k = j
'
*2*+l - 2F+T
'

and has | as a limit. Thus (10-28) is Cesaro summable, with sum \, even though
the series diverges in the usual sense of the term.

Example 2. The geometric series

00 1

l +4 + i + ••

k=0
with partial sums

S = 1, Sl = 1 + 2' ' Sk = 2 - 1

2*

converges to the value 2. In addition, it is Cesaro summable to the same value,

since, by (10-27),

<Tk = £ C?0 + Sl + • • •
+ Sfc-l)

*-!
2 - -V-
1 1

= 2
k \ 2*~V
and
limcr fc
= 2.
fc—>oo

These examples suggest that the notion of Cesaro summability is a generalization


of the notion of ordinary convergence of the type stipulated above, in that the
Cesaro sum of a convergent series always exists and is equal to the ordinary sum
of theseries. We now prove that this is the case by establishing
Theorem 10-8. If an infinite series

a + ai + a2 + • '
'
(10-29)

converges to the value <r, then the series is Cesaro summable, and its Cesaro
sum is a.

Proof. Let e > be given. Then since (10-29) converges to <r, there exists an
integer N such that
\sn ~ <r\ < 2
398 CONVERGENCE OF FOURIER SERIES | CHAP. 10

for all n > N. We now consider the quantity

— = Oo + ?! + ••• + Sn-l) — n<r


<Tn <T

i n—l
= n E ('* - *)•

Then for n > N, we have


iV

Wn — <T = E
k=0
(** - o-) + Y^
k=N+l
(s k - <r)

1 .
N . 1 .
n .

^ ^ E
&=o
l
j* - °1 + n
" fc=iV+l
E
k* — <^l

But by assumption, 1^ — <r\ < e/2 for k > N. Hence

lj; h -«^©
-'-? 5

<
Moreover, since JV is fixed, the quantity

^
1
-
"fc=0
E I
5* - *i

can also be made less than e/2 by choosing n sufficiently large, say n > N', and
when this is done we have

Wn ~ <r\ < + = e
2 2

whenever « > max [TV, N'].

The method of Cesaro summability can, of course, be applied to an infinite


series of functions

E
k=0
fl fcW (10-30)

In this case we say that the series is Cesaro summable at the point x if the nu-
merical series
00

E
k=0
°*(*o)
10-7 |
SUMMABILITY OF FOURIER SERIES; FEJER'S THEOREM 399

is Cesaro summable in the sense of the above definition, and that (10-30) is uni-

formly (Cesaro) summable on an interval [a, b] if the sequence {<Tk(x)} of arithmetic


means associated with (10-30) converges uniformly on [a, b] to a function <r(x);
i.e., if, given any e > 0, an integer can be found such that \an (x) — a(x)\ < e
N
for all x in [a, b], and all n > N. In these terms the proof of Theorem 10-8 also
serves to establish

Theorem 10-9. If the series


00

k=0

converges uniformly to <r(x) on an interval [a, b], where the ak(x) belong to
(PQ[a, b], then the series is uniformly summable on [a, b] to the same func-
tion a(x).

With these preliminaries out of the way, we now state the theorem which
justifies introducing the notion of Cesaro summability.

Theorem 10-10. (Fejer's theorem.) If f is a continuous function on (— oo , oo)


with period 2w, then its Fourier series is uniformly summable to f on every
closed interval of the x-axis.

We have already stated that this theorem is one of the most important in the
theory of Fourier series, and before giving a proof we point out some of its far-

reaching consequences. In the first place, the conditions imposed here are not
sufficient even to guarantee that the Fourier series for /converges pointwise, much
less that the convergence be uniform. In spite of this, Fejer's theorem asserts
that we can "sum" the series in question, and thereby recover the function /.
Moreover this summability is sufficiently well behaved so as to proceed uniformly

on closed intervals a truly remarkable fact. Hence one can legitimately say that
the Fourier series for a continuous periodic function in (PC[— t, w] does serve to
determine the function from which it was derived. (At the end of this section we
shall generalize these results in the usual way to piecewise continuous functions as
well.)
Finally, we note that Fejer's theorem also implies the following fact.

Theorem 10-11. If a trigonometric series

2
°
+ ^2 (Okcoskx + bksinkx) (10-31)
k=i

is known to be the Fourier series of a continuous function


f on (-co, oo),
and if this series converges in the usual sense when x = x , then the series
must converge tof(x ).
400 CONVERGENCE OF FOURIER SERIES | CHAP. 10

Proof. By Fejer's theorem, (10-31) is Cesaro summable to f(x) for all x. If, in
addition (10-31) converges pointwise
5
when x = x then, by Theorem , 10-8, the
value of the series must be the same as its Cesaro sum, namely f(x ). |

At first sight Theorem 10-11 may seem to be stating the obvious, since the
reader has probably long since assumed that a Fourier series must converge to
the function from which it was obtained if it converges at all. Until now, however,
we have proved this fact only for piecewise smooth functions, and there is nothing
in our earlier results to prevent the Fourier series for a continuous function from
converging pointwise to an entirely different function. This, as we now see, is

impossible.
Turning to the proof of Fejer's theorem
we begin by establishing an elementary lem-
ma on approximating continuous functions,
which, though obvious, does stand in need
of proof. The result in question asserts that
every continuous function on a closed inter-
val can be uniformly approximated by a
"broken line function," meaning a continuous figure io-9
function whose graph is made up of a finite
number of line segments as shown in Fig. 10-9. Since we shall have occasion to
refer to this result in our later work, we state it formally as follows.

Lemma 10-4. Let f be a continuousfunction on a closed interval [a, b]. Then,


given any e > 0, there exists a broken line function B on [a, b] such that

\f(x) - B(x)\ < e

for all x in [a, b].

Proof The proof follows the construction illustrated in Fig. 10-10, and goes as
follows.
Since /is uniformly continuous on [a, b] there exists a 8 > such that

|/(*i) - Ax 2 )\ < (10-32)


\

whenever \x x
- x 2 < 5. (Theorem
\
1-13, Appendix I.) Moreover, since [a, b]

is of finite length we can find points

a = x < xi < • • •
< xn = b

in [a, b] such that each of the intervals Ik = [x k _ u x k ] has length less than 5.

The function B is now constructed by successively joining the points (x f(x )), ,

(jci, /(*i)), . . .
,
(xn /(*„))
,
on the graph of/ by straight line segments. Thus
if x belongs to the interval Ik ,

B(x) = /(x _0 + _ (x - x fc _0
Xk _ i
fc
Xk
1

10-7 I SUMMABIUTY OF FOURIER SERIES; FEJER'S THEOREM 401

(xk-i,A.Xk--i))

a — XQ xi x2 x n-2 x n-\ X n = b

FIGURE 10-10

(Fig. 10-11), and we have

\f(x) - B(x)\ =
*fc •*&—

< \f(x) - /(^.oi + l/(


^
}
Z ffY
}|
" **-il
\Xk — Xk—l\
I*

But by (10-32),

\f(x) - flx k -i)\ < \Ax k ) - /(**_i)| <


and since

— < 1,
\Xk Xk—i\
we conclude that

\Ax) - B(x)\ <| + |=€,


as required.

Our next result is the key to the proof of Fejer's theorem, and is of sufficient
independent interest to be stated separately.

Lemma 10-5. Let f be piecewise continuous on (—00, 00) with period 2t,
and let M denote the maximum value of \f(pc)\ for all x. Then if<r n denotes
the nth arithmetic mean of the Fourier series for f
K(x)\ < M
for all n, and all x.

Proof If s n (x) denotes the wth partial sum of the Fourier series for/, then by (10-5),

/
sn (x)
\
= -1 / r,
Ax + ,

s)
n sin (n
~
i)^ ,
, ,Z.
+
ds.
/ jy .

irj_ r
' 2 sin (s/2)
Hence

^-iL^M)\E sia(k+i)s ds.


402 CONVERGENCE OF FOURIER SERIES | CHAP. 10

But by summing the identity

2 sin -= sin (k + ^)s = cos ks — cos (k + 1 )s

as k runs from to n — 1 , we obtain

2 sin
Z" 1

sin (k + %)s = — cos ns


^ ^2
1

k=0
= 2 sin -= s,

and it follows that

'-M-iLfi' + 'il&jm*- (1<M3)

Now when/(x) = 1, it is clear that s k (x) = 1 for all k (consider the Fourier
series for/). Thus, in this case, a k (x) = 1 for all k, and (10-33) yields

«7r7_ x 2 sin 2 (j/2)

Hence if |/C*)| < Af for all x,

/ m ^ M / sin («/2> ,

v
" ~ mr J- T 2 sin 2 (j/2)

as asserted.

With these facts in hand, the proof of Fejer's theorem is all but obvious. Indeed,
let /be continuous and periodic on (— oo, oo) with period 2%, and let e > be
given. We must show that if n is sufficiently large,

\f(x) - <r n {x)\ < e

for all x, where an is the «th arithmetic mean of the Fourier series for/
But by Lemma 10-4, we can find a continuous broken line function B, with
period 2t, such that

\f(x) - B(x)\ < (10-35)


|
for all x. Setting
Six) = f(x) - B{x),

so that \g(x)\ < e/3 for all x, we let <r'n and a'n denote, respectively, the nth
'
arith-

metic means of the Fourier series for g and B. Then since

Ax) = g(x) + B(x),

we have
<rn (x) = <r'n(x) + <(x)
10-7 |
SUMMABILITY OF FOURIER SERIES; FEJER'S THEOREM 403

for all x, and it follows that

\f(x) - <rn (x)\ = \g(x) - a'ntx) + B(x) - <r;'(*)|

< \g(x)\ + K(*)| + \B(x) - <r»(x)\.

But since \g(x)\ < e/3 for all x, Lemma 10-5 implies that the same is true of
n (x)\.
\<T' Finally, since the Fourier series for B converges uniformly to B on every
closed interval of the x-axis, Theorem 10-9 allows us to conclude that its as-

sociated sequence of arithmetic means also converges uniformly to B. Hence


there exists an integer N such that
\B(x) - <(x)| < |

for all n > N and all x, and we have

- =€
\f(x) <Tn (x)\
<f + f
+ f
>

which is precisely what had to be shown. |

Functions with jump discontinuities can be handled in much the same way, in
which case we have the following generalized version of Fejer's theorem.

Theorem 10-12. Iff is piecewise continuous on (— oo, co), with period 2ir,

then the Fourier series for f is Cesar o summable for all x, with sum

+
f(x ) + f(x~)
2

Furthermore, this summability is uniform on any closed interval of the x-axis


not containing a point of discontinuity off

The proof has been left to the reader as an exercise (Exercise 8).

At this point we can truthfully say that the theory of Fourier series for piece-
wise continuous functions is complete. Indeed, we now know that every such
series is "summable," either in the standard fashion when the function involved
is piecewise smooth, or by the method of arithmetic means. Moreover, the series
will always converge pointwise in either the standard or Cesaro sense to the func-
tion from which it was derived (so long as the obvious proviso is made for points
of discontinuity), and this convergence will be uniform on any closed interval
inwhich the function is continuous. Truly, then, there is little more to be said.
At the same time, however, the reader should not be misled into thinking that
we have uttered the last word on the subject of Fourier series. This is far from
being the case. Nevertheless, at this point the direction of inquiry changes abruptly,
and addresses itself to the task of generalizing the above results to wider classes

of functions. Such generalizations do exist, but are based upon an entirely dif-
ferent type of integral than the one we have been using, and are beyond the reach
404 CONVERGENCE OF FOURIER SERIES | CHAP. 10

of an introductory text. Thus, as soon as we have settled the one point which still

remains outstanding (i.e., the Weierstrass approximation theorem) our discussion


will be complete, and we can turn our attention to other types of orthogonal
series, and their applications to physical problems.

EXERCISES

1. Let £fc=o a k and XX=o b k be Cesaro summable, with sums o- and t, respectively.
Prove that

J] (aa k + &b k )

isCesaro summable with sum aa +


/3r for all real numbers a, /3. What does this imply
about the set of all Cesaro summable infinite series?
2. Let 2Zfc=o «fc be Cesaro summable, with sum <r. Prove that Y,k=n a k is also Cesaro
summable for all n > 0, and find the sum of the series.
3. Find the Cesaro sum of each of the following series.

(a)l+0-l + 0+l+0-l+---
(b)l+0 + 0-l+0 + 0+l+0 + 0-l+---
4. (a) Show that the series

sin x + sin 2x + sin 3x + • •


is Cesaro summable to zero in the interval (0, lid. [Hint: Use Formula (10-10).]

(b) Show that the series

\ + cos x + cos 2x + cos 3x + • •


is Cesaro summable to zero in the interval (0, 2ir). [Hint: Use Formula (10-4).]

5. Let (ao/2) + zZ,t=i (a k coskx + b k smkx) be the Fourier series expansion of a


function fin (PC[— ir, ir], and let {a k} be the sequence of arithmetic means associated

with this series, that is, an = (s + • • •


+ s n -\)/n. Prove that

- / Wn(x) - f(x)fdx = V (-) (flt +bl)+ V {a\ + bl).

6. Let ao +ai «2 H+ be an infinite series with arithmetic means a k k ,


= 1,2,...,
and, for each k, set

Tk = <y\ + o"2 +- • • •
+ q~fc

The sequence {r k } obtained in this way is known as the sequence of second arithmetic
means associated with the given series, and the series is said to be summable to the
value r by the method of second arithmetic means if lim^c, r k = r. (Higher orders
of summability by arithmetic means can also be defined.)
(a) Prove that the series
1 - 2 + 3 - 4 + ••

is not Cesaro summable, but is summable by the method of second arithmetic means.
10-8 |
THE WEIERSTRASS APPROXIMATION THEOREM 405

(b) Prove that every which is Cesaro summable


series is also summable to the same
value by the method of second arithmetic means.
7. Let sk and a k denote, respectively, the kth partial sum and arithmetic mean of the
series
ao -\- ai + ci2 + •

' •

(a) Show that


s k +i = (k + l)o-fc+i — ka k ,
k > 0,

and use this result to prove that lim^*, s k /k = whenever a + «i + a2 + • • • is

Cesaro summable.
(b) Use the result in (a) to deduce that the series

l
2 - 2 2
+ 3
2 - 42 +
is not Cesaro summable.
8. Prove Theorem 10-12. [Hint: First show that it is sufficient to prove

and
£ = /> + «»->^£^*-
2
/0 xn -m sin (n/ 2)s
[fix +
. >^
s) - jy
f{x )] ds =
-, 2sin^ (s/2)

for all x. Let e > be given, and divide the integral appearing in this expression

into two parts as

2
— 1

mc J
f
/
r
[fix
rt
+. ^
s) - ft
fix^)]
+m
2
sin
sm J
Q*/2)s
(s/2)
ds
o

sin
+ —
mr Js / [/(* + *) - /(^
+
)i
in/2)s
ri-nr/em
2 2
sin' (5/2)
ds

where 5 is chosen so that \f(x + s) — /(*)[ < e for all s in the interval (0, 8].

Treat the second limit similarly.]

*10-8 THE WEIERSTRASS APPROXIMATION THEOREM

To establish the version of the Weierstrass approximation theorem cited in the


preceding chapter show that every continuous periodic function on
we must first

(—oo, oo ) can be uniformly approximated by a "smooth" function. This is the


content of

Theorem 10-13. Let f be continuous and periodic on (— oo, oo) with period
2ir, and let e > be given. Then there exists a function g which is continu-
ously differentiable and periodic on (— oo, oo) with period 2ir, such that
\fix) - gix)\ < e for all x.

Proof. For each 8 > let

F ix) =
yJ_ Ax+
d i)dt.
406 CONVERGENCE OF FOURIER SERIES | CHAP. 10

(Note that F§(x) is the average value of/(x) in the interval x — 5 < t < x + 5.)
Then Fs is continuous and periodic on (—00, 00) with period 2w, and if we set
u = x -f- /, so that
+8

^O) = ±f T5 /
25 Jx—5
/(") du t

we find that

n(x) = ~ mx + >-/(*- 5 5)]

(see Appendix I). Thus F'6 is also continuous and periodic on (— 00, 00).
Next, we observe that
-x-\-8
/•x-f-o x+S
r-X-^-0
r

\F8 (x) - f(x)\ =


hL
25
x+S
mdu -T S L fM du

[/(")- Ax)]du
25 x —
x+b
< 1/(M)_ KX)ldU '

fsJx-5

But since /is continuous and periodic on (—00, 00), it is uniformly continuous,
and hence there exists a number 5 = 5(e) such that \f(u) — f(x)\ < e whenever
\x — u\ < 5(e). We now set g = Fs (e) and use the above inequality to deduce
,

that
rx+d(e)

\g(x) - f(x)\ < j^r /


\f{u) - f{x)\ du<e.i

Theorem 10-14. (The Weierstrass approximation theorem for trigono-


metric polynomials.) If f is continuous and periodic on (—00,00) with
period 2ir, then, given any e > 0, there exists a trigonometric polynomial

N
Tn(x) = A + ^
k=i
(A k cos kx + Bk sin kx)
such that
\f(x) - TN (x)\ < €
for all x.

Proof According to the preceding theorem we can find a function g such that g
and g' are continuous and periodic on (—00, 00) with period 2ir, and \f(x) —
g(x)\ < e/2 for all x. Thus if Tn denotes the trigonometric polynomial consist-
ing of the nth partial sum of the Fourier series for g, Theorem 10-2 implies that
there exists an integer N such that \g(x) — TN (x)\ < e/2 for all x. Hence

\f(x) - TN (x)\ < \f(x) - g(x)\ + \g(x) - TN (x)\ < I + I = e,

and we are done.


10-8 | THE WEIERSTRASS APPROXIMATION THEOREM 407

In the next chapter we shall have occasion to refer to the version of the Weier-
strassapproximation theorem which uses ordinary polynomials, and hence we
now prove this result as well. Our argument is based on Theorem 10-14, and
begins with the following lemma.

Lemma 10r-6. If
n
T(x) = a + ^ an cos kx (10-36)

is a trigonometric polynomial involving only cosine terms, then there exist


constants A , A\, . . . , An such that

T(x) = Ao + A i cos x + • • •
+ A n cos n x.

Proof. When n = or n = 1 there is nothing to prove. Thus we can proceed


by induction, assuming the validity of the lemma for all trigonometric poly-
nomials of degree n — 1 or less (« > 1). Let The given as in (10-36). Then

T(x) = a + ( ^2 dk cos kx
J
+ an cos nx,

and, by assumption, we can find constants A , . . . , A n _i such that

+ n_1 x
T(x) = Aq Ax cos x + • • •
+ An _ 1 cos + an cos nx.
But
cos nx = 2 cos [(« — l)x] cos x — cos (n — 2)x,

and, applying the induction assumption once more, we can write cos (n — \)x
and cos — 2)x as polynomials involving powers
(« of cos jc. This, of course,
implies that T can be written in the desired form. |

Theorem 10-15. Let f be continuous on the interval [—1, 1], and let e >
be given. Then there exists a polynomial P such that

\f(x) - P(x)\ < e


for all x in [—1, 1].

Proof Consider the function


F(t)=f[cost]

for —t < t < t. Then F is continuous, periodic, and even. Hence the Fj
approximations of Theorem 10-13 are also even, and it follows that their Fourier
series contain only cosine terms. Thus, by the preceding lemma a partial sum
T(t) = Aq +
Ax cos t +
A n cos n / of the Fourier series for some F$
• • •
+
will satisfy

\F(t) - 7X01 < e


408 CONVERGENCE OF FOURIER SERIES |
CHAP. 10

for all t in [— t, t]. Setting x = cos t yields \f(x) — P(x)\ < e for all x in
[— 1, 1], with
P(x) = A + A x x + •••
+ A n x n .\

From here it is an easy step to our final result.

Theorem 10-16. (The Weierstrass approximation theorem for ordinary


polynomials.) Let f be a continuous function on a closed interval [a, b].
Then, given any e > 0, there exists a polynomial P such that

\f(x) - P(x)\ < e

for all x in [a, b]. In other words, a continuous function on a closed interval
can be uniformly approximated by polynomials.

The asserted result follows immediately from Theorem 10-15 and the fact

that the mapping


„, . ^(b + a . b — a \

establishes a one-to-one correspondence between e[— 1, 1] and Q[a, b].


11

orthogonal series of polynomials

1 1-1 INTRODUCTION
In this chapter we continue the study of series expansions in Euclidean spaces
by introducing three classic orthogonal series expressed in terms of polynomials.
At the moment, of course, it is not at all clear that there is anything to be gained
by using polynomials in place of trigonometric functions, and we shall have to
ask the reader to reserve judgment on this point until we come to the study of
boundary value problems. In fact, this entire chapter can be omitted without
serious prejudice until Chapter 13 has been read, and even then it will be possible
to continue with no more than a knowledge of Legendre polynomials and series
(Sections 11-2 through 11-4). Nevertheless, the reader who pursues this dis-
cussion to its conclusion will enhance his appreciation for the scope and subtility
of the theory of orthogonal series, and will be that much better prepared for the
material which follows.
Before we begin, it may be appropriate to remark, once more, that our present
investigations are a natural outgrowth of the ideas developed in the study of
Euclidean spaces. Thus, although certain portions of the following discussion are
technically involved, the issues at stake are simple and familiar. The student who
keeps this point firmly in mind as he reads on should then be able to see the forest
while among the trees.

11-2 LEGENDRE POLYNOMIALS


In Section 7-4 we applied the Gram-Schmidt orthogonalization process to the
linearly independent set 1, x, x2, .in (PC[— 1, 1] to obtain an orthogonal
. .

sequence of polynomials

Po(x), pi(x), p 2 (x), (11-1)

As we shall see, these polynomials actually form a basis for (PC[— 1,1], and were
itnot for the complications involved in applying the orthogonalization process
we could proceed directly to the study of series expansions relative to (11-1). But
these complications do exist, and are serious enough to make it easier to start
409
410 ORTHOGONAL SERIES OF POLYNOMIALS CHAP. 11

\ 1 Port y
p ^x
y I

\
\ w
p*w / / //

\
-l
1 \

Y
/
\
/
/ // \ \ / II 1

_1 ~
A
FIGURE 11-1

afresh with a slightly different polynomial basis. This, then, is the reason for
introducing the so-called Legendre polynomials —they are a basis for (PC[— 1, 1],

and are reasonably amenable to computations.

Definition 11-1. Let {/>»(*)}, « = 0, 1, 2, . . . , be the sequence of poly-


nomials defined as follows:

PoW = 1. (11-2)
and

^*> = 24^ -^ 2
(11-3)

for n > 0.Then Pn (x) is called the Legendre polynomial of degree n, and
(11-3) is known as Rodrigues' formula for these polynomials.*

It is clear from (1 1-3) that Pn is a polynomial of degree n, and by direct computa-


tion we see that

Po(x) = 1, Pi(x) = x,
P 2 (x) = fx 2 - h P 3 (x) = fx 3 - fx,
P 4 (X) = 3AX 4 - ¥X 2 + f, P 5 (X) = ^X 5 - ^X S + tf*.

* This formula is also valid when n = 0, provided D° is interpreted as the identity


operator.
11-2 I
LEGENDRE POLYNOMIALS 411

Moreover, since all of the powers of x in (x — l) are even,


2 n
(See Fig. 11-1.)
P 2n contains only even powers of x, while P 2n +i contains only odd powers of x.
This phenomenon is already apparent in the above list, and will prove useful as
we continue.
Before going on to establish some of the basic properties of the Legendre
polynomials, we prove a general theorem concerning orthogonal sequences of
polynomials in (PQ[a, b] which will serve to relate the Legendre polynomials
The theorem we have
to the polynomials in (1 1-1). in mind is an immediate conse-

quence of the following lemma.

Lemma 11-1. Let {R n (x)}, n = 0, 1, 2, . . . , be an orthogonal sequence


of polynomials in (?<2[a, b] indexed by degree of degree n). Then
(i.e., Rn is

for each n, R n is orthogonal in (?Q[a, b] to every polynomial of degree <n.

Proof Let (P w denote the m-dimensional subspace of (P<B[a, b] consisting of


all polynomials of degree <m, together with the zero polynomial. Then, R ,

Ri, . . . , R m -i
an orthogonal is basis for (P m and every polynomial
, Q of degree
<m can be written in the form
Q = <xqRo + • • •
+ am -iR m —i,

where

«* = k = 0,...,m- L
I^IT'
Hence

Q Rn — '
«o(^0 Rn) "
+ ' ' "
+ «m— lC^m-l * Rn),

and since Rk Rn =
• if k ?* n, it follows that Q Rn =

0, as asserted. |

Theorem 1 1-1. Let {Q n (x)} and {R n (x)}, n = 0,1,2, ... ,be orthogonal
sequences of polynomials in (?Q[a, b] indexed by degree. Then, for each n,
Q n and R n are scalar multiples of one another.
Proof. Since Q , . . . , Qn is an orthogonal basis for <P n+ i,

K(X) = §^ Qo(x) + ^Ll_gl


Q l(x) + . . .
+ *^ Qn{x y

However, by the preceding lemma, Rn is orthogonal to Q ,


. . . , Q n -\. Hence
this equation reduces to

RnW =
|^gj Qn ( X ),

and we are done. |


1

412 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

Among other things, Theorem 11-1 implies that, up to scalar multiples,


(P(B[— 1, 1] contains only one orthogonal sequence of polynomials {R n (x)} indexed
by degree. Thus if each of the Rn
chosen so that its leading coefficient is one, is

the sequence in question But these two properties are


is uniquely determined.
enjoyed by the polynomials in the orthogonal sequence (11-1) (see Exercise 16,
Section 7-4), and we therefore have

Corollary 11-1. The only orthogonal sequence of polynomials {p n (x)},


n = 0, 1, 2, . . . , in (P<3[— 1, 1] with the property that pn has leading co-
efficient one and is of degree n for each n is the sequence obtained by or-
2
thogonalizing 1, x, x , . . . .

Finally, ifwe accept the assertion that the Legendre polynomials are mutually
orthogonal in <PC[— 1,1], Theorem 1 1-1 implies that each Pn is a constant multiple
of the corresponding polynomial in (11-1). This is, in fact, the case, and we
shall see later that

= (11-4)
Pn(x)
£0y p 2 n (x),

a formula which gives a second possible definition of the Legendre polynomials.

EXERCISES

1. Use Rodrigues' formula to compute the values of Pi, . . . ,Ps-

2. Prove that P 2m and P 2n +i are orthogonal in (PC[-1, 1] for all m and n.


3. Use Rodrigues' formula to prove that Pi is orthogonal to Pn in (P6[-l, 1] for all

n 9* 1.

4. Prove that if (PC[— 1 1] has a basis (B consisting of polynomials, then (B must contain
,

exactlyone polynomial of each degree. Hence deduce that up to scalar multiples the
sequence of polynomials given in (11-1) is the only polynomial basis for (PC[— 1, 1].

11-3 ORTHOGONALITY: THE RECURRENCE RELATION

Our first task is to prove that the Legendre polynomials are mutually orthogonal
in (P6[- 1,1]. To this end we begin by showing that Pn is a solution of a certain
linear differential equation, a fact which in itself is of considerable importance.*
Thus let (x 2 l) —
n =
w, and let w
{k)
denote the /cth derivative of w. Then

w (1) = 2nx(x 2 - IT' 1


,

* A direct proof of orthogonality can also be given along the lines suggested in Exer-
cise 5 below.

11-3 |
ORTHOGONALITY: THE RECURRENCE RELATION 413

and, multiplying by x
2
— 1, we have

O - 2
l)w
(1)
- 2nxw = 0. (11-5)

Repeated differentiation of (11-5) yields

(x
2 - l)w
(2)
- 2x(n - l)w (1) - 2nw = 0,

(x
2 - 1)m/
3)
- 2x(n - 2)h>
(2) - 2[n + (n - l)]w
(1)
= 0,

{x
2
- l)w
(k+2) - 2x[n - (k + l)]w
(k+1)

-2[n + (n - 1) + h (« - &)]h>
(A:)
= 0.

But since
(2» - fc)(fc + 1)
+
n _l (n
/ - 1
1)
^_+L • • •
+r /
(n - ia
k) = ^

(Exercise 7), this last equation reduces to

{x
2
- \)w
(k+2) - 2x[n - (k + l)]w
a+1) - (2n - k){k + l)w
(k)
= 0.

We now set k = n, and observe that, by definition,

Pn = ^.
2"n\

Thus, if the above equation is multiplied by — \/2 n n\, it can be rewritten

(1 - x
2
)P'r[ - 2xP'n + n(n + \)Pn = 0,

and we have proved

Theorem 1 1-2. The «th Legendre polynomial Pn is a solution of the second-


order linear differential equation

(1 - x 2 )y" - 2xy' + n(n + \)y = 0. (11-6)

It is not difficult to show that every polynomial solution of Equation (11-6)


which, by the way, is known as Legendre" s equation of order n — is a constant
multiple of Pn (see Example 2, Section 6-5). Thus (11-6) characterizes Pn up to a
constant multiple. Many treatments of the theory of Legendre polynomials start
at this point, and define Pn as the polynomial solution of Legendre's equation of
order n which assumes the value one when x = 1 (cf., Theorem 11-5 below).
Using (11-6) it is easy to establish the orthogonality of the Legendre poly-
nomials in (pe[— 1, 1]. Indeed, starting with the pair of equations

(1 - x
2
)P'rl - 2xP'n + n(n + \)Pn= 0,

(1 - 2
x )P% - 2xP'm + m(m + \)Pm = 0,
1

414 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

we multiply the first by Pm , the second by Pn , and subtract to get

(1 - 2
X JPm PX - P'r^Pn] ~ 2x[Pm Pn ~ P^n] = PmPn[m(m + 1) - «(« + 1)].

(11-7)
But the left-hand side of this equation is just the derivative of

and hence, integrating (11-7) from — 1 to 1, we obtain

1 1
(1 - x 2 )[Pm P'n - P'mPni = [m(m + 1) - n(ii + l)]/" Pm {x)Pn {x)dx.
l-i J-i

Finally, since 1 — x 2 vanishes at the upper and lower limits of integration, we have

[m(m + 1) - n(n + 1)] f Pm (x)Pn (x)dx = 0,

and thus, if m 5* n,
1
Pm (x)Pn (x) dx = 0.
J
This completes the proof of

Theorem 11-3. The Legendre polynomials are mutually orthogonal in

(PC[-1, 1].

We now anticipate the construction of series expansions relative to the Legendre


polynomials by computing ||/
>
W I[-, n = 0, 1, 2, Rather than attack this problem
directly, we first derive an important formula known as the recurrence relation for
the Legendre polynomials from which these values can be deduced without
difficulty.

Here we begin by considering the function xPn (x), which is obviously a poly-
nomial of degree n 1. +
Thus, since P P u Pn +i, is an orthogonal basis , . . . ,

for the subspace (Pn+2 of <PC[— 1, 1], we have

*f,M = "fP^W-
p p
*=„
"- "

But, by Lemma 11-1, Pn is orthogonal to every polynomial of degree <n, so that

(xPn ) -Pk = Pn - (xPk ) =

whenever k < n — 1. Hence

\XPn ) Pn — 1 D xPn ) Pn \ xPn )


~p"n+1 Pn+lW-
n
'
n / \ '
/^\ i_ *
/„\
xPnix) - -= =
r n _ 1 r n —l

Pn-l\X) H
K
rpn rpn
'
r>
PnK*) +i

p
r n + \ -*n+l
f v\
-

11-3 | ORTHOGONALITY: THE RECURRENCE RELATION 415

This equation can be simplified still further if we note that xPn {x) 2 is an odd
function (it contains only odd powers of jc), for it then follows that

(xPn ) -Pn = xPn (x) 2 dx = 0.


f
This allows us to write

XPU = <xPn+1 + jSPn-i, (11-8)

where a and /3 are real numbers which we now propose to determine.


In the first place, since

the coefficient of xk in Pk is

1
2k(2k - 1) • • •
[2k - (k - 1)] = ^^ •
(11-9)

Secondly, since

(x
2
- if = x
2k
- kx
2k ~ 2
+ Kk ~ l)
x
2k ~* ,

~2
the coefficient of xk in Pk is

( fc
-
_j>_ (-,0(2* - 2) ••• [2k - (k + 1)] = - ! (U-IO)
2Hfc (f_ 2) ,

We now use (11-9) to compute the coefficients of x n+1 on both sides of (11-8)
and equate the results, obtaining

—(2«)! = a [2(« + 1)]!

2»(«!)2 "2»+i[(/i + l)!] 2

This implies that

CL = n + 1
»

2/i + 1

and (11-8) becomes


1
*Pn = Pw+1 + ^Pn ~ 1 (11_11)
2«+ l
'

~x
To find |8 we
use (11-9) and (11-10) to compute the coefficients of xn on both
sides of (11-1 1), again equating results. This gives

_ ( 2yi ~ 2> !
= _ "+ 1 [2(« + 1) - 2]! [2(/t - 1)]!
2»(/i - !)!(« - 2)! 2n + 1 2*+i«!(n - 1)! + ^2»-i[(/i - l)!] 2 '
1

416 ORTHOGONAL SERIES OF POLYNOMIALS |


CHAP. 1

which, after a little arithmetic, yields

P =
2/1+1*

We now substitute this value in (11-11) and solve the resulting expression for
Pn +\ to obtain

Theorem 11-4. The (n + l)st Legendre polynomial satisfies the identity

'•+. = TTT^" " ^T P "-' (U - 12)

for all n > 1 , and all x.

Remark. This result is also valid when n = if we agree to set P-\{x) = 0.

Equation (11-12) is known as the recurrence relation for the Legendre poly-
nomials and can be used to deduce properties of Pn +i from those of the two
immediately preceding polynomials. For example, given that P (x) = 1 and that
Pi(jc) = x, we can compute P 2 from (11-12) by setting n = 1. This gives

i^C*) = 2X ~ 2-

From this, and the known value of Pi, we find that

Ps(x) ~ 3 X \2 X ~~ 27
— 3*
— 2A — JS-y-O 3
2 Ay

As a somewhat more substantial application of the recurrence relation we


prove

Theorem 11-5. Pn (l) = 1 for all n.

Proof The assertion obviously holds for P and P^ Moreover, if we assume that

p (l) = • • • = Pn (l) = 1, n > 1, the recurrence relation gives

P m _ LnIn ++ n
t-
1
1
i
_
n
"
+ 1
= i

The desired result now follows by mathematical induction. |

n
Inmuch the same way it can be shown that Pn (— 1) = (— \) .

We now use the recurrence relation to compute the value of ||P n ||. Again a
trick is needed, and this time it is furnished by the polynomial

p{x) = Pn {x) - ^Jp-!: */>„_!(*). (11-13)


.

11-3 | ORTHOGONALITY: THE RECURRENCE RELATION 417

A simple calculation using (11-9) reveals that this polynomial is of degree less
than n (see Exercise 8), and hence, by Lemma 11-1, is orthogonal to Pn in
(P6[— 1, 1]. Thus, if we multiply (1 1-13) by Pn and integrate from — 1 to 1 we get

i /.i

Pnipcfdx = 2n ~ l
/ xPn _i(x)Pn (jc) dx. (11-14)
—l n y_i

This is half of what we need. The other half is found by multiplying the recurrence
relation by Pn -i and then integrating. This gives

n +
, ,
1 J _i
,
Pn-l{xfdx = ^^
n -\- l
/
J_i
xPn _ l (x)Pn (x)dx.

Thus, by (11-14),

In -
/ Pn {xf dx = s!
Pn -i(*) Jx,
In + 1 J
(

or

2 In - 1
l|2 =
\\Pn\\ - \\Pn- ill n 1,2,
In + 1

We now use the fact that ||P ||


2
== 2 to deduce that

\\P,f — 1.0
3 z

ll^ll
2 — 5 3 z

In - 1 In - 3 2n — 5
ll^.f In + 1 2n — 1 In - 3
1-^

and we have proved

Theorem 11-6. ||P n ||


2
= 2/(2/i + I) for all n.

By now the reader may list of formulas involving Legendre poly-


suspect that the
nomials is almost endless. But rather than continue the task of collecting
It is!

— —
them interesting as that may be we let matters rest where they are, and bring
this discussion to a close with a theorem which will be of some importance in the
following chapters.

Theorem 11-7. The Legendre polynomial of degree n has n distinct (real)


roots between — 1 and 1

Proof The assertion is obviously true when n = 0. Moreover, for n > 1,

1 1
Pn (x)dx = Pn (x)P (x)dx = 0,
f J
1 1

418 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

by orthogonality, and it follows that Pn must change sign at least once in the interval
— 1 < x < 1, and hence has a root in this interval. Now let x lf x 2 , . . . , xm be
the roots of Pn in (—1, 1), and consider the polynomial

Q(x) = (x - xO(x - x 2) • • •
(x — xm ).

(Note that Q is of degree m, with 1 < m <


Then since Pn has no repeated
n.)

roots (see Exercise 19), and since Pn and Q


change sign at the same points in
(—1, 1), their product is either positive or negative throughout the entire interval,
and it follows that

Pn (x)Q(x) dx 9± 0.
/—
However, by Lemma 11-1, Pn is orthogonal to every polynomial of degree less
than n, and hence would be orthogonal to Q were m < n. The preceding in-
equality excludes that possibility, and forces us to conclude that m = n, as
asserted. |

[Note: For convenience of reference we have summarized the essential information


relating to the various polynomials studied in this chapter in tabular form at
the end of the chapter.]

EXERCISES

1. Use the recurrence relation to compute P3 and P4.


2. Prove that
Pn(-x) = (-l) nPn (x)

for all n, and use this result together with Theorem 11-5 to deduce that
n
Pn(-l) = (-D .

3. Compute /V
4. Prove that
2
Pn (x) = [(2n)\/2 n (n\) ]p n (x),

where p n is the nth polynomial in the sequence (11-1).


5. (a) Let (x 2 — l)
n = w, and denote the kih. derivative of w by w {k) . Prove that
w, h>
(1)
, . .
.
, n>
(n-1) all vanish when x = ±1.
(b) Prove that

f^Pn^x" dx =

for every non-negative integer m < n. [Hint: Use integration by parts, and the result

in (a).]

(c) Use the result in (b) to deduce that the Legendre polynomials are mutually
orthogonal in (PC[— 1, 1].
J

11-3 |
ORTHOGONALITY: THE RECURRENCE RELATION 419

6. (a) With w and w (n) as in Exercise 5, use integration by parts to prove that

l 1
n
w
{n)
w
{n)
dx = (1 - x) (l + xfdx.
f (2/i)!

(b) Prove that


1
2
=
I_! " X)(l+X) ^ (2/z)!(2n+l)
2 *

(c) Use the results in (a) and (b) to prove that

P„(x) dx =
-i 2n + 1

7. Let A: and n be non-negative integers with k < n. Prove that

- n « =
- k) (2/i - *)(* + 1)
+
« _u in
/
+
1) j_ • • •
+
_l (n
r

[Hint: Use the formula 1 + 2 -\ + n = n(n + l)/2.]

8. Prove that P„(jc) — [(2n — l)/n]xPn -i(x) is a polynomial of degree < /*.

9. Use mathematical induction to prove that the nth derivative of the product of two
functions is given by

=
a»)
g (t
) . . ,

where

and w (0) = ii, i><°> = d.

10. (a) Write

apply Exercise 9, and then differentiate the result to obtain

^n+1
2(n + 1)
U 1)Fn + /z + 1
Xtn + 2
Fn -

(b) Use the result just obtained and the differential equation for Pn to prove that

P'n + 1 = XK + (« + DPn.

11. Differentiate the recurrence relation for Pn +i, and use the result obtained together
with that of Exercise 10(b) to prove that

xP'n - P'n -l = nPn .

12. Prove that

(a) P'n+1 - K-i = (2/i + l)Pn ;


(b) (1 - x 2)P'n = nPn -i - nxPn .
. .

420 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 11

13. Establish each of the following results.

f In
(a) xPn (x)Pn -i(x) dx = _ > n > 1
J 4n2 1

(b) / Pn (x)P'n+1 (x) dx = 2, n>0

(c) / xP'n (x)Pn (x) dx = " » n >


2n x

14. (a) Use Exercise 12(b) and some other suitable identity to prove that

2
— 1
Pn = xPn ^ + P'n-\ '

(b) Replace n by n — 1 in the equation of Exercise 10(b), and square the result.

Square the identity in 14(a), and use these two relations to establish

-~^
l
[P'nf + [Pn]
2
= ±=/- [P'n-lf + [Pn-^f

for n > 1.

15. Use the result of Exercise 14(b) to show that

2
1 —
=—^- [PLf + [Pnf < 1
n2

whenever \x\ < 1, and n > 0. Now prove that

\Pn(x)\ < 1

for all « if |x| < 1.

16. (a) Let x x\, , x m be m 1 distinct numbers between —1 and


. . , + 1. Find a
polynomial Qi{x) of degree m which has roots at xq, x»_i, x i+ i, . . , . . . , x m and
takes the value a t at x,-. -

(b) Let (2(;c) denote the sum of the m +


1 polynomials (^(x) obtained in (a).

Show that Q(xi) = a» for < i < m, and then prove that no other polynomial of
degree less than or equal to m has this property.
17. Let F(x) be any polynomial of degree < 2m + 1, and let xo,...,xm be the roots
of Pm+i. Divide F(x) by Pm+ i(x) to obtain

F(x) = Pm+ i(x)P(x) + R(x),

where P(x) is of degree < m, and i?(x) is either zero or of degree <m.
(a) Prove that

f F(x)dx = f R(x)dx.

(b) Prove that R(x) is the polynomial Q(x) constructed in Exercise 16(b) if a{ is

taken as F(x,-), / = 0, 1, . . . , m.
11-4 | LEGENDRE SERIES 421

(c) Prove that


r1 m
F(x) dx = J^ IiF(<Xi) >

where the constants I ,h, ,Im depend only on x


. . . , x\, . .
. , xm , respectively, and
not on any particular properties of F(x).
18. Prove that there exists a positive constant M such that

-i
i

\P n (x)\ dx <
\/n

for all n > 1. [Hint: Use the Cauchy-Schwarz inequality.]

19. Prove that none of the Legendre polynomials has a repeated root. [Hint: Observe
that any repeated root of P n
is also a root of P'n and then use the uniqueness theorem,

for initial- value problems for second-order linear differential equations.]

11-4 LEGENDRE SERIES


Now that we have established the orthogonality of the Legendre polynomials, it

is only natural to ask if they form a basis for (Pe[— 1, 1]. The answer is that
they do, and a proof of this fact will be given toward the end of the present section.
For the moment, however, let us accept the truth of this assertion, and consider
the series expansion of an arbitrary function / in (P6[— 1, 1] computed with
respect to the Legendre polynomials. By Formula (8-22), this series assumes
the form

and converges in the mean to /. Thus we are entitled to write

n=0 H^ 71 ' 1

or, by Theorem 1 1-6,

f(x) = J2 anPnix) (mean), (11-16)


n=0
where

an = 2jL f(x)Pn (x) dx, n = 0, 1, 2, . . . . (11-17)


Y~- j_
A series of this type is known as the Legendre series expansion of the function /,
and the a n are called the Legendre or Fourier-Legendre coefficients off.
1

422 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

Example. Find the Legendre series expansion of the function

[-1, -1 < x < 0,


/(x) " I 1, < x < 1.

Since /is an odd function on [—1, 1], so is f(x)P 2n (x) for all n, and we have

a 2n = 0, n = 0,1,2,

Moreover,

/ f(x)P 2n+1 (x) dx = 2 / P 2n +i(x)dx,


J—i Jo

and (11-17) yields

«2n+l = (4« + 3)
Jo
/ ^n+lW dx.
To evaluate this integral we write the differential equation for Pn in the form

[(1 - *V»r + n(ji + \)Pn = 0,

and integrate from to 1 to obtain

1 *

n{n + 1) f Pn {x)dx = -[(1 - x 2 )PUx)]


Jo o
= nco).

But, by Exercise 12(b) of the last section,

PUO) = «P»-i(0), n > 0,

and we therefore have


•l

Pn (x)dx = -—-^Pn-iiO), n > 0.

Hence

a 2n +\ =
4« + 3 p " = 0, 1,2,
20s + n 2n(0),
Finally, since

^2n(0) - ( 1) 2 -4-6---(2«) ."


1 '
2 "-"
(Exercise 2 below),

^
«1 —
— 3
2>

_ nn 4ft + 3 1-3 -5- --(2ft - 1) _


a 2 n+i - ,(-1) ,
n ~ x 2 "--'
2 («+ 1)* 2-4- 6- •• (2n)
'
11-4 | LEGENDRE SERIES 423

and we have

f(x) = fiM*) - iP 3 (x) + HP 5 (*) - Wt(*) + • •


(2« - 1)
2 ,
V, n» 4n+3 3 5 1 • • • • -

(
,

2(n + 1) 2- 4-6- "(2/1)


n= l

In general, the theorems describing the convergence of Legendre series are


and are proved in much
similar to the corresponding results for Fourier series,
the same way.* Without going into the details here (see Section 1 1-5), the basic

result in this connection reads as follows.

Theorem 11-8. The Legendre series for a piecewise smooth function f in


(PC[— 1, 1] converges pointwise everywhere in the (open) interval (—1, 1),

and has the value

f(x
+ + f{x-)
)
2

at each point in the interval. Moreover, the convergence is uniform on any


closed subinterval of(—\, I) in which f is continuous.

This said, we now turn to the one major item of unfinished business —a proof of
the fact that the Legendre polynomials form a basis for (PC[— 1, l].f Again the
most important step in the proof is furnished by the Weierstrass approximation
theorem, phrased this time for ordinary polynomials.

Theorem 1 1-9. Let f be a continuous function on a closed interval [a, b], and
let e be a positive real number. Then there exists a polynomial p such that

\f(x) - p(x)\ < e

for all x in [a, b].

More succinctly, Theorem 11-9


asserts that any continuous function on a closed
interval can be uniformly approximated by polynomials. Properly interpreted,
Fig. 9-24 furnishes a suitable illustration of this result, and a proof can be found
in Section 10-8.
For simplicity we now let £ denote the set of all Legendre polynomials. One
of the implications of Theorem 11-1 is that <£ spans the same subspace of
2
<pe[— 1, 1] as the set 1, x, x , . . . , from which it follows that §>(£) is the subspace
of (P6[— 1, 1] consisting of all polynomials. Thus §>(£), the closure of S(£) in

* In fact, can be shown that the Legendre series for a function converges pointwise
it

in the interval(—1, 1) if and only if its Fourier series does.


t The following argument makes use of the results in Section 9-7, and should be omitted
by those who are unfamiliar with that material.
424 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

(P6[— 1, 1], must also contain all polynomials. This fact, combined with the
Weierstrass approximation theorem and Theorem 9-5 will allow us to show
that every function in (PC[— 1, 1] is the limit in the mean of a sequence of functions
in S(cE), which, of course, will prove the basis theorem.
Indeed, if /is a continuous function in (PC[— 1, 1], Theorem 11-9 provides a
sequence of polynomials {Qk}, k = 1, 2, ... (the subscript does not indicate ,

the degree of the polynomial this time) such that

l
l/(*) ~ Qk(x)\ <
VSk
for all x in [— 1, 1]. But then

11/- Qk\\ = L/C*) - Q k (x)] 2 dx i2


(/^ y
^ \Sk 2 J 2k

and we conclude that {Qk} converges in the mean to /. Hence S(£) contains all

continuous functions on [—1, 1].

Finally, let g be any function in (?e[— 1, 1]. If g is not continuous we apply


Theorem 9-5 (modified in the obvious way) to obtain a sequence of continuous
functions {gk}, k = 1, 2, ... such that ,

II*
~ gk\\ < 2k"

Since each gk is continuous we can apply the preceding argument to find a poly-
nomial Qk such that

\\gk - Qk\\ < jk


Hencs
II* - 6*11 = IK* - gk) + (gk - 2/011

< II* - gk\\ + II** - Qk\\

< 2k
+ 2k

~ k*
and the sequence of polynomials {Qk} converges in the mean to g. Thus g also
belongs to§>(£), and we have proved

Theorem 1 1-10. The Legendre polynomials form a basis for the Euclidean
space (Pe[— 1, 1].
11-5 I CONVERGENCE OF LEGENDRE SERIES 425

EXERCISES

1. What is the formula for the Legendre series expansion of an even function in
(PC[- 1,1]? Of an odd function ?
2. Prove that P 2 «+i(0) = 0, n = 0, 1, 2, . .
. , and that

JV.OT-C-D'
1 '.3 '.5 "'^" ' «-l.2
2-4-6 (2«)

3. Find the Legendre series expansion of the function \x\ in (?G[— 1, 1]. [Hint: Use
integration by parts and Exercise 12(a) of the preceding section to evaluate the
coefficients.]

4. Prove that P'n = (In - l)Pn -i (2« - 5)Pn _ 3 + + (2/i - 9)P„_ 5 H , for all
n > 1, by expanding P'
n in a Legendre series.

5. Find the Legendre series expansion of each of the following functions in (?Q[— 1, 1].

(a) jc
3 (b) x b - x 3
+ 2 (c) 4x 4
+ 2* 2 - x.

6. Find the Legendre series expansion of the function

1, -1 < x < 0,
/(*)
0, < x < 1.

*ll-5 CONVERGENCE OF LEGENDRE SERIES

We conclude our discussion of Legendre series by proving the convergence


theorem cited in the preceding section. As with the proof of the corresponding
result for Fourier series, we begin the argument by deriving a formula for the
partial sums of the series in question.
Thus let /be an arbitrary function in (PC[— 1, 1], and let

Sn (x) = J] a k Pk(x),
k=0
where
• 1

2k + 1
f(t)P k (t)dt.
«fc
— 1

Then
2k +
Sn(.x) = ^
fc=0
1
f(t)Pk (t) dt Pk(x)

2k + 1
P k (t)P k (x) f(t)dt
fc=0

Kn (t, x)f(t) dt,


— 1

426 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

where

Kn {t,x) = ^ ^~^P
k=0
k (t)P k (x).

To put this expression in more manageable form, we rewrite the recurrence


relation for the Legendre polynomials as

(2k + l)xPk (x) = (k + l)Pk +i(x) + kPk _ x (x),

and multiply by /^(z). This gives

(2fc + l)xP (f)/\(x)


fc
= (fc + l)P (f)^+i(x)
fc + *P*(/)Pib-i(*). (11-18)

We now interchange the roles of x and ? in this expression and subtract (11-18)
from the result to obtain

(2k + 1)(* - x)Pk (t)Pk (x) = (k + l)[Pfc+1 (0^W - Pk(t)Pk+ i(x)]

-k[Pk (t)Pk ^(x) - Pk-iiOPkix)].

Finally, summing this identity as k runs from to n, we find that

(/-*)£ (2* + \)Pk (t)Pk (x) = (« + lX/WiCO^OO - ^(0^+i(4


fc=0

or

Kn (t,x) =
rs , n n +
— =
1 >w+1 (QPn (x) - Pn (t)Pn+1 (x)
(11-19)
t — x

a formula which known as


is ChristoffeVs identity.
In particular, we now have

Sn(x) = 0+1 f Pn+lWnix) —~ Pn(t)Pn+1 (x)


2 J _i x t
m df (1 ^
Moreover, since Sn (x) = 1 for all n when fix) = 1, (11-20) also implies that

n + 1 Pn +i(t)Pn(x) - Pn (t)Pn+1 (x) dt = (H-21)


f j
2 J -i t - x

The two formulas just derived allow us to express the difference between Sn (x)
and f(x) in integral form. The following lemmas will enable us to show that this
difference approaches zero with increasing n whenever /is reasonably well-behaved,
and therefore play the same role in the present argument that the Riemann-
Lebesgue lemma played in the study of Fourier series.
11-5 |
CONVERGENCE OF LEGENDRE SERIES 427

Lemma 1 1-2. If g is an arbitrary function in (PC[— 1, 1], then

lim »
1/2
|
T g(x)Pn (x)dx = 0. (11-22)

Proof. Applying Parseval's equality to the function g, we have

0-ig(x)pn (x)dX y
,
g(x)
,2
dx
.
= v^
2^ -,
£, n=o J_j Pn (xy dx

= Z [/_
n=0
P^ x f dx

Thus the series

is convergent, and it follows that

Hence
2
= ?!L li)"' £(x)Pn (.x) dx =
Si (sttt)" kl SL ( i
0,
—i
which, in turn, implies the desired result. |

Lemma 1 1-3. There exists a positive constant M such that

Id ( \\
\Pn(x)\ < — ^ (11-23)
/2(1 _ x ^ l/2
for all n > 0, and all x in the interval (—1, 1).

The proof of this inequality is long and somewhat involved. Hence, rather than
run the risk of obscuring our present argument, we have deferred the proof to
the exercises where sufficient directions have been given to enable the student to
work the details through for himself.
/
Now let 'be an arbitrary function in (PQ[— 1, 1], and let x be a point in the open
interval (—1, 1) at which /is continuous and has a right- and left-hand derivative,
i.e., a point where

-
lim
f0)
t
~

/(X
Xq
^ and lim
At)
t —
/fro)
Xq
<-»x+
1 ,

428 ORTHOGONAL SERIES OF POLYNOMIALS |


CHAP. 1

exist Then (11-20) and (11-21) imply that

- _ n + 1 f{t) - f(x
Sn(xi) f(x )

)
[Pn+l (t)Pn (x ) - Pn (t)Pn+l (x )) dt
—1 t Xq
1

—±~ P
n
2
l
n (x ) I
^ ^^r-^- pn+i(» dt
- n
~±^P
L
n+l ix«)f
J_ 1
^-^P
— I Xq
n (t)dt.

But, by Lemma 11-3, there exists a constant M such that, for all n > 1,

0±A pnK x ( \
0) <r
M IL±1
2 ^ 2(1 - x|)i/2 „l/2

<
M 2n
2(1 - *2)l/2 «l/2

= Kn 112 ,

where
K= M/{\ - xl)
112
.

Moreover (and this is the critical point in the argument), the assumption that /
has a. right- and left-hand derivative at x implies that the function

f(0 - f(x ) . , . , .-
i .^ ^: i,
/ — x
'

belongs to (PQ[— 1, 1]. Hence, by Lemma 11-2,

lim
"+ l
Pn (x f^-^P^Od,
n—>oo )f
r1

< Klim n
112
f
At)-,fix ) P (t)dt
- n+1
n—+oo J -i t .

= 0.

Similarly,

lim
n—>x
—+~- Pn +
n
2
1
l(Xo) I
— 1
^—^ t Xo
Pn(t) dt = 0,

and we have therefore shown that

liml5n (x ) - /(x )| = 0,
11-5 I CONVERGENCE OF LEGENDRE SERIES 429

completing the proof of

Theorem 11-11. The Legendre series for a function f in (P6[— 1, 1] con-


verges to the value f(x ) at each point x in the open interval (— 1, 1) where

f is continuous and has a right- and left-hand derivative.

In addition, it can also be shown that this convergence is uniform whenever / is


continuous. we have the following theorem.
Specifically,

Theorem 11-12. The Legendre series for a function fin (PC[ —1,1] converges

uniformly to f on any closed subinterval of (— 1, 1) in which f is continuous


and piecewise smooth.

We omit the proof.


To complete the argument leading to Theorem 11-8 we must now determine
the behavior of the Legendre series at a point of discontinuity of/. To this end
we first prove the following rather surprising result.

Lemma 11-4. Let x be an arbitrary point in the open interval (—1, 1).

Then
«x
Pn+l(t)Pn (x ) ~ Pn (t)Pn + l (x ) ^
.. ... -1 t — Xo
and
~ Pn (t)Pn +
m !±A ^
(x
li
n—>oo -^ J
f Pn + l(t)Pn(x
Xn
*
)

-*()
1 )

Proof. We begin by computing the Legendre series expansion of the function

f(x )= \U -1 <*<*o,
(0, xo < x < 1.

Since

pn(x ) =
2n~\^[
[K+l(x) ~ K ~^ x)]

for all n > 1 (Exercise 12(a), Section 11-3), the general coefficient in this series is

/x
Pn (x) dx

[Pk+i(x) - Pk-i(x)] dx
i

= %Pn + l(Xo) ~ Pn-l(Xo)] ~ M^n + l(-l) ~ i»n-l(-l)]


= 2[Pn + l(Xo) — Pn — \\Xq)\\
the last step following from the fact that Pn (— 1) = (— l) n -
1

430 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

Moreover,
*x
1 /
a =

Hence the Legendre series for /is

2 H 2 ^ 2 *—> [f>n + 1 ( x °) ~ Pn—i(xo)]Pn(x),


n=l

and it follows that the value of the nth partial sum of this series at the point x is

Sn(x ) = \ + ^^ + \Y, [Pk+ x(x ) - Pk-dXomtXo)

= 2 + 2^n+l{Xo)Pn (.Xo).
But by (11-20),
/xo
Pn+\it)Pn {Xo) — Pn (t)Pn+ i(Xo) ,

-1 t — Xq
Hence, by (11-23),
x
Pn+1 (t)Pn (x ) - Pn (t)Pn+1 (x ) ^
n— >oo / -1 t - Xq
= \ + \ lim P„+i(^ )^n(^o)

This proves the first statement in the lemma, and the second follows by sub-
tracting this result from (11-21). |

Now let /be any function in (P6[— 1,1] with the property that

m . /(*
+
> + /<*->

for all a: in (— 1, 1), and let x be any point at which/has a right- and left-hand
derivative. Then, if Sn (x ) denotes the value of the nth partial sum of the Legendre
series for/ (11-20) implies that

Sn (x ) = h+h
where

Pn+l (t)Pn (x ) - Pn (t)Pn+l (x ) „


df
-1 t — Xq

n + Pn + l(t)Pn(x ~ Pn (t)Pn + (Xo) „


J =
2
l
f
)

— XQ
1
^
J Xn t
11-5 I CONVERGENCE OF LEGENDRE SERIES 431

Thus
+>
f(xo) Kxt)
SnOo) - /(*o) = h ~ + h -
and to prove that

lim [Sn (x ) - f(x )] = 0,


n—»oo

it suffices to show separately that

lim h ~
Axo) = lim h - /W) = o.
n—>oo

By the preceding lemma,

Pn+l(t)Pn (Xo) - Pn (t)Pn+1 (x


... -1 t ~ xq
)
^ ^
whence

f(xo)
lim
n—»oo
h ~

lim ±±1 I £&-


— £&>[Pn+l (typn (x ) - Pn (t)Pn+1 (x )]dt.
1 t Xq

Now let
, x
/(0 - /W) , \ <t<x
lo, x < / < 1.

Then, since/has a left-hand derivative at x , g is piecewise continuous on [— 1, 1],

and we have

lim
n—>qo
h ~
f(Xoj = v
lim
n =
—+ 1

/:
g(t)[Pn+1 (t)Pn (Xo ) - Pn (t)Pn+1 (x )] dt.

We now repeat the argument leading to Theorem 11-11 to deduce that

f(xo) =
lim 7i
n—>oo
- 2
0.

Finally, by introducing the function

0, -1 < t < Xq,


h(t) = <
/(0 ~ 7W)
Xq < t < 1,
J — Xq
432 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

and reasoning as above, we also find that

+>
lim
/«) = o,
n—>oo

thereby proving

Theorem 11-13. The Legendre series for a function f in (?Q[— 1, 1] con-


verges to the value

f(xt) + fixo)

at each point x in the open interval (—1, 1) where f has a right- and left-

hand derivative.

EXERCISES

1. During the proof of Lemma 11-2 it was asserted that if

lim (
—— - ) / g(x)Pn (x)dx = o,

then

/2
lim n / g(x)Pn (x) dx = 0.
n —>oo I J —

Why is this true?


2. At what point does the proof of Theorem 11-11 break down if/ has a jump discon-
tinuity at xq ?

The following exercises furnish a proof of Lemma 11-3.

3. For each integer n > 0, let

n
Mx) = -
7T
/
Jq
[x + O2 - 1)
1/2
cos <p] d<p.

(a) Show that fo(x) = 1 and/i(*) = x.

(b) Show that

(II + 1 )/„ + !(*) - (2#I + 1 )*/»(*) + H/„_l(x)

2 2
= - / {— n(x 2 — l)sin <p + (x — 1) '
cos<p[x + (x — 1) '
cos <p]f
TT Jo
X [* + (* — 1 ) COS if] dip.

(c) Use integration by parts with

u = [x + (x 2 — 1)
1/2
cos <£>]", dv = cos <pd<p
11-5 I CONVERGENCE OF LEGENDRE SERIES 433

to prove that

f* , 2 i\l/2 , 2 i\l/2 -,n

/ (x — 1)
r
cos<p[x +,

(x — 1) cos<p] dip
,

Jo
x 2 2 ,n— 1 ,
f 2 1^1/2
= / w(x
,
— 1N
l)sin
.
r
<p[x +, /
(x — 1) cos<p] a<p.
Jo

(d) Use the results of (a), (b), and (c) to deduce that

1' n
Pn (x) = - [x + (* - I) ' cos <p] d<p.
x Jo
4. (a) Rewrite the formula in Exercise 3(d) as

P„(x) = - I [x + i(l - x ) cos <pj" ^,


IT JO
where / = V— 1, and show that

|^n(x)| < - / (cos <p + x sin <p)


re
dip.
TV JO
[Recall that if z = a + to, then \z\ = {a
2
+ b 2 ) 1 ' 2 .]
(b) Use the result of (a) to prove that

\Pn(x)\ < 1

for all n and all a: in (—1, 1).

(c) Show that the inequality in (a) can be rewritten

\Pn (x)\ <- [1 - (1 - x


2
) sin
2
<pf
12
d<p.
T Jo
[Hint: The graph of sin <p is symmetric about the line <p = x/2.]
(d) Prove that sin tp > 2<p/w for < cp < w/2.
(e) Set u = (2/ir)(l — x 2 ) 1/2 and use the inequality in
, (d) to deduce that

1 - (1 - x 2 ) sin 2 < 1 - u V, < <p <p < v/2.

Then use (c) to prove that

/•ir/2 r<f

^«(x) < -
7r
/
Jo
c dip < —
irn l/z u
r^--
Jo
/ e <fr.

(f) Now use the fact that


/»Q0

,-' 2/2
/
^ < 00
Jo

to conclude that there exists a positive constant M such that


col < *L
|/nW|
IP ^ n l/2(J _ x 2)l/2
for all n > 1 and all x in (— 1, 1).
1

434 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

11-6 HERMITE POLYNOMIALS

We now know a great deal about the behavior of polynomials in the Euclidean
space (PC[— the way from some very special properties of the Legendre
1, 1], all

polynomials to the central result which states that the orthogonalized sequence
obtained from 1, x, x 2 ... is a basis for the space. In point of fact, however, the
,

essential portions of earlier discussion remain valid over any finite interval
our
[a, b], and thus, for theoretical purposes at least, we need not consider the ap-
parently more general space <?Q[a, b]. But when we come to study piecewise
continuous functions on the entire real line, (— oo, oo), the situation becomes much
different, and requires special consideration.*
In the first place, if / and g are arbitrary piecewise continuous functions on
(—oo, oo ), there is no guarantee that the improper integral

f—
J oo
f(x)g(x)dx = lim f
a —>» J —a
f(x)g(x)dx (11-24)
b —>»

converges. Indeed, (1 1-24) is undefined even when /and g are polynomials. Thus
our usual definition of an inner product is not valid, and this, in turn, implies that

if we wish to use the general theory of Euclidean spaces in this context we must
either consider another set of functions, or another inner product, or both.
As our point of departure, let us insist that any function space which we consider
contain all polynomials, and that its inner product be defined by means of an
improper integral over the entire real line. This being the case, it is clear that
we must introduce a weight function w into our integral in order to guarantee that
/oo
w(x)p(x)q(x) dx
— oo

exists for every pair of polynomials p, q. (See Example 3, Section 7-1). This
requirement suggests that w be piecewise continuous on the entire real line, and
that it —> oo. Moreover, it is convenient to require that
tend to zero as \x\
n = (Why?) Our experience from
limizi^oo w(x)x any positive integer n.
for
calculus suggests that we try an expo-
nential function (with negative exponent),
and since we allow x to assume negative
as well as positive values, the exponent
should involve only even powers of x.
Thus we are led to try the weight function

w{x) = e~ x /2
. (11-25) FIGURE 11-2

* A function is said to be piecewise continuous on (— oo, oo) whenever it is piecewise


continuous on every finite interval [a, b].
11-6 HERMITE POLYNOMIALS 435

(See Fig. 11-2.)* The following lemma guarantees that this choice will be
successful.

Lemma 1 1-5. Let In = $"„ e~ x2 ' 2 xn dx, n = 0,1,2, ... Then

^2n+l = Oj

(2«)!
2n
Io.n. — ./~z
V2x
2"n!

for all n.

Proof. We begin by considering

2n+1
/2 »+i= lim I* e-* 2l2 x dx.
a—»oo J —
b—>a>

In this case the integrand is an odd function of x, and hence

b e~x2
f e-
x2,2
x 2n+1 dx = f— e-
x2 ' 2
x
2n+1
dx+ [* ' 2
x 2n+1 dx
J — J a J a,

b
= x2 ' 2 2n+1
f e- x dx.
Ja
But

b b
2n+1 x2 ' 2 x2 ' 2
f x e~ dx= f x 2 \xe- ) dx
Ja Ja
" rb
= - x2n e - x -l
b „ , 2,
l2
+ 2nf x 2n e~
x l2
dx.
a Ja

Thus

2,„ b ~x
I2n +i = lim \-x
2n
e~
x l2
+ 2n[ x 2n e~ x l2
dx]
a—>x> L a Ja J
b—>oo

2n - x
b
= 2«lim x e~ x2l2 dx
a—>oo J\a
6—>oo

= 2«limrr x 2n - 2n - l
b
1
e-* 2l2 dx+ \ x e-x2 ' 2
dx\
a—>oo L./ — Ja J

2n - x
b
= 2rtlim x e- x2l2 dx,
a—KX Jf—a
b—>oo

* Someauthors set w(x) — e~ x Except for the obvious modifications this necessi-
.

of the results in this section still remain valid. However, the Hermite poly-
tates, all
nomials defined below then do not have 1 as their leading coefficient.
a 1

436 ORTHOGONAL SERIES OF POLYNOMIALS |


CHAP. 1

and it follows that I2n +i = 2nl 2n -i for all n > 0. Moreover, by direct compu-
tation, we find that Ii = 0. Hence I 2 n+\ = for all n > 0, as asserted.
To evaluate I 2n we must use the fact (proved in Exercise 3 below) that

7 = r—
J 00
e- x2/2 dx = -v/2^.

Then if « > 0,

J 00

= limT- x 2n -
a —>oo L
l
e-
x2 ' 2
\

I
h


+ (2n - 1)
J
/"*
—a
x 2n
-2
^ 2/2
del
J
6—»oo

= (2rt - 1) /2 „_ 2 .

Henc;
/2 = (2 - 1)V2tt,

/4 = (4- 1)(2 - l)\/2^,

/2n = {In - l)(2n - 3) • • •


(2 - 1) V^r.
But

(2» - l)(2n - 3) • • •
(2 - 1) = ^ if ii > 0,

and the lemma is proved. |

From the way things are developing, it is obvious that we stand in imminent
prospect of having to consider various integrals of the form

x2 ' 2
f
J 00
e- f{x)g{x)dx.

This being so, it will be well to dispose of convergence questions once and for all,

which we do by proving

Lemma 11-6. Let f and g be piecewise continuous on (— ao, oo), and


suppose that

e-
x2/2
f(xfdx < and [" e~ x2l2 g{x) 2 dx < oo.
f—
J 00
oo
J — oo

Then
oo 2
x l2 2
(0 [ e' [af(x)] dx < oo for all real a;
J — 00

(ii)
J
r 00
e'
x2/2
[f(x) + g(x)]
2
dx < *o;

2
let
/°° x ,2
e' f(x)g(x)dx < oo.
— 00
11-6 HERMITE POLYNOMIALS 437

Proof. The first result is obvious, since

f
J 00
e-
x2,2
[af(x)fdx = a2 T
J 00
e~
x l2
f{x) dx.
2

To prove (ii), we introduce the function max (/, g) which, for each x, is de-
two numbers f(x),
fined to be the larger of the g(x). It is clear that max (f g) is

piecewise continuous on (—00,00) (see


Fig. 1 1-3) and, in addition, that max (f,g)

2 2
[/+s] <[|/l + \g\]

< [2 max (l/l, H)]


2

= 4max(/ 2 ,g 2 )
< 4[/
2
+ g
2
].
FIGURE 11-3

Thus
x2 ' 2 2 x2/2 2
f— e- [f{x) + g(x)] dx < 4f e- f(x) dx
j 00 — J 00

x2 ' 2
+ 4f
J— 00
e- g(x?dx.

By assumption these last two integrals are finite. Hence

J
r— 00
e-
x2 ' 2
[f{x)+ g(x)]
2
dx < oo.

Finally, since
2

f(x)g(x)
[f(x) + g(x)] fixY + g(xY
2 2
2
e- x2l2 f(x)g(x)dx = e~ x2l2 [f(x) +
f if g(x)] dx

- if e-
x2/2
f(x) dx
2
- if* e-^/
2
^) 2
^.
7— 00 y — 00

But we now know that all of these integrals are finite, and we are done. |

These technical details out of the way, we consider the set of all piecewise con-
tinuous functions on (—00, 00), identified in accordance with the convention in
Section 9-2, which have the property that

J
r— 00
e-
x2 ' 2 2
f(x) dx < 00. (11-26)

This be denoted by ^2, the subscript serving as a reminder that f 2


set will ,

rather than/, appears in (11-26), the superscript recalling that a weight function
is involved, and d standing for "integrable." The functions in d%_ are sometimes
referred to as being "square integrable with respect to the weight function e _x2/2 ."
1

438 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

Theorem 11-14. d% is a Euclidean space under the usual definitions of


addition and scalar multiplication of piecewise continuous functions, and
inner product

f'g= J
r— oo
e-
x2 ' 2
Kx)g(x)dx. (11-27)

Proof. From Lemma 1 1-6 we know that f -\- g and of belong to #2 whenever
/ and g do, and also that the integral defining / •
g is finite. This disposes of all
but the straightforward details in the proof, and these we leave to the reader. |

Wi th this, we have realized our goal of constructing a function space on (— 00 , 00)


which contains all polynomials, and also, by the way, all bounded piecewise con-
tinuous functions. [For if/ is such a function, then

P—
J 00
e-
x2/2
f(x) dx
2
< M T— 2

J 00
e-
x2 ' 2
dx = M V^r2
< oo.]

Moreover, our patient insistence on keeping polynomials in view is now rewarded


in the form of the following basic result.

Theorem 11-15. &\ is the set of all piecewise continuous functions on


(—00, 00) which can be expressed as limits, in the mean, of sequences of
polynomials. Hence, iff is an arbitrary function in d^, there exists a sequence
of polynomials {p n}, n = 0, 1, 2, ... such that ,

lim 11/- pn \\
= lim ([* e~ x2 ' 2
[f(x) - pn {x)f dx)
112
= 0.

The proof is difficult, and will be omitted.


From this point the discussion proceeds very much as with Legendre poly-
nomials. Indeed, the sequence 1, x 2 ... is linearly independent in #2 (Exercise
x, ,

6), and can be orthogonalized by the Gram-Schmidt process to produce a basis

Po(x), pi(x), ... , p n (x), . . .

in which pn is a polynomial of degree n. (This is one of the consequences of


Theorem 11-15.) But again this particular basis leads to computational diffi-

culties, and since Theorem 1 1-1 holds, mutatis mutandis, in this situation, we are
free to seek another basis composed of polynomials, one of each degree. This
time the most convenient basis consists of the so-called Hermite polynomials,
which are defined by the formula

Hn (x) = (-l)V 2/2 ~e-* 2/2 , n = 0, 1,2, . . . (11-28)

By direct computation we see that

H (x) = 1, H^x) = x,
H 2 (x) = x2 - 1, H 3 (x) = x 3 - 3x,

H 4 (x) = x4 - 6.x
2
+ 3, H 5 (x) = x 5
- lOx
3
+ 15*.
11-6 I HERMITE POLYNOMIALS 439

The fact that H


n is a polynomial of degree n with leading coefficient 1 can be de-
termined by inspection of (11-28), or by the technique suggested in Exercise 5.
An even more obvious remark is that 2n is even, and 2n+ x odd, for all n.H H
Theorem 11-16. The Hermite polynomials are an orthogonal sequence in

#2 with
=
~
\\Hn \\* nWlir. (11-29)

Proof. Let
/oo
2

—00
e-' ' 2
Hm (x)Hn (x)dx
/OO

x
Hm (x)^Je- x2 ' 2
)dx,

and assume that m < n. Integrating by parts, we obtain


n-x
d
/= (-Dn Hm (x)^(e- X ' 2
) HUx)-^(e- x2/2 )dx
x 2
But the first term in this quantity is the product of a polynomial and e /
, and
hence vanishes as \x\ —> oo. Thus
/OO in — 1 "
/= (-1)
-00
Ux)-
dx n-i
V"-x
-(e~ ( 12 '
l2
)dx
>
ax -

We repeat this procedure a total of m times, and get

/oo tn—m 2

where H^denotes the ra-fold derivative of m H . But H^ = m\, since Hm is a


polynomial of degree m and leading coefficient 1 . Hence
/oo
jn — m 2
^ xl2
(e~ )dx.

Now suppose m < n. Then


jn—m—1
= (-l)w +* m ^ n ~ m ~*- ~ x 2 /2
/ !
(e )
dx
= 0,
since this quantity is also the product of a polynomial and e~ x2) 2 . Thus ^w and
Hn are orthogonal in 3% whenever m 9^ n.
Finally, if m — n,

J = (-l) 2 V.r e-
x2/2
dx
J — 00

= nXs/li,

and the theorem is proved. |


x . 1

440 ORTHOGONAL SERIES OF POLYNOMIALS |


CHAP. 1

One of the principal reasons for the importance of Hermite polynomials is that
they appear as solutions of a certain linear differential equation. To derive this
equation we first establish the following recurrence relation.

Theorem 11-17. The Hermite polynomials satisfy the identity

Hn+1 = xHn - nHn _ u n > 0. (11-30)

(This identity is also valid for n = if we agree that H_ = 1 0.)

Proof. The proof is similar to the one given in Section 1 1-3 for the Legendre
polynomials. Since xHn (x) is a polynomial of degree n + 1, its Hermite series is
of the form

xHn {x) = £ *0»-gh


*ToHk Hk
Hk
'
( x)m (H-31)

By Lemma 1 1-1, which is also valid in the Euclidean space d™ we have


, xHn Hk =•

Hn xHk = if k < n — 1
• . Moreover,

xHn • Hn =
J
r— xe-
x2 ' 2
Hn {xf dx
=
since the integrand is an odd function (see Lemma 11-5). Thus (11-31) is of the
form
xHn (x) = H
a n n+1 (x) + /3 n ^n _i(x).
If we equate the coefficients of x n+l in this identity, and recall that the leading

coefiicient of H k is 1 , we see at once that an = 1

Rather than equate coefficients to obtain f3 n , we reason as follows. Since

_ xHn • Hn +\ _ j

-"n + 1 *
-"n + 1

for all n, it follows that

xHn -Hn+1 = \\Hn+1 \\


2
= V^r(n + 1)!.

But then
a — xHn
_ Hn _\ '

Pn
Hn _ Hn _\
i

s/I-k n\

V2tt (n - 1)!

Thus
xHn — Hn+ + i
nHn _i,
as asserted. |
11-6 J
HERMITE POLYNOMIALS 441

To derive the differential equation we start with the defining formula for Hn :

Hn (x) = (-l)V 2/2 jU-* 2/2 .

This may be rewritten

2
£-e-* ' 2
= (-\Te- x2l2Hn (x\

and, when differentiated, gives

^e-*
dx +
n 1
!* = (-lf[- xHn (x) + m(x)]e-
x2 > 2
.

On the other hand,

n+l
d
n +*
-e-**
12
= (-l) n+1 e- x2l2Hn+1 (x).
dx

Thus
Hn+1 = xHn — Hn , (11-32)

a relation which is sometimes useful in itself.


The recurrence relation and (11-32) together give

Hn — nHn —\ = 0,

or

Hn+1 - {n + \)Hn = 0.

We now differentiate (11-32) to get

Hn +\ = xHn + Hn — H^,

and it follows that

Hn - xHn +
'
nHn = 0.

Thus we have proved

Theorem 11-18. The Hermite polynomial of degree n is a solution of the


second-order linear differential equation

y" - x? + ny = 0. (11-33)

As with the Legendre polynomials and their differential equation, it can be shown
that Hn is the only polynomial solution of (11-33) with leading coefficient 1. We
encounter this equation again in Chapter 13 when we discuss the Schrodinger
shall
wave equation from quantum mechanics.
1

442 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

EXERCISES

1. Determine whether or not the following functions, defined on (— oo , oo), belong to

(a) e*
2/4
(b) x sin x (c) |jc| (d) f(x) = n,n<x<n+\
(e) xe*'>
2/8
8 ,rx
(f)
^/.a ln W»
/(*)«'" I 1*1 ^
- 1
. (g) *
[0, |x| < 1

2. Prove that limizi-^ p(x)e~ x /2 = for any polynomial/?.

3. Let / = /"«, e~*


2/2
dx. Then
!,2 V/2
^(/:-" -)(/-> *)
/oo /-oo . 2 2. /0

-x ./ — 00

Change to polar coordinates, evaluate this integral, and thus show that / = y/lir.

4. Prove that

iln - 1) (2n - 3) • • •
(2 - 1) = ^. R > 0.

[////if: Use mathematical induction.]

5. Given that Hq(x) = 1, use (11-32) and mathematical induction to prove that
Hn {x) is a polynomial of degree n with leading coefficient 1.

2
6. Prove that the functions 1, x, x , . . . are linearly independent in 3™-

7. (a) Use the recurrence relation to verify that the polynomials listed in the text as
H2, . .
.
, H5 are correct.

(b) Compute H§ and H7.


8. Prove that

H2n +i(0) = 0, and H2n (0) = (-1)"1 •


3 5 • • • (2/i - 1)

for all n.

In Exercises 9-11 we outline an alternate approach to the materials of this section.

9. Let #2 denote the set of all piecewise continuous functions /on (— 00 , 00 ), identified

in the usual way, with the property that


/oo ,-
2
AxY v
dx < <x>.
-00

(Such functions are said to be "square integrable" on (— <*> ,


°o).)

(a) If /and g belong to #2, set


/oo
f(x)g(x) dx.
—00

Prove that this defines an inner product on #2, and that with the usual definitions of

addition and scalar multiplication #2 is a Euclidean space.


, ,

11-7 | LAGUERRE POLYNOMIALS 443

(b) Prove that #2 contains all functions of the form e~ x2/4p(x), where p is a
polynomial.

10. With #2 as above, define the Hermite functions h n ,n — 0, 1, 2, . .


.
, by the formula

d -x 2 /2 1/4
hn(x) = (-!)* nB e
dx

(a) = e~ x2H H„(x), where Hn is the nth Hermite polynomial.


Prove that h n (x)
(b) Prove that h m h n = if m 9^ n, and h n
- h n = nWlir with respect to the inner •

product in ^2. (The sequence {h n}, n = 0, 1, 2, of Hermite functions actually . .


.
,

forms a basis for £2.)

11. Show that the Hermite functions satisfy the same recurrence relation that the Her-
mite polynomials do.

12. Show that the functions h n (x) = e~ x /4


Hn (x) satisfy the linear differential equation
'" + + = -
(« \ -f)y
p n (x)} and {e~ qn (x)} n = 0, 1, 2,
x2/4
13. Let {e~ x /4
be orthogonal sequences in $2
, . .
.
,

(see Exercise 9), and suppose that pn and qn are polynomials of degree n. Prove that
for each n, p n is a scalar multiple of q n .

14. Prove that


V/4 2/2
(_1)V f(e-<
nK ) if = 0, 1, 2, ...
dt

a a nonzero real number, is an orthogonal sequence in 42 (see Exercise 9). (Actually,


this sequence is a basis for #2.)

11-7 LAGUERRE POLYNOMIALS


The last of the various sets of orthogonal polynomials which we shall formally
discuss —the so-called Laguerre polynomials arises in the study of functions on —
the semi-infinite interval [0, oo). In this setting we use the weight function e~ x ,
and let ^[0, oo) denote the set of all piecewise continuous functions /on [0, oo)
with the property that
x 2
f" e~ f(x) dx
Jo

converges. It is then easy to show that ^ is a Euclidean space under the usual
addition, scalar multiplication, and inner product, the last being defined by the
formula
f-g = re-xf(x)g(x)dx.
Jo
Moreover, since

r e~ x dx
Jo
x n
= «!, 7i = 0, 1, 2, ... (11-34)
1

444 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

(Exercise 1), $%, contains all polynomials, and hence, as before, we have the fol-
lowing theorem.

Theorem 11-19. 3\ is the set of all piecewise continuous functions on


[0, oc ) which can be expressed as limits in the mean of sequences
of polynomials.

From this it follows that any mutually orthogonal set of polynomials in #2>
one of each degree, will be a basis for this space. One such set is the sequence
{L n (x)}, n = 0, 1, 2, ... of Laguerre polynomials, where
,

n
L n {x) = (- 1)V £- (x e- x ). (11-35)

Indeed, when the argument used to establish Theorem 11-16 is adapted to this
situation, we find that
L m L n = 0, m 7* n, •

(11 " 36)


L..L.-W.
Furthermore, we have

Lemma 1 1-7. For each positive integer n, L n is a polynomial of degree n


n~ 1
with 1 as its leading coefficient and —n 2 as the coefficient ofx .

Proof. By repeated differentiation we find that

n~x
= (-l) n [xn -
^ n
{x
n
e-
x
) n x
2
+ • • -]e~
x
(11-37)

(see Exercise 9, Section 1 1-3) where the terms omitted are of degree < n — 1 in x.
The lemma now follows by substituting this expression in (11-35). |

To derive the recurrence relation for the Laguerre polynomials we reason as


follows. The polynomial xL n (x) is of degree n + 1, and hence can be written
in the form
M+l
xL n (x) = ^2 a k L k (x), (11-38)
fc=0

where
_ (xL n ) • Lk
ak " INI
2

But, by Lemma 11-1, (xL n ) - Lk = Ofor all k < n - 1. Hence (11-38) reduces to

xL n = a n _i£ n _i + an L n + a n+ iL n+1 . (11-39)

n+1 on both sides of this equation to deduce


We now equate the coefficients of x
11-7 LAGUERRE POLYNOMIALS 445

that a n+ i = 1. In other words,

(xL n ) • L n +\
1,

and we have
2
(xL n )-L n+1 = [(«+ l)!]

see (11-36). Hence


\XLi n ) • L, n _\ — i) L,
\XL, n '
ri

fln-l
[("
- I)!] 2 [(»- I)!] 2
~
2
n\
— -n 2 ,

L(« 1)'J

and (11-39) can be rewritten

XLin ft J-"n —1 i Qn*- nI


I L-'n
+ 1-
Finally, to find a n we equate the coefficients of x n on both sides of this expression
and use Lemma 11-7. This gives an = In + 1, and we have proved

Theorem 1 1-20. The Laguerre polynomials satisfy the recurrence relation

L n+1 + (2« + 1 - x)L n + n


2
Ln _ = x (11-40)

for all n > 1, and all x.

(Again this relation is valid when n = if we set L_ = 1 0.)

Starting with the value L (x) = 1 derived from (11-35), the above formula
yields
L (x) = 1,

Li(x) = x — 1,
L 2 (x) = x — 4x + 2,
2

L 3 (x) = x 3 - 9x 2 + 18* - 6,
L 4 (;t) = x 4 - 16x 3 + 72x 2 - 96* + 24,
L 5 (x) = x 5 - 25x 4 200jc
3 -
600jc + 600x -
-(-
2
120.

To complete our discussion it remains to derive the differential equation for Ln .

Here we start by differentiating (11-35) to obtain

n+l
d d
dx
Ln (x) = (-l)V^(xV-*) + L n (x)

= (-l)V^K- ^ - 1 n
x e~
x
) + L n {x)

= (-l)W (x*
dx n
1

446 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

Similarly,

£ £»-i(*) = (-lr-v^c*-- 1
*-*) + l^x),
and we therefore have

L'n = /!!»_! - /lI4_i, « > 1. (11-41)

But, by (11-40),

L'n+i + (2n + 1 - x)i; - Ln + n


2
^_x = 0,

and since (11-41) implies that

L'n+1 = (n + \)Ln - (n + l)£n and


2
n L'n _ x = 2
« £„_i - «L'n ,

we find that
2
xL'n = nL n + n Ln _ x . (11-42)

2
We now use (11-41) again to deduce that n L'n _ x
2
+ nL'n = n Ln _ and
1 hence,
by (11-42), that
2
n L'n _ x + nL'n = xL'n - nL n . (11-43)

Finally, differentiating (11-42), we obtain


2
xL'n' + L'n = nVn + n L'n _ Xi

or, by (11-43),
Xl^n i
**, n = -X-^n ^^-"nt

thereby proving

Theorem 11 -21. The Laguerre polynomial of degree n is a solution of the


second-order linear differential equation

xy" + (1 - x)y' + ny = 0. (11-44)

Equation (11-44) is known as Laguerre' s equation of order n, and, as we shall see

(Exercise 11, Section 15-3), L n is the only polynomial solution of this equation with
leading coefficient 1.

EXERCISES

1. Prove that
x n
/ e~ x dx = n\, n = 0,1,2,
Jo

2. Verify the formulas for L2, £3, and £4 given above.


3. Prove that 3%[0, °o) is a Euclidean space.
11-8 GENERATING FUNCTIONS 447

4. Prove that

= |0
ifm^n,
|(n!)
2
if m = n.

5. Use the differential equation for the Ln to prove the orthogonality of these polyno-
mials in #2 [0> °°). [Hint: Mimic the argument used in the case of the Legendre
polynomials.]

In the following exercises we give the student the opportunity to explore for himself an
interesting variation of the material in this section.

6. Let a > —1 be a real number, and let

n+a x
L :\x) = (-lfx-V
(
(x e-
J^ ),

n = 0, 1, 2, ... .Show that L%" is a polynomial in x of degree n with leading coeffi-


cient 1 for all a and all n.
7. Prove that
r x e- Ll:\x)Li \x)dx
Jo
a x a
=

for m t^ n, and hence deduce that {/,£*' (x)} ,n = 0, 1 , 2, . .


.
, is a mutually orthogo-
nal sequence of polynomials in the space 3% [0, <») with weight function x ae~ x .

8. Prove that
||L^||
2
= n\T(n + a+ 1),

where T(a) denotes the gamma function. (See Exercises 25 and 26, Section 5-7.)

9. (a) Prove that the Laguerre polynomials {L^} satisfy the recurrence relation

Z,$i + (2« + a + 1 - x)Z4


a)
+ n(n + a)L^ = x 0.

(b) Compute the first four Laguerre polynomials Li2) .

*10. Prove that L^ is a solution of the differential equation

xy" + (a + 1 - *)/ + /y> = 0.

*ll-8 GENERATING FUNCTIONS


We have seen that each of the various types of orthogonal polynomials encountered
in this chapter can be characterized in several different ways; by means of a
Rodrigues formula valid for by a recurrence relation, or as polynomial solu-
all n,
tions of certain linear differential equations. Each of these definitions has some-
thing to recommend it, and the particular one chosen is as much a matter of taste
as anything else. In this section we shall introduce still another way of obtaining
these polynomials —namely, by means of their so-called generating functions.
Here the motivating idea is that the polynomials in question can be made to
appear as the coefficients in the power series expansions of certain functions of
1

4481 ORTHOGONAL SERIES OF POLYNOMIALS |


CHAP. 1

two and in this sense are "generated" by these functions. Although


variables,
this method may appear highly artificial at first particularly since we make —
sca:it effort to motivate the choice of generating function — it is not so at all. In
fad:, in many physical problems the orthogonal functions we have been considering
arise out of their generating functions in a perfectly natural way, and for this
reason alone the following discussion has much to recommend it.

I . The Legendre polynomials. Here we start with the function

G ^ = TT -2xr + r^ ' (U -45)

which we expand as a power series in r under the assumption that x and r are chosen
so that the resulting series converges. This allows us to write

G(x, r) = P (x) +P x (x)r + P 2 (x)r 2 + •


= jjPn(*> B ,
(H-46)
n=0

where P Pu
, . . . are functions of x alone. These coefficients can be computed
directly from (1 1-45) and (1 1-46) in the usual fashion, and it is not difficult to show
thai: Pn must be a polynomial in x of degree n. For instance, setting r = in

(11-46) and noting that G(x, 0) = 1, we find that

P (x) = 1.

For our present purposes, however, it is sufficient to observe that this series is
uniformly and absolutely convergent whenever \2xr — r
2
< 1, and hence, under \

these conditions, can be differentiated term-by-term with respect to r to yield

dG _ n -\
(11-47)
dr f^ nPn {x)r
n=\

But, by (11-45),

dG _ r — x
dr (1 - 2xr + r 2 ) 312

or

(1 - 2xr + /-
2
)^+ (r - x)G = 0.

Substituting (11-46) and (11-47) into this expression yields

(1 - 2xr + 2
r ) £ nPnixy- 1
+ (r - x) £P n (x)r
n
= 0,
11-8 GENERATING FUNCTIONS 449

and it follows that

00 00
-1
X
n=l
nPn (x)rn - xP (x) _ £
n=l
(2n + \)xPn {x)r
n

n+1
+P (x)r + X (n + l)Pn (x> = 0,

or

n=0 n=l

+ X ^n_l(x>
w=2
n
= 0.

Thus

[/>!(*) - xi> <»] + [2P 2 (x) - 3xP 1 (x) +P (x)]r


00

+ X K" + O^n+lC*) -
n=2
(2/1 + 1 )*/>„(*) + /lP»_i(*)>
n
= 0,

and we have

Pi - */>o = 0,

(n + l)Pn+1 - (2/i + l)xPn + «Pn _! = 0, « > 1.

But the second of these expressions is none other than the recurrence relation for
the Legendre polynomials, and since P (x) = 1 and Pi(x) = x, we conclude
that the coefficients in (11-46) are, in fact, the Legendre polynomials. Thus the
function
1
G(x, r) =
(1 - 2xr + r2 Y 12
is a generating function for the Legendre polynomials.
In physics, the function G(x, r) is encountered in the study of planetary motion
and electrostatic potential, among other places. In fact, Legendre's original
memoir on this subject, published in 1785,
was devoted to the study of the gravitational
attraction of "spheroids," and it is only
fitting that we conclude our discussion of the

Legendre polynomials with a brief indication


of how this comes to pass. For various
reasons we shall consider the case involving
electrostatic potential, which, theoretically, is
the same as the one treated by Legendre.
1

450 ORTHOGONAL SERIES OF POLYNOMIALS |


CHAP. 1

we pose the problem of determining the potential at an arbitrary point


Specifically,

Q of the plane arising from a pair of point charges of magnitudes +<r and — a
located as shown in Fig. 11-4. Since the potential at Q due to a single point
cha rge a located R units away is <j/R, the potential at Q in this case is

But, referring to Fig. 11-4 we see that

r2 = r
2
+ x2 - 2rjccos0,

R'
2
r
2
+ x 2
— 2rx cos (x — 0),

from which it follows that

and
-1/2
-(2^cos(tt-0)-^)
R'

Save for notation, each of these expressions is the generating function for the
Legendre polynomials, and we therefore have

R =l±Pn(cOSe)(£f
n=0
and

^7 = \ £
n=0
P»(C0S (r - »))
(f)"
X *
- \ £ P„(-COS
n =0
0)
(ff.

provided that \2(x/r)cos0 - (x/r)


2
\
< 1. Thus

|n=0

Finally, recalling that Pn is even if n is even, and odd if n is odd, the above ex-
pression reduces to
go
/v\ 2n + 1
V = vS^n
n=0
+ l(cOS0)^j

In elementary applications it is customary to approximate V by the first term


in this series, in which case we have

V « -^ (2x) cos = -^z rf cos 0,


r2 r

where af is the distance between the charges.


11-8 GENERATING FUNCTIONS 451

II. The Hermite polynomials. Here a generating function is

xr - r2 2
G(x,r) = e
'
. (11-48)
Indeed, since the series

e
z
= 1 + z + |j- + • • •

is uniformly and absolutely convergent for all z,

2
G(x,,)=l + (xr-0 + l(,r-0 + ...

= 1 + xr + ^(;c
2
- 1>
2
+ •••

= #„(*) + H^xY + 1 # (x)r 2 + 2 • • •

for all x and all r. Differentiating this series term-by-term we find that

dG _
= y\
00
Hn+1 (x) ^n
dr ^
n=0 n\

But (11-48) implies that


dG .

and hence that

dr
- W r)
2^ n y
r
n=0

= xH xH"(x) Hn -^x>>
(x) + X) -f r".
w=l
Thus

71 =1 '

n= l
and it follows that

#! - *7/ = 0,

Hn+1 — xHn + nHn _ 1


= 0, n > 0,

as required.

III. The Laguerre polynomials. In this case a generating function is

G{x, r) = y-i-^ c -™' (1 - r,


; (11-49)

an assertion which is proved as follows.


1

452 ORTHOGONAL SERIES OF POLYNOMIALS |


CHAP. 1

The Taylor series expansion of G in powers of r can be written in the form

G(x, r) = "£ (- \)nL n {x) ^ * (1 1-50)


n=0

where L n {x) is a function of x alone. Moreover, this series converges uniformly


and absolutely for all x and all r with |r| < 1 since G(x, r) is the product of two
functions whose series are thus convergent. Hence

n=l

Moreover, by (11-49),

V-r?~={l-x-r)G,
dr

and we therefore have

- rf £ (-l)^n(x) -^y + 7
(x - 1 + r) £ (-!)"«*)£ = 0.

'(11-51)

An easy computation now reveals that

L n+1 + (2/i + 1 - x)L n + /i


2
Ln _ x = 0, « > 1,

and since L (x) = 1 and L^x) = x — 1, we are done.

EXERCISES

1. Let

G(x,r) =
(1 - 2xr + r2)i/2

Prove that

(a) (1 - 2xr + r
2
)—
ox
- rG^O;

... 3G
, dG_
(b)(x-r)---r--=0;

<c)r£(^-(l-r*)f -0.
2 (a) Substitute the series
00

G(x, r) = X) P»^) r?l


11-8 | GENERATING FUNCTIONS 453

into the identities in 1(b) and 1(c), and deduce in turn that

nPn (x) - xP'n {x) + P'n-dx) = 0, n > 1,

and
(« + l)Pn (x) - P'n+1 (x) + xP'n (x) = 0.

(b) Use these results to prove that

P'n+ l ~ P'n-l = (2n + \)Pn


for all n > 1.

3. Use the identities in the preceding exercise to derive the differential equation for Pn .

[Hint: Replace n by n — 1 in the second identity in 2(a) and then substitute the value
of P'n -\ found in 2(b). Differentiate, and again eliminate P'n -\.\

4. Starting with the series

G (x n = /,
>

n=0
— r" r 3

prove that

Hn(x) = (-1) e
2
— Jn
e
2

~ r2/2 = 2/2
[Hint: e xr e x2/2 e-^-^ .]

5. (a) With G as in Exercise 4, compute dG/dr, and show that Hi(x) = nHn -\(x).
(b) Use the result in (a) and the recurrence formula for the Hn to obtain the dif-
ferential equation for Hn .

6. Show that the recurrence relation for the Laguerre polynomials follows from (1 1-51).

7. Let the Laguerre polynomials be defined by their generating function

r-( \ = * — rz/(l— r)
G(x, r) e
1 — r

(a) Prove that

(1 - r)--+ rG=0,
ox
and

Kl-r)^-(*-l+r)^0.
or dx

(b) Use these identities and the series expansion

oo n

to prove that Ln is a solution of the differential equation

xy" + (1 - x)y' + ny = 0.
1

454 ORTHOGONAL SERIES OF POLYNOMIALS CHAP. 1

8. (a) Starting with the formula

Ln (x) = (-1) e — (x e ),

show that
n 2 n—k
X
Ln(x) = X)^ 1 ^
fc=0
|_(* - k)\\ k\

*(b) Use the result in (a) to sum the series

S(-i)U)^
and show that

n=0

[///«/: Interchange the order of summation.]

*9. Prove that the function


l -rx/a-r)
G(r,x) =
(l - r)
a+1

a)
isa generating function for the Laguerre polynomials I4 . (See Exercises 6 through
10 of the preceding section.)

A Table of Orthogonal Functions

The following table summarizes the results obtained in this chapter. For con-
venience of reference we have also included the corresponding information for
Bessel functions (see Chapter 15).

LEGENDRE POLYNOMIALS

Definition

* Recurrence P_l = 0, P = 1

relation (n + l)P„+i - (2n + \)xP n + nPn _i = 0, n >

Differential
(1 _ x 2)y » - 2xy' + n(n + l)y =
equation

Generating 1

function (1 - 2rx + r 2 Y' 2

Orthogonal in (PC[-1, 1]

o, m 9^ n
Orthogonality
"m ' "n
» m — n
12/1 + 1

* The index —1 is used as a subscript only when working with recurrence relations.
11-8 |
GENERATING FUNCTIONS 455

HERMITE POLYNOMIALS

tt / \ / t\ n x /2
2 A""
a —X 2 /2
Definition

* Recurrence /f_! = 0, Ho = l

relation Hn+1 - xHn + «/f„_i = 0, n >


Differential
>>" — xy' + /y =
equation

Generating pxr—r 12
function

-* /2
Orthogonal in $%(— °° , °°); weight function e

Orthogonality
«m "n — * <

l\/27r w!, m = n

LAGUERRE POLYNOMIALS L n

Definition f -t\ n x
Lt n /(x)\ = (-1) e —
a f
(x e
n ~x \
)

* Recurrence L_i = 0, Lo = l

relation L n+1 + (2#i + l - x)L„ + « 2L„_i = 0, n >


Differential
xy" + (l - *)/ + /ij =
equation

Generating l — n/(l— r)
function l -r C
Orthogonal in g%[0, °°); weight function e~*

Orthogonality r . r _ J°>
m 9± n

l(n!) , m — n

* The index —l is used as a subscript only when working with recurrence relations.
1

456 ORTHOGONAL SERIES OF POLYNOMIALS | CHAP. 1

LAGUERRE POLYNOMIALS L^

Definition Ln (x) = (-1) x e — (x


T e ), a > -l

* Recurrence L^\= 0, L ( a)
= l

relation
4ti + (2« + a + l - *)4a) + n(n + a)L^! s 0, n >
Differential
equation
*/' + (a + l - *)/ + ny =

Generating l — rz/U— r)
c
function a - r)«+i

Orthogonal in #2[0, °°); weight function x"e~ x

Orthogonality T M ,<«> (o, m ^ n

\n\T(n + a + l), m = n

BESSEL FUNCTIONS

Definition Defined by the differential equation

Recurrence
relation
xJp+ i — 2pJp -+- xJp -i = 0, p an arbitrary real number

Eifferential
jcV + jc/ + (x 2 - p 2 )y =
equation

Crenerating
e (x/2)«-l/0
function

Orthogonality Too lengthy to be summarized here. See Section 15-9.

* The index — l is used as a subscript only when working with recurrence relations.
12

boundary-value problems for


ordinary differential equations

12-1 DEFINITIONS AND EXAMPLES


By far the most important application of the ideas we have been considering in
the past few chapters occurs in the study of boundary-value problems for second-
order linear differential equations. Formally, such a problem consists of
(i) an equation of the type
Ly = h, (12-1)

in which L is a second-order linear differential operator defined on a (finite)

interval [a, b], and h a function in Q[a, b] ; and


(ii) a pair of boundary or endpoint conditions of the form

aiy(a) + a 2y(b) + a 3y'(a) + a 4y'(b) = yu


PiJ<a) + 0ay(b) + /3 3 /(«) + /3 4/(6) = 72 ,

where the a;, &, and 7 t are constants. The problem, of course, is to find all func-
2
tions y in Q [a, b] which simultaneously satisfy (12-1) and (12-2).*
For instance, the equation

y" + y = (12-3)

with boundary conditions

y(0) = 0, y{ir) = (12-4)

is a problem of this type on the interval [0, ir]. To solve it we simply apply the
boundary conditions to the general solution Cisinx + c 2 cos x of (12-3) to

* Boundary conditions of a different sort must be imposed if the interval is infinite.


An example of such a problem will be found in Section 13-8.
457
;

45EI BOUNDARY- VALUE PROBLEMS | CHAP. 12

deduce that c 2 = and that cx is arbitrary. Thus

y = c sin x,

c an arbitrary constant, is the general solution of this particular problem.


As we boundary-value problems abound in physics and applied mathe-
shall see,
matics, and the solution of all but the simplest of them involves some type of
orthogonal series. Indeed, the need to solve boundary-value problems was the
stimulus which originally led to the invention of orthogonal series, and as we
proceed it will become apparent that their study is a mathematical discipline in
its own right; not, as one might expect, an adjunct to the general theory of dif-

ferential equations.
In order to exclude certain trivial cases from the following discussion, we
demand that at least one of the a; and one of the fa appearing in (12-2) be dif-
ferentfrom zero, and that the left-hand sides of these equations be linearly inde-
pendent in the sense that they are not constant multiples of one another. Further-
more, to ensure that we are actually looking at a boundary-value problem, and not
an initial-value problem, we also require that (12-2) contain nonzero terms in-
volving each of the endpoints of the interval. Finally, we shall say that the given
boundary conditions are homogeneous whenever y 1 = 7 2 = 0. In this case the
set of twice continuously differentiable functions on [a, b] which satisfy (12-2)
2
is a subspace S of Q [a, b], and by viewing L as a linear transformation from S to

e[a, b] the problem in question becomes one of solving a certain operator equation
to wit, the equation Ly = h, where h is a known function in e[a, b], and
L:S — e[a, b]. Despite its simplicity this observation is important because it shows
thai: the boundary conditions influence the problem only to the extent of deter-

mining the domain space for L. Strictly speaking, some symbol other than "L"
ought to be used to represent the operator from S to Q[a, b], since heretofore we
2
have considered L as acting on all of e [a, b]. (Recall that operators with dif-
ferent domains are different, even though the operators themselves are defined
by the same formula.) However, such changes in notation would be as confusing
as they are correct, and will therefore be avoided. But by the same token, it then
becomes mandatory to specify the domain of the operator being considered when-
ever there is any possibility of confusion.
At this point we invoke a familiar argument to reduce the study of boundary-
value problems with nonhomogeneous boundary conditions to the homogeneous
case (see Section 2-9 and Exercise 8 below). Hence, unless otherwise stated, we
shall assume from now on that all boundary conditions imposed are homogeneous.
The solutions of a boundary-value problem involving a linear differential
operator L: S — e[a, b] are intimately related to the solutions of the equation

Ly = \y, (12-5)

where X isan unknown parameter. In this setting we are required to find all values
of X for which (12-5) admits nontrivial solutions in S, and then find the solutions
12-1 | DEFINITIONS AND EXAMPLES 459

corresponding to these X. The reader should note that (12-5) may be rewritten

(L - XI)y = 0, (12-6)

where / denotes the identity transformation which sends each function in e[a, b]
onto itself, or as
a 2 (x)/' + a^x)y' + [a (*) - X]y =

if L =
a 2 (x)D
2
ai(x)D + +
a (x). Thus, for each value of X, (12-5) is a homo-
geneous second-order linear differential equation, and the solution set of any
(homogeneous) boundary-value problem involving this equation is the null space,
in S, of the operator L — XI.

This having been said, we


observe that the above problem can be rephrased in
the language of linear algebra without specific reference to the nature of the
operator L, as follows:

Given a linear transformation L: S —* V, where S is a subspace of V, find


all values of X for which the equation

Lx = Xx

has nontrivial solutions; then find all solutions corresponding to these values
of\.

And this is the problem with which we shall begin our investigations. But first,

an example.

Example. Solve the boundary-value problem

y y '
(12-7)
X0) = 0, jM = 0.

Here S is the space of all twice continuously differentiable functions on [0, ir]
which vanish at the endpoints of the interval, and L is the second-order linear
differential operator — D
2
(The minus sign has been introduced merely to sim-
.

plify the final results. Without it the relevant values of X would be negative.)
We distinguish three cases, according as X = 0, X < 0, X > 0.*

Case 1: X = 0. Here the differential equation has C\ + c 2 x as its general solu-


tion, and the boundary conditions imply that c\ = c 2 = 0. Thus (12-7) has
no nontrivial solutions when X = 0.

_ ^ _Xx
Case 2: X < 0. In this case y = cie^~ Xx + c 2e , and the boundary con-
ditions again yield y = 0.

* At this point we are tacitly assuming that X must be real. This assumption will be
justified later.
. ,

460 BOUNDARY- VALUE PROBLEMS | CHAP. 12

Case 3 X : > 0. Here the general solution of y" + \y = is

y = C! sin a/X x + c 2 cos \A x, (12-8)

and the boundary conditions now yield the pair of equations

c2 = 0, C\ sin \/\ ir = 0.

Th^is (12-7) admits nontrivial solutions if and only if

sinVXir = 0;

that is, if = 1,2,.... Further-


and only if X assumes one of the values X n = n2 n ,

more, for each of these values of X the constant c x in (12-8) remains arbitrary,
and it follows that the solution space corresponding to X n is the one-dimensional
subspace of C[0, x] spanned by the function sin nx.

The numbers \ n n 2 which determine the cases in which (12-7) has non-
=
trivial solutions are called the eigenvalues for this problem, and each nontrivial
solution corresponding to the eigenvalue \ n is called an eigenvector or eigen-
function belonging to X n . In the next section this terminology will be generalized
to i nclude a much wider class of problems, and for the moment we merely ask
the reader to note that any set of eigenfunctions, such as sin x, sin 2x, sin 3x, . . .

one for each eigenvalue, is orthogonal in e[0, ir]. This, as we shall see, is no accident,
and when properly generalized will be of fundamental importance in the study of
boundary-value problems.

EXIERCISES

Find all solutions of each of the following boundary-value problems.

1. y" + y = 0; y(0) = 1, y(p) = -1


2. y" + y = 0; j,(0) = 0, /(*) =
3 y" + Ay = sin 2x; y(0) = 0, y(w) =

4 /' + 5y = e x y(0) = 0, y(p) =


;

5)
y" + 9y = 0; /(0) = 0, /(*) =

6, /' + 9y = cos2x; /(0) = 0, /(x) =

7) y" + 9y = x; /(0) = 0, /(*) =


Let
Ly = ft;

Bi = 7i, B2 = T 2 ,

be a boundary-value problem of the type described by Eqs. (12-1) and (12-2) above.
Prove that the solution set of this problem consists of all functions in Q 2 [a, b] of the
form y p +
va, where y p is a fixed solution of the problem, and y h a solution of the
12-2 I
EIGENVALUES AND EIGENVECTORS 461

associated homogeneous problem

Ly = 0,

Bi = B2 = 0.

9. Find all eigenvalues and eigenfunctions for the boundary- value problem

y" + Xy = 0,

y(0) = y(2x), y'(0) = /(2tt).

(Assume that X is real.)

10. Consider the boundary-value problem

y" + 4/ + (4 + 9\)y = 0,

HO) = 0, y(a) = 0,
with X real.

(a) Show that this problem has no nontrivial solutions for A < 0. [Hint: Consider
the cases X < and X = separately.]

(b) Show that the only positive values of X for which this problem has nontrivial
solutions are
2 2

9a 2

and find the corresponding solutions.

12-2 EIGENVALUES AND EIGENVECTORS


In this section we consider the general problem of solving an operator equation of
the form
Lx = Xx, (12-9)

where L is a linear transformation from S to 13, S a subspace of V, and X an un-


known parameter which can assume real or complex values. Technically, this
problem is known as the eigenvalue problem for the operator L, and requires that
we find all X for which (12-9) has nontrivial solutions, and all solutions corre-
sponding to these X. Our primary interest, of course, is in the case where S and V
are Euclidean spaces, and L a second-order linear differential operator. Never-
theless, we shall refrain from imposing these restrictions until Section 12-4 in
order that our preliminary results be valid for general linear transformations and
arbitrary vector spaces.
As usual, we begin by introducing some terminology.

Definition 12-1. The values of X for which Eq. (12-9) has nonzero solu-
tions are called the eigenvalues (or characteristic values) of L, and for each
eigenvalue X the nonzero vectors in S which satisfy the equation Lx = X x
are called the eigenvectors (or characteristic vectors) of L belonging to X .
462 BOUNDARY- VALUE PROBLEMS CHAP. 12

The reader should observe by definition, the zero vector is never an eigen-
that,
vector for L. Furthermore, zero an eigenvalue for L if and only if the equation
is

Lx = has nonzero solutions, i.e., if and only if L is not one-to-one. Failure to


appreciate these simple distinctions is a frequent source of confusion.
If X is an eigenvalue for L, and x an eigenvector belonging to X , then

L(«x ) = «L(x ) = a(A Xo) = Xo(ax )

for numbers a. Thus ax is also an eigenvector belonging to X whenever


all real
a is from zero. This, combined with the obvious fact that the sum of two
different
eigenvectors belonging to X is again an eigenvector belonging to X yields ,

Lemma 12-1. The solution set of the equation Lx = X x is a nontrivial


subspace of'S for each eigenvalue \ ofL.

In other words, the zero vector together with the eigenvectors for L which belong
to > constitute a subspace of S (and hence, by implication, of V as well). We
shall denote this subspace by S Xo and observe in passing that dim S Xo > 1 for
,

all X . Geometrically, L acts on S Xo by "stretching" each of its vectors by the


scalar factor X as indicated in Fig. 12-1.

FIGURE 12-1

We have just seen that Lx belongs to S Xo for all x in S Xo . This fact is some-
times expressed by saying that S Xo is "invariant" under L in accordance with the
following definition.

Definition 12-2. Let L: S —» V be a linear transformation, and suppose


that S is a subspace of V. Then a subspace W
of S is said to be invariant
under L if and only if Lw belongs to W for all w in W.

We hasten to point out that there is nothing in this definition to imply that the
nonzero vectors in an invariant subspace for L need be eigenvectors for L. Indeed,
as we shall see momentarily, such a conclusion is false. Rather, the implication
goes the other way: the S Xo are invariant subspaces for L consisting of vectors
with the special property that Lx = X x.
Having introduced the notion of invariant subspace, we can now rephrase
the definition of an eigenvector for a linear transformation L: $ — > V to read as
follows: A nonzero vector x in S is an eigenvector for L if and only if the one-di-
mensional subspace of S spanned by x is invariant under L. This observation is
frequently useful in the search for eigenvectors.
12-2 |
EIGENVALUES AND EIGENVECTORS 463

Example Let ei, e 2 be the standard basis vectors in


1. (ft
2
, and let L: (ft
2
—* (ft
2

be reflection across the e x -axis that is, ;

Lei = ei, Le 2 = — 2.

Then, from geometric considerations alone, it is clear that the only subspaces of
2
(ft
2
which are invariant under L are (i) the trivial subspace, (ii) (ft itself, and (iii)
the two one-dimensional subspaces spanned by e x and e 2 By the remark made a .

moment ago, the last two must be eigenspaces for L, and are the only such. Further,
it is obvious from the definition of L that these subspaces are associated with the
eigenvalues 1 and —1, respectively.

Example 2. If L: (ft
2 — » (ft
2
is reflection across the origin, then every subspace
of (ft
2
is invariant under L. In this case Lx = — x for all x, and it follows that — 1

2
is the only eigenvalue for L, and that S_i = (ft . The reader should note that
here the invariant subspace associated with the eigenvalue is fwo-dimensional.
2
Example 3. Let L be a rotation of (ft about the origin through an angle 0.
Then, if is not an integral multiple of w, there are no one-dimensional invariant
subspaces, and L has no eigenvectors at all.

Example Let S be the subspace of e[0, x] consisting of all twice continuously


4.

differentiable functions y such that y(0) = y(r) = 0, and let L: S —» e[0, ir] be
the operator — D2 . Then, by the example in the previous section, L has an in-
finite sequence of eigenvalues

2
1,4, 9,..., « ,...

with associated eigenvectors cn sin nx, cn ^ 0.

All this is simple enough, but hardly explains why these notions were introduced
in the first place. The following theorem furnishes a partial answer to this question,
and gives an indication of the importance of eigenvectors in the study of linear
transformations.

Theorem 12-1. Any set of eigenvectors belonging to distinct eigenvalues


for a linear transformation L: S — V is linearly independent in S. •»

(Note that this result would fail if were allowed to be an eigenvector. Thus
the prejudice against the zero vector found in Definition 12-1.)

Proof. The theorem is obviously true when applied to a single eigenvector. Beyond
this we reason by induction, as follows.
Assume that the theorem has been proved for every set of n — 1 eigenvectors
for L, n > 1, let Xi, . . . , x„ be « eigenvectors belonging, respectively, to distinct
eigenvalues X l5 . . . , X n and
, let

aiXi + • • •
+ a n _iX n _i + (XnXn = 0. (12-10)
464 BOUNDARY- VALUE PROBLEMS | CHAP. 12

Thejn, applying L to both sides of this equation, we have

a i Lx 1 + *
-
'
+ OLn—lLXn-X + ct n Lx n = 0,

or

«iOiXi) + • • •
+ a n -iOn-iXn-i) + ot n (\ n Xn) = 0. (12-11)

We now multiply (12-10) by X n and subtract the resulting equation from (12-11)
to obtain

ai(\i — X n )xi + • • •
+ a w _i(X n _i — X n )xn _i = 0.

But. by assumption, the vectors Xi, . . . , x n _i are linearly independent. Hence


each of the coefficients in this expression vanishes, and since A* — Xn ^ when-
ever i 9^ n, we conclude that cti = for i = 1, . . . , n — 1. This, together with
(12-10), implies that an is also zero, and we are done. |

EXERCISES

1. Every vector space V has at least two subspaces which are invariant under a linear
transformation L: V —> 1). What are they?

2. Let L be a linear transformation on a finite dimensional vector space V (i.e.,

V —> V), and let Si be invariant under L.


(a) Prove that there exists a subspace S2 of V such that every vector x in V can be
Written in one and only one way as

X = Xi + X2

with xi in Si, X2 in S2. [Hint: Choose an appropriate basis for V.]


(b) Is the subspace S2 found in (a) necessarily unique? Why?
(c) Prove, by example, that it may be impossible to choose the subspace S2 in (a)
so that S 2 is invariant under L. [Hint: Let V = (?n the space of polynomials of ,

degree < n, and let L be the differentiation operator.]


3. Let S 1 and S2 be invariant under a linear transformation L S —> V, S a subspace of V. :

Prove that the subspace of S spanned by Si and S2 is also invariant under L.

4. (a) Show that the null space of a linear transformation L : V — V is invariant under L.
>•

Let L V —> V be a linear transformation with the property that L = L (i.e.,


2
(b) :
L is

idempotent on V). Prove that the image of L is an invariant subspace of V.

5. Let S denote the subspace of Q 2 [a, b] determined by a pair of homogeneous boundary


conditions
aiy(a) + 02/(0) = 0,

Piy(b) + 02/0) = 0,

^ind let L: S — *• Q[a, b] be the second-order linear differential operator defined by

Ly = (py')' + gy,
12-3 |
EIGENVECTORS IN FINITE DIMENSIONAL SPACES 465

where p a function in Q^a, b] with p(x) >


is for all x in [a, b], and q is an arbitrary
function in 6 [a, b]. Let yi and y 2 be a basis for the null space of L, viewed as an
operator from C 2 [a, b] to Q[a, b]. Prove that zero is an eigenvalue for L: S —» C[a, Z>]
if and only if the determinant

aiJViCa) + «2> i(a)


;

01*0) + fo/itfO 01>>20) + 02*20)

vanishes.

12-3 EIGENVECTORS IN FINITE DIMENSIONAL SPACES


We have seen that eigenvectors belonging to distinct eigenvalues for a linear
transformation are linearly independent. In the finite dimensional case this fact
immediately yields

Theorem 1 2-2. A linear transformation L


mapping an n-dimensional vector
space V into itself has at most n Moreover, when the
distinct eigenvalues.
number of distinct eigenvalues is equal to n, any complete set of eigenvectors,
one for each eigenvalue, is a basis for V, and the matrix of L with respect
to such a basis is

Xi

An -

with the eigenvalues on the main diagonal and zeros elsewhere*

Of course, such bases need not exist for a given L: V —> V (see Examples 2 and
3 above). When they do,
however, a number of pleasant things happen. For one,
we can then solve operator equations involving L; and rather efficiently too. The
following example will illustrate the technique.

3
Example 1. Let L be a linear transformation mapping 01 into itself, and suppose
that L has distinct eigenvalues Let ei, e 2 e 3 be eigenvectors belonging
Xi, X 2 , A3. ,

to these eigenvalues, and consider the equation

Lx = y, (12-12)

3
y known, x unknown. Then, since the vectors ei, e 2 , e 3 are a basis for (R , we have

x = *xei + x 2e 2 + *3e3,

y = Jiei + >> 2 e 2 + j> 3 e 3 ,

* Such a matrix is said to be in diagonal form.


466 BOUNDARY-VALUE PROBLEMS CHAP. 12

and (12-12) can be written

LOid + x 2e 2 + x 3 e 3) = y^ + y 2e 2 + y&z-
Hefrce

(^iXi)ei + (x 2 X 2 )e 2 + 0x3X3)63 = y\*\ + yz*2 + J^s,

and it follows that *i, x 2 x 3 must be chosen


, so that

*iXi = yu ^2X2 = y2, ^3X3 = J>3-

In particular, we see that (12-12) has a unique solution

Ai A2 A3

whenever the \ are different from zero. If, on the other hand, one of the X»-, say
Xi, is zero, (12-12) has no solutions at y x = 0. In the latter case the
all unless
equation x{K x = y 1 is satisfied for all values of x u and the solution set of (12-12)
then consists of all vectors of the form

x = xiei + ^ e + ^ e 2 3,
A 2 A3

with xi arbitrary.
The generalization of these results to w-dimensional spaces is obvious, and has
been left to the reader.

The technique introduced in the above example is known as the eigenvalue


method for solving an operator equation. Its success depends upon the existence
of enough eigenvectors for L to span V, and upon our ability to find them. But
both of these questions can be settled by computing the eigenvalues for L, and
thus we now address ourselves to this problem.
In Section 2-9 we saw that an equation of the form Lx = y involving a linear
transformation mapping an ^-dimensional vector space V into itself can be
written in terms of a basis for V as a system of linear equations

ail*l + "12^2 -f- •



' + ot\ n Xn = y\,
«21*1 "+" «22*2 + " ' '
+ OL 2n Xn = y 2 ,

«ni*i "+" oi n2 x 2 + • • •
+ oc nn xn — yn ,

where
«11 «12 "In"
«21 «22 «2n

«wl a n2
12-3 I
EIGENVECTORS IN FINITE DIMENSIONAL SPACES 467

is the matrix of L, and xu . .


.
, xn and y lt . . .
, yn are the components of x and y,

all with respect to the chosen basis. In particular, this is true of the equation

Lx = Xx,

which, as we have already observed, is equivalent to

(L - X/)x = 0,

where / is the identity transformation on 13. Noting that L — \I can be repre-


sented by the matrix

«21 a 22 — X «2n

«nl «n2 - X

we therefore conclude that the eigenvalues for L are simply the values of X for
which the system of homogeneous equations

(an — X)Xi + ai2*2 + ' * *


+ OCinXn =
«21^1 + ( a 22 ~ X)X 2 + ' ' *
+ <X2nXn =
(12-13)

(XnlXl + «n2*2 + * " *


+ (ot nn — X)Xn 6

has nontrivial solutions. But this will occur if and only if the determinant of the
coefficients of (12-13) vanishes (see Appendix III), i.e., if and only if

an — X a 12

«21 «22 — X «2n = 0. (12-14)

«nl «n2 Otnn — X

Thus the eigenvalues for L can be computed by solving (12-14) for X, and since
the left-hand side of this equation is an nth degree polynomial in X, this can even
be done by the methods of elementary algebra (at least for small values of n).
The polynomial appearing in (12-14) is known as the characteristic polynomial of
the linear transformation L, and the equation itself is called the characteristic
equation of L. As its name suggests, the characteristic polynomial is independent
of the particular basis used to compute it a fact which is also proved in Ap- ;

pendix III.
46H BOUNDARY- VALUE PROBLEMS I CHAP. 12

Example 2. Find the eigenvalues and eigenvectors for the linear transformation
2 2
L: '

31
v
(R ,
given that the matrix of L with respect to the standard basis e it e 2 is

2 1

In this case the characteristic equation of L is

1 - X
= 0,
1 - X

or

(1 - X)
2 _ o,

and it follows that X = 1 is the only eigenvalue for L. Hence a nonzero vector
x .Xxei + x 2e 2 will be an eigenvector for L if and only if Lx = x. Rewriting
this equation in matrix form as

2 1
.
x 2.

we find that Xi and x 2 must satisfy the equations

Xi = Xi,
2xi -\- x2 = x2 .

Thus xi must be zero, while x 2 is arbitrary, and the eigenvectors for L are of the
form x 2 e 2 x 2 t^ 0. Finally, the eigenspace is the one-dimensional subspace of
,

spanned by e 2 .

The reader is encouraged to interpret these results geometrically by viewing L


as a 'shear" of the .xy-plane parallel to the j>-axis.

Example 3. Find the eigenvalues and eigenvectors of the linear transformation


3
L on <3t whose matrix with respect to the standard basis ei, e 2 e 3 , is

-1
2 2 1

Since the characteristic polynomial of L is

-X 1

-(1 + X) = -(X - 2)(X + \f


2 2 1 - X
12-3 I EIGENVECTORS IN FINITE DIMENSIONAL SPACES 469

the eigenvalues for L are X = 2 and X = — 1. To find the associated eigen-

vectors we set x = *iei + x 2e 2 + *3e3> and solve the equation

Lx = Xx (12-15)

when X = 2 and X = — 1.
In the first case, (12-15) becomes

l" Xl 2*1

-1 x2 = 2x 2

2 2 1 -*3_ 2x 3
and we have
x3 = 2a: i,

—x2 = 2x 2 ,

2a i -f- 2^2 -\- x3 = 2X3.

Thus x 2 — and the relevant eigenvectors are x&i + 2x 1 e 3 jci an


0, 2a: 1 = x3 , ,

arbitrary nonzero constant. Here the associated invariant subspace, S 2 is the ,

one-dimensional subspace of (R spanned by ei + 2e 3


3
.

Finally, a similar computation reveals that the eigenvectors belonging to the


eigenvalue —1 are of the form A^ei — A"ie 3 xi an arbitrary nonzero constant. ,

In this case the associated invariant subspace is the one-dimensional subspace of


(R
3
spanned by the vector ei — e3 .

Example 4. In the preceding section we invoked a geometric argument to prove


that any rotation of (R
2
through an angle 6 5* m has no (real) eigenvalues or

eigenvectors. We now establish this fact algebraically, as follows.


The matrix of a rotation L of (R 2 with respect to the standard basis is

cos 6 — sin 6
sin 6 cos 6

where 6 denotes the angle of rotation. Hence the characteristic equation of L is

cos 6 — X —sin 6
= 0,
sin e cos 6 — X
and we have
X
2
- 2(cos 0)X + 1 = 0.

Thus
X = cos 6 ± / sin 6,

and it follows that Xand only if 6 = nir. Moreover, when this is the case,
is real if
X assumes one of the values ± 1, and has all of (R 2 as its invariant subspace.
Otherwise, X is complex, and L has no eigenvectors.
470 BOUNDARY- VALUE PROBLEMS |
CHAP. 12

EXERCISES

Find 2
1. all eigenvalues and eigenvectors for the linear transformations on (R defined
by the following matrices.

(a) 1 1 (b) 1 (c) 1 1 (d) 1 4

2 1 1 1 1

3
2. Find all eigenvalues for the linear transformations on (R defined by the following
matrices, and in each case find the eigenvectors belonging to the real eigenvalues.

(a) 2 1 (b) 1 2

-1 2 3 1 1

1 2 1 1

(c) -1 1 (d) 2

1 2 2 -1
-1 1 1

3. Exercise 2 for

(a) 1 1 (b) 1 1

1 2

1 1 1 -1
(c) 5 -6 -6 (d) 1 2 1

-1 4 2 1 2 1

3 -6 -4 1 2

4 defined
Find the eigenvalues and eigenvectors for the linear transformations on (R

by the matrices

(a) 1 -1 (b) -1 2 1 3

1 1 -2 1

-1 1 2 1 2 -3
1 -1 4

Let L: (ft
3 —
> (ft
3
be the linear transformation whose matrix with respect to the
standard basis ei, e2, e3 is

1 2

2 1_
12-4 | SYMMETRIC LINEAR TRANSFORMATIONS 471

Use the eigenvalue method to solve the equation Lx = y for x, given that

(a) y = 2ei + e2 ; (b) y = ex + e3 ; (c) y = 4ei - 2e 2 - 2e 3 .

6. Repeat Exercise 5 when the matrix of L is

2 r
1 3

.0 2
and
(a) y = -2ei + e 2 ;

(b) y = 4ei + 4e 2 + 2e 3 ;

(c) y = ei + 9e 2 + 2e 3 .

7. Prove that every linear transformation on an ocW-dimensional vector space has at


leastone real eigenvalue. Give a geometric interpretation of this result for linear
transformations on (R 3 .

8. An (n X «)-matrix is if all of the entries above (or below) the


said to be triangular
main diagonal are zero. Prove that each of the diagonal entries in such a matrix is an
eigenvalue for the linear transformation on (R n defined by the matrix.

"9. (a) Let L be a linear transformation on a finite dimensional vector space V, and
let Xo be a real eigenvalue for L of multiplicity m, by which we mean that (X — Xo) TO
is a factor of the characteristic polynomial for L. Prove that the dimension of the
invariant subspace associated with Xo is a? most m. [Hint: Consider the character-
istic polynomial of the linear transformation obtained by restricting L to the sub-
space S Xo .]
(b) Give an example to show that this dimension can, in certain cases, be less than
m. [Hint: Consider the operator — D on the space of polynomials (P„.]

12-4 SYMMETRIC LINEAR TRANSFORMATIONS


In this section we begin the process of extending the eigenvalue method introduced
above to include operator equations defined on infinite dimensional Euclidean
spaces. Our objective is to isolate a class of linear operators whose eigenvectors
can be used to construct bases, and which, at the same time, includes the operators
arising in the study of boundary-value problems. Clearly, the first step in such a
program is to select a criterion which will guarantee that eigenvectors belonging
to distinct eigenvalues are orthogonal, and to this end we now introduce the notion
of a symmetric linear transformation, as follows.

Definition 12-3. Let L


be a linear transformation from S to V, where S
and V Then L is said to be sym-
are Euclidean spaces, S a subspace of V.
metric with respect to the inner product on *0 if and only if

(Lx)-y = x.(Ly) (12-16)


for all x and y in S.
472 BOUNDARY- VALUE PROBLEMS | CHAP. 12

Before giving any examples, we prove that this definition accomplishes our ob-
jective by establishing

Theorem 1 2-3. Every pair of eigenvectors belonging to distinct eigenvalues


for a symmetric linear transformation L: S —* V are orthogonal in V.

Proof Let Xi and x 2 be nonzero vectors in S, and suppose that Lxi = XiXi,
Lx 2 = X 2 x 2 with Xi
, X 2 Then ^ .

(Z,X!)-x 2 = Xi(xi -x 2 ),
Xi • (Lx 2 ) = X 2 (xi • x 2 ),

and since (Lx x ) • x2 = x x


• (Lx 2 ), it follows that

(Xi - X 2 )(xi -x 2 ) = 0.

But, by assumption, Xi — X2 ^ 0. Hence x x x 2 • = 0, as asserted. |

Example 1. Let L be a symmetric linear transformation mapping a finite di-


mensional Euclidean space V into itself, and let ei, e n be an orthonormal . . . ,

basis in V. Then if

Ley = aiyei + • * *
+ oc nj e n , j = 1, . . . , n,

we have

e* • (Ley) = e{ •
(aiyei + • • •
+ anjen )

= aij{ei • e0 + • • •
+ ciij(ei • e t-) + • • •
+ a n ;(ej • en)

the last step following from the fact that

JO, i*j,
}, i = J-

On the other hand,

(Le {) • ey = (ai,ei + • •
+ ot ni e n ) • ey

= aii(ei • ey) + • • •
+ aji(ej • ey) + • • •
+ a ni (e n • ey)

= OLji,

and the equality (Le t ) ey = e» (Ley) implies that a»y = ay; for all i and all j.
• •

Thus the matrix of L with respect to the basis ei eB is a symmetric matrix


in the sense that its rows and columns may be interchanged without changing the
matrix itself.
1 2-4 SYMMETRIC LINEAR TRANSFORMATIONS 473

Conversely, suppose that the matrix of L is symmetric with respect to an ortho-


normal basis d, . . . , e n in V, and let

L*j = aiyei + • • •
+ a nj e n , j = 1, ,n.

Then
(Le^) • ey = aji = ctij = e; • (Ley)

for all / and j, and hence if x and y are arbitrary vectors in V with

x = Xiei +
y = J>iei + ~~T~ yn$ni

we have

(Lx) •
y = ( J2 XiLti \-\Yj yjtj
J=l
n n

i=i y=i

= 22 *^ e* • ( Le ^

= [Y^XieA-lj^yjLej
\i=i j \j=\
= x • (Ly).

Thus L is symmetric on V, and we have proved

Theorem 12-4. A linear transformation on a finite dimensional Euclidean


space V symmetric if and only if the matrix of the transformation with
is

respect to any orthonormal basis in V is a symmetric matrix.

3 3
Example 2. Let L: (R -> (R be defined by the matrix

1 f
1

1 1

with respect to the standard basis ei, e 2 e 3 Then, by the above theorem, L is a , .

symmetric linear transformation, and an easy computation reveals that its charac-
teristic equation is

X(X - 1)(X - 2) = 0.
474 BOUNDARY-VALUE PROBLEMS CHAP. 12

Hence the eigenvalues for L are 0, 1, 2, and Theorems 12-2 and 12-3 imply that
3
any complete set of eigenvectors for L will be an orthogonal basis for (R a fact ;

which can be readily verified by direct computation. Finally, we note that the
matrix of L with respect to such a basis assumes the diagonal form

o"

2_

Example Let S denote the subspace of e[0, x] consisting of all twice


3.

continuously differentiable functions y such that y(0) = y(ir) = 0, and let


L: S -> e[0, t] be the operator
2
-D
Then, using integration by parts, we find
.

that if >>i and y 2 belong to S,

(Lyd-y 2 = ~ [* y"(x)y 2 (x) dx


Jo
*
= -y'i(x)y 2 (x) + (* y'i(x)y'2 {x) dx
Jo

and

y\ • {Ly 2 ) = - T yi{x)y 2 {x) dx


Jo
T T
= -J1W/2W + [
Jo
y'i(x)y 2 (x) dx.

But, by assumption, ^i(0) = ^i(x) = 0, y 2 (0) = y 2 {-n) = 0. Thus


T
(LyO y 2 = [

yi(x)y'2 (x) dx = ji • (Ly 2 ),
Jo

and L is symmetric onHere again Theorem 12-3 applies, and we can assert
S.

that any complete set of eigenvectors for L is an orthogonal set in e[0, ir]. This
agrees with the results obtained in Section 12-1 where we found that the eigen-
values for L are the integers X n = n
2
n = 1, 2, and that the corresponding
, . . . ,

eigenvectors (or eigenfunctions as they are called in this case) are cn sin nx, c n an
arbitrary nonzero constant.

The reader may have noticed that all of the eigenvalues for the linear trans-
formations considered in the last two examples were real. This, as it turns out, is
always true of symmetric linear transformations; a fact which we state formally as

Theorem 12-5. All of the eigenvalues for a symmetric linear transformation


are real.

The proof of though not difficult, would necessitate introducing com-


this result,
plex inner product spaces, and is therefore omitted. The interested reader can
find the argument in any standard text on linear algebra.
12-4 I
SYMMETRIC LINEAR TRANSFORMATIONS 475

EXERCISES

2
1. Let L be the symmetric linear transformation on (R defined by the matrix

a b
b c

with respect to the standard basis.


(a) Show that L has two distinct real eigenvalues except in the trivial case where
b = and a = c.
2
(b) Find a basis for (R composed of eigenvectors for L in the nontrivial case.

2. Let Li and Li be symmetric linear transformations mapping 13 into itself. Prove that
the transformationL1L2 is symmetric if and only if L1L2 = L2L1.
3. Let S denote the subspace of 6[0, ir] consisting of all twice continuously differentiable
functions v such that y'(Q) = y'(ir) = 0, and let L:$ —> G[0, 7r] be the operator 2
—D .

Prove that L is a symmetric linear transformation, and find its eigenvalues.

4. Determine whether or not the following linear transformations are symmetric on the
Euclidean space of polynomials with inner product p- q = f_ 1 p(x)q(x) dx.
(a) Lp(x) = xp(x); (b) Lp(x) = p'(x);

p(x) ~ x)
(c) Lp(x) = p{x + 1) - p(x); (d) Lp(x) - ~^ {
.

5. Let £F denote the set of all real- valued functions of an integer variable k,
— 00 < k < 00 , and let A 2 J —> : ff be defined by

(A 2 F)(A:) = F(k + 1) - 2F(A:) + F(k - 1)

for all F in J and all k.


(a) Show that J is a real vector space under the usual definitions of addition and
scalar multiplication, and that A 2 is a linear transformation on JF.
(b) Find the null space of A 2 .

'6. Let JFo denote the space of all real- valued functions defined on the finite set of integers
0,1, ... ,N,N + 1, and for each pair of functions F, G in Jo, set

iV+l
F>G = j^F^Gik).
k=0

(a) Prove that Jo is a Euclidean space.


(b) LetJi be the subspace of Jo consisting of those functions F for which F(0) =
F(N +
1) = 0, and let A
2
be defined as in Exercise 5 above. Prove that A 2 is a
symmetric linear transformation on Ji. [The symbols F(— 1), F(N 2), G(— 1), and +
G(N + 2) will appear in the computations, but no matter what values they are given,
the asserted result holds.]
(c) Find the eigenvalues and eigenvectors for the operator A 2 on Ji, and check by
direct computation that the eigenvectors belonging to distinct eigenvalues are or-
thogonal.
476 BOUNDARY- VALUE PROBLEMS | CHAP. 12

12-5 SELF-ADJOINT DIFFERENTIAL OPERATORS;


STURM-LIOUVILLE PROBLEMS

We have seen that the operator — D2 becomes a symmetric linear transformation


when restricted to the space of twice continuously differentiable functions on
[0, 7r] which vanish at the endpoints of the interval. Such behavior is typical of a
large number of differential operators, and when properly generalized furnishes
the key to the study of boundary-value problems. To effect this generalization we
now introduce the class of self-adjoint linear differential operators.

Definition 12-4. A second-order linear differential operator L defined


on an interval [a, b] is said to be in self-adjoint form if

L = D(p{x)D) + q{x), (12-17)

where p any function in Q x [a, b] such that p(x) > 0, or p(x) < 0, for
is

all x in the open interval (a, b), and q is an arbitrary function in e[a, b].

Despite its appearance, (12-17) is sufficiently general to include all normal


second-order linear differential operators on [a, b]. For if L = a 2 (x)D
2
+
ai(x)D + a (x) is such an operator, then L can be written in self-adjoint form
by setting

and

= a °( X ) hai(x)la 2 (x)]dx
n(x) c
a 2 (x)

(see Section 6-3 and Exercise 1 below). Thus, without any real loss of generality,
we can (and shall) restrict ourselves to the study of self-adjoint operators, and to
differential equations of the form

s(**)g) + 9(x)y = h( X ). (12-18)

Finally, we note that the function p appearing as the leading coefficient in this
equation is allowed to vanish at the endpoints of [a, b~\. This fact will be of some
importance later.
Our immediate objective is to determine conditions under which a self-adjoint
operator will be symmetric when viewed as a linear transformation from S to
2
Q[a, b], S a subspace of Q [a, b] determined by a pair of homogeneous boundary
conditions
ai y(a) + a 2 y(b) + a 3y'(a) + a 4/(b) = 0,
(12-19)
Piy(a) + P 2 y(b) + 0a/(a) + /3 4/(fc) = 0.

To this end we first prove the following lemma.


12-5 |
SELF-ADJOINT DIFFERENTIAL OPERATORS 477

Lemma 12-2. (The Lagrange Identity.) If

L = D(pix)D) + <?(*)

is any self-adjoint linear differential operator on [a, b], and if yi and y 2 are
twice differentiable on [a, b], then

yi(Ly 2 ) - {Ly{)y 2 = [p(y x y 2 - y 2y'i)]'. (12-20)

(As usual, the primes denote differentiation.)

Proof. We simply apply the definition of L, and rearrange terms, as follows:

yi(Ly 2 ) - (Ly^yz = yi[(py 2 )' + qy 2 - J^KM)' + qyi\ ]

= yiipyW - y2(pyi)'
= yiipyV + p'yfi - y&py" + p'yi]
= p'\yiy 2 - j>2j>i] + Piyiy'2 - wi']
r

= [piyiy'2 - y2y'i)Y- 1
Formula (12-20) can be written in much more suggestive form by integrating
from a to b. For then its left-hand side becomes >>i (Ly 2 ) — (Ly{) y 2 and we • •
,

therefore have

yi • (Ly 2 ) - (LyJ-yz = p(jiy'2 ~ J> 2 /i)| , (12-21)

from which we immediately deduce

Theorem 12-6. Let % be a subspace of Q 2 [a, b] determined by boundary


conditions of the type given in (12-19), and let L be any self-adjoint linear
differential operator mapping S into e[a, b]. Then L will be symmetric with
respect to the standard inner product on Q[a, b] if and only if

p(yiy 2 ~ Wi)[ = (12-22)

for all y x and y 2 in S; i.e., if and only if

P(bjyi(b)y'2 (b) - y 2 (b)y[(b)] - p(atyi(a)y 2 (a)


f
- y a (aM(a)] = 0.

Before giving any examples, we use Theorem 12-6 to determine several of the
more obvious and important boundary conditions which lead to symmetric
operators.

Case 1. p(d) = p(b) = 0. Here (12-22) is satisfied without restriction, and we


can set S = e 2 [a, b].

Case 2. Let S be the set of all y in e 2 [a, b] such that

+
«iX«) « 2 /(a) = 0,
(12-23^
Mb) + /3 2 /(Z>) = 0,
K >
478 BOUNDARY- VALUE PROBLEMS | CHAP. 12

with |ai| + \a 2 9*\


and ||8i| + |/3 2 j* 0. (These last conditions are imposed
|

to force at least one of the as and one of the |8's to be different from zero.) Then,
if yi and y 2 are any two functions in S,

yi(a)/2 (a) - y 2 (a)/1 (a) =


and
yi(.b)y'2 (b) - y 2 (b)/1 (b) = 0.

Hence (12-22) vanishes on S, as desired.

A boundary condition of the form


+ «2/(«) = «iX«)

which involves the values of y and / at only a single point is said to be unmixed.

In these terms the above argument asserts that a self-adjoint linear differential
2
operator is symmetric on every subspace ofe [a, b] described by a pair of unmixed
boundary conditions. (Note that only one such condition need be given if p vanishes
at a or at b.)

Case 3. Assume that p(a) = p(b), and let S be the subspace of e 2 [a, b] consist-

ing of all y such that


y(a) = y(b),
(12_ 24)
y'(a) = /(b).

Then (12-22) is obviously satisfied for all y x and y 2 in S, and L is again symmetric.
This is known as the case of periodic boundary conditions.

2
Example Let S be the subspace of e [0, *] consisting of
1. all functions y satisfy-

ing the pair of unmixed boundary conditions

X0) = yi/) = 0,

and let L = —D 2 . Then, by Case 2, L is symmetric on S,

Example 2. By Case 3, the operator — D is symmetric on the subspace of


2

2
e [0, 2x] described by the periodic boundary conditions

X0) = X2x), (12_25)


/(0) = /{It).

To find its eigenvalues and eigenfunctions we again apply the given boundary
conditions to the general solution of

y" + \y = 0. (12-26)

The argument proceeds by cases, as with the example in Section 12-1.*

* Note that by Theorem 12-5 we need only consider real values of X.


12-5 |
SELF-ADJOINT DIFFERENTIAL OPERATORS 479

When X < 0,

y = c x e^-
Xx
+ c2 e~^ x
,

and (12-25) implies that C\ = c2 = 0. Hence there are no negative eigenvalues.


When X = 0, the general solution of (12-26) is c x + c 2 x, and the boundary

conditions can be satisfied by setting c2 = and leaving c x arbitrary. Thus


X = is an eigenvalue, and its associated eigenfunctions are the constant func-
tions on [0, 27t].

Finally, when X > 0,

y = Ci sin \/Xx + c 2 cos \/\x,

and (12-25) leads to the pair of equations

Ci[l — cos (27r\/X)] = Ci sin (2t\/\),

c 2 [l — cos (27r\/X)] = —c2 sin (2t\/\),

which can be satisfied with y ^ by setting y/\ = 1, 2, 3, ... . Thus the integers

i2 n2 t2
1 5 *- 5 J j • • •

2
are eigenvalues, and the invariant subspace associated with n is the two-dimen-
sional subspace of e[0, 2ir] spanned by the functions sin nx and cos nx.

Boundary-value problems involving self-adjoint linear differential operators


with mutually orthogonal eigenfunctions are usually referred to as Sturm-Liouville
problems after the mathematicians J. C. F. Sturm and J. Liouville who first in-
vestigated them. Thus, each of the problems considered above was a Sturm-
Liouville problem. More generally, a Sturm-Liouville problem, or system, is, by
definition, a second-order homogeneous linear differential equation of the form

d
dx (/>(*);|)
+ fo(*)- *J> = 0,

p and q as above, together with a pair of (homogeneous) boundary conditions


chosen in such a way that eigenfunctions belonging to distinct eigenvalues for the
operator
D(p(x)D) + q(x)

are orthogonal.* In the next chapter we shall see that such systems arise naturally
in the study of boundary-value problems involving partial differential equations,
and for this reason are extremely important in physics and applied mathematics.

* This terminology will be generalized in Section 12-8 to a somewhat wider class of


problems.
480 BOUNDARY- VALUE PROBLEMS |
CHAP. 12

EXERCISES

1. Prove that every normal second-order linear differential operator a2(x)D 2 +


a\(x)D +
ao(x) can be put in self-adjoint form by setting

= / [o i (l)/a 2 (l)1<il
p(x) e

Q\X)
_ ao(x)
J[ ai (x)/a 2 (x)]dx
t \
az(x)

2. Rewrite each of the following linear differential operators in self-adjoint form.

(a) D2 + - D+ >
1, x
x
(b) (cosx)D 2 + (sinx)D - 1, -tt/2 < x < ir/2
(c) x 2 D 2 + xD + (x 2 — p 2 ), x > 0, a real number /?

(d) (1 — x 2
)D 2 —
2xD + n(n + 1), — 1 < * < 1, n a non-negative integer

3. Find all eigenvalues and eigenvectors for the Sturm-Liouville system

y" + Xv = 0,

/(-tt) = 0, /Or) = 0.

4. Repeat Exercise 3 for the Sturm-Liouville system

/' + \y = 0,

y(0) = 0, y'(T) = 0.

12-6 FURTHER EXAMPLES


we consider a number of Sturm-Liouville problems which will be
In this section
encountered repeatedly in our later work, and which, for convenience of refer-
ence, we solve here, once and for all. In each case we shall limit ourselves to
finding eigenvalues and eigenfunctions, and postpone any discussion of how this
information is used to solve specific boundary-value problems involving the given
operator.

Example 1. Solve the Sturm-Liouville system

/' + Xy = 0,

j(0) = 0, y(L) = 0.

This is a variant of a problem we have already considered many times over,


and this time we find that the constants

*,2 2

"n = r 2 » ^ = *J ^J • • • J
12-6 |
FURTHER EXAMPLES 481

and functions
yn (x) = sin — > n = 1,2, . . .,

are complete sets of eigenvalues and eigenfunctions.

Example 2. Solve the Sturm-Liouville system

/' + \y = 0,

/(0) = 0, /(L) = 0.

A computation similar in all respects to the one used to solve Example 1 reveals
that the eigenvalues for this problem are the non-negative constants

M2 2
X„ = 2 ' n = 0,1,2, ... ,

and that

>>w (x) = cos -j- > « = 0, 1, 2, . .


.

is a complete set of eigenfunctions. The details are left to the reader.

Example 3. Solve the Sturm-Liouville system

/' + = \y 0,

y(0) = 0,

hy(L) + y\L) = 0,

given that h andL are positive constants. (Note that the boundary conditions are
unmixed, and that the problem falls under Case 2 above.)
As usual, we argue by cases, depending upon the algebraic sign of X, and again
find that there are no eigenvalues < 0. On the other hand, when X > 0,

y = Ci sin VX x + c 2 cos Vx x,
and the first boundary condition implies that c2 = 0. Thus it remains (if possible)
to choose X so that the function

y = Ci sin \/x x,

with ci ^ 0, satisfies the equation hy(L) + /(L) = 0. This, in turn, implies


that X must be chosen so that

sin (VX L) = - ^-^ cos (V\ L); (12-27)

an equation which we rewrite as

tan fx = - -^ /x (12-28)
482 BOUNDARY-VALUE PROBLEMS CHAP. 12

FIGURE 12-2

by setting /x = y/\L. Although it is impossible to solve (12-27) explicitly for X,

its solutions can be visualized as arising, via (12-28), from the points of inter-

section of the graphs of the functions tan /x and —/x/hL. As indicated in Fig.
12-2, there are infinitely many such points

. . . , /x_ 2 , M-i» Mo = °> Mi» fJ-2, • • •

located symmetrically across the origin. Thus the given problem has an infinite

number of positive eigenvalues

2
^n ==
Y2 ' n = 1,2, . .
.

with X n < X n+ i for all n, and lim^* X n = 00. [Also note that from the geom-
etry of the situation we have lim n _+oo (X n+X - X n) = x.] Finally, the functions

HnX
y n (x) = sin

= sin a/\^ x, n = 1, 2, . .
.

constitute a complete set of eigenfunctions for this problem.

Example 4.* Solve the Sturm-Liouville problem

A o-* >i 2
+ x^ = (12-29)
dx
on the interval [—1, 1].
The leading coefficient of this equation vanishes at the endpoints of the interval,
and hence, by Case 1 above, the operator D[{\ — x )D] is symmetric on all of
2

2
e [— 1, 1]. Moreover, since (12-29) is the self-adjoint form of Legendre's equation
(of order X), our earlier results imply that the integers n{n + \),n = 0, 1, 2, ... ,

* This example should be omitted by anyone who is not familiar with the material in
Sections 11-2 through 11-4.
.

12-6 I
FURTHER EXAMPLES 483

are eigenvalues, and that the Legendre polynomials

P (x), P^x), P 2 (x), . .

are eigenfunctions for the problem.* To complete the discussion it remains to


show that these polynomials are a complete set of eigenfunctions for (12-29). Here
we argue as follows.
Were X 9* n(n + 1) an eigenvalue of (12-29), and y an eigenfunction be-
longing to X then y would be orthogonal in e[— 1, 1] to all of the Pn But, as
, .

we know, the Legendre polynomials are a basis for e[— 1, 1], and hence, by
Lemma 8-3, y = 0. Since this cannot be, no such eigenvalue exists.

EXERCISES

1. Verify that the eigenvalues and eigenvectors listed in Example 2 above are correct.

2. (a) Show that the boundary-value problem consisting of the fourth-order differential
equation
,4
dy 2 n

and boundary conditions

y(P) = y(l) = 0, /(0) = /(l) =


has nontrivial solutions if and only if

cos vw = 7=
*

cosh Vco

(b) Use the technique introduced in Example 3 above to prove that the boundary-
value problem in (a) has infinitely many non-negative eigenvalues w„, n = 0, 1,
2, . . . . How do these eigenvalues behave as n — » °o ?

(c) What is the general solution of the boundary-value problem in (a) correspond-
ing to the eigenvalue co n ?

3. Let L denote the fourth-order linear differential operator D4 , and let S denote the
subspace of Q 4 [a, b] consisting of all functions y such that

y(a) = y"{a) = y(b) = y"(b) = 0.

(a) Prove that

yi(Ly 2 ) - y 2 (Lyi ) = [y iy2" - y 2 y'{' - y'xy'i + yWi]'


for all vi and y 2 in S.

(b) Use the result in (a) to prove that eigenfunctions belonging to distinct eigen-
values for the boundary-value problem L: S — 6 [a, b]
> are orthogonal.

* Note that at this point Theorem 12-6 allows us to assert, without further proof, that
the Legendre polynomials are mutually orthogonal in (PC [—1, 1].
484 BOUNDARY- VALUE PROBLEMS | CHAP. 12

12-7 BOUNDARY-VALUE PROBLEMS AND SERIES EXPANSIONS


We began this chapter by defining a boundary-value problem to be an operator
equation of the form
Ly = h (12-30)

in which h is a known function in <B[a, b], and L a second-order linear differential


operator acting on a subspace S of Q 2 [a, b] described by a pair of boundary con-
ditions of the form (12-2). Since then we have said very little about such problems
and have, instead, been off chasing eigenvalues and eigenfunctions. It is now time
to justify this apparent digression by returning to our original problem and apply-
ing whatwe have learned to solve it.
The method we propose to use is a straightforward generalization to infinite
dimensional Euclidean spaces of the eigenvalue method introduced in Section 12-3,
and thus depends upon the existence of an eigenfunction basis for Q[a, b]. As yet,
of course, we have no assurance that such a basis will exist, even when L is sym-
metric, but if it does we can argue as follows.

Let
^0j ^1> ^2> • • •

be the eigenvalues for L, and let

<Po(x), <Pi(4 <p2(x), ...

be a complete set of eigenfunctions belonging to the X n . Then, since the & are a
basis for <3[a, b], we have
00

Kx) = 2
n=0
C«Vn(*),

where
h '
<p n _ }a h(x)<pn (x) d.X
'
2
IWI ja MX)] 2 dx
and the series converges in the mean to h. We now set

00

y(X ) = 2 "nVnOO,
n=0
(12-31)

with the a n unknown, and substitute in (12-30) to obtain

^
n—0
oc n <p n (x) = ^2
n=0
cn <pn (x).

Thus if L can be applied to (12-31) term-by-term, we have


00 00

yi ot n \ n <pn (x) = 2j cn (pn {x)


n=0 n=0
12-7 I
BOUNDARY-VALUE PROBLEMS AND SERIES EXPANSIONS 485

(recall that L<pn = X n <p n), and it follows that (12-31) will be a solution of the
given equation whenever
(i) the a n can be chosen so that

for all n, and


(ii) with these as the values of a n the series ,

00

^2 (*n<Pn(x)
n=0
2
defines a function in <5 [a, b] whose first two derivatives can be computed by
termwise differentiation.
It is clear that the first of these requirements can be met by setting a n = cn / \ n
so long as X n ^ for all n (i.e., so long as L
one-to-one). Furthermore, the
is

resulting solution is then unique. If, on the other hand, one of the X n say X is , ,

zero, the problem has no solution at all when c ^ 0, and an infinite number of
solutions when c = 0.

Unfortunately, no such simple analysis can be used to dispose of (ii), since here
we must investigate the convergence of the series

n=0 Aw

As we have already seen, this is a delicate problem whose solution depends


upon the properties of the function h from which the cn are derived, and upon the
particular orthogonal system <pn appearing in the series. Thus different orthogonal
systems must be examined individually, and the best that can be said in general
is that the desired termwise differentiability will be possible whenever h is "suf-
ficiently smooth." (However, see Theorem 12-7 below.) In the absence of specific
information as to what degree of smoothness is "sufficient" in any given instance,
it is standard practice to proceed formally, as above, and then attempt to verify

that the resulting series has the required properties. This method will be illustrated
in some detail in the next chapter.

Example 1. Let S be the subspace of e 2 [0, w] described by the boundary con-


ditions y(0) = y(w) = 0, and let L = -D 2
. Then

2
Xn = n , (Pn(x) = sin nx,

n = 1,2,..., and the <pn are a basis for e[0, *]. (See Section 9-5.) Hence the
boundary-value problem

y
' (12-32)
X0 = >>« = o
486 BOUNDARY- VALUE PROBLEMS |
CHAP. 12

has the formal solution

n=\

with

cn = ~
TT
I
Jo
Kx ) si n nx dx.

In this case the validity of (12-33) can be guaranteed by demanding that h be


continuous and have a piecewise continuous first derivative on [0, ir]. For then
the Fourier sine series for h will converge uniformly and absolutely on every closed
subinterval of (0, x), and the same will therefore be true of the series obtained by
twice differentiating (12-33) term-by-term.*
As a concrete illustration, let h(x) = x. Then (12-32) becomes

-/' = x,
(12-34)
X0) = y(T) = 0,

and we have
00

y(x) = X^ sin nx >

where

r = - / x sin nx dx.
ir J o

A routine calculation gives


2
Cn = ("I)\n+l

whence

Of course, (12-34) can also be solved in closed form by applying the given
boundary conditions to the general solution of /' = x. Lest the reader feel —
that we have been somewhat dishonest in using Fourier series when this easier
method was at hand we point out that frequently no such option exists, and the
only available solutions are those expressed as series in terms of eigenfunctions
for the problem.

* The reader should note that (12-33) will satisfy the given boundary conditions and
reduce to the value of h at and ir only if /r(0) = h(ir) = 0. Thus, in general, we can
neither demand nor expect the solution of (12-32) to satisfy the differential equation on
the closed interval [0, x].
12-7 I
BOUNDARY- VALUE PROBLEMS AND SERIES EXPANSIONS 487

The technique used in the above example was successful precisely because the
operator — D had a sufficient number of mutually orthogonal eigenfunctions
2

when restricted to S to allow us to construct an eigenfunction basis for e[0, x].


This immediately raises the problem of determining conditions which are sufficient
to guarantee the existence of such a basis for an arbitrary self-adjoint linear dif-
ferential operator acting on a given subspace of Q 2 [a, b]. As might be expected,
this is a very difficult problem, and any attempt to answer it, even for the sym-
metric operators introduced in Section 12-5, would carry us too far afield. Thus
we let matters rest with the statement of the following theorem which, as it turns
out, is adequate to handle most of the problems we shall discuss.

Theorem 1 2-7. Let L be a normal second-order linear differential operator


2
defined on a closed interval [a, b], and let §>be a subspace ofe [a, b] described
by a pair of unmixed boundary conditions. Then L has an infinite sequence
of real eigenvalues {\ n }, n = 0, 1, 2, ... such that ,

and
M< |Xi| < M< ,

lim |X n l = oc.

Moreover, the invariant subspaces of Q[a, b] associated with the \ n are all
one-dimensional; any complete set of eigenfunctions for L, one for each
eigenvalue, is a basis for Q[a, b]; and the series expansion of any piecewise
smooth function y on [a, b] relative to such a basis converges uniformly and
absolutely to y on any closed subinterval in which y is continuous*

EXERCISES
Find the formal series expansion of the solution of the boundary-value problems in
Exercises 1-6 in terms of the eigenfunctions for the associated Sturm-Liouville system.
1. y" = x(x - 2x), y(0) = 0, /(*) =
2. /' = X 2 - 7T 2 /(0) = 0, jKx) =
,

3. /' = sin
Try

/(0) = 0, /(L) =
,

4. y" = sin — . >;(0) = 0, y'(L) =

y" = —x, < x < t/2,


5.
y(0) = 0, yfr) = 0.
U- tt/2, x/2 < X < IT,

6. y" = -sin 2 *, ^(0) = y(w), /(0) = /Or).


*7. Use the technique introduced in this section to discuss the boundary-value problem
/' = -h(x),
v(0) = y(2ir), v'(0) = y'(2ir),

For a proof see E. L. Ince, Ordinary Differential Equations, Dover, New York, 1956.
488 BOUNDARY- VALUE PROBLEMS |
CHAP. 12

given that A is a function in C[0, 2ir\. [Hint: Consider the cases

T
f h(x) dx = and f h(x) dx ^
Jo Jo
separately.]
*8. By citing appropriate theorems in the text, verify the assertion made above concern-
ing the convergence of the series given in (12-33).

12-8 ORTHOGONALITY AND WEIGHT FUNCTIONS


The considerations of the preceding section admit an easy and important general-
ization to boundary-value problems involving a Sturm-Liouville system con-
sisting of
(i) a second-order homogeneous linear differential equation of the form

^(/>Mg + ?M-M*)]y [
= (.2-35)

defined on an and
interval [a, b],
(ii) a pair of homogeneous boundary conditions which serve to determine the
domain space for the operator

L = D(p(x)D) + q(x).

As before, we assume that p and q Q l [a, b] and Q[a, b],


belong, respectively, to
and that p(x) does not vanish in (a, b). In addition, we demand that r be a con-
tinuous non-negative function on [a, b] which vanishes at most finitely many
,

times in the interval.


The values of X which such a problem admits nontrivial solutions are again
for
called eigenvaluesand the associated nontrivial solutions of (12-35) are called
eigenfunctions. Our task, of course, is to find all eigenvalues and eigenfunctions
once L and its domain space have been given, and, more generally, to investigate
the possibility of extending our earlier results to this setting.
We begin by noting that if X x and X 2 are distinct eigenvalues for (12-35), and
if yx and y 2 are eigenfunctions belonging to these eigenvalues, then

Lyi = Air(*)yi(*),

Ly 2 = \2r(x)y 2 (x) t

and Lagrange's identity implies that

Oi - \ 2 y(x)y 1 (x)y 2 (x) = y 2 (x^Ly 1 (x)] - yi(xJLy 2 (x)]

= {p(x)[yi(x)y'2 (x) - y 2 (x)y'1 (x)]}


j^
Hence

(Xi - \ 2 )f ripQyiOOyzix) dx = p(xjy 1


,
(x)y 2 (x) - y 2 (x)y[(x)]\ ,
(12-36)
Ja la
1 2-8 ORTHOGONALITY AND WEIGHT FUNCTIONS 489

and it follows that

Kx)y l {x)y 2 {x)dx = (12-37)


/.

whenever the boundary conditions are such that the expression on the right-hand
side of (12-36) vanishes. Assuming this to be the case, (12-37) allows us to assert
that the functions \fry\ and \Zry 2 are orthogonal in e[a, b], or, equivalently,
that y x and y 2 are orthogonal in e[a, b] with respect to the weight function r. (See
Example 3, Section 7-1.) This latter terminology has the advantage of banishing
the cumbersome factor \Jr from the discussion of orthogonality, and amounts to
redefining the inner product on e[a, b] to be

f'g = f f(x)g(x)r(x) dx, (12-38)


Ja

a definition we know to be valid whenever r satisfies the conditions imposed above.


And with this we have proved the following generalization of Theorem 12-6.

Theorem 12-8. Let L be a self-adjoint linear differential operator on an


interval [a, b], let r be any weight function on [a, b], and let %be a subspace
2
ofe [a, b] such that

P(x){yi(x)y2(x) - y 2 (x)y[(x)] =

for every pair of functions y x and y 2 in S. Then any set of eigenfunctions


belonging to distinct eigenvalues for the Sturm-Liouville problem

Ly = \ry

is orthogonal in Q[a, b] when the inner product is computed with respect to


the weight function r.

We call the reader's attention to the fact that in general the operator L will not
be symmetric on S with respect to the weighted inner product defined by (12-38).
Nevertheless, Theorem 12-8 asserts that eigenfunctions belonging to distinct
eigenvalues are still orthogonal, and this
is really what is needed to construct
eigenfunction bases. Moreover, since the conditions required to ensure orthog-
onality here are the same as those imposed in Section 12-5, we see that the
conclusion of Theorem 12-8 is assured whenever S is described by boundary
conditions of Type 1, 2, or 3 of that section. And finally if L: S -> e[a, b] is
both one-to-one and normal, and if r(x) > for all x in [a, b], it can be shown
that Q[a, b] has a basis composed of eigenfunctions for L. Again we omit the
proof.

Example. Find the eigenvalues and eigenfunctions for the Sturm-Liouville


problem
y" + 4/ + (4 - 9X).y = 0,
(12- 39)
m-m-o.
490 BOUNDARY- VALUE PROBLEMS |
CHAP. 12

In the first place, (12-39) can be rewritten in self-adjoint form as

4-U
dx f) +
x
Ae* y=
x
A(9e
4a
>,

XO) = 0, y(d) = 0,

and therefore satisfies the hypotheses of Theorem 12-8. Thus eigenfunctions


belonging to distinct eigenvalues for this problem will be mutually orthogonal in
the Euclidean space e[0, a] with inner product computed relative to the weight
= 4x Furthermore, not difficult to show that the operator
function r(x) 9e . it is
4x
L = D(e D) 4x
+ Ae is a one-to-one linear transformation from the subspace
described by the given boundary conditions to G[0, a]. (See Exercise 5, Section
12-2, and Lemma Hence, since L is normal and r(x) >
12-3 below). for all

x in [0, a], the result cited a moment ago guarantees the existence of an eigen-
function basis for e[0, a]. To compute such a basis we argue as follows.

Case 1. X > 0. Here the general solution of/' + Ay' + (4 - 9A).y = is

y = c xe + c2e

and the boundary conditions imply that

Ci + c2 = 0,
(-2 + 3VX)a C2e (-2-3VX)« =
Cie + Q

Thus c x
= c2 = 0, and (12-39) has no positive eigenvalues.
2x
Case 2. X = 0. This time the general solution of the equation is (c x + c 2 x)e~ ,

and we again find that Ci = c2 = 0.

Case 3. X < 0. Here

y = e~ 2x {ci sin 3\^X x + c 2 cos 3\/^X x), (12-40)

and the requirement that y(0) = y(a) = yields

c2 = 0,

Cx sin 3V— X a = 0.

Hence (12-39) has nontrivial solutions if and only if X satisfies the equation
sin 3V— X a = 0, and it follows that the eigenvalues for this problem are

2„2
n
xn = ~
it

-gj2 '
"
= ,
l
>
2'

To find a corresponding set of eigenfunctions we now set c x = 1, c2 = 0, and


X = X n in (12-40), thereby obtaining the functions

<p n (x) = e sin ' n = 1, 2, . . . .


12-9 |
GREEN'S FUNCTIONS: AN EXAMPLE 491

EXERCISES

Compute the eigenvalues and eigenfunctions for the boundary-value problems in Exercises
1-8, and in each case determine a Euclidean space in which a complete set of eigenfunctions
for the given problem is an orthogonal set.

1. /' + (1 + \)y = 0; y(0)= 0, =


y(ir)

2. /' + (1 - \)y = 0; /(0) = 0, /(l) =

3. /' + + (1 - \)y = 0; y(0)


2/ = 0, y(l) =
4. /' - Ay' + (4 - X)y = 0; y(0) = 0, y(j) =

5. Ay" - Ay' + (1 + A)>> = 0; y{-\) = 0, y(\) =

6. /' + (1 - \)y = 0; y(0) + y'(0) = 0, y(l) + y'(l) =


7. y" + 2/ + (1 - X)y = 0; /(0) = 0, v'Ctt) =
8. /' - 3/ + 2(1 + \)y = 0; y(0) = 0, y(l) =
*9. Find all eigenvalues and eigenfunctions for the "singular" Sturm-Liouville problem

2
x y" - xy' + (1 + \)y = 0,

given that y(l) and lim _>o+ \y(x)\ < °° How does the set of eigenvalues of
= 0, a; .

this problem from those encountered earlier in this chapter? [Hint: Note that
differ
the given equation is an Euler equation, and recall that when its indicial polynomial
has complex roots a ± j3i, the solution space on (0, oo ) is spanned by the functions
x sin 08 In x), x" cos (8 In x).]
01

10. Repeat Exercise 9 for the boundary-value problem

x 2y" + xy' - 9\y = 0,

y(l) = 0, lim b(x)| < «.


i-»0+

*12-9 GREEN'S FUNCTIONS FOR BOUNDARY-VALUE PROBLEMS:


AN EXAMPLE
In an earlier chapter we saw that the equation

Lx = y, L: S -> V, (12-41)

can be solved for x by "inverting the operator" whenever L is a one-to-one linear


transformation mapping S onto V. For, under these conditions, there exists a
_1
linear transformation L from V to S such that LL~ l = I, I the identity map
on S, and hence x = L~ y. In the sections which follow we shall apply this
l

technique to the case where (12-41) is a boundary-value problem involving a normal


self-adjoint linear differential operator and unmixed boundary conditions, thereby
obtaining a method for solving such problems which is independent of the elab-
orate theory of series expansions in Euclidean spaces.
492 BOUNDARY- VALUE PROBLEMS | CHAP. 12

The best place to begin, perhaps, is at the point where we left the discussion of
initial-value problems in Chapter 4. As the reader will recall, we saw there that if

L: e 2 [a, b] -> Q[a, b]

is a normal second-order linear differential operator, and if S is the subspace of


Q 2 [a, b] consisting of all functions y which satisfy the initial conditions

y(x ) = y ,
y'(x ) = y lt

then L, restricted to S, has an inverse which can be expressed as an integral operator


of the form
b
ir l
h{x) = [ K{x, t)h(t) dt.
Ja

Moreover, the function K(x, t), known as the Green's function for L for initial-
way from the coefficients
value problems, can be constructed in a perfectly definite
2
of L and a basis for the null space of L in Q [a, b]. Our present objective is to
2
prove that a similar construction is possible whenever S is a subspace of Q [a, b]
determined by a pair of unmixed boundary conditions

+
cny(d) oi 2 y'(a) = 0,
(12-42)
Mb) + 02/0) = 0,

and L is one-to-one when restricted to S.

Obviously then, our first task is to devise a criterion which will guarantee the
one-to-oneness of L. In other words, we must impose restrictions on the boundary
conditions appearing in (12-42) which will ensure that the only solution of the
equation
Ly = (12-43)

which belongs to S is the trivial solution y = 0. This can be done as follows.


2
Let y x and y 2 be a basis for the null space of L in Q [a, b], and let

y(x) = cij?i(x) + c 2y 2 (.x)

be the general solution of (12-43). Then y(x) will be identically zero if and only
ifci = c 2 = 0, andy(x) will belong to S if and only if

«i[ciji(a) + c 2 y 2 (a)] + a 2 [ciy[(a) + c 2 y 2 (a)] = 0,

Hciyiib) + c 2y 2 (b)] + (3 2 [ciyi(b) + c 2 y 2 (b)] = 0,

that is, if and only if

Ci[otiyi(a) + a 2 y[(a)] + c 2 [a x y 2 {a) + a 2y'2 (a)] = 0,


n2_44)
ci[Piyi(b) + p 2 y[(b)] + c 2 [(3 iy2 (b) + fay'2 (b)\ = 0.
12-9 GREEN'S FUNCTIONS: AN EXAMPLE 493

Viewing (12-44) as a pair of equations in the unknowns c 1 and c 2 , it follows that


y(x) = if and only if (12-44) has the unique solution Ci = c2 = 0. From this,

and the elementary theory of systems of linear equations, we immediately deduce

Lemma 12-3. Let L be a normal second-order linear differential operator


defined on an interval [a, b], let yi and y 2 be a basis for the null space of L
2
in Q 2 [a, b], and let S be the subspace ofQ [a, b] determined by the unmixed
boundary conditions in (12-42). Then L will be one-to-one when restricted
to S // and only if the determinant

ctiyi(a) + <x 2 y'i{a) aij> 2 (a) + * 2 y'2 (a)


(12-45)
Pti>i(b) + foy'iib) PO?2(P) + (S 2 y'2 (b)

is different from zero.

Example. Let L = D 2 and , let S be the subspace of e 2 [a, b] defined by

y(a) = 0, y(b) = 0. (12-46)

Then using the functions 1 , x as a basis for the null space of L, the above deter-
minant becomes
1 a
= b — a,
1 b

and it follows that L is one-to-one on S.

In this case we can also prove that L maps S onto Q[a, b] by the simple expedient
of constructing L~ l
directly from the formula for L. Indeed, since the equation
Ly = h is simply y" (x) = h(x), two integrations yield

y'(s) =
/.
h(t)dt + c,

and

y(x) = + — +
f[f hit) dt ds c(x a) d, (12-47)

where c and d are arbitrary constants. Applying the given boundary conditions
we find that d = and that

/ / h(t)dt ds + c(b - a) = 0.
Ja \-J a
Thus
rb i- rs

ds,
494 BOUNDARY- VALUE PROBLEMS |
CHAP. 12

and (12-47) becomes


br rs
X —
X*) = h{t) dt ds
a Ja Ua
h(t) dt ds.

We now use the unit step function

= 0, s < 0,
"oO)
1, s > '0,

to rewrite this expression as

y(x) = u Q {x - s) h{t) dt ds
X -aC h(i) dt ds
-J af b — a Ja

b , rs
- X
— a
Kt) u (x — s)
— dt\ ds.
b a_

Using the unit step function again, we have

'b rb r
rir woO , -|

X*) = a \J a
- "oO - s) - lZa\ h ^ dt )
ds

•6 / rb
x — a\
u Q (s - U (X — S) — ds\ h(t) dt,
~b^~a\

and it follows that

y(x) = / K(x,t)h(t)dt, (12-48)


Ja

where

x —
K(x, t) u (s — i) u (x — s) — - i ds
b

(x — a)(t - b)
— x < t,
b a
= {
(12-49)
(x-bXt-a), x>L
b — a

(See Exercise 3.) The function K(x, i) defined by this formula is known as the
operator L = D
2
Green's function for the for the given boundary-value problem,
and Eq. (12-48) can be read as the definition of L~ e[a, b] -> S with
l
:

b
L- l
h{x) = [ K(x, t)h(t) dt. (12-50)
Ja
12-9 |
GREEN'S FUNCTIONS: AN EXAMPLE 495

In the next section we shall prove that an analogous result is valid for any normal
second-order linear differential operator L acting on S so long as S is determined
by a pair of unmixed boundary conditions, and L is one-to-one when restricted
toS.

EXERCISES

Determine which of the following operators are one-to-one when restricted to the
given subspace S of Q 2 [a, b].

(a) L = D 2 + 4D + 4 (b) L = D2 + 1

S:y(0) = 0, y(a) = S:v(-tt) = 0, y(r) =


(c) L = D2 - 1 (d) L = D2 - 1

S:y(0) + y'(0) = 0, y(a) = S:y(0) - y'(0) = 0, y(a) =


(e) L = x 2 D 2 + *D + 1

S:y(l) = 0, y(e') =

Show that if the condition in Lemma 12-3 is satisfied for a pair of functions yi and
j 2, it will also be satisfied for every other pair of linearly independent functions of the
form
Yi(x) = Jiyi(x) + y 2 y2(x),
Y2 (x) = 7ayi(jt) + T 4 y 2 W.

3. Prove that
(x - a)(t - b)
— X < t,
b a
uo(s — u (x — s) — ds =
t)
b — a
\
(x - b)(t - a)
— X > t,
b a

where «o is the unit step function, and a < x < b,a < t < b.

4. Prove that the first derivative with respect to x of the Green's function K(x, i) con-
structed above has a jump discontinuity of magnitude 1 along the line x = t, but is
continuous at all other points in the region a<x<b,a<t<b.
5. Let be a fixed point in the interval [a, b], and let K(x, i) be the Green's function
to
constructed above. Prove that the function K(x, to) is a solution of the boundary-
value problem
D 2y = 0,

y(a) = y(b) = 0,

for all x ?± to.

6. Use the technique introduced above to find the Green's function for the operator D 2
on the subspace of Q 2 [0, 1] defined by the boundary conditions >>(0) = /(l) = 0.
7. Repeat Exercise 6 for the boundary conditions

y(0) - y'(0) = 0,

y(l) + /(l) = 0.
496 BOUNDARY- VALUE PROBLEMS |
CHAP. 12

8. Let S denote the subspace of Q 2 [a, b] determined by the periodic boundary conditions

y(fl) = y(b),

y'(a) = y'(b),

and let L = D(p(x)D) + q(x) be normal on [a, b]. Prove that L is one-to-one when
restricted to S if and only if

yi(a) - yi(b) y 2 (a) - y 2 (b)


- - y'2 (b)
*
y'xia) y[(b) y 2 (a)

whenever y\ and y 2 are linearly independent solutions of the equation Ly = 0.

* 12-10 GREEN'S FUNCTIONS FOR BOUNDARY-VALUE PROBLEMS:


UNMIXED BOUNDARY CONDITIONS
Throughout this section we shall assume that L — D[p(x)D] + q(x) is a normal
second-order linear differential operator defined on an interval [a, b], that S is a
subspace of Q 2 [a,
determined by a pair of unmixed boundary conditions, and
b]
that L is one-to-one when restricted to S. As was stated above, we propose to
show that such a transformation necessarily maps S onto Q[a, b], and admits an
inverse which can be expressed as an integral operator of the form

h
L- X
h(x) = f K(x, t)h(t) dt
Ja

for all h in <B[a, b]. In order to motivate the axiomatic definition of the function
K(x, t) given below, begin by presenting a heuristic argument which will
we
_1
simultaneously suggest the existence of L the validity of the above formula, ,

and, most important of all, the correct determination of K(x, t).*


Assume that the equation

Ly = h (12-51)

describes the behavior of a physical system under the influence of a given input
function h, and assume, in addition, that the response y(x) of the system at the
point x due to the unit input

. fl x =
if t,
(pt{x) ~ (0 ifx^ t,

is K(x, 0-t Then, in view of the linearity of L, it is reasonable to expect that the

* The following argument is essentially the one given by R. Courant and D. Hilbert in
Methods of Mathematical Physics, Vol. I, Interscience Publishers, Inc., New York, 1953.
t For instance, (12-51) might be the equation of equilibrium of an elastic string
stretched along the interval [a, b] and subjected to a continuously distributed force
h = h(x). In that case, <p (x) represents a unit force applied at the point t, and K(x, i)
t

the deflection of the string at x due to this force.


t

12-10 | GREEN'S FUNCTIONS: UNMIXED BOUNDARY CONDITIONS 497

response of the system at x to a unit input applied continuously throughout the


entire interval should be obtained by "summing" the responses due to unit inputs
applied at each point of the interval, and hence should be of the form

b
y(x) = f K(x,t) dt.
Ja

In the case of a general input function h, this reasoning leads to the formula

b
y(x) = f K(x, t)h(t) dt, (12-52)
Ja

where the integrand is now viewed as the response at x to that portion of the
input applied at the point t.

Granting the validity of these considerations, we can easily deduce a number of


properties of the function K(x, t). In the first place, it is clear that K(x, t) must
be defined and continuous for a < x < b, a < t < b, and, for each value of t,
must satisfy whatever boundary conditions have been imposed on the problem.
Moreover, by its very definition, K(x, t) is a solution of the equation

Ly = <p t (x),

and hence, for each fixed t in [a, b], the function K(x, t ) satisfies the homo-
geneous equation
Ly =

for all x 5* t . Finally, to determine the behavior of K(x, t ) when x = t , let

fto denote the function which vanishes outside the interval \x — t \


> e, and
which contributes a total input of one in the interval |jc — f |
< e; that is,

/ ft Q (x)dx = 1.
Jt -(

(We assume that e has been chosen sufficiently small so that the interval [t — e,

to + e] is contained in [a, b].) Then, if K(x, t ) denotes the response of the sys-
tem at x to f we have
to ,

L[R(x, t )] =f t0 (x),

and it follows that

/ L[K(x, t )]dx = 1. (12-53)


Jt -(

But, by assumption, L = D[p(x)D] + q(x), whence (12-53) becomes

4: [p{x)R'{x, t )] dx + / q(x)K(X, t ) dx = 1,
J/ t —e «-* J t —e
or

p{x)K'(x, to) + / q(x)K(x, t ) dt = 1. (12-54)


t —t J t —
498 BOUNDARY- VALUE PROBLEMS | CHAP. 1

We now make the not unreasonable assumption that as e — 0, K(x, > t ) —> K(x, t )

for all x, and that K'(x, )


—» K'(x, t ) for all x different from
t t . Then the
continuity of q and K imply that the second term in (12-54) vanishes as e —
*•
0,
while the first reduces to

d
P(to) K(x, t )
dx

Thus
t+
°
d
K{x,t Q ) (12-55)
Tx pOo)

an equation which asserts that at the point t the derivative of K{x, t ) has a jump
discontinuity of magnitude l/p(t ).

Although these considerations are admittedly nonrigorous, they do agree with


the results obtained in the preceding section (see Exercises 4 and 5, Section 12-9),
and serve to motivate the following definition.

Definition 1 2-5. A Green's function for the boundary-value problem L: S —


Q[a, b] described above is a function K{x, i) of two variables satisfying the
following three conditions

(1) K(s, t) is defined and continuous for a < x < b, a < t < b, and,
as a function of x, is twice continuously differentiable except when x = t;

(2) For each fixed t in [a, b], K(x, t ) belongs to the subspace S (i.e.,
satisfies the boundary conditions imposed on the problem), and, in addi-
tion, is a solution of the equation Ly = 0, except at the point x = t \

t+
1
(3)-^K(x,t )
P(to)

With this as our definition, we now state the following basic theorem.

Theorem 12-9. If L: S — > Q[a, b] is a boundary-value problem of the type


described at the beginning of this section, and if L is a one-to-one mapping
ofS into e[a, b], then L maps S onto e[a, b], and for each h in Q[a, b] the
(necessarily unique) solution in S of the equation

Ly = h
is given by the formula
rb
y(x) = / K(x, t)h(t) dt,
Ja

where K(x, a Green's function for L. Moreover, K(x, t) is uniquely deter-


t) is

mined by the operator L and the boundary conditions which define the sub-
space §>.
12-10 GREEN'S FUNCTIONS: UNMIXED BOUNDARY CONDITIONS 499

We defer the proof of this result to the next section in favor of showing how the
Green's function for L can be explicitly computed once L and S are known. Here
we argue as follows.
Let ji and y 2 be solutions of the homogeneous equation Ly = chosen so
that y\ satisfies the boundary condition imposed at x = a, and y 2 the boundary
condition imposed at x = b; that is,

«iJi(«) + =
"2/1(0) 0,

0U>20) + foyM) = 0.

(See Exercise 1 Then y± and y 2


for a proof that such solutions do, in fact, exist.)
are linearly independent in Q 2 [a,
For otherwise, there would exist a constant
b].

c ?± such that y 2 {x) = cyi(x), and the function cyi(x) would be a nontrivial
solution of Ly = satisfying both of the boundary conditions imposed on S.
This, however, contradicts the assumption that L is one-to-one when restricted
to S, and is therefore impossible.
We now use the fact that the Wron- i^^iW
skian of y x and y 2 never vanishes on the 2 '*^
interval [a, b] to find functions A x (t) and 1

A 2 (0 such that

A 2 (t)y 2 (t) - AMyM = 0,

^2(0^(0 - ><i(0/i(0 = 1

Pit)

(12-56) FIGURE 12-3

for all t in [a, b]. The first of these equations guarantees that for each t in the
interval (a, b) the curves A i(t )yi(x) and A 2 (t )y 2 (x) intersect at the point x = 1

(see Fig. 12-3), while the second guarantees that the slope of A 2 (t )y 2 (x) differs
from the slope of Ai(t )y 1 (x) by l/p(t ) at x = t . Thus the function

^i(0j>i(*)> x < t,
K(x, t) = (12-57)
A 2 {t)y 2 {x), x > t,

satisfies the various conditions imposed in Definition 12-5, and is the Green's
function for L. Finally, by solving (12-56) for A and A 2 we
x obtain the formula

yi(x)y 2 (t)
X < t,

P(tyb>i(t)y'2 (t) - yi(t)y 2 (t)]


K(x, t) (12-58)
yi(t)y2(x)
X > t.

UOfriCMCO - yi(t)y 2 (t)]

Remark: The argument just given provides a rigorous proof of the existence
of a Green's function for the problem under consideration. Uniqueness will be
established in the next section.
500 BOUNDARY- VALUE PROBLEMS | CHAP. 12

Example. Let L = D 2 and , let S be the subspace of Q 2 [a, b] for which

y(a) = y(b) = 0.

Then, as we know, L is one-to-one when restricted to S, and Theorem 12-9


applies. To find the Green's function for L on S we set

y x (x) = x — a, y 2 (x) = x — b

in (12-58), to obtain

'(*-«X/-»),
b — a
K(x, t) = {
(t-
b — a

in agreement with the formula found in Section 12-9.

EXERCISES

1 Prove that the equation Ly = 0, L as above, has a pair of solutions y i and y% such that
a\y\(,a) + a2y'i{a) = 0,

j8 iy 2 (6) + fa-tito = 0.

[Hint: Use the existence theorem for solutions of initial-value problems.]

2. Use the method of this section to obtain the Green's functions for the boundary- value
problems of Exercises 6 and 7 of the preceding section.

3. (a) Find the Green's function for the boundary- value problem

// + k 2y = 0, y(0) = >>(!) = 0.

(b) Use the result in (a) to find the solution of

/' + k 2y = h(x), y(0) = y(l) = 0,

given that h belongs to C[0, 1].

*12-11 GREEN'S FUNCTIONS: A PROOF OF THE MAIN THEOREM

Continuing with the notation introduced above, we now show that Definition
12-5 does, in fact, uniquely characterize the Green's function for L. This is the
content of

Lemma 12-4. Let L = D[p(x)D] + q(x) be a normal second-order linear


differential operator on [a, b], and let H(x, i) and K(x, i) be two functions
satisfying Definition 12-5. Then H(x, t) = K(x, i) for all x and t in [a, b].
12-11 | GREEN'S FUNCTIONS: PROOF OF THE MAIN THEOREM 501

Proof. be any point in [a, b], and set F(x)


Let t = H(x, t ) — K(x, t ). Then
F and F' are continuous for all x in [a, b], and

L(F) = L(H - K) = L(H) - L(K) = (12-59)

for all x 9^ t * Furthermore, by applying L to F and solving the identity

p(x)F" + p'(x)F' + q(x)F =


for F", we conclude that F" exists and is continuous on [a, b], and hence that F
2
belongs to Q [a, b]. Finally, since H(x, t ) and K(x, t ) belong to the subspace S,
the same is true of F, and (12-59) together with the fact that L is one-to-one when
restricted to S now implies that F(x) = 0. Hence H(x, t ) = K(x, t ) for all x,
and since t was arbitrary in [a, b], the proof is complete. |

At this point we are in a position to assert that every boundary-value problem


L: S —* Q[a, b] involving a normal second-order linear differential operator
which is one-to-one on the subspace S has a unique Green's function K(x, t) and
that this function is given by Formula (12-58). Thus to complete the proof of
Theorem 12-9 we need but show that the function

b
y(x) = I K(x, i)h(t) dt
Ja

belongs to S and satisfies the equation Ly = h, for every h in Q[a, b].

To this end we write

y(x) = / K(x, t)h(t) dt+ K(x, t)h(t) dt,


Ja Jx

and differentiate to obtain

/(*) = K(x,x)h(x) + £.K(x,t)h(t)dt


J
b
f
- K(x, x)h(x) + / 4z Kx ( > OKt) dt
dx

I
(See Appendix I.) Thus

ociy(a) + a 2y'(a) =
Ja
ai K(a, t) + cc 2 —d K(a, t) hit) dt

= 0,

* Strictly speaking, F' has a removable discontinuity at x = to, which we can ignore
because the discontinuities in and H
cancel by subtraction. K
*

502 BOUNDARY- VALUE PROBLEMS CHAP. 12

the last step following from the fact that, as a function of x, K(x, t) belongs to S
for each t in [a, b]. Similarly

Mb) + W &) = 0,

and y(x) does belong to S, as required.


Finally,
rb

/'(*) = 4z K(x, t)h(t) dt + / 4z


dx
K(x 0K0 dt
>
dx a dX
2

o
y K(x, t)h(t) dt + J^ (*> x ~) h (x)

+ /' ^
Jx
K{x, t)h{t) dt-j± (x, x + )h(x)

dK . _. dK . _i_.
K(x, i)h(i) dt + h(x)
La dX 2

But, referring to Fig. 12-4, we observe that the


continuity of dK/dx in triangle ABC implies
dK . _i_
N dK , _ .

Similarly, using triangle ABD, we obtain

dK . _N dK , 4. N

FIGURE 12-4
and it follows that

dK a*

;>(*)

the prescribed jump in dK/dx at the point (x, x). Thus

L[y(x)] = p(x)/'(x) + p'(x)/(x) + *(*M*)

= A(*) + /»(x) 4k
d* 2
K(*> + />'(*) £
dx
*(*» + «(*)*(*» < )j
w
= h(x) + t)]h(t) dt,
Ja
12-11 | GREEN'S FUNCTIONS: PROOF OF THE MAIN THEOREM 503

and since L[K(x, t)] = for each t in [a, b],

L[y(x)] = h(x),

and we are done. |

As our final result on Green's functions, we now prove

Theorem 12-10. If K(x, t) is the Green's function for the boundary-value


problem L: S —> G[a, b] described above, then K(x, t) = K(t, x)for all x and
all t.

Proof. Let s and t be fixed points in [a, b] (with 1 < s ), and set u = K(x, s ),
v = K(x, t ). Then Lu = Lv = for all x in [a, b] different from s and 1 and ,

the Lagrange identity applied to u and v yields

-jr[/>(wi/ - u'v)] = 0.

We now integrate this expression from a to b, taking account of the discontinuities


of u' and v' at s and t , to obtain

p(x) |^— (x, t ) • K(x, s ) - -^ (x, s ) • K(x, t )

+ P(x) ^ (x, to) • K(x, s ) - — (x, s ) • K(x, r


)J|^

6
rn r\V hi

+ p(x)
I
Jf
^— (x, to) • K(x, so) - — (x, s ) • K(x, *
II

)j| +
= 0.

Thus

P0o)[-^(to, to) • K(t , so) - -^ Oo, to) ' K(t , so)

\ A IT rkTC I

+ p(so)
[^ (s$, So) • K(s , to) - "^ (sh~, *o) • K(s ,
*o)J

-
+ P(b)
[||
(b, to) • K{b, so)
H (b, s ) • Kib, /
)]

— (a, to) • K(a, so) - -^ (a, s ) • K(a, f


)J
= 0.

But, using the known jumps in dK/dx, this expression can be rewritten

-K(to, so) + K(so, to) + pip) (b, to) •


*0, s ) - (A, 5„) • K(b, t
[|f ff )\

- p(a) ^— (a, f ) • ^(a, ^o) - -j£ {a, s )


• K(a, f = 0.
)J
504 BOUNDARY- VALUE PROBLEMS | CHAP. 12

Finally, using the boundary conditions

dK
aiK(a, t ) = -a 2 -^ (a, t ),

axK(a, s ) = — a2 — (a, s ),

and

P Mb, t ) = -0 a |£(Mo),

^K{b,So)= -fo~(b,to),

we see that the bracketed terms vanish. Thus K(s , t ) = K(t , s ), as asserted.
13

boundary-value problems for


partial differential equations:

the wave and heat equations

13-1 INTRODUCTION
Historically the theory of boundary-value problems grew out of the study of certain
partial differential equations encountered in classical physics, and many of the
ideas treated in this book originated in attempts to solve these problems. For this
reason, if for no other, any introduction to the subject of boundary-value problems
would be incomplete without a discussion of partial differential equations. But
there are, in fact, compelling reasons for pursuing this subject which are quite
unrelated to any feelings of historical nicety. And though these reasons are
bound to become obvious as the chapter unfolds, it may not be out of place to
mention some of them before we begin.
For one thing, this discussion will serve to unite the various results on eigen-
functions and orthogonal series expansions that have been obtained in the pre-
ceding chapters, bringing them into sharper focus, and reenforcing the point of
view which sees them as a unified body of mathematical thought. For another,
we will at last be in a position to consider nontrivial physical problems, and the
fact that they can be solved with comparative ease should increase the student's
appreciation for the power of the techniques we now have at hand. And finally,
this material will in its turn suggest further problems leading to new results in the
subject.

13-2 PARTIAL DIFFERENTIAL EQUATIONS

The definition of a linear differential operator given in Chapter 3 can easily be


extended to include operators which involve partial differentiation. Such opera-
tors act on vector spaces whose members are functions of several variables, and
their associated operator equations are known as linear partial differential equa-
tions. Thus if Q l {E) is the space of all continuously differentiable functions
505
506 THE WAVE AND HEAT EQUATIONS | CHAP. 13

defined in a region R of the xy-plane, the most general first-order linear differential
operator defined on e^/?) has the form

L = a(x, y)Dx + b{x, y)D y + c(x, y) = a{x, y) —+ b(x, y) —+ c(x, y),

where a(x, y), b(x, y), c(x, y) are continuous everywhere in R. Here L may be
viewed as a linear transformation from e 1 (R) to <5(R), the space of all functions
continuous in R, and if h is any preassigned function in Q(R), the equation Lu = h,
u unknown, is a first-order (linear) partial differential equation.
It is clear that analogous, but more cumbersome formulas can be given for linear
differential operators (and equations) of higher order involving any number of
variables. Moreover, it is equally clear that all of the standard facts pertaining
to linearity continue to hold in this more general setting. This not withstanding,
the general theory of linear partial differential equations has very little in common
with that of ordinary equations for the simple reason that the solution space of
every (homogeneous) linear partial differential equation is infinite dimensional.
For instance, it is not difficult to show that the general solution of the first-order
equation

|+|= 03-,)

is u(x — y), where u is an everywhere differentiable but otherwise arbitrary


function of a single variable. From this it follows that each of the functions
— — (x ~ y) —
sin (jc y), cos (x y), e ,
(x yf, a > 1

is a solution of (13-1), and it is clear that these functions are linearly independent
in Q(R) for any R. The fact that even so simple an equation as this has such a
wealth of linearly independent solutions gives some indication of the difficulties

which must be surmounted in the study of partial differential equations.


These remarks go far to explain our insistence upon treating partial differential
equations strictly within the context of boundary-value problems. For there the
difficulties just mentioned vanish, and all of the problems we shall consider do

have unique solutions. But before we go on to describe these problems in detail


we must define what is meant by a solution of a boundary-value problem involving
a partial differential equation. Surprisingly, this is not so easy as it sounds, and
thus, for the sake of simplicity, we shall give the definition only in the two-dimen-
sional case with boundary conditions involving a single function. Once this has
been done the reader should have no difficulty in extending the definition to
higher-dimensional regions and more complicated boundary conditions.
In the particular case just mentioned the ingredients of a boundary-value
problem are

(i) a two-dimensional plane region R with boundary B,


(ii) a partial differential equation defined everywhere in R, and
(iii) a function / defined on B.
:

13-2 I
PARTIAL DIFFERENTIAL EQUATIONS 507

As used here the word "region" is a technical term reserved to describe a connected
subset of the plane each point of which can be surrounded by a circle lying entirely
within the set in question.* Thus the upper half plane, an infinite or semi-infinite
vertical strip of the plane, the interior of a rectangle, or the annulus between two
concentric circles are all regions in this sense, and as such are typical of the two-
dimensional regions in which boundary-value problems involving partial differential
equations are defined. In addition, we shall assume henceforth that the boundary
of each of the regions we consider is made up of a finite number of simple differ-

entiable arcs in the sense of the definition given in Section 9-8.f


Whenever (i), (ii), (iii) are given, the problem, of course, is to find all functions
u = u(x, y) which satisfy the differential equation in R and reduce to / on B.
(Note that we do not require u to satisfy the differential equation on B.) But were
this all thatwas required we could immediately solve the problem by letting u be
any solution whatever of the differential equation in R and then redefining u on B
to coincide with/. However, this clearly violates the spirit of the problem, and to
ensure that it violates the letter as well we must impose some restriction which
will guarantee that the values of u near B are related to the values of / on B.
Although thiscan be done in a variety of ways, the constraining condition is

usually taken to be one of the following

a. For each point b on B and each p in R, the limiting value of u(p) as p ap-
proaches b along any smooth curve in R is f(b )- (See Fig. 13-1.)

FIGURE 13-1 -^ FIGURE 13-2

b. For each point b on B and each p in R the limiting value of u(p) as p ap-
proaches b along any smooth curve in R which is normal to B at b is f(b )-

(See Fig. 13-2.)


In practice the choice between (a) and (b) is reflected in the hypotheses which
must be imposed on / to guarantee the existence of solutions of the desired type.
Since the second of these conditions is less restrictive than the first, it allows

theorems to be proved in greater generality and is therefore in wider use. It is the


condition which we shall adopt in the following chapters.

* A
subset R of the plane is said to be connected or pathwise connected if every pair of
points in R can be joined by a smooth curve lying entirely in R.
f The boundary of a region R is, by definition, the set B of all points in the plane with
the property that every circle centered at a point of B contains points in R and points
not in R.
508 THE WAVE AND HEAT EQUATIONS | CHAP. 13

EXERCISES

1. (a) Give the formula for the most general linear differential operator L: Q 1 (R) —
Q(R) when R is a region of xyz-space.
(b) Give the formula for the most general linear differential operator L: Q 2 (R) —
Q(R) when R is a region of the .xy-plane.
2. Determine which of the following partial differential equations are linear.

.
N d u ,
d u , d u „ ,, . d u ,
du du , d u . , ^

du du
(c)
fa* +
2
*ai + ^" " (d)
\a* +
W>)
u -
dx dy
a

d W d W
=0_
(C)
^+^ +(X>,M)
2
2

•*\ d u d u iou
(du ou\
du\ o
x 2x>'l-+ -) = (^>
a^ + + M M
(f)
>'^2 +
(a) Show that u(x — y) is a solution of the partial differential equation u x + uy =
whenever m is a differentiable function of a single variable.
(b) Let F(x, y) be a solution of u x + «y = 0. Set /? = x + y, # = x — y, and
write F(x, y) as

f(*,y) = f(^-^) = <%>,«)

Show that G is actually a function of q alone by computing dG/dp, and then deduce
that every solution of u x + m„ = can be written in the form w(x — y), where u
is a differentiable function of a single variable.

13-3 THE CLASSICAL PARTIAL DIFFERENTIAL EQUATIONS

Throughout the next three chapters, we shall, with but one exception, be ex-
problems involving various forms of the
clusively concerned with boundary-value
following second-order linear partial differential equations

2 2 2 2
d u d u d u d u
^+^ ^ ^a^' a>0
1 , .
(13 "2)
.

+
.
,
= >

(i3 - 4)
§+&"+f?-°-
Each of these equations first arose in classical (i.e., Newtonian) physics as the
mathematical description of a particular type of physical system, and the first two
are still known by the names of the simplest systems they describe. Thus (13-2)
13-3 | THE CLASSICAL PARTIAL DIFFERENTIAL EQUATIONS 509

is called the wave equation, and (13-3) the heat equation. Equation (13-4), on the
other hand, is known as Laplace's equation in honor of one of the mathematicians
who first studied it. For the present we shall content ourselves with the observa-
tion that Laplace's equation can be viewed as the time independent version of the
heat equation, and shall turn our attention to (13-2) and (13-3).

FIGURE 13-3

As its name indicates, the wave equation furnishes a satisfactory mathematical


description of certain vibrating physical systems. In particular, the so-called one-
dimensional wave equation
2 2
d u _ 1 d u
(13-5)
dx 2 a2 dt 2

in which a is a positive constant, is the differential equation governing the motion


of a vibrating string of constant density. To see why this is so, consider a stretched

elastic string of arbitrary length which is vibrating vertically in the xw-plane and
whose position of rest lies along the x-axis (Fig. 13-3). Throughout this discussion
we shall make the following simplifying assumptions

(a) The amplitude of vibration of the string is small, and each point on the
string moves only in a vertical direction
(b) All frictional forces (both internal and external) may be neglected
(c) The mass of the string per unit length is sufficiently small in comparison
with the tension in the string that gravitational forces may be neglected.*

In Fig. 13-4 we have isolated a small segment of the string and have indicated
by T and T' the forces of tension acting on its endpoints. Since the string moves
only in the vertical direction, the horizontal components of T and T' must cancel,
and we have

Tcosa = T' cos a' = k, (13-6)

where k denotes the constant horizontal ten-


sion in the string. On the other hand, the total
force acting on this element of the string in
the vertical direction is V sin a' — T sin a.
Hence, by Newton's second law,

2 X + AX
d u
T' sin a' — T sin a = p Ax (13-7)
dt 2 FIGURE 13-4

* The question as to whether these assumptions are permissible from a physical point
of view and do not prejudice the solution obtained is one which must be settled by the
physicist in his laboratory. In most situations they are, in fact, physically acceptable.
510 THE WAVE AND HEAT EQUATIONS CHAP. 13

X + AX

FIGURE 13-5

where p denotes the mass per unit length of the string, Ax the length of the seg-
ment in question, and d 2 u/dt 2 the acceleration of the segment at an appropriate
point between x and x + Ax. Using (13-6), this equation may be rewritten

T' sin a' T sin p Ax d u


T' cos a! T cos a
or
2
p Ax d u
tan a' — tan a = '

~k~ 'dt 2

But tan a! = du/dx evaluated at jc + Ax, and tan a = du/dx evaluated at x.

Hence
J_ du du p d u
Ax dx x+Ax dX k a72

and passing to the limit as Ax —* 0, we obtain

2
<Tw 1 d u
dx 2 a 2 dt 2

where a = y/k/p.
The two-dimensional wave equation

2 2 2
d u d u _ 1 d u
(13-8)
dx 2 dy 2 ~~ a 2 dt 2

arises in physics as the differential equation governing the motion of a thin flexible
membrane of constant density which is tightly stretched and then fixed along its

boundary, and which vibrates in the w-direction from its position of rest in the
xy-plane. In this case the simplifying physical assumptions under which the equa-
tion is derived are as follows:
(a) the amplitude of vibration is small, and every point of the membrane moves
only in the it-direction;
13-3 | THE CLASSICAL PARTIAL DIFFERENTIAL EQUATIONS 511

(b) all frictional and gravitational forces may be neglected;


(c) the tension per unit length in any direction is constant throughout the
membrane.
To obtain the equation of motion under these assumptions we analyze the
forces acting on the portion of the membrane shown in Fig. 13-5. If T denotes
tension per unit length, then the vertical components of the forces acting along
edges 1 and 2 of the indicated portion of the membrane are T Ax sin a i and
TAx sin a 2 for appropriate angles «i and a 2 But since the amplitude of deflection
.

is small, we can replace sin a x and sin a 2 by tan ai and tan a 2 respectively.* ,

Thus the total vertical force contributed by edges 1 and 2 is

TAx(tanai — tan« 2 )-

Similarly, edges 3 and 4 contribute a vertical force

r Ay (tan a 3 — tan a 4 )

for appropriate angles a 3 and « 4 . Thus, by Newton's second law,

2
d u
TAx(tancn — tana 2 ) + TAy(tan<x 3 — tan« 4 ) = pAxAy-^* (13-9)

where p is the mass of the membrane per unit area, and d 2 u/dt 2 is computed at
some point in the region under consideration. But

tan «x = du
tan a 2 = —
du
dy
Ty (x^y+Ay) (x 2 ,V)

tana 3 = -
du
tana 4 = -
du
(3J,J/i) (x+Ax.j/2)

where x\ and x 2 lie between x and x + Ax, y± and y 2 between y and y + Ay.
Thus (13-9) may be rewritten

1 du du du du p d u
Ay\_dy (x t ,y+Ay) fry (x 2 ,i/)-
+ Ax dx dx (.x+Ax,v 2 )l T dt 2
(z,2/i)

Compare
3 5
X X
sinx = x- — +—- ,

with

tan x = x +
, x
—+
— + 2x
Their difference is of the order of magnitude of x 3 /2, which is small if x is near zero.
512 THE WAVE AND HEAT EQUATIONS | CHAP. 13

and passing to the limit as Ax and Ay tend to zero, we obtain

2 2 2
d u d u _ 1 d u

where a = y/T/p.
Finally, the three-dimensional wave equation arises, among other places, in that
branch of physics which deals with electric and magnetic fields in space. In fact,
by using Maxwell's equations from electromagnetic field theory it can be shown
that each of the components of both the electric and magnetic field strengths in a
region of space are governed by Eq. (13-2).
Next we consider the heat equation, and show that under appropriate assump-
tions it serves to describe the temperature distribution in material bodies as a
function of position and time. To obtain the one-dimensional version of this
equation we consider a slender homogeneous rod, lying along the x-axis, and
insulated so that no heat can escape across its longitudinal surface. In addition,
we make the simplifying assumption that the temperature in the rod is constant
on each cross section perpendicular to the x-axis, and thus that the flow of heat
in the rod takes place only in the ^-direction.
Now, for empirical reasons it is assumed that the quantity of heat, AH, which
flows across any cross section of the rod is proportional to the rate of change of
temperature u on that cross section. In other words,

AH = -k^> k > 0, (13-10)

where the minus sign is introduced because heat flows in the direction opposite
to the positive direction of du/dx.But it is also known that the amount of heat
which accumulates in any portion of the rod is proportional to the product of its
mass and the (average) time rate of change of temperature in that mass. Hence

AH = cmy t
(13-11)

for an appropriate positive constant c, known as the specific heat of the material
in question.
To obtain the heat equation we now focus our attention on the portion of the
rod between the points x and x +If p denotes the mass of the rod per unit
Ax.
length, then by (13-11) the amount of heat accumulating in this portion of the
rod per unit time is

.„ =
AH cp
.

Ax —
du
'

where du/dt is computed at some point between x and x Ax. But by (13-10) +
the amount of heat flowing across the two faces of this portion of the rod is

du du
AH = -k dx x+Ax
+ dx xj
13-3 I THE CLASSICAL PARTIAL DIFFERENTIAL EQUATIONS 513

and since these two expressions must be equal, we have

du cp du
J_ du
Ax dx x+Ax dx ~k ~dl

Taking the limit as Ax 0, we obtain

d u du
= a
dx 2 Tt
2
where the constant cp/k has been replaced by a to emphasize that it is positive.*
An argument similar in almost all respects to the one just given can be used to
show that the temperature distribution in a thin rectangular plate, insulated so that
no heat flows across its faces, is governed by the two-dimensional heat equation
2 2
d u d u du
= '

dx 2 + w a
Tt

We leave this derivation as an exercise.


Each of the physical problems described above must also be subjected to certain
boundary and initial conditions before the future behavior of the system can be
determined. In the case of the one-dimensional wave equation one is interested
in finding solutions on a finite interval [0, L] under the assumption that at time
t = both u(x, 0) and ux (x, 0) are known functions of x. Physically this corre-
sponds to finding the equation of motion for a vibrating string of length L, given
its initial displacement and initial velocity. Furthermore, if A denotes the end
points jc = and x = L, then the boundary conditions imposed on the problem
are always chosen from among the following:

1. u(A, t) = 0;
2. ux (A, = 0;
3. u{A, t) = (l/h)u x (A, t), h a constant.

The first of these conditions simply means that the string is held fixed at the end-
point A, while the second is known as a. free-end condition. Here the string can
move in a vertical direction at A, but is constrained to do so in such a way that it
always remains horizontal there (see Fig. 13-6). The third condition says that
at x = A the displacement of the string is proportional to its slope. But for small

FIGURE 13-6 FIGURE 13-7

* This equation is also known as the diffusion equation.


514 THE WAVE AND HEAT EQUATIONS | CHAP. 13

vibrations we know that ux (A, i) = tan a « sin a, where a is the angle which the
string makes with the horizontal at A. Thus u(A, t) ~ (l/h) sin a, and the string
behaves as though it were attached to the end of a spring as shown in Fig. 13-7.
Indeed, the requirement that such a system be in equilibrium at a given displace-
ment u{A, t) is T sin a =
ku{A, t), where k denotes the spring constant, and
(3) now follows by setting h= k/T. Finally, if the coupling constant h is very
small we approach the free-end condition given in (2).
When solving the one-dimensional heat equation it is customary to start with
a given temperature distribution u(x, 0)
initial = f(x) along the conducting rod,
and then choose the boundary conditions from among the following

1. u(A, t) = k, k a constant;
2. u x (A, = 0;
3. u x {A, i) = hu(A, t), h a constant.

The first of these conditions means that the end of the rod at A is maintained at
the constant temperature k, the second that the rod is insulated at A and neither
gains nor loses heat at that end. This time the third condition may be read as
asserting that the rate at which heat passes through the end at A is proportional
to the temperature at A.

EXERCISES

1. Derive the two-dimensional heat equation

d u d u 2 du
dx 2 dy 2 dt

under the assumptions given in the text.

2. Show that the equation governing the temperature distribution in a thin homo-
geneous rod is

2
2 du d u , .
N
+ , , ,

a ^7 = ^~^2
,

b 8(x > ')> * a constant,


dt ox

when heat is being generated in the rod (say by an electric current) at the rate g(x, i)

per unit length.

3. Suppose that the assumptions under which the equation of motion of the vibrating
string was derived are modified to include a retarding force due to air resistance
which is proportional to the velocity of the string. Show that the equation governing
the motion then becomes

2 d u du d u
U = + '

dx^ dt d~fi

4. Suppose that an external force of magnitude G(x, i) per unit length acts on a vibrating
string. (This is the case of "forced vibrations.") Show that the equation governing
13-3 | THE CLASSICAL PARTIAL DIFFERENTIAL EQUATIONS 515

the motion is

d u 2 d u 1 . .

where p is the (constant) density of the string.

In the following exercises we sketch a method for solving the one-dimensional


wave equation discovered by the 18th-century French mathematician and philosopher
Jean d'Alembert and called after him <T Alemberf s solution of the wave equation.

5. Prove that the function

u(x, t) = f{x + at) + g(x - at)

is a solution of the one-dimensional wave equation whenever /and g are twice dif-
ferentiable functions of a single variable.

6. (a) Let G(p, q) and its derivatives G p G q G pp Gpq and G qq be continuous through-
, , , ,

out the pq-p\ane. Prove that there exist twice continuously differentiable functions
d and (7 2 of a single variable such that

G(p,q) = Gi(p) + G2 (q)


if and only if Gpq = 0.

(b) Let F(x, be a solution of the one-dimensional wave equation, and suppose
t)

that Fxx Fxt, Fu are continuous. Set p = x


, and at, q = x — at, and, as in +
Exercise 3(b) of Section 13-2, rewrite F(s, t) in the form G(p, q). Prove that Gpq = 0,
and then use the result in (a) to conclude that every twice continuously differentiable
solution of the one-dimensional wave equation has the form

f{x + at) + g(x - at).

7. Let u(x, t) = f{x + at) + g(x — at) be any twice continuously differentiable
solution of the one-dimensional wave equation, and note that

u(x,Q) = f(x) + g(x),

u t (x, 0) = a[/'(x) - g'(x)].

Set
r(x) = f{x) + gix),

six) = a[f'ix) - g'ix)],

and show that


fX+at
1
uix, = 2
[r(x + at ^ + r(x ~ at ^ + 2a '
2-a Jx-at
S
^ d(f '

(This is d'Alembert's solution of the wave equation.)

8. (a) Show that the physical units of the constant appearing in the equation

d u _
= 1 d u
dx2 a2~dt2
516 THE WAVE AND HEAT EQUATIONS | CHAP. 13

are those of velocity; i.e., length/time. [Hint: Recall that a = VWp> where the
units of T and p are, respectively, force/length and mass/length, and then use
Newton's second law.]
(b) Use the above result together with Exercise 6(b) to show that every twice con-
tinuously differentiable solution of the one-dimensional wave equation can be
interpreted as the superposition of two standing waves, one of which moves to the
right, and one to the left, both with velocity a.

FIGURE 13-8

Suppose that at time t = a homogeneous string of infinite length is displaced as


shown in Fig. 13-8 and then released from rest. Sketch the displacement of the
string at time t = \, t = 1, and t = 2, under the assumption that a = 1. [Hint:
See Exercise 8(b).]

13-4 SEPARATION OF VARIABLES:


THE ONE-DIMENSIONAL WAVE EQUATION

In this section we introduce the method of separation of variables for solving


boundary-value problems involving partial differential equations. In essence this
method consists of finding solutions of the equation which are products of func-
tions of a single variable, and then combining these solutions in such a way that
the given boundary conditions are satisfied. To illustrate how this is done we
consider the one-dimensional wave equation

2 2
d u d u
=
^ 1

^aF' (13 " 12)

subject to the boundary conditions

i/(0, t) - u(ir, i) = 0,

«(x,0) = /(*), (13-13)

u t (x, 0) = g(x),

where / and g are assumed known. Physically this problem consists of finding
the equation of motion of an elastic string which is stretched along the x-axis from
to X, clamped at its endpoints, given initial position fix), initial velocity g(x),
and then allowed to vibrate freely.
We begin by seeking solutions of (13-12) of the form
u(x, i) = X(x)T(t), (13-14)
,

13-4 | SEPARATION OF VARIABLES: ONE-DIMENSIONAL WAVE EQUATION 517

where X and T are, respectively, functions of x and t alone. Furthermore, we


demand that these solutions be such that «(0, t) = u(%, i) = for all t >
(see 13-13). This, in turn, implies that X must satisfy the endpoint conditions
X(0) = X(w) = 0, else (13-14) would yield T{t) = 0, and u(x, t) would then be
the trivial solution of (13-12). If we now assume (as we must) that X and T
are twice differentiate, then

o u _ Y"T '
= XT"
dx 2 dt 2

Substituting in (13-12), we obtain

X"T = \ XT",
whence
Y" \ T"

whenever XT ^ 0. At this point we make the crucial observation that the left-

hand side of this expression is a function of x alone, while the right-hand side
involves only t. Thus each is a constant, A, and (13-15) is equivalent to the pair of
ordinary linear differential equations

X" ~ XX = °*
(13-16)
T" - \a
2
T = 0,

the first of which must satisfy the endpoint conditions X(0) = X(ir) = 0.

To solve these equations we first note that the boundary-value problem

X" - XX = 0, X(0) = X(t) =


is essentially the one solved as Example 1 in Section 12-6. Thus we know that up
to multiplicative constants the only nontrivial solutions of this problem are

Xn (x) = sinnx, n = 1, 2, ... (13-17)

corresponding to the eigenvalues X n = —n2 . Moreover, when X = —n2 the


general solution of T" — \a
2
T= is

Tn (f) = A n sin nat + Bn cos nat, (13-18)

where A n and Bn are arbitrary constants. Hence, forming the product of the
functions in (13-17) and (13-18), we see that each of the functions

un (x, t) = sin nx{A n sin nat + Bn cos nat) (13-19)

is a solution of the one-dimensional wave equation which vanishes when x =


and x — t.
518 THE WAVE AND HEAT EQUATIONS | CHAP. 13

This done, we now attempt to use these functions to construct a solution u(x, t)

of (13-12) such that


«(*, 0) = /(*), u t (x, 0) = g(x). (13-20)

In general, of course, no one of the un (x, t) by itself will satisfy these conditions.
Neither, for that matter, will any finite sum of them, and it therefore appears that
the only possible choice is an infinite series of the form

n=l
00

= 2= sm nx
n l
iAn sin not + Bn cos nat) (13-21)

for suitable values of A n and Bn Now when . t = 0, this series reduces to


00

u{x, 0) = 2s
71=1
n sin nx, (13-22)

and we see that the first condition in (13-20) will be satisfied if the Bn are chosen
in such a way that (13-22) converges (pointwise) to the function /in the interval
[0, w]. But this is a familiar problem which, as we know, can be solved in either
of the following equivalent ways.
I. Let 0/ denote the odd extension of/ to the interval [
— tt, tt] (see Section 9-5),
and let Bn be the «th Fourier coefficient of Of. Then
2
Bn = -
r f(x)
/ sin nx dx, (13-23)
TT JO

and whenever / is sufficiently well behaved, the series obtained from (13-22)
using these values for the coefficients will converge to /everywhere in [0, t].

II. The functions sin nx, n = 1,2,..., form a complete set of eigenfunctions
for the Sturm-Liouville problem

X" - \x = 0, X(0) = X(tt) = 0,

and are an orthogonal basis for the space (PC[0, x] (Theorem 12-7). Thus the
Bn in (13-22) may also be computed as the coefficients of the expansion of/ in
terms of the eigenfunctions sin nx. The student should realize that the same series
is obtained in either case.
for determining the A n is much the same. We differentiate (13-21)
The technique
term-by-term with respect to t (under the assumption, of course, that this can be
done), and then set t = 0. Using the fact that u (x, 0) = g(x), we obtain t

00

g(x) = 2 na An sin nx, (13-24)


13-4 I SEPARATION OF VARIABLES: ONE-DIMENSIONAL WAVE EQUATION 519

and it follows that naA n must be the nth coefficient of the eigenfunction expansion
of g. Thus

An = /
/
sii nx dx,
g(x) sin (13-25)
Jo

and we are done.


An excellent picture of the physical significance of this solution can be given
in the special case where g(x) = 0. Then
00

u{x, i) = ^B
n=l
n sin nx cos nat,

and the components of the motion assume the simple form

un {x, i) = Bn sin nx cos nat, n = 1,2,....

In this case the frequency of vibration of each of the components of u is an integral


multiple of the fundamental frequency v x = a/2v of u x * This frequency deter-
mines what is known as the fundamental tone of the vibration, and its multiples
determine the overtones or harmonics.
The way in which the eigenvalues determine the frequencies of the various
modes of vibration of the string becomes strikingly clear if the un (x, t) are graphed
for various values of t, as in Fig. 13-9. For a rapidly moving string these figures
present a reasonably accurate picture of the fundamental vibration and its first

three harmonics. The reader should note that the different overtones are char-
acterized by the appearance of nodes or stationary points which may be regarded
as visual manifestations of the eigenvalues for this problem. In the general case
where g^O the situation is much the same, but does not lend itself to such an
easy graphical realization.

t =

"2 Cm)

FIGURE 13-9 u4 (x,t)

* The frequency v of the wave described by the function cos at is, by definition, a/2ir.
Physically, v may be interpreted as the number of waves which pass a given point per
unit time.
520 THE WAVE AND HEAT EQUATIONS | CHAP. 13

EXERCISES

In Exercises 1 through 5 find the solution u(x, i) of the one-dimensional wave equation
on the interval [0, L] subject to the endpoint conditions w(0, t) = u(L,i) = and initial
conditions as given.

1. u(x, 0) = sin —— » u t (x, 0) =


A-/

2. u(x, 0) = 0, u t (x, 0) = sin -7—

3. u(x, 0) = x(L — x), u t (x, 0) =

4. uix, 0) = u (x, 0)
for < x < L/3 and 2L/3 < x < L
0, t

1 for L/3 < x < 2L/3


N
5. u(x, 0) = ^2 An
n=l
sin —— > u t (x, 0) = ^B
n=l
n sin

6. If gravitational acceleration is taken into account the motion of a vibrating string


is governed by the equation

n u 2
o _ 2 o u
~ a
dfi dx~2

(See Exercise 4 of the preceding section.) Find the time independent solution of
this equation, and show how to use it to treat the general case.

7. Find the equation of motion of a vibrating string with fixed ends if it is released
from rest in the plucked position shown in Fig. 13-10. Sketch the position of the
string at times L/2a, L/a, 3L/2a, 2L/a.

FIGURE 13-10 FIGURE 13-11

8. Generalize the results of the preceding exercise to the case where the string is plucked
as shown in Fig. 13-11.

9. (a) Solve the one-dimensional wave equation subject to the conditions

m(0, t) = u(L, t) = 0,

u(x, 0) = 0,

m - ! ° for 8 <\x - L/2\ < L/2,


uAx,U)
| 1/(2pg)
for \
x - L / 2 < 8,
\

where p is the linear density of the string.


(b) Take the limit of the solution found in (a) as 5 —
> 0, and interpret the result
physically.
13-4 |
SEPARATION OF VARIABLES: ONE-DIMENSIONAL WAVE EQUATION 521

10. (a) Show that the solution of the boundary-value problem

a2 a2

a* - a* -
a **• °'

w(0, = u(L, t) = 0,

w(x,0) = f{x),

u (x,0)
t
= g(x),

is v(x, t) + m>(x, /), where v(x, t) is a solution of

2 a u 2
2 di1 u a
d i . .

a ^r^~ ^r =
^c 2 ~ ~dt 2
F(x, t)

such that u(0, r) = r(L, /) = 0, and w(x, i) is the solution of

2 a u 2
2 da* U
u O
d I
a T-^r —
~ -7~z =
dx* Bt 2

such that w(0, t) = w(L, i) = 0, w(x, 0) = f(x) - v(x, 0), w t (x, 0) = g(x) -
v t (x, 0).

(b) Use the technique suggested in (a) to solve the boundary-value problem

2 a u 2
da u a
r — -t—r = Sin X COS t,
dx 2 dt 2

u(0, i) = u(JL, i) = 0,

u(x,0) = f(x),

u (x,0) t
= g(x).

11. Under suitable assumptions it can be shown that the torsional vibrations of a
homogeneous metallic shaft of uniform circular cross section are governed by the
partial differential equation
x
o
2 z 2<p
o
<p _ a
2
~dt 2 ~ drf'

where a a positive constant, and <p(x, t) is the angular displacement from equi-
is

librium at time t of the cross section of the shaft at x. (See Fig. 13-12.) Assume a
shaft of length L with <p(x, 0) = f(x), <p t (x, 0) = g(x).

FIGURE 13-12

(a) Find <p(x, t) if the ends of the shaft are clamped, i.e., if <p(0, t) = <p(L, t) = 0.

(b) Find <p(x, i) if <p x (0, t) = <p x (L, t) = 0. (These conditions obtain when the
ends of the shaft are free to twist and no torque is transmitted across them.)

(c) Find (p(x, i) if the shaft is clamped at x = and free at x = L.


522 THE WAVE AND HEAT EQUATIONS | CHAP. 13

*13-5 THE WAVE EQUATION; VALIDITY OF THE SOLUTION


In the preceding section we saw that the series

00

u(x, 0=2 n=l


s* n nx (A n S^n nat + Bn cos nat) (13-26)

provides a. formal solution of the boundary-value problem

dx 2 ~ a 2 dt 2 '

w(0, = "O, = 0,

and that the initial conditions u(x, 0) = f(x) and u t (x, 0) = g(x) can be formally
satisfied by choosing A n and Bn as the coefficients in the orthogonal series ex-
pansions
CO CO

2 Bn
n=l
sin nx = f(x) and 2
n=l
na^n sin nx = g(x) (13-27)

on [0, 7t], To complete the discussion of this problem we now impose conditions
on / and g which are sufficient to guarantee the validity of these results. Our
choice in this respect will be guided by the various convergence theorems for
Fourier series proved in Chapter 10, and we shall assume that the reader is familiar
with these results.
The first step in our argument consists of the simple observation that with An
and Bn as above
00

U\(x, = 2 ^n
n=\
sin nx cos nat (13-28)

is the formal solution of the boundary-value probem

2 2
_ 1 d u
d u
'
^

dx ~ a 2 dt 2
2

M (0, /) = u(ir, = 0, (13-29)

u(x, 0) = f{x\ u (x, 0) s t 0,

and
00

u 2 (x, = 2^
n=l
n s* n nx s* n nat (13-30)

is the formal solution of


2 2
d u _ 1 d u
t

dx 2 ~ a 2 dt 2 '

u(0, t) = w(tt, = (13-31)

u(x, 0) = 0, u t (x, 0) = g(x).


13-5 | THE WAVE EQUATION; VALIDITY OF THE SOLUTION 523

Thus (13-26) can be viewed as the sum of the solutions of two simpler problems,
and it suffices to direct our attention to them.
Beginning with the first, that is with (13-28) and (13-29), we recall that the
series for /given in (13-27) will converge (pointwise) to f(x) for each x in [0, tt]
whenever

(i) /is continuous and/' piecewise continuous on [0, x], and

(ii)/(0) = /W = 0.

Moreover, under these hypotheses, this series is actually uniformly convergent on


(—00,00) where it represents the odd extension of/ on [— tt, t] repeated periodi-
cally along the entire *-axis. Denoting this extension by F, so that

F(x) = J] Bn smnx (13-32)


n=l

for all x, we now use the identity

sin nx cos nat = \ sin n(x — at) + \ sin n(x + at)

to rewrite (13-28) as

- ~ 33
Ml (x , t) = ^^2 Bn sin n(x at) + £^ Bn sin "(* + at ^ ( 13 )
n=l n=l

(The rearrangement of terms here is justified by the fact that under the hypotheses
in effect this series is absolutely convergent.) Thus

udx, t) = %[F(x - at) + F(x + at)],

and it now follows that

«i(0, = &F{-*>t) + F(at)]


= &-F(at) + F(at)]

= 0,

Wi(t, = M^(x - at) + F(t + at)]


= \[F(-tc - at) + F(x + at)]
= M-F(* + at) + F(tt + at)]
= 0,

and
m(*,0) = Unx) + F(x)]
= F(x)
= f(x)

on [0, t]. Thus «i satisfies the first three boundary conditions.


524 THE WAVE AND HEAT EQUATIONS | CHAP. 1

To ensure that it satisfies the last as well, we now demand that

(iii) /' be continuous on [0, %].*

Under this assumption it is easy to show that F' exists and is continuous for all x
(see Exercise 3), whence

^= %[-aF'(x - at) + aF\x + at)]

and
An-
dUi
= h[-aF'(x) + aF'(x)] = 0,
~dt t=0

as required.
Finally, to complete the argument and prove that U\ is also a solution of the one-
dimensional wave equation we impose the following additional restrictions on/:
(iv) f" is continuous on [0, x], and
(V) /"(0) = /"Or) = 0.

For then, arguing as above (Exercise 3 again) we find that F" is everywhere con-
tinuous,and that

a m = a
[F „ (x _ at) + F(x + at)l
dt 2 2

iifl - %F"(x
ir/r/: - at) + F"(x + at)].
dx 2
Thus
2 2
d Uj _ J_ d Uj ^

dx 2 ~ a 2 dt 2

and the proof is complete.


This done, we turn our attention to the formal solution u 2 of the boundary-
value problem described in (13-31), and begin by assuming that

(i) g is continuous and g' piecewise continuous on [0, t], and


(ii) g(0) = gW = 0.

Under these conditions we know that the Fourier sine series

g(x) = X) Cn sin nx

* In particular, this means that /has a right-hand derivative/^ at zero, a left-hand de-
rivative fi at x, and that

lim f(x) = /Jj(0), lim fix) = f£(ir).


X—>0+ x—>t
13-5 | THE WAVE EQUATION; VALIDITY OF THE SOLUTION 525

converges uniformly and absolutely to g(x) for all x in [0, *•]. Since A n = CJna,
we have
C
M „(v f) = -
1
Y—
°°
1

sin nx sin nat. (13-34)


n=l

Thus if we tentatively allow the necessary term-by-term differentiation, we find that

—dt
- = YC
Z-j n sin nx cos nat
n=l

or, using the trigonometric identity introduced earlier in this section,

^ = *£ C
d'
n=l
n sin n(x - at) + *
n=l
£C n sin n(x + at). (13-35)

But the assumptions imposed on g imply that these two series are uniformly and
absolutely convergent on [0, «•]. Hence so too are

EC n sin nx cos nat and - V— sin w* sin no/,


n=l n=l

and the term-by-term differentiation of (13-34) is therefore legitimate. Moreover,


ifG denotes the odd periodic extension of g to the whole real line, then

^= ±[G(x - at) + G(x + at)],

and
l

u 2 (x, t) = \[ G{x - at)dt + \{ G{x + at)dt,


Jo Jo
or
rx-\-at

u 2 (x, = <Ks) ds. (13-36)


La J/x—at
t-

From this it follows at once that u 2 (x, 0) = and that

dt/2

t=0

on [0, x]. Thus the function defined by (13-36) satisfies the initial conditions
prescribed in (13-31). Furthermore, since G is odd and periodic with period 2t,
it satisfies the end conditions w 2 (0, /) = u 2 (t, t) = as well. Hence, to com-
plete the argument we need only show that this function is also a solution of the

one-dimensional wave equation, and that its series expansion is

A X^
a ^-J n
— sin nx sin nat,
n= l
as required by (13-34).
526 THE WAVE AND HEAT EQUATIONS | CHAP. 13

To this end we recall that

= l[G(x - at) + G{x + at)]


~f
00

= ^2 Cn sin nx cos nat,


n=l

and that this series converges uniformly and absolutely to

%[G(x - at) + G(x + at)]

on [0, x]. Hence we can integrate term-by-term to obtain

rx-\-at

u 2 (x, t) = x- / G(s) ds
la J x—at
rt

— ^2 Cn sin nx J
cos nas ds

n=l

= 1
- >
a *-*{ n
n\-^
,

C
sin
.

nx sin «af,
n=l

as required. In addition, this series can be differentiated term-by-term with


respect to either variable, from which it follows that

d
-^ = ±[G(x + at) - G(x - at)],

^= i[G(x + at) + G(x - at)].

Finally, to assure the existence of the second partials of w 2 we now assume that

(iii) g' is continuous on [0, w].

This implies that G' exists and is continuous on (— oo, oo), and that

^ = \\G'(x + at) - G'(* - at)].

Thus
2 2
d u2 _ 1^ d u2
3
~~
dx 2 a2 dt 2

and we are done.


When taken together these two arguments furnish a proof of the following
theorem.
1 3-5 | THE WAVE EQUATION; VALIDITY OF THE SOLUTION 527

Theorem 13-1. The series

00

u(x, = X) sm nx (A n sin nat + Bn cos nat)


n=l
with

An = —2 - r g(x)
nica J o
/
sin nx dx,

2
Bn = - f f(x) sin hx dx,
I
T Jo

converges uniformly and absolutely to a solution of the boundary-value


problem
2 2
d u _ 1 d u
3x 2 ~ a 2 a/ 2
'
?

w(0, = m(tt, = 0,

w(x, 0) = f{x), u t (x, 0) = g(x),

whenever

(1) /, /' a«c? /" are continuous on [0, ir] with

/(0) = /"(O) = /(r) = /"(») = 0,

and

(2) g anfi? g' are continuous on [0, t] with

g(0) = g(ir) = 0.

Although Theorem 13-1 is sufficient for most purposes, it is clearly not strong
enough to handle every boundary-value problem involving the one-dimensional
wave equation that arises in practice. Perhaps the best known example which
falls outside the scope of this theorem is that of the "plucked string," and involves

finding the equation of motion of a string which is released from rest in the position
shown in Fig. 13-13. Here too a formal solution can be obtained by the method
of separation of variables (see Exercise 8, Section 13-4), but a much more careful
analysis than the one given above is required to treat those points where and f
f" have discontinuities. We shall omit this
discussion since it would carry us further
afield than we care to go, and content our-
selves with the remark that in most cases the
formal solution can in fact be proved valid.*
FIGURE 13-13

* For instance, see Chapter 4 of H. Sagan, Boundary and Eigenvalue Problems in


Mathematical Physics, John Wiley & Sons, Inc., New York, 1961.
528 THE WAVE AND HEAT EQUATIONS | CHAP. 13

EXERCISES

1. Give a rigorous discussion modeled on the one appearing above to establish the
validity of the formal solution of the boundary-value problem

a u 2 a 2
2 o a u
a = '

8x2 d/2

w(0, t) = 0, ux (t, =
when

(a) u(x,0) = f{x), u (x,0)


t
= 0; (b) u(x, 0) = 0, u t (x, 0) = g(*).

(Physically this is the problem of a torsionally vibrating bar with one fixed end and
one free end. See Exercise 11, Section 13-4.)

2. Repeat Exercise 1 for


a u 2 r>
2
2 o _ d u
'
" dx 2 ~ dt 2

w*(0, /) = 0, u x (ir,t) =
when

(a) k(jc,0) = /(*), «,(*,()) = 0; (b) u(x, 0) = 0, «,(*, 0) = g(x).

(Physically this is the problem of a torsionally vibrating bar with both ends free.)

3. Let /be continuously differentiable on the interval [0, a), and suppose that

lirn /'(*) = ti(P),


x->0+

where /^(0) denotes the right-hand derivative of /at zero; i.e.,

Assume that/(0+) = /(0) = 0.


(a)Prove that Of , the odd extension of /to {—a, a) is continuously differentiable on
{-a, a).
(b) Show that Of need not have a continuous derivative at x = even when /" is

continuous on [0, a) and/^ (0) exists, [flwi/: Consider the function x + x 2 .]


(c) Prove that #/ is continuous on (—a, a) whenever /is twice continuously dif-

ferentiable on [0, a) and /^'(0) = 0.

13-6 THE ONE-DIMENSIONAL HEAT EQUATION


In this section we shall use the method of separation of variables to solve boundary-
value problems involving the one-dimensional heat equation

<^ = a
2 ~' (13-37)
K
2
dx dt
13-6 |
THE ONE-DIMENSIONAL HEAT EQUATION 529

As our first example we consider (13-37) in conjunction with the boundary


conditions
w(0, - 0,

ux (L, t) = —hu(L, t), h a constant, (13-38)

u(x,0) = f(x).

Physically these equations describe a slender insulated rod of length L, with initial
temperature distribution f(x), whose left-hand end is kept at 0°, and which loses
(or gains) heat through its right end at a rate proportional to the temperature at
that end. The problem is to find the temperature u(x, t) in the rod as a function
of position and time.
We begin by seeking solutions of (13-37) of the form

u(x, i) = X(x)T(t) (13-39)

in which X is twice differentiable, T once differentiable, and each is a function of


a single variable. In addition we demand that these solutions satisfy the given
boundary conditions u(0, = 0, u x (L, i) = —hu{L, t). Substituting (13-39) into
(13-37) and dividing by AT we obtain

EL = 2 T_
~ a
x T
whenever XT ^ 0, and it follows that

X" - \X = 0,

X (13-40)
r - 4t = o,

X a constant. Moreover, it is easy to see that the only way in which XT can be
nontrivial and yet satisfy the required boundary conditions is for X itself to satisfy
those conditions. Hence we must begin by solving the Sturm-Liouville system

X" - XX = 0;

X(0) = 0,
hX(L) + X'(L) = 0.

But this problem has already been discussed in Section 12-6, and we know that
its eigenfunctions are

Xn (x) = sin \ n x, n = 1,2,...,

where Xn belongs to the eigenvalue — X^ obtained by solving a certain transcen-


dental equation (see Eq. 12-27). Furthermore, for each of these values of X the
general solution of T' + (\ Ja 2 )T =
2
is

Tn (i) = A n e -(KJa)t
530 THE WAVE AND HEAT EQUATIONS | CHAP. 13

Hence the only nontrivial solutions of (13-37) which are of the form XT and
which satisfy the first two boundary conditions in (13-38) are

(x2ja2)t
un (x, t) = An sin (X n x)e- , n = 1,2,..., (13-41)

An an arbitrary constant.
now remains to use these functions to construct a solution u(x, t) of (13-37)
It

which also satisfies the boundary condition u(x, 0) = f(x). To this end we form
the series

u(x,t)= ^A n e
-^ la2)t
sin (X„x)
n=l
and set t = to obtain
00

u(x, 0) = ^P A n sin \ n x.
n=l

From this it follows that the A n must be chosen as the coefficients of the series
expansion of/ in terms of the eigenfunctions sin \ n x.* Thus

Jo /(*) sin (\ n x) dx
A = 2
^

Jo sin (X n x) dx
and the series

„ (x> . v g/M^MA
x)dx
,-,.:,.',,
sin (XnX)> (13 _ 42)
n=1 Jo sin (\ n

which converges in the mean in (P6[0, L], is a formal solution of the given problem.

As oar second example we solve the one-dimensional heat equation given that

t/(0, = 100,

ux (L, t) = —huiL, t), h a constant, (13-43)

u(x, 0) = f(x).

Here the first boundary condition is nonhomogeneous, and we must therefore


begin by finding a particular solution u(x, i} = X(x)T(t) of (13-37) such that
t/(0, = 100, ux (L, t) = —hu{L, t). As before, Zand Twill then be solutions of

X" - \X = 0,

r-\
a
t= z
o,

* The reader should note that there is no possibility of using a Fourier series here.
The expansion must be given in terms of the sin X n x.
13-6 | THE ONE-DIMENSIONAL HEAT EQUATION 531

and by setting X = in these equations we immediately obtain XT = Ax + B, A and


B constants. The boundary conditions in effect imply that A = — 100/i/(l + hL),

B = 100, and it follows that

« (x f = Kx + 100, K = - j^Thl
is a solution with the desired properties.
This done,we now observe that the sum of w and anY solution of (13-37) which
satisfies the homogeneous boundary conditions u(0, t) = 0, u x (L, t) =
— hu(L, i)
will in turn satisfy the first two boundary conditions in (13-43). In particular, our
earlier results imply that

= (Kx + + -^ la2)t
u(x, t) 100) J£ A n e sin (\ n x) (13-44)
n=l

is such a solution, and the problem will be solved as soon as the An are chosen so
that w(x, 0) = /(*). But when / = 0, (13-44) becomes

00

= (Kx + + X* x
u(x, 0) 100) X) An sin
n=l
Thus
Sj [f(x) - Kx - 100] sin (\ n x) dx
A = 2
Jo sin (\ n x) dx

and we are done.


In Section 13-9 we shall prove that these series are solutions of the boundary-
value problems in question whenever / and /' are piecewise continuous on [0, L],
and f(x) = %[f(x + ) +
f(x~)] everywhere in the interval. (See Theorem 13-3.)

EXERCISES

In each of the following exercises find the solution u(x, i) of the one-dimensional heat
equation which satisfies the given boundary conditions.

!x, < x < 1/2

L - x, L/2 < x < L


= k sin —
"KX
2. u(0, t) = u(L, = 0; "(x, 0) -
» k a constant

3. u(0, i) = u(L, t) = 0; u(x,0) = x(L - x)

4. u(0, t) = u(L, i) = 0; u(x, 0) = f(x)

5. u x (0, t) = ux (L, t) = 0; u(x, 0) = k sin —


irx
> fc a constant
3

532 THE WAVE AND HEAT EQUATIONS | CHAP. 1

6. u x %i) = u x (L,t) = 0; u(x, 0) = x(L - x 2)


7. u x [0, t) = u x {L, t) = 0; u(x, 0) = f(x)
8. «('), /) = u x (L, /) = 0; u(x, 0) = f(x)
9. w(i), = 100, u(L, t) = 0; w(x, 0) = 100 cos
2L
10. Mz (0, f) = u x {L, i) = k, ka constant; u(x, 0) =
11. The temperature in a slender insulated rod of length L satisfies the endpoint con-
ditions «(0, /) = 0, u(L, t) = 1, and initial condition u(x, 0) = sin7r;c/X-

(a^ Find the temperature in the rod as a function of position and time.
(b; What is the steady-state temperature in the rod (i.e., the temperature as t —* oo ) ?

12. Find the steady-state temperature in a slender insulated rod of length L given that

m(0, i) = ko, u(L, t) = k\, ko and k\ constants,


u(x, 0) = f(x).

(See Exercise lib.)

13. Suppose that the temperature in the rod described in Exercise 12 is allowed to
reach equilibrium and that the temperature at the endpoints and L is then suddenly
changed to and 100, respectively. Find the new temperature distribution in the
rod as a function of x and t.

14. The equation governing the temperature distribution in a slender insulated rod in
which heat is being generated at a constant rate per unit length is

2
a u
du a
dt dx*
+ k,

k a constant. (See Exercise 2, Section 13-3.) Solve this equation given that

u(0, t) = u(L, i) = 0; u(x, 0) = f(x).

13-7 THE TWO-DIMENSIONAL HEAT EQUATION;


3IORTHOGONAL SERIES

In this section we shall study the flow


of heat in a thin rectangular plate R of
length 1, and width M, situated in the xy-
plane as shown in Fig. 13-14. Under the
assumption that heat is neither gained
nor lost across the faces of the plate, the
flow is two-dimensional, and is described
by the equation

di2 u
.,2
o u 2 du
~~v
ox :
+ ay 2
a
Yt
(13-45)
FIGURE 13-14
13-7 |
TWO-DIMENSIONAL HEAT EQUATION; BIORTHOGONAL SERIES 533

We now propose to solve this equation in the presence of the following boundary
conditions:

!u(x, 0, = 0, the temperature on the


horizontal sides of the (13-46)
u(x, M,i) = 0; plate is held at 0;

f«x(0, y, t) = 0, the vertical sides are


insulated and no heat (13-47)
(u x (L, y,t) = 0; flows across them;

u(x,y,0) = f(x,y); the initial temperature


(13-48)
distribution is known.

As usual we seek product solutions of the form

u(x,y,t) = X(x)Y(y)T(t)

which are nonzero in the region under consideration. Substituting this expression

in (13-45) and dividing by XYT, we obtain

Y"
>Ljr
±Y" = a
„ T7
2l -
(13-49)
Y f
Thus
Y" T' X"
-yr = 2
a j; ^ =X '
Xa constant >

and we have
Y" - \Y = 0.

The boundary conditions (13-46) imply that y(0) = Y(M) = 0, and lead to the
familiar set of eigenvalues —n\ 2 /M 2 , and eigenfunctions

Yn (y) = A n sin ^ . n = 1, 2, . . . . (13-50)

Next we substitute these eigenvalues in (13-49), obtaining

from which it follows that ^T'V^is also a constant, /*• Taking (13-47) into con-
sideration we find that we must now solve the Sturm-Liouville system

X" - fiX = 0; X'(0) = X'(L) = 0.

Leaving the details to the reader, we assert that in this case the eigenvalues are


2
m2 2
"
Mm =
-k -k
Mo = 0, Ml = — £2 ' ' • • '
jy~ ' ' •
534 THE WAVE AND HEAT EQUATIONS | CHAP. 13

and that the eigenfunctions belonging to the n m are

Xm (x) = Bm cos ^~ r
» m = 0,1,2, ... (13-52)

For these values of X, (13-51) becomes

a
T~ ~* \U + W)
and
2 2
_ -(x/a) [(m/L) +(n/M) 2 ]<
J- e (13-53^

(The constant of integration may be omitted without prejudice at this point.)


We now combine (13-50), (13-52), and (13-53) to conclude that each of the
functiois

um ,{x, y, = A mn cos (™f*) sin ("£) e^iWDW)'!.,


V L ) \M) (13-54)
m 0, 1, 2,
= n = .1,2,..., ^4 TOn arbitrary,
. . ,

is a soli tion of the two-dimensional heat equation which also satisfies the boundary
conditions given in (13-46) and (13-47).
To c ^mplete the solution of the given problem we must now choose the A mn
so that the (double) series
00

u(x, y,i) = Yl umn(x, y, t)


m=0
n=l

the boundary condition u(x, y, 0)


satisfies = f(x, y). This means that the A mn
must be chosen so that

mirx niry
f(x, y) = 2 Amn cos
"X
.

sin
M
w=l

and hence must be the coefficients of the double Fourier series expansion of f
in the rectangular region R under consideration (see Section 9-8). Thus

• Af r L

AA n
_ _?_
~ LM '

A*.J0rin(5g)<fc*,
Jo L
and, when m ^ 0,

4
LM Jo Jo
1 ^^™^^™^)**^
and has been completely determined.
u(x, y, i)

we shall omit the argument needed to establish the validity of these


In this case
computations. Suffice it to say that u(x, y, t) as determined above will in fact be
13-7 | TWO-DIMENSIONAL HEAT EQUATION; BIORTHOGONAL SERIES 535

a solution of the given problem whenever the function / is "sufficiently" smooth


in R. Moreover, it can also be shown that this problem admits only one solution,
and hence our discussion is complete.
This example is but one of an almost endless number of three-dimensional
boundary-value problems encountered in mathematical physics.* In this case
the solution was expressed as a double Fourier series, but from our earlier experi-
ence it is clear that one can describe physically meaningful boundary conditions
which lead to Sturm-Liouville systems with eigenfunctions other than the sine and
cosine functions encountered here. (This would have happened, for instance,
had we imposed the boundary conditions

"*(0, y, t) = -hu(0, y, t) and ux (L, y, i) = -hu(L, y, t)

on the vertical edges of the plate.) Such sets of eigenfunctions lead to the notions
of biorthogonal sets of functions and biorthogonal series expansions. The basic
theorem concerning biorthogonal sets in rectangular regions was given in Section
9-8, where it was shown that whenever {fm } and {gn } are orthogonal bases in
(9Q[a, b] and (PG[c, d], respectively, {fmgn } is a (bi)orthogonal basis in 6>e(R), R
the rectangle a < x < b, c < y < d. The import of this theorem is now obvious:
it allows us to apply the theory of eigenfunction expansions in rectangular regions,
and use them to solve three-dimensional boundary-value problems such as the
one discussed above.

EXERCISES

1 Verify that the eigenvalues and eigenfunctions for the Sturm-Liouville problem

X" - nX = 0, X'(0) = X'iL) =


are as given in the text.

2. Solve the boundary- value problem discussed above when/(;c, y) = sin


2
iry/M.
3. Repeat Exercise 2 with f(x, y) = v(M — y) cos tx/L.

4. Solve the two-dimensional heat equation in the rectangular region < x < t,
< y < t given that
«(0, y, t) = u(t, y, i) = u(x, 0, i) = u(x, t, i) = 0,

u(x,y,0) = f(x,y).

5. (a) Solve the partial differential equation

dx 2
M
_i__u—
dy dz
n
2 2

* The three-dimensional region in question is the semi-infinite slab < x < L,


< y < M, < t.
536 THE WAVE AND HEAT EQUATIONS | CHAP. 13

in the region 0<x<ir, 0<y<ir, 0<z<w under the boundary conditions


u(x, y, 0) = f(x, h = on all the other sides of the region,
;y),

(b) Generalize the result in (a) to the case where u assumes arbitrary boundary
values on all six sides of the region.

6. Solve the boundary-value problem


-2 /^2 n2
a u 2 (d u '
d u\

«(0, y, t) = u(t, y, t) - 0,

w(x, 0, /) = u(x, ir, t) = 0,

u(x,y, 0) = f(x,y), u t (x,y,0) = 0.

7. Solve the boundary-value problem in Exercise 6 when the last boundary condition
is replaced by u (x, y, 0)
t
= g(x, y).

*13-8 THE SCHRODINGER WAVE EQUATION


The second-order partial differential equation

e^ + ^ + ^2-nx,y,z)^ = k-> (13-55)

k a constant, is encountered in quantum mechanics as the Schrodinger wave


equation for a single particle. Its eigenfunctions are among the most interesting
in applied mathematics, has stimulated a great deal of modern
and their study

research in this field. Unfortunately most of these results are beyond the scope of
this book, and any attempt to solve (13-55) as it stands would be quite
out

of place here. Instead we shall limit ourselves to the one-dimensional version of


equation where V(x, y, z) = c x
2 2
this c a positive constant. Artificial as these
,

restrictions may seem, the problem is actually of considerable physical interest.f


Thus we consider the simplified equation

|!i_
a* 2
c V* = ^> dt
(13-56)

which we propose to solve in the region — oo <*< oo, * > 0, subject to the

restriction that the solutions tend to zero as |x| -> oo. Applying the method
of

separation of variables with $(x, i) = iK*M0, we find that <p and ^ satisfy the

equations ,

(13 " 57 >


f+^= '

and

g+ (X - cV)* = 0, (13-58)

significance of (13-55) and the problem we are


t For a discussion of the physical
about to consider, the reader is referred to any standard text on quantum mechanics.
13-8 I
THE SCHRODINGER WAVE EQUATION 537

X a constant, the second of which is subject to the "boundary" condition


}J/(x)
—» as |jc| —> oo. (This equation is known as the amplitude equation for the
particle.)
The very fact that we are now dealing with functions on (— oo, oo) suggests that

we attempt to solve this problem by using the Hermite polynomials Hn (x) defined
in Section 1 1-6. With this in mind we each non-negative integer n
recall that for
the function e~ x2,4 Hn (x) is a solution of the differential equation

y" + n + l-^)y = ° 3 - 59 )
(
(see Exercise 12, Section 11-6). The strong similarity between this equation and
(13-58) leads us to seek a solution of the latter in the form
- {ax)2l4
S(x ) = e Hn (ax) (13-60)

for a suitable constant a. Differentiating (13-60) we obtain

S'( x ) = e- (a2x2)l4[aH^(ax) - ±a
2
xHn (ax)],
S"(x) = a e-
2 laV)l4
\HU(qx) - (ax)H'n (ax) - \
(l - ^p)Hn (ax)
But since Hn is a solution of

y" — xy' + ny = 0,

we have
H'rliax) — (ax)Hn(ax) + nHn (ax) = 0,
and it follows that

S"(x) = aV
2 a
= a [-n-l+ *f\s(x).

Thus

S"(x) + 2
a (n + \
- ^p) S(x) = 0,

and we conclude that S(x) will be a solution of (13-58) if a = (4c


2 1/4
)
= \/2c.
This, in turn, implies that the eigenvalues and eigenfunctions for the problem
under consideration are

Xn = (2/i + \)c, n = 0, 1, 2, . . .
, (13-61)

and

*„(*) = e-
{cx2)l2
Hn (^/Tcx). (13-62)
538 THE WAVE AND HEAT EQUATIONS | CHAP. 13

(It can be shown that up to multiplicative constants these are the only eigen-
functions for this problem.) Moreover, with X n as above,

<PnV) — e ,

and we therefore conclude that the functions


2

*„(*,*) = A n e-^ ' 2)


Hn (V2cx)e-^ n+1)ct] '\
n = 0, 1, 2, . . . , An arbitrary, are solutions of (13-56). Finally, as with all of
the other problems discussed in this chapter, these functions can be used to con-
struct series solutions of (13-56) which satisfy initial conditions of the type
\p(x, 0) = f{x) provided / is a reasonably well behaved function. We leave the
details to the reader.

*13-9 THE HEAT EQUATION; VALIDITY OF THE SOLUTION

In order to establish the validity of the formal computations in Section 13-6


we must introduce the notion of a uniformly bounded, monotonic nonincreasing
sequence of functions {Fn (t)} on an interval /. The relevant definitions are as
follows: {G„(0} is said to be

(i) monotonic nonincreasing if Gn (t) > Gn+X {t) for all t in / and all n;

(ii) uniformly bounded if there exists a real number M such that \G < M n (t)\

for all t in / and all n. (The word "uniform" is used here because M simultaneously
bounds all of the Gn .)
Perhaps the simplest nontrivial example of a sequence with these properties is
{t }, n = 1, 2,
n
on the unit interval [0, 1]. In this case monotonicity is
. .
.
,

n
obvious, and uniform boundedness follows from the inequality \t < 1 whenever \

n
|/| <
1. This last inequality actually shows that {t } is uniformly bounded on
n
the larger interval [—1, 1], though it is no longer monotonic there since the t
alternate in sign to the left of the origin. For our purposes a more pertinent
example is furnished by the sequence with

\ n as in Section 13-6. Indeed, since X x < \ 2 < Gn (t) > Gn+ i(t) for all • • •
,

t > 0, and {Gn (i)} is monotonic nonincreasing. In addition, |Gn (/)| < e° = 1
for all / > 0, and hence the sequence is uniformly bounded on the semi-infinite

interval [0, oo).


With this terminology in force, we now prove

Theorem 13-2. Let £"=i Fn(x) be a uniformly convergent series of con-


tinuous functions on the interval a < x < b, and let {Gn (f)} be a uniformly
bounded monotonic nonincreasing sequence of continuous functions on
13-9 |
THE HEAT EQUATION; VALIDITY OF THE SOLUTION 539

c < t < d. Then the series

(13-63)
J2 Fn(x)Gn (t)
n=l

is uniformly convergent in the rectangle a < x < b, c < t < d.

Proof. The proof is based upon the fact that an infinite series is uniformly con-
vergent if and only if its associated sequence of partial sums is uniformly a Cauchy
sequence (see Appendix I). Thus if

Sk(x, t) = Y^ Fn (x)Gn (t)


n=l

denotes the kth partial sum of (13-63), we will be done if we can show that for each
€ > there exists an integer K such that whenever k > K, p > 0,
\s k+p (x, i) - s k (x, i)\ < e (13-64)

for all x and t in the prescribed rectangle.


To this end let
it

<Tk(x) =
n=l
^F n (x)

be the kth partial sum of the uniformly convergent series £ n =i Fn (x). Then
Fk(x) = (Tk(x) — <Tk-i(x) for all k > 1, and we have
Sk+p — Sk = Fk+pGk+p + • • •
+ Ffc +1 Gfc +i
= (<Tk+p — <Tk+p—l)Gk+p + * •
*

+ (<Tk+2 — <Tk+l)Gk+2 + (<Tk+l ~ <Tk)Gk+l

= [fak+p — 0"&) — ((Tk+p-1 — <Tk)]Gk+p + * *


'

+ [(o'k+2 — 0~k) — (<7k+l — <J'k)]Gk+2 + (Cfc+l ~ <Tk)Gk+l

p p—\
= 2^1 ( a k+n — <7k)Gk+n ~ ^J i&k+n ~ <Tk)Gk+ n +l
ns=l = n l

P-l
= 2_j (?k+n
— <Tk)(Gk+ n ~ Gfc+n+l) + fak+p ~ <Tk)Gk+p-
n=l

* This result is a modified form of the "summation by parts" identity


Jfc +1 k

dk+ibk+i — a\b\ = 22 <*n(b n — b n -i) + 22 bn (an+ — i a n ),


n=2 n=l
so called because of its similarity with the formula for integration by parts.
i

540 THE WAVE AND HEAT EQUATIONS | CHAP. 13

Thus

\Sk+p — Sk\ < 22 \<Tk+n — &k\ \Gk+n ~ ^fc+n+ll + Wk+p ~ <*k\ \G k +p\-
n=1
(13-65)

Now let e > be given, and let M be such that |<j n (Ol < M. for all n and all /

in the interval [c, d]. Then since £"=i Fn (x) is uniformly convergent on [a, b]

we can find an integer K such that


\<Tk+n — <Tk\ < 3M

for all k > K, n > 0, and all x in [a, b]. Substituting this inequality in (13-65)
we obtain
"p-i
\Sk+p — Sk\ < 3M /] \Gk+n — <Jjfc+n+l| + |Gfc+p| (13-66)
n=l

But since {Gn (t)} is monotonic nonincreasing, G k+n (t) > Gk+n+1 (t) for all t, and
p— i p—
/) \Gk+n — Gk+n+ i\ = 2J (Gk+n — Gk+n+1 )
n=l n=l
= (Gjfc+i — <J7fc +2 ) + (G k+ 2 ~ Gk+ %) + • •

+ ((jfc+p_I — (jfc+p)

= Gfc+l — Gk+p .

Hence

£ l^+n - G k+n+1 \ + |G* +P |


< 2M + M= 3Af,
ra=l

and (13-66) becomes

which is precisely what had to be shown. |

With this result in hand we are ready to investigate the formal solution

oo
2 2
x>r (X " /a n
<x, t)=Y J
An sin (X n (13-67)
n=l

of the boundary-value problem


2
d u _ 2 du
dx 2 ~ ° dt'

w(0, = 0, ux (L, t) = -hu(L, t), t > 0, (13-68)

u(x, 0) = f{x), < x < L,


13-9 |
THE HEAT EQUATION; VALIDITY OF THE SOLUTION 541

where the An are determined from the orthogonal series expansion

00

f{x) = J2 A " sin x »*- (13-69)


n=l

In this case the restrictions needed to guarantee the validity of (13-67) are

(i) / and /' piecewise continuous on [0, L] and

(ii) /(*) = M/(* + ) + f(x~)] for all x in (0, L),

and we now assume that they are in force. The first step in the proof consists of
an argument to show that (13-67) is uniformly and absolutely convergent in the
region < x < L, t > 0. We use the Weierstrass Af-test (Appendix I), as
follows.
From the general theory of Sturm-Li ouville series we know that (13-69) con-
verges (pointwise) to fix) for all x in (0, L). Moreover, by Theorem 8-3 the series

2
x: ki
n=l

is convergent. Hence the sequence {\A n \} is bounded, and there exists a positive
constant K such that
\A n \
< K
for all n. From this it follows that for any t >

|^sin(X n ^>- (X " /a2)<0 < Ke-°^ la2)i \ |

and the M-test will apply as soon as we have shown that

]T Ke- {x2ja2)t ° (13-70)


n=\

converges. To this end we note that

a
— (X n+1 /a )t
2

2
-alla )t

But from the way in which the X n were determined in the preceding chapter it is

clear that X n+ 1 — X n tends to a positive constant value as n —» oo . Hence

— <*n+l/«
lim
n—><x>
p

— (X„/a
—— = )t Q
)'o

0,

and the ratio test implies that (13-70) converges.


542 THE WAVE AND HEAT EQUATIONS | CHAP. 13

We now differentiate (13-67) termwise with respect to t to obtain

~ = ~ E An^l sin (\n x)e-^


^ n=l
la2)t
, (13-71)

and repeat the above argument using the inequality

- \A n \n sm(\ nX )e-^
2 la2)t
°\ < 4x^- (X " /aVo

to deduce that (13-71) is uniformly and absolutely convergent in the region


< x < L, t > t > 0. The same can also be said for the series

2
d u
= - ]T^x2sin(X n x)e- (X » /a2) <
(13-72)
dx 2 n=l

obtained by differentiating (13-67) twice with respect to x. Finally, (13-71) and


(13-72) imply that
2
d u _ Q
2 dff
dx* ~ dt

and we have therefore proved that under the hypotheses imposed above the
function
2 2
= ^2 A n sin (\ n x)e -(\ Ja )t
u(x, t)
n=l

is a solution of the one-dimensional heat equation in the region < x < L,


t>0*
This brings us to the boundary conditions, the first two of which can be dis-
patched with ease, as follows. By the argument just given (13-67) is uniformly
convergent in every (closed) region of the form t > t Q > 0, < x < L, and
hence represents a continuous function there (Theorem 1-21, Appendix I). But
when x = 0, (13-67) reduces to zero, and it follows that

«(0, =
for all t > 0. Moreover, since

= ^2 A n \ n cos (\ n x)e~
dx
n=l

* At this point the reader would do well to recall the remarks made in Section 13-2
concerning solutions of boundary-value problems, and, in particular, that they need not
satisfy the differential equation on the boundary of the region in question.
1 3-9 |
THE HEAT EQUATION; VALIDITY OF THE SOLUTION 543

for < x < L, t > t Q > 0,

/aV
ux (L, = E A ^n
n=l
cos (X n L)]e-
(X »

and

-hu(L, 0=2 An\~ n=l


h Sin (Xn^)]e
_(X " /a2)<
.

But
h sin (X n L) + X n cos (X n L) =
for all n (see Eq. 12-27), whence

t/ x (L, r) = —hu(L, i)

for all f > 0, as required.


This brings us to the boundary condition

u(x, 0) = f(x)

and the only troublesome step in the proof. At first sight it might seem that we
need only set t = in (13-67) and watch the series reduce to (13-69) to complete
the argument. Unfortunately this will not do. The error in such a naive argu-
ment arises from the fact that we are not in a position to assert the continuity of
u(x, t) for t > 0, and thus cannot guarantee that u(x t) approaches f(x ) as ,

(x , approaches (x
t) , 0). To establish this fact we use Theorem 13-2, as follows.
The sequence
-cx*/« 2 )*}
{e

is monotonic nonincreasing and uniformly bounded on [0, oo). Moreover, for


each x in (0, L)
00

2A
n=l
n sin (X n x )

isa convergent series of constants, and hence is uniformly convergent. We now


apply Theorem 13-2 to deduce that

u(x , t)=Y A n sin (Xw x )e-


i
(X » /a2) '

n=l

is uniformly convergent for all t > 0, and thus is a continuous function of t. It


now follows that
00

lim u(x , i) = Y\ A n sin (\ n x ) = f(x ),

and we have proved the following theorem.


544 THE WAVE AND HEAT EQUATIONS |
CHAP. 13

Theorem 1 3-3. The series

{x2ja2)t
u(x, t) = ^2 A n sin (X n x)e-
n=l

with
So f(x) sin (X w x) dx
=
2
So sin (\ n x) dx

converges uniformly and absolutely in every region of the form < x < L,

t > t > 0, and is a solution of the boundary-value problem


2
d u _ 2 du
~ a
dx^ Jt'

u(0, t) = 0, ux (L, t) = -hu{L, t), t > 0,

u{x, 0) = f(x\ < x < L,

whenever f and f are piecewise continuous and


Ax) =

everywhere in [0, L].

Remark. The reader should appreciate that under the hypotheses imposed here
this theorem is valid only if the last boundary condition is interpreted
to mean

lim u(x , t) = f(x )


+
<->o

for all x in (0, L). However, when /is continuous on [0, L] and/(0) = f(L),
the series

fix) = 2 An
n=l
sin XnX

is uniformly and absolutely convergent there, and

= W)t
u(x , t) Y,A n sin (\ n x)e-*
n=l

converges uniformly and absolutely for < x < L, t > 0. In this case u{x, t)

is continuous for < x < L, t > 0, and the last boundary condition is satisfied

in the stronger sense

lim u(x, t) = f(x )


t—*x

for all Xq in (0, L).


13-9 | THE HEAT EQUATION; VALIDITY OF THE SOLUTION 545

EXERCISES

1. Give a rigorous discussion to establish the validity of the formal solution of the
boundary-value problem
d u _ 2 du
= a
dtf fr'

m(0, = u(L, = 0, > 0, t

u(x, 0) = f(x), < x < L.


2. Repeat Exercise 1 for
2
d u _ 2 du

«x(0, = ux (L, = 0, t > 0,

u(x, 0) = f(x), < x < L.

3. Prove that Theorem 13-2 remains valid if the hypothesis "nonincreasing" is replaced
by "nondecreasing."
14
boundary-value problems
for laplace's equation

14-1 INTRODUCTION
In this chapter we restrict our attention to boundary-value problems involving
Laplace's equation

S+g+S- <"-»

in rectangular, circular, and spherical regions. The student will observe that (14-1)
can be viewed as the special case of the heat equation in which du/dt = 0, and
with this interpretation in effect its solutions are called steady-state solutions since
they describe temperature distributions which are independent of time. Other
interpretations of (14-1) and its solutions are given in various exercises throughout
this chapter.
Before Eq. (14-1) can be solved in a circular or spherical region it must be
transformed to polar or spherical coordinates. The computations involved in
making these changes are somewhat lengthy, and so to avoid interrupting our
later work we dispose of them here and now.
The polar coordinate form of Laplace's equation is derived by introducing the
change of variables
x = r cos 0, y = r sin 0, (14-2)

in

dx* + dy2 - u >

as follows. From (14-2) and the chain rule for differentiation we obtain


du
dr
= cos 6
n

du
dx
+ ,


du
dy
.

sin 6
n '

(14_3)

du
dd
= —r sin 6n .

du
dx
+ cos —
,
du
dy
r
Q
6 >

546
14-1 INTRODUCTION 547

and since the determinant of this pair of equations is nonzero everywhere in the
punctured plane r ^ 0, (14-3) can be solved for du/dx and du/dy whenever r ^ 0.

This gives


du
= cos 6
„du 1 .

sin


du
dx dr r dd
(14-4)

du
dy
= .

sin 6


du
dr
+ - cos —
du
,

dd
1

r
.

But these formulas are valid for any differentiable function u = u(x, y), and
hence can be applied to du/dx and du/dy themselves to obtain

„ d (du\ 1 „ d (du\
= cos e Smd
.

dx 2 Fr\te)--r d-8\d-x)
(14-5)
2
d u .
n /du\
d (du ,1 n d (du\

If the derivatives

_d (du\ _d_ (du\


dr \dx) ""
} >
'
dd \dyj

are now computed from (14-4), substituted in (14-5), and the resulting equations
added, we find that

2 2 2 2
d u d u d u 1 du 1 d u
+ =
~ + +
dx 2 dy 2 dr 2 r dr r2 dd 2

(see Exercise 1). Hence Laplace's equation in


polar coordinates is

2
d u 1 du d u
Jr
1
= °- (14^6)
2
dr- +-rTr 7 dT
2 2

To convert (14-1) to spherical coordinates (r, <p, 0) we make the change of


variables
x = r sin <p cos 0,

y = rsin<p sin 0,

z = r cos <p.

(See Fig. 14-1.) A computation entirely similar to the one given above now yields
the equality

^^d^w =
-T J[j)/ 2 chA
+
1 d_ ( Sm . d«\
+
1 c^m
d*2 d>,2
-I"
az 2 r2 5r ^ dr// r 2 sin ^ d(p \
<P
d(p ) r 2 sin 2 ^ 302
'
548 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION |
CHAP. 14

and it follows that Laplace's equation in spherical coordinates is

( du\ d(.Sm du\ \d 2


u n _.
d_

dr V
2
Tr)
.

+
1

sin^ * V ^j + sln^ W= 2
°" (14~ 7)
.

Incidentally, (14-7) was the equation originally encountered by Laplace in his

study of gravitational attraction in three-space, and it was only later that he found
the version
2 2 2
d u d u d u
+ + ^
dx 2 dy 2 dz 2

introduced earlier.

EXERCISES

1 Complete the proof of the equality

2
d u d\ _d\ lfla
+,}_dji.
dx 2
+ dy 2
~~
dr2
+ r dr r 2 dd 2
'

2. Use the transformation formulas given in the text to derive Eq. (14-7) from (14-1).

14-2 LAPLACE'S EQUATION IN RECTANGULAR REGIONS

In this section we consider the equation

g + g-0 04-8)

in a rectangular region of the form < x < L, < y < M, and begin by
imposing the boundary conditions

u(0,y) = 0, u(L,y) = 0,
^
u(x, M) = 0, u{x, 0) = f{x).

In physical terms this problem requires that we find the steady-state temperature
in a thin rectangular plate three sides of which are held at 0°, and the fourth
at

no heat gained or lost across its faces.


f(x), under the assumption that is

Weagain find the solution by separating variables, and thus set u(x, y) = XY,
where X
and Y are, respectively, functions of x and y alone. This implies that
u xx = X"Y, u yy = XY", and hence (14-8) becomes

XI - _ EL
Y ~ X
14-2 | LAPLACE'S EQUATION IN RECTANGULAR REGIONS 549

at those points where XY 9± 0. This equation in turn is equivalent to the pair of


ordinary differential equations

X" + XX = °>
(14-10)
Y" - \Y = 0,

X a constant, and the given boundary conditions plus the requirement that
XY 9* imply
X(0) = X(L) = 0,

Y(M) = 0.

Hence we can again begin with the solutions of the Sturm-Liouville system

X" + \X = 0,

*(0) = X(L) = 0,

that is, with the functions

Xn (x) = A n sin -j- » n = 1, 2, . . .

where Xn belongs to the eigenvalue X n = n 2 ir 2 /L 2 and A n is an arbitrary constant. ,

For these values of X the second equation in (14-10) becomes

2 2
n T
Y" — Y=
and has the general solution

Yn (y) = Bn sinh ~^ + Cn cosh ^ »


(14-12)

Bn and Cn arbitrary.* Since Y(M) = 0,

Bn sinh -j- M -\- Cn cosh -j- M = 0,

and we can set

Bn = — cosh-y-M, Cn = sinh-y-Af.

We now substitute these values in (14-12) and use the identity

sinh a cosh /3 — cosh a sinh f3


= sinh (a — (8),

valid for all a and /3, to obtain

Yn (y) = sinh
^ (M - y). (14-13)

* In this problem it is more convenient to work with hyperbolic functions than with
exponentials.
550 BOUNDARY-VALUE PROBLEMS FOR LAPLACE'S EQUATION |
CHAP. 14

Thus each of the functions

Unix, y) = An sin (jj?\ sinh


™ (Af - y), n = 1, 2, . .
. , (14-14)

is a solution of the two-dimensional Laplace equation which, in addition, satisfies

the boundary conditions

"n(0, y) = u n {L, y) = u n {x, M) = 0.

To find a solution which also satisfies the remaining boundary condition


u(x, 0) = f{x) we form the series

00

u(x, y) = 2
n=\
un(x y)>

= ]T A n
n=l
sin
(j^j sinh
^ (Af - y).

When j? = this series becomes

nirx
u(x, o) = 2^ ^ smh I "~ ZT /
sin
L
ra=l

and will represent the function / on the interval (0, L) if the A n sinh (rnrM/L) are
the coefficients of the Fourier series expansion of the odd extension of /to [— L, L\.
Thus L
. , rncM ( 2 r, \ • (nirx\ ,
t
^ n sinh -j- = j /(*) sm \^-^y <**
Jo /

and

u(x y,
ki*,.)';
£
y
- 1 Zj /°/W^(""/f)^ sin (V) sinh ^
sinh(nirM/£) \t/ £
(M - ,)•
w=l
(14-15)

Once again it can be shown that this solution is valid whenever / is sufficiently
smooth (see Exercise 10), and that it is the only possible solution of the problem
in question.
that the above argument can be applied to solve any boundary-value
It is clear

problem for Laplace's equation in a rectangular region given that the solution is
to vanish on three sides of the rectangle. But if u u u 2 u 3 w 4 denote the solutions , ,

corresponding to the four possible cases with one nonzero boundary condition,
the function
4
u(x, y) = ^ u k (x, y)
14-2 |
LAPLACE'S EQUATION IN RECTANGULAR REGIONS 551

will be a solution assuming nonzero values on all four sides of the rectangle, and
it follows that we can actually solve all problems involving Laplace's equation
and boundary conditions of the form

u(0,y) =/i(>>), u(L,y) = f2 (y)


u(x, 0) = /3 0), u(x, M) = f4 (x).

EXERCISES

In each of the following exercises find the solution of Laplace's equation in the rectangle
< x < L, < >> < M which satisfies the given boundary conditions.

1. u(0,y) = u(L,y) = 0, u(x, M) = u(x, 0) = 2x(x - L)

2. u(L, y) = u(x, 0) == 0,

\x, < x < L/2


u(x, M) =
\L - x, L/2 < x < L

m
"(0, y)
v
= U < y < M/2
[M - y, M/2 < y < M
3. u(0,y) = u(L,y) = 0,

(
L
0, < x <
2
u(x, M) = u(x, 0) = <
2 3L L2 L

4. u(x, 0) = «(0, y) = 0, w(x, M) = x sin —


TTX
> «(L, >>) = >> sin —
7TV

5. u(x, 0) = «(x, M) = x(x - L), u(0, y) = m(L, >>) = sin -^

6. Find the series solution of the equation u xx + u yy = in < x < L, < y < M
given that
u(0,y) =fi(y), u(L,y) = f2 (y),
u(x, 0) = h{x), u(x, M) = Mx).

7. Derive the formula for the steady-state temperature in a ir X ir square plate whose
faces are insulated so that no heat flows across them, under the assumption that

"x(0,» = u x (ir,y) = u(x,ir) = 0, u(x, 0) = f(x).

8. Solve u xx + Uyy = in the region < x < ir, < y < t, subject to the bound-
ary conditions

Ux(ir,y) = u x (0,y) = Uy(x,ir) = 0, u(x, 0) = f(x).


552 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION |
CHAP. 14

9. Find the steady-state temperature in the infinite plate shown in Fig. 14-2 if the
vertical edges are kept at 0°, the edge on the x-axis at 100°, and if the temperature
throughout the plate is bounded.

100°

FIGURE 14-2

*10. (This Exercise assumes a knowledge of the material in Section 13-9.) When M=
L = tt the boundary-value problem discussed in this section is

d u d u _
dx 2 dy 2

u(x,ir) = u(0, y) = u(jr,y) = 0,

u(x,0) = /(*),

and the formal solution found may be written

\7* x
sinhn(7r-->0
/
«(*, y)
x
= 22 Aa " sin t(«*>
sinh^x '

where Y,n=\ A n sin nx is the Fourier series expansion of the odd extension of / to
the interval [— t, it].

(a) Show that the sequence of functions

sinh «(x — y)
n = 1,2,
sinh mr

is monotone nondecreasing and uniformly bounded on < y < w. (In fact,

sinh niit — y)
<
— < 1
sinh nir
for all n.)

(b) Assuming that / and /' are piecewise continuous on < x < tt, show that
the formal solution given above is uniformly and absolutely convergent in the

rectangle < x < w, y < y < t for any y > 0. [Hint: Use the methods of
Section 13-9, including that of Exercise 3.]

(c) Show that under the assumptions given in (b) the series obtained by twice
differentiating the formal solution term-by-term with respect to x and with respect
to y are also uniformly and absolutely convergent on < x < ir, yo < y < if.
14-3 |
LAPLACE'S EQUATION IN A CIRCULAR REGION 553

(d) Using the results of (b) and (c), prove that the series solution u(x, t) satisfies

Laplace's equation in the rectangle 0<x<ir, 0<y<ir, and vanishes when


x = t and when y = and x.

(e) With /and/' as above, prove that lim y ^o+ u(x y) = f(x ) for each x in ,

(0, ir), and thus show that u(x, i) is a solution of the given boundary-value problem.

(0 Prove that the solution u(x, y) satisfies the boundary condition u(x, 0) = f(x) in
the stronger sense
lim u(x, y) = f(x )

V->0+

for all xq in (0, it) whenever / is continuous and /' piecewise continuous on [0, ir].

14-3 LAPLACE'S EQUATION IN A CIRCULAR REGION;


THE POISSON INTEGRAL

As our next example we solve Laplace's equation in a circular region centered at


the origin. In this case we must use the polar coordinate form of the equation
which, as we have seen, is

2 2
d u 1 du 1 d u n nA 1A
.

dr 2 r dr r2 dd 2

Here, in addition to the usual series solutions obtained by the method of separa-
tion of variables, there are certain exceptional solutions whichdepend upon only
one of the variables, and we begin by examining them.
First, if u is a function of r alone, (14-16) becomes

~ + ~f =
dr 2 r dr
(14-17)

or
(du\
ll'Sl-o.
and two integrations yield

u = c In (kr), (14-18)

where c and k > are arbitrary constants. In this case the solutions are undefined
when r = 0, and hence are valid only in regions of the form r > where they
provide solutions of (14-16) which are constant on circles centered at the origin.
On the other hand, if u is a function of 6 alone, we have

^=
dd 2

and u = Ad + B, where A and B are arbitrary constants. If we now demand


that u be single-valued (which is clearly the only type of solution of any physical
554 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION | CHAP. 14

significance), then u(6 + 2ir) = u(6), A = 0, and it follows that (14-16) has no
nonconstant solutions which depend only upon the polar angle 0.

This' said, we we now solve under the assumption


return to Eq. (14-16), which
that u is and assumes preassigned values on the boundary
single-valued, continuous,
of the circle. For convenience we shall work in the unit circle r < 1 in which ,

case the boundary condition can be written

«(i, e) = fie), (14-19)

where / is a continuous function such that f(6 + 2x) = /(0) for all 0.

Setting u(r, 0) = /?(/•) 0(0), (14-16) becomes

R"® + -R'% + \R<d" = 0,

and we have

r
2
R" + rR' 0" >
-^ = -pr- = X,
>
X a constant.
R ©
This yields the pair of ordinary differential equations

0" + x@ = 0,
(14-20)
r
2
R" + rR' - \R = 0,

thefirst of which is subjected to the periodic boundary conditions 0(0) = @(27r).

Save for notation, this problem was solved as Example 2 in Section 12-5, where
we saw that its eigenvalues and eigenfunctions are

Xw = «
2
, n = 0, 1,2, . . .

and
0(0) = A n cosnd + Bn sinnd. (14-21)

Moreover, when \ = \n = n
2
, the second equation in (14-20) becomes

r
2
R" + rR' - n
2
R = 0, (14-22)

which we recognize as an Euler equation whose solution space is spanned by the


functions r and r~ when n ^ 0, and 1 and In r when n =
n n (see Section 4-8).
n
We reject the solutions r~ and In r because of their discontinuity at the origin,

and thus are left with


n
R n (r) = r . (14-23)

This done, we combine (14-21) and (14-23) to obtain the functions

u n (r, 6) = r
n
(A n cos nd + Bn sin n&),
each of which is continuous, satisfies Laplace's equation, and is periodic in 6 with
14-3 | LAPLACE'S EQUATION IN A CIRCULAR REGION 555

period 2ir. To satisfy the boundary condition w(l,0) = f(6), we now form the
series

u{r, d) = 4 +
2
2 '"V* cos nd + Bn sin nd>>> (H-24)

and set r = 1 . This gives

i/(l, 0) = 4s + Z)
Z
(^» cos w * + B" sin "*)'
n=l

and it follows that the constants A n and Bn must be the Fourier coefficients of/; i.e.,

An = - f(6) cos nddd,


IT J —T
(14-25)

Bn = - I
f(0) sin nd dd.
T J —n

Finally, we leave the reader the task of verifying that whenever / is sufficiently

smooth, (14-24) and (14-25) provide a continuous, single-valued solution of


Laplace's equation in the unit circle which assumes the prescribed values on the
boundary of the circle. The argument needed to establish these facts is outlined
in Exercise 7 below.
We have already observed that the solution of the above boundary-value prob-
lem can be interpreted as the steady-state temperature distribution in a circular
region of radius one, given the temperature on the boundary of the region. There
are, however, other physical interpretations of this solution (and, by implication,
of the problem leading to it) which are of considerable importance in more advanced
work. Most of them are derived from an integral form of the above solution,
which is obtained as follows. Let s denote the variable of integration in (14-25),
and substitute in (14-24). This gives

u(r, 0) = -\~J f(s)ds + ^ r


n
( / f(s) cos ns ds
J
cos nd

+ ( / f(s) sin ns ds J sin nd

Interchanging the order of integration and summation (an operation which can
be shown to be valid here), this expression may be rewritten
T
°°
f
u(r, 6) = —1 I f(s)
l~l
-= + V^ n
r (cos ns cos nd + sin ns sin nd) ds

=
\\_ Ks)\~ + 2 r
n
cosn{s - d)\ds.
556 BOUNDARY-VALUE PROBLEMS FOR LAPLACE'S EQUATION |
CHAP. 14

We now evaluate the sum

\ + 2 '
,n
cosn(5 - 0) (14-26)

appearing in this integral. The easiest way to go about this is to introduce complex
numbers, and set

z = r [cos (s — 0) + / sin (s — 0)].

Then
z
n = r
n |-
CQS n (s _ fl) _|_ i sm n (s _ q^

and it follows that (14-26) is the real part of

i
"

Z
n=l

But when |z| = r < 1, as it is here, the geometric series 1 + z + z


2
+ • •

converges to 1/(1 — z). Hence

\
2
+ 2
.-i
*
n
= - \ + + z + z' + * * •>

--U 2 '
1
l

- z

= _ I +
^
1

2 1 - [r - 0) + /> sin (* - 0)]


cos (s

_ 1 1 — r cos (s — 0) + f> sin (s — 0)


~ 2
+ - 2r cos (s - 0) + r 2
1

Taking the real part of this expression we have

_ . 1 — r cos (s — 0)
Re
2
+ 1 - zj 2 1 - 2r cos (5 - 0) + r2

2(1 - 2r cos (s - 0) + r2 )

and it follows that

This expression is known as the Poisson Integral form of the solution of Laplace's
equation in the unit circle with boundary condition u{\, 0) = /(0), and, as men-
tioned above, appears in advanced work in this subject.
14-3 I LAPLACE'S EQUATION IN A CIRCULAR REGION 557

EXERCISES

1. Use the technique suggested in Example 3 of Section 4-8 to show that u = c In (kr)

is the general solution of

d u 1 du _
dr*^~ ~r dr
~ '

n
2. Verify that r n and r~ are solutions of

r 2 R" + rR' - n 2R = 0.

3. A homogeneous annular disk, having dimensions as shown in Fig. 14-3, is


thin
insulated so that no heat can flow across its faces. Find the steady-state temperature
throughout the disk if the inner boundary is kept at 0°, while the outer is kept at
100°. At what points in the disk will the temperature be 50°?

Q\

P'< Mi
V.
R2 V
<\*V<
Wire
dx

FIGURE 14-3 FIGURE 14-4

4. If a doubly infinite straight wire carries a uniform static electric charge a per unit
length (o- > 0), then the potential at a point P due to the charge on an element of
the wire of length dx is defined to be

where i?i and R2 are as shown in Fig. 14-4, and the distance OQ = 1. By experi-
ment it can be shown that the potential V(P) at P is the "sum" of the potentials due
to the various elements of length, i.e., that

V(P) = 2a dx.
Jo \Ri R2 )
Set OP — r, and evaluate this integral. Compare the result obtained with the
^-independent solution of the two-dimensional Laplace equation.
5. Refer to Appendix I to obtain sufficient conditions on f(6) to permit the necessary
interchange of the order of integration and summation in the derivation of the
Poisson integral.
558 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION | CHAP. 14

6. Verify directly that the Poisson integral satisfies Laplace's equation when r < 1.

(See Appendix I for the necessary rule for differentiating the integral.)
*7. (This exercise assumes a knowledge of the material in Section 13-9.) Assume that
fid) and fid) are piecewise continuous and periodic on [0, 2w] with period 2ir, and
that
fie) = h[fid+) + fie-)]

everywhere in this interval. Prove that

°°
a
uir, 6) = -j + ^
n=l
r
n
(A n cos nd + Bn sin nd),
where
A
fie) = -j + X) ( An cos nd + Bn sin n6 ^

is a solution of the boundary-value problem discussed in this section. Reason as


follows
(a) Show that \A n cos nd + Bn sin nd\ < \/^l + Bn2 -+ as n -+ °o

(b) Showthat the solution series and allfrom it by term-


relevant series obtained
wise differentiation converge uniformly and absolutely whenever r < ro < 1. Now
conclude that the necessary termwise differentiations can be performed, and that the
solution series does satisfy Laplace's equation in the region r < 1.

(c) Show that lim r ^i- uir, do) = Wo) for all O in [0, 2*-].

(d) Prove that if fid) is also continuous then the boundary condition is satisfied in

the stronger sense


lim uir, d) = fie ).

14-4 LAPLACE'S EQUATION IN A SPHERE;


SOLUTIONS INDEPENDENT OF 6
The remaining sections of this chapter will be given over to the study of Laplace's
equation in the unit sphere r < 1 where, as we have seen, the equation can be
written

a (I4" 28)
('S)+ib£(""S) + iik8f-
As usual we our attention to solutions which are single-valued,
shall restrict
continuous, and bounded under consideration, and which as a conse-
in the region
quence are periodic in d with 2tt as a period. With these assumptions in force it is
easy to show that (14-28) has no nonconstant solution involving only one of the
variables. Under less however, solutions of the latter
restrictive hypotheses,

type do exist, but since they can be found without difficulty they have been left
to the exercises.
14-4 | LAPLACE'S EQUATION IN A SPHERE 559

In this section we
solve (14-28) under the assumptions that the solution is
independent of 0,and that u is known when r = 1; i.e., «(1, <p) = f(<p). Then
2
d u/dd
2
= 0, and (14-28) assumes the somewhat simpler form
,2. ^.. *.. *2.
2# u
+ cot^ + ^i = 0. (14-29)

We now apply the method of separation of variables, this time with u(r, <p)
=
R(r)$(<p), to obtain the pair of equations

r
2
R" + 2rR! - \R = 0, < r < 1, (14-30)
and
<*>" + cot <p <*>'
+ X<l> = 0, < <p < ir, (14-31)

X a constant, which can be solved as follows.


Set s = cos tp in (14-31). Then

cfo
d<p
_ _ . d&
ds
d% =
d<p*
Sm 2
.

^- d'Q
C0S<f>
d$
ds

and we have

sin <p -jy — 2 cos <p -j — |- \$ 0,

or

(1
-^ 2>
) ^-
d$ z

2
^+ g&
x$ = '
-^^ L (14-32)

But this is none other than Legendre's equation, and hence the eigenvalues for the
problem under consideration are the integers X n = n(n + 1), n = 0, 1, 2, . . .

(see Example 4, Section 12-6). Since the corresponding eigenfunctions for (14-32)
are the Legendre polynomials Pn (s), it follows that the functions

<f> n (<p) = Pn (cos v?), n = 0, 1, 2, (14-33)

are the eigenfunctions for (14-31) under the given boundary conditions.
Next we observe that (14-30) is an Euler equation. Moreover, when X = \ n =
n(n + 1), an easy computation shows that its solution space is spanned by the
pair of functions
n
R n (r) = r and R n (r) = r~ (n+1) ,

and since the second of these solutions must be rejected because of its discontinuity
at the origin, we conclude that the relevant solutions of (14-29) are

u n (r, v) = A n r nPn (cos <p), n = 0, 1, 2, . . .


,
(14-34)

where An is an arbitrary constant and Pn is the Legendre polynomial of degree n*

* By definition, uo(r, <p) has the value Aq at the origin.


560 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION | CHAP. 14

Finally, to satisfy the boundary condition w(l, <p) = f(<p) we form the series

00

<r, <p) = ^2 4 n r nPn (cos <p), (14-35)


w=0

and determine the A n so that

f(<p)
= Y,A n Pn (cos<p) (14-36)
n=0

for all <p in the interval [0, tt]. To this end we again set s = cos <p and rewrite
(14-36) as
00

/(cos
-1
s) = 2dP
n=0
n n (s),

inwhich form it is obvious that the A n must be chosen as the coefficients of the
-1
Legendre series expansion of /(cos s); i.e.,

In + 1
/(cos
i
s)Pn (s) ds
2 1
(14-37)
2n + 1
f(<p)Pn (cos <p) sin <p d<p.

Using these coefficients, (14-35) will converge in the mean to f{<p) when = 1. /•

Moreover, as in all of the preceding examples, it can also be shown that this series

is uniformly and absolutely convergent, and twice differentiable term-by-term with

respect to each variable whenever /is sufficiently smooth (see Exercise 13). This,
of course, implies that (14-35) is a solution of the given boundary-value problem,
and problem has a unique solution (see Appendix IV), we are done.
since this
Before going on, there is one rather interesting aspect of these results which
deserves closer attention. It concerns our apparent success in obtaining an

orthogonal series expansion for the function f(<p) defined on the surface of the
unit sphere, which, after all, is the import of the assertion that

00

71
2= ^n(cOS <p)

converges in the mean to the A n are given by (14-37). One can, of


/ whenever
course, adopt the pedestrian point of view which sees this result as nothing more
than a statement of standard facts concerning Legendre series. However, a more
liberaland suggestive interpretation is possible by incorporating this result within
the context of orthogonal series expansions, as follows.
Let Q V {S) denote the set of all real-valued continuous functions defined on the
surface of the unit sphere and dependent only on the variable <p (& = colatitude
on the sphere). Clearly e (S) is a real vector space under the usual definitions of
v
14-4 | LAPLACE'S EQUATION IN A SPHERE 561

addition and scalar multiplication, and also becomes a Euclidean space if we set

/•*= fff(<p)g(<p)dS,
04-38)
s

where the integral in question is taken over all of S. Integrals of this sort are known
as surface integralsand are discussed in Appendix IV. For now we need know
only that they enjoy all of the standard properties of linearity (a fact which is
implicit in the statement that / g is an inner product), and that they can be

evaluated as ordinary iterated integrals. Indeed, when 5 is the surface of the unit
sphere, it can be shown that

fffhdgbp) dS = * dv dd* (14-39)


s
J** f f(<p)g(<p) sin

Thus (14-38) may be rewritten

1-2 1 r*
f'g=fJo [ K<p)g(<p) sin
Jo
<P d(P dd

= 2tt[ f(<p)g(<p) sin <p dip,


Jo

and can be evaluated in a perfectly routine fashion.


Now let Pn (cos <p) denote the nth Legendre polynomial, viewed as a function
ine„(S). Then

Pm-Pn = 2ir T Pm (cos <p)Pn (cos <p) sin <p d<p,


Jo
K— r Z®*J> dS= (r dp)
-
(r sin <p

.>Jde .
i

and thus if we make the change of variable \J < r sin <p

s = cos <p, we obtain


_1
Pm'Pn = ~ 27r Pm(s)Pn (s)ds
/
1
= ItJ Pm (s)Pn (s) ds

if m t* n,

!f m = n.
'fcrr) FIGURE 14-5

* This equality can be motivated by observing that the integral on the left is taken
over the surface of the unit sphere, described by letting <p vary from to x, 6 from to 2ir.
Since an element of surface dS on a sphere of radius r is given by the expression

dS = (r sin <p dd)(r dip) = r


2
sin <p dtp dd,

(14-39) follows by setting r = 1. (See Fig. 14-5.)


562 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION I CHAP. 14

n= «=1

n=2 «=3

FIGURE 14-6

(See Theorem 11-6.) Thus the functions Pn (cos <p), n = 0, 1,2, ..., are mutually
orthogonal in Q V (S), and the series given in (14-36), with coefficients as in (14-37),
is the orthogonal series expansion off in terms of the Pn (cos <p). From this point

of view the statement that this series converges in the mean to / is equivalent to
the assertion that the Pn (cos <p) form a basis for Q V (S), a fact which is proved in
exactly the same way as the corresponding statement for the Legendre polynomials
in the space (PQ[— 1, 1].

The functions Pn (cos <p) appearing in this discussion are known as surface zonal
harmonics, or simply zonal harmonics* The use of the term "harmonic" is ex-
plained by the fact that any solution of Laplace's equation, and thus, in particular,
Pn (cos <p), is said to be a harmonic function. The term "zonal" is applied because
the curves on the surface S of the unit sphere along which Pn (cos <p) vanishes are
parallel to the equator of S and thus separate S into "zones." Indeed, using the
fact that the Legendre polynomial of degree n has roots x u . . . , xn in the open
interval (—1, 1) (Theorem 11-7), it follows that the zeros of Pn (cos <p) consist
-1
of the n parallels of latitude on the unit sphere given by v?i = cos xu ,<p n = . . .

-1
cos xn , < ^i < 7r. These curves determine the zones alluded to in the name
zonal harmonic (see Fig. 14-6).

n
* Some authors reserve the latter term for the functions r Pn (cos <p).
14-4 | LAPLACE'S EQUATION IN A SPHERE 563

EXERCISES

1. Verify that the solution space of the equation

r
2
R" + 2rR' - \R =
is spanned by the functions /"and r
_(n+1) when X = n(n + !)•

2. Find the general solution of Eq. (14-28) which depends only on r, and which
vanishes at infinity.

3. Show that (14-28) has no nonconstant single-valued solution which depends only
on0.
4. Find the general solution of (14-28) which depends only on <p.

5. Use the results of the preceding exercises to show that under the conditions assumed
in the text (14-28) has no nonconstant solutions which involve only one of the
variables r, <p, 6.
z
6. A semi-infinite wire coincides with the posi-
tive z-axis, and carries a static electric charge
a per unit length (a > 0), while a similar
wire, oppositely charged, coincides with the
negative z-axis. Given that the potential at
a point P in space due to the charge element
a dz is a dz/R (see Fig. 14-7), perform the
appropriate integration, and find the poten-
tial at P due to the entire (doubly infinite)
wire. Compare your answer with the result
obtained in Exercise 4. x figure 14-7

7. Find a solution u(r, <p) of Laplace's equation in the region \r\ > 1 given that
"(!,*>) = f(fP).

8. Let u(r, <p) be the steady-state temperature in a spherical shell of inner radius a,

outer radius b. Find u(r, <p) if u(a, <p) = f(<p), u(b, <p) = 0.

9. Find the steady-state temperature in a homogeneous sphere of unit radius when


the surface is maintained at the following temperatures.
(a) cos 2 ip (b) cos 2<p (c) 2 sin 2 <p — 1 (d) sin
4
<p

10. Suppose that

*=i dxk

where u = «i(xi) • • •
w„(xn ). Prove that


u"
u k
= A*, X* a constant, k = 1, . . . , n,

and that Xi + • • •
+ X„ = 0.

11. Find the steady-state temperature in a homogeneous hemisphere of radius 1, given


that the temperature is independent of longitude, that the flat surface is kept at 0°,
and that the remainder of the surface has temperature /(<p), where /(7r/2) = 0.
564 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION | CHAP. 14

12. Solve the problem in Exercise 11 when f(<p) = cos 3 <p — cos <p.

*13. Let f(<p) = +) +


fOp~)] be piecewise continuous and periodic with period
%[f(<p
2ir, and let f'Qp) be piecewise continuous for all <p.
(a) Prove that the series

"(r, <p) = ^A
n=0
nr
n
Pn (cos <p)

constructed in the text is uniformly and absolutely convergent for r < tq < 1,

< (p < ir. [Hint: Recall that under these assumptions the Legendre series for /
converges in the mean to /, and hence that there exists a real number M such that
\A n \
< M for all n. Then use the fact that |PTC (*)| < 1 for -1 < x < 1. (See
Exercise 15, Section 11-3.)]

(b) Given that < n 2k for -1 < x < 1, where P™ denotes the £th
\P%*(x)\
derivative of prove that the series in (a) may be twice differentiated term-
Pn ,

by-term with respect to either variable, and that the resulting series still converge
uniformly and absolutely for r < ro < 1, < <p < ir.
(c) Use the result in (b) to show that this series is a solution of Laplace's equation
in the region r < 1

(d) Prove that


lim u(r,<po) = f(<po)

for fixed <po.

(e) Prove that


lim_ u(r, <p) = fifpo)
r->l
<p—><PO

whenever YJZ=o A nPn {cos <p) converges uniformly to/(<p) on the interval [0, *].

14-5 LAPLACE'S EQUATION; SPHERICAL HARMONICS


In this section solve Laplace's equation in the spherical region r
we < 1 subject

only to the requirement that the solution be continuous, single-valued, and take
on preassigned values when r = 1. Thus we consider the equation

f^^ + JLafrin*^
\
+ 4-^-0
Bd sin 2 2
(14-40)
\ Br) sin ip d<p B<p) <p

in conjunction with the boundary condition u(l,<p, 6) = f(<p, 6), where / is


continuous on the surface of the unit sphere.
Here we begin by seeking solutions of the form u(r, <p, 6) = R(r)$(<p)<d(d), and
substitute in (14-40) to obtain

r
2
R" + 2rR' $" + cot ip <£' 1 ©'
=
R + $ sin 2 <p
0. (14-41)
14-5 I LAPLACE'S EQUATION; SPHERICAL HARMONICS 565

Since the first term in this equation depends only on r, while the second is inde-
pendent of r, we have
r
2
R" + 2rR!
R
= X
X a constant, or

r
2
R" + 2rR! - \R = 0. (14-42)

But for each value of X the functions r


a
, a = £(— 1 ± y/l + 4X), are a basis

for the solution space of this equation (see Exercise 1). Thus if we demand that
a be an integer (which we do for the sake of simplicity) it follows that
X = ra(ra + 1), ra = 0, 1, 2, ... , and that a = ra, or a = — (ra + 1).* Hence,
in this instance, the relevant solutions of (14-42) are

R m (r) = r m , ra = 0,1,2,..., (14-43)

— (m + 1)
the solutions r being rejected for reasons of continuity.
Next, we use the chosen values of X to rewrite (14-41) as

+ ^COt 0"
+ n
<S>" $>' 1
if
m{m
, ,

1) H
,

r
$ sin 2 <p &
or as

0^
+ /
ra(ra + n
,

1) sin
2 •

<p -\
,
*" + —COt r <p
—$'
sin
. 2
<p 0. (14-44)

Thus 0"/0 is also constant, and the requirement that be periodic with period
2t implies that

—= -n , n = 0, 1, 2,

Hence the admissible values of are

0n(0) = A n cos nd + Bn sin nd, (14-45)

An and Bn constants, n = 0, 1, 2, ... .

Finally, when 0"/0 = —n2 , (14-44) becomes

— n
$" + cot <p <*>' + ra(ra + 1)
sin J ip_
$ = 0, (14-46)

an equation which we now solve by means of the following ingenious argument.

* The reader who feels that we have been somewhat ruthless in discarding potential
values of X should realize that we are only trying to find a single solution of (14-40) which
satisfies the given boundary conditions, and the fact that the above choices lead to such
a solution will furnish a postiori justification for making them. Of course, the heart of
the matter lies in a uniqueness theorem which can be cited to prove that nothing is lost
by this line of reasoning.
566 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION | CHAP. 14

Set s = cos (p. Then (14-46) becomes

(1 — s )^z^ - 2s-^ + m(m + 1) — $ = (14-47)


ds 2 ds 1 - s2
0,

and in this form is reminiscent of Legendre's equation of order m. To exploit


this similarity we make the second change of variable

$ = (1 _ sT'V
2\n/2,

Then
*. (1 _, V /.[:dw ns
- w
ds 1 s2

2
d *
ds 2
= (1 - 5*>
2yi/2
<ft
2 \
2M5

1

s2
dw
ds
— n —— —-
I
-,
(1
s (n
„,„
— —I)
s2 )2
- w

and (14-47) becomes

dw
(1 - s
2
)^
ds
- 2
2(n + 1>^
ds
+ [m(m + 1) - n(n + l)]w = 0. (14-48)

On the other hand, we know that Pm , the Legendre polynomial of degree m,


satisfies the equation

(1 - 2
s )y" - 2j/ + m(m -f \)y = (14-49)

on [— 1, 1]. Hence P%\ the nth derivative of Pm , satisfies the equation obtained
by differentiating (14-49) n-times with respect to s. Performing these differentia-
tions we obtain, successively,

(1 - *V 3)
- 2(2s)y
{2)
+ [m(m + 1) - 2]y
{1)
= 0,

(1 _ 2
5 )/4) _ 3(2s)/
3)
+ [m(m + 1) - 2(3)]/
2)
= 0,

(1 _ 5 2)/»+2) - („ + l)(2s)y
(n+1)
+ [m(m + 1) - «(« + l)]/
n)
= 0,

and since the last of these equations is just another form of (14-48), it follows that
P£° is a solution of that equation on [— 1, 1]. When rewritten in terms of <p we
find that
n
*mn (*) = sin <p P%\cos v ) (!4-50)

is a nontrivial solution of (14-46) on [0, ir] for each pair of non-negative integers
m and n, n < m*

* Since Pm is a polynomial of degree m, P^ {


will be nonzero only if n < m.
14-5 I LAPLACE'S EQUATION; SPHERICAL HARMONICS 567

Combining the solutions found in (14-43), (14-45), and (14-50) we obtain the
functions
m n n)
r (A mn cos nd + Bmn sin nd) sin <? /4 (cos <p),

n = 0, 1, . . . , m, m = 0, 1, 2, . . .
,

4«, Bmn constants,


each of which is a solution of (14-40). To complete the discussion we now use
these functions to construct a solution u(r, <p, 6) which also satisfies the boundary
condition w(l, <p, d) = f(<p, d). The student will recognize the method.
Let 6(5) denote the Euclidean space composed of all continuous functions
defined on the surface S of the unit sphere with inner product

f-g = fff(<p,e) g
(<p,e)ds
s

= /"

Jo
V Jo
f(#> e ^(<P, 0) sin ip dip dd,

and let

Umn{<P, 0) = COS (nd) sin* <p Pm\cOS if),


(14-51)
n
vmn (<p, 0) = sin (nd) sin <p P™ '(cos <p),

where m and n are as above. These particular functions are known as spherical
harmonics, and for each fixed value of m the 2m + 1 functions

In the sections which follow we will


are called the spherical harmonics of order m.
prove that these functions are a basis for 6(5), and hence that f(<p, 6) can be
written in the form

f(<P, #) — 21/ ~Y~ M ™° ~f~


z^i
(A mn umn -\- Bmnvmn ) (14-52)
w=0 n=l

where the series converges in the mean to/, and

4n0 _ /' U™0 = JO JO f(<f>> g) Sm y ^(COS y) f/<p </fl

2 2
||WmO|| UmO ' UmO

* n+1 n)
A — /• Wmn Jo Jo' /O, 0) cos (nd) sin ? P^ (cos *>) dip dd
Mmn ' Mmn

* w+1
-0*W.TJ.
f • vm n Jo Jo' f(<P, 0) sin (n0) sin <p /*f (cos <p) d<p dd
n 9± 0.
Vmn Vmn '
.

568 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION | CHAP. 14

As usual, when /is sufficiently well behaved this series is twice differentiable term-
by-term with respect to each variable, and the series

00 r a m
= m
u(r, (p, 6) 2_j r —2~ um0 + 2_^ (A mn umn + Bmn vmn )
m=0 L n=l

is the solution of the given boundary-value problem.

EXERCISES

1 (a) Find the general solution of

r 2 R" + 2rR! - \R =
on (0, <x>).

(b) Prove that this equation has solutions of the form rk k a non-negative integer, ,

if and only if X = m(m + 1), m = 0, 1, 2, ... .

2. Use mathematical induction to prove that the «th derivative of a solution of


Legendre's equation of order m is a solution of

(1 - x 2)y< n + 2) - 2{n + l)xy {n +V + [m(m + 1) - n(n + l)]y (n) = 0.

3. Compute the spherical harmonics of orders 0, 1, and 2.

4. Make the substitution u(r, <p, 6) = n


r F(cp, 6) in (14-40) to obtain a partial differential
equation for F(<p, 6), and then set s = cos <p to obtain

2
d F - 2 Jd n 2 dF
+ + =
W2 + k
(l s) ^ n(n \)F)

Solve this equation by the method of separation of variables, and compare the
results with those in the text.

14-6 ORTHOGONALITY OF THE SPHERICAL


HARMONICS; LAPLACE SERIES

To complete the discussion of the last section we must show that the spherical
harmonics
n
umn (<p, 0) = cos nd sin <p P%\cos <p),
(14-53)
n
vmn (<p, 6) = sin nd sin <p P£°(cos <p\

n = 0, 1 , . . . , m m = ; 0, 1 , 2, ... , are a basis for the Euclidean space Q(S).


(We assume throughout that n ^ in v mn .) As usual, the proof is divided into
two a straightforward computation to establish orthogonality, and a
parts;
sequence of theorems culminating in the assertion that the u mn v mn generate ,

Q(S). We begin with the proof of orthogonality, which in this case requires
14-6 | ORTHOGONALITY OF THE SPHERICAL HARMONICS 569

that we show
umn • urs = whenever m 9^ r or n 9^ s (or both),

v mn ' v rs = whenever m 9^ r or n 9^ s (or both), (14-54)

umn ' v rs = for all m, n, r, s.

Recalling that inner products in Q(S) are computed as twofold iterated integrals
according to the formula

f'g = Jo
[^ f /(^ 0)S^> 0) sin * ^ ^>
Jo
( 14_55 )

we have

«mn '
Urs = [^ COS «0 COS 50 </0 /" sinn+S+1 <p P%\cOS <p)Pr (S
\cOS if) dip.
Jo Jo

Thus if n 9± s, umn • u TS = since the first integral in this expression vanishes. On


the other hand, if n = s, then m 9^ r and the equation u mn • u rs = will be
satisfied if and only if

[* sin 2n+1 <p P%\cos <p)Pr


( n)
(cos <p) d<p = 0,
Jo

or, making the substitution x = cos <p, if and only if

1
2 n
(1 - x ) P^\x)P rn \x) dx =
{
0. (14-56)
J
To establish this equality we recall that P™ and PJ.
n)
satisfy the identities

(1 - x )P% +2)
2
- 2(» + l)xP% +1) + [m{m + 1) - n(n + 1)JP£° = 0,

- n+2) n+1) -
(1
2
x )Pr (
- 2(« + l)xPr {
+ [r(r + 1) n(n + l)]P<
n)
= 0,

(14-57)

throughout the interval [—1, 1]. We now multiply the first of these expressions
(1 - Jc ) Pr -
2 M (n)
by the second by , (1 x 2 ) nP%\ and subtract to obtain

/-} _ x 2\n+lrp(n+2)p(n) _ p(n+2)p(n)-i

- - - w+1)
+ x f[P£ +1)Pi
2 n) )
2(« 1)jc(1 P< />£ ]

= W+ l) - m{m + l)](l - x 2
)
n
P™P™
But by setting P^ +1) Pr ( ( n)
- P (n+1) i>£° =
r y, the left-hand side of this identity
may be rewritten
d^
[(1 - x 2 ) n+1 y],
dx
and we have

^[(1 - x 2 ) n+1y] = [r(r + 1) - m{m + 1)](1 - x'yP^P™


570 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION | CHAP. 14

Integrating from — 1 to 1 gives

[r(r + 1) - m(m + 1)]£ (1 - x


2
)
n
P^P n) dx = r
(
0,

and since m 9* r,

1
- 2 n n)
(1 x ) P^Pl. dx = 0,
f
as desired.
To complete the proof we observe that the above argument applies equally well
to the v mnand shows that v mn -v rs =
, whenever m 9^ r or n ^ s. Finally, the
vanishing of u mn v ra for all m, n, r, s is an immediate consequence of the or-

thogonality of cos nd and sin s8 in C[0, 2t], and we have

Theorem 14-1. The spherical harmonics are mutually orthogonal in the


Euclidean space of all continuous functions on the unit sphere.

For future use we take formal note of the following obvious consequence of this
result.

Corollary 14-1. For each fixed value of m the (2m + 1) functions

r m.. r m ii r Tn it i,rn
U m i, r^il
r u mm
' "mOi ' "ml » • • • j ' "mm, r i'i
. . . ,

are linearly independent in the space of all continuous functions defined in


the region r < 1

Indeed, were it possible to express one of these functions as a linear combination


of the others we would find by setting r = 1 that u m0 , . u mm v m i,
. . , v mm , . . . ,

are linearly dependent in 6(5). But this contradicts the known orthogonality of
these functions, and the corollary follows.
On the strength of Theorem 14-1 we are justified in introducing the formal series
expansion of a continuous function /on the unit sphere in terms of the umn v mn , ,

even though we cannot as yet assert that this series converges in the mean to /.
Using the notation of Section 8-3 we therefore have
- j m
(14-58)
2
m=0 L n=l

where A m o/2, A mn Bmn are the generalized Fourier coefficients of/ and are com-
,

puted according to the formulas

Am o /• Ww o
2 Um Q ww0
'

(14-59)
J ' Umn = J ' Vmn
mn ~~ "mn
i* mn 'U
U Uffin
'
Vmn ' Vm n
14-6 I
ORTHOGONALITY OF THE SPHERICAL HARMONICS 571

This series is known as the Laplace series expansion of/, and in the next section
we shall prove that it does in fact converge in the mean to/. But first we evaluate
the various inner products u mn • u mn and v mn • vmn appearing in (14-59). The
argument goes as follows.
From (14-53) and (14-55) we have

2 2n+1 2
umn • umn = / cos nB dd / sin tp [/>£°(cos ip)] d<p,
Jo Jo

2 2n+1 2
Vmn ' Vj = / sin nddd sin <p [P^\cos <p)] d<p.
Jo Jo
Thus if

n
(1 - x 2 ) [P%Xx)] 2 dx = f sin^+VtP^cos^)] 2 ^,
Jf—i
-•ran
Jo
we see that

2v w w0 ' Wto q) = IflmO)


(14-60)
Umn * Um n Vmn * Vmn — ""-*»mj

and it remains to find the value of Imn for given integers m and n (n < m). To
this end we observe that

_ r ,2
= -~
2
ImO =
— [Pm(x] dx
[Pm(x)Y
.
(14-61)
i0 j
/
Jm r X

(see Section 11-3), and then use integration by parts to express Im ,n+i in terms
of Imn by setting

- n+1 n+1)
u = (1 x2) /4 , dv = P% +1) dx
in

T +1 [P% +1 \X)]
2 2
/m..+l = J^O ~ X dx.

This gives

1Jm,n+l

— (\
Kl ~ xY 2\n+ln(n+l)n(n)|l
) rm *m |—

-\_\<\ - * )P%
2 +2)
~ 2(n + l)^n+1)](l - x
2 n
) P^dx

= -/.jK - * 2 1 A n+2)
- 2(» + l)^n+1)](l - x 2 ) nP™dx.

But since

(1 - x )/C +2)
2
- 2(« + l)xP% +1) + [m(m + 1) - fi( R + 1)]P<T> =
572 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION | CHAP. 14

(see 14-57), it follows that

/m n+ i
,
= [m(m + 1) - n(n + l)]j_ (1 - 2
x f[P^\x)fdx

= [m(m + !)-«(«+ \)]Imn .

Hence
Im ,n+i = (m - n){m + n + l)4m, (14-62)

and an easy computation starting with the known value of Im0 now yields

T
/», =
(m
t
(m
+

— «)!
£r ~
n)\
x~t
2m +1
2
'
_ .
< n < m
.
/t .

(14-63)
,. N

(see Exercise 1).

Finally, if this result is substituted in (14-60) and (14-59) we obtain the formulas

r2ir rr

Am n = ^m +
l-K\m
l
^-
+
lH
n)\
/

Jo
/
Jo
f((f , d) COS (Jiff) Sin
w+1
<p /£>(C OS <p) d<p dd,

^+ /"2x /•«

^n = JX^Ll^Ol / / ^ d ) sin («fl) sin


w+1
<p /£»>(cos *) 4> </0,
27r(m + «)! y ./o

for all m and «.

EXERCISES

1. Verify Formula 14-63.


2. Find the Laplace series expansion of the function

f((p, 8) = sin 2 <p cos 2 <p sin 8 cos

in the following two ways


(a) by direct application of the formulas in this section;

(b) by writing
f(<p, 8)
= \ cos
2
cp sin 2 <p sin 28,

and expressing cos 2 <p as a sum of terms of the form

2
Pk (cOS if).
d(cos <p)

[Hint: Note that


2 1 d (cos <p)
COS
^ = 4^3 tfcos^"
and then write cos 4 <p as a Legendre series in cos <p.]
14-7 I HARMONIC POLYNOMIALS AND THE BASIS THEOREM 573

3. Write cos 2 <p cos 2 as a series of spherical harmonics.

4. Prove that

cos <p sin <p sin cos

1 3
Pf (COS

6930
P^CCOS tp) + t^:
1540
*>) sin 30 sin (p

-^ Pj^cos *) - ^ Pieces *) - ^ P? (cos


}
<p) sin sin <p.

*5. (a) Expand the function

f(fp t 0) = (1 - |cos <p|)(l + cos 20)

in a series of spherical hamonics.

(b) Find the solution of the boundary- value problem of Section 14-5 when f(<p, 0)

is the function in (a).

6. Solve Laplace's equation in the spherical shell with inner radius a and outer radius
b (b > a), given that u =
for r = b, and u = f(<p, 0) for r = a. [Warning: Since
- (m+1)
the point r = is not in the shell, both of the functions rm and r must be used.]

7. (a) Use the results of the preceding exercise to find a solution for the shell problem
when u = for r = a, and u = g(<p, 0) for r = b.

(b) Solve the shell problem when u = f(<p, 0) for r = a, and u = g(<p, 0) for r = b.

*14-7 HARMONIC POLYNOMIALS AND THE BASIS THEOREM


To complete the discussion of the spherical harmonics we must show that these
functions generate the Euclidean space Q(S). This result however is an easy corol-
lary of the fact that every function in e(5) can be uniformly approximated by a
combination of the u mn v mn where by this we mean that if we are
finite linear , ,

given any function/in e(5), and any real number e > 0, there exists a finite linear
combination L(<p, 0) of the u mn v mn such that ,

[/(«>, 0) - L(<p, 0)| < e

for all < <p < ir, < < 2x. Indeed, the passage from such an approximation
to the assertion that the u mn , v mn generate e(5) is all but identical with the proof
of Theorem 9-6 and will be left to the reader. With this understanding, we turn
our attention to the following theorem.

Theorem 14-2. Any continuous function on the surface of the unit sphere
can be uniformly approximated by a finite linear combination of spherical
harmonics.

In broad outline, the proof consists of an argument to show that the spherical
harmonics can be replaced by certain polynomials in x, y, z, which are then used
574 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION | CHAP. 14

to construct the desired approximation. The basic properties of the polynomials


in question are established in the following lemmas.

Lemma For each non-negative integer m the functions r m u mn and


14-1.
m
r v are homogeneous polynomials of degree m when expressed in rectangu-
mn
lar coordinates*

Proof. We begin by rewriting umn and v mn as polynomials in cos 0, sin 0, cos <p,

sin cp, as follows. From the formula

cos nd + / sin nd = (cos + / sin 6)


n

we see that cos nd is the real part, and sin nd the imaginary part of the expansion
of (cos + / sin 0)
n
. But, by the binomial theorem,

+ n = n
+ n_1
(cos /'
sin 0) cos «i cos 0(/ sin 0)
n~ 2
+ a 2 cos 0(/ sin 0)
2
+ • • •
+ (/ sin 0)
n
,

where a 1} a 2 ... are constants whose ,


specific values we can ignore. Since
i
2
= — 1 , the real part of this expression is a sum of terms of the form
n- fc fc
c k cos sin 0, (14-64)

where c k is a constant, and k an even integer. Similarly, the imaginary part of


n
(cos -f- / sin 0) is a sum of terms of the same form with k odd.
Next, we observe that the «th derivative of the Legendre polynomial Pm (x)
~n ~ 2
is a polynomial of degree m — n, composed of terms of the form djX m \ dj
a constant, j a non-negative integer. (Recall that the degree of every term in P m {x)
is either even or odd according as is even or odd.) Hence m
(cos <p) is a poly- P^
nomial in cos (p, each of whose terms is of the form
m -M - 2
</y cos V (14-65)

Combined with (14-64), this implies that each of the spherical harmonics u mn ,

decomposed into a sum with individual entries


v mn can be

n- n m - n - 2i fc fc
a jk cos sin sin <p cos <p

* A polynomial p(x, y, z) is homogeneous of degree m if and only if it can be written


in the form
l 3
p(x, y,z)= ^2 c HkX y z ,

where the c ijk are real numbers. Thus x 2 — 2y + xz and 2xy — 3yz z 2 are homo- +
2

geneous of degree two, while x y 2 — Axyz — 3z 3


is homogeneous of degree three. The
reader should note that according to this definition the zero polynomial is homogeneous
of every degree.
14-7 | HARMONIC POLYNOMIALS AND THE BASIS THEOREM 575

for suitable constants a jfc and integers j and k. Thus the functions r m u mn and
m term appearing as
r v mn have a similar decomposition, with a typical

ctjkr cos 9 sin 9 sin <p cos <p

n -k n - fc k m - n - 2j cos m - n - 2y
= a jk r 2j[r cos"-* sin <p\r sin* 5 sin
fe
<p\r *>].

(14^66)
To complete the proof we recall that

x = r cos 9 sin <p, y = r sin 9 sin <p, z = r cos <p.

Thus when converted to rectangular coordinates, (14-66) becomes

oijkK.x -r y + 2 ) x y z ,

and is obviously a polynomial in x, y, z, each of whose terms is of degree m. |

For our purposes this lemma is important because it allows us to assert that the
equation
2 2 2
d u d u d u n ,_
(14- 67)
,

S3 + a^ + aP = °

has polynomial solutions which are homogeneous of degree m for each integer
m > 0. Any such solution of Laplace's equation is called a homogeneous har-
monic polynomial of degree m, and the linearity of (14-67) implies that the set 3CW
of all homogeneous harmonic polynomials of degree mis a real vector space.
Moreover, it follows from Corollary 14-1 that the 2m + 1 functions

mu m mm m
r mQ) r u ml , ..., r Umn, r vml , ..., r v mn (14-68)

are linearly independent in 3COT when expressed in rectangular coordinates. We


now propose to show that they are actually a basis for 3CTO by proving

Lemma 14-2. For each integer m, the vector space 3C m is of dimension


2m + 1.
Proof. This is clearly true if m < 2. For m > 2 let

u(x,y,z)=
i+j-\-k=m
^ c ijkx
i j h
yz (14-69)

be an arbitrary homogeneous polynomial of degree m. Then

2 2 2
d u d u d u
dx*
+ dy*
~*~
dz*

is homogeneous of degree m — 2, and hence is a sum of terms of the form


da pyx ay^zy where da p y is a
, constant and
576 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION | CHAP. 14

Moreover, if for fixed values of a, @, 7 we add the coefficients of x ay^z y appearing


2 2 2 2 2 2
in the derivatives d u/dx , d u/dy , d u/dz we find that
,

da py = (a + 2)(a + \)ca+2 ,p,y + (0 + 2)(0 + l)ca ^ +2 ,y

+ (7 + 2)(7 + l)ca>/3 7+2


,
(14-70)

(see Exercise 1).

Now suppose that u(x, y, z) is harmonic. Then each of the da p y appearing as


coefficients in the polynomial

2 2 2
d u d u d u
dx~*
+ dy*
+ dz*

must be zero, and it follows that

(7 + 2)(7 + l)caA y +2 = -(« + 2)(« + l)Ca+2,0,Y


- (0 + 2)(0 + 1)^+2,7. (14-71)

In other words, a homogeneous polynomial of degree m in x, y, z is harmonic if


and only if its coefficients satisfy (14-71) with a + j8 + 7 = m — 2.
To interpret this result in terms of the vector space 3C m , we introduce the fol-
lowing triangular array formed from the coefficients of (14-70):

If this array is examined closely, it soon becomes apparent that the entires along
the two enclosed diagonals on the upper right determine, via (14-71), all of the
other coefficients in the triangle. Indeed, the coefficients along the uppermost
diagonal determine those along the third, which in turn determine those along
the fifth, etc., while the coefficients along the second diagonal determine, succes-

sively, those along the fourth, sixth, and so forth. Thus two harmonic polynomials
in 5C m are identical if and only if the 2m +
1 coefficients appearing along the upper

diagonals of their coefficient triangles are identical.


To conclude the proof of the lemma we now let Qk < k < m, denote the
,

polynomial in 3C m determined by setting c m - k ,k,o = and all other "special"


1

coefficients c m - k ,p,y, c m - k -i,p,y zero, and let Rk , 1 < k < m, denote the poly-
14-7 | HARMONIC POLYNOMIALS AND THE BASIS THEOREM 577

nomial in 3C m obtained by setting c m -k,k-\, 1 = 1 and all other special coefficients


zero. (The coefficient triangles for these polynomials are those which have exactly
one nonzero entry on the two uppermost diagonals, that entry being 1.) The
above result implies that the 2m +
1 polynomials Q Qu Q m R lt ,
Rm . . .
, , . . . ,

are linearly independent and span 3C m and we are done. |


,

Example 1. The most general homogeneous polynomial of degree 3 in x, y,


and z is a sum of ten terms

p(x, y,z) = J^ c ijk xyz\


i+j+k=3

and has the following array as coefficient triangle:

In this case an easy computation using (14-71) shows that the particular homoge-
neous harmonic polynomials constructed above are

Thus

0o = *
3
- 3xz 2 , J\\ — X Z — jj z ,

Qi = x 2y — yz 2 , R2 = xyz,

Q 2 = -xz + xy
2 2
, Rs = ~iz 3 + y
2
z,

Q 3 = -3yz 2 + y 3 ,
578 BOUNDARY- VALUE PROBLEMS FOR LAPLACE'S EQUATION | CHAP. 14

and the preceding lemma implies that these polynomials are a basis for the space
3C 3 Using them we can easily construct all homogeneous harmonic polynomials
.

of degree 3 in x, y, z.

With these hand we are now in a position to prove that every con-
results in
tinuous function defined on the surface of the unit sphere can be uniformly
f(cp, 6)
approximated by a linear combination of spherical harmonics (Theorem 14-2).
Once again we use the Weierstrass approximation theorem, which in this context
asserts that every continuous function on the closed unit sphere r < 1 can be
uniformly approximated by polynomials* Thus, if e > is given there exists a
polynomial p(x, y, z) such that

\rf(<P, 0) - p(x, y, z)\ < e

throughout the solid sphere. In particular, we note that this approximation is

also valid
1

on the surface S of the unit sphere where r = x2 + y


2
+ z
2
= 1.

Now suppose p(x, y, z) is of degree m. Then if we restrict our attention to the


surface can use the identity z 2
S we = 1 — (x
2
+ y
2
) to write p(x, y, z) as a
linear combination of the monomials

xYz k , k < 1, i<


+ j + k < m. (14-72)

Thus, on S, p(x, y, z) may be viewed as a member of the vector space V consisting


of all polynomials in x, y, z which are linear combinations of the above monomials.
But these monomials are certainly linearly independent in 13, and therefore are a
basis for V. Moreover, an easy counting argument shows that for each fixed value
of a, < a < m, there are exactly 2a + 1 monomials of type (14-72) with
/ +j '•
-j- k = a. Hence, since
m
2
Y, (2« + 1) = (m + l) ,

a=0
we have dim V = (m + l)
2
.

On the other hand, for each fixed value of a, the 2a + 1 functions


a a a a
r ua0 , • • • , r uaa , r va i, . . . , r vaa (14-73)

are homogeneous polynomials of degree a when expressed in rectangular co-


ordinates (Lemma 14-1). In addition, when restricted to S these polynomials
belong to the vector space V, and are linearly independent in V. Thus, if we let
a run from to m we obtain in this way (m l)
2
linearly independent poly- +
nomials in V, and therefore have a basis for V. It now follows that on the surface
of the sphere p{x, y, z) may be written as a linear combination of these polynomials.
And with this, Theorem 14-2 is proved, since when r = 1 the various functions in
(14-73) are none other than the spherical harmonics of order a.

* More generally, the three-dimensional version of the Weierstrass approximation


theorem states that every continuous function in a closed bounded region of three-space
can be uniformly approximated by polynomials.
14-7 |
HARMONIC POLYNOMIALS AND THE BASIS THEOREM 579

EXERCISES

1. Verify Formula (14-70).

2. (a) Use the technique of Example 1 to find a basis for the space 3C 2 of homogeneous
harmonic polynomials of degree 2.

(b) Determine which of the following polynomials belong to the space 3C2:

(x- z)(x + y + z), x2 + y


2
, 2x 2 + y
2 - xy + 2z 2 ,

x{y - 2z) + y(x - 2z) + <x - y), y(y - 2x) - x(x + z).

3. Verify that the homogeneous polynomial

x3 — xz 2 — 2xy 2

is harmonic, and express it as a linear combination of the basis for 3C3 found in
Example 1.

4. Let /?2 = xyz (see Example 1). Prove that

^2 = 3V 3r 3,2.
5. Express each of the functions

r 3 « 3 ,n in = 0, 1, 2, 3), rV» (" = 1, 2, 3)

in terms of x, y, z.

6. Let Vn denote the vector space of all harmonic polynomials of degree n (n > 2) in
x and y, and let each of the polynomials in V„ be written in standard form as
n
n ~k
P(,x, y) = 22 c n-k,kX y .

Jfc=0

(a)Prove that dim V n = 2, and that the real and imaginary parts of (x + iy)
n

form a basis for V„.


(b) Derive a formula for the coefficients of the polynomials in V n analogous to
(14-70), and use it to prove that for each non-negative integer j with k 2/ < n, +
k\(n — k)\
C-(* + 2i).*+2y = (-1)'
_ k _ Cn-*.k.
{/( + 2j)Kn 2j y

7. Use the result of Exercise 6(b) to find the value of each coefficient in P(x, y) in
terms of the coefficient cn ,oorcB _i,i.

8. A nonzero homogeneous harmonic polynomial in x and y is divisible by xy. Prove


that the degree of the polynomial is even. (See Exercises 6 and 7.)

Find the homogeneous harmonic polynomial of degree 6 in a and y which


-

9. (a)
contains the term x 6 but no term in x 5y.
(b) Repeat (a) for the polynomial which contains the term x 5y but no term in x 6 .

(c) Find the homogeneous harmonic polynomial of degree 6 in x and y which


contains the terms x 6 6x 5y. +
580 BOUNDARY-VALUE PROBLEMS FOR LAPLACE'S EQUATION | CHAP. 14

10. Prove that the number of terms with nonzero coefficients in a homogeneous har-
monic polynomial in x and y of degree n > 2 is n/2, (n/2) 1, or n 1 if n + +
is even, or (n l)/2 or n + +
1 if n is odd. (See Exercises 6 and 7.)

11. Use the technique introduced in this section to find a basis for the space 3C4-

12. (a) Prove that if f(x, y, z) is a harmonic function in a sphere (or in all of three-
space), then so is the function g(x, y, z) = /(— x, y, Use this result to show that
z).

for fixed values of any two variables, the even and odd parts of f(x, y, z) with
respect to the third variable are harmonic whenever /is harmonic.
(b) Use the above result to prove that whenever a polynomial in 3C2 is even (or odd)
with respect to a particular variable, it is a linear combination of those members

of the standard basis for 3C2 which are even (or odd) with respect to that variable.
(c) Show that a polynomial in 3C2 cannot be odd with respect to each of the three
variables.

13. (a) Express each of the functions

2
r u 2 ,n (« = 0,1,2),
2
r v2 n,
(" = 1,2)

in terms of the basis for 3C2 found in Exercise 2(a). [Hint: Use the properties noted
in Exercise 12.]

(b) Use the result in (a) to express each of the basis polynomials found in Exercise
2(a) in terms of the basis consisting of r U2, n (n = 0, 1, 2) and r V2, n (n = 1, 2).
2 2

14. Let P(x, y, z) be an arbitrary polynomial in 3C3. Prove that dP/dx belongs to 5C2,
and express this polynomial as a linear combination of the basis for 3C 2 found in
Exercise 2(a).

*15. Prove that the number of nonzero terms in Pm (xi, X2, • • • , xk ), the general homo-
geneous polynomial of degree m > 1 in k > 1 variables, is

(m + k - l\
\ k-\ )
[Hint: For k > 1 write

Pm (xi, X2,...,Xk ) = 2_j XkQm-j(xi, X2, . . *fc-l),

where each Q m -j a homogeneous polynomial of degree


is m — j in the variables
x\, . . . , Xk-i. Then use induction on k, and the formulas

Recall that

-(,:,) Mi - y)'.
.

14-7 | HARMONIC POLYNOMIALS AND THE BASIS THEOREM 581

*16. (a) Let m be a positive integer, and set

Pm(xi, X 2 ,...,Xk) = 2 C ( mi ' m 2, • • • , Wfc)*?


1 *™ 2 • • • Xt\

where the c(m\, m<i, . .


.
, mk ) are constants, and the summation extends over all

non-negative integral values of m\,m2, . • , mk for which m\ -f- W2 + • •


+m = km. Prove that Pm is harmonic (i.e., is a solution of the partial differ-

ential equation ^^ =1 d 2 u/dxf = 0) if and only if

c(mi, . . . , m k _u m k + 2)
k ~X
E {rrii
7
— +—-1)
+—2)(m; c(mi, . . . , m,_i, im + 2, m t+ i, . . . , mk ).

i=1
(m fc + + 1)
2)(/llA;

(b) Let Q m (xi, xk _i) be the polynomial consisting of those terms of


. .
.
,

Pmix\, xk ) in which the variable xk is absent, let xk Q m -i(xi,


. .
.
, xk -i) be . .
.
,

the polynomial in which xk appears to the first degree, and suppose that Pm is har-
monic. Prove that the coefficients of Pm are uniquely determined by the coefficients
of Q m and Q m -\.

*17. Let X
denote the set of all homogeneous harmonic polynomials of degree m > 1
in k >
2 variables xi, xk with the property that each polynomial in 9C con-
. . ,

tains exactly one term (and that with coefficient 1) for which the exponent of xk is
less than 2. Prove that 9C is a basis for the space 3C m k of all homogeneous harmonic ,

polynomials of degree m in k variables, and determine the dimension of this space.

*18. Can the following assertion be deduced from the results of this section?
Any harmonic function in the closed unit sphere r < 1 can be uniformly approxi-
mated by harmonic polynomials.
Why?
19. Use Theorem 14-2 to prove that the spherical harmonics are a basis for the Euclidean
spaceC(5).
15

boundary-value problems
involving bessel functions*

15-1 INTRODUCTION
After the succession of examples in the preceding chapters the student has probably
come to the conclusion that the techniques we have developed are adequate to
solve any boundary-value problem involving the wave equation, heat equation, or
Laplace's equation, at least when the underlying region and boundary conditions
are reasonably simple. This, however, is false, as can be seen by considering
Laplace's equation in cylindrical coordinates (r, 6, z), where x = r cos 6, y =
r sin d, z = z. For then

2 2
d u d u d^u _ -

dx*
+ dy 2
+ dz 2
- u

becomes

^ + 1^+1^
dr 2 ^ ^^=
+ r dr r2 dd 2 dz 2
0-
' U
(15-1)}

and if we attempt to find solutions of (15-1) which are independent of 6 and have
the form u(r, z) = R(r)Z(z), we see that R and Z must satisfy the equations
Z" - X
2
Z= (15-2)
and
r
2
R" + rR' + 2
\ r
2
R = 0, r > 0. (15-3)

2
(The motive behind writing the constant as —X will become clear later.)
Although the first of these equations can be handled with ease, the second
causes trouble because its leading coefficient vanishes at r = 0. Here the imagina-
tive student might suggest using the power series method to solve (15-3) about
r = a, a > 0, since its coefficients are analytic whenever r ^ 0. Unfortunately,
this willnot do, for the simple reason that the resulting series need not converge
outside the interval (0, 2a), whereas we seek solutions valid for all r > 0. (See
Theorem 6-4.) Clearly, then, if we are to make any progress in solving this prob-

* The following discussion assumes a knowledge of the material in Chapter 6.


582
15-2 |
REGULAR SINGULAR POINTS 583

lem, or others of the same ilk, we must devise a method for studying the solutions

of a linear differential equation near points where its leading coefficient vanishes.
This is precisely what will be done in the sections which follow, where we introduce
the celebrated method of Frobenius generalizing the power series technique to a
large number of non-normal has been done,
linear differential equations. Once this

we shall use the technique in question to study the solutions of Bessel's equation
and the last important class of elementary boundary-value problems.

EXERCISES

1. Use the coordinate transformations given above to show that Laplace's equation
becomes
d u 1 du 1 d u d_u _

in cylindrical coordinates.

2. Assume that Laplace's equation in cylindrical coordinates has a solution of the form
u(r, z) = R(r)Z(z), and show that R and Z must satisfy Eqs. (15-2) and (15-3).

3. Show that the equation

r 2 R" + rR' + \ 2r 2R =

can be transformed into Bessel's equation of order zero by an appropriate change of


variable. (See Section 6-2.)

15-2 REGULAR SINGULAR POINTS


Let
p(x)y" + q(x)y' + r(x)y = (15-4)

be a second-order homogeneous linear differential equation whose coefficients are


analytic in an open interval / of the x-axis, and suppose that/?(x ) = for some
x in /. (Such a point is said to be a singular point for the equation.) Then, the
assumed analyticity of the coefficients of (15-4) implies that the function p has a
power series expansion
00

p( x ) = 2 a^x ~ x °)
k
>

valid in some interval about the point x Moreover, since p(x ) = 0, the leading
.

coefficient in this series must vanish, and there exists a positive integer m such that

m —m
p(*) = 2
A;=m
ak ( x ~ *°)
fc

= (•* ~ *°) 2
k=m
° k (x ~ *°)
k

- (x - XoTpiix),
.

584 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS |


CHAP. 15

where analytic at x
/>i is and/?i(x ) 5^ 0. We now divide the
, coefficients of
(15-4) by pi to obtain an equation of the form

(x - x Q ) my" + qi(x)/ + r x (x)y = 0, (15-5)

whose coefficients are still analytic at x , and which is equivalent to (15-4) in an


interval about that point. Thus, when studying the solutions of a second-order
homogeneous linear differential equation about a singular point x we can always
write the equation in this special form.
It turns out that the behavior of the solutions of (15-5) near x is strongly de-

pendent upon the exponent m and the value of q x at x and leads to a classification ,

of singular points as "regular" and "irregular" according to the following definition.

Definition 15-1. A point x is said to be a regular singular point for a


second-order homogeneous linear differential equation if and only if the
equation can be written in the form

(jc - * )V' +(x- x )q(x)/ + r(x)y = 0, (15-6)

where q and r are analytic at x . All other singular points are said to be
irregular.

Thus, among the singular points for the equation

x\x + l)
3
(x - 1)/' + xy' - 2y = 0, (15-7)

those at and 1 are regular since (15-7) may be rewritten

x2y " + y = °'


(* + 1)
3 (*- l)^ " (jc + 3
1) (* - l)

and

<* - «>V' + ;^+V " :!§-+> = °

about x = and x = 1, respectively. On the other hand, -1 is an irregular


singular point for (15-7) because the coefficients of y and / in

are undefined at x = —1
As the terminology suggests, regular singular points are relatively easy to handle.
In fact, with appropriate modification, the method of undetermined coefficients
introduced in Chapter 6 can be used to obtain a basis for the solution space of
any second-order linear differential equation about a regular singular point. Not
. :

15-2 | REGULAR SINGULAR POINTS 585

so when x is an irregular singular point. Here the situation is far more com-
plicated, and it is not difficult to exhibit equations which fail to have series solu-

tions of any form about such a point (see Exercise 11 below). Fortunately, all of
the boundary-value problems we shall encounter involve equations whose only
singularities are regular, and hence we shall have no more to say about irregular
singular points.
Finally, for simplicity, we shall limit the following discussion to singularities at

the origin, in which case (15-6) becomes

x 2y" + xq(x)/ + r(x)y = 0.

The reader should appreciate that this involves no loss of generality since the

change of variable u = x — x shifts a singularity from x to 0.

EXERCISES

Find and classify all of the singular points for the equations in Exercises 1 through 10.

1. x 3
(x 2 - 1)/' - x(x + 1)/ - (x - \)y =
2. (3x - 2)
2
xy" + xy' - y =
3. (x 4 - \)y" + xy' =
4. (x + 4
l) (x - l)
2/' - (x + l)
3
(x - 1)/ + y =
5. x 3
(x ix - 1)/
- l)y" + + 2xy =
6. x (x + 1)V
3 - y =
7. x(l - x)y" + (1 - 5x)y' - Ay =
8. Legendre's equation:

(1 - x 2 )y" - 2xy' + X(X + \)y =


9. Bessel's equation:

x 2y" + xy' + (x 2 - \ 2 )y =
10. Laguerre's equation

xy" + (1 - x)y' + \y =
1 1 Prove that the equation
x *y" + y =
has no nontrivial series solution of the form

ak x
T

for any real number v.


586 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS |
CHAP. 15

15-3 EXAMPLES OF SOLUTIONS ABOUT A REGULAR SINGULAR POINT


The simplest example of a second-order equation with a regular singular point at
the origin is the Euler equation

x 2y" + axy' + by = 0,

a and b (real) constants, studied in Chapter 4. At that time we proved that the
solution space of this equation is spanned by a pair of functions ji and y 2 con-
structed from the roots v 1} v 2 of the equation

v(y - 1) + av + b = 0,

as follows. If Vi 9^ v 2 ,

yiix) = \x\\ y 2 (x) = \xp;

if ^i = v2 = v,

yiix) = \x\
v
, y 2 (x) = \x\'ln\x\*

Save for the fact that yi and y 2 here appear in closed form, these expressions are
typical of the solutions of second-order homogeneous linear differential equations
with a regular singular point at the origin. Indeed, we shall find that about x =
the solution space of such an equation is always spanned by a pair of functions
which depend upon the roots of a polynomial equation of degree two, that these
functions involve powers of \x\, and under certain circumstances a logarithmic
term as well. The following example will illustrate these remarks, and introduce
the technique which is used to handle the general case.

Example. Find the general solution of

x 2y" + x(x - W+ iy = (15-8)

on each of the intervals (0, oo) and (— go, 0).

We begin by considering the interval jc > 0, where we seek a solution of the


form

y(x) = x Y, v

fc=0
«*** = E
k=0
akxk+ "> (15_9)

with O ^ 0, and v arbitrary. (This particular guess as to the form of y is motivated


by the results obtained for the Euler equation, and from that point of view is not

* Recall that when z is complex, \x\


z = ez ln '*'.
Hence if pi =a + /3/, v2 = a. — 0i,

then
a = a
y x {x) = \x\ sin (j8 In |*|), y 2 (x) \x\ cos (j3 ln \x\).
15-3 | EXAMPLES OF SOLUTIONS ABOUT A REGULAR SINGULAR POINT 587

unreasonable.) We then have

k+v - 2
/'(*) = 2
k=0
(k + v\k + v - l)a k x ,

and substitution in (15-8) yields

E
fc=0
(* + *)(* + v - l)a k x
k+v
+ 2
fc=0
(* + ^)^fc+ +1 "

- 1 E
fc=0
( fe + ") a **
fc+
"
+ \ E^
fc=0
fc+
" = °-

But since

E
fc=0
<k + v)« fc
/+" +1 =
E
fc=l
(* + " - iK-i^n

the above expression may be rewritten

[v(y - 1) - \v + i]« ^+ E K* + ?)(* + ? - 1) - *(* + ?) + fex* +v

+ E ( fc + " - H-i^ + =1 " o.


/c=i

Thus if (15-9) is to be a solution of the given equation we must have

v{v -Y)-\v + \ = 0, (15-10)


and

[(k + v)(k + v - 1) - *(* + *>) + i]fl* + (* + *- lH-i = 0, fc > 1.

(15-11)
(Recall that we assumed a ^ 0.)
Equation (15-10) determines the admissible values of v for this problem as \
and 1, and is known as the indicial equation associated with (15-8). We now set

I{v) = v{v - 1) - \v +h
and rewrite (15-10) and (15-11) as

I(y) = 0, (15-12)

and
I(k + v)a k + (k + v - l)« fc _i = 0, k>\. (15-13)
588 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS | CHAP. 15

In particular, when v = \, I(k -f v) = k(k — \), and (15-13) becomes

1
Hk = — T Qk—1-

Thus
01 = — «0,
«0

00
" 3= - 3!'

fc = (-1) fc0O
k\

Similarly, when v = 1,

0fc = — 0fc-l3
2A: + 1

and
fli = -§0o,

02 = 57300,

03 = - 7-5-3 0o,

ak = (-1)*
- 0o-
(2A: + l)(2fc l)---5-3

We now set a = 1 and substitute the above values in (15-9) to obtain the two
series
k
kX_
yi(x) = * 1/2 X;(-i> (15-14)
k=0

and
]

{2x)
y 2 (x) = xj^i-lf - (15-15)
k=0
(2k + 1)(2A; 1) • - •
5 • 3

each of which formally satisfies (15-8) on the interval (0, 00).


Thus to complete the argument it remains to show that these series converge for
all x > 0, and that the functions they define are linearly independent in e(0, oo).

Leaving the latter point as an exercise, we establish convergence by using the ratio
test with
,ft+3/2 -fc+1/2

Pi
(*+ 1)! k\ k+ 1
15-3 | EXAMPLES OF SOLUTIONS ABOUT A REGULAR SINGULAR POINT 589

for the first series, and

fc+1 fc+2 k k+1


2 ;c / 2 x 2\x\
P2 =
(2k + 3)- ••5-3/ (2k + 1) • - -
5 3 2k + 3

for the second. Then


lim pi = lim p 2 =
k—»oo k—>ao

for all *, and the series do converge as desired. Thus the general solution of
(15-8) on (0, oo ) is

y(x) = crf^x) + c 2y 2 (x),

where C\ and c 2 are arbitrary constants.


Finally, to remove the restriction on the interval we observe that the above
argument goes through without change if x" is replaced by |jc|" in (15-9); i.e., by
(—x) v for x < 0. Thus the general solution of (15-8) on any interval not contain-
ing the origin is

X*) - C ,M'"
£ (~i?E + ^ £ (-'f g* +\f-
fc=0
c
fc=0 v ' 5 3
'

EXERCISES

1. Prove that the functions y\ and y 2 defined by (15-14) and (15-15) are linearly inde-
pendent in 6(0, oo ). [Hint: Consider the behavior of ciy[(x) c2 y2 (x) as x —> 0.] +
2. The equation
x 2y" + x(x - 4)/ + y =

has solutions which are defined for all x. What are they ?

3. Find the indicial equation associated with the regular singular point at x = for
each of the following equations.
(a) x 2y" + xy' - y =
(b) x 2y" - + 1)/ + (* -
2x(x l)y =
(c) x 2 /' -
2xy' + y =
(d) x 2y" —
jcv' + (x
2 —
X 2 )y = 0, X a constant
(e) xy" + (1 — x)y' + Xy = 0, X a constant

4. Prove that the indicial equation associated with

x 2y" + -xtfOOy' + A*(jf)y =


is

v(v - 1) + q($S)v + r(0) =

whenever q and r are polynomials.


590 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS |
CHAP. 15

5. Find the indicial equation associated with each of the regular singular points x = 1

and x = —1 for Legendre's equation

(1 - x 2 )y" - 2xy' + X(X + \)y = 0.

Use the method introduced in this section to find two linearly independent solutions
of the equations in Exercises 6 through 10. In each case verify that the solutions obtained
converge whenever \x\ > 0, and that they are linearly independent in 6(0, °o) and
C(-oo,0).

6. 2x 2y" + xy' - y = 7. 9x 2y" + 3x(x + 3)/ - (4x + \)y =

8. xy" + ^— ^ y' - y = 9. Sx 2y" - 2x(x - \)y' + (* + l)y =

10. 4x 2y" + x(2x - l)y' + fy =


11 Use the method of this section to prove that Laguerre's equation of order X,

xy" + (1 - x)y' + \y = 0,

has a solution which is analytic for all x, and which reduces to a polynomial when
X is a non-negative integer.

15-4 SOLUTIONS ABOUT A REGULAR SINGULAR POINT:


THE GENERAL CASE

In this section we shall indicate how the technique introduced above can always
be used to find at least one solution of

x 2y" + xq(x)y + r(x)y = (15-16)

about the origin whenever q and r are analytic at x = 0. We again begin by


letting x be positive, in which case we seek a solution of the form

00

y(x) = x v
2
fc=0
akx (15-17)

with a 9^ 0. Then

~\
/(*) = *"
E
k=0
(* + v)a k x
k

00

y\x) = xv J2 (k + v)(k + ^ - i) flfc xfc—


fc=0

and substitution in (15-16) yields

00 00 00

2
fc=0
(k + */)(& + v - \)a k x
k
+ <?(*) J]
k=0
(k + *>Hx
fc
+ K*) £
k=0
a*** = 0.

(15-18)
)

15-4 | SOLUTIONS ABOUT A REGULAR SINGULAR POINT: GENERAL CASE 591

But since q and r are analytic at x = 0, we can write

q(x) = E
fc=0
qk x\ r(x) = E
k=0
r kx\ (15-19)

where both series converge in an interval \x\ < Ro, Ro > 0, centered at x = 0.

Substituting (15-19) in (15-18), we have

+ ^
00 / 00 v / 00 \

E
fc=0
(* + "X* + * - i**** + (
\/e=0
E ( fc x
)
/ \fc=0
( E ****
/

+ (£>***)(£ '***) = o,
\fc=0 / \fc=o /

and if we now carry out the indicated multiplications according to the formula
given in Section 6-6 we obtain

E
00

fc=C L
fc=0
r

( fc + ^+ v -!>»*+ E
k
fc

i=0
0" + v)ajqk _j +
J=0
k
fc

E W-i
Hence (15-17) will formally satisfy the given differential equation in the interval
< x < R if and only if
fc

(fc + v)(k + v - \)a k + E


i=o
£0" + v te-i + r k-j]aj = (15-20)

for all k > 0.

When A: = 0, (15-20) reduces to

v{v - 1) + ^+ r = (15-21)

(recall that o ^ 0), an d when fc > 1, (15-20) reduces to


fc-i

[(fc + v)(k + v -l)+ q (k +v)+ ro \a k + E


i=o
tO" + ^*-i + '"t-^i = °-

(15-22)

The first of these relations is known as the indicial equation associated with (15-16),
and its roots, which determine the admissible values of v in (15-17), are called
the characteristic exponents of that equation. We direct the reader's attention to
the fact that since q and r are, respectively, the constant terms in the series
expansions of q and r, (15-21) may be rewritten

viy - 1) + q(0)v + K0) = (15-23)

(cf. Eq. (15-19)). Thus when q and r have been explicitly given, the indicial equa-
tion can be obtained directly from (15-16) without undertaking the above
computations.
_

592 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS |


CHAP. 15

To continue, we now set

I(v) = v{y - 1) + q Qv + r ,

and let v x and v2 denote the roots of the equation I(v) = 0. Moreover, for con-
venience we suppose that v x and v2 have been labeled in such a way that
Re (>i) > Re (v 2 )* Then, when v = v u (15-22) becomes

k-i
I(k + v x )a k + 2
y=o
tO' + "i)?*-i + /*— y]flt y = 0, (15-24)

and, by the way in which v\ was chosen, we know that I(k + v{) 9^ for k = 1,

2, . . . . Hence (15-24) can be solved for a k to yield the recurrence relation

1
_

ak = ~ jfk , „ x J2 [C/ + ^Oflfc-i + '
,
*-y]fly» ^ > 1, (15-25)

which serves to determine all of the a k from k = 1 onward in terms of a And .

with this we have succeeded in producing a. formal solution of Eq. (15-16), valid
in the interval < x < R Moreover, if xv is replaced by \x\ v throughout these
-

computations, this result obviously holds in the interval — R < x < Oas well.
Finally, it can be shown that the series obtained in this way always converges if
< |jc| < -R -t Hence the function

y(x) = |xp 2
fc=0
a ^ k
(15-26)

with O arbitrary and the a k given by (15-25) is a solution of (15-16) when


< |*| < R .

To complete the discussion we must now find a second solution of (15-16),


linearly independent of the one just obtained. To this end we attempt to repeat
the above argument using the second root v 2 of the indicial equation. Of course,
if Vi = v 2 we get nothing new. However, if v x 7^ v 2 (15-24) becomes ,

fc-i
I{k + v 2 )a k + 2
i=o
tO" + v 2 )qk-j + r k -j}aj = 0, (15-27)

which can again be solved for a k provided I(k + v 2 ) 9^ for all k > 1. But
when k > 0,
I(k + v 2) =

* By we understand the real part of the complex number v. Thus if v = a + /3/,


Re(i-)
Re(v) = The real part of a real number is, of course, the number itself.
a.
| A proof of this fact can be found in the book by Coddington cited in the
bibliography.
15-4 | SOLUTIONS ABOUT A REGULAR SINGULAR POINT: GENERAL CASE 593

if and only if k -f v 2 = v x \ i.e., if and only if v\ — v 2 = k. [Recall that


Re (vi) > Re iy 2 ).] Thus the above method will produce a second solution of
(15-16) of the form

y(x) = WZ 2

fc=0
a ^ fl o
* °» ( 15
" 28
>

valid for < |jc| < R whenever the roots of the indicial equation I{v) = do not
differ by an integer. In this case it is easy to show that the (particular) solutions

y x and y 2 obtained by setting a = 1 in (15-26) and (15-28) are linearly inde-


pendent, and hence that the general solution of (15-16) in a neighborhood of the
origin is

y(x) = Ciyi(x) + c 2y 2 (x),

where c x and c 2 are arbitrary constants. This, for instance, is precisely what
happened in the example given in the preceding section.

EXERCISES

Find two linearly independent solutions about x = for each of the following equations,

and state for what values of x these solutions are valid.

1. x{x - 4)/' + (jc - 2)/ - Ay =


2. 2x 2y" + 5*/ - 2y =
3. Sx(x + 4)y" - 8/ + y =
1
4. 2xy" + 3/ - —
7 y =
x 1

2* 2
5. 3x 2/' - ~y' + 7>> =
x — 1 x — 1

6. 2x 2/' + x(x - \)y' + 2y =


7. 4x 2 (x + 1)/' + x(3;t - 1)/ + y =
8. jc
2/'
+ */ + (x 2 - %)y =
9. 3X 2/' + */ - (1 + x)y =

+—
x(7x + 1) 1
10. 2x 2y" + T-~y' 7T,y =
+ X +
AC 1 1

Compute the values of the coefficients a\, a 2 , as in the series solutions of each of the
following equations. (Assume a q =1.)
11. x 2y" + x(x + 1)/ + y =
12. 16a:V - 4x(x - 4)/ - = 2
>>

13.
2
x (x 2 - \)y" - xy' - 2y =
594 BOUNDARY-VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS | CHAP. 15

14. 8x 2 (x - 2)y" + 2xy' - (cos x)y =


15. x 2y"
+ xe x y'
+y =
16. Let y\ and 72 denote the functions obtained from (15-26) and (15-28) by setting
ao = 1. Prove that ^1 and j 2 are linearly independent in 6(0, i?o) and C(—i?o, 0).
[Hint: See Exercise 1, Section 15-3.]

*17. (a) Show that x = 1 is a regular singular point for Legendre's equation

(1 - x 2 )y" - 2xy' + X(\ + l)y = 0,

and find a solution about this point of the form

y(x) = \x - if X/ fl *(* ~ 1 )*'

fc=0

(b) Determine the values of x for which this solution is valid.

15-5 SOLUTIONS ABOUT A REGULAR SINGULAR POINT:


THE EXCEPTIONAL CASES

To complete our study of solutions about regular singular points it remains to


consider the case where the roots v lt v 2 of the equation I{v) = differ by an in-

teger. Our experience with the Euler equation suggests that a solution involving
a logarithmic term should arise when Vi = v 2 and, as we shall see, this can also
,

happen when v x
9^ v 2 . The following theorem gives a complete description of the
situation, both in the general case treated above, and in each of the exceptional
cases.

Theorem 15-1. Let

x 2y" + xq(x)/ + r(x)y = (15-29)

be a second-order homogeneous linear differential equation whose coefficients


are analytic in the interval \x\ < R (R > 0), let v x and v 2 be the roots

of the indicial equation


v(v - 1) + q(0)v + r(0) = 0,

and suppose that Vi and v 2 have been labeled so that Re (vi) > Re (y 2 ).
Then (15-29) has two linearly independent solutions yi and y 2 valid for ,

< |x| < R whose form depends upon V\ and v 2 as follows:


, ,

Case 1. V\ — v 2 not an integer. Then

y x (x) = \x\
Vl
2
fc=0
ak x
k
, a = 1,

y 2 (x) = \x\
V2
J2 b k x\ b = 1.

fc=0
1

15-5 | SOLUTIONS ABOUT A REGULAR SINGULAR POINT: EXCEPTIONAL CASES 595

Case 2. V\ = v2 = v. Then
00
k
y^x) = \x\" ^2 akx , a = 1,

fc=0

hW = \x\
v
2
fc=i
**** + JiWln |x|.

Case 3. Vi — v 2 a positive integer. Then


00

yi(x) = \x\
n ]T a k xk > oo = l,

A;=0

k
y 2 (x) = \x\
V2
J] hx + cji(x) In |x|, bQ = 1, c a (/ix«f) constant.*
k=0

Finally, the valuesof the constants appearing in each of these solutions


can be determined directly from the differential equation by the method of
undetermined coefficients.

In the remainder of this section we sketch the argument leading to the solution
involving a logarithmic term when v x = v 2 The reasoning in the case where .

V\ — v2 is a positive integer is similar, but more complicated, and will not be


given. The interested reader can find the details in the text by Coddington men-
tioned above.
We begin by repeating a portion of our earlier argument, and attempt to de-
termine the a k in

x
v

k=0
^ <>kx
k
(15-30)

so that the resulting expression satisfies (15-29) for < x < R . This time,
however, we also regard v as a variable, and write

y(x,v) = /|]fl/ (15-31)


fc=0

Moreover, we assume from the outset that a = 1. Then if L denotes the linear
differential operator x D
2 2
xq{x)D r(x), we have + +

Ly(x, v) = I{v)x
v
+ x Yj
v
oo

k=l
W
/

'
+ v)a k +
k
]T
j=0
[(J + v)q k _ j + n^aA x k
\

l
,

(15-32)

* In certain instances it may happen that c — 0. Moreover, the various constants ao


and bo appearing in these expressions can be assigned nonzero values different from 1
without affecting the validity of the results.
596 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS | CHAP. 15

where the q k _j and rk _j are the coefficients in the power series expansions of q and
r about the origin. [See (15-21) and (15-22).] We now use the recurrence relation

I(k + v)a k + J2 iU + v )lk-j + r k -j]aj =

to determine a l5 a2 ,
. . . in terms of v in such a way that every term but the first

on the right-hand side of (15-32) vanishes. Denoting the resulting expressions by


a k (v) and substituting in (15-31), we obtain a function

y x {x, v) = x" i + E
k=l
a ^ xk (15-33)

with the property that

Lyi(x, v) = I{v)x
v
. (15-34)

(Recall that, by assumption, a = 1.) But since v x is a double root of the indicial
equation I(y) = 0, I(v) = (y — v x)
2
, and (15-34) may be written

Lyi(x, v) = (y — vl fx
v
(15-35)

Thus Ly x (x, v) = = v u and we conclude that the function yi(x, Vi) is


when v
a solution of the equation Ly = 0. This, of course, agrees with our earlier results.
The idea for obtaining a second solution in this case originates with the observa-
tion that when (15-35) is differentiated with respect to v its right-hand side still

vanishes when v = V\. Indeed,

(y - Ci)V = x\v - ^)[2 + (y - Vl )\nx].


dv

But since

[Ly^x, v)] = L yi(x, v)


dv dv

(see Exercise 13), (15-35) implies that

d
yi(x, v) =
dv

when v = v\. Thus if we differentiate (15-33) term-by-term with respect to v


and then set v = v u the resulting expression will formally satisfy the equation
Ly = for < x < R . Denoting this expression by j^C*, ^i) we have

k
y2(x,v 1 ) = ^-yi(x,v) = xVl J2 a 'k(yi)x + yi(x, vO In x,
fc=l

which is precisely the form of the second solution given in the statement of
Theorem 15-1 under Case 2.
15-6 | BESSEL'S EQUATION 597

EXERCISES

Find two linearly independent solutions on the positive x-axis for each of the equations
in Exercises1 through 10.

1. x 2y" + x(x -
+ (1 -
1)/ x)y = 2. xy" + (1 - x)/ - y =
3. x 2y" 3xy' + (x + l)y =
+ 4. x 2y" + 2x 2y' - 2y =

5. xy" - (x + 3)/ + 2y = 6. xy" + (2x + 3)/ + 4y =

7. xy" + (x 3 - 1)/ + x 2y = 8. x 2y" - 2x 2y' + 2(2x - \)y =


9. xy" + (1 - x)y' + 3y = 10. x 2y" + x 2y' + (3x - 2)y =

11. Use Theorem 15-1 to determine the form of two linearly independent solutions

about x = for the equation

x 2/' + xy' + (x 2 - p 2)y = 0,

p a real number. Do not compute the solutions.

12. Prove that the solutions y\ and j>2 given in Cases 2 and 3 of Theorem 15-1 are
independent in 6(0, Rq) and C(— Rq, 0).
linearly

13. Verify that

d
[Ly(x,v)] = L
av ov

when y(x, v) is defined by (15-31), and

L = x2D2 + xq(x)D + r(x).

15-6 BESSEL'S EQUATION


In this section we shall use the technique introduced above to find a basis for the
solution space of the equation

x 2y" + xy + (x
2
- p
2
)y = (15-36)

about the point x = under the assumption that p is real. We recall (see Sec-

tion 6-2) that (15-36) is known as BesseVs equation of order p, and, as we shall
see, it arises in the study of boundary-value problems involving Laplace's equation
and the wave equation.
Since the indicial equation associated with (15-36) is

„2 - p2 == 0, (15-37)

and has roots ±p, Theorem 15-1 guarantees that Bessel's equation of order p
possesses a solution of the form

00
k
yi(x) = xp ]T akx , p > 0,
k=0
598 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS | CHAP. 15

valid for all x. To evaluate the a fc


we observe that

(*
2
- p
2
)yi (x)
= xp £
k=2
ak _ 2 x
k
- xp JP
fc=0
2
a k x\

00

xyi(x) = x p J] (k + />>****,
fc=0

00

*Vi'(*) = *P 2
fc=0
(fe + />)(* + /» - l)fl*x*.

Thus (15-36) implies that

A:=0 fc=2

or
00
k
(2/? + l) fll x + J] L*(2/> + *>** + a k _ 2 ]x = 0,
fc=2

and we therefore have

«i = 0, ak= __**=*_, k > 2 .

From this it immediately follows that

fli = o3 = fl 5 = • • • = 0,
a
a2 = -
2(2p + 2)

a* ~ 2 •
4(2/> + 2)(2/> + 4)

and, in general,

= (—1)
«2fc
2 • 4 • 6 • • •
(2k)(2p + 2)(2p + 4) • • • (Ip + 2fc)

= 1c-n} & ^
2**k\(p+ 1)Q> + 2)---Q> + *)

^ ^
Hence

J". W- g ^ «•
2 !(f, + 1)( r+2)...( P + fe )

where a ^ is an arbitrary constant.

For various reasons, it turns out to be convenient to set

1
a =
V>T{p + 1)
15-6 BESSEL'S EQUATION 599

where r denotes the well-known gamma function defined by

T(p) = f\- f- l x
dU p > 0. (15-39)
Jo

It can be shown that this integral converges for all p > 0, diverges to + oo when
p = 0, and has the values

r(D = l,

T(p + 1) = pT(p), p > 0,

(see Exercise 1). In particular, F(n + 1) = n\ whenever n is a positive integer,


and for this reason the gamma function is also known as the generalized factorial
function. Moreover, if we rewrite the identity T(p + 1) = pT(p) as

T(p) = T(p + 1)
(15-40)

we can use the values of T(p +


from (15-39) to assign meaning to T(p) for
1)
nonintegral, negative p. Indeed, as
stands the right-hand side of (15-40) can be
it

read as the definition of T(p) for — 1 < p < 0, since T(p + 1) is already defined
in that interval. This done, we use (15-40) and the values just obtained to extend
the definition of T(p) to the interval -2 < p < - 1. Continuing in this fashion
we obtain a real-valued function defined for all values of the independent variable
p, save p = 0, — 1, —2, The graph of the resulting function is shown in
Fig. 15-1.
I

FIGURE 15-1

We now return to (15-38), and set a = l/[2 T(p p


1)] to obtain the first of the +
two particular solutions needed to solve Bessel's equation. This solution is known
as the Bessel function of order p of the first kind, and is denoted by J
p Specifically, .

2k+p
MX) "
S ( 1)
*2"+'A:!(p+ !)(/> + 2) (p + k)T(p + 1)
'
600 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS CHAP. 15

1.0

0.9

0.8
\Jo
0.7

0.6
k<>i

0.5
sJ2
0.4

0.3

0.2

0.1
A Y .8 y5.1 A.i 7.C / 8.4v G.7
] I \ 3 ( 7
V \
10

0.1

0.2

-0.3

-0.4

FIGURE 15-2

an expression which can be rewritten in simpler form as*

oo / i\fc /J\2k+p
(-1V -,©' (15-41)
fc=0
r(fc + \)r(p + k + l

In particular, when p = 0, we have


2k
(15-42)

and, more generally, when p is a non-negative integer n

(15-43)

The graphs of •/„> «/i, and ^2 are sketched in Fig. 15-2.

* Note that

T(p + l){(/> + 1)(P + 2) • • •


(p + k)} = T(p + 2){(p + 2) •••(/> + *)}

= r(/> + £){(/> + *)}


= r(/7 + * + i).
15-6 | BESSEL'S EQUATION 601

To complete our discussion of Bessel's equation, it remains to find a second


solution, linearly independent of Jp . Again we use Theorem 15-1, dividing the
argument into cases depending upon the value of p.

Case 1. p >2p not an integer.* Here the roots of the indicial equation asso-
0,
ciated with (15-36) do not differ by an integer, and a second solution can be
obtained by repeating the above argument with —p in place of p. Obviously this
will lead to a series whose coefficients have the same form as before, and since the
gamma function is defined for nonintegral negative values of its argument, the
solution in question can be written

'-*» = S r(fc + ,)V- , + (?)


05-44)

for x > 0. (For negative values of x we must replace x~ p by \x\~ p . From now
on, however, we shall restrict our attention to the positive x-axis.) we
Finally,
observe that (15-44) is defined even when p is of the form n +
J, n an integer,
and again yields a solution which is linearly independent of Jp (See Exercises .

6 and 1 1 below.) Hence we conclude that the general solution of BesseVs equation
of order p is

y(x) = CxJpix) + c 2 J- P (x),

whenever p is not an integer.

Case 2. p = 0. Here Bessel's equation becomes

xy" +/+ xy = 0, (15-45)

and its indicial equation has zero as a repeated root. Hence, by Theorem 15-1,
we can find a second solution of the form
00

K {x) = Yj b kX
k
+ / (*)ln*, (15-46)
k=i

with J as above. To evaluate the b k we note that

00
~1
xK (x) = ^
fc=3
b k _ 2x
k
+ xJ (x) In x,

k ~l J (x)
Ko(x) = J2 kb k x + 7o(*)ln* +
k=l

~X - J (x)
xK'qXx) = Y, k ( k ~ x )°kX
k
+ xJK(x)\nx + 2J' (x)
k=l X

* Note that the difference of the roots of (15-37) be an integer and only
will if if p is
an integer or half an integer.
602 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS |
CHAP. 15

Thus (15-45) yields

2
bi + Ab 2 x + ]T [k b k + bk^x*' 1

fc=3

+ [xJ'o'(x) + J' (x) + xJ (x)] \nx + 2J' {x) = 0;

and since xJ'J{x) + J' {x) + xJ (x) = 0, we have

2 _1
bx + Ab 2 x + Yj t* ** + ^-2l^ = -2^'oW-
fc=3

Finally, by (15-42),
( \ — V* ( n fc
^ v 2/c-1 '
'oW
/
- Z-< v ^ 2 2k (W) 2

whence

fc=3 fc=l

To facilitate the evaluation of the b k we now multiply this expression by x and


split the series on the left into its even and odd parts to obtain

2k+1
bix + E [(2* + lVW+i + b 2k ^]x + 4b 2 x
2

00 °°
4^
+ £ [( 2*yw + 6 2fc _ 2 ]x
2fc
- x
2
+ E (- 1 >
fc+1
2^^
Thus Z>i = 63 = b5 = • • • = 0, while

46 2 = and (2k) b 2k
2
+ ^-2 = (-1)* +1 ' k > L
1,
22^)2
Hence
. 1
b* = 22'

^4 = — 42 C1 + 2) = — 24(2! )2
^ ~^" 2)'
22 .

(see Exercise 4), and it follows that

*oM =
&=1
Z^(l + \ + • + j[)(f)" + •«*) in*
15-6 BESSEL'S EQUATION 603

0.6
*o
0.5
Yt
0.4

0.3

0.2

0.1

/ .
if J 4 i
V ( 7, Ii J 9 10
0.1 /
0.2 /
3
/
0-1

0.5 /

06 1 /
FIGURE 15-3

In theoretical work with Bessel functions it is common practice to replace K


by a certain linear combination of J and K . The resulting function is known as
the Bessel function of order zero of the second kind, and is defined by the formula

lnj + 7

(15-48)

where 7 = 0.57721566 . . . , and is known as Eider's constant* The graph of


Y is shown in Fig. 15-3.

Case 3. p = n, an integer. This time the roots of (15-37) differ by 2n > 0,


and Theorem 15-1 asserts that the second solution of Bessel's equation is of the
form

**** + "
*»(*) = E
k=0
+ cJn (x) In x,

where c is a constant. Here too the b k and c can be evaluated by the method of
undetermined coefficients, but the argument is now exceptionally long and involved.
Fortunately, we shall not have to use these functions in any of our later work,

* The constant 7 is defined to be the sum of the series

i + EO + m^)
604 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS |
CHAP. 15

and thus can omit the argument in favor of the result. It is

a -n
*»(*) = - £
n-l
E
k=0
(n - k - 1)!

(i)
2k+n
-m
+ .40) In x,

where Hn = 1 + \ + + l/«.

Finally, we remark that it is customary to replace Kn by a linear combination of


Jn and ATn , denoted Yn , and called */ze Bessel function of order n of the second kind.
It is defined by the formula
— - -n
1
W- —
(n - k 0! /x\
2fc
_ _#zl /*Y
Yn(x) = ~ - E
fc=0
k\ \2y *(«!) V2>/

In* + 7
x fc=i
fc!(/i + A:)!

EXERCISES

1. (a) Starting with the definition of T(p), prove that

r(p + 1) = pT(p).

(b) Prove that T(l) = 1, and that lim p ^ + T(p) = +»


2. (a) Show that T(J) = Vtt- [#/«'• Note that

2 - (x+ »
TO)] = 4/ Jo
)
c rfx^,
Jo f

and evaluate the integral by changing to polar coordinates.]


(b) Find the value of T(n/2), n an integer.

3. Discuss the behavior of Jp {x) as x — > 0.

4. Prove Formula (15-47).

5. Prove that

Jo(x) = -Ji(x).
Jx

6. (a) Prove that the functions Jp and J_ p are linearly independent in 6(0, oo) for all

nonintegral values of p.
(b) Show that the function Y defined in the text is a linear combination of J
and Kq.
7. Prove that
/_„(*) = (-l) n/„(x)
for all integers n.
15-6 |
BESSEL'S EQUATION 605

8. (a) Prove that the general solution of

y" -^-^y' + fy = 0, * > o,

is

y = xaZa (j3x),

where Za denotes the general solution of Bessel's equation of order a, and /3 is a


(real) constant not zero.
(b) Use the result in (a) to find the general solution of

y" H —x y' + by = 0, a and b constants, b > 0,

and then show that

Ji/2(x) = x
\TX

sinx.

9. (a) Prove that the general solution of the equation u" + x 2u = is

VxZi /4 (y
j

where Zy± is the general solution of Bessel's equation of order \.


(b) Solve the Riccati equation y' = x2 + y
2
. [Hint: Make the change of variable
y = —u'/u and use the result in (a).]

10. (a) Prove that the function y = eax Zp (fix) is the general solution of the differential
equation

/- + (i_ 2B y+ a. + ^_|_^),.o
) (
whenever Zp is the general solution of Bessel's equation of order p, and a and /3

are (real) constants, /3 ^ 0.

(b) Use the result in (a) to find the general solution of

xy" + (2jc + 1)/ + (5* + \)y = 0.

*11. (a) Prove that the Wronskian of Jp and 7_ p satisfies the differential equation

-^[xW(Jp ,J_ p )] = 0.

[Hint: Use the Lagrange identity (Section 12-5).]

(b) Use the result in (a) to prove that W[JP /_p ] ,


= c/x, where c is a constant,
and then use the series expansions for Jp 7_ p and
, , their derivatives to show that

T(l - p)T(p)

whenever p is not an integer. (Note that this argument also provides a proof of the
fact that Jp and /_ p are linearly independent when p is not an integer.)
606 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS |
CHAP. 15

15-7 PROPERTIES OF BESSEL FUNCTIONS

Now that we have the series expansions for Jp and Yp we are in a position to ,

derive a number of important formulas involving Bessel functions and their deriv-
atives. The first two are immediate consequences of (15-41), and read as follows:


[*%(*)] = A-i(4 05-49)

[*"%(*)] = -x'pJp+1 (x), (15-50)


jk

for all p, positive, negative, or zero.


Indeed, by (15-41) and the identity T(p + k + \) = (p + k)T(p + k), we
have
-1
ArVXYpjA Y n -~
dx
r \\ A V
dx £^ 2^+Pk\T(k
n
( )

+ p + 1)
X
2k+2 P

_ ^ (-lf2(p + k) 2k+2p -i
^2 2 k+Pk\T(k
+p+ 1)

_ (~0 2k+(p-l)
- x p Y^
Lj 22k+ P -ik\T(k + p)

= x /p_i(x).

This proves (15-49) and, with obvious modifications, (15-50) as well. A similar
pair of formulas holds for Yp but we omit the proof.
,

When the derivatives appearing in (15-49) and (15-50) are expanded, these
formulas become
xJ'p + pJp = xJp -i, (15-51)

and
xJ'p — pJp = — x/p+i, (15-52)

from which, by adding and subtracting, we immediately obtain

Theorem 15-2. The Bessel functions of the first kind satisfy the recurrence

relations
xJp+1 - IpJp + x/p_i = 0, (15-53)
and
Jp+i + 2J'p - Jp-i = 0. (15-54)

Example (Bessel functions of the first kind of half-integral order). When p = J,

(15-41) becomes

, , , v^y (-0* (*\


2k
15-7 PROPERTIES OF BESSEL FUNCTIONS 607

But since

r(| + *) -r»)g. |-^)


3 • 5 • • • (2k + 1)
r(|),
2*
we find that

*_ X) (-1? „2fc
A/2W = ,-
---( 2/c
V2r(f) t'o
fc
2fc/c!3 -
5 + *>

2-
+ 2-4-3-5 2-4-6-3-5-7 +
V2r(|) L

3
1 r _^ jc
5
x7
^ ^
rm L
V2xT($) 3! 5! 7!

1
sin x.
'2*r(f)

Finally, referring to Exercise 2 of Section 15-6, we see that

r(f) = Vt/2,
whence

Ji/2(x) = a
\ TX
/ — sin x. (15-55)

A similar argument shows that

J-i /2 (x) = ^j-irx


COS X, (15-56)

and it now follows from (15-53) that every Bessel function of the first kind of
half-integral order (i.e., of order n + \, n an integer) can be expressed in finite
form in terms of elementary functions. For instance,

^3/2 (x) = -/_i/ 2 (*) + -Ji/2(x)

2_ smx
cos a:
TX

• /5/2W = — /l/2(*) + ~JZI2(X)

3 sin x 3 cos x
— sin a;

and so forth. In passing, we remark that the Bessel functions of half-integral


order are the only Bessel functions with this special property; all others being
608 BOUNDARY-VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS |
CHAP. 15

transcendental functions which cannot be written in closed form in terms of


elementary functions*
The above are of crucial importance in solving bound-
several formulas derived
ary-value problems involving Bessel's equation. So too is the information con-
cerning the general behavior of Bessel functions that was established in Section 6-2,
and we therefore conclude this section by reviewing and sharpening some of those
results.
First, we recall that every nontrivial solution of Bessel's equation of order p
has infinitely many zeros on and that the distance between
the positive x-axis,
successive zeros approaches r as x approaches oo. Thus both Jp and J-. p (or
Yp as the case may be) are "oscillatory" functions, and by the Sonin-Polya
,

theorem we can even assert that the magnitude of their oscillations decreases with
increasing jc. In short, the graphs of Jp and J_ p (or Yp ) have a damped oscillatory
character; a fact which was borne out in the case of Jn +i/2 b Y the results of the
preceding example. Actually, this rough description of the behavior of Bessel
functions can be made much more precise. For instance, it is not too difficult
to show that every solution of Bessel's equation of order p can be written in the
form

y(x) = ^sin(x
x
+ <) + ^
r

y/x
) -

where A p and wp are constants whose values depend upon p, and rp is a function,
again dependent on p, which is bounded as x — > oo Thus, for large values of x,
.

y differs very little from the damped sinusoidal function

-jz. sin {x + «p),

and, in particular, it follows that y(x) — > as x —> oo .f

Finally, in preparation for our study of boundary-value problems, we prove


two further lemmas on the zeros of Bessel functions.

Lemma 15-1. The zeros of Jp and Jp+ \ are distinct, and alternate on the

positive x-axis.

Proof. By (15-52),

xJp (x) = pJp (x) - xJp+1 (x).

Thus if Jp (x ) and Jp+1 (x ) were to vanish for some x > 0, Jp (x ) would also

* By definition, the class of elementary functions consists of all rational functions


(quotients of polynomials), trigonometric functions, exponential functions, and their
inverses.
8 of G. P. Tolstov, Fourier Series, Prentice-Hall,
t For a proof of this fact see Chapter
Englewood Cliffs, N. J., 1962.
;

15-7 I
PROPERTIES OF BESSEL FUNCTIONS 609

vanish, and the uniqueness theorem for initial-value problems involving second-
order linear differential equations would then imply that Jp = 0. This is nonsense,
and it therefore follows that the zeros of Jp and Jp+ 1 are distinct.
To complete the proof, let Xi < X 2 be consecutive positive zeros of Jp . Then
by (15-52),
7p+ i(Xi) = -7p(X0 and >/p+i(X 2 ) = -JP (X 2 )-
But, by assumption, Jp(\{) and «/p(X 2 ) have opposite signs, and the above equalities
then imply thatJp+ \ must vanish at least once between Xi and X 2 A similar .

argument using (15-51) shows that Jp must vanish between consecutive zeros of
Jp+u and we are done. |

Lemma 15-2. The function

F(x) = oJp (x) -f- (3xJp (x)

has infinitely many zeros on the positive x-axis for all values of p and all
constants a and /?.

Proof. If a = the assertion is obvious; if j8 = it has already been proved


=
if a = an immediate consequence of our results concerning the behavior
it is

of the zeros of Jp Thus we need only consider the case where a 9±


. and /3 ?± 0.
Here we let \\ < X 2 < be the positive zeros of Jp Then since Jp is positive
• • •
.

in the interval (0, X^, negative in the interval (X 1} X 2 ), positive in (X 2 X 3 ), etc., ,

we have
^(Xi) < 0, y^(X 2 ) > 0, 7^(X 3 ) < 0,

Hence F(x) alternates in sign at the points X ls X 2 , . . . , and therefore vanishes


somewhere between each of them. |

EXERCISES

1. Prove that d/dx[x~PJp (x)] = -x-pJp+ i(x) for all p.

2. Express J 3 and J4 in terms of Jo and J\.

3. Express /5 in terms of Jo and Jq.

4. Prove that

1/2W =
J-1/2M */
\\-
\ TX
.'
— cosx
71

5. Show that

IJo(x) sin x dx = xJo(x) sin x — xJ\(x) cos x + c

and that

fJo(x) cos xdx — xJo(x) cos x + xJi(x) sin x + c,

c an arbitrary constant.
610 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS | CHAP. 15

6. Prove that

Mx)dx = Ji(x) + ^ X
+ ^/
X
W+ 3 • •

{In - 2)Un (x) (2/i)! / Jn (x)


+ 2"- 1 (« - 1)U W_1 2 n n\ J xn *'

7. Prove that
00

xJi(x) = 4j^(-l) n+1 nJ2n (x).


n=\

[Hint: Use the formula Jn +i + Jn-i = (2n/x)Jn .]

8. (a) Prove that


/•OO /-00

/ Jn+ i(x)dx = / Jn -i(x)dx


Jo Jo

for all positive integers n. [Hint: Integrate the recurrence relation Jn -\ —


Jn+l = 2Jn .\
(b) Use the result in (a) to show that
/•OO

/ Jn (x)dx = 1
Jo

for all integers n > 1. (Remark. This result is also valid when n = 0.)

9. Use the recurrence relation Jn -i + Jn +i = (2n/x)Jn together with the result in


Exercise 8(b) to show that
/OO
Jn(x) 1
dx = -
'o x n
for all integers n > 0.

10. (a) Prove that

- =
[JP -i(x)f [JP+ i(x)f
^ £.\J p (x)f.

(b) Use the result in (a) to show that

r it yi
2 V^ 2(/> + 2fc) rf 2

k—0
11. Show that

/2OO = /oU) + 2foXx).


12. Prove that

2
k
Jn\x) = Jn-k(x) - kJn -k+2<,x) -\
&(& —— 1)
Jn-k+^x) + • •

k
+ (-l) Jn +k(x),
where
,m-
JnXx)
-
= ^ d
/»(*).
15-7 |
PROPERTIES OF BESSEL FUNCTIONS 611

13. Prove that

/ Jp (x) dx = 2 ]P JP +2k+i(x) + JP +2n(x) dx for all n > 0.


"* •*
J
k=0

14. (a) Prove that

— [x%(x)J q (x)]
= (t + p + q)x
l
~ J (x)J (x)
p q

- x
l
[Jp (x)Jq+ i(x) + Jq (x)Jp+ i(x)],

— [x'/p + i(x)/g+ i(x)] = (t — p — q — 2)x


t
~ J i(x)J i(x)
p+ q+

+ Xt/p(x)/g + l(x) + Jq (x)Jp + l(x)].

(b) Add the two identities in (a) and then integrate to deduce that

+ p + <?).x'~ Jp (x)Jq (x) dx + I (t — p — q — 2)x


l
~ J i(x)J i(x) dx
/ if p+ q+

= x\jp (x)Jq (x) + /P+ i(x)/3+ i(x)] + c,


c an arbitrary constant.

(c) Use the result in (b) to prove that


2(p+l)
2p+1 2
x [JP 0c)f dx =
2(2 t)
{[Wf + [/P+ i(x)] } + C.

15. (a) Use the formula in Exercise 14(b) to prove that

O+ q)\ — — dx - Jp(x)Jq (x) = 2Jp+ i(x)Jg+1 (x)

/p+l(x) g+l(x)
+ {[(/> + 1) + (<7 + D]/ / ^ - /p+1 (x)7g+1 W}.
(b) Iterate the result in (a) to obtain the formula

{p + q)\
J
-WW x
dx = 2^Jp+k (x)Jq+k (x) - Jp (x)Jq (x)
fc=0

Jp+n(x) +n(x)
- Jp+n (x)Jq+n (x) + + + 2n)j
(p q
^ dx.

(c) Use the preceding formula to deduce that

V^£ dx=-± 2
- [Ux)f +
j {[/o(*)] 2
J^ [Jk (x)]j

16. Discuss the behavior of the positive zeros of the function J".

17. Show that if y = x~ pJp , then xy" + (2/? + \)y' + X 2 jty = 0.

'
1 8. (a) Prove that the Laplace transform of the function x pJp (\x) is given by the formula

p
£[xpJp (\x)] = \ r(2 P + i)
2PT(p + 1)(*2 + X2)(2 P +l)/2
612 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS |
CHAP. 15

for all non-negative values of x and X and all non-negative integral p. [Hint: Show
that y = x pJp (\x) is a solution of the equation

xy" + (1 - 2/>)/ + XV = 0,

and then deduce that

— — +
d£[y]
- (1 + 2p)
s ds
= 0.
£[y] s2 + X2

Solve this equation for £[y], and complete the proof by evaluating the constant of
integration.]

(b) Use the result in (a) to show that

X"
£[/„(X*)] = + \2)V2]n
'

(52 + X 2)l/2[j + ( j2

19. (a) Use the convolution theorem and Exercise 18(a) to show that

/ J (k)Jo(t — \)d\ = sinx.


Jo

(b) Solve the initial-value problem

/' + y = /oW; y(0) = /(0) = 0,

by Laplace transform methods, and then use the result to deduce that
x
f
/ sin (t — X)7o(X) d\ = xJi(x).
Jo

*15-8 THE GENERATING FUNCTION

In Section 15-6 we defined the function Jn n , a non-negative integer, to be the


solution of Bessel's equation of order n whose series expansion about the origin
n
converges for x and has l/(2 n\) as its leading coefficient. Although this is
all

certainly the most natural way of approaching the study of Bessel functions of
integral order, it is also possible to define these functions by means of a generating
function G(x, t), in which case Jn (x) appears as the nth coefficient in the series

expansion of G developed in powers of /. (See Section 11-8.) To accomplish


this we set
- {llt)
G(x, t) = e
{xl2)lt
\ t * 0, (15-57)

and use the well-known power series expansion for the exponential function to
write
X _X_ 2 _*!_
xt/2
{
,

'2
, .
. . ,

f ,

2 2 2! 2 n n\

e
—x\2t _
- ,
1 —
X
^
i ,

+222!'
X 2 _ . . .

+ i /_
C
1\ n
1)
— t~
^
n 4-
+ • • • •

2 2B|l!
15-8 |
THE GENERATING FUNCTION 613

Hence

n=0 J ln = J

and since each of these series is absolutely convergent for all x and all t 9^ we
can perform the indicated multiplication and rearrange terms to obtain

G(x, t) = £ J *W>
where the Jn (x) are functions of x alone. Moreover, an easy computation reveals
that

Jn (x)
2
x4
2fc

=©• "l
_n\ 22(«+l)
x
'
242!(n + 2)! ' ^ ^
fc

22^!(A:
x
+ «)!
+

for n > 0, and that


J_ n (x) = {-\) nJn(x)

for n < 0, thereby proving

{x,2)[t {llt)]
Theorem 15-3. The function e generates the Bessel functions of
integral order of the first kind in the sense that

c
(«/2)[l-(l/l>]
= £
n= — 00
jn(x)t
n
(15 _ 58)

for all x and all t 5* 0.

(Incidentally, this result motivates the choice of the coefficient a in the series
expansion for Jn that was made in Section 15-6.)
Having proved this theorem it is impossible to resist the temptation to do
something with it, since we are now but a step away from a number of important
results. To obtain them we make the change of variable t = e
ld
in (15-57), and
use the identity e ie
= cos 6 + i sin to write

e
(*/2)[«-(i/m = e
ix S ine
= cos(xsin0) + / sin (x sine).

Thus (15-58) becomes

cos (x sin 0) + i sin (x sin 0) = ]|P Jn (x)[cos nd + * sin nd],


n = — 00

and by equating real and imaginary parts and using the identity

J_ n (x) = (-l) nJn (x),


614 BOUNDARY-VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS | CHAP. 15

we find that
00

cos (x sin 0) = Jq(x) + 2 ^ J2n(x) cos 2nd,

(15-59)
sin (x sin fl) = 2 J /2 n+iW sin (In + 1)0.*

Continuing, we now multiply the first of these formulas by cos 2kd, the second
by sin 2kd, and integrate the resulting expressions term-by-term over the interval
< 6 < it (an operation which is legitimate here) to obtain

J2k(x) = - / cos (x sin 0) cos 2kd dd,


7T Jo
(15-60)

= - / sin (x sin 0) sin 2kd dd.


T Jo

(See Exercise 1 .) In exactly the same way we find that

= - / cos (x sin 0) cos (2k + 1)0 dd,


IT Jo
(15-61)

J2 k+i(x) = -
7T
/
Jo
sin (jc sin 0) sin (2k + 1)6 dd.

Finally, by adding these results and using the identity

cos (nd — x sin 0) = cos nd cos (jc sin 0) + sin nd sin (x sin 0),

we deduce

Theorem 15-4. If n is a non-negative integer

jn ( x ) = I / cos(«0 - x sin d)dd. (15-62)


IT Jo

This formula is sometimes called BesseVs integral form for Jn .

Actually, the argument we have just given proves more than this. For if x is

held fixed, and a k b k and a k b k denote, respectively, the Fourier coefficients of


, ,

the functions cos (x cos 0) and sin (x sin 0) on the interval < < tt, then
(15-60) implies that
J2k(x) = ~y
and (15-61) that
(y-\ - b 2h±l
I

* The validity of these computations depends upon the fact that (15-58) is absolutely
convergent for all t ^ or complex.
0, real
.

15-8 I
THE GENERATING FUNCTION 615

But by Corollary 8-1, we know that

lim a 2 k = 0, lim b 2 k+i = 0;


fc—>oo k—>00

whence the following theorem.

Theorem 1 5-5. For each fixed x > 0, Jn (x) approaches zero as n — > oo

EXERCISES

1. Deduce (15-60) from (15-59) under the assumption that termwise integration is

legitimate.

2. Prove that \Jn (x)\ < 1 for all integers n, and all x.

3. Verify that the coefficient Jn (x) appearing in the series

e
( * /2)[ '- (1/ ' )]
= 2 '»<*>'*

is as asserted in the text.

4. (a) Use the identity

g
(x/2)[i-(l/0] (v/2)[«-(l/0] _ l(x+y)/2][t-a/t)]

to prove that
00

MX + y) = ^—
k= oo
Jk(x)Jn -k{y).

(b) Set n = in the above formula and show that


00

Mx + y) = Mx)J (y) + 2 J] (-l)V*(x)/*G0.

(c) Prove that


00
2 2
[7oW] + 2j][^W] = 1.

&=i
(d) Prove that
00 00

Jn (2x) = J2Mx)Jn _ k (x) + J] (-l) 7


fc

fc (x)J„ +fc (jc).

*5. (a) Show that


-l
r(jc) = 2[ e- r \ 2x dr, x>0,
Jo

and then use this formula to prove that


T/2
r(p)r(^) / 2p -i . 2 9 -i Q M
= 2 / sin
.

6 cos <tf0
Tip + q) Jo
for all p > 0, q > 0.
616 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS CHAP. 15

(b) Use the result established in (a) to rewrite the series expansion for Jp as

Jp (x) = y -^rrrr y- x J
sin cos 6 dd,
ni)T(p+h)^ (2k)\
j
Jo

and then apply the Weierstrass M-test with

i
2k \

£ (2k)\

as the comparison series to show that

p
(x/2)
(*<***?' (-D
Jp(x)
r(*)r(p + i)./o
^'e^itxr
{2k)
fc=0
<*>•

(c) Now prove that

(x/2Y 2p
Jp (x) = sin 8 cos (x cos 0) dd.
r(*)ro> + %)Jo
(This result is known as the Poisson Integral Form for Jp .)

The following exercises introduce the so-called modified Bessel functions of integral
order, /„(*).
{x/2)U+{1/t)] in powers of
6. (a) Expand the function e t as

e
(./2)[l+(l/l)]
= £ /„(*)/»,

and show that

In(x) =
2 n n\
1 + 2(/i + 2)
+ 2 •
4(2* + 2)(2/i + 4)
+
n+2k

fc=0

for all n > 0, and that

I- n (x) = /„(*).

(b) Prove that

I2n( — X) = hn(x), hn + li — x) = — hn + l(x),


and
UX) = (-i) nJn (ix)

for all integers n.

1. (a) Prove that


n
\x ln {x)\ = x n In -i(x)
dx
n
~[x "/„(*)] = x /n+1 (x).
dx
1 1 :

15-9 | STURM-LIOUVILLE PROBLEMS FOR BESSEL'S EQUATION 617

(b) Prove that the functions /„ satisfy the following recurrence relations

/ —
In - 1
*n4-l
-—I*-nt

X

In-1 + In+1 = 2l'n ,

r —
in - i'
In .l^t*«)
\

In+1 = In In-
X

8. Show that the modified Bessel function /„(*) is a particular solution of the equation

x 2y" + xy' - (x 2 + n 2)y = 0.

15-9 STURM-LIOUVILLE PROBLEMS FOR BESSEL'S EQUATION

In physical applications Bessel's equation usually arises in the "parametric" form

x 2y" + xy' + (X
2
jc
2 - p 2 )y = 0, (15-63)
or

dx (*£) + (*-£)'-* <•">


with X a constant.* As such its solution space is spanned by the functions J (\x),
p
and J_p (Xx) or Yp (\x), depending upon the value of p. In this section we propose
to establish the existence and orthogonality of eigenfunctions for a number of
boundary- value problems involving Eq. (15-64) on the closed unit interval [0, 1]
under the assumption that X is a non-negative real number. Basically we shall
use the method developed in Chapter 12, but must modify our arguments in
certain particulars to accommodate the singularities in the equation and its solu-
tions at the origin. For simplicity we continue to restrict our attention to Bessel
functions of the first kind.
In the first place, the fact that the parameter appearing in (15-64) is multiplied
by x implies that we must use the weighted inner product

f'g = C f(x)g(x)xdx (15-65)


./o

in (Pe[0, 1] throughout the following discussion. Moreover, in order to ensure


that the functions Jp (\x), X > 0, belong to this space we must demand that p
be non-negative. For otherwise, Jp (\x) is unbounded as x —> 0, and is not piece-
wise continuous. Finally, since we shall be working on the interval [0, 1], and

* The reader can verify that (15-63) is a variant of Bessel's equation by making the
2 -
change of variable s = \x in s 2y" sy' (s
2
+
p )y = 0. Equation (15-64) is, of
+
course, the self-adjoint form of (15-63).
.

618 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS |


CHAP. 15

since the leading coefficient of (15-64) vanishes when x = 0, we need only impose
a boundary condition at the point x = 1

This said, we now consider the Sturm-Liouville problem consisting of

d_
(i5 - 66)
dx (*s)+(^-£)"- -

p > 0, X > 0, and the single unmixed boundary condition

0O<1) + fc/O) = 0, (15-67)

|0i| + Ifol 5* 0. By the results in Section 12-8 we can assert in advance that
eigenfunctions belonging to distinct eigenvalues for this problem are mutually
orthogonal in (PC[0, 1]. Furthermore, the eigenfunctions belonging to a positive
eigenvalue X will be the nonzero multiples of Jp (\ x), while the eigenfunctions
belonging to the eigenvalue (if is be the nonzero multiples of
an eigenvalue) will

xp , since in this case (15-66) reduces to the Euler equation

jcV + x/ - p 2y = 0.

Thus, under the hypotheses imposed upon p and X we need only examine the
functions x p and J (Kx), X > 0, as potential eigenfunctions. And here we argue
p
as follows.

Case 1. j8 2 = 0. In this case (15-67) becomes

XI) = 0, (15-68)

and it follows that Jp (\x) will be an eigenfunction if and only if Jp (\) = 0. Hence
the positive zeros X x < X2 < • • •
of Jp are eigenvalues and

Jp(Ux), k — 1,2,...,

are eigenfunctions. Finally, since x p does not vanish when x = 1, X = is not

an eigenvalue, and the above list of eigenfunctions is complete.

Case 2. /3i
= 0. Here (15-67) becomes

/(l) = 0, (15-69)

and the positive zeros m < fx 2 < • •


of Jp are now eigenvalues (see Lemma
15-2).
p
In addition, the function x satisfies (15-69) when/? = 0, and in that case
X = is also an eigenvalue. Thus the eigenfunctions for this problem are

JpfakX), k = 1, 2, ... , when/? > 0,

and
1, JoQi k x), k = 1,2,..., when/? = 0.
. ,

15-9 |
STURM-LIOUV1LLE PROBLEMS FOR BESSEL'S EQUATION 619

Case 3. For various reasons the case fix ^ 0, j8 2 ^ is not particularly interest-

ing, and is usually replaced by the more general requirement that when jc = 1,
Jp(\x) satisfy the equation

\JP (\) - hJp (\) = 0, h a constant. (15-70)

(As we shall see, this type of boundary condition arises in the study of heat flow
in cylindrical regions, and is not as one might think.) Since (15-70)
artificial as

has a variable coefficient X, it does not under any of the several types of bound-
fall

ary conditions discussed earlier, and therefore must be treated separately.


We begin by applying Lemma 15-2 to assert that (15-70) has infinitely many
positive zeros, v\ < v2 < • • • , meaning, of course, that (15-70) is satisfied
whenever X = i>k- Hence the functions

JP (vkx), k = 1,2, ...

satisfy (15-66) and (15-70), and as a consequence are "eigenfunctions" for this
problem. Moreover, reasoning as in Section 12-8, we find that these functions
are mutually orthogonal in <PC[0, 1] with respect to the weight function x. (See
Exercise 2 below.) Thus, on the face of things, the situation here would appear
to be identical with that discussed in each of the preceding cases. This, however,
is not quite true, for it turns out that whenever p < h, Eq. (15-70) has other roots
in addition to the v k . Indeed, when/7 = h, Eq. (15-70) becomes

\JP (\) - pJp (\) = 0,

which, by (15-52), can be rewritten

X/p+ i(X) = 0,

and has X = as a root. When p < h the situation is much more complicated,
and cannot possibly be treated with the tools we have available. Suffice it to say
that the equation then admits a pair of imaginary roots ±«/ a phenomenon —
which does not occur with any of the boundary conditions we have considered
heretofore. In the next section we will find that the existence of these additional
roots introduces difficulties in the study of series expansions relative to the orthog-
onal set {Jp(vkX)}
In view of the now obvious fact that soon be computing series expan- we will
sions relative to the eigenfunctions found above, we conclude this section by
evaluating their norms in (PC[0, 1]. The basic formula for all of these computa-
tions is given in the following lemma.

Lemma 1 5-3. For all non-negative real numbers p and X,

[Jp(\x)fx dx = M^(X)] 2 + —.^r~


2
[^(X)]
2
. (15-71)
2X
620 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS | CHAP. 15

More generally*

C 2 2/2 2

[Jp (\x)]
2
xdx = y[^(X6)] 2 +
X ~ P [Jp QJ>)f. (15-72)
J 2

Proof. We begin by multiplying Bessel's equation by 2/ to obtain

2x 2y'y" + 2x(/) 2 + 2(x


2
- p
2
)y'y = 0,
or
2
[x*(/)*Y + 2(*
2
- /> )y'y = o.

Rewriting this equation as

[x
2
(/)
2
]' + [(jc
2 - p
2
)y
2
]' - 2xy 2 = 0,

we conclude that

2x[Ux)]
2
= ± {*V P (*)]
2
+ (x
2
- 2
p XJP {x)f}.
Hence
x iX
,X
f 2
2/ [^(x)] **/*
2
= xV^)] 2 + (/ - /FpW]
Jo lo lo

2 2 2 2 2
= x W(x)] + (x - p XU*)] ;

the last step following from the fact that pJp (0) = for all p > 0. The desired
result now follows by setting jc = A/in the above integral. |

When rewritten in terms of the inner product on <?e[0, 1], Eq. (15-71) becomes
2
2 2 X2
\\JP M\\ = M4(x)] + 2 x^ ^ (x)]2> [
(15_73)

and yields the following important formulas.

Case 1. A* the kth positive zero ofJp :

= «^(X*)]
a ~74
l|/p(Xfcx)|| . ( 15 )

The reader should note that by using Formula (15-52) this result can also be
written
a 2
l|/p(X*x)|| = M/p+i(X*)] . (15-75)

Case 2. Hk the kth positive zero of J'v


2 J2
2 Mfc P rr /.. Y|2 — ~ 76
/*G***)ir = ^-V^(M*)r. ( 15 )
2Mfc

* This formula is applied when working on the interval [0, b].


15-10 | BESSEL SERIES OF THE FIRST AND SECOND KINDS 621

In particular, when p — 0, we have


2
IWm**)!! = WoMf, (15-77)
and

llll = C xdx = |. (15-78)


Jo

Case 3. Vk the kth positive zero of \Jp(\) — hJp (\):


2 2 2
=
;
hr + i

Vk_— F
p
]Jp<Pk)V \
.2
VpMY. (15-79)

EXERCISES

1. Verify that Eq. (15-63) is Bessel's equation of order p in the variable \x.

2. Apply the argument given in Section 12-8 to prove that the functions Jp (y k x),
k = 1,2,..., discussed in Case 3 above are mutually orthogonal in (PC[0, 1] with
respect to the weight function x.

3. Prove that x p is orthogonal in (PC[0, 1] to each of the functions Jp (vkx) when h = p


in (15-70). What does this imply about the set of functions Jp (v k x), k = 1, 2, ... ?

15-10 BESSEL SERIES OF THE FIRST AND SECOND KINDS

Now that we have established the existence of an infinite set of mutually orthog-
onal eigenfunctions for each of the boundary-value problems introduced above
we are faced with the task of determining whether these sets of functions are bases
for (PC[0, 1]. With but one exception the answer is yes, and in general the results
in this connection are analogous to those we have encountered in similar situations
in the past. This time, however, we can do no more than state the relevant the-
orems, since their proofs are far too difficult to be given here. The first and most
important of them reads as follows.

Theorem 15-6. Let \\ < X2 < • • •


be the positive zeros of Jp (x), and
suppose that p > 0. Then the functions Jp (\hx), k = 1, 2, ... , are a basis
for 6>G[0, I], and every function in this space can be written uniquely in
the form
00

fix) = J2 CkJP (\ k x), (15-80)

where the series in question converges in the mean tof and


1
2 r
Ck = H rTvi2 / f(x)Jp(\kx)x dx. (15-81)
622 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS |
CHAP. 15

Moreover, iff is piecewise smooth this series converges pointwise to

Wxt) + /W)]
for each x in the interval (0, 1), and the convergence is uniform on every
closed subinterval of(0, 1) which does not contain a point of discontinuity off

Remark. The expansion given by (15-80) and (15-81)


series is also valid when
—2 < P < even though the Jp (\kx) do not belong to (PC[0, 1] for these values

of p.
The series described in this theorem is known as the Bessel or Fourier-Bessel
series for f of the first kind with respect to the functions Jp (Kkx). The reader should
note that this result actually provides us with infinitely many different series of
this kind, one for each admissible value of p.

Example 1. Find the Bessel series expansion with respect to Jp (\kx) for the
function x p
, p > 0.

In this case the coefficients in the series are given by the formula

Ck = x p+1Jp (\kX) dx,


[JP+ i(W] 2 Jo

and can be evaluated by setting t = \ k x and using Formula (15-49), as follows:

x p+1Jp (\ k x)dx = -^ 2 J/
? + %{t)dt
Jfo \l+ o

d r.p+l
P+1
[t Jp + 1 (t)]dt
P+2
A£ dt

+1 J
[f P + l{t)]
P+2
X*

= y- Jp+i&k)'

Thus

Ck k= 1,2,...,
Wp+l(A*;)
and

Jpfrl*) Jp0^2X) Jpfax)


xp = 2
,

X 2«/p+i(X 2 ) X 3/p + i(X 3 )


+ (15-82)
_XiVp+i(Xi)

where the series converges in the mean in (P6[0, 1], pointwise in (0, 1), and uni-
formly on any closed subinterval of (0, 1).
15-10 I
BESSEL SERIES OF THE FIRST AND SECOND KINDS 623

In particular, when/? = 0, (15-82) yields the formula

V- /o(X**) 1
< x < 1. (15-83)
Xfc/iCX/;) 2
j^J

Example 2. Expand x 2 , < x < 1, as a series in J (\kx).


Here
l

Ck = x,3JoiXkX) dx,
[MW] 2 Jo
and, reasoning as in the preceding example, we find that

x Jo(\]cX) dx \ / t%(t)dt

= ~ t
2
4:[tMt)]dt
dt
>l J °

x! lo Jo

2
t J x {t) dt.
Xfc x|^o

But

/V!(0* = - r-fJoMdt
o dt'

-*V (0 + 2 tJQ {t)dt


o Jo

-2f o ^[^W]*
= 2\ kJi(\ k ),
and it follows that

x J (\icX) dx = MW 4y 2 (x*)
Xfc
x2

Thus

Ck J_ _ 4
^i(Xft) x*
XlJ
624 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS | CHAP. 15

and

x2 = 2
Z T7^(v ~ 4W XfcX >» ° < x < L

Turning now to the functions Jp (jx k x), we have

Theorem 15-7. Let hi < M2 < ' ' '


be the positive zeros ofJp(x),p > 0.

Then
JpfokX), k=l,2,..., p > 0,

and
1, J (ix k x), k = 1,2,...,

ore bases for (PC[0, and series expansions relative to these bases converge
1],

in exactly the same way as the series described in the preceding theorem.

In particular, it now follows from Formulas (15-76) and (15-77) that a function

/in (PC[0, 1] can be written in the form

Ax) = £
k=i
c kJp (tx k x), p>0, (15-84)

with

^1 / f(x yp (tx k x)xdx, (15-85)


D*t - 2
p ]«M )]
^FpO**)]*- 70
fc
2
^
and
00

/(*) = 2c + X) CkJo&kx), (15-86)


fc=i

with

Co = I
Jo
f(x)xdx, ck = rr
yo\^k)r
^ Jo
f(x)Jo(n k x)xdx, k>0. (15-87)

A series of this type is usually said to be a Bessel series expansion- off of the second
kind, or a Dini series expansion off
Finally, a similar result holds for the functions Jp (y k x) discussed in Case 3
of the preceding section provided we insist that p be greater than h. On the other
hand, when p < h, an additional function must be added to the Jp iy k x) to obtain
p
a basis for (PC[0, 1]. When/? = h, the function in question is x or any nonzero ,

multiple of it (see Exercise 3, Section 15-9); but when p < h, it is considerably


more complicated.* However, once this function is included with the Jp (v k x), the

analog of Theorems 15-6 and 15-7 goes through unchanged.

* It is, in fact, I (v x), where p is chosen so that iv is a root of the equation


P
A/p(A) - hJ(X) = 0, and Ip is the so-called "modified" Bessel function of order p of
the first kind. (See Exercises 6 through 8 of Section 15-8.)
15-10 | BESSEL SERIES OF THE FIRST AND SECOND KINDS 625

EXERCISES

1. Prove that for n > 1

^ r1
/•i
2
n -2

Jo
n
x J (\x)dx = + ^'o(A) - ijL
^T-) x Jo(\x)dx.

2. Use the formula given in Exercise 1 to show that

°° r i
|
f
x = 2 X) "T~TFr\ i i / ^o(XfcO dt Jo(X x), fc

forO < x < 1.

3. (a) Show that

r xJo(kx)Jo
I
Jo
(\ k x)dx = \ kJo(\k)Mk)
k
22
- Xfc

whenever X* is a zero of 7o-


(b) Use the result in (a) to deduce that

Mkx) = 2Joik)YJ —— 2
^| Jo(\ k x),
fc=i (X fc k )/i(Xjb)
for < x < 1.

4. Prove that

l-* 2
= 8£4^-
k=\ X*;/i(Xfc)

forO < x < 1.

5. Expand the function xp ,p > 0, in (PC[0, 1] as a series involving the functions Jp(ji k x).

6. Prove that
Jo ^>'
lnx= -2^ 2 2
k =\ Xfc[.7l(Xfc)]

for < x < 1. [/##!/; Use integration by parts to evaluate

/ x In x7o(X&x) dlx.]
./o

7. (a) Prove that

fc=i Xi/p+i(X/t)
for all /? > 0, and all x in (0, 1).

(b) Use the result in (a) to show that

£~( Xfc/p+ i(Xfc)

for < x < 1. [Hint: Multiply the formula in (a) by x~ (p+1) and differentiate.]
626 BOUNDARY-VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS CHAP. 15

15-11 LAPLACE'S EQUATION IN CYLINDRICAL COORDINATES

At this point it should be abundantly clear that we have assembled more than
enough information to solve boundary-value problems involving Bessel's equation.
Indeed, given what we now know, this is largely a matter of routine computation,
and involves little that could not be left to the reader's imagination and the —
exercises. But, in the interest of completeness, we shall devote the following pages
to a brief discussion of several problems of this type, producing a formal series
solution for each of them. Of course, once this has been done we are still faced
with the task of determining conditions under which these series actually satisfy
the problems they purport to solve. Here, however, the technical difficulties are

formidable, and force us to be content with the vague statement that all of our
results are valid whenever the functions involved are sufficiently smooth.
This said, we turn our attention to Laplace's equation in cylindrical regions,
which we first propose to solve under the assumption that the solutions are in-
dependent of the polar angle 0. In this case the relevant version of Laplace's
equation is
2 2
d u 1 du d u
+
.

+
.

= (15-88)
dr 2 dr dz 2

(see Section 15-1), and we remind the reader that its solutions can be interpreted
as steady-state temperature distributions in the region in question.

Example 1. Solve Eq. (15-88) in the cylindrical region r < 1, < z < a
(Fig. 15-4) under the assumption that

u(l,z) = 0,

u(r, a) = 0, (15-89)

u(r,0) =f(r).

Applying the method of separation of variables with u(r, z) = R(r)Z(z) we


obtain the equations

R" + -r R +
f
\
2
R = 0,
(15-90)
Z" - X Z =
2
0,

X a (positive) constant, and the endpoint conditions

*(1) = o,
(15-91)
Z(a) = 0.

The first of these equations is Bessel's equation of order


zero in the variable \r, and has

R(r) = AJ (\r) + BY (Xr), FIGURE 15-4


1 5-11 | LAPLACE'S EQUATION IN CYLINDRICAL COORDINATES 627

A and B constants, as its general solution. The requirement that R be continuous


at the origin forces us to set B = 0, while (15-91) implies that / (X) =
0- Thus

the admissible values of X are the positive zeros of J , and we have shown that up
to constant multiples R must be one of the functions

Mr) = /o(A*r), k= 1,2,....

Moreover, when X = A* the general solution of

Z" - \lZ = 0, Z(a) = 0,


is

Zk (z) = A k sinh \k (a — z),

where Ak is an arbitrary constant (see p. 549). Thus the functions

u k {r, z) = A k sinh \ k (a — z)J (\ k r), k — 1,2,...,

are solutions of (15-88) and satisfy the boundary conditions u(l,z) = u(r, a) = 0.
To accommodate the remaining boundary condition we now form the series

u(r, z) = ^2 A k sinh \ k (a - z)J (\ k r), (15-92)


k=i

set z = 0, and replace u{r, 0) by/(r) to obtain


00

/(/•) = ^2 A k sinh (K k a)Jo(\ k r).


k=\

By our earlier results we know that this equation can be satisfied for any / in
(Pe[0, 1] by letting A k sinh (K k a) be the Ath coefficient in the series expansion of/
relative to the functions JoO^kr). Hence

and we are done.

Example 2. Find the steady-state temperature in the cylindrical region described


above, given that
u(r, a) = 0,

u r (l,z) = hu(l,z), h a constant, (15-93)

«(',0) =/(r).

Physically these conditions assert that the base of the cylinder is maintained at a
known temperature /(/•), the top at zero, and that the lateral surface exchanges
heat freely with the surrounding medium at a rate proportional to the temperature
of the surface.
628 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS | CHAP. 15

This time we must solve the pair of equations in (15-90) in the presence of the
endpoint conditions
*'(1) - hR(\) = 0,

Z(a) = 0.

Reasoning as before, we find that up to constant multiples,

R(r) = J (\r),

and (15-94) now implies that X must be a root of the equation

\J' (K) - hJ (\) = 0.

Again our earlier results allow us to assert the existence of infinitely many such
roots v\ < v2 < • • • , all of which are positive, and with them, solutions

R k {r) = J (vk r), k = 1, 2, . . . .

Since the corresponding solutions of the equation involving Z are


Zk (z) = A k sinh vk (a — z),

we find that
00

u{r, z) = ^2 A k sinh v k (a - z)J (v kr), (15-95)


k=i

where the Ak are chosen so that

oo

/(/•) = ^2 A k sinh (v k a)Jo(v k r).


ik=i

(Note that the existence of this series for any fin (PC[0, 1] is guaranteed only if
h < 0. Otherwise a more complicated series must be used.) Thus, by Formula
(15-79), we have

lvl
Ak = .
2 a
2 \ Ary (y k r)rdr, (15-96)
[h
2
+ v k][J (y k )] smhO> k a)Jo

and the solution is complete.

Example 3. As our final example of this type we again solve Eq. (15-88) in the
region r < 1, < z < a, but now impose the boundary conditions

u(r, 0) = 0,

u(r,a) = 0, (15-97)

ii(1, z) =f(z).
,

15-11 | LAPLACE'S EQUATION IN CYLINDRICAL COORDINATES 629

This time the method of separation of variables leads to the pair of equations

2
Z" + X Z= 0,
(15-98)
R" + - R' - \
2
R = 0,

and endpoint conditions Z(0) = Z(d) = 0. (Here the choice of sign on X 2 is


dictated by the requirement that the solutions vanish at the endpoints of the
cylinder.) It follows at once that the only possible solutions of the first equation are

Zk (z) = A k sin — ' k = 1, 2, . .


.

corresponding to the eigenvalues X& = kir/a, and it remains to solve

R" + - R' - \
2
R = (15-99)
r

for each of these values of X. To this end we make the change of variable t = i\r
(i = y/—l), and rewrite (15-99) as

or
d2R dR
¥ + 7I + * 1
- °'

which we recognize as Bessel's equation of order zero in the variable t. Thus,


up to constant multiples, the only solutions of (15-99) which are continuous at
the origin are of the form
R(r) = J (i\r).

Using the formula for the series expansion of J , we have

v2fc

= E o*r
*=o ^ 22 !
)
2

and itfollows that J (i\r) is a real-valued function after all. In view of this fact
it is reasonable to adopt a notation which does not (misleadingly) involve /, and
so we set
oo 2fc

o(*)=E^w*
7 (15 ~ 100)
630 BOUNDARY- VALUE RPOBLEMS INVOLVING BESSEL FUNCTIONS |
CHAP. 15

The function I (which the student should regard as being defined by this series)
is called the modified Bessel function of order zero of the first kind (see Exercise 6,
Section 15-8), and in the present case allows us to write

R{r) = 7 (Xr).

In particular, when X = \ k = kir/a,

*«-/„(**).

and we conclude that the solution of the boundary-value problem under discussion
isof the form

«(r, z)=± AA (^f) sin


(^)

(15-101)

Finally, to determine the values of A k we set r = 1, and u{\, z) = f(z) to obtain

Thus the A k are determined by the requirement that A k I (kT/a) be the kth coeffi-
off on the interval [0, a], and we have
cient in the Fourier sine series expansion

A * = -^-nrja\\
aIo(Kir/a) Jo
m^^dz.
a
05-102)

EXERCISES

1. Find the steady-state temperature in the cylindrical region r < 1, < z < a, given
that the solution u is independent of 8, and

u(r, 0) = f{r), u(r, a) = g(r), «(1, z) = 0.

2. Solve the problem in Exercise 1 under the boundary conditions

u(r, 0) = fir), uir, a) = g(r), «(1, z) = 100.

3. Find the steady-state temperature distribution u = u(r, z) in the cylinder r < 1,

< z < a, given that

«(r,0) = 100, uir, a) = 0, k(1, z) = 0.

4. Find the steady-state temperature distribution u = uir, z) in the cylindrical region


r < 1, < z < a, given that

„(r> 0) = fir), uir, a) = 100, « r (l,z) = 0.


15-12 I THE VIBRATING CIRCULAR MEMBRANE 631

(The last boundary condition asserts that the lateral surface of the cylinder is in-

sulated so that no heat flows across it.)


5. Discuss the nature of the solutions of Laplace's equation in the cylindrical region
r < 1,0 < z < a, which depend upon only one of the variables r, 6, z. Which of
them can describe steady-state temperature distributions in the cylinder?

15-12 THE VIBRATING CIRCULAR MEMBRANE


we consider the problem of describing the motion of a vibrating
In this section
circular membrane such as the head of a drum under the assumption that the
membrane is held fixed on the boundary of the circle and given a definite displace-
ment and velocity at time t = 0. In other words, we propose to solve the two-
dimensional wave equation
2 2
d U du d U d W im
n - 103)
w +
, 1

-rlfr
+ T-Wz =
. 1 1

«>-efi'
a>0n -
(l5
c >k

in a circular region, which for convenience we take to be r < 1, given that

w(l, 0, t) = 0,

u(r,d,0) =f(r,d), (15-104)

Utir, 6, 0) = g{r, 6).

We again begin by seeking solutions for (15-103) which are independent of 0, in


which case the equation and boundary conditions become

T^
dr 2
+ -^
dr
r
= -
a z at 1
2^' a > °> (15-105)

and
u{\, t) = 0,

«(r,0) =f(r), (15-106)

u t
{r, 0) = g{r).

Arguing in the usual fashion we set u(r, t) = R(r)T(t), and find that (15-105)
becomes
R" + (\/r)R' j_ r; = _ 2
R a2 T
where X is a positive constant. (The choice of sign here is forced upon us by the
requirement that T be periodic.) Thus R and T must be solutions of the equations

R" + - R' + \
2
R = 0, (15-107)

T" + xVr = 0, (15-108)


632 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS | CHAP. 15

and, in addition, R must satisfy the boundary condition R(l) = 0. Starting with
the general solution
R(r) = AJ (\r) + BY (Xr)

of (15-107), we first set B = in order to ensure the continuity of R at the origin,


and then apply the given boundary condition to deduce that, up to constant mul-
tiples, R must be one of the functions

Rk(r) = J (\ k r), k = 1,2,...,

where \ k is the A;th positive zero of J . Moreover, with these as the values of X
the solutions of (15-108) are

Tk (t) = A k cos (\ k at) + Bk sin (Kk at), k = 1,2,...,

A k Bk
, constants, and it follows that the functions

u k (r, i) = [A k cos (\ k at) + Bk sin (\ k at)]J (\ k r),

k = 1,2,..., satisfy (15-105) and the boundary condition w(l, i) = 0.


To find a solution u(r, i) which also satisfies the given initial conditions we set

00

u(r, 0=1] [Ak cos (X fc a0 + Bk sin (X fc a/)]J (X^), (15-109)

and attempt to determine the A k and Bk so that

u(r,0) =f(r),

u t (r, 0) = g(r).

Thus when / = 0, (15-109) must reduce to

fir) = X) AkU^ur),

and we find that

Ak = j/irVoMr dr. (15-1 10)


fjj^jp
A similar argument applied to the series obtained by differentiating (15-109)
term-by-term with respect to t yields

^spkfi "^*' 8 (15 - HI)

and the problem is solved.


,

15-12 | THE VIBRATING CIRCULAR MEMBRANE 633

We now turn to the general case with solutions dependent upon 0. Here we
start by 0, t) = R(r)®(d)T(t) in (15-103) to obtain
setting u(r,

R + r R + r2 © a2 T
where, as before, X is a positive constant. From this we obtain the pair of equations

T" + X Vr = 0,

i{
+ a il
+ r2

the second of which we rewrite as

2
_ R" + (\/r)R' + \ R = ©^ = _ 2
M
(1/r 2 )/? © '

where n > is again a constant. (The choice of sign is dictated by the fact that ©
must be periodic with period 2x.) Thus we must solve the equations

+ XVr = 0,
T" (15-112)

r
2
R" + rR' + (A 2 2 - 2 )/? = /- /i 0, (15-113)

0" + m 2 = 0, (15-114)

subject to the boundary condition

R(l) = 0. (15-115)

Starting with (15-114), we use the periodicity of © to deduce that the only
admissible values of /x are

Hn = n, n = 0, 1, 2, . .
.

and hence that © must be a linear combination of the functions

cos nd, sin nd.

Furthermore, when /x = n, (15-113) is Bessel's equation of order n, and (15-115)


implies that its solutions must be constant multiples of the function

Rnk(r) = JnQ^nkr), k = 1,2,...,

where the X n are the positive zeros of Jn


fc . (We again reject solutions involving Yn
because of their discontinuity at the origin.) Finally, when X = A n fc, (15-112)
yields
Tnk(i) = A nk cos (a\ nk t) + Bnk sin (a\nk t),
634 BOUNDARY- VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS | CHAP. 15

where A nk and Bnk are arbitrary constants. Putting these results together, we
conclude that the functions

Unkir, d, i) = [A nk cos (a\ nk t) + Bnk sin (a\nk t)] cos (nd)Jn (\ nk r)

and
v n k(r, 6, i) = [A nk cos (a\ nk t) + Bnk sin (a\ nk t)] sin (nd)Jn (\ nk r),

n = 0,1,2,..., k = 1,2,...,

are solutions of (15-103) which satisfy the boundary condition w(l, 0, ?) = 0.


To complete the solution we now set

00 00

u(r, M) = 2= £
71 fc=l
["»*(r » ' ') + Vnk (r > e 0],>
(15-116)

and determine the coefficients A nk Bnk A nk Bnk so that u(r, 0, 0) =


, , , f(r, 0) and
w*(r, 6, 0) = g(r, 0). Thus, when / = 0, (15-116) must reduce to

OO 00

fir, J)=2E
=
&
n fc=l
nk (r > e > °) + °»*fr '» °M

or
oo / oo oo \

/fo *) = 2 12
n=0 Ia;=1
[^n^n(Xn^)] cos «0 + J] P„*/„(\ n *r)]
&=1
sin «0
'
• (15-1 17)

To evaluate these coefficients, we expand f(r, 0) in a Fourier series with respect


to the variable 0, obtaining
00

f(r, d) = J2 t fl »( r ) cos nd + 6w (r ) sin n6l> (15-118)


n=0

with

floW = Kr,e)de,
^/_

= f{r, 0) cos nd dd,


an(r)
lL
Kir) = - / fir, 0) sin nd dd.
IT J —,

Comparing (15-117) and (15-118), we see that

an(r) = 2A
fc=l
nkJn0^nkr), and bn (r) = ^A
/c=l
nkJn (\ nk r h
.

15-12 |
THE VIBRATING CIRCULAR MEMBRANE 635

for all n. Hence the A nk and A nk must be the coefficients in the Bessel series expan-
sions of a n (r) and b n (r) with respect to the functions Jn (Kkr), and we have

1
2 f
A nk = 7i ,x v, 9 /
an (r)Jn (\ nk r)r dr,
Un+liKk)] 2

A nk = 71 r\ M2 Jo
/
bn (ryn (\ nk r)r dr.
Vn+i{Kk)r
Thus

^ofc = TT x , 12 /x / fir, d)Jo(\okry dd dr,


*ViOofc)r Jo J-*

A"k = ~T7 f\ vi2 / / /(r > ) cos (nO)Jn (\ nk r)r dd dr,


*Vn+l\Ank)r JO J—k

1 rr

Ank = ~T7 nT~Yi2 / / f(r >


6> sin
> (nfyniKkry dd dr.
Tr[J n +l(Kk)r J J-t

A similar computation starting with the series obtained by differentiating


(15-116) term-by-term with respect to / reveals that

B °k = J_j(r>Wo^okr)rd8dr,
«Xod/i(Xa*)P i
Bnk = .
r/ ,. ,
12 / /
g(r, 6) cos (nd)Jn (\ nkr)r dd dr,

Bnk = >
ry —n~ via / / #(r > ) sin ("0)4( W> ^ <fr,

and we are done.

EXERCISES

1 An object located at the point x = xo starts from rest and moves along the x-axis
under the action of a force directed toward the origin whose magnitude is propor-
tional to the distance of the object from the origin and to its mass, and initially is
moxo. Determine the motion of the object if its mass m varies with time according
to the formula m = mo(l 0- +
2. A uniform, flexible cable of length L is suspended vertically as shown in Fig. 15-5.
At timet = that portion of the cable between a: = Oandx = aL is given a uniform
horizontal velocity v = f(x). Describe the subsequent behavior of the cable, given
636 BOUNDARY-VALUE PROBLEMS INVOLVING BESSEL FUNCTIONS CHAP. 15

that its motion is governed by the equation

d y d
g
dl* dx (-D
where g is the acceleration due to gravity. [Note. The
x-axis is directed upward with the cable suspended from
-
*
the point (L, 0).] v= Ax).
FIGURE 15-5 X&
3. Find the steady-state temperature distribution u(r, 6, z) in the cylindrical region
r < 1,0 < z < a, given that

w(l, 0, z) = 0, u(r, 0, a) = 0, u(r, 0, 0) = fir, 6).

4. Find the temperature u(r, 6, t) in the two-dimensional region shown in Fig. 15-6,
given that the boundary of the region is maintained at a temperature of 0°, and that
at time t = the interior is at a uniform temperature of 100°.

FIGURE 15-6 FIGURE 15-7 FIGURE 15-8

5. Find the equation of motion of a vibrating membrane of the shape shown in Fig.
15-7, under the assumption that the membrane is held fixed along its boundary and
is released from rest at time / = from a known position.

6. Solve Exercise 5 when the membrane is also given a known initial velocity.

7. Generalize the results of Exercises 5 and 6 to the case of an arbitrary wedge-shaped


region with central angle a.

8. Find the steady-state temperature distribution in the semicylindrical region shown


in Fig. 15-8, given that the temperature on the upper face is held at a known value
f{r, 0), while that on all the remaining faces is zero.
APPENDIX I

infinite series

1-1 INTRODUCTION
The objective of this appendix is to provide a reasonably complete account of
the material relating to the convergence of sequences and series that was used
in the body of the text. From the standpoint of logical completeness this discus-
sion ought to begin with a detailed study of the real number system, including its

construction out of the rationals. This, however, is a lengthy undertaking which


properly belongs in a text on advanced calculus. Hence, rather than attempt it
here we assume that the student has a working knowledge of the real number
shall
system, at least to the extent normally taught in a first course in calculus, and
we shall base our discussion upon it.
Actually it is possible to give a rigorous, self-contained account of the theory
of sequential convergence if one is willing to accept, as given, a set (R of objects
called real numbers which can be added, subtracted, multiplied, and divided
according to the familiar rules of arithmetic. In addition, it is necessary to assume
(a) that (R contains the ordinary integers as a subset, (b) that (R is ordered by a
relation < having all the properties usually associated with this symbol, and
(c) that (R satisfies the so-called least upper bound principle which we now proceed
to state.

Definition 1-1. A real number b


is said to be an upper bound for a (non-

empty) set and only if s < b for all s in S. If, in addi-


S of real numbers if

tion, no number smaller than b is an upper bound for S, then b is said to


be a least upper bound (l.u.b.) for S. (The terms lower bound and greatest
lower bound (g.l.b.) are defined similarly.)

In these terms the least upper bound principle —which, by the way, is actually a
theorem concerning the real numbers —reads as follows:
Least upper bound principle. Every (nonempty) set of real numbers which is
bounded from above has a least upper bound. (Again there is a companion state-
ment concerning lower bounds which we omit.)

And once this statement has been accepted we are back on solid ground where
theorems can be proved and definitions given without further gaps in the reasoning.
637
638 INFINITE SERIES | APPENDIX I

Needless to say, we shall not attempt to give a complete treatment of the several
topics mentioned above, since this would entail writing an entire text, or more on
advanced calculus. Neither shall we prove every assertion that is made as we
progress, since this too would result in a labored discussion. We do, however,
insist upon the fact that these proofs are now within our reach, and want only
time and patience to present.

1-2 SEQUENTIAL CONVERGENCE

We assume that the reader is already familiar with the notion of a sequence {ak }
of real numbers, which, we recall, is simply an ordered list

{a i, a 2 , . . . , ak , . . .}

of real numbers indexed by the positive integers, or, more formally, a real valued
function F whose domain is the positive integers, and whose value F(k) at k is ak .

Actually, there is no reason to insist that the indexing always begin with the sub-
script one, and when convenient we shall change it without comment.
This said, we now introduce the concept of sequential convergence, as follows.

Definition 1-2. A sequence {ak } of real numbers is said to converge to the


number a if, given any e > 0, there exists an integer K (depending in general
upon e) such that
\a k - a\ < e (1-1)

for all k > K. When this happens we say that a is the limit of {ak } , and
write
lim a k = a, or {a k } —» a.

If, on the other hand, no such number exists, {ak } is said to diverge.

Implicit in the statement of this definition is the assertion that the limit of a con-
vergent sequence is unique. To see this, suppose that {a k } converges to a, and let

a' 9± a. Then if e = \a — a'\/3 and if K is chosen so that la* — a\ < e for all
k > K, the only entries in {ak } which do not lie in the interval (a — e, a + e)
are a u a 2 ax, and it follows that {ak } does not converge to a'.
, . . . ,

Having defined the notion of convergence, we now address ourselves to the


problem of determining whether a given sequence converges or not. For this
purpose, Definition 1-2 is manifestly unsatisfactory, since it requires us to find
the limit of the sequence before we can establish its convergence. Thus it is natural
to seek a convergence criterion which can be applied directly to the terms of the
sequence themselves. One, which is easily deduced from the least upper bound
property, reads as follows.
1-2 |
SEQUENTIAL CONVERGENCE 639

Theorem 1-1. A monotonically nondecreasing sequence converges if and only


if bounded from above, while a monotonically nonincreasing sequence
it is

converges if and only if it is bounded from below.

(Recall that {ak} is said to be monotonically nondecreasing if a i < a2 < a3 < • •


;

monotonically nonincreasing if ax > a 2 > a 3 > •••.)

Proof Let {ak} be monotonically nondecreasing. Then if {ak} is bounded from


above it has a least upper bound a, and hence, given any e > 0, there exists an
integer AT such that \a — ok\ < c Since ok < ak < a for all k > K, it follows
that \a — ak < e for k > K, and {a k } — > a.
\

Conversely, if {ak} is not bounded from above, then for each real number a
we can find an integer K such that a < ax- Setting e = \a — ok\, we have
\a — a k > e for all k > K. Thus {a k } does not converge to a, and since a was
\

arbitrary, we conclude that {ak} diverges.


This proves the first assertion in the theorem and, with obvious modifications,
the second as well. |

Using this result, it is now relatively easy to establish a criterion which will

enable us to test an arbitrary sequence for convergence merely by examining its


terms. But first, a definition.

Definition 1-3. A sequence {a k } is said to be a Cauchy sequence if for each


e > there exists an integer K, depending upon e, such that

\a m — an \
< e (1-2)
for all m,n>K.
The convergence criterion we now propose to establish asserts that the class of
Cauchy sequences is identical with the class of convergent sequences. This is easily
the most important single result on sequential convergence.

Theorem 1-2. A sequence {ak } of real numbers is convergent if and only if


it is a Cauchy sequence.

Proof Suppose that {a k } is convergent, with a as its limit. Then, given any e > 0,
K
we can find an integer such that \a — ak < e/2 for all k > K. Thus, if m and \

n are both greater than K,

\a m — an \
= \(a m — a) + (a — an )\

< \a m — a\ + \a - an \

<! +! = €,

and {ak} is a Cauchy sequence.


640 INFINITE SERIES | APPENDIX I

Conversely, suppose that {ak} is a Cauchy sequence. Then, in particular,


{cik} is bounded from above and from below. Indeed, if e > is given, and K
is chosen so that \am — an < e for all m, n \
> K, then none of the ak can be
greater than the largest number among a u a 2 , . . . , a K , ok+i e, and none can be +
smaller than the smallest number among a u a 2 , . . • , aK, cik+i — e. This said, let

b\ = l.u.b. of the sequence {a 1} a 2 , . .


.},

b2 = l.u.b. of the sequence {a 2 a 3 , , . .


.},

bk = l.u.b. of the sequence {a^ a&+i, . . .}.

Then b\ > b 2 > b 3 > and each bk is at least as large as the greatest lower
• • • ,

bound of {a l9 a 2 ,...}. Hence {bk} is a monotonic nonincreasing sequence


bounded from below, and therefore has a limit a (Theorem 1-1). We now propose
to show that a is also the limit of {a*} To this end, let e > be given, and let .

Ki be chosen so that \a m — a n < e/3 for all m, n > K\. (The existence of such
\

an integer follows from the assumption that {ak} is a Cauchy sequence.) Let K2
be chosen so that \a — b k < e/3 for all k > K2 and let K be the larger of AT X
\
,

and K2 Then, if k > K,


.

\a - ak \
<\a — bk \ + \b k - ak \

< 3 + \bk - a k \.

But since b k is the least upper bound of {ak , ak+1 , . . .}, there exists an index
p > k such that \b k — ap \
< e/3. Hence

\b k — ak \
< \b k — ap \ + \a p — ak \

<- +-
^3^3 = -•
3

Combining these results we have

\a — ak \ < ^ +y= e,

and it follows that {ak } converges to a, as asserted. |

Example 1. The sequence

1 1 1

2 3 k

is obviously a Cauchy sequence, and hence converges. Here, of course, we could


just as easily have applied Definition 1-2, since it is clear that the sequence in ques-
tion converges to zero.
1-2 |
SEQUENTIAL CONVERGENCE 641

Example 2. The sequence {1, - 1, 1, - 1, ...} is not a Cauchy sequence, and hence
does not converge.

The argument given in this section can be summarized by saying that the least
upper bound principle implies that every Cauchy sequence of real numbers is
convergent. It is also possible to turn things around, and deduce the least upper
bound principle from Theorem 1-2. In short, these two facts concerning the real
number system are equivalent, and either can be taken as the starting point for
the study of infinite series, and, in fact, the entire theory of real valued functions
of a real variable. We omit the proof.
We conclude this section by proving an elementary computational theorem
which will be needed below.

Theorem 1-3. Let {ak} and {bk } be convergent sequences with

lim a k = a, lim bk = b.
k—>oo k—>oo

Then

(i) {aak + pbk} is convergent for all real numbers a and @, and

lim (aa k + jSfefc) = aa + jSfe;


k—»oo

(ii) {akbk} is convergent, and

lim akbk = ab.


k—»»

Proof. We leave to the reader the easy task of verifying that {aak } — > aa when-
ever {ak} — a. This proved, (i) will follow as soon as we show that {ak} — > a
and {b k} — » b imply {a k + bk} — > a + b. Let € > be given. Then

|(fl* + h) - (a + b)\ < \a k - a\+ \b k - b\,

and, by assumption, we can find integers Ku K2 such that \a k — a\ < e/2 for all

k > K x, \b k - b\ < e/2 for all k > K2 . Thus if K is the larger of #i and K2 ,

we have

|(«* + bk ) - {a + b)\ < \ + \ = e,

for all k > K, as required.


To prove (ii) we note that

\a k b k — ab\ = \a k b k — + akb — ab\


ak b
< \ttk\ \b k -b\ + \a k - a\. \b\

But since {ak } is convergent it is a Cauchy sequence, and therefore is bounded


from above and from below. (See the proof of Theorem 1-2.) Thus there exists
642 INFINITE SERIES | APPENDIX I

a positive constant M such that \a k \


< M for all k, and the above inequality can
be written
\akbk ~ ab\ + \ak - a\.
< M\b k - b\ \b\ (1-3)

Now let e > be given, and choose integers K and K 2 such that x


\a k — a| <
2|6|
for all A: > K x , and
l*» " *l < 2S
for all k > K2 . (If b = 0, the second term in (1-3) vanishes, and we need only
choose K x .) Then with K the larger of K x and K2 , and k > K,

and we are done. |

1-3 INFINITE SERIES

In this section we review the elementary facts concerning infinite series of con-
stants, including several well-known tests for the convergence of such series.

Definition 1-4. Let

00

^
fc=i
ak = ax + a2 + • • •
+ ak + • •

(1-4)

be an infinite series of real numbers, and let {sk} be the associated sequence
of partial sums
Sx = ax ,

s2 = «i + a2 ,

sk = ax -\- a2 + — + • ak .

Then (1-4) is said to converge to the value a if and only if {s k } converges


to a. In this case we write

a = ^2 ak ,

k=l

and say that a is the sum of the series. Otherwise, (1-4) is said to diverge.

Perhaps the most familiar example of a convergent infinite series is the geometric
series
a + a r + a r
2
+ • • •
,
(1-5)
1-3 |
INFINITE SERIES 643

whose ratio r satisfies the inequality - 1 < r < 1. Indeed, in this case

so = a ,

Si = a (l + r),

k
sk = fl0 (l + r + r
2
+ • • •
+ r ).

But, by the identity

(1 + r + • • •
+ r^Xl - r) = 1 - r\

we have
- k+1
1 r
Sk = ^o _ ' r 9± 1,
1 r

and it follows that


hm 5 = fc
-.

— r
fc~»oo l

provided |r| < 1. On the other hand, if \r\ > 1, the sequence {sk } is divergent,
and hence so is (1-5).

Example 1. The real number 0.33333 ... is the sum of the geometric series

_^
3
_j
3
^
I
3
^
i_ . .

10 10 2 103

whose ratio is y^. In this case the formula given above for lim^*, sk yields

a _ 3 1 _ 1

1 - r
~ 10*1 _ jl_
~ 3'
1 10
as expected.

The question of convergence or divergence for geometric series was settled in


the most satisfactory way possible, namely by obtaining a simple closed-form ex-
pression for Sk and letting k tend to infinity. The existence of such a formula for
Sk and the ability to find the actual sum of the series is something of an accident,
however, and for this reason we now turn our attention toward deriving conver-
gence tests which may be applied directly to the terms of a series. We begin by
restating Theorem 1-2 in a form appropriate to series.

Theorem 1-4. The series £)j°=i ak converges if and only if for each e >
there exists an integer K, depending on e, such that

2
k=m
ak < e (1-6)

whenever K< m < n.


644 INFINITE SERIES | APPENDIX I

In fact we note that the expression £*=m <**> is just the difference sn — sm _ x of
the partial sums sn and jto _i. Hence the theorem states that ]C*=i a k converges
if and only if its sequence of partial sums is a Cauchy sequence, and this is the

content of Theorem 1-2. |

Theorem 1-4 provides a general convergence criterion for series, but unfor-
tunately it is difficult to apply in practice. We devote the remainder of this section,
therefore, to the derivation of several consequences of the theorem which, although
lacking the generality of Theorem 1-4, do provide convenient tests for convergence
in a large number of cases.

Theorem 1-5. If 2^a=i ak converges, then lim,^.,, an = 0.

Proof. Since

M= \Sn — S„_i|,

the theorem is an immediate consequence of Theorem 1-4. |

It is useful to restate the last result in the following form: If ak does not tend to
zero ask — > oo , then X)£=i a k diverges. Thus each of the series

oo oo oo / i \n

Ettv
n=l
2>
=
n
n
l
"' 2(« +
n=l N ;)
'

diverges, the first because lim„_., n/(l + n) = 1, the second because lim n _^„ (sin n)
does not exist, and the last because

lim ( 1 + -
n))
= e.
n^oo \

It must be emphasized, however, that the converse of Theorem 1-5 does not hold,
for, as we shall see presently, the so-called harmonic series

Zr
fc=i
i + i + i + ---

diverges, although lim^^ \/k = 0. Thus even though it is necessary that


limfc^oo a k =
in order that ]£*=i ak converge, this condition is not sufficient. In
the next three theorems we restrict our attention to series whose terms are positive.

Theorem 1-6. (Comparison test.) Let YI=\ ak and £,k=i bk be series with
positive terms.

(X) If Zfc=i <*k converges and b k < ak for every k, then Y,k=\ b k also
converges.

(2) IfHk=\ ak diverges andb k > ak for every k, then J2k=i b k also diverges.
1-3 I
INFINITE SERIES 645

Proof. (1) Since bk > for all k, the partial sums of b k form a monotoni-
2~Lk=i
cally nondecreasing sequence. But this sequence is also bounded from above, since

n n oo

^2
k=i
bk - 2
k=i
ak - S
fc=i
ak = s >

where S denotes the sum of the convergent series £*=i a^ Hence by Theorem 1-1
the sequence of partial sums of ££=i bk converges.
(2) In this case 2~Ifc=i bk > H£=i ak and it follows that these partial sums are
unbounded since otherwise the series JLk=i a* would converge. |

Theorem 1-7. (Ratio test.) Let £]a=i ak be a series of positive terms and
suppose that
L = lim
k—>Q0
^ @k
exists. Then

0) Et=i «* converges < 1,


ifL
(2) Zt=i tffc diverges ifL> 1.

(Afote f/wf «o assertion is made in case L = 1 .)

Proo/. (1) Suppose that Z, < 1 and that r is chosen to be a fixed real number
with L < r < 1. Then for sufficiently large values of n, say n > N, we have
an+ i/an < r. Thus

2
tf.v+2 < '"o^+i < r aN ,

k
ciN+k < r au.

But since r < 1, the geometric series ££= aN r


k
converges. Thus by the com-
parison test so does

2J aN+k = 22 a k>
fc=0 k=N

and hence also the given series zZk=i ak .

(2) The proof of this case is similar: this time Jlk=i ak is compared with a
divergent geometric series (ratio r > 1). We omit the details. |

Theorem 1-8. (Integral test.) Let 2Za=i a k be a series of positive terms and
assume that there exists a function f, continuous and monotonically nonin-
creasing on 1 < t < cc, such that f(k) = ak for k = 1,2, Then
the series 2~lk=i a k and the improper integral
J" f(f) dt converge or diverge
together.
646 INFINITE SERIES |
APPENDIX I

Proof. Assume first that J\ f(t)dt converges. Then a reference to Fig. I- 1(a)
makes it clear that
rn + 1 f°o

Y,a k < f{t)dt< / f(t)dt.

Thus the sums YLk=i a k of the given series form a monotonically nonde-
partial
creasing sequence whichis bounded above by the real number a x + J\ f(t) dt.

Convergence follows from Theorem 1-1.

FIGURE 1-1

If, on the other hand, the integral j^ fit) dt diverges, then the partial sums
5Z3b=i «fc are unbounded, for in this case (see Fig. I-lb)

r n+1
2>*> /
Jl
fd)dt,
JfcTi

and the latter integral tends to infinity as n —> oo . |

Example 2. The harmonic series

£;-•+*+*+
diverges, for the integral test may be applied in this case with /(f) = \/t to obtain

- = lim / -j = lim (In a) = oo.


1 * a—>ao J 1 * a—»oo

More generally given any p-series Y.k=\ l/k


p
, p a positive real number, we have

lim /
f t
_p
dt = lim
t- p+1 — T /> 7* 1,
» *

1-3 INFINITE SERIES 647

and since this limit exists if and only if p > 1, it follows that a p-series

££=1 \/k p converges ifp > 1 and diverges ifp< 1. In particular the series

00 -t 00 1

E^
fc=l
and
EfcToi
fc=l

converge, while the series


00 1 00 1

E-7^ and
Efciu9
Vk
fc=1 fc =i

diverge. The /(-series and the geometric series constitute useful classes of series
for which the question of convergence is completely settled. By using these series
together with the comparison test, a large number of additional examples may be
treated.

Example 3. The series

^Kk + \) 1-2
1
+A + A +
' 2-3 ' 3-4

converges by comparison with the series

00 1

y—
*-!*'

for l/k(k + 1) < l/k


2
for k = 1, 2,

Example 4. Consider the series

t
fc=0
= l
+ h + h + h+-
Since
1 1 1
<
~
(k + 1)! 1 •
2 •
3 • • •
(fc + 1) 2*

k
the given series converges by comparison with the geometric series Xlfc=o (¥) -

Note that the ratio test may also be applied in this case, since

L = lim *tl = lim


1/( * + 1)!
= lim ' = 0.
fc^oo ak fc^oo 1/A:! fc^«, /c + 1

Example 5. Apply the ratio test to the series

oo i, & ^2 t3

^ 2! ^ 3!
^
£i fc!
i

648 INFINITE SERIES | APPENDIX I

Computing lim&^oo ak+1 /a k , we have

"» (
* + + 1)! ,. (At + 1)*+' A;!

'C'/^ (k + 1)!

£l( +
= £)'" , = e > 1.

Hence the series diverges.

Example 6. For any /^-series we have


p P

a k+1 + l/(fc l) k \
lim ( 7—TT / = L
Lr = ,.
lim 1
..
= hm -^77^—- = r i
(

Since such series converge if p > 1 and diverge if /? < 1, the ratio test cannot
possibly give any information in the case L = 1 (see Theorem 1-7).

1-4 ABSOLUTE CONVERGENCE


The sequence of sums associated with a series whose terms are positive is
partial
monotonically nondecreasing, and hence the series converges or diverges accord-
ing as the sequence is bounded above or not. For more general series, however,
the question of convergence depends much more delicately on the magnitude and
distribution of the positive and the negative terms. In this section we treat series
which are absolutely convergent and series whose terms alternate in sign.

Definition 1-5. The series 5X=i ak is said to be absolutely convergent if


the series Xfc=i M converges.

Theorem 1-9. 7/ELi M converges, then so does Zfc=i «fc- Briefly, ab-

solute convergence implies convergence.

Proof. Under the assumption that £fc=i \a k converges, we shall show that the \

partial sums of £fc=i a k form a Cauchy sequence. But this follows immediately
from the relation
n n

X) °k ^ X) l°*l' m ~ "»
k=m k=m

because the right member can be made arbitrarily small by choosing m sufficiently
large. |

Since many important tests for convergence of series apply directly only to
series whose terms are positive, Theorem 1-9 provides a way, sufficient for many
1-4 I ABSOLUTE CONVERGENCE 649

applications, of applying these tests to arbitrary series. The ratio test for instance,

now takes the following form:

Theorem 1-10. (Ratio test.) Assume that the limit

Cln+l
L = lim
n—><»
exists. Then

(1) 2w*=i ak converges ifL < 1, and

(2) ££=1 ak diverges ifL > 1.

IfL = I, no information is obtained.

Proof. If L < 1, Theorem 1-7 £a=i \ak converges; that


asserts that \
is, Y2=\ ak
converges absolutely. Hence, by Theorem 1-9, IX=i «fc converges.
If, on the other hand,
L = lim > 1,
n—>oo a„

then \a n+ x > \a n for sufficiently large n. Thus a n does not tend to zero as n
| \
00.

and ]££=! a diverges. | fc

Example 1 . Determine the set of values of x for which the series

converges.
Since ak = (— l) kk(x/2) 3k we have ,

3n+3
an+l (-l) n+1 (« + l)(x/2) n + 1 /sV « + 1

(-l)»«0/2)3*

Hence

lim
^n+1
= lim
n + 1

n—»oo n—»oo

z
and the given series converges if |jc/2|
3
< 1 and diverges if \x/2\ > 1. Thus
the series converges for values of x lying in the interval

-2 < x < 2

and diverges if \x\ > 2. Finally, in order to determine the behavior of the series
at the points x = ±2, we note that the general term becomes ±(— l)*/c in this
case, and since this quantity does not tend to zero as k —» go , the series diverges
at both points.
650 INFINITE SERIES | APPENDIX I

It is an unfortunate fact that a series may converge without converging abso-


lutely, and for such series, usually referred to as conditionally convergent series,
the tests for convergence which we have devised so far are of no use. This situation
is illustrated, for example, by the alternating harmonic series

+1
(-o* + i_
,_ i2^3 i +
f-
/ j h.
K
—.
l
4 T ...
J

fc=l

whose partial sums do approach a limit (see Theorem I— 11), despite the fact that
the corresponding series of absolute values is divergent. This example is typical
of alternating series where we have the following useful criterion for convergence.

Theorem 1-11. If the terms of the series £*=i ak alternate in sign and satisfy

0) Mil > N>N> •••,

(ii) lim \a k \
= 0,
k—>oo

then the series converges. Moreover, if S is the sum of the series and Sn is

the wth partial sum, then

\S - Sn < \
\a n \.

Proof We may assume that the terms a x , a 3 a5


, , . . . are all positive and the terms
a 2 a4
, , a$, . . . are all negative. Then
2n+l 2n— 1
S2 n+1 = 2l
k=l
ak =
fc=l
2 ttk + ( tt2n "*" ° 2n +^

= S2n -1 + ( a 2n + a 2n + l)

< $2*, — 1j

since a 2n + a 2n+1 < 0. Likewise

2n 2n— 2
S 2n = 2
fc=l
ak = S
fc=l
° fc + ( a 2n-l + fl 2n)

= S2n — 2 + (#2n— 1 + a 2n)

> ^to— 2j
since a 2n -i +
a 2 n > 0. It follows that the odd-numbered partial sums form a
nonincreasing sequence bounded below by S 2 and that the even-numbered par- ,

tial sums form a nondecreasing sequence bounded above by Si (see Fig. 1-2).

Thus each of these sequences possesses a limit, say

lim S2k = SE and lim S2 k+i = So,


fc—>O0 fc—>QO
• ;

1-4 I ABSOLUTE CONVERGENCE 651

where clearly SE < S . But in fact since

lim \Sk +i - 5*1 = lim \a k \


= 0,
k—KXl fc—>00

we conclude that SE = So, and the common limit 5 is the desired sum of the series.
Moreover from the inequalities

S 2 k < S < S2l+1 (for every k, I),

we find that
\S — S2 k\ < 152*— i — S2 k\ = \a2k\,

and
\S — -S 2 fc+l| < |'S'2fc+l
— $2k\ — |«2fc+l|j

completing the proof of the theorem. |

l«6l

— • m • —• •—
S2 54 56 • • • S5 S3 S {

FIGURE 1-2

Example 2. The ratio test may be applied to the series

QD
(x + If
E
fc=0
2k + 1
d-7)

to show that it converges absolutely if Jjc -f- 1 1 < 1 and diverges if -{-
jjc > 1 1 1

that is, the series converges absolutely for values of x in the interval —2 < x < 0.
At the endpoints of this interval the ratio test gives no information. However, if

we substitute x = —2 and x = into the series, we obtain

(-0*
A;=0
2k + 1
and E 2k +
k=0
1

respectively. The first of these series converges since it is an alternating series


whose general term tends to zero (the absolute values tending monotonically to
zero), while the second series diverges. Thus the given series (1-7) converges for
values of x in the interval — 2 < x < and diverges for x outside of this interval.
We close this section by stating without proof an important property of abso-
lutely convergent series which is not shared by conditionally convergent series.

Theorem 1-12. If XlitLi «fc is absolutely convergent and if 2Zifc=i bk is any


series obtained by rearranging the terms of ]Cfc=i Qk, then Y^t=\ °k also
converges absolutely and has the same sum.
652 INFINITE SERIES | APPENDIX I

1-5 BASIC NOTIONS FROM ELEMENTARY CALCULUS


Before considering sequences and series of functions in Section 1-6, we recall
some of the basic facts concerning continuous and differentiable functions to
which reference was made in the text. We shall give no proofs in this section
although such proofs can be readily based upon the deeper properties of the real
number system described in Section 1-1. For a complete discussion the reader is
referred to any text on advanced calculus (e.g., W. Kaplan, Advanced Calculus,
Addison-Wesley, 1952).

Definition 1-6. A real valued function / defined on an interval / of the


x-axis is said to be continuous at a point x in I if for every e > there
exists a 8 >
depending in general on e and x
0, , such that \f(x) — /(x )| < e
whenever x is in / and |x — x < 8. If / [
is continuous at every point

of / we say that/(x) is continuous on I.

This is the familiar notion of continuity basic to any elementary calculus course.
Not usually introduced at that level, however, is the following notion of uniform
continuity.

Definition 1-7. A function / is said to be uniformly continuous on an in-


terval I if for every e > there exists a 5 > 0, depending in general on e

but not on x, such that |/(*i) - f(x 2 < )\ e whenever x ls x 2 are in / and

It is clear that a function which is uniformly continuous on an interval is also

continuous on that interval. The following example shows, however, that the
converse is false.

Example 1. Let
f(x) = l/x, < x < 1;

let x be any point in this interval, and let e > be given. If x > x /2, then

Xq
< — \x - x \.
Xq XXq Xq

If, further, \x - x \
< (x 2 /2) e, then

< €.
Xq
Thus with
Xq Xq
8 = min

the conditions of Definition 1-6 are satisfied, and/(x) = l/x is continuous at x .

It follows that /is continuous in the interval < x < 1. However, the given
1-5 BASIC NOTIONS FROM ELEMENTARY CALCULUS 653

Ax) =
i

2e<f(x )

FIGURE 1-3

value of 8 depends on both


e and x and a glance at Fig. 1-3 convinces us that
,

For if any e >


this is necessarily the case. is given, the points x', x" in the figure

can be made to fall as close to x as desired by simply choosing x sufficiently


close to 0. Since 8 must be chosen no larger than \x — x'\, this shows that it
cannot depend solely on e. Thus /is not uniformly continuous on < x < 1.
It is not hard to show, however, that the difficulty in this example is caused by

the fact that the interval < x < 1 under consideration is not closed. This is

a consequence of the following general theorem.

Theorem 1-13. If fix) is continuous on a closed interval a < x < b, then


it is uniformly continuous on that interval.

All of the above notions extend readily to functions of several variables. We


state them here for reference.

Definition 1-8. A real valued function f(x i, . . . , jc n ) defined in a region


n
2D of (R is continuous at the point y = (ji, . . .
, yn ) of 3D if for every e >
there exists a 8 > 0, depending in general on e and y, such that

l/(x) - /(y)| = |/(*i, • • • , xn ) - f(y 1} . . . , yn )\ < e

whenever x, y are in 2D and

Six
- y ||
= V(x! - yi y + ... + (*„- yn y < 8.

We say that /is continuous in 3D if it is continuous at every point of 3D.

Definition 1-9. A function f(xi, . . . , xn ) is uniformly continuous in 3D if

for every e > there exists a 8 > depending in general on e but not
0,
on x, such that |/(x) — /(y)| < e whenever x, y are in 3D and ||x — y|| < 8.
654 INFINITE SERIES | APPENDIX I

If we consider functions defined on closed and bounded regions 3D, we have the
following theorem.*

Theorem (-14. Iffix i, , x n ) is continuous on a closed bounded region


. . .
3D,

then it is uniformly continuous on 3D.

Two properties of continuous functions which are of the greatest interest from the
viewpoint of elementary calculus are given by the next two theorems.

Theorem (The maximum-minimum property.) Iff is continuous on


1-15.
< x < b, then it assumes a maximum and a minimum
a closed interval a
value there. More generally, if fix i, xn ) is continuous on a closed, . . . ,

bounded region 3D of (R n
then there exist real numbers m, and points
,
M
y = 0>i, • • • , yn ), z = Oi, . . . , zn ) in 3D such that

m < fixi, . . . , xn ) < M


for every x = (jt ls . . . , xn ) in 3D and such that /(y) = m and fix) = M.

Theorem 1-16. (The intermediate value property.) If f is continuous on


the interval a < x < b andfid) j* fib), then for any real number between M
fid) and fib) there is a point x , a < Xq < b such that fix ) = M. An
analogous statement holds for functions of several variables.

While the student is certainly familiar with the maximum-minimum problem, he


may fail to recognize the importance of Theorem 1-16. Naively, of course, this
theorem may be used to assert the existence of roots of equations ; for example,
if fid) < and/(Z>) > and if/ is continuous on a x < b, then/(x ) =
<
for some Xq in this interval. But on a deeper level, Theorem 1-16 is a consequence
of the basic properties of the real number system itself and may be interpreted
as connecting our intuitive notion of continuity with the abstract definition of
continuity given in Definition 1-6.
Turning now to definitions and theorems relating to differentiation and integra-
tion, we first state

Definition 1-10. Let / be defined on an open interval containing x . If

lim
^X + ® ° ~ f^
fc->o h

exists, it is called the derivative off at x and is denoted by /'(x ) or °y

A region 3D of (R" is closed if it contains all of its limit points. It is bounded


if here i s a real numb er
there M
such that for every point x = (xi, . . . , x n ) of 3D,

= Vxf + • • •
+ x'i < M.
1-5 | BASIC NOTIONS FROM ELEMENTARY CALCULUS 655

(d/dx)f(x ). If f has a derivative at every point of an open interval


a < x < b, then /is said to be differentiable on that interval*

It is a simple exercise to show that if/ is differentiable on an interval a < x < b,


then it is also continuous in that interval. We state for reference the somewhat
more important theorem.

Theorem 1-17. (The mean value theorem.) If f is continuous in the closed


interval a < x < b and differentiable on the open interval a < x < b,
then there exists a point x a < x < b, such that
,

fib) - /(a)
f'(x
b — a
).

The geometric content of this result is illustrated in Fig. 1-4. It states that there
is at least one point in the open interval a < x < b where the tangent to the
curve y = f(x) is parallel to the secant line connecting the points (a, /(a)), (b,f(b)).

FIGURE 1-4

Turning now to integration, we assume that the student already has some
6
intuitive feeling for the definite integral J f(x) dx of a continuous function /.
A detailed definition would be too lengthy to present here. Nevertheless, given a
continuous function/, it is useful to recall the following terminology:

(1) Jaf(x)dx is called the definite integral of/ on the interval a < x < b,
whereas

(2) if a: o is in the interval a < x < b, then the function


X
F(x) = [ f(i) dt, a < x < b,
Jx

is called an indefinite integral off in a < x < b.

The basic connection between differentiation and integration is provided by the


Fundamental Theorem of Calculus (Theorem 1-18).

Later we shall need to extend this notion to include the endpoints of an interval as
well.
656 INFINITE SERIES |
APPENDIX I

Theorem 1-18. Iffis continuous in a < x < b and


X
F(x)= [ f(t)dt
Jx

is an indefinite integral of f in a < x < b, then F is differentiable, and


F'(x) = f(x).

It follows almost immediately from the mean value theorem (Theorem 1-17) that

two indefinite integrals off in a < x < b differ by at most an additive constant.
Thus we are led to the formula

b
f f(x)dx = F(b)- F{a),
Ja

where Fis any "antiderivative" of/ on a < x < b.


Finally, we state two properties of integrals to which reference is made in the
text.

Theorem 1-19. Iffis continuous on a < x < b, then

b b

\[ f(x)dx < f \f(x)\dx.


IJa Ja

Theorem 1-20. (Mean value theorem for integrals.) Iff is continuous for
a < x < b, then there is an x in the open interval a < x < b such that
rb

/
f{x)dx = (b - a)f(x ).
Ja

Geometrically, f(x ) is the average height of/ on this interval.

1-6 SEQUENCES AND SERIES OF FUNCTIONS


There are three different notions of convergence studied in connection with
sequences and series of functions. Two of them, pointwise and uniform conver-
gence are treated in this section, and the third, mean convergence, is handled in
Chapter 8 of the text. We assume the elementary material of the preceding sec-
tions, especially that relating to sequences and series of constants.

Definition 1-11. A sequence {/&(.*)} of functions, each defined on an interval


7, is said to converge pointwise on I if

lim fk (x Q )
exists for each x in I.

The following examples illustrate this definition and point out some of the reasons
for introducing a stronger type of convergence below.
1-6 I SEQUENCES AND SERIES OF FUNCTIONS 657

Example 1. Let/*(jt) = xk , < x < 1, k = 1, 2, 3, . . . Then for < x < 1

we have
lim fk (x ) = lim Xq = 0,
k—>oo fc*-»00

while, when x = 1, limj^/fcO) = 1. Hence the given sequence converges


pointwise on the interval < x < 1 to the function

if < x < 1,
Ax) 1 if x = 1.

(See Fig. 1-5.) Note that whereas each of the/ fc is continuous on the entire in-
terval < x < 1, the limit function /is not continuous in this interval (namely,
it is discontinuous at x = 1).

0,1)

FIGURE 1-5 FIGURE 1-6

Example 2. Let fk (x) be the function defined on < x < 2 whose graph is
indicated in Fig. 1-6. Clearly ^(0) = timk^MO) = 0. More-
for every k, so
over if < x < 2, then there is some value of k, say K, such that 2/K < x ,

and hence/ (*o) fc


= for every A: > K. Thuslim^oo/fc^o) = 0, and we conclude
that {fk{x)} converges pointwise to the function which is identically zero on the
interval < x < 2. This time the sequence of continuous functions converges
to a continuous limit. Nevertheless, the individual functions fk , regardless of
how far out they lie in the sequence, may differ from the limit function f(x) =
by large amounts. The members of the sequence, therefore, are not "approxima-
tions" to the limit of the sequence in the expected sense of the term. And the
integrals of the members of the sequence reflect this peculiar behavior by failing
to approach the integral of the limit function/^). In fact Jo /(*) dx = 0, whereas
(see Fig. 1-6)
• 2
2
fk (x) dx = 2'k'
1'
/C — /C-

and tend to infinity, not zero, as k — > go .

In order to eliminate this kind of behavior we now introduce a stronger kind of


convergence, called uniform convergence.
658 INFINITE SERIES APPENDIX I

Definition 1-12. A sequence {/&(*)} is said to converge uniformly to the


function /on the interval a < x < b if for every e > there is a positive
integer K, depending on but not on x, such that \fk(x)
e — f(x)\ < e when-
ever k > K and x is in the given interval.

It is certainly clear that if {fk(x)} converges uniformly to / on a < x < b, then


it / on this interval.
also converges pointwise to In the case of uniform con-
vergence, however, one can choose k so that

f(x) - e < fk (x) < f(x) + e

for every x in the interval a < x < b. Thus


fk(x) approximates / to within e over the
entire interval as shown in Fig. 1-7. We
noted in Example 2, that in the case of point-
wise convergence, none of the members of
the sequence need approximate the limit
function in this way. FIGURE 1-7

Theorem 1-21. If a sequence {fk(x)} of continuous functions converges


uniformly to f on a < x < b, then the limit function f is also continuous
on this interval.

Proof We must show that for any x Q a < x Q < b, and any e
, > 0, there is a
5> such that \f(x) — f(x )\ < e whenever a < x < b and \x — x \
< 8.

Now

|/(x) - f(x )\
= \f(x) - fk (x) + fk (x) - fk (x ) + fk(x ) - f(x )\

< \f(x) - /*(*)| + |/*(*) - /*(*<,)! + |/*(*o) - /(*o)|. (1-8)

But since the convergence is uniform, we can choose an integer k so that the first
and third terms on the right-hand side of (1-8) are each less than e/3 for every x
in the interval a < x < b. Moreover, since the function fk thus chosen is
continuous, we can also choose 8 > such that \fk(x) — fk(xo)\ < «/3 whenever
a < x < b and \x — x < 8. The desired conclusion is now immediate. |
\

Theorem 1-22. If {fk(x)} is a sequence of continuous functions which con-


verges uniformly on a < x < b to the {continuous) limit function f then for
every x in this interval

lim f fk (t) dt = f /(/) dt,


k^>x J a Ja

and this convergence is uniform on a < x < b.


1-6 | SEQUENCES AND SERIES OF FUNCTIONS 659

Proof. We must show that for any e > there exists a K such that
x
\[ Mt)dt- r mdt < e
I Ja Ja

whenever k > and x is in the interval a < x < b. For


K this purpose, we use
the uniform convergence of {/fc(x)} to choose^ so that \fk (x) - f(x)\ < e/(b - a)

whenever k > K and a < x < b. Then


X
I

I
C h(t)dt -
Ja
f f(t)dt
Ja
= I

I
f [/*(/) -
Ja
f(t)]dt\
I

<
Ja f |/*(0 - /(Ol dt

b
< [
Ja
\M0 - f(t)\ dt

< (» - «)jr-5 - e

for every k > K and every x in a < x < b. |

It would be useful to have a result similar to Theorem 1-22 which would apply
to differentiation instead of integration. Unfortunately this is impossible since

there exist uniformly convergent sequences of differentiate functions whose


limit function, although continuous, is nowhere differentiable.* The following
theorem, however, does hold.

Theorem 1-23. a sequence of continuously differentiable func-


// {fk (x)} is

tions which converges pointwise to a limit f for a < x < b, and if the se-
quence {/ '0c)} converges uniformly on a < x < b, then
fc
exists for all f
x in the interval and

f\x) = lim fi(x).

Proof Let g be the limit function to which {f'k (x)} converges uniformly. Then,
applying Theorem 1-22, we have
X
f g(t)dt = lim (* f&i)dt
Ja k—>oo Ja
= lim [fk (x) - fk {a)]
k—>30
= Ax) - f{a),

and it follows from the fundamental theorem of calculus that f'(x) = g(x) =
lirrifc^oo/fcCx), as desired. |

See, for example, Rudin, Principles of Mathematical Analysis, McGraw-Hill, 1964.


660 INFINITE SERIES | APPENDIX I

All of the above results may be recast in the context of series of functions rather
than sequences. As usual we will write

Ax) = E
fc=i
/*(*)' a < x <b, (1-9)

and say that the series converges (pointwise) to /, if its associated sequence of
partial sums converges pointwise to/. The series is said to converge uniformly if
the same is true of its sequence of partial sums. In this case it follows immediately
from Theorems 1-21 and 1-22 that if each term of the series is continuous, then
their sum /is also continuous, and

r At)dt
Ja
= e
k=1 Ja
r MOdt
Finally, if each term of the series (1-9) is continuously differentiate, and if the
series 2w*=i ./*(*) obtained by differentiating each term is uniformly convergent,
then

fix) = E/*W-
/t=i

The details of these statements are left to the reader.


In conclusion we establish the following important criterion for uniform con-
vergence of series.

Theorem 1-24. (The Weierstrass Af-test.) If 2Zfc=i M k is a convergent


series ofpositive real numbers, andifYlk=\fk{x) is a series of functions such
that \fk(x)\ < Mjcfor every k and every x in the interval a < x < b, then
Jlk=ifk(x) is uniformly and absolutely convergent on a < x < b.

Proof. It follows from the comparison test that for any Xo in the given interval
the series Sa=i/a;(^o) converges absolutely. Thus the series converges pointwise
to a limit function /on a < x b. < Now

/(*) - S
k=l
/*(*)
fc=n+l

k=n+l

< J) M k
k=n+l

= EM * - E M*
and this latter expression tends to zero as n — > oo . Since it is independent of jc,

the convergence of YX=\fk(x) is uniform. |


,

1-7 POWER SERIES 661

Example 3. Consider the series

^sink 2 x sin 4* sin 9a: im


+ —r- + -g- +
. .
,r
, .

=
X,
fc=i
~W~
k*
sinx • • • (1-10)

Since

2
sin k x
k*
2

^
/or a// x, and since £*=i 1/& converges, it follows from the Weierstrass Af-test
that the given series converges uniformly on — oc < x < oo . Let / be the limit
function; i.e.,
2
a \ V^ sin/c A
k2

Then by Theorem 1-23,

^/ x j ^ / sin /c
2
*

_ ^ cos /c
2
a

cos 4a: cos 9a


= 1 - COS A
j^ g|

If, on the other hand, we differentiate the terms of (I— 10), we obtain the series

00

y^ cos k
2
x = cos a + cos 4a + • • • (I-H)

which clearly does not converge for certain values of a.

1-7 POWER SERIES

Of particular importance among series of functions are the so-called power series
00
2
^2 a kx
k
= a + aix + a 2x + • • •
,

fc=0

where a a u , are constants. Such series enjoy special convergence properties


. . . ,

which stem from the following theorem.

k
Theorem 1-25. If the power series IX=o GkX converges for some value of x,
say x = a then it converges absolutely for every x satisfying \x\ < \x
, \,

and it converges uniformly on every interval defined by \x\ < \xi\ < |a |.

Proof.
k
Since Jlk=o a k x converges, we know that ak xl >
— as k — oo and hence
that there exists a number such that |fl*Xo| < M, k M = 0, 1, 2, . . . . Now
662 INFINITE SERIES APPENDIX I

if |x| < |* |, then we have


ak x < M Xq
Thus since the series J2k=o M\x/x k is a convergent geometric series it
\
follows by
k
the comparison test that ££=0 akX converges absolutely on the interval |jc| < \x \.

In particular, for any fixed x lf \xi\ < the series £fc=o \akx\\ converges. Thus
\x \,

for \x\ < \xi\, we have \akX


k
\
< \akx\\, and the Weierstrass Af-test implies that
the series J^ k=o cikX k converges uniformly on— x\ < x < x\. 1

k
Now for any power series Yik=o QkX , one of the following is certainly true:

k
(0 Sr=o cikX converges for every value of*.

(2)
k
Hfc=o OkX converges only for x = 0.
k
(3) Sr=o QkX converges for some nonzero value of x but not for all values.

k
In the third case the set of positive numbers x for which Y2=o akX converges
is bounded above, for otherwise, by the theorem, case (1) would apply. Letting
R be the least upper bound of this set, we conclude that J2k=o <*kX
k
converges if
\x\ < R and diverges if \x\ > R.
Combining the above cases, we have

Theorem 1-26. For any power series 52k=o QkX k there is a nonnegative
number R(R = and R = ao are included), called the radius of convergence
of the series, such that the series converges (absolutely) if \x\ < R and
diverges if \x\ > R. Moreover, if Ri is any number such that < Ri < R,
k
then J2k=o QkX converges uniformly on the interval —R\ < x < Ri.

k
Example 1. Consider XX=o (1 /k\ )x . Applying the ratio test we have
+1
[l/(k + 1)!]^ \x\

(!/*!)** k + 1

and for any x, this ratio tends to zero as k —> oo Thus the given series converges .

absolutely in — oo < x < oo and uniformly on every finite interval — JRi < x < Ri.

k
Example 2. If for Ea=o ak x the limit

L = lim
k—»oo dk

exists, then the radius of convergence of the series is R = \/L. For in this case
we obtain the ratio
k+\
Qk+iXr Qk+1
OkX* dk

and this tends to L\x\ as k — > oo. Thus the series converges if L|jc| < 1 and
diverges if \Lx\ > 1, i.e., converges if |x| < R and diverges if |*| > R.
1-7 | POWER SERIES 663

Theorems concerning differentiation and integration of power series are easily


obtained from Theorem T>-26.

Theorem 1-27. If the power series

00

F(x) = 2
k=0
°***

has radius of convergence R, then fa F(x)dx exists for ~R < a < b < R
and


/
F(x)dx = £ /
******* = £ ak
b
• (1-12)

/Voqf. According to Theorem 1-26, the series converges uniformly on a < x < b.
Hence (1-12) follows from the general results on integration of uniformly con-
vergent series. |

Theorem If the power series F(x)


1-28. = ££=o ak x k has radius of con-
vergence R, then F'(x) exists on < x —R < R and

F'(x) = ]T ka k x
k -\ (1-13)
fc=0

Proof According to the general theorem on differentiation of series, we need


only show that the differentiated series (1-13) converges uniformly on every
interval — 7?i < x < R x for which Ri < R. For this purpose, choose *i such
that R <
1 x\ < R. Then 2w/T=i ak x\ converges absolutely, and hence there is a
number M such that \a k x\\ < M for all k. Then for |*| < R u we have
k ~l
\ka kx \
= k\a k \
{x^ 1

k
< k\a k \R r l

M „k _ x
k
\Xi\

-k M
fc-1
*1
Xi
However, the series

±k M
fc-1

fc=0 l*il

converges by the ratio test, and the uniform convergence of (1-13) now follows
from the Weierstrass M-test. |

Stated informally, the two preceding theorems assert that a power series may be
integrated or differentiated term by term without affecting the radius of conver-
664 INFINITE SERIES |
APPENDIX I

gence. Convergence at |x| = R may be destroyed by differentiation or added by


integration, however, but this can only be ascertained by examining each series
individually.

Example 3. Consider the series

(a) £
A;=0
xk = l
+ x + x2 + *
3
+ ' '
'

00
v fc
+X X2 Y3 X*
o» Er+r-'
, + + T + T + T- --'
A: 1
1

4
x
fc+2
x
2
x3 x
<c > £ ft + IV* + 2) 2
+
,

2 •
3
+
,

3 •
4
+
fc=0

Each of these series has radius of convergence R = 1, since (a) is a geometric


series and (b) and (c) are obtained from (a) by integration. We find, however, that
series (a) diverges at both endpoints x = 1 and x = — 1, series (b) converges at
x = — 1 but diverges at x = 1, and series (c) converges at both endpoints.

There are a number of interesting implications of the theorem on differentiation


of power series.
k
Ea°=o a k x converges on
If -R < x < R, then it represents

(converges to) a continuous function there:


00

Hx) = S akx
k
.

Moreover F'(x) exists on this interval and

F'(x) = £ ka x -\
fc=0
k
k
-R < x < R.

Repeating the process of differentiation indefinitely, we obtain

k ~n
F^Xx) = Y, Kk ~ l)(fc - 2) • • • (fc - /i + \)a k x ,

fc=0

for n = 1, 2, 3, . . . , and hence F (n) (0) = n\a n . With this we have proved

Theorem 1-29. // a function F can be represented by a power series


k
£a°=o a k x on the interval then F has derivatives of all —R<x<R,
orders on —R < x < R and the coefficients of the series are uniquely de-
termined by the relation a k = (l/k\)F ik) (0), a = F(0).

The theorem asserts the uniqueness of the power series expansion of a function
F on a given interval —R < x < R. The existence of such a series is a more
difficult problem, and we investigate it in the next section. We note, of course,
1-7 |
POWER SERIES 665

that for a function F to have a power series representation it is necessary that


F have derivatives of all This condition rules out, for example, the
orders.
function In \x\ which is not defined at jc = and the function jc 8/3 for which the
third derivative fails to exist at jc = 0. Unfortunately the existence of infinitely
many derivatives of F in —R < x < R is not sufficient to assure the represen-

tability of F by a power series.*


We close this section with several arithmetic results on power series.

Theorem 1-30. If

00 00
k k
/(*) = 2] a kX and g(x) = ]T b kx
fc=0 fc=0

on —R < x < R, then for any constants a, /3 the series Yik=o («#* + Pbk)xk
has radius of convergence at least R and represents the function af(x) + @g(x)
on —R < x < R.

The proof of the theorem is an immediate consequence of Theorem 1-3.

Theorem 1-31. If

f(x ) = 2
k=0
a kx
k
and g(x) = ]T
fc=0
b kx
k

on —R < x < R, then the series 2^=0 ck x


k
, where
k

ck = J] dibk-i = a bk + a^k-i + • • •
+ a 6 fc , (1-14)

converges tofgin—R < x < R.

The reader will note that the coefficients (1-14) are exactly the ones obtained by
"formal multiplication" of the given series, treating them as polynomials.
As a final comment, we note that so far we have been discussing power series
k
Z^=o ak x whose interval of convergence was centered at the point x = 0. The
entire discussion may be carried through equally well, however, for series of the
form X!fc= i a k (x — a)
k
. The interval of convergence for such series is of the form
a — R < x < a +R (with or without the endpoints) and the radius of con-

The function defined by


2
\e- ll ^ \ x ^ 0,
fix)
"0, x = 0,

has derivatives of all orders on — °o < x < <x> . However, /* (;fc)


(0) = for all n, and
the series 2^=0 (l/kl)f {k) (0)xk converges on — <» < x < <x> to the function which is
identically zero, not to /.
666 INFINITE SERIES |
APPENDIX I

vergence R is computed by the same techniques as before. In particular, if

ax) = £
A;=0
a ^x - °)
fc

on a — R < x < a + R, then / possesses infinitely many derivatives on this

interval and a k = (l/k\)f ik \a).

1-8 TAYLOR SERIES

In the preceding section we found that if a function /can be expanded in a power


series as
00

Ax) = j£ a k {x - of
fc=0

in an interval \x — a\ < R, then /has derivatives of all orders, and

ak = ~f m (a), k = 0,1,2,....

We consider now the question of existence of such a power series representation

for a given function/

Theorem 1-32. (Taylor's formula with remainder.) Suppose that f and its
first n + 1 derivatives are defined and continuous on the interval I defined by
\x — a\ < R. Then for all x in I, we have

Ax) = S^rr1
(*
k\
" ")" + *»(*>* ( M5 >
k=0

where

Rn(x) = \ {x- tff n+l \t)dt. (1-16)


n. J a

Proof. We begin with the formula

f(x)-f(a)= f f'(t)dt.
Ja

Transferring/^) to the right-hand side of this equation and integrating by parts


[with u = f'{t), dv = dt] we obtain*

fix) = f(a) + [f\t)(t - x% - f{t- x)f 2 \t)dt,


or
fix) = fia) + /'(«)(* - a) + Jafix- 2
i)f \t)dt.

* Recall that the formula for integration by parts may be written in the form
fu dv = u(v +c) f(p — +
c) du, where c is an arbitrary constant.
1-8 | TAYLOR SERIES 667

Again we integrate by parts, letting u = f {2 \i) and dv = (x - /) dt, to obtain

= K2) ~(x-t? -(x- if


fix) f(a) + f'(a)(x -a) + rv) 2! 2!
r(3)
r\Ddt

= f(a) + f'iaXx -a) + ^^ (x - a)


2
+ ^ ^-=-^/ (3)
(/) dt.

Continuing to integrate by parts (proof by mathematical induction) we arrive at


the desired formulas after n integrations. (Note that the integrations can be
carried out so long as the integrands are continuous.)

Example 1. The function /(*) = x 8/3 has continuous derivatives through order
two on the interval — oo < x < oo Thus, when a = . 8, Taylor's formula yields

* 8/3 = /(8) + /'(8)(x - 8) + f{x- {2


t)f \t)dt

= 256 + *§*(* - 8) + %°-f(jc - t)t


m dt.
J8

Iff has derivatives of all orders at the point a, it is only natural to consider the
infinite Taylor series
- f \a) k

Z^ix-af,
fc=0
(1-17)

and ask whether this series converges to fin an interval \x — a\ < R. Applying
Theorem 1-32 we can assert that (1-17) converges to /whenever \x — a\ < R if
and only if

lim R k (x) =
k—>oo

for every x in the interval. Thus to settle this question it suffices to determine the
behavior of Rk(x) as k — > oo . To this end the following result is particularly
useful.

Theorem 1-33. If f satisfies the conditions of Theorem 1-32, and if there


exists a number M
such that |/
(M+1)
(^)| < for \x — a\ < R, then M
n+l

\Rn(x)\ < M \x

(n +
a\

1)!
(1-18)

Proof Assume first that x > a; then

\Rn(x)\ = ± J
(x- t)
n
f
{n+1
\t)dt

{x - t)
n
dt
n\

n+1
= M (xin-+a) 1)!
668 INFINITE SERIES | APPENDIX I

If, on the other hand, x < a, then


a
f
{t-xfdt
\Rn{x)\
<™J x
n+1
(a- x)

Combining these results yields (1-18). |

Example 2. If f(x) = e
x
, Taylor's formula yields

x
e = i + ^ + ^r+--- + ^+ *»(*)»

with

Rn(x) =
hi {X ~ °V *'
w+1 (jc)| = x
< e
R on the interval < R, Theorem 1-33 yields
Since |/ \e \
|jc|

l*»(*)l < e* ^j^~ > -R<x<R. (1-19)

Thus for any x in this interval

lim R n (x) = 0*
n—>a>

Since R was arbitrary, it follows that

i.e. that e is represented by its Taylor series on the entire real line.

The estimate used for computational purposes as well. For


(1-19) may be
example, let us find the number of terms of (1-20) required to compute e to an
accuracy of seven decimal places. In this case we have (with R = 1)

e = e
1
= ^ h + *»0)
fc=0

and

I^ (1) l^ e <
(^FT)! (^FT)T
-7 =
This latter expression may be made smaller than 10 by choosing n 10.

* For fixed x, the series » +1


V^ l*r
1*1

*-ov.+ D!

converges by the ratio test. Hence its general term tends to zero.
1-9 FUNCTIONS DEFINED BY INTEGRALS 669

Example 3. Since the derivatives of sin x and cos x are each bounded by = 1 M
on — oo < x < x the remainder terms of their Taylor series are bounded by
,

n+1 /(n
\x\ + 1)!. Thus in each case the Taylor series converges to the respective
function on the interval — oo < x < oo The reader can easily show that the .

resulting series are

x3 ,
x5

cos* = 1 - ~+^ •

Example 4. Combining Example 2 with Theorem 1-30, we obtain

x x
sinhx = \{e - e~ ) = x + ^+^+ • • •
,

^+~+
x x
cosh x = \{e + e~ ) = 1 + • • •
.

1-9 FUNCTIONS DEFINED BY INTEGRALS


The definition of functions by integrals is in many ways analogous to their defini-
tion by series and in this section we consider the problems of differentiating and
integrating such functions.
Let the function/(x, t) be defined and continuous on the rectangle R: a < s < b,
c < t < d. Then the integral jc f(s, i) dt exists for a < s < b and defines a
function F(s).

Theorem 1-34. Under the conditions stated above, the function

d
F(s) = [ f(s,t)dt
Jc

is continuous on a < s < b.

Proof. We have
d
\F(s + h) - F(s)\ = I f [f(s + h, t) - f(s, /)] dt I

\
Jc I

d
<
J
f \f(s + h,t)- f(s, 01 dt.
c

Since f(s, t) is uniformly continuous on the rectangle (see Theorem 1-14), we


can, for any e > 0, find a 8 > such that \f(s -f h, t) - f(s, t)\ < e/(d - c)
whenever \h\ < 8 and both s and s +h are in the given rectangle. Then \h\ < 8
implies that

\F(s + h,t) - F(s)\ <


J
^-~ dt = e. I
670 INFINITE SERIES APPENDIX I

We next ask if F(s) is differentiable and if the natural formula

holds. With mild restrictions this turns out to be true.

Theorem 1-35. Iff{s, i) is continuous in R and if df/ds exists and is con-


tinuous in R, then
rd rd

Proof. We calculate
d
f
+ - = + h,t)-
\
[F(s h) F(s)]
Jl \f(s f(s, t)] dt.

By the mean value theorem

1
+ - = V.(s +
h
\f(s h, t) f(s, 0]
fs
eh, t),

where is some number between and 1. Thus

FXs) _
h-+o
lim
F(s + *)
h
- m _ lim [
h->oJc
d
3j (s
ds
+ „,,()*,

and since df/ds(s, t) was assumed to be continuous, Theorem 1-34 yields the de-
sired result. |

The formula of Theorem 1-35 can be extended to allow variable limits on the
defining integral, as follows.

Theorem 1-36. Let f(s, i) and df/ds be continuous on the rectangle


R: a < s < b, c < t < d, and let c(s) and d(s) be continuously differenti-
able functions with range in the inter-
val c < t < d (see Fig. 1-8). Then

d
Ns
V^^! )

tf(.) / \c(s)
t(s,t)dt
c

+ /0, </(*)) </'(*) —s


cI I

- f(s, c(s))c'(s). FIGU RE 1-8


1-9 |
FUNCTIONS DEFINED BY INTEGRALS 671

Proof. Let G(s, u, v) be the function defined by

G(s,u,v) = f f(s,t)dt.
Ju

Then F(s) = G(s, c(s), d(s)), and the chain rule for functions of several variables
yields

= +
F'(s)
|f
(s, c(s), d(s))
H (s, c(s\ d(s))c'(s)

+ ^(s,c(s),d(s))d'(s). (1-21)

But by the fundamental theorem of calculus

^(s,c(s),d(s)) = f(s,d(s)),

— (s, c(s), d(s)) = -f(s,c(s)),

and application of Theorem 1-35 yields


raw
-<*(s)
dG
S ,c(S ),d(s)) = / |-/(5,0A.
ds ( J c(s) "J

These equations together with (1-21) give the desired formula known in the liter-
ature as Leibnitz's formula. |

We now turn our attention to improper integrals and let / be piecewise con-
tinuous on < x < oo (see Section 9-2).

Definition 1-13. The improper integral So f(x) dx is said to converge if the


limit
rB
hm / f(x) dx
B-*x JO

exists. More precisely, the given integral is said to converge to L, and we


write L = J f(x) dx, if for every e > there is a positive number M
(depending in general on e) such that

r B
- / /(*) dx < €
Jo
whenever B > M.
In a similar fashion, j_ aB f(x) dx is defined for a piecewise continuous function
on — oo < x < oo , by the double limit

lim
a-*-* [ f(x)dx.
B-><*> JA
672 INFINITE SERIES APPENDIX I

Now if Jof(s, t) dt exists for every value of s in an interval 7, then it defines a


function
F(s)= rf(s,t)dt
Jo

on this interval. The situation here is analogous to that which arose in defining a
function as the pointwise limit (sum) of an infinite series; for example the con-
tinuity of/(s, on the region < / < oo, s in I, does not imply the continuity
of F(s) on I. For this reason we extend the notion of uniform convergence, as
follows.

Definition 1-14. The integral J f(s, t) dt is said to converge uniformly to


F(s) on I if for every e > there is a positive number M, depending in
general on e but not on s, such that

Hs) - f* f(s, t) dt < e


Jo

whenever B > M and s is in I.

UF(s) = J(?/0, t) dt uniformly on I, then the integral JJ f(s, t) dt may be viewed


as approximating F(s) on this interval (see the corresponding discussion concerning
uniform convergence of series). Moreover we have

Theorem 1-37. If f(s, i) is continuous for s in I and < t < oo , and if


F(s) = Jq f(s, i) dt converges uniformly on I, then F(s) is continuous on I.

Proof. Given e > 0, choose B > so that

B
m- f
Jo
/(s,t) dt <j
for every s in /. Then

\F(s + h) - F(s)\ < F(s + h)- f /(5+ h,t)dt


o
B rB

+ f(s + h,t)dt - / f(s, t) dt

+ f(s,t)dt - F(s)

^+ f(s + h, t) dt f(s,t)dt

Now choose 8 > so that the latter term is less than e/3 whenever \h\ < 8.

(Theorem 1-34 provides such a 8.) |


1-9 |
FUNCTIONS DEFINED BY INTEGRALS 673

The problem of integrating a function of the form

F(s) = I" f(s,t)dt, sin I,


Jo

forces us to consider the question of interchanging the order of integration in


iterated integrals. We recall that for the finite rectangle R: a < s < b, c < t < d
we always have

f
f Jc /0, t)dtds = f
f f(s, t) ds dt
Ja Jc Ja

(provided, of course, that /is continuous). However, this result is in general false
if the integrations are carried out over unbounded intervals. To examine this
situation more closely, we must define what is meant by improper double integrals
and explore their relation to the improper iterated integrals.

Definition 1-15. Let R be the first quadrant of the s, t plane. We say that
the improper double integral

/]>'> ds dt
converges to L if for every e > there is a positive number M such that
rB r A
L - [ f f(s, t)dsdt\ < e
Jo Jo I

whenever A, B are both greater than M. (Analogous definitions are given


when R is a half-plane or the whole plane.)

Thus if we let

*B rA
F(A,B)= f [ f(s,t)dsdt,
Jo Jo
then

(1) the double integral J/ij/(^, ds dt is the double limit

lim F(A, B);


A— >oo

(2) the iterated integrals are given by the iterated limits

r f'/(*.
Jo Jo
ds dt = lim
B—»oo A—>oo
lim F(A B)> >

r r f(s
JO JO
> o dt ds = iim
A-*ao 5-»oo
iim p( A B^ >

The question of equality of the three integrals is then just a special case of the
corresponding problem for limits. Thus the following result is of interest in this
context.
.

674 INFINITE SERIES |


APPENDIX I

Theorem 1-38. Let the double limit L = \\mA-+n,B-+*> F(A, E) exist; i.e.,

assume that for every e > there is an M> such that \L — F(A, B)\ < e

whenever A > M
and B > M.

(1) IflimA^x, F(A, E) = L(B) exists for every B, then

lim L(B) = L.

(2) If lim B ->«, F(A, B) = L{A) exists for every A, then

lim L(A) = L.

Proof It suffices to prove (1). Now


\L - L(B)\ <\L- F(A, B)\ + \F(A, B) - L(B)\. (1-22)

Thus, given > 0, choose M so that \L — F(A, B)\ < e/2 whenever A > M
e x x

and B > Mi. Now if B > M is held fixed we can find an M > Mi such that
x

\F(A, B) - L(B)\ < e/2 whenever A > M. Hence \L — L{B)\ < e, and since
this can be done for any B > M x, the proof is finished. |

Restated in terms of integrals, Theorem 1-38 asserts that if the double integral
L = Xfi?/(5, ds dt exists, then the existence of j f(s, t) dt for every s implies
that the iterated integral jo fo f(s, t) dt ds exists and equals L (and similarly with
the opposite order of integration). More interesting, however, is the converse
problem of inferring the existence of the double integral from the existence of one
of the iterated integrals. For this, the notion of uniform convergence enters once
again, and we accordingly restate Definition 1-14 in a form appropriate to limits.

Definition 1-16. F(A, B) converges uniformly to L(B) as A — oo if for


every e > there is a number M
> (which depends in general on e
but not on B) such that

\F(A, B) - L(B)\ < e

whenever A > M
We can now state the main result.

Theorem 1-39. Suppose that F(A, B) converges uniformly to L(B) as


A — > oo, and that L = lims^*, L(B) exists. Then

(1) lim^ ^ocs^oo F(A, B) = L, and

(2) if lim^^oo F(A, B) = L(A) exists for every A, then linu^ L(A) = L.

Proof. (1) We have


\L - F(A, B)\ <\L- L(B)\ + \L(B) - F(A, B)\.
1-9 |
FUNCTIONS DEFINED BY INTEGRALS 675

Given e > 0, choose M x so that the first term on the right-hand side is less than
e/2 whenever B > M± and choose M 2 (using uniform convergence) so that the
second term is less than e/2 whenever A > 2 With = max {M 1} M . M M 2} we
have \L — F(A, B)\ < e whenever A, B are both greater than M. (2) The
second result now follows from Theorem 1-38. |

For the special case of improper iterated integrals, the above theorem takes the
following form:

Theorem 1-40. If the integral / f(s, i) dt converges uniformly, and if the


iterated integral

exists, then the double integral also exists and satisfies

ff f{s,t)dsdt = f f f(s,t)dtds.
JJr Jo Jo

If, moreover, the integral j f(s, t) ds also exists, then we have

/ / f(s, t)dsdt = / f(s, t) dt ds.


Jo Jo Jo Jo

Corollary. Iff{s, t) is continuous ina<s<b,0<t< oo, and if


X
F(s) = [ f(s,t)dt, a < s < b,
Jo

the convergence to F(s) being uniform on a < s < b, then

X
f F(s)ds
= f f f(s, t) dt ds = f* f f(s,t)dsdt.
Ja Ja Jo Jo Ja

Proof We need only apply the theorem to the function f(s, t) defined on
0<s<ao,0<f<ooby
/y„ a = f/(
5 > 0, if a < s < b,
JK ' ' otherwise.
(0,

For then J f( s >


dt converges uniformly on < s < oo to the function

= \F(s), if a < s < b,


F(s)
10, otherwise,
and
b b
[f{s, t) ds = f(s, t) ds = f(s, t) ds
fa ja
exists.
676 INFINITE SERIES | APPENDIX I

The above theorems treat the problem of integrating functions defined by im-
proper integrals. A useful result concerning the differentiation of such functions
is the following:

Theorem 1-41. Ifdf/ds(s, i) is piecewise continuous on a < s < b for each


t, and if

F{s) = f(s, t) dt and j ^(s, t) dt


JQ o os

both converge uniformly on a < s < b, then

rtx>

F'(s) = / ?f(s,t)dt.
Jo ds

Proof Let H(s) = So df/ds(s, t) dt. Then

\ H{s)ds= \ \ ^(s,t)dtds= / / ^-{s,t)dsdt


Ja Ja JO OS Jo Ja OS

U(u,t)- f(a,t)]dt

= F(u) - F(a\

whence H(u) = F'(u) by differentiation. |

One of the main examples in the text of a function defined by an integral is the
Laplace transform
st
£L/TO = re- f(t)dt = F(s) (1-23)
Jo

of a piecewise continuous function. Since / need not be continuous, the uniform


convergence of (1-23) (established below) is not sufficient to establish the continuity
of F(s). However, for functions /of exponential order, we have, for h > 0,

\F(s + h)- F(s)\ = I r(e- (s+h)t - e- st )f(t)dt\

ht
= I
C(e~ - X)e- f{t)dt\
St

I Jo l

< f(l - ht
e- )e-
st
\f(t)\ dt.
Jo

Thus if a, M are any constants chosen so that \f(t)\ < Me ai , it follows for s > a
that
ht ~ a)t
\F(s + h) - F(s)\ < MP(1 - e- )e-
is
dt
Jo

Ls — ot h + S — a
r

1-9 I
FUNCTIONS DEFINED BY INTEGRALS 677

and hence that F(s -\- h) — F(s) tends to zero as h —> through positive values.
A slight modification of the argument yields the same result if h — » through
negative values, and therefore the continuity of F(s) is established for s > a. We
have thus proved the following theorem.

Theorem 1-42. If f is piecewise continuous on < t < cc and is of


exponential order, and if a is the greatest lower bound of the set of real
numbers a for which \fif)\ < Me at {for some constant M), then <£[/] is
continuous on ao < s < oo.*

Finally, we justify the formula

/oo /•oo

cT st n st
e- f(t)dt= (-D"J o t e- f(Odt
ds r j

given in Section 5-5 of the text by showing that each of the integrals

r t
n
e-
st
f(t)dt, n = 0,1,2,..., (1-24)
Jo

converges uniformly on a < s < oo where a > a (see Theorem 1-42 for the
definition of a ). In fact, choosing Si (a < Si < a) and M so that

1/(01 < Me s i\
we have, for s > a,

- Sl)t
I

I
f
JA
t
n
e- stf(t)dt\
I
< MA J
t
n
e~
{s
dt

- Sl)t
< M TA J
t
n
e-
{a
dt.

But the last expression tends to zero as A — »•


oo (t
n
is of exponential order), and
since it s, the uniform convergence of (1-24) on a < s < go
does not depend on
is established. In view of Theorem 1-41, any number of differentiations of £[f](s)

may be performed by differentiating under the integral sign.

* The number ao, sometimes called the order of f, is greater than or equal to the
abscissa of convergence sq of /. As was shown in the text proper, however, we may
have 5q < ao-
APPENDIX II

lerch's theorem

Let/(r) be defined on < t < oo, and be piecewise continuous on every finite
interval t< <
Assume,
A. moreover, that f(t) is of exponential order, i.e.,
that there exist constants a and such that \f{i)\ < Me
at
M
< t < oo. It is ,

the purpose of this appendix to demonstrate the following theorem.

Theorem. (Lerch's theorem.) If£\f](s) = J e~ f(i)dt is identically zero


st

for all s > So, s some constant, then fif) is identically zero {except possibly
at its points of discontinuity).

Proof Let <f,(s) = jo e~ stf(t) dt, s > s . Then if

P(x) = £
fc=0
«***

is any polynomial with real coefficients, we have


/oo /-oo

/

st
e- P(e- )f(t)dt= /
l

Jq
e-
st
£a e- k
kt
f(t)dt
k=0
/OO

kt
= !>*/ e-%- f(t)]dt

n
= ^2 a k <t>(s + k) = 0, s > s .

k=0

Making the change of variable x = e~\ this last condition transforms to

1
_1
f jt' P(jc)/(-ln x)dx=0, s > s .

Jo

Now choose a fixed Si > max {s , 1, a + 1}. Then

^i-i|/(_l nx) |
< Mx s *- l
e
a( - lnx)
= MxSl - {a+1 \
and it follows that the function

_1
G(x) = Jc
si
/(-lnjc), < x < 1,

678
APPENDIX II 679

tends to zero asx —* 0. Let us define (7(0) = 0, thus making G continuous at


x = Then G is a function which is bounded in the interval
0. < x < 1, has
only "jump" discontinuities in this interval (although there may be infinitely many
such discontinuities), and satisfies

G(x)P(x)dx =
/.o

for every polynomial P. We shall deduce from these conditions that G(x) =
for < x < 1 (except possibly at its points of discontinuity). In fact, let us
choose a complete orthogonal basis for the vector space (PC[0, 1] with inner
product f g = J f(x)g(x) dx*
• Then any piecewise continuous function g
which satisfies
1
f g(x)P(x) =
Jo

for every polynomial must be identically zero (except where it is discontinuous),


for it is orthogonal to every member of the chosen basis and hence must be the
zero vector in (PQ[0, 1].

Except for the fact that G may have infinitely many discontinuities, the proof
would be complete. Fortunately, it is not difficult to show that a complete ortho-
normal basis for (P(B[0, 1] is also a complete basis for the slightly larger class of
functions which, like G, may have infinitely many jump discontinuities but are
bounded, f We thus conclude that G is identically zero wherever it is continuous,
= *s -1
and hence since G(x) i
/(— In x), the same must be true of/. |

* We may choose, for example, the even-numbered Legendre polynomials as such a


basis. For if

f g(x)P2k (x) dx = 0, k = 0, 1, 2, ...


Jo

then for the even extension g of g to —1 < x < 1 , we have

g(x)Pn (x) dx = 0, n = 0, 1, 2, . . .

J
Thus the piecewise continuous function g must be zero (except at its points of discon-
tinuity) and hence the same is true of g.

f The function G(x) is, in fact, piecewise continuous in every interval A < x < 1
(for < A < 1) and is continuous at x = 0.
APPENDIX III

determinants

lll-l INTRODUCTION
We shall present here a brief introduction to determinants, with sufficient attention

given to their properties to permit the usual applications. A summary of important


properties, together with examples, is presented in Section III—4. The reader who
wishes only a reminder concerning methods of evaluating determinants or their
application to systems of linear equations may turn immediately to that section.
We wish to define a real-valued function D(sl u . . . , a„), where ai, . . . , a n are
n
vectors in (R , such that
Z>(ai, . . . ,a w ) =

if and only if a 1? . .
.
a n are linearly dependent. Such a function, when defined,
,

will be called an n X n determinant and will be denoted in the more familiar


form
#11 #12 •

• #ln
#21 #2 2 #2n
• • •
= m
#n 1 #n 2

where the columns of the matrix (a,-,-) are the components of a 1} . .


.
, a n relative
n
to the standard basis vectors ei, . .
.
, e n of (R .

Definition lll-l. A real valued function Z>(a 1} . . . , a n ) defined for vectors


a x,..., a n in (R
n
is called annXn determinant (or a determinant of
order n) if it satisfies the following three conditions.

I. D is linear in each of its n variables; i.e. for i = 1, ...,«, we have

D(a l5 ... t aSLi+ jSa'i, . . . , aw )


= aD(2L U . . . , a i5 . . . , an ) + PD(* U . . . , at, . . . , an )

n
for any real numbers a, and any vectors a(, a^ . .
.
, a re in (R .

II. If a, = 2Lj for some i,j, (i ^ j), then D(n u . . . , an ) = 0.

III. D(e 1 ,...,e n ) = 1.

680
IH-1 INTRODUCTION 681

Condition II of the definition is a very special case of the desired connection


between linear dependence and the vanishing of determinants. This same connec-
tion would demand, of course, that D(e x e n ) be different from zero. Con- , . . . ,

dition III may therefore be regarded as normalizing the value of D(a x , . . . , a n ).


We shall see, in fact, that the three conditions of the definition serve to determine
the function D uniquely (Theorem III— 3).

Example. If n = 1, we are concerned with a = 1


single vector ax a x ^i of (R .

Then by Conditions I and III, we have

Z)(aO = D(aiei) = aiZ)(ei) = ax .

Similarly if n
, = 2 and the vectors a x = 011^1 + 02^2 and a 2 = 012^1 + 022^2
are given, then I and II yield

JD(a 1 ,a 2 ) = D(a xl ei + 2X e 2 a l2 e + a 22 e 2 ) , x

= fluZ)(ei, ai 2 e + a 22 e 2 ) + a 21 D(e 2
x , a 12 e x + 22 e 2 )
= aii[ai 2 D(e ,e ) + 22 £>(ei, e 2 )]
l 1

+ a 21 [a 12 D(e 2 e x ) , + a 22 D(e 2 e 2 )] ,

= a 11 a 22 D(e 1 e 2 ) , + a 21 a i2 D(e 2 , e x ).

Moreover by Condition III, D(e x , e 2) = 1, and by Theorem III—2 below,


D(e 2 ,ei) = — 1. Thus

Z>(a l5 a 2 ) = a xl a 22 — a 21 a 12 .

Since the function D is defined for n-tuples of vectors

ai = (
: ' • • • > an = ( :

we could just as easily view it as defined on the set of n X n matrices


(«u • • • a ln \
i :

0nl "
' '
®nn/

In this case we follow the usual custom of denoting its value by det M or by
\an\. Thus
«11 012 01n
021 022 02n = D
0nl 0»i2 " " "
Q-nn ^0nl

The results of the example above then take the more familiar forms

|011 012|
|0l| = 01 and = 011022 021012-
|021 022l

These formulas will be generalized in the next section.


682 DETERMINANTS | APPENDIX III

III— 2 BASIC PROPERTIES OF DETERMINANTS


Let D be a function satisfying Conditions I through III of Definition III— 1.

Theorem lll-l. If one of the vectors a 1} . . . ,a n is the zero vector, then


Z>(a 1? . . . ,a n ) = 0.

Proof. Say a t = 0, Then by Condition I,

D(sl 1 , . .
.
, a,-, . . . , a„) = D(a u . .
.
, • a,-, . . . , an )
= 0- D(a lf . . . ,a„ . . . , an ) = 0. |

Theorem III — 2. If the sequence of vectors bi, . . . , bn is obtained from


a i, . . . , a n by interchanging a* and a, (i < j), then

D(b u ... ,K) = -D(&i, , a n ).

Proof. Replacing both a» and ay in D(a u . . . , a i} . . . , aj, . . . , a n ) by a^ + ay,

and applying Conditions I and II, we have

= D(a x ,
. . . , a,- + a y ... , , a{ + ay, . .
.
, an )
= D(a.i, . .
.
, a{, . .
.
, ai, . . . , an )

~r "(ai, . .
.
, 3j, . .
.
, ay, . . . , an)

i ^(ai, . . . , 3y, . . . , 3j, . . . , an)

+ D{a x ,
. . . , ay, . . . , ay, . . . ,a n ).

But the first and fourth terms of this sum vanish (by Condition I) and the sum of
the second and third terms is therefore zero as desired. |

By applying Theorem III— 2 repeatedly to adjacent vectors in the list a l5 . . . , an ,


we obtain the following corollary.

Corollary. If the sequence of vectors b l9 . . . , bn is obtainedfrom a l5 . . . , an


by shifting one of the a» k places to the left or right, then

Z>(bi,...,b„) = (-l)^(a l5 ...,a n ).

Now suppose that we take some permutation e p(1) ep(2 ), , e p(n) of the , . • .

standard basis vectors of (R n .* Then, by successively interchanging pairs of vectors


in thislist, we can rearrange the vectors into the "natural" order eu . . . ,e n ,

and thus by Theorem III— 2 we have

D(epn) , . . . , ep{n) ) = ± Z>(e !,..., e n ) = ±1, (HI-1)

* A permutation of the set {1, . .


.
, n) is just a one-to-one function p mapping this
set onto itself.
)

Ill— 2 |
BASIC PROPERTIES OF DETERMINANTS 683

the plus or minus sign being chosen according as the number of interchanges
required for this rearrangement is even or odd. It is of course an essential fact
(which we shall not prove here) that the number of interchanges is either always
even or always odd for all possible ways of carrying out the above rearrangement.
The permutation p itself is accordingly said to be an even permutation or an
odd permutation depending on which of these two possibilities holds.* Let us put

.
__ (+ 1 if p is an even permutation,
{
— 1 if p is an odd permutation.

Then (III— 1) becomes


D(epllh ..., ep(n) ) = a(p). (IH-2)

We are now in a position to "compute" the value of D(a 1 , . . . , a n ) for any n


n
vectors of (R . For if
n
ai = flue! + a 2 ie2 + • •
+ onl e n = ^ a^ey,

n
an = a ln e l + a 2n ^% + • • •
+ a nn e n =
3=1
^ a jn tj,

then repeated application of property I yields

n n n \

Z>(a 1? . . . , an ) =
Z
( y=i
a J'
ie J» S
y=i
a & e J> • • • ' £
j=i
a i**i
)
'

n / n n \

= 2
j i=i
a hi D \*h>
\
Yj
y=i
a i&i> • •

' 2
y=i
a '» e
>
/
n n j n \

= 2
y,=i
fl
J'ii 2
y 2 =i
a i22^(ei
\
1 ,
ey 2 , • • • , 2
j=i
fl y« e i)
'

= Z) a iii
Z) ^ 22
• • •

J2 a Jnn D{e h e J2 , , . . . , e jn )
3i=i h=i y n =i

n n
=
3l
S= S l 32=1
" "

3n=l
S a hi a h2 • ' ' a Jnn D{e h e J2 , , . . . , ey n ).

In this last sum, however, the only terms which are different from zero are those
for which the sequence j u ,jn is a permutation of the numbers 1, ...,«; for
. . .

only in these terms are there no repetitions among ey , . . . , ey . We may rewrite

* A well-known method of determining whether p is even or odd is to count the number


I of inversions in the list p(l), p(2), . . .
, p(n), i.e., the number of pairs /?(/), p(j) of these

integers for which / < j and p(i) > p(j). The permutation p is even or odd according
as I is even or odd. For example the list 3, 1, 2, 5, 4 contains three inversions, hence an
odd number of interchanges is required to rearrange this list into the natural order.
1 1

684 DETERMINANTS |
APPENDIX III

this sum, therefore, in the form (using III—2)

Z>(ai, . . . , an ) = ^2 (r(p)aP (i)iaH2)2 • • • ap{n)n ,


(IH-3)
p

where the notation indicates that the sum is extended over the «! permutations of

The derivation of Eq. (Ill— 3) from Properties I through III of Definition III—
demonstrates that // there is a function D which satisfies these properties, then its

values must be given by Eq. (IH-3). Thus there most one function (for each ri) is at
satisfying the conditions of Definition III— 1. Moreover, it is not difficult to show

that the function defined by Eq. (Ill— 3) does in fact satisfy I through III. We shall

omit these details and merely summarize the results in the following theorem.

Theorem III — 3. For each n, there is one and only one function D satisfying
Properties I through III of Definition III— 1. Its values are given by the formula

D(fL U . . . , a„) = ^2 <r(p)apan ap{2)2 • • •


a p(n)n . (III-4)
v

If we compare formula (III—4) with the matrix

an 012 * *
#ln

M «21 022 * '


#2n

0nl 0n2 '


Q-nn

we note that the right-hand side of (III—4) consists of the sum of n\ terms, each
one a product of n factors chosen from among the entries of in such a way M
that no two of these factors occur in the same row or the same column of M.

Example. We can also use (III—4) to obtain the result of the example in the
preceding section. For the value of the 2 X 2 determinant is given by (III— 4) as

D(a u a 2 ) = «U 012|
021 022
= a(pi)ana 2 2 + <^{p 2)^21^12 = 011022 — 021012,

the signs of <r(pi) and (r(p 2 ) being determined by counting in their respective terms
the number of inversions in the arrangement of (first) subscripts.

The 3X3 determinant D(sl u a 2 , a 3 ) may also be evaluated by direct applica-

tion of Eq. (HI-4), and in this case we obtain

011 012 013


Z>(ai, a2 a3 )
, 021 022 023
031 032 033

011022033 + 021032013 + 031012023


— #31022013 ~ 021012033 — 011032023-
Ill —2 I
BASIC PROPERTIES OF DETERMINANTS 685

For practical purposes, Eq. (Ill—4) is of little use for determinants of order
greater than three. Indeed the expansion of a 4 X 4 determinant would have
24 terms, that of a 10 X 10 determinant would have 10! = 3,628,800 terms,
and so forth. The remainder of this section and the next are devoted, therefore,
to properties of determinants which lead to simpler procedures for computing
their values.
From this point on we shall use in most cases the matrix notation

#11 #12 ' ' ' a \n


a 2\ a 21 '
' ' a 2n

a nl an2 '
' ' ann

for the n X n determinant D(a u . . . , a n ). In conjunction with this notation we


shall refer to the vectors

a; = I °V I' J = 1,2, . . .,«,

as the columns of the determinant and to the vectors

(flu, (2i2, . . . , Clin), i = 1) 2, . . . , n,

as its rows. Multiplying a column (row) by a real number a or adding two columns
(rows) is to be interpreted, therefore, as performing these operations on the cor-
responding vectors. We shall allow ourselves the usual misuse of language which
confuses a function with its values, and in this case we shall often speak of the
determinant |a^| when we are really speaking about the matrix (an) or about
the value of the function D on this matrix. Context will always provide the exact
meaning.

Theorem III — 4. Let M= (an) be an n X n matrix and let M' = (b{j) be


the transpose of M, i.e. the matrix whose columns are the rows of M (thus
bij = ciji). Then
det M= det M'.

Proof From (III—4) we have

det M= J] <T(p)ap{1)1 api2 )2 • • aP (n)n


p
and
det M' = 22 <T (p)bp(i)ib P (2)2 ' • • b pin)n
V

— 2~t
<T \P) a ip(.l) a 2p(2) ' ' '
Clnp(n)-

Thus the n\ products which enter into the expansion of det are precisely those M
which occur in the expansion of det M' and we need only show that they occur ,
686 DETERMINANTS | APPENDIX III

with the same algebraic signs. For this purpose let

#p( 1)1^2(2)2 ' ' ' ap{n)n and a \q(l) a 2q{2) ' ' '
^nq{n)

be two corresponding products, i.e. products which differ only in the order of their
factors. If we apply the permutation p to the second product (or more exactly to
the subscripts of its factors) we obtain

ap(l)p(.q(l)) ap(.2)p(q(2)) ' ' '


Op(n)p(q(n))-

But by definition of the transpose this must agree with the first product, and hence
p(q(i)) =
i for / 1, 2, =
n. This implies that either p and q are both even
. . . ,

permutations or else both are odd. Thus a(p) = cr(q). and we are done. |

The theorem may be stated informally by saying that the value of a deter-
last

minant is unchanged if its rows and columns are interchanged. This result permits
us to concentrate on just the columns of a determinant. Each theorem that we
prove will remain true if the word "column" is replaced by "row" and vice versa.
Certain useful properties of determinants are immediate consequences of Defi-
nition III— 1.For example, it follows from Property I that the value of a determin-
ant is multiplied by the real number k if each of the entries in a single column or
row are multiplied by k. We also easily obtain the following useful result.

Theorem III — 5. The value of a determinant is not changed by adding a


multiple ofthejth column (row) to the rth column (row) if i ^ j.

Proof This is an immediate consequence of Properties I and II, for

D(a 1? . .
. , ^+ knj, . .
.
, ay, . .
.
, a„)

= D(a lf . .
.
, a,-, . .
.
, a,-, . . . , a„) + kD(a 1} . .
.
, ay, . .
.
, ay, . .
.
, a n ),

and the second term of the last expression vanishes. |

Theorem 111-6. If the vectors ai, . . . , a n are linearly dependent, then

an * " " a \n
Z>(a 1? . .
.
, an ) = :
... J =0.
Un i Linn

Proof Suppose a 4 = YJj=\ CjSij, with c { = 0. Then using Property I, we have

Z>(ai, . . . , a l5 . . . , an ) = D(a u . . .
, ^
j=i
c i a J> • • • > a «)

— / j
CjD(&i, . . . , ay, . . . , a n ).
3

Ill — MINORS AND COFACTORS 687

But in this sum, the /th term vanishes because c* = 0, and each of the other terms

also vanishes because in each case two of the vectors to which D is applied are
equal. |

The theorem provides half of the desired relationship between determinants


last
and The converse is obtained in the next section where we
linear dependence.
introduce one further topic essential for applications.

111-3 MINORS AND COFACTORS


Let us begin this section by evaluating D(e u a 2 a n ), where a 2 a n are , . . . , , . .
.
,

any n — \ vectors of <R n Applying Theorem III—3, we write the value of this
.

determinant in the form

1 a 12 din
a 22 = z2 G(p)aP(in aP(2)2 ' • • a p{n)n . (Ill— 5)

an2

But the only terms in this sum which are different from zero are those for which
the permutation p satisfies p(l) = 1. Moreover, for every such permutation, the
number of inversions in the sequence p(l),p(2), . . . ,
p(ri) is the same as the number
of inversions in the shorter sequence p(2), pi}), . . . ,
p(n). The value of (III— 5),
therefore, reduces to

2 <1
ff(?K(2)2««(3)3 Qq{n)n>

where q ranges over all permutations of {2, 3, . . . , n) . That is,

1 012 Q\n #22 '


' ' a2
a 22 a 2n
(III-6)
«n2
a n2 • • •
ann

Now suppose we wish to evaluate the determinant

Oil a iij—\ a Uj+l Win


Z)(ai, . . . , aj_i, e,-, n j+1 , . . . , an ) = '
' '
1

&nl '
' '
Onii—l Qn ,j + i

(III-7)

By applying the corollary to Theorem III—2, we may move the y'th column j — 1

places to the left thus multiplying the value of the determinant by (— l) y_1 Then .

applying this corollary once again (in conjunction with Theorem III—4), we move
the z'th row up i — 1 places, this time multiplying the value of the determinant by
688 DETERMINANTS |
APPENDIX III

1-1
(— l) . The resulting determinant now has the form (III— 5) and hence can be
reduced to an (« — 1) X (n — 1) determinant by applying Eq. (Ill— 6). The
value of D = Z)(ai, . .
.
, ay_ l5 e*, ay +1 , . .
.
, an) is thus given by

D = (-ly+JMv, (III-8)

where Aft-y is the determinant obtained from (III—7) by deleting the rth row and
they'th column. We give formal status to this new (« — 1) X in — 1) determinant
in the following definition.

Definition —2. Let D be any n X n determinant. For each i, j (1 < i,


III

j n) let Mij be the (n — 1) X (« — 1) determinant obtained from D


<
by deleting the z'th row and the y'th column, and put An = (— 1)* +J M#.
Then An is called the cofactor of the entry a^, and M^ the minor determinant
of this entry.

We are now in a position to prove the main theorem of this section.

Theorem 111-7. For any n X n determinant D = |tf;y|, we have

D= 2 aijAij, j = 1,2,..., n, (III-9)

and

D = £ aijAij, i = 1, 2, . .
.
, n. (111-10)

Proo/. Lety be fixed and write ay in the form


n
ay = / j aijQi.
i=l

Applying Property I of Definition III— 1 followed by Eq. (Ill— 8), we have

D = D(a t ,
. .
.
, ay, . .
.
, an )
n
= ^ aijD(*u
i=l
. • • , e i5 . . . , an )

n
= / J
aijAij.
i=l

This proves (HI-9), and (111-10) now follows from Theorem HI-4, for if Bi} is

the cofactor of b^ in the transpose, then ]£ t -


bijBa = 2Z t -
fly^/t = Z!y aijAij. |

Equation (HI-9) is often referred to as the expansion of D by cofactors of the


(entries of the)/th column. Similarly (111-10) gives the expansion of D by cofactors
of the /'throw. Either of these formulas permits us to reduce an n X n determinant
Ill —3 |
MINORS AND COFACTORS 689

to a sum of n determinants, each of order n - 1. They provide, therefore, a very

efficient method for evaluating arbitrary determinants, especially


when used in
conjunction with Theorem III-5. Before turning to examples, we obtain one

important corollary of the last theorem, and apply these results to obtain the
converse of Theorem III— 6.

Corollary. For any n X n determinant D = |a,-y|, we have

J2 <HjAik = 0, j, k = 1, • • , n, j*k, (IH-11)

and

2=
3 1
aijAki = 0, i, k = 1, • • , n, i*k. (111-12)

Proof. Every term of (III— 1 1) is unaltered if the kth column of D is changed.


Hence we may set aik = a»,-, i = 1,2, ... ,n. Then
n n

But the latter sum vanishes since it is the expansion by cofactors of the kth column
of a determinant whose y"th and kth columns are equal. Similar reasoning applies

to (111-12), using (111-10) in place of (III-9). I

Theorem 111-8. A necessary and sufficient condition that a set of n vectors


a x ,...,2i n in(S{
n
be linearly dependent is that D(a u . . . , an ) = 0. Thus
ann X n matrix A is nonsingular if and only if det (A) ^ 0*

Proof. The necessity of the condition has already been established in Theorem
III-6. To prove the sufficiency we must show, conversely, that if det (A) = 0,

then the columns ai, a n of A are linearly dependent.


. . . ,

Consider first the case in which the cofactor A pq of some entry a pq of A is differ-

ent from zero. Then (111-10) and (111-12) yield

aij A pj = 0, i = 1,2,..., n, (111-13)


Y,
3 =1
and we conclude that
TO

/ j
Apj&j = 0.
3 =1

* A matrix A is called nonsingular if it has an inverse, i.e., if there is a matrix B such


that AB = BA = I. The existence of such an inverse, usually denoted A~ l
is equivalent
,

to the linear independence of the columns of A.


690 DETERMINANTS APPENDIX III

Since at least one of the coefficients (viz. A pq ) in this linear combination is not
zero, the vectors a l5 . . . , a n are linearly dependent, and we are done.
If the cofactor of every element of A is zero, then (except in the case in which
a_, = for all j, where the theorem is obviously true) there exists an integer r,

1 < r < n — 2, M we have det (M)


such that for some r X r submatrix 9^

but for every larger submatrix M' we have det (M') = 0* Since the order of
rows and columns immaterial, we may assume that M
is in the rows lies first r

and columns of A, i.e. that


/«ii '
• •
«iA
M=[ : : )•
w ri • • • a rr /

Now choose k so that r -\- I < k < n and let M ' be the submatrix

(an ai r tfi, r +l
'

M' Q rr ar , r ^-i

We note that for 7 = 1, 2, . . . , r + 1, the cofactor of akj in M' is independent of


k. Denote it by Aj. Then (111-10) and (IH-12) once again yield

r+l
y^ ciijAj = 0, / = 1, . . . , r, or / = k. (111-14)

Since k was chosen arbitrarily from {r + 1, ...,«}, the relation (III- 14) holds
for / = 1,2, ... ,n, and hence is equivalent to

r+l
22 A J a J = °-
y=i
Since
A r+1 = det(Af) 5* 0,

we thus conclude, that ai, . . . , a r+i are linearly dependent and hence that the
same is true of a l5 . . . , an .
|

III— 4 SUMMARY AND EXAMPLES


To each n X n matrix (a^) we associate a real number called the determinant of
the matrix (or determinant of order n) and denoted by

«11 012 «ln

\aa\ = (111-15)
#nl an2

*By an r X * submatrix of a given m X n matrix A, we mean a matrix obtained from


A by deleting m-r rows and h-s columns.
.

111-4 | SUMMARY AND EXAMPLES 691

The value of this determinant is given by

\
a a\ = 2 <r(p) a P an a P(2)2 ' '

'
a vWn, (111-16)

where/? ranges over permutations of {1,


all n) and (r(» is 1 or — 1 according
. . . , +
as the number of inversions in the sequence p(\), p(2), ,p(n) is even or odd . . .

(see footnote on page 683).


Except for determinants of small order, however, Formula (III- 16) is of little
practical use. In fact even a 4 X 4 determinant would already have 4! = 24
terms in expansion, and the evaluation of a 100
its 100 determinant by means X
of this formula would be completely out of the question.*
Fortunately, quite efficient techniques exist for computing the value of |a iy |.
They on properties of determinants developed in the preceding sections
are based
which we review here and illustrate through examples. In so doing, we recall that
when we speak of the columns (rows) of the determinant (III— 15) we are really
referring to the column (row) vectors of the matrix (a fy). This terminology is
customary, though nonsensical, and fortunately leads to no great confusion in
practice. Operations performed on columns or rows (for example multiplying a
column (row) by a constant or adding one column (row) to another) are under-
stood as operations on vectors. For more formal statements of the following
properties, the reader may refer to Sections III— 1 through III— 3.

Summary of properties of determinants

(1) If a single column (row) of |a z-y| is multiplied by the constant k, the value
of the determinant is multiplied by k.

(2) If two columns (rows) of \aa\ are interchanged, the value of the determinant
is multiplied by — 1

(3) The value of the determinant |a»y| is unchanged if a multiple of one column
(row) is added to another column (row).
(4) The determinant |a f vanishes if and only if its columns (rows) are linearly
,-|

dependent. In particular it vanishes if two columns (rows) are equal or if one is a

multiple of another.
(5) The value of the determinant |a t y| is unchanged if its rows and columns are
interchanged.
(6) If each entry in a fixed column of |a t-,-| is multiplied by its cofactorf and the
results are added, the sum is the value of the determinant; i.e., for any fixed j,

l<j<n,
\a {j \
= £ auAiy (IH-17)

* It has been estimated that the fastest modern electronic computer would require sev-
eral centuries to compute and add the 100! terms of (111-16).
| See Definition III—2 for the meaning of the term cofactor.
692 DETERMINANTS APPENDIX

The corresponding statement for rows is also true. Namely for any fixed /',

1 < i < n,

— / j QijAij. (Ill- 18)


j=i

Equation (III- 17) is called the expansion of |a*j| by cofactors of the jth column

and (III— 18) the expansion of |a^| by cofactors of the rth row. One consequence of
(4) and (6) is the following property:
each entry of a fixed column (row) of |a^| is multiplied by the cofactor
(7) If
of the corresponding entry of a second column (row), and the results are added,
the sum is zero. More precisely,

2 a a A ik 0, j, k = 1, . . . , n, yV k,

(111-19)

0, i,k=\,...,n, i t± k.
3=1

Example 1. For determinants of order <3, it is not difficult to apply (HI- 16)
directly. For the first-order determinant \a\ we obtain, of course, \a\ = a, and
for determinants of orders 2 and 3 we obtain, respectively,

= «11«22 «21 a 12j


I #21 °22

«11 #12 «13


«21 022 023 011022033 + 031012023 + 021032013
031 032 033 — 031022013 — 021012033 — 011032023-

The last of these formulas is frequently remembered by forming the six products
indicated by the arrows in Fig. Ill— 1 and assigning a plus or minus sign according

FIGURE III
111-4 I SUMMARY AND EXAMPLES 693

as the arrow points downward or upward. Thus,

-1 3
1 -6 = 2-1-7 + 4-(-l)-(-6) + 3-0-2
7
- 4-1-3 - (-6)-2-2 - (-l)-0-(7)
= 14 + 24 + - 12 + 24 - = 50,
while
-7
= 3-1 - (-7) -4 =3 + 28 31.
1

Example 2. The rule for expansion by cofactors may be used to evaluate the
third-order determinant of Example 1. Expanding by cofactors of the 1st row,
for example, we have

-1 3
1 -6
1

2
-6 =
7
2
2 7
+ (-!)'
(-t 1) + 3

= 2(7 + 12) + (0 + 24) + 3(0 - 4)

= 2 •
19 + 24 - 3 •
4 = 50.

Expanding instead by cofactors of the first column, we have

2 -1 3
-1
-6 3
1 -6 = 2
7
+ 4
1 -6
4 2 7
= 2(7 + 12) + 4(6 -
3) = 50.

Example 3. The expansion of a determinant by cofactors is particularly simple


if most of the entries in some column or row are zero, and this suggests that we
first apply Property (3) above to bring about this situation. As an illustration we

evaluate the fourth-order determinant

3 7 9 11

1 5 -7 -2
D = 6 2 4 9
8 2 -5
Multiplying the third column by —4 and adding it to the first column introduces
a second zero into the fourth row. We then add twice the third column to the
fourth column, introducing a — 1 into the lower right corner. Thus

33 7 9 29 -33 7 67 29
39 5 -7 -16 39 5 -39 -16
D 10 2 4 17 -10 2 38 17
2 -1 -1
694 DETERMINANTS APPENDIX III

the second determinant being obtained from the first by adding twice the fourth
column to the third column. If we now expand by cofactors the fourth row, we
obtain
-33 7 67 -33 7 67
D = 4+4,
(_!)*-*-*(_ i) 39 5 -39 = — 39 5 -39
-10 2 38 -10 2 38

and the computation could be completed as in Example 1. We continue, however,


by factoring a 2 from the third row with the aid of Property (1) and then adding
multiples of this row to each of the first two rows. That is,

-33 7 67
D - 2 39 5 -39
-5 1 19

2 -66
- 2 39 5 -39
-5 1 19

2 -66
= - 2 64 -134
-5 1 19

We now expand by cofactors of the second column, obtaining

2 -66
D= (-2)(-l)
64 -134
= 2(-268 + 4224) = 2 • 3956 = 7912.

As a final example, we illustrate the application of determinants to systems of


linear equations.

Example 4. Consider the system of equations

anXi + a 12 x 2 + • ' •
+ a ln xn = b\,

a 2 \Xi + a 2 2*2 + • • "


+ a 2n xn = b2 ,
(111-20)

OnlXl + fln2*2 +
and let

011 «12 a\n

D (111-21)
#rol #n2

be the determinant of coefficients of this system. If we multiply the first of these


equations by A 1U the second by A 21 etc., and then add, we obtain
,

(^ «*l^ilVl + (£ <*i2AiijX + 2 • •
+ f S a inAiijXn = ^ b i A H-
.

11-4 SUMMARY AND EXAMPLES 695

But the coefficient of xi is then just the value D of \aij\ itself, and the coefficients
of x 2 , .
.
, xn are (by Eq. Ill— 19) all zero. Thus we have
n
Dxi = J] M^
and if D ^ we have determined x 1 . Of course, instead of An we could have
used the cofactors of they'th column of (III—21). We would then have obtained
the formulas

Dxi = 2
i=l
biAi i> J = l >
2 '
,n

for the Xj.


If D 9* 0, it is easily verified that these x, do satisfy (III—20). Moreover, the
solution of (IH-20) unique if and only if the columns of (HI-21) are linearly
is

independent (see Theorem 2-12 in the text). But, by Property (4), this is equiv-
alent to the nonvanishing of D. Thus we have the following theorem.

Theorem HI-9. The system (III—20) of n linear equations in n unknowns has


a unique solution if and only if the determinant of its coefficients is not zero.
In this case, moreover, the solution is given by

2 Mi
j = 1,2, (111-22)
D
It is worth noting that the expression in the numerator of (HI-22) is just the

expansion (by cofactors of the y'th column) of the determinant obtained from the
coefficient determinant |a t-y| by replacing they'th column by the vector (bu . .
.
, b n )',
that is,
an ••• bi -•• a^

an \ bn
an a\i a lr j= 1,2, ,n. (111-23)

an \

When written in this form, (III—23) is called Cramer's rule for solving the given
system of equations.

Example 5. The system


3x — 5y = 14,

x + 2y = 3

has coefficient determinant

3 -5
D = 1 2
6 + 5 = 11,
696 DETERMINANTS APPENDIX III

and hence has the solution

14 -5
= 3 2 28 + 15
= 43
*i ~
3 -5 11 11

1 2

3 14
1 3 9-14 = -5
x2
3 -5 11 11

1 2

It is clear, of course, that the method of Examples 4 and 5 can be applied only
to systems of equations for which the coefficient matrix is a square matrix, and
for this reason solutions given by (III—23) are largely of theoretical interest.

111-5 MULTIPLICATION OF DETERMINANTS

Consider the In X 2n determinant

#11 ' ' '


Qnn

anl '
' ' ann • • •

kd (111-24)
c ll ' ' ' c ln bu •
'
bi n

cn\ bnl

According to (HI- 16), its value is given by

\
e ij\ = Z~i
<T (P) e Pd)i eP(2)2 ' " '
^p(2n)2n> (111-25)

where p ranges over the (2«)! permutations of {1, 2, ... 2n}. Since e^ = for ,

1 < i < n, n + 1 < j < 2«, the only terms of (III—25) which are different from

zero are those for which p satisfies the inequalities

1 < pif) < n for 1 < / < n,

n + 1 < p(i) < 2« for n + 1 < / < In. (111-26)

Moreover, for such a permutation p, the number of inversions in the sequence


p(l), p(2), ,p(2ri) is the sum of the number of inversions in the sequences
. . .

p(l),p(2),...,p(n)
and
pin + l),p(n + 2), . . . ,p(2n).
Ill— 5 I
MULTIPLICATION OF DETERMINANTS 697

Each term of (111-25), therefore, can be written as a product:

<r(r)<T(s)a r (i)lClr(2)2 ' ' ' a r (n)nb s (l)lb s (2)2 ' ' ' b s (n)n,

where r and s are permutations of {1, . . . , «} and <r(r)a(s) = a(p). Thus

= e P^n)2n
\
e ii\
Zl viP^PWl ' ' '

= ^ ^^ r s
cr (. r )°'( S ) a ran ' ' ' a r(n)nbs{l)l ' ' ' bs{n)n

= ]>^ air )a r( i)i


• • • a rW ?i 2^ a (s )bsan
' '
'
b S (n)n
r S

= Wij\ '
\bij\,

and we have proved the following lemma.

Lemma. Regardless of the values of the c ih the determinant (111-24) is the

product of the two "sub-determinants'" \aa\ and |%|.

Let us now choose c,-y = if / ^ 7 and ca = ~ L Then we have


All *
' •
<*\n

flnl '
' ' a nn "
\<*n\
' \bij\
= (111-27)
2>n bin
-1

— 1 Ki
Denoting the first n columns of (111-27) by C l5 . . . , Cn , and adding the sum

2 bud =
i=l
611C1 + b 2 xC 2 + • • •
+ bnl C n

to the (« + l)st column, (111-27) takes the form

«11 * • •
din YL a \ibi\

an \ • • •
ann YL amb%\ • • •

\a%j\ ' h a\ =
\

-1 ••• b 12 ---b ln
-1 :

• • • — 1 bn2 - •
-bnn

Similarly, adding £?=1 b i2 Ci to the (n + 2)nd column, E?=i &;3Q to the


698 DETERMINANTS | APPENDIX III

(« + 3)rd column, etc., we obtain finally the determinant

flu *

' «in Hauba • • •
YLaubi

Y,an ib n
\bij\ = (111-28)
-1

... -i o •••

which, by repeated application of Property II of Section III— 4, can be written

-1 ••• •••

•-1
\aa\-\bij\ = (-Dn 011 • ' a ln Ha ubn • • •
J2 a iib

We now apply the lemma once again, obtaining

-1 Haiibn Ha ubi
w
Kl-M = (-l)
-1 L,Qnibn H^nibi
Finally noting that

_1 ...

(-D
n = (-i) n (-ir = (-i^+d = i,

-1
we obtain
Haiibn Ha ubi
Wa\ •
\bn\ = (111-29)
zL^nibil

Thus if we are given matrices

'
' ' a ln '4,1 •• bin
(^11
B
an\ ' * ' ann> \b n i •

"nn/

the determinant on the right of (III—29) is seen to be the determinant of the matrix
product of A and B, and we have

Theorem 111-10. If A and B are n X n matrices, then

det (AB) = det (A) det •


(B).
Hl—5 |
MULTIPLICATION OF DETERMINANTS 699

It follows from "Theorem 111-10 in particular, that if A is a nonsingular n X n


matrix and A~ x
is its inverse, then

det(^) = ^ (A)

-1 -1 = =
For det 04 )- det 04) = det(^ ^) det (/) 1.

We
conclude our discussion of determinants with one further application of
Theorem III- 10. Namely, let A = (a,-y) be any n X n matrix, and consider its
characteristic polynomial*

«11 - X 012 d\n


021 022 — X 02n
det {A - XI)

0nl a„.o (*nn A

Then for any nonsingular matrix P, we have

det (PAP' 1
- XI) = det (PAP' 1 - XPIP~ l )
- _1
= det (P(A X/)/> )
_1
= det (P) det (A - XI) •
det (P )

= det (A - XI).

This proves the following theorem.

Theorem 111-11. If A and B are similar matrices, i.e., if there exists a non-
-1
singular matrix P such that B = PAP then A and B have the same char-
,

acteristic polynomial.

It is not difficult to show that two n X n matrices are similar if and only if they
represent the same linear transformation L: (R
n — > (R
n
(relative to possibly dif-

ferent bases). Since Theorem III— 1 1 asserts that the characteristic polynomial of
two such matrices is the same, it follows that the characteristic polynomial which
is associated with choosing a matrix representation for L is independent of
L by
the particular matrix used. This allows us to conclude that the eigenvalues for L
can be found by representing L in matrix form and solving the resulting charac-
teristic equation for X.*

* See Section 12-3.


,

APPENDIX IV

uniqueness theorems

IV-l SURFACES IN (R
3
, SURFACE AREA
Our objective in this appendix is to prove the uniqueness of the solutions obtained
in Chapters 13 through 15 for various boundary-value problems involving partial
differential equations from mathematical physics. For this purpose it will be
necessary to introduce several concepts from vector field theory, among them the
notion of surface integral. However these ideas will be developed only to the
extent required for the stated goal.
A surface S in (R 3 is said to be represented parametrically if it is the range of a
function r : (R
2 — > (R
3
. In this case S is the set of points (x, y, z) in (R
3
satisfying

x = x(u, v),

y = y(u,v), (IV-l)

2 = z(ll, V),

where x(u, v), y(u, are the coordinate functions of r and (u, v) lies in a
v), z(u, v)
2
region R uv of (R . Equations (IV-l) are called parametric equations for S. For
3
example the sphere of radius R in (R centered at the origin is represented para-
metrically by
x = R cos u sin v

y = i? sin w sin v,

z = R cos v,

where {u, v) lies in R uv : < u < 2w, < v < t (see Fig. IV-l).
A surface S may also be represented explicitly in the form

z = f(x, y), (x, y) in R xy ,


(IV-2)

or implicitly as the set of points satisfying an equation

F(x,y,z) = 0. (IV-3)

We are familiar with each of these forms in the case of the sphere described above.
For example, the equation
x2 + y
2
+ z
2
= R2
700
3 SURFACE AREA 701
IV-1 I
SURFACES IN (R ,

(0,0,1)

(1,0,0)

defines this sphere implicitly, while the upper and lower hemispheres are explicitly

represented by the respective equations

z = VR -2 x2 - y2 , x2 + y2 < R2 ,

and
= -VW - x2 - y
2
, x2 + y
2
< R2 .

It is usually the particular application at hand that determines which of the above
representations is used.
We shall say that a given surface S has a unique tangent plane at
is smooth if it

each of its points P and if this tangent plane varies continuously as P ranges over
the surface. In this case we can erect at each point P of S a vector N which is normal
to S at P (i.e., perpendicular to the tangent plane at P). For the explicit represen-
tation (IV-2) of S, the property of smoothness amounts simply to the continuity
of the partial derivatives df/dx and df/dy, and a normal vector N is then given by

(IV-4)
dx dy

3
where i, j, k are the standard basis vectors of (R . For implicit and explicit repre-

sentations, however, we must impose assumptions in addition to continuity of the

partial derivatives involved. Thus (IV-3) yields a normal vector

^
NT =
dF.
—I
dx
+
.


dF.
dy
dF,
+ -r-k
dz J
.

(IV-5)

provided that this vector does not vanish, while in the case of the parametric repre-
sentation (IV-1), the vectors

du ~ du dw
J
du '

(IV-6)
dr _ dx . . dy . . dz
dv
~ dv
l
au
J
au
702 UNIQUENESS THEOREMS APPENDIX IV

are tangent to the surface S and hence determine the tangent plane only if they are
linearly independent. In the latter case a normal N to S at the point P = r(u, v)

is orthogonal to each of the vectors (IV-6) and hence may be expressed in the
form
i J k

dx dy dz
N = — X-
du dv du du du

dx dy dz
dv dv dv

dy dz dz dx dx dy
du du du du du du

dy dz
i + dz dx
j + dx dy
(IV-7)

dv dv dv dv dv dv

[all partial derivatives evaluated at the point (u, v)]*

Example 1. For the sphere S defined parametrically by

x = R cos u sin v, y = R sin u sin v, z = R cos v,

Eq. (IV-7) yields

i J k

N = — R sin u sin v R cos u sin v


R cos u cos v R sin u cos v — R sin v
= — R2 cos u sin
2
vi — R 2 sin u sin 2 vj — R 2 sin v cos vk
= — R2 sin v (cos u sin v\ + sin u sin vj + cos vk).

2
Here N is easily seen to be the inward pointing normal at P and ||N|| = R sin v.
Employing, instead, the explicit or implicit representations of this sphere, Eqs.

* For two vectors ai = *ii + j>ij + zik, a2 = x2 i + >>2J + z 2 k in (R 3 , the vector


product ai X a2 is defined by

ai X a2 = (yiZ2 - ziy 2)i + (zi*2 - xiz 2)j + (xiy 2 — j>i* 2)k

i J k
Xi yi Zl
X2 yi z2

that ai X a 2 is orthogonal to both ai and a 2 and that ||ai


It is easily verified a2 X ||

istwice the area of the triangle determined by ai and a2- These properties, together
with the fact that the vectors a 1, a 2, a 1 X
a2 (in that order) forma right-handed triple,
serve to determine ai X
a 2 uniquely.
3 703
IV-1 I
SURFACES IN (R , SURFACE AREA

FIGURE IV-2

(IV-4) and (IV-5) yield N in the respective forms


2x 2y
N =
.

i +,

J + k
^#2 _ X 2 _ y 2
"

v^ - 2 x2 - y
2

or
N = 2xi + 2y\ + 2zk.

We now consider the notion of surface area. Here we begin with a smooth
surface S and a parametric representation r = x(u, v)i + y(u, v)j + z(u, v)k as
described above. To obtain an approximation to the area of S we subdivide the
surface into a finite number of pieces AS; by means of a network of intersecting
curves corresponding to a grid of horizontal and vertical lines in the domain R uv
of r (see Fig. IV-2). The approximation is now found by replacing each of the
ASi by an appropriate portion of the tangent plane to S at a point in AS t and taking
area on the tangent plane as an approximation to area on the surface.
Specifically, let w and r(w) be as shown in Fig. IV-2 and let

dr
and U (w)
dv

be the tangent vectors to S (Eq. IV-6) at the point r(w). Then if AS; is the image
under r of the indicated rectangle with sides Aw,, Ai?», the area of the parallelogram
formed by the vectors
dr
ti Aui = AW,; t2 AVi
du dv

provide a reasonable approximation to the area of AS*


will, intuitively at least,

whenever Aw* and Av { are small. But, as was pointed out in the footnote on
page 702, the area of this parallelogram is just the magnitude of the vector

NAUiAVi = (ti X t 2 )AW;Al?;,


704 UNIQUENESS THEOREMS [
APPENDIX IV

and hence, using (IV-7), we have the required approximation in the form

2 2 1/2
dy dz dz dx dx dy
du du du du du du
|N|| AuiAVi = < + + \ Au { AV{.
dy dz dz dx dx dy
, dv dv dv dv dv dv
(IV-8)

We now repeat the above computation for each of the AS;, add the results, and
take the limit as the number of pieces AS; is allowed to increase in the usual fashion.
The value of this limit(which can be shown to exist under the hypotheses in force)
is, by definition, the area of S, and we therefore have

2 2 2} 1/2
dy dz dz dx dx dy
du du du du du du
a(s) + + dudv.
dy dz dz dx dx dy
dv dv dv dv dv dv
(IV-9)

This formula is usually written in less cumbersome form as

1/2
= d(y, zf d(z, xf d(x, yf du dv, (IV-10)
a(s) ~*~ "*"
J L d{u, v) d(u, v) d(u, v)

where
dX dy
d(x, y) du du
d(u, v)
dX dy
dv dv

etc. More generally the symbol

d(xi, . . . , xn )
d(ui, . . . , un )

is used to denote the n X n functional determinant

3.Xi dx 2 dxn
dwi dui dUi

dX\ dx 2 dxn
du 2 du 2 du 2

d*i dx 2 dxn
dun du n dun
3 SURFACE AREA 705
IV-1 I
SURFACES IN (R ,

and is known as the Jacobian determinant of the functions

X\{U\, . . . , w„), . . . , x n {u\, . . . , u n ).

The argument just given can easily be generalized to include integration of


scalar functions defined on surfaces or, as they are called, scalar fields. In this

case we begin with a real valued function F defined and continuous in


a region of

(R
3
and a smooth two-dimensional
,
surface S in that region. Let S be subdivided

into a finite number of nonoverlapping pieces AS,-, / = 1, «, and choose a . . . ,

point Pf in each of them. Then the surface integral of F on S, denoted

Fds -

If'
jj
s

is defined to be

lim 2 F(P?)a(ASO,
t=i

where the limit is taken in such a way that the diameter of each of the AS,- tends

to zero.* Here again it can be shown that this limit exists, and that its value is
independent of the manner in which S is subdivided and the P* chosen in AS,-.
Finally, to obtain a formula for evaluating this integral, let

x = x(u, v),

y = y(u,v),

z = z(u, v)

be a parametric representation of S defined in a region R uv of the wu-plane. Then,


following the argument which led to (IV-10), we find that

Fd$
2 2 1/2
d(y, if d(z,x) d(x, y)
F(x(u, v), y(u, v), z{u, vj)
d(u,v)
+ ,

^7^
d(u,v)
+
,

d(u,v)
dudv,
R,
(IV-11)

where the integral on the right is an ordinary double integral over R uv Integrals .

of this type are encountered in physical problems dealing with surface distributions
of matter where they admit interpretations as mass, moments, etc.

Example Use Formula (IV-10) to calculate the surface area of a sphere


2. S of
radius R. Actually we have already computed the value of
1/2
~d(y, zf d(z, xf d(x, yy
IN]
d(u,v) d(u,v)
+ d(u,v)

* By definition, the diameter of ASi is the least upper bound of the set of real numbers
|Pi - P2II with Pi and P 2 in ASi.
706 UNIQUENESS THEOREMS APPENDIX IV

in Example 1 where we found that [|N|| = R2 sin v. Hence

a(S) = R2 ffsin v dudv = R 2 [* f


*
sin vdudv = 4irR
2
.

R,

Example 3. Find the total mass M of a hemispherical shell S of unit radius, given
that the density of S is proportional to the dist ance from the b ase of the hemisphere.
Let S be represented (explicitly) by z = Vl — x 2 — y
2
, x2 + y
2
< 1. Then
parametric equations for S may be given in the form

x = u,

y = v,

z = a/1 — w2 — v2,

with u
2
+ v
2
< l. Since the density of S is given by the scalar function p = kz,
k a (positive) constant, Formula (IV-l 1) yields

M pd§>

1/2
d(y,zf d{z,xf d(x,y)
k\/\ — u2 — v2 +,

+ ,

dudv.
d(u, v) d(u, v) d{u, v)

But

d(y, z) d{z, x) d(x, y)


d(u, v) ^/i _ U2 _ V2 d(u, V) y/\ _ M2 _ V2 d(u, V)

whence
M= k J J
dudv = kir.

u
2 ,
-\-v
2 ^
S 1
1

The example suggests that we simplify Formula (IV-11) (and IV-10) for
last
surfaces S which are represented explicitly as follows. If

z = /(•*> y)> (X y) in R *

defines S, then with

X = u,

y = v,

z = f(u, v), (u, v) in R uv ,

we have
d(y, z)
= _ df s
'
d(z, x)
= _df, d(x, y)
= {
d(u, v) du d{u, v) dv a(w, v)
IV-2 |
SURFACE INTEGRALS OF VECTOR FIELDS 707

and (IV-11) becomes

F<* = // F{x, y, f(x, y)) >/l + (j*)' + (g)' dx dy.

IV-2 SURFACE INTEGRALS OF VECTOR FIELDS


We come now to the problem of assigning a meaning to the integral of a con-
tinuous vector function, or vector field

F(x, y, z) = P(x, y, z)i + Q(x, y, z)j + R(x, y, z)k

3
over a smooth surface S in To do so we use the integral of scalar functions
(R .

introduced in the preceding section, and tentatively define the required integral to be

ff(F-n)d&, (IV-12)
s

where n = n(P) is a unit normal vector at the point P of S chosen in such a way
that F n

is a continuous function on S (Fig. IV-3). Informally this requirement
means that we must designate one side of the surface as positive, and let n(P) be
the unit normal at P which points in that direction. For many surfaces such a
choice of positive direction is easily made. On the sphere (or any closed surface),
for example, we may choose n(P) to be the outward pointing normal. And on a
surface defined explicitly by

2 = fix, y), (x, y) in R xy ,

we may choose n(P) to be the upward normal

=
- (df/dy)j + k
-(df/dx)i
n
VTTXdf/dxy + (df/dyy
3
Unfortunately, there are smooth surfaces in (R , so-called one-sided surfaces, for
which not possible to assign a positive direction. The most notorious example
it is

of such a surface (and aside from trivial modifications the only one) is the Mbbius
strip m, shown in Fig. IV-4.* It is clear that 9TI is genuinely one-sided in the

F(P)

FIGURE IV-3 FIGURE IV-4

* August Ferdinand Mobius, German mathematician, 1790-1868.


708 UNIQUENESS THEOREMS APPENDIX IV

sense that anyone who agreed to paint a single side of the strip would find that he
had actually contracted to paint the entire surface.More to the point is the ob-
servation that we can pass from one of the two unit normal vectors at P to the
other by moving the normal continuously along a smooth curve on 9TI. Thus
there is no natural meaning that can be attached to the integration of F n over am, •

and we must accordingly exclude the Mobius strip and its relatives from further
consideration. To this end we introduce the following definition.

Definition IV-1. A smooth surface S in (R


3
is said to be orientable if the
unit normal vectors at each point P of S return to their original positions
after traversing any smooth closed curve on S. If this property fails to hold,
S is said to be nonorientable. In short, a surface is orientable if it is two-
sided; nonorientable otherwise.

3
Let us consider, now, an orientable smooth surface S in (R with positive unit
normal n, and let

F(x, y, z) = P(x, y, z)\ + Q(x, y, z)\ + R(x, y, z)k

be a continuous vector field on S. Then the scalar function F • n is also continuous

on S, and the surface integral ofFonS, is defined to be

If (F • n) d§.

Moreover if r = r(w, v) is a parametric representation of S with coordinate func-


tions x(u, v), y(u, v), z(u, v), and if the normal vector

K = UXt 2 - -^—^i + a(MjU)


J + d(MjU)
k

of Eq. (IV-7) is a positive normal to S (i.e., points in the chosen positive


direction

from S), then the surface integral of F on S can be evaluated as an ordinary double
integral over the domain D uv of r by the formula

(F • n) flfS = f i|n| dudv


-m)
(F • N) du dv

[**'>>&3+<*<*'>>&3
r

du dv. (IV-13)
IV-2 |
SURFACE INTEGRALS OF VECTOR FIELDS 709

para-
Before considering examples, several remarks are in order. First, if the
metric representation of S is such that the vector = t x X t 2 does not point N
from the above formula must be modified to read
in the positive direction S,

ff (F • n) dS = (F • N) du dv. (IV-14)
-ff

Second, represented explicitly by an equation of the form z = f(x, y)


if S is

denned in a region R xy of the xy-plane, and if the upward direction from S is


chosen as the positive direction, then
df. df. .
,

is a positive normal to S, and we then have

d
(F • n) c& = -P(x, y, f{x, y)) £- Q(x, y, f{x, y))
fy
+ R(x,y,f(x,y)) dudv. (IV-15)

(This formula will prove quite useful later.) an alternative and Finally, there is

very convenient notation for the surface integral of F on S which can be obtained
by writing

ff (F
• n) d$ =
ff
(Pi + &+ Rk)-n d§,

s s

= ff P(i • n) dS + QQ • n) d§> + R(k n)


• d$.

For then
i-n = cos7 1} j-n = cos7 2 , k-n = cos7 3 ,

where y u 7 2 7 3 are, respectively, the


,

angles between n and the vectors i, j, k


(Fig. IV-5), and the quantities

(i • n) d§>, (j • n) d§>, (k • n) dS

can be interpreted as the projections of


an element of surface area d$ onto the
three coordinate planes. This suggests
that we set

(i • n) d§> = dy dz,

(j • n) d§> = dz dx,
(k • n) dS = dx dy, FIGURE IV-5
710 UNIQUENESS THEOREMS | APPENDIX IV

and write JJS (F • n) d§> as

ff Pdydz + Qdzdx + R dx dy. (IV-16)

The value of this expression is, of course, still given by (IV-13) or (IV-14).*

Example 1. Compute the value of

ff (F • n) dS
s

when F = xyH + k, and S is the surface of the unit sphere in (R


3
with the outward
direction chosen as positive.
Since S can be represented by the equations

x = cos u sin v, y = sin u sin v, z = cos v,

with < u < 2t, < v < ir, we obtain as in Example 1 of the last section

i J k

dx dy dz
N = du du du

dx dy dz
dv dv dv

= — (cos u sin 2 vi + sin u sin


2
vj + sin v cos vk).

But this vector is clearly an inward normal to S, and hence we must use (IV-14)
to evaluate £fs (F • n) dS. This gives

2
ff (F • n) dS = ff (sin
2
u cos u sin
3
vi + k) • (cos u sin v\

+ sin u sin
2
vj + sin v cos uk) c?m c/u

= r f
Jo Jo
(sin
2 2
wcos wsin
5
y + smvcosv)dudv
W
[ sin v dv [* sin +
2 2
= 5
u cos udu f sinucosurfu/ du
Jo Jo Jo Jo

15*

* The reader should not take these remarks too seriously; they have been given only

to motivate introducing (IV-16). In more advanced work, however, the expression


Pdydz + Qdzdx +
Rdxdy is given independent status, and (IV-16) is defined to
be the integral of this quantity over the oriented surface S. (See Fleming, Functions of
Several Variables, Addison- Wesley, 1965.)
1

IV-2 SURFACE INTEGRALS OF VECTOR FIELDS 71

Example 2. Find the value of

1 1 xz dx dy

when S is the triangular surface in Fig. IV-6.


In this case S is the portion of the plane z = 1 — x — y above the triangular
region 0<;t< 1,0<j>< 1 — x in the j»y;-plane, and (IV-15) yields

xz dx dy jc(1 — x — y) dy dx
0^0
1 -il— x
2
xy x y dx
L

Example 3. Let F be the vector field associated with a time independent flow of
fluid in a region of 3-space [i.e., F(x, y, z) is the velocity vector of the fluid at the
point (x y, z)], and let S be an orientable smooth surface lying in this region.
Then // F is continuous and n
is the positive unit normal to S at a point P, the

quantity [F(P) «an approximation to the amount of fluid flowing per


n]a(AS) is

unit time in the positive direction across a surface element containing P, i.e. to
the, flux across AS in the positive direction. The usual limiting argument now
applies and allows us to assert that

//(F-n)</S (IV-17)

is the total flux crossing S in the positive direction.


In an alternative computation of this same quantity, let us consider the rate at
which fluid is leaving a small rectangular region B. For this purpose let

F(x, y, z) = P(x, y, z)i + Q(x, y, z)j + R(x, y, z)k,

z
'

(0, 0, 1)

Ay
ixQ ,yQ ,ZQ)

FIGURE IV-6 FIGURE IV-7


y

712 UNIQUENESS THEOREMS | APPENDIX IV

and suppose the box B to have edges of length Ax, Ay, Az situated as shown in
Fig. IV-7. Then the net amount of fluid leaving B through the shaded sides is

(approximately)

P(x + Ax, y , z Q) Ay Az - P(x , y , z ) Ay Az,

which, assuming continuous differentiability of F and using the mean value


theorem, can be rewritten

P(x , yo, z ) + Ax—(x +d Ax, Q, z ) Ay Az - P(x , y , z ) Ay Az


()P
= 1~ (*o + Ax, y , z ) Ax Ay Az,
ox

where < < 1 . Similar results express the rate of flow through the remaining
two pairs of parallel sides of B using dQ/dy, dR/dz instead of dP/dx. Hence if

B is small, the total rate at which fluid is leaving B is given approximately by

+—
dQ
(dP
(
\dx
t
,

h ^
dy
dR\
dzj
Ax Ay ,

)
A A A
y Az

[the partial derivatives being evaluated at x = (x , y , z )]. The quantity in


parentheses thus measures the rate per unit volume at which fluid is diverging
from the point x. It is accordingly called the divergence of the vector field F and
is denoted by div F that ; is,

divF = ^
dx
+^
dy
+ ^.
dz
ov-18)

If the fluid is incompressible, then div F must measure the rate (per unit volume)
at which fluid is being introduced at the point x. Therefore, if S is a closed surface,
enclosing a region R, the integral

fff (div F) dx dy dz
R

represents the total amount of fluid introduced into the region R, and hence must
also represent the total flux across the boundary surface §. Comparison of this
result with (IV- 17) leads to the equation

ff (F • n) dS = fff (div F) dx dy dz. (IV-19)


S R

This which would almost be self-evident if one were willing to accept the
result,
physical argument just given, is one of the basic relations in vector field theory.

It is known as the divergence theorem and is proved below.


IV-3 |
THE DIVERGENCE THEOREM 713

One finalremark before leaving this section. Thus far we have limited our
is no need to be quite this
treatment of integration to smooth surfaces. There
we want to integrate over surfaces such as the surface
restrictive, and in fact shall
surface of a closed cylinder (Fig. IV-8). Although not smooth,
of a cube, or the
surfaces are constructed from a finite number of nonoverlapping smooth
these
pieces and are accordingly called piecewise smooth surfaces. The various integrals
surface S simply by
of these two sections are extended to a piecewise smooth
integrals over the smooth pieces of S. The only question which presents
adding the
real difficulty in this connection is that of orientation. However, for a piecewise
3
smooth surface S which bounds a finite region of (R i.e. which has an inside and,

an outside, the question of orientation can be settled by


choosing the outward normal for each of the smooth pieces
of S. A general discussion of orientation of piecewise smooth
surfaces is quite involved and unnecessary for our present

aims. The reader isreferred to a brief but excellent treat-


ment in Protter and Morrey, Modem Mathematical Analysis,
Addison- Wesley, 1964, pp. 602-611, where it is proved that
the formulas which we have obtained in these two sections,
are in fact independent of the particular parametrization
used to derive them.
FIGURE IV-8

IV-3 THE DIVERGENCE THEOREM

In this section we shall establish the equality given in Eq. (IV- 19) of the preceding
section, known in the literature as the divergence theorem, or as Gauss' theorem*

Theorem IV-1 . Leta region of 3 -space whose boundary dV is apiece-


V be
wise smooth two-dimensional closed surface, and let F be a continuously
differentiable vector field defined in and on the boundary of V. Then if n
denotes the unit outward normal to dV,

ff (F
• n) dS = (div F) dV. (IV-20)
fff
dV

Proof Webegin by establishing (IV-20) when V is an "elementary" region of


the type in Fig. IV-9, and then pass to the general case by decomposing
shown
more general regions into elementary ones. Moreover, to simplify matters even
further we rewrite (IV-20) in scalar form as

dP dQ dR
P dydz + Qdzdx + Rdxdy = dx
~*~
dy
~*~
dz.
dV,
dV 'v

* Named in honor of the famous German mathematician Karl Friedrich Gauss,


1777-1855.
714 UNIQUENESS THEOREMS APPENDIX IV

and observe that the desired result will follow by additivity


ifwe can show that

dP
P dy dz = dV,
dx
dV

dQ
Q dz dx dy
dV,
dV

dR
Rdxdy = fl J
j^dV. FIGURE IV-9

dV

And finally, since these equations are all of the same type, it obviously suffices
to establish just one of them.
This said, let V be an "elementary" region bounded above and below by smooth
surfaces Si and S 2 described, respectively, by the functions z =/2(x,y) and
z = fi(x, y) where (jc, y) ranges over a region D in the xj-plane. Let S 3 denote
the lateral surface (if any) of V. Then, by Formula (IV- 15),

Rdxdy = / /
(Rk) • n d$
§2

//(*">•(- i'-f^")^
R(x > y> f2(x, y)) dA.

Similarly

Rdxdy = - R(x, y, fx {x, yj)dA

(where the minus sign occurs because the positive normal to S i points toward the
interior of V), and

Rdxdy = / /
(Rk) • n ds = 0,

since n is orthogonal to k on § 3 . Hence

JJ
Rdxdy =
JJ
[R(x, y, f2 (x, y)) - R(x, y, Mx, y))] dA

f2( x <y'>
d
R(x, y, z) dz dA = -*dV '

fi(x,y) dz dz

as required. Thus (IV-20) holds for "elementary" regions V.


IV-3 | THE DIVERGENCE THEOREM 715

To complete the proof we now assume that V can


be decomposed into a finite
number of elementary regions Vu . . . , Vn as suggested in Fig. IV-10, and apply

the theorem to each of them in turn. Then, since the integrals over those portions

of the bounding surfaces common to a V and


{
Vj cancel in pairs, we have

II (F-n)d$ = ^2
JJ
(F-n)^ ^^ ~^%k

n fff / Mj
= E
1=1
///
(divF)JF / ^ y
(div F) dV, X^^— «A

and (IV-20) still holds. Finally, for the most general \


regions considered in the statement of the theorem, ^^^^^|
we apply a limiting argument based on the case just
F,GURE lv" 10
considered.* |

Example 1. Use the divergence theorem to evaluate the surface integral con-

sidered in Example 1 of Section IV-2, i.e., the integral

ff xy dy dz
2
+ dx dy,

2
x2
2
where S is the surface of the unit sphere + y + z = 1-

By (IV-20) we have
2
2
xy dydz + dxdy =
fff
y dV,
ff
s v

and, changing to spherical coordinates,

2 2
2
xy dy dz + dx dy = / / (r sin <p sin d) r sin <p dr d<p dO

/2t pT /*1

2 3 4
/ sin Odd- I sin cp d<p •
I r dr
Jo Jo Jo

= TrdXi)

_ 4ir
~ 15
'

The economy of this method is too obvious to need comment.

* See O. D. Kellogg, Foundations of Potential Theory, Springer, Berlin, 1929.


716 UNIQUENESS THEOREMS APPENDIX IV

Example 2. The divergence theorem can some-


times be used to evaluate surface integrals even
when the surface in question is not closed. As an
illustration of how this is done, consider

2
// xy dy dz + x dz dx + (x + z) dx dy

(Fig. IV— 1 1), where S is the hemispherical surface

z = VI - x 2 - y 2 , z > 0. FIGURE IV-11

To evaluate this integral let V denote the region bounded by S and the plane
z = 0, and let §>i denote the closed disc x 2 + y
2
< 1 in the xy-plane. Then we
write
2
If
xy dy dz + xdzdx + (x + z) dx dy
s
2
= [J
xy dy dz + x dz dx + (x + z) dx dy
dV

+ ff
2
xy dy dz + x dz dx + (x + z) dx dy.

§i

(The reader should note that the plus sign appearing here is not a misprint. Why?)
But by the divergence theorem, we have

2
ft xy dydz
2
+ x dz dx + (x + z) dx dy =
fff
(y + 1) dV
dV v
2
= I f f t( r s * nv s * n 0) + W r2 s n ^ ^r d*P d®
*

Jo Jo Jo

2* 12 l
r2 * r*l 2 l
o ^ r a r r* f 2
= / sin^ Odd- I sin
d
<pdp- I r*dr + I
Jo
dd
Jo
sm <p d<p
Jo
r dr
Jo Jo Jo

_ 2tt 2t _ Air
~ TI + T= T'
Finally,

//
xy dy dz + x dz dx + (x + z) cfx d[y

[xy
2
i + jcj + (x + z)k] • k </S
- //< + z)dA
x +j/ < 1
2=0
Vl-X 2
jc dy dx = 2 / x\/l — x 2 dx =
-1 ^-Vl-x 2 -/-l
IV-3 1 THE DIVERGENCE THEOREM 717

and we have

2
xy dy dz + xdzdx + (x + z)dxdy = -y

divergence of a con-
In Eq. (IV- 18) of the preceding section we defined the
tinuously differentiable vector field by the equation

div F = —+
3a;
-^7
dj>
+ -T-
dz
'

where P, Q, and are the coordinate functions of F with respect to the standard
i?

basis vectors in (R Although this quantity is defined in terms of a particular


3
.

coordinate system, its physical interpretation suggests that its values remain un-
changed under a change of basis. We are now in a position to prove this fact by
3
giving a description of div F which does not involve a coordinate system in (R .

To this end we recall that if/is real valued and continuous in a region R of 3-space,
the mean-value theorem for integrals asserts the existence of a point (x*, y*, z*)
in R such that

ffff(x,y,z)dV=f(x*,y*,z*)V,
R

where V denotes the volume of R. Thus if x is any point in the interior of the
domain of F, and (B e is the sphere of radius e about x, then

fff (div F) dV = div ¥(x*)Ve ,

with x* in (B € , and V e the volume of (B e . Hence, by the divergence theorem,

div F(x*) = -y- (F • n) c/S,


JJ
d(B e

and passing to the limit as e —> 0, we have

divF(x) = lim 4- // (F-n)</S. (IV-21)


3(B e

Besides providing us with a coordinate-free description of the divergence of a


vector field, this expression also allows us to recapture the physical interpretation

of div F(x) given in the preceding section. Indeed, if F is the velocity field of a
time independent flow of fluid,

ff (F-n)rfS
718 UNIQUENESS THEOREMS APPENDIX IV

is, as we have seen, the flux crossing the surface of (B e in the positive direction.
Hence, in the limit, (IV-21) measures the amount of fluid diverging from x per
unit volume.

IV-4 BOUNDARY- VALUE PROBLEMS REVISITED: UNIQUENESS THEOREMS


In this section we shall finally prove the long postponed uniqueness theorems for
boundary-value problems involving the wave equation, heat equation, and La-
place's equation. Each of these theorems is an easy consequence of the divergence
theorem, or, rather, of two simple corollaries of the divergence theorem known as
Green's and second identities. However, before we can state these identities
first

in anything approaching readable form we must simplify our notation somewhat.


This can be done very effectively by introducing the symbolic vector V (read
"del") where
B B B
V = —1 +
. .

+
. .

r-J r-k. (IV-22)


Bx By Bz

The rules for manipulating this symbol are perfectly simple and go as follows.
If i/' is a differentiable scalar function, then

a quantity usually referred to as the gradient of \p or grad \f/.


Similarly, if
F = Pi + Q] + Rk is a differentiable vector field, then

dP dQ dR
VF dx By Bz

divF,

and finally

B_ B_ B_
V X F
Bx By Bz

P Q R
(BR _ BQ\ 1+,(BP_ BR\ } ,(dQ_ BP\
\By Bz~/ \Bz BxJ ^\Bx By/*'

a quantity called the curl of F. Finally, carrying this notation to its logical con-

clusion, we set
2 2 2
B B B
T
,
,

Bx 2 ^ By 2 Bz 2

2
and write V V = V2
• . (The operator V is sometimes called the Laplacian.)
IV-4 |
BOUNDARY-VALUE PROBLEMS REVISITED: UNIQUENESS THEOREMS 719

Since V F is just
• another way of writing div F, the divergence theorem may be
rewritten in terms of V as

(F • n) d§> = ( V • F) dV. (IV-23)


ff fff
av v

if u and v are twice continuously differentiate scalar functions


In particular, in
and on the boundary of V, and if we set

we find that
V F = . u(V 2 v) + (V«) • (Vv) (IV-24)

and that (IV-23) becomes

[u(W) • n] d§> = fff [u(V v)


2
+ (V«) • (Vi?)] dV. (IV-25)
ff
av v

This is Green's first identity. To derive the second we simply interchange u and v
in (IV-25) and subtract to obtain

ff[u(Vv) - v(Vu)] • n d% = fff


[u(V v)
2
- v(V
2
u)] dV. (IV-26)
av v

This done we now get on with the uniqueness theorems.

Theorem IV-2. Let S be apiecewise smooth closed surface in<R 3 surrounding


a region V, and let <p be a real valued continuous function on S. Then there
exists at most one solution of the boundary-value problem

V2 u = in V,

u = <p on S.

Proof. Let u x and u 2 be two solutions of the given problem and set U= u\ — u2 .

Then U is a solution of the boundary-value problem

V2U =0 in V,

U= on S,

and we will be done if we can show that U= in V. To this end we note that when
u = v Green's first identity becomes

[u(Vu) • n] dS = [u(V u)
2
+ (Vh) • (Vw)] dV. (IV-27)
ff fff
av v
720 UNIQUENESS THEOREMS | APPENDIX IV

Thus, by setting u = U, we obtain

jlj{(yu).(yu)]dv = 0,

v
or

///[©' +©*-© dV = 0;

and since the integrand appearing here is continuous and nonnegative everywhere
in V, we have
©' +($+©-«
in V. This, in turn, implies that

<W _ dU = dU =
~~
dx dy dz

everywhere in V, and it follows that U is constant in V. Finally, since U vanishes


identically on dV = S, U= in V, as required.* |
We omit the two-dimensional version of this result, and go on to the corre-
sponding theorem for the heat equation.

Theorem IV-3. Jf$ and V are as above, there exists at most one solution
of the three-dimensional heat equation

satisfying the boundary condition u(x, y, z, t) = F(x, y, z, t) on S, and the

initial condition u(x, y, z, 0) = G(x, y, z), where F and G are preassigned


continuous functions.

Proof As in the preceding proof let U denote the difference between any two
solutions of this problem. Then U is a solution of the problem

9 dU __9
a
a,
~ v "•

u(x, y, z, t) = on S, u(x, y, z, 0) = 0,

and (IV-27) implies that


2
fff[U(V
2
U)+ \\vU\\ ]dV = 0.

* Recall that, by definition, the values of U


in V must approach the prescribed values
on S = d V as we approach $ along any smooth curve in V. (See Section 13-2.)
IV-4 BOUNDARY-VALUE PROBLEMS REVISITED: UNIQUENESS THEOREMS 721
I

2
But V U = 2
a (dU/dt). Thus

d 2
a
2
U -£+
at
vU \\ \\
dV=0,
'
v"

or

///|(i^VK=-///l|V^K.
v v

Since U continuously differentiable with respect to


is t, we can interchange the

order of integration and differentiation to obtain

i\\\
dt
i^U dV= - 2
) \\VU\\
2
dV. (IV-28)

Now set
2 2
/(0= fff(ia U )dV.

Then 7(0) = 0, while 7(0 > and 7'(0 < for all / > 0. A straightforward
application of the mean value theorem then implies that /(/) = 0. Indeed, from
the mean value theorem we deduce that

/(f) ~ m = net), Q<e<u


or
/(0 = ti'(et),

and since 7(0 > and I'(t) < 0, we must have 7(0 = 0. Thus the left-hand
side of (IV-28) vanishes, and it follows that

2
\vU\\ dV = 0.
///

As before, this implies that U= in V, and the proof is complete. |

(Note that this argument can be used equally well for any of the other boundary
conditions considered earlier.)
Finally we consider the uniqueness problem for the wave equation.

Theorem IV-4. With S and V as above, there exists at most one solution of
the three-dimensional wave equation

satisfying the boundary condition u(x, y, z, t) = F(x, y, z, t) on S, and the


722 UNIQUENESS THEOREMS | APPENDIX IV

initial conditions

du
u(x, y, z, 0) = G(x, y, z), (x, y, z, 0) = H(x, y, z)
Tt

in V, where F, G, and H are preassigned continuous functions.


Proof. Once again, let U denote the difference between any two solutions of the
given problem. Then U is a solution of the problem

2d U _2

u(x, y, z, i) = on S,

u{x, y, z, 0) = —
du
(x, y, z, 0) = in V,

and hence (IV-25), with u = dU/dt and v = U, yields

dU
dt
V2 U + V {—) •
Vt/J dV =
JJ
~ VU- n d% = 0. (IV-29)

But
dU
dt

and

(VC/) V^1|||V^;
so that (IV-29) becomes
Dmes

+ vi/ JK = 0. (IV-30)

Thus the integral in (IV-30) does not depend on t, and since the initial conditions
require it to be zero when t = 0, we conclude that it is identically zero. This
implies that
2(dU\

and hence U must be a constant. Again using the initial conditions, this constant
must be zero, and we are done. |

As in the case of Theorem IV-2, we omit the lower dimensional versions of


Theorems IV-3 and IV-4.
recommendations for

further reading

LINEAR ALGEBRA

Finkbeiner, D., Introduction to Matrices and Linear Transformations;


Freeman, San
Francisco, 1960.

Gel'fand, I., Lectures on Linear Algebra; Interscience, New York, 1961.

Halmos, P., Finite Dimensional Vector Spaces, 2nd Ed.; Van Nostrand, Princeton, 1958.

Cliffs, N.J., 1961.


Hoffman, K., and R. Kunze, Linear Algebra; Prentice-Hall, Englewood
Nearing, E., Linear Algebra and Matrix Theory; Wiley, New York, 1963.

Shilov, G., Introduction to the Theory of Linear Spaces; Prentice-Hall, Englewood


Cliffs,

N.J., 1961.

DIFFERENTIAL EQUATIONS

Birkhoff, G., and G. C. Rota, Ordinary Differential Equations ; Ginn, Boston, 1962.
Coddington, E., Introduction to Ordinary Differential Equations; Prentice-Hall, Engle-
wood Cliffs, N.J. , 1961.

Pontryagin, L., Ordinary Differential Equations; Addison- Wesley, Reading, Mass., 1962.

Tricomi, F., Differential Equations; Hafner, New York, 1961.

Weinberger, H., A First Course in Partial Differential Equations; Blaisdell, New York,
1965.

Yoshida, K., Lectures on Differential and Integral Equations; Interscience, New York,
1960.

ORTHOGONAL FUNCTIONS AND SERIES EXPANSIONS

Churchill, R., Fourier Series and Boundary Value Problems, 2nd Ed.; McGraw-Hill,
New York, 1963.
Davis, H., Fourier Series and Orthogonal Polynomials; Allyn and Bacon, Boston, 1963.
Jackson, D., Fourier Series and Orthogonal Polynomials; Mathematical Association of
America (Carus Monograph), 1941.
Lebedev, N., Special Functions and Their Applications; Prentice-Hall, Englewood Cliffs,

N.J., 1965.

Rainville, E., Special Functions; Macmillan, New York, 1960.


723
724 RECOMMENDATIONS FOR FURTHER READING

Sansone, G., Orthogonal Functions, Revised English Ed.; Interscience, New York, 1959.
Tolstov, G., Fourier Series, Prentice-Hall, Englewood Cliffs, N.J., 1962.

APPLIED MATHEMATICS

Courant, R., and D. Hilbert, Methods of Mathematical Physics, Vol. I; Interscience,


New York, 1953.

Friedman, B., Principles and Techniques of Applied Mathematics; Wiley, New York, 1956.

Sobolev, S., Partial Differential Equations of Mathematical Physics; Addison- Wesley,


Reading, Mass., 1964.

advanced calculus and vector field theory

Bartle, R., Elements of Real Analysis ; Wiley, New York, 1964.


Buck, R., Advanced Calculus, 2nd Ed.; McGraw-Hill, New York, 1965.
Crowell, R., and R. Williamson, Calculus of Vector Functions; Prentice-Hall, Englewood
Cliffs, N.J., 1962.

Fleming, W., Functions of Several Variables; Addison- Wesley, Reading, Mass., 1965.
answers to odd-numbered exercises

Chapter 1

Section 1-1

1. (-1,3), (-3,9) \ 2' 3'' ^2' 3'

5. (-6, -3), (18, 9) 7. tan 2


x -\- \, —tan 2 x — 1

10x + 5 2x + 1
11. No; undefined at x = 1
+ X + x - 6

13. Yes 15. No; undefined at x =


17. No; discontinuous at x = 19. No; discontinuous at x =
21. (-3, -§) 23 f— ^ — 10 ^
-
25. i(l + sec
2
x) 27. In (x + 2)

Section 1-2

3. (a) (-1,5) (b) (0, J#) (c) (0, -2)


5. (a) + 2x 2 - 8x + 3 (b) 7 (c) f
3 - 4x 2 - i* + %
9. (a), (b), (d), (e), (g)
1 1 . No we ; cannot assert that x + = x.

Section 1-4

1 (a) No; not closed under addition (b) Yes; satisfies subspace criterion
(c) Yes; satisfies subspace criterion (d) Yes; satisfies subspace criterion
(e) No ; not closed under addition
3. (a) Yes; satisfies subspace criterion
(b) No ; not closed under scalar multiplication
(c) Yes; satisfies subspace criterion (d) No; not closed under addition
(e) Yes; satisfies subspace criterion
7. (a), (b), (0 9. (a), (0 11. All constant functions
13. (a) x(ax + /?) (b) (x + l)(ax + (3) (c) ax 2
+ (3x + y
17. If 9C consists of the two vectors and<y the two vectors (1, 0, 1),
(0, 1, 1), (0, -1, 1)
(—1, 0, 1), then S(9C n <y) is the trivial subspace of V, but S9C fl S^ comprises all
vectors of the form (0, 0, X3). If 9C consists of the two vectors (0, 0, 1), (0, 1, 1)
and'y the two vectors (0, 0, 1), (1, 0, 1), then S(9C D^) and S9C n S'y each comprise
all vectors of the form (0, 0, X3).

21. (c) «Wi D W2 contains only f(x) = 0.

725
726 ANSWERS TO ODD-NUMBERED EXERCISES | CHAP. I

Section 1-5

1. (a) All subsets which contain 1, 2, or 3 vectors


(b) All subsets which contain 1, 2, or 3 vectors
(c) All subsets which contain 1, 2, or 3 vectors and which do not include both
(1,1, 1) and (2, 2, 2)
(d) All subsets which contain 1, 2, or 3 vectors and which do not include (0, 0, 0)

3. No; (0, 2, -1) and (0, *, -£), or (0, \, -£) and (0, f, -±)
5. i = 1(1, 0) + 0(0, -1) = cos 0(cos 0, sin 0) - sin 0(-sin 0, cos 0)

= - (a, 0) + 0(0,0) = 1(1,1) - 1(0,1),


a
j = 0(1, 0) - 1(0,-1) = sin 0(cos 0, sin 0) + cos 0(-sin 0, cos 0)

= 0(a, 0) + - (0, 0) = 0(1, 1) + 1(0, 1),

i + j = 1(1, 0) - 1(0, -1) = (cos + sin 0)(cos 0, sin 0)

-f- (—sin + cos 0)(— sin 0, cos 0)

= -
a
(a, 0) + \ (0, 0) = 1(1, 1) + 0(0, 1),
p
a'i + 0'j = a'(l, 0) - 0'(O, -1) = (a' cos + 0' sin 0)(cos 0, sin 0)

+ (-a' sin + 0' cos 0)(-sin 0, cos 0)


a' 0'
+ ^(O,0) = -(a,O)
p a
= a'(l, 1) + (-a' + 00(0, 1)
7. (2, -2, 1, 3) = -1(1, 0, 0, 0) - 5(0, 1, 0, 0) - 2(0, 0, 1, 0) + 3(1, 1, 1, 1)
= 0(1, 1, 0, 0) + 3(0, 0, 1, 1) - 2(-l, 0, 1, 1) + 2(0, -1, 0, 1)
= 2(2,-1,0,1) + 0(1,3,2,0) + 1(0,-1,-1,0) + l(-2, 1,2,1)
= -1(1, -1, 2, 0) + 0(1, 1, 2, 0) + 3(3, 0, 0, 1) - 3(2, 1, -1, 0)
15. x2 = §(fx 2 - i) + |(1),
v3
a,
_ 3V5. V 3
S^S"*
-
_ S
2
Y\' _i
'
iWvl
10W
21. x = x(x - l)(x - 2) + 3x(x
3 - 1) + x,
x 3 + 3x - 1 = x(x - l)(x - 2) + 3x(x - 1) + 4x - 1

23. (a) x3 + 2x + 5 = + 2x + 5)
l(x 3
(b) x2 + 1 = K3* 2 + 2) + ^(6)
(c) 2* 3 - x2 + lOx + 2 = 2(* 3 + 2x + 5) - J(3x
2
+ 2) + 1(6*) - ^"(6)

Section 1-6

1. (a) -1,1,0 3. (a) (2,1,0)


(b) -3, 0, 1 (b) (4, 3, 3)
(c) 0,-1,1 (c) (-2,-1, -5)
(d) -f, -i,l (d) (7, |, 3)
(e) 6, -4, 2 (e) (6, 2, 0)

(0 -2, 1, 2 (0 (10, 6, 3)

5. One such basis is (-3, 0, 0, 0), (0, 1, 0, 0), (0, 0, 2, 0), (0, 0, 0, -1).

7. Yes; one such basis is (^, 0), (0, £).


.

2-1 I ANSWERS TO ODD-NUMBERED EXERCISES 727

Section 1-7

1. (a) 2 (b) 3

3. (a) One such basis is 1, x.


~1
(b) One such basis is 1, x, x2 , . . , xn .

5. (a) 3 (b) 2 7. 2

9. One such pair is x3 = (1, 0, 0, 0), x4 = (0, 1, 0, 0).

Section 1-8

1. A'B' and A"B", where, for example,


A' = (2, 1), B> = (4, 1), A" = (1, 2), B" = (3, 2)

3. ^'5' and A"B", where, for example,


A' = (2, -1), B' = (4, 1), A" = (1, 0), B" = (3, 2)

5. /4'Z?' and A"B", where, for example,


A' = (1, 2), 5' = (-2, -1), A" = (0, 3), 5" = (-3, 0)

7.

9. v(0£), where = (0, 0), E = (-£, -3)


11. (3,0) 13. (-3, -4) 15. (-13,-6)
17. v(OC), where O = (0, 0), C = (2, 2)

19. v(OC), where O = (0, 0), C = (-11, 5)

21. (x 3 + *2 — *i, y3 + y2 — y\)

23. \{OE), where = (0, 0), E = (x 2 - *i + x4 — *3, ^2 - yi + )>4 — }>3)

Section 1-9

3. m = 2: [0] = {2A:} = {0, ±2, ±4, ±6, .} . .

[1] = {2k + 1} = {±1, ±3, ±5, .} . .

m = 3: [0] = {3A:} = {0, ±3, ±6, ±9, .} . .

[1] = {3k + 1} = {. -8, -5, -2, 1, 4, 7,


. . , . .
.}

[2] = {3k + 2} = {. -7, -4, -1, 2, 5, 8,


. . , . . .}

m = 4: [0] = {4A:} = {0, ±4, ±8, ±12, .} . .

[1] = {4k + 1} = {. -11, -7, -3, 1, 5, 9,


. . , . .
.}

[2] = {4k + 2} = {0, ±2, ±6, ±10, .} . .

[3] = {4k + 3} = {. -9, -5, -1, 3, 7, 11,


.
.
, . .
.}

Chapter 2

Section 2-1

1 . Central reflection in the origin

3. Compression (toward the xi axis) by a factor £, followed by a magnification by a


factor 2

5. Rotation through 45° followed by magnification by a factor 2


7. Reflection in the line X2 = — *i 9. The whole plane is mapped on (0, 0)
728 ANSWERS TO ODD-NUMBERED EXERCISES |
CHAP. 2

11. Not linear 13. Linear 15. Linear

21. A(x) = ex, c an arbitrary real number

Section 2-2

5. (b) If 9C = {0}, then Ct(9C) is the space of all linear transformations from V i to V2
if SC = V 1, then G(9C) contains only the zero transformation.

Section 2-3

3. (a) LDy(x) = y(x) - y{a), DLy(x) = y(x)

2
n n -
(b) L D y(x) = y(x) y(a) + /(«)(* - a) + ^r <* ~ a> +
(n— 1), N

^ (n - 1)P '

Z> n L nj>0c) = y(x)

5. (a) x2£2 + 3xD + 1 (b) 2x£ 2 + (1 - 2x)D - 1

(c) 2xD 2
+ 2(1 - x)Z) - 1 (d) x 2 Z) 2 + 2x(l - x 2 )Z> - 6x 2
(e) x 2 Z) 2 + 2x(2 - x 2 )D + 2 - 4x 2

7. (a) 0, 0,
(b) 0,-6 sin 2x, 2e x
(c) -3 sin x - cos x, -2e x , 2 - 2x - 2x 2
2x
(d) 0, 0, 2e~
(e) Ax 2 Ax 2
,
- 21 x, e*(x
2
+ 2x - 2)

11. If B = O, then the subspace is V ; if B = /, then the subspace contains only O.

Section 2-4

1. (a) 9104) = 0; A~ x
(yi,y2) = (ivi, -&V2)
(b) 9104) is the line x2 = 0; there is no inverse.
(c) 9104) is the line xi + x2 = 0; there is no inverse.

y
(d) 9104) = 0; A-\yuy 2 ) = (^y^> -^~)
3. (a) 9104) comprises all constant polynomials; there is no inverse.

(b) 9104) = ; there is no inverse. {A is not onto.)


(c) 9104) comprises all constant polynomials; there is no inverse.
(d) 9104) = unless q(x) =
0, in which case 9104) = (P; there is no inverse.

(A is not onto.)

5. ai/32 — «2|Si 9*

Section 2-5

3. Representation of aA is I
a^j

Representation of A + B is
^+^ ft + ^
5.
Representation of

dim £((R 2
,
2
(R ) =
AB is
4
^ + ^ ^+^
2-7 ANSWERS TO ODD-NUMBERED EXERCISES 729

tic m 2-6
1 1 o" 1 1 1 -1 o"
1. (a) 1 1 (b) 1 -1 (c) 1 2 1

-1 1

3. (a) (1, h 4) (b) (1, -*, -4) (c ) V2» 4j


— Tlj)
"0 -1 "0 0"
3 6
-2 2 4 4 14
5. (a) (b) (C)
1 1 6 1 8
2 2 2

7. If (Bi = {ei, e2, . . . , e„} and ($>2 = {e{, e2, . .


.
, e^}, then

'tioi2 2-7
-2 1 -6
-11
20 -4
1. 3 7 -
3. 4 16
4 -8 8 L 3 3 .

2 1 1 8~
9 4 3 3
1 5 1 1
5.
3 2 2 6
35 7
. 18 18 i o

-i -2 -2
l 1
7. For standard bases,
1 1

0"

1 2
for bases »1, t»2,
1 2
1

11. If (Bx = {(e n ), (<? 12 ), (e 21 ), (e22 )}, CB 2 = {(e'u), (e[ 2 ), (e{ 3 ), (e'21 ), (e'22 ), (e23 )},

then [A:(S,u (B 2 ] = 1

2
-2
13. (b) -3 -4 15. mx)(x) = xv-3
2 -1
-2 2
730 ANSWERS TO ODD-NUMBERED EXERCISES | CHAP. 2

Section 2-8

18 -16 1

1. (a) 19 11 (b) Product undefined (c) 1

31 -25 1

8 2-4 12

(d) (3) (e)


4 1-26
-2 -h 1 -3
3. a 12 = «2i, or else an = 0:22 and a: 12 = ~ a 2i
i 1
1 1 2 2
5. (a)
5
4
5
1 (b) -1 1 -1
L 5 5j — 21 1
2
1 5 3
13 13 13
2 3 6
(c) 13 13 13
5 1 2
13 13 13

7. (a) Invertible if and only if a/3 7* 0;

_2_ 1

a/3 /3
inverse
1

(b) Invertible if and only if a/3 (/3 — a) 7* 0;

fl 1

a(/3 — a) a(/3 — a)
inverse
a _l
j8(j8 - a) 0O - a)
(c) Invertible if and only if a 9^ 0;

£ I
a a

inverse -1

1 _^ J_
.a a' a2
a "1 0~ "-1 0~
1 - a2 a, arbitrary except that /3 ^ 0; also and
a , fi
7 -ij |_
7 1

1 -1
7 arbitrary; and
1 -1
a
™2 also
11. a, arbitrary except that j8 5^ 0;
, j8
7

T^0 but otherwise arbitrary


3-2 ANSWERS TO ODD-NUMBERED EXERCISES 731

0'
2
2 1
13.

-1
-1 -6 1
15. (a)
£ -1* (b)
3
(c) ± 9
.2

17. One such pair is A =


1 1
5 -
r-i -f
[l lj |_
i ij

19. (b) An n X n diagonal matrix (a zj ) is invertible if and only if aii j* for


/= 1, 2, . .
. , n. The inverse is an n X n diagonal matrix (a'a) in which

a'u = I/an, i = 1, 2, ,n.

2
(d) The matrix is an n X 1 matrix with entries corresponding to the n 2 matrices
(eij) of the image basis. Those entries which correspond to (en), O22), (£33),
• • • ,
(e nn ) are each 1 and all other entries are 0.

Chapter 3

Section 3-1

1. (a) 6e 2x (b) 2 cos x — sin x


(c) 2 - 2x In x (d) 2e x (x sin x — 3 sin x — 2x cos x)
3. a = %, b = 0, c = —\
5. D(xD) = xD 2 + D, (xD)D = xD 2
7. (a) (aiD + l)(Z>iZ) + 1) = (a x + Z»i)Z> + 1, for given a x and 61
(b) D and Vx Z) are each linear on [0, 00) but D[Vx D] is undefined at x = 0.

9. (a) 3xZ> + 2 (b) (e x


+ e -*)Z) 2
(c) xD + Dx + 1
13. (a) (Z) - \)(D - 2) (b) (2Z) + 1)(Z> + 2)
(c) (2D + l) 2 (d) (£) + \)(D - 2) 2
(e) (D - \)(D + 2)(4D 2 + 1) _ (f) (Z) - 1)(D + \)(D 2 + 1)
(g) (D 2 + V2 Z) + 1)(£>
2 - V2 D + 1)
(h) [Z> - \]{D 2 - i(l + V5) Z> + 1][Z> 2 - i(l - V5) Z) + 1]
«
19. (a) ^ c-flfx* where c = 1, a = k(k - 1) • • •
(k - + i 1), / = 1, 2, 3, . . . , n

Section 3-2

1. (a) 2 (b) 3 (c) 3, 3 (d) 2 (e) 3, 3, 3

3. (b) y = cie ax
cosbx + C2e ax
sinbx

y = 6e aa: cos bx
<*(b + 1)
(c) e ax sin bx
b
) ) . —

732 ANSWERS TO ODD-NUMBERED EXERCISES CHAP. 3

5. (a) The given solutions are not linearly independent.


(b) y = c\ sin 3 x -f- C2

7. (a) y = c\ sinh x +
C2 cosh x
(b) j = cix 3 + C2X 3 ]nx
(c) y = ci sin 2x +
C2 cos 2x

+
1 + x
(d) y = c\x C2X In
1 - x

Section 3-3

1. y = —Q ' on (— oo , 0) or (0, oo

3. y =
sin x
on far < x < (k + 1)tt, k = 0, ±1, ±2, . . .

~x
5 . y = e + ce-^ l2) \ on (-00,00)
-(fi/£)
7. /
R
+ ce , on (-00, go)

x
= ce — (2x + l)e / n
9 _y ^\ <x
> on (— °o , 00
4(*2 + 1)
t

e{x - 1) + c
11. y = far < x < (k + 1>, A: = 0, ±1, ±2, . . .

x sin x

= + *2 *
.13. y
C ^ )
(<? + c)
'
° n (_i o) ° r ' (0, i}

sin x + In 1 — sin x\ + c
= -
1

15. y
(1 + sin x)2

on —- 2k 1
7r
^ x ^
< <
2k —+ -
1
ir,
,
k = n
U, ±1,
t ,
.,
±A
_1
17, ^ = (sin x In |csc 2x + cot 2x| c sin x) +
For a given c the solution is valid on any interval which does not include any point
far/2, k = 0, ±1, ±2, or at which the quantity in the parentheses vanishes;
. .
.
,

y = is also a solution on the intervals &7r/2 < x < (k + 1)tt/2.

2 3'2 2'3
19. y = e\\ + c(x + i)- ] , on (-00, 00)
1/3

21. y = 3
* +
_u 1 A 2
~ 6x\ _ 3 /x
2
- 12x\
+,1/3^
72x
(In x) 3
+ c

2 V In x / 2 V On x)2 / 4 V
on (0, l)and (1, 00)
2
23. y = {(1 - x)[c + £ln(-x + V* 2 - 1)] - Vx 2 - l} on (-00, -1); ,

_1 2
y = -{(1 - x)[c + £cos x] + Vl - * }
2 on (-1, 1); ,
2
y = {(x - l)[c + £ln (x + V* 2 - 1)1 - Vx 2 - l} on (1, 00); ,

also, y = is a solution on — 00 < x < °o

5
25. y = - 7=
x3 + 5x + cV|jc|

For a given c the solution is valid on any interval on which the denominator
nowhere vanishes; y = is also a solution, on (— <x> 00). ,
3-4 I ANSWERS TO ODD-NUMBERED EXERCISES 733

x2/2
27. (a) v = - - + - / e~ dx; on (-oo, 0) and (0, »)
X X J2
(b) y(l) = -6.3407, y'(l) = 6.9472

1 2x
29. y = 1 + —
31. y = x + —
1 x + ce J c In |x|

COS X — sin 2 x l\— li


33. y = -rrrrt 1 +
.

( ce ~ 2) 1
sin x

ce — 2
35. (b) (i) y also y = —1
1 — cex
;

3
my 2 \ce 12 * - 1/
'
also y

1
(iii) y = 1 + ; also y = 1
x + c
1 (2ce^ x + 3\ .

37.,' + ,
2
_ 1 =0,0 = 1 + ce2x _ x
> also v = 1 ; y = cie* + c 2e

= oe 2x + c 2 e 3x 41. (a) cie


-x + c 2 x*?~ x
39. (a) y
(b) y = cie~W 2)x + c 2 e x (b) cie (3 ' 2) * + c 2xe< 3/2) *
(c) y = cie _x + C2e 2x (c) cie
(3/2)x
+ c 2 xe (3/2) *
(d) y = cie- (5/4) * + c 2 e (4/3)x (d) cie
(1/6)x
+ c 2 xe (1/6) *
(e) y = cie~^ /2)x + c 2 e i<i/2)x (e) c e :
^ /2) *
(
+ c 2 xe^ l2)x

Section 3-4
1
1. y = -' on x >
x2

3. y = e~
x
- (,- 3/2 + 3,- 9/2 )e- (3/2)x on (-co, co) ,

1
5. y = tfx, on (— t, 0)
x sin x J _i/2 x

= 1 1 + sin x
7. y
sin x
1 + (1 + sin x) In , on (0, 7r)

x — 1

3(x + 1)

1L ' =
9 + \/3ir
sin
.

x H
, v3 — :
7r
cos x > on (0, 00 )
V^L 12 - 4
13. y = (1 + 21n2) - 21n|x|,on(-oo,0)

15. The equation + y = is of the first order on (—00, 00) but is not normal
xy'
on any which contains x = 0. If y(x) is any solution then the equation
interval
implies that y(0) = 0, and so if yo ^ there is no solution which satisfies the initial
condition y(0) = yo.
~ x o\
23. G is the linear operator which transforms h(x) into e Hx) J*^ h(x)e~ kx dx +y e Hx
734 ANSWERS TO ODD-NUMBERED EXERCISES |
CHAP. 3

Section 3-5

1. (a) Interval is (—oo, go).

(b) One representation of y is

y = ci 1 + c 2 e~ x + c^e 2x
• .

yx = 1, y 2 = \ - %e~* + \e
2x
(c) ,

^3 = -\ + &-' + fe 2x

3. (a) Interval is (—<*>, °o).


(b) One representation of >> is

y = c\e x + C2*e + C3e z cosx + c^sinx. a:

(c) >>i = 2e
x —
2xe x — e^cosx + e*sinx,
y 2 = —2e + 4xe + 2e cosx — S^sinx,
x x a:

x — x —
y3 = e 3xe e x cos x + 3e* sin x,
= x — x
y± xe e sin x
5. (a) Interval is (0, <»).
(b) One representation of y is

ci

x
I

2
'a
+
*' +
1
2
- ) c 2 (3x

x
2
2
).

1
1
(c)yi =-3 + 3x'
y2==
T-3x
Section 3-6

1. = -\2e x
W[l, e~ x , 2e 2x ]

3. ^[l,x, x 2 ...,x n ] = 1!
,

2! • 3! •
•;

5. ^[x 1/2 x 1/3 = -£*- 1/6


, ]

H^[ex e x sin x] = e cos x


x
7. ,

9. W[\, sin x, 1 — cos x] =


2 2 sin 3 x

11. WlVl - *2 , *] = 1/Vl - *2


«i(xV2'(x) — u 2 (x)u"(x)
23. ai(x) = —
Ul(x)l/2(x) - U 2 (x)u\(x)
u\(x)u'{(x) - u 2 (x)u'{(,x)
Q2(x)
Kl(x)w'2 (x) — U2(x)u'i{x)

25. (a) x 2y" - x(x + 2)y' + (x + 2)y =


(b) x 2 y" - 2xy' + 2y =
(c) /' + >- =
(d) (x cos x - sin x)y" + (x sin x)y' - (sin x)y =
(e)
2
x(l - In x)y" + xy' - y =

Section 3-7

1. (a) -1/x (b) -4/(1 - x 2)


(c) 2x 3 (d) -1
^l-coss (2/3)I(l+x3)l/2-2l/2]
(e) (f) e

M -1 +|ln
1 + x
3. Jce 5. -J--
2
7.
* - x
sin x 1
4-3 I ANSWERS TO ODD-NUMBERED EXERCISES 735

Chapter 4

Section 4-1

3. x2 - lax + (a
2
+ b 2) =
5. (a) (D + l)
2 (D
+ 2)
(b) (Z) - 1)(Z>
2
+ 1)
(c) (D + V5)(D - V5)(Z) 2 + 2D + 5)
(d) (D + 1)(Z) - 1)(Z> + 2)(Z> -2)
(e) (Z>
2
+ V2VT0 - 2 Z> + vlO )(Z> 2 - V2\/I0 - 2 Z) + %/TO )

Section 4-2

\. y — c\e~ 2x + C2e x
3. y = ae
3x/4
+ c 2 e' 5x/2
5. y = ci cos 2x + C2 sin 2a:
7. j> = cie"~
2x cos2x
+ C2e~ 2x sin 2x
9. >> = cie
x
cosx + C2£ x sinx
y = c ie~ cos Vl x + cie~ sin a/3 x
x x
11.
"

13. >> = cie


2V 3- + c 2 c v 3x/2

15. y = c 1 e (3/8)x cos—- x


V2
+ c 2 e (3/8)x sin

V2
x-

17. }>= (i - |*)*4 *


19. >>= 2 cos V2 x -f- 2 sin \/2 x

y = — §e- sin3jt
2x
21.

y = 3xe^
x
23.

25. y = \/2e
(V2/2)x (cos — x + sin — x)
29. (a) /' + 6/ + 9y = (b) /' - 2? + 5y =
(c) /' + 4/ + Ay = 4 (d) /' + Ay' + 3>> = 3x + 16
(e) /' + 9y = 3x
31. (b) y = ci cos (2x + C2)
33. (a) y = e cos 5a: + ^e x sin 5a:
x
" 50 *
(b) y = (4 - ;rV)e<
1+5i
+ (i + ^iV >* 1

Section 4-3

1. y = c\e x + c 2 e _x + c 3 ^ _3x 3. y = c\ + (c 2 + c 3 x)e~ (3/2)x


+ = + c 2 x + c 3 e~ x + c±e
x
5. y = cie" 2x c 2 e 2x + c 3 e~
(1/2)x
7. >> ci

9. y = (ci + C2A:) cos 3x + (c 3 + c±x) sin 3x

11. y = ci + c 2 A- + c 3 e-
(1/2)x
cos —
V3
x + c 4 e-
(1/2)x sin —
V^
x

13. y = (ci + C2X + c 3x 2 + ax 3 )ex


15. j = (ci + c 2x + c 3 x 2 )e _2x + c 4 cos V3 x + c 5 sin a/3 x
736 ANSWERS TO ODD-NUMBERED EXERCISES I CHAP. 4

17. (a) (D - l)
3 (b) D2 - 4Z> + 8 (c) (D 2 + l)
3

(d) Z)
2
(D + 2) (e) (D 2 + 4)
3
(f) (D 2 - 2D + 5)
3

(g) (Z)
2
+ l)
2

(h) (D - - 6D + 10) 3
3)
3 (Z> 2

(i) D(Z) 1) - - 2) 3 (Z> - 3) 4


2 (D

(j) D n+1 (D - l) n (Z) - 2)"- •••(£)- 1


A:)"-*" 1
"
1
• • • (Z) - h)

21. (d) The results of (a) and (b) show that the solutions are linearly independent in
C(— oo, oo ).

Section 4-4

{Note. Sometimes a simplified particular solution may be obtained by deleting from the
solution given by variation of parameters, or the use of Green's functions, any terms
which satisfy the homogeneous equation.)
1. y = (ci + In |cos x\) cos x + (c2 + x) sin x
2* — 1 o , O
3. y e 2x + (ci + C2X)e~ Zx
32

5. y = ie~ {1/2)x (—xsinx — 2 cos a; + c\ + c 2 x)


2 * - 14 - 3*- 2 *) + ci^- 5 + V37)x + c 2 e
- 5 - VF?)a!
>, = ^(7e
(
7.

y = ie 2:r (sinjc — 2cosx) + cie cos.x + C2e sinx


x r
9.

v p = x (ilnx - |)
3
11.

cos 2x
13- yp = T? hi|tanx| + (cos2x)ln
16 1_
(1 + cos 2x)|sin 2x\
2{x ' (x - l)

15. >>
p = —x 17. K(x, t) = U.e
l)
~ e- ]

x_< x(x - t)
19. #(*,/) = 2£> sin K* - 21. K(x,t)

23. K(x, t) = —-—r


1
- In
(1

(1
+
-
x)(l
x)(l
-
+ /)

2 2 2
25. If a 5^ or if co 9^ b —
- a , then

sin (W + <pi) sin (wt + ^2)


2^ Z>i D2
_ a< sin (ct — (pi) ,
sin (ct + <p 2 )

+r
2c £1 Z>2

where

c = (b
2 - a 2) 1 ' 2 , Di = [a
2
+ (w + c)
2
]
1/2
, £> 2 = [a
2
+ (« - c)
2
]
1 '2
,

and <pi, (P2 are determined by the relations

cos<£>i = (co + c)/D\, sin^i = a/Di, COS <P2 = (w — C)/Z>2,

sin <p2 = a/ D2, < <pi, <p2 < 2ir.

If a = and co
2 = b 2
, then

y = —
2 co
o (sin w/ — (at cos wf).
4-8 I ANSWERS TO ODD-NUMBERED EXERCISES 737

Section 4-5

1. y = (1*3 _ 1*2 + i x _ A) ex _(_ Cie -* + (C2 + c 3 x)e*


2x
y = _(i*3 + 3^2 + X + 3) + Cl + C2 * _|_ C3e
3
3 .

5 . j, = ^(4 cos x + 3 sin x) + ae


x
+ c 2 e 2 * + c 3 e _3a!
7. y = Jcosx + ci + C2e
_a:
+ C3C X

= + + C2 + C3X + c±e~
(t - C1
y e*
9. **
)
~ - - -
11. #(;t, = ex l
(x t) 1
*- J)
13. #(x, /) = - 2e
\\e*- 1 2 (*-»
+ e 3( ]

15. K(x, = sinh (x - t) - (x -

Section 4-6

l m
y = 2x + /afe-(
x2/2)
dx + k2
3. y = kix + k 2 2

5. y = *3/2 (A:ilnjc + £ 2) x > o

= e — /[ao^/ajCz)]^
^JX) /[« (x)/a jCi)]^
7. )>
I c + k
7 tfi(x)

9. Let y\ = Jo(x) ——t--2


x[7o(x)]
: lim yi
*_>o
= °o , but lim -

^olnx
—= 1

Section 4'-7

1. ^p - 23 e x _j_ x 2 _ 2x
_ 3- y P = £ cos x — \ sin x

-2
yP = (x - 3x + ^&)e*
2
5. JP = -|*e >
7.

2 3
9. y P = eh ( 899 + I 305 * + 675 * + 125x )
11. yP = ^_
(2374 - 2940* + 3200x - 1500a: + 625x )
2 3 4

y p = \{x - l)e 15. y v = 2 + 2x + xe~


x x
13.

17. y P = -(sin x + cos x) 19. >> p = ^(39* + 9x + 2x


2 3
+ 12xe*)

21. }> = (ci + c 2 x)x e + (c 3 + c 4 x + c 5 x )cosx + (c 6 +


P
2 2x + 2 c 7x
2
c 8 x )sinx

23. yp = en?-* + x 3
[(c 2 + c 3 x + c 4 x 2
)cosx + (c 5 + c 6 x + c 7 x 2
)sinx]

25. >>„ = x [(ci + c 2 x)e


2 x
+ (c 3 + c±x)e~ x + (c 5 + c 6x + c 7 x 2 + c 8x 3) cos x
+ ciox + cnx 2 + c 12x 3)sinx]
+ (eg

27. yp = x 3 (c! + c 2 x + c 3 x 2 )e x + (c 4 + c 5 x)cosx + (c 6 + c 7 x)sinx

yp = ci + c 2 x + xV [(c3 + c 4 x)cosx + (c 5 + c 6 x) sin x]


/2
29.
+ (C7 + c 8x + c 9x 2)cosx + (cio + ciix + ci 2 x 2)sinx
Section 4-8

1. y = ci\x\ + c 2 |x|- 2

3. y = ci sin (3 In |x|) + c 2 cos (3 In |x|)

5. y = ci\x\p + c 2 |x|-p, ifp 9^ 0;y = a+ c 2 ln|x|,if/? =


1. y = \x\~Hci + c 2 ln|x|) + c 3 |x| 7
1

738 ANSWERS TO ODD-NUMBERED EXERCISES |


CHAP. 4

9. y = alx]- 1 + 021x1^ + c 3 \x\-^


_1
11. y = a\x\ + C2|x| + C3 cos (In \x\) + C4 sin (In \x\)

13. Letd = (ai - l)


2 -
4a .

(i) If d > 0, setai = £[1 - ai - \/d],

a 2 = i[l
2U — ai
«1 T
+ V « J» then
V</]; «-"«-•" #(*,
-»vv-v, f)
»y — /- a —
'? 2
y/df
a
x [ln x — In /]
(ii) If
1
a = 0, set a = (1 - a)/2; then #(x, ?)
/a-l

(iii) If d < 0, set a = ~ ai


> 6 = —r— ; then
a

K(x, f) = — cos [Z>(ln x — In /)].

17. (a) (xZ) + 2)(xD - 1) (b) (xD + 3)(jcZ) - 3)


(c) (xD + 2
l) (xZ) - 7) (d) (xD - l)(xD - 2)(xD - 3)

(e) (x
2
D 2 + xD + l)(xZ) + 1)(*Z> - 1)

19. y = -£(ln *) cos (In jc


3
) 21. j> = £x

Section 4-9

(Numerical values are approximate)


1. (a) x = x e- [(ln2)/5600l< = xoe- 000012t
(b) t = 18,600

3. (a) x = (*o + 72.13A:)e-° 01386< - 72.13A:


= In (xo + 72.13*) - In (72.1 3fc)
~
( *
*
0.01386
5. 132.9 hours

7. (a) 9.00 years


(b) 221,000,000; 856,000,000

9. r = ro — kt\ evaporation complete when t = ro/k.

11. h = h cosV2g/Lt

Section 4-10

1- ' = 'o + ^e-™™ + f e-'ftoibt - /3),

where Y = V(* - aL) 2 + 6 2£ 2 > /? - a^ = 1" cos 0,

6L = F sin < j8, j8 < x.

3. (a) q = 0.03e
(c) 3.5676 volts
2 1/2
5. If R > 2(L/C)
1/2
, set A = [R - (4L/C)] ; then

[R/(2L)]t
q = 0.03e- cosh-, + -sinh-f
1/2 [RI{2L)]t
If* = 2(L/C) then q = , 0.03[1 + (R/2L)t]e- .
5-4 I ANSWERS TO ODD-NUMBERED EXERCISES 739

If R < 2 (^\, set A' = [(4L/C) - R2 ]


1 '2
; then

-[R/(2L)]t A' R . A'


q = 0.03e" cos-, sin-,
+K7
- uEC £C
7. (i) q = <?(0)cos + Vlc /(o)
- u 2 LC sin + 1 - co2£C
sin cof

Vlc 1 'LC

(ii) q = q(0) cos cor H /(0) + ^ sin«/-£ E


/cos«r

(Resonance in this case)

Chapter 5

Section 5-1

1. (a) Piecewise continuous on [0, <»), from definition

(b) Piecewise continuous on [0, <*>), from definition

(c) Not piecewise continuous on [0, °o), since function becomes infinite at t = 1

(d) Piecewise continuous on [0, <*>); note that

- 2 = t 1

}™+T2—^r-2 = SS-fi- t-2 3'

Note that the function, as given, is not continuous on [0, 00 ), since it is unde-
fined at t = 2.

(e) Not piecewise continuous on [0, 00), since it becomes infinite as t —>

7.1 9. -;

11. Km (t sin \/t) = 0; t sin \/t is piecewise continuous on [0, 00 ).

Section 5-2

1. £[t] = — ; so = 0; abscissa of convergence is 0.


s2

3. £[/](*) = - (1 - e~
s
), if s * 0; £[/](0) = 1; s = -«
5

5. £[sin atf] = ——
s2
-

+
— - ;s
a2
=

Section 5-4

1
cos a + 5 sin a
52 + 1

r n n— 1 n-2 „ 1
° U
* -•
3. n\
_n!s
I

(n - l)\s 2
I

(n - 2)!j3
"*" I

^I

l!^ n ^
I

j»+ij
2
s 2a
5
-s* - a2 ' s(s 2 + 4a2)
740 ANSWERS TO ODD-NUMBERED EXERCISES CHAP. 5

4s
13. -e
'
(s 2 + 4)2 a
a
17. If |/(0| < Ce \ then lim e~"f(t) = if s > a.
t—>00
21. y = e + 2e* 2<

y = -| + \e- + t qCOs2? +
3
23. %
§sin2f
25. y = -it + fe
2 3 - fe~ 2< '

27. y = -2 + 2 cos £f + sin £/

Section 5-5
72j(j - 9)
1. 3.
(5 -2)2+9 (j2 + 9)4

5.
(5 + 3) cos 4 - 2 sin 4
(^ + 3)2 + 4

7. If£[/] = <p(s), then £[te


2
'/'(/)] = - (5 - 2)-<p(s - 2) + ?(j - 2)
as
~ s/2 - -2t>
„ (3s + 2) e
11.
1 e

2,y2 s2 + 1

2
—n — —2s —3s
— e—
I 4s,
1
13.

s2
[1
yi
Ae + i /i
4e ]
15.
(s - 2)2 + 4

1 l + e~"\ 2(s-- 2)
17. ( 19.
-
j2 + 1 \1 - C-'V S[(S 2)2 + 1]2

8(5s - 1) 2(cos 1 - sin 1)

(s2 + 1)4 s3

„„ 3s
4
- 16s 3 + 96s - 108 1 e
2a
(2 sin a — 5a sin a — cos a)
23.
- ttt^ »x„ «,„ +
,

- + -
l)2[(s - + l) 2 l) 2
,

(s 3)2 1]3 5(s 5(5

/ __ ie/ + 120/ - 400s + 460 e


2a
(2sina - 5a sin a - cos a)
= -
5[(s - 3) 2 + 1? 5(s l) 2

25. 1 - e 27. ±
4 -
— ie
4e
-2t
"* —
- ite~
2'

1 n —1 at
31. i-e~
zt
-2t
sin5/
29. t e
(1 - D!
33. / sin t 35. § sin ? + fr cos t

(0 if r ^
37. V3 W2 (r)sin —l
(/ - 2) = i/^ J_
2,

(
, _ 2) if , > 2

39.
1
cosh
,
—-sin ——
t . t
— sinh
. ,
——cos—-
t t

V2 V2 V2 V2 a/2

= 1 if t ^ 1,
41. 1 + «i(0
2 if t > 1
5-7 |
ANSWERS TO ODD-NUMBERED EXERCISES 741

„,
43. —
3a 2
1
e
-at
- — 1

3a 2
<?
at/2
cos
V3
—- at
2
+
4\/3
-^pr- *
9a 2
at/2
sin
.


V3 at
2

Ae 1 —2t —3u
45. - ,
(e — e )

\
n —1 times

Section 5-6

1. y = \e - \e~
l
- \te~
l

3. y = %e- cosV2t l

+ ^e^siny/lt + t - f

~ j)
+ I E ("D^Wtl - e-
k(t
]

7 - y = "to cos 2 ' + M sin 2 ' - f tet + T5et + ie2t


9. y = te -2l + e-2« + J/2 ( /)[^-(«-2) - (, - 1)^-2(1-2)]

(re~ 2 ' + e- 2 '


if / < 2
[re~ 2 ^ e~ 2t + e
-«- 2 - >
(t - l) e
- 2( <- 2 >
if t > 2
-f-

'- x) - -»
11. y = 2sin2r + fe- (
fe
(

13. y = - 2te-< - 2ui(t)[l - re-c- )] + « 2 (0[1 - U -


1
1
l)e-«~ 2) ]
- 2te~* if t < 1,
{1-1 - 2te~< + 2re-('- if 1 < < 2, 1)
f

-2te-« + 2te-('-» - (t - l)e-('-2) if 2 < /

(-D
!7 -'°« = Z4^<
fc=0
2 2 *(A:!) 2
,2*

Section 5-7

-<
1. f/C - £)sinU£ 3. e + / - 1
Jo
6t at
e e
5. f[rcos/ + sin/] 7. ~ \{ a ^ £ ,e
af
if a = £
o — a
;

n O 2 2
_ (cos bt — cos at) ifa ?* b ; ^/sinar ifZ> = ±a
a2 £2
n_1
,, -« _ _
•'+ 1
,„ - Oe-tgWdt ,. /
11. ej f(t 17.
(« - D!
«, t ii 3tt 1 /, 9 \ „ _i s
-tan
T -j,ln^+ ?j-3tan
, ,

21. j 23. j
j

27. The minimum is between x = 1 and x = 2.

31. (a) V^ (b) 2


742 ANSWERS TO ODD-NUMBERED EXERCISES |


CHAP. 5

Section 5-8

1. K(t, & = (t - Qe 2 «-0


3. K(t,® = ie- 3(e -«>sin2(r -
5. *(/, = (t - &e«-V /&

7. K(t, e-
(< -e -^- e/2 V~3
cos^(r-{) + >/3e<-*'* sin ^(,-0

9.^(^) = ^j,-
V_2( ^ /2
sin^(f- f) + cosYC- *)

+ e
V2(«-f)/2
sin^(f - f)-cos^(/- f)

~ _1 -1)
11. y = hi(0O ' + i[c' - c- ('
]}

_ (0 if< f < 1,

|l _ t+ i[ t-i
e _ e
-(«-i)] if/ > 1

17. >> = |e-<'~'> - ^e (t ~ T)/2 - ^ sin / - ^ cos f

19. y = f^- + 2)
^e-(«-2) + ^-t + ^e4 '- 10 - Ate~ {

21. y = e'
/2
[^(/ - a) sin 3/ - -^ sin 3a sin 3(/ - a)]

+ e ('-«)/2[7 cos 3( r _ a) _ JLL s in 3(/ - fl)]

23. >> = / - a - f sin (t - a) + W-a ) cos (' ~ a)

Section 5-9

. t , i -t ,
2\/3 t/2 . V3,
3 2

+ ^e
'- 1)/2
i„-«-» _ i^-^cos^C/- l)
(
sin^O - 1)

V + y-* + ^e^sin^^ if ^ / ^ 1,

= <

fc' + fcr
l

+
2\/3 «/2
e' sm

—\/3
t + i
$e
-0-D -\e
tjt-D/2
cos — - 1)

V3 e o-i)/2 sin V3 (r _ >


+ 1) iff j

5. y = \ sin It + 6 M*72(0[2 cos f — sin 2t]

f^sin2/ if ^ t ^ tt/2,

If sin 2? + £ cos 1 — | sin It \l t > t/2

lm
— a/3 r- fk
+ fk
.
L
7. y = 2t -^ sin V 3
.

t 9. y = a cos a /— /
,

b A /-r- sin A /

13. (a) >>= e-%-1 + I*" 8 ')

(b) r = i In 3 = 0.27465
(c) \/3/27 = 0.064150 above position of equilibrium
6-3 ANSWERS TO ODD-NUMBERED EXERCISES 743

(d) Mass starts at time t — at the point 1 unit below the position of equilibrium
with velocity 10, and moves upward. It passes through the position of equi-
librium att = 5 In 3, and at / = ^ In 3 it reaches its maximum height, x/3/27,
above the position of equilibrium, after which it descends asymptotically
toward the position of equilibrium.
15. y = e-<"[a + {b + <ra)t]

-\tl(2m) \/4km - X2 2bm + Xa- . \/4km - X2


y =
,

17. e a cos t -\ sin - t 0.


2m \/\km - X2 2m

Chapter 6

Section 6-1

5. (b) No; if /(x) = on (0, oo) except possibly on a finite closed interval, then the
equation cannot have infinitely many solutions on (0, «> ).

Section 6-2
2
/ 1
1 — An
P \
7. (a) u" + (l + 4 J ) " = 0, [— °o 0) and
on (- , (0, <x>

(b) «" +
!+/?(/> + Dd ~ * )
« = 0, on (-oo,-l), (-1,1) or (1,00)
(1 - x2 ) 2
2 2
x + 2 + 4/(1 - x )
=
(c) u" + 4(1 - X2 )2 u 0, on (-oo,-l), (-1, l)or (l,oo)

(d) «" + (2/> + 1 - x )« = 0, on (-oo, oo)

Section 6-3

1. (a) ± + 6y = o, on (-oo,-l), (-1, l)or(l,oo)


dx

= on (— oo, oo)
^i'- 0,

0, on(-oo,-2 1/3 ),
(-2,1/3 ,2-1/3 )or(2
1/O 1/O 1/O
1/3
,oo)
1/2
3/2 (-In x) sin x
(d) (-lnx) ^ + ,

^ y, on (0, 1),
dx dx 2x
d_
n \l/2
(In x) sin

x
3/2 ffr
(lnx) >>, on (l,oo)
dx dx 2x
d_ 1 dy 2x
(e) + y = 0, on(— oo, — l)or (— 1, oo
dx x -+- 1 dx (x + D2
,2
2t 2
3. ?£+ (e - p )y = 0, on (-co,0) or (0,oo)
at*

5. (a) Every solution has infinitely many zeros on every interval of the form (— oo — a) ,

or (a, oo ), where a > 1, and does not oscillate on (—1, 0) or (0, 1).
(b) Every solution has infinitely many zeros on every interval of the form (— oo a) ,

and does not oscillate on (1, oo).


744 ANSWERS TO ODD-NUMBERED EXERCISES |
CHAP. 6

(c) Every solution has infinitely many zeros on every interval of the form (a, <x>
),

where a > 1, but does not oscillate on (— oo 0) or (0, 1). ,

7. (a) Every solution has infinitely many zeros on every interval of the form (a, oo ),

where a > 0, but does not oscillate on (— «> 0). ,

Section 6-5

= f-(-D*
2k
1-3' = aoyo + divr, vo Is oft)! x '

fc=0
00
(— 1) 2fc+l x
x , on ,(—oo, oo

k=0
(2k + 1)!

3*
= owo + = +

El
3. y ao'UJ'o 1
J^tti
^ ft![2 •
5 •
8 • • •
(3k - 1)]

00 1
3&+1 , >.

„ *![l-4-7---(3* + l)l * .«(—•-)


*=0
,

(-D
5. >> = tfi(x + x ) + 3a ^
A;=0
(2Jfc - 3)(2k - 1)
a:
2/fc

, on (—1, 1)

7. y = ai(x + §x'
- 5)!(2fc+ 1)
+ ao
L
1 + 9x
2
+ ^x4 + 3 ^
A=3
(-1)
fc (2fc

2*-3[A:!(ft - 3)!]
*
2fc

on
(-^
9. y = a yo + ai^i, where y = ^ (-D *
fc=0
"TuT"
£!3*
3k
3i
'

(-I)'
3
1
!
-St ft=0
(1 + 3 •
1)(1 + 3 •
2) • • •
(1 + 3*)
x , on (— oo , oo

»> " ^+ 1 1 Tftihn


6f^k\(2k- 1)
k ~ lx2k
+ S
£{(2* +
^^ <*^ 1)!
V3;
+1

1 6-1 2*
+ flo 1 — - Y^ riV
Kg}
'
; + a\x, on (—oo, oo
6^[k\(2k - \)

13. y = aoyo + aiyi + «23 2, where


;

10 19 - -
£ 1 (9A: 8)
• • •
3*
>>o = 1 + (3A:)!
* '

4j_13 -22 (9* -


E 5) st +l
• •

yi -x + (3* + 1)!

y2 ^ 2
+ 2E- *=i
7 •
16 •
25
(3*

+

(9*
2)!
- 2)
— x 3k+2 , on (—00,
.
00),
)

6-5 I
ANSWERS TO ODD-NUMBERED EXERCISES 745

15. y = a yo + aiyu where


yo = l +
Y, 2
2fc
[(0)(-l) - 1][(1)(0) - 1][(2)(1) - 1] • • • [(k - 1)(* - 2) - 1] 2k
*=i (2*)!

y\ = x +
2
*=i
[(!)(-!) - 4][(3)(1) - 4][(5)(3)
(2k +
-
1)!
4] • • •
[(2k - \)(2k - 3) - 4] 2 *+i

on (-1,1)

17. y = Jp + «oyo + aiyi, where


_ i 2 — j_ 3 4 L 5 _ 21 6
J^ — "S^
7 , ,

12* "+"12 8-*' •" 6 0-^ 51 20*


00
(2* - 7)!(2* + 1)(2* + 3) 2*

24(*-i)A:!(ik - 4)!

= 1 _^8* 2 4_1Z^
+ 128*
4 _ J^L * 6 _ cf ^- 7)!(2fc + D(2k-
yo

= x - x 3 5
1024
^ 24<*-»fc!(ife - 4)!

yi + £x , on (-2, 2)

m
19.
V^ ( i^+l
= ao2^(-\)
A: + 1
x
3fc
+.
„ «„
aix, on
/ -,1/3
(-2 ,2
l/3,
>;
k _ )

fc=0

+-^ X"^
+ V^ +
4 2fr+l
= -x 2
1 1 3 1
21. y
1
+-x
"
,

/4
+2^ . .

fl 2fcX
2fc ,

2^ »2fc+ix ,
,
where
fc=3 fc=2

2 3
a2 * = t^T [1 - 2(* - 2) + 2 (A: - 2)(* - 3) - 2 (fc - 2)(k - 3)(k - 4)
(2A:)!
2
+ • • •
+ (-2f- (k - 2)!],

a 2k +i = [1 - (2* - 3) + (2* - 3)(2* - 5) - (2k - 3)(2fc - 5)


(2 ^^_ 1);
fc_1
X (2k - 7) + • • •
+ (-l) (2A: - 3)(2k - 5)(2k - 7) • • •
(1)],

on (—oo, oo

23. There exists a solution which is a polynomial of degree zero (a nonzero constant)
if and only if 7 = 0; there exists a polynomial of degree 1 which is a solution if

and only if /3 +
7 = 0; there exists a solution which is a polynomial of degree
n > 2 if and only if n(n - 1) fin 7 = and either a = or k(k - 1) + + +
file + 7 5^ for all nonnegative integers £ which are less than n and differ from
n by an even integer.

25. (b) A basis is {yo, yi} where


*~ lx
i_l\S X ~ 2 >< X ~ - -
+ 1, (-D^
2 ( 4) • • • [X 2(A: 1)] j> fc
=
m
*
'0 1
mk _ * >

f. . *
fc

2 (X - 1) (X - 3)(X - 5)---[X - (2k - 1)] 2 *+i


*=*+L,(-l) ,

(2kTT)l
X '

k =l
on (—oo 5
oo )
746 ANSWERS TO ODD-NUMBERED EXERCISES |
CHAP. 7

Section 6-6

1. y = - i* 3 + j±oX 5 + j^* 6 +
1 • • • , on (-oo, oo)

3. y = — 1 — x + \x 2 — \x z + • • • , on (— 1, 1)

5. y = 1 + x + x 2 + jix 3 + on (-*, J) • • • ,

7. j> = 6x + i*
3
+ f* + 27)* +
4 5
on (-3, 3) • • • ,

9. y = \{x - 3) 2 + £>(x - 3) 5 - zh>(x - 3) 6 + T^(x - 3)


7
+ • • • , on (0,6)
11. y = \x 2 + -j^x 3 + ^x4 + 5
+ on (-00, 00) ^x • • • ,

13. x - \x z + i^x
5 - ^x 7 + • • • , on (-00, 00)

Chapter 7

Section 7-1

1. (a) -5 (b) -^ (c) (d) -10 (e)2


2xe — 4
3. (a) (b) i(e
2 - 1) (c) 3 - e (d)

5. (b)

9. (a) No;f- f = does not imply f(x)


• = on [a, b].

(b) Yes; Definition 7-1 applies. {Note. If [P(x)]


2 = on [a 1, 61], then
P(x) = on [a, 6].)

11. (b) a >

Section 7-2

1. (a) 3 (b) I (c) V35 (d)


^ (e) V39

3. (a)

V3
(b) VT^l (c)
2V30
-^- (d) —
V2
(e) V2(l - In 2)

5. (a) (i) (b) (i) V3/2


(ii) (ii) V15/4
(iii) -4 (iii) *
7. (a) (i) 7r, if m = n 9^ 0; otherwise
00
(iii) 2x if m = n = 0; w if m = ai 5^ 0; if m 9^ n.

(b) Orthogonal, that is, the cosine of the angle between any two distinct vectors
(functions) is zero.

Section 7-3

\/2 \/6 3yTo 2 _


2 ' 2 *' 4
7. /4(x 3 — |x), A any nonzero constant

9. ± — (1, 0, -1, 2) 11. ae x - - {e


2 - \)e~ x a arbitrary
,

7-5 |
ANSWERS TO ODD-NUMBERED EXERCISES 747

Section 7-4

1. (a) (1,1,0), (-1,1,0), (0,0,1)


(b)(i,0,2), ^(-20,51,5), ^(12,5,-3)
(c) (2, -1, 1), ^(4, -11, -19), ^(5, 7, -3)
(d) (1,2,-1), ±(13,2, 17), ff-(-6,5,4)
(e) (1,0,0), (0,1,0), (0,0,1)
3. (a) (1, 0, 0, 1), (-1, 0, 2, 1), K2, 3, 2, -2), f(-l, 2, -1, 1)
(b) K4, 0, 1, -2), ^-(-4, 0, 62, 23), ^(7, 0,-4, 12), (0, 1, 0, 0)

(c) (1, 0, 0, 0), (0,1,0,0), (0,0,1,0), (0,0,0,1)

5 '
w
(a) e*
'
e~* - -=
e2
2
- 1
ex (b) e
1
, e 2x
2(e 2
—++ +—
3(e
e

1)
1)
- ex

(c) 1, 2x - \, ex + 6(e - 3)x + 10 - 4e

, ,
11. (a) at
/ x
=
x—
— • e,
-
>
.

i = .,
„ „
1, 2, 3, . .
. , n
,,.
(b) x y = v^
}^

— (x • e*)(y
(tk
_,
• e*)

13. ±-^(4,-1,2)

15. (b) (1, 1, 1)

17. (a) /><>(*) = 1, Pi(x) = x, P 2 (x) = f (x 2 - *)


/>
3 (*) = f (x 3 - fx), i>4 (x) = fK*4 - fx 2 + &)
(b) (i) 2P 2 (x) - 2Pi(x) + 2Po(x),
(ii) -2P 3 (x) + 6P 2 (x) - 6Pi(x) +P (x),

(iii) 3P 3 (x) + 3P 2 (x) + £P (*)

Section 7-5

l. (a)

V2
(b) 2 (c)
C1
f± (d)
30V29
-^~
3. (a) l (b)

\/43
(c) &Vm5 ,

(d) 3

5. The line spanned by (4, 3) and that spanned by (3, —4)


7. (a) J§V3 (b) 4fyV497 (c) l

Q
y (11 4 19.\
- v 9 > 9> 9 /

11. (a) Projection, 2 sin x; ||d|| = x\/6x/3


(b) Projection, | + £ cos 2x; = ||d||
2
(c) Projection, £ir - 4 cos x + cos 2x; |[d[| = ^/5tt(8tt* + 2295)

13. an = 0;b n = (-l) n+1 --

15. ao = 1,«2 = — 2» fl n = Oforn ^ 0, 2;6„ = 0, all n

17. ao = ir,a n = -|-[(-l) n - l],whenn ^ 0;Z>„ =

io
ly -
fl
» -
<-nv
(n
2
+
-
1),
o ' bn ~ (-lr+w
(„2 +
-
l )7r
o
*

748 ANSWERS TO ODD-NUMBERED EXERCISES I CHAP. 8

21. (a) Projection, *+i=£


2(/i +
3(2/i
1)

+

1 +
2n
^ ^"
+ 3)
2(/i + 2)
]
x + 4(/i + ^
^|+ +t}*
3
1)(« 3)
- i);

1/2
lldll
= [1 + (-1)"]
v
- [1 - (-1)"]
12/i + 1 (n + 1)2(« + 3)2 L ~ '
J
"' (« + 2)2

(b) Projection, 3(sin 1 - cos l)x; ||d|| = W 22 sin 2 - 20

i 2
- = V6
(c) Projection, •
1 + if(x £); ||d||
24
(d) Projection, 1 + 3(x - i); ||d|| =
23. If 9C = V, then SC^ contains only 0; if 9C contains only 0, then XL = V.
27. (a) ay + /3z, where y and z are linearly independent and orthogonal to (1,1,0)
and a, j3 are arbitrary reals; for example:
a(l, -1,0) + 0(0,0,1)
(b) a(l, 1, 0), a an arbitrary real
(c) ay + jSz, where y and z are linearly independent and orthogonal to (1, 2, —1)
and a, are arbitrary reals; for example:

a(l, 0, 1) + 0(-2, 1, 0)

(d) a(l, —1, 0), a an arbitrary real

Section 7-6

1. (a) A: 1.4082; B: 1.4136


(b) 5's data (Variances 9.056 X 10~ 5 and 4.744 X 10~ 5 for A and 5, respectively)

3. (a) A: 16.06, variance 0.00236; B: 16.12, variance 0.00092; fi's results are the more
consistent,
(b) 16.09, variance 0.00254, greater than for either A's or Z?'s data alone
5. (2.25, 2)

7. (a) (-0.8, 0.4, 1.7) (b) (3.4, -1.3, 3.6) (c) (1.9, 0.4, 0.8)

9. (b) wi = W2 = 1, 5, /«3 = 3, /W4 = 1

13. (a) 54* - 35v = (b) 417* - 142y = (c) 26* + lly =
15. y = fxi + %X2 17. y = x\ — x2
19. y = %xi — \x2 — |*3 - (a) y = -£x 2 + ffx
(b) y =
= 9
2
-
2 -
M A
(c) ^ -WfX
37' ff;
"• v = _3y4
23 y 8*
i 15„2
* 1_
' 8 10

Chapter 8

Section 8-1

1. lim {flfc} = 3. {aj does not converge.

5. lim {afc} = 1 7. lim {a k } = 2

9. lim {x fc } = ei + e2 + e3 11. {xfc} does not converge.


. . .

9-3 ANSWERS TO ODD-NUMBERED EXERCISES 749

Section 8-3

3. With respect to the first set


—t
co = e
r
— —— e
;

ck = ;
2
—— ie - e~
T
), k = 1, 2, 3,
(i + k W^r
With respect to the second set

k
ck = (-l)
fc+1
-(/ - O, k = 1,2,3,
(1 + k^W^
Section 8-4

3. f(x) = y+ ^2 iak cos kx + b k sin kx), where

a*: = - / /(*) cos kx dx, k = 0, 1, 2, 3,


tt J -»

bk i/'
= - J fix) sin A:x t/xr, A: = 1, 2, 3,

Section 8-5

3. (b) / 2

13. (b) The vector y = iyi,y2,)>3, . . .) is in S(9C) if and only if for some integer
k > 2, yk+ i = yk+2 = yk +3 = • • = and yi = y 2 + ^3 + + yk • • • -

Chapter 9

Section 9-2

1. (a) /(I") = 1, /(1+) = -1; in (PC[0, 2]


(b) /(0+) does not exist; not in (PC[0, 2]
(c) /(1-) = 0, /(1+) = 1; in (PC[0, 2]
(d) /(1-)= /(1+) = 0; in 6[0, 2] and so in (Pe[0, 2]
+
(e) Ai~) = i, /G +) = /(!") = o»/(I ) = 4; in <wsiP. 2 ]
(f) /(l") does not exist; /(1+) = 2; not in (P6[0, 2]

3. (a) In C[0, 1] and so in (PC[0, 1]


(b) Not in (PC[0, 1] (Infinitely many discontinuities)
(c) Not in (PCfO, 1] (Infinitely many discontinuities)

7. Yes; a linear combination would vanish identically only if all the coefficients were
zero.

Section 9-3

1. (a) Odd (b) Even (c) Neither (d) Even (e) Odd
(f) Odd (g) Neither (h) Even (i) Neither (j) Even
750 ANSWERS TO ODD-NUMBERED EXERCISES | CHAP. 9

2 j
,

2x
7. (a) fE (x) =^^4>
x - 2 l
f (x) =
X^ - 1

1
(b) fE (x) = 1 _ x2
'
•—
Mx) =
1 _ X2
(c) fE (x) = — cos 2x, fo(x) = X COS X

(d) fE (x) = ^4» Mx) =


X + *
-
x2 1 X2
ax + c
(e) fE (x) = -
a2 x 4 + (lac b 2 )x2 + c2
bx
/o(*) = -
Cl 2 X* + (lac - b 2 )x 2 + c2

9. fE (x) = — -f- ^ a n cos «x, / (x) = ^ Z>„ sin mx

Section 9-4
fc+i
(-D kx
1-22] sin
k=X

3.
(-D* (cos Arx — k sin £x)
+ £2

1 1 fl « „+i2 1 n 1 n
f. . .

•^-2 + n=l Sm cos nx +


2.te 2
(-1) -H
n rm
cos-
2
5-sin- sm«x
n 2 ir 2

ifc +1
7.§.
2
+ 42 (-1)£2 cos Ax

00 j

9.-2 jP -sin&x

5t\ \
— and — ,3t the
/5tt
11. (a) On 2x, 1 I ) series converges to —1 and 1, respectively;

atx = 5x/2 and x = 3x the series converges to zero. On [— 2x, — 3x/2) the
converges to 1, on (— 3x/2, — x/2) to —1, and on (— 7r/2, 0] to 1. At
series
x = — 3x/2 and x = — x/2 the series converges to zero,
(b) When x = kir, the series converges to ±1, according as k is even or odd;
when x = (2k + l)x/2, the series converges to zero.
13. (b) When x = k-w, the series converges to x/3; when x = (4k l)x/3 the series +
converges to x/6 or x/3, according as k is or is not an integral multiple of 3.
k+l
-1) " 1 , , (-D—
15. (a) cos kx H sin kx
k 2 ir )
k=\ x
"1 +1
(-i)*
17. (a) cos ax = -
a
+ ,
, v-
2a >
*—( k 2
— — a 2
cos kx
. 1

J
9-5 | ANSWERS TO ODD-NUMBERED EXERCISES 751

19. (a) Periodic; fundamental period 8 (b) Periodic; fundamental period 2x


(c) Not periodic (d) Periodic; fundamental period 2x
(e) Not periodic (f) Not periodic
(g) Periodic, every rational number a period; no fundamental period

sin (2k — l)x 00


sin 2&X
23. (a)
-*X; fc=i
2/k - l 2
fci *

sin fcx
(c) T + 2 JP
25. (a) i — ^ cos 2* (b)isin2x
(c) f sin jc — i sin 3* (d) sin x + 5 sin 2x
\
(e) \ cos x + \ cos 5.x (0 f + i cos 2x + •§ cos 4x
00

27. Let f(x) ~ -y + ^


it=i
(at cos A:x + £/b sin A:x); then

00

(a) f(—x) ~ — + 22 - (a k cos £x - bk sin &x);

00 00

(b) fE (x) ~ y+ J]
*=i
a* cos £x, /o(x) ~ ^
fc=i
Z> fc sin kx.

(c) 1, cos x, cos 2x, . . . and sin x, sin 2x, . . . are bases for S[— x, it] and 0[— x, x],
respectively.

Section 9-5

1. (i) £/ = 1, -7T < X < X, (ii) Ef = X2 —X < X <, X,

(-1, -X < X < 0, -X 2


, -X < X <
7 Of
}1,0 < * < *• lx 2 ,0 < x < x
_ \e~ x
=

< x < 0, n-
0") £/~ (iv) Ef
,

L-JC,0<JC<T x
e < x < ir
,

— X — X, —IT < X < 0, -e~ x -x < x < 0,


0/ =
j ,

X — X, 0<X<X e*, < x < x


+
3-!e 4&2 - 1
sin 2kx 5. -T- + £2 [1 + (-l)* V]sinA:jc
*=i
00 j

7. 2 ^- sin Arx

9.4^;
i
i~\f
+l ~ sin (2k — 1 )x
^ (2* - l) 2 ir(2k - 1).

/2

f(x) cos kxdx if A: = 1 , 3, 5,


TT J

.0 if A: = 0,2,4,6,...
752 ANSWERS TO ODD-NUMBERED EXERCISES | CHAP. 9

Section 9-6

3. fix) = \ + 2 Ak cos
kirx
—r- » where

ifk = 1,5,9,
kir k 2 ir 2

ifk = 2,6,10,...,
Ak = < k 2 it 2

ifk = 3,7,11,.
kir k 2 ir 2

ifk = 4,8,12,16,...

5. Y] —— — (cos kx + 4fc sin kx), Q < x < -


7T 7T f— \6k 2 ' 1 2

7-i + s — -
(-D* 1
cos kirx
.
+ ,
— sm
1 - .
kirx , 8 < x < 10
k'lr* kir
A=l L

2 i 1
9. /^ -
*— sin At7ta:, 1 < x < 2
*£l*
ir A:'

k+1
-l) 3ir . (2k - l)irx „
11.
- sm 2 < x < 3
(2k 1)2 (2A: - Dv
,

13. —2 /_] -sin A:x, —t < x < ir


K
*;=i

Section 9-8

1. (b) ll/iW&OOU = ||y;-W|| ||»G»||

3. If F(— x, >>) = F(x, >-) and F(x, —y) = F(x, >0, then
00

F(x, y) = ^J a P g cos px cos qy.


p,q=Q

If F(— x, y) = — F(x, y) and F(x, —y)= — F(x, y), then


00

F(x, y) = 2J fl »»n sin mx sin ny.


m,n=l

5. 2 /]
*— sin mx
ro>=l
' m
^•2 + 2 cos 2* cos 2> — ^ sin 2x sin 2y
2 oo
°o /
/- <
i \»™+l
\»»+l oo /
f 1 \»H
A -jm-\-q-\-l
2x
9. sin mx cos #y
wi,g=l
11-4 | ANSWERS TO ODD-NUMBERED EXERCISES 753

Chapter 10

Section 10-3
00

1. A = 2
J] bk /k\ A k = -b k /k, k * 0; Bk = a k /k
k=\

5. (a) |/(xi) - f(x 2 < M\ Xi - x 2 a where


)\ \
,

(i) M > 0, a > 0; (ii) M > 1, a2 = 1; (iii) M> 2, a = 1;

(iv) M > 2tt, a = 1; (v) M > 3tt a = , 1.

(c) Only constant functions satisfy such a condition.

Section 10-6

3. No; the formula x/2 = £"=i (-1)* +1 (sin fc*A) is invalid outside (--tt, tt).

Section 10-7

1. With obvious conventions it is a vector space.

3. (a) (b)

Chapter 11

Section 11-2

1. Pi(x) = x;P 2 (x) = fx 2 - *;/>3(x) = f* 3 - |*J


P 4 (x) = ^c4 - J^c 2 + f ; i> 5 (*) = %3-* 5 - -3
£-x* + ^x
Section 11-3

1. P 3 (x) = f* 3 - fx; P 4 (x) = ^*4 - ^x 2 + f


3. p 6 (*) = 6
-w* 4
+ w* 2 - —
w*
5
16

Section 11-4

1. If/(x) is an even function in (PC [—1, 1] then


00

/(*) = S
k=0
a 2kP2k(x) (mean),

where a 2 * = (4k +
l)f£ f(x)P 2k (x) dx.
If/(x) is an odd function in (PC[-1, 1], then
00

/(*) = S fl2i+iftt+iW (mean),

where a 2k+ i = (4k + 3)f<$ f(x)P 2k+ i(x) dx.

* Irl ^Vf n*+! (2 * - 2)!(4fc + 1)


p ^
_ _i < * < 1

5. (a) fPi(x) + %Ps(x)


(b) 2PoW - j&Pi(x) + + ^3W ^sW
(c) ffPoW - Pi(jt) + ^Pi(x) + §fP 4 (x)
754 ANSWERS TO ODD-NUMBERED EXERCISES | CHAP. 12

Section 11-5

1. n
1/2
g(x)Pn (x) dx (2 \
1/2
/2 -I- A 1/2
f
1

Section 11-6

1. All the functions belong to ^, except those in (a) and (g).

7. (b) H 6 (x) = x6 - 15a:


4
+ 45x 2 - 15,
H 7 {x) = x7 - 21a: 5 + 105a:
3 -
105a:

Section 11-7
( 2) 2
9. (b) Lo = 1, L? = x - 8x + 12,

4 2)
= x - 3, L3
2)
= x 3
- 15x
2
+ 60* - 60

Chapter 12

Section 12-1

\. y = cos x + c sin * 3. No solutions

5. y = c cos 3a: 7. No solutions

9. X n = 2 « = 0, 1, 2,
/7 , ; Jn ci cos nx + C2 sin nx

Section 12-2

1. T) and the trivial subspace

Section 12-3

1. Using standard basis ei, e2."


(a) X = 1, with eigenvectors of form xiei;
X = 2, with eigenvectors of form ;ti(ei + e2)
(b) X = 0, with eigenvectors of form a:2C2 5

X = 1, with eigenvectors of form xiei


(c) X = 0, with eigenvectors of form a: i(ei — e2);
X = 2, with eigenvectors of form A-i(ei + e2>
(d) X = 3, with eigenvectors of form *2(2ei + e2);
X = —1, with eigenvectors of form A:2(—2ei -f- e2>

Using standard basis ei, e2, e3:


(a) X = 0, with eigenvectors of form a: i(ei — e3);
X = 1, with eigenvectors of form A:2e2;
X = 2, with eigenvectors of form A:i(ei + e3>
(b) X = 2, with eigenvectors of form A^e2;
X = \/2, with eigenvectors of form x i(ei + [V2 — l]e3>;
X = —V2, with eigenvectors of form xi(ei — [1 + V2]e3)
(c) X = 1, with eigenvectors of the form a 2 (—3ei + e2 — 3e3>;
X = 2, with eigenvectors of the form *2(2ei + e2> + A:3(2ei + e3>
.

12-7 | ANSWERS TO ODD-NUMBERED EXERCISES 755

(d) X = 0, with eigenvectors of the form X3(3ei + e 3); — 2e2

X = 5 + V5
, "°
with eigenvectors
(
of the form X2 ei + e2 +
- (

y/5
-e 3 j ;
2 '
"V* " '
2
5 - V5 / 1 +~ V5
X = , with eigenvectors of the form X2 I ei + e2 —
)
5. (a) x = -|ei + + |e3 e2 (b) x = ^ei + |e 3
(c) x = -fei - 2e 2 + fe 3 -x

9. (b) X = is an eigenvalue of multiplicity n + 1.

So is the subspace of constant functions; dim So = 1.

Section 12-4

1. (a) Xi = £[c a)* + W],


+ a + V(c -
X2 = - V(c - a)2 + 4^2
l[c + a ]

(b) [c - a + V(c - a)2 + 4Z>2 ]ei - 2Z* 2 ,

-2bm +[c - a - V(c - a)2 + 462 e2 ]

[Afote. If 6 = 0, one of these vectors is the zero vector; in this case the required
basis is {ei, e2}.]

3. X = n2 n ,
= 0, 1, 2, . .

5. (b) The null space is the set of all functions F{k) in $7 of the form
F{k) = cik + c2 , k = 0, ±1, ±2,
where c\ and C2 are arbitrary real constants.

Section 12-5

3. X = n2 n ,
= 0, 1, 2, 3, . .
.
, with eigenfunctions of the form
y = c cos nx (c t* 0),

=
(2/i + l)
2
=
X , « 0, 1,2,3,... with eigenfunctions of the form

{In + \)x
y = c sin (c ^ 0)

Section 12-7

1. * = 128
~ Z2*— (2«—+TT^
7T
n=U
'
7^
l) 5
sin o
2
— x 3 - No solution; Xo = 0, c = — -
7T
^
fc+1 fc+i
(-l) 4 1
— 1 + (-D
sin {2k )x + sin 2&x
5. y
_ _(2* - 1)% (2/t - 1)3
1
8A:3

2*
7. If Jo' «(*) <** ^ 0, there is no solution; if f h{x) dx - 0, then
00

y = c + ^J a„ cos nx + 6„ sin nx,


n=l

where
2* 2a-

1 1
an = — I h{x) cos nx f/x, bn = — I
h{x) sin nx i/x,
7T Jo 7T Jo
and c is arbitrary.
756 ANSWERS TO ODD-NUMBERED EXERCISES | CHAP. 1

Section 12-8

1. X„ = n 2 — 1, <p„0r) = sin nx, n = 1, 2, 3, ... ; orthogonal in C[0, ir]

3. X„ = — n 2ir 2 <p n (x) = e~ x sin rnrx,


, n = 1, 2, 3, ... ; orthogonal in C[0, 1] with
respect to the weight function e 2x

5. X„ = „2^2
« 2 2 '-a »
x' 2
sin —
rnrx

^
, « = 2, 4, 6,

7r , <?„(*)
/0
nwx
?
x/z
cos -r- n = , 1, 3, 5,
2
orthogonal in Q[— 1, 1] with respect to the weight function e~ x

7. Xn = —n 2
,(p n (x) = e~ (ncosnx x
+ sinnx), n = 1,2, 3,. . . ,andX = l,f(x) = 1;
orthogonal in 6[0, 7r] with respect to the weight function e 2x

9. Here the set S of eigenvalues is not a sequence but a region of the complex plane,
comprising those points X = £ + /17 such that — 1 < £ and |i;| < 2(£ -f- 1) 1/2 ,

with the addition of one boundary point, X = —1. To obtain the eigenfunction
<p(h; ^corresponding to a given X ^ of S, set X = pe ie (where p = [£ 2 + n 2 ] 1/2 ,

£ = p cos 0, = p sin 0, ^ 77 < 2x) and let a + i0 = pi/V** - ^' 2 so that ,

a + ij8 = (-X) 1/2 then on < x, ;

08 In x) + i sin (j3 In x)] - ^""[cos (p In x) - i sin (0 In *)];


<p(X; x) = *
1+a [cos
also, <p(0, a:) = x In x, < x.
Section 12-9

1. (a) One-to-one (b) Not one-to-one (c) One-to-one


(d) One-to-one (e) Not one-to-one

7. *(*,,) = < K* + 1)( '


"
2)
'
*"''
|$(* - 2)(t +1), x> t,

Section 12-10
1
- (cot A: sin fo — cos kt) sin Arx, x < t,

3. (a) K(x, t) = <


1
- (cot A: sin kx — cos &x) sin kt, x > t

[Note. The constant k may be chosen as any real number except that k must
not be an integral multiple of t, since then L = D 2 -\- k 2 is not one-to-one
when restricted to S.]
x
1 r
(b) y(x) = - {(cot k sin kx — cos kx) / hO) sin kt dt
k Jo

+ (sin kx) J
A(0(cot k sin kt — cos £r) dt}
JX

Chapter 13

Section 13-2

1. (a) a(x, y, z)Dx + b(x, y, z)Dy + c(x, y, z)D + d(x, y, z) z

(b) a (*, y)Dxx + 6(x, j;)/)^ + c(x, y)Dyy + </(*, y)£>x + e(x, y)Dy + /(*, y)
Section 13-4

.
1.
.

u(x, t)
.

= cos—
«7raf
— sin—=- . n7rx
— »

1 3-6 I ANSWERS TO ODD-NUMBERED EXERCISES 757

^-Z> + (-D^'lcos — sm —
at 2 J^ i _ , i mrat . rmx
3. u(x,,) -
n=l

^
5. «(*,/) = 2 Ucos— + —
n=l N
&/v^
L
/ J

1

.
mrat

nir
,

wraf
ft. sin

.
.

nirx
^
/ittaA
/JTTflM
sin
. AI7TX

7. "(*,0 = ^i;-^n
v 2Z-j n T2 cos--sin
L2 L
n=l
. rmx
sin-yr-
n=l
When / = L/(2a) or t = 3L/(2a), the string is in the position of equilibrium
the reflection in the x-axis of
(y = 0); when f = L/a, it is in a position
which is

its initial position; when t = 2L/a, its position is identical with its initial position.

9. (a) u(x,y) =
2L v^ ~
-^ £ rm-~ n=l
1

2
sin
.

sin
.


rmb .

sin —
mrat
sin
.

rmi

(b) lim u(x,y) - -Ij-H " n T ,m T an


T
by a
The vibration represented in (b) may be interpreted as that approximated

when given a sharp blow at the point x = L/2.


string

11. (a) <p(x, t)


it is

= 2^
^v /
V
Cn C0S
mrat
~L~
+
_ .

Sm
rmat\
mrat\
~W S "
sin
.
.

rmx

n=l ^
where £ .z,

L Jo
m
Cn = - I f(x)sm ^dx, D n = rma
L
— Jo
g(x)sin
m^dx
L

(b) <p(x, t) = i(A +B t)


^
+ 22 (^cos-y- + B
/ mrat ,
_ .

n sin
mrat\
—J cos —
mrx

where

A> = 7 ( /Wcos^Jx,
L
» = 0,1,2,...
L Jo
r

Bo = j\Jo g(x)dx,Bn = -?- g(x)cos ^dx, n = 1,2,...


L nira J°
00
(2« -— l)7raf
(c) <p(x, = 2 ^2n-l COS

. (2n - l)irat . (2n -


_ l)7rx
+ J? 2 n-i sm sm
2L
where L
2 [ r, x . (2n - l>rx

= 7— sm ^x
*2»-i 7;
— / *(*) 2Z^
LL
(2n l)ira Jo
Section 13-6

, x
4L^-% 1
*
-u2m+i)2**tiL*a*] .
(2m + 1)ttx
1. m(x, = -jL
^2^
-i=U
(2m + 1)2
' J —

758 ANSWERS TO ODD-NUMBERED EXERCISES | CHAP. 13

3. «Cx.O = 4 t TT^TIk
8
+
ir 3
e«»*W>
*" (2m'
l)3
sin
(2m
+
L
1)rx

5"(*.<> = -|l-2X;z^rT
=
e cos—
1

T *\
/-
i. u(x, t) = —
Aq
+ X^
2^ A
I A
ne
-(n 2 r 2 tJL 2 a 2 )
cos_
rVKX
zr
ra=l

where

An = t \ f(x) cos —r- dx, n = 0,1,2,3,..


L Jo L
-(n 2 * 2 t/L 2 a 2 ) nirx
sm
\ L) ir
*— n(4n 2
' - De ~r
X 2 (—1)" -(n 2 X
11
11.
/ \
(a) u(x, t)
/ ,n
= - + ,
- V"*
> e
*- 2 </a 2 L 2 ) •

sin -—
«7T*
(b) w(x, t) = -
L, IT •*~^
n—L
n L L

13. w(x,
lOOx
= — \-
2
- > ^ & + (-l)"
+1
(&i - 100)
e
_ (n 2,2 (/L 2 a2)
sin
. mrx

L 7r
-'— j
' n L

Section 13-7
2 \ oo ^-[(2fe + l) 2 T 2 t)/(a 2 Af 2
3. „(*, ,, ,) - ^-j- e
f o „ o

cos
T) g ^ + )y
)

sm ^—
00

5. (a) u(x, y, z) = ^J a mw (tanh ir\/m 2 + n 2 cosh \/m 2 + n2 z


m=l
w=l
— sinh v w2 + n 2 z) sin mx sin ny
where

/(x, y) sin mx sin n^ dx dy


7T
2 tiVw 2 + «2-/o Vo
tanh
(b) Let «i(x, y, z), Vi(x, y, z), / = 1, 2, 3, be solutions corresponding respectively to
the boundary value on all but a single face, on which they have the values
indicated.
"i(*, v, 0) = fx(x, y), w 2 (0, y, z) = /2 (y, z), « 3 (x, 0, z) = f3 (x, z);
<n(*, y, tt) = g!(x, y), v 2 (t, y, z) = g 2 (y, z), o 3 (x, w, z) = g 3 (x, z).
Then u\ is the w of part (a), with/(x, y) replaced by /i(x, y) and
00

ui = ^
7»=1
F„,„ sinh \/m 2 + « 2 z sin mx sin «y,
71 =1
where

f ffl » =
7r 2 sinhxv/m 2
-==
+ «2
J
-^o 7o
I gi(x, y) sin mx sin nydxdy.

The formulas for « 2 , «3 are obtained from that for u\ by permuting cyclically
the subscripts 1, 2, 3 and the variables x, y, z. Similarly the formulas for u 2 , v-s

are obtained from that for v\. The solution of the given boundary-value prob-
lem is U = «l + #2 + «3 + l?l + V2 + U3.

14-2 ANSWERS TO ODD-NUMBERED EXERCISES 759

7. u(x, y,t) = 22 (Emn cos a\J m 2 + n2 t + Fmn sin a\/m 2 + n2 t) sin mx sin ny,
m=l
n=l
where

F = —
** J
/(x, >>) sin mx sin ny dx dy,
./

F — /
g(x, y) sin mx sin ny rfx </y
/
ir 2 a\/m 2 + n 2 Jo Jo

Section 13-9

1. The formal solution is

"7T* 2 WTX
/
m(x,
,\
f) = V* D -(n 2 T 2 *)/(L 2 a 2 sin——»
2_j Bn e
) •

where 2?„ = - /
/ /(x) sin
.
-— efr.

Chapter 14

Section 14-2

.,(2/ii + 1)tt(M - y) (2m +— 1)tt>> (2m + \)ttx


= Zc„
. .

1. m(x,j>) sinh (- smh -


sin

where
161/
t-m —
(2m + 1)3^3 si n h [(2m + 1)ttM/L]
nir(M — v) rnry
= Zc„ ,
. , . .

3. w(x,}>) sinh h sinh-— sin


n=l
where
, n
(-1)
n
- COSy
/nr

nSu-s sinh (mcM/L) I 2

= {sinh (ttx/M) + sinh [tt(L - x)/M]} sin Ory/Af)


5. u(x,y)
sinh (ir£/M)
'

(2M + 1)tt(M - y) u (2m + Dwy


+ EC
m=0
»
.

sinh
, .

h sinh
.

X sin
(2m + 1)ttx

where
sr
(2m + 1)3^3 sinh [(2m + l}irM/L]

ao -^ sinh n(7r — >0 cos nx


= ^ Or - y)
, x
+ ^2
,

7. w(x, y) an
=l sinhnw
n

where

) cos nx dx
760 ANSWERS TO ODD-NUMBERED EXERCISES | CHAP. 14

9. u(x, = >
1
cosh
,
(2m += l)iry
smh
.
,
(2m +
=
1)^
y)

X sin
(2m + 1)tx

Section 14-3

5. It is sufficient that f(B) belong to <?e[— t, t]. On each subinterval of continuity


apply Theorem 1-15 to the functions f(6) cos n(s — 6), Theorem 1-24 to the series

n
^2 r f(d) cos n(s - 6),

Theorem 1-22 to the sequence of partial sums, and combine the results for the
subintervals.

Section 14-4
00

7. u(r,<p) = X) A n r- (n+1) Pn (cos<p),


i=0
where

A n = ~^~
2
— /
Jo
f(<p)Pn (cos <p) sin <p d<p

9. (a)u(r,<p) = J + r 2 (cos 2 ^ - J)
(b) u(r,<p) = -i - fr 2 + 2r 2 cos 2 ^
(c) u(r, <p) = i + f r 2 - 2r 2 cos 2 p
(d) u(r,<p) = i%- + ^-r 2 (l - 3 cos 2 <p) + ^(3 - 30 cos 2 <p + 35 cos 4 <p)

2fc+1
11. w(r, 9?) = 2^ fi fc r P2 fc+i(cos^),
fc=0

where
/•t/2
Bk = (4k + 3)/ f((p)P 2 k+i(cos(p)sm<pd<p
Jo

Section 14-5
_i + vr+s>/2 (-i-vrRx)/2
cir( + C2r if x > _ i;
_1/2
1. (a) R = < /•
(ci + C2lnr) ifX = —5;
j~ 1/2
(a cos [-h/-(l + 4X)] In r + c 2 sin [£\/-(l + 4X)] In r)
ifX < -i
3. «oo = 1 «io = cos <p «ii = cos 6 sin <p

un = sin sin <p U20 = f cos 2 <p \ — W21 = 3 cos 6 sin <p cos <p

"22 = 3 cos 26 sin 2 45 U21 = 3 sin sin <p cos <p U22 = 3 sin 2d sin 2 <p

Section 14-6

2 „ .1 .1 «22 1 °°
4A: + 1
1
3. COS
2
(p COS ~ 1
- «00 + -Z "20 +^ - «2fc,2
O 3 24
4f^2 k(k+ 1)(4*2 1)
14-7 I ANSWERS TO ODD-NUMBERED EXERCISES 761

5. (a) fbp, 0) ~
1
- woo ^
+ 2^ 22*+i(4A:2
72k+iuk2 _
(-D*(4A: +
ix(fc +
nrrt-u
1)
1)!]2
fc=l

{lk 2)! 2
X t M 2fc ,o - (2k - 2)\(2k + k + 2)u 2k 2,

(b) w(r, <p,d)


1
= - woo
^
+ 2^ 2 2*+i(4A:2 _
(-1) (£k
K
+
i)[(fc
l)r

+
2k

l)i]2

(2k 2)! 2
X t u 2k ,
- (2k - 2)l(2k + k + 2) M2fc ,2

»+ 1 r - (w+1)
7. (a) ii(r, <p, 0) = ]T
i=0
[r
ro
- a
2
] ^ m0"mO + £ v4 mn « mn + Bmnv n

where
,-TO +l
Slmn — Simnx!*, u s/ £,2m+l _ ^mn(g),
a 2"»+l
i

,m+l
"mn = Bmn (a, b; 5 '
_ ^mn(g),
£2»»+l a 2m+l
with
2m + 1 (m — n)l
A mn (g)
2tt (m + n)!
/•2t
rZ-K /i
rT
n+
X / / *fo 0) cos /i0 sin <p/C (cos (p) dip dd,
./o ^0
2m + 1 (m - n)!
Bm n(g) =
27r (m -f- «)•
*2t /•*

n+
X g((f, 6) sin nd sin <pP„T (cos <p) d<p dd
o Jo

(b) u(r,<p,6) = a
2_j \2 rno(r)UmO + 22 [«mn(r)««» + jS mn (r)i?mn ]}
'
m=0 * n=l
where
2m
a^nO) = 4n(i>, # J
/)[r" — /, +V-(m+1) + ^ m ] „(fl, b\ g)
m - 2m+1 {m+1)
X [r a r- ],

Pmn (r) = Bmn (b a;JJrm - b 2m +\~^+^


f +5 TO „(a, 6; *)
» _ 2«+i -(«+i)
x [r a r L
with notation as in part (a)

Section 14-7

3. Qo - 202
5. r 3 U3o = — §x 2 z — \y 2 z + z 3 3
r V3\ = —%x 2y — fy 3 + 6yz^
r «3i = — f* 3 — fx>> + 6xz = 30x>>z
3 2 2 3
r vs2
r 3 «32 = 15(x 2 z — y 2 z) r 3 £?33 = 15(3* 2.y — >> 3 )
r 3 K33 = 15(* 3 — 3xy )
2

7. Cn -2j,2j = (— 1) - Cn> °'


(2y)!(/i 2/)!

= — 1) (« - D!
C„_2i-i,2y+i
- - Cn — i,i
(
(2y + 1)!(« 2y 1)!
762 ANSWERS TO ODD-NUMBERED EXERCISES I CHAP. 15

9. (a) x 6 - 15xV + 15x 2y 4 - y 6


(b) x 5 y - ^x 3 y 3 + xy 5
(c) x 6 + 6x 5y - 15x 4y 2 - 20x 3 y 3 + 15x 2y 4 + 6xy 5 - y
i

11. x - 6x z + z
4 2 2 4
x 3 y - 3xyz 2
x 2y 2 — x 2 z 2 — y,2,2
2 2
z 1,4
j_ ^z-
+ ^3 _ 3xyz 2
^
4 _ 6>,2 Z
2
+ 24 X 3 Z7 _
2 VZ — ivr 3 xy 2z zr — j^xz'
3
y z — yz°
13. (a) r
2
u 20 = -K* 2 - z 2) ~ Uy 2 ~ z 2) (b) x2 - z2 = -r 2 « 20 + ir 2 « 22
r 2 M9i
U2i = 3fxz)
3(xz) ^
xv = 2
Z
^r V22
\r o<>9
r 2 u 22 = 3(x 2 - z 2) - 3(y
2 - z 2) y
2 — z2 = —r 2U20 — \r 2 U22
2
r U2i = 3(yz) xz = §r 2 U2i
2
r t;22 = 6(x>>) yz = §r 2 v 2 i

m + * ~ 2 + A: - 3
17. dim 3C m , fc
= +
f
\ k - 2 k -1 /

Chapter T5

Section 15-1

3. Set x = Xr

Section 15-2

1. x = ±1 regular singular points; x = irregular singular point


3. x = ±1 regular singular points
5. x = 1 regular singular point; x = irregular singular point
7. x = and x = 1 regular singular points
9. x = regular singular point

Section 15-3

3. (a) v
2 - 1 = (b) ?
2 - 3? - 1 =
(c) v
2 - 3v + 1 = (d) v
2 - 2v - X2 =
(e) v
2 =
2 —
5. v 0, in each case.
(-0*
7. yx = |x|
1/3
(l + £*), J2
1-1/3
E
fc=0
A:!3*(3A: - 5)(3/t - 2)

9.^ = ixi
i/2
^ 2 = ixi
i/4
2:
fc=0
fc!2
2fc
(4A: - 1)

*X(X - 1)---(X A: 4- 1) *
11. y = ao i + E(-d
fc=i
(fe!)2
x

If X = 0, y = ao; if X is a positive integer, then


x
X! k
= a £(-l) fc

_
-
*
X .

(X ife)!(fc!)2
fc=0
15-4 I ANSWERS TO ODD-NUMBERED EXERCISES 763

Section 15-4
fc-i
-
E
00
2y 15
= \x\
1/2
^2a kx k
where ao = 1, a* = 4*-'- 1 «i>
1. yi
fc=0
,
/c(2A: + 1)
j^Q "

* ^ 1; < |*| < 4;

V2
00

^2a k xk , where ao = 1, ak =
J

_
^
z2~^T
/ — 8
a i> k ^ 1; \x\ < 4
k(2k l^
fc=0
00
k— i
(_1) "
3. yi = |x|
5/4
^
*=o
a kx
k
, where ao = 1, ak
4*+*A:(4A:
2(-l) 4 (4y +
+ 5) ^=0
y y
7) fl y,

A: ^ 1; < |x| < 4;


fc-1
(-D
v2 = X) ^^ wherefl o = 1, ak
u* =
-
2 2fc+iA:(4A: - 5)
i=0
fc=0
k ^ 1;
,

\x\
,

< 4
fc— i
2
5 ^l = * y^ akx
k
where ao = 1> ak = — X>.*s ;w< i 1 ;
-

k=0
J ,
3t(t + 1)
i-o
)t-i

V2 = \x\
2/3
J]
fc=0
flfcX*, where a = 1, a* = -
3^(3^ - 5) ^^
i=0
7 " 5 ** y '

A: ^ 1; < \x\ < 1


A; fc—

7. 71 = x ^2 a ^ fc

» where ao = 1, ak = .... ,

3)
/] (— l)'(4y + 3)ay,
fc=0
*^1;M<1;
V2 = Ixl
11 *
f^a k x\ where ao = 1, ak = Jtlil^
- (-!)>* A: ^ 1;
A:(4A: 3)
fc=0 ;=o
< |jc| < 1

9. yi = x + 2i ' \X < 00
+
\

• 10 13
^Jfc![7- • - •
(3A: 4) J

00
1—1/3
V2 = \X » < (jc| < °o
^A:![(-l)-2-5---(3A: - 4)]
fc

11. vi = ao = 1 give a\ = — § — f/, a2 = to + ^'> «3 = —two ~ T§o'J


i,

j>2 = — «o = 1 give a\ = — § + 3V, «2 = iV


/,
— ^5 a 3 = —jWo + T30 ;> 7 -

This yields y = ciyi + c 2V 2, where

vi = (1 - fx + ^x - tVo* +
2 3 •) cos (In [*|) • •

3
+ (£* - ^ro*
2
+ nhi* + • • •) sin ( ln W)>
v2 = (-£* + ^x - 2
y^o*
3
+ cos (ln \x\)
• • •)
3
+ d-f* + TO"* 2 " Tiro* + * * •> sin < ln l*D-

This solution is valid for < \x\ < °°

r ~ -4 + V2i
13. vi = V2/, ao = 1 giveai = 0, a2 = ^ ' «3 = 0;

4 -
= - r^V2
/- ~ /
= —V 2 = = =

a 1 give a\ 0, a2 ' «3 U.
j/
2 /,
.

764 ANSWERS TO ODD-NUMBERED EXERCISES |


CHAP. 15

This yields y = c\y>\ + C2V2, where

y/2 2
y\ = (1 - ix
2
+ • • •) cos (\/21n|x|) + (- ^x*
12
+ •J
sin(\/21n |*|),

2 2
- +
(^y^ +
/
y2 = •••)cos(v 21n|x|)+ (1 ±x • • •) sin (a/2 In |x|).

This solution is valid for < \x\ < 1.

15. vi = a = 1 give a\ = \(—2 - /), a 2 = ^(3 — /), as = q^o(67 +


i, 81/')»

2 = -1, «o = 1 give ai = |(-2 + /), «2 = ^(3 + /), «3 = 9^o(67


j/
- 81/).

This yields y = ciyi + C2V2 where

yi = (1 - fx + ^x + sffo* +
2 3
-)cos (In \x\) • •

2
+ (£* + fo* ~ imo^ 3 + •

'
sin 11
W).
v2 = (-^x - ^x + t^oX +
2 3 • • •) cos (In |x|)

+ - (i
fx + im*
2
+ sfk* 3 + • • sin d11 W)-
This solution is valid for < |jc| < 00

17. (a) v = gives >> = ^


k=0
a *(* ~ ^ >

where

a* =
^^E y=o
(-l)'" 2 ^+ *> - #** k = h
(JVote. It can be shown that, with a = 1, the formulas above imply that

[X(X + 1) - (A: - 1)A][X(X + 1) - (A: - 2)(* - 1)] • • • [X(X + 1) - • 1]


-i + S 2*(A!) 2

X (x - 1)"

so that if X is a non-negative integer n, then y is a polynomial of degree n.)


(b) -1 < x < 3. (IfX is a non-negative integer, then the solution is valid for all x.)

Section 15-5
(— 1) fc+i
1. v.i = x, y2 = 2^-jTTZ x + x In x ,

^-^[i + Ec-d*^*
fc+i Ly-iVy^* -i
= 2*- X)(-l) lnx
1
y2
_|_ x
(A!) 2

yi = 4!
A + 1 jfc+4
"
y2 = 1 + §x + 1
|*
2
5.
J] (A: + 4)! ,

3k+ 2
7. yi = x2 + £
fc=i
(-1)*
5 •
8 • • •
1

(3A + 2)
x , y2 -ti-^m 3/fc

2 S
9. yi = 1 -3x + ix - ix ,

4)! (A- - * 2 3
+ Hx - fiE^OkoT ^ +
3
y2 = Ix - \*x 2 1
(1 - 3x + fx - |x )ln*
15-11 I
ANSWERS TO ODD-NUMBERED EXERCISES 765

11. If 2/7 is not an integer, Case I; if p = 0, Case II; if 2p is an integer other than zero,
Case III.

Section 15-6

3. If p = 0, Jp (x) — > 1 x —» 0; if p >


as 0,Jp (x) —> as * —> 0; if /? < and /? is

not an integer, then Jp {x) —> ±oo as x —> 0.


2 2 2
_ + 2*
o
"
W* =
(u\ ^i/4(^ /2)
2xZ1/4 (x2/2)
Z'i /4 (x /2)

[Afote. Every y of this form is a solution of >>' = x


2
+ y 2 on every interval
on which x ?* and Zy4(x 2 /2) is defined and nonvanishing. Any given
solution y may be so represented by choosing the pair of arbitrary constants
in Zi/4(x 2 /2) = x~ 1/2u(x) in infinitely many different ways, corresponding to
the solutions u of the differential equation y(x) = —u'(x)/u(x).]

Section 15-7

„ r ,. (\2 192\ r/x A 72 , 384\ „, ,

19. (b) y = JB- 1 [(1 + s


2 )- 3 ' 2
]

Section 15-9

3. If ft = p in Case III, the functions Jp (p k x), k = l,2,3,...,do not form a basis


for 0>e[0, 1].

Section 15-10

^+i(M*)
5. If /> > 0, x* = 2
£
*=l [Mfc
2
— /> ][JP (j*k)]
2
/p(MfcX)> o < x < 1,

where /x*is the fcth positive zero of /£(*). If p = then x p = 1 and the Dini series
contains a single nonvanishing term, 2co = 1.

Section 15-11
00

1. w = 22 ^ Ak s*nn ^ a — z) + ^fc sinh Xfcz]7o(Xfer),

where \ k is the &th positive zero of Jo(x), and

f(r)M* k r)rdr,
^"[/ift^unhX*,'
2
g(r)J (\ k r)r dr
[/i(X*)P sinh X*.
ti* Jo
00

3. « = 5^ ^4* sinh \ k (a — z) Jo(\ k r),


fc=i

where X* is the kth positive zero of /o(X) = and


200
=
Xfc/i(Xfc) sinh Xfca
/

766 ANSWERS TO ODD-NUMBERED EXERCISES I CHAP. 15

5. u = c\ lnr + C2\ u = c\d + C2', u — c\z + C2. In the third situation u is a


possible temperature distribution in the cylinder; in the first two situations this is

true only in the special cases in which c\ = 0.

Section 15-12

1. x = XQ
n(l)/o(l + - J'o(X)Yo(\ + tj

/o(i)na) -Mi)Yo(i)
00 00

3. u(r, d, z) = ]jP
n=0
2
fc=l
l Ank cos nd ~^~ Ank sin /I ^»( A ~* r) sinh \ nk (a — z),

where \ n k is the kth positive zero of Jn (\), and


.-» /-l

f(r, d)J (\ 0k r)r dr dd,


7r[/i(Xoifc)]
2 sinh Xo/ta
J -r Jo

2
•^rafc — f(r, d)J„(X nk r)r cos nd dr dd, n > 0,
7r[Jre+ i(X n fc)] 2 sinh \ nk a J-* Jo

2
^nfc — / f(r,d)Jn (X nhk r)r sin nd dr dd
7r[/„ + i(X nfc )] 2 sinh \ nka Jo
00 00

5. «(r, M) = JJA
i=l fc=l
2ntkJ2n(^2n,kr) cos (X2n,fc a/) sin 2n0,

where X2 n ,fc is the &th positive zero of /2n(X) and


/•x/2
r-x/2 ^1
/-l

s*2n,k = r , 7T
—— / / f(r, d)J2 n(\2n, k r)r sin 2nd drdd,
7nV2n + HA2 n,k)\ z J ./O

with «(/-, 0, 0) = f(r, 6)


00 00

7. «(r, 6, = 22 l A nk cos (\ nr /a kat),


+ #n* sin (\mri a kat)]
,

X Jn */a (Xn* ia,hr) sin (nird/a)

where \ nr i a k is
,
the &th positive zero of /n w«(X) and

^4 n fc — 2
f(r, d)Jnr/a (\nir/ a ,kr)r sin (nird/a) dr dd,
tt[/(a+«T)/«(Xnir/a,fc)3 Jo Jo

4
#nfc = g(.r, d)JnT/a (\n*j a ,kr)r sin (nw6/a) dr dd,
aoi\ nr / atk [J( a + nr ) a (\nT/a,k)] 2 JO Jo

with w(r, 0, 0) = /(r, 0) and u t


(r, 6, 0) = g(r, B)
index

Laplace transform of, 205, 611, 612


modified, 616, 617, 630
Abel's formula, 117
of order n of the second kind, 604
Addition
of order p of the first kind, 599
of functions in Q[a, b], 3
of order —p of the first kind, 601
of geometric vectors, 34
of order zero of the first kind, 600
of linear transformations, 45
of order zero of the second kind, 603
of matrices, 70
Poisson integral form for, 616
in a vector space, 6
recurrence relations for, 616
of vectors in 01™, 7
zeros of, 235, 608, 609
Airy's differential equation, 241
Bessel's equation
Amplitude equation, 537
of order p, 234, 597
Angle between vectors, 263
of order zero, 157, 250
Arc
zeros of solutions of, 235
differentiable, 364
Bessel series
simple, 364
of the first kind, 622
Arithmetic means, 396
of the second kind, 624
Bessel's inequality, 272, 319
Bilinear, 257
B Biorthogonal
series, 535
Basis
sets of functions, 535
in a Euclidean space, 317
Bolzano- Weierstrass theorem, 389
in ^(-oo, oo), 438
Boundary conditions
in *£[0, ex.
), 444
for heat equation, 513, 514
orthogonal, 271, 317
homogeneous, 458
orthonormal, 278
periodic, 478
in (pe[-7r,Tr], 338, 360
for second-order equations, 457
in (PC[-1, 1], 409, 410, 424
unmixed, 478
in (PC[0, 1], 621
for wave equation, 514
standard in 2fll m „, 71
Boundary-value problem
standard in (R n 22 ,
for ordinary differential equations, 457
in a vector space, 21
for partial differential equations, 506, 507
Bernoulli's equation, 98
Bessel functions
Bessel's integral form for, 614
differentiation formulas for, 606 Cancellation law, 12
generating function for, 613 Cauchy
of half-integral order, 606 inequality, 263
767
768 INDEX

product, 254 of order n, 380


sequence, 639 Vandermonde, 116, 137
Cauchy-Schwarz inequality, 262 Diagonal form, 465
Center of mass, 296 Differentiation as a linear transformation,
Cesaro summable, 396 44
Characteristic Diffusion equation, 513
equation, 467 Dimension of a vector space, 23
exponent, 591 Dini series, 624
polynomial, 467, 699 Directed line segment, 32
values, 461 Dirichlet's formula, 374
vector, 461 Distance
ChristoffeFs identity, 426 to a line, 284
Circuit, electrical, 170 to a plane, 285
Closed subspace, 323 to a subspace, 282
Closure between vectors, 264
of a subset, 325 Divergence
of a subspace, 324 theorem, 713, 719
Cofactor, 688 of a vector field, 712, 719
Comparison test, 644 Domain of a linear transformation, 42
Components of a vector, 26 Dot product, 257
in (R 2 , 1

Convergence E
absolute, 648
conditional, 650 Eigenfunction, 460, 474
in a Euclidean space, 304 Eigenvalue, 460, 461
of integrals, 179, 671, 672, 673, 674 method, 466, 484
mean, 307, 314 Eigenvector, 460, 461
in (R", 306 Endpoint conditions, 457
pointwise, 306, 338, 656 Equivalence
of a sequence of real numbers, 305, 638 class, 38
of a series, 242, 311, 314, 642, 661 relation, 37
uniform, 658, 661, 672, 674 Euclidean space, 256
Continuity, 652, 653 Euler
uniform, 652, 653 equation, 161, 586
Convolution formula, 206 formula, 129
Coordinate axes, 26 Euler-Fourier coefficients, 338
Coordinates of a vector, 26 Even extension, 350
Coordinate system, 26 Exchange principle, 18
Cramer's rule, 695 Existence
Critical damping, 227 problem, 83
Curl of a vector field, 718 theorems, 104, 243

D Fejer's theorem, 399, 403


Definite integral, 655 Flux, 711
as a linear transformation, 44 Fourier coefficients, 287, 314, 317, 338
Delaying factor, 195 Fourier-Legendre coefficients, 421
Delay time, 171 Fourier series
Derivative, right- and left-hand, 374 cosine, 351
Determinant differentiation of, 391, 392
expansion of, 388 double, 367, 534
Gramm, 283 expansion, 338
INDEX 769

generalized, 317, 322 Gramm determinant, 283


integration of, 392, 393 Gamm-Schmidt orthogonalization
mean convergence of, 314, 338, 363 process, 277
pointwise convergence of, 340, 342, 369, Green's
374, 379 first identity, 719
sine, 351 functions, 141, 147, 149, 212, 494, 498
uniform convergence of, 380, 382 second identity, 719
Free-end condition, 513
Frequency, 519
Frobenius, method of, 583, 594 H
Function Half-life,168
analytic, 243 Half- wave symmetry, 359
Bessel (see Bessel) Harmonic
beta, 211 function, 562, 567
broken line, 400 motion, damped, 227
continuous, 652, 653 motion, simple, 168, 226
differentiable,654 polynomial, 574, 576
Dirac delta, 223 Heat equation, 509
even, 334 one-dimensional, 528
even extension of, 350 two-dimensional, 513, 532
even part of, 336 steady-state solutions of, 546
of exponential order, 180 Hermite
gamma (generalized factorial), 211, 599 equation, 250, 441
harmonic, 562 functions, 443
Hermite, 443 Hermite polynomials, 438, 537
null, 332 equation for, 441
differential
odd, 334 norms 439
of,
odd extension of, 351 orthogonality of, 439
odd part of, 336 recurrence relation for, 440
order of, 677 Homogeneous
periodic, 198, 347 linear differential equation, 91
periodic extension of, 342 operator equation, 81
piecewise continuous, 177, 330, 364 polynomial, 574
piecewise smooth, 340 polynomial; harmonic, 575
uniformly continuous, 652, 653 system of linear equations, 84
unit step, 194
Fundamental theorem of Calculus, 655

Image of a linear transformation, 42


Impedence, steady-state, 171, 174
Gamma function, 211, 599 Impulse, 222
Gauss's theorem, 719 Indefinite integral, as a linear
Generating function transformation, 44
for Bessel functions, 613 Indicial equation, 163, 587, 591
for. Hermite polynomials, 451 Infinite series

for Laguerre polynomials, 451 convergent, 311, 642


forLegendre polynomials, 448 divergent, 311, 642
Geometric vector, 33 in a Euclidean space, 311
Gibbs partialsums of, 311, 342
interval, 387, 390 of real numbers, 642
phenomenon, 386, 387 Initial conditions, 104

Gradient, 718 Initial-value problem, 103


770 INDEX

Inner product inverse, 185


standard, in Q[a, b], 257, 258 of a periodic function, 198
ra
standard, in (R 257, 258
, table of, 228-230
weighted, 258 Laplacian, 718
Inner product space, 256 Least squares, method of, 292
Integral test, 645 Least upper bound principle, 637
Integration as a linear transformation, 44 Legendre equation, 246, 248, 413
Intermediate value property, 654 power series solution of, 248
Interval Legendre polynomials, 278, 280, 410,
closed, 3 483, 559
open, 3 differential equation for, 413
Invariant norms of, 417
of a second-order linear differential orthogonality of, 414, 418, 562
equation, 236 recurrence relation for, 417
subspace, 68, 462 Rodrigues' formula for, 410
Inverse roots of, 417
left, 59 Legendre series
of a linear transformation, 57 mean convergence of, 421
right, 59 pointwise convergence of, 423, 429,
Inversion, 683 432
Inverting the operator, method of, 83 uniform convergence of, 429
Isomorphism, 57 Leibnitz
formula, 150, 671
rule, 90
Length of a vector, 261
Jacobian determinant, 705 Lerch's theorem, 185, 678
Jump discontinuity, 330 Limits, right- and left-hand, 330
Linear combination of vectors, 14
Linear dependence, 19
K
Linear differential equation
Kirchhoff s laws, 170 associated homogeneous equation of, 92
Kronecker delta, 269 first-order, ordinary, 95
homogeneous, 91
nonhomogeneous, 91
normal, 91
Lagrange identity, 477 ordinary, of order n, 91
Laguerre equation, 446, 447, 590 partial, 505
Laguerre polynomials, 444, 447 self-adjoint form of, 237, 476
differential equation for, 446, 447 with constant coefficients, 129, 134
norms of, 444 Linear differential operator, 52, 86, 505
orthogonality of, 444 coefficients of, 86
recurrence relation for, 444 constant coefficient, 51, 88, 89, 126
Laplace equation equidimensional (Euler), 90
in circular regions, 553 factoring, 52, 88, 89, 126
in cylindrical coordinates, 582 order of, 86
in polar coordinates, 547 self-adjoint form of, 237, 476
in rectangular coordinates, 509, 546 Linear independence, 19
in rectangular regions, 548 Linear operator (see Linear
in spherical coordinates, 548 transformation)
in spherical regions, 558, 564 Linear transformation
Laplace series, 571 addition of, 45
Laplace transform definition of, 41
definition of, 179 domain of, 42
INDEX 771

idempotent, 61 N
identity, 43
Nilpotent, linear transformation, 51
image of, 42
matrix, 79
invertible, 57
Node, 519
matrix of, 64 equations, 294
Normal
minimum polynomial of, 72
Normal vector to a surface, 701, 708
nilpotent, 51
null space of, 55
polynomials in, 51 O
powers of, 50
products of, 48
Odd extension, 351
Operator equation, 81
one-to-one, 56
Orthogonal, 268
onto, 42
Orthogonalization, 277
range of, 42
Orthonormal, 268
scalar multiples of, 47
symmetric, 471
zero, 43
Lipschitz condition, 378
Lower bound, 182, 637 Parallelogram law, 1
greatest, 182, 637 Parseval's equality, 272, 319, 322, 363
Particular solution, 81, 92
Partition of a set, 39
Pendulum, simple, 167
M Permutation, 682
Matrix even, 683
addition, 70 odd, 683
characteristic polynomial of, 699 Periodic
definition of, 64 extension, 342
diagonal, 80, 465 function, 198, 347
identity, 66 Perpendicular
of a linear transformation, 64 projection, 282
minimum polynomial of, 77 vectors, 268
multiplication, 75 Phase angle, 171, 174
nilpotent, 79 Piecewise continuous function, 177, 330,
nonsingular, 689, 699 364
principal diagonal of, 77 Pointwise convergence, 306, 338, 656
similarity,699 Poisson integral form, 556, 616
symmetric, 472 Polynomial
transpose, 685 characteristic, 467,699
triangular, 471 Hermite, 438, 537
zero, 66 Laguerre, 444, 447
Maximum-minimum property, 654 Legendre, 278, 280, 410, 483, 559
Mean convergence, 307, 314 minimum, 72, 77
Mean square deviation, 287 space of, 8, 13
Mean value theorem Population growth, 166
for derivatives, 655 Positive definite, 257
for integrals, 656 Power series, 242, 661
Minor, 688 differentiation of, 242, 662
Mobius 707
strip, integration of, 663
Momentum, 168 Product
Multiplication of linear transformations, 48
of linear transformations, 48 of matrices, 75
of matrices, 75 of power series, 254
772 INDEX

Projection, 61, 282 Fourier, 317, 338, (see also Fourier


Pythagorean theorem, 268 series)
geometric, 642
harmonic, 644
R
Laplace, 571
Radius of convergence, 242, 662 Legendre, 421, (see also Legendre
Range of a linear transformation, 42 series)
Ratio test, 645, 649 A 646
Real vector space, 1 , 6, (see also Vector power, 242, 661, (see also Power
space) series)
Recurrence relation, 345, 417, 440, 444, Shifting theorem
616 first,193
Reduction in order, method of, 154 second, 195
Reflexive, 37 Singular point, 583
Region irregular, 584
bounded, 654 regular, 584
closed, 654 Sonin Polya theorem, 238
connected, 507 Spanned, 14
pathwise connected, 507 Spherical harmonics, 567
Resonance, 174, 221 Steady-state current, 171
Resonating frequency, 174 Sturm comparison theorem, 232, 238
Riccati equation, 100, 605 Sturm-Liouville problem, 479
Riemann-Lebesgue lemma, 371 Sturm separation theorem, 231
Rodrigues' formula, 410 Subspace, 12
closed, 323
criterion, 12
generated by a set, 325
Scalar field, 705 intersection, 14, 17
Scalar multiplication invariant, 68, 462
in Q[a, b], 4 spanned by a set, 14
for linear transformations, 47 Subtraction of vectors, 8
for matrices, 70 Summable
in (R 2 , 2 by arithmetic means, 396
n Cesaro, 396
in (R , 7
Schrodinger wave equation, 536 uniformly Cesaro, 399
Self-adjoint form, 237, 476 Surface
Separation of variables, method of, 516 area, 704
Sequence explicit representation of, 700
Cauchy, 639 implicit representation of, 700
convergent, 304, 306, 638 integral, 705, 708

of functions, 655 nonorientable, 708


monotonic nondecreasing, 639 orientable, 708
monotonic nonincreasing, 538, 639 parametric equations for, 700
pointwise convergence of, 306, 338, 656 piecewise smooth, 713
of real numbers, 638 smooth, 701
uniform convergence of, 658 zonal harmonics, 562
uniformly bounded, 538 Surface integral
of vectors, 304 of a scalar field, 705
Series of a vector field, 708
alternating, 650 Symmetric, 37, 257
Bessel, 622, 624 Symmetric matrix, 261
Dini, 624 System of linear equations, 83, 84, 694
INDEX 773

field, 707
product, 704
Taylor series, 667
Vector space
Taylor's formula, 666
of functions, 3, 13, 183, 332
Test for linear independence, 20
of linear transformations, 47
Transient
of matrices, 70
charge, 173
of polynomials, 8, 13, 575
current, 171
Vibrating spring, 218
Transitive, 37
Transpose of a matrix, 685
Triangle inequality, 264 W
Trigonometric polynomial, 269
Wave equation, 509
d'Alembert's solution of, 515
U one-dimensional, 509, 516
Undetermined coefficients, method of,
Schrodinger, 510
158, 244
two-dimensional, 510
Uniform continuity, 652, 653 Weierstrass
Uniqueness problem, 83 approximation theorem, 360, 406,
Uniqueness theorem, 104, 718 408, 578
Upper bound, least, 637 M-test, 660
Weight function, 258
Wronskian, 111

Variance, 296
Variation of parameters, 141
Vector Zero vector, 6
addition, 1, 3, 6 Zonal harmonics, 562

CDE69
i

K.

S-ar putea să vă placă și