Sunteți pe pagina 1din 72

Notes on Numerical Linear

Algebra
Dr. George W Benthien
December 9, 2006
E-mail: Benthien@cox.net
Contents
Preface 5
1 Mathematical Preliminaries 6
1.1 Matrices and Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Linear Independence and Bases . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 Inner Product and Orthogonality . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.3 Matrices As Linear Transformations . . . . . . . . . . . . . . . . . . . . . 9
1.3 Derivatives of Vector Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 Newtons Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Solution of Systems of Linear Equations 11
2.1 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 The Basic Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.2 Row Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.3 Iterative Renement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Cholesky Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Elementary Unitary Matrices and the QR Factorization . . . . . . . . . . . . . . . 19
2.3.1 Gram-Schmidt Orthogonalization . . . . . . . . . . . . . . . . . . . . . . 19
1
2.3.2 Householder Reections . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.3 Complex Householder Matrices . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.4 Givens Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.5 Complex Givens Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.6 QR Factorization Using Householder Reectors . . . . . . . . . . . . . . . 28
2.3.7 Uniqueness of the Reduced QR Factorization . . . . . . . . . . . . . . . . 29
2.3.8 Solution of Least Squares Problems . . . . . . . . . . . . . . . . . . . . . 32
2.4 The Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.1 Derivation and Properties of the SVD . . . . . . . . . . . . . . . . . . . . 33
2.4.2 The SVD and Least Squares Problems . . . . . . . . . . . . . . . . . . . . 36
2.4.3 Singular Values and the Norm of a Matrix . . . . . . . . . . . . . . . . . . 39
2.4.4 Low Rank Matrix Approximations . . . . . . . . . . . . . . . . . . . . . . 39
2.4.5 The Condition Number of a Matrix . . . . . . . . . . . . . . . . . . . . . 41
2.4.6 Computation of the SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3 Eigenvalue Problems 44
3.1 Reduction to Tridiagonal Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2 The Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 The Rayleigh Quotient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4 Inverse Iteration with Shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.5 Rayleigh Quotient Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.6 The Basic QR Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.6.1 The QR Method with Shifts . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.7 The Divide-and-Conquer Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4 Iterative Methods 61
2
4.1 The Lanczos Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 The Conjugate Gradient Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.3 Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3
List of Figures
2.1 Householder reection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Householder reduction of a matrix to bidiagonal form. . . . . . . . . . . . . . . . 42
3.1 Graph of (z) = 1
:5
1

:5
2

:5
3

:5
4
. . . . . . . . . . . . . . . . . . . 58
3.2 Graph of (z) = 1
:5
1

:01
2

:5
3

:5
4
. . . . . . . . . . . . . . . . . . . 59
4
Preface
The purpose of these notes is to present some of the standard procedures of numerical linear al-
gebra from the perspective of a user and not a computer specialist. You will not nd extensive
error analysis or programming details. The purpose is to give the user a general idea of what the
numerical procedures are doing. You can nd more extensive discussions in the references
v Applied Numerical Linear Algebra by J. Demmel, SIAM 1997
v Numerical Linear Algebra by L. Trefethen and D. Bau, Siam 1997
v Matrix Computations by G. Golub and C. Van Loan, Johns Hopkins University Press 1996
The notes are divided into four chapters. The rst chapter presents some of the notation used in
this paper and reviews some of the basic results of Linear Algebra. The second chapter discusses
methods for solving linear systems of equations, the third chapter discusses eigenvalue problems,
and the fourth discusses iterative methods. Of course we cannot discuss every possible method,
so I have tried to pick out those that I believe are the most used. I have assumed that the user has
some basic knowledge of linear algebra.
5
Chapter 1
Mathematical Preliminaries
In this chapter we will describe some of the notation that will be used in these notes and review
some of the basic results from Linear Algebra.
1.1 Matrices and Vectors
A matrix is a two-dimensional array of real or complex numbers arranged in rows and columns. If
a matrix has m rows and n columns, we say that it is an mn matrix. We denote the element in
the i -th row and -th column of by a
ij
. The matrix is often written in the form
=

a
11
a
1n
.
.
.
.
.
.
a
m1
a
mn

.
We sometimes write = (a
1
. . . . . a
n
) where a
1
. . . . . a
n
are the columns of . A vector (or
n-vector) is an n 1 matrix. The collection of all n-vectors is denoted by R
n
if the elements
(components) are all real and by C
n
if the elements are complex. We dene the sum of two
m n matrices componentwise, i.e., the i , entry of T is a
ij
b
ij
. Similarly, we dene the
multiplication of a scalar times a matrix to be the matrix whose i , component is a
ij
.
If is a real matrix with components a
ij
, then the transpose of (denoted by
T
) is the matrix
whose i , component is a
ji
, i.e. rows and columns are interchanged. If is a matrix with complex
components, then
H
is the matrix whose i , -th component is the complex conjugate of the ,i -th
component of . We denote the complex conjugate of a by a. Thus, (
H
)
ij
= a
ji
. A real matrix
is said to be symmetric if =
T
. A complex matrix is said to be Hermitian if =
H
.
Notice that the diagonal elements of a Hermitian matrix must be real. The n n matrix whose
diagonal components are all one and whose off-diagonal components are all zero is called the
identity matrix and is denoted by 1.
6
If is an m k matrix and T is an k n matrix, then the product T is the m n matrix with
components given by
(T)
ij
=
k

rD1
a
ir
b
rj
.
The matrix product T is only dened when the number of columns of is the same as the
number of rows of T. In particular, the product of an mn matrix and an n-vector . is given by
(.)
i
=
n

kD1
a
ik
.
k
i = 1. . . . . m.
It can be easily veried that 1 = if the number of columns in 1 equals the number of rows
in . It can also be shown that (T)
T
= T
T

T
and (T)
H
= T
H

H
. In addition, we have
(
T
)
T
= and (
H
)
H
= .
1.2 Vector Spaces
R
n
and C
n
together with the operations of addition and scalar multiplication are examples of a
structure called a vector space. A vector space V is a collection of vectors for which addition and
scalar multiplication are dened in such a way that the following conditions hold:
1. If . and , belong to V and is a scalar, then . , and . belong to V.
2. . , = , . for any two vectors . and , in V.
3. . (, :) = (. ,) : for any three vectors ., ,, and : in V.
4. There is a vector 0 in V such that . 0 = . for all . in V.
5. For each . in V there is a vector . in V such that . (.) = 0.
6. (). = (.) for any scalars , and any vector . in V.
7. 1. = . for any . in V.
8. (. ,) = . , for any . and , in V and any scalar .
9. ( ). = . . for any . in V and any scalars , .
A subspace of a vector space V is a subset that is also a vector space in its own right.
7
1.2.1 Linear Independence and Bases
A set of vectors
1
. . . . .
r
is said to be linearly independent if the only way we can have
1

r
= 0 is for
1
= =
r
= 0. A set of vectors
1
. . . . .
n
is said to span a vector
space V if every vector . in V can be written as a linear combination of the vectors
1
. . . . .
n
, i.e.,
. =
1
.
1

n
.
n
. The set of all linear combinations of the vectors
1
. . . . .
r
is a subspace
denoted by <
1
. . . . .
r
> and called the span of these vectors. If a set of vectors
1
. . . . .
n
is
linearly independent and spans V it is called a basis for V. If a vector space V has a basis consisting
of a nite number of vectors, then the space is said to be nite dimensional. In a nite-dimensional
vector space every basis has the same number of vectors. This number is called the dimension of
the vector space. Clearly R
n
and C
n
have dimension n. Let e
k
denote the vector in R
n
or C
n
that
consists of all zeroes except for a one in the k-th position. It is easily veried that e
1
. . . . . e
n
is a
basis for either R
n
or C
n
.
1.2.2 Inner Product and Orthogonality
If . and , are two n-vectors, then the inner (dot) product . , is the scaler value dened by .
H
,.
If the vector space is real we can replace .
H
by .
T
. The inner product . , has the properties:
1. , . = . ,
2. . (,) = (. ,)
3. . (, :) = . , . :
4. . . _ 0 and . . = 0 if and only if . = 0.
Vectors . and , are said to be orthogonal if . , = 0. A basis
1
. . . . .
n
is said to be orthonormal
if

i

j
=
_
0 i =
1 i =
.
We dene the norm [.[ of a vector . by [.[ =
_
. . =

[.
1
[
2
[.
n
[
2
. The norm has
the properties
1. [.[ = [[[.[
2. [.[ = 0 implies that . = 0
3. [. ,[ _ [.[ [,[.
If
1
. . . . .
n
is an orthonormal basis and . =
1

1

n

n
, then it can be shown that
[.[
2
= [
1
[
2
[
n
[
2
. The norm and inner product satisfy the inequality
[. ,[ _ [.[ [,[. Cauchy Inequality
8
1.2.3 Matrices As Linear Transformations
An m n matrix can be considered as a mapping of the space R
n
(C
n
) into the space R
m
(C
m
)
where the image of the n-vector . is the matrix-vector product .. This mapping is linear, i.e.,
(. ,) = . , and (.) = .. The range of (denoted by Range()) is the space
of all m-vectors , such that , = . for some n-vector .. It can be shown that the range of is
the space spanned by the columns of . The null space of (denoted by Null()) is the vector
space consisting of all n-vectors . such that . = 0. An n n square matrix is said to be
invertible if it is a one-to-one mapping of the space R
n
(C
n
) onto itself. It can be shown that a
square matrix is invertible if and only if the null space Null() consists of only the zero vector.
If is invertible, then the inverse
1
of is dened by
1
, = . where . is the unique n-vector
satisfying . = ,. The inverse has the properties
1
=
1
= 1 and (T)
1
= T
1

1
.
We denote (
1
)
T
and (
T
)
1
by
T
.
If is an m n matrix, . is an n-vector, and , is an m-vector; then it can be shown that
(.) , = . (
H
,).
1.3 Derivatives of Vector Functions
The central idea behind differentiation is the local approximation of a function by a linear func-
tion. If is a function of one variable, then the locus of points
_
.. (.)
_
is a plane curve C. The
tangent line to C at
_
.. (.)
_
is the graphical representation of the best local linear approximation
to at .. We call this local linear approximation the differential. We represent this local linear
approximation by the equation J, =
0
(.)J.. If is a function of two variables, then the locus
of points
_
.. ,. (.. ,)
_
represents a surface S. Here the best local linear approximation to at
(.. ,) is graphically represented by the tangent plane to the surface S at the point
_
.. ,. (.. ,)
_
.
We will generalize this idea of a local linear approximation to vector-valued functions of n vari-
ables. Let be a function mapping n-vectors into m-vectors. We dene the derivative D(.) of
at the n-vector . to be the unique linear transformation (m n matrix) satisfying
(. h) = (.) D(.)h o([h[) (1.1)
whenever such a transformation exists. Here the o notation signies a function with the property
lim
khk!0
o([h[)
[h[
= 0.
Thus, D(.) is a linear transformation that locally approximates .
We can also dene a directional derivative
h
(.) in the direction h by

h
(.) = lim
!0
(. zh) (.)
z
=
J
Jz
(. zh)

D0
(1.2)
9
whenever the limit exists. This directional derivative is also referred to as the variation of in the
direction h. If D(.) exists, then

h
(.) = D(.)h.
However, the existence of
h
(.) for every direction h does not imply the existence of D(.). If
we take h = e
i
, then
h
(.) is just the partial derivative
@f.x/
@x
i
.
1.3.1 Newtons Method
Newtons method is an iterative scheme for nding the zeroes of a smooth function . If . is a
guess, then we approximate near . by
(. h) = (.) D(.)h.
If . h is the zero of this linear approximation, then
h =
_
D(.)
_
1
(.)
or
. h = .
_
D(.)
_
1
(.). (1.3)
We can take . h as an improved approximation to the nearby zero of . If we keep iterating
with equation (1.3), then the (k 1)-iterate .
.kC1/
is related to the k-iterate .
.k/
by
.
.kC1/
= .
.k/

_
D(.
.k/
)
_
1
(.
.k/
). (1.4)
10
Chapter 2
Solution of Systems of Linear Equations
2.1 Gaussian Elimination
Gaussian elimination is the standard way of solving a system of linear equations . = b when
is a square matrix with no special properties. The rst known use of this method was in the
Chinese text Nine Chapters on the Mathematical Art written between 200 BC and 100 BC. Here
it was used to solve a system of three equations in three unknowns. The coefcients (including
the right-hand-side) were written in tabular form and operations were performed on this table to
produce a triangular form that could be easily solved. It is remarkable that this was done long
before the development of matrix notation or even a notation for variables. The method was used
by Gauss in the early 1800s to solve a least squares problemfor determining the orbit of the asteroid
Pallas. Using observations of Pallas taken between 1803 and 1809, he obtained a system of six
equations in six unknowns which he solved by the method now known as Gaussian elimination.
The concept of treating a matrix as an object and the development of an algebra for matrices were
rst introduced by Cayley [2] in the paper A Memoir on the Theory of Matrices.
In this paper we will rst describe the basic method and show that it is equivalent to factoring the
matrix into the product of a lower triangular and an upper triangular matrix, i.e., = 1U. We
will then introduce the method of row pivoting that is necessary in order to keep the method stable.
We will show that row pivoting is equivalent to a factorization 1 = 1U or = 11U where 1
is the identity matrix with its rows permuted. Having obtained this factorization, the solution for a
given right-hand-side b is obtained by solving the two triangular systems 1, = 1b and U. = ,
by simple processes called forward and backward substitution.
There are a number of good computer implementations of Gaussian elimination with row pivoting.
Matlab has a good implementation obtained by the call [L,U,P]=lu(A). Another good implemen-
tation is the LAPACK routine SGESV (DGESV,CGESV). It can be obtained in either Fortran or C
from the site www.netlib.org.
We will end by showing how the accuracy of a solution can be improved by a process called
11
iterative renement.
2.1.1 The Basic Procedure
Gaussian elimination begins by producing zeroes below the diagonal in the rst column, i.e.,

. . .
. . .
.
.
.
.
.
.
.
.
.
. . .

. . .
0 . . .
.
.
.
.
.
.
.
.
.
0 . . .

. (2.1)
If a
ij
is the element of in the i -th row and the -th column, then the rst step in the Gaussian
elimination process consists of multiplying on the left by the lower triangular matrix 1
1
given
by
1
1
=

1 0 0 . . . 0
a
21
,a
11
1 0 . . . 0
a
31
,a
11
0 1
.
.
.
.
.
.
.
.
.
.
.
. 0
a
n1
,a
11
0 . . . 0 1

. (2.2)
i.e., zeroes are produced in the rst column by adding appropriate multiples of the rst row to the
other rows. The next step is to produce zeroes below the diagonal in the second column, i.e.,

. . .
0 . . .
.
.
.
.
.
.
.
.
.
0 . . .

. . .
0
0 0
.
.
.
.
.
.
.
.
.
0 0 . . .

. (2.3)
This can be obtained by multiplying 1
1
on the left by the lower triangular matrix 1
2
given by
1
2
=

1 0 0 0 . . . 0
0 1 0 0 0
0 a
.1/
32
,a
.1/
22
1 0 0
0 a
.1/
42
,a
.1/
22
0 1 0
.
.
.
.
.
.
.
.
.
.
.
. 0
0 a
.1/
n2
,a
.1/
22
0 . . . 0 1

(2.4)
where a
.1/
ij
is the i. -th element of 1
1
. Continuing in this manner, we can dene lower triangular
matrices 1
3
. . . . . 1
n1
so that 1
n1
1
1
is upper triangular, i.e.,
1
n1
1
1
= U. (2.5)
12
Taking the inverses of the matrices 1
1
. . . . . 1
n1
, we can write as
= 1
1
1
1
1
n1
U. (2.6)
Let
1 = 1
1
1
1
1
n1
. (2.7)
Then it follows from equation (2.6) that
= 1U. (2.8)
We will now show that 1 is lower triangular. Each of the matrices 1
k
can be written in the form
1
k
= 1 u
.k/
e
T
k
(2.9)
where e
k
is the vector whose components are all zero except for a one in the k-th position and u
.k/
is a vector whose rst k components are zero. The term u
.k/
e
T
k
is an n n matrix whose elements
are all zero except for those below the diagonal in the k-th column. In fact, the components of u
.k/
are given by
u
.k/
i
=
_
0 1 _ i _ k
a
.k1/
ik
,a
.k1/
kk
k < i
(2.10)
where a
.k1/
ij
is the i. -th element of 1
k1
1
1
. Since e
T
k
u
.k/
= u
.k/
k
= 0, it follows that
_
1 u
.k/
e
T
k
__
1 u
.k/
e
T
k
_
= 1 u
.k/
e
T
k
u
.k/
e
T
k
u
.k/
e
T
k
u
.k/
e
T
k
= 1 u
.k/
_
e
T
k
u
.k/
_
e
T
k
= 1. (2.11)
i.e.,
1
1
k
= 1 u
.k/
e
T
k
. (2.12)
Thus, 1
1
k
is the same as 1
k
except for a change of sign of the elements below the diagonal in
column k. Combining equations (2.7) and (2.12), we obtain
1 =
_
1 u
.1/
e
T
1
_

_
1 u
.n1/
e
T
n1
_
= 1 u
.1/
e
T
1
u
.n1/
e
T
n1
. (2.13)
In this expression the cross terms dropped out since
u
.i/
e
T
i
u
.j /
e
T
j
= u
.j /
i
u
.i/
e
T
j
= 0 for i < .
Equation (2.13) implies that 1is lower triangular and that the k-th column of 1 looks like the k-th
column of 1
k
with the signs reversed on the elements below the diagonal, i.e.,
1 =

1 0 0 . . . 0
a
21
,a
11
1 0 0
a
31
,a
11
a
.1/
32
,a
.1/
22
1 0
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
,a
11
a
.1/
n2
,a
.1/
22
1

. (2.14)
13
Having the 1U factorization given in equation (2.8), it is possible to solve the system of equations
. = 1U. = b
for any right-hand-side b. If we let , = U., then , can be found by solving the triangular system
1, = b. Having ,, . can be obtained by solving the triangular system U. = ,. Triangular
systems are very easy to solve. For example, in the system U. = ,, the last equation can be
solved for .
n
(the only unknown in this equation). Having .
n
, the next to the last equation can be
solved for .
n1
(the only unknown left in this equation). Continuing in this manner we can solve
for the remaining components of .. For the system 1, = b, we start by computing ,
1
and then
work our way down. Solving an upper triangular system is called back substitution. Solving a
lower triangular system is called forward substitution.
To compute 1 requires approximately n
3
,3 operations where an operation consists of an addition
and a multiplication. For each right-hand-side, solving the two triangular systems requires approx-
imately n
2
operations. Thus, as far as solving systems of equations is concerned, having the 1U
factorization of is just as good as having the inverse of and is less costly to compute.
2.1.2 Row Pivoting
There is one problem with Gaussian elimination that has yet to be addressed. It is possible for
one of the diagonal elements a
.k1/
kk
that occur during Gaussian elimination to be zero or to be
very small. This causes a problem since we must divide by this diagonal element. If one of the
diagonals is exactly zero, the process obviously blows up. However, there can still be a problem
if one of the diagonals is small. In this case large elements are produced in both the 1 and U
matrices. These large entries lead to a loss of accuracy when there are subtractions involving these
big numbers. This problem can occur even for well behaved matrices. To eliminate this problem
we introduce row pivoting. In performing Gaussian elimination, it is not necessary to take the
equations in the order they are given. Suppose we are at the stage where we are zeroing out the
elements below the diagonal in the k-th column. We can interchange any of the rows from the
k-th row on without changing the structure of the matrix. In row pivoting we nd the largest in
magnitude of the elements a
.k1/
kk
. a
.k1/
kC1;k
. . . . . a
.k1/
nk
and interchange rows to bring that element
to the k. k-position. Mathematically we can perform this row interchange by multiplying on the
left by the matrix 1
k
that is like the identity matrix with the appropriate rows interchanged. The
matrix 1
k
has the property 1
k
1
k
= 1, i.e., 1
k
is its own inverse. With row pivoting equation (2.5)
is replaced by
1
n1
1
n1
1
2
1
2
1
1
1
1
= U. (2.15)
We can write this equation in the form
1
n1
_
1
n1
1
n2
1
1
n1
__
1
n1
1
n2
1
n3
1
1
n2
1
1
n1
_

_
1
n1
1
2
1
1
1
1
2
. . . 1
1
n1
__
1
n1
1
1
_
= U. (2.16)
Dene 1
0
n1
= 1
n1
and
1
0
k
= 1
n1
1
kC1
1
k
1
1
kC1
1
1
n1
k = 1. . . . . n 2. (2.17)
14
Then equation (2.16) can be written
_
1
0
n1
1
0
1
__
1
n1
1
1
_
= U. (2.18)
Note that multiplying by 1
j
on the left only modies rows up to n. Similarly, multiplying by
1
1
j
= 1
j
on the right only modies columns up to n. Therefore,
1
0
k
=
_
1
n1
1
kC1
__
1 u
.k/
e
T
k
__
1
kC1
1
n1
_
= 1
_
1
n1
1
kC1
_
u
.k/
e
T
k
_
1
kC1
1
n1
_
= 1
.k/
e
T
k
(2.19)
where
.k/
is like u
.k/
except the components k 1 to n are permuted by 1
n1
1
kC1
. Since 1
0
k
has the same form as 1
k
, it follows that the matrix 1 = (1
0
1
)
1
(1
0
n1
)
1
is lower triangular.
Thus, if we dene 1 = 1
n1
1
1
, equation (2.18) becomes
1 = 1U. (2.20)
Of course, in practice we dont need to explicitly construct the matrix 1 since the interchanges can
be kept tract of using a vector. To solve a system of equations . = b we replace the system by
1. = 1b and proceed as before.
It is also possible to do column interchanges as well as row interchanges, but this is seldom used in
practice. By the construction of 1 all its elements are less than or equal to one in magnitude. The
elements of U are usually not very large, but there are some peculiar cases where large entries can
appear in U even with row pivoting. For example, consider the matrix
=

1 0 0 . . . 0 1
1 1 0 . . . 0 1
1 1 1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 0 1
1 1 1 1
1 1 1 . . . 1 1

.
In the rst step no pivoting is necessary, but the elements 2 through n in the last column are
doubled. In the second step again no pivoting is necessary, but the elements 3 through n are
doubled. Continuing in this manner we arrive at
U =

1 1
1 2
1 4
.
.
.
.
.
.
2
n1

.
Although growth like this in the size of the elements of U is theoretically possible, there are no
reports of this ever having happened in the solution of a real-world problem. In practice Gaussian
elimination with row pivoting has proven to be very stable.
15
2.1.3 Iterative Renement
If the solution of . = b is not sufciently accurate, the accuracy can be improved by applying
Newtons method to the function (.) = . b. If .
.k/
is an approximate solution to (.) = 0,
then a Newton iteration produces an approximation .
.kC1/
given by
.
.kC1/
= .
.k/

_
D(.
.k/
)
_
1
(.
.k/
) = .
.k/

1
_
.
.k/
b
_
. (2.21)
An iteration step can be summarized as follows:
1. compute the residual r
.k/
= .
.k/
b;
2. solve the system J
.k/
= r
.k/
using the 1U factorization of ;
3. Compute .
.kC1/
= .
.k/
J
.k/
.
The residual is usually computed in double precision. If the above calculations were carried out
exactly, the answer would be obtained in one iteration as is always true when applying Newtons
method to a linear function. However, because of roundoff errors, it may require more than one
iteration to obtain the desired accuracy.
2.2 Cholesky Factorization
Matrices that are Hermitian (
H
= ) and positive denite (.
H
. > 0 for all . = 0) occur
sufciently often in practice that it is worth describing a variant of Gaussian elimination that is
often used for this class of matrices. Recall that Gaussian elimination amounted to a factorization
of a square matrix into the product of a lower triangular matrix and an upper triangular matrix,
i.e., = 1U. The Cholesky factorization represents a Hermitian positive denite matrix by the
product of a lower triangular matrix and its conjugate transpose, i.e., = 11
H
. Because of the
symmetries involved, this factorization can be formed in roughly half the number of operations as
are needed for Gaussian elimination.
Let us begin by looking at some of the properties of positive denite matrices. If e
i
is the i -th
column of the identity matrix and = (a
ij
) is positive denite, then a
i i
= e
T
i
e
i
> 0, i.e., the
diagonal components of are real and positive. Suppose X is a nonsingular matrix of the same
size as the Hermitian, positive denite matrix . Then
.
H
(X
H
X). = (X.)
H
(X.) > 0 for all . = 0.
Thus, Hermitian positive denite implies that X
H
X is Hermitian positive denite. Conversely,
suppose X
H
X is Hermitian positive denite. Then
= (XX
1
)
H
(XX
1
) = (X
1
)
H
(X
H
X)(X
1
) is Hermitian positive denite.
16
Next we will show that the component of largest magnitude of a Hermitian positive denite matrix
always lies on the diagonal. Suppose that [a
kl
[ = max
i;j
[a
ij
[ and k = l. If a
kl
= [a
kl
[e
i
kl
, let
= e
i
kl
and . = e
k
e
l
. Then
.
H
. = e
T
k
e
k
e
T
l
e
k
e
T
k
e
l
[[
2
e
T
l
e
l
= a
kk
a
l l
2[a
kl
[ _ 0.
This contradicts the fact that is positive denite. Therefore, max
i;j
[a
ij
[ = max
i
a
i i
. Suppose
we partition the Hermitian positive denite matrix as follows
=
_
T C
H
C D
_
If , is a nonzero vector compatible with D, let .
H
= (0. ,
H
). Then
.
H
. = (0. ,
H
)
_
T C
H
C D
__
0
,
_
= ,
H
D, > 0.
i.e., D is Hermitian positive denite. Similarly, letting .
H
= (,
H
. 0), we can show that T is
Hermitian positive denite.
We will now show that if is a Hermitian, positive-denite matrix, then there is a unique lower
triangular matrix 1 with positive diagonals such that = 11
H
. This factorization is called the
Cholesky factorization. We will establish this result by induction on the dimension n. Clearly, the
result is true for n = 1. For in this case we can take 1 = (
_
a
11
). Suppose the result is true for
matrices of dimension n 1. Let be a Hermitian, positive-denite matrix of dimension n. We
can partition as follows
=
_
a
11
n
H
n 1
_
(2.22)
where n is a vector of dimension n1 and 1 is a (n1) (n1) matrix. It is easily veried that
=
_
a
11
n
H
n 1
_
= T
H
_
1 0
0 1
ww
H
a
11
_
T (2.23)
where
T =
_
_
a
11
w
H
_
a
11
0 1
_
. (2.24)
We will rst show that the matrix B is invertible. If
T. =
_
_
a
11
w
H
_
a
11
0 1
_
_
.
1
.
2
_
=
_
_
a
11
.
1

w
H
x
2
_
a
11
.
2
_
= 0.
then .
2
= 0 and
_
a
11
.
1
= .
1
= 0. Therefore, T is invertible. From our discussion at the
beginning of this section it follows from equation (2.23) that the matrix
_
1 0
0 k
ww
H
a
11
_
17
is Hermitian positive denite. By the results on the partitioning of a positive denite matrix, it
follows that the matrix
1
nn
H
a
11
is Hermitian positive denite. By the induction hypothesis, there exists a unique lower triangular
matrix

1 with positive diagonals such that
1
nn
H
a
11
=

1

1
H
. (2.25)
Substituting equation (2.25) into equation (2.23), we get
= T
H
_
1 0
0

1

1
H
_
T = T
H
_
1 0
0

1
__
1 0
0

1
H
_
T =
_
_
a
11
0
w
_
a
11

1
__
_
a
11
w
H
_
a
11
0

1
H
_
(2.26)
which is the desired factorization of . To show uniqueness, suppose that
=
_
a
11
n
H
n 1
_
=
_
l
11
0


1
__
l
11

H
0

1
H
_
(2.27)
is a Cholesky factorization of . Equating components in equation (2.27), we see that l
2
11
= a
11
and hence that l
11
=
_
a
11
. Also l
11
= n or = n,l
11
= n,
_
a
11
. Finally,
H

1
H
= 1
or 1
H
= 1 nn
H
,a
11
=

1

1
H
. Since

1

1
H
is the unique factorization of the (n 1)
(n 1) Hermitian, positive-denite matrix 1 nn
H
,a
11
, we see that the Cholesky factorization
of is unique. It now follows by induction that there is a unique Cholesky factorization of any
Hermitian, positive-denite matrix.
The factorization in equation (2.23) is the basis for the computation of the Cholesky factorization.
The matrix T
H
is lower triangular. Since the matrix 1 nn
H
,a
11
is positive denite, it can
be factored in the same manner. Continuing in this manner until the center matrix becomes the
identity matrix, we obtain lower triangular matrices 1
1
. . . . . 1
n
such that
= 1
1
1
n
1
H
n
1
H
1
.
Letting 1 = 1
1
1
n
, we have the desired Cholesky factorization.
As was mentioned previously, the number of operations in the Cholesky factorization is about half
the number in Gaussian elimination. Unlike Gaussian elimination the Cholesky method does not
need pivoting in order to maintain stability. The Cholesky factorization can also be written in the
form
= 1D1
H
where D is diagonal and 1 now has all ones on the diagonal.
18
2.3 Elementary Unitary Matrices and the QR Factorization
In Gaussian elimination we saw that a square matrix could be reduced to triangular form by
multiplying on the left by a series of elementary lower triangular matrices. This process can also
be expressed as a factorization = 1U where 1 is lower triangular and U is upper triangular. In
least squares problems the number of rows m in is usually greater than the number of columns
n. The standard technique for solving least-squares problems of this type is to make use of a
factorization = Q1 where Q is an m m unitary matrix and 1 has the form
1 =
_

1
0
_
with

1 an n n upper triangular matrix. The usual way of obtaining this factorization is to
reduce the matrix to triangular form by multiplying on the left by a series of elementary unitary
matrices that are sometimes called Householder matrices (reectors). We will show how to use
this Q1 factorization to solve least square problems. If

Q is the m n matrix consisting of the
rst n columns of Q, then
=

Q

1.
This factorization is called the reduced Q1 factorization. Elementary unitary matrices are also
used to reduce square matrices to a simplied form (Hessenberg or tridiagonal) prior to eigenvalue
calculation.
There are several good computer implementations that use the Householder Q1 factorization to
solve the least squares problem. The LAPACK routine is called SGELS (DGELS,CGELS). In
Matlab the solution of the least squares problem is given by A\b. The Q1 factorization can be
obtained with the call [Q,R]=qr(A).
2.3.1 Gram-Schmidt Orthogonalization
A reduced Q1 factorization can be obtained by an orthogonalization procedure known as the
Gram-Schmidt process. Suppose we would like to construct an orthonormal set of vectors q
1
. . . . . q
n
from a given linearly independent set of vectors a
1
. . . . . a
n
. The process is recursive. At the -th
step we construct a unit vector q
j
that is orthogonal to q
1
. . . . . q
j 1
using

j
= a
j

j 1

iD1
(q
H
i
a
j
)q
i
q
j
=
j
,[
j
[.
The orthonormal basis constructed has the additional property
< q
1
. . . . . q
j
>=< a
1
. . . . . a
j
> = 1. 2. . . . . n.
19
If we consider a
1
. . . . . a
n
as columns of a matrix , then this process is equivalent to the matrix
factorization =

Q

1 where

Q = (q
1
. . . . . q
n
) and

1 is upper triangular. Although the Gram-
Schmidt process is very useful in theoretical considerations, it does not lead to a stable numerical
procedure. In the next section we will discuss Householder reectors, which lead to a more stable
procedure for obtaining a Q1 factorization.
2.3.2 Householder Reections
Let us begin by describing the Householder reectors. In this section we will restrict ourselves to
real matrices. Afterwards we will see that there are a number of generalizations to the complex
case. If is a xed vector of dimension m with [[ = 1, then the set of all vectors orthogonal to
is an (m 1)-dimensional subspace called a hyperplane. If we denote this hyperplane by H, then
H = {u :
T
u = 0]. (2.28)
Here
T
denotes the transpose of . If . is a point not on H, let . denote the orthogonal projection
of . onto H (see Figure 2.1). The difference . . must be orthogonal to H and hence a multiple
of , i.e.,
. . = or . = . . (2.29)
x
x
x
v
H
Figure 2.1: Householder reection
Since . lies on H and
T
= [[
2
= 1, we must have

T
. =
T
.
T
=
T
. = 0. (2.30)
Thus, =
T
. and consequently
. = . (
T
.) = .
T
. = (1
T
).. (2.31)
20
Dene 1 = 1
T
. Then 1 is a projection matrix that projects vectors orthogonally onto H.
The projection . is obtained by going a certain distance from . in the direction . Figure 2.1
suggests that the reection . of . across H can be obtained by going twice that distance in the
same direction, i.e.,
. = . 2(
T
.) = . 2
T
. = (1 2
T
).. (2.32)
With this motivation we dene the Householder reector Q by
Q = 1 2
T
[[ = 1. (2.33)
An alternate form for the Householder reector is
Q = 1
2uu
T
[u[
2
(2.34)
where here u is not restricted to be a unit vector. Notice that, in this form, replacing u by a multiple
of u does not change Q. The matrix Q is clearly symmetric, i.e., Q
T
= Q. Moreover,
Q
T
Q = Q
2
= (1 2
T
)(1 2
T
) = 1 2
T
2
T
4
T

T
= 1. (2.35)
i.e., Q is an orthogonal matrix. As with all orthogonal matrices Q preserves the norm of a vector,
i.e.,
[Q.[
2
= (Q.)
T
Q. = .
T
Q
T
Q. = .
T
. = [.[
2
. (2.36)
To reduce a matrix to one that is upper triangular it is necessary to zero out columns below a certain
position. We will show how to construct a Householder reector so that its action on a given vector
. is a multiple of e
1
, the rst column of the identity matrix. To zero out a vector below row k we
can use a matrix of the form
Q =
_
1 0
0 Q
_
where 1 is the (k 1) (k 1) identity matrix and Q is a (mk 1) (mk 1) Householder
matrix. Thus, for a given vector . we would like to choose a vector u so that Q. is a multiple of
the unit vector e
1
, i.e.,
Q. = .
2(u
T
.)
[u[
2
u = e
1
. (2.37)
Since Q preserves norms, we must have [[ = [.[. Therefore, equation (2.37) becomes
Q. = .
2(u
T
.)
[u[
2
u = [.[e
1
. (2.38)
It follows from equation (2.38) that u must be a multiple of the vector . [.[e
1
. Since u can be
replaced by a multiple of u without changing Q, we let
u = . [.[e
1
. (2.39)
It follows from the denition of u in equation (2.39) that
u
T
. = [.[
2
[.[.
1
(2.40)
21
and
[u[
2
= u
T
u = [.[
2
[.[.
1
[.[.
1
[.[
2
= 2([.[
2
[.[.
1
). (2.41)
Therefore,
2(u
T
.)
[u[
2
= 1. (2.42)
and hence Q. becomes
Q. = .
2(u
T
.)
[u[
2
u = . u = [.[e
1
(2.43)
as desired. From what has been discussed so far, either of the signs in equation (2.39) would
produce the desired result. However, if .
1
is very large compared to the other components, then it
is possible to lose accuracy through subtraction in the computation of u = . [.[e
1
. To prevent
this we choose u to be
u = . sign(.
1
)[.[e
1
(2.44)
where sign(.
1
) is dened by
sign(.
1
) =
_
1 .
1
_ 0
1 .
1
< 0.
(2.45)
With this choice of u, equation (2.43) becomes
Q. = sign(.
1
)[.[e
1
. (2.46)
In practice, u is often scaled so that u
1
= 1, i.e.,
u =
. sign(.
1
)[.[e
1
.
1
sign(.
1
)[.[
. (2.47)
With this choice of u,
[u[
2
=
2[.[
[.[ [.
1
[
. (2.48)
The matrix Q applied to a general vector , is given by
Q, = , 2
u
T
,
[u[
2
u. (2.49)
2.3.3 Complex Householder Matrices
Thee are several ways to generalize Householder matrices to the complex case. The most obvious
is to let
U = 1 2
uu
H
[u[
2
where the superscript H denotes conjugate transpose. It can be shown that a matrix of this form
is both Hermitian (U = U
H
) and unitary (U
H
U = 1). However, it is sometimes convenient
22
to be able to construct a U such that U
H
. is a real multiple of e
1
. This is especially true when
converting a Hermitian matrix to tridiagonal form prior to an eigenvalue computation. For in this
case the tridiagonal matrix becomes a real symmetric matrix even when starting with a complex
Hermitian matrix. Thus, it is not necessary to have a separate eigenvalue routine for the complex
case. It turns out that there is no Hermitian unitary matrix U, as dened above, that is guaranteed to
produce a real multiple of e
1
. Therefore, linear algebra libraries such as LAPACK use elementary
unitary matrices of the form
U = 1 onn
H
(2.50)
where o can be complex. These matrices are not in general Hermitian. If U is to be unitary, we
must have
1 = U
H
U = (1 onn
H
)(1 onn
H
) = 1 (o o [o[
2
[n[
2
)nn
H
and hence
[o[
2
[n[
2
= 2 Re(o). (2.51)
Notice that replacing n by n,j and o by [j[
2
o in equation (2.50) leaves U unchanged. Thus, a
scaling of n can be absorbed in o. We would like to choose n and o so that
U
H
. = . o(n
H
.)n = ;[.[e
1
(2.52)
where ; = 1. It can be seen from equation (2.52) that n must be proportional to the vector
. ; [.[e
1
. Since the factor of proportionality can be absorbed in o, we choose
n = . ; [.[e
1
. (2.53)
Substituting this expression for n into equation (2.52), we get
U
H
. = . o(n
H
.)(. ; [.[e
1
) = (1 on
H
.). o;(n
H
.)[.[e
1
= ;[.[e
1
. (2.54)
Thus, we must have
o(n
H
.) = 1 or o =
1
.
H
n
. (2.55)
This choice of o gives
U
H
. = ;[.[e
1
.
It follows from equation (2.53) that
.
H
n = [.[
2
; [.[.
1
(2.56)
and
[n[
2
= (.
H
; [.[e
T
1
)(. ; [.[e
1
) = [.[
2
; [.[.
1
; [.[.
1
[.[
2
= 2
_
[.[
2
; [.[ Re(.
1
)
_
(2.57)
23
Thus, it follows from equations (2.55)(2.57) that
2 Re(o)
[o[
2
=
o o
oo
=
1
o

1
o
= .
H
n n
H
.
=
_
[.[
2
; [.[.
1
_

_
[.[
2
; [.[.
1
_
= 2
_
[.[
2
2; [.[ Re(.
1
)
_
= [n[
2
.
i.e., the condition in equation (2.51) is satised. It follows that the matrix U dened by equation
(2.50) is unitary when n is dened by equation (2.53) and o is dened by equation (2.55). As
before we choose ; to prevent the loss of accuracy due to subtraction in equation (2.53). In this
case we choose ; = sign
_
Re(.
1
)
_
. Thus, n becomes
n = . sign
_
Re(.
1
)
_
[.[e
1
. (2.58)
Let us dene a real constant v by
v = sign
_
Re(.
1
)
_
[.[. (2.59)
With this denition n becomes
n = . ve
1
. (2.60)
It follows that
.
H
n = [.[
2
v.
1
= v
2
v.
1
= v(v .
1
). (2.61)
and hence
o =
1
v(v .
1
)
. (2.62)
In LAPACK n is scaled so that n
1
= 1, i.e.,
n =
. ve
1
.
1
v
. (2.63)
With this n, o becomes
o =
[.
1
v[
2
v(v .
1
)
=
(.
1
v)(.
1
v)
v(v .
1
)
=
.
1
v
v
. (2.64)
Clearly this o satises the inequality
[o 1[ =
[.
1
[
[v[
=
[.
1
[
[.[
_ 1. (2.65)
It follows from equation (2.64) that o is real when .
1
is real. Thus, U is Hermitian when .
1
is real.
An alternate approach to dening a complex Householder matrix is to let
U = 1
2nn
H
[n[
2
. (2.66)
24
This U is Hermitian and
U
H
U =
_
1
2nn
H
[n[
2
__
1
2nn
H
[n[
2
_
= 1
2nn
H
[n[
2

2nn
H
[n[
2

4[n[
2
nn
H
[n[
4
= 1. (2.67)
i.e., U is unitary. We want to choose n so that
U
H
. = U. = .
2n
H
.
[n[
2
n = ; [.[e
1
(2.68)
where [;[ = 1. Multiplying equation (2.68) by .
H
, we get
.
H
U. = .
H
U
H
. = .
H
U. = ; [.[.
1
. (2.69)
Since .
H
U. is real, it follows that ;.
1
is real. If .
1
= [.
1
[e
i
1
, then ; must have the form
; = e
i
1
. (2.70)
It follows from equation (2.68) that n must be proportional to the vector . e
i
1
[.[e
1
. Since
multiplying n by a constant factor doesnt change U, we take
n = . e
i
1
[.[e
1
. (2.71)
Again, to avoid accuracy problems, we choose the plus sign in the above formula, i.e.,
n = . e
i
1
[.[e
1
. (2.72)
It follows from this denition that
[n[
2
=
_
.
H
e
i
1
[.[e
T
1
__
. e
i
1
[.[e
1
_
= [.[
2
[.
1
[[.[ [.
1
[[.[ [.[
2
= 2[.[
_
[.[ [.
1
[
_
(2.73)
and
n
H
. =
_
.
H
e
i
1
[.[e
T
1
_
. = [.[
2
e
i
1
.
1
[.[ = [.[
_
[.[ [.
1
[
_
. (2.74)
Therefore,
2n
H
.
[n[
2
= 1. (2.75)
and hence
U. = . n = .
_
. e
i
1
[.[e
1
_
= e
i
1
[.[e
1
. (2.76)
This alternate form for the Householder matrix has the advantage that it is Hermitian and that the
multiplier of nn
H
is real. However, it cant in general map a given vector . into a real multiple of
e
1
. Both EISPACK and LINPACK use elementary unitary matrices similar to this. The LAPACK
form is not Hermitian, involves a complex multiplier of nn
H
, but can produce a real multiple of
e
1
when acting on .. As stated before, this can be a big advantage when reducing matrices to
triangular form prior to an eigenvalue computation.
25
2.3.4 Givens Rotations
Householder matrices are very good at producing long strings of zeroes in a row or column. Some-
times, however, we want to produce a zero in a matrix while altering as little of the matrix as
possible. This is true when dealing with matrices that are very sparse (most of the elements are al-
ready zero) or when performing many operations in parallel. The Givens rotations can sometimes
be used for this purpose. We will begin by considering the case where all matrices and vectors are
real. The complex case will be considered in the next section.
The two-dimensional matrix
1 =
_
cos 0 sin 0
sin 0 cos 0
_
rotates a 2-vector counterclockwise through an angle 0. If we let c = cos 0 and s = sin 0, then
the matrix 1 can be written as
1 =
_
c s
s c
_
where c
2
s
2
= 1. If . is a 2-vector, we can determine c and s so that 1. is a multiple of e
1
.
Since
1. =
_
c.
1
s.
2
s.
1
c.
2
_
.
1 will have the desired property if c = .
1
,

.
2
1
.
2
2
and s = .
2
,

.
2
1
.
2
2
. In fact 1. =

.
2
1
.
2
2
e
1
.
Givens matrices are an extension of this two-dimensional rotation to higher dimensions. For > i ,
the givens matrix G(i. ) is an mm matrix that performs a counterclockwise rotation in the (i. )
coordinate plane. It can be obtained by replacing the (i. i ) and (. ) components of the m m
identity matrix by c, the (i. ) component by s and the (. i ) component by s. It has the matrix
form
G(i. ) =
col i col
row i
row

1
1
.
.
.
c s
.
.
.
s c
.
.
.
1
1

(2.77)
26
where c
2
s
2
= 1. The matrix G(i. ) is clearly orthogonal. In terms of components
G(i. )
kl
=

1 k = l, k = i and k =
c k = l, k = i or k =
s k = i , l =
s k = , l = i
0 otherwise
. (2.78)
Multiplying a vector by G(i. ) only affects the i and components. If , = G(i. )., then
,
k
=

.
k
k = i and k =
c.
i
s.
j
k = i
s.
i
c.
j
k =
. (2.79)
Suppose we want to make ,
j
= 0. We can do this by setting
c =
.
i

.
2
i
.
2
j
and s =
.
j

.
2
i
.
2
j
. (2.80)
With this choice for c and s, , becomes
,
k
=

.
k
k = i and k =

.
2
i
.
2
j
k = i
0 k =
. (2.81)
Multiplying a matrix on the left by G(i. ) only alters rows i and . Similarly, Multiplying
on the right by G(i. ) only alters columns i and
2.3.5 Complex Givens Rotations
For the complex case we replace 1 in the previous section by
1 =
_
c s
s c
_
where c is real. (2.82)
It can be easily veried that 1 is unitary if and only if c and s satisfy
c
2
[s[
2
= 1.
Given a 2-vector ., we want to choose 1 so that 1. is a multiple of e
1
. For 1 unitary, we must
have
1. = ;[.[e
1
where [;[ = 1. (2.83)
27
Multiplying equation (2.83) by 1
H
, we get
. = 11
H
. = ;[.[1
H
e
1
= ;[.[
_
c
s
_
(2.84)
or
c =
.
1
;[.[
and s =
.
2
;[.[
. (2.85)
We dene sign(u) for u complex by
sign(u) =
_
u,[u[ u = 0
1 u = 0.
(2.86)
If c is to be real, ; must have the form
; = c sign(.
1
) c = 1.
With this choice of ;, c and s become
c =
[.
1
[
c[.[
and s =
.
2
c sign(.
1
)[.[
. (2.87)
If we want the complex case to reduce to the real case when .
1
and .
2
are real, then we can
choose c = sign
_
Re(.
1
)
_
. As before, we can construct G(i. ) by replacing the (i. i ) and (. )
components of the identity matrix by c, the (i. ) component by s, and the (. i ) component by
s. In the expressions for c and s in equation (2.87), we replace .
1
by .
i
, .
2
by .
j
, and [.[ by

[.
i
[
2
[.
j
[
2
.
2.3.6 QR Factorization Using Householder Reectors
Let be an mn matrix with m > n. Let Q
1
be a Householder matrix that maps the rst column
of into a multiple of e
1
. Then Q
1
will have zeroes below the diagonal in the rst column. Now
let
Q
2
=
_
1 0
0

Q
2
_
where

Q
2
is an (m 1) (m 1) Householder matrix that will zero out the entries below the
diagonal in the second column of Q
1
. Continuing in this manner, we can construct Q
2
. . . . . Q
n1
so that
Q
n1
Q
1
=
_

1
0
_
(2.88)
where

1 is an n n triangular matrix. The matrices Q
k
have the form
Q
k
=
_
1 0
0

Q
k
_
(2.89)
28
where

Q
k
is an (m k 1) (m k 1) Householder matrix. If we dene
Q
H
= Q
n1
Q
1
and 1 =
_

1
0
_
. (2.90)
then equation (2.88) can be written
Q
H
= 1. (2.91)
Moreover, since each Q
k
is unitary, we have
Q
H
Q = (Q
n1
Q
1
)(Q
H
1
Q
H
n1
) = 1. (2.92)
i.e., Q is unitary. Therefore, equation (2.91) can be written
= Q1. (2.93)
Equation (2.93) is the desired factorization. The operations count for this factorization is approxi-
mately mn
2
where an operation is an addition and a multiplication. In practice it is not necessary
to construct the matrix Q explicitly. Usually only the vectors dening each Q
k
are saved.
If

Q is the matrix consisting of the rst n columns of Q, then
=

Q

1 (2.94)
where

Q is a m n matrix with orthonormal columns and

1 is a n n upper triangular matrix.
The factorization in equation (2.94) is the reduced Q1 factorization.
2.3.7 Uniqueness of the Reduced QR Factorization
In this section we will show that a matrix of full rank has a unique reduced Q1 factorization if
we require that the triangular matrix 1 has positive diagonals. All other reduced Q1factorizations
of are simply related to this one with positive diagonals.
The reduced Q1 factorization can be written
= (a
1
. a
2
. . . . . a
n
) = (q
1
. q
2
. . . . . q
n
)

r
11
r
12
r
1n
r
22
r
2n
.
.
.
.
.
.
r
nn

. (2.95)
If has full rank, then all of the diagonal elements r
jj
must be nonzero. Equating columns in
equation (2.95), we get
a
j
=
j

kD1
r
kj
q
k
= r
jj
q
j

j 1

kD1
r
kj
q
k
29
or
q
j
=
1
r
jj
(a
j

j 1

kD1
r
kj
q
k
). (2.96)
When = 1 equation (2.96) reduces to
q
1
=
a
1
r
11
. (2.97)
Since q
1
must have unit norm, it follows that
[r
11
[ = [a
1
[. (2.98)
Equations (2.97) and (2.98) determine q
1
and r
11
up to a factor having absolute value one, i.e.,
there is a J
1
with [J
1
[ = 1 such that
r
11
= J
1
r
11
q
1
=
q
1
J
1
where r
11
= [a
1
[ and q
1
= a
1
, r
11
.
For = 2, equation (2.96) becomes
q
2
=
1
r
22
(a
2
r
12
q
1
).
Since the columns q
1
and q
2
must be orthonormal, it follows that
0 = q
H
1
q
2
=
1
r
22
(q
H
1
a
2
r
12
)
and hence that
r
12
= q
H
1
a
2
= J
1
q
H
1
a
2
. (2.99)
Here we have used the fact that J
1
= 1,J
1
. Since q
2
has unit norm, it follows that
1 = [q
2
[ =
1
[r
22
[
[a
2
r
12
q
1
[ =
1
[r
22
[
[a
2
(J
1
q
H
1
a
2
) q
1
,J
1
[ =
1
[r
22
[
[a
2
( q
H
1
a
2
) q
1
[
and hence that
[r
22
[ = [a
2
( q
H
1
a
2
) q
1
[ r
22
.
Therefore, there exists a scalar J
2
with [J
2
[ = 1 such that
r
22
= J
2
r
22
and q
2
= q
2
,J
2
where q
2
=
_
a
2
( q
H
1
a
2
) q
1
_
, r
22
.
For = 3, equation (2.96) becomes
q
3
=
1
r
33
(a
3
r
13
q
1
r
23
q
2
).
30
Since the columns q
1
, q
2
and q
3
must be orthonormal, it follows that
0 = q
H
1
q
3
=
1
r
33
(q
H
1
a
3
r
13
)
0 = q
H
2
q
3
=
1
r
33
(q
H
2
a
3
r
23
)
and hence that
r
13
= q
H
1
a
3
= J
1
q
H
1
a
3
r
23
= q
H
2
a
3
= J
2
q
H
2
a
3
.
Since q
3
has unit norm, it follows that
1 = [q
3
[ =
1
[r
33
[
[a
3
r
13
q
1
r
23
q
2
[ =
1
[r
33
[
[a
3
( q
H
1
a
3
) q
1
( q
H
2
a
3
) q
2
[
and hence that
[r
33
[ = [a
3
( q
1
H
a
3
) q
1
( q
2
H
a
3
) q
2
[ r
33
.
Therefore, there exists a scalar J
3
with [J
3
[ = 1 such that
r
33
= J
3
r
33
and q
3
= q
3
,J
3
(2.100)
where q
3
=
_
a
3
( q
1
H
a
3
) q
1
( q
2
H
a
3
) q
2
_
, r
33
. Continuing in this way we obtain the unitary
matrix

Q = ( q
1
. . . . . q
n
) and the triangular matrix

1 =

r
11
r
12
r
1n
r
22
r
2n
.
.
.
.
.
.
r
nn

such that =

Q

1 is the unique reduced Q1 factorization of with 1 having positive diagonal
elements. If = Q1 is any other reduced Q1 factorization of , then
1 =

J
1
.
.
.
J
n

1
and
Q =

Q

1,J
1
.
.
.
1,J
n

=

Q

J
1
.
.
.
J
n

where [J
1
[ = = [J
n
[ = 1.
31
2.3.8 Solution of Least Squares Problems
In this section we will show how to use the Q1 factorization to solve the least squares problem.
Consider the system of linear equations
. = b (2.101)
where is an m n matrix with m > n. In general there is no solution to this system of equa-
tions. Instead we seek to nd an . so that [. b[ is as small as possible. In view of the Q1
factorization, we have
[. b[
2
= [Q1. b[
2
= [Q(1. Q
H
b)[
2
= [1. Q
H
b[
2
. (2.102)
We can write Q in the partitioned form Q = (Q
1
. Q
2
) where Q
1
is an m n matrix. Then
1. Q
H
b =
_

1.
0
_

_
Q
H
1
b
Q
H
2
b
_
=
_

1. Q
H
1
b
Q
H
2
b
_
. (2.103)
It follows from equation (2.103) that
[1. Q
H
b[
2
= [

1. Q
H
1
b[
2
[Q
H
2
b[
2
. (2.104)
Combining equations (2.102) and (2.104), we get
[. b[
2
= [

1. Q
H
1
b[
2
[Q
H
2
b[
2
. (2.105)
It can be easily seen from this equation that [. b[ is minimized when . is the solution of the
triangular system

1. = Q
H
1
b (2.106)
when such a solution exists. This is the standard way of solving least square systems. Later we will
discuss the singular value decomposition (SVD) that will provide even more information relative
to the least squares problem. However, the SVD is much more expensive to compute than the Q1
decomposition.
2.4 The Singular Value Decomposition
The Singular Value Decomposition (SVD) is one of the most important and probably one of the
least well known of the matrix factorizations. It has many applications in statistics, signal process-
ing, image compression, pattern recognition, weather prediction, and modal analysis to name a
few. It is also a powerful diagnostic tool. For example, it provides approximations to the rank and
the condition number of a matrix as well as providing orthonormal bases for both the range and
the null space of a matrix. It also provides optimal low rank approximations to a matrix. The SVD
is applicable to both square and rectangular matrices. In this regard it provides a general solution
to the least squares problem.
32
The SVD was rst discovered by differential geometers in connection with the analysis of bilinear
forms. Eugenio Beltrami [1] (1873) and Camille Jordan [10] (1874) independently discovered
that the singular values of the matrix associated with a bilinear form comprise a complete set
of invariants for the form under orthogonal substitutions. The rst proof of the singular value
decomposition for rectangular and complex matrices seems to be by Eckart and Young [5] in 1939.
They saw it as a generalization of the principal axis transformation for Hermitian matrices.
We will begin by deriving the SVD and presenting some of its most important properties. We will
then discuss its application to least squares problems and matrix approximation problems. Follow-
ing this we will show how singular values can be used to determine the condition of a matrix (how
close the rows or columns are to being linearly dependent). We will conclude with a brief outline
of the methods used to compute the SVD. Most of the methods are modications of methods used
to compute eigenvalues and vectors of a square matrix. The details of the computational methods
are beyond the scope of this presentation, but we will provide references for those interested.
2.4.1 Derivation and Properties of the SVD
Theorem 1. (Singular Value Decomposition) Let be a nonzero m n matrix. Then there exists
an orthonormal basis u
1
. . . . . u
m
of m-vectors, an orthonormal basis
1
. . . . .
n
of n-vectors, and
positive numbers o
1
. . . . . o
r
such that
1. u
1
. . . . . u
r
is a basis of the range of
2.
rC1
. . . . .
n
is a basis of the null space of
3. =

r
kD1
o
k
u
k

H
k
.
Proof:
H
is a Hermitian n n matrix that is positive semidenite. Therefore, there is an
orthonormal basis
1
. . . . .
n
and nonnegative numbers o
2
1
. . . . . o
2
n
such

k
= o
2
k

k
k = 1. . . . . n. (2.107)
Since is nonzero, at least one of the eigenvalues o
2
k
must be positive. Let the eigenvalues be
arranged so that o
2
1
_ o
2
2
_ _ o
2
r
> 0 and o
2
rC1
= = o
2
n
= 0. Consider now the vectors

1
. . . . .
n
. We have
(
i
)
H

j
=
H
i

H

j
= o
2
j

H
i

j
= 0 i = . (2.108)
i.e.,
1
. . . . .
n
are orthogonal. When i =
[
i
[
2
=
H
i

H

i
= o
2
i

H
i

i
= o
2
i
> 0 i = 1. . . . . r
= 0 i > r. (2.109)
Thus,
rC1
= =
n
= 0 and hence
rC1
. . . . .
n
belong to the null space of . Dene
u
1
. . . . . u
r
by
u
i
= (1,o
i
)
i
i = 1. . . . . r. (2.110)
33
Then u
1
. . . . . u
r
is an orthonormal set of vectors in the range of that span the range of . Thus,
u
1
. . . . . u
r
is a basis for the range of . The dimension r of the range of is called the rank of
. If r < m, we can extend the set u
1
. . . . . u
r
of orthonormal vectors to an orthonormal basis
u
1
. . . . . u
m
of m-space using the Gram-Schmidt process. If . is an n-vector, we can write . in
terms of the basis
1
. . . . .
n
as
. =
n

kD1
(
H
k
.)
k
. (2.111)
It follows from equations (2.110) and (2.111) that
. =
n

kD1
(
H
k
.)
k
=
r

kD1
(
H
k
.)o
k
u
k
=
r

kD1
o
k
u
k

H
k
.. (2.112)
Since . in equation (2.112) was arbitrary, we must have
=
r

kD1
o
k
u
k

H
k
. (2.113)
The representation of in equation (2.113) is called the singular value decomposition (SVD). If
. belongs to the null space of (. = 0), then it follows from equation (2.112) and the linear
independence of the vectors u
1
. . . . . u
r
that
H
k
. = 0 for k = 1. . . . . r. It then follows from
equation (2.111) that
. =
n

kDrC1
(
H
k
.)
k
.
i.e.,
rC1
. . . . .
n
span the null space of . Since
rC1
. . . . .
n
are orthonormal vectors belonging
to the null space of , they form a basis for the null space of .
We will now express the SVD in matrix form. Dene U = (u
1
. . . . . u
m
), V = (
1
. . . . .
n
), and
S = diag(o
1
. . . . . o
r
). If r < min(m.n), then the SVD can be written in the matrix form
= U
_
S 0
0 0
_
V
H
. (2.114)
If r = m < n, then the SVD can be written in the matrix form
= U
_
S 0
_
V
H
. (2.115)
If r = n < m, then the SVD can be written in the matrix form
= U
_
S
0
_
V
H
. (2.116)
If r = m = n, then the SVD can be written in the matrix form
= USV
H
. (2.117)
34
Generally we write the SVD in the form (2.114) with the understanding that some of the zero
portions might collapse and disappear.
We next give a geometric interpretation of the SVD. For this purpose we will restrict ourselves to
the real case. Let . be a point on the unit sphere, i.e., [.[ = 1. Since u
1
. . . . . u
r
is a basis for the
range of , there exist numbers ,
1
. . . . . ,
k
such that
. =
r

kD1
,
k
u
k
=
r

kD1
o
k
(
T
k
.)u
k
.
Therefore, ,
k
= o
k
(
T
k
.), k = 1. . . . . r. Since the columns of V form an orthonormal basis, we
have
. =
n

kD1
(
T
k
.)
k
.
Therefore,
[.[
2
=
n

kD1
(
T
k
.)
2
= 1.
It follows that
,
2
1
o
2
1

,
2
r
o
2
r
= (
T
1
.)
2
(
T
r
.)
2
_ 1.
Here equality holds when r = n. Thus, the image of . lies on or interior to the hyper ellipsoid
with semi axes o
1
u
1
. . . . . o
r
u
r
. Conversely, if ,
1
. . . . . ,
r
satisfy
,
2
1
o
2
1

,
2
r
o
2
r
_ 1.
we dene
2
= 1

r
kD1
(,
k
,o
k
)
2
and
. =
r

kD1
,
k
o
k

k

rC1
.
Since
rC1
is in the null space of and
k
= o
k
u
k
(k _ r), it follows that
. =
r

kD1
,
k
o
k

k

rC1
=
r

kD1
,
k
u
k
.
In addition,
[.[
2
=
r

kD1
,
2
k
o
2
k

2
= 1.
35
Thus, we have shown that the image of the unit sphere [.[ = 1 under the mapping is the hyper
ellipsoid
,
2
1
o
2
1

,
2
r
o
2
r
_ 1
relative to the basis u
1
. . . . . u
r
. When r = n, equality holds and the image is the surface of the
hyper ellipsoid
,
2
1
o
2
1

,
2
r
o
2
n
= 1.
2.4.2 The SVD and Least Squares Problems
In least squares problems we seek an . that minimizes [. b[. In view of the singular value
decomposition, we have
[. b[
2
=

U
_
S 0
0 0
_
V
H
. b

2
=

U
__
S 0
0 0
_
V
H
. U
H
b
_

2
=

_
S 0
0 0
_
V
H
. U
H
b

2
. (2.118)
If we dene
, =
_
,
1
,
2
_
= V
H
. (2.119)

b =
_

b
1

b
2
_
= U
H
b. (2.120)
then equation (2.118) can be written
[. b[
2
=

_
S,
1


b
1

b
2
_

2
= [S,
1


b
1
[
2
[

b
2
[
2
. (2.121)
It is clear from equation (2.121) that [. b[ is minimized when ,
1
= S
1

b
1
. Therefore, the ,
that minimizes [. b[ is given by
, =
_
S
1

b
1
,
2
_
,
2
arbitrary. (2.122)
In view of equation (2.119), the . that minimizes [. b[ is given by
. = V, = V
_
S
1

b
1
,
2
_
,
2
arbitrary. (2.123)
36
Since V is unitary, it follows from equation (2.123) that
[.[
2
= [S
1

b
1
[
2
[,
2
[
2
.
Thus, there is a unique . of minimum norm that minimizes [. b[, namely the . corresponding
to ,
2
= 0. This . is given by
. = V
_
S
1

b
1
0
_
= V
_
S
1
0
0 0
_
_

b
1

b
2
_
= V
_
S
1
0
0 0
_
U
H
b.
The matrix multiplying b on the right-hand-side of this equation is called the generalized inverse
of and is denoted by
C
, i.e.,

C
= V
_
S
1
0
0 0
_
U
H
. (2.124)
Thus, the minimum norm solution of the least squares problem is given by . =
C
b. The n m
matrix
C
plays the same role in least squares problems that
1
plays in the solution of linear
equations. We will now show that this denition of the generalized inverse gives the same result
as the classical Moore-Penrose conditions.
Theorem 2. If has a singular value decomposition given by
= U
_
S 0
0 0
_
V
H
.
then the matrix X dened by
X =
C
= V
_
S
1
0
0 0
_
U
H
is the unique solution of the Moore-Penrose conditions:
1. X =
2. XX = X
3. (X)
H
= X
4. (X)
H
= X.
37
Proof:
X = U
_
S 0
0 0
_
V
H
V
_
S
1
0
0 0
_
U
H
U
_
S 0
0 0
_
V
H
= U
_
S 0
0 0
__
1 0
0 0
_
V
H
= U
_
S 0
0 0
_
V
H
= .
i.e., X satises condition (1).
XX = V
_
S
1
0
0 0
_
U
H
U
_
S 0
0 0
_
V
H
V
_
S
1
0
0 0
_
U
H
= V
_
S
1
0
0 0
_
U
H
= X.
i.e., X satises condition (2). Since
X = U
_
S 0
0 0
_
V
H
V
_
S
1
0
0 0
_
U
H
= U
_
1 0
0 0
_
U
H
and
X = V
_
S
1
0
0 0
_
U
H
U
_
S 0
0 0
_
V
H
= V
_
1 0
0 0
_
V
H
.
it follows that both X and X are Hermitian, i.e., X satises conditions (3) and (4). To show
uniqueness let us suppose that both X and Y satisfy the Moore-Penrose conditions. Then
X = XX by (2)
= X(X)
H
= XX
H

H
by (3)
= XX
H
(Y)
H
= XX
H

H
Y
H

H
by (1)
= XX
H

H
(Y )
H
= XX
H

H
Y by (3)
= X(X)
H
Y = XXY by (3)
= XY by (2)
= X(Y)Y by (1)
= X(Y)Y = X(Y)
H
Y = X
H
Y
H
Y by (4)
= (X)
H

H
Y
H
Y =
H
X
H

H
Y
H
Y by (4)
= (X)
H
Y
H
Y =
H
Y
H
Y by (1)
= (Y)
H
Y = YY by (4)
= Y by (2).
Thus, there is only one matrix X satisfying the Moore-Penrose conditions.
38
2.4.3 Singular Values and the Norm of a Matrix
Let be an m n matrix. By virtue of the SVD, we have
. =
r

kD1
o
k
(
H
k
.)u
k
for any n-vector .. (2.125)
Since the vectors u
1
. . . . . u
r
are orthonormal, we have
[.[
2
=
r

kD1
o
2
k
[
H
k
.[
2
_ o
2
1
r

kD1
[
H
k
.[
2
_ o
2
1
[.[
2
. (2.126)
The last inequality comes from the fact that . has the expansion . =

n
kD1
(
H
k
.)
k
in terms of
the orthonormal basis
1
. . . . .
n
and hence
[.[
2
=
n

kD1
[
H
k
.[
2
.
Thus, we have
[.[ _ o
1
[.[ for all .. (2.127)
Since
1
= o
1
u
1
, we have [
1
[ = o
1
= o
1
[
1
[. Hence,
max
x0
[.[
[.[
= o
1
. (2.128)
i.e., cant stretch the length of a vector by a factor greater than o
1
. One of the denitions of the
norm of a matrix is
[[ = sup
x0
[.[
[.[
. (2.129)
It follows from equations (2.128) and (2.129) that [[ = o
1
(the maximum singular value of ).
If is of full rank (r=n), then it follows by a similar argument that
min
x0
[.[
[.[
= o
n
.
If is an m n matrix and T is an n matrix, then for every -vector . we have
[T.[ _ [[ [T.[ _ [[ [T[ [.[
and hence [T[ _ [[ [T[.
2.4.4 Low Rank Matrix Approximations
You can think of the rank of a matrix as a measure of redundancy. Matrices of low rank should
have lots of redundancy and hence should be capable of specication by less parameters than the
39
total number of entries. For example, if the matrix consists of the pixel values of a digital image,
then a lower rank approximation of this image should represent a form of image compression. We
will make this concept more precise in this section.
One choice for a low rank approximation to is the matrix
k
=

k
iD1
o
i
u
i

H
i
for k < r.
k
is
a truncated SVD expansion of . Clearly

k
=
r

iDkC1
o
i
u
i

H
i
. (2.130)
Since the largest singular value of
k
is o
kC1
, we have
[
k
[ = o
kC1
. (2.131)
Suppose T is another mn matrix of rank k. Then the null space N of T has dimension nk. Let
n
1
. . . . . n
nk
be a basis for N. The n 1 n-vectors n
1
. . . . . n
nk
.
1
. . . . .
kC1
must be linearly
dependent, i.e., there are constants
1
. . . . . a
nk
and
1
. . . . .
kC1
, not all zero, such that
nk

iD1

i
n
i

kC1

iD1

i
= 0.
Not all of the
i
can be zero since
1
. . . . .
kC1
are linearly independent. Similarly, not all of the

i
can be zero. Therefore, the vector h dened by
h =
nk

iD1

i
n
i
=
kC1

iD1

i
is a nonzero vector that belongs to both N and <
1
. . . . .
kC1
>. By proper scaling, we can
assume that h is a vector with unit norm. Since h belongs to <
1
. . . . .
kC1
>, we have
h =
kC1

iD1
(
H
i
h)
i
. (2.132)
Therefore,
[h[
2
=
kC1

iD1
[
H
i
h[
2
. (2.133)
Since
i
= o
i
u
i
for i = 1. . . . . r, it follows from equation (2.132) that
h =
kC1

iD1
(
H
i
h)
i
=
kC1

iD1
(
H
i
h)o
i
u
i
. (2.134)
Therefore,
[h[
2
=
kC1

iD1
[
H
i
h[
2
o
2
i
_ o
2
kC1
kC1

iD1
[
H
i
h[
2
= o
2
kC1
[h[
2
. (2.135)
40
Since h belongs to the null space N, we have
[ T[
2
_ [( T)h[
2
= [h[
2
_ o
2
kC1
[h[
2
= o
2
kC1
. (2.136)
Combining equations (2.131) and (2.136), we obtain
[ T[ _ o
kC1
= [
k
[. (2.137)
Thus,
k
is the rank k matrix that is closest to .
2.4.5 The Condition Number of a Matrix
Suppose is an n n invertible matrix and . is the solution of the system of equations . = b.
We want to see how sensitive . is to perturbations of the matrix . Let . . be the solution to
the perturbed system ( )(. .) = b. Expanding the left-hand-side of this equation and
neglecting the second order perturbations ., we get
. . = 0 or . =
1
.. (2.138)
It follows from equation (2.138) that
[.[ _ [
1
[[[[.[
or
[.[,[.[
[[,[[
_ [
1
[[[. (2.139)
The quantity [
1
[[[ is called the condition number of and is denoted by k(), i.e.,
k() = [
1
[[[.
Thus, equation (2.139) can be written
[.[,[.[
[[,[[
_ k(). (2.140)
We have seen previously that [[ = o
1
, the largest singular value. Since
1
has the singular
value decomposition
1
= VS
1
U
H
, it follows that [
1
[ = 1,o
n
. Therefore, the condition
number is given by
k() =
o
1
o
n
. (2.141)
The condition number is sort of an aspect ratio of the hyper ellipsoid that maps the unit sphere
into.
41
2.4.6 Computation of the SVD
The methods for calculating the SVD are all variations of methods used to calculate eigenvalues
and eigenvectors of Hermitian Matrices. The most natural procedure would be to followthe deriva-
tion of the SVD and compute the squares of the singular values and the unitary matrix V by solving
the eigenproblem for
H
. The U matrix would then be obtained from V . Unfortunately, this
procedure is not very accurate due to the fact that the singular values of
H
are the squares of the
singular values of . As a result the ratio of largest to smallest singular value can be much larger
for
H
than for . There are, however, implicit methods that solve the eigenproblem for
H

without ever explicitly forming


H
. Most of the SVD algorithms rst reduce to bidiagonal
form (all elements zero except the diagonal and rst superdiagonal). This can be accomplished
using householder reections alternately on the left and right as shown in gure 2.2.

1
= U
H
1
=

. . . .
0 . . .
0 . . .
0 . . .
0 . . .

!
2
=
1
V
1
=

. . 0 0
0 . . .
0 . . .
0 . . .
0 . . .

3
= U
H
2

2
=

. . 0 0
0 . . .
0 0 . .
0 0 . .
0 0 . .

!
4
=
3
V
2
=

. . 0 0
0 . . 0
0 0 . .
0 0 . .
0 0 . .

5
= U
H
3

4
=

. . 0 0
0 . . 0
0 0 . .
0 0 0 .
0 0 0 .

!
6
= U
H
4

5
=

. . 0 0
0 . . 0
0 0 . .
0 0 0 .
0 0 0 0

.
Figure 2.2: Householder reduction of a matrix to bidiagonal form.
Since the application of the Householder reections on the right dont try to zero all the elements
to the right of the diagonal, they dont affect the zeroes already obtained in the columns. We have
seen that, even in the complex case, the Householder matrices can be chosen so that the resulting
bidiagonal matrix is real. Notice also that when the number of rows m is greater than the number
of columns n, the reduction produces zero rows after row n. Similarly, when n > m, the reduction
produces zero columns after column m. If we replace the products of the Householder reections
by the unitary matrices

U and

V , the reduction to a bidiagonal T can be written as
T =

U
H

V or =

UT

V
H
. (2.142)
42
If T has the SVD T =

U

V
T
, then has the SVD
=

U(

U

V
T
)

V
H
= (

U

U)(

V

V )
H
= UV
H
.
where U =

U

U and V =

V

V . Thus, it is sufcient to nd the SVD of the real bidiagonal matrix
T. Moreover, it is not necessary to carry along the zero rows or columns of T. For if the square
portion T
1
of T has the SVD T
1
= U
1

1
V
T
1
, then
T =
_
T
1
0
_
=
_
U
1

1
V
T
1
0
_
=
_
U
1
0
0 1
__

1
0
_
V
T
1
(2.143)
or
T = (T
1
. 0) = (U
1

1
V
T
1
. 0) = U
1
(
1
. 0)
_
V
1
0
0 1
_
T
. (2.144)
Thus, it is sufcient to consider the computation of the SVD for a real, square, bidiagonal matrix
T.
In addition to the implicit methods of nding the eigenvalues of T
T
T, some methods look instead
at the symmetric matrix
_
0 T
T
T 0
_
. If the SVD of T is T = UV
T
, then
_
0 T
T
T 0
_
has the
eigenequation
_
0 T
T
T 0
__
V V
U U
_
=
_
V V
U U
__
0
0
_
. (2.145)
In addition, the matrix
_
0 T
T
T 0
_
can be reduced to a real tridiagonal matrix T by the relation
T = 1
T
T1 (2.146)
where 1 = (e
1
. e
nC1
. e
2
. e
nC2
. . . . . e
n
. e
2n
) is a permutation matrix formed by a rearrangement
of the columns e
1
. e
2
. . . . . e
2n
of the 2n 2n identity matrix. The matrix 1 is unitary and is
sometimes called the perfect shufe since its operation on a vector mimics a perfect card shufe of
the components. The algorithms based on this double size Symmetric matrix dont actually form
the double size matrix, but make efcient use of the symmetries involved in this eigenproblem.
For those interested in the details of the various SVD algorithms, I would refer you to the book by
Demmel [4].
In Matlab the SVD can be obtained by the call [U,S,V]=svd(A). In LAPACK the general driver
routines for the SVD are SGESVD, DGESVD, and CGESVD depending on whether the matrix is
real single precision, real double precision, or complex.
43
Chapter 3
Eigenvalue Problems
Eigenvalue problems occur quite often in Physics. For example, in Quantum Mechanics eigen-
values correspond to certain energy states; in structural mechanics problems eigenvalues often
correspond to resonance frequencies of the structure; and in time evolution problems eigenvalues
are often related to the stability of the system.
Let be an m m square matrix. A nonzero vector . is an eigenvector of and z is its corre-
sponding eigenvalue, if
. = z..
The set of vectors
V

= {. : . = z.]
is a subspace called the eigenspace corresponding to z. The equation . = z. is equivalent to
( z1). = 0. If z is an eigenvalue, then the matrix z1 is singular and hence
det( z1) = 0.
Thus, the eigenvalues of are roots of a polynomial equation of order n. This polynomial equation
is called the characteristic equation of . Conversely, if (:) = a
0
a
1
: a
n1
:
n1
a
n
:
n
is an arbitrary polynomial of degree n (a
n
= 0), then the matrix

0 a
0
,a
n
1 0 a
1
,a
n
1 0 a
2
,a
n
1
.
.
.
.
.
.
.
.
. 0 a
n2
,a
n
1 a
n1
,a
n

has (:) = 0 as its characteristic equation.


In some problems an eigenvalue z might correspond to a multiple root of the characteristic equa-
tion. The multiplicity of the root z is called its algebraic multiplicity. The dimension of the space
44
V

is called its geometric multiplicity. If for some eigenvalue z of , the geometric multiplicity
of z does not equal its geometric multiplicity, this eigenvalue is said to be defective. A matrix
with one or more defective eigenvalues is said to be a defective matrix. An example of a defective
matrix is the matrix

2 1 0
0 2 1
0 0 2

.
This matrix has the single eigenvalue 2 with algebraic multiplicity 3. However, the eigenspace
corresponding to the eigenvalue 2 has dimension 1. All the eigenvectors are multiples of e
1
. In
these notes we will only consider eigenvalue problems involving Hermitian matrices (
H
= ).
We will see that all such matrices are non defective.
If S is a nonsingular m m matrix, then the matrix S
1
S is said to be similar to . Since
det(S
1
S z1) = det
_
S
1
( z1)S
_
= det(S
1
) det( z1) det(S) = det( z1).
it follows that S
1
S and have the same characteristic equation and hence the same eigenvalues.
It can be shown that a Hermitian matrix always has a complete set of orthonormal eigenvectors.
If we form the unitary matrix U whose columns are the eigenvectors belonging to this orthonormal
set, then
U = U or U
H
U = (3.1)
where is a diagonal matrix whose diagonal entries are the eigenvalues. Thus, a Hermitian matrix
is similar a diagonal matrix. Since a diagonal matrix is clearly non defective, it follows that all
Hermitian matrices are non defective.
If e is a unit eigenvector of the Hermitian matrix and z is the corresponding eigenvalue, then
e = ze and hence z = e
H
e.
It follows that z = (e
H
e)
H
= e
H

H
e = e
H
e = z, i.e., the eigenvalues of a Hermitian
matrix are real.
It was shown by Abel, Galois and others in the nineteenth century that there can be no alge-
braic expression for the roots of a polynomial equation whose order is greater than four. Since
eigenvalues are roots of the characteristic equation and since the roots of any polynomial are the
eigenvalues of some matrix, there can be no purely algebraic method for computing eigenvalues.
Thus, algorithms for nding eigenvalues must at some stage be iterative in nature. The methods
to be discussed here rst reduce the Hermitian matrix to a real, symmetric, tridiagonal matrix
T by means of a unitary similarity transformation. The eigenvalues of T are then found using
certain iterative procedures. The most common iterative procedures are the Q1 algorithm and the
divide-and-conquer algorithm.
45
3.1 Reduction to Tridiagonal Form
The reduction to tridiagonal form can be done with Householder reectors. I will illustrate the
procedure with a 5 5 matrix , i.e.,
=

.
We can zero out the elements in the rst column from row three to the end using a Householder
reector of the form
U
1
=
_
1 0
0 Q
1
_
.
This reector does not alter the elements of the rst row. Thus, multiplying U
1
on the right
by U
H
1
zeros out the elements of the rst row from column three on and doesnt affect the rst
column. Hence,
Q
1
Q
H
1
=

0 0 0

0
0
0

.
Moreover, the Householder reector can be chosen so that the 12 element and the 21 element are
real. We can continue in this manner to zero out the elements below the rst subdiagonal and
above the rst superdiagonal. Furthermore, the Householder reectors can be chosen so that the
super and sub diagonals are real. The diagonals of the resulting tridiagonal matrix are real since the
transformations have preserved the Hermitian property. Collecting the products of the Householder
reectors into a unitary matrix U, we have
UU
H
= T or = U
H
T U
where T is a real, symmetric, tridiagonal matrix. Since and T are similar, they have the same
eigenvalues. Thus, we only need eigenvalue routines for real symmetric matrices. In the following
sections we will assume that the matrix is real and symmetric
3.2 The Power Method
The power method is one of the oldest methods for obtaining the eigenvectors of a matrix. It is
no longer used for this purpose because of its slow convergence, but it does underlie some of the
practical algorithms. Let
1
.
2
. . . . .
n
be an orthonormal basis of eigenvectors of the matrix
46
and let z
1
. . . . . z
n
be the corresponding eigenvalues. We will assume that the eigenvalues and
eigenvectors are so ordered that
[z
1
[ _ [z
2
[ _ _ [z
n
[.
We will assume further that [z
1
[ > [z
2
[. Let be an arbitrary vector with [[ = 1. Then there
exist constants c
1
. . . . . c
n
such that
= c
1

1
c
n

n
. (3.2)
We will make the further assumption that c
1
= 0. Successively applying to equation (3.2), we
obtain

k
= c
1

1
c
n

n
= c
1
z
k
1

1
c
n
z
k
n

n
. (3.3)
You can see from equation (3.3) that the term c
1
z
k
1

1
will eventually dominate and thus
k
,
if properly scaled at each step to prevent overow, will approach a multiple of the eigenvector

1
. This convergence can be slow if there are other eigenvalues close in magnitude to z
1
. The
condition c
1
= 0 is equivalent to the condition
< > <
2
. . . . .
n
>= {0].
3.3 The Rayleigh Quotient
The Rayleigh quotient of a vector . is the real number
r(.) =
.
T
.
.
T
.
.
If . is an eigenvector of corresponding to the eigenvalue z, then r(.) = z. If . is any nonzero
vector, then
[. .[
2
= (.
T

T
.
T
)(. .)
= .
T

T
. 2.
T
.
2
.
T
.
= .
T

T
. 2r(.).
T
.
2
.
T
. r
2
(.).
T
. r
2
(.).
T
.
= .
T

T
. .
T
.
_
r(.)
_
2
r
2
(.).
T
..
Thus, = r(.) minimizes [. .[. If . is an approximate eigenvector, then r(.) is an
approximate eigenvalue.
3.4 Inverse Iteration with Shifts
For any j that is not an eigenvalue of , the matrix ( j1)
1
has the same eigenvectors as
and has eigenvalues (z
j
j)
1
where {z
j
] are the eigenvalues of . Suppose j is close to the
47
eigenvalue z
i
. Then (z
i
j)
1
will be large compared to (z
j
j)
1
for = i . If we apply power
iteration to (j1)
1
, the process will converge to a multiple of the eigenvector
i
corresponding
to z
i
. This procedure is called inverse iteration with shifts. Although the power method is not used
in practice, the inverse power method with shifts is frequently used to compute eigenvectors once
an approximate eigenvalue has been obtained.
3.5 Rayleigh Quotient Iteration
The Rayleigh quotient can be used to obtain the shifts at each stage of inverse iteration. The
procedure can be summarized as follows.
1. Choose a starting vector
.0/
of unit magnitude.
2. Let z
.0/
= (
0
)
T

0
be the corresponding Rayleigh quotient.
3. For k = 1. 2. . . .
Solve
_
z
.k1/
_
n =
.k1/
for n, i.e., compute
_
z
.k1/
_
1

.k1/
.
Normalize n to obtain
.k/
= n,[n[.
Let z
.k/
= (
.k/
)
T

.k/
be the corresponding Rayleigh quotient.
It can be shown that the convergence of Rayleigh quotient iteration is ultimately cubic. Cubic
convergence triples the number of signicant digits on each iteration.
3.6 The Basic QR Method
The QR method was discovered independently by Francis [6] and Kublanovskaya [11] in 1961.
It is one of the standard methods for nding eigenvalues. The discussion in this section is based
largely on the paper Understanding the QR Algorithm by Watkins [13]. As before, we will assume
that the matrix is real and symmetric. Therefore, there is an orthonormal basis
1
. . . . .
n
such
that
j
= z
j

j
for each . We will assume that the eigenvalues z
j
are ordered so that [z
1
[ _
[z
2
[ _ _ [z
n
[.
The QR algorithm can be summarized as follows:
48
1. Choose
0
=
2. For m = 1. 2. . . .

m1
= Q
m
1
m
Q1 factorization

m
= 1
m
Q
m
3. Stop when
m
is approximately diagonal.
It is probably not obvious what this algorithmhas to do with eigenvalues. We will showthat the Q1
method is a way of organizing simultaneous iteration, which in turn is a multivector generalization
of the power method.
We can apply the power method to subspaces as well as to single vectors. Suppose S is a k-
dimensional subspace. We can compute the sequence of subspaces S. S.
2
S. . . . . Under certain
conditions this sequence will converge to the subspace spanned by the eigenvectors
1
.
2
. . . . .
k
corresponding to the k largest eigenvalues of . We will not provide a rigorous convergence proof,
but we will attempt to make this result seem plausible. Assume that [z
k
[ > [z
kC1
[ and dene the
subspaces
T =<
1
. . . . .
k
> U =<
kC1
. . . . .
n
> .
We will rst show that all the null vectors of lie in U. Suppose is a null vector of , i.e.,
= 0. We can expand in terms of the basis
1
. . . . .
n
giving
= c
1

1
c
k

k
c
kC1

kC1
c
n

n
.
Thus,
= c
1
z
1

1
c
k
z
k

k
c
kC1
z
kC1

kC1
c
n
z
n

n
= 0.
Since the vectors {
j
] are linearly independent and [z
1
[ _ _ [z
k
[ > 0, it follows that c
1
=
c
2
= = c
k
= 0, i.e., belongs to the subspace U. We will now make the additional assumption
S U = {0]. This assumption is analogous to the assumption c
1
= 0 in the power method. If .
is a nonzero vector in S, then we can write
. = c
1

1
c
2

2
c
k

k
(component in T )
c
kC1

kC1
c
n

n
. (component in U)
Thus,

m
.,(z
k
)
m
= c
1
(z
1
,z
k
)
m

1
c
k1
(z
k1
,z
k
)
m

k1
c
k

k
c
kC1
(z
kC1
,z
k
)
m

kC1
c
n
(z
n
,z
k
)
m

n
.
Since . doesnt belong to U, at least one of the coefcients c
1
. . . . . c
k
must be nonzero. Notice
that the rst k terms on the right-hand-side do not decrease in absolute value as m owhereas
the remaining terms approach zero. Thus,
m
., if properly scaled, approaches the subspace T as
m o. In the limit
m
S must approach a subspace of T . Since S U = {0], can have no null
49
vectors in S. Thus, is invertible on S. It follows that all of the subspaces
m
S have dimension
k and hence the limit can not be a proper subspace of T , i.e.,
m
S T as m o.
Numerically, we cant iterate on an entire subspace. Therefore, we pick a basis of this subspace
and iterate on this basis. Let q
0
1
. . . . . q
0
k
be a basis of S. Since is invertible on S, q
0
1
. . . . . q
0
k
is a basis of S. Similarly,
m
q
0
1
. . . . .
m
q
0
k
is a basis of
m
S for all m. Thus, in principle
we can iterate on a basis of S to obtain bases for S.
2
S. . . . . However, for large m these
bases become ill-conditioned since all the vectors tend to point in the direction of the eigenvector
corresponding to the eigenvalue of largest absolute value. To avoid this we orthonormalize the basis
at each step. Thus, given an orthonormal basis q
m
1
. . . . . q
m
k
of
m
S, we compute q
m
1
. . . . . q
m
k
and then orthonormalize these vectors (using something like the Gram-Schmidt process) to obtain
an orthonormal basis q
mC1
1
. . . . . q
mC1
k
of
mC1
S. This process is called simultaneous iteration.
Notice that this process of orthonormalization has the property
< q
m
1
. . . . . q
m
i
>=< q
mC1
1
. . . . . q
mC1
i
> for i = 1. . . . . k.
Let us consider now what happens when we apply simultaneous iteration to the complete set of
orthonormal vectors e
1
. . . . e
n
where e
k
is the k-th column of the identity matrix. Let us dene
S
k
=< e
1
. . . . . e
k
>. T
k
=<
1
. . . . .
k
>. U
k
=<
kC1
. . . . .
n
>
for k = 1. 2. . . . . n 1. We also assume that S
k
U
k
= {0] and [z
k
[ > [z
kC1
[ > 0 for each
1 _ k _ n 1. It follows from our previous discussion that
m
S
k
T
k
as m o. In terms
of bases, the orthonormal vectors q
m
1
. . . . . q
m
n
will converge to and orthonormal basis q
1
. . . . . q
n
such that T
k
=< q
1
. . . . . q
k
> for each k = 1. . . . . n 1. Each of the subspaces T
k
is invariant
under , i.e., T
k
T
k
. We will now look at a property of invariant subspaces. Suppose T is an
invariant subspace of . Let Q = (Q
1
. Q
2
) be an orthogonal matrix such that the columns of Q
1
is a basis of T . Then
Q
T
Q =
_
Q
T
1
Q
1
Q
T
1
Q
2
Q
T
2
Q
1
Q
T
2
Q
2
_
=
_
Q
T
1
Q
1
0
0 Q
T
2
Q
2
_
, i.e., the basis consisting of the columns of Q block diagonalizes . Let Q be the matrix with
columns q
1
. . . . . q
n
. Since each T
k
is invariant under , the matrix Q
T
Q has the block diagonal
form
Q
T
Q =
_

1
0
0
2
_
where
1
is k k
for each k = 1. . . . . n 1. Therefore, Q
T
Q must be diagonal. The diagonal entries are the
eigenvalues of . If we dene
m
= Q
T
m
Q
m
where Q
m
=< q
m
1
. . . . . q
m
n
>, then
m
will
become approximately diagonal for large m.
We can summarize simultaneous iteration as follows:
50
1. We start with the orthogonal matrix Q
0
= 1 whose columns form a basis
of n-space
2. For k = 1. 2. . . . we compute
7
m
= Q
m1
Power iteration step
(3.4a)
7
m
= Q
m
1
m
Orthonormalize columns of 7
m
(3.4b)

m
= Q
T
m
Q
m
Test for diagonal matrix. (3.4c)
The Q1 algorithm is an efcient way to organize these calculations. Equations (3.4a) and (3.4b)
can be combined to give
Q
m1
= Q
m
1
m
. (3.5)
Combining equations (3.4c) and (3.5), we get

m1
= Q
T
m1
Q
m1
= Q
T
m1
(Q
m
1
m
) = (Q
T
m1
Q
m
)1
m
=

Q
m
1
m
(3.6)
where

Q
m
= Q
T
m1
Q
m
. Equation (3.5) can be rewritten as
Q
T
m
Q
m1
= 1
m
. (3.7)
Combining equations (3.4c) and (3.7), we get

m
= Q
T
m
Q
m
= (Q
T
m
Q
m1
)Q
T
m1
Q
m
= 1
m
(Q
T
m1
Q
m
) = 1
m

Q
m
. (3.8)
Equation (3.6) is a Q1 factorization of
m1
. Equation (3.8) shows that
m
has the same Q
and 1 factors but with their order reversed. Thus, the Q1 algorithm generates the matrices
m
recursively without having to compute 7
m
and Q
m
at each step. Note that the orthogonal matrices

Q
m
and Q
m
satisfy the relation

Q
1

Q
2


Q
k
= (Q
T
0
Q
1
)(Q
T
1
Q
2
) (Q
T
k1
Q
k
) = Q
k
.
We have now seen that the Q1 method can be considered as a generalization of the power method.
We will see that the Q1 algorithm is also related to inverse power iteration. In fact we have the
following duality result.
Theorem 3. If is an n n symmetric nonsingular matrix and if S and S
?
are orthogonal
complementary subspaces. Then
m
S and
m
S
?
are also orthogonal complements.
Proof. If . and , are n-vectors, then
. , = .
T
, = .
T

T
(
T
)
1
, = (.)
T
(
T
)
1
, = (.)
T

1
, = .
1
,.
Applying this result repeatedly, we obtain
. , =
m
.
m
,.
51
It is clear from this relation that every element in
m
S is orthogonal to every element in
m
S
?
.
Let q
1
. . . . . q
k
be a basis of S and let q
kC1
. . . . . q
n
be a basis of S
?
. Then
m
q
1
. . . . .
m
q
k
is a basis of
m
S and
m
q
kC1
. . . . .
m
q
n
is a basis of
m
S
?
. Suppose there exist scalars
c
1
. . . . . c
n
such that
c
1

m
q
1
c
k

m
q
k
c
kC1

m
q
kC1
c
n

m
q
n
= 0. (3.9)
Taking the dot product of this relation with c
1

m
q
1
c
k

m
q
k
, we obtain
[c
1

m
q
1
c
k

m
q
k
[ = 0
and hence c
1

m
q
1
c
k

m
q
k
= 0. Since
m
q
1
. . . . .
m
q
k
are linearly independent, it follows
that c
1
= c
2
= = c
k
= 0. In a similar manner we obtain c
kC1
= = c
n
= 0. Therefore,

m
q
1
. . . . .
m
q
k
.
m
q
kC1
. . . . .
m
q
n
are linearly independent and hence form a basis for n-
space. Thus,
m
S and
m
S
?
are orthogonal complements.
It can be seen fromthis theoremthat performing power iteration on subspaces S
k
is also performing
inverse power iteration on S
?
k
. Since
< q
m
1
. . . . . q
m
k
>=<
m
e
1
. . . . .
m
e
k
>.
Theorem 3 implies that
< q
m
kC1
. . . . . q
m
n
>=<
m
e
kC1
. . . . .
m
e
n
> .
For k = n 1 we have < q
m
n
>=<
m
e
n
>. Thus, q
m
n
is the result at the m-th step of
applying the inverse power method to e
n
. It follows that q
m
n
should converge to an eigenvector
corresponding to the smallest eigenvalue z
n
. Moreover, the element in the n-th row and n-th
column of
m
= Q
T
m
Q
m
should converge to the smallest eigenvalue z
n
.
The convergence of the Q1 method, like that of the power method, can be quite slow. To make the
method practical, the convergence is accelerated using shifts as in the inverse power method.
3.6.1 The QR Method with Shifts
Suppose we apply a shift j
m
at the m-th step, i.e., we replace by j
m
1. Then the algorithm
becomes
1. Set
0
= .
2. for k = 1. 2. . . .

k1
j
k
1 =

Q
k

1
k
QR factorization

k
=

1
k

Q
k
j
k
1.
3. Deate when eigenvalue converges
52
It follows from the QR factorization of
k1
j
k
1 that

Q
T
k

k1

Q
k
j
k
1 =

Q
T
k
(
k1
j
k
1)

Q
k
=

Q
T
k

Q
k

1
k

Q
k
=

1
k

Q
k
. (3.10)
Equation (3.10) implies that

k
=

Q
T
k

k1

Q
k
. (3.11)
It follows by induction on equation (3.11) that

k
=

Q
T
k


Q
T
1


Q
1


Q
k
. (3.12)
If we dene
Q
k
=

Q
1


Q
k
.
then equation (3.12) can be written

k
= Q
T
k
Q
k
. (3.13)
Thus, each
k
has the same eigenvalues as .
Theorem 4. For each k _ 1 we have the relation
( j
k
1) ( j
1
1) =

Q
1


Q
k

1
k


1
1
= Q
k
1
k
where Q
k
=

Q
1


Q
k
and 1
k
=

1
k


1
1
.
Proof. For k = 1 the result is just the k = 1 step. Assume that the result holds for some k, i.e.,
( j
k
1) ( j
1
1) = Q
k
1
k
. (3.14)
From the k 1 step we have

k
j
kC1
1 =

Q
kC1

1
kC1
. (3.15)
Combining equations (3.13) and (3.15), we get

k
j
kC1
1 = Q
T
k
Q
k
j
kC1
1 = Q
T
k
( j
kC1
1)Q
k
=

Q
kC1

1
kC1
.
and hence
j
kC1
1 = Q
k

Q
kC1

1
kC1
Q
T
k
= Q
kC1

1
kC1
Q
T
k
. (3.16)
Combining equations (3.14) and (3.16), we get
( j
kC1
1)( j
k
1) ( j
1
1) = Q
kC1

1
kC1
Q
T
k
Q
k
1
k
= Q
kC1
1
kC1
.
which is the result for k 1. This completes the proof by induction
It follows from Theorem 4 that
( j
k
1) ( j
1
1)e
1
= Q
k
1
k
e
1
.
53
Since 1
k
is upper triangular, Q
k
1
k
e
1
is proportional to the rst column of Q
k
. Thus, the rst
column of Q
k
, apart from a constant multiplier, is the result of applying the power method with
shifts to e
1
. Taking the inverse of the result in Theorem 4, we obtain
( j
1
1)
1
( j
k
1)
1
= 1
1
k
Q
T
k
. (3.17)
Since for each the factor j
j
1 is symmetric, its inverse ( j
j
1)
1
is also symmetric.
Taking the transpose of equation (3.17), we get
( j
k
1)
1
( j
1
1)
1
= Q
k
_
1
1
k
_
T
. (3.18)
Applying equation (3.18) to e
n
, we get
( j
k
1)
1
( j
1
1)
1
e
n
= Q
k
_
1
1
k
_
T
e
n
.
Since
_
1
1
k
_
T
is lower triangular,
_
1
1
k
_
T
e
n
is a multiple of the last column of Q
k
. Therefore,
the last column of Q
k
, apart from a constant multiplier, is the result of applying the inverse power
method with shifts to e
n
. We have yet to say how the shifts are to be chosen. One choice is to
choose j
k
to be the Rayleigh quotient corresponding the last column of Q
k1
. This is readily
available to us since, by equation (3.13), it is equal to the (n. n) element of
k1
. By our remarks
on Rayleigh quotient iteration, we should expect cubic convergence to the eigenvalue z
n
. This
choice of shifts generally leads to convergence, but there are a few matrices for which the process
fails to converge. For example, consider the matrix
=
_
0 1
1 0
_
.
The unshifted Q1 algorithm doesnt converge since
=

Q
1

1
1
=
_
0 1
1 0
__
1 0
0 1
_

1
=

1
1

Q
1
=
_
1 0
0 1
__
0 1
1 0
_
= .
Thus, all the iterates are equal to . The Rayleigh quotient shift doesnt help since
22
= 0.
A shift that does work all the time is the Wilkinson Shift. This shift is obtained by considering
the lower-rightmost 2 2 submatrix of
k1
and choosing j
k
to be the eigenvalue of this 2 2
submatrix that is closest to the (n. n) element of
k1
. When there is sufcient convergence to
the eigenvalue z
n
, the off-diagonal elements in the last row and column of the
k
matrices will be
very small. We can deate these matrices by removing the rst and last columns, and then z
n1
can be obtained using the deated matrices. Continuing in this manner we can obtain all of the
eigenvalues.
Until recently, the QR method with shifts (or one of its variants) was the primary method for
computing eigenvalues and eigenvectors. Recently a competitor has emerged called the Divide-
and-Conquer algorithm.
54
3.7 The Divide-and-Conquer Method
The Divide-and-Conquer algorithmwas rst introduced by Cuppen [3] in 1981. As rst introduced,
the algorithm suffered from certain accuracy and stability problems. These were not overcome
until a stable algorithm was introduced in 1993 by Gu and Eisenstat [8]. The divide-and-conquer
algorithm is faster than the shifted QR algorithm if the size is greater than about 25 and both
eigenvalues and eigenvectors are required. Let us begin by discussing the basic theory underlying
the method. Let T denote a symmetric tridiagonal matrix for which we desire the eigenvalues and
eigenvectors, i.e., T has the form
T =

a
1
b
1
b
1
.
.
.
.
.
.
.
.
. a
m1
b
m1
b
m1
a
m
b
m
b
m
a
mC1
b
mC1
b
mC1
.
.
.
.
.
. b
n1
b
n1
a
n

. (3.19)
55
The matrix T can be split into the sum of two matrices as follows:
T =

a
1
b
1
b
1
.
.
.
.
.
.
.
.
. a
m1
b
m1
b
m1
a
m
b
m
a
mC1
b
m
b
mC1
b
mC1
.
.
.
.
.
. b
n1
b
n1
a
n

b
m
b
m
b
m
b
m

=
_
T
1
0
0 T
2
_
b
m

_
_
_
_
_
_
_
_
_
_
_
_
_
0
.
.
.
0
1
1
0
.
.
.
0
_
_
_
_
_
_
_
_
_
_
_
_
_
(0. . . . . 0. 1. 1. 0. . . . . 0) =
_
T
1
0
0 T
2
_
b
m

T
(3.20)
where m is roughly one half of n, T
1
and T
2
are tridiagonal, and is the vector = e
m
e
mC1
.
Suppose we have the following eigen decompositions of T
1
and T
2
T
1
= Q
1

1
Q
T
1
T
2
= Q
2

2
Q
T
2
(3.21)
where
1
and
2
are diagonal matrices of eigenvalues. Then T can be written
T =
_
T
1
0
0 T
2
_
b
m

T
=
_
Q
1

1
Q
T
1
0
0 Q
2

2
Q
T
2
_
b
m

T
=
_
Q
1
0
0 Q
2
___

1
0
0
2
_
b
m
uu
T
_ _
Q
T
1
0
0 Q
T
2
_
(3.22)
where
u =
_
Q
T
1
0
0 Q
T
2
_
.
56
Therefore, T is similar to a matrix of the form D juu
T
where D = diag(J
1
. . . . . J
n
). Thus,
it sufces to look at the eigen problem for matrices of the form D juu
T
. Let us assume rst
that z is an eigenvalue of D juu
T
, but is not an eigenvalue of D. Let . be an eigenvector of
D juu
T
corresponding to z. Then
(D juu
T
). = D. j(u
T
.)u = z..
and hence
. = j(u
T
.)(D z1)
1
u. (3.23)
Multiplying equation (3.23) by u
T
and collecting terms, we get
(u
T
.)
_
1 ju
T
(D z1)
1
u
_
= (u
T
.)
_
1 j
n

kD1
u
2
k
J
k
z
_
= 0. (3.24)
Since z is not an eigenvalue of D, we must have u
T
. = 0. Thus,
(z) = 1 j
n

kD1
u
2
k
J
k
z
= 0. (3.25)
Equation (3.25) is called the secular equation and (z) is called the secular function. The eigen-
values of D juu
T
that are not eigenvalues of D are roots of the secular equation. It follows
from equation (3.23) that the eigenvector corresponding to the eigenvalue z is proportional to
(D z1)
1
u. Figure 3.1 shows a plot of an example secular function.
The slope of (z) is given by

0
(z) = j
n

kD1
u
2
k
(J
k
z)
2
.
Thus, the slope (when it exists) is positive if j > 0 and negative if j < 0. Suppose the J
i
are such
that J
1
> J
2
> > J
n
and that all the components of u are nonzero. Then there must be a root
between each pair (J
i
. J
iC1
). This gives n 1 roots. Since (z) 1 as z oor as z o,
there is another root greater than J
1
if j > 0 and a root less than J
n
if j < 0. This gives n roots.
The only way the secular equation will have less than n roots is if one or more of the components
of u are zero or if one or more of the J
i
are equal. Suppose z is a root of the secular equation. We
will show that . = (Dz1)
1
u is an eigenvector of D juu
T
corresponding to the eigenvalue
z. Since z is a root of the secular equation, we have
(z) = 1 ju
T
(D z1)
1
u = 1 ju
T
. = 0
or ju
T
. = 1. Since . = (D z1)
1
u, we have
(D z1). = D. z. = u or D. u = z..
It follows that
(D juu
T
). = D. j(u
T
.)u = D. u = z.
as was to be proved.
57
1 0 1 2 3 4 5 6
6
4
2
0
2
4
6
Figure 3.1: Graph of (z) = 1
:5
1

:5
2

:5
3

:5
4
Let us now look at the special cases where there are less than n roots of the secular equation. If
u
i
= 0, then
(D juu
T
)e
i
= De
i
j(u
T
e
i
)u = De
i
ju
i
u = De
i
= J
i
e
i
.
i.e., e
i
is an eigenvector of D juu
T
corresponding to the eigenvalue J
i
.
If J
i
= J
j
for i = and either u
i
or u
j
is nonzero, then the vector . = e
i
e
j
is an eigenvector
of D corresponding to the eigenvalue J
i
for any and that are not both zero. We can choose
and so that
u
T
. = u
i
u
j
= 0.
For example, = u
j
and = u
i
would work. With this choice of and , the vector . =
e
i
e
j
is an eigenvector of D juu
T
corresponding to the eigenvalue J
i
. In this way we can
obtain n eigenvalues and vectors even when the secular equation has less that n roots.
Finding Roots of the Secular Equation The rst thought would be to use Newtons method
to nd the roots of (z). However, when one or more of the u
i
are small but not small enough
to neglect, the function (z) behaves pretty much like it would if the terms corresponding to the
small u
i
were not present until z is very close to one of the corresponding J
i
where it abruptly
approaches o. Thus, almost any initial guess will lead away from the desired root. This is
58
illustrated in Figure 3.2 where the .5 factor multiplying 1,(2 z) in the previous example is
replaced by 0.01. Notice that the curve is almost vertical at the zero crossing near 2. To solve this
1 0 1 2 3 4 5 6
6
4
2
0
2
4
6
Figure 3.2: Graph of (z) = 1
:5
1

:01
2

:5
3

:5
4
problem, a modied form of Newtons method is used. Newtons method approximates the curve
near the guess by the tangent line at the guess and then nds the place where this line crosses zero.
Alternatively, we could approximate (z) near the guess by another curve that is tangent to (z)
at the guess as long as we can nd the nearby zero crossing of this curve. If we are looking for a
root between J
i
and J
iC1
, we could use a function of the form
g(z) = c
1

c
2
J
i
z

c
3
J
iC1
z
(3.26)
to approximate (z). Once c
1
, c
2
and c
3
are chosen, the roots of g(z) can be found by solving the
quadratic equation
c
1
(J
i
z)(J
iC1
z) c
2
(J
iC1
z) c
3
(J
i
z) = 0. (3.27)
Let us write (z) as follows
(z) = 1 j[
1
(z) j[
2
(z) (3.28)
where
[
1
(z) =
i

kD1
u
2
k
J
k
z
and [
2
(z) =
n

kDiC1
u
2
k
J
k
z
. (3.29)
59
Notice that [
1
has only positive terms and [
2
has only negative terms for J
iC1
< z < J
i
. If z
j
is
our initial guess, then we approximate [
1
near z
j
by the function g
1
given by
g
1
(z) =
1


2
J
i
z
(3.30)
where
1
and
2
are chosen so that
g
1
(z
j
) = [
1
(z
j
) and g
0
1
(z
j
) = [
0
1
(z
j
). (3.31)
It is easily shown that
1
= [
1
(z
j
) (J
i
z
j
)[
0
1
(z
j
) and
2
= (J
i
z
j
)
2
[
0
1
(z
j
). Similarly, we
approximate [
2
near z
j
by the function g
2
given by
g
2
(z) =
3


4
J
iC1
z
(3.32)
where
3
and
4
are chosen so that
g
2
(z
j
) = [
2
(z
j
) and g
0
2
(z
j
) = [
0
2
(z
j
). (3.33)
Again it is easily shown that
3
= [
1
(z
j
) (J
iC1
z
j
)[
0
2
(z
j
) and
4
= (J
iC1
z
j
)
2
[
0
2
(z
j
).
Putting these approximations together, we have the following approximation for near z
j
(z)
.
= 1 jg
1
(z) jg
2
(z) = (1 j
1
j
3
)
j
2
J
i
z

j
4
J
iC1
z

c
1

c
2
J
i
z

c
3
J
iC1
z
. (3.34)
This modied Newtons method generally converges very fast.
Recursive Procedure We have shown how the eigenvalues and eigenvectors of T can be ob-
tained from the eigenvalues and eigenvectors of the smaller matrices T
1
and T
2
. The procedure we
have applied to T can also be applied to T
1
and T
2
. Continuing in this manner we can reduce the
original eigen problem to the solution of a series of 1-dimensional eigen problems and the solution
of a series of secular equations. In practice the recursive procedure is not carried all the way down
to 1-dimensional problems, but stops at some size where the QR method can be applied effec-
tively. We saw previously that the eigenvector corresponding to the eigenvalue z is proportional
to (D z1)
1
u as in equation (3.23). There are a number of subtle issues involved in computing
the eigenvectors this way when there are closely spaced pairs of eigenvalues. The interested reader
should consult the book by Demmel [4] for a discussion of these issues.
60
Chapter 4
Iterative Methods
Direct methods for solving systems of equations . = b or computing eigenvalues/eigenvectors of
a matrix become very expensive when the size n of becomes large. These methods generally
involve order n
3
operations and order n
2
storage. For large problems iterative methods are often
used. Each step of an iterative method generally involves the multiplication of the matrix by a
vector to obtain . Since the matrix is not modied in this process it is often possible to
take advantage of special structure of the matrix in forming . The special structure most often
exploited is sparseness (many elements of zero). Taking advantage of the structure of can
often drastically reduce the cost of each iteration. The cost of iterative methods also depends on
the rate of convergence. Convergence is usually better when the matrix is well conditioned.
Therefore, preconditioning of the matrix is often employed prior to the start of iteration. There are
many iterative methods. In this section we will discuss only two: the Lanczos method for eigen
problems and the conjugate gradient method for equation solution.
4.1 The Lanczos Method
As before, we will restrict our attention here to real symmetric matrices. We saw previously that
the power method is an iterative method whose m-th iterate .
.m/
is given by .
.m/
= .
.m1/
.
Lanczos had the idea that better convergence could be obtained if we made use of all the iterates
.
.0/
. .
.0/
.
2
.
.0/
. . . . .
m
.
.0/
at the m-th step instead of just the nal iterate .
.m/
. The subspace
generated by .
.0/
. .
.0/
. . . . .
m1
.
.0/
is called the m-th Krylov subspace and is denoted by K
m
.
Lanczos showed that you could generate an orthonormal basis q
1
. . . . . q
m
of the Krylov subspace
K
m
recursively. He then showed that the eigen problem restricted to this subspace is equivalent
to nding the eigenvalues/eigenvectors of the tridiagonal matrix T
m
= Q
T
m
Q
m
where Q
m
is
the matrix whose columns are q
1
. . . . . q
m
. As m becomes larger some of the eigenvalues of T
m
converge to eigenvalues of .
61
Let q
1
be dened by
q
1
= .
.0/
,[.
.0/
[ (4.1)
and let q
2
be given by
q
2
= r
1
,[r
1
[ where r
1
= q
1
(q
1
q
1
)q
1
. (4.2)
It is easily veried that r
1
q
1
= q
2
q
1
= 0. We generate the remaining vectors q
k
recursively.
Suppose q
1
. . . . . q
p
have been generated. We form
r
p
= q
p
(q
p
q
p
)q
p
(q
p
q
p1
)q
p1
(4.3)
q
pC1
= r
p
,[r
p
[. (4.4)
Clearly, r
p
q
p
= r
p
q
p1
= 0 by construction. For s _ 2 we have
r
p
q
s
= q
p
q
s
= q
p
q
s
. (4.5)
But, it follows from equations (4.3)(4.4) that
q
s
= r
s
(q
s
q
s
)q
s
(q
s
q
s1
)q
s1
= [r
s
[q
sC1
(q
s
q
s
)q
s
(q
s
q
s1
)q
s1
. (4.6)
Thus, r
p
q
s
= q
p
q
s
= 0 since q
s
is a linear combination of vectors q
k
with k < . It follows
that q
pC1
is orthogonal to all of the preceding q
k
vectors. We will now show that q
1
. . . . . q
m
is a
basis for the space K
m
. It follows from equations (4.3) and (4.4) that
< .
.0/
>=< q
1
> and < .
.0/
. .
.0/
>=< q
1
. q
2
> .
Suppose for some k we have
< .
.0/
. .
.0/
. . . . .
k1
.
.0/
>=< q
1
. q
2
. . . . . q
k
> .
Then,
k
.
.0/
can be written as a linear combination of q
1
. . . . . q
k
. It follows from equations
(4.3) and (4.4) that q
i
can be written as a linear combination of q
i1
, q
i
, q
iC1
. Therefore,
k
.
.0/
can be written as a linear combination of q
1
. . . . . q
kC1
and hence
< .
.0/
. .
.0/
. . . . .
k
.
.0/
>=< q
1
. q
2
. . . . . q
kC1
> .
It follows by induction that q
1
. . . . . q
m
is a basis for K
m
=< .
.0/
. .
.0/
. . . . .
m1
.
.0/
>.
Dene
p
= q
p
q
p
and
p
= q
p
q
p1
. Then

p
= q
p
q
p1
= q
p
q
p1
= q
p

_
[r
p1
[q
p
(q
p1
q
p1
)q
p1
(q
p1
q
p2
)q
p2
= [r
p1
[. (4.7)
It follows from equations (4.3), (4.4), and (4.7) that
q
p
=
pC1
q
pC1

p
q
p

p
q
p1
. (4.8)
62
In view of equation (4.8), the matrix T
m
= Q
T
m
Q
m
has the tridiagonal form
T
m
= Q
T
m
Q
m
=

1

2

2

2

3
.
.
.
.
.
.
.
.
.

m1

m1

m

m

m

. (4.9)
The original eigenvalue problem can be given a variational interpretation. Let a function be
dened by
(.) =
. .
. .
. (4.10)
We will show that (.) is an eigenvalue of if and only if . is a stationary point of , i.e.,

h
(.) = 0 for all h. Since

h
(.)
J
Jz
(. zh)

D0
=
(. .)(2. h) (. .)(2. h)
(. .)
2
=
2
. .
_
.
. .
. .
.
_
h. (4.11)
we have

h
(.) = 0 for all h =
_
.
. .
. .
.
_
h = 0 for all h (4.12a)
= . =
. .
. .
. = (.).. (4.12b)
Suppose in this variational principle we restrict both . and h to the subspace K
m
. Then . and h
can be expressed in the form . = Q
m
, and h = Q
m
n for some ,. n R
m
. With these relations
equation (4.12a) becomes
_
(Q
m
),
(Q
m
), Q
m
,
Q
m
, Q
m
,
.
_
Q
m
n = 0 for all n
or
_
(Q
T
m
Q
m
),
(Q
T
m
Q
m
), ,
, ,
,
_
n = 0 for all n. (4.13)
Thus the variational principle restricted to K
m
leads to the reduced eigenvalue problem
T
m
, = (Q
T
m
Q
m
), = j,. (4.14)
It has been found that the extreme eigenvalues usually converge the fastest with this method. The
biggest numerical problem with this method is that round-off errors cause the vectors {q
k
] gener-
ated in this way tend to loose their orthogonality as the number of steps increases. It has been found
that this loss of orthogonality increases rapidly whenever one of the eigenvalues of T
m
approaches
63
an eigenvalue of . There are a number of methods that counteract this loss of orthogonality by
periodically reorthogonalizing the vectors {q
k
] based on the convergence of the eigenvalues.
We can give another way of looking at the Lanczos algorithm. Let 1
m
denote the matrix whose
columns are .
.0/
. .
.0/
. . . . .
m1
.
.0/
. We will show that 1
m
has a reduced QR factorization
1
m
= Q
m
1
m
where Q
m
is the matrix occurring in the Lanczos method with columns q
1
. . . . . q
m
.
We have shown previously that
< .
.0/
> =< q
1
>
< .
.0/
. .
.0/
> =< q
1
. q
2
>
.
.
.
< .
.0/
. .
.0/
. . . . .
m1
.
.0/
> =< q
1
. q
2
. . . . . q
m
> .
We can express this result in matrix form as
1
m
= (.
.0/
. .
.0/
. . . . .
m1
.
.0/
) = (q
1
. . . . . q
m
)1
m
= Q
m
1
m
(4.15)
where 1
m
is an upper triangular matrix. This is the reduced QR factorization that we set out to
establish. Of course we dont want to determine Q
m
and 1
m
directly since the matrix 1
m
becomes
poorly conditioned for large m.
4.2 The Conjugate Gradient Method
The conjugate gradient (CG) method is a widely used iterative method for solving a system of
equations . = b when is symmetric and positive denite. It was rst introduced in 1952
by Hestenes and Stiefel [9]. Although this was not the original motivation, the CG method can be
considered as a Krylov subspace method related to the Lanczos method. We assume that q
1
. q
2
. . . .
are orthonormal vectors generated using the Lanczos recursion starting with the initial vector b. As
before we let Q
k
= (q
1
. . . . . q
k
) and T
k
= Q
T
k
Q
k
. Since is positive denite, we can dene
an A-norm by
[.[
2
A
= .
T
.. (4.16)
We will show that each iterate .
m
in the CG method is the unique element of the Krylov subspace
K
m
that minimizes the error [. .
m
[
A
where . is the solution of . = b.
Let r
k
denote the residual r
k
= b .
k
. Since q
1
= b,[b[, it follows that
64
Q
T
k
r
k
= Q
T
k
(b .
k
) = Q
T
k
b Q
T
k
.
k
=

q
T
1
.
.
.
q
T
k

b T
k
Q
T
k
.
k
= [b[e
1
T
k
Q
T
k
.
k
= T
k
Q
T
k
([b[Q
k
T
1
k
e
1
.
k
). (4.17)
If .
k
is chosen to be
.
k
= [b[Q
k
T
1
k
e
1
. (4.18)
then Q
T
k
r
k
= 0, i.e., r
k
is orthogonal to each of the vectors q
1
. . . . . q
k
and hence to every vector
in K
k
. It follows from equation (4.18) that .
k
is a linear combination of q
1
. . . . . q
k
and hence is a
member of K
k
. If . is an arbitrary element of K
k
, then
. = .
k
for some in K
k
.
Since r
k
is orthogonal to every vector in K
k
, we have
[. .[
2
A
= (. .)
T
(. .)
= (. .
k
)
T
(. .
k
)
= [. .
k
[
2
A
[[
2
A
2
T
(. .
k
)
= [. .
k
[
2
A
[[
2
A
2
T
r
k
= [. .
k
[
2
A
[[
2
A
. (4.19)
Thus [. .[
2
A
is minimized for = 0, i.e., when . = .
k
. We will now develop a simple recursive
method to generate the iterates .
k
.
The matrix T
k
= Q
T
k
Q
k
is also positive denite and hence has a Cholesky factorization
T
k
= 1
k
D
k
1
T
k
(4.20)
where 1
k
is unit lower triangular and D
k
is diagonal with positive diagonals. Combining equations
(4.18) and (4.20), we get
.
k
= [b[Q
k
(1
T
k
D
1
k
1
1
k
)e
1
=

1
k
,
k
(4.21)
where

1
k
= Q
k
1
T
k
and ,
k
= [b[D
1
k
1
1
k
e
1
. We denote the columns of

1
k
by
1
. . . . .
k
and
the components of ,
k
by j
1
. . . . . j
k
. We will show that the columns of

1
k1
are
1
. . . . .
k1
and
the components of ,
k1
are j
1
. . . . . j
k1
. It follows from equation (4.20) and the denition of

1
k
that

1
T
k


1
k
= 1
1
k
Q
T
k
Q
k
1
T
k
= 1
1
k
T
k
1
T
k
= 1
1
k
(1
k
D
k
1
T
k
)1
T
k
= D
k
.
65
Thus

T
i

j
= 0 for all i = . (4.22)
It is easy to see from equation (4.9) that T
k1
is the leading (k 1) (k 1) submatrix of T
k
.
Equation (4.20) can be written
T
k
=

1
l
1
.
.
.
.
.
.
.
.
.
l
k1
1

J
1
.
.
.
J
k1
J
k

1
l
1
.
.
.
.
.
.
.
.
.
l
k1
1

T
=
_
1
k1
0
l
k1
e
T
k1
1
__
D
k1
0
0 J
k
__
1
k1
0
l
k1
e
T
k1
1
_
T
=
_
1
k1
D
k1
1
T
k1
=
= =
_
where = denotes terms that are not signicant to the argument. Thus, 1
k1
and D
k1
are the
leading (k 1) (k 1) submatrices of 1
k
and D
k
respectively. Since 1
k
has the form
1
k
=
_
1
k1
0
= 1
_
.
the inverse 1
1
k
must have the form
1
1
k
=
_
1
1
k1
0
= 1
_
.
Therefore, it follows from the denition of ,
k
that
,
k
[b[D
1
k
1
1
k
e
1
= [b[
_
D
1
k1
0
0 1,J
k
__
1
1
k1
0
= 1
_
e
1
= [b[
_
D
1
k1
1
1
k1
0
= 1,J
k
_
e
1
= [b[
_
D
1
k1
1
1
k1
0
= 1,J
k
__
e
1
0
_
e
1
is here a (k 1)-vector
=
_
[b[D
1
k1
1
1
k1
e
1
j
k
_
=
_
,
k1
j
k
_
.
i.e., ,
k1
consists of the rst k 1 components of ,
k
. It follows from the denition of

1
k
that

1
k
Q
k
1
T
k
= (Q
k1
. q
k
)
_
1
T
k1
=
0 1
_
= (Q
k1
1
T
k1
.
k
) = (

1
k1
.
k
).
66
i.e.,

1
k1
consists of the rst k 1 columns of

1
k
.
We now develop a recursion relation for .
k
. It follows from equation (4.21) that
.
k
=

1
k
,
k
= (

1
k1
.
k
)
_
,
k1
j
k
_
=

1
k1
,
k1
j
k

k
= .
k1
j
k

k
. (4.23)
We now develop a recursion relation for
k
. It follows from the denition of

1
k
that

1
k
1
T
k
= Q
k
or
(
1
. . . . .
k
)

1 l
1
.
.
.
.
.
.
.
.
. l
k1
1

= (q
1
. . . . . q
k
). (4.24)
Equating the k-th columns in equation (4.24), we get
l
k1

k1

k
= q
k
or

k
= q
k
l
k1

k1
. (4.25)
Next we develop a recursion relation for the residuals r
k
. Multiplying equation (4.23) by and
subtracting from b, we obtain
r
k
= r
k1
j
k

k
. (4.26)
Since .
k1
belongs to K
k1
, it follows that .
k1
belongs to K
k
. Since b also belongs to K
k
, it
is clear that r
k1
= b .
k1
is a member of K
k
. Since r
k1
and q
k
both belong to K
k
and both
are orthogonal to K
k1
, they must be parallel. Thus,
q
k
=
r
k1
[r
k1
[
. (4.27)
We now dene
k
by

k
= [r
k1
[
k
. (4.28)
Substituting equations (4.27) and (4.28) into equations (4.23), (4.26), and (4.25), we get
.
k
= .
k1

j
k
[r
k1
[

k
= .
k1
v
k

k
(4.29a)
r
k
= r
k1

j
k
[r
k1
[

k
= r
k1
v
k

k
(4.29b)

k
= [r
k1
[q
k

l
k1
[r
k1
[
[r
k2
[

k1
= r
k1
j
k

k1
. (4.29c)
67
Here we have used the denitions
v
k
=
j
k
[r
k1
[
and
j
k
=
l
k1
[r
k1
[
[r
k2
[
.
Equations (4.29a), (4.29b), and (4.29c) are our three basic recursion relations. We next develop
a formula for v
k
. Since r
k1
= [r
k1
[q
k
and r
k
is orthogonal to K
k
, multiplication of equation
(4.29b) by r
T
k1
gives
0 = r
T
k1
r
k
= [r
k1
[
2
v
k
r
T
k1

k
.
Thus
v
k
=
[r
k1
[
2
r
T
k1

k
. (4.30)
Multiplying equation (4.29c) by
T
k
, we get

T
k

k
=
T
k
r
k1
0 = r
T
k1

k
. (4.31)
Combining equations (4.30) and (4.31), we obtain the desired formula
v
k
=
[r
k1
[
2

T
k

k
. (4.32)
We next develop a formula for j
k
. In view of equations (4.22) and (4.28), multiplication of equa-
tion (4.29c) by
T
k1
, gives
0 =
T
k1

k
=
T
k1
r
k1
j
k

T
k1

k1
or
j
k
=

T
k1
r
k1

T
k1

k1
. (4.33)
Multiplying equation (4.29b) by r
T
k
, we get
r
T
k
r
k
= 0 v
k
r
T
k

k
or
v
k
=
r
T
k
r
k
r
T
k

k
=
[r
k
[
2
r
T
k

k
. (4.34)
Combining equations (4.32) and (4.34), we get

[r
k
[
2
r
T
k

k
=
[r
k1
[
2

T
k

k
. (4.35)
Evaluating equation (4.35) for k = k 1 and combining the result with equation (4.33), we obtain
the desired formula
j
k
=

T
k1
r
k1

T
k1

k1
=
[r
k1
[
2
[r
k2
[
2
. (4.36)
68
We can now summarize the CG algorithm
1. Compute the initial values .
0
= 0, r
0
= b, and
1
= b.
2. For k = 1. 2. . . . compute
: =
k
Save
k
v
k
= [r
k1
[
2
,
T
k
: New step length
.
k
= .
k1
v
k

k
Update approximation
r
k
= r
k1
v
k
: New residual
j
kC1
= [r
k
[
2
,[r
k1
[
2
Improvement of residual

kC1
= r
k
j
kC1

k
New search direction
3. Stop when [r
k
[ is small enough.
Notice that the algorithm at each step only involves one matrix vector product, two dot products
(by saving [r
k
[
2
at each step), and three linear combinations of vectors. The storage required is
only four vectors (current values of :, r, ., and ) in addition to the matrix . As with all iterative
methods, the convergence is fastest when the matrix is well conditioned. The convergence also
depends on the distribution of eigenvalues.
4.3 Preconditioning
The convergence of iterative methods often depends on the condition of the underlying matrix as
well as the distribution of its eigenvalues. The convergence can often be improved by applying
a preconditioner M
1
to , i.e., we consider the matrix M
1
in place of . If we are solving
a system of equations . = b, this system can be replaced by M
1
. = M
1
b. The matrix
M
1
might be better suited for an iterative method. Of course M must be fairly simple to
compute, or the advantage might be lost. We often try to choose M so that it approximates in
some sense. If the original was symmetric and positive denite, we generally choose M to be
symmetric and positive denite. However, M
1
is generally not symmetric and positive denite
even when both and M are. If M is symmetric and positive denite, then M = 11
T
for
some 1 (possible obtained by a Cholesky factorization). The system of equations . = b can be
replaced by (1
1
1
T
) . = 1
1
b where . = 1
T
.. The matrix 1
1
1
T
is symmetric and
positive denite. Since
1
T
(1
1
1
T
)1
T
= M
1
. Similarity Transformation
1
1
1
T
has the same eigenvalues as M
1
.
The choice of a good preconditioner is more of an art than a science. The following are some of
69
the ways M might be chosen:
1. M can be chosen to be the diagonal of , i.e., M = diag(a
11
. a
22
. . . . . a
nn
).
2. M can be chosen on the basis of an incomplete Cholesky or 1U factorization of . If
is sparse, then the Cholesky factorization = 11
T
will generally produce an 1 that is
not sparse. Incomplete Cholesky factorization uses Cholesky-like formulas, but only lls in
those positions that are nonzero in the original . If

1 is the factor obtained in this manner,
we take M =

1

1
T
.
3. If a system of equations is obtained by a discretization of a differential or integral equation,
it is sometimes possible to use a coarser discretization and interpolation to approximate the
system obtained using a ne discretization.
4. If the underlying physical problem involves both short-range and long-range interactions, a
preconditioner can sometimes be obtained by neglecting the long-range-interactions.
5. If the underlying physical problem can be broken up into nonoverlapping domains, then a
preconditioner might be obtained by neglecting interactions between domains. In this way
M becomes a block diagonal matrix.
6. Sometimes the inverse operator
1
can be expressed as a matrix power series. An approx-
imate inverse can be obtained by truncating this series. For example, we might approximate

1
by a few terms of the Neumann series
1
= 1 (1 ) (1 )
2
.
There are many more preconditioners designed for particular types of problems. The user should
survey the literature to nd a preconditioner appropriate to the problem at hand.
70
Bibliography
[1] Beltrami, E., Sulle Funzioni Bilineari, Gionale di Mathematiche 11, pp. 98106 (1873).
[2] Cayley, A., A Memoir on the Theory of Matrices, Phil. Trans. 148, pp. 1737 (1858).
[3] Cuppen, J., A divide and conquer algorithm for the symmetric tridiagonal eigenproblem,
Numer. Math. 36, pp. 177195 (1981).
[4] Demmel,J.W., Applied Numerical Linear Algebra, SIAM (1997).
[5] Eckart, C. and Young, G., A Principal Axis Transformation for Non-Hermitian Matrices,
Bull. Amer. Math. Soc. 45, pp. 118121 (1939).
[6] Francis, J., The QR transformation: A unitary analogue to the LR transformation, parts I and
II, Computer J. 4, pp. 256272 and 332345 (1961).
[7] Golub, G. and Van Loan, C., Matrix Computations, Johns Hopkins University Press (1996)
[8] Gu, M. and Eisenstat, S., A stable algorithm for the rank-1 modication of the symmetric
eigenproblem, Computer Science Dept. Report YaleU/DCS/RR-967, Yale University (1993).
[9] Hestenes, M. and Stiefel, E., Methods of Conjugate Gradients for Solving Linear Systems, J.
Res. Nat. Bur. Stand. 49, pp. 409436 (1952).
[10] Jordan, C., Sur la r eduction des formes bilin eares, Comptes Redus de lAcad emie des Sci-
ences, Paris 78, pp. 614617 (1874).
[11] Kublanovskaya, V., On some algorithms for the solution of the complete eigenvalue problem,
USSR Comp. Math. Phys. 3, pp. 637657 (1961).
[12] Trefethen, L. and Bau, D., Numerical Linear Algebra, SIAM (1997).
[13] Watkins, D., Understanding the QR Algorithm, SIAM Review, vol. 24, No. 4 (1982).
71

S-ar putea să vă placă și