Linear Algebra: MAT 217 Lecture Notes, Spring 2012

Linear Algebra: MAT 217
Lecture notes, Spring 2012

Michael Damron
compiled from lectures and exercises designed with Tasho Kaletha
Princeton University
1
Contents
1 Vector spaces 4
1.1 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Linear independence and bases . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Linear transformations 16
2.1 Denitions and basic properties . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Range and nullspace, one-to-one, onto . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Isomorphisms and L(V, W) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4 Matrices and coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3 Dual spaces 32
3.1 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Annihilators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Double dual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 Dual maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 Determinants 39
4.1 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Determinants: existence and uniqueness . . . . . . . . . . . . . . . . . . . . 41
4.3 Properties of determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5 Eigenvalues 54
5.1 Denitions and the characteristic polynomial . . . . . . . . . . . . . . . . . . 54
5.2 Eigenspaces and the main diagonalizability theorem . . . . . . . . . . . . . . 56
5.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6 Jordan form 62
6.1 Generalized eigenspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2 Primary decomposition theorem . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.3 Nilpotent operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.4 Existence and uniqueness of Jordan form, Cayley-Hamilton . . . . . . . . . . 70
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7 Bilinear forms 79
7.1 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.2 Symmetric bilinear forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
7.3 Sesquilinear and Hermitian forms . . . . . . . . . . . . . . . . . . . . . . . . 84
2
7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8 Inner product spaces 88
8.1 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
8.2 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.3 Adjoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
8.4 Spectral theory of self-adjoint operators . . . . . . . . . . . . . . . . . . . . . 95
8.5 Normal and commuting operators . . . . . . . . . . . . . . . . . . . . . . . . 98
8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3
1 Vector spaces
1.1 Denitions
We begin with the denition of a vector space. (Keep in mind vectors in R
n
or C
n
.)
Denition 1.1.1. A vector space is a collection of two sets, V and F. The elements of F
(usually we take R or C) are called scalars and the elements of V are called vectors. For
each v, w V , there is a vector sum, v + w V , with the following properties.
0. There is one (and only one) vector called

0 with the property
v +
0 = v for all v V ;
1. for each v V there is one (and only one) vector called v with the property
v + (v) =
0 for all v V ;
2. commutativity of vector sum:
v + w = w + v for all v, w V ;
3. associativity of vector sum:
(v + w) + z = v + (w + z) for all v, w, z V .
Furthermore, for each v V and c F there is a scalar product cv V with the following
properties.
1. For all v V ,
1v = v .
2. For all v V and c, d F,
(cd)v = c(dv) .
3. For all c F, v, w V ,
c(v + w) = cv + cw .
4. For all c, d F, v V ,
(c + d)v = cv + dv .
Here are some examples.
1. V = R
n
, F = R. Addition is given by
(v
1
, . . . , v
n
) + (w
1
, . . . , w
n
) = (v
1
+ w
1
, . . . , v
n
+ w
n
)
and scalar multiplication is given by
c(v
1
, . . . , v
n
) = (cv
1
, . . . , cv
n
) .
4
2. Polynomials: take V to be all polynomials of degree up to n with real coecients
V = a
n
x
n
+ + a
1
x + a
0
: a
i
R for all i
and take F = R. Note the similarity to R
n
.
3. Let S be any nonempty set and let V be the set of functions from S to C. Set F = C.
If f
1
, f
2
V set f
1
+ f
2
to be the function given by
(f
1
+ f
2
)(s) = f
1
(s) + f
2
(s) for all s S
and if c C set cf
1
to be the function given by
(cf
1
)(s) = c(f
1
(s)) .
4. Let
V =
__
1 1
0 0
__
(or any other xed object) with F = C. Dene
_
1 1
0 0
_
+
_
1 1
0 0
_
=
_
1 1
0 0
_
and
c
_
1 1
0 0
_
=
_
1 1
0 0
_
.
In general F is allowed to be a eld.
Denition 1.1.2. A set F is called a eld if for each a, b F there is an element ab F
and one a + b F such that the following hold.
1. For all a, b, c F we have (ab)c = a(bc) and (a + b) + c = a + (b + c).
2. For all a, b F we have ab = ba and a + b = b + a.
3. There exists an element 0 F such that for all a F, we have a +0 = a; furthermore
there is a non-zero element 1 F such that for all a F, we have 1a = a.
4. For each a F there is an element a F such that a + (a) = 0. If a ,= 0 there
exists an element a
1
such that aa
1
= 1.
5. For all a, b, c F,
a(b + c) = ab + ac .
Here are some general facts.
1. For all c F, c
0 =
0.
5
Proof.
c
0 = c(
0 +
0) = c
0 + c
0
c
0 + ((c
0)) = (c
0 + c
0) + (c(
0))
0 = c
0 + (c
0 + ((c
0)))
0 = c
0 +
0 = c
0 .
Similarly one may prove that for all v V , 0v =
0.
2. For all v V , (1)v = v.
Proof.
v + (1)v = 1v + (1)v
= (1 + (1))v
= 0v
=

0 .
However v is the unique vector such that v + (v) =
0. Therefore (1)v = v.
1.2 Subspaces
Denition 1.2.1. A subset W V of a vector space is called a subspace if (W, F) with the
same operations is also a vector space.
Many of the rules for vector spaces follow directly by inheritance. For example, if
W V then for all v, w W we have v +w = w+v. We actually only need to check a few:
A.

0 W.
B. For all w W the vector w is also in W.
C. For all w W and c F, cw W.
D. For all v, w W, v + w W.
Theorem 1.2.2. W V is a subspace if and only if it is nonempty and for all v, w W
and c F we have cv + w W.
Proof. Suppose that W is a subspace and let v, w W, c F. By (C) we have cv W. By
(D) we have cv + w W.
Conversely suppose that for all v, w W and c F we have cv +w W. Then we need
to show A-D.
6
A. Since W is nonempty choose w W. Let v = w and c = 1. This gives
0 = ww W.
B. Set v = w, w = 0 and c = 1.
C. Set v = w, w = 0 and c F.
D. Set c = 1.
Examples:
1. If V is a vector space then 0 is a subspace.
2. Take V = C
n
. Let
W = (z
1
, . . . , z
n
) :
n
i=1
z
i
= 0 .
Then W is a subspace. (Exercise.)
3. Let V be the set of 2 2 matrices with real entries.
_
a
1
b
1
c
1
d
1
_
+
_
a
2
b
2
c
2
d
2
_
=
_
a
1
+ a
2
b
1
+ b
2
c
1
+ c
2
d
1
+ d
2
_
c
_
a
1
b
1
c
1
d
1
_
=
_
ca
1
cb
1
cc
1
cd
1
_
.
Then is W is a subspace, where
W =
__
a b
c d
_
: a + d = 1
_
?
4. In R
n
all subspaces are hyperplanes through the origin.
Theorem 1.2.3. Suppose that ( is a non-empty collection of subspaces of V . Then the
intersection
W =
W(
W
is a subspace.
Proof. Let c F and v, w
W. We need to show that (a)
W ,= and (b) cv +w
W. The
rst holds because each W is a subspace, so 0 W. Next for all W ( we have v, w W.
Then cv + w W. Therefore cv + w
W.
Denition 1.2.4. If S V is a subset of vectors let (
S
be the collection of all subspaces
containing S. The span of S is dened as
span(S) =
W(
S
W .
Since (
S
is non-empty, span(S) is a subspace.
7
Question: What is span()?
Denition 1.2.5. If
v = a
1
w
1
+ + a
k
w
k
for scalars a
i
F and vectors w
i
V then we say that v is a linear combination of
w
1
, . . . , w
k
.
Theorem 1.2.6. If S ,= then span(S) is equal to the set of all nite linear combinations
of elements of S.
Proof. Set
S = all nite l.c.s of elements of S .

We want to show that

S = span(S). First we show that

S span(S). Let
a
1
s
1
+ + a
k
s
k

S
and let W be a subspace in (
S
. Since s
i
S for all i we have s
i
W. By virtue of
W being a subspace, a
1
s
1
+ + a
k
s
k
W. Since this is true for all W (
S
then
a
1
s
1
+ + a
k
s
k
span(S). Therefore

S span(S).
In the other direction, the set

S is itself a subspace and it contains S (exercise). Thus
S (
S
and so
span(S) =
W(
S
W

S .
Question: If W
1
and W
2
are subspaces, is W
1
W
2
a subspace? If it were, it would be the
smallest subspace containing W
1
W
2
and thus would be equal to span(W
1
W
2
). But it is
not!
Proposition 1.2.7. If W
1
and W
2
are subspaces then span(W
1
W
2
) = W
1
+ W
2
, where
W
1
+ W
2
= w
1
+ w
2
: w
1
W
1
, w
2
W
2
.
Furthermore for each n 1,
span(
n
k=1
W
k
) = W
1
+ + W
n
.
Proof. A general element w
1
+w
2
in W
1
+W
2
is a linear combination of elements in W
1
W
2
so
it is in the span of W
1
W
2
. On the other hand if a
1
v
1
+ +a
n
v
n
is an element of the span then
we can split the v
i
s into vectors from W
1
and those from W
2
. For instance, v
1
, . . . , v
k
W
1
and v
k+1
, . . . , v
n
W
2
. Now a
1
v
1
+ + a
k
v
k
W
1
and a
k+1
v
k+1
+ + a
n
v
n
W
2
by the
fact that these are subspaces. Thus this linear combination is equal to w
1
+ w
2
for
w
1
= a
1
v
1
+ + a
k
v
k
and w
2
= a
k+1
v
k+1
+ + a
n
v
n
.
The general case is an exercise.
Remark. 1. Span(S) is the smallest subspace containing S in the following sense: if
W is any subspace containing S then span(S) W.
2. If S T then Span(S) Span(T).
3. If W is a subspace then Span(W) = W. Therefore Span(Span(S)) = Span(S).
8
1.3 Linear independence and bases
Now we move on to linear independence.
Denition 1.3.1. We say that vectors v
1
, . . . , v
k
V are linearly independent if whenever
a
1
v
1
+ + a
k
v
k
=
0 for scalars a
i
F
then a
i
= 0 for all i. Otherwise we say they are linearly dependent.
Lemma 1.3.2. Let S = v
1
, . . . , v
n
for n 1. Then S is linearly dependent if and only if
there exists v S such that v Span(S v).
Proof. Suppose rst that S is linearly dependent and that n 2. Then there exist scalars
a
1
, . . . , a
n
F which are not all zero such that a
1
v
1
+ +a
n
v
n
=
0. By reordering we may
assume that a
1
,= 0. Now
v
1
=
a
2
a
1
v
2
+ . . . +
_
a
n
a
1
_
v
n
.
So v
1
Span(S v
1
).
If S is linearly dependent and n = 1 then there exists a nonzero a
1
such that a
1
v
1
=

0,
so v
1
=
0. Now v
1
Span() = Span(S v
1
).
Conversely, suppose there exists v S such that v span(S v) and n 2. By
reordering we may suppose that v = v
1
. Then there exist scalars a
2
, . . . , a
n
such that
v
1
= a
2
v
2
+ + a
n
v
n
.
But now we have
(1)v
1
+ a
2
v
2
+ + a
n
v
n
=
0 .
Since this is a nontrivial linear combination, S is linearly dependent.
If n = 1 and v
1
Span(S v
1
) then v
1
Span() =
0, so that v
1
=

0. Now it is
easy to see that S is linearly dependent.
Examples:
1. For two vectors v
1
, v
2
V , they are linearly dependent if and only if one is a scalar
multiple of the other. By reordering, we may suppose v
1
= av
2
.
2. For three vectors this is not true anymore. The vectors (1, 1), (1, 0) and (0, 1) in R
2
are linearly dependent since
(1, 1) + (1)(1, 0) + (1)(0, 1) = (0, 0) .
However none of these is a scalar multiple of another.
9
3.
0 is linearly dependent:
1
0 =
0 .
Proposition 1.3.3. If S is a linearly independent set and T is a nonempty subset then T
is linearly independent.
Proof. Suppose that
a
1
t
1
+ + a
n
t
n
=
0
for vectors t
i
T and scalars a
i
F. Since each t
i
S this is a linear combination
of elements of S. Since S is linearly independent all the coecients must be zero. Thus
a
i
= 0 for all i. This was an arbitrary linear combination of elements of T so T is linearly
independent.
Corollary 1.3.4. If S is linearly dependent and R contains S then R is linearly dependent.
In particular, any set containing

0 is linearly dependent.
Proposition 1.3.5. If S is linearly independent and v Span(S) then there exist unique
vectors v
1
, . . . , v
n
and scalars a
1
, . . . , a
n
such that
v = a
1
v
1
+ + a
n
v
n
.
Proof. Suppose that S is linearly independent and there are two representations
v = a
1
v
1
+ + a
n
v
n
and v = b
1
w
1
+ + b
k
w
k
.
Then split the vectors in v
1
, . . . , v
n
w
1
, . . . , w
k
into three sets: S
1
are those in the rst
but not the second, S
2
are those in the second but not the rst, and S
3
are those in both.
0 = v w =
s
j
S
1
a
j
s
j
+
s
j
S
2
b
j
s
j
+
s
j
S
3
(a
j
b
j
)s
j
.
This is a linear combination of elements from S and by linear independence all coecients are
zero. Thus both representations used the same vectors (in S
3
) and with the same coecients
and are thus the same.
Lemma 1.3.6 (Steinitz exchange). Let L = v
1
, . . . , v
k
be an linearly independent set in a
vector space V and let S = w
1
, . . . , w
m
be a spanning set; that is, Span(S) = V . Then
1. k m and
2. there exist mk vectors s
1
, . . . , s
mk
S such that
Span(v
1
, . . . , v
k
, s
1
, . . . , s
mk
) = V .
10
Proof. We will prove this by induction on k. For k = 0 it is obviously true (using the fact
that is linearly independent). Suppose it is true for k and we will prove it for k + 1.
In other words, let v
1
, . . . , v
k+1
be a linearly independent set. Then by last lecture, since
v
1
, . . . , v
k
is linearly independent, we nd k m and vectors s
1
, . . . , s
mk
S with
Span(v
1
, . . . , v
k
, s
1
, . . . , s
mk
) = V .
Now since v
k+1
V we can nd scalars a
1
, . . . , a
k
and b
1
, . . . , b
mk
in F such that
v
k+1
= a
1
v
1
+ + a
k
v
k
+ b
1
s
1
+ b
mk
s
mk
. (1)
We claim that not all of the b
i
s are zero. If this were the case then we would have
v
k+1
= a
1
v
1
+ + a
k
v
k
a
1
v
1
+ + a
k
v
k
+ (1)v
k+1
=
0 ,
a contradiction to linear independence. Also this implies that k ,= m, since otherwise the
linear combination (1) would contain no b
i
s. Thus
k m + 1 .
Suppose for example that b
1
,= 0. Then we could write
b
1
s
1
= a
1
v
1
+ + (a
k
)v
k
+ v
k+1
+ (b
2
)s
2
+ + (b
mk
)s
mk
s
1
=
_
a
1
b
1
_
v
1
+ +
_
a
k
b
1
_
v
k
+
1
b
1
v
k+1
+
_
b
2
b
1
_
s
2
+ +
_
b
mk
b
1
_
s
mk
.
In other words,
s
1
Span(v
1
, . . . , v
k
, v
k+1
, s
2
, . . . , s
mk
)
or said dierently,
V = Span(v
1
, . . . , v
k
, s
1
, . . . , s
mk
) Span(v
1
, . . . , v
k+1
, s
2
, . . . , s
mk
) .
This completes the proof.
This lemma has loads of consequences.
Denition 1.3.7. A set B V is called a basis for V if B is linearly independent and
Span(B) = V .
Corollary 1.3.8. If B
1
and B
2
are bases for V then they have the same number of elements.
Proof. If both sets are innite, we are done. If B
1
is innite and B
2
is nite then we may
choose [B
2
[ + 1 elements of B
1
that will be linearly independent. Since B
2
spans V , this
contradicts Steinitz. Similarly if B
1
is nite and B
2
is innite.
Otherwise they are both nite. Since each spans V and is linearly independent, we apply
Steinitz twice to get [B
1
[ [B
2
[ and [B
2
[ [B
1
[.
11
Denition 1.3.9. We dene the dimension dim V to be the number of elements in a basis.
By the above, this is well-dened.
Remark. Each nonzero element of V has a unique representation in terms of a basis.
Theorem 1.3.10. Let V be a nonzero vector space and suppose that V is nitely generated;
that is, there is a nite set S V such that V = Span(S). Then V has a nite basis:
dim V < .
Proof. Let B be a minimal spanning subset of S. We claim that B is linearly independent.
If not, then by a previous result, there is a vector b B such that b Span(B b). It
then follows that
V = Span(B) Span(B b) ,
so B b is a spanning set, a contradiction.
Theorem 1.3.11 (1 subspace theorem). Let V be a nite dimensional vector space and W
be a nonzero subspace of V . Then dim W < . If C = w
1
, . . . , w
k
is a basis for W then
there exists a basis B for V such that C B.
Proof. It is an exercise to show that dim W < . Write n for the dimension of V and
let B be a basis for V (with n elements). Since this is a spanning set and C is a linearly
independent set, there exist vectors b
1
, . . . , b
nk
B such that
B := C b
1
, . . . , b
nk
is a spanning set. Note that B is a spanning set with n = dim V number of elements. Thus
we will be done if we prove the following lemma.
Lemma 1.3.12. Let V be a vector space of dimension n 1 and S = v
1
, . . . , v
k
V .
1. If k < n then S cannot span V .
2. If k > n then S cannot be linearly independent.
3. If k = n then S is linearly independent if and only if S spans V .
Proof. Let B be a basis of V . If S spans V , Steinitz gives that [B[ [S[. This proves 1. If
S is linearly independent then again Steinitz gives [S[ [B[. This proves 2.
Suppose that k = n and S is linearly independent. By Steinitz, we may add 0 vectors
from the set B to S to make S span V . Thus Span(S) = V . Conversely, suppose that
Span(S) = V . If S is linearly dependent then there exists s S such that s Span(Ss).
Then S s is a set of smaller cardinality than S and spans V . But now this contradicts
Steinitz, using B as our linearly independent set and S s as our spanning set.
Corollary 1.3.13. If V is a vector space and W is a subspace then dim W dim V .
Theorem 1.3.14. Let W
1
and W
2
be subspaces of V (with dim V < ). Then
dim W
1
+ dim W
2
= dim (W
1
W
2
) + dim (W
1
+ W
2
) .
12
Proof. Easy if either is zero. Otherwise we argue as follows. Let
v
1
, . . . , v
k
be a basis for W
1
W
2
. By the 1-subspace theorem, extend this to a basis of W
1
:
v
1
, . . . , v
k
, w
1
, . . . , w
m
1
and also extend it to a basis for W

2
:
v
1
, . . . , v
k
, w
1
, . . . , w
m
2
.
We claim that
B := v
1
, . . . , v
k
, w
1
, . . . , w
m
1
, w
1
, . . . , w
m
2
is a basis for W
1
+ W
2
. It is not hard to see it is spanning.
To show linear independence, suppose
a
1
v
1
+ + a
k
v
k
+ b
1
w
1
+ + b
m
1
w
m
1
+ c
1
w
1
+ + c
m
2
w
m
2
=
0 .
Then
c
1
w
1
+ + c
m
2
w
m
2
= (a
1
)v
1
+ + (a
k
)v
k
+ (b
1
)w
1
+ + (b
m
1
)w
m
1
W
1
.
Also it is clearly in W
2
. So it is in W
1
W
2
. So we can nd scalars a
1
, . . . , a
k
such that
c
1
w
1
+ + c
m
2
w
m
2
= a
1
v
1
+ + a
k
v
k
a
1
v
1
+ + a
k
v
k
+ (c
1
) w
1
+ + (c
m
2
) w
m
2
=
0 .
This is a linear combination of basis elements, so all c
i
s are zero. A similar argument gives
all b
i
s as zero. Thus we nally have
a
1
v
1
+ + a
k
v
k
=
0 .
But again this is a linear combination of basis elements so the a
i
s are zero.
Theorem 1.3.15 (2 subspace theorem). Let V be a nite dimensional vector space and W
1
and W
2
be nonzero subspaces. There exists a basis of V that contains bases for W
1
and W
2
.
Proof. The proof of the previous theorem shows that there is a basis for W
1
+ W
2
which
contains bases for W
1
and W
2
. Extend this to a basis for V .
13
1.4 Exercises
Notation:
1. If F is a eld then dene s
1
: F F by s
1
(x) = x. For integers n 2 dene
s
n
: F F by s
n
(x) = x + s
n1
(x). Last, dene the characteristic of F as
char(F) = minn : s
n
(1) = 0 .
If s
n
(1) ,= 0 for all n 1 then we set char(F) = 0.
Exercises:
1. The nite eld F
p
: For n N, let Z
n
denote the set of integers mod n. That is, each
element of Z/nZ is a subset of Z of the form d +nZ, where d Z. We dene addition
and multiplication on Z
n
by
(a + nZ) + (b + nZ) = (a + b) + nZ.
(a + nZ) (b + nZ) = (ab) + nZ.
Show that these operations are well dened. That is, if a
t
, b
t
Z are integers such that
a +nZ = a
t
+nZ and b +nZ = b
t
+nZ, then (a
t
+nZ) +(b
t
+nZ) = (a +b) +nZ and
(a
t
+nZ) (b
t
+nZ) = (ab) +nZ. Moreover, show that these operations make Z
n
into
a eld if and only if n is prime. In that case, one writes Z/pZ = F
p
.
2. The nite eld F
p
n: Let F be a nite eld.
(a) Show that char(F) is a prime number (in particular non-zero).
(b) Write p for the characteristic of F and dene
F = s
n
(1) : n = 1, . . . , p ,
where s
n
is the function given in the notation section. Show that F is a subeld
of F, isomorphic to F
p
.
(c) We can consider F as a vector space over F. Vector addition and scalar mul-
tiplication are interpreted using the operations of F. Show that F has nite
dimension.
(d) Writing n for the dimension of F, show that
[F[ = p
n
.
3. Consider R as a vector space over Q(using addition and multiplication of real numbers).
Does this vector space have nite dimension?
14
4. Recall the denition of direct sum: if W
1
and W
2
are subspaces of a vector space V
then we write W
1
W
2
for the space W
1
+ W
2
if W
1
W
2
=
0. For k 3 we write
W
1
W
k
for the space W
1
+ + W
k
if for each i = 2, . . . , k, we have
W
i
(W
1
+ + W
i1
) =
0 .
Let S = v
1
, . . . , v
n
be a subset of nonzero vectors in a vector space V and for each
k = 1, . . . , n write W
k
= Span(v
k
). Show that S is a basis for V if and only if
V = W
1
W
n
.
15
2 Linear transformations
2.1 Denitions and basic properties
We now move to linear transformations.
Denition 2.1.1. Let V and W be vector spaces over the same eld F. A function T : V
W is called a linear transformation if
1. for all v
1
, v
2
V , T(v
1
+ v
2
) = T(v
1
) + T(v
2
) and
2. for all v V and c F, T(cv) = cT(v).
Remark. We only need to check that for all v
1
, v
2
V , c F, T(cv
1
+v
2
) = cT(v
1
)+T(v
2
).
Examples:
1. Let V = F
n
and W = F
m
, the vector spaces of n-tuples and m-tuples respectively.
Any mn matrix A denes a linear transformation L
A
: F
n
F
m
by
L
A
v = A v .
2. Let V be a nite dimensional vector space (of dimension n) and x an (ordered) basis
= v
1
, . . . , v
n
of V . Dene T
: V F
n
by
T
(v) = (a
1
, . . . , a
n
) ,
where v = a
1
v
1
+ +a
n
v
n
. Then T
is linear. It is called the coordinate map for the

ordered basis .
Suppose that T : F
n
F
m
is a linear transformation and x F
n
. Then writing x =
(x
1
, . . . , x
n
),
T(x) = T((x
1
, . . . , x
n
))
= T(x
1
(1, 0, . . . , 0) + + x
n
(0, . . . , 0, 1))
= x
1
T((1, 0, . . . , 0)) + x
n
T((0, . . . , 0, 1)) .
Therefore we only need to know the values of T at the standard basis. This leads us to:
Theorem 2.1.2 (The slogan). Given V and W, vector spaces over F, let v
1
, . . . , v
n
be a
basis for V . If w
1
, . . . , w
n
are any vectors in W, there exists exactly one linear transfor-
mation T : V W such that
T(v
i
) = w
i
for all i = 1, . . . , n .
16
Proof. We need to prove two things: (a) there is such a linear transformation and (b) there
cannot be more than one. Motivated by the above, we rst prove (a).
Each v V has a unique representation
v = a
1
v
1
+ + a
n
v
n
.
Dene T by
T(v) = a
1
w
1
+ + a
n
w
n
.
Note that by unique representations, T is well-dened. We claim that T is linear. Let
v, v V and c F. We must show that T(cv + v) = cT(v) + T( v). If v =
n
i=1
a
i
v
i
and
v =
n
i=1
a
i
v
i
then we claim that the unique representation of cv + v is
cv + v = (ca
1
+a
1
)v
1
+ + (ca
n
+a
n
)v
n
.
Therefore
T(cv + v) = (ca
1
+a
1
)w
1
+ + (ca
n
+a
n
)w
n
= c(a
1
w
1
+ + a
n
w
n
) + (a
1
w
1
+ +a
n
w
n
)
= cT(v) + T( v) .
Thus T is linear.
Now we show that T is unique. Suppose that T
t
is another linear transformation such
that
T
t
(v
i
) = w
i
for all i = 1, . . . , n .
Then if v V write v =
n
i=1
a
i
v
i
. We have
T
t
(v) = T
t
(a
1
v
1
+ + a
n
v
n
)
= a
1
T
t
(v
1
) + + a
n
T
t
(v
n
)
= a
1
w
1
+ + a
n
w
n
= T(v) .
Since T(v) = T
t
(v) for all v, by denition T = T
t
.
2.2 Range and nullspace, one-to-one, onto
Denition 2.2.1. If T : V W is linear we dene the nullspace of T by
N(T) = v V : T(v) =
0 .
Here,

0 is the zero vector in W. We also dene the range of T
R(T) = w W : there exists v V s.t. T(v) = w .
17
Proposition 2.2.2. If T : V W is linear then N(T) is a subspace of V and R(T) is a
subspace of W.
Proof. First note that T(
0) =
0. This holds from
0 = 0T(v) = T(0v) = T(
0) .
Therefore both spaces are nonempty.
Choose v
1
, v
2
N(T) and c F. Then
T(cv
1
+ v
2
) = cT(v
1
) + T(v
2
) = c
0 +
0 =
0 ,
so that cv
1
+ v
2
N(T). Therefore N(T) is a subspace of V . Also if w
1
, w
2
R(T) and
c F then we may nd v
1
, v
2
V such that T(v
1
) = w
1
and T(v
2
) = w
2
. Now
T(cv
1
+ v
2
) = cT(v
1
) + T(v
2
) = cw
1
+ w
2
,
so that cw
1
+ w
2
R(T). Therefore R(T) is a subspace of W.
Denition 2.2.3. Let S and T be sets and f : S T a function.
1. f is one-to-one if it maps distinct points to distinct points. In other words if s
1
, s
2
S
such that s
1
,= s
2
then f(s
1
) ,= f(s
2
). Equivalently, whenever f(s
1
) = f(s
2
) then
s
1
= s
2
.
2. f is onto if its range is equal to T. That is, for each t T there exists s S such that
f(s) = t.
Theorem 2.2.4. Let T : V W be linear. Then T is one-to-one if and only if N(T) =
0.
Proof. Suppose T is one-to-one. We want to show that N(T) =
0. Clearly

0 N(T) as it
is a subspace of V . If v N(T) then we have
T(v) =
0 = T(
0) .
Since T is one-to-one this implies that v =
0.
Suppose conversely that N(T) =
0. If T(v
1
) = T(v
2
) then
0 = T(v
1
) T(v
2
) = T(v
1
v
2
) ,
so that v
1
v
2
N(T). But the only vector in the nullspace is
0 so v
1
v
2
=
0. This implies
that v
1
= v
2
and T is one-to-one.
We now want to give a theorem that characterizes one-to-one and onto linear maps in a
dierent way.
Theorem 2.2.5. Let T : V W be linear.
18
1. T is one-to-one if and only if it maps linearly independent sets in V to linearly inde-
pendent sets in W.
2. T is onto if and only if it maps spanning sets of V to spanning sets of W.
Proof. Suppose that T is one-to-one and that S is a linearly independent set in V . We will
show that T(S), dened by
T(S) := T(s) : s S ,
is also linearly independent. If
a
1
T(s
1
) + + a
k
T(s
k
) =
0
for some s
i
S and a
i
F then
T(a
1
s
1
+ + a
k
s
k
) =
0 .
Therefore a
1
s
1
+ + a
k
s
k
N(T). But T is one-to-one so N(T) =
0. This gives
a
1
s
1
+ + a
k
s
k
=
0 .
Linear independence of S gives the a
i
s are zero. Thus T(S) is linearly independent.
Suppose conversely that T maps linearly independent sets to linearly independent sets.
If v is any nonzero vector in V then v is linearly independent. Therefore so is T(v).
This implies that T(v) ,= 0. Therefore N(T) =
0 and so T is one-to-one.
If T maps spanning sets to spanning sets then let w W. Let S be a spanning set of V ,
so that consequently T(S) spans W. If w W we can write w = a
1
T(s
1
) + + a
k
T(s
k
)
for a
i
F and s
i
S, so
w = T(a
1
s
1
+ + a
k
s
k
) R(T) ,
giving that T is onto.
For the converse suppose that T is onto and that S spans V . We claim that T(S) spans
W. To see this, let w W and note there exists v V such that T(v) = w. Write
v = a
1
s
1
+ + a
k
s
k
,
so w = T(v) = a
1
T(s
1
) + + a
k
T(s
k
) Span(T(S)). Therefore T(S) spans W.
Corollary 2.2.6. Let T : V W be linear.
1. if V and W are nite dimensional, then T is an isomorphism (one-to-one and onto)
if and only if T maps bases to bases.
2. If V is nite dimensional, then every basis of V is mapped to a spanning set of R(T).
3. If V is nite dimensional, then T is one-to-one if and only if T maps bases of V to
bases of R(T).
19
Theorem 2.2.7 (Rank-nullity theorem). Let T : V W be linear and V of nite dimen-
sion. Then
rank(T) + nullity(T) = dim V .
Proof. Let
v
1
, . . . , v
k
be a basis for N(T). Extend it to a basis

v
1
, . . . , v
k
, v
k+1
, . . . , v
n
of V . Write N
t
= Spanv
k+1
, . . . , v
n
and note that V = N(T) N
t
.
Lemma 2.2.8. If N(T) and N
t
are complementary subspaces (that is, V = N(T)N
t
) then
T is one-to-one on N
t
.
Proof. If z
1
, z
2
N
t
such that T(z
1
) = T(z
2
) then z
1
z
2
N(T). But it is in N
t
so it is in
N
t
N(T), which is only the zero vector. So z
1
= z
2
.
We may view T as a linear transformation only on N
t
; call it T[
N
; in other words, T[
N
is a linear transformation from N

t
to W that acts exactly as T does. By the corollary,
T[
N
(v
k+1
), . . . , T[
N
(v
n
) is a basis for R(T[
N
). Therefore T(v
k+1
), . . . , T(v
n
) is a basis
for R(T[
N
). By part two of the corollary, T(v
1
), . . . , T(v
n
) spans R(T). So
R(T) = Span(T(v
1
), . . . , T(v
n
))
= Span(T(v
k+1
), . . . , T(v
n
))
= = R(T[
N
) .
The second equality follows because the vectors T(v
1
), . . . , T(v
k
) are all zero and do not
contribute to the span (you can work this out as an exercise). Thus T(v
k+1
), . . . , T(v
n
) is
a basis for R(T) and
rank(T) + nullity(T) = (n k) + k = n = dim V .
2.3 Isomorphisms and L(V, W)
Denition 2.3.1. If S and T are sets and f : S T is a function then a function g : T S
is called an inverse function for f (written f
1
) if
f(g(t)) = t and g(f(s)) = s for all t T, s S .
Fact: f : S T has an inverse function if and only if f is one-to-one and onto. Furthermore
the inverse is one-to-one and onto. (Explain this.)
Theorem 2.3.2. If T : V W is an isomorphism then the inverse map T
1
: W V is
an isomorphism.
20
Proof. We have one-to-one and onto. We just need to show linear. Suppose that w
1
, w
2
W
and c F. Then
T
_
T
1
(cw
1
+ w
2
)
_
= cw
1
+ w
2
and
T
_
cT
1
(w
1
) + T
1
(w
2
)
_
= cT(T
1
(w
1
)) + T(T
1
(w
2
)) = cw
1
+ w
2
.
However T is one-to-one, so
T
1
(cw
1
+ w
2
) = cT
1
(w
1
) + T
1
(w
2
) .
The proof of the next lemma is in the homework.
Lemma 2.3.3. Let V and W be vector spaces with dim V = dim W. If T : V W is
linear then T is one-to-one if and only if T is onto.
Example: The coordinate map is an isomorphism. For V of dimension n choose a basis
= v
1
, . . . , v
n
and dene T
: V F
n
by T(v) = (a
1
, . . . , a
n
), where
v = a
1
v
1
+ + a
n
v
n
.
Then T
is linear (check). To show one-to-one and onto we only need to check one (since
dim V = dim F
n
. If (a
1
, . . . , a
n
) F
n
then dene v = a
1
v
1
+ + a
n
v
n
. Now
T
(v) = (a
1
, . . . , a
n
) .
So T is onto.
The space of linear maps. Let V and W be vector spaces over the same eld F. Dene
L(V, W) = T : V W[T is linear .
We dene addition and scalar multiplication as usual: for T, U L(V, W) and c F,
(T + U)(v) = T(v) + U(v) and (cT)(v) = cT(v) .
This is a vector space (exercise).
Theorem 2.3.4. If dim V = n and dim W = m then dim L(V, W) = mn. Given bases
v
1
, . . . , v
n
and w
1
, . . . , w
m
of V and W, the set T
i,j
: 1 i n, 1 j m is a basis
for L(V, W), where
T
i,j
(v
k
) =
_
w
j
if i = k
0 otherwise
.
21
Proof. First, to show linear independence, suppose that
i,j
a
i,j
T
i,j
= 0
T
,
where the element on the right is the zero transformation. Then for each k = 1, . . . , n, apply
both sides to v
k
:
i,j
a
i,j
T
i,j
(v
k
) = 0
T
(v
k
) =
0 .
We then get
0 =
m
j=1
n
i=1
a
i,j
T
i,j
(v
k
) =
m
j=1
a
k,j
w
j
.
But the w
j
s form a basis, so all a
k,1
, . . . , a
k,m
= 0. This is true for all k so the T
i,j
s are
linearly independent.
To show spanning suppose that T : V W is linear. Then for each i = 1, . . . , n, the
vector T(v
i
) is in W, so we can write it in terms of the w
j
s:
T(v
i
) = a
i,1
w
1
+ + a
i,m
w
m
.
Now dene the transformation
T =
i,j
a
i,j
T
i,j
.
We claim that this equals T. To see this, we must only check on the basis vectors. For some
k = 1, . . . , n,
T(v
k
) = a
k,1
w
1
+ + a
k,m
w
m
.
However,
T(v
k
) =
i,j
a
i,j
T
i,j
(v
k
) =
m
j=1
n
i=1
a
i,j
T
i,j
(v
k
)
=
m
j=1
a
k,j
w
j
= a
k,1
w
1
+ + a
k,m
w
m
.
2.4 Matrices and coordinates
Let T : V W be a linear transformation and
= v
1
, . . . , v
n
and = w
1
, . . . , w
m
bases for V and W, respectively.

We now build a matrix, which we label [T]
. (Using the column convention.)

22
1. Since T(v
1
) W, we can write
T(v
1
) = a
1,1
w
1
+ + a
m,1
w
m
.
2. Put the entries a
1,1
, . . . , a
m,n
into the rst column of [T]
.
3. Repeat for k = 1, . . . , n, writing
T(v
k
) = a
1,k
w
1
+ + a
m,k
w
m
and place the entries a
1,k
, . . . , a
m,k
into the k-th column.
Theorem 2.4.1. For each T : V W and bases and , there exists a unique m n
matrix [T]
such that for all v V ,

[T]
[v]
= [T(v)]
.
Proof. Let v V . Then write v = a
1
v
1
+ + a
n
v
n
.
T(v) = a
1
T(v
1
) + + a
n
T(v
n
)
= a
1
(a
1,1
w
1
+ + a
m,1
w
m
) + + a
n
(a
1,n
w
1
+ + a
m,n
w
m
) .
Collecting terms,
T(v) = (a
1
a
1,1
+ + a
n
a
1,n
)w
1
+ + (a
1
a
m,1
+ + a
n
a
m,n
)w
m
.
This gives the coordinates of T(v) in terms of :
[T(v)]
=
_
_
a
1
a
1,1
+ + a
n
a
1,n

a
1
a
m,1
+ + a
n
a
m,n
_
_
=
_
_
a
1,1
a
1,n

a
m,1
a
m,n
_
_
_
_
a
1

a
n
_
_
= [T]
[v]
.
Suppose that A and B are two matrices such that for all v V ,
A[v]
= [T(v)]
= B[v]
.
Take v = v
k
. Then [v]
= e
k
and A[v]
is the k-th column of A (and similarly for B).

Therefore A and B have all the same columns. This means A = B.
Theorem 2.4.2. Given bases and of V and W, the map T [T]
is an isomorphism.
Proof. It is one-to-one. To show linear let c F and T, U L(V, W). For each k = 1, . . . , n,
[cT + U]
[v
k
]
= [cT(v) + U(v)]
= c[T(v
k
)]
+ [U(v
k
)]
= c[T]
[v
k
]
+ [U]
[v
k
]
= (c[T]
+ [U]
)[v
k
]
.
However the left side is the k-th column of [cT + U]
and the right side is the k-th column

of c[T]
+ [U]
.
Both spaces have the same dimension, so the map is also onto and thus an isomorphism.
23
Examples:
1. Change of coordinates: Let V be nite dimensional and ,
t
two bases of V . How
do we express coordinates in in terms of those in
t
? For each v V ,
[v]
= [Iv]
= [I]
[v]
.
We multiply by the matrix [I]
.
2. Suppose that V, W and Z are vector spaces over the same eld. Let T : V W and
U : W Z be linear with , and bases for V, W and Z.
(a) UT : V Z is linear. If v
1
, v
2
V and c F then
(UT)(cv
1
+ v
2
) = U(T(cv
1
+ v
2
)) = U(cT(v
1
) + T(v
2
)) = c(UT)(v
1
) + (UT)(v
2
) .
(b) For each v V ,
[(UT)v]
= [U(T(v))]
= [U]
[T(v)]
= [U]
[T]
[v]
.
Therefore
[UT]
= [U]
[T]
.
(c) If T is an isomorphism from V to W,
Id = [I]
= [T]
[T
1
]
.
Similarly,
Id = [I]
= [T
1
]
[T]
This implies that [T

1
]
=
_
[T]
_
1
.
3. To change coordinates back,
[I]
=
_
[I
1
]
_
1
=
_
[I]
]
_
1
.
Denition 2.4.3. An n n matrix A is invertible is there exists an n n matrix B such
that
I = AB = BA .
Remark. If ,
t
are bases of V then
I = [I]
= [I]
[I]
.
Therefore each change of basis matrix is invertible.
Now how do we relate the matrix of T with respect to dierent bases?
24
Theorem 2.4.4. Let V and W be nite-dimensional vector spaces over F with ,
bases
for V and , bases for W.
1. If T : V W is linear then there exist invertible matrices P and Q such that
[T]
= P[T]
Q .
2. If T : V W is linear then there exists an invertible matrix P such that
[T]
= P
1
[T]
P .
Proof.
[T]
= [I]
[T]
[I]
.
Also
[T]
= [I]
[T]
[I]
=
_
[I]
_
1
[T]
[I]
.
Denition 2.4.5. Two nn matrices A and B are similar if there exists an nn invertible
matrix P such that
A = P
1
BP .
Theorem 2.4.6. Let A and B be nn matrices with entries from F. If A and B are similar
then there exists an n-dimensional vector space V , a linear transformation T : V V , and
bases ,
t
such that
A = [T]
and B = [T]
.
Proof. Suppose that A = P
1
BP. Dene the linear transformation L
A
: F
n
F
n
by
L
A
(v) = A v .
If we choose to be the standard basis then
[L
A
]
= A .
Next we will show that if
t
= p
1
, . . . , p
n
are the columns of P then
t
is a basis and
P = [I]
, where I : F
n
F
n
is the identity map. If we prove this, then P
1
= [I]
and so
B = P
1
AP = [I]
[L
A
]
[I]
= [L
A
]
.
Why is
t
a basis? Note that
p
k
= P e
k
,
so that if L
P
is invertible then
t
will be the image of a basis and thus a basis. But for all
v F
n
,
L
P
1L
P
v = P
1
P v = v
and
L
P
L
P
1v = v .
So (L
P
)
1
= L
P
1. This completes the proof.
25
The moral: Similar matrices represent the same transformation but with respect to two
dierent bases. Any property of matrices that is invariant under conjugation can be viewed
as a property of the underlying transformation.
Example: Trace. Given an n n matrix A with entries from F, dene
Tr A =
n
i=1
a
i,i
.
Note that if P is another matrix (not nec. invertible),
Tr (AP) =
n
i=1
(AP)
i,i
=
n
i=1
_
n
l=1
a
i,l
p
l,i
_
=
n
l=1
_
n
i=1
p
l,i
a
i,l
_
=
n
l=1
(PA)
l,l
= Tr (PA) .
Therefore if P is invertible,
Tr (P
1
AP) = Tr (APP
1
) = Tr A .
This means that trace is invariant under conjugation. Thus if T : V V is linear (and V is
nite dimensional) then Tr T can be dened as
Tr [T]
for any basis .

2.5 Exercises
Notation:
1. A group is a pair (G, ), where G is a set and : G G G is a function (usually
called product) such that
(a) there is an identity element e; that is, an element with the property
e g = g e = g for all g G ,
(b) for all g G there is an inverse element in G called g
1
such that
g
1
g = g g
1
= e ,
(c) and the operation is associative: for all g, h, k G,
(g h) k = g (h k) .
If the operation is commutative; that is, for all g, h G, we have g h = h g then we
call G abelian.
26
2. If (G, ) is a group and H is a subset of G then we call H a subgroup of G if (H, [
HH
)
is a group. Equivalently (and analogously to vector spaces and subspaces), H G is
a subgroup of G if and only if
(a) for all h
1
, h
2
H we have h
1
h
2
H and
(b) for all h H, we have h
1
H.
3. If G and H are groups then a function : G H is called a group homomorphism if
(g
1
g
2
) = (g
1
) (g
2
) for all g
1
and g
2
G .
Note that the product on the left is in G whereas the product on the right is in H. We
dene the kernel of to be
Ker() = g G : (g) = e
H
.
Here, e
H
refers to the identity element of H.
4. A group homomorphism : G H is called
a monomorphism (or injective, or one-to-one), if (g
1
) = (g
2
) g
1
= g
2
;
an epimorphism (or surjective, or onto), if (G) = H;
an isomorphism (or bijective), if it is both injective and surjective.
A group homomorphism : G G is called an endomorphism of G. An endomor-
phism which is also an isomorphism is called an automorphism. The set of automor-
phisms of a group G is denoted by Aut(G).
5. Recall that, if F is a eld (with operations + and ) then (F, +) and (F 0, ) are
abelian groups, with identity elements 0 and 1 respectively. If F and G are elds and
: F G is a function then we call a eld homomorphism if
(a) for all a, b F we have
(a + b) = (a) + (b) ,
(b) for all a, b F we have
(ab) = (a)(b) ,
(c) and (1
F
) = 1
G
. Here 1
F
and 1
G
are the multiplicative identities of F and G
respectively.
6. If V and W are vector spaces over the same eld F then we dene their product to be
the set
V W = (v, w) : v V and w W .
which becomes a vector space under the operations
(v, w)+(v
t
, w
t
) = (v+v
t
, w+w
t
), c(v, w) = (cv, cw) v, v
t
V, w, w
t
W, c F.
27
If you are familiar with the notion of an external direct sum, notice that the product
of two vector spaces is the same as their external direct sum. The two notions cease
being equivalent when one considers innitely many factors/summands.
If Z is another vector space over F then we call a function f : V W Z bilinear if
(a) for each xed v V , the function f
v
: W Z dened by
f
v
(w) = f((v, w))
is a linear transformation as a function of w and
(b) for each xed w W, the function f
w
: V Z dened by
f
w
(v) = f((v, w))
is a linear transformation as a function of v.
Exercises:
1. Suppose that G and H are groups and : G H is a homomorphism.
(a) Prove that if H
t
is a subgroup of H then the inverse image
1
(H
t
) = g G : (g) H
t
is a subgroup of G. Deduce that Ker() is a subgroup of G.

(b) Prove that if G
t
is a subgroup of G, then the image of G
t
under ,
(G
t
) = (g)[g G
t
is a subgroup of H.
(c) Prove that is one-to-one if and only if Ker() = e
G
. (Here, e
G
is the identity
element of G.)
2. Prove that every eld homomorphism is one-to-one.
3. Let V and W be nite dimensional vector spaces with dim V = n and dim W = m.
Suppose that T : V W is a linear transformation.
(a) Prove that if n > m then T cannot be one-to-one.
(b) Prove that if n < m then T cannot be onto.
(c) Prove that if n = m then T is one-to-one if and only if T is onto.
28
4. Let F be a eld, V, W be nite-dimensional F-vector spaces, and Z be any F-vector
space. Choose a basis v
1
, . . . , v
n
of V and a basis w
1
, . . . , w
m
of W. Let
z
i,j
: 1 i n, 1 j m
be any set of mn vectors from Z. Show that there is precisely one bilinear transforma-
tion f : V W Z such that
f(v
i
, w
j
) = z
i,j
for all i, j .
5. Let V be a vector space and T : V V a linear transformation. Show that the
following two statements are equivalent.
(A) V = R(T) N(T), where R(T) is the range of T and N(T) is the nullspace of T.
(B) N(T) = N(T
2
), where T
2
is T composed with itself.
6. Let V, W and Z be nite-dimensional vector spaces over a eld F. If T : V W and
U : W Z are linear transformations, prove that
rank(UT) minrank(U), rank(T) .
Prove also that if either of U or T is invertible, then the rank of UT is equal to the
rank of the other one. Deduce that if P : V V and Q : W W are isomorphisms
then the rank of QTP equals the rank of T.
7. Let V and W be nite-dimensional vector spaces over a eld F and T : V W be
a linear transformation. Show that there exist ordered bases of V and of W such
that
_
[T]
_
i,j
=
_
0 if i ,= j
0 or 1 if i = j
.
8. The purpose of this question is to show that the row rank of a matrix is equal to its
column rank. Note that this is obviously true for a matrix of the form described in
the previous exercise. Our goal will be to put an arbitrary matrix in this form without
changing either its row rank or its column rank. Let A be an mn matrix with entries
from a eld F.
(a) Show that the column rank of A is equal to the rank of the linear transformation
L
A
: F
n
F
m
dened by L
A
(v) = A v, viewing v as a column vector.
(b) Use question 1 to show that if P and Q are invertible n n and mm matrices
respectively then the column rank of QAP equals the column rank of A.
(c) Show that the row rank of A is equal to the rank of the linear transformation
R
A
: F
m
F
n
dened by R
A
(v) = v A, viewing v as a row vector.
(d) Use question 1 to show that if P and Q are invertible n n and mm matrices
respectively then the row rank of QAP equals the row rank of A.
29
(e) Show that there exist n n and mm matrices P and Q respectively such that
QAP has the form described in question 2. Deduce that the row rank of A equals
the column rank of A.
9. Given an angle [0, 2) let T
: R
2
R
2
be the function which rotates a vector
clockwise about the origin by an angle . Find the matrix of T
relative to the standard

basis. You do not need to prove that T
is linear.
10. Given m R, dene the line
L
m
= (x, y) R
2
: y = mx .
(a) Let T
m
be the function which maps a point in R
2
to its closest point in L
m
. Find
the matrix of T
m
relative to the standard basis. You do not need to prove that
T
m
is linear.
(b) Let R
m
be the function which maps a point in R
2
to the reection of this point
about the line L
m
. Find the matrix of T
m
relative to the standard basis. You do
not need to prove that R
m
is linear.
Hint for (a) and (b): rst nd the matrix relative to a carefully chosen basis
and then perform a change of basis.
11. The quotient space V/W. Let V be an F-vector space, and W V a subspace. A
subset S V is called a W-ane subspace of V , if the following holds:
s, s
t
S : s s
t
W and s S, w W : s + w S.
(a) Let S and T be W-ane subspaces of V and c F. Put
S + T := s + t : s S, t T, cT :=
_
ct : t T c ,= 0
W c = 0
.
Show that S + T and cT are again W-ane subspaces of V .
(b) Show that the above operations dene an F-vector space structure on the set of
all W-ane subspaces of V .
We will write V/W for the set of W-ane subspaces of V . We now know that it
is a vector space. Note that the elements of V/W are subsets of V .
(c) Show that if v V , then
v + W := v + w : w W
is a W-ane subspace of V . Show moreover that for any W-ane subspace S V
there exists a v V such that S = v + W.
(d) Show that the map p : V V/W dened by p(v) = v+W is linear and surjective.
30
(e) Compute the nullspace of p and, if the dimension of V is nite, the dimension of
V/W.
A helpful way to think about the quotient space V/W is to think of it as being the
vector space V , but with a new notion of equality of vectors. Namely, two vectors
v
1
, v
2
V are now seen as equal if v
1
v
2
W. Use this point of view to nd a
solution for the following exercise. When you nd it, use the formal denition given
above to write your solution rigorously.
12. Let V and X be F-vector spaces, and f L(V, X). Let W be a subspace of V
contained in N(f). Consider the quotient space V/W and the map p : V V/W from
the previous exercise.
(a) Show that there exists a unique

f L(V/W, X) such that f =

f p.
(b) Show that

f is injective if and only if W = N(f).
31
3 Dual spaces
3.1 Denitions
Consider the space F
n
and write each vector as a column vector. When can we say that a
vector is zero? When all coordinates are zero. Further, we say that two vectors are the same
if all of their coordinates are the same. This motivates the denition of the coordinate maps
e
i
: F
n
F by e
i
(v) = i-th coordinate of v .
Notice that each e
i
is a linear function from F
n
to F. Furthermore they are linearly inde-
pendent, so since the dimension of L(F
n
, F) is n, they form a basis.
Last it is clear that a vector v is zero if and only if e
i
(v) = 0 for all i. This is true if
and only if f(v) = 0 for all f which are linear functions from F
n
F. This motivates the
following denition.
Denition 3.1.1. If V is a vector space over F we dene the dual space V
as the space of
linear functionals
V
= f : V F [ f is linear .
We can view this as the space L(V, F), where F is considered as a one-dimensional vector
space over itself.
Suppose that V is nite dimensional and f V
. Then by the rank-nullity theorem,

either f 0 or N(f) is a dim V 1 dimensional subspace of V . Conversely, you will show
in the homework that any dim V 1 dimensional subspace W (that is, a hyperspace) is the
nullspace of some linear functional.
Denition 3.1.2. If = v
1
, . . . , v
n
is a basis for V then we dene the dual basis
=
f
v
1
, . . . , f
vn
as the unique functionals satisfying
f
v
i
(v
j
) =
_
1 i = j
0 i ,= j
.
From our proof of the dimension of L(V, W) we know that
is a basis of V
.
Proposition 3.1.3. Given a basis = v
1
, . . . , v
n
of V and dual basis
of V
we can
write
f = f(v
1
)f
v
1
+ + f(v
n
)f
vn
.
In other words, the coecients for f in the dual basis are just f(v
1
), . . . , f(v
n
).
Proof. Given f V
, we can write
f = a
1
f
v
1
+ + a
n
f
vn
.
To nd the coecients, we evaluate both sides at v
k
. The left is just f(v
k
). The right is
a
k
f
v
k
(v
k
) = a
k
.
Therefore a
k
= f(v
k
) are we are done.
32
3.2 Annihilators
We now study annihilators.
Denition 3.2.1. If S V we dene the annihilator of S as
S
= f V
: f(v) = 0 for all v S .

Theorem 3.2.2. Let V be a vector space and S V .
1. S
is a subspace of V
(although S does not have to be a subspace of V ).

2. S
= (Span S)
.
3. If dim V < and U is a subspace of V then whenever v
1
, . . . , v
k
is a basis for U
and v
1
, . . . , v
n
is a basis for V ,
f
v
k+1
, . . . , f
vn
is a basis for U
.
Proof. First we show that S
is a subspace of V
. Note that the zero functional obviously

sends every vector in S to zero, so 0 S
. If c F and f
1
, f
2
S
, then for each v S,

(cf
1
+ f
2
)(v) = cf
1
(v) + f
2
(v) = 0 .
So cf
1
+ f
2
S
and S
is a subspace of V
.
Next we show that S
= (Span S)
. To prove the forward inclusion, take f S
. Then
if v Span S we can write
v = a
1
v
1
+ + a
m
v
m
for scalars a
i
F and v
i
S. Thus
f(v) = a
1
f(v
1
) + + a
m
f(v
m
) = 0 ,
so f (Span S)
. On the other hand if f (Span S)
then clearly f(v) = 0 for all v S

(since S Span S). This completes the proof of item 2.
For the third item, we know that the functionals f
v
k+1
, . . . , f
vn
are linearly independent.
Therefore we just need to show that they span U
. To do this, take f U
. We can write
f in terms of the dual basis f
v
1
, . . . , f
vn
:
f = a
1
f
v
1
+ + a
k
f
v
k
+ a
k+1
f
v
k+1
+ + a
n
f
vn
.
Using the formula we have for the coecients, we get a
j
= f(v
j
), which is zero for j k.
Therefore
f = a
k+1
f
v
k+1
+ + a
n
f
vn
and we are done.
33
Corollary 3.2.3. If V is nite dimensional and W is a subspace then
dim V = dim W + dim W
.
Denition 3.2.4. For S
t
V
we dene
(S
t
) = v V : f(v) = 0 for all f S
t
.
In the homework you will prove similar properties for

(S
t
).
Fact: v V is zero if and only if f(v) = 0 for all f V
. One implication is easy. To prove

the other, suppose that v ,=
0 and and extend v to a basis for V . Then the dual basis has
the property that f
v
(v) ,= 0.
Proposition 3.2.5. If W V is a subspace and V is nite-dimensional then

(W
) = W.
Proof. If w W then for all f W
, we have f(w) = 0, so w
(W
). Suppose conversely
that w V has f(w) = 0 for all f W
. If w / W then build a basis v

1
, . . . , v
n
of V
such that v
1
, . . . , v
k
is a basis for W and v
k+1
= w. Then by the previous proposition,
f
v
k+1
, . . . , f
vn
is a basis for W
. However f
w
(w) = 1 ,= 0, which is a contradiction, since
f
w
W
.
3.3 Double dual
Lemma 3.3.1. If v V is nonzero and dim V < , there exists a linear functional f
v
V
such that f
v
(v) = 1. Therefore v =
0 if and only if f(v) = 0 for all f V
.
Proof. Extend v to a basis of V and consider the dual basis. f
v
is in this basis and
f
v
(v) = 1.
For each v V we can dene the evaluation map v : V
F by
v(f) = f(v) .
Theorem 3.3.2. Suppose that V is nite-dimensional. Then the map : V V
given
by
(v) = v
is an isomorphism.
Proof. First we show that if v V then (v) V
. Clearly v maps V
to F, but we just
need to show that v is linear. If f
1
, f
2
V
and c F then
v(cf
1
+ f
2
) = (cf
1
+ f
2
)(v) = cf
1
(v) + f
2
(v) = c v(f
1
) + v(f
2
) .
Therefore (v) V
. We now must show that is linear and either one-to-one or onto

(since the dimension of V is equal to the dimension of V
). First if v
1
, v
2
V , c F then
we want to show that
(cv
1
+ v
2
) = c(v
1
) + (v
2
) .
34
Both sides are elements of V
so we need to show they act the same on elements of V
. Let
f V
. Then
(cv
1
+v
2
)(f) = f(cv
1
+v
2
) = cf(v
1
) +f(v
2
) = c(v
1
)(f) +(v
2
)(f) = (c(v
1
) +(v
2
))(f) .
Finally to show one-to-one, we show that N() = 0. If (v) =
0 then for all f V
,
0 = (v)(f) = f(v) .
This implies v =
0.
Theorem 3.3.3. Let V be nite dimensional.
1. If = v
1
, . . . , v
n
is a basis for V then () = (v
1
), . . . , (v
n
) is the double dual
of this basis.
2. If W is a subspace of V then (W) is equal to (W
.
Proof. Recall that the dual basis of is
= f
v
1
, . . . , f
vn
, where
f
v
i
(v
j
) =
_
1 i = j
0 i ,= j
.
Since is an isomorphism, () is a basis of V
. Now
(v
i
)(f
v
k
) = f
v
k
(v
i
)
which is 1 if i = k and 0 otherwise. This means () =
.
Next if W is a subspace, let w W. Letting f W
,
(w)(f) = f(w) = 0 .
So w (W
. However, since is an isomorphism, (W) is a subspace of (W
. But
they have the same dimension, so they are equal.
3.4 Dual maps
Denition 3.4.1. Let T : V W be linear. We dene the dual map T
: W
by
T
(g)(v) = g(T(v)) .
Theorem 3.4.2. Let V and W be nite dimensional and let and be bases for V and W.
If T : V W is linear, so is T
. If
and
are the dual bases, then

[T
=
_
[T]
_
t
.
35
Proof. First we show that T
is linear. If g
1
, g
2
W
and c F then for each v V ,

T
(cg
1
+ g
2
)(v) = (cg
1
+ g
2
)(T(v)) = cg
1
(T(v)) + g
2
(T(v))
= cT
(g
1
)(v) + T
(g
2
)(v) = (cT
(g
1
) + T
(g
2
))(v) .
Next let = v
1
, . . . , v
n
, = w
1
, . . . , w
m
,
= f
v
1
, . . . , f
vn
and
= g
w
1
, . . . , g
wm
and write [T]
= (a
i,j
). Then recall from the lemma that for any f V
we have
f = f(v
1
)f
v
1
+ + f(v
n
)f
vn
,
Therefore the coecient of f
v
i
for T
(g
w
k
) is
T
(g
w
k
)(v
i
) = g
w
k
(T(v
i
)) = g
w
k
(a
1,i
w
1
+ + a
m,i
w
m
) = a
k,i
.
So
T
(g
w
k
) = a
k,1
f
v
1
+ + a
k,n
f
vn
.
This is the k-th column of [T
.
Theorem 3.4.3. If V and W are nite dimensional and T : V W is linear then R(T
) =
(N(T))
and (R(T))
= N(T
).
Proof. If g N(T
) then T
(g)(v) = 0 for all v V . If w R(T) then w = T(v) for

some v V . Then g(w) = g(T(v)) = T
(g)(v) = 0. Thus g (R(T))
. If, conversely,
g (R(T))
then we would like to show that T
(g)(v) = 0 for all v V . We have

T
(g)(v) = g(T(v)) = 0 .
Let f R(T
). Then f = T
(g) for some g W
. If T(v) = 0, we have f(v) =

T
(g)(v) = g(T(v)) = 0. Therefore f (N(T))
. This gives R(T
) (N(T))
. To show
the other direction,
dim R(T
) = mdim N(T
)
and
dim (N(T))
= n dim N(T) .
However, by part 1, dim N(T
) = mdim R(T), so
dim R(T
) = dim R(T) = n dim N(T) = dim (N(T))
.
This gives the other inclusion.
36
3.5 Exercises
Notation:
1. Recall the denition of a bilinear function. Let F be a eld, and V, W and Z be
F-vector spaces. A function f : V W Z is called bilinear if
(a) for each v V the function f
v
: W Z given by f
v
(w) = f(v, w) is linear as a
function of w and
(b) for each w W the function f
w
: V Z given by f
w
(v) = f(v, w) is linear as a
function of v.
When Z is the F-vector space F, one calls f a bilinear form.
2. Given a bilinear function f : V W Z, we dene its left kernel and its right kernel
as
LN(f) = v V : f(v, w) = 0 w W,
RN(f) = w W : f(v, w) = 0 v V .
More generally, for subspaces U V and X W we dene their orthogonal comple-
ments
U
f
= w W : f(u, w) = 0 u U,
f
X = v V : f(v, x) = 0 x X.
Notice that LN(f) =

f
W and RN(f) = V
f
.
Exercises:
1. Let V and W be vector spaces over a eld F and let f : V W F be a bilinear
form. For each v V , we denote by f
v
the linear functional W F given by
f
v
(w) = f(v, w). For each w W, we denote by f
w
the linear functional V F given
by f
w
(v) = f(v, w).
(a) Show that the map
: V W
, (v) = f
v
is linear and its kernel is LN(f).
(b) Analogously, show that the map
: W V
, (w) = f
w
is linear and its kernel is RN(f).
(c) Assume now that V and W are nite-dimensional. Show that the map W
given by composing with the inverse of the canonical isomorphism W W
is equal to
, the map dual to .

37
(d) Assuming further dim(V ) = dim(W), conclude that the following statements are
equivalent:
i. LN(f) = 0,
ii. RN(f) = 0,
iii. is an isomorphism,
iv. is an isomorphism.
2. Let V, W be nite-dimensional F-vector spaces. Denote by
V
and
W
the canonical
isomorphisms V V
and W W
. Show that if T : V W is linear then
1
W
T

V
= T.
3. Let V be a nite-dimensional F-vector space. Show that any basis (
1
, . . . ,
n
) of V
is the dual basis to some basis (v

1
, . . . , v
n
) of V .
4. Let V be an F-vector space, and S
t
V
a subset. Recall the denition
S
t
= v V : (v) = 0 S
t
.
Imitating a proof given in class, show the following:
(a)

S
t
=

span(S
t
).
(b) Assume that V is nite-dimensional, let U
t
V
be a subspace, and let (

1
, . . . ,
n
)
be a basis for V
such that (
1
, . . . ,
k
) is a basis for U
t
. If (v
1
, . . . , v
n
) is the ba-
sis of V from the previous exercise, then (v
k+1
, . . . , v
n
) is a basis for

U
t
. In
particular, dim(U
t
) + dim(
U
t
) = dim(V
).
5. Let V be a nite-dimensional F-vector space, and U V a hyperplane (that is, a
subspace of V of dimension dim V 1).
(a) Show that there exists V
with N() = U.
(b) Show that if V
is another functional with N() = U, then there exists

c F
with = c.
6. (From Homan-Kunze)
(a) Let A and B be nn matrices with entries from a eld F. Show that Tr (AB) =
Tr (BA).
(b) Let T : V V be a linear transformation on a nite-dimensional vector space.
Dene the trace of T as the trace of the matrix of T, represented in some basis.
Prove that the denition of trace does not depend on the basis thus chosen.
(c) Prove that on the space of n n matrices with entries from a eld F, the trace
function Tr is a linear functional. Show also that, conversely, if some linear
functional g on this space satises g(AB) = g(BA) then g is a scalar multiple of
the trace function.
38
4 Determinants
4.1 Permutations
Now we move to permutations. These will be used when we talk about the determinant.
Denition 4.1.1. A permutation on n letters is a function : 1, . . . , n 1, . . . , n
which is a bijection.
The set of all permutations forms a group under composition. There are n! elements.
There are two main ways to write a permutation.
1. Row notation:
1 2 3 4 5 6
6 4 2 3 5 1
Here we write the elements of 1, . . . , n in the rst row, in order. In the second row
we write the elements they are mapped to, in order.
2. Cycle decomposition:
(1 6)(2 4 3)(5)
All cycles are disjoint. It is easier to compose permutations this way. Suppose is the
permutation given above and
t
is the permutation
t
= (1 2 3 4 5)(6) .
Then the product of
t
is (here we will apply
t
rst)
_
(1 6)(2 3 4)(5)
_
(1 2 3 4 5)(6)
= (1 3 2 4 5 6) .
It is a simple fact that each permutation has a cycle decomposition with disjoint cycles.
Denition 4.1.2. A transposition is a permutation that swaps two letters and xes the
others. Removing the xed letters, it looks like (i j) for i ,= j. An adjacent transposition is
one that swaps neighboring letters.
Lemma 4.1.3. Every permutation can be written as a product of transpositions. (Not
necessarily disjoint.)
Proof. All we need to do is write a cycle as a product of transpositions. Note
(a
1
a
2
a
k
) = (a
1
a
k
)(a
1
a
k1
) (a
1
a
3
)(a
1
a
2
) .
Denition 4.1.4. A pair of numbers (i, j) is an inversion pair for if i < j but (i) > (j).
Write N
inv
() for the number of inversion pairs of .
39
For example in the permutation (13)(245), also written as
1 2 3 4 5
3 4 1 5 2
,
we have inversion pairs (1, 3), (1, 5), (2, 3), (2, 5), (4, 5).
Lemma 4.1.5. Let be a permutation and = (k k+1) be an adjacent transposition. Then
N
inv
() = N
inv
() 1. If
1
, . . . ,
m
are adjacent transpositions then
N
inv
(
1

m
) N
inv
() =
_
even m even
odd m odd
.
Proof. Let a < b 1, . . . , n. If (a), (b) k, k+1 then (a)(b) = ((a)(b))
so (a, b) is an inversion pair for if and only if it is not one for . We claim that in all
other cases, the sign of (a) (b) is the same as the sign of (a) (b). If neither
of (a) and (b) is in k, k + 1 then (a) (b) = (a) (b). The other cases are
somewhat similar: if (a) = k but (b) > k + 1 then (a) (b) = k + 1 (b) < 0 and
(a) (b) = k (b) < 0. Keep going.
Therefore has exactly the same inversion pairs as except for (
1
(a),
1
(b)), which
switches status. This proves the lemma.
Denition 4.1.6. Given a permutation on n letters, we say that is even if it can be
written as a product of an even number of transpositions and odd otherwise. This is called
the signature (or sign) of a permutation:
sgn() =
_
+1 if is even
1 if is odd
.
Theorem 4.1.7. If can be written as a product of an even number of transpositions, it
cannot be written as a product of an odd number of transpositions. In other words, signature
is well-dened.
Proof. Suppose that = s
1
s
k
and = t
1
t
j
. We want to show that k j is even. In
other words, if k is odd, so is j and if k is even, so is j.
Now note that each transposition can be written as a product of an odd number of
adjacent transpositions:
(5 1) = (5 4)(4 3)(3 2)(1 2)(2 3)(3 4)(4 5)
so write s
1
s
k
= s
1
s
k
and t
1
t
j
=
t
1

t
j
where k k
t
is even and j j
t
is even.
We have 0 = N
inv
(id) = N
inv
(
t
1
j
t
1
1
), which is N
inv
() plus an even number if j
t
is even or and odd number if j
t
is odd. This means that N
inv
() j
t
is even. The same
argument works for k, so N
inv
() k
t
is even. Now
j k = j j
t
+ j
t
N
inv
() + N
inv
() k
t
+ k
t
k
is even.
40
Corollary 4.1.8. For any two permutations and
t
,
sgn(
t
) = sgn()sgn(
t
) .
4.2 Determinants: existence and uniqueness
Given n vectors v
1
, . . . , v
n
in R
n
we want to dene something like the volume of the paral-
lelepiped spanned by these vectors. What properties would we expect of a volume?
1. vol(e
1
, . . . , e
n
) = 1.
2. If two of the vectors v
i
are equal the volume should be zero.
3. For each c > 0, vol(cv
1
, v
2
, . . . , v
n
) = c vol(v
1
, . . . , v
n
). Same in other arguments.
4. For each v
t
1
, vol(v
1
+v
t
1
, v
2
, . . . , v
n
) = vol(v
1
, . . . , v
n
)+vol(v
t
1
, v
2
, . . . , v
n
). Same in other
arguments.
Using the motivating example of the volume, we dene a multilinear function as follows.
Denition 4.2.1. If V is an n-dimensional vector space over F then dene
V
n
= (v
1
, . . . , v
n
) : v
i
V for all i = 1, . . . , n .
A function f : V
n
F is called multilinear if for each i and vectors v
1
, . . . , v
i1
, v
i+1
, . . . , v
n

V , the function f
i
: V F is linear, where
f
i
(v) = f(v
1
, . . . , v
i1
, v, v
i+1
, . . . , v
n
) .
A multilinear function f is called alternating if f(v
1
, . . . , v
n
) = 0 whenever v
i
= v
j
for some
i ,= j.
Proposition 4.2.2. Let f : V
n
F be a multilinear function. If F does not have charac-
teristic two then f is alternating if and only if for all v
1
, . . . , v
n
and i < j,
f(v
1
, . . . , v
i
, . . . , v
j
, . . . , v
n
) = f(v
1
, . . . , v
j
, . . . , v
i
, . . . , v
n
) .
Proof. Suppose that f is alternating. Then
0 = f(v
1
, . . . , v
i
+ v
j
, . . . , v
i
+ v
j
, . . . , v
n
)
= f(v
1
, . . . , v
i
, . . . , v
i
+ v
j
, . . . , v
n
) + f(v
1
, . . . , v
j
, . . . , v
i
+ v
j
, . . . , v
n
)
= f(v
1
, . . . , v
i
, . . . , v
j
, . . . , v
n
) + f(v
1
, . . . , v
j
, . . . , v
i
, . . . , v
n
) .
Conversely suppose that f has the property above. Then if v
i
= v
j
,
f(v
1
, . . . , v
i
, . . . , v
j
, . . . , v
n
) = f(v
1
, . . . , v
j
, . . . , v
i
, . . . , v
n
)
= f(v
1
, . . . , v
i
, . . . , v
j
, . . . , v
n
) .
Since F does not have characteristic two, this means this is zero.
41
Corollary 4.2.3. Let f : V
n
F be an n-linear alternating function. Then for each S
n
,
f(v
(1)
, . . . , v
(n)
) = sgn() f(v
1
, . . . , v
n
) .
Proof. Write =
1

k
where the
i
s are transpositions and (1)
k
= sgn(). Then
f(v
(1)
, . . . , v
(n)
) = f(v
k1
(1)
, . . . , v
k1
(n)
) .
Applying this k 1 more times gives the corollary.
Theorem 4.2.4. Let v
1
, . . . , v
n
be a basis for V . There is at most one multilinear alter-
nating function f : V
n
F such that f(v
1
, . . . , v
n
) = 1.
Proof. Let u
1
, . . . , u
n
V and write
u
k
= a
1,k
v
1
+ + a
n,k
v
n
.
Then
f(u
1
, . . . , u
n
) =
n
i
1
=1
a
i
1
,1
f(v
i
1
, u
2
, . . . , u
n
)
=
n
i
1
,...,in=1
a
i
1
,1
a
in,n
f(v
i
1
, . . . , v
in
) .
However whenever two i
j
s are equal, we get zero, so we can restrict the sum to all distinct
i
j
s. So this is
n
i
1
,...,in=1 distinct
a
i
1
,1
a
in,n
f(v
i
1
, . . . , v
in
) .
This can now be written as
Sn
a
(1),1
a
(n),n
f(v
(1)
, . . . , v
(n)
)
=
Sn
sgn() a
(1),1
a
(n),n
f(v
1
, . . . , v
n
)
=
Sn
sgn() a
(1),1
a
(n),n
.
We now nd that if dim V = n and f : V
n
F is an n-linear alternating function
with f(v
1
, . . . , v
n
) = 1 for some xed basis v
1
, . . . , v
n
then we have a specic form for f.
Writing vectors u
1
, . . . , u
n
as
u
k
= a
1,k
v
1
+ + a
n,k
v
n
,
42
then we have
f(u
1
, . . . , u
n
) =
Sn
sgn() a
(1),1
a
(n),n
.
Now we would like to show that the formula above indeed does dene an n-linear alternating
function with the required property.
1. Alternating. Suppose that u
i
= u
j
for some i < j. We will then split the set of
permutations into two classes. Let A = S
n
: (i) < (j). Letting
i,j
= (ij),
write for A for the permutation
i,j
.
f(u
1
, . . . , u
n
) =
A
sgn() a
(1),1
a
(n),n
+
Sn\A
sgn() a
(1),1
a
(n),n
=
A
sgn() a
(1),1
a
(n),n
+
piA
sgn(
i,j
) a
i,j
(1),1
a
i,j
(n),n
.
However,
i,j
(k) = (k) when k ,= i, j and u
i
= u
j
so this equals
A
[sgn() + sgn(
i,j
)]a
(1),1
a
(n),n
= 0 .
2. 1 at the basis. Note that for u
i
= v
i
for all i we have
a
i,j
=
_
1 i = j
0 i ,= j
.
Therefore the value of f is
Sn
sgn() a
(1),1
a
(n),n
= sgn(id)a
1,1
a
n,n
= 1 .
3. Multilinear. Write
u
k
= a
1,k
v
1
+ + a
n,k
v
n
for k = 1, . . . , n
and write u = b
1
v
1
+ + b
n
v
n
. Now for c F,
cu + u
1
= (cb
1
+ a
1,1
)v
1
+ + (cb
n
+ a
n,1
)v
n
.
f(cu + u
1
, u
2
, . . . , u
n
)
=
Sn
sgn() [cb
(1)
+ a
(1),1
]a
(2),2
a
(n),n
= c
Sn
sgn() b
(1)
a
(2),2
a
(n),n
+
Sn
sgn() a
(1),1
a
(2),2
a
(n),n
= cf(u, u
2
, . . . , u
n
) + f(u
1
, . . . , u
n
) .
43
4.3 Properties of determinants
Theorem 4.3.1. Let f : V
n
F be a multilinear alternating function and let v
1
, . . . , v
n
be a basis with f(v

1
, . . . , v
n
) ,= 0. Then u
1
, . . . , u
n
is linearly dependent if and only if
f(u
1
, . . . , u
n
) = 0.
Proof. One direction is on the homework: suppose that f(u
1
, . . . , u
n
) = 0 but that u
1
, . . . , u
n
is linearly independent. Then write

v
k
= a
1,k
u
1
+ + a
n,k
u
n
.
By the same computation as above,
f(v
1
, . . . , v
n
) = f(u
1
, . . . , u
n
)
Sn
sgn() a
(1),1
a
(n),n
= 0 ,
which is a contradiction. Therefore u
1
, . . . , u
n
is linearly dependent.
Conversely, if u
1
, . . . , u
n
is linearly dependent then for n 2 we can nd some u
j
and
scalars a
i
for i ,= j such that
u
j
=
i,=j
a
i
u
i
.
Now we have
f(u
1
, . . . , u
j
, . . . , u
n
) =
i,=j
a
i
f(u
1
, . . . , u
i
, . . . , u
n
) = 0 .
Denition 4.3.2. On the space F
n
we dene det : F
n
F as the unique alternating
n-linear function that gives det(e
1
, . . . , e
n
) = 1. If A is an n n matrix then we dene
det A = det(a
1
, . . . , a
n
) ,
where a
k
is the k-th column of A.
Corollary 4.3.3. An n n matrix A over F is invertible if and only if det A ,= 0.
Proof. We have det A ,= 0 if and only if the columns of A are linearly independent. This is
true if and only if A is invertible.
We start with the multiplicative property of determinants.
Theorem 4.3.4. Let A and B be n n matrices over a eld F. Then
det (AB) = det A det B .
44
Proof. If det B = 0 then B is not invertible, so it cannot have full column rank. Therefore
neither can AB (by a homework problem). This means det (AB) = 0 and we are done.
Otherwise det B ,= 0. Dene a function f : M
nn
(F) F by
f(A) =
det (AB)
det B
.
We claim that f is n-linear, alternating and assigns the value 1 to the standard basis (that
is, the identity matrix).
1. f is alternating. If A has two equal columns then its column rank is not full.
Therefore neither can be the column rank of AB and we have det (AB) = 0. This
implies f(A) = 0.
2. f(I) = 1. This is clear since IB = B.
3. f is n-linear. This follows because det is.
But there is exactly one function satisfying the above. We nd f(A) = det A and we are
done.
For the rest of the lecture we will give further properties of determinants.
det A = det A
t
. This is on homework.
det is alternating and n-linear as a function of rows.
If A is (a block matrix) of the form
A =
_
B C
0 D
_
then det A = det Bdet D. This is also on homework.
The determinant is unchanged if we add a multiple of one column (or row) to another.
To show this, write a matrix A as a collection of columns (a
1
, . . . , a
n
). For example if
we add a multiple of column 1 to column 2 we get
det(a
1
, ca
1
+a
2
, a
3
, . . . , a
n
) = det c(a
1
, a
1
, a
3
, . . . , a
n
) + det(a
1
, a
2
, a
3
, . . . , a
n
)
= det(a
1
, a
2
, a
3
, . . . , a
n
)
det cA = c
n
det A.
We will now discuss cofactor expansion.
Denition 4.3.5. Let A M
nn
(F). For i, j [1, n] dene the (i, j)-minor of A (written
A(i[j)) to be the (n 1) (n 1) matrix obtained from A by removing the i-th row and the
j-th column.
45
Theorem 4.3.6 (Laplace expansion). Let A M
nn
(F) for n 2 and x j [1, n]. We
have
det A =
n
i=1
(1)
i+j
A
i,j
det A(i[j) .
Proof. Let us begin by taking j = 1. Now write the column a
1
as
a
1
= A
1,1
e
1
+ a
2,1
e
2
+ + A
n,1
e
n
.
Then we get
det A = det(a
1
, . . . , a
n
) =
n
i=1
A
i,1
det(e
i
, a
2
, . . . , a
n
) . (2)
We now consider the term det(e
i
, a
2
, . . . , a
n
). This is the determinant of the following matrix:
_
_
_
_
_
_
0 A
1,2
A
1,n

1 A
i,2
A
i,n

0 A
n,2
A
n,n
_
_
_
_
_
_
.
Here, the rst column is 0 except for a 1 in the i-th spot. We can now swap the i-th row to
the top using i 1 adjacent transpositions (12) (i 1 i). We are left with the determinant
of the matrix
(1)
i1
_
_
_
_
_
_
_
_
_
_
1 A
i,2
A
i,n
0 A
1,2
A
1,n

0 A
i1,2
A
i1,n
0 A
i+1,2
A
i+1,n

0 A
n,2
A
n,n
_
_
_
_
_
_
_
_
_
_
.
This is a block matrix of the form
_
1 B
0 A(i[1)
_
.
By the remarks earlier, the determinant is equal to det A(i[j) 1. Plugging this into formula
(5), we get
det A =
n
i=1
(1)
i1
A
i,1
det A(i[1) ,
which equals
n
i=1
(1)
i+j
A
i,j
det A(i[j).
If j ,= 1 then we perform j 1 adjacent column switches to get the j-th column to the
rst. This gives us a new matrix

A. For this matrix, the formula holds. Compensating for
46
the switches,
det A = (1)
j1
det

A = (1)
j1
n
i=1
(1)
i1
A
i,1
det

A(i[1)
=
n
i=1
(1)
i+j
A
i,j
det A(i[j) .
Check the last equality. This completes the proof.
We have discussed determinants of matrices (and of n vectors in F
n
). We will now dene
for transformations.
Denition 4.3.7. Let V be a nite dimensional vector space over a eld F. If T : V V
is linear, we dene det T as det[T]
for any basis of V .

Note that det T does not depend on the choice of basis. Indeed, if
t
is another basis,
det[T]
= det
_
[I]
[T]
[I]
_
= det[T]
.
If T and U are linear transformations from V to V then det TU = det T det U.
det cT = c
dim V
det T.
det T = 0 if and only if T is non-invertible.
4.4 Exercises
Notation:
1. n = 1, . . . , n is the nite set of natural numbers between 1 and n;
2. S
n
is the set of all bijective maps n n;
3. For a sequence k
1
, . . . , k
t
of distinct elements of n, we denote by (k
1
k
2
. . . k
t
) the element
of S
n
which is dened by
(i) =
_
_
k
s
, i = k
s1
, 1 < s < t + 1
k
1
, i = k
t
i, i / k
1
, . . . , k
t
Elements of this form are called cycles (or t-cycles). Two cycles (k
1
. . . k
t
) and (l
1
. . . l
s
)
are called disjoint if the sets k
1
, . . . , k
t
and l
1
, . . . , l
s
are disjoint.
4. Let S
n
. A subset k
1
, . . . , k
t
n is called an orbit of if the following conditions
hold
47
For any j N there exists an 1 i t such that
j
(k
1
) = k
i
.
For any 1 i t there exists a j N such that k
i
=
j
(k
1
).
Here
j
is the product of j-copies of .
5. Let V and W be two vector spaces over an arbitrary eld F, and k N. Recall that
a k-linear map f : V
k
W is called
alternating, if f(v
1
, . . . , v
k
) = 0 whenever the vectors (v
1
, . . . , v
k
) are not distinct;
skew-symmetric, if f(v
1
, . . . , v
k
) = f(v
(1)
, . . . , v
(k)
) for any transposition
S
k
;
symmetric, if f(v
1
, . . . , v
k
) = f(v
(1)
, . . . , v
(k)
) for any transposition S
k
.
6. If k and n are positive integers such that k n the binomial coecient
_
n
k
_
is dened
as
_
n
k
_
=
n!
k!(n k)!
.
Note that this number is equal to the number of distinct subsets of size k of a nite
set of size n.
7. Given an F-vector space V , denote by Alt
k
(V ) the set of alternating k-linear forms
(functions) on V ; that is, the set of alternating k-linear map V
k
F.
Exercises:
1. Prove that composition of maps denes a group law on S
n
. Show that this group is
abelian only if n 2.
2. Let f : S
n
Z be a function which is multiplicative, i.e. f() = f()f(). Show that
f must be one of the following three functions: f() = 0, f() = 1, f() = sgn().
3. (From Dummit-Foote) List explicitly the 24 permutations of degree 4 and state which
are odd and which are even.
4. Let k
1
, . . . , k
t
be a sequence of distinct elements of n. Show that sgn((k
1
. . . k
t
)) =
(1)
t1
.
5. Let S
n
be the element (k
1
. . . k
t
) from the previous exercise, and let S
n
be any
element. Find a formula for the element
1
.
6. Let = (k
1
. . . k
t
) and = (l
1
. . . l
s
) be disjoint cycles. Show that then = . One
says that and commute.
7. Let S
n
. Show that can be written as a product of disjoint (and hence, by the
previous exercise, commuting) cycles.
Hint: Consider the orbits of .
48
8. If G is a group and S G is a subset, dene S to be the intersection of all subgroups
of G that contain S. (This is the subgroup of G generated by S.)
(a) Show that if S is a subset of G then S is a subgroup of G.
(b) Let S be a subset of G and dene
S = S s
1
: s S .
Show that
S = a
1
a
k
: k 1 and a
i
S for all i .
9. Prove that S
n
= (12), (12 n).
Hint: Use exercise 5.
10. Let V be an nite-dimensional vector space over some eld F, W an arbitrary vector
space over F, and k > dim(V ). Show that every alternating k-linear function V
k
W
is identically zero. Give an example (choose F, V , W, and k as you wish, as long as
k > dim(V )) of a skew-symmetric k-linear function V
k
W which is not identically
zero.
11. (From Homan-Kunze) Let F be a eld and f : F
2
F be a 2-linear alternating
function. Show that
f(
_
a
b
_
,
_
c
d
_
) = (ad bc)f(e
1
, e
2
).
Find an analogous formula for F
3
. Deduce from this the formula for the determinant
of a 2 2 and a 3 3 matrix.
12. Let V be an n-dimensional vector space over a eld F. Suppose that f : V
n
F is an
n-linear alternating function such that f(v
1
, . . . , v
n
) ,= 0 for some basis v
1
, . . . , v
n
of
V . Show that f(u
1
, . . . , u
n
) = 0 implies that u
1
, . . . , u
n
is linearly dependent.
13. Let V and W be vector spaces over a eld F, and f : V W a linear map.
(a) For Alt
k
(W) let f
be the function on V
k
dened by
[f
](v
1
, . . . , v
k
) = (f(v
1
), . . . , f(v
k
)).
Show that f
Alt
k
(V ).
(b) Show that in this way we obtain a linear map f
: Alt
k
(W) Alt
k
(V ).
(c) Show that, given a third vector space X over F and a linear map g : W X,
one has (g f)
= f
.
(d) Show that if f is an isomorphism, then so is f
.
49
14. For n 2, we call M M
nn
(F) a block upper-triangular matrix if there exists k with
1 k n 1 and matrices A M
kk
(F), B M
k(nk)
(F) and C M
(nk)(nk)
(F)
such that M has the form
_
A B
0 C
_
.
That is, the elements of M are given by
M
i,j
=
_
_
A
i,j
1 i k, 1 j k
B
i,jk
1 i k, k < j n
0 k < i n, 1 j k
C
ik,jk
k < i n, k < j n
.
We will show in this exercise that
det M = det A det C . (3)
(a) Show that if det C = 0 then formula (3) holds.
(b) Suppose that det C ,= 0 and dene a function

A
_
A
_
for

A M
kk
(F) by
A
_
= [det C]
1
det
_

A B
0 C
_
.
That is,
_
A
_
is a scalar multiple of the determinant of the block upper-triangular
matrix we get when we replace A by

A and keep B and C xed.
i. Show that is k-linear as a function of the columns of

A.
ii. Show that is alternating and satises (I
k
) = 1, where I
k
is the k k
identity matrix.
iii. Conclude that formula (3) holds when det C ,= 0.
15. Suppose that A M
nn
(F) is upper-triangular; that is, a
i,j
= 0 when 1 j < i n.
Show that det A = a
1,1
a
2,2
a
n,n
.
16. Let A M
nn
(F) such that A
k
= 0 for some k 0. Show that det A = 0.
17. Let a
0
, a
1
, . . . , a
n
be distinct complex numbers. Write M
n
(a
0
, . . . , a
n
) for the matrix
_
_
_
_
1 a
0
a
2
0
a
n
0
1 a
1
a
2
1
a
n
1

1 a
n
a
2
n
a
n
n
_
_
_
_
.
The goal of this exercise is to show that
det M
n
(a
0
, . . . , a
n
) =
0i<jn
(a
j
a
i
) . (4)
We will argue by induction on n.
50
(a) Show that if n = 2 then formula (5) holds.
(b) Now suppose that k 3 and that formula (5) holds for all 2 n k. Show that
it holds for n = k + 1 by completing the following outline.
i. Dene the function f : C C by f(z) = det M
n
(z, a
1
, . . . , a
n
). Show that f
is a polynomial of degree at most n.
ii. Find all the zeros of f.
iii. Show that the coecient of z
n
is (1)
n
det M
n1
(a
1
, . . . , a
n
).
iv. Show that formula (5) holds for n = k + 1, completing the proof.
18. Show that if A M
nn
(F) then det A = det A
t
, where A
t
is the transpose of A.
19. Let V be an n-dimensional F-vector space and k n. The purpose of this problem is
to show that
dim(Alt
k
(V )) =
_
n
k
_
,
by completing the following steps:
(a) Let W be a subspace of V and let B = (v
1
, . . . , v
n
) be a basis for V such that
(v
1
, . . . , v
k
) is a basis for W. Show that
p
W,B
: V W, v
i

_
v
i
, i k
0, i > k
species a linear map with the property that p
W,B
p
W,B
= p
W,B
. Such a map
(that is, a T such that T T = T) is called a projection.
(b) With W and B as in the previous part, let d
W
be a non-zero element of Alt
k
(W).
Show that [p
W,B
]
d
W
is a non-zero element of Alt
k
(V ). (Recall this notation from
exercise 3.)
(c) Let B = (v
1
, . . . , v
n
) be a basis of V . Let S
1
, . . . , S
t
be subsets of n = 1, . . . , n.
Assume that each S
i
has exactly k elements and no two S
i
s are the same. Let
W
i
= Span(v
j
: j S
i
).
For i = 1, . . . , t, let d
W
i
Alt
k
(W
i
) be non-zero. Show that the collection
[p
W
i
,B
]
d
W
i
: 1 i t of elements of Alt
k
(V ) is linearly independent.
(d) Show that the above collection is also generating, by taking an arbitrary
Alt
k
(V ), an arbitrary collection u
1
, . . . , u
k
of vectors in V , expressing each u
i
as a
linear combination of (v
1
, . . . , v
k
) and plugging those linear combinations into .
In doing this, it may be helpful (although certainly not necessary) to assume
that d
W
i
is the unique element of Alt
k
(W
i
) with d
W
i
(w
1
, . . . , w
k
) = 1, where
S
i
= (w
1
, . . . , w
k
).
51
20. Let A M
nn
(F) for some eld F. Recall that if 1 i, j n then the (i, j)-th minor
of A, written A(i[j), is the (n1) (n1) matrix obtained by removing the i-th row
and j-th column from A. Dene the cofactor
C
i,j
= (1)
i+j
det A(i[j) .
Note that the Laplace expansion for the determinant can be written
det A =
n
i=1
A
i,j
C
i,j
.
(a) Show that if 1 i, j, k n with j ,= k then
n
i=1
A
i,k
C
i,j
= 0 .
(b) Dene the classical adjoint of A, written adj A, by
(adj A)
i,j
= C
j,i
.
Show that (adj A)A = (det A)I.
(c) Show that A(adj A) = (det A)I and deduce that if A is invertible then
A
1
= (det A)
1
adj A .
Hint: begin by applying the result of the previous part to A
t
.
(d) Use the formula in the last part to nd the inverses of the following matrices:
_
_
1 2 4
1 3 9
1 4 16
_
_
,
_
_
_
_
1 2 3 4
1 0 0 0
0 1 1 1
6 0 1 1
_
_
_
_
.
21. Consider a system of equations in n variables with coecients from a eld F. We
can write this as AX = Y for an n n matrix A, an n 1 matrix X (with entries
x
1
, . . . , x
n
) and an n 1 matrix Y (with entries y
1
, . . . , y
n
). Given the matrices A and
Y we would like to solve for X.
(a) Show that
(det A)x
j
=
n
i=1
(1)
i+j
y
i
det A(i[j) .
(b) Show that if det A ,= 0 then we have
x
j
= (det A)
1
det B
j
,
where B
j
is an n n matrix obtained from A by replacing the j-th column of A
by Y . This is known as Cramers rule.
52
(c) Solve the following systems of equations using Cramers rule.
_
_
2x y + z = 3
2y z = 1
y x = 1
_
_
2x y + z 2t = 5
2x + 2y 3z + t = 1
x + y z = 1
4x 3y + 2z 3t = 8
22. Find the determinants of the following matrices. In the rst example, the entries are
from R and in the second they are from Z
3
.
_
_
_
_
1 4 5 7
0 0 2 3
1 4 1 7
2 8 10 14
_
_
_
_
,
_
_
_
_
_
_
1 1 0 0 0
1 1 1 0 0
0 1 1 1 0
0 0 1 1 1
0 0 0 1 1
_
_
_
_
_
_
53
5 Eigenvalues
5.1 Denitions and the characteristic polynomial
The simplest matrix is I for some F. These act just like the eld F. What is the
second simplest? A diagonal matrix; that is, a matrix D that satised D
ij
= 0 if i ,= j.
Denition 5.1.1. Let V be a nite dimensional vector space over F. An linear transfor-
mation T is called diagonalizable if there exists a basis such that [T]
is diagonal.
Proposition 5.1.2. T is diagonalizable if and only if there exists a basis v
1
, . . . , v
n
of V
and scalars
1
, . . . ,
n
such that
T(v
i
) =
i
v
i
for all i .
Proof. Suppose that T is diagonalizable. Then there is a basis v
1
, . . . , v
n
such that [T]
is
diagonal. Then, writing D = [T]
,
[Tv
k
]
= D[v
k
]
= D e
k
= D
k,k
.
Now we can choose
k
= D
k,k
.
If the second condition holds then we see that [T]
is diagonal with entries D

i,j
= 0 if
i ,= j and D
i,i
=
i
.
This motivates the following denition.
Denition 5.1.3. If T : V V is linear then we call a nonzero vector v an eigenvector of
T if there exists F such that T(v) = v. In this case we call the eigenvalue for v.
Theorem 5.1.4. If dim V < and T : V V is linear then the following are equivalent.
1. is an eigenvalue of T (for some eigenvector).
2. T I is not invertible.
3. det(T I) = 0.
Proof. If (1) holds then the eigenvector v is a non-zero vector in the nullspace of T I. Thus
T I is not invertible. We already know that (2) and (3) are equivalent. If T I is not
invertible then there is a non-zero vector in its nullspace. This vector is an eigenvector.
Denition 5.1.5. If T : V V is linear and dim V = n then we dene the characteristic
polynomial c : F F by
c() = det(T I) .
Note that c() does not depend on the choice of basis.
54
We can write in terms of the matrix.
c() = det(T I) = det([T I]
) = det([T]
Id) .
Eigenvalues are exactly the roots of c().
Facts about c(x).
1. c is a polynomial of degree n. We can see this by analyzing each term in the denition
of the determinant: set B = A xI and see
sgn() B
1,(1)
B
n,(n)
.
Each term B
i,(i)
is a polynomial of degree 0 or 1 in x. So the product has degree at
most n. A sum of such polynomials is a polynomial of degree at most n.
In fact, the only term of degree n is
sgn(id) B
1,1
B
n,n
= (A
1,1
x) (A
n,n
x) .
So the coecient of x
n
is (1)
n
.
2. In the above description of c(x), all terms corresponding to non-identity permutations
have degree at most n 2. Therefore the degree n 1 term comes from (A
1,1

x) (A
n,n
x) as well. It is
(1)
n1
x
n1
[A
1,1
+ + A
n,n
] = (1)
n1
x
n1
Tr A .
3. Because c(0) = det A,
c(x) = (1)
n
_
x
n
Tr A x
n1
+ + det A
.
For F = C (or any eld so that c(x) splits), we can always write c(x) = (1)
n
(x
1
) (x
n
). Thus the constant term in the polynomial is (1)
n
i
. Therefore
c(x) = (1)
n
_
x
n
Tr A x
n1
+ +
n
i=1
i
_
.
We nd det A =
i
in C.
Theorem 5.1.6. If dim V = n and c() has n distinct roots then T is diagonalizable. The
converse is not true.
Proof. Write the eigenvalues as
1
, . . . ,
n
. For each
i
we have an eigenvector v
i
. We claim
that the v
i
s are linearly independent. This follows from the lemma:
55
Lemma 5.1.7. If
1
, . . . ,
k
are k-distinct eigenvalues associated to eigenvectors v
1
, . . . , v
k
then v
1
, . . . , v
k
is linearly independent.
Proof. Suppose that
a
1
v
1
+ + a
k
v
k
=
0 .
Take T of both sides
a
1
1
v
1
+ + a
k
k
v
k
=
0 .
Keep doing this k 1 times so we get the system of equations
a
1
v
1
+ + a
k
v
k
=

0
a
1
1
v
1
+ + a
k
k
v
k
=

0

a
1
k1
1
v
1
+ + a
k
k1
k
v
k
=

0
Write each v
i
as [v
i
]
for some basis . This is then equivalent to the matrix equation

_
_
a
1
(v
1
)
a
2
(v
2
)
a
k
(v
k
)
_
_
_
_
_
_
1
1

k1
1
1
2

k1
2

1
k

k1
k
_
_
_
_
=
_
_
0

0

0
_
_
.
Here the left matrix is n k and has j-th column equal to the column vector a
j
[v
j
]
. But
the middle matrix has nonzero determinant when the
i
s are distinct: its determinant is
1i<jk
(
j

i
). Therefore it is invertible. Multiplying both sides by its inverse, we nd
a
i
v
i
=
0 for all i. Since v

i
,=
0, it follows that a
i
= 0 for all i.
5.2 Eigenspaces and the main diagonalizability theorem
Denition 5.2.1. If F we dene the eigenspace
E
= N(T I) = v V : T(v) = v .
Note that E
is a subspace even if is not an eigenvalue. Furthermore,

E
,= 0 if and only if is an eigenvalue of T

and
E
is T-invariant for all F .

What this means is that if v E
then so is T(v):
(T I)(T(v)) = (T I)(v) = (T I)v =
0 .
56
Denition 5.2.2. If W
1
, . . . , W
k
are subspaces of a vector space V then we write
W
1
W
k
for the sum space W
1
+ + W
k
and say the sum is direct if
W
j
[W
1
+ + W
j1
] = 0 for all j = 2, . . . , k .
We also say the subspaces are independent.
Theorem 5.2.3. If
1
, . . . ,
k
are distinct eigenvalues of T then
E
1
+ + E
k
= E
1
E
k
.
Furthermore
dim
_
k
i=1
E
i
_
=
k
i=1
dim E
i
.
Proof. The theorem will follows directly from the following lemma. The proof is in home-
work.
Lemma 5.2.4. Let W
1
, . . . , W
k
be subspaces of V . The following are equivalent.
1.
W
1
+ + W
k
= W
1
W
k
.
2. Whenever w
1
+ + w
k
=
0 for w
i
W
i
for all i, we have w
i
=
0 for all i.
3. Whenever
i
is a basis for W
i
for all i, the
i
s are disjoint and :=
k
i=1
i
is a basis
for
k
i=1
W
i
.
So take w
1
+ +w
k
=
0 for w
i
E
i
for all i. Note that each nonzero w
i
is an eigenvector
for the eigenvalue
i
. Remove all the zero ones. If we are left with any nonzero ones, by the
previous theorem, they must be linearly independent. This would be a contradiction. So
they are all zero.
For the second claim take bases
i
of E
i
. By the lemma,
k
i=1
i
is a basis for
k
i=1
E
i
.
This implies the claim.
Theorem 5.2.5 (Main diagonalizability theorem). Let T : V V be linear and dim V <
. The following are equivalent.
1. T is diagonalizable.
2. c(x) can be written as (1)
n
(x
1
)
n
1
(x
k
)
n
k
, where n
i
= dim E
i
for all i.
3. V = E
1
E
k
, where
1
, . . . ,
k
are the distinct eigenvalues of T.
57
Proof. Suppose rst that T is diagonalizable. Then there exists a basis of eigenvectors for
T; that is, for which [T]
is diagonal. Clearly each diagonal element is an eigenvalue. For

each i, call n
i
the number of entries on the diagonal that are equal to
i
. Then [T
i
I]
has n
i
number of zeros on the diagonal. All other diagonal entries must be non-zero, so the
nullspace has dimension n
i
. In other words, n
i
= dim E
i
.
Suppose that condition 2 holds. Since c is a polynomial of degree n we must have
dim E
1
+ + dimE
k
= dim V .
However since the
i
s are distinct the previous theorem gives that
dim
k
i=1
E
i
= dim V .
In other words, V =
k
i=1
E
i
. The previous theorem implies that the sum is direct and the
claim follows.
Suppose that condition 3 holds. Then take
i
a basis for E
i
for all i. Then =
k
i=1
i
is a basis for V . We claim that [T]
is diagonal. This is because each vector in is an

eigenvector. This proves 1 and completes the proof.
5.3 Exercises
1. Let V be an F-vector space and let W
1
, . . . , W
k
be subspaces of V . Recall the denition
of the sum
k
i=1
W
i
. It is the subspace of V given by
w
1
+ + w
k
: w
i
W
i
.
Recall further that this sum is called direct, and written as
k
i=1
if and only if for all
1 < i n we have
W
i
(
i1
j=1
W
j
) = 0.
Show that the following statements are equivalent:
(a) The sum
k
i=1
W
i
is direct.
(b) For any collection w
1
, . . . , w
k
with w
i
W
i
for all i, we have
k
i=1
w
i
= 0 i : w
i
= 0.
(c) If, for each i,
i
is a basis of W
i
, then the
i
s are disjoint and their union
= .
k
i=1
i
is a basis for the subspace
k
i=1
W
i
.
(d) For any v
k
i=1
W
i
there exist unique vectors w
1
, . . . , w
k
such that w
i
W
i
for
all i and v =
k
i=1
w
i
.
58
2. Let V be an F-vector space. Recall that a linear map p L(V, V ) is called a projection
if p p = p.
(a) Show that if p is a projection, then so is q = id
V
p, and we have pq = q p = 0.
(b) Let W
1
, . . . , W
k
be subspaces of V and assume that V =
k
i=1
W
i
. For 1 t k,
show that there is a unique element p
t
L(V, W
t
) such that for any choice of
vectors w
1
, . . . , w
k
such that w
j
W
j
for all j,
p
t
(w
j
) =
_
w
j
j = t
0 j ,= t
.
(c) Show that each p
t
dened in the previous part is a projection. Show furthermore
that
k
i=1
p
t
= id
V
and that for t ,= s we have p
t
p
s
= 0.
(d) Show conversely that if p
1
, . . . , p
t
L(V, V ) are projections with the properties
(a)
k
i=1
p
i
= id
V
and (b) p
i
p
j
= 0 for all i ,= j, and if we put W
i
= R(p
i
), then
V =
k
i=1
W
i
.
3. Let V be an F-vector space.
(a) If U V is a subspace, W is another F-vector space, and f L(V, W), dene
f[
U
: U W by
f[
U
(u) = f(u) u U.
Show that the map f f[
U
is a linear map L(V, W) L(U, W). It is called the
restriction map.
(b) Let f L(V, V ) and let U V be an f-invariant subspace (that is, a subspace U
with the property that f(u) U whenever u U). Observe that f[
U
L(U, U).
If W V is another f-invariant subspace and V = U W, show that
N(f) = N(f[
U
)N(f[
W
), R(f) = R(f[
U
)R(f[
W
), det(f) = det(f[
U
) det(f[
W
).
(c) Let f, g L(V, V ) be two commuting endomorphisms, i.e. we have f g = g f.
Show that N(g) and R(g) are f-invariant subspaces of V .
4. Consider the matrix
A :=
_
1 1
0 1
_
.
Show that there does not exist an invertible matrix P M
22
(C) such that PAP
1
is
diagonal.
5. Let V be a nite-dimensional F-vector space and f L(V, V ). Observe that for each
natural number k we have N(f
k
) N(f
k+1
).
(a) Show that there exists a natural number k so that N(f
k
) = N(f
k+1
).
59
(b) Show further that for all l k one has N(f
l
) = N(f
k
).
6. Let V be a nite-dimensional F-vector space and f L(V, V ).
(a) Let U V be an f-invariant subspace and = (v
1
, . . . , v
n
) a basis of V such that
t
= (v
1
, . . . , v
k
) is a basis for U. Show that
[f]
=
_
A B
0 C
_
with A = [f[
U
]
M
kk
(F), B M
k(nk)
(F), and C M
(nk)(nk)
(F).
(b) Let U, W V be f-invariant subspaces with V = U W. Let
t
be a basis for
U,
tt
a basis for W, and =
t
.
tt
. Show that
[f]
=
_
A 0
0 C
_
with A = [f[
U
]
and C = [f[
W
]
.
7. Let V be a nite-dimensional F-vector space. Recall that an element p L(V, V ) with
p
2
= p is called a projection. On the other hand, an element i L(V, V ) with i
2
= id
V
is called an involution.
(a) Assume that char(F) ,= 2. Show that the maps
Involutions on V Projections in V
i
1
2
(id
V
+ i)
2p id
V
p
are mutually inverse bijections.
(b) Show that if p L(V, V ) is a projection, then the only eigenvalues of p are 0
and 1. Furthermore, V = E
0
(p) E
1
(p) (the eigenspaces for p). That is, p is
diagonalizable.
(c) Show that if i L(V, V ) is an involution, then the only eigenvalues of i are +1
and 1. Furthermore, V = E
+1
(i) E
1
(i). That is, i is diagonalizable.
Observe that projections and involutions are examples of diagonalizable endomor-
phisms which do not have dim(V )-many distinct eigenvalues.
8. In this problem we will show that every endomorphism of a vector space over an
algebraically closed eld can be represented as an upper triangular matrix. This is a
simpler result than (and is implied by) the Jordan Canonical form, which we will cover
in class soon.
60
We will argue by (strong) induction on the dimension of V . Clearly the result holds for
dim V = 1. So suppose that for some k 1 whenever dim W k and U : W W
is linear, we can nd a basis of W with respect to which the matrix of U is upper-
triangular. Further, let V be a vector space of dimension k +1 over F and T : V V
be linear.
(a) Let be an eigenvalue of T. Show that the dimension of R := R(T I) is
strictly less than dim V and that R is T-invariant.
(b) Apply the inductive hypothesis to T[
R
to nd a basis of R with respect to which
T[
R
is upper-triangular. Extend this to a basis for V and complete the argument.
9. Let A M
nn
(F) be upper-triangular. Show that the eigenvalues of A are the diagonal
entries of A.
10. Let A be the matrix
A =
_
_
6 3 2
4 1 2
10 5 3
_
_
.
(a) Is A diagonalizable over R? If so, nd a basis for R
3
of eigenvectors of A.
(b) Is A diagonalizable over C? If so, nd a basis for C
3
of eigenvectors of A.
11. For which values of a, b, c R is the following matrix diagonalizable over R?
_
_
_
_
0 0 0 0
a 0 0 0
0 b 0 0
0 0 c 0
_
_
_
_
12. Let V be a nite dimensional vector space over a eld F and let T : V V be linear.
Suppose that every subspace of V is T-invariant. What can you say about T?
13. Let V be a nite dimensional vector space over a eld F and let T, U : V V be
linear transformations.
(a) Prove that if I TU is invertible then I UT is invertible and
(I UT)
1
= I + U(I TU)
1
T .
(b) Use the previous part to show that TU and UT have the same eigenvalues.
14. Let A be the matrix
_
_
1 1 1
1 1 1
1 1 1
_
_
.
Find A
n
for all n 1.
Hint: rst diagonalize A.
61
6 Jordan form
6.1 Generalized eigenspaces
It is of course not always true that T is diagonalizable. There can be a couple of reasons for
that. First it may be that the roots of the characteristic polynomial do not lie in the eld.
For instance
_
0 1
1 0
_
has characteristic polynomial x
2
+ 1. Even still it may be that the eigenvalues are in the
eld, but we still cannot diagonalize. On the homework you will see that the matrix
_
1 1
0 1
_
is not diagonalizable over C (although its eigenvalues are certainly in C). So we resort to
looking for a block diagonal matrix.
Suppose that we can show that
V = W
1
W
k
for some subspaces W
i
. Then we can choose a basis for V made up of bases for the W
i
s. If
the W
i
s are T-invariant then the matrix will be in block form.
Denition 6.1.1. Let T : V V be linear. A subspace W of V is T-invariant if T(w) W
whenever w W.
Each eigenspace is T-invariant. If w E
then
(T I)T(w) = (T I)w =
0 .
Therefore the eigenspace decomposition is a T-invariant direct sum.
To nd a general T-invariant direct sum we dene generalized eigenspaces.
Denition 6.1.2. Let T : V V be linear. If F then the set
= v V : (T I)
k
v =
0 for some k
is called the generalized eigenspace for .
This is a subspace. If v, w

E
and c F then there exists k

v
and k
w
such that
(T I)
kv
v =
0 = (T I)
kw
w .
Choosing k = maxk
v
, k
w
we nd
(T I)
k
(cv + w) =
0 .
62
Each generalized eigenspace is T-invariant. To see this, suppose that (T I)
k
v =
0.
Then because T commutes with (T I)
k
we have
(T I)
k
Tv = T(T I)
k
v =
0 .
To make sure the characteristic polynomial has roots we will take F to be an algebraically
closed eld. That is, each polynomial with coecients in F has a root in F.
6.2 Primary decomposition theorem
Theorem 6.2.1 (Primary decomposition theorem). Let F be algebraically closed and V a
nite-dimensional vector space over F. If T : V V is linear and
1
, . . . ,
k
are the distinct
eigenvalues of T then
V =

E
1

E
k
.
Proof. We follow several steps.
Step 1. c(x) has a root. Therefore T has an eigenvalue. Let be this value.
Step 2. Consider the generalized eigenspace

E
. We rst show that there exists k such that
= N(T I)
k
.
Let v
1
, . . . , v
m
be a basis for

E
. Then for each i there is a k

i
such that (T I)
k
i
v
i
=

0.
Choose k = maxk
1
, . . . , k
m
. Then (T I)
k
kills all the basis vectors and thus kills
everything in

E
. Therefore
N(T I)
k
.
The other direction is obvious.
Step 3. We now claim that
V = R(T I)
k
N(T I)
k
= R(T I)
k
.
First we show that the intersection is only the zero vector. Suppose that v is in the in-
tersection. Then (T I)
k
v =

0 and there exists w such that (T I)
k
w = v. Then
(T
1
I)
2k
1
w =
0 so w

E
1
. Therefore
v = (T
1
I)
k
1
w =
0 .
By the rank-nullity theorem,
dim R(T
1
I)
k
1
+ dim N(T
1
I)
k
1
= dim V .
By the 2-subspace (dimension) theorem,
V = R(T
1
I)
k
1
+ N(T
1
I)
k
1
,
and so it is a direct sum.
63
Step 4. Write W
1
= R(T
1
I)
k
1
so that
V =

E
1
W
1
.
These spaces are T-invariant. To show that note that we know

E
1
is already. For W
1
,
suppose that w W
1
. Then there exists u such that
w = (T
1
I)
k
1
u .
So
(T
1
I)
k
1
(T
1
I)u = (T
1
I)w .
Therefore (T
1
I)w W
1
and thus W
1
is (T
1
I)-invariant. If w W
1
then
Tw = (T
1
I)w +
1
Iw W
1
,
so W
1
is T-invariant.
Step 5. We now argue by induction and do the base case. Let e(T) be the number of
distinct eigenvalues of T. Note e(T) 1.
We rst assume e(T) = 1. In this case we write
1
for the eigenvalue and see
V =

E
1
R(T
1
I)
k
1
=

E
1
W
1
.
We claim that the second space is only the zero vector. Otherwise we restrict T to it to get
an operator T
W
1
. Then T
W
1
has an eigenvalue . So there is a nonzero vector w W
1
such
that
Tw = T
W
1
w = w ,
so w is an eigenvector for T. But T has only one eigenvalue so =
1
. This means that
w

E
1
and thus
w

E
1
W
1
=
0 .
This is a contradiction, so
V =

E
0 =

E
1
,
and we are done.
Step 6. Suppose the theorem is true for any transformation U with e(U) = k (k 1). Then
suppose that e(T) = k +1. Let
1
, . . . ,
k+1
be the distinct eigenvalues of T and decompose
as before:
V =

E
1
R(T
1
I)
k
1
=

E
1
W
1
.
Now restrict T to W
1
and call it T
W
1
.
Claim 6.2.2. T
W
1
has eigenvalues
2
, . . . ,
k+1
with the generalized eigenspaces from T: they
are

E
2
, . . . ,

E
k+1
.
64
Once we show this we will be done: we will have e(T
W
1
) = k and so we can apply the
theorem:
W
1
=

E
2

E
k+1
,
so
V =

E
1

E
k+1
.
Proof. We rst show that each of

E
2
, . . . ,

E
k+1
is in W
1
. For this we want a lemma and a
denition:
Denition 6.2.3. If p(x) is a polynomial with coecients in F and T : V V is linear,
where V is a vector space over F, we dene the transformation
P(T) = a
n
T
n
+ + a
1
T + a
0
I ,
where p(x) = a
n
x
n
+ + a
1
x + a
0
.
Lemma 6.2.4. Suppose that p(x) and q(x) are two polynomials with coecients in F. If
they have no common root then there exist polynomials a(x) and b(x) such that
a(x)p(x) + b(x)q(x) = 1 .
Proof. Homework
Now choose v

E
j
for some j = 2, . . . , k + 1. By the decomposition we can write
v = u + w where u

E
1
and w W
1
. We can now write
1
= N(T
1
I)
k
1
and

E
j
= N(T
j
I)
k
j
and see
0 = (T
j
I)
k
j
v = (T
j
I)
k
j
u + (T
j
I)
k
j
w .
However

E
1
and W
1
are T-invariant so they are (T
j
I)
k
j
-invariant. This is a sum of
vectors equal to zero, where on is in

E
1
, the other is in W
1
. Because these spaces direct
sum to V we know both vectors are zero. Therefore
u satises (T
j
I)
k
j
u =
0 = (T
1
I)
k
1
u .
In other words, p(T)u = q(T)u =

0, where p(x) = (x
j
)
k
j
and q(x) = (x
1
)
k
1
. Since
these polynomials have no root in common we can nd a(x) and b(x) as in the lemma.
Finally,
u = (a(T)p(T) + b(T)q(T))u =
0 .
This implies that v = w W
1
and therefore all of

E
2
, . . . ,

E
k+1
are in W
1
.
Because of the above statement, we now know that all of
2
, . . . ,
k+1
are eigenvalues of
T
W
1
. Furthermore if
W
1
is an eigenvalue of T
W
1
then it is an eigenvalue of T. It cannot
be
1
because then any eigenvector for T
W
1
with eigenvalue
W
1
would have to be in

E
1
65
but also in W
1
so it would be zero, a contradiction. Therefore the eigenvalues of T
W
1
are
precisely
2
, . . . ,
k+1
.
Let

E
W
1
j
be the generalized eigenspace for T
W
1
corresponding to
j
. We want
E
W
1
j
=

E
j
, j = 2, . . . , k + 1 .
If w

E
W
1
j
then there exists k such that (T
W
1

j
I)
k
w =
0. But now on W
1
, (T
W
1

j
I)
k
is the sam as (T
j
I)
k
, so
(T
j
I)
k
w = (T
W
1

j
I)
k
w =
0 ,
so that

E
W
1
j
=

E
j
. To show the other inclusion, take w

E
j
, Since E
j
W
1
, this implies
that w W
1
. Now since there exists k such that (T I)
k
w =
0, we nd
(T
W
1

j
I)
k
w = (T
j
I)
k
w =
0 ,
and we are done. We nd
V =

E
1

E
k+1
.
6.3 Nilpotent operators
Now we look at the operator T on the generalized eigenspaces. We need only restrict T to
each eigenspace to determine the action on all of V . So for this purpose we will assume that
T has only one generalized eigenspace: there exists F such that
V =

E
.
In other words, for each v V there exists k such that (T I)
k
v =
0. Recall we can then

argue that there exists k
such that
V = N(T I)
k
,
or, if U = T I, U
k
= 0.
Denition 6.3.1. Let U : V V be linear. We say that U is nilpotent if there exists k
such that
U
k
= 0 .
The smallest k for which U
k
= 0 is called the degree of U.
The point of this section will be to give a structure theorem for nilpotent operators. It
can be seen as a special case of Jordan form when all eigenvalues are zero. To prove this
structure theorem, we will look at the nullspaces of powers of U. Note that if k = deg(U),
then N(U
k
) = V but N(U
k1
) ,= V . We get then an increasing chain of subspaces
N
0
N
1
N
k
, where N
j
= N(U
j
) .
66
If v N
j
N
j1
then U(v) N
j1
N
j2
.
Proof. v has the property that U
j
v =

0 but U
j1
v ,=

0. Therefore U
j1
(Uv) =

0 but
U
j2
(Uv) ,=
0.
Denition 6.3.2. If W
1
W
2
are subspaces of V then we say that v
1
, . . . , v
m
W
2
are
linearly independent mod W
1
if
a
1
v
1
+ + a
m
v
m
W
1
implies a
i
= 0 for all i .
Lemma 6.3.3. Suppose that dim W
2
dim W
1
= l and v
1
, . . . , v
m
W
2
W
1
are linearly
independent mod W
1
. Then
1. m l and
2. we can choose l m vectors v
m+1
, . . . , v
l
in W
2
W
1
such that v
1
, . . . , v
l
are linearly
independent mod W
1
.
Proof. It suces to show that we can add just one vector. Let w
1
, . . . , w
t
be a basis for W
1
.
Then dene
X = Span(w
1
, . . . , w
t
, v
1
, . . . , v
m
) .
Then this set is linearly independent. Indeed, if
a
1
w
1
+ + a
t
w
t
+ b
1
v
1
+ + b
m
v
m
=
0 ,
then b
1
v
1
+ + b
m
v
m
W
1
, so all b
i
s are zero. Then
a
1
w
1
+ + a
t
w
t
=
0 ,
so all a
i
s are zero. Thus
t + m = dim X dim W
2
= t + l
or m l.
For the second part, if k = l, we are done. Otherwise dim X < dim W
2
, so there exists
v
k+1
W
2
X. To show linear independence mod W
1
, suppose that
a
1
v
1
+ + a
k
v
k
+ a
k+1
v
k+1
= w W
1
.
If a
k+1
= 0 then we are done. Otherwise we can solve for v
k+1
and see it is in X. This is a
contradiction.
Lemma 6.3.4. Suppose that for some m, v
1
, . . . , v
p
N
m
N
m1
are linearly independent
mod N
m1
. Then U(v
1
), . . . , U(v
p
) are linearly independent mod N
m2
.
67
Proof. Suppose that
a
1
U(v
1
) + + a
p
U(v
p
) = n N
m2
.
Then
U(a
1
v
1
+ + a
p
v
p
) = n .
Now
U
m1
(a
1
v
1
+ + a
p
v
p
) = U
m1
(n) =
0 .
Therefore a
1
v
1
+ + a
p
v
p
N
m1
. But these are linearly independent mod N
m1
so we
nd that a
i
= 0 for all i.
Now we do the following
1. Write d
m
= dim N
m
dim N
m1
. Starting at the top, choose
k
= v
k
1
, . . . , v
k
d
k
which are linearly independent mod N
k1
.
2. Move down one level: write v
k1
i
= U(v
k
i
). Then v
k1
1
, . . . , v
k1
d
k
is linearly indepen-
dent mod N
k2
, so d
k
d
k1
. By the lemma we can extend this to
k1
= v
k1
1
, . . . , v
k1
d
k
, v
k1
d
k
+1
, . . . , v
k1
d
k1
,
a maximal linearly independent set mod N
k2
in N
k1
N
k2
.
3. Repeat.
Note that d
k
+ d
k1
+ + d
1
= dim V . We claim that if
i
is the set at level i then
=
1

k
is a basis for V . It suces to show linear independence. For this, we use the following fact.
If W
1
W
k
= V is a nested sequence of subspaces with
i
W
i
W
i1
linearly
independent mod W
i1
then =
i
i
is linearly independent. (check).
We have shown the following result.
Denition 6.3.5. A chain of length l for U is a set v, U(v), U
2
(v), . . . , U
l1
(v) of non-zero
vectors such that U
l
(v) =
0.
Theorem 6.3.6. If U : V V is linear and nilpotent (dim V < ) then there exists a
basis of V consisting entirely of chains for U.
68
Let U : V V be nilpotent. If C = U
l1
v, U
l2
v, . . . , U(v), v is a chain then note that
( = Span(C) is U-invariant .
Since V has a basis of chains, say C
1
, . . . , C
m
, we can write
V = (
1
(
m
.
In our situation each C
i
is a basis for (
i
so our matrix for U is block diagonal. Each
block corresponds to a chain. Then U[
(
i
has the following matrix w.r.t. C
i
:
_
_
_
_
_
_
_
_
0 1 0 0 0
0 0 1 0 0
0 0 0 1

0 1
0
_
_
_
_
_
_
_
_
.
Theorem 6.3.7 (Uniqueness of nilpotent form). Let U : V V be linear and nilpotent
with dim V < . Write
l
i
() = # of (maximal) chains of length i in .
Then if ,
t
are bases of V consisting of chains for U then
l
i
()l
i
(
t
) for all i .
Proof. Write K
i
() for the set of elements v of such that U
i
(v) =

0 but U
i1
(v) ,=

0 (for
i = 1 we only require U(v) =
0). Let

l
i
() be the number of (maximal) chains in of length
at least i.
We rst claim that #K
i
() =

l
i
() for all i. To see this note that for each chain C of
length at least i there is a unique element v C such that v K
i
(). Conversely, for each
v K
i
() there is a unique chain of length at least i containing v.
Let n
i
be the dimension of N(U
i
). We claim that n
i
equals the number m
i
() of v
such that U
i
(v) =

0. Indeed, the set of such vs is linearly independent and in N(U
i
) so
n
i
m
i
(). However all other vs (dim V m
i
() of them) are mapped to distinct elements
of (since is made up of chains), so dim R(U
i
) dim V m
i
(), so n
i
m
i
().
Because N(U
i
) contains N(U
i1
) for all i (here we take N(U
0
) =
0, we have
l
i
() = #K
i
() = n
i
n
i1
.
Therefore
l
i
() =

l
i
()
l
i+1
() = (n
i
n
i1
) (n
i+1
n
i
) .
The right side does not depend on and in fact the same argument shows it is equal to
l
i
(
t
). This completes the proof.
69
6.4 Existence and uniqueness of Jordan form, Cayley-Hamilton
Denition 6.4.1. A Jordan block for of size l is
J
l
=
_
_
_
_
_
_
1 0 0 0
0 1 0 0

1
_
_
_
_
_
_
.
Theorem 6.4.2 (Jordan canonical form). If T : V V is linear with dim V < and F
algebraically closed. Then there is a basis of V such that [T]
is block diagonal with Jordan

blocks.
Proof. First decompose V =

E
1

E
k
. On each

E
i
, the operator T
i
I is nilpotent.
Each chain for (T
i
I)[
i
gives a block in nilpotent decomposition. Then T = T
i
I +
i
I
gives a Jordan block.
Draw a picture at this point (of sets of chains). We rst decompose
V =

E
1

E
k
and then
i
= (
i
1
(
i
k
i
,
where each (
i
j
is the span of a chain of generalized eigenvectors: C
i
j
= v
1
, . . . , v
p
, where
T(v
1
) =
i
v
1
, T(v
2
) = v
2
+ v
1
, . . . , T(v
p
) =
i
v
p
+ v
p1
.
Theorem 6.4.3 (Cayley-Hamilton). Let T : V V be linear with dim V < and F
algebraically closed. Then
c(T) = 0 .
Remark. In fact the theorem holds even if F is not algebraically closed by doing a eld
extension.
Lemma 6.4.4. If U : V V is linear and nilpotent with dim V = n < then
U
n
= 0 .
Therefore if T : V V is linear and v

E
then
(T I)
dim

E
v =
0 .
Proof. Let be a basis of chains for U. Then the length of the longest chain is n.
70
Lemma 6.4.5. If T : V V is linear with dim V < and is a basis such that [T]
is
in Jordan form then for each eigenvalue , let S
be the basis vectors corresponding to blocks

for . Then
Span(S
) =

E
for each .
Therefore if
c(x) =
k
i=1
(
i
x)
n
i
,
then n
i
= dim

E
i
for each i.
Proof. Write
1
, . . . ,
k
for the distinct eigenvalues of T. Let
W
i
= Span(S
i
) .
We may assume that the blocks corresponding to
1
appear rst,
2
appear second, and so
on. Since [T]
is in block form, this means V is a T-invariant direct sum

W
1
W
k
.
However for each i, T
i
I restricted to W
i
is in nilpotent form. Thus (T
i
I)
dim

E
i
v =
0
for each v S
i
. This means
W
i

E
i
for all i or dim W
i
dim

E
i
.
But V =

E
1

E
k
, so
k
i=1
dim

E
i
= dim V . This gives that dim W
i
= dim

E
i
for
all i, or W
i
=

E
i
.
For the second claim, n
i
is the number of times that
i
appears on the diagonal; that is,
the dimension of Span(S
i
).
Proof of Cayley-Hamilton. We rst factor
c(x) =
k
i=1
(
i
x)
n
i
,
where n
i
is called the algebraic multiplicity of the eigenvalue
i
. Let be a basis such that
[T]
is in Jordan form. If v is in a block corresponding to

j
then v

E
j
and so
(T
j
I)
dim

E
j
v =
0 by the previous lemma). Now

c(T)v =
_
k
i=1
(
i
I T)
n
i
_
v =
_
i,=j
(
i
I T)
n
i
_
(
j
I T)
n
j
v =
0
since n
j
= dim

E
j
.
Finally we have uniqueness of Jordan form.
71
Theorem 6.4.6. If and
t
are bases of V for which T is in Jordan form, then the matrices
are the same up to permutation of blocks.
Proof. First we note that the characteristic polynomial can be read o of the matrices and
is invariant. This gives that the diagonal entries are the same, and the number of vectors
corresponding to each eigenvalue is the same.
We see from the second lemma that if
i
and
t
i
are the parts of the bases corresponding
to blocks involving
i
then
W
i
:= Span(
i
) =

E
i
= Span(
t
i
) =: W
t
i
.
Restricting T to W
i
and to W
t
i
then gives the blocks for
i
. But then
i
and
t
i
are just bases
of

E
i
of chains for T
i
I. The number of chains of each length is the same, and this is
the number of blocks of each size.
6.5 Exercises
Notation:
1. If F is a eld then we write F[X] for the set of polynomials with coecients in F.
2. If P, Q F[X] then we say that P divides Q and write P[Q if there exists S F[X]
such that Q = SP.
3. If P F[X] then we write the deg(P) for the largest k such that the coecient of x
k
in P is nonzero. We dene the degree of the zero polynomial to be .
4. If P F[X] then we say that P is monic if the coecient of x
n
is 1, where n =deg(P).
5. For a complex number z, we denote the complex conjugate by z, i.e. if z = a+ib, with
a, b R, then z = a ib.
6. If V is an F-vector space, recall the denition of V V : it is the F-vector space whose
elements are
V V = (v
1
, v
2
) : v
1
, v
2
V .
Vector addition is performed as (v
1
, v
2
) + (v
3
, v
4
) = (v
1
+ v
3
, v
2
+ v
4
) and for c F,
c(v
1
, v
2
) is dened as (cv
1
, cv
2
).
Exercises
1. (a) Show that for P, Q F[X], one has deg(PQ) = deg(P) + deg(Q).
(b) Show that for P, D F[X] such that D is nonzero, there exist Q, R F[X] such
that P = QD + R and deg(R) < deg(D).
Hint: Use induction on deg(P).
72
(c) Show that, for any F,
P() = 0 (X )[P.
(d) Let P F[X] be of the form p(X) = a(X
1
)
n
1
(X
k
)
n
k
for some
a,
1
, . . . ,
k
F and natural numbers n
1
, . . . , n
k
. Show that Q F[X] divides
P if and only if Q(X) = b(X
1
)
m
1
(X
k
)
m
k
for some b F and natural
numbers m
i
with m
i
n
i
(we allow m
i
= 0).
(e) Assume that F is algebraically closed. Show that every P F[X] is of the
form a(X
1
)
n
1
. . . (X
k
)
n
k
for some a,
1
, . . . ,
k
F and natural numbers
n
1
, . . . , n
k
with n
1
+ + n
k
= deg(P). In this case we call the
i
s the roots of
P.
2. Let F be a eld and suppose that P, Q are nonzero polynomials in F[X]. Dene the
subset o of F[X] as follows:
o = AP + BQ : A, B F[X] .
(a) Let D o be of minimal degree. Show that D divides both P and Q.
Hint: Use part 2 of the previous problem.
(b) Show that if S F[X] divides both P and Q then S divides D.
(c) Conclude that there exists a unique monic polynomial D F[X] satisfying the
following conditions
i. D divides P and Q.
ii. If T F[X] divides both P and Q then T divides D.
(Such a polynomial is called the greatest common divisor (GCD) of P and Q.)
(d) Show that if F is algebraically closed and P and Q are polynomials in F[X] with
no common root then there exist A and B in F[X] such that
AP + BQ = 1 .
3. Let F be any eld, V be an F-vector space, f L(V, V ), and W V an f-invariant
subspace.
(a) Let p : V V/W denote the natural map dened in Homework 8, problem 5.
Show that there exists an element of L(V/W, V/W), which we will denote by
f[
V/W
, such that p f = f[
V/W
p. It is customary to expresses this fact using
the following diagram:
V
f
//
p
V
p
V/W
f[
V/W
//
V/W
73
(b) Let
t
be a basis for W and be a basis for V which contains
t
. Show that the
image of
tt
:=
t
under p is a basis for V/W.
(c) Let
tt
be a subset of V with the property that the restriction of p to
tt
(which
is a map of sets
tt
V/W) is injective and its image is a basis for V/W. Show
that
tt
is a linearly-independent set. Show moreover that if
t
is a basis for W,
then :=
t
.
tt
is a basis for V .
(d) Let ,
t
, and
tt
be as above. Show that
[f]
=
_
A B
0 C
_
with A = [f[W]
and C = [f[
V/W
]
p(
)
p(
)
.
4. The minimal polynomial. Let F be any eld, V an F-vector space, and f L(V, V ).
(a) Consider the subset S F[X] dened by
S = P F[X][P(f) = 0.
Show that S contains a nonzero element.
(b) Let M
f
S be a monic non-zero element of minimal degree. Show that M
f
divides any other element of S. Conclude that M
f
as dened is unique. It is
called the minimal polynomial of f.
(c) Show that the roots of M
f
are precisely the eigenvalues of f by completing the
following steps.
i. Suppose that r F is such that M
f
(r) = 0. Show that
M
f
(X) = Q(X)(X r)
k
for some positive integer k and Q F[X] such that Q(r) ,= 0. Prove also
that Q(f) ,= 0.
ii. Show that if r F satises M
f
(r) = 0 then f rI is not invertible and thus
r is an eigenvalue of f.
iii. Conversely, let be an eigenvalue of f with eigenvector v. Show that if
P F[X] then
P(f)v = P()v .
Conclude that is a root of M
f
.
(d) Assume that F is algebraically closed. For each eigenvalue of f, express mult
(M
f
)
in terms of the Jordan form of f.
(e) Assume that F is algebraically closed. Show that f is diagonalizable if and only
if mult
(M
f
) = 1 for all eigenvalues of f.
74
(f) Assume that F is algebraically closed. Under which circumstances the does M
f
equal the characteristic polynomial of f?
5. If T : V V is linear and V is a nite-dimensional F-vector space with F algebraically
closed, we dene the algebraic multiplicity of an eigenvalue to be a(), the dimension
of the generalized eigenspace

E
. The geometric multiplicity of is g(), the dimension

of the eigenspace E
. Finally, the index of is i(), the length of the longest chain of

generalized eigenvectors in

E
.
Suppose that is an eigenvalue of T and g = g() and i = i() are given integers.
(a) What is the minimal possible value for a = a()?
(b) What is the maximal possible for a?
(c) Show that a can take any value between the answers for the above two questions.
(d) What is the smallest dimension n of V for which there exist two linear transfor-
mations T and U from V to V with all of the following properties? (i) There
exists F which is the only eigenvalue of either T or U, (ii) T and U are not
similar transformations and (iii) the geometric multiplicity of for T equals that
of U and similarly for the index.
6. Find the Jordan form for each of the following matrices over C. Write the minimal
polynomial and characteristic polynomial for each. To do this, rst nd the eigenvalues.
Then, for each eigenvalue , nd the dimensions of the nullspaces of (A I)
k
for
pertinent values of k (where A is the matrix in question). Use this information to
deduce the block forms.
(a)
_
_
1 0 0
1 4 1
1 4 0
_
_
(b)
_
_
2 3 0
0 1 0
0 1 2
_
_
(c)
_
_
5 1 3
0 2 0
6 1 4
_
_
(d)
_
_
3 0 0
4 2 0
5 0 2
_
_
(e)
_
_
2 3 2
1 4 2
0 1 3
_
_
(f)
_
_
4 1 0
1 2 0
1 1 3
_
_
7. (a) The characteristic polynomial of the matrix
A =
_
_
_
_
7 1 2 2
1 4 1 1
2 1 5 1
1 1 2 8
_
_
_
_
is c(x) = (x 6)
4
. Find an invertible matrix S such that S
1
AS is in Jordan
form.
(b) Find all complex matrices in Jordan form with characteristic polynomial
c(x) = (i x)
3
(2 x)
2
.
75
8. Complexication of nite-dimensional real vector spaces. Let V be an R-
vector space. Just as we can view R as a subset of C we will be able to view V as a
subset of a C-vector space. This will be useful because C is algebraically closed so we
can, for example, use the theory of Jordan form in V
C
and bring it back (in a certain
form) to V . We will give two constructions of the complexication; the rst is more
elementary and the second is the standard construction you will see in algebra.
We put V
C
= V V .
(a) Right now, V
C
is only an R-vector space. We must dene what it means to
multiply vectors by complex scalars. For z C, z = a + ib with a, b R, and
v = (v
r
, v
i
) V
C
, we dene the element zv V
C
to be
(av
r
bv
i
, av
i
+ bv
r
).
Show that in this way, V
C
becomes a C-vector space. This is the complexication
of V .
(b) We now show how to view V as a subset of V
C
. Show that the map : V V
C
which maps v to (v, 0) is injective and R-linear. (Thus the set (V ) is a copy of
V sitting in V
C
.)
(c) Show that dim
C
(V
C
) = dim
R
(V ). Conclude that V
C
is equal to span
C
((V )).
Conclude further that if v
1
, . . . , v
n
is an R-basis for V , then (v
1
), . . . , (v
n
) is a
C-basis for V
C
.
(d) Complex conjugation: We dene the complex conjugation map c : V
C
V
C
to
be the map (v
r
, v
i
) (v
r
, v
i
). Just as R (sitting inside of C) is invariant under
complex conjugation, so will our copy of V (and its subspaces) be inside of V
C
.
i. Prove that c
2
= 1 and i(V ) = v V
C
[c(v) = v.
ii. Show that for all z C and v V
C
, we have c(zv) = zc(v). Maps with this
property are called anti-linear.
iii. In the next two parts, we classify those subspaces of V
C
that are invariant
under c. Let W be a subspace of V . Show that the C-subspace of V
C
spanned
by (W) equals
(w
1
, w
2
) V
C
: w
1
, w
2
W
and is invariant under c.
iv. Show conversely that if a subspace

W of the C-vector space V
C
is invariant
under c, then there exists a subspace W V such that

W = span
C
((W)).
v. Last, notice that the previous two parts told us the following: The subspaces
of V
C
which are invariant under conjugation are precisely those which are
equal to W
C
for subspaces W of the R-vector space V . Show that moreover,
in that situation, the restriction of the complex conjugation map c : V
C
V
C
to W
C
is equal to the complex conjugation map dened for W
C
(the latter
map is dened intrinsically for W
C
, i.e. without viewing it as a subspace of
V
C
).
76
9. Let V be a nite dimensional R-vector space. For this exercise we use the notation of
the previous one.
(a) Let W be another nite dimensional R-vector space, and let f L(V, W). Show
that
f
C
((v, w)) := (f(v), f(w))
denes an element f
C
L(V
C
, W
C
).
(b) Show that for v V
C
, we have f
C
(c(v)) = c(f
C
(v)). Show conversely that if
f L(V
C
, W
C
) has the property that

f(c(v)) = c(

f(v)), the

f = f
C
for some
f L(V, W).
10. In this problem we will establish the real Jordan form. Let V be a vector space over
R of dimension n < . Let T : V V be linear and T
C
its complexication.
(a) If C is an eigenvalue of T
C
, and

E
is the corresponding generalized eigenspace,

show that
c(

E
) =

E
.
(b) Show that the non-real eigenvalues of T
C
come in pairs. In other words, show that
we can list the distinct eigenvalues of T
C
as
1
, . . . ,
r
,
1
,
2
, . . . ,
2m
,
where for each j = 1, . . . , r,
j
=
j
and for each i = 1, . . . , m,
2i1
=
2i
.
(c) Because C is algebraically closed, the proof of Jordan form shows that
V
C
=

E
1

E
r

E
1

E
2m
.
Using the previous two points, show that for j = 1, . . . , r and i = 1, . . . , m, the
subspaces of V
C
j
and

E
2i1

E
2i
are c-invariant.
(d) Deduce from the results of problem 6, homework 10 that there exist subspaces
X
1
, . . . , X
r
and Y
1
, . . . , Y
m
of V such that for each j = 1, . . . , r and i = 1, . . . , m,
j
= Span
C
((X
j
)) and

E
2i1

E
2i
= Span
C
((Y
i
)) .
Show that
V = X
1
X
r
Y
1
Y
m
.
(e) Prove that for each j = 1, . . . , r, the transformation T
j
I restricted to X
j
is
nilpotent and thus we can nd a basis
j
for X
j
consisting entirely of chains for
T
j
I.
77
(f) For each i = 1, . . . , m, let
i
= (v
i
1
, w
i
1
), . . . , (v
i
n
i
, w
i
n
i
)
be a basis of

E
2i1
consisting of chains for T
C
2i1
I. Prove that
i
= v
i
1
, w
i
1
, . . . , v
i
n
i
, w
i
n
i
is a basis for Y
i
. Describe the form of the matrix representation of T restricted
to Y
i
relative to

i
.
(g) Gathering the previous parts, state and prove a version of Jordan form for linear
transformations over nite-dimensional real vector spaces. Your version should
be of the form If T : V V is linear then there exists a basis of V such that
[T]
has the form . . .

78
7 Bilinear forms
7.1 Denitions
We now switch gears from Jordan form.
Denition 7.1.1. If V is a vector space over F, a function f : V V F is called a
bilinear form if for xed v V , f(v, w) is linear in w and for xed w V , f(v, w) is linear
in v.
Bilinear forms have matrix representations similar to those for linear transformations.
Choose a basis = v
1
, . . . , v
n
for V and write
v = a
1
v
1
+ + a
n
v
n
, w = b
1
v
1
+ + b
n
v
n
.
Now
f(v, w) =
n
i=1
a
i
f(v
i
, w) =
n
i=1
n
j=1
a
i
b
j
f(v
i
, v
j
) .
Dene an n n matrix A by A
i,j
= f(v
i
, v
j
). Then this is
n
i=1
a
i
_
n
j=1
b
j
A
i,j
_
=
n
i=1
a
i
_
A
b
_
i
= (a)
t
A
b = [v]
t
[f]
[w]
.
We have proved:
Theorem 7.1.2. If dim V < and f : V V F is a bilinear form there exists a unique
matrix [f]
such that for all v, w V ,

f(v, w) = [v]
t
[f]
[w]
.
Furthermore the map f [f]
is an isomorphism from Bil(V, F) to M

nn
(F).
Proof. We showed existence. To prove uniqueness, suppose that A is another such matrix.
Then
A
i,j
= e
t
i
Ae
j
= [v
i
]
t
A[v
j
]
= f(v
i
, v
j
) .
If
t
is another basis then
([v]
)
t
_
_
[I]
_
t
[f]
[I]
_
[w]
=
_
[I]
[v]
_
t
[f]
[I]
[w]
= ([v]
)
t
[f]
[w]
= f(v, w) .
Therefore
[f]
=
_
[I]
_
t
[f]
[I]
.
Note that for xed v V the map L
f
(v) : V F given by L
f
(v)(w) = f(v, w) is a
linear functional. So f gives a map in L(V, V
).
79
Theorem 7.1.3. Denote by Bil(V, F) the set of bilinear forms on V . The map
L
:
Bil(V, F) L(V, V
) given by
L
(f) = L
f
is an isomorphism.
Proof. If f, g Bil(V, F) and c F then
(
L
(cf + g)(v)) (w) = (L
cf+g
(v)) (w) = (cf + g)(v, w) = cf(v, w) + g(v, w)
= cL
f
(v)(w) + L
g
(v)(w) = (cL
f
(v) + L
g
(v))(w)
= (c
L
(f)(v) +
L
(g)(v))(w) .
Thus
L
(cf + g)(v) = c
L
(f)(v) +
L
(g)(v). This is the same as (c
L
(f) +
L
(g))(v).
Therefore
L
(cf + g) = c
L
(f) +
L
(g). Thus
L
is linear.
Now Bil(V, F) has dimension n
2
. This is because the map from last theorem is an
isomorphism. So does L(V, V
). Therefore we only need to show one-to-one or onto. To

show one-to-one, suppose that
L
(f) = 0. Then for all v, L
f
(v) = 0. In other words, for all
v and w V , f(v, w) = 0. This means f = 0.
Remark. We can also dene R
f
(w) by R
f
(w)(v) = f(v, w). Then the corresponding map
R
is an isomorphism.
You will prove the following fact in homework. If is a basis for V and
is the dual
basis, then for each f Bil(V, F),
[R
f
]
= [f]
.
Then we have
[L
f
]
=
_
[f]
_
t
.
To see this, set g Bil(V, F) to be g(v, w) = f(w, v). Then for each v, w V ,
([w]
)
t
[f]
[v]
= f(w, v) = g(v, w)
Taking transpose on both sides,
([v]
)
t
_
[f]
_
t
[w]
= g(v, w)
so [g]
=
_
[f]
_
t
. But L
f
= R
g
, so
_
[f]
_
t
= [R
g
]
= [L
f
]
.
Denition 7.1.4. If f Bil(V, F) then we dene the rank of f to be the rank of R
f
.
By the above remark, the rank equals the rank of either of the matrices
[f]
or [L
f
]
.
Therefore the rank of [f]
does not depend on the choice of basis.

80
For f Bil(V, F), dene
N(f) = v V : f(v, w) = 0 for all w V .
This is just
v V : L
f
(v) = 0 = N(L
f
) .
But L
f
is a map from V to V
, so we have
rank(f) = rank(L
f
) = dim V dim N(f) .
7.2 Symmetric bilinear forms
Denition 7.2.1. A bilinear form f Bil(V, F) is called symmetric if f(v, w) = f(w, v)
for all v, w V . f is called skew-symmetric if f(v, v) = 0 for all v V .
The matrix for a symmetric bilinear form is symmetric and the matrix for a skew-
symmetric bilinear form is anti-symmetric.
Furthermore, each symmetric matrix A gives a symmetric bilinear form:
f(v, w) = ([v]
)
t
A [w]
.
Similarly for skew-symmetric matrices.
Lemma 7.2.2. If f Bil(V, F) is symmetric and char(F) ,= 2 then f(v, w) = 0 for all
v, w V if and only if f(v, v) = 0 for all v V .
Proof. One direction is clear. For the other, suppose that f(v, v) = 0 for all v V .
f(v + w, v + w) = f(v, v) + 2f(v, w) + f(w, w)
f(v w, v w) = f(v, v) 2f(v, w) + f(w, w) .
Therefore
0 = f(v + w, v + w) f(v w, v w) = 4f(v, w) .
Here 4f(v, w) means f(v, w) added to itself 3 times. If char(F) ,= 2 then this implies
f(v, w) = 0.
Remark. If char(F) = 2 then the above lemma is false. Take F = Z
2
with V = F
2
and f
with matrix
_
0 1
1 0
_
.
Check that this has f(v, v) = 0 for all v but f(v, w) is clearly not 0 for all v, w V .
Denition 7.2.3. A basis = v
1
, . . . , v
n
of V is orthogonal for f Bil(V, F) if f(v
i
, v
j
) =
0 whenever i ,= j. It is orthonormal if it is orthogonal and f(v
i
, v
i
) = 1 for all i.
81
Theorem 7.2.4 (Diagonalization of symmetric bilinear forms). Let f Bil(V, F) with
char(F) ,= 2 and dim V < . If f is symmetric then V has an orthogonal basis for f.
Proof. We argue by induction on n = dim V . If n = 1 it is clear. Let us now suppose that
the statement holds for all k < n for some n > 1 and show that it holds for n. If f(v, v) = 0
for all v then f is identically zero and thus we are done. Otherwise we can nd some v ,= 0
such that f(v, v) ,= 0.
Dene
W = w V : f(v, w) = 0 .
Since this is the nullspace of L
f
(v) and L
f
(v) is a nonzero element of V
, it follows that W
is n1 dimensional. Because f restricted to W is still a symmetric bilinear function, we can
nd a basis
t
of W such that [f]
is diagonal. Write
t
= v
1
, . . . , v
n1
and =
t
v.
Then we claim is a basis for V : if
a
1
v
1
+ + a
n1
v
n1
+ av =
0 ,
then taking L
f
(v) of both sides we nd af(v, v) =

0, or a = 0. Linear independence gives
that the other a
i
s are zero.
Now it is clear that [f]
is diagonal. For i ,= j which are both < n this follows because

t
is a basis for which f is diagonal on W. Otherwise one of i, j is n and then the other vector
is in W and so f(v
i
, v
j
) = 0.
In the basis , f has a diagonal matrix. This says that for each symmetric matrix A
we can nd an invertible matrix S such that
S
t
AS is diagonal .
In fact, if F is a eld such that each number has a square root (like C) then we can make
a new basis, replacing each element v of such that f(v, v) ,= 0 by v/
_
f(v, v) and
leaving all elements such that f(v, v) = 0 to nd a basis such that the representation
of f is diagonal, with all 1 and 0 on the diagonal. The number of 1s equals the rank
of f.
Therefore if f has full rank and each element of F has a square root, there exists an
orthonormal basis of V for f.
Theorem 7.2.5 (Sylvesters law). Let f be a symmetric bilinear form on R
n
. There exists
a basis such that [f]
is diagonal, with only 0s, 1s and 1s. Furthermore, the number

of each is independent of the choice of basis that puts f into this form.
Proof. Certainly such a basis exists. Just modify the construction by dividing by
_
[f(v
i
, v
i
)[
instead. So we show the other claim. Because the number of 0s is independent of the basis,
we need only show the statement for 1s.
82
For a basis , let V
+
() be the span of the v
i
s such that f(v
i
, v
i
) > 0, and similarly for
V
() and V
0
(). Clearly
V = V
+
() V
() V
0
() .
Note that the number of 1s equals the dimension of V
+
(). Furthermore, for each v V
+
()
we have
f(v, v) =
p
i=1
a
2
i
f(v
i
, v
i
) > 0 ,
where v
1
, . . . , v
p
are the basis vectors for V
+
(). A similar argument gives f(v, v) 0 for
all v V
() V
0
().
If
t
is another basis we also have
V = V
+
(
t
) V
(
t
) V
0
(
t
) .
Suppose that dim V
+
(
t
) > dim V
+
(). Then
dim V
+
(
t
) + dim (V
() V
0
()) > n ,
so V
+
(
t
) intersects V
() V
0
() in at least one non-zero vector, say v. Since v V
+
(
t
),
f(v, v) > 0, However, since v V
() V
0
(), so f(v, v) 0, a contradiction. Therefore
dim V
+
() = dim V
+
(
t
) and we are done.
The subspace V
0
() is unique. We can dene
N
L
(f) = N(L
f
), N
R
(f) = N(R
f
) .
In the symmetric case, these are equal and we can dene it to be be N(f). We claim
that
V
0
() = N(f) for all .
Indeed, if v V
0
() then because the basis is orthogonal for f, f(v, v) = 0 for all
v and so v N(f). On the other hand,
dim V
0
() = dim V
_
dim V
+
() + dim V
()
= dim V rank [f]
= dim N(f) .
However the spaces V
+
() and V
() are not unique. Let us take f Bil(R

2
, R) with
matrix (in the standard basis)
[f]
=
_
1 0
0 1
_
.
Then f((a, b), (c, d)) = ac bd. Take v
1
= (2,
3) and v
2
= (
3, 2). Thus we get

f(v
1
, v
1
) = (2)(2) (
3)(
3) = 1 ,
f(v
1
, v
2
) = (2)(
3) (
3)(2) = 0
f(v
2
, v
2
) = (
3)(
3) (2)(2) = 1 .
83
7.3 Sesquilinear and Hermitian forms
One important example of a symmetric bilinear form on R
n
is
f(v, w) = v
1
w
1
+ + v
n
w
n
.
In this case,
_
f(v, v) actually denes a good notion of length of vectors on R
n
(we will
dene precisely what this means later). In particular, we have f(v, v) 0 for all v. On C
n
,
however, this is not true. If f is the bilinear form from above, then f((i, . . . , i), (i, . . . , i)) < 0.
But if we dene the form
f(v, w) = v
1
w
1
+ + v
n
w
n
,
then it is true. This is not bilinear, but it is sesquilinear.
Denition 7.3.1. Let V be a nite dimensional complex vector space. A function f :
V V C is called sesquilinear if
1. for each w V , the function v f(v, w) is linear and
2. for each v V , the function w f(v, w) is anti-linear.
To be anti-linear, it means that f(v, cw
1
+w
2
) = cf(v, w
1
) +f(v, w
2
). The sesquilinear form
f is called Hermitian if f(v, w) = f(w, v).
Note that if f is hermitian, then f(v, v) = f(v, v), so f(v, v) R.
1. If f(v, v) 0 (> 0) for all v ,= 0 then f is positive semi-denite (positive denite).
2. If f(v, v) 0 (< 0) for all v ,= 0 then f is negative semi-denite (negative
denite).
If f is a sesquilinear form and is a basis then there is a matrix for f: as before, if
v = a
1
v
1
+ + a
n
v
n
and w = b
1
v
1
+ + b
n
v
n
,
f(v, w) =
n
i=1
a
i
f(v
i
, w) =
n
i=1
_
b
j
f(v
i
, v
j
)
= [v]
t
[f]
[w]
.
The map w f(, w) is a conjugate isomorphism from V to V
.
We have the polarization formula
4f(u, v) = f(u + v, u + v) f(u v, u v) + if(u + iv, u + iv) if(u iv, u iv) .
From this we deduce that if f(v, v) = 0 for all v then f = 0.
Theorem 7.3.2 (Sylvester for Hermitian forms). Let f be a hermitian form on a nite-
dimensional complex vector space V . There exists a basis of V such that [f]
is diagonal
with only 0s, 1s and 1s. Furthermore the number of each does not depend on so long
as the matrix is in diagonal form.
Proof. Same proof.
84
7.4 Exercises
Notation:
1. For all problems below, F is a eld of characteristic dierent from 2, and V is a nite-
dimensional F-vector space. We write Bil(V, F) for the vector space of bilinear forms
on V , and Sym(V, F) for the subspace of symmetric bilinear forms.
2. If B is a bilinear form on V and W V is any subspace, we dene the restriction of
B to W, written B[
W
Bil(W, F), by B[
W
(w
1
, w
2
) = B(w
1
, w
2
).
3. We call B Sym(V, F) non-degenerate, if N(B) = 0.
Exercises
1. Let l V
. Dene a symmetric bilinear form B on V by B(v, w) = l(v)l(w). Compute

the nullspace of B.
2. Let B be a symmetric bilinear form on V . Suppose that W V is a subspace with
the property that V = W N(B). Show that the B[
W
is non-degenerate.
3. Let B be a symmetric bilinear form on V and char(F) ,= 2. Suppose that W V is a
subspace such that B[
W
is non-degenerate. Show that then V = W W
.
Hint: Use induction on dim(W).
4. Recall the isomorphism : Bil(V, F) L(V, V
) given by
(B)(v)(w) = B(v, w), v, w V.
If is a basis of V , and
is the dual basis of V
, show that
[B]
=
_
[(B)]
_
T
.
5. Let n denote the dimension of V . Let d Alt
n
(V ), and B Sym(V, F) both be non-
zero. We are going to show that there exists a constant c
d,B
F with the property
that for any vectors v
1
, . . . , v
n
, w
1
, . . . , w
n
V , the following identity holds:
det(B(v
i
, w
j
)
n
i=1
n
j=1
) = c
d,B
d(v
1
, . . . , v
n
)d(w
1
, . . . , w
n
),
by completing the following steps:
(a) Show that for xed (v
1
, . . . , v
n
), there exists a constant c
d,B
(v
1
, . . . , v
n
) F such
that
det(B(v
i
, w
j
)
n
i=1
n
j=1
) = c
d,B
(v
1
, . . . , v
n
) d(w
1
, . . . , w
n
).
85
(b) We now let (v
1
, . . . , v
n
) vary. Show that there exists a constant c
d,B
F such
that
c
d,B
(v
1
, . . . , v
n
) = c
d,B
d(v
1
, . . . , v
n
).
Show further that c
d,B
= 0 precisely when B is degenerate.
6. The orthogonal group. Let B be a non-degenerate symmetric bilinear form on V .
Consider
O(B) = f L(V, V )[B(f(v), f(w)) = B(v, w) v, w V .
(a) Show that if f O(B), then det(f) is either 1 or 1
Hint: Use the previous exercise.
(b) Show that the composition of maps makes O(B) into a group.
(c) Let V = R
2
, B((x
1
, x
2
), (y
1
, y
2
)) = x
1
y
1
+ x
2
y
2
. Give a formula for the 2x2-
matrices that belong to O(B).
7. The vector product. Assume that V is 3-dimensional. Let B Sym(V, F) be
non-degenerate, and d Alt
3
(V ) be non-zero.
(a) Show that for any v, w V there exists a unique vector z V such that for all
vectors x V the following identity holds: B(z, x) = d(v, w, x).
Hint: Consider the element d(v, w, ) V
.
(b) We will denote the unique vector z from part 1 by v w. Show that V V V ,
(v, w) v w is bilinear and skew-symmetric.
(c) For f O(B), show that f(v w) = det(f) (f(v) f(w)).
(d) Show that v w is B-orthogonal to both v and w.
(e) Show that v w = 0 precisely when v and w are linearly dependent.
8. Let V be a nite dimensional R-vector space. Recall its complexication V
C
, dened
in the exercises of last chapter. It is a C-vector space with dim
C
V
C
= dim
R
V . As an
R-vector space, it equals V V . We have the injection : V V
C
, v (v, 0). We
also have the complex conjugation map c(v, w) = (v, w).
(a) Let B be a bilinear form on V . Show that
B
C
((v, w), (x, y)) := B(v, x) B(w, y) + iB(v, y) + iB(w, x)
denes a bilinear form on V
C
. Show that N(B
C
) = N(B)
C
. Show that B
C
is
symmetric if and only if B is.
(b) Show that for v, w V
C
, we have B
C
(c(v), c(w)) = B
C
(v, w). Show conversely
that any bilinear form

B on V
C
with the property

B(c(v), c(w)) =

B(v, w) is equal
to B
C
for some bilinear form B on V .
86
(c) Let B be a symmetric bilinear form on V . Show that
B
H
((v, w), (x, y)) := B(v, x) + B(w, y) iB(v, y) + iB(w, x)
denes a Hermitian form on V
C
. Show that N(B
H
) = N(B)
C
.
(d) Show that for v, w V
C
, we have B
H
(c(v), c(w)) = B
H
(v, w). Show conversely
that any Hermitian form

B on V
C
with the property

B(c(v), c(w)) =

B(v, w) is
equal to B
H
for some bilinear form B on V .
9. Prove that if V is a nite-dimensional F-vector space with char(F) ,= 2 and f is a
nonzero skew-symmetric bilinear form (that is, a bilinear form such that f(v, w) =
f(w, v) for all v, w V ) then there is no basis for V such that [f]
is upper
triangular.
10. For each of the following real matrices A, nd an invertible matrix S such that S
t
AS
is diagonal.
_
_
2 3 5
3 7 11
5 11 13
_
_
,
_
_
_
_
0 1 2 3
1 0 1 2
2 1 0 1
3 2 1 0
_
_
_
_
.
Also nd a complex matrix T such that T
t
AT is diagonal with only entries 0 and 1.
87
8 Inner product spaces
8.1 Denitions
We will be interested in positive denite hermitian forms.
Denition 8.1.1. Let V be a complex vector space. A hermitian form f is called an inner
product (or scalar product) if f is positive denite. In this case we call V a (complex) inner
product space.
An example is the standard dot product:
u, v = u
1
v
1
+ + u
n
v
n
.
It is customary to write an inner product f(u, v) as u, v. In addition, we write |u| =
_
u, u. This is the norm induced by the inner product , . In fact, (V, d) is a metric
space, using
d(u, v) = |u v| .
Properties of norm. Let (V, , ) be a complex inner product space.
1. For all c C, |cu| = [c[|u|.
2. |u| = 0 if and only if u =
0.
3. (Cauchy-Schwarz inequality) For u, v V ,
[u, v[ |u||v| .
Proof. Dene If u or v is

0 then we are done. Otherwise, set w = u
u,v)
|v|
2
v.
0 w, w = w, u
u, v
|v|
2
w, v .
However
w, v = u, v
u, v
|v|
2
v, v = 0 ,
so
0 w, u = u, u
u, vu, v
|v|
2
,
and
0 u, u
[u, v[
2
|v|
2
[u, v[
2
|u|
2
|v|
2
.
88
Everything above is an equality so, we have equality if and only if w =

0, or v and u
are linearly dependent.
4. (Triangle inequality) For u, v V ,
|u + v| |u| +|v| .
This is also written |u v| |u w| +|w v|.
Proof.
|u + v|
2
= u + v, u + v = u, u +u, v +u, v +v, v
= u, u + 2Re u, v +v, v
|u|
2
+ 2 [u, v[ +|v|
2
|u|
2
+ 2|u||v| +|v|
2
= (|u| +|v|)
2
.
Taking square roots gives the result.
8.2 Orthogonality
Denition 8.2.1. Given a complex inner product space (V, , ) we say that vectors u, v V
are orthogonal if u, v = 0.
Theorem 8.2.2. Let v
1
, . . . , v
k
be nonzero and pairwise orthogonal in a complex inner prod-
uct space. Then they are linearly independent.
Proof. Suppose that
a
1
v
1
+ + a
k
v
k
=
0 .
Then we take inner product with v
i
.
0 =
0, v
i
=
k
j=1
a
j
v
j
, v
i
= a
i
|v
i
|
2
.
Therefore a
i
= 0.
We begin with a method to transform a linearly independent set into an orthonormal set.
Theorem 8.2.3 (Gram-Schmidt). Let V be a complex inner product space and v
1
, . . . , v
k
V
which are linearly independent. There exist u
1
, . . . , u
k
such that
1. u
1
, . . . , u
k
is orthonormal and
2. for all j = 1, . . . , k, Span(u
1
, . . . , u
j
) = Span(v
1
, . . . , v
j
).
89
Proof. We will prove by induction. If k = 1, we must have v
1
,=
0, so set u
1
= v
1
/|v
1
|. This
gives |u
1
| = 1 so that u
1
is orthonormal and certainly the second condition holds.
If k 2 then assume the statement holds for all j = k 1. Find vectors u
1
, . . . , u
k1
as
in the statement. Now to dene u
k
we set
w
k
= v
k
[v
k
, u
1
u
1
+ +v
k
, u
k1
u
k1
] .
We claim that w
k
is orthogonal to all u
j
s and is not zero. To check the rst, let 1 j k1
and compute
w
k
, u
j
= v
k
, u
j
[v
k
, u
1
u
1
, u
j
+ +v
k
, u
k1
u
k1
, u
j
]
= v
k
, u
j
v
k
, u
j
u
j
, u
j
= 0 .
Second, if w
k
were zero then we would have
v
k
Span(u
1
, . . . , u
k1
) = Span(v
1
, . . . , v
k1
) ,
a contradiction to linear independence. Therefore we set u
k
= w
k
/|w
k
| and we see that
u
1
, . . . , u
k
is orthonormal and therefore linearly independent.
Furthermore note that by induction,
Span(u
1
, . . . , u
k
) Span(u
1
, . . . , u
k1
, v
k
) Span(v
1
, . . . , v
k
) .
Since the spaces on the left and right have the same dimension they are equal.
Corollary 8.2.4. If V is a nite-dimensional inner product space then V has an orthonormal
basis.
What do vectors look like represented in an orthonormal basis? Let = v
1
, . . . , v
n
be
a basis and let v V . Then
v = a
1
v
1
+ + a
n
v
n
.
Taking inner product with v
j
on both sides gives a
j
= v, v
j
, so
v = v, v
1
v
1
+ +v, v
n
v
n
.
Thus we can view in this case (orthonormal) the number v, v
i
as the projection of v onto
v
i
. We can then nd the norm of v easily:
|v|
2
= v, v = v,
n
i=1
v, v
i
v
i
=
n
i=1
v, v
i
v, v
i
=
n
i=1
[v, v
i
[
2
.
This is known as Parsevals identity.
90
Denition 8.2.5. If V is an inner product space and W is a subspace of V we dene the
orthogonal complement of V as
W
= v V : v, w = 0 for all w W .

= V and V
0.
If S V then S
is always a subspace of V (even if S was not). Furthermore,

S
= (Span S)
and
_
S
= Span S .
Theorem 8.2.6. Let V be a nite-dimensional inner product space with W a subspace. Then
V = W W
.
Proof. Let w
1
, . . . , w
k
be a basis for W and extend it to a basis w
1
, . . . , w
n
for V . Then
perform Gram-Schmidt to get an orthonormal basis v
1
, . . . , v
n
such that
Span(v
1
, . . . , v
j
) = Span(w
1
, . . . , w
j
) for all j = 1, . . . , n .
In particular, v
1
, . . . , v
k
is an orthonormal basis for W. We claim that v
k+1
, . . . , v
n
is a
basis for W
. To see this, dene

W to be the span of these vectors. Clearly

W W
. On
the other hand,
W W
= w W : w, w
t
= 0 for all w
t
W
w W : w, w = 0 =
0 .
This means that dim W +dim W
n, or dim W
n k. Since dim

W = n k we see
they are equal.
This leads us to a denition.
Denition 8.2.7. Let V be a nite dimensional inner product space. If W is a subspace of
V we write P
W
: V V for the operator
P
W
(v) = w
1
,
where v is written uniquely as w
1
+w
2
for w
1
W and w
2
W
. P
W
is called the orthogonal
projection onto W.
Properties of orthogonal projection.
1. P
W
is linear.
2. P
2
W
= P
W
.
3. P
W
= I P
W
.
91
4. For all v
1
, v
2
V ,
P
W
(v
1
), v
2
= P
W
(v
1
), P
W
(v
2
) +P
W
(v
1
), P
W
(v
2
)
= P
W
(v
1
), P
W
(v
2
) +P
W
(v
1
), P
W
(v
2
) = v
1
, P
W
(v
2
) .
Alternatively one may dene an orthogonal projection as a linear map with properties 2 and
4. That is, if T : V V is linear with T
2
= I and T(v), w = v, T(w) for all v, w V
then (check this)
V = R(T) N(T) where N(T) = (R(T))
and
T = P
R(T)
.
Example. Orthogonal projection onto a 1-d subspace. What we saw in the proof of V =
W W
is the following. If W is a subspace of a nite dimensional inner product space,

there exists an orthonormal basis of V of the form =
1

2
, where
1
is an orthonormal
basis of W and
2
is an orthonormal basis of W
.
Choose an orthonormal basis v
1
, . . . , v
n
of V so that v
1
spans W and the other vectors
span W
. For any v V ,
v = v, v
1
v
1
+v, v
2
v
2
+ +v, v
n
v
n
,
which is a representation of v in terms of W and W
. Thus P
W
(v) = v, v
1
v
1
.
For any vector v
t
we can dene the orthogonal projection onto v
t
as P
W
, where W =
Spanv
t
. Then we choose w
t
= v
t
/|v
t
| as our rst vector in the orthonormal basis and
P
v
(v) = P
W
(v) = v, w
t
w
t
=
v, v
t
|v
t
|
2
v
t
.
Theorem 8.2.8. If V is a nite dimensional inner product space with W a subspace of V ,
for each v V , P
W
(v) is the closest vector in W to v (using the distance coming from | |).
That is, for all w W,
|v P
W
(v)| |v w| .
Proof. First we note that if w W and w
t
W
then the Pythagoras theorem holds:

|w + w
t
|
2
= w, w
t
= w, w +w
t
, w
t
= |w|
2
+|w
t
|
2
.
We now take v V and w W and write vw = P
W
(v)+P
W
(v)w = P
W
(v)w+P
W
(v).
Applying Pythagoras,
|v w|
2
= |P
W
(v) w|
2
+|P
W
(v)|
2
|P
W
(v)|
2
= |P
W
(v) v|
2
.
92
8.3 Adjoints
Theorem 8.3.1. Let V be a nite-dimensional inner product space. If T : V V is linear
then there exists a unique linear transformation T
: V V such that for all v, u V ,

T(v), u = v, T
(u) . (5)
We call T
the adjoint of T.
Proof. We will use the Riesz representation theorem.
Lemma 8.3.2 (Riesz). Let V be a nite-dimensional inner product space. For each f V
there exists a unique z V such that

f(v) = v, z for all v V .
Given this we will dene T
as follows. For u V we dene the linear functional

f
u,T
: V C by f
u,T
(v) = T(v), u .
You can check this is indeed a linear functional. By Riesz, there exists a unique z V such
that
f
u,T
(v) = v, z for all v V .
We dene this z to be T
(u). In other words, for a given u V , T
(u) is the unique vector

in V with the property
T(v), u = f
T,u
(v) = v, T
(u) for all v V .

Because of this identity, we see that there exists a function T
: V V such that for all

u, v V , (5) holds. In other words, given T, we have a way of mapping a vector u V to
another vector which we call T
(u). We need to know that it is unique and linear.

Suppose that R : V V is another function such that for all u, v V ,
T(v), u = v, R(u) .
Then we see that
v, T
(u) R(u) = v, T
(u) v, R(u) = T(v), u v, R(u) = 0

for all u, v V . Choosing v = T
(u) R(u) gives that

|T
(u) R(u)| = 0 ,
or T
(u) = R(u) for all u. This means T
= R.
To show linearity, let c C and u
1
, u
2
, v V .
T(v), cu
1
+ u
2
= cT(v), u
1
+T(v), u
2
= cv, T
(u
1
) +v, T
(u
2
)
= v, cT
(u
1
) + T
(u
2
) .
93
This means that
v, T
(cu
1
+ u
2
) cT
(u
1
) T
(u
2
) = 0
for all v V . Choosing v = T
(cu
1
+ u
2
) cT
(u
1
) T
(u
2
) give that
T
(cu
1
+ u
2
) = cT
(u
1
) + T
(u
2
) ,
or T
is linear.
Properties of adjoint.
1. T
: V V is linear. To see this, if w

1
, w
2
V and c F then for all v,
T(v), cw
1
+ w
2
= cT(v), w
1
+T(v), w
2
= cv, T
(w
1
) +v, T
(w
2
)
= v, cT
(w
1
) + T
(w
2
) .
By uniqueness, T
(cw
1
+ w
2
) = cT
(w
1
) + T
(w
2
).
2. If is an orthonormal basis of V then [T
=
_
[T]
_
t
.
Proof. If is an orthonormal basis, remembering that , is a sesquilinear form, the
matrix of this in the basis is simply the identity. Therefore
T(v), w = [T(v)]
t
[w]
=
_
[T]
[v]
_
t
[w]
= [v]
t
_
[T]
_
t
[w]
.
On the other hand, for all v, w
v, T
(w) = [v]
t
[T
(w)]
= [v]
t
[T
[w]
.
Therefore
[v]
t
[T
[w]
= [v]
t
_
[T]
_
t
[w]
.
Choosing v = v
i
and w = v
j
tells that all the entries of the two matrices are equal.
3. (T + S)
= T
+ S
.
Proof. If v, w V ,
(T + S)(v), w = T(v), w +S(v), w = v, T
(w) +v, S
(w) .
This equals v, (T
+ S
)(w).
4. (cT)
= cT
. This is similar.
94
5. (TS)
= S
.
Proof. For all v, w V ,
(TS)(v), w = T(S(v)), w = S(v), T
(w) = v, S
(T
(w)) .
This is v, (S
)(w).
6. (T
= T.
Proof. If v, w V ,
T
(v), w = w, T
(v) = T(w), v = v, T(w) .

8.4 Spectral theory of self-adjoint operators
Denition 8.4.1. If V is an inner product space and T : V V is linear we say that T
1. self-adjoint if T = T
;
2. skew-adjoint if T = T
;
3. unitary if T is invertible and T
1
= T
;
4. normal if TT
= T
T.
Note all the above operators are normal. Also orthogonal projections are self-adjoint.
Draw relation to complex numbers (SA is real, skew is purely imaginary, unitary is unit
circle).
Theorem 8.4.2. Let V be an inner product space and T : V V linear with an eigenvalue
of T.
1. If T is self-adjoint then is real.
2. If T is skew-adjoint then is purely imaginary.
3. If T is unitary then [[ = 1 and [det T[ = 1.
Proof. Let T : V V be linear and and eigenvalue for eigenvector v.
1. Suppose T = T
. Then
|v|
2
= T(v), v = v, T(v) = |v|
2
.
But v ,= 0 so = .
95
2. Suppose that T = T
. Dene S = iT. Then S
= iT
= iT
= iT = S, so S is
self-adjoint. Now i is an eigenvalue of S:
S(v) = (iT)(v) = iv .
This means i is real, or is purely imaginary.
3. Suppose T
= T
1
. Then
[[
2
|v|
2
= T(v), T(v) = v, T
1
Tv = |v|
2
.
This means [[ = 1. Furthermore, det T is the product of eigenvalues, so [det T[ = 1.
What do these operators look like? If is an orthonormal basis for V then
1. If T = T
then [T]
=
_
[T]
_
t
.
2. If T = T
then [T]
=
_
[T]
_
t
.
Lemma 8.4.3. Let V be a nite-dimensional inner product space. If T : V V is linear,
the following are equivalent.
1. T is unitary.
2. For all v V , |T(v)| = |v|.
3. For all v, w V , T(v), T(w) = v, w.
Proof. If T is unitary then T(v), T(v) = v, T
1
Tv = v, v. This shows 1 implies 2. If T
preserves norm then it also preserves inner product by the polarization identity. This proves
2 implies 3. To see that 3 implies 1, we take v, w V and see
v, w = T(v), T(w) = v, T
T(w) .
This implies that v, w T
T(w) = 0 for all v. Taking v = w T
T(w) gives that

T
T(w) = w. Thus T must be invertible and T
= T
1
.
Furthermore, T is unitary if and only if T maps orthonormal bases to orthonormal
bases. In particular, [T]
has orthonormal columns whenever is orthonormal.

For an orthonormal basis, the unitary operators are exactly those whose matrices
relative to have orthonormal columns.
We begin with a denition.
96
Denition 8.4.4. If V is a nite-dimensional inner product space and T : V V is linear,
we say that T is unitarily diagonalizable if there exists an orthonormal basis of V such
that [T]
is diagonal.
Note that T is unitarily diagonalizable if and only if there exists a unitary operator U
such that
U
1
TU is diagonal .
Theorem 8.4.5 (Spectral theorem). Let V be a nite-dimensional inner product space. If
T : V V is self-adjoint then T is unitarily diagonalizable.
Proof. We will use induction on dim V = n. If n = 1 just choose a vector of norm 1.
Otherwise suppose the statement is true for n < k and we will show it for k 2. Since T
has an eigenvalue , it has an eigenvector, v
1
. Choose v
1
with norm 1.
Let U
= T I. We claim that
V = N(U
) R(U
) .
To show this we need only prove that R(U
) = N(U
. This will follow from a lemma:

Lemma 8.4.6. Let V be a nite-dimensional inner product space and U : V V linear.
Then
R(U) = N(U
.
Proof. If w R(U) then let z N(U
). There exists v V such that U(v) = w. Therefore

w, z = U(v), z = v, U
(z) = 0 .
Therefore R(U) N(U
. For the other containment, note that dim R(U) = dim R(U
)
(since the matrix of U
in an orthonormal basis is just the conjugate transpose of that of

U). Therefore
dim R(U) = dim R(U
) = dim V dim N(U
) = dim N(U
.
Now we apply the lemma. Note that since T is self-adjoint,
U
= (T I)
= T
I = T I = U
,
since R. Thus using the lemma with U
,
V = N(U
) N(U
= N(U
) R(U
) .
Note that these are T-invariant spaces and dim R(U
) < k since is an eigenvalue of T. Thus

there is an orthonormal basis
t
of R(U
) such that T restricted to this space is diagonal.

Taking
=
t
v
1
gives now an orthonormal basis such that [T]
is block diagonal. But the block for v

1
is
of size 1, so [T]
is diagonal.
97
Note that if T is skew-adjoint, iT is self-adjoint, so we can nd an orthonormal basis
such that [iT]
is diagonal. This implies that T itself can be diagonalized by : its

matrix is just i[iT]
.
8.5 Normal and commuting operators
Lemma 8.5.1. Let U, T : V V be linear and F be algebraically closed. Write

E
T
1
, . . . ,

E
T
k
for the generalized eigenspaces for T. If T and U commute then
V =

E
T
1

E
T
k
is both a T-invariant direct sum and a U-invariant direct sum.
Proof. We need only show that the generalized eigenspaces of T are U-invariant. If v
N(T
i
I)
m
then
(T
i
I)
m
(U(v)) = U(T
i
I)
m
v =
0 .
Theorem 8.5.2. Let U, T : V V be linear and F algebraically closed. Suppose that T
and U commute. Then
1. If T and U are diagonalizable then there exists a basis such that both [T]
and [U]
are diagonal.
2. If V is an inner product space and T and U are self-adjoint then we can choose to
be orthonormal.
Proof. Suppose that T and U are diagonalizable. Then the direct sum of eigenspaces for T
is simply
E
T
1
E
T
k
,
the eigenspaces. For each j, choose a Jordan basis
j
for U on E
T
j
. Set =
k
j=1
j
. These
are all eigenvectors for T so [T]
is diagonal. Further, [U]
is in Jordan form. But since U

is diagonalizable, its Jordan form is diagonal. By uniqueness, [U]
is diagonal.
If T and U are self-adjoint the decomposition
V = E
T
1
E
T
k
is orthogonal. For each j, choose an orthonormal basis
j
of E
T
j
of eigenvectors for U (since
U is self-adjoint on E
T
j
). Now =
k
i=1
i
is an orthonormal basis of eigenvectors for both
T and U.
Theorem 8.5.3. Let V be a nite-dimensional inner product space. If T : V V is linear
then T is normal if and only if T is unitarily diagonalizable.
98
Denition 8.5.4. If V is an inner product space and T : V V is linear, we write
T = (1/2)(T + T
) + (1/2)(T T
) = T
1
+ T
2
and call these operators the self-adjoint part and the skew-adjoint part of T respectively.
Of course each part of a linear transformation can be unitarily diagonalized on its own.
We now need to diagonalize them simultaneously.
Proof. If T is unitarily diagonalizable, then, taking U such that T = U
1
DU for a diagonal
D, we get
TT
= U
1
DU
_
U
1
DU
_
= U
DUU
U = U
DD
U = = T
T ,
or T is normal.
Suppose that T is normal. Then T
1
and T
2
commute. Note that T
1
is self-adjoint and
iT
2
is also. They commute so we can nd an orthonormal basis such that [T
1
]
and [iT
2
]
are diagonal. Now

[T]
= [T
1
]
i[iT
2
]
is diagonal.
8.6 Exercises
Notation
1. If V is a vector space over R and , : V V R is a positive-denite symmetric
bilinear form, then we call , a (real) inner product. The pair (V, , ) is called a
(real) inner product space.
2. If (V, , ) is a real inner product space and S is a subset of V we say that S is
orthogonal if v, w = 0 whenever v, w S are distinct. We say S is orthonormal if S
is orthogonal and v, v = 1 for all v S.
3. If f is a symmetric bilinear form on a vector space V the orthogonal group is the set
O(f) = T : V V [ f(T(u), T(v)) = f(u, v) for all u, v V .
Exercises
1. Let V be a complex inner product space. Let T L(V, V ) be such that T
= T.
We call such T skew-self-adjoint. Show that the eigenvalues of T are purely imaginary.
Show further that V is the orthogonal direct sum of the eigenspaces of T. In other
words, V is a direct sum of the eigenspaces and v, w = 0 if v and w are in distinct
eigenspaces.
Hint: Construct from T a suitable self-adjoint operator and apply the known results
from the lecture to that operator.
99
2. Let V be a complex inner product space, and T L(V, V ).
(a) Show that T is unitary if and only if it maps orthonormal bases to orthonormal
bases.
(b) Let be an orthonormal basis of V . Show that T is unitary if and only if the
columns of the matrix matrix [T]
form a set of orthonormal vectors in C

n
with
respect to the standard hermitian form (standard dot product).
3. Let (V, , ) be a complex inner product space, and be a Hermitian form on V (in
addition to , ). Show that there exists an orthonormal basis of V such that []
is
diagonal, by completing the following steps:
(a) Show that for each w V , there exists a unique vector, which we call Aw, in V
with the property that for all v V ,
(v, w) = v, Aw.
(b) Show the the map A : V V which sends a vector w V to the vector Aw just
dened, is linear and self-adjoint.
(c) Use the spectral theorem for self-adjoint operators to complete the problem.
4. Let (V, , ) be a real inner product space.
(a) Dene | | : V R by
|v| =
_
v, v .
Show that for all v, w V ,
[v, w[ |v||w| .
(b) Show that | | is a norm on V .
(c) Show that there exists an orthonormal basis of V .
5. Let (V, , ) be a real inner product space and T : V V be linear.
(a) Prove that for each f V
there exists a unique z V such that for all v V ,

f(v) = v, z .
(b) For each u V dene f
u,T
: V V by
f
u,T
(v) = T(v), u .
Prove that f
u,T
V
. Dene T
t
(u) to be the unique u V such that for all
v V ,
T(v), u = v, T
t
(u)
and show that T
t
is linear.
100
(c) Show that if is an orthonormal basis for V then
[T
t
]
=
_
[T]
_
t
.
6. Let (V, , ) be a real inner product space and dene the complexication of , as
in homework 11 by
(v, w), (x, y)
C
= v, x +w, y iv, y + iw, x .
(a) Show that ,
C
is an inner product on V
C
.
(b) Let T : V V be linear.
i. Prove that (T
C
)
= (T
t
)
C
.
ii. If T
t
= T then we say that T is symmetric. Show in this case that T
C
is
Hermitian.
iii. It T
t
= T then we say T is anti-symmetric. Show in this case that T
C
is
skew-adjoint.
iv. If T is invertible and T
t
= T
1
then we say that T is orthogonal. Show in
this case that T
C
is unitary. Show that this is equivalent to
T O(, ) ,
where O(, ) is the orthogonal group for , .
7. Let (V, , ) be a real inner product space and T : V V be linear.
(a) Suppose that TT
t
= T
t
T. Show then that T
C
is normal. In this case, we can nd
a basis of V
C
such that is orthonormal (with respect to ,
C
) and [T
C
]
is
diagonal. Dene the subspaces of V
X
1
, . . . , X
r
, Y
1
, . . . , Y
2m
as in problem 1, question 3. Show that these are mutually orthogonal; that is, if
v, w are in dierent subspaces then v, w = 0.
(b) If T is symmetric then show that there exists an orthonormal basis of V such
that [T]
is diagonal.
(c) If T is skew-symmetric, what is the form of the matrix of T in real Jordan form?
(d) A M
22
(R) is called a rotation matrix if there exists [0, 2) such that
A =
_
cos sin
sin cos
_
.
If T is orthogonal, show that there exists a basis of V such that [T]
is block
diagonal, and the blocks are either 2 2 rotation matrices or 1 1 matrices
consisting of 1 or 1.
Hint. Use the real Jordan form.
101

Linear Algebra: MAT 217 Lecture Notes, Spring 2012

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Linear Algebra: MAT 217 Lecture Notes, Spring 2012

Încărcat de

Drepturi de autor:

Formate disponibile

Linear Algebra: MAT 217

Lecture notes, Spring 2012

W. We need to show that (a)

S = all nite l.c.s of elements of S .

and also extend it to a basis for W

is linear. It is called the coordinate map for the

0. This holds from

be a basis for N(T). Extend it to a basis

is a linear transformation from N

bases for V and W, respectively.

. (Using the column convention.)

such that for all v V ,

is the k-th column of A (and similarly for B).

and the right side is the k-th column

This implies that [T

for any basis .

is a subgroup of G. Deduce that Ker() is a subgroup of G.

relative to the standard

. Then by the rank-nullity theorem,

: f(v) = 0 for all v S .

(although S does not have to be a subspace of V ).

. Note that the zero functional obviously

, then for each v S,

. To prove the forward inclusion, take f S

. On the other hand if f (Span S)

then clearly f(v) = 0 for all v S

. One implication is easy. To prove

. If w / W then build a basis v

0 if and only if f(v) = 0 for all f V

. We now must show that is linear and either one-to-one or onto

so we need to show they act the same on elements of V

0 then for all f V

. However, since is an isomorphism, (W) is a subspace of (W

are the dual bases, then

and c F then for each v V ,

and write [T]

(g)(v) = 0 for all v V . If w R(T) then w = T(v) for

(g)(v) = 0. Thus g (R(T))

then we would like to show that T

(g)(v) = 0 for all v V . We have

(g) for some g W

. If T(v) = 0, we have f(v) =

(g)(v) = g(T(v)) = 0. Therefore f (N(T))

. This gives R(T

) = dim R(T) = n dim N(T) = dim (N(T))

given by composing with the inverse of the canonical isomorphism W W

, the map dual to .

. Show that if T : V W is linear then

is the dual basis to some basis (v

a subset. Recall the denition

be a subspace, and let (

is another functional with N() = U, then there exists

be a basis with f(v

is linearly independent. Then write

for any basis of V .

is diagonal with entries D

for some basis . This is then equivalent to the matrix equation

0 for all i. Since v

is a subspace even if is not an eigenvalue. Furthermore,

,= 0 if and only if is an eigenvalue of T

is T-invariant for all F .

is diagonal. Clearly each diagonal element is an eigenvalue. For

is diagonal. This is because each vector in is an