Module 1 - Gauss

Math 221a
Gauss
Anil Nerode 2007
We begin the course with the study of the basic Gauss algorithm for solution
of any system of simultaneous linear equations in many variables. Such systems
arise in all branches of applied mathematics as "linear approximations" to non-
linear systems. They are the simplest to study and capture "local" properties
of the corresponding "non-linear system". That is why we study them rst.
Also understanding them will clarify many notions in the calculus of several
variables, which is the normal course succeeding this course.
Gausss algorithm for solving systems of linear equations
We are interested in solving linear systems of m equations in n unknowns of
the form
a
11
x
1
+ : : : + a
1n
x
n
= b
1
: : : : :
: : : : :
: : : : :
a
m1
x
1
+ :::+ a
mn
x
n
= b
m
by manipulating the matrix
_
_
_
_
_
_
a
11
: : a
1n
b
1
: : : : :
: : : : :
: : : : :
a
m1
: : a
mn
b
m
_
_
_
_
_
_
As we do so the matrices take on an inner life of their own!
The three row operations on equations or their matrices.
Type 1 Interchange 2 rows
Type 2 Add c times one row to another.
Type 3 Multiply a row by a non-zero constant.
Note that each type of operation has an inverse (an operation which undoes
it) of the same type, and that row operations leave solution sets unchanged.
We give the algorithm before presenting a single example. The reason for this
is that the examples are easy to follow. After looking at them, the student
will say "I can do that", and indeed this is true. At that point, the student
is likely to ignore reading the algorithm. But if you were to try to write a
computer program implementing the algorithm, the "I can do that" will not
help a lot. The algorithm below is explicit enough as a program specication so
that someone who has read it with understanding can write a program executing
1
it easily. A huge number of programming errors in industrial settings are due
to not reading and understanding the program specication, ending up with a
program with unintended behavior. So learn to read mathematics accurately,
then go to the examples, then come back and read the mathematics again.
Acquiring the skill of puzzling through mathematical statements till you have
acquired full understanding.
Gausss Algorithm
STAGE 1
1) Find the leftmost column j with a non zero entry, and choose the upper-
most non-zero entry a in that column.
2) Interchange the row of that entry a with the rst row.
3) Multiply the rst row by 1=a:
4) Add to each row, other than the rst, a suitable multiple of the rst row
to produce all zeros in column j below the rst row.
STAGE 2
1) Find the leftmost column k entry with a non-zero entry below the rst
row, and choose the uppermost non-zero entry b:
2) Interchange the row of that entry b with the second row.
3) Multiple the second row by 1=b:
4) Add to each row below the second a suitable multiple of the rst row to
produce all zeros in column k below the second row..
LATER STAGES
Continue in the same manner till rows or columns are exhausted.
Note:
At the end of stage 1 we stopped changing the rst row and column.
At the end of stage 2 we stopped changing the second row and column.
.
.
.
We end up with a so called echelon form for the original matrix, from
which solutions and their properties can be read o.
Since row operations do not change the solution set, the echelon form has
the same solutions as the original system.
2
We have specied a specic order for the operations.This is to make grading
problems easy, and to give a wholly deterministic algorithm, where there are
no choices as to what step to take next. Neither you nor I will stick to this
routine, except for homework. We deviate from the Gauss algorithm by not
multiplying by
1
c
to make the leading term 1. That step my be omitted if the
only purpose is to nd a simple formula for the solutions. Afterwards, one can
multiply each row by the reciprocal of its leftmost term and we will have the
1s on the diagonal of Gausss own echelon form.
Example Start out with equations
x
2
x
3
+x
4
= 0
x
1
+x
2
+x
3
+x
4
= 6
2x
1
+4x
2
+x
3
2x
4
= 1
3x
1
+x
2
2x
3
+2x
4
= 3
and the corresponding matrix
_
_
_
_
0 1 1 1 0
1 1 1 1 6
2 4 1 2 1
3 1 2 2 3
_
_
_
_
The leftmost column with a non-zero entry is the rst column. The highest
row with a non-zero entry in that column is the second. The entry in the rst
column and second row is 1. The rst operation is to interchange rows 1 and 2,
to bring this 1 to the top row.
_
_
_
_
1 1 1 1 6
0 1 1 1 0
2 4 1 2 1
3 1 2 2 3
_
_
_
_
Next add (2) times the rst row to the third row to make the rst non-zero
element of the third row zero.
_
_
_
_
1 1 1 1 6
0 1 1 1 0
0 2 1 4 13
3 1 2 2 3
_
_
_
_
Next add (3) times the rst row to the fourth row to make the rst element
of the fourth row zero
_
_
_
_
1 1 1 1 6
0 1 1 1 0
0 2 1 4 13
0 2 5 1 15
_
_
_
_
3
If we write down the corresponding system of equations we have arranged it
so that x
1
occurs only in the rst equation.
x
1
+ x
2
+ x
3
+ x
4
= 6:
Now, holding the rst row and the rst column xed (we have nished our
work there), repeat the same process starting with the second row and second
column. Namely, nd the left most column with an entry below the rst row
that is non-zero, and, if necessary, interchange rows so that it becomes the
second row. It already IS the second row, so this step is not needed.
Add 2 times the second row to the third so that the second entry of the third
row is zero
_
_
_
_
1 1 1 1 6
0 1 1 1 0
0 0 3 2 13
0 2 5 1 15
_
_
_
_
Add (2) times the second row to the fourth row to make the second entry
of the fourth row zero.
_
_
_
_
1 1 1 1 6
0 1 1 1 0
0 0 3 2 13
0 0 3 3 15
_
_
_
_
At this point, in terms of the equation interpretation, x
1
occurs only in the
rst equation, x
2
occurs only in the rst equation and in the second equation
x
2
x
3
+ x
4
= 0
Now hold the rst two rows and the rst two columns xed, and repeat the
process. Find the leftmost column with a non-zero entry below the rst two
rows, and interchange rows so that it is in the third row. This is unnecessary
because the entry already there is non-zero.
Add (1) times the third row to the fourth to assure that the third element
of the fourth row is zero.
_
_
_
_
1 1 1 1 6
0 1 1 1 0
0 0 3 2 13
0 0 0 1 2
_
_
_
_
,
We now have that x
1
occurs only in the rst equation, x
2
occurs only in the
rst two equations, x
3
occurs only in the third equation
x
3
= 9, x ocurs only in the fourth equation x
4
= 2:
The corresponding equations are
x
1
+x
2
+x
3
+x
4
= 6
x
2
x
3
+x
4
= 0
3x
3
2x
4
= 13
x
4
= 2
Working from the bottom up,
4
_
_
_
_
x
1
x
2
x
3
x
4
_
_
_
_
=
_
_
_
_
2
1
3
2
_
_
_
_
This process is called back substitution.
But solutions are not always unique. We discuss a couple of row reduced
systems without unique solutions.
Example
_
_
1 4 2
0 1 3
0 0 1
_
_
We interpret this as representing three equations in two unknowns
x
1
+4x
2
= 2
x
2
= 3
0 = 1
:
If you row reduced a system and got this as the row reduced form, the last
equation shows there are no solutions whatever.
Example
_
_
1 3 0 0
0 0 1 3
0 0 0 0
_
_
This is also row reduced. We interpret this as representing three equations
in three unknowns
x
1
+3x
2
= 0
x
3
= 3
0 = 0
If you row reduced a system and got this,
the third equation is unneeded,
the second species that x
3
= 3;
the rst says that x
1
= 3x
2
:
So we conclude that for any value of x
2
, we get one and only one solution
with x
3
= 3 and x
1
= 3x
2
:
That is, the "general solution" is
_
_
3x
2
x
2
3
_
_
:
Geometrically this says the solution consists of all points on a line in the
plane x
3
= 3:
5
If we set x
2
= 1; we see that
_
_
3
1
3
_
_
lies on the line of solutions.
If we set x
2
= 0, we see that
_
_
0
0
3
_
_
lies on that line.
A line is determined by two points. The set of all solutions is the set of all
points in Euclidean space on the line through
_
_
3
1
3
_
_
and
_
_
0
0
3
_
_
:.
This is a convenient time to introduce vector notation.
A 3-vector is a column of numbers
_
_
u
v
w
_
_
:
The 3- vector
_
_
u
v
w
_
_
is pictured, following Newton, as a directed line segment
stretching from the origin (0; 0; 0) to the point (u; v; w):
Scalar multiplication: If c is a particular number (scalar), the product
c
_
_
u
v
w
_
_
is dened as
_
_
cu
cv
cw
_
_
:
For instance 5
_
_
6
7
8
_
_
=
_
_
30
35
40
_
_
:
In the geometric interpretation, scalar multiplication by c stretches out the
line segment from origin to (u; v; w) by a factor c in the direction indicated by
the sign of c.
VectorAddition:
_
_
u
v
w
_
_
+
_
_
x
y
z
_
_
=
_
_
u + x
v + y
w + z
_
_
:
6
For instance;
_
_
1
2
3
_
_
+
_
_
4
5
6
_
_
=
_
_
5
7
9
_
_
:
In the geometric interpretation, vector addition is interpreted by Newtons
parallelogram law.
One completes the parallelogram which has as two of its sides the line seg-
ments from origin to (u; v; w) and (x; y; z) respectively.
The corner point opposite to the origin in this parallelogram is (u + x; v +
y; w + z); corresponding to the vector sum
_
_
u
v
w
_
_
+
_
_
x
y
z
_
_
:
For the problem just given, the general solution can be written
_
_
3x
2
x
2
3
_
_
=
_
_
0
0
3
_
_
+ x
2
_
_
3
1
0
_
_
The general solution depends on one parameter x
2
:
That is, every solution is the vector sum of the specic solution
_
_
0
0
3
_
_
plus
some (any) scalar multiple of the xed vector
_
_
3
1
0
_
_
.
The solutions form the unique line in the plane containing the two points
(0; 0; 3) and (3; 1; 3). (why?)
Example
_
_
1 1 1 1 0 1
0 0 0 1 0 2
0 0 0 0 1 1
_
_
This is already row reduced. The corresponding equations are
x
1
+x
2
+x
3
+x
4
= 1
x
4
= 2
x
5
= 1
We see that the solutions are given by
7
x
5
= 1;
x
4
= 2;
x
1
= 1 x
2
x
3
x
4
= 1 x
2
x
3
,
where x
2
; x
3
can have any values. The general solution is
_
_
_
_
_
_
1 x
2
x
3
x
2
x
3
2
1
_
_
_
_
_
_
This can be written
_
_
_
_
_
_
1
0
0
2
1
_
_
_
_
_
_
+
_
_
_
_
_
_
x
2
x
3
x
2
x
3
0
0
_
_
_
_
_
_
or as
_
_
_
_
_
_
1
0
0
2
1
_
_
_
_
_
_
+ x
2
_
_
_
_
_
_
1
1
0
0
0
_
_
_
_
_
_
+ x
3
_
_
_
_
_
_
1
0
1
0
0
_
_
_
_
_
_
:
This solution depends on two independent parameters. x
2
; x
3
.
We say we are working in the 5space of column vectors
_
_
_
_
_
_
a
b
c
d
e
_
_
_
_
_
_
: In that space
the solutions form a plane through the three points
_
_
_
_
_
_
1
0
0
2
1
_
_
_
_
_
_
;
_
_
_
_
_
_
1
0
0
2
1
_
_
_
_
_
_
+
_
_
_
_
_
_
1
1
0
0
0
_
_
_
_
_
_
;
8
_
_
_
_
_
_
1
0
0
2
1
_
_
_
_
_
_
+
_
_
_
_
_
_
1
0
1
0
0
_
_
_
_
_
_
:
Gauss-Jordan Algorithm
Reduced Echelon Form
Let A be in row echelon form.
1) Go to the leftmost non-zero column j of A, in that column nd the largest
index i with a
ij
= 1, subtract suitable multiples of that row from all previous
rows to make the entries in that column above a
ij
= 0 .
2) Repeat until every column with a leading 1 has zeros above and below
that 1:
Example
Start with
_
_
1 1 5
0 1 4
0 0 1
_
_
:
Add (1) times the second row to the rst row, getting
_
_
1 0 1
0 1 4
0 0 1
_
_
.
Add (4) times the third row to the second, getting
_
_
1 0 1
0 1 0
0 0 1
_
_
Add (1) times the third row to the rst, getting
_
_
1 0 0
0 1 0
0 0 1
_
_
This is in reduced row echelon form.
Example
Start with
9
_
_
1 0 5 2
0 0 1 2
0 0 0 1
_
_
:
Add (5) times the second row to the rst, getting
_
_
1 0 0 8
0 0 1 2
0 0 0 1
_
_
:
Add (8) times the third row to the rst, getting
_
_
1 0 0 0
0 0 1 2
0 0 0 1
_
_
Add (2) times the third row to the second, getting
_
_
1 0 0 0
0 0 1 0
0 0 0 1
_
_
:
Example.
No matter what values we choose for a, b, c, d, , the following matrices are
already in reduced row echelon form.
_
_
1 0 0 a
0 1 0 b
0 0 1 c
_
_
;
_
_
1 0 a 0
0 1 a 0
0 0 0 1
_
_
;
_
_
1 0 a c
0 1 b d
0 0 0 0
_
_
;
_
_
1 a 0 0
0 0 1 0
0 0 0 1
_
_
Matrix algebra
10
A matrix is an m by n array of numbers, m rows, n columns. We write
such a matrix as A = (a
ij
). The index i runs from 1 to m, indexing rows, and
the index j runs from 1 to n, indexing columns. We refer to this matrix as an
m by n matrix, or a matrix of size (m; n). Here are two 3 by 2 matrices
_
_
a
11
a
12
a
13
a
14
a
15
a
16
a
21
a
22
a
23
a
24
a
25
a
26
a
31
a
32
a
33
a
34
a
35
a
36
_
_
_
_
b
11
b
12
b
13
b
14
b
15
b
16
b
21
b
22
b
23
b
24
b
25
b
26
b
31
b
32
b
33
b
34
b
35
b
36
_
_
Two m by n matrices can be added.
_
_
a
11
a
12
a
13
a
14
a
15
a
16
a
21
a
22
a
23
a
24
a
25
a
26
a
31
a
32
a
33
a
34
a
35
a
36
_
_
+
_
_
b
11
b
12
b
13
b
14
b
15
b
16
b
21
b
22
b
23
b
24
b
25
b
26
b
31
b
32
b
33
b
34
b
35
b
36
_
_
=
_
_
a
11
+ b
11
a
12
+ b
12
a
13
+ b
13
a
14
+ b
14
a
15
+ b
15
a
16
+ b
16
a
21
+ b
21
a
22
+ b
22
a
23
+ b
23
a
24
+ b
24
a
25
+ b
25
a
26
+ b
26
a
31
+ b
31
a
32
+ b
32
a
33
+ b
33
a
34
+ b
34
a
35
+ b
35
a
36
+ b
36
_
_
They can be scalar multiplied
c
_
_
a
11
a
12
a
13
a
14
a
15
a
16
a
21
a
22
a
23
a
24
a
25
a
26
a
31
a
32
a
33
a
34
a
35
a
36
_
_
=
_
_
ca
11
ca
12
ca
13
ca
14
ca
15
ca
16
ca
21
ca
22
ca
23
ca
24
ca
25
ca
26
ca
31
ca
32
ca
33
ca
34
ca
35
ca
36
_
_
Row by Column Matrix Multiplication.
Given a row vector
_
a
1
: : : a
n
_
of length n and a column vector
_
_
_
_
_
_
b
1
:
:
:
b
n
_
_
_
_
_
_
of length n; we dene their product to be the number
a
1
b
1
+:::+a
n
b
n
: Thus
_
1 2 3
_
_
_
4
5
6
_
_
= 14+25+36 = 4+10+18 = 32:
Matrix Product
11
If A; B are matrices, we only form the matrix product AB in case A is m
by n and B is n by p, that is, only in case the rows of A are the same length
as the columns of B: In that case, the product AB is the m by p matrix whose
ij- th entry is the product of the i-th row of A by the j-th row of B.
Example
_
a b c
d e f
_
_
_
g h
i j
k l
_
_
=
_
ib + ag + ck ah + bj + cl
dg + fk + ie dh + fl + je
_
_
a b
c d
__
e f
g h
_
=
_
bg + ae af + bh
dg + ce cf + dh
_
Square matrices of the same size can be multiplied in either order, but the
results are usually dierent.
_
e f
g h
__
a b
c d
_
=
_
cf + ae df + be
ag + ch bg + dh
_
Here is a numerical exampl.
_
1 2
3 4
__
5 6
7 8
_
=
_
19 22
43 50
_
_
5 6
7 8
__
1 2
3 4
_
=
_
23 34
31 46
_
Thus matrices of the same size can be added and scalar multiplied
(a
ij
) + (b
ij
) = (a
ij
+ b
ij
)
c(a
ij
) = (ac
ij
):
The set M
mn
of all m by n matrices is closed under these two operations.
We dene the product
AB = (a
ij
)(b
jk
)
of two matrices A; B only in case A is m by n, and B is n by p;, that is, only
in case the number of columns of A is equal to the number of rows of B: Here
is the denition using summation notation .
.
AB = (a
ij
)(b
jk
) = (a
i1
b
1k
+ ::: + a
in
b
nk
) = (
n
j=1
a
ij
b
jk
) =
j
a
ij
b
jk
Here are the laws that matrix multiplication obeys.
A(B + C) = AB + AC
(A + B)C = AC + BC
c(AB) = (cA)B = A(cB)
The hardest law to verify is the
12
Associative law for matrix multiplication
A(BC) = (AB)C
Proof.
Let A = (a
ij
); B = (b
jk
); C = (c
kl
):
The ik th entry of AB is
j
a
ij
b
jk
: The il th entry of (AB)C is
k
(
j
a
ij
b
jk
)c
kl
=
k
(
j
a
ij
b
jk
c
kl
)
The jl th entry of BC is
k
b
jk
c
kl
: The il th entry of
A(BC) is
j
a
ij
(
k
b
jk
c
kl
) =
j
(
k
a
ij
b
jk
c
kl
)
But these are the same terms added in a dierent order . So due to the
commutative and associative laws for numbers,
A(BC) = (AB)C:
Einstein Convention
The Einstein convention regards any repeated index as being summed over.
He introduced writing
AB = a
ij
b
jk
Physicists and engineers use this convention, most mathemati-
cians use the summation sign :
Einstein would have written this proof as
(AB)C = (a
ij
b
jk
)c
kl
= a
ij
(b
jk
c
kl
) = A(BC)
leaving the explicit use of associative and distibutive las for numbers to the
imagination.
The n by n identity matrix is the matrix I
n
= (a
ij
) with a
ij
= 1 if i =
j; a
ij
= 0 otherwise.
Example
_
_
1 0 0
0 1 0
0 0 1
_
_
= I
3
These matrices satisfy:
I
n
A = A if A has n rows.
AI
n
= A if A has n columns.
Finally, the transpose A
t
of A = (a
ij
) is (a
ji
);obtained by interchanging
rows and columns of A.
The properties of transpose are
(A + B)
t
= A
t
+ B
t
(aA)
t
= a(A
t
)
13
(AB)
t
= B
t
A
t
Elementary Row Matrices
If I is an identity matrix and is a row operation, then I is the row
matrix corresponding to that row operation.
Example: If
I =
_
_
1 0 0
0 1 0
0 0 1
_
_
and
is the row operation of adding 3 times the rst row to the second, then
I =
_
_
1 0 0
3 1 0
0 0 1
_
_
is the corresponding row matrix.
Lemma: (AB) = (A)B:
Proof This is straightforward. Try a few examples.
That is, a row operation applied to a product AB of matrices gives the
same result as applying the row operation to the rst matrix A and then
premultiplying B by the result.
When A is the identity matrix this says:
To apply a row operation to a matrix premultiply by the corresponding row
matrix.
That is A = (I)A:
Denition A square matrix A is called invertible if there is a matrix B such
that AB = BA = I:
Since each elementary row operation has an inverse row operation, it follows
that every elementary row matrix has an inverse matrix, which is the row matrix
for the inverse operation.
Corollary. A row reduction by a series of elementary row operations can
be carried out by premultiplying by the corresponding row matrices in the same
order.
14
The product of invertible matrices is invertible too. The inverse is the prod-
uct of the corresponding inverses in the opposite order.
(AB)
1
= B
1
A
1
:
Combining all this we conclude that if a square matrix can be row reduced
to the identity matrix, it has an inverse, the product of the row matrices corre-
sponding to the row operations used and in the same order.
Lemma. Let C be the row echelon form of A: Let P be the invertible
matrix such that PA = C. (This P is the product of the row matrices reducing
A to C:) Then for any column vector B, the solutions X to AX = B are the
same as the solutions X to CX = PB:
Proof of Lemma: This is because the row operations applied to AX =
B regarded as a set of non-homogeneous equations do not change the solution
set.
Main Theorem.
Let A be an n by n matrix.Then the following two conditions are equivalent:
1) For every column matrix B, there exists a column matrix X such that
AX = B:
2) The row echelon form of A is the identity matrix.
To prove this we have to show that 2) implies 1) and that 1) implies 2).
2) =) 1)
Proof By the lemma above, if the row echelon form C is the identity, then
CX = PB is X = PB: So AX = B has the same solutions as X = PB, namely
PB.
1) =) 2)
Proof. This is the same as proving "(not 2) implies (not 1)". That is,
suppose that the row echelon form C = PA of A is not the identity matrix.
Then we must prove that there is a column vector B such that the equation
AX = B has no solution X: But this is the same, according to the lemma above,
as showing there is a column vector B such that CX = PB has no solution X:
Since C is square and row echelon and not the identity, the row echelon
process implies the last row is all zeros (why?)
Let D be any column vector with bottom entry non-zero, all other entries
zero.
15
The system CX = D has no solutions since in the corresponding system of
equations, the last equation has left side zero while the right side is non-zero.
We set P
1
D = B; and conclude that CX = PB has no solution X and
therefore that AX = B has no solution X:
Corollary. If n by n matrix A has a right inverse B, then B is also a left
inverse of A.
Proof.
B is called a right inverse of A if AB = I. We have to show that B is a left
inverse of A, namely that BA = I
Now AB = I implies that for any column vector C, A(BC) = (AB)C =
IC = C:
So AX = C always has a solution X = BC:
Apply the theorem above, and conclude that A can be row reduced to the
identity matrix I:
Row reduction implies there is an invertible matrix P such that PA = I .
Then
B = IB = (PA)B = P(AB) = PI = P:
So BA = PA = I as required.
Corollary If n by n matrix A has a left inverse B, then B is also a right
inverse of A.
Proof If B is a left inverse to A; then BA = I: Then B has A as a right
inverse: By the previous lemma, A is also a left inverse of B:
So AB = I: This makes B a right inverse of A:
Proposition A matrix A has at most one inverse which is two sided (both
left sided and right sided).
Proof This is an exercise in the associative law. Suppose that both B and
C are two sided inverses of A: Then
AB = BA = I and AC = CA = I.
Then B = IB = (CA)B = C(AB) = CI = C
Theorem. Let A be an n by n matrix. Then A row reduces to reduced
echelon form an identity matrix if and only if A is invertible.
Proof:
16
=) If A has row echelon form the identity matrix, and P is the product
of the elementary row matrices used to reduce A to I , then PA = I. We have
shown above that If A has a left inverse P, then P is also a right inverse. So
P is invertible.
(= Conversely, if A is invertible, then for any B , AX = B can be solved
by X = A
1
B: So by a theorem above, X row reduces to the identity matrix.
Corollary Every invertible matrix A is a product of elementary row ma-
trices.
Proof If the row echelon form of A is the identity matrix , then I =
(
k
I):::(
1
I)A
using the indicated row operations. Then A is invertible with inverse (
k
I):::(
1
I):
Also A
1
= (
1
I)
1
:::(
k
I)
1
These are elementary row matrices too. So A is the product of row matrices.
Comment. This says that the concept of an invertible matrix is not all that
complicated; applying an invertible matrix to a vector amounts to applying a
series of row matrices to that vector.
Dependence, Independence, Spanning, Bases, Dimension
Many fundamental theorems about R
n
can be derived from the main theo-
rem we established in the last section.
Denition Vector v is a linear combination of vector v
1
; :::; v
k
if there exist
scalars c
1
; :::; c
k
such that v = c
1
v
1
+ ::: + c
k
v
k
Denition Vectors v
1
; :::; v
k
span R
n
if for every vector v in R
n
, v is a
linear combination of v
1
; :::; v
k
:
Denition Vectors v
1
; :::; v
k
are independent (also called linearly indepen-
dent) if for all scalars c
1
; :::; c
k
, if c
1
v
1
+ ::: + c
k
v
k
= 0;then c
1
= ::: = c
k
= 0
Denition Vectors v
1
; :::; v
k
are dependent (also called linearly dependent)
if there exist scalars c
1
; :::; c
k
, not all zero, such that c
1
v
1
+ ::: + c
k
v
k
= 0.
Denition Vectors v
1
; :::; v
k
are a basis for R
n
if they are independent and
span R
n
.
Theorem Let v
1
; :::; v
n
be n vectors in R
n
: Then v
1
; :::; v
n
are independent
if and only if v
1
; :::; v
n
span R
n
:
Proof. Let v
1
=
_
_
_
_
_
_
a
11
:
:
:
a
n1
_
_
_
_
_
_
; :::; v
n
=
_
_
_
_
_
_
a
1n
:
:
:
a
nn
_
_
_
_
_
_
:
17
Then v
1
; :::; v
n
span R
n
if and only if for any vector B =
_
_
_
_
_
_
b
1
:
:
:
b
n
_
_
_
_
_
_
there exist numbers x
1
; :::; x
n
such that
x
1
_
_
_
_
_
_
a
11
:
:
:
a
n1
_
_
_
_
_
_
+ ::: + x
n
_
_
_
_
_
_
a
1n
:
:
:
a
nn
_
_
_
_
_
_
=
_
_
_
_
_
_
b
1
:
:
:
b
n
_
_
_
_
_
_
This is the same as
a
11
x
1
+ : : : + a
1n
x
n
= b
1
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
a
n1
x
1
+ : : : + a
nn
x
n
= b
n
If A =
_
_
_
_
_
_
a
11
a
12
: : : a
1n
: : : : : :
: : : : : :
: : : : : :
a
n1
: : : : a
nn
_
_
_
_
_
_
;
we see that v
1
; :::; v
n
span R
n
if and only if for every vector B, there is a
vector X such that AX = B.
Now apply the "main theorem" of the previous section. Condition (1) of
that theorem has just been veried here. So condition (2) also holds. It says
that AX = 0 implies X = 0. But AX = 0 is equivalent to
a
11
x
1
+ : : : + a
1n
x
n
= 0
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
a
n1
x
1
+ : : : + a
nn
x
n
= 0
which is equivalent to
x
1
_
_
_
_
_
_
a
11
:
:
:
a
n1
_
_
_
_
_
_
+ ::: + x
n
_
_
_
_
_
_
a
1n
:
:
:
a
nn
_
_
_
_
_
_
.=
_
_
_
_
_
_
0
:
:
:
0
_
_
_
_
_
_
Thus condition (2) has been decoded. It says that whenever
18
x
1
_
_
_
_
_
_
a
11
:
:
:
a
n1
_
_
_
_
_
_
+ ::: + x
n
_
_
_
_
_
_
a
1n
:
:
:
a
nn
_
_
_
_
_
_
.=
_
_
_
_
_
_
0:
:
:
:
0
_
_
_
_
_
_
,
then x
1
= x
2
= ::: = x
n
= 0:
This says that v
1
; :::; v
n
are independent.
Corollary In R
n
any n + 1 vectors are dependent.
Proof We give a proof by contradiction. Suppose that v
1
; :::; v
n+1
were
independent. Then v
1
; :::; v
n
would be independent. By the theorem just proved,
v
1
; :::; v
n
span R
n
. Therfore v
n+1
is a linear combination of v
1
:::; v
n
. That is,
there exist scalars c
1
; :::c
n
such that v
n+1
= c
1
v
1
+ ::: + c
n
v
n
: So c
1
v
1
+ ::: +
c
n
v
n
+ (1)v
n+1
= 0: So v
1
; :::; v
n+1
are dependent, a contradiction. So no
independent v
1
; :::; v
n+1
exist. It follows that in R
n
there is no sequence of
distinct vectors of length greater than n that is independent. (Why?)
Corollary No sequence v
1
; :::; v
n1
of n1 vectors from R
n
can span R
n
:
Proof We give a proof by contradiction. Suppose that v
1
; :::; v
n1
span R
n
:
Cross out from the list v
1
; :::; v
n1
the rst entry that is a linear combination
of other entries. This leaves us with a spanning set (Why?). Do this repeat-
edly until no element in the sequence is a linear combination of the rest. The
remaining sequence is independent. (Why?) It is still a spanning set. It is now
an independent spanning set. We have already shown that every independent
spanning set has n elements. Therefore v
1
; :::; v
n1
do not span R
n
. Thus no
set of less than n vectors spans R
n
:
Corollary Every basis of R
n
has precisely n members.
Proof We have shown that all independent spanning sets in R
n
have
exactly n elements, neither more nor less.
Denition n is called the dimension (or linear dimension) of R
n
:
Tests for Dependence, Independence, Spanning
Here are algorithms for testing for dependence, independence, and spanning.
They all reduce to writing the problem as one about solutions of systems of linear
equations. They are answered by row reduction.
Independence, Dependence
Let v
1
=
_
_
_
_
_
_
a
11
:
:
:
a
n1
_
_
_
_
_
_
; :::; v
k
=
_
_
_
_
_
_
a
1k
:
:
:
a
nk
_
_
_
_
_
_
We want to determine whether or not
19
v
1
; :::; v
k
are independent. This is the same as asking whether or not
x
1
_
_
_
_
_
_
a
11
:
:
:
a
1n
_
_
_
_
_
_
::: + x
k
_
_
_
_
_
_
a
1k
:
:
:
a
nk
_
_
_
_
_
_
.=
_
_
_
_
_
_
0:
:
:
:
0
_
_
_
_
_
_
has only the one solution
_
_
_
_
_
_
x
1
:
:
:
:
x
n
_
_
_
_
_
_
=
_
_
_
_
_
_
0:
:
:
:
0
_
_
_
_
_
_
:
This is the same as asking whether or not
a
11
x
1
+ : : : + a
1k
x
k
= 0
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
a
1n
+ : : : + a
nk
x
k
= 0
has only the solution x
1
= ::: = x
k
= 0: Row reduce and nd out.
Spanning.
Let v
1
=
_
_
_
_
_
_
a
11
:
:
:
a
n1
_
_
_
_
_
_
; :::; v
k
=
_
_
_
_
_
_
a
1k
:
:
:
a
nk
_
_
_
_
_
_
.
We wish to determine whether or not v
1
; :::; v
k
span R
n
, that is, whether
or not
for every column vector B =
_
_
_
_
_
_
b
1
:
:
:
b
n
_
_
_
_
_
_
; there exist scalars x
1
; :::; x
k
such that
B = x
1
v
1
+ ::: + x
k
v
k
:
This is equivalent to asking for A =
_
_
_
_
_
_
a
11
a
12
: : : a
1k
: : : : : :
: : : : : :
: : : : : :
a
n1
: : : : a
nk
_
_
_
_
_
_
, whether
or not AX = B has a solution X: This is the same as asking whether or not
the system
20
a
11
x
1
+ : : : + a
1k
x
k
= b
1
: : : : : : : : :
: : : : : : : : :
: : : : : : : : :
a
1n
x
1
+ : : : + a
nk
x
k
= b
k
has a solution. Row reduce and nd out.
Direct Algorithm for Matrix Inversion by Row Reduction
Let A be an n by n matrix, let I be an n by n identity matrix.
Form the n by 2n matrix (A : I) with A simply placed to the left of I.
If this matrix row reduces to a reduced row matrix of the form.(I : C); then
A is invertible with inverse C. Otherwise,A is not invertible.
Example
_
_
4 0 5 1 0 0
0 1 6 0 1 0
3 0 4 0 0 1
_
_
Multiply rst row by (1=4)
_
_
1 0 5=4 1=4 0 0
0 1 6 0 1 0
3 0 4 0 0 1
_
_
Add (3)times rst row to third
_
_
1 0
5
4
1
4
0 0
0 1 6 0 1 0
0 0
1
4
3
4
0 1
_
_
Multiply third row by 4
_
_
1 0
5
4
1
4
0 0
0 1 6 0 1 0
0 0 1 3 0 4
_
_
Add (5=4) third row to rst
_
_
1 0 0 4 0 5
0 1 6 0 1 0
0 0 1 3 0 4
_
_
Add 6 times third row to second
21
_
_
1 0 0 4 0 5
0 1 0 18 0 24
0 0 1 3 0 4
_
_
_
_
4 0 5
0 1 6
3 0 4
_
_
1
=
_
_
4 0 5
18 1 24
3 0 4
_
_
Example
Suppose the reduced echelon matrix is
_
_
1 0 0 2 1 1
0 1 0 1 1 2
0 0 1 3 6 3
_
_
with corresponding system of non-homogeneous equations
x
1
+2x
4
x
5
= 1
x
2
x
4
+x
5
= 2
x
3
+3x
4
+6x
5
= 3
The columns without a leading 1 are the fourth and fth and sixth.
These correspond to the variables x
4
; x
5
, and the right constant column in
terms of which we express x
1;
x
2;
x
3
in writing out the general solution
x
1
= 2x
4
+x
5
+1
x
2
= x
4
x
5
+2
x
3
= 3x
4
6x
5
+3
meaning by this that any values of x
4
; x
5
can be assigned and the corre-
sponding
values of x
1
; x
2
; x
3
give a solution.
In vector form, the general solution is
_
_
_
_
_
_
x
1
x
2
x
3
x
4
x
5
_
_
_
_
_
_
=
_
_
_
_
_
_
1 2x
4
+ x
5
2 + x
4
x
5
3 3x
4
6x
5
x
4
x
5
_
_
_
_
_
_
= x
4
_
_
_
_
_
_
2
1
3
1
0
_
_
_
_
_
_
+ x
5
_
_
_
_
_
_
1
1
6
0
1
_
_
_
_
_
_
+
_
_
_
_
_
_
1
2
3
0
0
_
_
_
_
_
_
Theorem. Suppose that AX = B is a matrix equation with xed A; B:
Suppose X
0
is a particular solution to AX = B.
Then the solutions of AX = B are all of the form X
0
+ Y;
where Y is a solution of the (homogeneous) system AY = 0 .
These are the only solutions.
22
Proof.
If Y is a solution to AY = 0, then by the associative law for matrix multi-
plication,
A(X
0
+ Y ) = AX
0
+ AY = AX
0
+ 0 = AX
0
= B: Conversely, suppose that
X is any solution to AX = B.
Set Y = X X
0
: By the same associative law,
AY = A(X X
0
) = AX AX
0
= B B = 0:
The example above bears this out.
The expression
x
4
_
_
_
_
_
_
2
1
3
1
0
_
_
_
_
_
_
+ x
5
_
_
_
_
_
_
1
1
6
0
1
_
_
_
_
_
_
is the general solution to
_
_
1 0 0 2 1
0 1 0 1 1
0 0 1 3 6
_
_
_
_
_
_
_
_
x
1
x
2
x
3
x
4
x
5
_
_
_
_
_
_
=
_
_
0
0
0
_
_
:
The vector
_
_
_
_
_
_
1
2
3
0
0
_
_
_
_
_
_
is a particular solution to
_
_
1 0 0 2 1
0 1 0 1 1
0 0 1 3 6
_
_
_
_
_
_
_
_
x
1
x
2
x
3
x
4
x
5
_
_
_
_
_
_
=
_
_
1
2
3
_
_
23

Module 1 - Gauss

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Module 1 - Gauss

Încărcat de

Drepturi de autor:

Formate disponibile

Math 221a

S-ar putea să vă placă și