CLA Week3

Matrix norms
February 7, 2020
1 Induced Norms
Theorem 1.1. If k.k is an induced norm on Rn×n , then kAxk ≤ kAkkxk∀x ∈ Rn ,this
inequality is sharp.
Proof. We know that kAk = max kAxk

kxk
∀x ∈ Rn , kxk =
6 0
kAxk
therefore, kAk ≥ kxk
∀x ∈ Rn , kxk =
6 0, =⇒ kAxk ≤ kAkkxk (as kxk is positive
valued)
kAxmax k
Let xmax ∈ Rn s.t kAk = kxmax k
=⇒ kAkkxmax k = kAxmax k ∴ inequality is sharp.
Theorem 1.2. Induced norms are matrix norms.
Proof. 1) if kAk = 0, thenA = 0
=⇒ max kAxk
kxk
= 0, ∀x 6= 0 , since the maximum of a set of non-negative values is zero,
kAxk
all elements in the set must be zero. =⇒ kxk
= 0 ∀x 6= 0 =⇒ kAxk = 0.
=⇒ Ax = 0 ∀x 6= 0, which is only possible when A = 0. Hence proved.
2. kλAk = |λ| kAk

kλAxk
kλAk = maxkxk6=0 kxk
= |λ| kAk. Hence proved.
3. kA + Bk ≤ kAk + kBk
k(A+B)xk kAxk+kBxk
kA + Bk = maxkxk6=0 kxk
≤ maxkxk6=0 kxk
= kAk + kBk
=⇒ kA + Bk ≤ kAk + kBk , Hence proved. Therefore, Induced norm is a norm.
1
Now we need to show that,
4. kABk ≤ kAkkBk
kAk·kBxk kBxk
kABk ≤ maxkxk6=0 kxk
≤ kAk · maxkxk6=0 kxk
= kAkkBk
=⇒ kABk ≤ kAkkBk Hence proved. All induced norms are matrix norms
Theorem 1.3. kAk1 = max1≤j≤n ni=1 |aij | (maximum absolute column sum)
P
Proof. kAxk1 =k nj=1 a1j xj ,

P Pn Pn Pn
j=1 a2j xj , j=1 a3j xj , ... j=1 anj xj k1
Pn Pn P P
n n
= a x ≤ i=1 j=1 |aij | |xj |

i=1 j=1 ij j
Pn Pn Pn
=⇒ kAxk1 ≤ j=1 |xj | i=1 |aij | ≤ kxk1 max1≤j≤n ( i=1 |aij |)
kAxk1 Pn
Therefore, kxk1
≤ max1≤j≤n i=1 |aij |
n
X
=⇒ kAk1 ≤ max |aij | (1)
1≤j≤n
i=1
Now, we need to show converse inequality to prove that this indeed is an equality. As-
sume matrix A attains maximum absolute column sum at the k th column. Let x = ek where
ek is the standard basis with one at the index k and zero elsewhere.
=⇒ kAek k1 = k(a1k , a2k , a3k ..... , ank )T k1 =

Pn
i=1 |aik |
Pn
=⇒ i=1 |aik | = kAek k1 ≤ kAk1 · 1
n
X
=⇒ kAk1 ≥ max |aij | (2)
1≤j≤n
i=1
Pn
By inequalities (1) and (2), kAk1 = max1≤j≤n i=1 |aij |. Hence proved.
Pn
Theorem 1.4. kAk∞ = max1≤i≤n j=1 |aij | (maximum absolute row sum)
2
Proof. kAxk∞ = k nj=1 a1j xj ,
P Pn Pn Pn
j=1 a2j xj , j=1 a3j xj , ... j=1 anj xj k∞
P
= max1≤i≤n j=1 aij xj ≤ max1≤i≤n nj=1 |aij | |xj | (using triangle inequality)
n P
Pn kAxk∞ Pn
≤ kxk∞ max1≤i≤n j=1 |aij | =⇒ kxk∞
≤ max1≤i≤n j=1 |aij |
n
X
=⇒ kAk∞ ≤ max |aij | (3)
1≤i≤n
j=1
Now we need to show converse inequality to prove equality. Let the maximum absolute
row sum be attained at the k th row of matrix A. Construct vector x such that,
x = (sgn (ak1 ) , sgn (ak2 ) , sgn (ak3 ) , ....... sgn (akn )) (4)

1
 x>0
sgn(x) = −1 x < 0

0 x=0

Pn Pn Pn Pn
Now kAxk∞ = k j=1 a1j sgn(akj ), j=1 a2j sgn(akj ), j=1 a3j sgn(akj ), ... j=1 anj sgn(akj )k∞
Pn
=⇒ j=1 |akj | = kAxk∞ ≤ kAk · 1 (kxk∞ = 1 for non empty matrix A)
n
X
=⇒ kAk∞ ≥ max |aij | (5)
1≤i≤n
j=1
Pn
By inequalities (3) and (5), kAk∞ = max1≤i≤n j=1 |aij |. Hence proved.
1/2
Theorem 1.5. kAk2 = λmax AT A where λmax AT A is the largest eigenvalue of AT A.
[spectral norm].
Proof. Let λ1 ≥ λ2 ≥ λ3 ......... ≥ λn ≥ 0 be the eigenvalues of AT A. The eigenvalues form

an orthonormal basis for Rn (AT A is symmetric).
3
Let Z = {Z1 , Z2 , Z3 ....Zn } be an orthonormal basis for Rn . For any x ∈Rn ,
x = α1 Z1 + α2 Z2 , +α3 Z3 +, ....... + αn Zn
kAxk22 = xT AT Ax = (α1 Z1 + α2 Z2 , +α3 Z3 +, ....... + αn Zn )T AT A ( ni=1 αi Zi )

P
T P
= ( ni=1 αi Zi ) ( ni=1 αi λ1 Zi )
P
= (α12 λ1 + α22 λ2 + α32 λ3 + ......... + αn2 λn ) ≤ λmax (α12 + α22 + α32 + ....... + αn2 )
kAxk22
=⇒ kxk22
≤ λmax
1/2
=⇒ kAk2 ≤ λmax (6)
Converse inequality,
Let x = Z1 , assuming λ1 is the maximum eigenvalue then AZ1 = λZ1

=⇒ kAZ1 k22 = Z1T AT AZ1 = Z T Z λ1
kAZ1 k22
=⇒ kZ1 k22
= λ1 =⇒ λmax ≤ kAk22 ,
1/2
=⇒ kAk2 ≥ λmax (7)
1/2
By inequalities (6) and (7), kAk2 = λmax . Hence proved
Corollary 1.1. if A is a symmetric positive semi definite matrix such that A = C T C , then
kAk2 = kCk22
Proof. Using Theorem 1.5,
kCk2 = [λmax C T C ]1/2 = [λmax (A)]1/2

(8)
4
kAk2 = [λmax AT A ]1/2 = [λmax (A2 )]1/2 = [λmax (A)2 ]1/2

[λmax (A)2 ]1/2 = λmax (A) (as A is p.s.d). Using equation (8),
kAk2 = kCk22 . Hence proved.

Theorem 1.6. kAkF = [trace AT A ]1/2
Proof. Exercise.
√
Theorem 1.7. kAk2 ≤ kAkF ≤ nkAk2
Proof. Exercise.
Theorem 1.8. Let A in Rn×n be a symmetric matrix, then
kAk2 = max |hAx, xi|

kxk2 =1
Proof. As A is symmetric, A = U DU T by spectral theorem for symmetric matrices, and

eigenvalues of A form a orthonormal basis for Rn .
kAk2 = |λmax | by Corrollary 1.1
|hAx, xi| ≤ kAxk2 kxk2 ≤ kAk2 · 1 (kxk2 = 1)
=⇒ max |hAx, xi| ≤ kAk2 (9)

kxk2 =1
Let x = vi , vi ∈ orthonormal basis formed by eigenvectors of A, such that Ax = λmax vi
|hAx, xi| = |hλv1 , v1 i| = |λmax | · 1 = kAk2
∴ kAk2 = max |hAx, xi|

kxk2 =1
. Hence proved.
5
Theorem 1.9. Let A ∈ Rn×n be a symmetric positive semi definite matrix then,
1. λmax (A) = maxkxk2 =1 hAx, xi
2. λmin (A) = minkxk2 =1 hAx, xi
Proof. 1. λmax (A) = kAk2 = maxkxk2 =1 hAx, xi by Corollary 1.1 and Theorem 1.8.
2. hAx, xi = xT Ax = xT U DU T x (By spectral theorem for symmetric matrices)
=⇒ hAx, xi = (U T x)T DU T x ≥ λmin kU T xk22
Proposition 1.1. kU T xk22 = kxk22 (Orthogonal matrices preserve length)
Proof. kU T xk22 = xT U U T x = xT x = kxk22 . (U T = U −1 by definition for orthogonal

matrices)
=⇒ hAx, xi ≥ λmin (10)

for kxk2 = 1.
Take y such that Ay = λmin y and kyk2 = 1 Now,
hAy, yi = y T Ay = λmin y T y = λmin kyk22
=⇒ λmin (A) = min hAx, xi (11)

kxk2 =1
Hence proved.
6
Definition 1.1. Absolute and Relative Error:
If x̂ scalar is an approximation of scalar x then, Absolute Error is given by |x̂ − x| and

Relative Error is given by |x̂−x|
|x|
. If |x̂| is 6= 0, then Relative Error is also given by |x̂−x|
|x̂|
. If
we use k.k instead of |.|, it is known as normwise relative/absolute error.
2 Sensitivity of Linear Systems

Consider a linear system
Ax = b

1000 999 1999
where A = and b =
999 998 1997

1
then by solving the linear system , x =
1
A slight perturbation is given to b,

1998.99 1000 999
b= A=
1997.01 999 998

20.97
The solution of this new linear system is x = , a drastic change!
−18.99
2.1 Geometric Intuition

The initial linear system of equations are given by,
1000x1 + 999x2 = 1999
999x1 + 998x2 = 1997

and after perturbation,
1000x1 + 999x2 = 1998.99
999x1 + 998x2 = 1997.01
Plotting the initial system of equations on a graph, we get figure 1., we can see that the
lines almost completely overlap each other, and intersect at (1,1) as expected.
7
figure 1
Now let us plot the perturbed system of equations, we get figure 2., we can see that the
intersection point has drastically changed and is at (-18.99,20.97).
figure 2
8
The drastic change in the intersection point is a result of the lines being extremely close to
each other. Close enough such that a minute change in the value of b results in a completely
different solution. This is a way to understand this behavior geometrically. But we shall see
how to formally measure sensitivity of linear systems.
Definition 2.1. Condition Number
For an invertible matrix A, the condition number of A with respect to a norm k.k is de-
noted by κ(A) and is defined as
κ(A) = kAkkA−1 k
1 ≤ κ(A) ≤ ∞
Theorem 2.1. Properties of Condition Number
1.
κ(A) = κ(A−1 )
2.
κ(A) = κ(cA) ∀c 6= 0, c ∈ R
3.
κ(A) ≥ 1
Proof. Excercise.
Remark 2.1. Let A ∈ Rn×n be an invertible matrix. Then,
1. Condition number of a singular matrix is defined to be ∞
2. In general there is no relationship between condition number and determinant.

α 0
Example 2.1. for A = κ(A) = 1, but det(A) = α2
0 α
Theorem 2.2. Let A be a non-singular and let x and x̂ = x + ∆x be the solutions of Ax=b
and Ax̂ = b + δb. Then,
k∆xk kδbk
≤ κ(A)
kxk kbk
9
Proof.
Ax = b (12)
A(x + ∆x) = b + δb (13)

Subtracting equation (12) from (13), we get A∆x = δb and as A is invertible,
∆x = A−1 δb
=⇒ k∆xk = kA−1 δbk ≤ kA−1 kkδbk (14)
Now,
kAxk = kbk =⇒ kbk ≤ kAkkxk
1 1 1
≥ · (15)
kbk kAk kxk
Multiplying inequality (14) and (15) we get,

k∆xk kδbk
≤ κ(A)
kxk kbk
Hence proved.
Remark 2.2. If we perturb the coefficient matrix A, then also we can bound the error in
the solution. Note that the perturbed matrix need not be invertible.
k∆Ak 1
Theorem 2.3. Let A be an invertible matrix, if kAk
< κ(A)
then A + ∆A is invertible.
Proof. If A + ∆A is singular then (A + ∆A)x = 0 for some x 6= 0
[as dimension of nullspace > 0 for singular matrices]
=⇒ Ax = −∆Ax =⇒ x = −A−1 ∆Ax
=⇒ kxk ≤ kA−1 kk∆Akkxk =⇒ 1 ≤ kA−1 kk∆Ak
=⇒ 1 ≤ κ(A) k∆Ak
kAk
=⇒ 1
κ(A)
≤ k∆Ak
kAk
k∆Ak 1
∴ if kAk
< κ(A)
then A + ∆A is invertible. Hence proved.
10
Theorem 2.4. Let A be an invertible matrix, if x and x̂ = x + ∆x are the solutions to
Ax = b and (A + ∆A)x̂ = b, and k∆Ak
kAk
1
< κ(A) then,
k∆xk κ(A) k∆Ak

kAk
≤
kxk 1 − κ(A) k∆Ak
kAk
Proof.
(A + ∆A)x̂ = b
=⇒ A∆x + ∆Ax̂ = 0 =⇒ A∆x = −∆Ax̂
∆x = −A−1 ∆Ax̂ =⇒ k∆xk ≤ kA−1 kk∆Akkx̂k
=⇒ k∆xk ≤ kA−1 kk∆Ak(kxk + k∆xk)
=⇒ (1 − kA−1 kk∆Ak)k∆xk ≤ kA−1 kk∆Akkxk
Since kA−1 kk∆Ak = κ(A) k∆Ak

kAk
k∆xk κ(A) k∆Ak

kAk
≤
kAk
Hence proved.
Theorem 2.5. Let A be an invertible matrix if Ax = b and
(A + ∆A)(x + ∆x) = b + ∆b; b + ∆b 6= 0
then,
k∆xk k∆Ak k∆bk k∆Ak k∆bk
≤ κ(A) + +
kx̂k kAk kb + ∆bk kAk kb + ∆bk
Proof. Exercise.
11
k∆Ak 1
Theorem 2.6. Let A be an invertible matrix and kAk
< κ(A)
if Ax = b and
(A + ∆A)(x + ∆x) = b + ∆b; b 6= 0
then,
k∆Ak k∆bk
k∆xk κ(A) kAk
+ kbk
≤
kAk
Proof. Exercise.
3 Geometric meaning of condition number

Definition 3.1. The maximum and minimum magnification are defined by
• maxmag(A) = maxkxk=1 kAxk
• minmag(A) = minkxk=1 kAxk
Theorem 3.1. Let A be an invertible matrix then,

1
• maxmag(A) = minmag(A−1 )
1
• minmag(A) = maxmag(A−1 )
Proof. maxmag(A) = maxkxk=1 kAxk
Let,
Ax = y =⇒ A−1 y = x
x A−1 y y

A
kxk = A kxk = kA−1 yk

1 1
=⇒ max
−1 y = minmag(A−1 )
(16)
A kyk
1 1
=⇒ min = (17)
−1 y
A kyk maxmag(A−1 )
12
By equation (16) and (17)
1
maxmag(A) =
minmag(A−1 )
1
minmag(A) =
maxmag(A−1 )
Hence proved.
Remark 3.1. Let A be an invertible matrix, then by previous theorem,
maxmag(A)
κ(A) =
minmag(A)
Condition number captures the behavior of the unit ball

under transformation by matrix A.
13

CLA Week3

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

CLA Week3

Încărcat de

Drepturi de autor:

Formate disponibile

Matrix norms

Proof. We know that kAk = max kAxk

Theorem 1.2. Induced norms are matrix norms.

Proof. 1) if kAk = 0, thenA = 0

2. kλAk = |λ| kAk

=⇒ kA + Bk ≤ kAk + kBk , Hence proved. Therefore, Induced norm is a norm.

Proof. kAxk1 =k nj=1 a1j xj ,

=⇒ kAek k1 = k(a1k , a2k , a3k ..... , ank )T k1 =

Proof. Let λ1 ≥ λ2 ≥ λ3 ......... ≥ λn ≥ 0 be the eigenvalues of AT A. The eigenvalues form

kAxk22 = xT AT Ax = (α1 Z1 + α2 Z2 , +α3 Z3 +, ....... + αn Zn )T AT A ( ni=1 αi Zi )

Let x = Z1 , assuming λ1 is the maximum eigenvalue then AZ1 = λZ1

Proof. Using Theorem 1.5,

kCk2 = [λmax C T C ]1/2 = [λmax (A)]1/2

kAk2 = kCk22 . Hence proved.

Theorem 1.8. Let A in Rn×n be a symmetric matrix, then

kAk2 = max |hAx, xi|

Proof. As A is symmetric, A = U DU T by spectral theorem for symmetric matrices, and

kAk2 = |λmax | by Corrollary 1.1

|hAx, xi| ≤ kAxk2 kxk2 ≤ kAk2 · 1 (kxk2 = 1)

=⇒ max |hAx, xi| ≤ kAk2 (9)

|hAx, xi| = |hλv1 , v1 i| = |λmax | · 1 = kAk2

∴ kAk2 = max |hAx, xi|

1. λmax (A) = maxkxk2 =1 hAx, xi

2. λmin (A) = minkxk2 =1 hAx, xi

2. hAx, xi = xT Ax = xT U DU T x (By spectral theorem for symmetric matrices)

=⇒ hAx, xi = (U T x)T DU T x ≥ λmin kU T xk22

Proposition 1.1. kU T xk22 = kxk22 (Orthogonal matrices preserve length)

Proof. kU T xk22 = xT U U T x = xT x = kxk22 . (U T = U −1 by definition for orthogonal

=⇒ hAx, xi ≥ λmin (10)

Take y such that Ay = λmin y and kyk2 = 1 Now,

hAy, yi = y T Ay = λmin y T y = λmin kyk22

=⇒ λmin (A) = min hAx, xi (11)

If x̂ scalar is an approximation of scalar x then, Absolute Error is given by |x̂ − x| and

2 Sensitivity of Linear Systems

A slight perturbation is given to b,

2.1 Geometric Intuition

1000x1 + 999x2 = 1999

999x1 + 998x2 = 1997

Definition 2.1. Condition Number

Theorem 2.1. Properties of Condition Number

Remark 2.1. Let A ∈ Rn×n be an invertible matrix. Then,

1. Condition number of a singular matrix is defined to be ∞

2. In general there is no relationship between condition number and determinant.

A(x + ∆x) = b + δb (13)

Multiplying inequality (14) and (15) we get,

[as dimension of nullspace > 0 for singular matrices]

=⇒ Ax = −∆Ax =⇒ x = −A−1 ∆Ax

=⇒ kxk ≤ kA−1 kk∆Akkxk =⇒ 1 ≤ kA−1 kk∆Ak

k∆xk κ(A) k∆Ak

=⇒ A∆x + ∆Ax̂ = 0 =⇒ A∆x = −∆Ax̂

∆x = −A−1 ∆Ax̂ =⇒ k∆xk ≤ kA−1 kk∆Akkx̂k

=⇒ k∆xk ≤ kA−1 kk∆Ak(kxk + k∆xk)

=⇒ (1 − kA−1 kk∆Ak)k∆xk ≤ kA−1 kk∆Akkxk

Since kA−1 kk∆Ak = κ(A) k∆Ak

k∆xk κ(A) k∆Ak

Theorem 2.5. Let A be an invertible matrix if Ax = b and

(A + ∆A)(x + ∆x) = b + ∆b; b + ∆b 6= 0

(A + ∆A)(x + ∆x) = b + ∆b; b 6= 0

3 Geometric meaning of condition number

• maxmag(A) = maxkxk=1 kAxk

• minmag(A) = minkxk=1 kAxk

Theorem 3.1. Let A be an invertible matrix then,

Proof. maxmag(A) = maxkxk=1 kAxk