Sunteți pe pagina 1din 10

Chapter 8

Least Squares
8.1

The problem of least squares

We have learnt how to solve linear systems of the form Ax = b; in terms of distance when we try to solve such a system, we are looking for vectors x that make
the distance between Ax and b equal to 0: ||Ax b|| = 0 (or, to avoid square roots,
||Ax b||2 = 0). If the solution set of Ax = b is empty, then we might be interested in
finding a vector x0 that makes ||Ax0 b||2 as small as possible. Such a vector is called
the least square solution of the system Ax = b. This term arises from the fact that
||Ax b|| is the square root of a sum of squares.
Given an n m matrix A and a vector b Rn , a vector x0 Rm is called a least
squares solution of Ax = b if and only if for all x Rm
||A x0 b||2 ||A x b||2 .
Notice that if x0 solves the system Ax = b then it is a least squares solution.
Least squares problems arise in many situations; the most typical one is the following. Suppose that a certain physical experiment provides measured data points
(x1 , y1 ), . . . , (xn , yn ), corresponding to the physical magnitudes X and Y, that theo1

retically satisfy the law Y = + X, i.e., the plot of their relationship is a straight line.
The measured data will yield then a linear system of the form

y1 = x1 +

..
.

yn = xn +
that can be written by means of the augmented matrix as

1 x1 y1

..
..
..
.
. ,
.

1 xn yn
or as Ax = b, where:

..
A= .

x1

..
. ;

xn

x=

y
1
..
b= . .

yn

Experimental data (with unavoidable noise) like the ones shown in the figure below
will normally lead to inconsistent systems (i.e., there is no straight line that passes
through all the points), and then we will seek a least squares solution.

Physical magnitude Y

Experimental measurements

Physical magnitude X
2

How do we do this? The following section shows the way.

8.1.1

Geometrical interpretation of the least squares solution

It is clear that the least squares solution of the system Ax = b will satisfy that the
vector Ax0 is in the column space of A, C(A), and it will be the vector that makes Ax0
the closest point to b. Obviously, the projection of b on the column space of A will
yield Ax0 .
C(A)

b
Ax2
0

Ax0
Ax1

Suppose then that x0 satisfies Ax0 = ProjC(A) (b). Such a projection satisfies that
b ProjC(A) (b) is orthogonal to C(A), and thus b Ax0 is orthogonal to each column
of A, Aj :
Aj (b Ax0 ) = 0,

j = 1, . . . , m,

or
Atj (b Ax0 ) = 0,

j = 1, . . . , m,

which can be written in a single expression as


At (b Ax0 ) = 0

At Ax0 = At b .

This proves the following theorem.


Theorem: The set of least squares solution of Ax = b coincides with the non-empty set
of solutions of At Ax = At b.
3

The matrix equation At Ax = At b is usually referred to as the normal equations for


x0 . Then, to find the least squares solution of Ax = b, the first step is to (left-)multiply
both sides by At . Thus we get

1 x1

1 ... 1

..

...
.

x1 . . . x n
1 xn

...

xn

n
X

...

i=1

{z i=1

{z

y
1

...

yn

i=1

At A

yi
xi

i=1

n
X

xi yi
x2i

n
X

xi
i=1

1
=

x1

n
X

At b

If the matrix At A is invertible, then the system At Ax = At b has a unique solution


given by:
x = (At A)1 At .
The straight line Y = + X whose coefficients are found by least squares is shown
in the figure below:

Physical magnitude Y

Experimental measurements

Physical magnitude X
4

Example: Find the least squares solution of Ax = b, where A =


1

b=
. We compute the solution as:
1

and
1

x = (At A)1 At b =

=
6

1 6

1 2
1 1 1

1
2 1 7
1

1
7

=
|

90
{z
}

1 1 1

2 1 7

=
1

2
=

,
1
45
2
| {z }

At A

At b

which is the least squares solution of the problem.

Notice that ker(At A) = ker(A) and C(At A) = C(A). Notice as well the following
important result:
Theorem: Given an n m matrix A with linearly independent columns, let A = QR
be a QR factorization of A. Then, the equation Ax = b has a unique least squares
solution given by
x0 = R1 Qt b .
Obviously, the following result also holds:
5

Theorem: Given an n m matrix A whose columns form an orthonormal set of vectors in Rn , then the solution to the least squares problem is
x0 = A t b .

8.2

Approximation of functions

8.2.1

Approximation by polynomials

In many applications it is necessary to approximate a continuous function in terms


of functions from some special types of approximating sets. Most commonly we approximate by a polynomial of degree n or less, using orthonormal bases.
Example: Consider the function f(x) = ex on the interval [0, 1]. Let us approximate
it by a linear function. The concept of orthogonality is based on the inner product. It
is easy to verify that the following definition provides an inner product in the set of
continuous functions defined on [a, b], C[a, b]:
Zb
fg=

f(x)g(x)dx .
a

Then let us find an orthonormal basis of polynomials of degree 1 or less. The polynomial
p0 (x) = 1
has length given by

s
Z1
||p0 (x)|| =

(1)2 dx = 1 .
0

Now consider a polynomial of degree 1, p1 (x) = ax + b, orthogonal to p0 (x) and with


length also 1. On the one hand we have
Z1
p0 (x) p1 (x) =

1(ax + b)dx =
0

a
+b=0
2

a = 2b .

On the other hand, the length of p1 (x) is:


s
Z1
||p1 (x)|| =

s
(b(2x +

1))2 dx

 
1
=1
3

b2

Thus we have an orthonormal basis:

b=

3, a = 2 3.

Orthogonal vectors in the basis for P1


3

3(1 2x)}.

B = {p0 (x) = 1, p1 (x) =

Obviously, we cannot write


1

p0(x) = 1

ex = c0 p0 (x) + c1 p1 (x), for x [0, 1]


1

(the system with unknowns c0 and c1 is

p1(x) = 3 (1 2x)

inconsistent) but we can approximate ex


(find the least squares solution). Such an

approximation is given by the orthogonal


projections of the vector ex on P1 :

PP1 (ex ) = c0 p0 (x) + c1 p1 (x) = ||Pp0 (x) (ex )|| p0 (x) + ||Pp1 (x) (ex )|| p1 (x)
and since p0 (x) and p1 (x) are unit vectors we have:
Z1
||Pp0 (x) (e )|| = p0 (x) e =
x

and
||Pp1 (x) (e )|| = p1 (x) e =
x

Z1

1
e dx = e = e 1,
x

3(1 2x)ex dx =

3(e 3).

by parts

Thus, the best linear approximation of ex is f(x) = (e 1) +


4e 10 + (18 6e)x (see the figure below).
7

3(e 3) 3(1 2x) =




Least squares approximation of ex on [0,1]


ex

f(x)

8.2.2

Approximation by trigonometric polynomials

Trigonometric polynomials are used to approximate periodic functions. By a trigonometric polynomial of degree n we mean a function of the form
a0 X
(ak cos kx + bk sin kx).
+
2
k=1
n

tn (x) =

If we consider continuous functions in C[, ], we can verify that f, g the expression


1
fg=

Z
f(x)g(x)dx,

defines an inner product in the vector space C[, ], and according to this definition,
the set
1
B = { , cos x, cos 2x, . . . , cos nx, sin x, sin 2x, . . . sin nx}
2
is an orthonormal set with respect to the previous inner product. To approximate
a continuous 2-periodic function f(x) by trigonometric polynomials of degree n or
8

less, we have to compute the projection of f(x) on each of the orthonormal vectors in
B, which is given by the corresponding dot product; therefore:
vector

a0
=
2

z}|{ 

a0
1
1
1

= f
2
2
|{z}
| {z 2 } 2
coefficient

inner
product
properties

coefficient

1
11
1
1
(f 1) = (f 1) =
2
2
| 2 {z } 2

1
a0 =

Z
f(x)dx

coefficient

Z
f(x)dx,

and
1
ak = f cos kx =

1
bk = f sin kx =

Z
f(x) cos kx dx,

f(x) sin kx dx,

for k = 1, . . . , n, which are the coefficients that determine the best (least squares)
approximation of f in the vector space C[, ]. The ak s and bk s turn out to be the
well-known Fourier coefficients of the 2-periodic function f.
Example: Let us find the second, third and fourth order Fourier approximation of the
function f(x) = x on , . First we compute the coefficients
1
a0 =

1
ak =

bk =

Z
xdx

0,

odd function

Z
x cos kx dx

0,

odd function

2(1)k+1
.
k

x sin kx dx =

by parts

Then the least squares n-th order approximation of x on [, ] is


tn (x) =

n
X
2(1)k+1

k=1

In particular, we have:
9

sin kx .

Least squares approximation of x on [,]

t2(x)

t2 (x) = 2 sin x sin 2x;


2

Second order

Least squares approximation of x on [,]

t3(x)

2
sin 3x;
3

t3 (x) = 2 sin x sin 2x +

Third order

Least squares approximation of x on [,]

t4(x)

2
1
sin 3x sin 4x;
3
2

t4 (x) = 2 sin x sin 2x +

10


Fourth order

S-ar putea să vă placă și