Documente Academic
Documente Profesional
Documente Cultură
Least Squares
8.1
We have learnt how to solve linear systems of the form Ax = b; in terms of distance when we try to solve such a system, we are looking for vectors x that make
the distance between Ax and b equal to 0: ||Ax b|| = 0 (or, to avoid square roots,
||Ax b||2 = 0). If the solution set of Ax = b is empty, then we might be interested in
finding a vector x0 that makes ||Ax0 b||2 as small as possible. Such a vector is called
the least square solution of the system Ax = b. This term arises from the fact that
||Ax b|| is the square root of a sum of squares.
Given an n m matrix A and a vector b Rn , a vector x0 Rm is called a least
squares solution of Ax = b if and only if for all x Rm
||A x0 b||2 ||A x b||2 .
Notice that if x0 solves the system Ax = b then it is a least squares solution.
Least squares problems arise in many situations; the most typical one is the following. Suppose that a certain physical experiment provides measured data points
(x1 , y1 ), . . . , (xn , yn ), corresponding to the physical magnitudes X and Y, that theo1
retically satisfy the law Y = + X, i.e., the plot of their relationship is a straight line.
The measured data will yield then a linear system of the form
y1 = x1 +
..
.
yn = xn +
that can be written by means of the augmented matrix as
1 x1 y1
..
..
..
.
. ,
.
1 xn yn
or as Ax = b, where:
..
A= .
x1
..
. ;
xn
x=
y
1
..
b= . .
yn
Experimental data (with unavoidable noise) like the ones shown in the figure below
will normally lead to inconsistent systems (i.e., there is no straight line that passes
through all the points), and then we will seek a least squares solution.
Physical magnitude Y
Experimental measurements
Physical magnitude X
2
8.1.1
It is clear that the least squares solution of the system Ax = b will satisfy that the
vector Ax0 is in the column space of A, C(A), and it will be the vector that makes Ax0
the closest point to b. Obviously, the projection of b on the column space of A will
yield Ax0 .
C(A)
b
Ax2
0
Ax0
Ax1
Suppose then that x0 satisfies Ax0 = ProjC(A) (b). Such a projection satisfies that
b ProjC(A) (b) is orthogonal to C(A), and thus b Ax0 is orthogonal to each column
of A, Aj :
Aj (b Ax0 ) = 0,
j = 1, . . . , m,
or
Atj (b Ax0 ) = 0,
j = 1, . . . , m,
At Ax0 = At b .
1 x1
1 ... 1
..
...
.
x1 . . . x n
1 xn
...
xn
n
X
...
i=1
{z i=1
{z
y
1
...
yn
i=1
At A
yi
xi
i=1
n
X
xi yi
x2i
n
X
xi
i=1
1
=
x1
n
X
At b
Physical magnitude Y
Experimental measurements
Physical magnitude X
4
b=
. We compute the solution as:
1
and
1
x = (At A)1 At b =
=
6
1 6
1 2
1 1 1
1
2 1 7
1
1
7
=
|
90
{z
}
1 1 1
2 1 7
=
1
2
=
,
1
45
2
| {z }
At A
At b
Notice that ker(At A) = ker(A) and C(At A) = C(A). Notice as well the following
important result:
Theorem: Given an n m matrix A with linearly independent columns, let A = QR
be a QR factorization of A. Then, the equation Ax = b has a unique least squares
solution given by
x0 = R1 Qt b .
Obviously, the following result also holds:
5
Theorem: Given an n m matrix A whose columns form an orthonormal set of vectors in Rn , then the solution to the least squares problem is
x0 = A t b .
8.2
Approximation of functions
8.2.1
Approximation by polynomials
f(x)g(x)dx .
a
Then let us find an orthonormal basis of polynomials of degree 1 or less. The polynomial
p0 (x) = 1
has length given by
s
Z1
||p0 (x)|| =
(1)2 dx = 1 .
0
1(ax + b)dx =
0
a
+b=0
2
a = 2b .
s
(b(2x +
1))2 dx
1
=1
3
b2
b=
3, a = 2 3.
3(1 2x)}.
p0(x) = 1
p1(x) = 3 (1 2x)
PP1 (ex ) = c0 p0 (x) + c1 p1 (x) = ||Pp0 (x) (ex )|| p0 (x) + ||Pp1 (x) (ex )|| p1 (x)
and since p0 (x) and p1 (x) are unit vectors we have:
Z1
||Pp0 (x) (e )|| = p0 (x) e =
x
and
||Pp1 (x) (e )|| = p1 (x) e =
x
Z1
1
e dx = e = e 1,
x
3(1 2x)ex dx =
3(e 3).
by parts
f(x)
8.2.2
Trigonometric polynomials are used to approximate periodic functions. By a trigonometric polynomial of degree n we mean a function of the form
a0 X
(ak cos kx + bk sin kx).
+
2
k=1
n
tn (x) =
Z
f(x)g(x)dx,
defines an inner product in the vector space C[, ], and according to this definition,
the set
1
B = { , cos x, cos 2x, . . . , cos nx, sin x, sin 2x, . . . sin nx}
2
is an orthonormal set with respect to the previous inner product. To approximate
a continuous 2-periodic function f(x) by trigonometric polynomials of degree n or
8
less, we have to compute the projection of f(x) on each of the orthonormal vectors in
B, which is given by the corresponding dot product; therefore:
vector
a0
=
2
z}|{
a0
1
1
1
= f
2
2
|{z}
| {z 2 } 2
coefficient
inner
product
properties
coefficient
1
11
1
1
(f 1) = (f 1) =
2
2
| 2 {z } 2
1
a0 =
Z
f(x)dx
coefficient
Z
f(x)dx,
and
1
ak = f cos kx =
1
bk = f sin kx =
Z
f(x) cos kx dx,
for k = 1, . . . , n, which are the coefficients that determine the best (least squares)
approximation of f in the vector space C[, ]. The ak s and bk s turn out to be the
well-known Fourier coefficients of the 2-periodic function f.
Example: Let us find the second, third and fourth order Fourier approximation of the
function f(x) = x on , . First we compute the coefficients
1
a0 =
1
ak =
bk =
Z
xdx
0,
odd function
Z
x cos kx dx
0,
odd function
2(1)k+1
.
k
x sin kx dx =
by parts
n
X
2(1)k+1
k=1
In particular, we have:
9
sin kx .
t2(x)
Second order
t3(x)
2
sin 3x;
3
Third order
t4(x)
2
1
sin 3x sin 4x;
3
2
10
Fourth order