Sunteți pe pagina 1din 24

Overview of Multivariate Optimization Topics

Multivariate Optimization Overview

Problem denition

The unconstrained optimization problem is a generalization of


the line search problem

Algorithms
Cyclic coordinate method
Steepest descent
Conjugate gradient algorithms
PARTAN
Newtons method
Levenberg-Marquardt

Find a vector a such that


a = argminf (a)
a

Note that the are no constraints on a


Example: Find the vector of coecients (w Rp1 ) that
minimize the average absolute error of a linear model
Akin to a blind person trying to nd their way to the bottom of a
valley in a multidimensional landscape

Concise, subjective summary

We want to reach the bottom with the minimum number of cane


taps
Also vaguely similar to taking core samples for oil prospecting

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 1

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 2

Example 1: Optimization Problem

Example 1: Optimization Problem


5

a2

a2

5
5

0
a1

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 3

5
5

0
a1

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 4

Example 1: Optimization Problem

Example 1: Optimization Problem

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 5

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 6

Example 1: Optimization Problem

Example 1: MATLAB Code


function [] = O p t i m i z a t i o n P r o b l e m ();
% ==============================================================================
% User - Specified Parameters
% ==============================================================================
x = -5:0 .05 :5;
y = -5:0 .05 :5;
% ==============================================================================
% Evaluate the Function
% ==============================================================================
[X , Y ] = meshgrid (x , y );
[Z , G ] = OptFn (X , Y );
functionName = O p t i m i z a t i o n P r o b l e m ;
fileIdentifier = fopen ([ functionName .tex ] , w );
% ==============================================================================
% Contour Map
% ==============================================================================
figure ;
FigureSet (2 , Slides );
contour (x ,y ,Z ,50);
xlabel ( a_1 );
ylabel ( a_2 );
zoom on ;
AxisSet (8);
fileName = sprintf ( %s -% s , functionName , Contour );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 7

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 8

print ( fileName , - depsc );


fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = \ n
fprintf ( fileIdentifier , \\ newslide \ n );
fprintf ( fileIdentifier , \\ stepcounter { exc }\ n );
fprintf ( fileIdentifier , \\ slideheading { Exam ple \\ arabic { exc }: Optimization Problem }\ n );
fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = \ n
fprintf ( fileIdentifier , \\ includegraphics [ scale =1]{ Matlab /% s }\ n , fileName );
fprintf ( fileIdentifier , \ n );
% ==============================================================================
% Quiver Map
% ==============================================================================
figure ;
FigureSet (1 , Slides );
axis ([ -5 5 -5 5]);
contour (x ,y ,Z ,50);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
hold on ;
xCoarse = -5:0 .5 :5;
yCoarse = -5:0 .5 :5;
[X , Y ] = meshgrid ( xCoarse , yCoarse );
[ ZCoarse , GCoarse ] = OptFn (X , Y );
nr = size ( xCoarse ,1);
dzx = GCoarse (
1: nr ,1: nr );
dzy = GCoarse ( nr + (1: nr ) ,1: nr );
quiver ( xCoarse , yCoarse , dzx , dzy );
hold off ;
xlabel ( a_1 );
ylabel ( a_2 );
zoom on ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 9

case 1 ,
case 2 ,
case 3 ,
otherwise ,

AxisSet (8);
fileName = sprintf ( %s -% s , functionName , Quiver );
print ( fileName , - depsc );
fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = \ n
fprintf ( fileIdentifier , \\ newslide \ n );
fprintf ( fileIdentifier , \\ slideheading { Example \\ arabic { exc }: Optimization Problem }\ n );
fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = \ n
fprintf ( fileIdentifier , \\ includegraphics [ scale =1]{ Matlab /% s }\ n , fileName );
fprintf ( fileIdentifier , \ n );
% ==============================================================================
% 3 D Maps
% ==============================================================================
figure ;
set ( gcf , Renderer , zbuffer );
FigureSet (1 , Slides );
h = surf (x ,y , Z );
set (h , LineStyle , None );
xlabel ( a_1 );
ylabel ( a_2 );
shading interp ;
grid on ;
AxisSet (8);
hl = light ( Position ,[0 ,0 ,30]);
set ( hl , Style , Local );
set (h , B a c k F a c e L i g h t i n g , unlit )
material dull
for c1 =1:3
switch c1

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 10

Global Optimization?

view (45 ,10);


view ( -55 ,22);
view ( -131 ,10);
error ( Not implemented. );

In general, all optimization algorithms nd a local minimum in as


few steps as possible

end
fileName = sprintf ( %s -% s % d , functionName , Surface , c1 );
print ( fileName , - depsc );
fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
fprintf ( fileIdentifier , \\ newslide \ n );
fprintf ( fileIdentifier , \\ slideheading { Example \\ arabic { exc }: Optimization Problem }\ n );
fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
fprintf ( fileIdentifier , \\ includegraphics [ scale =1]{ Matlab /% s }\ n , fileName );
fprintf ( fileIdentifier , \ n );
end
% ==============================================================================
% List the MATLAB Code
% ==============================================================================
fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = \ n
fprintf ( fileIdentifier , \\ newslide \ n );
fprintf ( fileIdentifier , \\ slideheading { Example \\ arabic { exc }: MATLAB Code }\ n );
fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = \ n
fprintf ( fileIdentifier , \ t \\ matlabcode { Matlab /% s.m }\ n , functionName );

There are also global optimization algorithms based on ideas


such as
Evolutionary computing
Genetic algorithms
Simulated annealing
None of these guarantee convergence in a nite number of
iterations
All require a lot of computation

fclose ( fileIdentifier );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 11

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 12

Optimization Comments
Ideally, when we construct models we should favor those which
can be optimized with few shallow local minima and reasonable
computation
Graphically you can think of the function to be minimized as the
elevation in a complicated high-dimensional landscape

Optimization Algorithm Outline


The basic steps of these algorithms is as follows
1. Pick a starting vector a
2. Find the direction of descent, d
3. Move in that direction until a minimum is found:
:= argminf (a + d)

The problem is to nd the lowest point

a := a + d

The most common approach is to go downhill


The gradient points in the most uphill direction
The steepest downhill direction is the opposite of the gradient
Most optimization algorithms use a line search algorithm
The methods mostly dier only in the way that the direction of
descent is generated

4. Loop to 2 until convergence


Most of the theory of these algorithms is based on quadratic
surfaces
Near local minima, this is a good approximation
Note that the functions should (must) have continuous gradients
(almost) everywhere

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 13

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 14

Cyclic Coordinate Method

Example 2: Cyclic Coordinate Method

1. For i = 1 to p,

ai := argminf ([a1 , a2 , . . . , ai1 , , ai+1 , . . . , ap ])

2. Loop to 1 until convergence

+ Each line search can be performed semi-globally to avoid shallow


local minima

1
Y

+ Simple to implement

+ Can be used with nominal variables

+ f (a) can be discontinuous

+ No gradient required

Very slow compared to gradient-based optimization algorithms

Usually only practical when the number of parameters, p, is small


There are modied versions with faster convergence

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 15

5
5

0
X

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 16

Example 2: Cyclic Coordinate Method

Example 2: Cyclic Coordinate Method

0.5

0.5
Function Value

1
1.5
2

3
2

2.5

3
3.5

10

15

20

25

Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 17

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 18

Example 2: Cyclic Coordinate Method

Example 2: Relevant MATLAB Code

function [] = C y c l i c C o o r d i n a t e ();
% clear all ;
close all ;

Euclidean Position Error

ns
x
y
b0
ls

26;
-3;
1;
-1;
30;

a = zeros ( ns ,2);
f = zeros ( ns ,1);

[z , dzx , dzy ] = OptFn (x , y );


a (1 ,:) = [ x y ];
f (1)
= z;
for cnt = 2: ns ,
if rem ( cnt ,2)==1 ,
d = [1 0] ; % Along x direction
else
d = [0 1] ; % Along y direction
end ;

2
1
0

=
=
=
=
=

10

15

20

25

Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 19

[b , fmin ] = LineSearch ([ x y ] ,d , b0 , ls );
x = x + b * d (1);
y = y + b * d (2);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 20

a ( cnt ,:) = [ x y ];
f ( cnt )
= fmin ;
end ;
[x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt , id1 ] = min ( z );
[ zopt , id2 ] = min ( zopt );
id1 = id1 ( id2 );
xopt = x ( id1 , id2 );
yopt = y ( id1 , id2 );
[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt2 , id1 ] = min ( z );
[ zopt2 , id2 ] = min ( zopt2 );
id1 = id1 ( id2 );
xopt2 = x ( id1 , id2 );
yopt2 = y ( id1 , id2 );
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5);
z = OptFn (x , y );
contour (x ,y ,z ,50);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
axis ( square );
hold on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 21

print - depsc C y c l i c C o o r d i n a t e C o n t o u r B ;

set ( h (1) , LineWidth ,1 .2 );


set ( h (2) , LineWidth ,0 .6 );
h = plot ( xopt , yopt , kx , xopt , yopt , rx );
set ( h (1) , LineWidth ,1 .5 );
set ( h (2) , LineWidth ,0 .5 );
set ( h (1) , MarkerSize ,5);
set ( h (2) , MarkerSize ,4);
hold off ;
xlabel ( X );
ylabel ( Y );
zoom on ;
AxisSet (8);
print - depsc C y c l i c C o o r d i n a t e C o n t o u r A ;
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid ( -1 .5 + ( -2:0 .05 :2) , -1 .5 + ( -2:0 .05 :2));
[z , dzx , dzy ] = OptFn (x , y );
contour (x ,y ,z ,75);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
axis ( square );
hold on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
set ( h (1) , LineWidth ,1 .2 );
set ( h (2) , LineWidth ,0 .6 );
hold off ;
xlabel ( X );
ylabel ( Y );
zoom on ;
AxisSet (8);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 22

print - depsc C y c l i c C o o r d i n a t e E r r o r L i n e a r ;

figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
h = plot (k -1 , xerr , b );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Euclidean Position Error );
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on ;
set ( gca , Box , Off );
AxisSet (8);
print - depsc C y c l i c C o o r d i n a t e P o s i t i o n E r r o r ;
figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Function Value );
ylim ([0 f (1)]);
xlim ([0 ns -1]);
grid on ;
set ( gca , Box , Off );
AxisSet (8);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 23

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 24

Steepest Descent

Steepest Descent

The gradient of the function f (a) is dened as the vector of partial


derivatives:
T

f (a)
f (a)
f (a)
.
.
.
a f (a) a1
a2
ap

+ Very stable algorithm


Can converge very slowly once near the local minima where the
surface is approximately quadratic

It can be shown that the gradient, a f (a), points in the


direction of maximum ascent
The negative of the gradient, a f (a), points in the direction
of maximum descent
A vector d is a direction of descent if there exists a  such that
f (a + d) < f (a) for all 0 < < 
It can alsoT be shown that d is a direction of descent i
(a f (a)) d < 0
The algorithm of steepest descent uses d = a f (a)
The most fundamental of all algorithms for minimizing a
continuously dierentiable function

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 26

Example 3: Steepest Descent

Example 3: Steepest Descent

1.2

1.3

1.4

1.5

1.6
Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 25

1.7

1.8

1.9

2.1

5
5

2.2
0
X

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 27

1.8

1.6
X

1.4

1.2

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 28

Example 3: Steepest Descent

Example 3: Steepest Descent Method

6
Euclidean Position Error

Function Value

5
4
3
2

3
2
1

1
0

0
0

10

15

20

25

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 29

Example 3: Relevant MATLAB Code


function [] = SteepestDescent ();
% clear all ;
close all ;
ns
x
y
b0
ls

=
26;
=
-3;
=
1;
= 0 .01 ;
=
30;

a = zeros ( ns ,2);
f = zeros ( ns ,1);
[z , g ] = OptFn (x , y );
a (1 ,:) = [ x y ];
f (1)
= z;
d
= -g / norm ( g );
for cnt = 2: ns ,
[b , fmin ] = LineSearch ([ x y ] ,d , b0 , ls );
x = x + b * d (1);
y = y + b * d (2);
[z , g ] = OptFn (x , y );
d
= -g ;
d
= d / norm ( d );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 31

10

15

20

25

Iteration

Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 30

a ( cnt ,:) = [ x y ];
f ( cnt )
= z;
end ;
[x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt , id1 ] = min ( z );
[ zopt , id2 ] = min ( zopt );
id1 = id1 ( id2 );
xopt = x ( id1 , id2 );
yopt = y ( id1 , id2 );
[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt2 , id1 ] = min ( z );
[ zopt2 , id2 ] = min ( zopt2 );
id1 = id1 ( id2 );
xopt2 = x ( id1 , id2 );
yopt2 = y ( id1 , id2 );
[ zopt zopt2 ]
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5);
z = OptFn (x , y );
contour (x ,y ,z ,50);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
axis ( square );
hold on ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 32

h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );


set ( h (1) , LineWidth ,1 .2 );
set ( h (2) , LineWidth ,0 .6 );
h = plot ( xopt , yopt , kx , xopt , yopt , rx );
set ( h (1) , LineWidth ,1 .5 );
set ( h (2) , LineWidth ,0 .5 );
set ( h (1) , MarkerSize ,5);
set ( h (2) , MarkerSize ,4);
hold off ;
xlabel ( X );
ylabel ( Y );
zoom on ;
AxisSet (8);
print - depsc S t e e p e s t D e s c e n t C o n t o u r A ;
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid ( -1 .6 + ( -0 .5 :0 .01 :0 .5 ) , -1 .7 + ( -0 .5 :0 .01 :0 .5 ));
z = OptFn (x , y );
contour (x ,y ,z ,75);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
axis ( square );
hold on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
set ( h (1) , LineWidth ,1 .2 );
set ( h (2) , LineWidth ,0 .6 );
hold off ;
xlabel ( X );
ylabel ( Y );
zoom on ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 33

AxisSet (8);
print - depsc S t e e p e s t D e s c e n t C o n t o u r B ;
figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
h = plot (k -1 , xerr , b );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Euclidean Position Error );
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on ;
set ( gca , Box , Off );
AxisSet (8);
print - depsc S t e e p e s t D e s c e n t P o s i t i o n E r r o r ;
figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Function Value );
ylim ([0 f (1)]);
xlim ([0 ns -1]);
grid on ;
set ( gca , Box , Off );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 34

Conjugate Gradient Algorithms

AxisSet (8);
print - depsc S t e e p e s t D e s c e n t E r r o r L i n e a r ;

1. Take a steepest descent step


2. For i = 2 to p
:= argminf (a + d)

a := a + d
gi := f (a)
:=

T
gi gi
T
gi1 gi1

d := gi + di
3. Loop to 1 until convergence
Based on quadratic approximations of f
Called the Fletcher-Reeves method

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 35

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 36

Example 4: Fletcher-Reeves Conjugate Gradient

2.5

2.6

2.7

2.8

2.9
Y

Example 4: Fletcher-Reeves Conjugate Gradient

3.1

3.2

3.3

3.4

5
5

0
X

3.5
1.5

2
X

2.5

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 37

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 38

Example 4: Fletcher-Reeves Conjugate Gradient

Example 4: Fletcher-Reeves Conjugate Gradient

6
Euclidean Position Error

Function Value

5
4
3
2

3
2
1

1
0

0
0

10

15

20

25

Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 39

10

15

20

25

Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 40

Example 4: Relevant MATLAB Code


function [] = FletcherReeves ();
% clear all ;
close all ;
ns
x
y
b0
ls

=
26;
=
-3;
=
1;
= 0 .01 ;
=
30;

a = zeros ( ns ,2);
f = zeros ( ns ,1);
[z , g ] = OptFn (x , y );
a (1 ,:) = [ x y ];
f (1)
= z;
d = -g / norm ( g ); % First direction
for cnt = 2: ns ,
[b , fmin ] = LineSearch ([ x y ] ,d , b0 , ls );
x = x + b * d (1);
y = y + b * d (2);
go = g ; % Old gradient
[z , g ] = OptFn (x , y );
beta = (g * g )/( go * go );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 41

h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );


set ( h (1) , LineWidth ,1 .2 );
set ( h (2) , LineWidth ,0 .6 );
h = plot ( xopt , yopt , kx , xopt , yopt , rx );
set ( h (1) , LineWidth ,1 .5 );
set ( h (2) , LineWidth ,0 .5 );
set ( h (1) , MarkerSize ,5);
set ( h (2) , MarkerSize ,4);
hold off ;
xlabel ( X );
ylabel ( Y );
zoom on ;
AxisSet (8);
print - depsc F l e t c h e r R e e v e s C o n t o u r A ;
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid (1 .5 :0 .01 :2 .5 , -3 .5 :0 .01 : -2 .5 );
z = OptFn (x , y );
contour (x ,y ,z ,75);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
axis ( square );
hold on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
set ( h (1) , LineWidth ,1 .2 );
set ( h (2) , LineWidth ,0 .6 );
hold off ;
xlabel ( X );
ylabel ( Y );
zoom on ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 43

= -g + beta * d ;

a ( cnt ,:) = [ x y ];
f ( cnt )
= z;
end ;
[x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt , id1 ] = min ( z );
[ zopt , id2 ] = min ( zopt );
id1 = id1 ( id2 );
xopt = x ( id1 , id2 );
yopt = y ( id1 , id2 );
[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt2 , id1 ] = min ( z );
[ zopt2 , id2 ] = min ( zopt2 );
id1 = id1 ( id2 );
xopt2 = x ( id1 , id2 );
yopt2 = y ( id1 , id2 );
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5);
z = OptFn (x , y );
contour (x ,y ,z ,50);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
axis ( square );
hold on ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 42

AxisSet (8);
print - depsc F l e t c h e r R e e v e s C o n t o u r B ;
figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
h = plot (k -1 , xerr , b );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Euclidean Position Error );
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on ;
set ( gca , Box , Off );
AxisSet (8);
print - depsc F l e t c h e r R e e v e s P o s i t i o n E r r o r ;
figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Function Value );
ylim ([0 f (1)]);
xlim ([0 ns -1]);
grid on ;
set ( gca , Box , Off );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 44

Conjugate Gradient Algorithms Continued

AxisSet (8);
print - depsc F l e t c h e r R e e v e s E r r o r L i n e a r ;

There is also a variant called Polak-Ribiere where


T

:=

(gi gi1 ) gi
T
gi1 gi1

+ Only requires the gradient


+ Converges in a nite No. steps when f (a) is quadratic and perfect
line searches are used
Less stable numerically than steepest descent
Sensitive to inexact line searches

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 46

Example 5: Polak-Ribiere Conjugate Gradient

Example 5: Polak-Ribiere Conjugate Gradient

2.5

2.6

2.7

2.8

2.9
Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 45

3.1

3.2

3.3

3.4

5
5

0
X

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 47

3.5
1.5

2
X

2.5

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 48

Example 5: Polak-Ribiere Conjugate Gradient

Example 5: Polak-Ribiere Conjugate Gradient

6
Euclidean Position Error

Function Value

5
4
3
2

3
2
1

1
0

0
0

10

15

20

25

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 49

Example 5: MATLAB Code


function [] = PolakRibiere ();
% clear all ;
close all ;
ns
x
y
b0
ls

=
26;
=
-3;
=
1;
= 0 .01 ;
=
30;

a = zeros ( ns ,2);
f = zeros ( ns ,1);
[z , g ] = OptFn (x , y );
a (1 ,:) = [ x y ];
f (1)
= z;
d = -g / norm ( g ); % First direction
for cnt = 2: ns ,
[b , fmin ] = LineSearch ([ x y ] ,d , b0 , ls );
x = x + b * d (1);
y = y + b * d (2);
go = g ; % Old gradient
[z , g ] = OptFn (x , y );
beta = (( g - go ) * g )/( go * go );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 51

10

15

20

25

Iteration

Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 50

= -g + beta * d ;

a ( cnt ,:) = [ x y ];
f ( cnt )
= z;
end ;
[x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt , id1 ] = min ( z );
[ zopt , id2 ] = min ( zopt );
id1 = id1 ( id2 );
xopt = x ( id1 , id2 );
yopt = y ( id1 , id2 );
[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt2 , id1 ] = min ( z );
[ zopt2 , id2 ] = min ( zopt2 );
id1 = id1 ( id2 );
xopt2 = x ( id1 , id2 );
yopt2 = y ( id1 , id2 );
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5);
z = OptFn (x , y );
contour (x ,y ,z ,50);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
axis ( square );
hold on ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 52

h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );


set ( h (1) , LineWidth ,1 .2 );
set ( h (2) , LineWidth ,0 .6 );
h = plot ( xopt , yopt , kx , xopt , yopt , rx );
set ( h (1) , LineWidth ,1 .5 );
set ( h (2) , LineWidth ,0 .5 );
set ( h (1) , MarkerSize ,5);
set ( h (2) , MarkerSize ,4);
hold off ;
xlabel ( X );
ylabel ( Y );
zoom on ;
AxisSet (8);
print - depsc P o l a k R i b i e r e C o n t o u r A ;
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid (1 .5 :0 .01 :2 .5 , -3 .5 :0 .01 : -2 .5 );
z = OptFn (x , y );
contour (x ,y ,z ,75);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
axis ( square );
hold on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
set ( h (1) , LineWidth ,1 .2 );
set ( h (2) , LineWidth ,0 .6 );
hold off ;
xlabel ( X );
ylabel ( Y );
zoom on ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 53

AxisSet (8);
print - depsc P o l a k R i b i e r e E r r o r L i n e a r ;

AxisSet (8);
print - depsc P o l a k R i b i e r e C o n t o u r B ;
figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
h = plot (k -1 , xerr , b );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Euclidean Position Error );
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on ;
set ( gca , Box , Off );
AxisSet (8);
print - depsc P o l a k R i b i e r e P o s i t i o n E r r o r ;
figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Function Value );
ylim ([0 f (1)]);
xlim ([0 ns -1]);
grid on ;
set ( gca , Box , Off );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 54

Parallel Tangents (PARTAN)


1. First gradient step
d := f (a)
:= argmin f (a + d)
sp := d
a := a + sp
2. Gradient Step
dg := f (a)
:= argmin f (a + d)
sg := d
a := a + sg
3. Conjugate Step
dp := sp + sg
:= argmin f (a + d)
sp := d
a := a + sp
4. Loop to 2 until convergence

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 55

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 56

PARTAN Concept
a2

a0

Example 6: PARTAN

a3

5
a6

a1

a7

4
3

a4
a5

First two steps are steepest descent

1
Y

Thereafter, each iteration consists of two steps


1. Search along the direction

0
1

di = ai ai2

where ai is the current point and ai2 is the point from two
steps ago
2. Search in the direction of the negative gradient

3
4
5
5

di = f (ai )

0
X

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 57

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 58

Example 6: PARTAN

Example 6: PARTAN

2.5

2.6
6
2.7
5
Function Value

2.8

2.9
3
3.1
3.2

4
3
2

3.3
1

3.4
3.5
1.5

2
X

2.5

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 59

10

15

20

25

Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 60

Example 6: PARTAN

Example 6: MATLAB Code

function [] = Partan ();


% clear all ;
close all ;

Euclidean Position Error

5
4
3

=
26;
=
-3;
=
1;
= 0 .01 ;
=
30;

a
f

= zeros ( ns ,2);
= zeros ( ns ,1);

[z , g ] = OptFn (x , y );
a (1 ,:) = [ x y ];
f (1)
= z;
xa
= x;
ya
= y;

2
1
0

ns
x
y
b0
ls

10

15

20

25

Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 61

cnt = 2;
while cnt < ns ,
% Gradient step
[z , g ] = OptFn (x , y );
d = -g / norm ( g ); % Direction
[ bg , fmin ] = LineSearch ([ x y ] ,d , b0 , ls );
xg = x + bg * d (1);
yg = y + bg * d (2);
cnt = cnt + 1;
a ( cnt ,:) = [ xg yg ];
f ( cnt )
= OptFn ( xg , yg );
fprintf ( G : % d %5 .3f \ n ,cnt , f ( cnt ));
if cnt == ns ,
break ;
end ;
% Conjugate
d = [ xg - xa yg - ya ] ;
if norm ( d )=0 ,
d = d / norm ( d );
[ bp , fmin ] = LineSearch ([ xg yg ] ,d , b0 , ls );
else
bp = 0;
end ;
if bp >0 , % Line search in conj ugate direction was successful
fprintf ( P : );
x = xg + bp * d (1);
y = yg + bp * d (2);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 63

% First step - substitute for a Conjugate step


d = -g / norm ( g );
% First direction
[ bp , fmin ] = LineSearch ([ x y ] ,d , b0 ,100);
x
= x + bp * d (1); % Standin for a conjugate step
y
= y + bp * d (2);
a (2 ,:) = [ x y ];
f (2)
= fmin ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 62

else

% Could not move - do another gradient update


cnt
= cnt + 1;
a ( cnt ,:) = a ( cnt -1 ,:);
f ( cnt )
= f ( cnt -1);
if cnt == ns ,
break ;
end ;
fprintf ( G2 : );
[z , g ] = OptFn ( xg , yg );
d = -g / norm ( g ); % Direction
[ bp , fmin ] = LineSearch ([ xg yg ] ,d , b0 , ls );
x = xg + bp * d (1);
y = yg + bp * d (2);
end ;

% Update anchor point


xa = xg ;
ya = yg ;
cnt
= cnt + 1;
a ( cnt ,:) = [ x y ];
f ( cnt )
= OptFn (x , y );
fprintf ( % d %5 .3f \ n ,cnt , f ( cnt ));
end ;
[x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt , id1 ] = min ( z );
[ zopt , id2 ] = min ( zopt );
id1 = id1 ( id2 );
xopt = x ( id1 , id2 );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 64

yopt = y ( id1 , id2 );


[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt2 , id1 ] = min ( z );
[ zopt2 , id2 ] = min ( zopt2 );
id1 = id1 ( id2 );
xopt2 = x ( id1 , id2 );
yopt2 = y ( id1 , id2 );
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5);
z = OptFn (x , y );
contour (x ,y ,z ,50);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
axis ( square );
hold on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
set ( h (1) , LineWidth ,1 .2 );
set ( h (2) , LineWidth ,0 .6 );
h = plot ( xopt , yopt , kx , xopt , yopt , rx );
set ( h (1) , LineWidth ,1 .5 );
set ( h (2) , LineWidth ,0 .5 );
set ( h (1) , MarkerSize ,5);
set ( h (2) , MarkerSize ,4);
hold off ;
xlabel ( X );
ylabel ( Y );
zoom on ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 65

xlim ([0 ns -1]);


ylim ([0 xerr (1)]);
grid on ;
set ( gca , Box , Off );
AxisSet (8);
print - depsc P a r t a n P o s i t i o n E r r o r ;
figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Function Value );
ylim ([0 f (1)]);
xlim ([0 ns -1]);
grid on ;
set ( gca , Box , Off );
AxisSet (8);
print - depsc P a r t a n E r r o r L i n e a r ;

AxisSet (8);
print - depsc PartanContourA ;
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid (1 .5 :0 .01 :2 .5 , -3 .5 :0 .01 : -2 .5 );
z = OptFn (x , y );
contour (x ,y ,z ,75);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
axis ( square );
hold on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
set ( h (1) , LineWidth ,1 .2 );
set ( h (2) , LineWidth ,0 .6 );
hold off ;
xlabel ( X );
ylabel ( Y );
zoom on ;
AxisSet (8);
print - depsc PartanContourB ;
figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
h = plot (k -1 , xerr , b );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Euclidean Position Error );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 66

PARTAN Pros and Cons


a2

a0

a3
a6

a1

a7

a4
a5

+ For quadratic functions, converges in a nite number of steps


+ Easier to implement than 2nd order methods
+ Can be used with large number of parameters
+ Each (composite) step is at least as good as steepest descent
+ Tolerant of inexact line searches
Each (composite) step requires two line searches

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 67

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 68

Newtons Method
1

ak+1 = ak H(ak )

Example 7: Newtons with Steepest Descent Safeguard

f (ak )

where f (ak ) is the gradient and H(ak ) is the hessian of f (a),

2 f (a)
2 f (a)
2 f (a)
.
.
.
2
a
a
a
a
a1
1
2
1
p
2 f (a)
2 f (a)
2 f (a)

.
.
.
2
a2 ap
a2
a a
H(ak ) 2. 1
..
..
..
..
.
.
.

2
2
2
f (a)
f (a)
f (a)
.
.
.
2
ap a1
ap a2
a

4
3
2

0
1

Based on a quadratic approximation of the function f (a)

If f (a) is quadratic, converges in one step

If H(a) is positive-denite, the problem is well dened near local


minima where f (a) is nearly quadratic

4
5
5

0
X

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 69

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 70

Example 7: Newtons with Steepest Descent Safeguard

Example 7: Newtons with Steepest Descent Safeguard


7

1.5

6
2

Function Value

2.5

4
3
2

3
1
0

0.5

1
X

1.5

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 71

10

20

30

40

50
Iteration

60

70

80

90

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 72

Example 7: Newtons with Steepest Descent Safeguard


6

function [] = Newtons ();


% clear all ;
close all ;

5
Euclidean Position Error

Example 7: Relevant MATLAB Code

4
3

ns
x
y
b0

= 100;
= -3; % Starting x
= 1; % Starting y
= 1;

a
f

= zeros ( ns ,2);
= zeros ( ns ,1);

[z ,g , H ] = OptFn (x , y );
a (1 ,:) = [ x y ];
f (1)
= z;

2
1
0

10

20

30

40

50
Iteration

60

70

80

90

for cnt = 2: ns ,
d = - inv ( H )* g ;
if d * g >0 , % Revert to steepest descent if is not direction of descent
% fprintf ( (%2 d of %2 d ) Min. Eig :%5 .3f Reverting... \n , cnt , ns , min ( eig ( H )));
d = -g ;
end ;
d = d / norm ( d );
[b , fmin ] = LineSearch ([ x y ] ,d , b0 ,100);
% a ( cnt ,:) = ( a ( cnt -1 ,:) - inv ( H )* g ) ; % Pure Newton s Method
x = x + b * d (1);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 73

y = y + b * d (2);
[z ,g , H ] = OptFn (x , y );
a ( cnt ,:) = [ x y ];
f ( cnt )
= z;
end ;
[x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt , id1 ] = min ( z );
[ zopt , id2 ] = min ( zopt );
id1 = id1 ( id2 );
xopt = x ( id1 , id2 );
yopt = y ( id1 , id2 );
[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt2 , id1 ] = min ( z );
[ zopt2 , id2 ] = min ( zopt2 );
id1 = id1 ( id2 );
xopt2 = x ( id1 , id2 );
yopt2 = y ( id1 , id2 );
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5);
z = OptFn (x , y );
contour (x ,y ,z ,50);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 75

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 74

axis ( square );
hold on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
set ( h (1) , LineWidth ,1 .2 );
set ( h (2) , LineWidth ,0 .6 );
h = plot ( xopt , yopt , kx , xopt , yopt , rx );
set ( h (1) , LineWidth ,1 .5 );
set ( h (2) , LineWidth ,0 .5 );
set ( h (1) , MarkerSize ,5);
set ( h (2) , MarkerSize ,4);
hold off ;
xlabel ( X );
ylabel ( Y );
zoom on ;
AxisSet (8);
print - depsc NewtonsContourA ;
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid (1 .0 + ( -1:0 .02 :1) , -2 .4 + ( -1:0 .02 :1));
z = OptFn (x , y );
contour (x ,y ,z ,75);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
axis ( square );
hold on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
set ( h (1) , LineWidth ,1 .2 );
set ( h (2) , LineWidth ,0 .6 );
hold off ;
xlabel ( X );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 76

ylabel ( Y );
zoom on ;
AxisSet (8);
print - depsc NewtonsContourB ;

grid on ;
set ( gca , Box , Off );
AxisSet (8);
print - depsc N e w t o n s E r r o r L i n e a r ;

figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
h = plot (k -1 , xerr , b );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Euclidean Position Error );
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on ;
set ( gca , Box , Off );
AxisSet (8);
print - depsc N e w t o n s P o s i t i o n E r r o r ;
figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Function Value );
ylim ([0 f (1)]);
xlim ([0 ns -1]);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 77

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 78

Newtons Method Pros and Cons

Levenberg-Marquardt

ak+1 = ak H(ak )1 f (ak )


+ Very fast convergence near local minima

1. Determine if k I + H(ak ) is positive denite. If not, k := 4k


and repeat.
2. Solve the following equation for ak+1
[k I + H(ak )] (ak+1 ak ) = f (ak )

Not guaranteed to converge (may actually diverge)


Requires p p Hessian
Requires a p p matrix inverse that uses O(p3 ) operations

3.
rk

f (ak ) f (ak+1 )
q(ak ) q(ak+1 )

where q(a) is the quadratic approximation of f (a) based on the


f (a), f (a), and H(ak )
4. If rk < 0.25, then k+1 := 4k
If rk > 0.75, then k+1 := 12 k
If rk 0, then ak+1 := ak
5. If not converged, k := k + 1 and loop to 1.

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 79

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 80

Levenberg-Marquardt Comments

Example 8: Levenberg-Marquardt Conjugate Gradient

Similar to Newtons method

Has safety provisions for regions where quadratic approximation is


inappropriate

4
3

Compare

ak+1 = ak H(ak )1 f (ak )


[k I + H(ak )] (ak+1 ak ) = f (ak )

1
Y

Newtons:
LM :

0
1

If  = 0, these are equivalent

If  , ak+1 ak

 is chosen to ensure that the smallest eigenvalue of H(ak ) is


positive and suciently large ( )

4
5
5

0
X

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 81

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 82

Example 8: Levenberg-Marquardt Conjugate Gradient

Example 8: Levenberg-Marquardt Conjugate Gradient

2.5

2.6
6
2.7
5
Function Value

2.8

2.9
3
3.1
3.2

4
3
2

3.3
1

3.4
3.5
1.5

2
X

2.5

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 83

10

15

20

25

Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 84

Example 8: Levenberg-Marquardt Conjugate Gradient


6

function [] = L e v e n b e r g M a r q u a r d t ();
% clear all ;
close all ;

5
Euclidean Position Error

Example 8: Relevant MATLAB Code

ns
x
y
eta

a
f

= 26;
= -3; % Starting x
=
1; % Starting y
= 0 .0001 ;
= zeros ( ns ,2);
= zeros ( ns ,1);

[ zn ,g , H ] = OptFn (x , y );
a (1 ,:) = [ x y ];
f (1)
= zn ;
ap = [ x y ] ; % Previous point

for cnt = 2: ns ,
[ zn ,g , H ] = OptFn (x , y );

1
0

while min ( eig ( eta * eye (2)+ H )) <0 ,


eta = eta * 4;
end ;

10

15

20

25

a ( cnt ,:) = ( ap - inv ( eta * eye (2)+ H )* g ) ;

Iteration
x = a ( cnt ,1);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 85

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 86

y = a ( cnt ,2);
zo = zn ; % Old function value
zn = OptFn (x , y );

y = a ( cnt ,2);
a ( cnt ,:) = [ x y ];
f ( cnt )
= OptFn (x , y );

xd = ( a ( cnt ,:) - ap );
qo = zo ;
qn = zn + g * xd + 0 .5 * xd * H * xd ;

% disp ([ cnt a ( cnt ,:) f ( cnt ) r eta ])


end ;

if qo == qn , % Test for convergence


x = a ( cnt ,1);
y = a ( cnt ,2);
a ( cnt : ns ,:) = ones ( ns - cnt +1 ,1)*[ x y ];
f ( cnt : ns ,:) = OptFn (x , y );
break ;
end ;
r = ( zo - zn )/( qo - qn );
if r <0 .25 ,
eta = eta * 4;
elseif r >0 .50 , % 0 .75 is recommended , but much slower
eta = eta / 2;
end ;
if zn > zo , % Back up
a ( cnt ,:) = a ( cnt -1 ,:);
else
ap = a ( cnt ,:) ;
end ;
x = a ( cnt ,1);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 87

[x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt , id1 ] = min ( z );
[ zopt , id2 ] = min ( zopt );
id1 = id1 ( id2 );
xopt = x ( id1 , id2 );
yopt = y ( id1 , id2 );
[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
[z , dzx , dzy ] = OptFn (x , y );
[ zopt2 , id1 ] = min ( z );
[ zopt2 , id2 ] = min ( zopt2 );
id1 = id1 ( id2 );
xopt2 = x ( id1 , id2 );
yopt2 = y ( id1 , id2 );
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5);
z = OptFn (x , y );
contour (x ,y ,z ,50);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
axis ( square );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 88

hold on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
set ( h (1) , LineWidth ,1 .2 );
set ( h (2) , LineWidth ,0 .6 );
h = plot ( xopt , yopt , kx , xopt , yopt , rx );
set ( h (1) , LineWidth ,1 .5 );
set ( h (2) , LineWidth ,0 .5 );
set ( h (1) , MarkerSize ,5);
set ( h (2) , MarkerSize ,4);
hold off ;
xlabel ( X );
ylabel ( Y );
zoom on ;
AxisSet (8);
print - depsc L e v e n b e r g M a r q u a r d t C o n t o u r A ;
figure ;
FigureSet (1 ,4 .5 ,2 .75 );
[x , y ] = meshgrid (1 .5 :0 .01 :2 .5 , -3 .5 :0 .01 : -2 .5 );
z = OptFn (x , y );
contour (x ,y ,z ,75);
h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
axis ( square );
hold on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
set ( h (1) , LineWidth ,1 .2 );
set ( h (2) , LineWidth ,0 .6 );
hold off ;
xlabel ( X );
ylabel ( Y );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 89

set ( gca , Box , Off );


AxisSet (8);
print - depsc L e v e n b e r g M a r q u a r d t E r r o r L i n e a r ;

zoom on ;
AxisSet (8);
print - depsc L e v e n b e r g M a r q u a r d t C o n t o u r B ;
figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
h = plot (k -1 , xerr , b );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Euclidean Position Error );
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on ;
set ( gca , Box , Off );
AxisSet (8);
print - depsc L e v e n b e r g M a r q u a r d t P o s i t i o n E r r o r ;
figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Function Value );
ylim ([0 f (1)]);
xlim ([0 ns -1]);
grid on ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 90

Levenberg-Marquardt Pros and Cons


[k I + H(ak )] (ak+1 ak ) = f (ak )
Many equivalent formulations
+ No line search required
+ Can be used with approximations to the hessian
+ Extremely fast convergence (2nd order)
Requires gradient and hessian (or approximate hessian)
Requires O(p3 ) operations for each solution to the key equation

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 91

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 92

Optimization Algorithm Summary

Algorithm
Cyclic Coordinate
Steepest Descent
Conjugate Gradient
PARTAN
Newtons Method
Levenberg-Marquardt

Convergence
Slow
Slow
Fast
Fast
Very Fast
Very Fast

Stable
Y
Y
N
Y
N
Y

f (a)
N
Y
Y
Y
Y
Y

H(a)
N
N
N
N
Y
Y

LS
Y
Y
Y
Y
N
N

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 93

S-ar putea să vă placă și