HW 4 Solu

Math 523: Numerical Analysis I
Solution of Homework 4. Numerical Optimization
Problem 1. Program the steepest descent and Newton’s methods using the backtracking line
search algorithm (using either the Wolfe conditions or the Goldstein conditions). Use them to
minimize the Rosenbrock function
F (x, y) = 100(y − x2 )2 + (1 − x)2 .
Set the initial step size to be 1 and print out the step size at iteration in your algorithms. Test
your algorithms with two initial guesses: (1.2, 1.2) and (−1.2, 1). What about you choose step
sizes always equal to 1.0? What should be the exact minimizer and minimal value? What are
convergence rates for these algorithms?
Problem 2. Implement the BFGS quasi-Newton method with the line search algorithm with
the Wolfe conditions. Check the condition ykT sk > 0 at each iteration. Use your code to minimize
the Rosenbrock function in Problem 1.
Problem 3. Test your algorithms with and without line search in the previous two problems
for the minimizing the function
F (x, y) = x4 + y 2 .
Problem 4. Minimize the following function
x4
F (x, y) = − x2 + 2x + (y − 1)2
4
with the pure Newton’s method (α = 1). Explain why the pure Newton’s method does not
converge for this problem (Hint: examine the Hessian matrix). What if you use the line search
step length selection rules?
1
Solution. In this homework, we mainly focus on applying different minimization techniques for
some simple test objective functions. From our numerical tests, we can draw some superficial
conclusions:
• When the Newton’s method converges, it converges very fast (quadratic convergence
asymptotically).
• The Newton’s method requires second order derivatives which are difficult, if possible, to
obtain. Furthermore, to store the second derivatives, we need O(n2 ) storage, where n
is the number of variables of the objective function. The steepest descent method and
quasi-Newton methods can be used instead.
• The quasi-Newton method is a good compromise between convergence speed and complex-
ity. It usually converges fast, and some times converges even without step length control.
The drawback is the high storage requirement.
• The steepest descent method usually does not converge without step length control except
we fix the step length α to be sufficiently small. It is a low complexity and low storage
method. This is the last choice to resort in Matlab function fminunc (unconstrained
minimization).
• The initial guess is extremely important for Newton-like methods. How to find a “good
enough” initial guess is an interesting question to explore.
• For non convex functions, like in Problem 4, the Hessian matrix might not be always
positive definite. This is very crucial for Newton-like methods because if the Hessian is
not positive definite, the Newton’s direction might not even be a descent direction as
discussed in class.
• The performance of line search algorithms depends on many parameters, like the condi-
tion(s) you choose and constants in those conditions. Different line searching conditions
could give different performance of you descent direction methods.
Here is the Matlab code for general descent direction methods:
2
function [min, xhist, steplength] = ...
descentdirect(f, grad_f, hessian_f, x0, maxit, tol, dir, line)
%--------------------------------------------------------------------------
% General descent direction method
%--------------------------------------------------------------------------
% f: object function
% grad_f: gradient of f
% hessian_f: hessian matrix of f
%--------------------------------------------------------------------------
%--------------------------------------------------------------------------
% Parameter
%--------------------------------------------------------------------------
c_1 = 10^-4; c_2 = 0.9;
%c_1 = 0.4; c_2 = 0.6;
alpha_max = 16;
%--------------------------------------------------------------------------
% Initialization
%--------------------------------------------------------------------------
i = 1; x_k = x0; stop = 1; xsize = length(x0); B_k = eye(xsize);
xhist = zeros(xsize,maxit); xhist(:,i) = x_k;
steplength = zeros(maxit,1);
%--------------------------------------------------------------------------
while stop && i < maxit
%--------------------------------------------------------------------
% Search Direction
%--------------------------------------------------------------------
if (dir == 1) % Steepest descent direction
p_k = steepdir(grad_f,x_k);
elseif (dir == 2) % Newton’s direction
p_k = newtondir(grad_f,hessian_f,x_k);
elseif (dir == 3) % Quasi-Newton direction
p_k = qnewtondir(grad_f,B_k,x_k);
end
%--------------------------------------------------------------------
% Step Length
%--------------------------------------------------------------------
if (line == 1) % No stepsize control
alpha = 1;
elseif (line == 2) % Wolfe condition
alpha = linesearch(f, grad_f, p_k, x_k, c_1, c_2, alpha_max);
end
steplength(i) = alpha;
%--------------------------------------------------------------------
% Update
%--------------------------------------------------------------------
x_old = x_k;
x_k = x_k + alpha * p_k;
i = i + 1;
xhist(:,i) = x_k;
if (norm(grad_f(x_k)) < tol) || (norm(x_k - x_old) < 1e-12)
stop = 0;
end
%--------------------------------------------------------------------
% Updating Quasi-Newton matrix
%--------------------------------------------------------------------
if (dir == 3) % Quasi-Newton direction
B_k = bfgs(x_k,x_old,B_k,grad_f);
end
end
min = x_k;
steplength = steplength(1:i);
xhist = xhist(:,1:i);
3
%--------------------------------------------------------------------
function p_k = steepdir(grad_f,x_k)
% Steepest descent direction
p_k = - feval(grad_f, x_k);
%--------------------------------------------------------------------
function p_k = newtondir(grad_f,hessian_f,x_k)
% Newton’s direction
grad_f_k = feval(grad_f, x_k);

hessian_f_k = feval(hessian_f, x_k);
p_k = - hessian_f_k\grad_f_k;
%--------------------------------------------------------------------
function p_k = qnewtondir(grad_f,B_k,x_k)
% Quasi-Newton’s direction
grad_f_k = feval(grad_f, x_k);

p_k = -B_k\grad_f_k;
%--------------------------------------------------------------------
function B_k = bfgs(x_k,x_old,B_k,grad_f)
% BFGS
s_k = x_k - x_old;

y_k = feval(grad_f, x_k) - feval(grad_f, x_old);
if y_k’ * s_k <= 0

return
end
B_k = B_k - (B_k * s_k * s_k’ * B_k) / (s_k’ * B_k * s_k) + ...
(y_k* y_k’) / (y_k’ * s_k);
%--------------------------------------------------------------------
function B_k = sr1(x_k,x_old,B_k,grad_f)
% SR1
s_k = x_k - x_old;

y_k = feval(grad_f, x_k) - feval(grad_f, x_old);
if y_k’ * s_k <= 0

return
end
B_k = B_k + ((y_k - B_k * s_k) * (y_k - B_k *s_k)’) / ...

(y_k - B_k * s_k)’*s_k;

HW 4 Solu

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

HW 4 Solu

Încărcat de

Drepturi de autor:

Formate disponibile

Math 523: Numerical Analysis I

Solution of Homework 4. Numerical Optimization

F (x, y) = 100(y − x2 )2 + (1 − x)2 .

Problem 4. Minimize the following function

Here is the Matlab code for general descent direction methods:

p_k = - feval(grad_f, x_k);

grad_f_k = feval(grad_f, x_k);

grad_f_k = feval(grad_f, x_k);

s_k = x_k - x_old;

if y_k’ * s_k <= 0

s_k = x_k - x_old;

if y_k’ * s_k <= 0

B_k = B_k + ((y_k - B_k * s_k) * (y_k - B_k *s_k)’) / ...

S-ar putea să vă placă și