Sunteți pe pagina 1din 4

Math 523: Numerical Analysis I

Solution of Homework 4. Numerical Optimization

Problem 1. Program the steepest descent and Newton’s methods using the backtracking line
search algorithm (using either the Wolfe conditions or the Goldstein conditions). Use them to
minimize the Rosenbrock function

F (x, y) = 100(y − x2 )2 + (1 − x)2 .

Set the initial step size to be 1 and print out the step size at iteration in your algorithms. Test
your algorithms with two initial guesses: (1.2, 1.2) and (−1.2, 1). What about you choose step
sizes always equal to 1.0? What should be the exact minimizer and minimal value? What are
convergence rates for these algorithms?

Problem 2. Implement the BFGS quasi-Newton method with the line search algorithm with
the Wolfe conditions. Check the condition ykT sk > 0 at each iteration. Use your code to minimize
the Rosenbrock function in Problem 1.

Problem 3. Test your algorithms with and without line search in the previous two problems
for the minimizing the function
F (x, y) = x4 + y 2 .

Problem 4. Minimize the following function

x4
F (x, y) = − x2 + 2x + (y − 1)2
4
with the pure Newton’s method (α = 1). Explain why the pure Newton’s method does not
converge for this problem (Hint: examine the Hessian matrix). What if you use the line search
step length selection rules?

1
Solution. In this homework, we mainly focus on applying different minimization techniques for
some simple test objective functions. From our numerical tests, we can draw some superficial
conclusions:

• When the Newton’s method converges, it converges very fast (quadratic convergence
asymptotically).

• The Newton’s method requires second order derivatives which are difficult, if possible, to
obtain. Furthermore, to store the second derivatives, we need O(n2 ) storage, where n
is the number of variables of the objective function. The steepest descent method and
quasi-Newton methods can be used instead.

• The quasi-Newton method is a good compromise between convergence speed and complex-
ity. It usually converges fast, and some times converges even without step length control.
The drawback is the high storage requirement.

• The steepest descent method usually does not converge without step length control except
we fix the step length α to be sufficiently small. It is a low complexity and low storage
method. This is the last choice to resort in Matlab function fminunc (unconstrained
minimization).

• The initial guess is extremely important for Newton-like methods. How to find a “good
enough” initial guess is an interesting question to explore.

• For non convex functions, like in Problem 4, the Hessian matrix might not be always
positive definite. This is very crucial for Newton-like methods because if the Hessian is
not positive definite, the Newton’s direction might not even be a descent direction as
discussed in class.

• The performance of line search algorithms depends on many parameters, like the condi-
tion(s) you choose and constants in those conditions. Different line searching conditions
could give different performance of you descent direction methods.

Here is the Matlab code for general descent direction methods:

2
function [min, xhist, steplength] = ...
descentdirect(f, grad_f, hessian_f, x0, maxit, tol, dir, line)
%--------------------------------------------------------------------------
% General descent direction method
%--------------------------------------------------------------------------
% f: object function
% grad_f: gradient of f
% hessian_f: hessian matrix of f
%--------------------------------------------------------------------------

%--------------------------------------------------------------------------
% Parameter
%--------------------------------------------------------------------------
c_1 = 10^-4; c_2 = 0.9;
%c_1 = 0.4; c_2 = 0.6;
alpha_max = 16;

%--------------------------------------------------------------------------
% Initialization
%--------------------------------------------------------------------------
i = 1; x_k = x0; stop = 1; xsize = length(x0); B_k = eye(xsize);
xhist = zeros(xsize,maxit); xhist(:,i) = x_k;
steplength = zeros(maxit,1);

%--------------------------------------------------------------------------
while stop && i < maxit

%--------------------------------------------------------------------
% Search Direction
%--------------------------------------------------------------------
if (dir == 1) % Steepest descent direction
p_k = steepdir(grad_f,x_k);
elseif (dir == 2) % Newton’s direction
p_k = newtondir(grad_f,hessian_f,x_k);
elseif (dir == 3) % Quasi-Newton direction
p_k = qnewtondir(grad_f,B_k,x_k);
end

%--------------------------------------------------------------------
% Step Length
%--------------------------------------------------------------------
if (line == 1) % No stepsize control
alpha = 1;
elseif (line == 2) % Wolfe condition
alpha = linesearch(f, grad_f, p_k, x_k, c_1, c_2, alpha_max);
end
steplength(i) = alpha;

%--------------------------------------------------------------------
% Update
%--------------------------------------------------------------------
x_old = x_k;
x_k = x_k + alpha * p_k;

i = i + 1;
xhist(:,i) = x_k;
if (norm(grad_f(x_k)) < tol) || (norm(x_k - x_old) < 1e-12)
stop = 0;
end

%--------------------------------------------------------------------
% Updating Quasi-Newton matrix
%--------------------------------------------------------------------
if (dir == 3) % Quasi-Newton direction
B_k = bfgs(x_k,x_old,B_k,grad_f);
end

end

min = x_k;
steplength = steplength(1:i);
xhist = xhist(:,1:i);

3
%--------------------------------------------------------------------
function p_k = steepdir(grad_f,x_k)
% Steepest descent direction

p_k = - feval(grad_f, x_k);

%--------------------------------------------------------------------
function p_k = newtondir(grad_f,hessian_f,x_k)
% Newton’s direction

grad_f_k = feval(grad_f, x_k);


hessian_f_k = feval(hessian_f, x_k);
p_k = - hessian_f_k\grad_f_k;

%--------------------------------------------------------------------
function p_k = qnewtondir(grad_f,B_k,x_k)
% Quasi-Newton’s direction

grad_f_k = feval(grad_f, x_k);


p_k = -B_k\grad_f_k;

%--------------------------------------------------------------------
function B_k = bfgs(x_k,x_old,B_k,grad_f)
% BFGS

s_k = x_k - x_old;


y_k = feval(grad_f, x_k) - feval(grad_f, x_old);

if y_k’ * s_k <= 0


return
end

B_k = B_k - (B_k * s_k * s_k’ * B_k) / (s_k’ * B_k * s_k) + ...
(y_k* y_k’) / (y_k’ * s_k);

%--------------------------------------------------------------------
function B_k = sr1(x_k,x_old,B_k,grad_f)
% SR1

s_k = x_k - x_old;


y_k = feval(grad_f, x_k) - feval(grad_f, x_old);

if y_k’ * s_k <= 0


return
end

B_k = B_k + ((y_k - B_k * s_k) * (y_k - B_k *s_k)’) / ...


(y_k - B_k * s_k)’*s_k;

S-ar putea să vă placă și