Sunteți pe pagina 1din 12

Section-2 Roots of Equations

(These notes are based mainly on the 6th Edition of the textbook
Numerical Methods for Engineers by S.C. Chapra and R.P. Canale.)

This section discusses some advanced root-finding methods (including roots of polynomials) and mentions
about rate of convergence. In numerical solutions, the equation to be solved is typically arranged into the
form ( ) = 0 . The root-finding problem involves finding a root (or solution) of an equation of the form
( ) = 0. Therefore, the root of the equation ( ) = 0 is the value of x that makes ( ) = 0. In other words,
we call a number r satisfying ( ) = 0 a zero of the function f and a root of the equation ( ) = 0. Hence,
saying zero of is equivalent to saying root of ( ). There are two main types of numerical methods for
root finding; these are bracketing and open methods.
Bracketing Methods
Remember that the bisection and the false-position methods are the two bracketing methods which apply the
following algorithm:
1) Choose a lower initial-guess
and an upper initial-guess
such that ( ) ( ) < 0
2) The root estimate
is calculated as:
=

(for the bisection method)

(
(

)(
)

)
)

(for the false-position method)

3) IF ( ) ( ) > 0 , then set


=
and return to step (2) for a new root estimate
ELSE, set
=
and return to step (2) for a new root estimate.
4) END the computations when the error criterion is satisfied (i.e. when ( ) 0).
The false-position method generally converges to the root faster than the bisection method as it utilizes a
graphical approach which can be expressed by the following example:
If ( ) is much closer to zero than ( ) , it is likely that the root is closer to than to .
However, in some cases this approach does not work efficiently as the false-position method can exhibit onesidedness i.e. one of the bounds ( or ) can be stuck for a considerable number of successive iterations
(see Example 5.6 in the textbook). For the false-position method, the two strategies to increase rate of
convergence can be:
a) Switching to the bisection method when there is no change in one of the bounds twice in a row.
b) Halving the function value at the stagnant bound when there is no change in one of the bounds twice in a
row. This procedure is called the Modified False-Position Method. (See Figure 5.15 in the textbook for an
algorithm of the modified false-position method). The graphical depiction of the modified false-position
method is given below in Figure 2.1 in which the function value ( ( )) at the stagnant bound ( ) is halved
successively to increase the convergence rate.

Fig. 2.1 Graphical depiction of the modified false-position method

Open methods
Open methods (e.g. fixed-point iteration, Newton-Raphson and Secant methods) usually converge much
faster than bracketing methods, but they sometimes diverge.
-It can be shown that when the fixed-point iteration method converges, the error is roughly proportional to
and less than the error at the previous step. Thus, fixed-point iteration has linear convergence.
- It can be shown that when the Newton-Raphson method converges, the error is roughly proportional to the
square of the error at the previous step. Thus, the Newton-Raphson method has quadratic convergence.
The Newton-Raphson Method
Let be an approximation for
such that ( ) 0 and | | is small. Note that
the root of ( ) i.e. ( ) = 0 . Apply Taylor Series expansion of ( ) about to get,
(

)=0= ( )+

( )(

)+

( )
(
2!

is the exact value of

Dropping all the second and higher-order terms leads to: 0 ( ) + ( )( ) . This is a reasonable
approximation if is close enough to . Solving for gives, ( )/ ( ). The Newton-Raphson
Method says that if we start with an initial guess of for the root, an improved estimate
is given by:
=

( )
( )

The above equation can be applied successively to obtain improved estimates (


,
,
etc.) for the
actual root of interest. To ensure convergence, the initial guess should be sufficiently close to the actual root.
The Secant Method
In this method, the derivative term
difference as follows:
=

( )
,
( )

( ) of the Newton-Raphson method is approximated by a finite divided


( )

) ( )
,

( )(
)
(
) ( )

Note that two initial guesses


and are required to find the root estimate
. The initial guesses do not
have to bracket the root. In Secant method, a sequential approach is used i.e. in the first iteration
and
are used to find , and in the second iteration and are used to find the next root estimate and so on.

Fig. 2.2 (a) Secant method, (b) Inverse Quadratic Interpolation (

is the root estimate)

Inverse Quadratic Interpolation


) ,(
) , ( , ) and these three points are
In this method, a parabola is fitted to three points (
,
,
on a function ( ) whose roots are to be found. The intersection of this parabola with the x-axis is the root
estimate
. Note that a parabola in x might not intersect the x-axis if it has complex roots as shown in the
below figure. The remedy is to fit the three points with a parabola in y instead of using a parabola in x. A
parabola in y (i.e. = ( )) is a sideways parabola which always intersects the x-axis. Using the Lagrange
form of 2nd-order interpolating polynomial, ( ) can be written as:
( )=

)( )
)(
)

)( )
)(
)

(
(

)(
)(

)
)

The root estimate


is found when y is set to zero in the above equation i.e.
= (0) . Note that if by
any chance, two of the y-values are the same, then the method fails.
As in Secant method, a sequential approach is used i.e.
,
,
are used to find the root estimate and
in the next iteration
, ,
are used to find the new root estimate .
Example1: In order to find a root of ( ), fit a parabola that goes through three points on ( ) which are
(1,2), (2,1) and (4,5).
Solution1: As shown in the below figure, two different parabolas can be fitted to the given points. It can be
determined that, these parabolas are = ( ) = 4 + 5 and = ( ) = 0.5 2.5 + 4. See that ( ) is
no good since it does not cut the x-axis but ( ) does. Then, ( ) must be used to calculate the root estimate
which is (0) = 4. The parabolas can be defined and plotted in Matlab as follows.
>> h=@(x) x.^2-4*x+5;
>> p=@(y) 0.5*y.^2-2.5*y+4;
>> x=-1:0.01:6; y=-2:0.01:6;
>> figure, plot(x,h(x), p(y),y)

Matlabs fzero function


The fzero function basically finds the zeros of functions with single variable. When fzero is used with a
single initial guess, the algorithm first finds an interval (or bracket) where the function changes sign. Then, it
employs a combination of bisection, secant, and inverse quadratic interpolation methods to find the root.
Whenever possible, the algorithm uses the open methods (secant and inverse quadratic interpolation) which
can provide quick convergence to the root. However, if the root estimate falls outside the bracket, the reliable
bisection method is applied to produce an estimate inside the bracket. In a typical fzero procedure, the
bisection method dominates at first but as the root is approached, the procedure switches to the open
methods.
3

Example2: Find a real root (or zero) of the polynomial function ( ) = 4.5
fzero. Try an initial guess of 2.
Solution2a: >> f=@(x) x.^3-4.5*x.^2+6.25*x-3.125; xsol=fzero(f,2)

+ 6.25 3.125 using

xsol =
2.500000000000000

Solution2b: Write a function m-file to define the equation as follows.


function p=polyeqn(x)
p=x.^3-4.5*x.^2+6.25*x-3.125;

Then in the command window, type


>> xsol=fzero(@polyeqn,2)

>> xsol=fzero('polyeqn',2)

xsol =

xsol =

2.500000000000000

2.500000000000000

Systems of Nonlinear Equations


A system of n nonlinear equations with n unknowns can be expressed as
( ,

)=0 ,

,,

( ,

,,

)=0 ,

( ,

,,

)=0

] of the n-dimensional space into the real


where each function maps a vector = [
line . For this system of n nonlinear equations with n unknowns, we can define a function mapping
into as
( ,

)=[

,,

( ,

,,

( ,

,,

( ,

)]

,,

Hence, we can express this system of n nonlinear equations with n unknowns in a concise form as
where 0 is a column vector of zeros.

( ) = 0

Solving Systems of Nonlinear Equations using Newtons Method


Consider a system of n nonlinear equations with n unknowns.
( ,
Thus,

( ,

Define

)=0 ,

,,
,,

( ,

) = 0 for

=[

guesses). Then,

,,

( ,

,,

)=0

] which corresponds to the values of the latest root estimates (or the initial

=[

,,

= 1,2, , . Lets apply Newtons Method for this given system.

The function values at can be expressed by


( )=

)=0 ,

] corresponds to the values of the new root estimates.


( )=[

( )

( )

( )]

where

. Apply the first-order Taylor Series (multi-variable version) for each

nonlinear equation to get:


(

)=

( )+

+ +

)=

( )+

++

++

)=

( )+

All the partial derivatives are evaluated at . Set ( ) (i.e. the left-hand side of the above equations) to
zero so as to solve for the new root estimates . Finally, the following equation is obtained where ( ) is

] .
the Jacobian matrix of the system. Recall that = [

( )=

( )

= ( )
1
1

1
2

2
1

2
=

where

=
2

Note that the evaluation of the inverse of ( ) is computationally expensive. Therefore, the above system of
equations is written in an alternative form as follows:
( )(

)= ( ),

( )

= ( )

The equation ( )
= ( ) can be solved by Gauss Elimination to find
and then
is
obtained by adding
to . Gauss Elimination is the preferred way of solving linear system of equations
in terms of computation time and numerical accuracy (see Matlabs help file for the function inv).

Continuation
In Newtons Method, we need a good initial guess to find the roots. The set of equations we want to solve
can be expressed as ( ; ) = 0 where is the exact vector of roots for the parameter in the problem
and 0 is a column vector of zeros. Suppose we calculated the root for the value of the parameter
i.e.

( ; ) 0. However, we would also like to find the root for a different parameter value
. If
,
then would be a good initial guess. But if
is not sufficiently close to , then we should apply the
technique called continuation. In this technique, we form a sequence of s as follows:
=

+(

= 1,2, ,

Using Newtons Method, we solve ( ; ) 0 with as the initial guess and if


is large enough,

then

and Newtons Method should converge quickly. Then, we solve ( ; ) 0 using as the
initial guess. This procedure is applied repeatedly until we reach the parameter value . The technique
given here can be generalised for a system of equations with many parameters.
Example3: A single-cylinder engine model, composed of three moving links, is shown below. The link
lengths are given as a 4 and b 8 . Calculate numerically the values of the joint variables s and when
the crank angle is = 30. Apply Newtons Method. The initial guesses for s and are
= 10 and
= 150, respectively.

Solution3: The kinematic equations for the given mechanism can be written as:
( )= +

( ) and

( )=

( )

Put the above two equations into the form:


=

( )

( ) = 0 and

( )

Note that as is specified, the vector of variables becomes = [


= 1 ,

( )=

( ),

1
0

] =[

= 0,

( )=

( )=0

( )
=
( )
2
1

] . Then,
( )

( )

( )

] . To start the iteration, we use the initial guesses given as = [


] . The
Notice that = [
improved (or new) root estimates can be calculated as
=
. The iterations can be stopped
when, for instance, ( ) <
where is a stopping criterion. Note that ( ) is the 2-norm of
( ) and the 2-norm of a vector can be calculated by using the function norm in Matlab.

Definition: Consider a vector = [


1) The 2-norm (or

] with

components.

-norm or Euclidean norm) of is


/

2) The -norm (or max-norm) of is equal to the maximum component (in absolute value).
= max | |
3) The 1-norm (or

-norm) of is
=

| |

These different norms can be calculated by the function norm in Matlab; please read the help files.

Solving Systems of Nonlinear Equations using Matlabs fsolve function


Matlabs fsolve function utilises optimization algorithms to solve systems of nonlinear equations.
Example4: Solve Example 3 using the function fsolve.
Solution4: First, compose a function m-file to define the equations i.e. define

( )

function F=pistoneqn(x)
global a b th
F=[a*cos(th)-x(1)-b*cos(x(2));
a*sin(th)-b*sin(x(2))
];

Then, use the initial guess x0 and type the following statements in the command window. (Note that the
angles must be in radians). Note that in the below statements, th corresponds to the angle .
>> global a b th; th=30*pi/180; a=4; b=8;
>> x0=[10; 150*pi/180]; [x,Fval]=fsolve(@pistoneqn,x0)
Optimization terminated: first-order optimality is less than options.TolFun.

x =
11.210068316000402
2.888912398711557
Fval =
1.0e-008 *
-0.792012944117459
0.204371120027247

The statement [x,Fval]=fsolve(@pistoneqn,x0) returns the value of the objective function pistoneqn at
the solution x.
See that the solution is (1) = = 11.210068316000402 and (2) = = 2.888912398711557 165.52
Note that there is an exact (analytical) solution which is given as follows:
=

The exact values of and

( )

for

= 30 are

( )

( )

= 11.210068307552588 and

= 2.888912398447714.

Example5: Matrix equations can also be solved using fsolve. Solve the equation
where is the (2 2) identity matrix and

+4 =4 +

4 3
5 6

is the unknown matrix.

Solution5: Write a function m-file by putting the equation into the form F(X) = 0
function F=matrixeqn1(X)
F=X^2-4*X+4*eye(2)-[4 3;5 6];

Then, use an initial guess X0 to apply fsolve.


>> X0=[1 2;3 4]; [X,Fval]=fsolve(@matrixeqn1,X0)
Optimization terminated: first-order optimality is less than options.TolFun.
X =
2.499999916939480
1.500000027054891
2.500000063549487
3.499999988251996
Fval =
1.0e-006 *
0.079900946303724
-0.109922341451352

-0.088103007556128
0.127717447284681

Show that three other solutions are also possible as given below. Try different initial guesses to find them.
X=[1/4 -3/4; -5/4 -1/4],

X=[3/2 -3/2; -5/2

1/2],

X=[15/4

3/4; 5/4

17/4]

Quasi-Newton Methods for Systems of Nonlinear Equations (Broyden's Method)


Newtons Method is powerful since it has quadratic convergence but has three shortcomings. The first one is
that good initial guesses are required to ensure convergence to the desired root. The second one is the
computational effort required to evaluate the exact partial derivatives in the Jacobian matrix. In some cases,
the analytical (i.e. exact) partial derivatives may not be available. The third shortcoming is the computational
effort required to solve a linear system of equations for each iteration. The amount of computational effort is
extensive if there is a large number equations in the system.
If the exact evaluation of the partial derivatives in the Jacobian matrix is not practical (i.e. expensive), we
can use finite difference approximations of the partial derivatives. For example, we can write
( )

( + ) ( )

where is the step size which must be small and is the column vector whose only nonzero element is 1 in
the
coordinate. In Newton's method, if we approximate the Jacobian matrix using such finite difference
approximations, then the resulting method is called Finite-difference Newton's method. However, the
evaluation of each function can be expensive and we still have to solve the linear system involving the
approximate Jacobian matrix.
In order to circumvent the shortcomings mentioned in the previous paragraphs, algorithms called QuasiNewton methods were devised. Quasi-Newton methods replace the Jacobian matrix in Newton's method with
an approximate Jacobian matrix which is updated at each iteration. This is very useful if the Jacobian matrix
is difficult to calculate. The most well-known Quasi-Newton method is Broyden's method which utilises a
generalisation of the Secant method to systems of nonlinear equations. In Quasi-Newton methods, the
quadratic convergence of Newton's method is lost but the number of arithmetic calculations is significantly
reduced.
Broyden's Method
Recall the Secant method for a single nonlinear equation which replaces the derivative term
Newton-Raphson method by the backward finite-divided difference
( )

) ( )

( ) of the

) ( ) is called the secant equation and if we regard


The equation (
)= (
as an
unknown, then
can be found uniquely. The Secant method can be generalised to a system of n nonlinear
equations with n unknowns expressed in the form ( ) = 0. Remembering the idea behind Newton's
method, the secant equation for the nonlinear system ( ) = 0 can be written as follows.
(

) = ( ) (

(2.1)

In Eqn.(2.1),
is an approximation for the Jacobian matrix. However, the secant equation (Eqn.(2.1)) does
not determine
uniquely since Eqn.(2.1) is a system of
equations with
unknowns (there are
elements of to be determined).
In order to start Broyden's Method, we use an initial guess for the solution of the system ( ) = 0. We
calculate the next approximation (i.e. next root estimate) using Newton's method as shown in Eqn.(2.2).
( )

= ( )

(2.2)

If the calculation of ( ) is computationally expensive, we can approximate the partial derivatives using
finite differences. Now, let
= ( ). For the next steps (i.e. iterations), we do not want to use the exact
values of the Jacobian matrix. At each iteration, we would like to use an approximation for the Jacobian
matrix which can be calculated cheaply. As indicated in Eqn.(2.1), we would like to use
in place of ( )
so that we can obtain a cheap solution of ( ) = 0 as given in Eqn.(2.3).

( )

(2.3)

In order to determine
uniquely, an extra condition is needed. In a convergent numerical solution, the
Jacobian matrices from successive iterations are close to each other, then we should be able to update
cheaply an approximate Jacobian matrix from iteration to iteration. Consider two successive approximate
Jacobian matrices
and
(see Eqn.(2.3)). Broyden's method ensures that
is as close as possible to
while satisfying the secant equation (or condition)
by minimising the Frobenius norm
given in (2.1). Applying these conditions provides a unique solution for
as given in Eqn.(2.4).
=

(2.4)

In Eqn.(2.4),
= ( ) ( ) and =
. Eqn.(2.4) is called Broyden update formula. Thus,
once we obtain
from Eqn.(2.4), we use Eqn.(2.3) to calculate
for 1. Some authors advise
(
)
resetting to
at some steps without increasing the computational time considerably.
Note: For an

matrix , the Frobenius norm is given by


/

Notice that applying Eqn.(2.3) still requires the solution of an linear system of equations but we can
avoid this requirement by employing a matrix inversion formula of Sherman and Morrison as described
below. Consider an matrix which has the form
= + where and are given column
vectors in . is the outer product of and , thus = . The modification of to obtain
is called rank-one update since the matrix has rank one as every column of is a scalar multiple
of . Notice that Eqn.(2.4), which is the Broyden update formula, has the form
=
+ where
=(
)/ and = .
Sherman-Morrison formula says that if is a nonsingular square matrix and, and are column vectors,
then + is nonsingular. Then, provided that 1 +
0, the Sherman-Morrison formula is
( +

(2.5)

1+

Notice that the Sherman-Morrison formula applies to our problem when we let
= and =
in Eqn.(2.5). Then we can write

(2.6)

Therefore, Eqn.(2.6) enables us to calculate


inverse in Eqn.(2.3) at each iteration.

)/

1+

=(

directly from

without needing to compute matrix

Verification of Sherman-Morrison formula


=( +

We need to demonstrate that ( + )( + )


matrix. Let's work out ( + )( + ) .
=( +

= +

) ( +

)=

where is the identity

1+
+
1+

+
1+

= +

= +
See that

(1 +

1+

(1 +

1+

1+

is a scalar, then
= +

It can be shown that


= +

=(

Show also that ( +

) ( +

) . Then noting that

(
1+

is the zero matrix, we get

= +

1+

)= .

Example6: Consider the following system of nonlinear equations.


= cos(

) + 0.5

+ sin( ) + 1.06 = 81(

+ 0.1)

+ (10 3)/3 = 20
Solve this system using Broydens Method with an initial guess of = [0.1 0.1 0.1] . Apply the
Sherman-Morrison formula for matrix inversion.
Solution6: First of all, we have to put the equations in the form

( ) =

( ) = 0 .

) 0.5
3 cos(
(
)
+ sin
+ 1.06 81( + 0.1)
10 3
+
+ 20
3

For the first step, we will use the exact ( ) therefore we need to determine the Jacobian matrix ( )
analytically. For this purpose, we can use the functions of Symbolic Math Toolbox of Matlab. The Matlab
functions that we will use are syms, jacobian, double, subs and inv. In the Matlab command window,
type the following statements.
>> syms f1 f2 f3 x1 x2 x3
>> f1=3*x1-cos(x2*x3)-0.5;
>> f2=x1^2-81*(x2+0.1)^2+sin(x3)+1.06;
>> f3=exp(-x1*x2)+20*x3+(10*pi-3)/3;
>> J=jacobian([f1, f2, f3], [x1, x2, x3])
J =
[
[

3,

x3*sin(x2*x3), x2*sin(x2*x3)]

2*x1, - 162*x2 - 81/5,

cos(x3)]

[ -x2*exp(-x1*x2), -x1*exp(-x1*x2),

20]

The matrix J is equal to ( ). Then, we set

= ( ) by writing the below statements in Matlab.

>> x0=[0.1 0.1 -0.1]';


>> A0=double(subs(J,[x1;x2;x3],x0))

10

A0 =
3.000000000000000

0.000999983333417

0.200000000000000 -32.399999999999999
-0.099004983374917

-0.099004983374917

-0.000999983333417
0.995004165278026
20.000000000000000

In the first step, we apply Eqn.(2.2) which is =


A0iv equals ( ), ( ), ,
and
, respectively.

( ) . In the below statements F, F0, x1, A0,

>> syms F, F=[f1;f2;f3];


>> F0=double(subs(F,[x1;x2;x3],x0))
F0 =
-1.199950000416665
-2.269833416646828
8.462025345715146
>> x1=x0-inv(A0)*F0
x1 =
0.499869672926428
0.019466848537418
-0.521520471935831
>> A0iv=inv(A0)
A0iv =
0.333333183973692

0.000010238518631

0.000016157012988

0.002108606838122

-0.030868825519714

0.001535835927053

0.001660520446129

-0.000152757694651

0.050007682751761

( ). We determine

The next step is to calculate using =


Morrison formula (Eqn.(2.6))
=

using the Sherman-

where = and
= ( ) ( ). Using Eqn.(2.3) together with Eqn.(2.6), we will continue
our iterations until ( ) <
where
is a stopping criterion. The given system of equations are
solved using = 10 and the results are given below.
At iteration no.1 the root estimates xr=[x1;x2;...;xn] are:
4.9986967293e-01
1.9466848537e-02
-5.2152047194e-01
The corresponding inf-norm of F(x) at xr is 3.4439e-01
At iteration no.2 the root estimates xr=[x1;x2;...;xn] are:
4.9998637546e-01
8.7378392993e-03
-5.2317457440e-01
The corresponding inf-norm of F(x) at xr is 1.4738e-01
At iteration no.3 the root estimates xr=[x1;x2;...;xn] are:
5.0000659706e-01
8.6727355579e-04
-5.2357234149e-01
The corresponding inf-norm of F(x) at xr is 1.4081e-02

11

At iteration no.4 the root estimates xr=[x1;x2;...;xn] are:


5.0000032872e-01
3.9528275306e-05
-5.2359768538e-01
The corresponding inf-norm of F(x) at xr is 6.3921e-04
At iteration no.5 the root estimates xr=[x1;x2;...;xn] are:
5.0000000157e-01
1.9354397511e-07
-5.2359877006e-01
The corresponding inf-norm of F(x) at xr is 3.1291e-06
At iteration no.6 the root estimates xr=[x1;x2;...;xn] are:
5.0000000000e-01
5.3464000412e-13
-5.2359877560e-01
The corresponding inf-norm of F(x) at xr is 1.6337e-11

12

S-ar putea să vă placă și