Nonlinear equations

From Sutherland_wiki
Revision as of 21:42, 20 October 2011 by 00033394 (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Introduction

Nonlinear equations arise ubiquitously in science and engineering applications. There are several key distinctions between linear and nonlinear equations:

Number of roots (solutions)
For linear systems of equations there is a single unique solution for a well-posed system.
For nonlinear equaitons there may be zero to many roots
Solution Approach
For linear systems, an exact solutions exist if system is well-posed
For nonlinear systems, exact solutions are occasionally possible, but not in general. Solution methods are iterative.


Residual Form

Typically when solving a nonlinear equation we solve for its roots - i.e. where the equation equals zero. In some cases we want to solve an equation for where f(x)=\alpha. In such a case, we must rewrite the equation in residual form:

Original Equation Residual Form
f(x)=\alpha r(x)=f(x)-\alpha

For example, if we wanted to solve y=x^2\sin(x) for the value of x where y=3, we would solve for the roots of the function r(x)=x^2\sin(x)-3.


Solving a Single Nonlinear Equation

Nonlinear categorization.png

There are two general classes of techniques for solving nonlinear equations:

  • Closed domain methods - These methods look for a root inside a specified interval by iteratively shrinking the interval containing the root. These methods are typically more robust than open domain methods, but are somewhat slower to converge, and require that you be able to bracket the root.
  • Open domain methods - These methods look for a root near a specified initial guess, but are not constrained in the region that they search for a root. These methods typically converge faster than closed domain methods, if the initial guess is "close enough" to the root.



Closed Domain Methods

Bisection

The bisection technique is fairly straightforward. Given an interval x=[a,b] that contains the root f(x)=0, we proceed as follows:

  1. Set c=(a+b)/2. We now have two intervals: [a,c] and [c,b].
  2. calculate f(c)
  3. choose the interval containing a sign change. In other words: if f(a) \cdot f(c) < 0 then we know a sign change occured on the interval [a,c] and we select that interval. Otherwise we select the interval [c,b]
  4. Return to step 1.

The following Matlab code implements the bisection algorithm:

function x = bisect( fname, a, b, tol )
% function x = bisect( fname, a, b, tol )
% fname - the name of the function we are solving
% a - lower bound of interval containing the root
% b - upper bound of interval containing the root
% tol - solution tolerance (absolute)
  err = 10*tol;
  while err>tol
   fa = feval(fname,a);
   c = 0.5*(a+b);
   fc = feval(fname,c);
   err = abs(fc);
   if fc*fa<0
      % f(c) is of opposite sign from f(a).
      % This means that a and c now bracket
      % the root. Set up for the next pass.
      b = c;
   else
      % f(c) is of same sign as f(a). Thus, b and
      % c bracket the root. Set up for next pass.
      a = c;
   end end
  % set the return value x=0.5*(a+b);
Example
Solve y=\exp(-x)\sin(5x) for y=0.5. Look for a root on the interval [0,0.4].

We proceed as follows:

  1. Write the equation in residual form: r(x) = 0.5-\exp(-x)\,\sin(5x)
  2. Set a=0, b=0.4. Calculate r(a)=0.5, r(b)=-0.1095.
  3. Set c=(a+b)/2 = 0.2
  4. evaluate r(c)=-0.19
  5. Choose the sub-interval containing a sign change. This is the interval [a,c].
  6. Reset the interval: [a,b] = [0,0.2]
  7. return to step 3.

The plot for the first iteration is shown below, and the results for the first 9 iterations are also shown.

Iteration a b c residual
Bisect demo iter0.png 1 0.0 -0.4 0.2 -0.19
2 0.0 0.2 0.1 0.066
3 0.1 0.2 0.15 -0.087
4 0.1 0.15 0.125 -0.016
5 0.1 0.125 0.1125 0.023
6 0.1125 0.125 0.1187 0.0032
7 0.1187 0.1250 0.1219 -0.0067
8 0.1187 0.1219 0.1203 -0.0018
9 0.1187 0.1203 0.1195 0.00069

Regula Falsi

The Regula Falsi method is a simple variation on the bisection method that is meant to improve convergence. Rather than bisecting the interval into two equal halves (setting c=(a+b)/2), we draw a line connecting f(a) and f(b) and set c to the value of x where the line crosses zero. Specifically,

  1. Calculate the slope of the line connecting f(a) and f(b), m = \frac{f(b)-f(a)}{b-a}
  2. Calculate c = b - f(b)/m
  3. Choose the interval where the sign change occurs
  4. return to step 1.

The figure below illustrates this for the same function we considered previously, r(x) = 0.5 - \sin(5x)\cdot\exp(-x) and the first several iterations are shown.

Iteration a b c residual
Regulafalsi demo iter0.png 1 0.0 -0.4 0.3281 -0.22
2 0.0 0.3281 0.2283 -0.224
3 0.0 0.2283 0.1578 -0.11
4 0.0 0.1578 0.1302 -0.032
5 0.0 0.1302 0.1124 -0.0082
6 0.0 0.1224 0.1204 -0.0020
7 0.0 0.1204 0.1199 -0.00409

Comparing the results here with those from the bisection method, we see that the Regula Falsi method converges faster.


Open Domain Methods

Open domain methods do not require an interval that bounds the solution. Rather, they require an initial guess for the solution and refine that guess until convergence is achieved. Here we consider the secant method and Newton's method.

Secant Method

The secant method requires two initial guesses for the solution. In contrast to the closed domain methods, these guesses do not need to bracket the root. Given two guesses a and b, the algorithm is

  1. Calculate the slope of the line connecting f(b) and f(a), m=\frac{f(b)-f(a)}{b-a}.
  2. Calculate the point where this line is zero: c=b-f(b)/m.
  3. Check for convergence. If we are not converged, set a=b,

b=c and return to step 1.

On the function we considered previously, r(x)=0.5-\sin(5x)\exp(-x), with initial guesses of 0 and 0.4, the following table shows the iteration:

Iteration a b c residual
1 0.0000 0.4000 0.3281 -0.22
2 0.4000 0.3281 0.4722 0.061
3 0.3281 0.4722 0.4407 -0.019
4 0.4722 0.4407 0.4482 -0.00068

The results here illustrate an important point. We converged to a different answer than we did using the bisection and regula falsi techniques! Try initial guesses of 0.0 and 0.1 and see what happens then...


Newton's Method

Newton's method is a bit more sophisticated than the secant method. It is also more robust and converges faster. It requires a single guess, but it also requies the function's derivative.

Newton's method is derived from a Taylor series expansion of a function. It gives the following result:

x_{k+1} = x_k + \frac{ f(x_k) }{ f^\prime(x_k) }

where x_k and x_{k+1} are the guesses for the root at iteration k and k+1 respectively.

We have two options to evaluate f^\prime(x):

  • Use an analytic expression. This is best but not always possible.
  • Use numerical differentiation. In this case, we only need the function itself and we can approximate its derivative.

Returning to our example of solving

r(x)=0.5-\sin(5x)\exp(-x),

we have

r^\prime(x) = \sin(5x)\exp(-x) - 5\cos(5x)\exp(-x).

If we use an initial guess of x=0.2 then the values for x are:

Convergence for initial guess x=0.2
Iteration (k) xk r(x) r′(x)
0 0.2000 -1.89e-01 -1.52e+00
1 0.0759 1.56e-01 -1.52e+00
2 0.1154 1.38e-02 -3.96e+00
3 0.1197 1.74e-04 -3.25e+00
Convergence for initial guess x=0.4
Iteration (k) xk r(x) r′(x)
0 0.4000 -1.10e-01 2.00e+00
1 0.4546 1.56e-02 2.00e+00
2 0.4485 1.45e-04 2.53e+00

Solving a System of Nonlinear Equations

When solving systems of nonlinear equations we are looking for the intersection of the equations. We can write a system of nonlinear equations as


\mathbf{f}(\mathbf{x})=\mathbf{0}\Longleftrightarrow\begin{array}{ccc}
f_{1}\left(x_{1},x_{2},\cdots x_{n}\right) & = & 0\\
f_{2}\left(x_{1},x_{2},\cdots x_{n}\right) & = & 0\\
\vdots & \vdots & \vdots\\
f_{n}\left(x_{1},x_{2},\cdots x_{n}\right) & = & 0\end{array}

Newton's Method for Systems of Equations

For a system of nonlinear equations, Newton's method is written in matrix form as


\left[\mathbf{J}\right]\left(\Delta\mathbf{x}\right)=-\mathbf{f}(\mathbf{x})

or


\left[\begin{array}{cccc}
J_{11} & J_{12} & \cdots & J_{1n}\\
J_{21} & J_{22} & \cdots & J_{2n}\\
\vdots & \vdots & \ddots & \vdots\\
J_{n1} & J_{n2} & \cdots & J_{nn}\end{array}\right]\left(\begin{array}{c}
\Delta x_{1}\\
\Delta x_{2}\\
\vdots\\
\Delta x_{n}\end{array}\right)=-\left(\begin{array}{c}
f_{1}(x_{1},x_{2},\cdots x_{n})\\
f_{2}(x_{1},x_{2},\cdots x_{n})\\
\vdots\\
f_{n}(x_{1},x_{2},\cdots x_{n})\end{array}\right)

Here, \Delta \mathbf{x} represents the change in all of the variables at this iteration. The [\mathbf{J}] matrix is called the Jacobian matrix and is defined as


 J_{ij}=\frac{\partial f_{i}}{\partial x_{j}}

In other words, it is the partial derivative of the ith function with respect to the jth independent variable.

Given a guess for the solution, the algorithm for Newton's method is

  1. Calculate the elements of the Jacobian matrix, J_{ij}=\frac{\partial f_{i}}{\partial x_{j}}.
  2. Calculate the function values at the current value for \mathbf{x}, \mathbf{f}(\mathbf{x})
  3. Solve the system of equations \left[\mathbf{J}\right]\left(\Delta\mathbf{x}\right)=-\mathbf{f}(\mathbf{x}) for \Delta\mathbf{x}
  4. Update the solution by adding \Delta\mathbf{x} to the current value of \mathbf{x}.
  5. Check for convergence. If not converged then go to step 1.


Example: Solve the system of equations


 \begin{cases}
   x^2 + y^2 =1 \\
   y = x^3-x+1
 \end{cases}

We can rewrite these as


\begin{cases}
 f_1(x,y) = x^2 + y^2 - 1 \\
 f_2(x,y) = y - x^3 + x - 1
\end{cases}

We need to be able to calculate the Jacobian. We do this as:


 J=
 \left[\begin{array}{cc}
  \frac{\partial f_{1}}{\partial x} & \frac{\partial f_{1}}{\partial y}\\
  \frac{\partial f_{2}}{\partial x} & \frac{\partial f_{2}}{\partial y}
 \end{array}\right]
 =
 \left[\begin{array}{cc}
  2x & 2y\\
 -3x^{2}+1 & 1
 \end{array}\right]

Now we are ready to solve the system of equations.

Intial guess: x=1.0, y=1.0
Iteration x y Residual L2 Norm
1 0.8333 0.6667 1.60e-01
2 0.7625 0.6687 3.11e-02
3 0.7476 0.6697 7.36e-03
4 0.7448 0.6683 1.40e-03
5 0.7443 0.6680 2.53e-04
6 0.7442 0.6680 4.49e-05
Intial guess: x=-1.0, y=-1.0
Iteration x y Residual L2 Norm
1 -2.5000 -2.0000 1.37e+01
2 -1.7790 0.6730 4.39e+00
3 -1.6207 -1.5069 3.90e+00
4 -1.3919 -0.0618 9.72e-01
5 -2.0829 -3.6297 1.67e+01
10 -1.1864 0.5305 6.89e-01
15 -2.2152 -4.0203 2.04e+01
20 -1.4098 -0.3367 1.10e+00
25 0.9635 0.8885 7.19e-01
30 0.7443 0.6680 1.79e-04
31 0.7442 0.6680 3.18e-05
Intial guess: x=0.0, y=0.0
Iteration x y Residual L2 Norm
1 0.5000 0.5000 5.15e-01
2 0.7143 0.5714 1.81e-01
3 0.7424 0.6651 6.73e-03
4 0.7439 0.6677 7.50e-04
5 0.7441 0.6679 1.32e-04
6 0.7442 0.6680 2.34e-05


The Matlab file to generate these results can be downloaded here.
NonlinSysDemo2 plot.png
The figure to the right shows a plot of these two functions.

Note that there are two solutions to the system of equations, but in the results above we only found one of the two roots. The root at (x,y)=(0,1) was not found. You can verify that this is a root by substituting it into the original equations and seeing that it satifies both equations. Note that if we evaluate the Jacobian at (0,1), we obtain J=\left[\begin{smallmatrix} 2 & 2\\ 1 &
1\end{smallmatrix}\right]. This matrix cannot be inverted since the second row is a multiple of the first. This illustrates problems that Newton's method can run into.


Solution Approaches: Single Nonlinear Equation

Excel

To solve a nonlinear equation in Excel, we have to options:

  • Goal Seek is a simple way to solve a single nonlinear equation.
  • Solver is a more powerful way to solve a nonlinear equation. It can also perform minimization, maximization, and can solve systems of nonlinear equations as well.


Goal Seek

Warn.jpg
This section is a stub and needs to be expanded.
If you can provide information or finish this section you're welcome to do so and then remove this message afterwards.

Solver

Warn.jpg
This section is a stub and needs to be expanded.
If you can provide information or finish this section you're welcome to do so and then remove this message afterwards.


Matlab

To solve a single nonlinear equation in Matlab, we use the fzero function. If, however, we are solving for the roots of a polynomial, we can use the roots function. This will solve for all of the polynomial roots (both real and imaginary).

ROOTS

The roots function can be used to obtain all of the roots (both real and imaginary) of a polynomial
  xroots = roots( coefs );
  • coefs are the polynomial coefficients, in descending order.
  • xroots is a vector containing all of the polynomial roots.

For example, consider the quadratic equation y=3x^2-2x+1. From the quadratic formula, we can identify the roots as


  x = \frac{1 \pm i\sqrt{2}}{3} where i\equiv\sqrt{-1} is the imaginary number.
Using the roots function we have
 x = roots( [3 -2 1] );
which gives
 x =
  0.3333 + 0.4714i
  0.3333 - 0.4714i

This is equivalent to the answer above obtained via the quadratic formula.


FZERO

In Matlab, fzero is the most flexible way to find the roots of a nonlinear equation. Its syntax is slightly different depending on the type of function we are using:

  • For a function stored in an m-file named myFunction.m we use
 x = fzero( 'myFunction', xguess );
 x = fzero( fun, xguess );
  • For a built in function (like sin, exp, etc.)
 x = fzero( @sin, xguess );

Note that fzero can only find real roots. Therefore, if we tried to use it on the quadratic function y=3x^2-2x+1, it would fail. To demonstrate its usage, let's solve for the roots of the function

 y=3x^2-2x-1

From the quadratic formula, or by factoring this equation, we know that its roots should be x=-1/3 and x=1.

To use fzero to solve this, we could use an anonymous function:

 f = @(x)(3*x.^2-2*x-1); x1 = fzero( f, -2 ); % start looking for the root at -2.0 x2 = fzero( f, 2 ); % start looking for the root at 2.0

This results in x1=-0.33333 and x2=1. Note that here we found the two roots by using different initial guesses. Of course it helps to know where the roots are so that you can supply decent initial guesses!

To solve this using a m-file, first create the function you will use: {| border="1" cellpadding="5" cellspacing="0" |+ style="background:lightgrey" | myQuadratic.m |-

|
 function y = myQuadratic(x) y = 3*x.^2 - 2*x -1
|} Save this in a file myQuadratic.m and then use fzero:
 x1 = fzero( 'myQuadratic', -2.0 ); % start looking for the root at -2.0 x2 = fzero( 'myQuadratic', 2.0 ); % start looking for the root at 2.0


FMINSEARCH

Occasionally we want to maximize or minimize a function. Matlab provides a tool called fminsearch to do this. It searches for the minimum of a function near a starting guess. As with fzero, you use this in different ways depending on what kind of function you are minimizing:

 x = fminsearch( 'myFun', xo ); % when using a function in an m-file
 x = fminsearch( myFun, xo ); % when using an anonymous function
 x = fminsearch( @myFun, xo ); % when using a built-in function
Warn.jpg
This section is a stub and needs to be expanded.
If you can provide information or finish this section you're welcome to do so and then remove this message afterwards.



Solution Approaches: System of Nonlinear Equations

Warn.jpg
This section is a stub and needs to be expanded.
If you can provide information or finish this section you're welcome to do so and then remove this message afterwards.

Excel

Solver...

Matlab

fsolve