Difference between revisions of "Nonlinear equations"

From Sutherland_wiki
Jump to: navigation, search
Line 1: Line 1:
 
== Introduction ==
 
== Introduction ==
  
Nonlinear equations arise ubiquitously in science and engineering applications. There are several key distinctions between linear and nonlinear equations:
+
Nonlinear equations arise ubiquitously in science and engineering
 +
applications. There are several key distinctions between linear and
 +
nonlinear equations:
 
; Number of roots (solutions)
 
; Number of roots (solutions)
 
: For linear systems of equations there is a single unique solution for a well-posed system.
 
: For linear systems of equations there is a single unique solution for a well-posed system.
Line 8: Line 10:
 
; Solution Approach
 
; Solution Approach
 
: For linear systems, an exact solutions exist if system is well-posed
 
: For linear systems, an exact solutions exist if system is well-posed
: For nonlinear systems, exact solutions are occasionally possible, but not in general. Solution methods are iterative.
+
: For nonlinear systems, exact solutions are occasionally possible, but not in general. Solution methods are iterative.
  
  
Line 18: Line 20:
 
must rewrite the equation in ''residual form'':
 
must rewrite the equation in ''residual form'':
  
:{| border="1" cellpadding="5" cellspacing="0"
+
:{| border="1" cellpadding="5" cellspacing="0" style="text-align:center"
 
|-
 
|-
 
| Original Equation || Residual Form
 
| Original Equation || Residual Form
Line 35: Line 37:
  
 
There are two general classes of techniques for solving nonlinear
 
There are two general classes of techniques for solving nonlinear
equations:  
+
equations:
  
 
* [[#Closed_Domain_Methods|Closed domain methods]] - These methods look for a root inside a specified interval by iteratively shrinking the interval containing the root.  These methods are typically more robust than [[#Open_Domain_Methods|open domain methods]], but are somewhat slower to converge, and require that you be able to bracket the root.
 
* [[#Closed_Domain_Methods|Closed domain methods]] - These methods look for a root inside a specified interval by iteratively shrinking the interval containing the root.  These methods are typically more robust than [[#Open_Domain_Methods|open domain methods]], but are somewhat slower to converge, and require that you be able to bracket the root.
Line 44: Line 46:
  
  
=== Closed Domain Methods ===
+
=== Closed Domain Methods === {{Stub|section}}
{{Stub|section}}
 
  
  
 
==== Bisection ====
 
==== Bisection ====
  
The bisection technique is fairly straightforward. Given an interval <math>x=[a,b]</math> that contains the root <math>f(x)=0</math>, we proceed as follows:
+
The bisection technique is fairly straightforward. Given an interval
# Set <math>c=(a+b)/2</math>. We now have two intervals: <math>[a,c]</math> and <math>[c,b]</math>.
+
<math>x=[a,b]</math> that contains the root <math>f(x)=0</math>, we
 +
proceed as follows:
 +
# Set <math>c=(a+b)/2</math>. We now have two intervals: <math>[a,c]</math> and <math>[c,b]</math>.
 
# calculate <math>f(c)</math>
 
# calculate <math>f(c)</math>
# choose the interval containing a sign change. In other words: if <math>f(a) \cdot f(c) < 0</math> then we know a sign change occured on the interval <math>[a,c]</math> and we select that interval. Otherwise we select the interval <math>[c,b]</math>
+
# choose the interval containing a sign change. In other words: if <math>f(a) \cdot f(c) < 0</math> then we know a sign change occured on the interval <math>[a,c]</math> and we select that interval. Otherwise we select the interval <math>[c,b]</math>
 
# Return to step 1.
 
# Return to step 1.
  
Line 63: Line 66:
 
function x = bisect( fname, a, b, tol )
 
function x = bisect( fname, a, b, tol )
 
% function x = bisect( fname, a, b, tol )
 
% function x = bisect( fname, a, b, tol )
%
 
 
% fname - the name of the function we are solving
 
% fname - the name of the function we are solving
% a     - lower bound of interval containing the root
+
% a - lower bound of interval containing the root
% b     - upper bound of interval containing the root
+
% b - upper bound of interval containing the root
% tol   - solution tolerance (absolute)
+
% tol - solution tolerance (absolute)
+
  err = 10*tol;
err = 10*tol;
+
  while err>tol
 
while err>tol
 
 
   fa = feval(fname,a);
 
   fa = feval(fname,a);
 
   c = 0.5*(a+b);
 
   c = 0.5*(a+b);
Line 82: Line 82:
 
       b = c;
 
       b = c;
 
   else
 
   else
       % f(c) is of same sign as f(a). Thus, b and
+
       % f(c) is of same sign as f(a). Thus, b and
 
       % c bracket the root. Set up for next pass.
 
       % c bracket the root. Set up for next pass.
 
       a = c;
 
       a = c;
   end
+
   end end
end
+
  % set the return value x=0.5*(a+b); </source>
 
% set the return value
 
x=0.5*(a+b);
 
</source>
 
 
|}
 
|}
  
 
;Example:
 
;Example:
: Solve <math>y=\exp(-x)\sin(5x)</math> for <math>y=0.5</math>. Look for a root on the interval [0,1].
+
: Solve <math>y=\exp(-x)\sin(5x)</math> for <math>y=0.5</math>. Look for a root on the interval [0,0.4].
  
 
We proceed as follows:
 
We proceed as follows:
 
# Write the equation in residual form: <math>r(x) = 5-\exp(-x)\sin(5x)</math>
 
# Write the equation in residual form: <math>r(x) = 5-\exp(-x)\sin(5x)</math>
# Determine the interval that we want to search for the root.
+
# Set a=0, b=0.4. Calculate <math>r(a)=0.5</math>, <math>r(b)=-0.1095</math>.
 +
# Set <math>c=(a+b)/2 = 0.2</math>
 +
# evaluate <math>r(c)=-0.19</math>
 +
# Choose the sub-interval containing a sign change. This is the interval [a,c].
 +
# Reset the interval: [a,b] = [0,0.2]
 +
# return to step 3.
 +
 
 +
The plot for the first iteration is shown below, and the results for
 +
the first 9 iterations are also shown.
  
{|
+
{| cellpadding="4" cellspacing="0" style="text-align:center"
|---------------------------------+---|
+
!
| [[Image:bisect_demo_iter0.png]]
+
! style="background:#efefef;" | Iteration
|                                | {|
+
! style="background:#efefef;" | a
                                  ! Iteration !! a !! b !! c !! residual
+
! style="background:#efefef;" | b
                                  |-
+
! style="background:#efefef;" | c
                                  | 1 || 0.0 || 1.0 || 0.5 || 0.36
+
! style="background:#efefef;" | residual
                                  |-
+
|-
                                  | 2 || 0.5 || 1.0 || 0.75 || -0.27
+
| rowspan="9" | [[Image:bisect_demo_iter0.png|350px]]
                                  |-
+
|| 1 || 0.0 || -0.4 || 0.2 || -0.19
                                  | 3 || 0.5 || 0.75 || 0.625 || 0.0089
+
|-
                                  |-
+
| 2 || 0.0 || 0.2 || 0.1 || 0.066
                                  | 4 || 0.625 || 0.75 || 0.6875 || -0.15
+
|-
                                  |-
+
| 3 || 0.1 || 0.2 || 0.15 || -0.087
                                  | 5 || 0.625 || 0.6875 || 0.6562 || -0.072
+
|-
                                  |-
+
| 4 || 0.1 || 0.15 || 0.125 || -0.016
                                  | 6 || 0.6250 || 0.6562 || 0.6406 || -0.032
+
|-
                                  |-
+
| 5 || 0.1 || 0.125 || 0.1125 || 0.023
                                  | 7 || 0.6250 || 0.6406 || 0.6328 || -0.012
+
|-
                                  |-
+
| 6 || 0.1125 || 0.125 || 0.1187 || 0.0032
                                  | 8 || 0.6250 || 0.6328 || 0.6289 || -0.0016
+
|-
                                  |}
+
| 7 || 0.1187 || 0.1250 || 0.1219 || -0.0067
| }
+
|-
 +
| 8 || 0.1187 || 0.1219 || 0.1203 || -0.0018
 +
|-
 +
| 9 || 0.1187 || 0.1203 || 0.1195 || 0.00069
 +
|}
  
  
 
==== Regula Falsi ====
 
==== Regula Falsi ====
  
 +
The Regula Falsi method is a simple variation on the
 +
[[#Bisection|bisection]] method that is meant to improve convergence.
 +
Rather than bisecting the interval into two equal halves (setting
 +
<var>c=(a+b)/2</var>), we draw a line connecting <math>f(a)</math> and
 +
<math>f(b)</math> and set <var>c</var> to the value of <var>x</var>
 +
where the line crosses zero. Specifically,
 +
# Calculate the slope of the line connecting <math>f(a)</math> and <math>f(b)</math>, <math>m = \frac{f(b)-f(a)}{b-a}</math>
 +
# Calculate <math>c = b - f(b)/m</math>
 +
# Choose the interval where the sign change occurs
 +
# return to step 1.
 +
 +
The figure below illustrates this for the same function we considered
 +
previously, <math>r(x) = 0.5 - \sin(5x)\cdot\exp(-x)</math> and the
 +
first several iterations are shown.
 +
 +
{| cellpadding="4" cellspacing="0" style="text-align:center"
 +
!
 +
! style="background:#efefef;" | Iteration
 +
! style="background:#efefef;" | a
 +
! style="background:#efefef;" | b
 +
! style="background:#efefef;" | c
 +
! style="background:#efefef;" | residual
 +
|-
 +
| rowspan="7" | [[Image:regulafalsi_demo_iter0.png|350px]]
 +
|| 1 || 0.0 || -0.4 || 0.3281 || -0.22
 +
|-
 +
| 2 || 0.0 || 0.3281 || 0.2283 || -0.224
 +
|-
 +
| 3 || 0.0 || 0.2283 || 0.1578 || -0.11
 +
|-
 +
| 4 || 0.0 || 0.1578 || 0.1302 || -0.032
 +
|-
 +
| 5 || 0.0 || 0.1302 || 0.1124 || -0.0082
 +
|-
 +
| 6 || 0.0 || 0.1224 || 0.1204 || -0.0020
 +
|-
 +
| 7 || 0.0 || 0.1204 || 0.1199 || -0.00409
 +
|}
 +
 +
Comparing the results here with those from the bisection method, we
 +
see that the Regula Falsi method converges faster.
  
  
 
=== Open Domain Methods ===
 
=== Open Domain Methods ===
{{Stub|section}}
+
 
 +
Open domain methods do not require an interval that bounds the
 +
solution. Rather, they require an initial guess for the solution and
 +
refine that guess until convergence is achieved. Here we consider the
 +
secant method and Newton's method.
 +
 
 
==== Secant Method ====
 
==== Secant Method ====
 +
 +
The secant method requires two initial guesses for the solution. In
 +
contrast to the [[#Closed_Domain_Methods|closed domain methods]],
 +
these guesses do not need to bracket the root. Given two guesses
 +
<var>a</var> and <var>b</var>, the algorithm is
 +
# Calculate the slope of the line connecting <math>f(b)</math> and <math>f(a)</math>, <math>m=\frac{f(b)-f(a)}{b-a}</math>.
 +
# Calculate the point where this line is zero: <math>c=b-f(b)/m</math>.
 +
# Check for convergence. If we are not converged, set <var>a=b</var>,
 +
<var>b=c</var> and return to step 1.
 +
 +
On the function we considered previously,
 +
<math>r(x)=0.5-\sin(5x)\exp(-x)</math>, with initial guesses of
 +
<var>0</var> and <var>0.4</var>, the following table shows the
 +
iteration:
 +
{| cellpadding="5" cellspacing="0" style="text-align:center"
 +
! style="background:#efefef;" | Iteration
 +
! style="background:#efefef;" | a
 +
! style="background:#efefef;" | b
 +
! style="background:#efefef;" | c
 +
! style="background:#efefef;" | residual
 +
|-
 +
| 1 || 0.0000 || 0.4000 || 0.3281 || -0.22
 +
|-
 +
| 2 || 0.4000 || 0.3281 || 0.4722 || 0.061
 +
|-
 +
| 3 || 0.3281 || 0.4722 || 0.4407 || -0.019
 +
|-
 +
| 4 || 0.4722 || 0.4407 || 0.4482 || -0.00068
 +
|}
 +
 +
The results here illustrate an important point. We converged to a
 +
different answer than we did using the [[#Bisection|bisection]] and
 +
[[#Regula_Falsi|regula falsi]] techniques! Try initial guesses of 0.0
 +
and 0.1 and see what happens then...
 +
 +
 
==== Newton's Method ====
 
==== Newton's Method ====
  
 +
Newton's method is a bit more sophisticated than the secant method. It
 +
is also more robust and converges faster. It requires a single guess,
 +
but it also requies the function's derivative.
 +
 +
Newton's method is derived from a Taylor series expansion of a
 +
function. It gives the following result:
 +
:<math>x_{k+1} = x_k + \frac{ f(x_k) }{ f^\prime(x_k) }</math>
 +
where <math>x_k</math> and <math>x_{k+1}</math> are the guesses for
 +
the root at iteration <var>k</var> and <var>k+1</var> respectively.
 +
 +
We have two options to evaluate <math>f^\prime(x)</math>:
 +
* Use an analytic expression.  This is best but not always possible.
 +
* Use [[Numerical_Differentiation|numerical differentiation]].  In this case, we only need the function itself and we can approximate its derivative.
 +
 +
Returning to our example of solving
 +
:<math>r(x)=0.5-\sin(5x)\exp(-x)</math>,
 +
we have
 +
:<math>r^\prime(x) = \sin(5x)\exp(-x) - 5\cos(5x)\exp(-x)</math>.
 +
If we use an initial guess of <var>x=0.2</var> then the values for <var>x</var> are:
 +
 +
{| cellpadding="10" cellspacing="0" align="center"
 +
|-
 +
| valign="top"|
 +
{| cellpadding="5" cellspacing="0" style="text-align:center"
 +
|+ Convergence for initial guess <var>x=0.2</var>
 +
! style="background:#efefef;" | Iteration (<var>k</var>)
 +
! style="background:#efefef;" | <var>x<sub>k</sub></var>
 +
! style="background:#efefef;" | <var>r(x)</var>
 +
! style="background:#efefef;" | <var>r&prime;(x)</var>
 +
|-
 +
| 0 || 0.2000 || -1.89e-01 || -1.52e+00
 +
|-
 +
| 1 || 0.0759 ||  1.56e-01 || -1.52e+00
 +
|-
 +
| 2 || 0.1154 ||  1.38e-02 || -3.96e+00
 +
|-
 +
| 3 || 0.1197 ||  1.74e-04 || -3.25e+00
 +
|}
 +
| valign="top" |
 +
{| cellpadding="5" cellspacing="0" style="text-align:center"
 +
|+ Convergence for initial guess <var>x=0.4</var>
 +
! style="background:#efefef;" | Iteration (<var>k</var>)
 +
! style="background:#efefef;" | <var>x<sub>k</sub></var>
 +
! style="background:#efefef;" | <var>r(x)</var>
 +
! style="background:#efefef;" | <var>r&prime;(x)</var>
 +
|-
 +
| 0 || 0.4000 ||-1.10e-01 || 2.00e+00
 +
|-
 +
| 1 || 0.4546 || 1.56e-02 || 2.00e+00
 +
|-
 +
| 2 || 0.4485 || 1.45e-04 || 2.53e+00
 +
|}
 +
|}
  
  
Line 139: Line 282:
 
== Solving a System of Nonlinear Equations ==
 
== Solving a System of Nonlinear Equations ==
 
{{Stub|section}}
 
{{Stub|section}}
 +
 
=== Newton's Method ===
 
=== Newton's Method ===
 
  
  
Line 157: Line 300:
 
* [[#Solver|Solver]] is a more powerful way to solve a nonlinear equation.  It can also perform minimization, maximization, and can solve systems of nonlinear equations as well.
 
* [[#Solver|Solver]] is a more powerful way to solve a nonlinear equation.  It can also perform minimization, maximization, and can solve systems of nonlinear equations as well.
  
==== Goal Seek ====
 
{{Stub|section}}
 
<!-- jcs need to include image of goal seek and a simple example -->
 
  
==== Solver ====
+
==== Goal Seek ==== {{Stub|section}} <!-- jcs need to include image of goal seek and a simple example -->
{{Stub|section}}
+
 
<!-- jcs need to include image of solver and a simple example -->
+
==== Solver ==== {{Stub|section}} <!-- jcs need to include image of solver and a simple example -->
  
  
Line 169: Line 309:
 
=== Matlab ===
 
=== Matlab ===
  
To solve a single nonlinear equation in Matlab, we use the
+
To solve a single nonlinear equation in Matlab, we use the [[#FZERO|<tt>fzero</tt>]] function. If, however, we are solving for the roots of a polynomial, we can use the [[#ROOTS|<tt>roots</tt>]] function. This will solve for ''all'' of the polynomial roots (both real and imaginary).
[[#FZERO|<tt>fzero</tt>]] function. If, however, we are solving for the roots
 
of a polynomial, we can use the [[#ROOTS|<tt>roots</tt>]] function. This will
 
solve for ''all'' of the polynomial roots (both real and imaginary).
 
  
 
==== ROOTS ====
 
==== ROOTS ====
  
The <tt>roots</tt> function can be used to obtain all of the roots
+
The <tt>roots</tt> function can be used to obtain all of the roots (both real and imaginary) of a polynomial <source lang="matlab">
(both real and imaginary) of a polynomial
+
   xroots = roots( coefs ); </source>
<source lang="matlab">
 
   xroots = roots( coefs );
 
</source>
 
 
* <tt>coefs</tt> are the polynomial coefficients, in ''descending'' order.
 
* <tt>coefs</tt> are the polynomial coefficients, in ''descending'' order.
 
* <tt>xroots</tt> is a vector containing all of the polynomial roots.
 
* <tt>xroots</tt> is a vector containing all of the polynomial roots.
  
For example, consider the quadratic equation <math>y=3x^2-2x+1</math>. From the
+
For example, consider the quadratic equation <math>y=3x^2-2x+1</math>. From the [http://en.wikipedia.org/wiki/Quadratic_formula#Quadratic_formula quadratic formula], we can identify the roots as
[http://en.wikipedia.org/wiki/Quadratic_formula#Quadratic_formula quadratic formula],
 
we can identify the roots as
 
 
:<math>
 
:<math>
   x = \frac{1 \pm i\sqrt{2}}{3}
+
   x = \frac{1 \pm i\sqrt{2}}{3} </math> where <math>i\equiv\sqrt{-1}</math> is the imaginary number.
</math>
 
where <math>i\equiv\sqrt{-1}</math> is the imaginary number.
 
  
Using the <tt>roots</tt> function we have
+
Using the <tt>roots</tt> function we have <source lang="matlab">
<source lang="matlab">
+
  x = roots( [3 -2 1] ); </source> which gives
  x = roots( [3 -2 1] );
 
</source>
 
which gives
 
 
   x =
 
   x =
 
   0.3333 + 0.4714i
 
   0.3333 + 0.4714i
Line 206: Line 333:
 
==== FZERO ====
 
==== FZERO ====
  
In Matlab, <tt>fzero</tt> is the most flexible way to find the roots
+
In Matlab, <tt>fzero</tt> is the most flexible way to find the roots of a nonlinear equation. Its syntax is slightly different depending on the type of function we are using:
of a nonlinear equation. Its syntax is slightly different depending on
 
the type of function we are using:
 
 
* For a function stored in an m-file named <tt>myFunction.m</tt> we use
 
* For a function stored in an m-file named <tt>myFunction.m</tt> we use
:<source lang="matlab">
+
:<source lang="matlab"> x = fzero( 'myFunction', xguess ); </source>
x = fzero( 'myFunction', xguess );
 
</source>
 
 
* For an [[Matlab_Functions#Anonymous_Functions|anonymous function]] named <tt>fun>/tt>
 
* For an [[Matlab_Functions#Anonymous_Functions|anonymous function]] named <tt>fun>/tt>
:<source lang="matlab">
+
:<source lang="matlab"> x = fzero( fun, xguess ); </source>
x = fzero( fun, xguess );
 
</source>
 
 
* For a built in function (like <tt>sin</tt>, <tt>exp</tt>, etc.)
 
* For a built in function (like <tt>sin</tt>, <tt>exp</tt>, etc.)
:<source lang="matlab">
+
:<source lang="matlab"> x = fzero( @sin, xguess ); </source>
x = fzero( @sin, xguess );
 
</source>
 
  
Note that <tt>fzero</tt> can only find real roots. Therefore, if we
+
Note that <tt>fzero</tt> can only find real roots. Therefore, if we tried to use it on the quadratic function <math>y=3x^2-2x+1</math>, it would fail. To demonstrate its usage, let's solve for the roots of the function
tried to use it on the quadratic function <math>y=3x^2-2x+1</math>, it
 
would fail. To demonstrate its usage, let's solve for the roots of the
 
function
 
  
 
:<math> y=3x^2-2x-1</math>
 
:<math> y=3x^2-2x-1</math>
  
From the quadratic formula, or by factoring this equation, we know
+
From the quadratic formula, or by factoring this equation, we know that its roots should be <math>x=-1/3</math> and <math>x=1</math>.
that its roots should be <math>x=-1/3</math> and <math>x=1</math>.
 
  
 
To use <tt>fzero</tt> to solve this, we could use an anonymous function:
 
To use <tt>fzero</tt> to solve this, we could use an anonymous function:
  
<source lang="matlab">
+
<source lang="matlab"> f = @(x)(3*x.^2-2*x-1); x1 = fzero( f, -2 ); % start looking for the root at -2.0 x2 = fzero( f, 2 ); % start looking for the root at 2.0 </source>
f = @(x)(3*x.^2-2*x-1);
 
x1 = fzero( f, -2 ); % start looking for the root at -2.0
 
x2 = fzero( f, 2 ); % start looking for the root at 2.0
 
</source>
 
  
This results in <tt>x1=-0.33333</tt> and <tt>x2=1</tt>. Note that here
+
This results in <tt>x1=-0.33333</tt> and <tt>x2=1</tt>. Note that here we found the two roots by using different initial guesses. Of course it helps to know where the roots are so that you can supply decent initial guesses!
we found the two roots by using different initial guesses. Of course
 
it helps to know where the roots are so that you can supply decent
 
initial guesses!
 
  
To solve this using a m-file, first create the function you will use:
+
To solve this using a m-file, first create the function you will use: {| border="1" cellpadding="5" cellspacing="0"
{| border="1" cellpadding="5" cellspacing="0"
 
 
|+ style="background:lightgrey" | myQuadratic.m
 
|+ style="background:lightgrey" | myQuadratic.m
 
|-
 
|-
|<source lang="matlab">
+
|<source lang="matlab"> function y = myQuadratic(x) y = 3*x.^2 - 2*x -1 </source>
function y = myQuadratic(x)
+
|} Save this in a file <tt>myQuadratic.m</tt> and then use <tt>fzero</tt>: <source lang="matlab"> x1 = fzero( 'myQuadratic', -2.0 ); % start looking for the root at -2.0 x2 = fzero( 'myQuadratic', 2.0 ); % start looking for the root at 2.0 </source>
y = 3*x.^2 - 2*x -1
 
</source>
 
|}
 
Save this in a file <tt>myQuadratic.m</tt> and then use <tt>fzero</tt>:
 
<source lang="matlab">
 
x1 = fzero( 'myQuadratic', -2.0 ); % start looking for the root at -2.0
 
x2 = fzero( 'myQuadratic', 2.0 ); % start looking for the root at 2.0
 
</source>
 
  
  
 
==== FMINSEARCH ====
 
==== FMINSEARCH ====
  
Occasionally we want to maximize or minimize a function. Matlab
+
Occasionally we want to maximize or minimize a function. Matlab provides a tool called <tt>fminsearch</tt> to do this. It searches for the minimum of a function near a starting guess. As with [[#fzero|<tt>fzero</tt>]], you use this in different ways depending on what kind of function you are minimizing:
provides a tool called <tt>fminsearch</tt> to do this. It searches for
 
the minimum of a function near a starting guess. As with
 
[[#fzero|<tt>fzero</tt>]], you use this in different ways depending on
 
what kind of function you are minimizing:
 
  
 
<source lang="matlab">
 
<source lang="matlab">
  x = fminsearch( 'myFun', xo ); % when using a function in an m-file
+
  x = fminsearch( 'myFun', xo ); % when using a function in an m-file
  x = fminsearch( myFun,   xo ); % when using an anonymous function
+
  x = fminsearch( myFun, xo ); % when using an anonymous function
  x = fminsearch( @myFun, xo ); % when using a built-in function
+
  x = fminsearch( @myFun, xo ); % when using a built-in function </source>
</source>
 
  
{{Stub|section}}
+
{{Stub|section}} <!-- jcs still need to provide an example and discuss maximizing functions as well. -->
<!-- jcs still need to provide an example and discuss maximizing functions as well. -->
 
  
  

Revision as of 14:02, 5 August 2009

Introduction

Nonlinear equations arise ubiquitously in science and engineering applications. There are several key distinctions between linear and nonlinear equations:

Number of roots (solutions)
For linear systems of equations there is a single unique solution for a well-posed system.
For nonlinear equaitons there may be zero to many roots
Solution Approach
For linear systems, an exact solutions exist if system is well-posed
For nonlinear systems, exact solutions are occasionally possible, but not in general. Solution methods are iterative.


Residual Form

Typically when solving a nonlinear equation we solve for its roots - i.e. where the equation equals zero. In some cases we want to solve an equation for where f(x)=\alpha. In such a case, we must rewrite the equation in residual form:

Original Equation Residual Form
f(x)=\alpha r(x)=f(x)-\alpha

For example, if we wanted to solve y=x^2\sin(x) for the value of x where y=3, we would solve for the roots of the function r(x)=x^2\sin(x)-3.


Solving a Single Nonlinear Equation

Nonlinear categorization.png

There are two general classes of techniques for solving nonlinear equations:

  • Closed domain methods - These methods look for a root inside a specified interval by iteratively shrinking the interval containing the root. These methods are typically more robust than open domain methods, but are somewhat slower to converge, and require that you be able to bracket the root.
  • Open domain methods - These methods look for a root near a specified initial guess, but are not constrained in the region that they search for a root. These methods typically converge faster than closed domain methods, if the initial guess is "close enough" to the root.



Closed Domain Methods

Warn.jpg
This section is a stub and needs to be expanded.
If you can provide information or finish this section you're welcome to do so and then remove this message afterwards.


Bisection

The bisection technique is fairly straightforward. Given an interval x=[a,b] that contains the root f(x)=0, we proceed as follows:

  1. Set c=(a+b)/2. We now have two intervals: [a,c] and [c,b].
  2. calculate f(c)
  3. choose the interval containing a sign change. In other words: if f(a) \cdot f(c) < 0 then we know a sign change occured on the interval [a,c] and we select that interval. Otherwise we select the interval [c,b]
  4. Return to step 1.

The following Matlab code implements the bisection algorithm:

function x = bisect( fname, a, b, tol )
% function x = bisect( fname, a, b, tol )
% fname - the name of the function we are solving
% a - lower bound of interval containing the root
% b - upper bound of interval containing the root
% tol - solution tolerance (absolute)
  err = 10*tol;
  while err>tol
   fa = feval(fname,a);
   c = 0.5*(a+b);
   fc = feval(fname,c);
   err = abs(fc);
   if fc*fa<0
      % f(c) is of opposite sign from f(a).
      % This means that a and c now bracket
      % the root. Set up for the next pass.
      b = c;
   else
      % f(c) is of same sign as f(a). Thus, b and
      % c bracket the root. Set up for next pass.
      a = c;
   end end
  % set the return value x=0.5*(a+b);
Example
Solve y=\exp(-x)\sin(5x) for y=0.5. Look for a root on the interval [0,0.4].

We proceed as follows:

  1. Write the equation in residual form: r(x) = 5-\exp(-x)\sin(5x)
  2. Set a=0, b=0.4. Calculate r(a)=0.5, r(b)=-0.1095.
  3. Set c=(a+b)/2 = 0.2
  4. evaluate r(c)=-0.19
  5. Choose the sub-interval containing a sign change. This is the interval [a,c].
  6. Reset the interval: [a,b] = [0,0.2]
  7. return to step 3.

The plot for the first iteration is shown below, and the results for the first 9 iterations are also shown.

Iteration a b c residual
Bisect demo iter0.png 1 0.0 -0.4 0.2 -0.19
2 0.0 0.2 0.1 0.066
3 0.1 0.2 0.15 -0.087
4 0.1 0.15 0.125 -0.016
5 0.1 0.125 0.1125 0.023
6 0.1125 0.125 0.1187 0.0032
7 0.1187 0.1250 0.1219 -0.0067
8 0.1187 0.1219 0.1203 -0.0018
9 0.1187 0.1203 0.1195 0.00069


Regula Falsi

The Regula Falsi method is a simple variation on the bisection method that is meant to improve convergence. Rather than bisecting the interval into two equal halves (setting c=(a+b)/2), we draw a line connecting f(a) and f(b) and set c to the value of x where the line crosses zero. Specifically,

  1. Calculate the slope of the line connecting f(a) and f(b), m = \frac{f(b)-f(a)}{b-a}
  2. Calculate c = b - f(b)/m
  3. Choose the interval where the sign change occurs
  4. return to step 1.

The figure below illustrates this for the same function we considered previously, r(x) = 0.5 - \sin(5x)\cdot\exp(-x) and the first several iterations are shown.

Iteration a b c residual
Regulafalsi demo iter0.png 1 0.0 -0.4 0.3281 -0.22
2 0.0 0.3281 0.2283 -0.224
3 0.0 0.2283 0.1578 -0.11
4 0.0 0.1578 0.1302 -0.032
5 0.0 0.1302 0.1124 -0.0082
6 0.0 0.1224 0.1204 -0.0020
7 0.0 0.1204 0.1199 -0.00409

Comparing the results here with those from the bisection method, we see that the Regula Falsi method converges faster.


Open Domain Methods

Open domain methods do not require an interval that bounds the solution. Rather, they require an initial guess for the solution and refine that guess until convergence is achieved. Here we consider the secant method and Newton's method.

Secant Method

The secant method requires two initial guesses for the solution. In contrast to the closed domain methods, these guesses do not need to bracket the root. Given two guesses a and b, the algorithm is

  1. Calculate the slope of the line connecting f(b) and f(a), m=\frac{f(b)-f(a)}{b-a}.
  2. Calculate the point where this line is zero: c=b-f(b)/m.
  3. Check for convergence. If we are not converged, set a=b,

b=c and return to step 1.

On the function we considered previously, r(x)=0.5-\sin(5x)\exp(-x), with initial guesses of 0 and 0.4, the following table shows the iteration:

Iteration a b c residual
1 0.0000 0.4000 0.3281 -0.22
2 0.4000 0.3281 0.4722 0.061
3 0.3281 0.4722 0.4407 -0.019
4 0.4722 0.4407 0.4482 -0.00068

The results here illustrate an important point. We converged to a different answer than we did using the bisection and regula falsi techniques! Try initial guesses of 0.0 and 0.1 and see what happens then...


Newton's Method

Newton's method is a bit more sophisticated than the secant method. It is also more robust and converges faster. It requires a single guess, but it also requies the function's derivative.

Newton's method is derived from a Taylor series expansion of a function. It gives the following result:

x_{k+1} = x_k + \frac{ f(x_k) }{ f^\prime(x_k) }

where x_k and x_{k+1} are the guesses for the root at iteration k and k+1 respectively.

We have two options to evaluate f^\prime(x):

  • Use an analytic expression. This is best but not always possible.
  • Use numerical differentiation. In this case, we only need the function itself and we can approximate its derivative.

Returning to our example of solving

r(x)=0.5-\sin(5x)\exp(-x),

we have

r^\prime(x) = \sin(5x)\exp(-x) - 5\cos(5x)\exp(-x).

If we use an initial guess of x=0.2 then the values for x are:

Convergence for initial guess x=0.2
Iteration (k) xk r(x) r′(x)
0 0.2000 -1.89e-01 -1.52e+00
1 0.0759 1.56e-01 -1.52e+00
2 0.1154 1.38e-02 -3.96e+00
3 0.1197 1.74e-04 -3.25e+00
Convergence for initial guess x=0.4
Iteration (k) xk r(x) r′(x)
0 0.4000 -1.10e-01 2.00e+00
1 0.4546 1.56e-02 2.00e+00
2 0.4485 1.45e-04 2.53e+00


Solving a System of Nonlinear Equations

Warn.jpg
This section is a stub and needs to be expanded.
If you can provide information or finish this section you're welcome to do so and then remove this message afterwards.

Newton's Method

Solution Approaches: Single Nonlinear Equation

Excel

To solve a nonlinear equation in Excel, we have to options:

  • Goal Seek is a simple way to solve a single nonlinear equation.
  • Solver is a more powerful way to solve a nonlinear equation. It can also perform minimization, maximization, and can solve systems of nonlinear equations as well.


Goal Seek

Warn.jpg
This section is a stub and needs to be expanded.
If you can provide information or finish this section you're welcome to do so and then remove this message afterwards.

Solver

Warn.jpg
This section is a stub and needs to be expanded.
If you can provide information or finish this section you're welcome to do so and then remove this message afterwards.


Matlab

To solve a single nonlinear equation in Matlab, we use the fzero function. If, however, we are solving for the roots of a polynomial, we can use the roots function. This will solve for all of the polynomial roots (both real and imaginary).

ROOTS

The roots function can be used to obtain all of the roots (both real and imaginary) of a polynomial
  xroots = roots( coefs );
  • coefs are the polynomial coefficients, in descending order.
  • xroots is a vector containing all of the polynomial roots.

For example, consider the quadratic equation y=3x^2-2x+1. From the quadratic formula, we can identify the roots as


  x = \frac{1 \pm i\sqrt{2}}{3} where i\equiv\sqrt{-1} is the imaginary number.
Using the roots function we have
 x = roots( [3 -2 1] );
which gives
 x =
  0.3333 + 0.4714i
  0.3333 - 0.4714i

This is equivalent to the answer above obtained via the quadratic formula.


FZERO

In Matlab, fzero is the most flexible way to find the roots of a nonlinear equation. Its syntax is slightly different depending on the type of function we are using:

  • For a function stored in an m-file named myFunction.m we use
 x = fzero( 'myFunction', xguess );
 x = fzero( fun, xguess );
  • For a built in function (like <tt>sin, exp, etc.)
 x = fzero( @sin, xguess );

Note that fzero can only find real roots. Therefore, if we tried to use it on the quadratic function y=3x^2-2x+1, it would fail. To demonstrate its usage, let's solve for the roots of the function

 y=3x^2-2x-1

From the quadratic formula, or by factoring this equation, we know that its roots should be x=-1/3 and x=1.

To use fzero to solve this, we could use an anonymous function:

 f = @(x)(3*x.^2-2*x-1); x1 = fzero( f, -2 ); % start looking for the root at -2.0 x2 = fzero( f, 2 ); % start looking for the root at 2.0

This results in x1=-0.33333 and x2=1. Note that here we found the two roots by using different initial guesses. Of course it helps to know where the roots are so that you can supply decent initial guesses!

To solve this using a m-file, first create the function you will use: {| border="1" cellpadding="5" cellspacing="0" |+ style="background:lightgrey" | myQuadratic.m |-

|
 function y = myQuadratic(x) y = 3*x.^2 - 2*x -1
|} Save this in a file myQuadratic.m and then use fzero:
 x1 = fzero( 'myQuadratic', -2.0 ); % start looking for the root at -2.0 x2 = fzero( 'myQuadratic', 2.0 ); % start looking for the root at 2.0


FMINSEARCH

Occasionally we want to maximize or minimize a function. Matlab provides a tool called fminsearch to do this. It searches for the minimum of a function near a starting guess. As with fzero, you use this in different ways depending on what kind of function you are minimizing:

 x = fminsearch( 'myFun', xo ); % when using a function in an m-file
 x = fminsearch( myFun, xo ); % when using an anonymous function
 x = fminsearch( @myFun, xo ); % when using a built-in function
Warn.jpg
This section is a stub and needs to be expanded.
If you can provide information or finish this section you're welcome to do so and then remove this message afterwards.



Solution Approaches: System of Nonlinear Equations

Warn.jpg
This section is a stub and needs to be expanded.
If you can provide information or finish this section you're welcome to do so and then remove this message afterwards.

Excel

Solver...

Matlab

fsolve