HOME

TheInfoList



OR:

In
numerical analysis Numerical analysis is the study of algorithms that use numerical approximation (as opposed to symbolic computation, symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics). It is the study of ...
, Steffensen's method is a root-finding technique named after
Johan Frederik Steffensen Johan Frederik Steffensen (28 February 1873, in Copenhagen – 20 December 1961) was a Danish mathematician, statistician, and actuary who did research in the fields of calculus of finite differences and interpolation. He was professor of a ...
which is similar to
Newton's method In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a real-valu ...
. Steffensen's method also achieves
quadratic convergence In numerical analysis, the order of convergence and the rate of convergence of a convergent sequence are quantities that represent how quickly the sequence approaches its limit. A sequence (x_n) that converges to x^* is said to have ''order of co ...
, but without using
derivative In mathematics, the derivative of a function of a real variable measures the sensitivity to change of the function value (output value) with respect to a change in its argument (input value). Derivatives are a fundamental tool of calculus. F ...
s as
Newton's method In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a real-valu ...
does.


Simple description

The simplest form of the formula for Steffensen's method occurs when it is used to find a
zero 0 (zero) is a number representing an empty quantity. In place-value notation Positional notation (or place-value notation, or positional numeral system) usually denotes the extension to any base of the Hindu–Arabic numeral system (or ...
of a
real function In mathematical analysis, and applications in geometry, applied mathematics, engineering, and natural sciences, a function of a real variable is a function whose domain is the real numbers \mathbb, or a subset of \mathbb that contains an interv ...
; that is, to find the real value x_\star that satisfies f(x_\star) = 0 . Near the solution x_\star, the function f is supposed to approximately satisfy -1 < f'(x_\star) < 0 ; this condition makes ~ f ~ adequate as a correction-function for ~ x ~ for finding its ''own'' solution, although it is not required to work efficiently. For some functions, Steffensen's method can work even if this condition is not met, but in such a case, the starting value ~ x_0 ~ must be ''very'' close to the actual solution ~ x_\star ~, and convergence to the solution may be slow. Given an adequate starting value ~ x_0 ~, a sequence of values ~ x_0, \, x_1, \, x_2, \dots, \, x_n, \, \dots ~ can be generated using the formula below. When it works, each value in the sequence is much closer to the solution ~ x_\star ~ than the prior value. The value ~ x_n ~ from the current step generates the value ~ x_ ~ for the next step, via this formula: : x_ = x_n - \frac ~ for n = 0, 1, 2, 3, ... ~; where the slope function ~ g(x) ~ is a composite of the original function ~ f ~ given by the following formula: : g(x) = \frac - 1 ~ or perhaps more clearly, :g(x) = \frac ~ where ~ h = f(x) ~ is a step-size between the last iteration point, , and an auxiliary point located at ~ x + h~. The function ~ g ~ is the average value for the slope of the function f between the last sequence point ~ \left( x, y \right) = \bigl( x_n, \, f\left( x_n \right) \bigr) ~ and the auxiliary point ~ \bigl(x,y\bigr)=\bigl( x_n + h, \, f\left( x_n + h \right) \, \bigr) ~, with the step ~ h = f(x_n) ~. It is also called the first-order divided difference of ~ f ~ between those two points. It is only for the purpose of finding ~ h ~ for this auxiliary point that the value of the function ~ f ~ must be an adequate correction to get closer to its own solution, and for that reason fulfill the requirement that ~ -1 < f'(x_\star) < 0 ~. For all other parts of the calculation, Steffensen's method only requires the function ~ f ~ to be continuous and to actually have a nearby solution. Several modest modifications of the step ~ h ~, such as multiplying it by or , in the formula for the slope ~ g ~ exist to accommodate functions ~ f ~ that do not quite meet the requirement.


Advantages and drawbacks

The main advantage of Steffensen's method is that it has
quadratic convergence In numerical analysis, the order of convergence and the rate of convergence of a convergent sequence are quantities that represent how quickly the sequence approaches its limit. A sequence (x_n) that converges to x^* is said to have ''order of co ...
like
Newton's method In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a real-valu ...
– that is, both methods find roots to an equation ~ f ~ just as 'quickly'. In this case ''quickly'' means that for both methods, the number of correct digits in the answer doubles with each step. But the formula for Newton's method requires evaluation of the function's derivative ~ f' ~ as well as the function ~ f ~, while Steffensen's method only requires ~ f ~ itself. This is important when the derivative is not easily or efficiently available. The price for the quick convergence is the double function evaluation: Both ~ f(x_n) ~ and ~ f(x_n + h) ~ must be calculated, which might be time-consuming if ~ f ~ is a complicated function. For comparison, the
secant method In numerical analysis, the secant method is a root-finding algorithm that uses a succession of roots of secant lines to better approximate a root of a function ''f''. The secant method can be thought of as a finite-difference approximation of ...
needs only one function evaluation per step. The secant method increases the number of correct digits by "only" a factor of roughly 1.6 per step, but one can do twice as many steps of the secant method within a given time. Since the secant method can carry out twice as many steps in the same time as Steffensen's method, when both algorithms succeed, the secant method actually converges faster than Steffensen's method in practical use: The secant method achieves a factor of about as many digits for every two steps (two function evaluations), compared to Steffensen's factor of 2 for every one step (two function evaluations). Similar to most other iterative root-finding algorithms, the crucial weakness in Steffensen's method is the choice of the starting value ~ x_0 ~. If the value of ~ x_0 ~ is not 'close enough' to the actual solution ~ x_\star ~, the method may fail and the sequence of values ~ x_0, \, x_1, \, x_2, \, x_3, \, \dots ~ may either flip-flop between two extremes, or diverge to infinity, or both.


Derivation using Aitken's delta-squared process

The version of Steffensen's method implemented in the
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation ...
code shown below can be found using the
Aitken's delta-squared process In numerical analysis, Aitken's delta-squared process or Aitken extrapolation is a series acceleration method, used for accelerating the rate of convergence of a sequence. It is named after Alexander Aitken, who introduced this method in 1926.Alexa ...
for accelerating convergence of a sequence. To compare the following formulae to the formulae in the section above, notice that x_n = p \, - \, p_n ~. This method assumes starting with a linearly convergent sequence and increases the rate of convergence of that sequence. If the signs of ~ p_n, \, p_, \, p_ ~ agree and ~ p_n ~ is 'sufficiently close' to the desired limit of the sequence ~ p ~, we can assume the following: :\frac ~ \approx ~ \frac then : ( p_ - p)^2 ~ \approx ~ (\, p_ - p \,)\,(\, p_n - p \, ) ~ so : p_^2 - 2 \, p_ \, p + p^2 ~ \approx ~ p_ \; p_n - (\, p_n + p_ \,) \, p + p^2 and hence :(\, p_ - 2 \, p_ + p_n \,) \, p ~ \approx ~ p_ \, p_n - p_^2 ~. Solving for the desired limit of the sequence ~ p ~ gives: : p ~ \approx ~ \frac ~=~ \frac : =~ \frac : =~ p_n - \frac ~, which results in the more rapidly convergent sequence: : p ~ \approx ~ p_ ~=~ p_n - \frac ~.


Code example


In Matlab

Here is the source for an implementation of Steffensen's Method in
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation ...
. function Steffensen(f,p0,tol) % This function takes as inputs: a fixed point iteration function, f, % and initial guess to the fixed point, p0, and a tolerance, tol. % The fixed point iteration function is assumed to be input as an % inline function. % This function will calculate and return the fixed point, p, % that makes the expression f(x) = p true to within the desired % tolerance, tol. format compact % This shortens the output. format long % This prints more decimal places. for i=1:1000 % get ready to do a large, but finite, number of iterations. % This is so that if the method fails to converge, we won't % be stuck in an infinite loop. p1=f(p0); % calculate the next two guesses for the fixed point. p2=f(p1); p=p0-(p1-p0)^2/(p2-2*p1+p0) % use Aitken's delta squared method to % find a better approximation to p0. if abs(p-p0)tol % If we fail to meet the tolerance, we output a % message of failure. 'failed to converge in 1000 iterations.' end


In Python

Here is the source for an implementation of Steffensen's Method in
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
. from typing import Callable, Iterator Func = Callable
float Float may refer to: Arts and entertainment Music Albums * ''Float'' (Aesop Rock album), 2000 * ''Float'' (Flogging Molly album), 2008 * ''Float'' (Styles P album), 2013 Songs * "Float" (Tim and the Glory Boys song), 2022 * "Float", by Bush ...
float] def g(f: Func, x: float, fx: float) -> Func: """First-order divided difference function. Arguments: f: Function input to g x: Point at which to evaluate g fx: Function f evaluated at x """ return lambda x: f(x + fx) / fx - 1 def steff(f: Func, x: float) -> Iterator
loat Loat or LOAT may refer to: People * Lily Loat (1880–1958), British anti-vaccination activist * William Leonard Stevenson Loat (1871–1932), British archaeologist, naturalist, and collector Other * Trausdorf Airport (ICAO: LOAT), a former pu ...
"""Steffenson algorithm for finding roots. This recursive generator yields the x_ value first then, when the generator iterates, it yields x_ from the next level of recursion. Arguments: f: Function whose root we are searching for x: Starting value upon first call, each level n that the function recurses x is x_n """ while True: fx = f(x) gx = g(f, x, fx)(x) if gx

0: break else: x = x - fx / gx # Update to x_ yield x # Yield value


Generalization

Steffensen's method can also be used to find an input ~ x = x_\star ~ for a different kind of function ~F~ that produces output the same as its input: ~ x_\star = F(x_\star) ~ for the special value ~ x_\star ~. Solutions like ~ x_\star ~ are called '' fixed points''. Many of these functions can be used to find their own solutions by repeatedly recycling the result back as input, but the rate of convergence can be slow, or the function can fail to converge at all, depending on the individual function. Steffensen's method accelerates this convergence, to make it quadratic. For orientation, the root function ~f~ and the fixed-point functions are simply related by ~ F(x) = x + \varepsilon \,f(x) ~, where ~ \varepsilon ~ is some scalar constant small enough in magnitude to make ~F~ stable under iteration, but large enough for the
non-linearity In mathematics and science, a nonlinear system is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians, and many other ...
of the function ~f~ to be appreciable. All issues of a more general
Banach space In mathematics, more specifically in functional analysis, a Banach space (pronounced ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vector ...
vs. basic
real numbers In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every real ...
being momentarily ignored for the sake of the comparison. This method for finding fixed points of a real-valued function has been generalised for functions ~ F : X \to X ~ that map a
Banach space In mathematics, more specifically in functional analysis, a Banach space (pronounced ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vector ...
~ X ~ onto itself or even more generally ~ F : X \to Y ~ that map from one
Banach space In mathematics, more specifically in functional analysis, a Banach space (pronounced ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vector ...
~ X ~ into another
Banach space In mathematics, more specifically in functional analysis, a Banach space (pronounced ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vector ...
~ Y ~. The generalized method assumes that a
family Family (from la, familia) is a Social group, group of people related either by consanguinity (by recognized birth) or Affinity (law), affinity (by marriage or other relationship). The purpose of the family is to maintain the well-being of its ...
of bounded
linear operators In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pre ...
~ \ ~ associated with ~ u ~ and ~ v ~ can be devised that (locally) satisfies that condition. : F\left( u \right) - F\left( v \right) = L\left( u, v \right) \, \bigl , u - v \,\bigr~. \qquad\qquad\qquad\qquad ~ ''If division is possible'' in the
Banach space In mathematics, more specifically in functional analysis, a Banach space (pronounced ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vector ...
, the linear operator ~L~ can be obtained from : L\left( u, v \right) = \bigl , F\left( u \right)- F\left( v \right) \,\bigr\bigl , u - v \,\bigr ~, Which may provide some insight. (The quotient form is shown for orientation; it is ''not'' required ''per se''.) Expressed in this way, the linear operator ~ L ~ can be more easily seen to be an elaborate version of the
divided difference In mathematics, divided differences is an algorithm, historically used for computing tables of logarithms and trigonometric functions. Charles Babbage's difference engine, an early mechanical calculator, was designed to use this algorithm in it ...
~ g ~ discussed in the first section, above. Note that the division is not necessary; the only requirement is that the operator ~ L ~ satisfy the equation marked with the
segno Segno is a village in North Western Italy in the region of Liguria. It belongs to the Municipality of Vado Ligure. Its countryside landscape makes it a popular venue for outdoor sports including mountain biking, cross-country running, trail and c ...
, . For the basic real number function ~ f ~, given in the first section, the function simply takes in and puts out real numbers. There, the function ~ g ~ is a ''
divided difference In mathematics, divided differences is an algorithm, historically used for computing tables of logarithms and trigonometric functions. Charles Babbage's difference engine, an early mechanical calculator, was designed to use this algorithm in it ...
''. In the generalized form here, the operator ~ L ~ is the analogue of a divided difference for use in the
Banach space In mathematics, more specifically in functional analysis, a Banach space (pronounced ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vector ...
. The operator ~ L ~ is roughly equivalent to a
matrix Matrix most commonly refers to: * ''The Matrix'' (franchise), an American media franchise ** ''The Matrix'', a 1999 science-fiction action film ** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchis ...
whose entries are all functions of
vector Vector most often refers to: *Euclidean vector, a quantity with a magnitude and a direction *Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematic ...
arguments An argument is a statement or group of statements called premises intended to determine the degree of truth or acceptability of another statement called conclusion. Arguments can be studied from three main perspectives: the logical, the dialectic ...
~ u ~ and ~ v ~. Steffensen's method is then very similar to the Newton's method, except that it uses the divided difference ~ L \bigl( \, F\left( x \right), \, x \,\bigr) ~ instead of the derivative ~ F'(x) ~. Note that for arguments ~ x ~ close to some fixed point ~ x_\star ~, fixed point functions ~ F ~ and their linear operators ~ L ~ meeting the marked condition, ~ F'(x) \approx L \bigl( \, F\left( x \right), \, x \,\bigr) \approx I~, where ~I~ is the identity operator. The generalized iteration formula is given by : x_ = x_n + \Bigl I - L\bigl( F\left( x_n \right), x_n \bigr) \Bigr\Bigl F\left( x_n \right) - x_n \Bigr~, for ~n = 1,\,2,\,3,\,... ~. If the operator ~ L ~ satisfies : \Bigl\, L \left( u, v \right) - L \left( x, y \right) \Bigr\, \le k \biggl( \Bigl\, u - x \Bigr\, + \Bigr\, v - y \Bigr\, \biggr) ~ for some constant ~ k ~, then the method converges quadratically to a fixed point of F if the initial approximation ~ x_0 ~ is 'sufficiently close' to the desired solution ~ x_\star ~, that satisfies ~ x_\star = F(x_\star) ~.


Notes


References

{{reflist, 25em, refs= {{cite book , first1=Germund , last1=Dahlquist , author1-link=Germund Dahlquist , first2=Åke , last2=Björck , translator-first=Ned , translator-last=Anderson , year=1974 , title=Numerical Methods , page
230–231
, location=Englewood Cliffs, NJ , publisher=Prentice Hall , url=https://archive.org/details/numericalmethods00dahl , url-access=registration
{{cite journal , last1=Johnson , first1=L.W. , last2=Scholz , first2=D.R. , date=June 1968 , title=On Steffensen's method , journal=SIAM Journal on Numerical Analysis , volume=5 , issue=2 , pages=296–302 , jstor=2949443 , doi=10.1137/0705026 Quasi-Newton methods