floating-point arithmetic In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be ...

, the Sterbenz lemma or Sterbenz's lemma is a theorem giving conditions under which floating-point differences are computed exactly. It is named after Pat H. Sterbenz, who published a variant of it in 1974. The Sterbenz lemma applies to

IEEE 754 The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found i ...

, the most widely used floating-point number system in computers.

Proof

Let

\beta

be the radix of the floating-point system and

p

the precision. Consider several easy cases first: * If

x

is zero then

x - y = -y

, and if

y

is zero then

x - y = x

, so the result is trivial because floating-point negation is always exact. * If

x = y

the result is zero and thus exact. * If

x < 0

then we must also have

y/2 \leq x < 0

y < 0

. In this case,

x - y = -(-x - -y)

, so the result follows from the theorem restricted to

x, y \geq 0

. * If

x \leq y

, we can write

x - y = -(y - x)

with

x/2 \leq y \leq 2 x

, so the result follows from the theorem restricted to

x \geq y

. For the rest of the proof, assume

0 < y < x \leq 2 y

without loss of generality. Write

x, y > 0

in terms of their positive integral significands

s_x, s_y \leq \beta^p - 1

and minimal exponents

e_x, e_y

\begin
  x &= s_x \cdot \beta^ \\
  y &= s_y \cdot \beta^
\end

Note that

x

and

y

may be subnormal—we do not assume

s_x, s_y \geq \beta^

. The subtraction gives:

\begin
  x - y
  &= s_x \cdot \beta^
     - s_y \cdot \beta^ \\
  &= s_x \beta^ \cdot \beta^
     - s_y \cdot \beta^ \\
  &= (s_x \beta^ - s_y) \cdot \beta^.
\end

Let

s' = s_x \beta^ - s_y

. Since

0 < y < x

we have: *

e_y \leq e_x

, so

e_x - e_y \geq 0

, from which we can conclude

\beta^

is an integer and therefore so is

s' = s_x \beta^ - s_y

; and *

x - y > 0

, so

s' > 0

. Further, since

x \leq 2 y

, we have

x - y \leq y

, so that

s' \cdot \beta^ = x - y \leq y = s_y \cdot \beta^

which implies that

0 < s' \leq s_y \leq \beta^p - 1.

Hence

x - y = s' \cdot \beta^,
  \quad \text \quad
  0 < s' \leq \beta^p - 1,

x - y

is a floating-point number. Note: Even if

x

and

y

are normal, ''i.e.'',

s_x, s_y \geq \beta^

, we cannot prove that

s' \geq \beta^

and therefore cannot prove that

x - y

is also normal. For example, the difference of the two smallest positive normal floating-point numbers

x = (\beta^ + 1) \cdot \beta^

and

y = \beta^ \cdot \beta^

x - y = 1 \cdot \beta^

which is necessarily subnormal. In floating-point number systems without

subnormal numbers In computer science, subnormal numbers are the subset of denormalized numbers (sometimes called denormals) that fill the arithmetic underflow, underflow gap around zero in floating-point arithmetic. Any non-zero number with magnitude smaller than ...

, such as CPUs in nonstandard flush-to-zero mode instead of the standard gradual underflow, the Sterbenz lemma does not apply.

Relation to catastrophic cancellation

The Sterbenz lemma may be contrasted with the phenomenon of

catastrophic cancellation In numerical analysis, catastrophic cancellation is the phenomenon that subtracting good approximations to two nearby numbers may yield a very bad approximation to the difference of the original numbers. For example, if there are two studs, one L_ ...

: * The Sterbenz lemma asserts that if

x

and

y

are sufficiently close floating-point numbers then their difference

x - y

is computed exactly by floating-point arithmetic

x \ominus y = \operatorname(x - y)

, with no rounding needed. * The phenomenon of catastrophic cancellation is that if

\tilde x

and

\tilde y

are approximations to true numbers

x

and

y

—whether the approximations arise from prior rounding error or from series truncation or from physical uncertainty or anything else—the error of the difference

\tilde x - \tilde y

from the desired difference

x - y

is inversely proportional to

x - y

. Thus, the closer

x

and

y

are, the worse

\tilde x - \tilde y

may be as an approximation to

x - y

, even if the subtraction itself is computed exactly. In other words, the Sterbenz lemma shows that subtracting nearby floating-point numbers is exact, but if the numbers you have are approximations then even their exact difference may be far off from the difference of numbers you wanted to subtract.

Use in numerical analysis

The Sterbenz lemma is instrumental in proving theorems on error bounds in numerical analysis of floating-point algorithms. For example,

Heron's formula In geometry, Heron's formula (or Hero's formula) gives the area of a triangle in terms of the three side lengths , , . If s = \tfrac12(a + b + c) is the semiperimeter of the triangle, the area is, :A = \sqrt. It is named after first-century ...

A = \sqrt

for the area of triangle with side lengths

a

b

, and

c

, where

s = (a + b + c)/2

is the semi-perimeter, may give poor accuracy for long narrow triangles if evaluated directly in floating-point arithmetic. However, for

a \geq b \geq c

, the alternative formula

A = \frac \sqrt

can be proven, with the help of the Sterbenz lemma, to have low forward error for all inputs.

References

{{reflist Computer arithmetic Floating point Numerical analysis