computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, e ...

, octuple precision is a binary

floating-point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...

-based

computer number format A computer number format is the internal representation of numeric values in digital device hardware and software, such as in programmable computers and calculators. Numerical values are stored as groupings of bits, such as bytes and words. The e ...

that occupies 32

byte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...

s (256

bit The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represente ...

s) in computer memory. This 256-

octuple precision is for applications requiring results in higher than

quadruple precision In computing, quadruple precision (or quad precision) is a binary floating point–based computer number format that occupies 16 bytes (128 bits) with precision at least twice the 53-bit double precision. This 128-bit quadruple precision is desi ...

. This format is rarely (if ever) used and very few environments support it.

IEEE 754 octuple-precision binary floating-point format: binary256

In its 2008 revision, the

IEEE 754 The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found i ...

standard specifies a binary256 format among the ''interchange formats'' (it is not a basic format), as having: *

Sign bit In computer science, the sign bit is a bit in a signed number representation that indicates the sign of a number. Although only signed numeric data types have a sign bit, it is invariably located in the most significant bit position, so the term ...

: 1 bit *

Exponent Exponentiation is a mathematical operation, written as , involving two numbers, the '' base'' and the ''exponent'' or ''power'' , and pronounced as " (raised) to the (power of) ". When is a positive integer, exponentiation corresponds to re ...

width: 19 bits *

Significand The significand (also mantissa or coefficient, sometimes also argument, or ambiguously fraction or characteristic) is part of a number in scientific notation or in floating-point representation, consisting of its significant digits. Depending on ...

precision Precision, precise or precisely may refer to: Science, and technology, and mathematics Mathematics and computing (general) * Accuracy and precision, measurement deviation from true value and its scatter * Significant figures, the number of digit ...

: 237 bits (236 explicitly stored) The format is written with an implicit lead bit with value 1 unless the exponent is all zeros. Thus only 236 bits of the

significand The significand (also mantissa or coefficient, sometimes also argument, or ambiguously fraction or characteristic) is part of a number in scientific notation or in floating-point representation, consisting of its significant digits. Depending on ...

appear in the memory format, but the total precision is 237 bits (approximately 71 decimal digits: ). The bits are laid out as follows: Octuple precision visual demonstration

Exponent encoding

The octuple-precision binary floating-point exponent is encoded using an

offset binary Offset binary, also referred to as excess-K, excess-''N'', excess-e, excess code or biased representation, is a method for signed number representation where a signed number n is represented by the bit pattern corresponding to the unsigned numbe ...

representation, with the zero offset being 262143; also known as exponent bias in the IEEE 754 standard. * E_min = −262142 * E_max = 262143 *

Exponent bias In IEEE 754 Floating-point arithmetic, floating-point numbers, the exponent is biased in the biasing, engineering sense of the word – the value stored is offset from the actual value by the exponent bias, also called a biased exponent. Biasing is ...

= 3FFFF₁₆ = 262143 Thus, as defined by the offset binary representation, in order to get the true exponent the offset of 262143 has to be subtracted from the stored exponent. The stored exponents 00000₁₆ and 7FFFF₁₆ are interpreted specially. The minimum strictly positive (subnormal) value is and has a precision of only one bit. The minimum positive normal value is 2^−262142 ≈ 2.4824 × 10⁻⁷⁸⁹¹³. The maximum representable value is 2²⁶²¹⁴⁴ − 2²⁶¹⁹⁰⁷ ≈ 1.6113 × 10⁷⁸⁹¹³.

Octuple-precision examples

These examples are given in bit ''representation'', in

hexadecimal In mathematics and computing, the hexadecimal (also base-16 or simply hex) numeral system is a positional numeral system that represents numbers using a radix (base) of 16. Unlike the decimal system representing numbers using 10 symbols, hexa ...

, of the floating-point value. This includes the sign, (biased) exponent, and significand. 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000₁₆ = +0 8000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000₁₆ = −0 7fff f000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000₁₆ = +infinity ffff f000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000₁₆ = −infinity 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001₁₆ = 2^−262142 × 2⁻²³⁶ = 2^−262378 ≈ 2.24800708647703657297018614776265182597360918266100276294348974547709294462 × 10⁻⁷⁸⁹⁸⁴ (smallest positive subnormal number) 0000 0fff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff₁₆ = 2^−262142 × (1 − 2⁻²³⁶) ≈ 2.4824279514643497882993282229138717236776877060796468692709532979137875392 × 10⁻⁷⁸⁹¹³ (largest subnormal number) 0000 1000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000₁₆ = 2^−262142 ≈ 2.48242795146434978829932822291387172367768770607964686927095329791378756168 × 10⁻⁷⁸⁹¹³ (smallest positive normal number) 7fff efff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff₁₆ = 2²⁶²¹⁴³ × (2 − 2⁻²³⁶) ≈ 1.61132571748576047361957211845200501064402387454966951747637125049607182699 × 10⁷⁸⁹¹³ (largest normal number) 3fff efff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff ffff₁₆ = 1 − 2⁻²³⁷ ≈ 0.999999999999999999999999999999999999999999999999999999999999999999999995472 (largest number less than one) 3fff f000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000₁₆ = 1 (one) 3fff f000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001₁₆ = 1 + 2⁻²³⁶ ≈ 1.00000000000000000000000000000000000000000000000000000000000000000000000906 (smallest number larger than one) By default, 1/3 rounds down like

double precision Double-precision floating-point format (sometimes called FP64 or float64) is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. Flo ...

, because of the odd number of bits in the significand. So the bits beyond the rounding point are 0101... which is less than 1/2 of a

unit in the last place In computer science and numerical analysis, unit in the last place or unit of least precision (ulp) is the spacing between two consecutive floating-point numbers, i.e., the value the least significant digit (rightmost digit) represents if it is 1 ...

Implementations

Octuple precision is rarely implemented since usage of it is extremely rare.

Apple Inc. Apple Inc. is an American multinational technology company headquartered in Cupertino, California, United States. Apple is the largest technology company by revenue (totaling in 2021) and, as of June 2022, is the world's biggest company ...

had an implementation of addition, subtraction and multiplication of octuple-precision numbers with a 224-bit

two's complement Two's complement is a mathematical operation to reversibly convert a positive binary number into a negative binary number with equivalent (but negative) value, using the binary digit with the greatest place value (the leftmost bit in big- endian ...

significand and a 32-bit exponent. One can use general

arbitrary-precision arithmetic In computer science, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that calculations are performed on numbers whose digits of precision are li ...

libraries to obtain octuple (or higher) precision, but specialized octuple-precision implementations may achieve higher performance.

Hardware support

There is no known hardware implementation of octuple precision.

IEEE 754 octuple-precision binary floating-point format: binary256

Exponent encoding

Octuple-precision examples

Implementations

Hardware support

See also

References

Further reading