HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...
, half precision (sometimes called FP16) is a binary
floating-point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can ...
computer number format that occupies
16 bit 16-bit microcomputers are microcomputers that use 16-bit microprocessors. A 16-bit register can store 216 different values. The range of integer values that can be stored in 16 bits depends on the integer representation used. With the two m ...
s (two bytes in modern computers) in
computer memory In computing, memory is a device or system that is used to store information for immediate use in a computer or related computer hardware and digital electronic devices. The term ''memory'' is often synonymous with the term '' primary storag ...
. It is intended for storage of floating-point values in applications where higher precision is not essential, in particular
image processing An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensio ...
and
neural network A neural network is a network or circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus, a neural network is either a biological neural network, made up of biological ...
s. Almost all modern uses follow the
IEEE 754-2008 The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operation ...
standard, where the 16-bit base-2 format is referred to as binary16, and the exponent uses 5 bits. This can express values in the range ±65,504, with the minimum value above 1 being 1 + 1/1024. Depending on the computer, half-precision can be over an order of magnitude faster than double precision, e.g. 550 PFLOPS for half-precision vs 37 PFLOPS for double precision on one cloud provider.


History

Several earlier 16-bit floating point formats have existed including that of Hitachi's HD61810 DSP of 1982, Scott's WIF and the 3dfx Voodoo Graphics processor.
ILM Ilm or ILM may refer to: Acronyms * Identity Lifecycle Manager, a Microsoft Server Product * '' I Love Money,'' a TV show on VH1 * Independent Loading Mechanism, a mounting system for CPU sockets * Industrial Light & Magic, an American motion ...
was searching for an image format that could handle a wide
dynamic range Dynamic range (abbreviated DR, DNR, or DYR) is the ratio between the largest and smallest values that a certain quantity can assume. It is often used in the context of signals, like sound and light. It is measured either as a ratio or as a base ...
, but without the hard drive and memory cost of single or double precision floating point. The hardware-accelerated programmable shading group led by John Airey at SGI (Silicon Graphics) invented the s10e5 data type in 1997 as part of the 'bali' design effort. This is described in a
SIGGRAPH SIGGRAPH (Special Interest Group on Computer Graphics and Interactive Techniques) is an annual conference on computer graphics (CG) organized by the ACM SIGGRAPH, starting in 1974. The main conference is held in North America; SIGGRAPH Asia ...
2000 paper (see section 4.3) and further documented in US patent 7518615. It was popularized by its use in the open-source OpenEXR image format.
Nvidia Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
and
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washi ...
defined the half
datatype In computer science and computer programming, a data type (or simply type) is a set of possible values and a set of allowed operations on it. A data type tells the compiler or interpreter how the programmer intends to use the data. Most progra ...
in the Cg language, released in early 2002, and implemented it in silicon in the GeForce FX, released in late 2002. Since then support for 16-bit floating point math in graphics cards has become very common. The F16C extension in 2012 allows x86 processors to convert half-precision floats to and from single-precision floats with a machine instruction.


IEEE 754 half-precision binary floating-point format: binary16

The IEEE 754 standard specifies a binary16 as having the following format: * Sign bit: 1 bit *
Exponent Exponentiation is a mathematical operation, written as , involving two numbers, the '' base'' and the ''exponent'' or ''power'' , and pronounced as " (raised) to the (power of) ". When is a positive integer, exponentiation corresponds to r ...
width: 5 bits *
Significand The significand (also mantissa or coefficient, sometimes also argument, or ambiguously fraction or characteristic) is part of a number in scientific notation or in floating-point representation, consisting of its significant digits. Depending on ...
precision: 11 bits (10 explicitly stored) The format is laid out as follows: The format is assumed to have an implicit lead bit with value 1 unless the exponent field is stored with all zeros. Thus only 10 bits of the
significand The significand (also mantissa or coefficient, sometimes also argument, or ambiguously fraction or characteristic) is part of a number in scientific notation or in floating-point representation, consisting of its significant digits. Depending on ...
appear in the memory format but the total precision is 11 bits. In IEEE 754 parlance, there are 10 bits of significand, but there are 11 bits of significand precision (log10(211) ≈ 3.311 decimal digits, or 4 digits ± slightly less than 5 units in the last place).


Exponent encoding

The half-precision binary floating-point exponent is encoded using an offset-binary representation, with the zero offset being 15; also known as exponent bias in the IEEE 754 standard. * Emin = 000012 − 011112 = −14 * Emax = 111102 − 011112 = 15 * Exponent bias = 011112 = 15 Thus, as defined by the offset binary representation, in order to get the true exponent the offset of 15 has to be subtracted from the stored exponent. The stored exponents 000002 and 111112 are interpreted specially. The minimum strictly positive (subnormal) value is 2−24 ≈ 5.96 × 10−8. The minimum positive normal value is 2−14 ≈ 6.10 × 10−5. The maximum representable value is (2−2−10) × 215 = 65504.


Half precision examples

These examples are given in bit representation of the floating-point value. This includes the sign bit, (biased) exponent, and significand. By default, 1/3 rounds down like for double precision, because of the odd number of bits in the significand. The bits beyond the rounding point are ... which is less than 1/2 of a
unit in the last place In computer science and numerical analysis, unit in the last place or unit of least precision (ulp) is the spacing between two consecutive floating-point numbers, i.e., the value the least significant digit (rightmost digit) represents if it is 1 ...
.


Precision limitations

65519 is the largest number that will round to a finite number (65504), 65520 and larger will round to infinity. This is for round-to-even, other rounding strategies will change this cutoff.


ARM alternative half-precision

ARM processors support (via a floating point
control register A control register is a processor register which changes or controls the general behavior of a CPU or other digital device. Common tasks performed by control registers include interrupt control, switching the addressing mode, paging control, ...
bit) an "alternative half-precision" format, which does away with the special case for an exponent value of 31 (111112). It is almost identical to the IEEE format, but there is no encoding for infinity or NaNs; instead, an exponent of 31 encodes normalized numbers in the range 65536 to 131008.


Uses of half precision

This format is used in several
computer graphics Computer graphics deals with generating images with the aid of computers. Today, computer graphics is a core technology in digital photography, film, video games, cell phone and computer displays, and many specialized applications. A great de ...
environments to store pixels, including
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...
, OpenEXR, JPEG XR,
GIMP GIMP ( ; GNU Image Manipulation Program) is a free and open-source raster graphics editor used for image manipulation (retouching) and image editing, free-form drawing, transcoding between different image file formats, and more specialized ...
,
OpenGL OpenGL (Open Graphics Library) is a cross-language, cross-platform application programming interface (API) for rendering 2D and 3D vector graphics. The API is typically used to interact with a graphics processing unit (GPU), to achieve hardwa ...
, Cg,
Direct3D Direct3D is a graphics application programming interface (API) for Microsoft Windows. Part of DirectX, Direct3D is used to render three-dimensional graphics in applications where performance is important, such as games. Direct3D uses hardware ...
, and D3DX. The advantage over 8-bit or 16-bit integers is that the increased
dynamic range Dynamic range (abbreviated DR, DNR, or DYR) is the ratio between the largest and smallest values that a certain quantity can assume. It is often used in the context of signals, like sound and light. It is measured either as a ratio or as a base ...
allows for more detail to be preserved in highlights and
shadow A shadow is a dark area where light from a light source is blocked by an opaque object. It occupies all of the three-dimensional volume behind an object with light in front of it. The cross section of a shadow is a two- dimensional silhouett ...
s for images, and the linear representation of intensity making calculations easier. The advantage over 32-bit
single-precision Single-precision floating-point format (sometimes called FP32 or float32) is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. A floatin ...
floating point is that it requires half the storage and bandwidth (at the expense of precision and range). Hardware and software for
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
or
neural networks A neural network is a network or circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus, a neural network is either a biological neural network, made up of biological ...
tend to use half precision: such applications usually do a large amount of calculation, but don't require a high level of precision. If the hardware has instructions to compute half-precision math, it is often faster than single or double precision. If the systems has
SIMD Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal (part of the hardware design) and it can be directly accessible through an instruction set architecture (ISA), but it shoul ...
instructions that can handle multiple floating-point numbers within one instruction, half precision can be twice as fast by operating on twice as many numbers simultaneously. However, if there is no hardware support, math must be done by emulation, or by conversion to single or double precision and then back, and is therefore slower.


Hardware support

Several versions of the
ARM architecture ARM (stylised in lowercase as arm, formerly an acronym for Advanced RISC Machines and originally Acorn RISC Machine) is a family of reduced instruction set computer (RISC) instruction set architectures for computer processors, configured ...
have support for half precision. Support for half precision in the x86
instruction set In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ...
is specified in the AVX-512_FP16 instruction set extension to be implemented in the future Intel Sapphire Rapids processor.


See also

*
bfloat16 floating-point format The bfloat16 (Brain Floating Point) floating-point format is a computer number format occupying 16 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. This format is a truncated (16- ...
: Alternative 16-bit floating-point format with 8 bits of exponent and 7 bits of mantissa * Minifloat: small floating-point formats *
IEEE 754 The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found ...
: IEEE standard for floating-point arithmetic (IEEE 754) * ISO/IEC 10967, Language Independent Arithmetic * Primitive data type * RGBE image format * Power Management Bus § Linear11 Floating Point Format


References


Further reading


Khronos Vulkan signed 16-bit floating point format


External links



(in ''Survey of Floating-Point Formats'')
OpenEXR site

Half precision constants
from D3DX
OpenGL treatment of half precision

Fast Half Float Conversions

Analog Devices variant
(four-bit exponent)
C source code to convert between IEEE double, single, and half precision can be found here

Java source code for half-precision floating-point conversion


{{DEFAULTSORT:Half-Precision Floating-Point Format Binary arithmetic Floating point types