A floating-point unit (FPU), numeric processing unit (NPU), colloquially math coprocessor, is a part of a

computer A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...

system specially designed to carry out operations on

floating-point In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...

numbers. Typical operations are

addition Addition (usually signified by the Plus and minus signs#Plus sign, plus symbol, +) is one of the four basic Operation (mathematics), operations of arithmetic, the other three being subtraction, multiplication, and Division (mathematics), divis ...

subtraction Subtraction (which is signified by the minus sign, –) is one of the four Arithmetic#Arithmetic operations, arithmetic operations along with addition, multiplication and Division (mathematics), division. Subtraction is an operation that repre ...

multiplication Multiplication is one of the four elementary mathematical operations of arithmetic, with the other ones being addition, subtraction, and division (mathematics), division. The result of a multiplication operation is called a ''Product (mathem ...

, division, and

square root In mathematics, a square root of a number is a number such that y^2 = x; in other words, a number whose ''square'' (the result of multiplying the number by itself, or y \cdot y) is . For example, 4 and −4 are square roots of 16 because 4 ...

. Modern designs generally include a fused multiply-add instruction, which was found to be very common in real-world code. Some FPUs can also perform various

transcendental function In mathematics, a transcendental function is an analytic function that does not satisfy a polynomial equation whose coefficients are functions of the independent variable that can be written using only the basic operations of addition, subtraction ...

s such as

exponential Exponential may refer to any of several mathematical topics related to exponentiation, including: * Exponential function, also: **Matrix exponential, the matrix analogue to the above *Exponential decay, decrease at a rate proportional to value * Ex ...

trigonometric Trigonometry () is a branch of mathematics concerned with relationships between angles and side lengths of triangles. In particular, the trigonometric functions relate the angles of a right triangle with ratios of its side lengths. The field ...

calculations, but the accuracy can be low, so some systems prefer to compute these functions in software. Floating-point operations were originally handled in

software Software consists of computer programs that instruct the Execution (computing), execution of a computer. Software also includes design documents and specifications. The history of software is closely tied to the development of digital comput ...

in early computers. Over time, manufacturers began to provide standardized floating-point libraries as part of their software collections. Some machines, those dedicated to scientific processing, would include specialized hardware to perform some of these tasks with much greater speed. The introduction of

microcode In processor design, microcode serves as an intermediary layer situated between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer. It consists of a set of hardware-level instructions ...

in the 1960s allowed these instructions to be included in the system's

instruction set architecture In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, ...

(ISA). Normally these would be decoded by the microcode into a series of instructions that were similar to the libraries, but on those machines with an FPU, they would instead be routed to that unit, which would perform them much faster. This allowed floating-point instructions to become universal while the floating-point hardware remained optional; for instance, on the

PDP-11 The PDP–11 is a series of 16-bit minicomputers originally sold by Digital Equipment Corporation (DEC) from 1970 into the late 1990s, one of a set of products in the Programmed Data Processor (PDP) series. In total, around 600,000 PDP-11s of a ...

one could add the floating-point processor unit at any time using plug-in

expansion card In computing, an expansion card (also called an expansion board, adapter card, peripheral card or accessory card) is a printed circuit board that can be inserted into an electrical connector, or expansion slot (also referred to as a bus sl ...

s. The introduction of the

microprocessor A microprocessor is a computer processor (computing), processor for which the data processing logic and control is included on a single integrated circuit (IC), or a small number of ICs. The microprocessor contains the arithmetic, logic, a ...

in the 1970s led to a similar evolution as the earlier

mainframe A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterpris ...

s and

minicomputer A minicomputer, or colloquially mini, is a type of general-purpose computer mostly developed from the mid-1960s, built significantly smaller and sold at a much lower price than mainframe computers . By 21st century-standards however, a mini is ...

s. Early

microcomputer A microcomputer is a small, relatively inexpensive computer having a central processing unit (CPU) made out of a microprocessor. The computer also includes memory and input/output (I/O) circuitry together mounted on a printed circuit board (P ...

systems performed floating point in software, typically in a vendor-specific library included in

ROM Rom, or ROM may refer to: Biomechanics and medicine * Risk of mortality, a medical classification to estimate the likelihood of death for a patient * Rupture of membranes, a term used during pregnancy to describe a rupture of the amniotic sac * ...

. Dedicated single-chip FPUs began to appear late in the decade, but they remained rare in real-world systems until the mid-1980s, and using them required software to be re-written to call them. As they became more common, the software libraries were modified to work like the microcode of earlier machines, performing the instructions on the main CPU if needed, but offloading them to the FPU if one was present. By the late 1980s,

semiconductor manufacturing Semiconductor device fabrication is the process used to manufacture semiconductor devices, typically integrated circuits (ICs) such as microprocessors, microcontrollers, and memories (such as Random-access memory, RAM and flash memory). It is a ...

had improved to the point where it became possible to include an FPU with the main CPU, resulting in designs like the

i486 The Intel 486, officially named i486 and also known as 80486, is a microprocessor introduced in 1989. It is a higher-performance follow-up to the i386, Intel 386. It represents the fourth generation of binary compatible CPUs following the Inte ...

and

68040 The Motorola 68040 ("''sixty-eight-oh-forty''") is a 32-bit microprocessor in the Motorola 68000 series, released in 1990. It is the successor to the 68030 and is followed by the 68060, skipping the 68050. In keeping with general Motorola ...

. These designs were known as an "integrated FPU"s, and from the mid-1990s, FPUs were a standard feature of most CPU designs except those designed as low-cost as

embedded processor An embedded system is a specialized computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is em ...

s. In modern designs, a single CPU will typically include several

arithmetic logic unit In computing, an arithmetic logic unit (ALU) is a Combinational logic, combinational digital circuit that performs arithmetic and bitwise operations on integer binary numbers. This is in contrast to a floating-point unit (FPU), which operates on ...

s (ALUs) and several FPUs, reading many instructions at the same time and routing them to the various units for parallel execution. By the 2000s, even embedded processors generally included an FPU as well.

History

In 1954, the

IBM 704 The IBM 704 is the model name of a large digital computer, digital mainframe computer introduced by IBM in 1954. Designed by John Backus and Gene Amdahl, it was the first mass-produced computer with hardware for floating-point arithmetic. The I ...

had floating-point arithmetic as a standard feature, one of its major improvements over its predecessor the

IBM 701 The IBM 701 Electronic Data Processing Machine, known as the Defense Calculator while in development, was IBM’s first commercial scientific computer and its first series production mainframe computer, which was announced to the public on May 2 ...

. This was carried forward to its successors the 709, 7090, and 7094. In 1963, Digital announced the

PDP-6 The PDP-6, short for Programmed Data Processor model 6, is a computer developed by Digital Equipment Corporation (DEC) during 1963 and first delivered in the summer of 1964. It was an expansion of DEC's existing 18-bit systems to use a 36-bit da ...

, which had floating point as a standard feature. In 1963, the GE-235 featured an "Auxiliary Arithmetic Unit" for floating point and double-precision calculations. Historically, some systems implemented

floating point In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a signed sequence of a fixed number of digits in some base) multiplied by an integer power of that base. Numbers of this form ...

with a

coprocessor A coprocessor is a computer processor used to supplement the functions of the primary processor (the CPU). Operations performed by the coprocessor may be floating-point arithmetic, graphics, signal processing, string processing, cryptography or ...

rather than as an integrated unit (but now in addition to the CPU, e.g.

GPUs A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...

that are coprocessors not always built into the CPUhave FPUs as a rule, while first generations of GPUs did not). This could be a single

integrated circuit An integrated circuit (IC), also known as a microchip or simply chip, is a set of electronic circuits, consisting of various electronic components (such as transistors, resistors, and capacitors) and their interconnections. These components a ...

, an entire

circuit board A printed circuit board (PCB), also called printed wiring board (PWB), is a laminated sandwich structure of conductive and insulating layers, each with a pattern of traces, planes and other features (similar to wires on a flat surface) ...

or a cabinet. Where floating-point calculation hardware has not been provided, floating-point calculations are done in software, which takes more processor time, but avoids the cost of the extra hardware. For a particular computer architecture, the floating-point unit instructions may be

emulated In computing, an emulator is hardware or software that enables one computer system (called the ''host'') to behave like another computer system (called the ''guest''). An emulator typically enables the host system to run software or use perip ...

by a library of software functions; this may permit the same

object code In computing, object code or object module is the product of an assembler or compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' ...

to run on systems with or without floating-point hardware. Emulation can be implemented on any of several levels: in the CPU as

, as an

operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...

function, or in user-space code. When only integer functionality is available, the

CORDIC CORDIC, short for coordinate rotation digital computer, is a simple and efficient algorithm to calculate trigonometric functions, hyperbolic functions, square roots, multiplications, divisions, and exponentials and logarithms In ma ...

methods are most commonly used for

evaluation. In most modern computer architectures, there is some division of floating-point operations from

integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...

operations. This division varies significantly by architecture; some have dedicated floating-point registers, while some, like

Intel x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. The ...

, go as far as independent clocking schemes. CORDIC routines have been implemented in Intel x87 coprocessors (

8087 The Intel 8087, announced in 1980, was the first floating-point coprocessor for the 8086 line of microprocessors. The purpose of the chip was to speed up floating-point arithmetic operations, such as addition, subtraction, multiplication, di ...

, 80287, 80387) up to the

80486 The Intel 486, officially named i486 and also known as 80486, is a microprocessor introduced in 1989. It is a higher-performance follow-up to the Intel 386. It represents the fourth generation of binary compatible CPUs following the 8086 of 19 ...

microprocessor series, as well as in the

Motorola 68881 The Motorola 68881 and Motorola 68882 are floating-point units (FPUs) used in some computer systems in conjunction with Motorola's 32-bit 68020 or 68030 microprocessors. These coprocessors are external chips, designed before floating point math ...

and 68882 for some kinds of floating-point instructions, mainly as a way to reduce the

gate A gate or gateway is a point of entry to or from a space enclosed by walls. The word is derived from Proto-Germanic language, Proto-Germanic ''*gatan'', meaning an opening or passageway. Synonyms include yett (which comes from the same root w ...

counts (and complexity) of the FPU subsystem. Floating-point operations are often pipelined. In earlier

superscalar A superscalar processor (or multiple-issue processor) is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single in ...

architectures without general

out-of-order execution In computer engineering, out-of-order execution (or more formally dynamic execution) is an instruction scheduling paradigm used in high-performance central processing units to make use of instruction cycles that would otherwise be wasted. In t ...

, floating-point operations were sometimes pipelined separately from integer operations. The modular architecture of Bulldozer microarchitecture uses a special FPU named FlexFPU, which uses

simultaneous multithreading Simultaneous multithreading (SMT) is a technique for improving the overall efficiency of superscalar CPUs with hardware multithreading. SMT permits multiple independent threads of execution to better use the resources provided by modern proces ...

. Each physical integer core, two per module, is single-threaded, in contrast with Intel's Hyperthreading, where two virtual simultaneous threads share the resources of a single physical core.

Floating-point library

Some floating-point hardware only supports the simplest operations: addition, subtraction, and multiplication. But even the most complex floating-point hardware has a finite number of operations it can supportfor example, no FPUs directly support

arbitrary-precision arithmetic In computer science, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that calculations are performed on numbers whose digits of precision are po ...

. When a CPU is executing a program that calls for a floating-point operation that is not directly supported by the hardware, the CPU uses a series of simpler floating-point operations. In systems without any floating-point hardware, the CPU emulates it using a series of simpler

fixed-point arithmetic In computing, fixed-point is a method of representing fractional (non-integer) numbers by storing a fixed number of digits of their fractional part. Dollar amounts, for example, are often stored with exactly two fractional digits, represen ...

operations that run on the integer

. The software that lists the necessary series of operations to emulate floating-point operations is often packaged in a floating-point

library A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...

Integrated FPUs

In some cases, FPUs may be specialized, and divided between simpler floating-point operations (mainly addition and multiplication) and more complicated operations, like division. In some cases, only the simple operations may be implemented in hardware or

, while the more complex operations are implemented as software. In some current architectures, the FPU functionality is combined with

SIMD Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...

units to perform SIMD computation; an example of this is the augmentation of the x87 instructions set with SSE instruction set in the

x86-64 x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit extension of the x86 instruction set architecture, instruction set. It was announced in 1999 and first available in the AMD Opteron family in 2003. It introduces two new ope ...

architecture used in newer Intel and AMD processors.

Add-on FPUs

Several models of the

, such as the PDP-11/45, PDP-11/34a, PDP-11/44, and PDP-11/70, supported an add-on floating-point unit to support floating-point instructions. The PDP-11/60, MicroPDP-11/23 and several

VAX VAX (an acronym for virtual address extension) is a series of computers featuring a 32-bit instruction set architecture (ISA) and virtual memory that was developed and sold by Digital Equipment Corporation (DEC) in the late 20th century. The V ...

models could execute floating-point instructions without an add-on FPU (the MicroPDP-11/23 required an add-on microcode option), and offered add-on accelerators to further speed the execution of those instructions. In the 1980s, it was common in

IBM PC The IBM Personal Computer (model 5150, commonly known as the IBM PC) is the first microcomputer released in the List of IBM Personal Computer models, IBM PC model line and the basis for the IBM PC compatible ''de facto'' standard. Released on ...

/compatible

microcomputers A microcomputer is a small, relatively inexpensive computer having a central processing unit (CPU) made out of a microprocessor. The computer also includes memory and input/output (I/O) circuitry together mounted on a printed circuit board (P ...

for the FPU to be entirely separate from the

CPU A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary processor in a given computer. Its electronic circuitry executes instructions of a computer program, such as arithmetic, log ...

, and typically sold as an optional add-on. It would only be purchased if needed to speed up or enable math-intensive programs. The IBM PC, XT, and most compatibles based on the 8088 or 8086 had a socket for the optional 8087 coprocessor. The AT and

80286 The Intel 80286 (also marketed as the iAPX 286 and often called Intel 286) is a 16-bit microprocessor that was introduced on February 1, 1982. It was the first 8086-based CPU with separate, non-multiplexed address and data buses and also the fi ...

-based systems were generally socketed for the 80287, and 80386/80386SX-based machinesfor the 80387 and 80387SX respectively, although early ones were socketed for the 80287, since the 80387 did not exist yet. Other companies manufactured co-processors for the Intel x86 series. These included

Cyrix Cyrix Corporation was a microprocessor developer that was founded in 1988 in Richardson, Texas, as a specialist supplier of floating point units for 286 and 386 microprocessors. The company was founded by Tom Brightman and Jerry Rogers. Ter ...

and

Weitek Weitek Corporation was an American Microprocessor, chip-design company that originally focused on floating-point units for a number of commercial Central processing unit, CPU designs. During the early to mid-1980s, Weitek designs could be found ...

Acorn Computers Acorn Computers Ltd. was a British computer company established in Cambridge, England in 1978 by Hermann Hauser, Christopher Curry (businessman), Chris Curry and Andy Hopper. The company produced a number of computers during the 1980s with asso ...

opted for the WE32206 to offer single,

double Double, The Double or Dubble may refer to: Mathematics and computing * Multiplication by 2 * Double precision, a floating-point representation of numbers that is typically 64 bits in length * A double number of the form x+yj, where j^2=+1 * A ...

and

extended precision Extended precision refers to floating-point number formats that provide greater precision than the basic floating-point formats. Extended-precision formats support a basic format by minimizing roundoff and overflow errors in intermediate value ...

to its

ARM In human anatomy, the arm refers to the upper limb in common usage, although academically the term specifically means the upper arm between the glenohumeral joint (shoulder joint) and the elbow joint. The distal part of the upper limb between ...

powered

Archimedes Archimedes of Syracuse ( ; ) was an Ancient Greece, Ancient Greek Greek mathematics, mathematician, physicist, engineer, astronomer, and Invention, inventor from the ancient city of Syracuse, Sicily, Syracuse in History of Greek and Hellenis ...

range, introducing a gate array to interface the ARM2 processor with the WE32206 to support the additional ARM floating-point instructions. Acorn later offered the FPA10 coprocessor, developed by ARM, for various machines fitted with the ARM3 processor. Coprocessors were available for the

Motorola 68000 family The Motorola 68000 series (also known as 680x0, m68000, m68k, or 68k) is a family of 32-bit complex instruction set computer (CISC) microprocessors. During the 1980s and early 1990s, they were popular in personal computers and workstations and w ...

, the 68881 and 68882. These were common in

Motorola 68020 The Motorola 68020 is a 32-bit microprocessor from Motorola, released in 1984. A lower-cost version was also made available, known as the 68EC020. In keeping with naming practices common to Motorola designs, the 68020 is usually referred to as t ...

/ 68030-based

workstation A workstation is a special computer designed for technical or computational science, scientific applications. Intended primarily to be used by a single user, they are commonly connected to a local area network and run multi-user operating syste ...

s, like the

Sun-3 Sun-3 is a series of UNIX computer workstations and servers produced by Sun Microsystems, launched on September 9, 1985. The Sun-3 series are VMEbus-based systems similar to some of the earlier Sun-2 series, but using the Motorola 68020 mic ...

series. They were also commonly added to higher-end models of Apple

Macintosh Mac is a brand of personal computers designed and marketed by Apple Inc., Apple since 1984. The name is short for Macintosh (its official name until 1999), a reference to the McIntosh (apple), McIntosh apple. The current product lineup inclu ...

and Commodore

Amiga Amiga is a family of personal computers produced by Commodore International, Commodore from 1985 until the company's bankruptcy in 1994, with production by others afterward. The original model is one of a number of mid-1980s computers with 16-b ...

series, but unlike IBM PC-compatible systems, sockets for adding the coprocessor were not as common in lower-end systems. There are also add-on FPU coprocessor units for

microcontroller A microcontroller (MC, uC, or μC) or microcontroller unit (MCU) is a small computer on a single integrated circuit. A microcontroller contains one or more CPUs (processor cores) along with memory and programmable input/output peripherals. Pro ...

units (MCUs/μCs)/

single-board computer A single-board computer (SBC) is a complete computer built on a single circuit board, with microprocessor(s), memory, input/output (I/O) and other features required of a functional computer. Single-board computers are commonly made as demonst ...

(SBCs), which serve to provide floating-point

arithmetic Arithmetic is an elementary branch of mathematics that deals with numerical operations like addition, subtraction, multiplication, and division. In a wider sense, it also includes exponentiation, extraction of roots, and taking logarithms. ...

capability. These add-on FPUs are host-processor-independent, possess their own programming requirements ( operations,

instruction set In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, s ...

s, etc.) and are often provided with their own

integrated development environment An integrated development environment (IDE) is a Application software, software application that provides comprehensive facilities for software development. An IDE normally consists of at least a source-code editor, build automation tools, an ...

s (IDEs).

History

Floating-point library

Integrated FPUs

Add-on FPUs

See also

References

Further reading