The Intel 8087, announced in 1980, was the first x87 floating-point coprocessor for the 8086 line of microprocessors. The purpose of the 8087 was to speed up computations for floating-point arithmetic, such as addition, subtraction, multiplication,

division Division or divider may refer to: Mathematics *Division (mathematics), the inverse of multiplication *Division algorithm, a method for computing the result of mathematical division Military *Division (military), a formation typically consisting ...

, and

square root In mathematics, a square root of a number is a number such that ; in other words, a number whose ''square'' (the result of multiplying the number by itself, or  ⋅ ) is . For example, 4 and −4 are square roots of 16, because . ...

. It also computed transcendental functions such as

exponential Exponential may refer to any of several mathematical topics related to exponentiation, including: *Exponential function, also: **Matrix exponential, the matrix analogue to the above *Exponential decay, decrease at a rate proportional to value *Expo ...

, logarithmic or

trigonometric Trigonometry () is a branch of mathematics that studies relationships between side lengths and angles of triangles. The field emerged in the Hellenistic world during the 3rd century BC from applications of geometry to astronomical studies. ...

calculations, and besides floating-point it could also operate on large binary and decimal integers. The performance enhancements were from approximately 20% to over 500%, depending on the specific application. The 8087 could perform about 50,000 FLOPS using around 2.4 watts. Only arithmetic operations benefited from installation of an 8087; computers used only with such applications as word processing, for example, would not benefit from the extra expense (around $150) and power consumption of an 8087. Intel_8087_die

The 8087 was an advanced IC for its time, pushing the limits of manufacturing technology of the period. Initial yields were extremely low. The sales of the 8087 received a significant boost when IBM included a coprocessor socket on the IBM PC motherboard. Due to a shortage of chips, IBM did not actually offer the 8087 as an option for the PC until it had been on the market for six months. Development of the 8087 led to the IEEE 754-1985 standard for floating-point arithmetic. There were later x87 coprocessors for the

80186 The Intel 80186, also known as the iAPX 186, or just 186, is a microprocessor and microcontroller introduced in 1982. It was based on the Intel 8086 and, like it, had a 16-bit external data bus multiplexed with a 20-bit address bus. The 80188 ...

(not used in PC-compatibles),

80286 The Intel 80286 (also marketed as the iAPX 286 and often called Intel 286) is a 16-bit microprocessor that was introduced on February 1, 1982. It was the first 8086-based CPU with separate, non- multiplexed address and data buses and also the ...

, 80386, and

80386SX The Intel 386, originally released as 80386 and later renamed i386, is a 32-bit microprocessor introduced in 1985. The first versions had 275,000 transistors x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was intr ...

processors did not use a separate floating-point coprocessor; floating-point functions were provided integrated with the processor. Internally, the chip lacked a hardware multiplier and implemented calculations using the

CORDIC CORDIC (for "coordinate rotation digital computer"), also known as Volder's algorithm, or: Digit-by-digit method Circular CORDIC (Jack E. Volder), Linear CORDIC, Hyperbolic CORDIC (John Stephen Walther), and Generalized Hyperbolic CORDIC (GH C ...

algorithm.

Design and development

Intel had previously manufactured the 8231 ''Arithmetic processing unit'', and the 8232 ''Floating Point Processor''. These were designed for use with

8080 The Intel 8080 (''"eighty-eighty"'') is the second 8-bit microprocessor designed and manufactured by Intel. It first appeared in April 1974 and is an extended and enhanced variant of the earlier 8008 design, although without binary compatibil ...

or similar processors and used an 8-bit data bus. They were interfaced to a host system either through programmed I/O or a DMA controller. The 8087 was initially conceived by Bill Pohlman, the engineering manager at Intel who oversaw the development of the 8086 chip. Bill took steps to be sure that the 8086 chip could support a yet-to-be-developed math chip. In 1977 Pohlman got the go ahead to design the 8087 math chip. Bruce Ravenel was assigned as architect, and John Palmer was hired to be co-architect and mathematician for the project. The two came up with a revolutionary design with 64 bits of mantissa and 16 bits of exponent for the longest-format real number, with a stack architecture CPU and eight 80-bit stack registers, with a computationally rich instruction set. The design solved a few outstanding known problems in numerical computing and numerical software: rounding-error problems were eliminated for 64-bit operands, and numerical mode conversions were solved for all 64-bit numbers. Palmer credited William Kahan's writings on floating point as a significant influence on their design. The 8087 design initially met a cool reception in Santa Clara due to its aggressive design. Eventually, the design was assigned to Intel Israel, and Rafi Nave was assigned to lead the implementation of the chip. Palmer, Ravenel and Nave were awarded patents for the design. Robert Koehler and John Bayliss were also awarded a patent for the technique where some instructions with a particular bit pattern were offloaded to the coprocessor. The 8087 had 45,000 transistors and was manufactured as a 3 μm depletion-load HMOS circuit. It worked in tandem with the 8086 or 8088 and introduced about 60 new instructions. Most 8087 assembly

mnemonic A mnemonic ( ) device, or memory device, is any learning technique that aids information retention or retrieval (remembering) in the human memory for better understanding. Mnemonics make use of elaborative encoding, retrieval cues, and imag ...

s begin with F, such as FADD, FMUL, FCOM and so on, making them easily distinguishable from 8086 instructions. The binary encodings for all 8087 instructions begin with the bit pattern 11011, decimal 27, the same as the

ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...

character ESC, although in the higher-order bits of a byte; similar instruction prefixes are also sometimes referred to as " escape codes". The instruction mnemonic assigned by Intel for these coprocessor instructions is "ESC". When the 8086 or 8088 CPU executed the ESC instruction, if the second byte (the ModR/M byte) specified a memory operand, the CPU would execute a bus cycle to read one word from the memory location specified in the instruction (using any 8086 addressing mode), but it would not store the read operand into any CPU register or perform any operation on it; the 8087 would observe the bus and decode the instruction stream in sync with the 8086, recognizing the coprocessor instructions meant for itself. For an 8087 instruction with a memory operand, if the instruction called for the operand to be read, the 8087 would take the word of data read by the main CPU from the data bus. If the operand to be read was longer than one word, the 8087 would also copy the address from the address bus; then, after completion of the data read cycle driven by the CPU, the 8087 would immediately use DMA to take control of the bus and transfer the additional bytes of the operand itself. If an 8087 instruction with a memory operand called for that operand to be written, the 8087 would ignore the read word on the data bus and just copy the address, then request DMA and write the entire operand, in the same way that it would read the end of an extended operand. In this way, the main CPU maintained general control of the bus and bus timing, while the 8087 handled all other aspects of execution of coprocessor instructions, except for brief DMA periods when the 8087 would take over the bus to read or write operands to/from its own internal registers. As a consequence of this design, the 8087 could only operate on operands taken either from memory or from its own registers, and any exchange of data between the 8087 and the 8086 or 8088 was only through RAM. The main CPU program continued to execute while the 8087 executed an instruction; from the perspective of the main 8086 or 8088 CPU, a coprocessor instruction took only as long as the processing of the opcode and any memory operand cycle (2 clock cycles for no operand, 8 clock cycles plus the EA calculation time to 12 clock cyclesfor a memory operand lus 4 more clock cycles on an 8088 to transfer the second byte of the operand word), after which the CPU would begin executing the next instruction of the program. Thus, a system with an 8087 was capable of true parallel processing, performing one operation in the integer ALU of the main CPU while at the same time performing a floating-point operation in the 8087 coprocessor. Since the 8086 or 8088 exclusively controlled the instruction flow and timing and had no direct access to the internal status of the 8087, and because the 8087 could execute only one instruction at a time, programs for the combined 8086/8087 or 8088/8087 system had to ensure that the 8087 had time to complete the last instruction issued to it before it was issued another one. The WAIT instruction (of the main CPU) was provided for this purpose, and most assemblers implicitly asserted a WAIT instruction before each instance of most floating-point coprocessor instructions. (It is not necessary to use a WAIT instruction before an 8087 operation if the program uses other means to ensure that enough time elapses between the issuance of timing-sensitive 8087 instructions so that the 8087 can never receive such an instruction before it completes the previous one. It is also not necessary, if a WAIT is used, that it immediately precede the next 8087 instruction.) The WAIT instruction waited for the −TEST input pin of the 8086/8088 to be asserted (low), and this pin was connected to the BUSY pin of the 8087 in all systems that had an 8087 (so TEST was asserted when BUSY was deasserted). Because the instruction prefetch queues of the 8086 and 8088 make the time when an instruction is executed not always the same as the time it is fetched, a coprocessor such as the 8087 cannot determine when an instruction for itself is the next instruction to be executed purely by watching the CPU bus. The 8086 and 8088 have two queue status signals connected to the coprocessor to allow it to synchronize with the CPU's internal timing of execution of instructions from its prefetch queue. The 8087 maintains its own identical prefetch queue, from which it reads the coprocessor opcodes that it actually executes. Because the 8086 and 8088 prefetch queues have different sizes and different management algorithms, the 8087 determines which type of CPU it is attached to by observing a certain CPU bus line when the system is reset, and the 8087 adjusts its internal instruction queue accordingly. The redundant duplication of prefetch queue hardware in the CPU and the coprocessor is inefficient in terms of power usage and total die area, but it allowed the coprocessor interface to use very few dedicated IC pins, which was important. At the time when the 8086, which defined the coprocessor interface, was introduced, IC packages with more than 40 pins were rare, expensive, and wrangled with problems such as excessive lead capacitance, a major limiting factor for signalling speeds. The coprocessor operation codes are encoded in 6 bits across 2 bytes, beginning with the escape sequence: ┌───────────┬───────────┐ │ 1101 1xxx │ mmxx xrrr │ └───────────┴───────────┘ The first three "x" bits are the first three bits of the floating-point opcode. Then two "m" bits, then the latter half three bits of the floating-point opcode, followed by three "r" bits. The "m" and "r" bits specify the addressing-mode information. Application programs had to be written to make use of the special floating-point instructions. At run time, software could detect the coprocessor and use it for floating-point operations. When detected absent, similar floating-point functions had to be calculated in software, or the whole coprocessor could be emulated in software for more precise numerical compatibility.

Registers

The x87 family does not use a directly addressable register set such as the main registers of the x86 processors; instead, the x87 registers form an eight-level deep stack structure ranging from st0 to st7, where st0 is the top. The x87 instructions operate by pushing, calculating, and popping values on this stack. However, dyadic operations such as FADD, FMUL, FCMP, and so on may either ''implicitly'' use the topmost st0 and st1 or may use st0 together with an ''explicit'' memory operand or register; the st0 register may thus be used as an accumulator (i.e. as a combined destination and left operand) and can also be exchanged with any of the eight stack registers using an instruction called FXCH st''X'' (codes D9C8–D9CF_h). This makes the x87 stack usable as seven freely addressable registers plus an accumulator. This is especially applicable on superscalar x86 processors (

Pentium Pentium is a brand used for a series of x86 architecture-compatible microprocessors produced by Intel. The original Pentium processor from which the brand took its name was first released on March 22, 1993. After that, the Pentium II and P ...

of 1993 and later), where these exchange instructions are optimized down to a zero-clock penalty.

IEEE floating-point standard

When Intel designed the 8087, it aimed to make a standard floating-point format for future designs. An important aspect of the 8087 from a historical perspective was that it became the basis for the IEEE 754 floating-point standard. The 8087 did not implement the eventual IEEE 754 standard in all its details, as the standard was not finished until 1985, but the 80387 did. The 8087 provided two basic 32/ 64-bit floating-point data types and an additional extended 80-bit internal temporary format (that could also be stored in memory) to improve accuracy over large and complex calculations. Apart from this, the 8087 offered an 80-bit/18-digit packed BCD ( binary-coded decimal) format and 16-, 32-, and 64-bit integer data types.

Infinity

The 8087 handles infinity values by either affine closure or projective closure (selected by the status register). With affine closure, positive and negative infinities are treated as different values. With projective closure, infinity is treated as an unsigned representation for very small or very large numbers. These two methods of handling infinity were incorporated into the draft version of the IEEE 754 floating-point standard. However, projective closure ( projectively extended real number system) was dropped from the later formal issue of IEEE 754-1985. The 80287 retained projective closure as an option, but the 80387 and subsequent floating-point processors (including the 80187) only supported affine closure.

Coprocessor interface

The 8087 differed from subsequent Intel coprocessors in that it was directly connected to the address and data buses. The 8087 looked for instructions that commenced with the "11011" sequence and acted on them, immediately requesting DMA from the main CPU as necessary to access memory operands longer than one word (16 bits), then immediately releasing bus control back to the main CPU. The coprocessor ''did not'' hold up execution of the program until the coprocessor instruction was complete, and the program had to explicitly synchronize the two processors, as explained above (in the " Design and development" section). There was a potential crash problem if the coprocessor instruction failed to decode to one that the coprocessor understood. Intel's later coprocessors did not connect to the buses in the same way, but received instructions through the main processor I/O ports. This yielded an execution time penalty, but the potential crash problem was avoided because the main processor would ignore the instruction if the coprocessor refused to accept it. The 8087 was able to detect whether it was connected to an 8088 or an 8086 by monitoring the data bus during the reset cycle. The 8087 was, in theory, capable of working concurrently while the 8086/8 processes additional instructions. In practice, there was the potential for program failure if the coprocessor issued a new instruction before the last one had completed. The assembler would automatically insert an FWAIT instruction after every coprocessor opcode, forcing the 8086/8 to halt execution until the 8087 signalled that it had finished. This limitation was removed from later designs.

Models and second sources

Intel 8087 coprocessors were fabricated in two variants: one with ceramic side-brazed DIP (CerDIP) and one in hermetic DIP (PDIP), and were designed to operate in the following temperature ranges: * C, D, QC and QD prefixes: 0 °C to +70 °C (commercial use). * LC, LD, TC and TD prefixes: −40 °C to +85 °C (industrial use). * MC and MD prefixes: −55 °C to +125 °C (military use). All models of the 8087 had a 40-pin DIP package and operated on 5 volts, consuming around 2.4 watts. Unlike later Intel coprocessors, the 8087 had to run at the same clock speed as the main processor. Suffixes on the part number identified the clock speed: File:Intel C8087.jpg, The original 5 MHz Intel C8087 File:Intel 8087.jpg, The original 5 MHz Intel D8087 File:Ic-photo-Intel--D8087-2-(8086-FPU).png, An 8 MHz Intel D8087-2 File:Ic-photo-intel-C8087-3.png, An Intel 5 MHz C8087-3 The part was second-sourced by

AMD Advanced Micro Devices, Inc. (AMD) is an American multinational semiconductor company based in Santa Clara, California, that develops computer processors and related technologies for business and consumer markets. While it initially manufactur ...

as AMD 8087 and by Cyrix as Cyrix 8087. The clone K1810WM87 of the 8087 was produced in the

Soviet Union The Soviet Union,. officially the Union of Soviet Socialist Republics. (USSR),. was a List of former transcontinental countries#Since 1700, transcontinental country that spanned much of Eurasia from 1922 to 1991. A flagship communist state, ...

Successors

Just as the 8088 and 8086 processors were superseded by later parts, so was the 8087 superseded. Other Intel coprocessors were the 80287, 80387, and the 80187. Starting with the 80486, the later Intel processors did not use a separate floating-point coprocessor; virtually all included it on the main processor die, with the significant exception of the 80486SX, which was a modified 80486DX with the FPU disabled. The 80487 was in fact a full-blown

80486DX The Intel 486, officially named i486 and also known as 80486, is a microprocessor. It is a higher-performance follow-up to the Intel 386. The i486 was introduced in 1989. It represents the fourth generation of binary compatible CPUs following the ...

chip with an extra pin. When installed, it disabled the 80486SX CPU. The

, and later processors include floating-point functionality on the CPU core.

References

Bibliography

External links

Intel 80x87 math coprocessors at cpu-collection.de

Coprocessor.info: 8087 math coprocessor history information and pictures

Datasheet for the Intel 8087 Math Coprocessor
* {{Authority control 80087 Floating point Coprocessors