HOME

TheInfoList



OR:

The AMD Am29000, commonly shortened to 29k, is a family of 32-bit
RISC In computer engineering, a reduced instruction set computer (RISC) is a computer designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a complex instruction set comput ...
microprocessor A microprocessor is a computer processor where the data processing logic and control is included on a single integrated circuit, or a small number of integrated circuits. The microprocessor contains the arithmetic, logic, and control circu ...
s and
microcontroller A microcontroller (MCU for ''microcontroller unit'', often also MC, UC, or μC) is a small computer on a single VLSI integrated circuit (IC) chip. A microcontroller contains one or more CPUs (processor cores) along with memory and programmable i ...
s developed and fabricated by
Advanced Micro Devices Advanced Micro Devices, Inc. (AMD) is an American multinational semiconductor company based in Santa Clara, California, that develops computer processors and related technologies for business and consumer markets. While it initially manufact ...
(AMD). Based on the seminal
Berkeley RISC Berkeley RISC is one of two seminal research projects into reduced instruction set computer (RISC) based microprocessor design taking place under the Defense Advanced Research Projects Agency '' Very Large Scale Integration'' (VLSI) VLSI Project. ...
, the 29k added a number of significant improvements. They were, for a time, the most popular RISC chips on the market, widely used in
laser printer Laser printing is an electrostatic digital printing process. It produces high-quality text and graphics (and moderate-quality photographs) by repeatedly passing a laser beam back and forth over a negatively-charged cylinder called a "drum" to d ...
s from a variety of manufacturers. Developed since 1984-1985, announced in March 1987 and released in May 1988, the initial Am29000 was followed by several versions, ending with the Am29040 in 1995. The 29050 was notable for being early to feature a
floating point unit Floating may refer to: * a type of dental work performed on horse teeth * use of an isolation tank * the guitar-playing technique where chords are sustained rather than scratched * ''Floating'' (play), by Hugh Hughes * Floating (psychological phe ...
capable of executing one multiply–add operation per cycle. AMD was designing a
superscalar A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a sup ...
version until late 1995, when AMD dropped the development of the 29k because the design team was transferred to support the PC (
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introd ...
) side of the business. What remained of AMD's embedded business was realigned towards the embedded 186 family of
80186 The Intel 80186, also known as the iAPX 186, or just 186, is a microprocessor and microcontroller introduced in 1982. It was based on the Intel 8086 and, like it, had a 16-bit external data bus multiplexed with a 20-bit address bus. The 801 ...
derivatives. By then the majority of AMD's resources were concentrated on their high-performance x86 processors for desktop PCs, using many of the ideas and individual parts of the 29k designs to produce the
AMD K5 The K5 is AMD's first x86 processor to be developed entirely in-house. Introduced in March 1996, its primary competition was Intel's Pentium microprocessor. The K5 was an ambitious design, closer to a Pentium Pro than a Pentium regarding techni ...
.


Design

The 29000 evolved from the same
Berkeley RISC Berkeley RISC is one of two seminal research projects into reduced instruction set computer (RISC) based microprocessor design taking place under the Defense Advanced Research Projects Agency '' Very Large Scale Integration'' (VLSI) VLSI Project. ...
design that also led to the Sun SPARC,
Intel i960 Intel's i960 (or 80960) was a RISC-based microprocessor design that became popular during the early 1990s as an embedded microcontroller. It became a best-selling CPU in that segment, along with the competing AMD 29000. In spite of its success, ...
,
ARM In human anatomy, the arm refers to the upper limb in common usage, although academically the term specifically means the upper arm between the glenohumeral joint (shoulder joint) and the elbow joint. The distal part of the upper limb between th ...
and
RISC-V RISC-V (pronounced "risk-five" where five refers to the number of generations of RISC architecture that were developed at the University of California, Berkeley since 1981) is an open standard instruction set architecture (ISA) based on estab ...
. One design element used in some of the Berkeley RISC-derived designs is the concept of
register window In computer engineering, register windows are a feature which dedicates registers to a subroutine by dynamically aliasing a subset of internal registers to fixed, programmer-visible registers. Register windows are implemented to improve the perf ...
s, a technique used to speed up
procedure call In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed. Functions may ...
s significantly. The idea is to use a large set of registers as a stack, loading local data into a set of registers during a call, and marking them "dead" when the procedure returns. Values being returned from the routines would be placed in the "global page", the top eight registers in the SPARC (for instance). The competing early RISC design from
Stanford University Stanford University, officially Leland Stanford Junior University, is a private research university in Stanford, California. The campus occupies , among the largest in the United States, and enrolls over 17,000 students. Stanford is consider ...
, the Stanford MIPS, also looked at this concept but decided that improved compilers could make more efficient use of general purpose registers than a hard-wired window. In the original Berkeley design, SPARC, and i960, the windows were fixed in size. A routine using only one local variable would still use up eight registers on the SPARC, wasting this expensive resource. It was here that the 29000 differed from these earlier designs, using a variable window size. In this example only two registers would be used, one for the local variable, another for the
return address In postal mail, a return address is an explicit inclusion of the address of the person sending the message. It provides the recipient (and sometimes authorized intermediaries) with a means to determine how to respond to the sender of the message i ...
. It also added more registers, including the same 128 registers for the procedure stack, but adding another 64 for global access. In comparison, the SPARC had 128 registers in total, and the global set was a standard window of eight. This change resulted in much better register use in the 29000 under a wide variety of workloads. The 29000 also extended the register window stack with an in-memory (and in theory, in-cache) stack. When the window filled the calls would be pushed off the end of the register stack into memory, restored as required when the routine returned. Generally, the 29000's register usage was considerably more advanced than competing designs based on the Berkeley concepts. Another difference with the Berkeley design is that the 29000 included no special-purpose condition code register. Any register could be used for this purpose, allowing the conditions to be easily saved at the expense of complicating some code. A Branch Target Cache (512 bytes on the 29000 and 1024 bytes on the 29050) stored sets of 4 or 2 sequential instructions found at the branch target address, reducing the instruction fetch latency during taken branches—the 29000 did not include any branch prediction system so there was a delay if a branch was taken. The buffer mitigated this by storing four or two instructions from the target address of the branch, which could be run instantly while the fetch buffer was re-filled with new instructions from memory.


Versions

The first Am29000 was released in 1988, including a built-in MMU but
floating point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be ...
support was offloaded to the Am29027 FPU. Units with failed MMU or Branch Target Cache were sold as the Am29005. In 1991 the line was extended with the Am29030 and Am29035, which included an 8  KB or 4 KB of instruction cache, respectively. By then the Am29050 had also become available, without on-chip cache but featuring a
floating-point unit In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
with fully pipelined
multiply–accumulate operation In computing, especially digital signal processing, the multiply–accumulate (MAC) or multiply-add (MAD) operation is a common step that computes the product of two numbers and adds that product to an accumulator. The hardware unit that performs ...
s, a larger 1 KB Branch Target Cache with a claimed 80% hit rate, and better-pipelined load operations sped up by a 4-entry TLB-like Physical Address Cache. Though it is not a
superscalar A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a sup ...
processor, it permits a floating-point operation and an integer operation to complete at the same cycle. The integer and floating-point sides each have an own write port to the registers. It contained 428,000 transistors on a 1-micron process with a 0.8-micron effective channel length and was available at 20, 25, 33, and 40 MHz. Later the Am29040 was released at 33, 40, and 50 MHz, being like the Am29030 except for featuring a 4 KB data cache, a multiplication unit, and a few other enhancements. The 119 mm² Am29040 contained 1.2 million transistors on a 0.7-micron process. A
superscalar A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a sup ...
version of 29K was being designed, but canceled in favor of x86. It was codenamed ''Jaguar'', and was described in November 1994 and August 1995. It was an advanced design, capable of four-way dispatch into six
reservation station A unified reservation station, also known as unified scheduler, is a decentralized feature of the microarchitecture of a CPU that allows for register renaming, and is used by the Tomasulo algorithm for dynamic instruction scheduling. Reservat ...
s and
speculative Speculative may refer to: In arts and entertainment *Speculative art (disambiguation) *Speculative fiction, which includes elements created out of human imagination, such as the science fiction and fantasy genres **Speculative Fiction Group, a Per ...
out-of-order execution of instructions, with four-way retire. The register file permitted four reads and two writes at once. The caches for instructions and data were 8 KB each. Loads from cache could bypass stores. It had no on-chip FPU due to cost reasons and the target market. It was expected to attain 100 MHz frequency on a 0.4-micron process. AMD used the unreleased 29K microarchitecture as the basis of the K5 series of
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introd ...
-compatible processors. The ALUs were carried over, as was the
re-order buffer A re-order buffer (ROB) is a hardware unit used in an extension to the Tomasulo algorithm to support out-of-order and speculative instruction execution. The extension forces instructions to be committed in-order. The buffer is a circular buffer ...
with a slight modification. The FPU was taken from the 29050, but extended to 80 bits precision. The K5 translated the x86 instructions into "RISC-OPs" upon decoding, aided by the predecode information held of the cached instructions. AMD claimed that the superscalar 29K would have only a slightly lower performance than the K5, but much lower cost due to the size difference. The Honeywell 29KII is a CPU based on the AMD 29050, and it was extensively used in real-time avionics. Image:AMD_Am29000_die.JPG, Am29000 Image:AMD_Am29030_die.jpg, Am29030 Image:AMD_Am29040_die.JPG, Am29040 Image:AMD_Am29050_die.JPG, Am29050


See also

*
List of AMD Am2900 and Am29000 families Advanced Micro Devices (AMD) had a number of product lines with the part numbers beginning with "29". These families were generally not related to one another. The Am29(F, BL, DL, DS)xxx family contains a variety of flash memories, an