HOME

TheInfoList



OR:

The Alpha 21064 is a
microprocessor A microprocessor is a computer processor where the data processing logic and control is included on a single integrated circuit, or a small number of integrated circuits. The microprocessor contains the arithmetic, logic, and control circu ...
developed and fabricated by
Digital Equipment Corporation Digital Equipment Corporation (DEC ), using the trademark Digital, was a major American company in the computer industry from the 1960s to the 1990s. The company was co-founded by Ken Olsen and Harlan Anderson in 1957. Olsen was president unt ...
that implemented the
Alpha Alpha (uppercase , lowercase ; grc, ἄλφα, ''álpha'', or ell, άλφα, álfa) is the first letter of the Greek alphabet. In the system of Greek numerals, it has a value of one. Alpha is derived from the Phoenician letter aleph , whic ...
(introduced as the Alpha AXP)
instruction set architecture In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ' ...
(ISA). It was introduced as the DECchip 21064 before it was renamed in 1994. The 21064 is also known by its code name, EV4. It was announced in February 1992 with volume availability in September 1992. The 21064 was the first commercial implementation of the Alpha ISA, and the first microprocessor from Digital to be available commercially. It was succeeded by a derivative, the Alpha 21064A in October 1993. This last version was replaced by the
Alpha 21164 The Alpha 21164, also known by its code name, EV5, is a microprocessor developed and fabricated by Digital Equipment Corporation that implemented the Alpha instruction set architecture (ISA). It was introduced in January 1995, succeeding the Alp ...
in 1995.


History

The first Alpha processor was a test chip codenamed EV3. This test chip was fabricated using Digital's 1.0-
micrometre The micrometre ( international spelling as used by the International Bureau of Weights and Measures; SI symbol: μm) or micrometer (American spelling), also commonly known as a micron, is a unit of length in the International System of Unit ...
(μm) CMOS-3 process. The test chip lacked a
floating point unit Floating may refer to: * a type of dental work performed on horse teeth * use of an isolation tank * the guitar-playing technique where chords are sustained rather than scratched * ''Floating'' (play), by Hugh Hughes * Floating (psychological phe ...
and only had 1  KB
cache Cache, caching, or caché may refer to: Places United States * Cache, Idaho, an unincorporated community * Cache, Illinois, an unincorporated community * Cache, Oklahoma, a city in Comanche County * Cache, Utah, Cache County, Utah * Cache County ...
s. The test chip was used to confirm the operation of the aggressive
circuit design The process of circuit design can cover systems ranging from complex electronic systems down to the individual transistors within an integrated circuit. One person can often do the design process without needing a planned or structured design ...
techniques. The test chip, along with simulators and emulators, was also used to bring up
firmware In computing, firmware is a specific class of computer software that provides the low-level control for a device's specific hardware. Firmware, such as the BIOS of a personal computer, may contain basic functions of a device, and may provide h ...
and the various
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also in ...
s that the company supported. The production chip, codenamed EV4, was fabricated using Digital's 0.75 μm CMOS-4 process.
Dirk Meyer Derrick R. "Dirk" Meyer (born November 24, 1961) is a former Chief Executive Officer of Advanced Micro Devices, serving in the position from July 18, 2008 to January 10, 2011. Education He received a bachelor's degree in computer engineering ...
and Edward McLellan were the micro-architects. Ed designed the issue logic while Dirk designed the other major blocks. Jim Montanaro led the circuit implementation. The EV3 was used in the Alpha Demonstration Unit (ADU), a
multiprocessor Multiprocessing is the use of two or more central processing units (CPUs) within a single computer system. The term also refers to the ability of a system to support more than one processor or the ability to allocate tasks between them. There ar ...
system used by Digital to develop software for the Alpha platform before the availability of EV4 parts. The 21064 was unveiled at the 39th
International Solid-State Circuits Conference International Solid-State Circuits Conference is a global forum for presentation of advances in solid-state circuits and Systems-on-a-Chip. The conference is held every year in February at the San Francisco Marriott Marquis in downtown San Fr ...
(ISSCC) in mid-February 1992. It was announced on 25 February 1992, with a 150 MHz sample introduced on the same day. It was priced at $3,375 in quantities of 100, $1,650 in quantities between 100 and 1,000, and $1,560 for quantities over 1,000. Volume shipments began in September 1992. In early February 1993, the price of the 150 MHz version was reduced to $1,096 from $1,559 in quantities greater than 1,000. On 25 February 1993, a 200 MHz was introduced, with sample kits available, priced at $3,495. In volume, it was priced at $1,231 per unit in quantities greater than 10,000. Volume orders were accepted in June 1993, with shipments in August 1993. The price of the 150 MHz version was reduced in response. The sample kit was reduced to $1,690 from $3,375, effective in April 1993; and in volume, it was reduced to $853 from $1,355 per unit in quantities greater than 10,000, effective in July 1993. With the introduction of the Alpha 21066 and the Alpha 21068 on 10 September 1993, Digital adjusted the positioning of the existing 21064s and introduced a 166 MHz version priced at $499 per unit in quantities of 5,000. The price of the 150 MHz version was reduced to $455 per unit in quantities of 5,000. On 6 June 1994, the price of the 200 MHz version was reduced by 31% to $544 to position it against the 60 MHz Pentium; and the 166 MHz version by 19% to $404 per unit in quantities of 5,000, effective on 3 July 1994. The Alpha 21064 was fabricated at Digital's
Hudson, Massachusetts Hudson is a town in Middlesex County, Massachusetts, United States, with a total population of 20,092 as of the 2020 census. Before its incorporation as a town in 1866, Hudson was a neighborhood and unincorporated village of Marlborough, Massa ...
and South Queensferry, Scotland facilities.


Users

The 21064 was mostly used in high-end computers such as
workstation A workstation is a special computer designed for technical or scientific applications. Intended primarily to be used by a single user, they are commonly connected to a local area network and run multi-user operating systems. The term ''workstat ...
s and
server Server may refer to: Computing *Server (computing), a computer program or a device that provides functionality for other programs or devices, called clients Role * Waiting staff, those who work at a restaurant or a bar attending customers and su ...
s. Users included: * Aspen Systems in its Alpine workstations * Carrera Computers in its Hercules 150, Hercules 200, and Pantera II workstations *
Cray Research Cray Inc., a subsidiary of Hewlett Packard Enterprise, is an American supercomputer manufacturer headquartered in Seattle, Washington. It also manufactures systems for data storage and analytics. Several Cray supercomputer systems are listed i ...
, which used the 150 MHz 21064 in its
Cray T3D The T3D (''Torus, 3-Dimensional'') was Cray Research's first attempt at a massively parallel supercomputer architecture. Launched in 1993, it also marked Cray's first use of another company's microprocessor. The T3D consisted of between 32 and 204 ...
supercomputer A supercomputer is a computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second ( FLOPS) instead of million instructions ...
s * Digital, in its DECpc AXP 150 entry-level workstations,
DEC 2000 AXP The DECpc AXP 150, code-named ''Jensen'', is an entry-level workstation developed and manufactured by Digital Equipment Corporation. Introduced on 25 May 1993, the DECpc AXP 150 was the first DEC Alpha, Alpha-based system to support the Windows NT o ...
entry-level servers,
DEC 3000 AXP DEC 3000 AXP was the name given to a series of computer workstations and servers, produced from 1992 to around 1995 by Digital Equipment Corporation. The DEC 3000 AXP series formed part of the first generation of computer systems based on the 64- ...
workstations and entry-level servers,
DEC 4000 AXP The DEC 4000 AXP is a series of departmental server computers developed and manufactured by Digital Equipment Corporation introduced on 10 November 1992. These systems formed part of the first generation of systems based on the 64-bit Alpha AXP a ...
mid-range servers and
DEC 7000/10000 AXP The DEC 7000 AXP and DEC 10000 AXP are a series of high-end multiprocessor server computers developed and manufactured by Digital Equipment Corporation, introduced on 10 November 1992 (although the DEC 10000 AXP was not available until the followi ...
high-end servers *
Encore Computer Encore Computer was an early pioneer in the parallel computing market, based in Marlborough, Massachusetts. Although offering several system designs beginning in 1985, they were never as well known as other companies in this field such as Pyrami ...
, in its Infinity R/T high-end real-time computer


Performance

The 21064 was the highest performing microprocessor from when it was introduced until 1993, after International Business Machines (IBM) introduced the multi-chip
POWER2 The POWER2, originally named RIOS2, is a processor designed by IBM that implemented the POWER instruction set architecture. The POWER2 was the successor of the POWER1, debuting in September 1993 within IBM's RS/6000 systems. When introduced, t ...
. It subsequently became the highest performing single-chip microprocessor, a position it held until the 275 MHz 21064A was introduced in October 1993.


Description

The Alpha 21064 is a
superpipelined In computer engineering, instruction pipelining or ILP is a technique for implementing instruction-level parallelism within a single processor. Pipelining attempts to keep every part of the processor busy with some instruction by dividing inco ...
dual-issue
superscalar A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a sup ...
microprocessor that executes instructions in-order. It is capable of issuing up to two instructions every clock cycle to four functional units: an integer unit, a
floating-point unit In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
(FPU), an address unit, and a branch unit. The integer
pipeline Pipeline may refer to: Electronics, computers and computing * Pipeline (computing), a chain of data-processing stages or a CPU optimization found on ** Instruction pipelining, a technique for implementing instruction-level parallelism within a s ...
is seven stages long, and the floating-point pipeline ten stages. The first four stages of both pipelines are identical and are implemented by the I-Box.


I-box

The I-box is the
control unit The control unit (CU) is a component of a computer's central processing unit (CPU) that directs the operation of the processor. A CU typically uses a binary decoder to convert coded instructions into timing and control signals that direct the ope ...
; it fetches, decodes, and issues instructions and controls the pipeline. During stage one, two instructions are fetched from the I-cache.
Branch prediction In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow ...
is performed by logic in the I-box during stage two. Either static prediction or dynamic prediction is used. Static prediction examined the
sign bit In computer science, the sign bit is a bit in a signed number representation that indicates the sign of a number. Although only signed numeric data types have a sign bit, it is invariably located in the most significant bit position, so the term ...
of the displacement field of a
branch instruction A branch is an instruction in a computer program that can cause a computer to begin executing a different instruction sequence and thus deviate from its default behavior of executing instructions in order. ''Branch'' (or ''branching'', ''branc ...
, predicted the branch as taken if the sign bit indicated a backwards branch (if sign bit contained 1). Dynamic prediction examined an entry in the 2,048-entry by 1-bit branch history table. If an entry contained 1, the branch was predicted as taken. If dynamic prediction was utilized, the branch prediction is approximately 80% accurate for most programs. The
branch misprediction In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow i ...
penalty is four cycles. These instructions are decoded during stage three. The I-box then checks if the resources required by the two instructions are available during stage four. If so, the instructions are issued, providing they can be paired. Which instructions could be paired was determined by the number of read and write ports in the integer register file. The 21064 could issue: an integer operate with a floating-point operate, any load/store instruction with any operate instruction, an integer operate with an integer branch, or a floating-point operate with a floating-point branch. Two combinations were not permitted: an integer operate and a floating-point store, and a floating-point operate and an integer store. If one of the two instructions cannot be issued together, the first four stages are stalled until the remaining instruction is issued. The first four stages are also stalled in the event that no instruction can be issued due to resource unavailability, dependencies, or similar conditions. The I-box contains two
translation lookaside buffer A translation lookaside buffer (TLB) is a memory cache that stores the recent translations of virtual memory to physical memory. It is used to reduce the time taken to access a user memory location. It can be called an address-translation cache. ...
s (TLBs) for translating
virtual address In computing, a virtual address space (VAS) or address space is the set of ranges of virtual addresses that an operating system makes available to a process. The range of virtual addresses usually starts at a low address and can extend to the hig ...
es to
physical address In computing, a physical address (also real address, or binary address), is a memory address that is represented in the form of a binary number on the address bus circuitry in order to enable the data bus to access a ''particular'' storage cell ...
es. These TLBs are referred to as ''instruction translation buffers'' (ITBs). The ITBs cache recently used page table entries for the instruction stream. An eight-entry ITB is used for 8 KB
page Page most commonly refers to: * Page (paper), one side of a leaf of paper, as in a book Page, PAGE, pages, or paging may also refer to: Roles * Page (assistance occupation), a professional occupation * Page (servant), traditionally a young m ...
s and a four-entry ITB for 4 MB pages. Both ITBs are
fully associative A CPU cache is a memory which holds the recently utilized data by the processor. A block of memory cannot necessarily be placed randomly in the cache and may be restricted to a single CPU cache#Cache entries, cache line or a set of cache lines by ...
and use a not-last used replacement algorithm.


Execution

Execution begins during stage five for all instructions. The
register file A register file is an array of processor registers in a central processing unit (CPU). Register banking is the method of using a single name to access multiple different physical registers depending on the operating mode. Modern integrated circuit- ...
s are read during stage four. The pipelines beginning at stage five cannot be stalled.


Integer unit

The integer unit is responsible for executing integer instructions. It consists of the integer
register file A register file is an array of processor registers in a central processing unit (CPU). Register banking is the method of using a single name to access multiple different physical registers depending on the operating mode. Modern integrated circuit- ...
(IRF) and the E-box. The IRF contains thirty-two 64-bit registers and has four read ports and two write ports that are equally divided between the integer unit and the branch unit. The E-box contains an adder, a logic unit, barrel shifter, and multiplier. Except for multiply, shift, and byte manipulation instructions, most integer instructions are completed by the end of stage five and thus have a latency of one cycle. The barrel shifter is pipelined, but shift and byte manipulation instructions are not completed by the end of stage six, and thus have a latency of two cycles. The multiplier was not pipelined in order to save die area; McLellan 1993, p. 42 thus multiply instructions have a variable latency of 19 to 23 cycles depending on the operands. In stage seven, integer instructions write their results to the IRF.


Address unit

The address unit, also known as the "A-box", executed load and store instructions. To enable the address unit and integer unit to operate in parallel, the address unit has its own displacement adder, which it uses to calculate
virtual address In computing, a virtual address space (VAS) or address space is the set of ranges of virtual addresses that an operating system makes available to a process. The range of virtual addresses usually starts at a low address and can extend to the hig ...
es, instead of using the adder in the integer unit. McLellan 1993, p. 43 A 32-entry fully associative
translation lookaside buffer A translation lookaside buffer (TLB) is a memory cache that stores the recent translations of virtual memory to physical memory. It is used to reduce the time taken to access a user memory location. It can be called an address-translation cache. ...
(TLB) is used to translate
virtual address In computing, a virtual address space (VAS) or address space is the set of ranges of virtual addresses that an operating system makes available to a process. The range of virtual addresses usually starts at a low address and can extend to the hig ...
es into
physical address In computing, a physical address (also real address, or binary address), is a memory address that is represented in the form of a binary number on the address bus circuitry in order to enable the data bus to access a ''particular'' storage cell ...
es. This TLB is referred to as the ''data translation buffer'' (DTB). The 21064 implemented a 43-bit virtual address and a 34-bit physical address, and is therefore capable of addressing 8 TB of
virtual memory In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very l ...
and 16 GB of
physical memory Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers. The central processing unit (CPU) of a computer ...
. Store instructions result in data buffered in a 4-entry by 32-byte write buffer. The write buffer improved performance by reducing the number of writes on the system bus by merging data from adjacent stores and by temporarily delaying stores, enabling loads to be serviced quicker as the system bus is not utilized as often.


Floating-point unit

The floating-point unit consists of the floating-point register file (FRF) and the F-box.Dobberpuhl 1992, p. 36 The FRF contains thirty-two 64-bit registers and has three read ports and two write ports. The F-box contained a floating-point pipeline and a non-pipelined divide unit which retired one bit per cycle. The floating-point register file is read and the data formatted into fraction, exponent, and sign in stage four. If executing add instructions, the adder calculates the exponent difference, and a predictive leading one or zero detector using input operands for normalizing the result is initiated. If executing multiply instructions, a 3 X
multiplicand Multiplication (often denoted by the cross symbol , by the mid-line dot operator , by juxtaposition, or, on computers, by an asterisk ) is one of the four elementary mathematical operations of arithmetic, with the other ones being additio ...
is generated. In stages five and six, alignment or a normalization shift and sticky-bit calculations are performed for adds and subtracts. Multiply instructions are multiplied in a pipelined, two-way interleaved array which uses a radix-8 Booth algorithm.Dobberpuhl 1992, p. 38 In stage eight, final addition is performed in parallel with rounding. Floating-point instructions write their results to the FRF in stage ten. Instructions executed in the pipeline have a six-cycle latency. Single-precision (32-bit) and double-precision (64-bit) divides, which are executed in the non-pipelined divide unit, have a latency of 31 and 61 cycles, respectively.


Caches

The 21064 has two on-die primary
cache Cache, caching, or caché may refer to: Places United States * Cache, Idaho, an unincorporated community * Cache, Illinois, an unincorporated community * Cache, Oklahoma, a city in Comanche County * Cache, Utah, Cache County, Utah * Cache County ...
s: an 8 KB data cache (known as the D-cache) using a write-through write policy and an 8 KB instruction cache (known as the I-cache). Both caches are direct-mapped for single-cycle access and have 32-byte line size. The caches are built with six-transistor
static random access memory Static random-access memory (static RAM or SRAM) is a type of random-access memory (RAM) that uses latching circuitry (flip-flop) to store each bit. SRAM is volatile memory; data is lost when power is removed. The term ''static'' differen ...
(SRAM) cells that have an area of 98 μm2. The caches are 1,024 cells wide by 66 cells tall, with the top two rows used for redundancy. An optional external secondary cache, known as the B-cache, with capacities of 128 KB to 16 MB was supported. The cache operated at one-third to one-sixteenth of the internal clock frequency, or 12.5 to 66.67 MHz at 200 MHz. The B-cache is direct-mapped and has a 128-byte line size by default that could be configured to use larger quantities. The B-cache is accessed via the system bus.


External interface

The external interface is a 128-bit
data bus In computer architecture, a bus (shortened form of the Latin '' omnibus'', and historically also called data highway or databus) is a communication system that transfers data between components inside a computer, or between computers. This ex ...
that operated at half to one-eighth the internal clock rate, or 25 to 100 MHz at 200 MHz. The width of the bus was configurable, systems using the 21064 could have a 64-bit external interface. The external interface also consisted of a 34-bit
address bus In computer architecture, a bus (shortened form of the Latin '' omnibus'', and historically also called data highway or databus) is a communication system that transfers data between components inside a computer, or between computers. This ex ...
.


Fabrication

The 21064 contained 1.68 million transistors.Dobberpuhl 1992, p. 35 The original EV4 was fabricated by Digital in its CMOS-4 process, which has a 0.75 μm feature size and three levels of
aluminium interconnect In integrated circuits (ICs), interconnects are structures that connect two or more circuit elements (such as transistors) together electrically. The design and layout of interconnects on an IC is vital to its proper function, performance, power ef ...
. The EV4 measures 13.9 mm by 16.8 mm, for an area of 233.52 mm2. The later EV4S was fabricated in CMOS-4S, a 10% optical shrink of CMOS-4 with a 0.675 μm feature size. This version measured 12.4 mm by 15.0 mm, for an area 186 mm2. The 21064 used a 3.3-
volt The volt (symbol: V) is the unit of electric potential, electric potential difference (voltage), and electromotive force in the International System of Units (SI). It is named after the Italian physicist Alessandro Volta (1745–1827). Defi ...
(V) power supply. The EV4 dissipated a maximum of 30 W at 200 MHz. The EV4S dissipates a maximum of 21.0 W at 150 MHz, 22.5 W at 166 MHz, and 27.0 W at 200 MHz.


Package

The 21064 is packaged in a 431-pin alumina-ceramic pin grid array (PGA) measuring 61.72 mm by 61.72 mm. Of the 431 pins, 291 were for signals and 140 were for power and ground. The
heatsink A heat sink (also commonly spelled heatsink) is a passive heat exchanger that transfers the heat generated by an electronic or a mechanical device to a fluid medium, often air or a liquid coolant, where it is dissipated away from the device, the ...
is directly attached to the package, secured by nuts attached to two studs protruding from the tungsten
heat spreader A heat spreader transfers energy as heat from a hotter source to a colder heat sink or heat exchanger. There are two thermodynamic types, passive and active. The most common sort of passive heat spreader is a plate or block of material having hi ...
.


Derivatives


Alpha 21064A

The Alpha 21064A, introduced as the DECchip 21064A, code-named EV45, is a further development of the Alpha 21064 introduced in October 1993. It operated at clock frequencies of 200, 225, 233, 275 and 300 MHz. The 225 MHz model was replaced by the 233 MHz model on 6 July 1994, which at introduction, was priced at US$788 in quantities of 5,000, 10% less than the 225 MHz model it replaced. On the same day, prices for the 275 MHz was also reduced by 25% to US$1,083 in quantities of 5,000. The 300 MHz model was announced and sampled on 2 October 1995 and was shipped in December 1995. There was also one model, the 21064A-275-PC, that was restricted to running the
Windows NT Windows NT is a proprietary graphical operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems sc ...
or
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also in ...
s that use the Windows NT memory management model. The 21064A succeeded the original 21064 as the high-end Alpha microprocessor. It subsequently saw the most use in high-end systems. Users included: * Digital in some models of its DEC 3000 AXP, DEC 4000 AXP and DEC 7000/10000 AXP systems * Aspen Systems in its Alpine workstation * BTG, who used a 275 MHz model in its Action AXP275 RISC PC * Carrera Computers in its Cobra AXP 275 workstation * NekoTech, who used a 275 MHz model
overclock In computing, overclocking is the practice of increasing the clock rate of a computer to exceed that certified by the manufacturer. Commonly, operating voltage is also increased to maintain a component's operational stability at accelerated spe ...
ed by 5% to 289 MHz in their Mach 2-289-T workstation * Network Appliance (now NetApp), who used a 275 MHz model in its storage systems The 21064A had a number of microarchitectural improvements over the 21064. The primary caches were improved in two ways: the capacity of the I-cache and D-cache was doubled from 8 KB to 16 KB and parity protection was added to the cache tag and cache data arrays. Floating-point divides have a lower latency due to an improved divider that retires two bits per cycle on average. Branch prediction was improved by a larger 4,096-entry by 2-bit BHT. The 21064A contains 2.8 million transistors and is 14.5 by 10.5 mm large, for an area of 152.25 mm2. It was fabricated by Digital in their fifth-generation CMOS process, CMOS-5, a 0.5 μm process with four levels of
aluminium interconnect In integrated circuits (ICs), interconnects are structures that connect two or more circuit elements (such as transistors) together electrically. The design and layout of interconnects on an IC is vital to its proper function, performance, power ef ...
.


Alpha 21066

The Alpha 21066, introduced as the DECchip 21066, code-named LCA4 (''Low Cost Alpha''), is a low-cost variant of Alpha 21064. Samples were introduced on 10 September 1993, with volume shipments in early 1994. At the time of introduction, the 166 MHz Alpha 21066 was priced at US$385 in quantities of 5,000. A 100 MHz model, intended for
embedded system An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' as ...
s, also existed. Sampling begun in late 1994, with volume shipments in the third quarter of 1995. The ''
Microprocessor Report ''Microprocessor Report'' is a newsletter covering the microprocessor industry. The publication is accessible only to paying subscribers. To avoid bias, it does not take advertisements. The publication provides extensive analysis of new high-perfo ...
'' recognized the Alpha 21066 as the first microprocessor with an integrated PCI controller. The Alpha 21066 was intended for use in low-cost applications, specifically
personal computer A personal computer (PC) is a multi-purpose microcomputer whose size, capabilities, and price make it feasible for individual use. Personal computers are intended to be operated directly by an end user, rather than by a computer expert or tec ...
s running
Windows NT Windows NT is a proprietary graphical operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems sc ...
. Digital used various models of the Alpha 21066 in their
Multia Multia is a municipality of Finland. It is located in the Central Finland region. The municipality has a population of () and covers an area of of which is water. The population density is . The municipality is unilingually Finnish. The munic ...
clients, AXPpci 33
original equipment manufacturer An original equipment manufacturer (OEM) is generally perceived as a company that produces non-aftermarket parts and equipment that may be marketed by another manufacturer. It is a common industry term recognized and used by many professional or ...
(OEM) motherboards and AXPvme
single board computer A single-board computer (SBC) is a complete computer built on a single circuit board, with microprocessor(s), memory, input/output (I/O) and other features required of a functional computer. Single-board computers are commonly made as demonstrat ...
s. Outside of Digital, users included Aspen Systems in its Alpine workstation, Carrera Computers in its Pantera I workstation, NekoTech used a 166 MHz model in its Mach 1-166 personal computer, and Parsys in its TransAlpha TA9000 Series supercomputers. Due to the process shrink, it was able to include features that were desirable in cost-sensitive
embedded system An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' as ...
s. These features include an on-die B-cache and
memory controller The memory controller is a digital circuit that manages the flow of data going to and from the computer's main memory. A memory controller can be a separate chip or integrated into another chip, such as being placed on the same die or as an int ...
with ECC support, a functionally limited
graphics accelerator A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mob ...
supporting up to 8 MB of
VRAM Video random access memory (VRAM) is dedicated computer memory used to store the pixels and other graphics data as a framebuffer to be rendered on a computer monitor. This is often different technology than other computer memory, to facilitate b ...
for implementing a
framebuffer A framebuffer (frame buffer, or sometimes framestore) is a portion of random-access memory (RAM) containing a bitmap that drives a video display. It is a memory buffer containing data representing all the pixels in a complete video frame. Modern ...
, a PCI controller and a
phase locked loop A phase-locked loop or phase lock loop (PLL) is a control system that generates an output signal whose phase is related to the phase of an input signal. There are several different types; the simplest is an electronic circuit consisting of a ...
(PLL) clock generator for multiplying a 33 MHz external clock signal to the desired internal clock frequency. The memory controller supported 64 KB to 2 MB of B-cache and 2 to 512 MB of memory. The ECC implementation was capable of detecting 1-, 2- and 4-bit errors and correcting 1-bit errors. To reduce cost, the Alpha 21066 has a 64-bit system bus, which reduced the number of pins and thus the size of the package. The reduced width of the system bus also reduced bandwidth and thus performance by 20%, which was deemed acceptable. The 21066 contained 1.75 million transistors and measured 17.0 by 12.3 mm, for an area of 209.1 mm2. It was fabricated in CMOS-4S, a 0.675 μm process with three levels of interconnect. The 21066 was packaged in a 287-pin CPGA measuring 57.404 by 57.404 mm.


Alpha 21066A

The Alpha 21066A, code-named LCA45, is a low-cost variant of the Alpha 21064A. It was announced on 14 November 1994, with samples of 100 and 233 MHz models introduced on the same day. Both models were shipped in March 1995. When announced, the 100 and 233 MHz models were priced at $175 and $360, respectively, in quantities of 5,000. A 266 MHz model was later made available. The 21066A was
second source In the electronics industry, a second source is a company that is licensed to manufacture and sell components originally designed by another company (the first source). It is common for engineers and purchasers to avoid components that are only av ...
d by
Mitsubishi Electric , established on 15 January 1921, is a Japanese multinational electronics and electrical equipment manufacturing company headquartered in Tokyo, Japan. It is one of the core companies of Mitsubishi. The products from MELCO include elevators an ...
as the M36066A. It was the first Alpha microprocessor to be fabricated by the company. 100 and 233 MHz parts were announced in November 1994. At the time of the announcement, engineering samples were set for December 1994, commercial samples in July 1995 and volume quantities in September 1995. The 233 MHz part was priced at $490 in quantities of 1,000. Although it was based on the 21064A, the 21066A did not have the 16 KB instruction and data caches. A feature specific to the 21066A was power management – the microprocessor's internal clock frequency could be adjusted by software. Digital used various models of 21066A in their products which had previously used the 21066. Outside of Digital, Tadpole Technology used a 233 MHz model in their ALPHAbook 1
notebook A notebook (also known as a notepad, writing pad, drawing pad, or legal pad) is a book or stack of paper pages that are often ruled and used for purposes such as note-taking, journaling or other writing, drawing, or scrapbooking. History ...
. The 21066A contained 1.8 million transistors on a die measuring 14.8 by 10.9 mm, for an area of 161.32 mm2. It was fabricated in Digital's fifth-generation CMOS process, CMOS-5, a 0.5 μm process with three levels of interconnect. Mitsubishi Electric fabricated the M36066A in its own 0.5 μm three-level-metal process.


Alpha 21068

The Alpha 21068, introduced as the DECchip 21068, is a version of the 21066 positioned for embedded systems. It was identical to the 21066 but had a lower clock rate to reduce power dissipation and cost. Samples were introduced on 10 September 1993 with volume shipments in early 1994. It operated at 66 MHz and had a 9 W maximum power dissipation. At the time of introduction, the 21068 was priced at US$221 each in quantities of 5,000. On 6 June 1994, Digital announced that it was cutting the price by 16% to US$186, effective on 3 July 1994. The Alpha 21068 was used by Digital in their AXPpci 33 motherboard and the AXPvme 64 and 64LC
single-board computer A single-board computer (SBC) is a complete computer built on a single circuit board, with microprocessor(s), memory, input/output (I/O) and other features required of a functional computer. Single-board computers are commonly made as demonstrati ...
s.


Alpha 21068A

The Alpha 21068A, introduced as the DECchip 21068A, is a variant of the Alpha 21066A for embedded systems. It operated at a clock frequency of 100 MHz.


Chipsets

Initially, there was no standard
chipset In a computer system, a chipset is a set of electronic components An electronic component is any basic discrete device or physical entity in an electronic system used to affect electrons or their associated fields. Electronic components are ...
for the 21064 and 21064A. Digital's computers used custom
application-specific integrated circuit An application-specific integrated circuit (ASIC ) is an integrated circuit (IC) chip customized for a particular use, rather than intended for general-purpose use, such as a chip designed to run in a digital voice recorder or a high-efficie ...
s (ASICs) to interface the microprocessor to the system. As this raised development cost for third parties who wished to develop Alpha-based products, Digital developed a standard chipset, the DECchip 21070 (''Apecs''), for
original equipment manufacturer An original equipment manufacturer (OEM) is generally perceived as a company that produces non-aftermarket parts and equipment that may be marketed by another manufacturer. It is a common industry term recognized and used by many professional or ...
s (OEMs). There were two models of the 21070, the DECchip 21071 and the DECchip 21072. The 21071 was intended for workstations whereas the 21072 was intended for high-end workstations or low-end uniprocessor servers. The two models differed in memory subsystem features: the 21071 has a 64-bit
memory bus In computer architecture, a bus (shortened form of the Latin '' omnibus'', and historically also called data highway or databus) is a communication system that transfers data between components inside a computer, or between computers. This ex ...
and supports 8 MB to 2 GB of parity-protected memory whereas the 21072 has a 128-bit memory bus and supports 16 MB to 4 GB of ECC-protected memory. The chipset consisted of three chip designs: the COMANCHE B-cache and
memory controller The memory controller is a digital circuit that manages the flow of data going to and from the computer's main memory. A memory controller can be a separate chip or integrated into another chip, such as being placed on the same die or as an int ...
, the DECADE data slice, and the EPIC PCI controller. The DECADE chips implemented the data paths in 32-bit slices, and therefore the 21071 has two such chips while the 21072 has four. The EPIC chip has a 32-bit path to the DECADE chips. The 21070 was introduced on 10 January 1994, Digital Equipment Corporation 1994 with samples available. Volume shipments began in mid-1994. In quantities of 5,000, the 21071 was priced at $90 and the 21072 at $120. 21070 users included Carrera Computers for its Pantera workstations and Digital in some models of its
AlphaStation AlphaStation is the name given to a series of computer workstations, produced from 1994 onwards by Digital Equipment Corporation, and later by Compaq and HP. As the name suggests, the AlphaStations were based on the DEC Alpha 64-bit microproce ...
s and
uniprocessor A uniprocessor system is defined as a computer system that has a single central processing unit that is used to execute computer tasks. As more and more modern software is able to make use of multiprocessing architectures, such as SMP and MPP, th ...
AlphaServer AlphaServer is a series of server computers, produced from 1994 onwards by Digital Equipment Corporation, and later by Compaq and HP. AlphaServers were based on the DEC Alpha 64-bit microprocessor. Supported operating systems for AlphaSe ...
s.


Notes


References

* ''Alpha 21064 and 21064A Microprocessors Hardware Reference Manual'', June 1996. Order number: EC-Q92UC-TE. Digital Equipment Corporation. * Apiki, Steve; Grehan, Rick (March 1995)
"Fastest NT Workstations"
''
Byte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
''.
* Bhandarkar, Dileep
"Alpha Implementations"
, ''IEEE Computer Society Technical Committee on Computer Architecture Newsletter'', December 1995.
* Computergram (25 February 1992)
"DEC Reveals More On Alpha, Challenges Hewlett-Packard's Precision Architecture RISC"
''Computer Business Review''. * Computergram (26 February 1992)
"DEC Describes Its Alpha RISC, Kubota Discloses Its Plans"
''Computer Business Review''. * Computergram (7 June 1994)
"DEC slashes Alpha AXP Chip Prices by up to 31%"
''Computer Business Review''. * Computergram (13 September 1993)
"DEC adds Alphas for Personal Computers, Control"
''Computer Business Review''. * Computergram (11 January 1994)
"Microprocessor Report's Annual Chip Awards Declare Motorola 88110 the Part least likely to..."
''Computer Business Review''. * Computergram (11 November 1994)
"Mitsubishi Electric Is Ready To Sample Its First Alpha At Last"
''Computer Business Review''. * Computergram (25 November 1994). "Mitsubishi's First Alpha Provides The Same Functionality As DEC's 21066A". ''Computer Business Review''. * Digital Equipment Corporation (10 January 1994). "Digital Introduces PCI-Based System Logic Chipsets For Alpha AXP 21064 Microprocessors And Announces The Industry's First PCI To PCI Bridge Chip". Press release. * Dobberpuhl, Daniel W., Witek, Richard T. et al. "A 200-MHz 64-bit Dual-issue CMOS Microprocessor", ''Digital Technical Journal'', Volume 4, Number 4, Special Issue 1992, pp. 35–50. * Gwennap, Linley (12 September 1994). "Digital Leads the Pack with 21164", ''Microprocessor Report'', Volume 8, Number 12. * Krause, Reinhardt (13 September 1993). "DEC unveils two Alphas in PCI, embedded drive". ''
Electronic News ''Electronic News'' was a publication that covered the electronics industry, from semiconductor equipment and materials to military/aerospace electronics to supercomputers. It was originally a weekly trade newspaper, which covered all aspects of ...
''. * Krause, Reinhardt (18 October 1993). "DEC readies 225/275MHz Alphas". ''
Electronic News ''Electronic News'' was a publication that covered the electronics industry, from semiconductor equipment and materials to military/aerospace electronics to supercomputers. It was originally a weekly trade newspaper, which covered all aspects of ...
''. * Krause, Reinhardt (21 November 1994). "Alpha partners roll 233MHz 21066A". ''
Electronic News ''Electronic News'' was a publication that covered the electronics industry, from semiconductor equipment and materials to military/aerospace electronics to supercomputers. It was originally a weekly trade newspaper, which covered all aspects of ...
''.
* McKinney, Dina L. et al. "Digital's DECchip 21066: The First Cost-focused Alpha AXP Chip". ''Digital Technical Journal'', Volume 6, Number 1, Winter 1994, pp. 66–77. * McLellan, Edward (June 1993). "The Alpha AXP Architecture and 21064 Processor". ''
IEEE Micro ''IEEE Micro'' is a peer-reviewed scientific journal published by the IEEE Computer Society covering small systems and semiconductor chips, including integrated circuit processes and practices, project management, development tools and infrastruc ...
''. pp. 36–47.
* Ryan, Bob; Thompson, Tom (January 1994)
"RISC Grows Up"
''
Byte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
''.


Further reading

* "DEC Enters Microprocessor Business with Alpha". (4 March 1992). ''
Microprocessor Report ''Microprocessor Report'' is a newsletter covering the microprocessor industry. The publication is accessible only to paying subscribers. To avoid bias, it does not take advertisements. The publication provides extensive analysis of new high-perfo ...
'', Volume 6, Number 3. * "DEC's Alpha Architecture Premiers". (4 March 1992). ''
Microprocessor Report ''Microprocessor Report'' is a newsletter covering the microprocessor industry. The publication is accessible only to paying subscribers. To avoid bias, it does not take advertisements. The publication provides extensive analysis of new high-perfo ...
'', Volume 6, Number 3. * "Digital Plans Broad Alpha Processor Family" (18 November 1992). ''
Microprocessor Report ''Microprocessor Report'' is a newsletter covering the microprocessor industry. The publication is accessible only to paying subscribers. To avoid bias, it does not take advertisements. The publication provides extensive analysis of new high-perfo ...
'', Volume 6, Number 3. * "Digital Reveals PCI Chip Sets For Alpha". (12 July 1993). ''
Microprocessor Report ''Microprocessor Report'' is a newsletter covering the microprocessor industry. The publication is accessible only to paying subscribers. To avoid bias, it does not take advertisements. The publication provides extensive analysis of new high-perfo ...
'', Volume 7, Number 9. * "Alpha Hits Low End with Digital's 21066". (13 September 1993). ''
Microprocessor Report ''Microprocessor Report'' is a newsletter covering the microprocessor industry. The publication is accessible only to paying subscribers. To avoid bias, it does not take advertisements. The publication provides extensive analysis of new high-perfo ...
'', Volume 7, Number 12. * Bhandarkar, Dileep P. (1995). ''Alpha Architecture and Implementations''. Digital Press. * Fox, Thomas F. (1994). "The design of high-performance microprocessors at Digital". ''Proceedings of the 31st Annual ACM-IEEE Design Automation Conference''. pp. 586–591. * Gronowski, Paul E. et al. (May 1998). "High-performance microprocessor design". ''IEEE Journal of Solid-State Circuits'' 33 (5): pp. 676–686.


See also

* AlphaVM: A full
DEC Alpha Alpha (original name Alpha AXP) is a 64-bit reduced instruction set computer (RISC) instruction set architecture (ISA) developed by Digital Equipment Corporation (DEC). Alpha was designed to replace 32-bit VAX complex instruction set computers ...
system emulator running on Windows or Linux. It contains a high-performance emulator of the Alpha CPU. {{Digital Equipment Corporation DEC microprocessors Superscalar microprocessors 64-bit microprocessors de:Alpha-Prozessor#Benennung