The NetBurst microarchitecture, called P68 inside
Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
, was the successor to the
P6 microarchitecture in the
x86
x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...
family of
central processing unit
A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary Processor (computing), processor in a given computer. Its electronic circuitry executes Instruction (computing), instructions ...
s (CPUs) made by Intel. The first CPU to use this architecture was the
Willamette-core Pentium 4, released on November 20, 2000 and the first of the
Pentium 4
Pentium 4 is a series of single-core central processing unit, CPUs for Desktop computer, desktops, laptops and entry-level Server (computing), servers manufactured by Intel. The processors were shipped from November 20, 2000 until August 8, 20 ...
CPUs; all subsequent Pentium 4 and
Pentium D
Pentium D is a range of desktop 64-bit x86-64 processors based on the NetBurst microarchitecture, which is the Multi-core processor, dual-core variant of the Pentium 4 manufactured by Intel. Each CPU comprised two cores. The brand's first process ...
variants have also been based on NetBurst. In mid-2001, Intel released the ''Foster'' core, which was also based on NetBurst, thus switching the
Xeon
Xeon (; ) is a brand of x86 microprocessors designed, manufactured, and marketed by Intel, targeted at the non-consumer workstation, server, and embedded markets. It was introduced in June 1998. Xeon processors are based on the same archite ...
CPUs to the new architecture as well. Pentium 4-based
Celeron
Celeron is a series of IA-32 and x86-64 computer microprocessor, microprocessors targeted at low-cost Personal computer, personal computers, manufactured by Intel from 1998 until 2023.
The first Celeron-branded CPU was introduced on April 15, ...
CPUs also use the NetBurst architecture.
NetBurst was replaced with the
Core microarchitecture based on P6, released in July 2006.
Technology
The NetBurst microarchitecture includes features such as
Hyper-threading,
Hyper Pipelined Technology,
Rapid Execution Engine,
Execution Trace Cache
The NetBurst microarchitecture, called P68 inside Intel, was the successor to the P6 microarchitecture in the x86 family of central processing units (CPUs) made by Intel. The first CPU to use this architecture was the Willamette-core Pentium ...
, and
replay system
The replay system is a subsystem within the Intel Pentium 4 processor. Its primary function is to catch operations that have been mistakenly sent for execution by the processor's scheduler. Operations caught by the replay system are then re-execu ...
which all were introduced for the first time in this particular microarchitecture, and some never appeared again afterwards.
Hyper-threading
Hyper-threading is Intel's proprietary
simultaneous multithreading
Simultaneous multithreading (SMT) is a technique for improving the overall efficiency of superscalar CPUs with hardware multithreading. SMT permits multiple independent threads of execution to better use the resources provided by modern proces ...
(SMT) implementation used to improve parallelization of computations (doing multiple tasks at once) performed on x86 processors. Intel introduced it with NetBurst processors in 2002. Later Intel reintroduced it in the
Nehalem microarchitecture after its absence in the Core 2.
Quad-Pumped Front-Side Bus
The Northwood and Willamette cores feature an external Front Side Bus (FSB) that runs at 100 MHz which transfers four bits per clock cycle, thus having an effective speed of 400 MHz. Later revisions of the Northwood core, along with the Prescott core (
and derivatives) have an effective 800 MHz front-side bus (200 MHz quad pumped)
Hyper-Pipelined Technology
The Willamette and Northwood cores contain a 20-stage
instruction pipelining, instruction pipeline. This is a significant increase in the number of stages compared to the Pentium III, which had only 10 stages in its pipeline. The Prescott core increased the length of the pipeline to 31 stages. A drawback of longer pipelines is the increase in the number of stages that need to be traced back in the event of a branch misprediction, increasing the penalty of said misprediction. To address this issue, Intel devised the Rapid Execution Engine and has invested a great deal into its branch prediction technology, which Intel claims reduces
branch misprediction
In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow ...
s by 33% over
Pentium III
The Pentium III (marketed as Intel Pentium III Processor, informally PIII or P3) brand refers to Intel's 32-bit x86 desktop and mobile CPUs based on the sixth-generation P6 (microarchitecture), P6 microarchitecture introduced on February 28, 1999 ...
. In reality, the longer pipeline resulted in reduced efficiency through a lower number of
instructions per clock
In computer architecture, instructions per cycle (IPC), commonly called instructions per clock, is one aspect of a processor's performance: the average number of instructions executed for each clock cycle. It is the multiplicative inverse of c ...
(IPC) executed as high enough clock speeds were not able to be reached to offset lost performance due to larger than expected increase in power consumption and heat.
Rapid Execution Engine
With this technology, the two
arithmetic logic unit
In computing, an arithmetic logic unit (ALU) is a Combinational logic, combinational digital circuit that performs arithmetic and bitwise operations on integer binary numbers. This is in contrast to a floating-point unit (FPU), which operates on ...
s (ALUs) in the core of the CPU are double-pumped, meaning that they actually operate at twice the core clock frequency. For example, in a 3.8 GHz processor, the ALUs will effectively be operating at 7.6 GHz. The reason behind this is to generally make up for the low IPC count; additionally this considerably enhances the integer performance of the CPU. Intel also replaced the high-speed
barrel shifter
A barrel shifter is a digital circuit that can bit shift, shift a word (data type), data word by a specified number of bits without the use of any sequential logic, only pure combinational logic, i.e. it inherently provides a binary operation. I ...
with a shift/rotate execution unit that operates at the same frequency as the CPU core. The downside is that certain instructions are now much slower (relatively and absolutely) than before, making optimization for multiple target CPUs difficult. An example is shift and rotate operations, which suffer from the lack of a barrel shifter which was present on every x86 CPU beginning with the i386, including the main competitor processor,
Athlon
AMD Athlon is the brand name applied to a series of x86, x86-compatible microprocessors designed and manufactured by AMD, Advanced Micro Devices. The original Athlon (now called Athlon Classic) was the first seventh-generation x86 processor a ...
.
Execution Trace Cache
Within the L1 cache of the CPU, Intel incorporated its Execution Trace Cache. It stores decoded
micro-operation
In computer central processing units, micro-operations (also known as micro-ops or μops, historically also as micro-actions) are detailed low-level instructions used in some designs to implement complex machine instructions (sometimes termed ma ...
s, so that when executing a new instruction, instead of fetching and decoding the instruction again, the CPU directly accesses the decoded micro-ops from the trace cache, thereby saving considerable time. Moreover, the micro-ops are cached in their predicted path of execution, which means that when instructions are fetched by the CPU from the cache, they are already present in the correct order of execution. Intel later introduced a similar but simpler concept with
Sandy Bridge
Sandy Bridge is the List of Intel codenames, codename for Intel's 32 nm process, 32 nm microarchitecture used in the second generation of the Intel Core, Intel Core processors (Intel Core i7, Core i7, Intel Core i5, i5, Intel Core i3, i3). The Sa ...
called
micro-operation cache
A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, whic ...
(UOP cache).
Replay system
The replay system is a subsystem within the Intel Pentium 4 processor to catch operations that have been mistakenly sent for execution by the processor's scheduler. Operations caught by the replay system are then re-executed in a loop until the conditions necessary for their proper execution have been fulfilled.
Branch prediction hints
The Intel NetBurst architecture allows
branch prediction
In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow ...
hints to be inserted into the code to tell whether the static prediction should be taken or not taken, while this feature was abandoned in later Intel processors. According to Intel, NetBurst's branch prediction algorithm is 33% better than the one in P6.
Scaling-up issues
Despite these enhancements, the NetBurst architecture created obstacles for engineers trying to scale up its performance. With this microarchitecture, Intel planned to attain clock speeds of 10 GHz, but because of rising clock speeds, Intel faced increasing problems with keeping power dissipation within acceptable limits. Intel reached a speed barrier of 3.8 GHz in November 2004 but encountered problems trying to achieve even that. Intel abandoned NetBurst in 2006 after the heat problems became unacceptable and then developed the
Core microarchitecture, inspired by the P6 Core of the
Pentium Pro
The Pentium Pro is a sixth-generation x86 microprocessor developed and manufactured by Intel and introduced on November 1, 1995. It implements the P6 (microarchitecture), P6 microarchitecture (sometimes termed i686), and was the first x86 Intel C ...
to the ''Tualatin''
Pentium III
The Pentium III (marketed as Intel Pentium III Processor, informally PIII or P3) brand refers to Intel's 32-bit x86 desktop and mobile CPUs based on the sixth-generation P6 (microarchitecture), P6 microarchitecture introduced on February 28, 1999 ...
-S, and most directly the
Pentium M
The Pentium M is a family of mobile 32-bit single-core x86 microprocessors (with the modified Intel P6 (microarchitecture), P6 microarchitecture) introduced in March 2003 and forming a part of the Intel Centrino#Carmel platform (2003), Carmel no ...
.
Revisions
Intel replaced the original ''Willamette'' core with a redesigned version of the NetBurst microarchitecture called ''Northwood'' in January 2002. The ''Northwood'' design combined an increased cache size, a smaller 130 nm fabrication process, and
Hyper-threading (although initially all models but the 3.06 GHz model had this feature disabled) to produce a more modern, higher-performing version of the NetBurst microarchitecture.
In February 2004, Intel introduced ''Prescott'', a more radical revision of the microarchitecture. The ''Prescott'' core was produced on a 90 nm process, and included several major design changes, including the addition of an even larger cache (from 512 KB in the ''Northwood'' to 1 MB, and 2 MB in Prescott 2M), a much deeper
instruction pipeline
In computer engineering, instruction pipelining is a technique for implementing instruction-level parallelism within a single processor. Pipelining attempts to keep every part of the processor busy with some instruction by dividing incoming Mac ...
(31 stages as compared to 20 in the ''Northwood''), a heavily improved
branch predictor
In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow ...
, the introduction of the
SSE3
SSE3, Streaming SIMD Extensions 3, also known by its Intel code name Prescott New Instructions (PNI), is the third iteration of the SSE instruction set for the IA-32 (x86) architecture. Intel introduced SSE3 in early 2004 with the Prescott revis ...
instructions, and later, the implementation of Intel Extended Memory 64 Technology (EM64T), Intel's branding for their compatible implementation of the
x86-64
x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit extension of the x86 instruction set architecture, instruction set. It was announced in 1999 and first available in the AMD Opteron family in 2003. It introduces two new ope ...
64-bit version of the
x86
x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...
microarchitecture (as with hyper-threading, all ''Prescott'' chips branded Pentium 4 HT have hardware to support this feature, but it was initially only enabled on the high-end
Xeon
Xeon (; ) is a brand of x86 microprocessors designed, manufactured, and marketed by Intel, targeted at the non-consumer workstation, server, and embedded markets. It was introduced in June 1998. Xeon processors are based on the same archite ...
processors, before being officially introduced in processors with the
Pentium
Pentium is a series of x86 architecture-compatible microprocessors produced by Intel from 1993 to 2023. The Pentium (original), original Pentium was Intel's fifth generation processor, succeeding the i486; Pentium was Intel's flagship proce ...
trademark). Power consumption and heat dissipation also became major issues with ''Prescott'', which quickly became the hottest-running, and most power-hungry, of Intel's single-core x86 and x86-64 processors. Power and heat concerns prevented Intel from releasing a Prescott clocked above 3.8 GHz, along with a mobile version of the core clocked above 3.46 GHz.
Intel also released a dual-core processor based on the NetBurst microarchitecture branded Pentium D. The first Pentium D core was codenamed ''Smithfield'', which is actually two Prescott cores in a single die, and later ''Presler'', which consists of two ''Cedar Mill'' cores on two separate dies (''Cedar Mill'' being the 65 nm die-shrink of ''Prescott'').
Roadmap
Successor
Intel had NetBurst-based successors in development called
Tejas and Jayhawk Tejas was a code name for Intel's microprocessor, which was to be a successor to the latest Pentium 4 with the Prescott core and was sometimes referred to as Pentium V. Jayhawk was a code name for its Xeon counterpart. The cancellation of the proce ...
with between 40 and 50 pipeline stages, but ultimately decided to replace NetBurst with the
Core microarchitecture, released in July 2006; these successors were more directly derived from the
Pentium Pro
The Pentium Pro is a sixth-generation x86 microprocessor developed and manufactured by Intel and introduced on November 1, 1995. It implements the P6 (microarchitecture), P6 microarchitecture (sometimes termed i686), and was the first x86 Intel C ...
(
P6 microarchitecture). August 8, 2008 marked the end of Intel NetBurst-based processors.
The reason for NetBurst's abandonment was the severe heat problems caused by high clock speeds. While some Core- and Nehalem-based processors have higher
TDPs, most processors are multi-core, so each core gives off a fraction of the maximum TDP, and the highest-clocked Core-based single-core processors give off a maximum of 27 W of heat. The fastest-clocked desktop Pentium 4 processors (single-core) had TDPs of 115 W, compared to 88 W for the fastest clocked mobile versions. Although, with the introduction of new steppings, TDPs for some models were eventually lowered.
The Nehalem microarchitecture, the successor to the Core microarchitecture, was supposed to be an evolution of NetBurst according to Intel roadmaps dating back to 2000. Nehalem reimplements certain features of NetBurst, including the Hyper-Threading technology first introduced in the 3.06 GHz ''Northwood'' core, and L3 cache, first implemented on a consumer processor in the ''Gallatin'' core used in the Pentium 4 Extreme Edition.
NetBurst-based chips
*
Celeron (NetBurst)
*
Celeron D
*
Pentium 4
Pentium 4 is a series of single-core central processing unit, CPUs for Desktop computer, desktops, laptops and entry-level Server (computing), servers manufactured by Intel. The processors were shipped from November 20, 2000 until August 8, 20 ...
*
Pentium 4 Extreme Edition
*
Pentium D
Pentium D is a range of desktop 64-bit x86-64 processors based on the NetBurst microarchitecture, which is the Multi-core processor, dual-core variant of the Pentium 4 manufactured by Intel. Each CPU comprised two cores. The brand's first process ...
*
Pentium Extreme Edition
Pentium D is a range of desktop 64-bit x86-64 processors based on the NetBurst microarchitecture, which is the dual-core variant of the Pentium 4 manufactured by Intel. Each CPU comprised two cores. The brand's first processor, codenamed Smithfi ...
*
Xeon
Xeon (; ) is a brand of x86 microprocessors designed, manufactured, and marketed by Intel, targeted at the non-consumer workstation, server, and embedded markets. It was introduced in June 1998. Xeon processors are based on the same archite ...
, from 2001 through 2006
See also
*
Megahertz myth
The megahertz myth, or in more recent cases the gigahertz myth, refers to the misconception of only using clock rate (for example measured in megahertz or gigahertz) to compare the performance of different microprocessors. While clock rates are a ...
*
List of Intel CPU microarchitectures
*
List of Intel Celeron processors (NetBurst-based)
*
List of Intel Pentium 4 processors
The Pentium 4 was a seventh-generation CPU from Intel targeted at the consumer and enterprise markets. It is based on the NetBurst microarchitecture.
Desktop processors
Pentium 4
Willamette (180 nm)
* Intel Family 15 Model 1
* All mod ...
*
List of Intel Pentium D processors
*
List of Intel Xeon processors (NetBurst-based)
*
Tick–tock model
Tick–tock was a production model adopted in 2007 by integrated circuit, chip manufacturer Intel. Under this model, every new process technology was first used to manufacture a die shrink of a proven microarchitecture (tick), followed by a new mic ...
References
External links
The Microarchitecture of the Pentium 4 Processor
{{Intel processors, netburst
Intel x86 microprocessors
Intel microarchitectures
X86 microarchitectures
Computer-related introductions in 2000