AMD Steamroller Family 15h is a
microarchitecture
In computer engineering, microarchitecture, also called computer organization and sometimes abbreviated as µarch or uarch, is the way a given instruction set architecture (ISA) is implemented in a particular processor. A given ISA may be imp ...
developed by
AMD for
AMD APUs, which succeeded
Piledriver in the beginning of 2014 as the third-generation
Bulldozer
A bulldozer or dozer (also called a crawler) is a large, motorized machine equipped with a metal blade to the front for pushing material: soil, sand, snow, rubble, or rock during construction work. It travels most commonly on continuous trac ...
-based microarchitecture.
Steamroller APUs continue to use two-core modules as their predecessors, while aiming at achieving greater levels of parallelism.
Microarchitecture
''Steamroller'' still features two-core modules found in ''Bulldozer'' and ''Piledriver'' designs called
clustered multi-thread
The AMD Bulldozer Family 15h is a microprocessor microarchitecture for the FX and Opteron line of processors, developed by AMD for the desktop and server markets. Bulldozer is the codename for this family of microarchitectures. It was released o ...
(CMT), meaning that one module is marketed as a dual-core processor.
The focus of ''Steamroller'' is for greater parallelism. Improvements center on independent instruction decoders for each core within a module, 25% more of the maximum width dispatches per thread, better instruction schedulers, improved perceptron branch predictor, larger and smarter caches, up to 30% fewer instruction cache misses, branch misprediction rate reduced by 20%, dynamically resizable L2 cache, micro-operations queue, more internal register resources and improved memory controller.
AMD estimated that these improvements will increase
instructions per cycle
In computer architecture, instructions per cycle (IPC), commonly called instructions per clock is one aspect of a processor's performance: the average number of instructions executed for each clock cycle. It is the multiplicative inverse of c ...
(IPC) up to 30% compared to the first-generation Bulldozer core while maintaining Piledriver's high clock rates with decreased power consumption.
The final result was a 9% single-threaded IPC improvement, and 18% multi-threaded IPC improvement over Piledriver.
Steamroller, the microarchitecture for CPUs, as well as
Graphics Core Next
Graphics Core Next (GCN) is the codename for a series of microarchitectures and an instruction set architecture that were developed by AMD for its GPUs as the successor to its TeraScale microarchitecture. The first product featuring GCN was la ...
, the microarchitecture for GPUs, are paired together in the APU lines to support features specified in
Heterogeneous System Architecture Heterogeneous System Architecture (HSA) is a cross-vendor set of specifications that allow for the integration of central processing units and graphics processors on the same bus, with shared memory and tasks. The HSA is being developed by the HSA ...
.
History
In 2011, AMD announced a third-generation Bulldozer-based line of processors for 2013, with ''Next Generation Bulldozer'' as the working title, using the 28 nm manufacturing process.
On 21 September 2011, leaked AMD slides indicated that this third generation of Bulldozer core was codenamed ''Steamroller''.
In January 2014, the first ''Kaveri'' APUs became available.
Starting from May 2015 till March 2016 new APUs were launched as Kaveri-refresh (codenamed Godavari).
Features
APU features table
Processors
APU lines
# Kaveri A-series APU
#* Desktop budget and mainstream markets (FM2+): The ''Trinity'' / ''Richland'' APU line was replaced in January 2014 by the ''Kaveri'' APU line, as the third generation of A10, A8, A6 and A4 series for the desktop market. Top-of-the-line model in 2014 was the quad-core A10-7850K APU, with a 3.7 GHz core frequency and 4 MB L2 cache, incorporating a 720 MHz GPU with 512 stream processors and over 856
GFLOPS
In computing, floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate mea ...
of total
processing power
In computing, computer performance is the amount of useful work accomplished by a computer system
A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations ( computation) automatically. Moder ...
.
In 2015 and 2016 new models with two to four enhanced ''Steamroller B'' cores were released as Kaveri-refresh / Godavari. A10-7890K, the new top-of-the-line model, features an increased core frequency of 4.1 GHz and an 866 MHz GPU.
#* Two or four CPU cores based on the Steamroller microarchitecture
#*
Socket FM2+-only, Socket FM2 is ''not'' supported, support for
PCIe 3.0
#*
DDR3
Double Data Rate 3 Synchronous Dynamic Random-Access Memory (DDR3 SDRAM) is a type of synchronous dynamic random-access memory (SDRAM) with a high bandwidth ("double data rate") interface, and has been in use since 2007. It is the higher-speed ...
Dual-channel (2x64-bit) memory controller
#*
AMD Heterogeneous System Architecture (HSA) 2.0
#*
SIP blocks:
Unified Video Decoder
Unified Video Decoder (UVD, previously called Universal Video Decoder) is the name given to AMD's dedicated video decoding ASIC. There are multiple versions implementing a multitude of video codecs, such as H.264 and VC-1.
UVD was introduced w ...
,
Video Coding Engine
Video Code Engine (VCE, was earlier referred to as Video Coding Engine, Video Compression Engine or Video Codec Engine in official AMD documentation) is AMD's video encoding application-specific integrated circuit implementing the video codec H ...
,
TrueAudio
#* Three to eight Compute Units (CUs) based on the revised
GCN 2nd gen microarchitecture; 1 Compute Unit (CU) consists of 64
Unified Shader Processors : 4
Texture Mapping Units (TMUs) : 1
Render Output Unit (ROP)
#*
AMD Eyefinity
AMD Eyefinity is a brand name for AMD video card products that support multi-monitor setups by integrating multiple (up to six) display controllers on one GPU. AMD Eyefinity was introduced with the Radeon HD 5000 Series "Evergreen" in September ...
up to 4 monitors, 4K Ultra HD support, DisplayPort 1.2 Support
#* Select models support
AMD Hybrid Graphics by using a Radeon R7 240 or R7 250 discrete graphics card.
#* Integrated custom
ARM Cortex-A5
The ARM Cortex-A5 is a 32-bit processor core licensed by ARM Holdings implementing the ARMv7-A architecture announced in 2009.
Overview
The Cortex-A5 is intended to replace the ARM9 and ARM11 cores for use in low-end devices. The Cortex-A5 off ...
co-processor with
TrustZone Security Extensions
# Berlin APU - canceled
#* Announced in 2013 by AMD the ''Berlin'' APU were targeted at the enterprise and server markets featuring four ''Steamroller'' cores, up to 512 stream processors and support for
ECC memory
Error correction code memory (ECC memory) is a type of computer data storage that uses an error correction code (ECC) to detect and correct n-bit data corruption which occurs in memory. ECC memory is used in most computers where data corruption c ...
.
FX lines (discontinued)
In November 2013 AMD confirmed it would not update the
FX series in 2014, neither its
Socket AM3+ version, nor will it receive a ''Steamroller'' version with a new socket.
AMD however, released a Kaveri based FX-770K for desktop and FX-7600P for mobile which are basically APUs with their integrated graphics disabled similar to the Athlon X4 FM2+ line. Those APUs were released for OEMs only.
Server lines (canceled)
AMD's server roadmaps for 2014 showed:
* ''Berlin'' APU - quad-core x86 Steamroller architecture (as described above) for 1 Processor (1P) compute and media clusters
* ''Berlin'' CPU - quad-core x86 Steamroller architecture for 1P web and enterprise services clusters
* ''Seattle'' CPU - 4/8 core
AArch64
AArch64 or ARM64 is the 64-bit extension of the ARM architecture family.
It was first introduced with the Armv8-A architecture. Arm releases a new extension every year.
ARMv8.x and ARMv9.x extensions and features
Announced in October 2011, ...
Cortex-A57 architecture (Opteron A1100) for 1P web and enterprise services clusters
* ''Warsaw'' CPU - up to 16 core
x86 Piledriver (2nd gen Bulldozer) architecture (
Opteron 6338P and 6370P) for 2P/4P servers
However, plans for ''Steamroller'' Opteron products were cancelled, likely due to the poor energy efficiency achieved in this generation of the ''Bulldozer'' architecture. Energy efficiency was greatly increased in the following generation, ''Excavator'', which exceeded ''Jaguar'' in performance per watt, and approximately doubled performance/watt over ''Steamroller'' (for example 20.74 pt/W vs 10.85 pt/W when comparing similar mobile APUs using rough arbitrary metrics).
References
{{AMD processor roadmap
AMD x86 microprocessors
AMD microarchitectures
Heterogeneous System Architecture
X86 microarchitectures