The SSE5 (short for Streaming SIMD Extensions version 5) was a
SIMD
Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...
instruction set extension proposed by
AMD
Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a hardware and fabless company that de ...
on August 30, 2007 as a supplement to the 128-bit
SSE core instructions in the
AMD64
x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit extension of the x86 instruction set. It was announced in 1999 and first available in the AMD Opteron family in 2003. It introduces two new operating modes: 64-bit mode an ...
architecture.
AMD chose not to implement SSE5 as originally proposed. In May 2009, AMD replaced SSE5 with three smaller instruction set extensions named as
XOP,
FMA4, and
F16C, which retain the proposed functionality of SSE5, but encode the instructions differently for better compatibility with Intel's proposed
AVX instruction set.
The three SSE5-derived instruction sets were introduced in the
Bulldozer
A bulldozer or dozer (also called a crawler) is a large tractor equipped with a metal #Blade, blade at the front for pushing material (soil, sand, snow, rubble, or rock) during construction work. It travels most commonly on continuous tracks, ...
processor core, released in October 2011 on a
32 nm
The "32 nm" node is the step following the "45 nm" process in CMOS (MOSFET) semiconductor device fabrication. "32-nanometre" refers to the average half-pitch (i.e., half the distance between identical features) of a memory cell at this technolo ...
process.
Compatibility
AMD's SSE5 extension bundle does not include the full set of
Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
's
SSE4
SSE4 (Streaming SIMD Extensions 4) is a SIMD CPU instruction set used in the Intel Core microarchitecture and AMD K10 (K8L). It was announced on September 27, 2006, at the Fall 2006 Intel Developer Forum, with vague details in a white paper;< ...
instructions, making it a competitor to SSE4 rather than a successor.
SSE5 enhancements
The proposed SSE5 instruction set consisted of 170 instructions (including 46 base instructions), many of which are designed to improve single-threaded performance. Some SSE5 instructions are
3-operand instructions, the use of which will increase the average number of
instructions per cycle
In computer architecture, instructions per cycle (IPC), commonly called instructions per clock, is one aspect of a processor's performance: the average number of instructions executed for each clock cycle. It is the multiplicative inverse of c ...
achievable by
x86
x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...
code.
Selected new instructions include:
*
Fused multiply–accumulate (FMACxx) instructions
* Integer
multiply–accumulate (IMAC, IMADC) instructions
* Permutation (PPERM, PERMPx) and conditional move (PCMOV) instructions
* Precision control, rounding, and conversion instructions
AMD claimed SSE5 would provide dramatic performance improvements, particularly in
high-performance computing
High-performance computing (HPC) is the use of supercomputers and computer clusters to solve advanced computation problems.
Overview
HPC integrates systems administration (including network and security knowledge) and parallel programming into ...
(HPC),
multimedia
Multimedia is a form of communication that uses a combination of different content forms, such as Text (literary theory), writing, Sound, audio, images, animations, or video, into a single presentation. T ...
, and
computer security
Computer security (also cybersecurity, digital security, or information technology (IT) security) is a subdiscipline within the field of information security. It consists of the protection of computer software, systems and computer network, n ...
applications, including a 5x performance gain for
AES encryption
The Advanced Encryption Standard (AES), also known by its original name Rijndael (), is a specification for the encryption of electronic data established by the U.S. National Institute of Standards and Technology (NIST) in 2001.
AES is a variant ...
and a 30% performance gain for the
discrete cosine transform
A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequency, frequencies. The DCT, first proposed by Nasir Ahmed (engineer), Nasir Ahmed in 1972, is a widely ...
(DCT) used for example in video processing.
2009 revision
The SSE5 specification included a proposed extension to the general coding scheme of
x86
x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...
instructions in order to allow instructions to have more than two operands. In 2008,
Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
announced their planned
AVX instruction set which proposed a different way of coding instructions with more than two operands. The two proposed coding schemes, SSE5 and AVX, are mutually incompatible, although the AVX scheme has certain advantages over the SSE5 scheme: most importantly, AVX has plenty of space for future extensions, including larger vector sizes.
In May 2009, AMD published a revised specification for the planned future instructions. This revision changes the coding scheme to make it compatible with the AVX scheme, but with a differing prefix byte in order to avoid overlap between instructions introduced by AMD and instructions introduced by Intel.
The revised instruction set no longer carries the name SSE5, which has been criticized for being misleading, but most of the instructions in the new revision are functionally identical to the original SSE5 specification—only the way the instructions are coded differs. The planned additions to the AMD instruction set consists of three subsets:
#
XOP: Integer vector
multiply–accumulate instructions, integer vector horizontal addition, integer vector compare, shift and rotate instructions, byte permutation and conditional move instructions, floating point fraction extraction.
#
FMA4: Floating-point vector
multiply–accumulate.
#
F16C:
Half-precision floating-point conversion.
Both XOP and FMA4 are removed in newer AMD processors using the
Zen microarchitecture.
See also
*
x86 instruction listings
*
Fused multiply–add
References
External links
A New SSE Instruction Set: AMD Announces SSE5 AnandTech
''AnandTech'' was an online computer hardware magazine owned by Future plc. It was founded in April 1997 by then-14-year-old Anand Lal Shimpi, who was CEO and editor-in-chief until August 2014, with Ryan Smith replacing him as editor-in-chief. ...
AMD64 Architecture Programmer’s Manual Volume 6: 128-Bit and 256-Bit XOP and FMA4 InstructionsAMD and Intel incompatible - What to do? AMD Developer Forums
{{Multimedia extensions, state=uncollapsed
X86 instructions
SIMD computing
AMD technologies