CLMUL instruction set
   HOME

TheInfoList



OR:

Carry-less Multiplication (CLMUL) is an extension to the
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introd ...
instruction set used by
microprocessor A microprocessor is a computer processor where the data processing logic and control is included on a single integrated circuit, or a small number of integrated circuits. The microprocessor contains the arithmetic, logic, and control circu ...
s from
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ...
and
AMD Advanced Micro Devices, Inc. (AMD) is an American multinational semiconductor company based in Santa Clara, California, that develops computer processors and related technologies for business and consumer markets. While it initially manufactur ...
which was proposed by Intel in March 2008 and made available in the Intel Westmere processors announced in early 2010. Mathematically, the instruction implements multiplication of polynomials over the
finite field In mathematics, a finite field or Galois field (so-named in honor of Évariste Galois) is a field that contains a finite number of elements. As with any field, a finite field is a set on which the operations of multiplication, addition, subtr ...
GF(2) where the bitstring a_0a_1\ldots a_ represents the polynomial a_0 + a_1X + a_2X^2 + \cdots + a_X^. The CLMUL instruction also allows a more efficient implementation of the closely related multiplication of larger finite fields GF(2''k'') than the traditional instruction set. One use of these instructions is to improve the speed of applications doing block cipher encryption in
Galois/Counter Mode In cryptography, Galois/Counter Mode (GCM) is a mode of operation for symmetric-key cryptographic block ciphers which is widely adopted for its performance. GCM throughput rates for state-of-the-art, high-speed communication channels can be achiev ...
, which depends on finite field GF(2''k'') multiplication. Another application is the fast calculation of CRC values, including those used to implement the
LZ77 LZ77 and LZ78 are the two lossless data compression algorithms published in papers by Abraham Lempel and Jacob Ziv in 1977 and 1978. They are also known as LZ1 and LZ2 respectively. These two algorithms form the basis for many variations includin ...
sliding window A sliding window protocol is a feature of packet-based data transmission protocols. Sliding window protocols are used where reliable in-order delivery of packets is required, such as in the data link layer (OSI layer 2) as well as in the Transm ...
DEFLATE algorithm in
zlib zlib ( or "zeta-lib", ) is a software library used for data compression. zlib was written by Jean-loup Gailly and Mark Adler and is an abstraction of the DEFLATE compression algorithm used in their gzip file compression program. zlib is also ...
and
pngcrush pngcrush is a free and open-source command-line utility for optimizing PNG image files. It reduces the size of the file losslessly that is, the resulting "crushed" image will have the same quality as the source image. The main purpose of png ...
. ARMv8 also has a version of CLMUL. SPARC calls their version XMULX, for "XOR multiplication".


New instructions

The instruction computes the 128-bit carry-less product of two 64-bit values. The destination is a 128-bit XMM register. The source may be another XMM register or memory. An immediate operand specifies which halves of the 128-bit operands are multiplied.
Mnemonics A mnemonic ( ) device, or memory device, is any learning technique that aids information retention or retrieval (remembering) in the human memory for better understanding. Mnemonics make use of elaborative encoding, retrieval cues, and imagery ...
specifying specific values of the immediate operand are also defined: A EVEX vectorized version (VPCLMULQDQ) is seen in
AVX-512 AVX-512 are 512-bit extensions to the 256-bit Advanced Vector Extensions SIMD instructions for x86 instruction set architecture (ISA) proposed by Intel in July 2013, and implemented in Intel's Xeon Phi x200 (Knights Landing) and Skylake-X CPUs; thi ...
.


CPUs with CLMUL instruction set

*
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ...
** Westmere processor (March 2010). **
Sandy Bridge Sandy Bridge is the codename for Intel's 32 nm microarchitecture used in the second generation of the Intel Core processors (Core i7, i5, i3). The Sandy Bridge microarchitecture is the successor to Nehalem and Westmere microarchitecture ...
processor ** Ivy Bridge processor ** Haswell processor ** Broadwell processor (with increased throughput and lower latency) ** Skylake (and later) processor **
Goldmont Goldmont is a microarchitecture for low-power Atom, Celeron and Pentium branded processors used in systems on a chip (SoCs) made by Intel. They allow only one thread per core. The ''Apollo Lake'' platform with 14 nm Goldmont core was un ...
processor *
AMD Advanced Micro Devices, Inc. (AMD) is an American multinational semiconductor company based in Santa Clara, California, that develops computer processors and related technologies for business and consumer markets. While it initially manufactur ...
: ** Jaguar-based processors and newer ** Puma-based processors and newer ** "Heavy Equipment" processors *** Bulldozer-based processors *** Piledriver-based processors *** Steamroller-based processors *** Excavator-based processors and newer **
Zen Zen ( zh, t=禪, p=Chán; ja, text= 禅, translit=zen; ko, text=선, translit=Seon; vi, text=Thiền) is a school of Mahayana Buddhism that originated in China during the Tang dynasty, known as the Chan School (''Chánzong'' 禪宗), and ...
processors **
Zen+ Zen+ is the codename for a computer processor microarchitecture by AMD. It is the successor to the first gen Zen microarchitecture, first released in April 2018, powering the second generation of Ryzen processors, known as Ryzen 2000 for mainstre ...
processors ** Zen2 (and later) processors The presence of the CLMUL instruction set can be checked by testing one of the CPU feature bits.


See also

*
Finite field arithmetic In mathematics, finite field arithmetic is arithmetic in a finite field (a field containing a finite number of elements) contrary to arithmetic in a field with an infinite number of elements, like the field of rational numbers. There are infinitel ...
*
AES instruction set An Advanced Encryption Standard instruction set is now integrated into many processors. The purpose of the instruction set is to improve the speed and security of applications performing encryption and decryption using Advanced Encryption Standard ...
*
FMA3 instruction set The FMA instruction set is an extension to the 128 and 256-bit Streaming SIMD Extensions instructions in the x86 microprocessor instruction set to perform fused multiply–add (FMA) operations."FMA3 and FMA4 are not instruction sets, they are ind ...
*
FMA4 instruction set The FMA instruction set is an extension to the 128 and 256-bit Streaming SIMD Extensions instructions in the x86 microprocessor instruction set to perform fused multiply–add (FMA) operations."FMA3 and FMA4 are not instruction sets, they are i ...
* AVX instruction set


References

{{Multimedia extensions, state=uncollapsed X86 architecture X86 instructions