XOR Swap Algorithm
   HOME

TheInfoList



OR:

In
computer programming Computer programming or coding is the composition of sequences of instructions, called computer program, programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of proc ...
, the exclusive or swap (sometimes shortened to XOR swap) is an
algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
that uses the
exclusive or Exclusive or, exclusive disjunction, exclusive alternation, logical non-equivalence, or logical inequality is a logical operator whose negation is the logical biconditional. With two inputs, XOR is true if and only if the inputs differ (on ...
bitwise operation In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral (considered as a bit string) at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operatio ...
to swap the values of two variables without using the temporary variable which is normally required. The algorithm is primarily a novelty and a way of demonstrating properties of the ''exclusive or'' operation. It is sometimes discussed as a
program optimization In computer science, program optimization, code optimization, or software optimization is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources. In general, a computer program may be op ...
, but there are almost no cases where swapping via ''exclusive or'' provides benefit over the standard, obvious technique.


The algorithm

Conventional swapping requires the use of a temporary storage variable. Using the XOR swap algorithm, however, no temporary storage is needed. The algorithm is as follows: X := Y XOR X; // XOR the values and store the result in X Y := X XOR Y; // XOR the values and store the result in Y X := Y XOR X; // XOR the values and store the result in X Since XOR is a
commutative operation In mathematics, a binary operation is commutative if changing the order of the operands does not change the result. It is a fundamental property of many binary operations, and many mathematical proofs depend on it. Perhaps most familiar as a pr ...
, either X XOR Y or Y XOR X can be used interchangeably in any of the foregoing three lines. Note that on some architectures the first operand of the XOR instruction specifies the target location at which the result of the operation is stored, preventing this interchangeability. The algorithm typically corresponds to three machine-code instructions, represented by corresponding pseudocode and assembly instructions in the three rows of the following table: In the above System/370 assembly code sample, R1 and R2 are distinct
register Register or registration may refer to: Arts, entertainment, and media Music * Register (music), the relative "height" or range of a note, melody, part, instrument, etc. * ''Register'', a 2017 album by Travis Miller * Registration (organ), ...
s, and each operation leaves its result in the register named in the first argument. Using x86 assembly, values X and Y are in registers eax and ebx (respectively), and places the result of the operation in the first register. In RISC-V assembly, value X and Y are in registers x10 and x11, and places the result of the operation in the first operand. However, in the pseudocode or high-level language version or implementation, the algorithm fails if ''x'' and ''y'' use the same storage location, since the value stored in that location will be zeroed out by the first XOR instruction, and then remain zero; it will not be "swapped with itself". This is ''not'' the same as if ''x'' and ''y'' have the same values. The trouble only comes when ''x'' and ''y'' use the same storage location, in which case their values must already be equal. That is, if ''x'' and ''y'' use the same storage location, then the line: X := X XOR Y sets ''x'' to zero (because ''x'' = ''y'' so X XOR Y is zero) ''and'' sets ''y'' to zero (since it uses the same storage location), causing ''x'' and ''y'' to lose their original values.


Proof of correctness

The
binary operation In mathematics, a binary operation or dyadic operation is a rule for combining two elements (called operands) to produce another element. More formally, a binary operation is an operation of arity two. More specifically, a binary operation ...
XOR over bit strings of length N exhibits the following properties (where \oplus denotes XOR): * L1.
Commutativity In mathematics, a binary operation is commutative if changing the order of the operands does not change the result. It is a fundamental property of many binary operations, and many mathematical proofs depend on it. Perhaps most familiar as a p ...
: A \oplus B = B \oplus A * L2.
Associativity In mathematics, the associative property is a property of some binary operations that rearranging the parentheses in an expression will not change the result. In propositional logic, associativity is a Validity (logic), valid rule of replaceme ...
: (A \oplus B) \oplus C = A \oplus (B \oplus C) * L3. Identity exists: there is a bit string, 0, (of length ''N'') such that A \oplus 0 = A for any A * L4. Each element is its own
inverse Inverse or invert may refer to: Science and mathematics * Inverse (logic), a type of conditional sentence which is an immediate inference made from another conditional sentence * Additive inverse, the inverse of a number that, when added to the ...
: for each A, A \oplus A = 0. Suppose that we have two distinct registers R1 and R2 as in the table below, with initial values ''A'' and ''B'' respectively. We perform the operations below in sequence, and reduce our results using the properties listed above.


Linear algebra interpretation

As XOR can be interpreted as binary addition and a pair of bits can be interpreted as a vector in a two-dimensional
vector space In mathematics and physics, a vector space (also called a linear space) is a set (mathematics), set whose elements, often called vector (mathematics and physics), ''vectors'', can be added together and multiplied ("scaled") by numbers called sc ...
over the
field with two elements (also denoted \mathbb F_2, or \mathbb Z/2\mathbb Z) is the finite field with two elements. is the Field (mathematics), field with the smallest possible number of elements, and is unique if the additive identity and the multiplicative identity ...
, the steps in the algorithm can be interpreted as multiplication by 2×2 matrices over the field with two elements. For simplicity, assume initially that ''x'' and ''y'' are each single bits, not bit vectors. For example, the step: X := X XOR Y which also has the implicit: Y := Y corresponds to the matrix \left(\begin1 & 1\\0 & 1\end\right) as :\begin1 & 1\\0 & 1\end \beginx\\y\end = \beginx+y\\y\end. The sequence of operations is then expressed as: : \begin1 & 1\\0 & 1\end \begin1 & 0\\1 & 1\end \begin1 & 1\\0 & 1\end = \begin0 & 1\\1 & 0\end (working with binary values, so 1 + 1 = 0), which expresses the
elementary matrix In mathematics, an elementary matrix is a square matrix obtained from the application of a single elementary row operation to the identity matrix. The elementary matrices generate the general linear group when is a field. Left multiplication (p ...
of switching two rows (or columns) in terms of the transvections (shears) of adding one element to the other. To generalize to where X and Y are not single bits, but instead bit vectors of length ''n'', these 2×2 matrices are replaced by 2''n''×2''n''
block matrices In mathematics, a block matrix or a partitioned matrix is a matrix that is interpreted as having been broken into sections called blocks or submatrices. Intuitively, a matrix interpreted as a block matrix can be visualized as the original matrix w ...
such as \left(\beginI_n & I_n\\0 & I_n\end\right). These matrices are operating on ''values,'' not on ''variables'' (with storage locations), hence this interpretation abstracts away from issues of storage location and the problem of both variables sharing the same storage location.


Code example

A C function that implements the XOR swap algorithm: void xor_swap(int *x, int *y) The code first checks if the addresses are distinct and uses a guard clause to exit the function early if they are equal. Without that check, if they were equal, the algorithm would fold to a triple *x ^= *x resulting in zero. The XOR swap algorithm can also be defined with a macro: #define XORSWAP_UNSAFE(a, b) \ ((a) ^= (b), (b) ^= (a), \ (a) ^= (b)) /* Doesn't work when a and b are the same object - assigns zero \ (0) to the object in that case */ #define XORSWAP(a, b) \ ((&(a)

&(b)) ? (a) /* Check for distinct addresses */ \ : XORSWAP_UNSAFE(a, b))


Reasons for avoidance in practice

On modern
CPU architecture In computer science and computer engineering, computer architecture is a description of the structure of a computer system made from component parts. It can sometimes be a high-level description that ignores details of the implementation. At a mo ...
s, the XOR technique can be slower than using a temporary variable to do swapping. At least on recent x86 CPUs, both by AMD and Intel, moving between registers regularly incurs zero latency. (This is called MOV-elimination.) Even if there is not any architectural register available to use, the XCHG instruction will be at least as fast as the three XORs taken together. Another reason is that modern CPUs strive to execute instructions in parallel via
instruction pipeline In computer engineering, instruction pipelining is a technique for implementing instruction-level parallelism within a single processor. Pipelining attempts to keep every part of the processor busy with some instruction by dividing incoming Mac ...
s. In the XOR technique, the inputs to each operation depend on the results of the previous operation, so they must be executed in strictly sequential order, negating any benefits of
instruction-level parallelism Instruction-level parallelism (ILP) is the Parallel computing, parallel or simultaneous execution of a sequence of Instruction set, instructions in a computer program. More specifically, ILP refers to the average number of instructions run per st ...
.


Aliasing

The XOR swap is also complicated in practice by
aliasing In signal processing and related disciplines, aliasing is a phenomenon that a reconstructed signal from samples of the original signal contains low frequency components that are not present in the original one. This is caused when, in the ori ...
. If an attempt is made to XOR-swap the contents of some location with itself, the result is that the location is zeroed out and its value lost. Therefore, XOR swapping must not be used blindly in a high-level language if aliasing is possible. This issue does not apply if the technique is used in assembly to swap the contents of two registers. Similar problems occur with
call by name In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a ''parameter-passing strategy'' that defines the kind of value that is passed to the ...
, as in Jensen's Device, where swapping i and A /code> via a temporary variable yields incorrect results due to the arguments being related: swapping via temp = i; i = A A = temp changes the value for i in the second statement, which then results in the incorrect i value for A /code> in the third statement.


Variations

The underlying principle of the XOR swap algorithm can be applied to any operation meeting criteria L1 through L4 above. Replacing XOR by addition and subtraction gives various slightly different, but largely equivalent, formulations. For example: void add_swap(unsigned int* x, unsigned int* y) Unlike the XOR swap, this variation requires that the underlying processor or programming language uses a method such as
modular arithmetic In mathematics, modular arithmetic is a system of arithmetic operations for integers, other than the usual ones from elementary arithmetic, where numbers "wrap around" when reaching a certain value, called the modulus. The modern approach to mo ...
or
bignum In computer science, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that calculations are performed on numbers whose numerical digit, digits of p ...
s to guarantee that the computation of X + Y cannot cause an error due to
integer overflow In computer programming, an integer overflow occurs when an arithmetic operation on integers attempts to create a numeric value that is outside of the range that can be represented with a given number of digits – either higher than the maximu ...
. Therefore, it is seen even more rarely in practice than the XOR swap. However, the implementation of AddSwap above in the C programming language always works even in case of integer overflow, since, according to the C standard, addition and subtraction of unsigned integers follow the rules of
modular arithmetic In mathematics, modular arithmetic is a system of arithmetic operations for integers, other than the usual ones from elementary arithmetic, where numbers "wrap around" when reaching a certain value, called the modulus. The modern approach to mo ...
, i. e. are done in the
cyclic group In abstract algebra, a cyclic group or monogenous group is a Group (mathematics), group, denoted C_n (also frequently \Z_n or Z_n, not to be confused with the commutative ring of P-adic number, -adic numbers), that is Generating set of a group, ge ...
\mathbb Z/2^s\mathbb Z where s is the number of bits of unsigned int. Indeed, the correctness of the algorithm follows from the fact that the formulas (x + y) - y = x and (x + y) - ((x + y) - y) = y hold in any
abelian group In mathematics, an abelian group, also called a commutative group, is a group in which the result of applying the group operation to two group elements does not depend on the order in which they are written. That is, the group operation is commu ...
. This generalizes the proof for the XOR swap algorithm: XOR is both the addition and subtraction in the abelian group (\mathbb Z/2\mathbb Z)^ (which is the
direct sum The direct sum is an operation between structures in abstract algebra, a branch of mathematics. It is defined differently but analogously for different kinds of structures. As an example, the direct sum of two abelian groups A and B is anothe ...
of ''s'' copies of \mathbb Z/2\mathbb Z). This doesn't hold when dealing with the signed int type (the default for int). Signed integer overflow is an undefined behavior in C and thus modular arithmetic is not guaranteed by the standard, which may lead to incorrect results. The sequence of operations in AddSwap can be expressed via matrix multiplication as: : \begin1 & -1\\0 & 1\end \begin1 & 0\\1 & -1\end \begin1 & 1\\0 & 1\end = \begin0 & 1\\1 & 0\end


Application to register allocation

On architectures lacking a dedicated swap instruction, because it avoids the extra temporary register, the XOR swap algorithm is required for optimal
register allocation In compiler optimization, register allocation is the process of assigning local automatic variables and Expression (computer science), expression results to a limited number of processor registers. Register allocation can happen over a basic bloc ...
. This is particularly important for
compilers In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
using
static single assignment form In compiler design, static single assignment form (often abbreviated as SSA form or simply SSA) is a type of intermediate representation (IR) where each variable is assigned exactly once. SSA is used in most high-quality optimizing compilers for ...
for register allocation; these compilers occasionally produce programs that need to swap two registers when no registers are free. The XOR swap algorithm avoids the need to reserve an extra register or to spill any registers to main memory. The addition/subtraction variant can also be used for the same purpose. This method of register allocation is particularly relevant to
GPU A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...
shader compilers. On modern GPU architectures, spilling variables is expensive due to limited memory bandwidth and high memory latency, while limiting register usage can improve performance due to dynamic partitioning of the
register file A register file is an array of processor registers in a central processing unit (CPU). The instruction set architecture of a CPU will almost always define a set of registers which are used to stage data between memory and the functional units on ...
. The XOR swap algorithm is therefore required by some GPU compilers.


See also

*
Symmetric difference In mathematics, the symmetric difference of two sets, also known as the disjunctive union and set sum, is the set of elements which are in either of the sets, but not in their intersection. For example, the symmetric difference of the sets \ and ...
*
XOR linked list An XOR linked list is a type of data structure used in computer programming. It takes advantage of the bitwise XOR operation to decrease storage requirements for doubly linked lists by storing the composition of both addresses in one field. While ...
*
Feistel cipher In cryptography, a Feistel cipher (also known as Luby–Rackoff block cipher) is a symmetric structure used in the construction of block ciphers, named after the German-born physicist and cryptographer Horst Feistel, who did pioneering resear ...
(the XOR swap algorithm is a degenerate form of a Feistel cipher)


Notes


References

{{Reflist Algorithms Articles with example C code Binary arithmetic fr:Échange (informatique)#En utilisant l.27op.C3.A9ration XOR pl:Zamiana wartości zmiennych