middle-endian
   HOME

TheInfoList



OR:

In computing, endianness, also known as byte sex, is the order or sequence of bytes of a word of digital data in computer memory. Endianness is primarily expressed as big-endian (BE) or little-endian (LE). A big-endian system stores the most significant byte of a word at the smallest memory address and the
least significant byte In computing, bit numbering is the convention used to identify the bit positions in a binary number. Bit significance and indexing In computing, the least significant bit (LSB) is the bit position in a binary integer representing the binary 1 ...
at the largest. A little-endian system, in contrast, stores the least-significant byte at the smallest address. Bi-endianness is a feature supported by numerous computer architectures that feature switchable endianness in data fetches and stores or for instruction fetches. Other orderings are generically called middle-endian or mixed-endian. Endianness may also be used to describe the order in which the bits are transmitted over a communication channel, e.g., big-endian in a communications channel transmits the most significant bits first. Bit-endianness is seldom used in other contexts.


Etymology

Danny Cohen introduced the terms ''big-endian'' and ''little-endian'' into computer science for data ordering in an
Internet Experiment Note An Internet Experiment Note (IEN) is a sequentially numbered document in a series of technical publications issued by the participants of the early development work groups that created the precursors of the modern Internet. After DARPA began the ...
published in 1980. Also published at '' IEEE Computer''
October 1981 issue
The adjective ''endian'' has its origin in the writings of 18th century Anglo-Irish writer Jonathan Swift. In the 1726 novel ''
Gulliver's Travels ''Gulliver's Travels'', or ''Travels into Several Remote Nations of the World. In Four Parts. By Lemuel Gulliver, First a Surgeon, and then a Captain of Several Ships'' is a 1726 prose satire by the Anglo-Irish writer and clergyman Jonathan ...
'', he portrays the conflict between sects of Lilliputians divided into those breaking the shell of a boiled egg from the big end or from the little end. He called them the ''Big-Endians'' and the ''Little-Endians''. Cohen makes the connection to ''Gulliver's Travels'' explicit in the appendix to his 1980 note.


Overview

Computers store information in various-sized groups of binary bits. Each group is assigned a number, called its ''address'', that the computer uses to access that data. On most modern computers, the smallest data group with an address is eight bits long and is called a byte. Larger groups comprise two or more bytes, for example, a
32-bit In computer architecture, 32-bit computing refers to computer systems with a processor, memory, and other major system components that operate on data in 32-bit units. Compared to smaller bit widths, 32-bit computers can perform large calculation ...
word contains four bytes. There are two possible ways a computer could number the individual bytes in a larger group, starting at either end. Both types of endianness are in widespread use in digital electronic engineering. The initial choice of endianness of a new design is often arbitrary, but later technology revisions and updates perpetuate the existing endianness to maintain backward compatibility. Internally, any given computer will work equally well regardless of what endianness it uses since its hardware will consistently use the same endianness to both store and load its data. For this reason, programmers and computer users normally ignore the endianness of the computer they are working with. However, endianness can become an issue when moving data external to the computer – as when transmitting data between different computers, or a programmer investigating internal computer bytes of data from a memory dump – and the endianness used differs from expectation. In these cases, the endianness of the data must be understood and accounted for. These two diagrams show how two computers using different endianness store a 32-bit (four byte) integer with the value of . In both cases, the integer is broken into four bytes, , , , and , and the bytes are stored in four sequential byte locations in memory, starting with the memory location with address ''a'', then ''a + 1'', ''a + 2'', and ''a + 3''. The difference between big and little endian is the order of the four bytes of the integer being stored. The left-side diagram shows a computer using big-endian. This starts the storing of the integer with the ''most''-significant byte, , at address ''a'', and ends with the ''least''-significant byte, , at address ''a + 3''. The right-side diagram shows a computer using little-endian. This starts the storing of the integer with the ''least''-significant byte, , at address ''a'', and ends with the ''most''-significant byte, , at address ''a + 3''. Since each computer uses its same endianness to both store and retrieve the integer, the results will be the same for both computers. Issues may arise when memory is addressed by bytes instead of integers, or when memory contents are transmitted between computers with different endianness. Big-endianness is the dominant ordering in networking protocols, such as in the internet protocol suite, where it is referred to as network order, transmitting the most significant byte first. Conversely, little-endianness is the dominant ordering for processor architectures ( x86, most ARM implementations, base RISC-V implementations) and their associated memory.
File format A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free. Some file formats ...
s can use either ordering; some formats use a mixture of both or contain an indicator of which ordering is used throughout the file. The styles of little- and big-endian may also be used more generally to characterize the ordering of any representation, e.g. the digits in a numeral system or the sections of a date. Numbers in
positional notation Positional notation (or place-value notation, or positional numeral system) usually denotes the extension to any base of the Hindu–Arabic numeral system (or decimal system). More generally, a positional system is a numeral system in which the ...
are generally written with their digits in left-to-right big-endian order, even in right-to-left scripts. Similarly, programming languages use big-endian digit ordering for numeric literals.


Basics

Computer memory consists of a sequence of storage cells (smallest addressable units); in machines that support byte addressing, those units are called '' bytes''. Each byte is identified and accessed in hardware and software by its memory address. If the total number of bytes in memory is ''n'', then addresses are enumerated from 0 to ''n'' − 1. Computer programs often use data structures or fields that may consist of more data than can be stored in one byte. In the context of this article where its type cannot be arbitrarily complicated, a "field" consists of a consecutive sequence of bytes and represents a "simple data value" which – at least potentially – can be manipulated by ''one'' single hardware instruction. On most systems, the address of a multi-byte simple data value is the address of its first byte (the byte with the lowest address). Another important attribute of a byte being part of a "field" is its "significance". These attributes of the parts of a field play an important role in the sequence the bytes are accessed by the computer hardware, more precisely: by the low-level algorithms contributing to the results of a computer instruction.


Numbers

Positional number systems (mostly base 2, or less often base 10) are the predominant way of representing and particularly of manipulating integer data by computers. In pure form this is valid for moderate sized non-negative integers, e.g. of C data type
unsigned Unsigned can refer to: * An unsigned artist is a musical artist or group not attached or signed to a record label ** Unsigned Music Awards, ceremony noting achievements of unsigned artists ** Unsigned band web, online community * Similarly, the c ...
. In such a number system, the ''value'' of a digit which it contributes to the whole number is determined not only by its value as a single digit, but also by the position it holds in the complete number, called its significance. These positions can be mapped to memory mainly in two ways: * decreasing numeric significance with increasing memory addresses (or increasing time), known as ''big-endian'' and * increasing numeric significance with increasing memory addresses (or increasing time), known as ''little-endian''. The integer data that are directly supported by the
computer hardware Computer hardware includes the physical parts of a computer, such as the computer case, case, central processing unit (CPU), Random-access memory, random access memory (RAM), Computer monitor, monitor, Computer mouse, mouse, Computer keyboard, ...
have a fixed width of a low power of 2, e.g. 8 bits ≙ 1 byte, 16 bits ≙ 2 bytes, 32 bits ≙ 4 bytes, 64 bits ≙ 8 bytes, 128 bits ≙ 16 bytes. The low-level access sequence to the bytes of such a field depends on the operation to be performed. The least-significant byte is accessed first for
addition Addition (usually signified by the Plus and minus signs#Plus sign, plus symbol ) is one of the four basic Operation (mathematics), operations of arithmetic, the other three being subtraction, multiplication and Division (mathematics), division. ...
,
subtraction Subtraction is an arithmetic operation that represents the operation of removing objects from a collection. Subtraction is signified by the minus sign, . For example, in the adjacent picture, there are peaches—meaning 5 peaches with 2 taken ...
and
multiplication Multiplication (often denoted by the cross symbol , by the mid-line dot operator , by juxtaposition, or, on computers, by an asterisk ) is one of the four elementary mathematical operations of arithmetic, with the other ones being additi ...
. The most-significant byte is accessed first for division and
comparison Comparison or comparing is the act of evaluating two or more things by determining the relevant, comparable characteristics of each thing, and then determining which characteristics of each are similar to the other, which are different, and t ...
. See . For
floating-point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
numbers, see .


Text

When character (text) strings are to be compared with one another, e.g. in order to support some mechanism like
sorting Sorting refers to ordering data in an increasing or decreasing manner according to some linear relationship among the data items. # ordering: arranging items in a sequence ordered by some criterion; # categorizing: grouping items with similar pro ...
, this is very frequently done lexicographically where a single positional element (character) also has a positional value. Lexicographical comparison means almost everywhere: first character ranks highest – as in the telephone book. Integer numbers written as text are always represented most significant digit first in memory, which is similar to big-endian, independently of text direction.


Hardware

Many historical and extant processors use a big-endian memory representation, either exclusively or as a design option. Other processor types use little-endian memory representation; others use yet another scheme called '' middle-endian'', ''mixed-endian'' or ''
PDP-11 The PDP-11 is a series of 16-bit minicomputers sold by Digital Equipment Corporation (DEC) from 1970 into the 1990s, one of a set of products in the Programmed Data Processor (PDP) series. In total, around 600,000 PDP-11s of all models were sold, ...
-endian''. Some instruction sets feature a setting which allows for switchable endianness in data fetches and stores, instruction fetches, or both. This feature can improve performance or simplify the logic of networking devices and software. The word ''bi-endian'', when said of hardware, denotes the capability of the machine to compute or pass data in either endian format. Dealing with data of different endianness is sometimes termed the ''NUXI problem''. This terminology alludes to the byte order conflicts encountered while adapting UNIX, which ran on the mixed-endian PDP-11, to a big-endian IBM Series/1 computer. Unix was one of the first systems to allow the same code to be compiled for platforms with different internal representations. One of the first programs converted was supposed to print out , but on the Series/1 it printed instead. The
IBM System/360 The IBM System/360 (S/360) is a family of mainframe computer systems that was announced by IBM on April 7, 1964, and delivered between 1965 and 1978. It was the first family of computers designed to cover both commercial and scientific applica ...
uses big-endian byte order, as do its successors System/370,
ESA/390 The IBM System/390 is a discontinued mainframe product family implementing the ESA/390, the fifth generation of the System/360 instruction set architecture. The first computers to use the ESA/390 were the Enterprise System/9000 (ES/9000) ...
, and
z/Architecture z/Architecture, initially and briefly called ESA Modal Extensions (ESAME), is IBM's 64-bit complex instruction set computer (CISC) instruction set architecture, implemented by its mainframe computers. IBM introduced its first z/Architecture-b ...
. The PDP-10 uses big-endian addressing for byte-oriented instructions. The
IBM Series/1 The IBM Series/1 is a 16-bit minicomputer, introduced in 1976, that in many respects competed with other minicomputers of the time, such as the PDP-11 from Digital Equipment Corporation and similar offerings from Data General and HP. The Seri ...
minicomputer uses big-endian byte order. The Datapoint 2200 used simple bit-serial logic with little-endian to facilitate
carry propagation An adder, or summer, is a digital circuit that performs addition of numbers. In many computers and other kinds of processors adders are used in the arithmetic logic units (ALUs). They are also used in other parts of the processor, where they are ...
. When Intel developed the 8008 microprocessor for Datapoint, they used little-endian for compatibility. However, as Intel was unable to deliver the 8008 in time, Datapoint used a medium-scale integration equivalent, but the little-endianness was retained in most Intel designs, including the
MCS-48 The MCS-48 microcontroller series, Intel's first microcontroller, was originally released in 1976. Its first members were 8048, 8035 and 8748. The 8048 is probably the most prominent member of the family. Initially, this family was produced us ...
and the 8086 and its x86 successors. The DEC Alpha,
Atmel AVR AVR is a family of microcontrollers developed since 1996 by Atmel, acquired by Microchip Technology in 2016. These are modified Harvard architecture 8-bit Reduced instruction set computer, RISC single-chip microcontrollers. AVR was one of the f ...
, VAX, the MOS Technology 6502 family (including Western Design Center
65802 The W65C816S (also 65C816 or 65816) is an 8/16-bit microprocessor (MPU) developed and sold by the Western Design Center (WDC). Introduced in 1985, the W65C816S is an enhanced version of the WDC 65C02 8-bit MPU, itself a CMOS enhancement of the ...
and
65C816 The W65C816S (also 65C816 or 65816) is an 8/16-bit microprocessor (MPU) developed and sold by the Western Design Center (WDC). Introduced in 1985, the W65C816S is an enhanced version of the WDC 65C02 8-bit MPU, itself a CMOS enhancement of the ve ...
), the Zilog Z80 (including
Z180 The Zilog Z180 eight-bit processor is a successor of the Z80 CPU. It is compatible with the large base of software written for the Z80. The Z180 family adds higher performance and integrated peripheral functions like clock generator, 16-bit coun ...
and eZ80), the Altera Nios II, and many other processors and processor families are also little-endian. The Motorola 6800 / 6801, the
6809 The Motorola 6809 ("''sixty-eight-oh-nine''") is an 8-bit microprocessor with some 16-bit features. It was designed by Motorola's Terry Ritter and Joel Boney and introduced in 1978. Although source compatible with the earlier Motorola 6800, the ...
and the
68000 series The Motorola 68000 series (also known as 680x0, m68000, m68k, or 68k) is a family of 32-bit complex instruction set computer (CISC) microprocessors. During the 1980s and early 1990s, they were popular in personal computers and workstations and w ...
of processors used the big-endian format. The Intel
8051 The Intel MCS-51 (commonly termed 8051) is a single chip microcontroller (MCU) series developed by Intel in 1980 for use in embedded systems. The architect of the Intel MCS-51 instruction set was John H. Wharton. Intel's original versions were po ...
, unlike other Intel processors, expects 16-bit addresses for LJMP and LCALL in big-endian format; however, xCALL instructions store the return address onto the stack in little-endian format. SPARC historically used big-endian until version 9, which is bi-endian. Similarly early IBM POWER processors were big-endian, but the
PowerPC PowerPC (with the backronym Performance Optimization With Enhanced RISC – Performance Computing, sometimes abbreviated as PPC) is a reduced instruction set computer (RISC) instruction set architecture (ISA) created by the 1991 Apple Inc., App ...
and
Power ISA Power ISA is a reduced instruction set computer (RISC) instruction set architecture (ISA) currently developed by the OpenPOWER Foundation, led by IBM. It was originally developed by IBM and the now-defunct Power.org industry group. Power IS ...
descendants are now bi-endian. The
ARM architecture ARM (stylised in lowercase as arm, formerly an acronym for Advanced RISC Machines and originally Acorn RISC Machine) is a family of reduced instruction set computer (RISC) instruction set architectures for computer processors, configured ...
was little-endian before version 3 when it became bi-endian.


Newer architectures

The IA-32 and x86-64 instruction set architectures use the little-endian format. Other instruction set architectures that follow this convention, allowing only little-endian mode, include Nios II,
Andes Technology Andes Technology Corporation is a Taiwanese supplier of 32/64-bit embedded CPU cores and a founding Premier member of RISC-V International Association.  It focuses on the embedded market and delivers CPU cores with integrated development environmen ...
NDS32, and Qualcomm Hexagon. Solely big-endian architectures include the IBM
z/Architecture z/Architecture, initially and briefly called ESA Modal Extensions (ESAME), is IBM's 64-bit complex instruction set computer (CISC) instruction set architecture, implemented by its mainframe computers. IBM introduced its first z/Architecture-b ...
and OpenRISC. Some instruction set architectures are "bi-endian" and allow running software of either endianness; these include
Power ISA Power ISA is a reduced instruction set computer (RISC) instruction set architecture (ISA) currently developed by the OpenPOWER Foundation, led by IBM. It was originally developed by IBM and the now-defunct Power.org industry group. Power IS ...
, SPARC, ARM AArch64, C-Sky, and RISC-V. IBM AIX and
IBM i IBM i (the ''i'' standing for ''integrated'') is an operating system developed by IBM for IBM Power Systems. It was originally released in 1988 as OS/400, as the sole operating system of the IBM AS/400 line of systems. It was renamed to i5/OS in ...
run in big-endian mode on bi-endian Power ISA; Linux originally ran in big-endian mode, but by 2019, IBM had transitioned to little-endian mode for Linux to ease the porting of Linux software from x86 to Power. SPARC has no relevant little-endian deployment, as both Oracle Solaris and Linux run in big-endian mode on bi-endian SPARC systems, and can be considered big-endian in practice. ARM, C-Sky, and RISC-V have no relevant big-endian deployments, and can be considered little-endian in practice.


Bi-endianness

Some architectures (including ARM versions 3 and above,
PowerPC PowerPC (with the backronym Performance Optimization With Enhanced RISC – Performance Computing, sometimes abbreviated as PPC) is a reduced instruction set computer (RISC) instruction set architecture (ISA) created by the 1991 Apple Inc., App ...
,
Alpha Alpha (uppercase , lowercase ; grc, ἄλφα, ''álpha'', or ell, άλφα, álfa) is the first letter of the Greek alphabet. In the system of Greek numerals, it has a value of one. Alpha is derived from the Phoenician letter aleph , whic ...
, SPARC V9, MIPS, Intel i860, PA-RISC, SuperH SH-4 and IA-64) feature a setting which allows for switchable endianness in data fetches and stores, instruction fetches, or both. This feature can improve performance or simplify the logic of networking devices and software. The word ''bi-endian'', when said of hardware, denotes the capability of the machine to compute or pass data in either endian format. Many of these architectures can be switched via software to default to a specific endian format (usually done when the computer starts up); however, on some systems, the default endianness is selected by hardware on the motherboard and cannot be changed via software (e.g. the Alpha, which runs only in big-endian mode on the
Cray T3E The Cray T3E was Cray Research's second-generation massively parallel supercomputer architecture, launched in late November 1995. The first T3E was installed at the Pittsburgh Supercomputing Center in 1996. Like the previous Cray T3D, it was a ful ...
). Note that the term ''bi-endian'' refers primarily to how a processor treats data accesses. Instruction accesses (fetches of instruction words) on a given processor may still assume a fixed endianness, even if data accesses are fully bi-endian, though this is not always the case, such as on Intel's IA-64-based Itanium CPU, which allows both. Note, too, that some nominally bi-endian CPUs require motherboard help to fully switch endianness. For instance, the 32-bit desktop-oriented
PowerPC PowerPC (with the backronym Performance Optimization With Enhanced RISC – Performance Computing, sometimes abbreviated as PPC) is a reduced instruction set computer (RISC) instruction set architecture (ISA) created by the 1991 Apple Inc., App ...
processors in little-endian mode act as little-endian from the point of view of the executing programs, but they require the motherboard to perform a 64-bit swap across all 8 byte lanes to ensure that the little-endian view of things will apply to I/O devices. In the absence of this unusual motherboard hardware, device driver software must write to different addresses to undo the incomplete transformation and also must perform a normal byte swap. Some CPUs, such as many PowerPC processors intended for embedded use and almost all SPARC processors, allow per-page choice of endianness. SPARC processors since the late 1990s (SPARC v9 compliant processors) allow data endianness to be chosen with each individual instruction that loads from or stores to memory. The
ARM architecture ARM (stylised in lowercase as arm, formerly an acronym for Advanced RISC Machines and originally Acorn RISC Machine) is a family of reduced instruction set computer (RISC) instruction set architectures for computer processors, configured ...
supports two big-endian modes, called ''BE-8'' and ''BE-32''. CPUs up to ARMv5 only support BE-32 or word-invariant mode. Here any naturally aligned 32-bit access works like in little-endian mode, but access to a byte or 16-bit word is redirected to the corresponding address and unaligned access is not allowed. ARMv6 introduces BE-8 or byte-invariant mode, where access to a single byte works as in little-endian mode, but accessing a 16-bit, 32-bit or (starting with ARMv8) 64-bit word results in a byte swap of the data. This simplifies unaligned memory access as well as memory-mapped access to registers other than 32 bit. Many processors have instructions to convert a word in a register to the opposite endianness, that is, they swap the order of the bytes in a 16-, 32- or 64-bit word. All the individual bits are not reversed though. Recent Intel x86 and x86-64 architecture CPUs have a MOVBE instruction ( Intel Core since generation 4, after Atom), which fetches a big-endian format word from memory or writes a word into memory in big-endian format. These processors are otherwise thoroughly little-endian. There are also devices which use different formats in different places. For instance, the BQ27421 Texas Instruments battery gauge uses the little-endian format for its registers and the big-endian format for its random-access memory. This behavior does not seem to be modifiable.


Floating point

Although many processors use little-endian storage for all types of data (integer, floating point), there are a number of hardware architectures where
floating-point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
numbers are represented in big-endian form while integers are represented in little-endian form. There are ARM processors that have mixed-endian floating-point representation for double-precision numbers: each of the two 32-bit words is stored as little-endian, but the most significant word is stored first. VAX floating point stores little-endian 16-bit words in big-endian order. Because there have been many floating-point formats with no network standard representation for them, the XDR standard uses big-endian IEEE 754 as its representation. It may therefore appear strange that the widespread
IEEE 754 The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found i ...
floating-point standard does not specify endianness. Theoretically, this means that even standard IEEE floating-point data written by one machine might not be readable by another. However, on modern standard computers (i.e., implementing IEEE 754), one may safely assume that the endianness is the same for floating-point numbers as for integers, making the conversion straightforward regardless of data type. Small embedded systems using special floating-point formats may be another matter, however.


Variable-length data

Most instructions considered so far contain the size (lengths) of their operands within the
operation code In computing, an opcode (abbreviated from operation code, also known as instruction machine code, instruction code, instruction syllable, instruction parcel or opstring) is the portion of a machine language instruction that specifies the operat ...
. Frequently available operand lengths are 1, 2, 4, 8, or 16 bytes. But there are also architectures where the length of an operand may be held in a separate field of the instruction or with the operand itself, e.g. by means of a
word mark __notoc__ A wordmark, word mark, or logotype, is usually a distinct text-only typographic treatment of the name of a company, institution, or product name used for purposes of identification and branding. Examples can be found in the graphic iden ...
. Such an approach allows operand lengths up to 256 bytes or larger. The data types of such operands are character strings or BCD. Machines able to manipulate such data with one instruction (e.g. compare, add) include the IBM 1401,
1410 Year 1410 ( MCDX) was a common year starting on Wednesday (link will display the full calendar) of the Julian calendar. Events January–December * March 25 – The first of the Yongle Emperor's campaigns against the Mongols is ...
,
1620 Events January–June * February 4 – Prince Bethlen Gabor signs a peace treaty with Ferdinand II, Holy Roman Emperor. * May 17 – The first merry-go-round is seen at a fair (Philippapolis, Turkey). * June 3 – The ...
,
System/360 The IBM System/360 (S/360) is a family of mainframe computer systems that was announced by IBM on April 7, 1964, and delivered between 1965 and 1978. It was the first family of computers designed to cover both commercial and scientific applica ...
, System/370,
ESA/390 The IBM System/390 is a discontinued mainframe product family implementing the ESA/390, the fifth generation of the System/360 instruction set architecture. The first computers to use the ESA/390 were the Enterprise System/9000 (ES/9000) ...
, and
z/Architecture z/Architecture, initially and briefly called ESA Modal Extensions (ESAME), is IBM's 64-bit complex instruction set computer (CISC) instruction set architecture, implemented by its mainframe computers. IBM introduced its first z/Architecture-b ...
, all of them of type big-endian.


Simplified access to part of a field

On most systems, the address of a multi-byte value is the address of its first byte (the byte with the lowest address); little-endian systems of that type have the property that, for sufficiently low data values, the same value can be read from memory at different lengths without using different addresses (even when
alignment Alignment may refer to: Archaeology * Alignment (archaeology), a co-linear arrangement of features or structures with external landmarks * Stone alignment, a linear arrangement of upright, parallel megalithic standing stones Biology * Structu ...
restrictions are imposed). For example, a 32-bit memory location with content can be read at the same address as either
8-bit In computer architecture, 8-bit Integer (computer science), integers or other Data (computing), data units are those that are 8 bits wide (1 octet (computing), octet). Also, 8-bit central processing unit (CPU) and arithmetic logic unit (ALU) arc ...
(value = 4A),
16-bit 16-bit microcomputers are microcomputers that use 16-bit microprocessors. A 16-bit register can store 216 different values. The range of integer values that can be stored in 16 bits depends on the integer representation used. With the two mos ...
(004A),
24-bit Notable 24-bit machines include the CDC 924 – a 24-bit version of the CDC 1604, CDC lower 3000 series, SDS 930 and SDS 940, the ICT 1900 series, the Elliott 4100 series, and the Datacraft minicomputers/Harris H series. The term SWORD is ...
(00004A), or
32-bit In computer architecture, 32-bit computing refers to computer systems with a processor, memory, and other major system components that operate on data in 32-bit units. Compared to smaller bit widths, 32-bit computers can perform large calculation ...
(0000004A), all of which retain the same numeric value. Although this little-endian property is rarely used directly by high-level programmers, it is occasionally employed by code optimizers as well as by
assembly language In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence be ...
programmers. In more concrete terms, identities like this are the equivalent of the following C code returning ''true'' on most little-endian systems: union u = ; puts(u.u8

u.u16 && u.u8

u.u32 && u.u8

u.u64 ? "true" : "false");
While not allowed by C++, such
type punning In computer science, a type punning is any programming technique that subverts or circumvents the type system of a programming language in order to achieve an effect that would be difficult or impossible to achieve within the bounds of the formal ...
code is allowed as "implementation-defined" by the C11 standard and commonly used in code interacting with hardware. On the other hand, in some situations it may be useful to obtain an approximation of a multi-byte or multi-word value by reading only its most significant portion instead of the complete representation; a big-endian processor may read such an approximation using the same base address that would be used for the full value. Simplifications of this kind are of course not portable across systems of different endianness.


Calculation order

Some operations in
positional number system Positional notation (or place-value notation, or positional numeral system) usually denotes the extension to any base of the Hindu–Arabic numeral system (or decimal system). More generally, a positional system is a numeral system in which the ...
s have a natural or preferred order in which the elementary steps are to be executed. This order may affect their performance on small-scale byte-addressable processors and
microcontroller A microcontroller (MCU for ''microcontroller unit'', often also MC, UC, or μC) is a small computer on a single VLSI integrated circuit (IC) chip. A microcontroller contains one or more CPUs (processor cores) along with memory and programmable i ...
s. However, high-performance processors usually fetch multi-byte operands from memory in the same amount of time they would have fetched a single byte, so the complexity of the hardware is not affected by the byte ordering. Addition, subtraction, and multiplication start at the least significant digit position and propagate the carry to the subsequent more significant position. On most systems, the address of a multi-byte value is the address of its first byte (the byte with the lowest address). The implementation of these operations is marginally simpler using little-endian machines where this first byte contains the least significant digit. Comparison and division start at the most significant digit and propagate a possible carry to the subsequent less significant digits. For fixed-length numerical values (typically of length 1,2,4,8,16), the implementation of these operations is marginally simpler on big-endian machines. Some big-endian processors (e.g. the IBM System/360 and its successors) contain hardware instructions for lexicographically comparing varying length character strings. The normal data transport by an assignment statement is in principle independent of the endianness of the processor.


Middle-endian

Numerous other orderings, generically called ''middle-endian'' or ''mixed-endian'', are possible. The
PDP-11 The PDP-11 is a series of 16-bit minicomputers sold by Digital Equipment Corporation (DEC) from 1970 into the 1990s, one of a set of products in the Programmed Data Processor (PDP) series. In total, around 600,000 PDP-11s of all models were sold, ...
is in principle a 16-bit little-endian system. The instructions to convert between floating-point and integer values in the optional floating-point processor of the PDP-11/45, PDP-11/70, and in some later processors, stored 32-bit "double precision integer long" values with the 16-bit halves swapped from the expected little-endian order. The UNIX C compiler used the same format for 32-bit long integers. This ordering is known as ''PDP-endian''. A way to interpret this endianness is that it stores a 32-bit integer as two little-endian 16-bit words, with a big-endian word ordering: The 16-bit values here refer to their numerical values, not their actual layout. Segment descriptors of IA-32 and compatible processors keep a 32-bit base address of the segment stored in little-endian order, but in four nonconsecutive bytes, at relative positions 2, 3, 4 and 7 of the descriptor start.


Endian dates

Dates can be represented with different endianness by the ordering of the year, month and day. For example, September 13, 2002 can be represented as: * little-endian date (day, month, year), * middle-endian dates (month, day, year), * big-endian date (year, month, day), as with ISO 8601 In
date and time notation in the United States Date and time notation in the United States differs from that used in nearly all other countries. It is inherited from one historical branch of Date and time notation in the United Kingdom, conventions from the United Kingdom. American styles of ...
, dates are middle-endian and differ from date formats worldwide.


Byte addressing

When memory bytes are printed sequentially from left to right (e.g. in a
hex dump In computing, a hex dump is a hexadecimal view (on screen or paper) of computer data, from memory or from a computer file or storage device. Looking at a hex dump of data is usually done in the context of either debugging, reverse engineering or ...
), little-endian representation of integers has the significance increasing from left to right. In other words, it appears backwards when visualized, which can be counter-intuitive. This behavior arises, for example, in FourCC or similar techniques that involve packing characters into an integer, so that it becomes a sequences of specific characters in memory. Let's define the notation as simply the result of writing the characters in hexadecimal ASCII and appending to the front, and analogously for shorter sequences (a C multicharacter literal, in Unix/MacOS style): ' J o h n ' hex 4A 6F 68 6E ---------------- -> 0x4A6F686E On big-endian machines, the value appears left-to-right, coinciding with the correct string order for reading the result: But on a little-endian machine, one would see: Middle-endian machines complicate this even further; for example, on the
PDP-11 The PDP-11 is a series of 16-bit minicomputers sold by Digital Equipment Corporation (DEC) from 1970 into the 1990s, one of a set of products in the Programmed Data Processor (PDP) series. In total, around 600,000 PDP-11s of all models were sold, ...
, the 32-bit value is stored as two 16-bit words in big-endian, with the characters in the 16-bit words being stored in little-endian:


Byte swapping

Byte-swapping consists of rearranging bytes to change endianness. Many compilers provide built-ins that are likely to be compiled into native processor instructions (/), such as . Software interfaces for swapping include: * Standard network endianness functions (from/to BE, up to 32-bit). Windows has a 64-bit extension in . * BSD and Glibc functions (from/to BE and LE, up to 64-bit). * macOS macros (from/to BE and LE, up to 64-bit). Some
CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
instruction sets provide native support for endian byte swapping, such as ( x86 -
486 __NOTOC__ Year 486 ( CDLXXXVI) was a common year starting on Wednesday (link will display the full calendar) of the Julian calendar. At the time, it was known as the Year of the Consulship of Basilius and Longinus (or, less frequently, year 12 ...
and later), and (
ARMv6 ARM (stylised in lowercase as arm, formerly an acronym for Advanced RISC Machines and originally Acorn RISC Machine) is a family of reduced instruction set computer (RISC) instruction set architectures for computer processors, configured ...
and later). Some compilers have built-in facilities for byte swapping. For example, the Intel Fortran compiler supports the non-standard specifier when opening a file, e.g.: . Other compilers have options for generating code that globally enables the conversion for all file IO operations. This permits the reuse of code on a system with the opposite endianness without code modification.


Logic design

Hardware description languages (HDLs) used to express digital logic often support arbitrary endianness, with arbitrary granularity. For example, in SystemVerilog, a word can be defined as little endian or big endian: logic 1:0little_endian; // bit 0 is the least significant bit logic :31big_endian; // bit 31 is the least significant bit logic :37:0] mixed; // each byte is little-endian, but bytes are packed in big-endian order.


Files and filesystems

The recognition of endianness is important when reading a file or filesystem created on a computer with different endianness. Fortran sequential unformatted files created with one endianness usually cannot be read on a system using the other endianness because Fortran usually implements a storage record, record (defined as the data written by a single Fortran statement) as data preceded and succeeded by count fields, which are integers equal to the number of bytes in the data. An attempt to read such a file using Fortran on a system of the other endianness results in a run-time error, because the count fields are incorrect. Unicode text can optionally start with a byte order mark (BOM) to signal the endianness of the file or stream. Its code point is U+FEFF. In UTF-32 for example, a big-endian file should start with ; a little-endian should start with . Application binary data formats, such as MATLAB ''.mat'' files, or the ''.bil'' data format, used in topography, are usually endianness-independent. This is achieved by storing the data always in one fixed endianness or carrying with the data a switch to indicate the endianness. An example of the former is the binary
XLS file Microsoft Excel is a spreadsheet developed by Microsoft for Windows, macOS, Android and iOS. It features calculation or computation capabilities, graphing tools, pivot tables, and a macro programming language called Visual Basic for Appli ...
format that is portable between Windows and Mac systems and always little-endian, requiring the Mac application to swap the bytes on load and save when running on a big-endian Motorola 68K or PowerPC processor. TIFF image files are an example of the second strategy, whose header instructs the application about endianness of their internal binary integers. If a file starts with the signature it means that integers are represented as big-endian, while means little-endian. Those signatures need a single 16-bit word each, and they are
palindrome A palindrome is a word, number, phrase, or other sequence of symbols that reads the same backwards as forwards, such as the words ''madam'' or ''racecar'', the date and time ''11/11/11 11:11,'' and the sentence: "A man, a plan, a canal – Panam ...
s (that is, they read the same forwards and backwards), so they are endianness independent. stands for Intel and stands for Motorola, the respective
CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
providers of the
IBM PC The IBM Personal Computer (model 5150, commonly known as the IBM PC) is the first microcomputer released in the IBM PC model line and the basis for the IBM PC compatible de facto standard. Released on August 12, 1981, it was created by a team ...
compatibles (Intel) and Apple Macintosh platforms (Motorola) in the 1980s. Intel CPUs are little-endian, while Motorola 680x0 CPUs are big-endian. This explicit signature allows a TIFF reader program to swap bytes if necessary when a given file was generated by a TIFF writer program running on a computer with a different endianness. As a consequence of its original implementation on the Intel 8080 platform, the operating system-independent
File Allocation Table File Allocation Table (FAT) is a file system developed for personal computers. Originally developed in 1977 for use on floppy disks, it was adapted for use on hard disks and other devices. It is often supported for compatibility reasons by ...
(FAT) file system is defined with little-endian byte ordering, even on platforms using another endianness natively, necessitating byte-swap operations for maintaining the FAT.
ZFS ZFS (previously: Zettabyte File System) is a file system with volume management capabilities. It began as part of the Sun Microsystems Solaris operating system in 2001. Large parts of Solaris – including ZFS – were published under an open ...
, which combines a filesystem and a
logical volume manager In computer storage, logical volume management or LVM provides a method of allocating space on mass-storage devices that is more flexible than conventional partitioning schemes to store volumes. In particular, a volume manager can concatenate ...
, is known to provide adaptive endianness and to work with both big-endian and little-endian systems.


Networking

Many
IETF RFC A Request for Comments (RFC) is a publication in a series from the principal technical development and standards-setting bodies for the Internet, most prominently the Internet Engineering Task Force (IETF). An RFC is authored by individuals or g ...
s use the term ''network order'', meaning the order of transmission for bits and bytes ''over the wire'' in network protocols. Among others, the historic RFC 1700 (also known as Internet standard STD 2) has defined the network order for protocols in the Internet protocol suite to be big-endian, hence the use of the term for big-endian byte order. However, not all protocols use big-endian byte order as the network order. The Server Message Block (SMB) protocol uses little-endian byte order. In CANopen, multi-byte parameters are always sent
least significant byte In computing, bit numbering is the convention used to identify the bit positions in a binary number. Bit significance and indexing In computing, the least significant bit (LSB) is the bit position in a binary integer representing the binary 1 ...
first (little-endian). The same is true for
Ethernet Powerlink Ethernet Powerlink is a real-time protocol for standard Ethernet. It is an open protocol managed by the Ethernet POWERLINK Standardization Group (EPSG). It was introduced by Austrian automation company B&R in 2001. This protocol has nothing to ...
. The Berkeley sockets API defines a set of functions to convert 16-bit and 32-bit integers to and from network byte order: the (host-to-network-short) and (host-to-network-long) functions convert 16-bit and 32-bit values respectively from machine (''host'') to network order; the and functions convert from network to host order. These functions may be a
no-op In computer science, a NOP, no-op, or NOOP (pronounced "no op"; short for no operation) is a machine language instruction and its assembly language mnemonic, programming language statement, or computer protocol command that does nothing. Mac ...
on a big-endian system. While the high-level network protocols usually consider the byte (mostly meant as '' octet'') as their atomic unit, the lowest network protocols may deal with ordering of bits within a byte.


Bit endianness

Bit numbering is a concept similar to endianness, but on a level of bits, not bytes. Bit endianness or bit-level endianness refers to the transmission order of bits over a serial medium. The bit-level analogue of little-endian (least significant bit goes first) is used in RS-232,
HDLC High-Level Data Link Control (HDLC) is a bit-oriented code-transparent synchronous data link layer protocol developed by the International Organization for Standardization (ISO). The standard for HDLC is ISO/IEC 13239:2002. HDLC provides both c ...
, Ethernet, and USB. Some protocols use the opposite ordering (e.g. Teletext, I2C, SMBus, PMBus, and SONET and SDHCf. Sec. 2.1 Bit Transmission o
draft-ietf-pppext-sonet-as-00 "Applicability Statement for PPP over SONET/SDH"
/ref>), and ARINC 429 uses one ordering for its label field and the other ordering for the remainder of the frame. Usually, there exists a consistent view to the bits irrespective of their order in the byte, such that the latter becomes relevant only on a very low level. One exception is caused by the feature of some
cyclic redundancy check A cyclic redundancy check (CRC) is an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to digital data. Blocks of data entering these systems get a short ''check value'' attached, based on t ...
s to detect ''all'' burst errors up to a known length, which would be spoiled if the bit order is different from the byte order on serial transmission. Apart from serialization, the terms ''bit endianness'' and ''bit-level endianness'' are seldom used, as computer architectures where each individual bit has a unique address are rare. Individual bits or bit fields are accessed via their numerical value or, in high-level programming languages, assigned names, the effects of which, however, may be machine dependent or lack software portability.


Notes


References

{{Reflist Computer memory Data transmission Metaphors Software wars