Intel hexadecimal object file format, Intel hex format or Intellec Hex is a
file format
A file format is a Computer standard, standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary format, pr ...
that conveys
binary information in
ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
text
Text may refer to:
Written word
* Text (literary theory)
In literary theory, a text is any object that can be "read", whether this object is a work of literature, a street sign, an arrangement of buildings on a city block, or styles of clothi ...
form,
making it possible to store on non-binary media such as
paper tape,
punch card
A punched card (also punch card or punched-card) is a stiff paper-based medium used to store digital information via the presence or absence of holes in predefined positions. Developed over the 18th to 20th centuries, punched cards were wide ...
s, etc., to display on
text terminals or be printed on
line-oriented printers.
The format is commonly used for programming
microcontroller
A microcontroller (MC, uC, or μC) or microcontroller unit (MCU) is a small computer on a single integrated circuit. A microcontroller contains one or more CPUs (processor cores) along with memory and programmable input/output peripherals. Pro ...
s,
EPROMs, and other types of programmable logic devices and
hardware emulators. In a typical application, a
compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
or
assembler converts a
program's
source code
In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer.
Since a computer, at base, only ...
(such as in
C or
assembly language
In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
) to
machine code
In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
and outputs it into a
object or
executable file
In computer science, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instructions", as opposed to a da ...
in hexadecimal (or binary) format. In some applications, the Intel hex format is also used as a
container format holding
packets of
stream data.
Common file extensions used for the resulting files are .HEX
or .H86.
The HEX file is then read by a
programmer
A programmer, computer programmer or coder is an author of computer source code someone with skill in computer programming.
The professional titles Software development, ''software developer'' and Software engineering, ''software engineer' ...
to write the machine code into a
PROM or is transferred to the target system for loading and execution.
There are various tools to convert files between hexadecimal and binary format (i.e.
HEX2BIN), and vice versa (i.e. OBJHEX, OH, OHX, BIN2HEX).
History
The Intel hex format was originally designed for
Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
's
Intellec Microcomputer Development Systems
(MDS) in 1973 in order to load and execute programs from
paper tape. It was also used to specify memory contents to Intel for
ROM production,
which previously had to be
encoded in the much less efficient
BNPF (Begin-Negative-Positive-Finish) format.
In 1973, Intel's "software group" consisted only of Bill Byerly and Kenneth Burgett, and
Gary Kildall
Gary Arlen Kildall (; May 19, 1942 – July 11, 1994) was an American computer scientist and microcomputer entrepreneur. During the 1970s, Kildall created the CP/M operating system among other operating systems and programming tools, and s ...
as an external consultant doing business as
Microcomputer Applications Associates (MAA) and founding
Digital Research
Digital Research, Inc. (DR or DRI) was a privately held American software company created by Gary Kildall to market and develop his CP/M operating system and related 8-bit, 16-bit and 32-bit systems like MP/M, Concurrent DOS, FlexOS, Multiuser ...
in 1974.
Beginning in 1975, the format was utilized by
Intellec Series II ISIS-II systems supporting diskette drives, with files using the file extension HEX.
Many
PROM and
EPROM programming devices accept this format.
Format
Intel HEX consists of lines of
ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
text that are separated by
line feed
A newline (frequently called line ending, end of line (EOL), next line (NEL) or line break) is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or ...
or
carriage return characters or both. Each text line contains uppercase
hexadecimal
Hexadecimal (also known as base-16 or simply hex) is a Numeral system#Positional systems in detail, positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbo ...
characters that
encode multiple binary numbers. The binary numbers may represent data,
memory address
In computing, a memory address is a reference to a specific memory location in memory used by both software and hardware. These addresses are fixed-length sequences of digits, typically displayed and handled as unsigned integers. This numeric ...
es, or other values, depending on their position in the line and the type and length of the line. Each text line is called a ''record''.
Record structure
A
record (line of text) consists of six
fields (parts) that appear in order from left to right:
# ''Start code'', one character, an ASCII colon ''. All characters preceding this symbol in a record should be ignored.
In fact, very early versions of the specification even asked for a minimum of 25
NUL characters to precede the first record and follow the last one, owing to the format's origins as a
paper tape format which required some
tape leadin and leadout for handling.
However, as this was a little known part of the specification, not all software written copes with this correctly. It allows to store other related information in the same file (and even the same line),
a facility used by various software development utilities to store
symbol tables or additional comments,
and third-party extensions using other characters as start code like the digits ''..'' by Intel
and
Keil,
'' by
Mostek
Mostek Corporation was a semiconductor integrated circuit manufacturer, founded in 1969 by L. J. Sevin, Louay E. Sharif, Richard L. Petritz and other ex-employees of Texas Instruments. At its peak in the late 1970s, Mostek held an 85% market sh ...
,
or '', '', '', '', '' and '' by
TDL.
By convention, '' is often used for comments.
Neither of these extensions may contain any ':' characters as part of the payload.
# ''Byte count'', two hex digits (one hex digit pair), indicating the number of bytes (hex digit pairs) in the data field. The maximum byte count is 255 (0xFF). The values of 8 (0x08),
16 (0x10)
and 32 (0x20) are commonly used byte counts. Not all software copes with counts larger than 16.
# ''Address'', four hex digits, representing the 16-bit beginning memory address offset of the data. The physical address of the data is computed by adding this offset to a previously established base address, thus allowing memory addressing beyond the 64 kilobyte limit of 16-bit addresses. The base address, which defaults to zero, can be changed by various types of records. Base addresses and address offsets are always expressed as
big endian values.
# ''Record type'' (see
record types below), two hex digits, to , defining the meaning of the data field.
# ''Data'', a sequence of ''n'' bytes of data, represented by 2''n'' hex digits. Some records omit this field (''n'' equals zero). The meaning and interpretation of data bytes depends on the application. (4-bit data will either have to be stored in the lower or upper half of the bytes, that is, one byte holds only one addressable data item.
)
# ''
Checksum'', two hex digits, a computed value that can be used to verify the record has no errors.
Color legend
As a visual aid, the fields of Intel HEX records are colored throughout this article as follows:
Checksum calculation
A record's checksum byte is the
two's complement
Two's complement is the most common method of representing signed (positive, negative, and zero) integers on computers, and more generally, fixed point binary values. Two's complement uses the binary digit with the ''greatest'' value as the ''s ...
of the
least significant byte
In computing, bit numbering is the convention used to identify the bit positions in a binary number.
Bit significance and indexing
In computing, the least significant bit (LSb) is the bit position in a binary integer representing the lowes ...
(LSB) of the sum of all decoded byte values in the record preceding the checksum. It is computed by summing the decoded byte values and extracting the LSB of the sum (i.e., the data checksum), and then calculating the two's complement of the LSB (e.g., by
inverting its bits and adding one).
For example, in the case of the record , the sum of the decoded byte values is + + + + + + =
E2
, which has LSB value
E2
. The two's complement of
E2
is , which is the checksum byte appearing at the end of the record.
The validity of a record can be checked by computing its checksum and verifying that the computed checksum equals the checksum appearing in the record; an error is indicated if the checksums differ. Since the record's checksum byte is the two's complement — and therefore the
additive inverse
In mathematics, the additive inverse of an element , denoted , is the element that when added to , yields the additive identity, 0 (zero). In the most familiar cases, this is the number 0, but it can also refer to a more generalized zero el ...
— of the data checksum, this process can be reduced to summing all decoded byte values, including the record's checksum, and verifying that the LSB of the sum is zero. When applied to the preceding example, this method produces the following result: + + + + + + + =
100
, which has LSB value
00
.
Text line terminators
Intel HEX records are usually separated by one or more ASCII line termination characters so that each record appears alone on a text line. This enhances readability by visually
delimiting the records and it also provides padding between records that can be used to improve machine
parsing
Parsing, syntax analysis, or syntactic analysis is a process of analyzing a String (computer science), string of Symbol (formal), symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal gramm ...
efficiency. However, the line termination characters are optional, as the '' is used to detect the start of a record.
Programs that create HEX records typically use line termination characters that conform to the conventions of their
operating system
An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ...
s. For example, Linux programs use a single LF (
line feed
A newline (frequently called line ending, end of line (EOL), next line (NEL) or line break) is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or ...
, hex value
0A
) character to terminate lines, whereas Windows programs use a CR (
carriage return, hex value
0D
) followed by a LF.
Record types
Intel HEX has six standard record types:
Other record types have been used for variants, including ('blinky' messages / transmission protocol container) by Wayne and Layne,
(block start), (block end), (padded data), (custom data) and (other data) by the
BBC
The British Broadcasting Corporation (BBC) is a British public service broadcaster headquartered at Broadcasting House in London, England. Originally established in 1922 as the British Broadcasting Company, it evolved into its current sta ...
/
Micro:bit Educational Foundation,
and (data in
code segment), (data in
data segment), (data in
stack segment), (data in
extra segment), (
paragraph address for absolute code segment), (paragraph address for absolute data segment), (paragraph address for absolute stack segment) and (paragraph address for absolute extra segment) by
Digital Research
Digital Research, Inc. (DR or DRI) was a privately held American software company created by Gary Kildall to market and develop his CP/M operating system and related 8-bit, 16-bit and 32-bit systems like MP/M, Concurrent DOS, FlexOS, Multiuser ...
.
Named formats
The original 4-bit/8-bit ''Intellec Hex Paper Tape Format'' and ''Intellec Hex Computer Punched Card Format'' in 1973/1974 supported only one record type .
This was expanded around 1975 to also support record type .
Sometimes called ''symbolic hexadecimal format'',
it could include an optional header containing a
symbol table for
symbolic debugging,
all characters in a record preceding the colon are ignored.
Around 1978, Intel introduced the new record types and (to add support for the segmented address space of the then-new
8086
The 8086 (also called iAPX 86) is a 16-bit microprocessor chip designed by Intel between early 1976 and June 8, 1978, when it was released. The Intel 8088, released July 1, 1979, is a slightly modified chip with an external 8-bit data bus (allo ...
/
8088 processors) in their ''Extended Intellec Hex Format''.
Special names are sometimes used to denote the formats of HEX files that employ specific subsets of record types. For example:
* I8HEX (aka HEX-80) files use only record types and
* I16HEX (aka HEX-86) files use only record types through
* I32HEX (aka HEX-386) files use only record types , , , and
File example
This example shows a file that has four data records followed by an end-of-file record:
Variants
Besides Intel's own extension, several third-parties have also defined variants and extensions of the Intel hex format, including
Digital Research
Digital Research, Inc. (DR or DRI) was a privately held American software company created by Gary Kildall to market and develop his CP/M operating system and related 8-bit, 16-bit and 32-bit systems like MP/M, Concurrent DOS, FlexOS, Multiuser ...
(as in the so-called "Digital Research hex format"
),
Zilog,
Mostek
Mostek Corporation was a semiconductor integrated circuit manufacturer, founded in 1969 by L. J. Sevin, Louay E. Sharif, Richard L. Petritz and other ex-employees of Texas Instruments. At its peak in the late 1970s, Mostek held an 85% market sh ...
,
TDL,
Texas Instruments
Texas Instruments Incorporated (TI) is an American multinational semiconductor company headquartered in Dallas, Texas. It is one of the top 10 semiconductor companies worldwide based on sales volume. The company's focus is on developing analog ...
,
Microchip,
c't, Wayne and Layne,
and
BBC
The British Broadcasting Corporation (BBC) is a British public service broadcaster headquartered at Broadcasting House in London, England. Originally established in 1922 as the British Broadcasting Company, it evolved into its current sta ...
/
Micro:bit Educational Foundation (with its "Universal Hex Format"
). These can have information on
program entry points and
register contents, a
swapped byte order in the data fields,
fill values for unused areas,
fuse bits, and other differences.
The Digital Research hex format for 8086 processors supports segment information by adding record types to distinguish between code, data, stack, and extra segments.
Most assemblers for
CP/M-80 (and also
XASM09 for the
Motorola 6809) don't use record type 01h to indicate the end of a file, but use a zero-length data type 00h entry instead.
This eases the concatenation of multiple hex files.
Texas Instruments defines a variant where addresses are based on the bit-width of a processor's registers, not bytes.
Microchip defines variants INTHX8S
(INHX8L,
INHX8H
), INHX8M,
INHX16
(INHX16M
) and INHX32
for their
PIC microcontrollers.
Alfred Arnold's cross-macro-assembler AS,
Werner Hennig-Roleff's
8051-emulator SIM51,
and Matthias R. Paul's cross-converter BINTEL
are also known to define extensions to the Intel hex format.
See also
*
Binary-to-text encoding
A binary-to-text encoding is code, encoding of data (computing), data in plain text. More precisely, it is an encoding of binary data in a sequence of character (computing), printable characters. These encodings are necessary for transmission of ...
, a survey and comparison of encoding algorithms
*
Text-based protocol
In computing, text-based user interfaces (TUI) (alternately terminal user interfaces, to reflect a dependence upon the properties of computer terminals and not just text), is a retronym describing a type of user interface (UI) common as an ear ...
*
MOS Technology file format
*
Motorola S-record hex format
*
Tektronix hex format
*
Texas Instruments TI-TXT (TI Text)
*
Intel Micro Computer Set (MCS)
*
Object file
An object file is a file that contains machine code or bytecode, as well as other data and metadata, generated by a compiler or assembler from source code during the compilation or assembly process. The machine code that is generated is kno ...
(typically binary, but sometimes also in Intel hex format)
References
Further reading
*
*
*
* (32 pages)
* {{cite web , title=ADuC70xx Serial Download Protocol , type=Application Note , date=2016 , publisher=
Analog Devices , publication-place=Norwood, Massachusetts, USA , version=Revision C , id=AN-724 , url=https://www.analog.com/media/en/technical-documentation/evaluation-documentation/an-724.pdf , access-date=2023-10-05 , url-status=live , archive-url=https://web.archive.org/web/20231005181920/https://www.analog.com/media/en/technical-documentation/evaluation-documentation/an-724.pdf , archive-date=2023-10-05 (8 pages)
External links
binex- a converter between Intel HEX and binary for Windows.
SRecord a converter between Intel HEX and binary for Linux
, C++ source code.
kk_ihex open source C library for reading and writing Intel HEX
libgis open source C library that converts Intel HEX, Motorola S-Record, Atmel Generic files.
bincopyis a Python package for manipulating Intel HEX files.
SwiftIntelHex- a Swift package to parse Intel HEX files for iOS and macOS.
Binary-to-text encoding formats
Embedded systems
Computer file formats