The Portable Executable (PE) format is a
file format
A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free.
Some file formats ...
for
executable
In computing, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instruction (computer science), instructi ...
s,
object code
In computing, object code or object module is the product of a compiler.
In a general sense object code is a sequence of statements or instructions in a computer language, usually a machine code language (i.e., binary) or an intermediate langua ...
,
DLLs and others used in 32-bit and 64-bit versions of
Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
operating system
An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs.
Time-sharing operating systems schedule tasks for efficient use of the system and may also in ...
s. The PE format is a data structure that encapsulates the information necessary for the Windows OS loader to manage the wrapped
executable code
In computing, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instructions", as opposed to a data file ...
. This includes
dynamic library references for linking,
API
An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
export and import tables, resource management data and
thread-local storage
Thread-local storage (TLS) is a computer programming method that uses static or global memory local to a thread.
While the use of global variables is generally discouraged in modern programming, legacy operating systems such as UNIX are designed ...
(TLS) data. On
NT operating systems, the PE format is used for
EXE
Exe or EXE may refer to:
* .exe, a file extension
* exe., abbreviation for executive
Places
* River Exe, in England
* Exe Estuary, in England
* Exe Island, in Exeter, England
Transportation and vehicles
* Exe (locomotive), a British locomotive
...
,
DLL,
SYS (
device driver
In computing, a device driver is a computer program that operates or controls a particular type of device that is attached to a computer or automaton. A driver provides a software interface to hardware devices, enabling operating systems and ot ...
),
MUI and other file types. The
Unified Extensible Firmware Interface (UEFI) specification states that PE is the standard executable format in EFI environments.
On Windows NT operating systems, PE currently supports the
x86-32
IA-32 (short for "Intel Architecture, 32-bit", commonly called i386) is the 32-bit version of the x86 instruction set architecture, designed by Intel and first implemented in the 80386 microprocessor in 1985. IA-32 is the first incarnation o ...
,
x86-64
x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging mod ...
(AMD64/Intel 64),
IA-64
IA-64 (Intel Itanium architecture) is the instruction set architecture (ISA) of the Itanium family of 64-bit Intel microprocessors. The basic ISA specification originated at Hewlett-Packard (HP), and was subsequently implemented by Intel in col ...
,
ARM
In human anatomy, the arm refers to the upper limb in common usage, although academically the term specifically means the upper arm between the glenohumeral joint (shoulder joint) and the elbow joint. The distal part of the upper limb between th ...
and
ARM64
AArch64 or ARM64 is the 64-bit extension of the ARM architecture family.
It was first introduced with the Armv8-A architecture. Arm releases a new extension every year.
ARMv8.x and ARMv9.x extensions and features
Announced in October 2011, AR ...
instruction set architecture
In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ' ...
s (ISAs). Prior to
Windows 2000
Windows 2000 is a major release of the Windows NT operating system developed by Microsoft and oriented towards businesses. It was the direct successor to Windows NT 4.0, and was Software release life cycle#Release to manufacturing (RTM), releas ...
, Windows NT (and thus PE) supported the
MIPS,
Alpha
Alpha (uppercase , lowercase ; grc, ἄλφα, ''álpha'', or ell, άλφα, álfa) is the first letter of the Greek alphabet. In the system of Greek numerals, it has a value of one. Alpha is derived from the Phoenician letter aleph , whic ...
, and
PowerPC
PowerPC (with the backronym Performance Optimization With Enhanced RISC – Performance Computing, sometimes abbreviated as PPC) is a reduced instruction set computer (RISC) instruction set architecture (ISA) created by the 1991 Apple Inc., App ...
ISAs. Because PE is used on
Windows CE
Windows Embedded Compact, formerly Windows Embedded CE, Windows Powered and Windows CE, is an operating system subfamily developed by Microsoft as part of its Windows Embedded family of products.
Unlike Windows Embedded Standard, which is base ...
, it continues to support several variants of the MIPS,
ARM
In human anatomy, the arm refers to the upper limb in common usage, although academically the term specifically means the upper arm between the glenohumeral joint (shoulder joint) and the elbow joint. The distal part of the upper limb between th ...
(including
Thumb
The thumb is the first digit of the hand, next to the index finger. When a person is standing in the medical anatomical position (where the palm is facing to the front), the thumb is the outermost digit. The Medical Latin English noun for thumb ...
), and
SuperH
SuperH (or SH) is a 32-bit reduced instruction set computing (RISC) instruction set architecture (ISA) developed by Hitachi and currently produced by Renesas. It is implemented by microcontrollers and microprocessors for embedded systems.
At t ...
ISAs.
Analogous formats to PE are
ELF
An elf () is a type of humanoid supernatural being in Germanic mythology and folklore. Elves appear especially in North Germanic mythology. They are subsequently mentioned in Snorri Sturluson's Icelandic Prose Edda. He distinguishes "ligh ...
(used in
Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
and most other versions of
Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
) and
Mach-O
Mach-O, short for Mach object file format, is a file format for executables, object code, shared libraries, dynamically-loaded code, and core dumps. It was developed to replace the a.out format.
Mach-O is used by some systems based on the ...
(used in
macOS
macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and lapt ...
and
iOS
iOS (formerly iPhone OS) is a mobile operating system created and developed by Apple Inc. exclusively for its hardware. It is the operating system that powers many of the company's mobile devices, including the iPhone; the term also include ...
).
History
Microsoft migrated to the PE format from the 16-bit
NE formats with the introduction of the
Windows NT 3.1
Windows NT 3.1 is the first major release of the Windows NT operating system developed by Microsoft, released on July 27, 1993.
At the time of Windows NT's release, Microsoft's Windows 3.1 desktop environment had established brand recognition ...
operating system. All later versions of Windows, including Windows 95/98/ME and the
Win32s Win32s is a 32-bit application runtime environment for the Microsoft Windows 3.1 and 3.11 operating systems. It allowed some 32-bit applications to run on the 16-bit operating system using call thunks. A beta version of Win32s was available in Oct ...
addition to Windows 3.1x, support the file structure. The format has retained limited legacy support to bridge the gap between
DOS
DOS is shorthand for the MS-DOS and IBM PC DOS family of operating systems.
DOS may also refer to:
Computing
* Data over signalling (DoS), multiplexing data onto a signalling channel
* Denial-of-service attack (DoS), an attack on a communicat ...
-based and NT systems. For example, PE/COFF headers still include a
DOS executable program, which is by default a
DOS stub
The New Executable (abbreviated NE or NewEXE) is a 16-bit .exe file format, a successor to the DOS MZ executable format. It was used in Windows 1.0–3.x, Windows 9x, multitasking MS-DOS 4.0, OS/2 1.x, and the OS/2 subset of Windows NT up to v ...
that displays a message like "This program cannot be run in DOS mode" (or similar), though it can be a full-fledged DOS version of the program (a later notable case being the Windows 98 SE installer). This constitutes a form of
fat binary
A fat binary (or multiarchitecture binary) is a computer executable program or library which has been expanded (or "fattened") with code native to multiple instruction sets which can consequently be run on multiple processor types. This results ...
. PE also continues to serve the changing Windows platform. Some extensions include the .NET PE format (see below), a version with 64-bit address space support called PE32+, and a specification for Windows CE.
Technical details
Layout
A PE file consists of a number of headers and sections that tell the
dynamic linker
In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed (at " run time"), by copying the content of libraries from persistent storage to RAM, filling ...
how to map the file into memory. An executable image consists of several different regions, each of which require different memory protection; so the start of each section must be aligned to a page boundary. For instance, typically the ''.text'' section (which holds program code) is mapped as execute/read-only, and the ''.data'' section (holding global variables) is mapped as no-execute/read write. However, to avoid wasting space, the different sections are not page aligned on disk. Part of the job of the dynamic linker is to map each section to memory individually and assign the correct permissions to the resulting regions, according to the instructions found in the headers.
Import table
One section of note is the ''import address table'' (IAT), which is used as a lookup table when the application is calling a function in a different module. It can be in the form of both
import by ordinal and import by name. Because a compiled program cannot know the memory location of the libraries it depends upon, an indirect jump is required whenever an API call is made. As the dynamic linker loads modules and joins them together, it writes actual addresses into the IAT slots, so that they point to the memory locations of the corresponding library functions. Though this adds an extra jump over the cost of an intra-module call resulting in a performance penalty, it provides a key benefit: The number of memory pages that need to be
copy-on-write
Copy-on-write (COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources. If a resource is dupl ...
changed by the loader is minimized, saving memory and disk I/O time. If the compiler knows ahead of time that a call will be inter-module (via a dllimport attribute) it can produce more optimized code that simply results in an indirect call
opcode
In computing, an opcode (abbreviated from operation code, also known as instruction machine code, instruction code, instruction syllable, instruction parcel or opstring) is the portion of a machine language instruction that specifies the operat ...
.
Relocations
PE files normally do not contain
position-independent code
In computing, position-independent code (PIC) or position-independent executable (PIE) is a body of machine code that, being placed somewhere in the primary memory, executes properly regardless of its absolute address. PIC is commonly used fo ...
. Instead they are compiled to a preferred ''
base address
In computing, a base address is an address serving as a reference point ("base") for other addresses. Related addresses can be accessed using an ''addressing scheme''.
Under the ''relative addressing'' scheme, to obtain an absolute address, the ...
'', and all addresses emitted by the compiler/linker are fixed ahead of time. If a PE file cannot be loaded at its preferred address (because it's already taken by something else), the operating system will ''
rebase'' it. This involves recalculating every absolute address and modifying the code to use the new values. The loader does this by comparing the preferred and actual load addresses, and calculating a
delta
Delta commonly refers to:
* Delta (letter) (Δ or δ), a letter of the Greek alphabet
* River delta, at a river mouth
* D ( NATO phonetic alphabet: "Delta")
* Delta Air Lines, US
* Delta variant of SARS-CoV-2 that causes COVID-19
Delta may also ...
value. This is then added to the preferred address to come up with the new address of the memory location. Base
relocations are stored in a list and added, as needed, to an existing memory location. The resulting code is now private to the process and no longer
shareable, so many of the memory saving benefits of DLLs are lost in this scenario. It also slows down loading of the module significantly. For this reason rebasing is to be avoided wherever possible, and the DLLs shipped by Microsoft have base addresses pre-computed so as not to overlap. In the no rebase case PE therefore has the advantage of very efficient code, but in the presence of rebasing the memory usage hit can be expensive. This contrasts with
ELF
An elf () is a type of humanoid supernatural being in Germanic mythology and folklore. Elves appear especially in North Germanic mythology. They are subsequently mentioned in Snorri Sturluson's Icelandic Prose Edda. He distinguishes "ligh ...
which uses fully position-independent code and a global offset table, which trades off execution time in favor of lower memory usage.
.NET, metadata, and the PE format
In a .NET executable, the PE code section contains a stub that invokes the
CLR CLR may refer to:
* Calcium Lime Rust, a household cleaning-product
* California Law Review, a publication by the UC Berkeley School of Law
* Tube_bending, Centerline Radius, a term in the tubing industry used to describe the radius of a bend
* Cen ...
virtual machine startup entry,
_CorExeMain
or
_CorDllMain
in
mscoree.dll
, much like it was in
Visual Basic Visual Basic is a name for a family of programming languages from Microsoft. It may refer to:
* Visual Basic .NET (now simply referred to as "Visual Basic"), the current version of Visual Basic launched in 2002 which runs on .NET
* Visual Basic (cl ...
executables. The virtual machine then makes use of .NET metadata present, the root of which,
IMAGE_COR20_HEADER
(also called "CLR header") is pointed to by
IMAGE_DIRECTORY_ENTRY_COMHEADER
entry in the PE header's data directory.
IMAGE_COR20_HEADER
strongly resembles PE's optional header, essentially playing its role for the CLR loader.
The CLR-related data, including the root structure itself, is typically contained in the common code section,
.text
. It is composed of a few directories: metadata, embedded resources, strong names and a few for native-code interoperability. Metadata directory is a set of tables that list all the distinct .NET entities in the assembly, including types, methods, fields, constants, events, as well as references between them and to other assemblies.
Use on other operating systems
The PE format is also used by
ReactOS, as ReactOS is intended to be
binary-compatible with Windows. It has also historically been used by a number of other operating systems, including
SkyOS
SkyOS (''Sky Operating System'') is a discontinued prototype commercial, proprietary, graphical desktop operating system written for the x86 computer architecture. As of January 30, 2009 development was halted with no plans to resume its develop ...
and
BeOS R3. However, both SkyOS and BeOS eventually moved to
ELF
An elf () is a type of humanoid supernatural being in Germanic mythology and folklore. Elves appear especially in North Germanic mythology. They are subsequently mentioned in Snorri Sturluson's Icelandic Prose Edda. He distinguishes "ligh ...
.
As the
Mono development platform intends to be binary compatible with the Microsoft
.NET Framework
The .NET Framework (pronounced as "''dot net"'') is a proprietary software framework developed by Microsoft that runs primarily on Microsoft Windows. It was the predominant implementation of the Common Language Infrastructure (CLI) until bein ...
, it uses the same PE format as the Microsoft implementation. The same goes for Microsoft's own cross-platform
.NET Core
The domain name net is a generic top-level domain (gTLD) used in the Domain Name System of the Internet. The name is derived from the word ''network'', indicating it was originally intended for organizations involved in networking technologies ...
.
On
x86
x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introd ...
(-64)
Unix-like
A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
operating systems, Windows binaries (in PE format) can be executed with
Wine
Wine is an alcoholic drink typically made from fermented grapes. Yeast consumes the sugar in the grapes and converts it to ethanol and carbon dioxide, releasing heat in the process. Different varieties of grapes and strains of yeasts are m ...
. The
HX DOS Extender also uses the PE format for native DOS 32-bit binaries, plus it can, to some degree, execute existing Windows binaries in DOS, thus acting like an equivalent of Wine for DOS.
On
IA-32
IA-32 (short for "Intel Architecture, 32-bit", commonly called i386) is the 32-bit version of the x86 instruction set architecture, designed by Intel and first implemented in the 80386 microprocessor in 1985. IA-32 is the first incarnation o ...
and
x86-64
x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging mod ...
Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
one can also run
Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
'
DLLs under load library.
Mac OS X 10.5 has the ability to load and parse PE files, but is not binary compatible with Windows.
UEFI
UEFI (Unified Extensible Firmware Interface) is a set of specifications written by the UEFI Forum. They define the architecture of the platform firmware used for booting and its interface for interaction with the operating system. Examples of ...
and EFI firmware use Portable Executable files as well as the Windows
ABI x64
calling convention
In computer science, a calling convention is an implementation-level (low-level) scheme for how subroutines or functions receive parameters from their caller and how they return a result. When some code calls a function, design choices have bee ...
for
applications
Application may refer to:
Mathematics and computing
* Application software, computer software designed to help the user to perform specific tasks
** Application layer, an abstraction layer that specifies protocols and interface methods used in a c ...
.
See also
*
EXE
Exe or EXE may refer to:
* .exe, a file extension
* exe., abbreviation for executive
Places
* River Exe, in England
* Exe Estuary, in England
* Exe Island, in Exeter, England
Transportation and vehicles
* Exe (locomotive), a British locomotive
...
*
Executable and Linkable Format
In computing, the Executable and Linkable FormatTool Interface Standard (TIS) Portable Formats SpecificationVersion 1.1'' (October 1993) (ELF, formerly named Extensible Linking Format), is a common standard file format for executable files, ...
*
Mach-O
Mach-O, short for Mach object file format, is a file format for executables, object code, shared libraries, dynamically-loaded code, and core dumps. It was developed to replace the a.out format.
Mach-O is used by some systems based on the ...
*
a.out
*
Comparison of executable file formats
This is a comparison of binary executable file formats which, once loaded by a suitable executable loader, can be directly executed by the CPU rather than being interpreted by software. In addition to the binary application code, the executables ma ...
*
Executable compression
Executable compression is any means of compressing an executable file and combining the compressed data with decompression code into a single executable. When this compressed executable is executed, the decompression code recreates the origina ...
*
ar (Unix)
The archiver, also known simply as ar, is a Unix utility that maintains groups of files as a single archive file. Today, ar is generally used only to create and update static library files that the link editor or linker uses and for generating . ...
since all COFF libraries use that same format
*
Application virtualization
Application virtualization is a software technology that encapsulates computer programs from the underlying operating system on which they are executed. A fully virtualized application is not installed in the traditional sense, although it is sti ...
References
External links
PE Format (latest online document)
Microsoft Portable Executable and Common Object File Format Specification (revision 9.3,
.docx
The Office Open XML file formats are a set of file formats that can be used to represent electronic office documents. There are formats for word processing documents, spreadsheets and presentations as well as specific formats for material su ...
format)
Microsoft Portable Executable and Common Object File Format Specification (revision 6.0,
.doc format)
The original Portable Executable articleby
Matt Pietrek
Matt Pietrek (born January 27, 1966) is an American spirits and cocktail writer. Previously, he was a computer specialist and author specializing in Microsoft Windows.
Pietrek also has a keen interest in cocktails and spirits, and he writes a blog ...
(
MSDN
Microsoft Developer Network (MSDN) was the division of Microsoft responsible for managing the firm's relationship with developers and testers, such as hardware developers interested in the operating system (OS), and software developers developing ...
Magazine, March 1994)
Part I. An In-Depth Look into the Win32 Portable Executable File Formatby
Matt Pietrek
Matt Pietrek (born January 27, 1966) is an American spirits and cocktail writer. Previously, he was a computer specialist and author specializing in Microsoft Windows.
Pietrek also has a keen interest in cocktails and spirits, and he writes a blog ...
(
MSDN
Microsoft Developer Network (MSDN) was the division of Microsoft responsible for managing the firm's relationship with developers and testers, such as hardware developers interested in the operating system (OS), and software developers developing ...
Magazine, February 2002)
Part II. An In-Depth Look into the Win32 Portable Executable File Formatby
Matt Pietrek
Matt Pietrek (born January 27, 1966) is an American spirits and cocktail writer. Previously, he was a computer specialist and author specializing in Microsoft Windows.
Pietrek also has a keen interest in cocktails and spirits, and he writes a blog ...
(
MSDN
Microsoft Developer Network (MSDN) was the division of Microsoft responsible for managing the firm's relationship with developers and testers, such as hardware developers interested in the operating system (OS), and software developers developing ...
Magazine, March 2002)
The .NET File Format by Daniel PistelliEro Carrera's blog describing the PE header and how to walk throughPE Internals provides an easy way to learn the Portable Executable File FormatPE Explorer{{Executables
Executable file formats
Windows administration