An object file is a computer file containing
object code
In computing, object code or object module is the product of a compiler.
In a general sense object code is a sequence of statements or instructions in a computer language, usually a machine code language (i.e., binary) or an intermediate lang ...
, that is,
machine code
In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ver ...
output of an
assembler
Assembler may refer to:
Arts and media
* Nobukazu Takemura, avant-garde electronic musician, stage name Assembler
* Assemblers, a fictional race in the ''Star Wars'' universe
* Assemblers, an alternative name of the superhero group Champions of A ...
or
compiler
In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
. The object code is usually
relocatable, and not usually directly
executable
In computing, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instructions", as opposed to a data fil ...
. There are various formats for object files, and the same machine code can be packaged in different object file formats. An object file may also work like a
shared library
In computer science, a library is a collection of non-volatile memory, non-volatile resources used by computer programs, often for software development. These may include configuration data, documentation, help data, message templates, Code r ...
.
In addition to the object code itself, object files may contain
metadata used for linking or debugging, including: information to resolve symbolic cross-references between different modules,
relocation information,
stack unwinding
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or mach ...
information,
comments
Comment may refer to:
* Comment (linguistics) or rheme, that which is said about the topic (theme) of a sentence
* Bernard Comment (born 1960), Swiss writer and publisher
Computing
* Comment (computer programming), explanatory text or informa ...
, program
symbols
A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, object, or relationship. Symbols allow people to go beyond what is known or seen by creating linkages between otherwise very different co ...
, debugging or
profiling information. Other metadata may include the date and time of compilation, the compiler name and version, and other identifying information.
The term "object program" dates from at least the 1950s:
A computer programmer generates object code with a
compiler
In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
or
assembler
Assembler may refer to:
Arts and media
* Nobukazu Takemura, avant-garde electronic musician, stage name Assembler
* Assemblers, a fictional race in the ''Star Wars'' universe
* Assemblers, an alternative name of the superhero group Champions of A ...
. For example, under
Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which i ...
, the
GNU Compiler Collection
The GNU Compiler Collection (GCC) is an optimizing compiler produced by the GNU Project supporting various programming languages, hardware architectures and operating systems. The Free Software Foundation (FSF) distributes GCC as free sof ...
compiler will generate files with a .o extension which use the
ELF
An elf () is a type of humanoid supernatural being in Germanic mythology and folklore. Elves appear especially in North Germanic mythology. They are subsequently mentioned in Snorri Sturluson's Icelandic Prose Edda. He distinguishes "ligh ...
format. Compilation on
Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
generates files with a .obj extension which use the
COFF
The Common Object File Format (COFF) is a format for executable, object code, and shared library computer files used on Unix systems. It was introduced in Unix System V, replaced the previously used a.out format, and formed the basis for ext ...
format. A linker is then used to combine the object code into one executable program or library pulling in precompiled system libraries as needed.
Object file formats
There are many different object file formats; originally each type of computer had its own unique format, but with the advent of
Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
and other
portable
Portable may refer to:
General
* Portable building, a manufactured structure that is built off site and moved in upon completion of site and utility work
* Portable classroom, a temporary building installed on the grounds of a school to provide a ...
operating system
An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
s, some formats, such as
ELF
An elf () is a type of humanoid supernatural being in Germanic mythology and folklore. Elves appear especially in North Germanic mythology. They are subsequently mentioned in Snorri Sturluson's Icelandic Prose Edda. He distinguishes "ligh ...
and
COFF
The Common Object File Format (COFF) is a format for executable, object code, and shared library computer files used on Unix systems. It was introduced in Unix System V, replaced the previously used a.out format, and formed the basis for ext ...
, have been defined and used on different kinds of systems. It is possible for the same format to be used both as
linker
Linker or linkers may refer to:
Computing
* Linker (computing), a computer program that takes one or more object files generated by a compiler or generated by an assembler and links them with libraries, generating an executable program or shar ...
input and output, and thus as the
library
A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vi ...
and
executable
In computing, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instructions", as opposed to a data fil ...
file format. Some formats can contain machine code for different processors, with the correct one chosen by the operating system when the program is loaded.
Some systems make a distinction between formats which are directly executable and formats which require processing by the linker. For example,
OS/360 and successors
OS/360, officially known as IBM System/360 Operating System, is a discontinued batch processing operating system developed by IBM for their then-new System/360 mainframe computer, announced in 1964; it was influenced by the earlier IBSYS/IBJO ...
call the first format a ''load module'' and the second an ''object module''. In this case the files have entirely different formats.
The design and/or choice of an object file format is a key part of overall system design. It affects the performance of the linker and thus
programmer
A computer programmer, sometimes referred to as a software developer, a software engineer, a programmer or a coder, is a person who creates computer programs — often for larger computer software.
A programmer is someone who writes/creates ...
turnaround while a program is being developed. If the format is used for executables, the design also affects the time programs take to
begin running, and thus the
responsiveness
Responsiveness as a concept of computer science refers to the specific ability of a system or functional unit to complete assigned tasks within a given time. For example, it would refer to the ability of an artificial intelligence system to unde ...
for users.
Absolute files
Many early computers, or small
microcomputers
A microcomputer is a small, relatively inexpensive computer having a central processing unit (CPU) made out of a microprocessor. The computer also includes memory and input/output (I/O) circuitry together mounted on a printed circuit board (P ...
, support only an absolute object format. Programs are not relocatable; they need to be assembled or compiled to execute at specific, predefined addresses. The file contains no relocation or linkage information. These files can be loaded into read/write memory, or stored in
read-only memory
Read-only memory (ROM) is a type of non-volatile memory used in computers and other electronic devices. Data stored in ROM cannot be electronically modified after the manufacture of the memory device. Read-only memory is useful for storing s ...
. For example, the
Motorola 6800
The 6800 ("''sixty-eight hundred''") is an 8-bit microprocessor designed and first manufactured by Motorola in 1974. The MC6800 microprocessor was part of the M6800 Microcomputer System (latter dubbed ''68xx'') that also included serial and paral ...
MIKBUG monitor contains a routine to read an absolute object file (
SREC Format) from
paper tape
Five- and eight-hole punched paper tape
Paper tape reader on the Harwell computer with a small piece of five-hole tape connected in a circle – creating a physical program loop
Punched tape or perforated paper tape is a form of data storage ...
.
DOS
DOS is shorthand for the MS-DOS and IBM PC DOS family of operating systems.
DOS may also refer to:
Computing
* Data over signalling (DoS), multiplexing data onto a signalling channel
* Denial-of-service attack (DoS), an attack on a communicat ...
COM files are a more recent example of absolute object files.
Segmentation
Most object file formats are structured as separate sections of data, each section containing a certain type of data. These sections are known as "segments" due to the term "
memory segment
Memory segmentation is an operating system memory management technique of division of a computer's primary memory into segments or sections. In a computer system using segmentation, a reference to a memory location includes a value that identifi ...
", which was previously a common form of
memory management
Memory management is a form of resource management applied to computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and free it for reuse when ...
. When a program is loaded into memory by a
loader
Loader can refer to:
* Loader (equipment)
* Loader (computing)
** LOADER.EXE, an auto-start program loader optionally used in the startup process of Microsoft Windows ME
* Loader (surname)
* Fast loader
* Speedloader
* Boot loader
** LOADER ...
, the loader allocates various regions of memory to the program. Some of these regions correspond to sections of the object file, and thus are usually known by the same names. Others, such as the stack, only exist at run time. In some cases,
relocation is done by the loader (or linker) to specify the actual memory addresses. However, for many programs or architectures, relocation is not necessary, due to being handled by the
memory management unit
A memory management unit (MMU), sometimes called paged memory management unit (PMMU), is a computer hardware unit having all memory references passed through itself, primarily performing the translation of virtual memory addresses to physical ...
or by
position-independent code
In computing, position-independent code (PIC) or position-independent executable (PIE) is a body of machine code that, being placed somewhere in the primary memory, executes properly regardless of its absolute address. PIC is commonly used for ...
. On some systems the segments of the object file can then be copied (paged) into memory and executed, without needing further processing. On these systems, this may be done ''lazily'', that is, only when the segments are referenced during execution, for example via a
memory-mapped file
A memory-mapped file is a segment of virtual memory that has been assigned a direct byte-for-byte correlation with some portion of a file or file-like resource. This resource is typically a file that is physically present on disk, but can also b ...
backed by the object file.
Types of data supported by typical object file formats:
* Header (descriptive and control information)
*
Code segment
In computing, a code segment, also known as a text segment or simply as text, is a portion of an object file or the corresponding section of the program's virtual address space that contains executable instructions.
Segment
The term "segment ...
("text segment", executable code)
*
Data segment
In computing, a data segment (often denoted .data) is a portion of an object file or the corresponding address space of a program that contains initialized static variables, that is, global variables and static local variables. The size of this seg ...
(initialized
static variable
In computer programming, a static variable is a variable that has been allocated "statically", meaning that its lifetime (or "extent") is the entire run of the program. This is in contrast to shorter-lived automatic variables, whose storage is ...
s)
* Read-only data segment (''
rodata
In computing, a data segment (often denoted .data) is a portion of an object file or the corresponding address space of a program that contains initialized static variables, that is, global variables and static local variables. The size of this seg ...
,'' initialized static
constants)
*
BSS segment
In computer programming, the block starting symbol (abbreviated to .bss or bss) is the portion of an object file, executable, or assembly language code that contains statically allocated variables that are declared but have not been assigned a va ...
(uninitialized static data, both variables and constants)
* External definitions and references for linking
*
Relocation information
*
Dynamic linking
In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed (at " run time"), by copying the content of libraries from persistent storage to RAM, filli ...
information
*
Debugging
In computer programming and software development, debugging is the process of finding and resolving ''bugs'' (defects or problems that prevent correct operation) within computer programs, software, or systems.
Debugging tactics can involve in ...
information
Segments in different object files may be combined by the linker according to rules specified when the segments are defined. Conventions exist for segments shared between object files; for instance, in
DOS
DOS is shorthand for the MS-DOS and IBM PC DOS family of operating systems.
DOS may also refer to:
Computing
* Data over signalling (DoS), multiplexing data onto a signalling channel
* Denial-of-service attack (DoS), an attack on a communicat ...
there are
different memory models that specify the names of special segments and whether or not they may be combined.
Debugging information may either be an integral part of the object file format, as in
COFF
The Common Object File Format (COFF) is a format for executable, object code, and shared library computer files used on Unix systems. It was introduced in Unix System V, replaced the previously used a.out format, and formed the basis for ext ...
, or a
semi-independent format which may be used with several object formats, such as
stabs
stabs (sometimes written STABS) is a debugging data format for storing information about computer programs for use by symbolic and source-level debuggers. (The information is stored in symbol table strings; hence the name "stabs".) Cygnus Sup ...
or
DWARF
Dwarf or dwarves may refer to:
Common uses
*Dwarf (folklore), a being from Germanic mythology and folklore
* Dwarf, a person or animal with dwarfism
Arts, entertainment, and media Fictional entities
* Dwarf (''Dungeons & Dragons''), a humanoid ...
.
The
GNU Project
The GNU Project () is a free software, mass collaboration project announced by Richard Stallman on September 27, 1983. Its goal is to give computer users freedom and control in their use of their computers and Computer hardware, computing devi ...
's
Binary File Descriptor library
The Binary File Descriptor library (BFD) is the GNU Project's main mechanism for the portable manipulation of object files in a variety of formats. , it supports approximately 50 file formats for some 25 instruction set architectures.
History
W ...
(BFD library) provides a common
API
An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
for the manipulation of object files in a variety of formats.
References
Further reading
*
{{executables
Executable file formats
Compiler construction
Computer libraries
Programming language implementation