HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, e ...
, a linker or link editor is a computer system program that takes one or more
object file An object file is a computer file containing object code, that is, machine code output of an assembler or compiler. The object code is usually relocatable, and not usually directly executable. There are various formats for object files, and the ...
s (generated by a
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
or an
assembler Assembler may refer to: Arts and media * Nobukazu Takemura, avant-garde electronic musician, stage name Assembler * Assemblers, a fictional race in the ''Star Wars'' universe * Assemblers, an alternative name of the superhero group Champions of A ...
) and combines them into a single
executable In computing, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instruction (computer science), instructi ...
file,
library A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vir ...
file, or another "object" file. A simpler version that writes its
output Output may refer to: * The information produced by a computer, see Input/output * An output state of a system, see state (computer science) * Output (economics), the amount of goods and services produced ** Gross output in economics, the value of ...
directly to
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
is called the ''loader'', though loading is typically considered a separate process.


Overview

Computer programs typically are composed of several parts or modules; these parts/modules do not need to be contained within a single
object file An object file is a computer file containing object code, that is, machine code output of an assembler or compiler. The object code is usually relocatable, and not usually directly executable. There are various formats for object files, and the ...
, and in such cases refer to each other by means of
symbols A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, object, or relationship. Symbols allow people to go beyond what is known or seen by creating linkages between otherwise very different conc ...
as addresses into other modules, which are mapped into memory addresses when linked for execution. While the process of linking is meant to ultimately combine these independent parts, there are many good reasons to develop those separately at the
source Source may refer to: Research * Historical document * Historical source * Source (intelligence) or sub source, typically a confidential provider of non open-source intelligence * Source (journalism), a person, publication, publishing institute o ...
-level. Among these reasons are the ease of organizing several smaller pieces over a
monolithic A monolith is a monument or natural feature consisting of a single massive stone or rock. Monolith or monolithic may also refer to: Architecture * Monolithic architecture, a style of construction in which a building is carved, cast or excavated ...
whole and the ability to better define the purpose and responsibilities of each individual piece, which is essential for managing complexity and increasing long-term maintainability in
software architecture Software architecture is the fundamental structure of a software system and the discipline of creating such structures and systems. Each structure comprises software elements, relations among them, and properties of both elements and relations. ...
. Typically, an object file can contain three kinds of symbols: * defined "external" symbols, sometimes called "public" or "entry" symbols, which allow it to be called by other modules, * undefined "external" symbols, which reference other modules where these symbols are defined, and * local symbols, used internally within the object file to facilitate relocation. For most compilers, each object file is the result of compiling one input source code file. When a program comprises multiple object files, the linker combines these files into a unified executable program, resolving the symbols as it goes along. Linkers can take objects from a collection called a ''
library A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vir ...
'' or
runtime library In computer programming, a runtime library is a set of low-level routines used by a compiler to invoke some of the behaviors of a runtime environment, by inserting calls to the runtime library into compiled executable binary. The runtime enviro ...
. Most linkers do not include the whole library in the output; they include only the files that are referenced by other object files or libraries. Library linking may thus be an iterative process, with some referenced modules requiring additional modules to be linked, and so on. Libraries exist for diverse purposes, and one or more system libraries are usually linked in by default. The linker also takes care of arranging the objects in a program's
address space In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity. For software programs to save and retrieve st ...
. This may involve ''relocating'' code that assumes a specific
base address In computing, a base address is an address serving as a reference point ("base") for other addresses. Related addresses can be accessed using an ''addressing scheme''. Under the ''relative addressing'' scheme, to obtain an absolute address, the r ...
into another base. Since a compiler seldom knows where an object will reside, it often assumes a fixed base location (for example,
zero 0 (zero) is a number representing an empty quantity. In place-value notation Positional notation (or place-value notation, or positional numeral system) usually denotes the extension to any base of the Hindu–Arabic numeral system (or ...
). Relocating machine code may involve re-targeting of absolute jumps, loads and stores. The executable output by the linker may need another relocation pass when it is finally loaded into memory (just before execution). This pass is usually omitted on hardware offering
virtual memory In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very l ...
: every program is put into its own address space, so there is no conflict even if all programs load at the same base address. This pass may also be omitted if the executable is a
position independent In computing, position-independent code (PIC) or position-independent executable (PIE) is a body of machine code that, being placed somewhere in the primary memory, executes properly regardless of its absolute address. PIC is commonly used for ...
executable. On some
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
variants, such as
SINTRAN III Sintran III is a real-time, multitasking, multi-user operating system used with Norsk Data minicomputers from 1974. Unlike its predecessors Sintran I and II, it was written entirely by Norsk Data, in Nord Programming Language (Nord PL, NPL), ...
, the process performed by a linker (assembling object files into a program) was called '' loading'' (as in loading executable code onto a file). Additionally, in some operating systems, the same program handles both the jobs of linking and loading a program (
dynamic linking In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed (at "run time"), by copying the content of libraries from persistent storage to RAM, filling ...
).


Dynamic linking

Many
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also in ...
environments allow dynamic linking, deferring the resolution of some undefined symbols until a program is run. That means that the executable code still contains undefined symbols, plus a list of objects or libraries that will provide definitions for these. Loading the program will load these objects/libraries as well, and perform a final linking. This approach offers two advantages: * Often-used libraries (for example the standard system libraries) need to be stored in only one location, not duplicated in every single executable file, thus saving limited
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
and
disk Disk or disc may refer to: * Disk (mathematics), a geometric shape * Disk storage Music * Disc (band), an American experimental music band * ''Disk'' (album), a 1995 EP by Moby Other uses * Disk (functional analysis), a subset of a vector sp ...
space. * If a bug in a library function is corrected by replacing the library or
performance A performance is an act of staging or presenting a play, concert, or other form of entertainment. It is also defined as the action or process of carrying out or accomplishing an action, task, or function. Management science In the work place ...
is improved, all programs using it dynamically will benefit from the correction after restarting them. Programs that included this function by static linking would have to be re-linked first. There are also disadvantages: * Known on the
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
platform as "
DLL hell In computing, DLL Hell is a term for the complications that arise when one works with dynamic-link libraries (DLLs) used with Microsoft Windows operating systems, particularly legacy 16-bit editions, which all run in a single memory space. DLL Hel ...
", an incompatible updated library will break executables that depended on the behavior of the previous version of the library if the newer version is not correctly
backward compatible Backward compatibility (sometimes known as backwards compatibility) is a property of an operating system, product, or technology that allows for interoperability with an older legacy system, or with input designed for such a system, especially i ...
. * A program, together with the libraries it uses, might be certified (e.g. as to correctness, documentation requirements, or performance) as a package, but not if components can be replaced (this also argues against automatic OS updates in critical systems; in both cases, the OS and libraries form part of a ''qualified'' environment). Contained or virtual environments may further allow
system administrators A system administrator, or sysadmin, or admin is a person who is responsible for the upkeep, configuration, and reliable operation of computer systems, especially multi-user computers, such as servers. The system administrator seeks to ensu ...
to mitigate or trade-off these individual pros and cons.


Static linking

Static linking is the result of the linker copying all library routines used in the program into the executable image. This may require more disk space and memory than dynamic linking, but is more portable, since it does not require the presence of the
library A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vir ...
on the system where it runs. Static linking also prevents "DLL hell", since each program includes exactly the versions of library routines that it requires, with no conflict with other programs. A program using just a few routines from a library does not require the entire library to be installed.


Relocation

As the compiler has no information on the layout of objects in the final output, it cannot take advantage of shorter or more efficient instructions that place a requirement on the address of another object. For example, a jump instruction can reference an absolute address or an offset from the current location, and the offset could be expressed with different lengths depending on the distance to the target. By first generating the most conservative instruction (usually the largest relative or absolute variant, depending on platform) and adding ''relaxation hints'', it is possible to substitute shorter or more efficient instructions during the final link. In regard to jump optimizations this is also called ''automatic jump-sizing''. This step can be performed only after all input objects have been read and assigned temporary addresses; the linker relaxation pass subsequently reassigns addresses, which may in turn allow more potential relaxations to occur. In general, the substituted sequences are shorter, which allows this process to always converge on the best solution given a fixed order of objects; if this is not the case, relaxations can conflict, and the linker needs to weigh the advantages of either option. While instruction relaxation typically occurs at link-time, inner-module relaxation can already take place as part of the optimizing process at
compile-time In computer science, compile time (or compile-time) describes the time window during which a computer program is compiled. The term is used as an adjective to describe concepts related to the context of program compilation, as opposed to concept ...
. In some cases, relaxation can also occur at
load-time In computing, computer systems a loader is the part of an operating system that is responsible for loading computer program, programs and Library (computing), libraries. It is one of the essential stages in the process of starting a program, as ...
as part of the relocation process or combined with
dynamic dead-code elimination In compiler theory, dead-code elimination (also known as DCE, dead-code removal, dead-code stripping, or dead-code strip) is a compiler optimization to remove code which does not affect the program results. Removing such code has several benefits: ...
techniques.


Linkage editor

In IBM
System/360 The IBM System/360 (S/360) is a family of mainframe computer systems that was announced by IBM on April 7, 1964, and delivered between 1965 and 1978. It was the first family of computers designed to cover both commercial and scientific applica ...
mainframe A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterprise ...
environments such as
OS/360 OS/360, officially known as IBM System/360 Operating System, is a discontinued batch processing operating system developed by IBM for their then-new System/360 mainframe computer, announced in 1964; it was influenced by the earlier IBSYS/IBJOB ...
, including
z/OS z/OS is a 64-bit operating system for IBM z/Architecture mainframes, introduced by IBM in October 2000. It derives from and is the successor to OS/390, which in turn was preceded by a string of MVS versions.Starting with the earliest: * O ...
for the
z/Architecture z/Architecture, initially and briefly called ESA Modal Extensions (ESAME), is IBM's 64-bit complex instruction set computer (CISC) instruction set architecture, implemented by its mainframe computers. IBM introduced its first z/Architecture-b ...
mainframes, this type of program is known as a ''linkage editor''. As the name implies a linkage ''editor'' has the additional capability of allowing the addition, replacement, and/or deletion of individual program sections. Operating systems such as OS/360 have format for executable load-modules containing supplementary data about the component sections of a program, so that an individual program section can be replaced, and other parts of the program updated so that relocatable addresses and other references can be corrected by the linkage editor, as part of the process. One advantage of this is that it allows a program to be maintained without having to keep all of the intermediate object files, or without having to re-compile program sections that haven't changed. It also permits program updates to be distributed in the form of small files (originally card decks), containing only the object module to be replaced. In such systems, object code is in the form and format of 80-byte punched-card images, so that updates can be introduced into a system using that medium. In later releases of OS/360 and in subsequent systems, load-modules contain additional data about versions of components modules, to create a traceable record of updates. It also allows one to add, change, or remove an
overlay Overlay may refer to: Computers *Overlay network, a computer network which is built on top of another network *Hardware overlay, one type of video overlay that uses memory dedicated to the application *Another term for exec, replacing one process ...
structure from an already linked load module. The term "linkage editor" should not be construed as implying that the program operates in a user-interactive mode like a text editor. It is intended for batch-mode execution, with the editing commands being supplied by the user in sequentially organized files, such as
punched card A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to di ...
s,
DASD A direct-access storage device (DASD) (pronounced ) is a secondary storage device in which "each physical record has a discrete location and a unique address". The term was coined by IBM to describe devices that allowed random access to data, t ...
, or
magnetic tape Magnetic tape is a medium for magnetic storage made of a thin, magnetizable coating on a long, narrow strip of plastic film. It was developed in Germany in 1928, based on the earlier magnetic wire recording from Denmark. Devices that use magne ...
, and tapes were often used during the initial installation of the OS. ''Linkage editing'' ( IBM nomenclature) or ''consolidation'' or ''collection'' ( ICL nomenclature) refers to the ''linkage editor's'' or ''consolidator's'' act of combining the various pieces into a relocatable binary, whereas the loading and relocation into an absolute binary at the target address is normally considered a separate step.


Common implementations

On Unix and Unix-like systems, the linker is known as "ld". Origins of the name "ld" are "LoaDer" and "Link eDitor". The term "loader" was used to describe the process of loading external symbols from other programs during the process of linking.


GNU linker

The GNU linker (or GNU ld) is the
GNU Project The GNU Project () is a free software, mass collaboration project announced by Richard Stallman on September 27, 1983. Its goal is to give computer users freedom and control in their use of their computers and computing devices by collaborati ...
's
free software Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, no ...
implementation of the Unix command ld. GNU ld runs the linker, which creates an executable file (or a library) from object files created during compilation of a software project. A ''linker script'' may be passed to GNU ld to exercise greater control over the linking process. The GNU linker is part of the
GNU Binary Utilities The GNU Binary Utilities, or , are a set of programming tools for creating and managing binary programs, object files, libraries, profile data, and assembly source code. Tools They were originally written by programmers at Cygnus Solutions. ...
(binutils). Two versions of ld are provided in binutils: the traditional GNU ld based on bfd, and a "streamlined" ELF-only version called
gold Gold is a chemical element with the symbol Au (from la, aurum) and atomic number 79. This makes it one of the higher atomic number elements that occur naturally. It is a bright, slightly orange-yellow, dense, soft, malleable, and ductile met ...
. The command-line and linker script syntaxes of GNU ld is the ''de facto'' standard in much of the
Unix-like A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
world. The
LLVM LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate represen ...
project's linker, ', is designed to be drop-in compatible, and may be used directly with the GNU compiler. Another drop-in replacement, mold, is a highly parallelized and faster alternative which is also supported by GNU tools.


See also

*
Binary File Descriptor library The Binary File Descriptor library (BFD) is the GNU Project's main mechanism for the portable manipulation of object files in a variety of formats. , it supports approximately 50 file formats for some 25 instruction set architectures. History Wh ...
(libbfd) *
Compile and go system In computer programming, a compile and go system, compile, load, and go system, assemble and go system, or load and go system is a programming language processor in which the compiler, compilation, assembler (computer programming), assembly, or link ...
*
DLL hell In computing, DLL Hell is a term for the complications that arise when one works with dynamic-link libraries (DLLs) used with Microsoft Windows operating systems, particularly legacy 16-bit editions, which all run in a single memory space. DLL Hel ...
*
Direct binding Direct binding is a feature of the linker and dynamic linker on Solaris and OpenSolaris. It provides a method to allow libraries to directly bind symbols to other libraries, rather than weakly bind to them and leave the dynamic linker to figure ...
*
Dynamic binding Dynamic binding may refer to: * Dynamic binding (computing), also known as late binding *Dynamic scoping in programming languages * Dynamic binding (chemistry) See also *Dynamic dispatch *Dynamic linking In computing, a dynamic linker is the par ...
*
Dynamic dead code elimination In compiler theory, dead-code elimination (also known as DCE, dead-code removal, dead-code stripping, or dead-code strip) is a compiler optimization to remove code which does not affect the program results. Removing such code has several benefits: ...
*
Dynamic dispatch In computer science, dynamic dispatch is the process of selecting which implementation of a polymorphic operation (method or function) to call at run time. It is commonly employed in, and considered a prime characteristic of, object-oriented ...
*
Dynamic library In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed (at "run time"), by copying the content of libraries from persistent storage to RAM, filling ...
*
Dynamic linker In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed (at "run time"), by copying the content of libraries from persistent storage to RAM, filling ...
*
Dynamic loading Dynamic loading is a mechanism by which a computer program can, at run time, load a library (or other binary) into memory, retrieve the addresses of functions and variables contained in the library, execute those functions or access those varia ...
*
Dynamic-link library Dynamic-link library (DLL) is Microsoft's implementation of the shared library concept in the Microsoft Windows and OS/2 operating systems. These libraries usually have the file extension DLL, OCX (for libraries containing ActiveX controls), o ...
*
External variable In the C programming language, an external variable is a variable defined outside any function block. On the other hand, a local (automatic) variable is a variable defined inside a function block. Definition, declaration and the extern keywor ...
*
Library A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vir ...
*
Loader Loader can refer to: * Loader (equipment) * Loader (computing) ** LOADER.EXE, an auto-start program loader optionally used in the startup process of Microsoft Windows ME * Loader (surname) * Fast loader * Speedloader * Boot loader ** LOADER.COM ...
*
Name decoration In compiler construction, name mangling (also called name decoration) is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages. It provides a way of e ...
*
Prelinking In computing, prebinding, also called prelinking, is a method for optimizing application load times by resolving library symbols prior to launch. Background Most computer programs consist of code that requires external Shared library, shared li ...
(prebinding) * Relocation * Smart linking *
Static library In computer science, a static library or statically-linked library is a set of routines, external functions and variables which are resolved in a caller at compile-time and copied into a target application by a compiler, linker, or binder, produci ...
*
Gold (linker) In software engineering, gold is a linker for ELF files. It became an official GNU package and was added to binutils in March 2008 and first released in binutils version 2.19. gold was developed by Ian Lance Taylor and a small team at Google. The ...


References


Further reading

* * * * Code

ftp://ftp.iecc.com/pub/linker/] Errata
https://archive.today/20200114224817/https://linker.iecc.com/ 2020-01-14 -->
* (19 pages) *


External links


Ian Lance Justin's ''Linkers'' blog entries

Linkers and Loaders
a
Linux Journal ''Linux Journal'' (''LJ'') is an American monthly technology magazine originally published by Specialized System Consultants, Inc. (SSC) in Seattle, Washington since 1994. In December 2006 the publisher changed to Belltown Media, Inc. in Houston ...
article by Sandeep Grover
Another Listing of Where to Get a Complete Collection of Free Tools for Assembly Language Development



LLD - The LLVM Linker
* {{Authority control Compilers Computer libraries Programming language implementation Utility software types