The archiver, also known simply as ar, is a
Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
utility that maintains groups of files as a single
archive file
In computing, an archive file is a computer file that is composed of one or more files along with metadata. Archive files are used to collect multiple data files together into a single file for easier portability and storage, or simply to compress ...
. Today,
ar
is generally used only to create and update
static library
In computer science, a static library or statically-linked library is a set of routines, external functions and variables which are resolved in a caller at compile-time and copied into a target application by a compiler, linker, or binder, produci ...
files that the link editor or
linker
Linker or linkers may refer to:
Computing
* Linker (computing), a computer program that takes one or more object files generated by a compiler or generated by an assembler and links them with libraries, generating an executable program or shar ...
uses and for generating .deb packages for the
Debian
Debian (), also known as Debian GNU/Linux, is a Linux distribution composed of free and open-source software, developed by the community-supported Debian Project, which was established by Ian Murdock on August 16, 1993. The first version of D ...
family; it can be used to create archives for any purpose, but has been largely replaced by
tar
Tar is a dark brown or black viscous liquid of hydrocarbons and free carbon, obtained from a wide variety of organic materials through destructive distillation. Tar can be produced from coal, wood, petroleum, or peat. "a dark brown or black bit ...
for purposes other than static libraries.
An implementation of
ar
is included as one of the
GNU Binutils
The GNU Binary Utilities, or , are a set of programming tools for creating and managing binary programs, object files, libraries, profile data, and assembly source code.
Tools
They were originally written by programmers at Cygnus Solutions.
...
.
In the
Linux Standard Base
The Linux Standard Base (LSB) was a joint project by several Linux distributions under the organizational structure of the Linux Foundation to standardize the software system structure, including the Filesystem Hierarchy Standard used in the ...
(LSB),
ar
has been deprecated and is expected to disappear in a future release of that standard. The rationale provided was that "the LSB does not include software development utilities nor does it specify .o and .a file formats."
File format details
The ar format has never been standardized; modern archives are based on a common format with two main variants,
BSD
The Berkeley Software Distribution or Berkeley Standard Distribution (BSD) is a discontinued operating system based on Research Unix, developed and distributed by the Computer Systems Research Group (CSRG) at the University of California, Berk ...
and
System V
Unix System V (pronounced: "System Five") is one of the first commercial versions of the Unix operating system. It was originally developed by AT&T and first released in 1983. Four major versions of System V were released, numbered 1, 2, 3, an ...
(initially known as
COFF
The Common Object File Format (COFF) is a format for executable, object code, and shared library computer files used on Unix systems. It was introduced in Unix System V, replaced the previously used a.out format, and formed the basis for exte ...
, and used as well by
GNU
GNU () is an extensive collection of free software (383 packages as of January 2022), which can be used as an operating system or can be used in parts with other operating systems. The use of the completed GNU tools led to the family of operat ...
,
ELF
An elf () is a type of humanoid supernatural being in Germanic mythology and folklore. Elves appear especially in North Germanic mythology. They are subsequently mentioned in Snorri Sturluson's Icelandic Prose Edda. He distinguishes "ligh ...
, and
Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
.)
Historically there have been other variants
[ Code]
ftp://ftp.iecc.com/pub/linker/] Errata
https://archive.today/20200114224817/https://linker.iecc.com/ 2020-01-14 -->
/ref> including Version 6 Unix, V6, V7, AIX (small and big), and Coherent, which all vary significantly from the common format.[Manual page for NET/2 ar file format](_blank)
/ref>
Debian
Debian (), also known as Debian GNU/Linux, is a Linux distribution composed of free and open-source software, developed by the community-supported Debian Project, which was established by Ian Murdock on August 16, 1993. The first version of D ...
". deb" archives use the common format.
An ar file begins with a global header, followed by a header and data section for each file stored within the ar file.
Each data section is 2 byte aligned. If it would end on an odd offset, a newline ('\n', 0x0A) is used as filler.
File signature
The file signature is a single field containing the magic ASCII string "!"
followed by a single LF control character (0x0A).
File header
Each file stored in an ar archive includes a file header to store information about the file. The common format is as follows. Numeric values are encoded in ASCII and all values right-padded with ASCII spaces (0x20).
As the headers only include printable ASCII characters and line feeds, an archive containing only text files therefore still appears to be a text file itself.
The members are aligned to even byte boundaries. "Each archive file member begins on an even byte boundary; a newline is inserted between files if necessary. Nevertheless, the size given reflects the actual size of the file exclusive of padding."
Due to the limitations of file name length and format, both the GNU and BSD variants devised different methods of storing long filenames. Although the common format does not suffer from the year 2038 problem
The year 2038 problem (also known as Y2038, Y2K38, or the Epochalypse) is a time formatting bug in computer systems with representing times after 03:14:07 UTC on 19 January 2038.
The problem exists in systems which measure Unix time â ...
, many implementations of the ar utility do and may need to be modified in the future to handle correctly timestamps in excess of 2147483647. A description of these extensions is found in libbfd.
Depending on the format, many ar implementations include a global symbol table (aka armap, directory or index) for fast linking without needing to scan the whole archive for a symbol. POSIX recognizes this feature, and requires ar implementations to have an option for updating it. Most implementations put it at the first file entry.
BSD variant
BSD ar stores filenames right-padded with ASCII spaces. This causes issues with spaces inside filenames.
4.4BSD The History of the Berkeley Software Distribution begins in the 1970s.
1BSD (PDP-11)
The earliest distributions of Unix from Bell Labs in the 1970s included the source code to the operating system, allowing researchers at universities to modify a ...
ar stores extended filenames by placing the string "#1/" followed by the file name length in the file name field, and storing the real filename in front of the data section.
BSD ar utility traditionally does not handle the building of a global symbol lookup table, and delegates this task to a separate utility named ranlib, which inserts an architecture-specific file named __.SYMDEF
as first archive member. Some descendents put a space and "SORTED" after the name to indicate a sorted version. A 64-bit variant called exists on Darwin.
Since POSIX added the requirement for the option as an replacement of ranlib, however, newer BSD ar implementations have been rewritten to have this feature. FreeBSD in particular ditched the SYMDEF table format and embraced the System V style table.
System V (or GNU) variant
System V ar uses a '/' character (0x2F) to mark the end of the filename; this allows for the use of spaces without the use of an extended filename. Then it stores multiple extended filenames in the data section of a file with the name "//", this record is referred to by future headers. A header references an extended filename by storing a "/" followed by a decimal offset to the start of the filename in the extended filename data section. The format of this "//" file itself is simply a list of the long filenames, each separated by one or more LF characters. Note that the decimal offsets are number of characters, not line or string number within the "//" file. This is usually the second entry of the file, after the symbol table which always is the first.
System V ar uses the special filename "/" to denote that the following data entry contains a symbol lookup table, which is used in ar libraries to speed up access. This symbol table is built in three parts which are recorded together as contiguous data.
# A 32-bit big endian integer, giving the number of entries in the table.
# A set of 32-bit big endian integers. One for each symbol, recording the position within the archive of the header for the file containing this symbol.
# A set of Zero-terminated strings. Each is a symbol name, and occurs in the same order as the list of positions in part 2.
Some System V systems do not use the format described above for the symbol lookup table.
For operating systems such as HP-UX
HP-UX (from "Hewlett Packard Unix") is Hewlett Packard Enterprise's proprietary implementation of the Unix operating system, based on Unix System V (initially System III) and first released in 1984. Current versions support HPE Integrity Ser ...
11.0, this information is stored in a data structure based on the SOM file format.
The special file "/" is not terminated with a specific sequence; the end is assumed once the last symbol name has been read.
To overcome the 4 GiB file size limit some operating system like Solaris
Solaris may refer to:
Arts and entertainment Literature, television and film
* ''Solaris'' (novel), a 1961 science fiction novel by Stanisław Lem
** ''Solaris'' (1968 film), directed by Boris Nirenburg
** ''Solaris'' (1972 film), directed by ...
11.2 and GNU use a variant lookup table. Instead of 32-bit integers, 64-bit integers are used in the symbol lookup tables. The string "/SYM64/" instead "/" is used as identifier for this table
= Windows variant
=
The Windows (PE/COFF) variant is based on the SysV/GNU variant. The first entry "/" has the same layout as the SysV/GNU symbol table. The second entry is another "/", a Microsoft ECOFF
The Extended Common Object File Format (ECOFF) is a file format for executables, object code, and shared libraries, extended from the COFF specification.
ECOFF was developed for the MIPS platform, and was used by DEC Ultrix and Tru64 (previo ...
extension that stores an extended symbol cross-reference table. This one is sorted and uses little-endian integers. The third entry is the optional "//" long name data as in SysV/GNU.
Thin archive
The version of in GNU binutils
The GNU Binary Utilities, or , are a set of programming tools for creating and managing binary programs, object files, libraries, profile data, and assembly source code.
Tools
They were originally written by programmers at Cygnus Solutions.
...
and Elfutils have an additional "thin archive" format with the magic number . A thin archive only contains a symbol table and references to the file. The file format is essentially a System V format archive where every file is stored without the data sections. Every filename is stored as a "long" filename and they are to be resolved as if they were symbolic links
In computing, a symbolic link (also symlink or soft link) is a file whose purpose is to point to a file or directory (called the "target") by specifying a path thereto.
Symbolic links are supported by POSIX and by most Unix-like operating syst ...
.
Example usage
To create an archive from files , , , the following command would be used:
ar rcs libclass.a class1.o class2.o class3.o
Unix linkers, usually invoked through the C compiler cc
, can read ar
files and extract object files
An object file is a computer file containing object code, that is, machine code output of an assembler or compiler. The object code is usually relocatable, and not usually directly executable. There are various formats for object files, and the ...
from them, so if libclass.a
is an archive containing class1.o
, class2.o
and class3.o
, then
cc main.c libclass.a
or (if libclass.a is placed in standard library path, like )
cc main.c -lclass
or (during linking)
ld ... main.o -lclass ...
is the same as:
cc main.c class1.o class2.o class3.o
See also
*.deb
deb is the format, as well as extension of the software package format for the Debian Linux distribution and its derivatives.
Design
Debian packages are standard Unix ar archives that include two tar archives. One archive holds the cont ...
*Archive formats
In computing, an archive file is a computer file that is composed of one or more files along with metadata. Archive files are used to collect multiple data files together into a single file for easier portability and storage, or simply to compress ...
*List of Unix commands
This is a list of Unix commands as specified by IEEE Std 1003.1-2008, which is part of the Single UNIX Specification (SUS). These commands can be found on Unix operating systems and most Unix-like operating systems.
List
See also
* List of G ...
References
External links
*
*
*
*
*
*
* -- an account of Unix formats
* ''The 32-bit PA-RISC Run-time Architecture Document, HP-UX 11.0 Version 1.0,'' Hewlett-Packard, 1997.
See ''Chapter 4: Relocatable Libraries''. Available a
(devresource.hp.com)
{{Archive formats
Archive formats
Unix archivers and compression-related utilities
File archivers
Ar
Unix programming tools
Unix SUS2008 utilities
Plan 9 commands
Inferno (operating system) commands
GNU Project software