Reproducible builds, also known as deterministic compilation, is a process of
compiling
In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
software which ensures the resulting
binary code can be
reproduced.
Source code
In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the w ...
compiled using deterministic compilation will always output the same binary.
Reproducible builds can act as part of a
chain of trust
In computer security, a chain of trust is established by validating each component of hardware and software from the end entity up to the root certificate. It is intended to ensure that only trusted software and hardware can be used while still ...
;
[ the source code can be signed, and deterministic compilation can prove that the binary was compiled from trusted source code.
]
Methods
For the compilation process to be deterministic, the input to the compiler must be the same, regardless of the build environment used. This typically involves normalizing variables that may change, such as order of input files, timestamps
A timestamp is a sequence of characters or encoded information identifying when a certain event occurred, usually giving date and time of day, sometimes accurate to a small fraction of a second. Timestamps do not have to be based on some absolut ...
, locales, and paths.
Additionally, the compilers must not introduce non-determinism themselves. This sometimes happens when using hash tables with a random hash seed value. It can also happen when using the address of variables because that varies from address space layout randomization
Address space layout randomization (ASLR) is a computer security technique involved in preventing exploitation of memory corruption vulnerabilities. In order to prevent an attacker from reliably jumping to, for example, a particular exploited f ...
(ASLR).
Build systems, such as Bazel
Bazel is a village in Belgium, in the municipality of Kruibeke in the province of East Flanders. The village is home to the Wissekerke Castle. The municipality of Bazel merged into Kruibeke in 1977.
Overview
The parish church was founded in the ...
and Gitian, can be used to automate deterministic build processes.
History
The GNU Project
The GNU Project () is a free software, mass collaboration project announced by Richard Stallman on September 27, 1983. Its goal is to give computer users freedom and control in their use of their computers and computing devices by collaborat ...
used reproducible builds in the early 1990s. Changelogs from 1992 indicate the ongoing effort.
One of the older projects to promote reproducible builds is the Bitcoin project with Gitian. Later, in 2013, the Tor (anonymity network)
Tor, short for The Onion Router, is free and open-source software for enabling anonymous communication. It directs Internet traffic through a free, worldwide, volunteer overlay network, consisting of more than seven thousand relays, to co ...
project started using Gitian for their reproducible builds.
In July 2013 on the Debian project started implementing reproducible builds across its entire package archive.
By July 2017 more than 90% of the packages in the repository have been proven to build reproducibly.
In November 2018, the Reproducible Builds project joined the Software Freedom Conservancy.
F-droid
F-Droid is an app store and software repository for Android, serving a similar function to the Google Play store. The main repository, hosted by the project, contains only free and open source apps. Applications can be browsed, downloaded and ...
uses reproducible builds to provide a guarantee that the distributed APKs use the claimed free source code.
The Tails portable operating system uses reproducible builds and explains to others how to verify their distribution.
NixOS
NixOS is a Linux distribution built on top of the Nix package manager. It uses declarative configuration and allows reliable system upgrades. Several official package "channels" are offered, including the current Stable release and the Unstable ...
claims 100% reproducible build in June 2021.
Challenges
According to the Reproducible Builds project, timestamps are "the biggest source of reproducibility issues. Many build tools record the current date and time... and most archive formats will happily record modification times on top of their own timestamps." They recommend that "it is better to use a date that is relevant to the source code instead of the build: old software can always be built later" if it is reproducible. They identify several ways to modify build processes to do this:
* Set the SOURCE_DATE_EPOCH environment variable to the number seconds since January 1, 1970, using something from the source code. Tools that support this environment variable will use its value (when set) instead of the current date and time.
* Post-process output to remove timestamps or normalize them. The tool strip-nondeterminism can often help do this.
* Use a library like libfaketime to intercept requests for the current time of day and provide a controlled response.
In some cases other changes must be made to make a build process reproducible. For example, some data structures do not guarantee a stable order in each execution. A typical solution is to modify the build process to specify a sorted output from those structures.
See also
* Bootstrapable builds
References
{{reflist
External links
reproducible-builds.org
Debian Reproducible Builds
Compiling tools