Piper is a
centralized version control system used by
Google
Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...
for its internal software development. Originally designed for
Linux
Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
, it supports
Microsoft Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
and
macOS
macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
since October 2012.
Scale
Since its founding years Google used a central
codebase
In software development, a codebase (or code base) is a collection of source code used to build a particular software system, application, or software component. Typically, a codebase includes only human-written source code system files; thu ...
shared by the developers. For over 10 years Google relied on a single
Perforce
Perforce Software, Inc. is an American developer of software used for developing and running applications, including version control software, web-based repository management, developer collaboration, application lifecycle management, web applic ...
instance, using proprietary caching for scalability. This mode of operation was kept as Google grew, the need for further scaling led to the development of Piper. Currently, Google's version control "is an extreme case": as of 2016, the repository was storing 86
terabytes
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...
of data comprising two billion
lines of code in nine million files (two
orders of magnitude
In a ratio scale based on powers of ten, the order of magnitude is a measure of the nearness of two figures. Two numbers are "within an order of magnitude" of each other if their ratio is between 1/10 and 10. In other words, the two numbers are wi ...
more than in the
Linux kernel
The Linux kernel is a Free and open-source software, free and open source Unix-like kernel (operating system), kernel that is used in many computer systems worldwide. The kernel was created by Linus Torvalds in 1991 and was soon adopted as the k ...
repository). 25 thousand developers contributed 16 thousand changes daily, with additional 24 thousand
commit operations by bots. Read requests each day are measured in billions.
Architecture
Piper uses regular
Google Cloud storage solutions, originally
Bigtable
Bigtable is a fully managed wide-column and key-value NoSQL database service for large analytical and operational workloads as part of the Google Cloud portfolio.
History
Bigtable development began in 2004.. It is now used by a number of Goo ...
and later
Spanner
A wrench or spanner is a tool used to provide grip and mechanical advantage in applying torque to turn objects—usually rotary fasteners, such as Nut (hardware), nuts and screw, bolts—or keep them from turning.
In the United Kingdom, UK, ...
, distributed across 10 data centers worldwide and replicated through
Paxos protocol.
Use
When using Piper, developers apply changes to a local copy of files, similar to a ''working copy'' of
Subversion
Subversion () refers to a process by which the values and principles of a system in place are contradicted or reversed in an attempt to sabotage the established social order and its structures of Power (philosophy), power, authority, tradition, h ...
, ''local clone'' of
Git
Git () is a distributed version control system that tracks versions of files. It is often used to control source code by programmers who are developing software collaboratively.
Design goals of Git include speed, data integrity, and suppor ...
, or a ''client'' of Perforce. Updates made by other developers can be
pulled from the central repository and merged into the local code. The commits are only allowed after a code review.
Typical use involves Clients in the Cloud (CitC). This system utilizes cloud backend and a local
FUSE filesystem to create an illusion of changes overlaid on top of a full repository. This approach enables seamless browsing and use of standard Unix tools without explicit synchronization operations, thus keeping the local copy very small (average size of a local copy is less than ten files). All file writes are mapped to
snapshots thus permitting restoration of the previous states of the code without explicit snapshotting. Due to the always-connected operation, CitC allows easy switching of the computers as well as sharing the modified code with other developers, the automated
build system
Build automation is the practice of building software systems in a relatively unattended fashion. The build is configured to run with minimized or no software developer interaction and without using a developer's personal computer. Build automati ...
and testing tools. As a result, the majority of Google developers practices
trunk-based development
Branching, in version control and software configuration management, is the duplication of an object under version control (such as a source code file or a directory tree). Each object can thereafter be modified separately and in parallel so that t ...
with no personal
branches
A branch, also called a ramus in botany, is a stem that grows off from another stem, or when structures like veins in leaves are divided into smaller veins.
History and etymology
In Old English, there are numerous words for branch, includi ...
; the branches are mostly used for releases.
Security
Most of the codebase is visible to all developers, sensitive individual files (less than 1% as of 2016) are access-controlled. All operations with Piper are logged, accidentally committed files can be purged.
Open-source clone
Piper is
proprietary software
Proprietary software is computer software, software that grants its creator, publisher, or other rightsholder or rightsholder partner a legal monopoly by modern copyright and intellectual property law to exclude the recipient from freely sharing t ...
. Mega, a
Git
Git () is a distributed version control system that tracks versions of files. It is often used to control source code by programmers who are developing software collaboratively.
Design goals of Git include speed, data integrity, and suppor ...
-compatible
open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
clone of Piper, is available on
GitHub
GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
. It supports the trunk-based development,
Conventional Commits and
code owners.
References
Sources
*
*
Version control systems
Google software
{{Google-stub