In
software development
Software development is the process of conceiving, specifying, designing, programming, documenting, testing, and bug fixing involved in creating and maintaining applications, frameworks, or other software components. Software development invol ...
, distributed version control (also known as distributed revision control) is a form of
version control
In software engineering, version control (also known as revision control, source control, or source code management) is a class of systems responsible for managing changes to computer programs, documents, large web sites, or other collections o ...
in which the complete
codebase
In software development, a codebase (or code base) is a collection of source code used to build a particular software system, application, or software component. Typically, a codebase includes only human-written source code files; thus, a codeb ...
, including its full history, is mirrored on every developer's computer.
Compared to centralized version control, this enables automatic management
branching and
merging
Merge, merging, or merger may refer to:
Concepts
* Merge (traffic), the reduction of the number of lanes on a road
* Merge (linguistics), a basic syntactic operation in generative syntax in the Minimalist Program
* Merger (politics), the comb ...
, speeds up most operations (except pushing and pulling), improves the ability to work offline, and does not rely on a single location for backups.
Git
Git () is a distributed version control system: tracking changes in any set of files, usually used for coordinating work among programmers collaboratively developing source code during software development. Its goals include speed, data in ...
, the world's most popular version control system,
is a distributed version control system.
In 2010, software development author
Joel Spolsky
Avram Joel Spolsky (born 1965) is a software engineer and writer. He is the author of ''Joel on Software'', a blog on software development, and the creator of the project management software Trello. He was a Program Manager on the Microsoft Exce ...
described distributed version control systems as "possibly the biggest advance in software development technology in the
astten years".
Distributed vs. centralized
Distributed version control systems (DVCS) use a
peer-to-peer
Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer n ...
approach to version control, as opposed to the
client–server approach of centralized systems. Distributed revision control synchronizes repositories by transferring
patches from peer to peer. There is no single central version of the codebase; instead, each user has a working copy and the full change history.
Advantages of DVCS (compared with centralized systems) include:
* Allows users to work productively when not connected to a network.
* Common operations (such as commits, viewing history, and reverting changes) are faster for DVCS, because there is no need to communicate with a central server.
With DVCS, communication is necessary only when sharing changes among other peers.
* Allows private work, so users can use their changes even for early drafts they do not want to publish.
* Working copies effectively function as remote backups, which avoids relying on one physical machine as a single point of failure.
* Allows various development models to be used, such as using
development branches or a Commander/Lieutenant model.
* Permits centralized control of the "release version" of the project
* On
FOSS
Fos or FOSS may refer to:
Companies
*Foss A/S, a Danish analytical instrument company
* Foss Brewery, a former brewery in Oslo, Norway
*Foss Maritime, a tugboat and shipping company
Historic houses
* Foss House (New Brighton, Minnesota), United ...
software projects it is much easier to create a
project fork from a project that is stalled because of leadership conflicts or design disagreements.
Disadvantages of DVCS (compared with centralized systems) include:
* Initial checkout of a repository is slower as compared to checkout in a centralized version control system, because all branches and revision history are copied to the local machine by default.
* The lack of locking mechanisms that is part of most centralized VCS and still plays an important role when it comes to non-mergeable binary files such as graphic assets or too complex single file binary or XML packages (e.g. office documents, PowerBI files, SQL Server Data Tools BI packages, etc.).
* Additional storage required for every user to have a complete copy of the complete codebase history.
* Increased exposure of the code base since every participant has a locally vulnerable copy.
Some originally centralized systems now offer some distributed features. For example,
Subversion
Subversion () refers to a process by which the values and principles of a system in place are contradicted or reversed in an attempt to transform the established social order and its structures of power, authority, hierarchy, and social norms. Sub ...
is able to do many operations with no network.
Team Foundation Server
Azure DevOps Server (formerly Team Foundation Server (TFS) and Visual Studio Team System (VSTS)) is a Microsoft product that provides version control (either with Team Foundation Version Control (TFVC) or Git), reporting, requirements managemen ...
and Visual Studio Team Services now host centralized and distributed version control repositories via hosting Git.
Similarly, some distributed systems now offer features that mitigate the issues of checkout times and storage costs, such as the
Virtual File System for Git
Virtual File System for Git (VFS for Git), developed by Microsoft, is an extension to the Git version control system.
Overview
VFS for Git is designed to ease the handling of enterprise-scale Git repositories, such as the Microsoft Windows operat ...
developed by Microsoft to work with very large codebases, which exposes a virtual file system that downloads files to local storage only as they are needed.
Work model
The distributed model is generally better suited for large projects with partly independent developers, such as the Linux kernel project, because developers can work independently and submit their changes for merge (or rejection). The distributed model flexibly allows adopting custom source code contribution workflows. The
integrator workflow is the most widely used. In the centralized model, developers must serialize their work, to avoid problems with different versions.
Central and branch repositories
In a truly distributed project, such as
Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
, every contributor maintains their own version of the project, with different contributors hosting their own respective versions and pulling in changes from other users as needed, resulting in a general consensus emerging from multiple different nodes. This also makes the process of "forking" easy, as all that is required is one contributor stop accepting pull requests from other contributors and letting the codebases gradually grow apart.
This arrangement, however, can be difficult to maintain, resulting in many projects choosing to shift to a paradigm in which one contributor is the universal "upstream", a repository from whom changes are almost always pulled. Under this paradigm, development is somewhat recentralized, as every project now has a central repository that is informally considered as the official repository, managed by the project maintainers collectively. While distributed version control systems make it easy for new developers to "clone" a copy of any other contributor's repository, in a central model, new developers always clone the central repository to create identical local copies of the code base. Under this system, code changes in the central repository are periodically synchronized with the local repository, and once the development is done, the change should be integrated into the central repository as soon as possible.
Organizations utilizing this centralize pattern often choose to host the central repository on a third party service like
GitHub
GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous ...
, which offers not only more reliable
uptime
Uptime is a measure of system reliability, expressed as the percentage of time a machine, typically a computer, has been working and available. Uptime is the opposite of downtime.
It is often used as a measure of computer operating system reliabi ...
than self-hosted repositories, but can also add centralized features like
issue trackers and
continuous integration.
Pull requests
Contributions to a source code repository that uses a distributed version control system are commonly made by means of a pull request, also known as a merge request.
The contributor requests that the project maintainer ''pull''s the source code change, hence the name "pull request". The maintainer has to ''merge'' the pull request if the contribution should become part of the source base.
The developer creates a pull request to notify maintainers of a new change; a comment thread is associated with each pull request. This allows for
focused discussion of code changes. Submitted pull requests are visible to anyone with repository access. A pull request can be accepted or rejected by maintainers.
Once the pull request is reviewed and approved, it is merged into the repository. Depending on the established workflow, the code may need to be tested before being included into official release. Therefore, some projects contain a special branch for merging untested pull requests.
Other projects run an automated test suite on every pull request, using a
continuous integration tool such as
Travis CI
Travis CI is a hosted continuous integration service used to build and test software projects hosted on GitHub, Bitbucket, GitLab, Perforce, Apache Subversion and Assembla.
Travis CI was the first CI service that provided services to open-sourc ...
, and the reviewer checks that any new code has appropriate test coverage.
History
The first open-source DVCS systems included
Arch
An arch is a vertical curved structure that spans an elevated space and may or may not support the weight above it, or in case of a horizontal arch like an arch dam, the hydrostatic pressure against it.
Arches may be synonymous with vaul ...
,
Monotone
Monotone refers to a sound, for example music or speech, that has a single unvaried tone. See: monophony.
Monotone or monotonicity may also refer to:
In economics
*Monotone preferences, a property of a consumer's preference ordering.
*Monotonic ...
, and
Darcs
Darcs is a distributed version control system created by David Roundy. Key features include the ability to choose which changes to accept from other repositories, interaction with either other local (on-disk) repositories or remote repositories via ...
. However, open source DVCSs were never very popular until the release of
Git
Git () is a distributed version control system: tracking changes in any set of files, usually used for coordinating work among programmers collaboratively developing source code during software development. Its goals include speed, data in ...
and
Mercurial
Mercurial is a distributed revision control tool for software developers. It is supported on Microsoft Windows and Unix-like systems, such as FreeBSD, macOS, and Linux.
Mercurial's major design goals include high performance and scalability, d ...
.
BitKeeper
BitKeeper is a software tool for distributed revision control of computer source code. Originally developed as proprietary software by BitMover Inc., a privately held company based in Los Gatos, California, it was released as open-source software ...
was used in the development of the
Linux kernel
The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ope ...
from 2002 to 2005.
The development of
Git
Git () is a distributed version control system: tracking changes in any set of files, usually used for coordinating work among programmers collaboratively developing source code during software development. Its goals include speed, data in ...
, now the world's most popular version control system,
was prompted by the decision of the company that made BitKeeper to rescind the free license that Linus Torvalds and some other Linux kernel developers had previously taken advantage of.
See also
References
External links
Essay on various revision control systems especially the section "Centralized vs. Decentralized SCM"
Introduction to distributed version control systems- IBM Developer Works article
{{DEFAULTSORT:Distributed Revision Control
Version control
Free software projects
Free version control software
Distributed version control systems
de:Versionsverwaltung#Verteilte Versionsverwaltung
fr:Gestion de version décentralisée
ja:分散型バージョン管理システム