History
Torvalds started developing Git in April 2005 after the freeNaming
Torvalds sarcastically quipped about the name ''git'' (which means "unpleasant person" inCharacteristics
Design
Git's design is a synthesis of Torvalds's experience with Linux in maintaining a large distributed development project, along with his intimate knowledge of file-system performance gained from the same project and the urgent need to produce a working system in short order. These influences led to the following implementation choices: ; Strong support for non-linear development: Git supports rapid branching and merging, and includes specific tools for visualizing and navigating a non-linear development history. In Git, a core assumption is that a change will be merged more often than it is written, as it is passed around to various reviewers. In Git, branches are very lightweight: a branch is only a reference to one commit. ; Distributed development: Likegit gc
.
; Periodic explicit object packing: Git stores each newly created object as a separate file. Although individually compressed, this takes up a great deal of space and is inefficient. This is solved by the use of ''packs'' that store a large number of objects delta-compressed among themselves in one file (or network byte stream) called a ''packfile''. Packs are compressed using the git gc
command. For data integrity, both the packfile and its index have an git fsck
command.
Another property of Git is that it snapshots directory trees of files. The earliest systems for tracking versions of source code, Downsides
These implicit revision relationships have some significant consequences: * It is slightly more costly to examine the change history of one file than the whole project. To obtain a history of changes affecting a given file, Git must walk the global history and then determine whether each change modified that file. This method of examining history does, however, let Git produce with equal efficiency a single history showing the changes to an arbitrary set of files. For example, a subdirectory of the source tree plus an associated global header file is a very common case. * Renames are handled implicitly rather than explicitly. A common complaint with CVS is that it uses the name of a file to identify its revision history, so moving or renaming a file is not possible without either interrupting its history or renaming the history and thereby making the history inaccurate. Most post-CVS revision-control systems solve this by giving a file a unique long-lived name (analogous to anMerging strategies
Git implements several merging strategies; a non-default strategy can be selected at merge time: * ''resolve'': the traditional three-way merge algorithm. * ''recursive'': This is the default when pulling or merging one branch, and is a variant of the three-way merge algorithm. * ''octopus'': This is the default when merging more than two heads.Data structures
Git's primitives are not inherently a source-code management system. Torvalds explains: From this initial design approach, Git has developed the full set of features expected of a traditional SCM, with features mostly being created as needed, then refined and extended over time.Commands
Frequently used commands for Git'sgit init
, which is used to create a git repository.
* git clone RL/code>, which ''clones'', or duplicates, a git repository from an external URL.
* git add ile
Ile or ILE may refer to:
Ile
* Ile, a Puerto Rican singer
* Ile District (disambiguation), multiple places
* Ilé-Ifẹ̀, an ancient Yoruba city in south-western Nigeria
* Interlingue (ISO 639:ile), a planned language
* Isoleucine, an amino a ...
/code>, which adds a file to git's ''working directory'' (files about to be committed).
* git commit -m ommit message/code>, which ''commits'' the files from the current working directory (so they are now part of the repository's history).
A ''.gitignore'' file may be created in a Git repository as a plain text file
A text file (sometimes spelled textfile; an old alternative name is flat file) is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system.
In ope ...
. The files listed in the ''.gitignore'' file will ''not'' be tracked by Git. This feature can be used to ignore files with keys or passwords, various extraneous files, and large files (which GitHub will refuse to upload).
Git references
Every object in the Git database that is not referred to may be cleaned up by using a garbage collection command or automatically. An object may be referenced by another object or an explicit reference. Git has different types of references. The commands to create, move, and delete references vary. git show-ref
lists all references. Some types are:
* ''heads'': refers to an object locally,
* ''remotes'': refers to an object which exists in a remote repository,
* ''stash'': refers to an object not yet committed,
* ''meta'': ''e.g.'', a configuration in a bare repository, user rights; the refs/meta/config namespace was introduced retrospectively, gets used by Gerrit,
* ''tags'': see above.
Implementations
Git (the main implementation in C) is primarily developed on Linux
Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
, although it also supports most major operating systems, including the BSDs (DragonFly BSD
DragonFly BSD is a free and open-source Unix-like operating system forked from FreeBSD 4.8. Matthew Dillon, an Amiga developer in the late 1980s and early 1990s and FreeBSD developer between 1994 and 2003, began working on DragonFly BSD in ...
, FreeBSD
FreeBSD is a free-software Unix-like operating system descended from the Berkeley Software Distribution (BSD). The first version was released in 1993 developed from 386BSD, one of the first fully functional and free Unix clones on affordable ...
, NetBSD
NetBSD is a free and open-source Unix-like operating system based on the Berkeley Software Distribution (BSD). It was the first open-source BSD descendant officially released after 386BSD was fork (software development), forked. It continues to ...
, and OpenBSD
OpenBSD is a security-focused operating system, security-focused, free software, Unix-like operating system based on the Berkeley Software Distribution (BSD). Theo de Raadt created OpenBSD in 1995 by fork (software development), forking NetBSD ...
), Solaris
Solaris is the Latin word for sun.
It may refer to:
Arts and entertainment Literature, television and film
* ''Solaris'' (novel), a 1961 science fiction novel by Stanisław Lem
** ''Solaris'' (1968 film), directed by Boris Nirenburg
** ''Sol ...
, macOS
macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
, and Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
.
The first Windows port
A port is a maritime facility comprising one or more wharves or loading areas, where ships load and discharge cargo and passengers. Although usually situated on a sea coast or estuary, ports can also be found far inland, such as Hamburg, Manch ...
of Git was primarily a Linux-emulation framework that hosts the Linux version. Installing Git under Windows creates a similarly named Program Files directory containing the Mingw-w64
Mingw-w64 is a free and open-source suite of development tools that generate Portable Executable (PE) binaries for Microsoft Windows. It was forked in 2005–2010 from MinGW (''Minimalist GNU for Windows'').
Mingw-w64 includes a port of the GNU ...
port of the GNU Compiler Collection
The GNU Compiler Collection (GCC) is a collection of compilers from the GNU Project that support various programming languages, Computer architecture, hardware architectures, and operating systems. The Free Software Foundation (FSF) distributes ...
, Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language".
Perl was developed ...
5, MSYS2 (itself a fork of Cygwin
Cygwin ( ) is a free and open-source Unix-like environment and command-line interface (CLI) for Microsoft Windows. The project also provides a software repository containing open-source packages. Cygwin allows source code for Unix-like operati ...
, a Unix-like emulation environment for Windows) and various other Windows ports or emulations of Linux utilities and libraries. Currently, native Windows builds of Git are distributed as 32- and 64-bit installers. The git official website currently maintains a build of Git for Windows, still using the MSYS2 environment.
The JGit implementation of Git is a pure Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
software library, designed to be embedded in any Java application. JGit is used in the Gerrit code-review tool, and in EGit, a Git client for the Eclipse
An eclipse is an astronomical event which occurs when an astronomical object or spacecraft is temporarily obscured, by passing into the shadow of another body or by having another body pass between it and the viewer. This alignment of three ...
IDE.
Go-git is an open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
implementation of Git written in pure Go. It is currently used for backing projects as a SQL
Structured Query Language (SQL) (pronounced ''S-Q-L''; or alternatively as "sequel")
is a domain-specific language used to manage data, especially in a relational database management system (RDBMS). It is particularly useful in handling s ...
interface for Git code repositories and providing encryption
In Cryptography law, cryptography, encryption (more specifically, Code, encoding) is the process of transforming information in a way that, ideally, only authorized parties can decode. This process converts the original representation of the inf ...
for Git.
Dulwich is an implementation of Git written in pure Python with support for CPython 3.6 and later and Pypy.
The libgit2 implementation of Git is an ANSI C software library with no other dependencies, which can be built on multiple platforms, including Windows, Linux, macOS, and BSD. It has bindings for many programming languages, including Ruby
Ruby is a pinkish-red-to-blood-red-colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapph ...
, Python, and Haskell
Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
.
JS-Git is a JavaScript
JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior.
Web browsers have ...
implementation of a subset of Git.
GameOfTrees is an open-source implementation of Git for the OpenBSD project.
Git server
As Git is a distributed version control system, it could be used as a server out of the box. It is shipped with a built-in command git daemon
which starts a simple TCP server running on the Git protocol. Dedicated Git HTTP servers help (amongst other features) by adding access control, displaying the contents of a Git repository via the web interfaces, and managing multiple repositories. Already existing Git repositories can be cloned and shared to be used by others as a centralized repo. It can also be accessed via remote shell just by having the Git software installed and allowing a user to log in. Git servers typically listen on TCP port 9418.
Open source
* Hosting the Git server using the Git Binary.
* Gerrit, a Git server configurable to support code reviews and provide access via ssh, an integrated Apache MINA or OpenSSH, or an integrated Jetty
A jetty is a man-made structure that protrudes from land out into water. A jetty may serve as a breakwater (structure), breakwater, as a walkway, or both; or, in pairs, as a means of constricting a channel. The term derives from the French la ...
web server. Gerrit provides integration for LDAP, Active Directory, OpenID, OAuth, Kerberos/GSSAPI, X509 https client certificates. With Gerrit 3.0 all configurations will be stored as Git repositories, and no database is required to run. Gerrit has a pull-request feature implemented in its core but lacks a GUI for it.
* Phabricator, a spin-off from Facebook. As Facebook primarily uses Mercurial
Mercurial is a distributed revision control tool for software developers. It is supported on Microsoft Windows, Linux, and other Unix-like systems, such as FreeBSD and macOS.
Mercurial's major design goals include high performance and scalabi ...
, Git support is not as prominent.
* RhodeCode Community Edition (CE), supporting Git, Mercurial
Mercurial is a distributed revision control tool for software developers. It is supported on Microsoft Windows, Linux, and other Unix-like systems, such as FreeBSD and macOS.
Mercurial's major design goals include high performance and scalabi ...
and Subversion
Subversion () refers to a process by which the values and principles of a system in place are contradicted or reversed in an attempt to sabotage the established social order and its structures of Power (philosophy), power, authority, tradition, h ...
with an AGPLv3 license.
* Kallithea
Kallithea (Greek language, Greek: Καλλιθέα, meaning "beautiful view") is a suburb in Athens#Athens Urban Area, Athens agglomeration and a municipality in South Athens (regional unit), south Athens regional unit. It is the eighth larges ...
, supporting both Git and Mercurial
Mercurial is a distributed revision control tool for software developers. It is supported on Microsoft Windows, Linux, and other Unix-like systems, such as FreeBSD and macOS.
Mercurial's major design goals include high performance and scalabi ...
, developed in Python with GPL license
The GNU General Public Licenses (GNU GPL or simply GPL) are a series of widely used free software licenses, or ''copyleft'' licenses, that guarantee end users the freedom to run, study, share, or modify the software. The GPL was the first ...
.
* External projects like gitolite, which provide scripts on top of Git software to provide fine-grained access control.
* There are several other FLOSS solutions for self-hosting, including Gogs, Gitea
Gitea () is a forge software package for hosting software development version control using Git as well as other collaborative features like bug tracking, code review, continuous integration, kanban boards, tickets, and wikis. It supports self ...
, a fork of Gogs, as well as Forgejo, which is, in turn, a fork of Gitea. Gogs, as well as the two aforementioned derivatives of it, is developed using the Go language. All three solutions are made available under the MIT license
The MIT License is a permissive software license originating at the Massachusetts Institute of Technology (MIT) in the late 1980s. As a permissive license, it puts very few restrictions on reuse and therefore has high license compatibility.
Unl ...
.
Git server as a service
There are many offerings of Git repositories as a service. The most popular are GitHub
GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
, SourceForge
SourceForge is a web service founded by Geoffrey B. Jeffery, Tim Perdue, and Drew Streib in November 1999. SourceForge provides a centralized software discovery platform, including an online platform for managing and hosting open-source soft ...
, Bitbucket
Bitbucket is a Git-based source code repository hosting service owned by Atlassian. Bitbucket offers both commercial plans and free accounts with an unlimited number of private repositories.
Services Bitbucket Cloud
Bitbucket Cloud (pre ...
and GitLab
GitLab is a software forge primarily developed by GitLab Inc. It is available as a community edition and a commercial edition.
History
GitLab was created in 2011 by Ukrainian programmer Dmitriy Zaporozhets as a side project written in Rub ...
.
Graphical interfaces
Git GUI clients offer a graphical user interface (GUI) to simplify interaction with Git repositories.
These GUIs provide visual representations of project history, including branches, commits, and file changes. They also streamline actions like staging changes, creating commits, and managing branches. Visual diff tools help resolve merge conflicts arising from concurrent development.
Git comes with a Tcl/Tk
Tk is a cross-platform widget toolkit that provides a library of basic elements of GUI widgets for building a graphical user interface (GUI) in many programming languages. It is free and open-source software released under a BSD-style software l ...
GUI, which allows users to perform actions such as creating and amending commits, creating and merging branches, and interacting with remote repositories.
In addition to the official GUI, many 3rd party interfaces exist that provide similar features to the official GUI distributed with Git.
GUI clients make Git easier to learn and use, improving workflow efficiency and reducing errors.
Adoption
The Eclipse Foundation
The Eclipse Foundation AISBL is an independent, Europe-based not-for-profit organization that acts as a steward of the Eclipse open source software development community, with legal jurisdiction in the European Union. It is an organization supp ...
reported in its annual community survey that , Git is now the most widely used source-code management tool, with 42.9% of professional software developers reporting that they use Git as their primary source-control system compared with 36.3% in 2013, 32% in 2012; or for Git responses excluding use of GitHub
GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
: 33.3% in 2014, 30.3% in 2013, 27.6% in 2012 and 12.8% in 2011. Open-source directory Open Hub reports a similar uptake among open-source projects.
Stack Overflow
In software, a stack overflow occurs if the call stack pointer exceeds the stack bound. The call stack may consist of a limited amount of address space, often determined at the start of the program. The size of the call stack depends on many fa ...
has included version control
Version control (also known as revision control, source control, and source code management) is the software engineering practice of controlling, organizing, and tracking different versions in history of computer files; primarily source code t ...
in their annual developer survey in 2015 (16,694 responses), 2017 (30,730 responses), 2018 (74,298 responses) and 2022 (71,379 responses). Git was the overwhelming favorite of responding developers in these surveys, reporting as high as 93.9% in 2022.
Version control systems used by responding developers:
The UK IT jobs website itjobswatch.co.uk reports that as of late September 2016, 29.27% of UK permanent software development job openings have cited Git, ahead of 12.17% for Microsoft Team Foundation Server
Azure DevOps Server, formerly known as Team Foundation Server (TFS) and Visual Studio Team System (VSTS), is a Microsoft product that provides version control (either with Team Foundation Version Control (TFVC) or Git), reporting, requirements ...
, 10.60% for Subversion
Subversion () refers to a process by which the values and principles of a system in place are contradicted or reversed in an attempt to sabotage the established social order and its structures of Power (philosophy), power, authority, tradition, h ...
, 1.30% for Mercurial
Mercurial is a distributed revision control tool for software developers. It is supported on Microsoft Windows, Linux, and other Unix-like systems, such as FreeBSD and macOS.
Mercurial's major design goals include high performance and scalabi ...
, and 0.48% for Visual SourceSafe.
Extensions
There are many ''Git extensions'', like Git LFS, which started as an extension to Git in the GitHub community and is now widely used by other repositories. Extensions are usually independently developed and maintained by different people, but at some point in the future, a widely used extension can be merged with Git.
Other open-source Git extensions include:
* git-annex, a distributed file synchronization system based on Git
* git-flow, a set of Git extensions to provide high-level repository operations for Vincent Driessen's branching model
* git-machete, a repository organizer & tool for automating rebase/merge/pull/push operations
Microsoft developed the Virtual File System for Git
Virtual File System for Git (VFS for Git), developed by Microsoft, is an extension to the Git version control system.
Overview
VFS for Git is designed to ease the handling of enterprise-scale Git repositories, such as the Microsoft Windows ope ...
(VFS for Git; formerly Git Virtual File System or GVFS) extension to handle the size of the Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
source-code tree as part of their 2017 migration from Perforce
Perforce Software, Inc. is an American developer of software used for developing and running applications, including version control software, web-based repository management, developer collaboration, application lifecycle management, web applic ...
. VFS for Git allows cloned repositories to use placeholders whose contents are downloaded only once a file is accessed.
Conventions
Git can be used in a variety of different ways, but some conventions are commonly adopted.
* The command to create a local repo, ''git init'', creates a branch named ''master''. Often it is used as the integration branch for merging changes into. Since the default upstream remote is named ''origin'', the default remote branch is ''origin/master''. Some tools such as GitHub and GitLab create a default branch named ''main'' instead. Also, users can add and delete branches and choose any branch for integrating.
* Pushed commits generally are not overwritten, but are ''reverted'' by committing another change which reverses an earlier commit. This prevents shared commits from being invalid because the commit on which they are based does not exist in the remote. If the commits contain sensitive information, they should be removed, which involves a more complex procedure to rewrite history.
* The ''git-flow'' workflow and naming conventions are often adopted to distinguish feature-specific unstable histories (feature/*), unstable shared histories (develop), production-ready histories (main), and emergency patches to released products (hotfix).
* A ''pull request'', a.k.a. ''merge request'', is a request by a user to merge a branch into another branch. Git does not itself provide for pull requests, but it is a common feature of git cloud services. The underlying function of a pull request is no different than that of an administrator of a repository pulling changes from another remote (the repository that is the source of the pull request). However, the pull request itself is a ticket managed by the hosting server which perform these actions; it is not a feature of git SCM.
Security
Git does not provide access-control mechanisms, but was designed for operation with other tools that specialize in access control.
On 17 December 2014, an exploit was found affecting the Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
and macOS
macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
versions of the Git client. An attacker could perform arbitrary code execution
In computer security, arbitrary code execution (ACE) is an attacker's ability to run any commands or code of the attacker's choice on a target machine or in a target process. An arbitrary code execution vulnerability is a security flaw in softwa ...
on a target computer with Git installed by creating a malicious Git tree (directory) named ''.git'' (a directory in Git repositories that stores all the data of the repository) in a different case (such as .GIT or .Git, needed because Git does not allow the all-lowercase version of ''.git'' to be created manually) with malicious files in the ''.git/hooks'' subdirectory (a folder with executable files that Git runs) on a repository that the attacker made or on a repository that the attacker can modify. If a Windows or Mac user ''pulls'' (downloads) a version of the repository with the malicious directory, then switches to that directory, the .git directory will be overwritten (due to the case-insensitive trait of the Windows and Mac filesystems) and the malicious executable files in ''.git/hooks'' may be run, which results in the attacker's commands being executed. An attacker could also modify the ''.git/config'' configuration file, which allows the attacker to create malicious Git aliases (aliases for Git commands or external commands) or modify extant aliases to execute malicious commands when run. The vulnerability was patched in version 2.2.1 of Git, released on 17 December 2014, and announced the next day.
Git version 2.6.1, released on 29 September 2015, contained a patch for a security vulnerability (CVE-2015-7545) that allowed arbitrary code execution. The vulnerability was exploitable if an attacker could convince a victim to clone a specific URL, as the arbitrary commands were embedded in the URL itself. An attacker could use the exploit via a man-in-the-middle attack
In cryptography and computer security, a man-in-the-middle (MITM) attack, or on-path attack, is a cyberattack where the attacker secretly relays and possibly alters the communications between two parties who believe that they are directly communi ...
if the connection was unencrypted, as they could redirect the user to a URL of their choice. Recursive clones were also vulnerable since they allowed the controller of a repository to specify arbitrary URLs via the gitmodules file.
Git uses SHA-1
In cryptography, SHA-1 (Secure Hash Algorithm 1) is a hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest – typically rendered as 40 hexadecimal digits. It was designed by the United States ...
hashes internally. Linus Torvalds has responded that the hash was mostly to guard against accidental corruption, and the security a cryptographically secure hash gives was just an accidental side effect, with the main security being signing elsewhere. Since a demonstration of the SHAttered attack against git in 2017, git was modified to use a SHA-1 variant resistant to this attack. A plan for hash function transition is being written since February 2020.
Trademark
"Git" is a registered word trademark
A trademark (also written trade mark or trade-mark) is a form of intellectual property that consists of a word, phrase, symbol, design, or a combination that identifies a Good (economics and accounting), product or Service (economics), service f ...
of Software Freedom Conservancy
Software Freedom Conservancy, Inc. (also known as "Conservancy") is an organization that provides a Nonprofit organization, non-profit home, infrastructure support, and legal support for free software, free and open source software projects. The ...
under US500000085961336 since 2015-02-03.
See also
* Comparison of source-code-hosting facilities
A source-code-hosting facility (also known as forge software) is a file archive and web hosting facility for source code of software, documentation, web pages, and other works, accessible either publicly or privately. They are often used by open- ...
* Comparison of version-control software
The following tables describe attributes of notable version control and software configuration management (SCM) systems that can be used to compare and contrast the various systems.
For SCM software not suitable for source code, see Comparis ...
* List of version-control software
This is a list of notable version control software systems.
Common attributes
*Openness, whether the software is ''open'' source or ''proprietary''
*Repository model, how working and shared source code is handled
**Shared, all developers use the ...
Notes
Citations
External links
*
{{Authority control
2005 software
Distributed version control systems
Free version control software
Free software programmed in C
Free software programmed in Perl
Linus Torvalds
Self-hosting software
Software using the GNU General Public License
Software that uses Tk (software)
Version control systems
Open source projects