In
cryptography
Cryptography, or cryptology (from grc, , translit=kryptós "hidden, secret"; and ''graphein'', "to write", or ''-logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of adver ...
, SHA-1 (Secure Hash Algorithm 1) is a cryptographically broken
but still widely used
hash function
A hash function is any function that can be used to map data of arbitrary size to fixed-size values. The values returned by a hash function are called ''hash values'', ''hash codes'', ''digests'', or simply ''hashes''. The values are usually u ...
which takes an input and produces a 160-
bit
The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represente ...
(20-
byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
) hash value known as a
message digest
A cryptographic hash function (CHF) is a hash algorithm (a map of an arbitrary binary string to a binary string with fixed size of n bits) that has special properties desirable for cryptography:
* the probability of a particular n-bit output re ...
– typically rendered as 40
hexadecimal
In mathematics and computing, the hexadecimal (also base-16 or simply hex) numeral system is a positional numeral system that represents numbers using a radix (base) of 16. Unlike the decimal system representing numbers using 10 symbols, hexa ...
digits. It was designed by the United States
National Security Agency
The National Security Agency (NSA) is a national-level intelligence agency of the United States Department of Defense, under the authority of the Director of National Intelligence (DNI). The NSA is responsible for global monitoring, collecti ...
, and is a U.S.
Federal Information Processing Standard
The Federal Information Processing Standards (FIPS) of the United States are a set of publicly announced standards that the National Institute of Standards and Technology (NIST) has developed for use in computer systems of non-military, American ...
.
Since 2005, SHA-1 has not been considered secure against well-funded opponents; as of 2010 many organizations have recommended its replacement.
NIST
The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical sci ...
formally deprecated use of SHA-1 in 2011 and disallowed its use for digital signatures in 2013, and declared that it should be phased out by 2030. ,
chosen-prefix attack
In cryptography, a collision attack on a cryptographic hash tries to find two inputs producing the same hash value, i.e. a hash collision. This is in contrast to a preimage attack where a specific target hash value is specified.
There are roug ...
s against SHA-1 are practical.
As such, it is recommended to remove SHA-1 from products as soon as possible and instead use
SHA-2
SHA-2 (Secure Hash Algorithm 2) is a set of cryptographic hash functions designed by the United States National Security Agency (NSA) and first published in 2001. They are built using the Merkle–Damgård construction, from a one-way compression ...
or
SHA-3
SHA-3 (Secure Hash Algorithm 3) is the latest member of the Secure Hash Algorithm family of standards, released by NIST on August 5, 2015. Although part of the same series of standards, SHA-3 is internally different from the MD5-like struct ...
. Replacing SHA-1 is urgent where it is used for
digital signatures
A digital signature is a mathematical scheme for verifying the authenticity of digital messages or documents. A valid digital signature, where the prerequisites are satisfied, gives a recipient very high confidence that the message was created b ...
.
All major
web browser
A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used on ...
vendors ceased acceptance of SHA-1
SSL certificates in 2017.
In February 2017,
CWI Amsterdam
The (abbr. CWI; English: "National Research Institute for Mathematics and Computer Science") is a research centre in the field of mathematics and theoretical computer science. It is part of the institutes organization of the Dutch Research Cou ...
and
Google
Google LLC () is an American multinational technology company focusing on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics. ...
announced they had performed a
collision attack
In cryptography, a collision attack on a cryptographic hash tries to find two inputs producing the same hash value, i.e. a hash collision. This is in contrast to a preimage attack where a specific target hash value is specified.
There are roughl ...
against SHA-1, publishing two dissimilar PDF files which produced the same SHA-1 hash.
However, SHA-1 is still secure for
HMAC
In cryptography, an HMAC (sometimes expanded as either keyed-hash message authentication code or hash-based message authentication code) is a specific type of message authentication code (MAC) involving a cryptographic hash function and a secret ...
.
Microsoft has discontinued SHA-1 code signing support for Windows Update on August 7, 2020.
Development
SHA-1 produces a
message digest
A cryptographic hash function (CHF) is a hash algorithm (a map of an arbitrary binary string to a binary string with fixed size of n bits) that has special properties desirable for cryptography:
* the probability of a particular n-bit output re ...
based on principles similar to those used by
Ronald L. Rivest
Ronald Linn Rivest (; born May 6, 1947) is a cryptographer and an Institute Professor at MIT. He is a member of MIT's Department of Electrical Engineering and Computer Science (EECS) and a member of MIT's Computer Science and Artificial Inte ...
of
MIT
The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of the m ...
in the design of the
MD2,
MD4
The MD4 Message-Digest Algorithm is a cryptographic hash function developed by Ronald Rivest in 1990. The digest length is 128 bits. The algorithm has influenced later designs, such as the MD5, SHA-1 and RIPEMD algorithms. The initialism "MD" s ...
and
MD5 message digest algorithms, but generates a larger hash value (160 bits vs. 128 bits).
SHA-1 was developed as part of the U.S. Government's
Capstone project
A capstone course, also known as senior synthesis, capstone unit, capstone module, capstone project, capstone subject, or capstone experience, serves as the culminating and usually integrative experience of an educational program. It may also be r ...
. The original specification of the algorithm was published in 1993 under the title ''Secure Hash Standard'',
FIPS PUB 180, by U.S. government standards agency
NIST
The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical sci ...
(National Institute of Standards and Technology). This version is now often named ''SHA-0''. It was withdrawn by the
NSA
The National Security Agency (NSA) is a national-level intelligence agency of the United States Department of Defense, under the authority of the Director of National Intelligence (DNI). The NSA is responsible for global monitoring, collecti ...
shortly after publication and was superseded by the revised version, published in 1995 in FIPS PUB 180-1 and commonly designated ''SHA-1''. SHA-1 differs from SHA-0 only by a single bitwise rotation in the message schedule of its
compression function. According to the NSA, this was done to correct a flaw in the original algorithm which reduced its cryptographic security, but they did not provide any further explanation. Publicly available techniques did indeed demonstrate a compromise of SHA-0, in 2004, before SHA-1 in 2017 (''see
§Attacks'').
Applications
Cryptography
SHA-1 forms part of several widely used security applications and protocols, including
TLS and
SSL,
PGP
PGP or Pgp may refer to:
Science and technology
* P-glycoprotein, a type of protein
* Pelvic girdle pain, a pregnancy discomfort
* Personal Genome Project, to sequence genomes and medical records
* Pretty Good Privacy, a computer program for the ...
,
SSH
The Secure Shell Protocol (SSH) is a cryptographic network protocol for operating network services securely over an unsecured network. Its most notable applications are remote login and command-line execution.
SSH applications are based on ...
,
S/MIME S/MIME (Secure/Multipurpose Internet Mail Extensions) is a standard for public key encryption and signing of MIME data. S/MIME is on an IETF standards track and defined in a number of documents, most importantly . It was originally developed by R ...
, and
IPsec
In computing, Internet Protocol Security (IPsec) is a secure network protocol suite that authenticates and encrypts packets of data to provide secure encrypted communication between two computers over an Internet Protocol network. It is used in ...
. Those applications can also use
MD5; both MD5 and SHA-1 are descended from
MD4
The MD4 Message-Digest Algorithm is a cryptographic hash function developed by Ronald Rivest in 1990. The digest length is 128 bits. The algorithm has influenced later designs, such as the MD5, SHA-1 and RIPEMD algorithms. The initialism "MD" s ...
.
SHA-1 and SHA-2 are the hash algorithms required by law for use in certain
U.S. government applications, including use within other cryptographic algorithms and protocols, for the protection of sensitive unclassified information. FIPS PUB 180-1 also encouraged adoption and use of SHA-1 by private and commercial organizations. SHA-1 is being retired from most government uses; the U.S. National Institute of Standards and Technology said, "Federal agencies ''should'' stop using SHA-1 for...applications that require collision resistance as soon as practical, and must use the
SHA-2
SHA-2 (Secure Hash Algorithm 2) is a set of cryptographic hash functions designed by the United States National Security Agency (NSA) and first published in 2001. They are built using the Merkle–Damgård construction, from a one-way compression ...
family of hash functions for these applications after 2010" (emphasis in original), though that was later relaxed to allow SHA-1 to be used for verifying old digital signatures and time stamps.
A prime motivation for the publication of the
Secure Hash Algorithm
The Secure Hash Algorithms are a family of cryptographic hash functions published by the National Institute of Standards and Technology (NIST) as a U.S. Federal Information Processing Standard (FIPS), including:
*SHA-0: A retronym applied to the ...
was the
Digital Signature Standard
The Digital Signature Standard (DSS) is a Federal Information Processing Standard specifying a suite of algorithms that can be used to generate digital signatures established by the U.S. National Institute of Standards and Technology (NIST) in 1994 ...
, in which it is incorporated.
The SHA hash functions have been used for the basis of the
SHACAL
SHACAL-1 (originally simply SHACAL) is a 160-bit block cipher based on SHA-1, and supports keys from 128-bit to 512-bit. SHACAL-2 is a 256-bit block cipher based upon the larger hash function SHA-256.
Both SHACAL-1 and SHACAL-2 were selected fo ...
block cipher
In cryptography, a block cipher is a deterministic algorithm operating on fixed-length groups of bits, called ''blocks''. Block ciphers are specified cryptographic primitive, elementary components in the design of many cryptographic protocols and ...
s.
Data integrity
Revision control
In software engineering, version control (also known as revision control, source control, or source code management) is a class of systems responsible for managing changes to computer programs, documents, large web sites, or other collections o ...
systems such as
Git
Git () is a distributed version control system: tracking changes in any set of files, usually used for coordinating work among programmers collaboratively developing source code during software development. Its goals include speed, data in ...
,
Mercurial
Mercurial is a distributed revision control tool for software developers. It is supported on Microsoft Windows and Unix-like systems, such as FreeBSD, macOS, and Linux.
Mercurial's major design goals include high performance and scalability, d ...
, and
Monotone
Monotone refers to a sound, for example music or speech, that has a single unvaried tone. See: monophony.
Monotone or monotonicity may also refer to:
In economics
*Monotone preferences, a property of a consumer's preference ordering.
*Monotonic ...
use SHA-1, not for security, but to identify revisions and to ensure that the data has not changed due to accidental corruption.
Linus Torvalds
Linus Benedict Torvalds ( , ; born 28 December 1969) is a Finnish software engineer who is the creator and, historically, the lead developer of the Linux kernel, used by Linux distributions and other operating systems such as Android. He also c ...
said about Git:
:If you have disk corruption, if you have DRAM corruption, if you have any kind of problems at all, Git will notice them. It's not a question of ''if'', it's a guarantee. You can have people who try to be malicious. They won't succeed.
..Nobody has been able to break SHA-1, but the point is the SHA-1, as far as Git is concerned, isn't even a security feature. It's purely a consistency check. The security parts are elsewhere, so a lot of people assume that since Git uses SHA-1 and SHA-1 is used for cryptographically secure stuff, they think that, Okay, it's a huge security feature. It has nothing at all to do with security, it's just the best hash you can get. ...
:I guarantee you, if you put your data in Git, you can trust the fact that five years later, after it was converted from your hard disk to DVD to whatever new technology and you copied it along, five years later you can verify that the data you get back out is the exact same data you put in.
..:One of the reasons I care is for the kernel, we had a break in on one of the
BitKeeper
BitKeeper is a software tool for distributed revision control of computer source code. Originally developed as proprietary software by BitMover Inc., a privately held company based in Los Gatos, California, it was released as open-source software ...
sites where people tried to corrupt the kernel source code repositories.
However Git does not require the
second preimage resistance of SHA-1 as a security feature, since it will always prefer to keep the earliest version of an object in case of collision, preventing an attacker from surreptitiously overwriting files.
Cryptanalysis and validation
For a hash function for which ''L'' is the number of bits in the message digest, finding a message that corresponds to a given message digest can always be done using a brute force search in approximately 2
''L'' evaluations. This is called a
preimage attack
In cryptography, a preimage attack on cryptographic hash functions tries to find a message that has a specific hash value. A cryptographic hash function should resist attacks on its preimage (set of possible inputs).
In the context of attack, th ...
and may or may not be practical depending on ''L'' and the particular computing environment. However, a ''collision'', consisting of finding two different messages that produce the same message digest, requires on average only about evaluations using a
birthday attack
A birthday attack is a type of cryptographic attack that exploits the mathematics behind the birthday problem in probability theory. This attack can be used to abuse communication between two or more parties. The attack depends on the higher likeli ...
. Thus the
strength
Strength may refer to:
Physical strength
*Physical strength, as in people or animals
* Hysterical strength, extreme strength occurring when people are in life-and-death situations
*Superhuman strength, great physical strength far above human c ...
of a hash function is usually compared to a symmetric cipher of half the message digest length. SHA-1, which has a 160-bit message digest, was originally thought to have 80-bit strength.
Some of the applications that use cryptographic hashes, like password storage, are only minimally affected by a collision attack. Constructing a password that works for a given account requires a
preimage attack
In cryptography, a preimage attack on cryptographic hash functions tries to find a message that has a specific hash value. A cryptographic hash function should resist attacks on its preimage (set of possible inputs).
In the context of attack, th ...
, as well as access to the hash of the original password, which may or may not be trivial. Reversing password encryption (e.g. to obtain a password to try against a user's account elsewhere) is not made possible by the attacks. (However, even a secure password hash can't prevent brute-force attacks on
weak passwords.)
In the case of document signing, an attacker could not simply fake a signature from an existing document: The attacker would have to produce a pair of documents, one innocuous and one damaging, and get the private key holder to sign the innocuous document. There are practical circumstances in which this is possible; until the end of 2008, it was possible to create forged
SSL certificates using an
MD5 collision.
Due to the block and iterative structure of the algorithms and the absence of additional final steps, all SHA functions (except SHA-3) are vulnerable to
length-extension and partial-message collision attacks. These attacks allow an attacker to forge a message signed only by a keyed hash – or – by extending the message and recalculating the hash without knowing the key. A simple improvement to prevent these attacks is to hash twice: (the length of 0
''b'', zero block, is equal to the block size of the hash function).
SHA-0
At
CRYPTO
Crypto commonly refers to:
* Cryptocurrency, a type of digital currency secured by cryptography and decentralization
* Cryptography, the practice and study of hiding information
Crypto or Krypto may also refer to:
Cryptography
* Cryptanalysis, ...
98, two French researchers,
Florent Chabaud and
, presented an attack on SHA1:
collisions
In physics, a collision is any event in which two or more bodies exert forces on each other in a relatively short time. Although the most common use of the word ''collision'' refers to incidents in which two or more objects collide with great f ...
can be found with complexity 2
61, fewer than the 2
80 for an ideal hash function of the same size.
In 2004,
Biham and Chen found near-collisions for SHA-0 – two messages that hash to nearly the same value; in this case, 142 out of the 160 bits are equal. They also found full collisions of SHA-0 reduced to 62 out of its 80 rounds.
Subsequently, on 12 August 2004, a collision for the full SHA-0 algorithm was announced by Joux, Carribault, Lemuet, and Jalby. This was done by using a generalization of the Chabaud and Joux attack. Finding the collision had complexity 2
51 and took about 80,000 processor-hours on a
supercomputer
A supercomputer is a computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second ( FLOPS) instead of million instructions ...
with 256
Itanium 2 processors (equivalent to 13 days of full-time use of the computer).
On 17 August 2004, at the Rump Session of CRYPTO 2004, preliminary results were announced by
Wang
Wang may refer to:
Names
* Wang (surname) (王), a common Chinese surname
* Wāng (汪), a less common Chinese surname
* Titles in Chinese nobility
* A title in Korean nobility
* A title in Mongolian nobility
Places
* Wang River in Thailand ...
, Feng, Lai, and Yu, about an attack on
MD5, SHA-0 and other hash functions. The complexity of their attack on SHA-0 is 2
40, significantly better than the attack by Joux ''et al.''
In February 2005, an attack by
Xiaoyun Wang
Wang Xiaoyun (; born 1966) is a Chinese cryptographer, mathematician, and computer scientist. She is a professor in the Department of Mathematics and System Science of Shandong University and an academician of the Chinese Academy of Sciences.
Ear ...
,
Yiqun Lisa Yin
Yiqun Lisa Yin is a Chinese-American cryptographer and independent security consultant. Yin is known for breaking the SHA-1 cryptographic hash function, for developing the RC6 block cipher, and for her service as editor of the IEEE P1363 project fo ...
, and Hongbo Yu was announced which could find collisions in SHA-0 in 2
39 operations.
Another attack in 2008 applying the
boomerang attack
In cryptography, the boomerang attack is a method for the cryptanalysis of block ciphers based on differential cryptanalysis. The attack was published in 1999 by David Wagner, who used it to break the COCONUT98 cipher.
The boomerang attack ...
brought the complexity of finding collisions down to 2
33.6, which was estimated to take 1 hour on an average PC from the year 2008.
In light of the results for SHA-0, some experts suggested that plans for the use of SHA-1 in new
cryptosystem
In cryptography, a cryptosystem is a suite of cryptographic algorithms needed to implement a particular security service, such as confidentiality (encryption).
Typically, a cryptosystem consists of three algorithms: one for key generation, one for ...
s should be reconsidered. After the CRYPTO 2004 results were published, NIST announced that they planned to phase out the use of SHA-1 by 2010 in favor of the SHA-2 variants.
Attacks
In early 2005,
Vincent Rijmen
Vincent Rijmen (; born 16 October 1970) is a Belgian cryptographer and one of the two designers of the Rijndael, the Advanced Encryption Standard. Rijmen is also the co-designer of the WHIRLPOOL cryptographic hash function, and the block cipher ...
and
Elisabeth Oswald published an attack on a reduced version of SHA-1 – 53 out of 80 rounds – which finds collisions with a computational effort of fewer than 2
80 operations.
In February 2005, an attack by
Xiaoyun Wang
Wang Xiaoyun (; born 1966) is a Chinese cryptographer, mathematician, and computer scientist. She is a professor in the Department of Mathematics and System Science of Shandong University and an academician of the Chinese Academy of Sciences.
Ear ...
, Yiqun Lisa Yin, and Hongbo Yu was announced.
The attacks can find collisions in the full version of SHA-1, requiring fewer than 2
69 operations. (A
brute-force search
In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the soluti ...
would require 2
80 operations.)
The authors write: "In particular, our analysis is built upon the original differential attack on SHA-0, the near collision attack on SHA-0, the multiblock collision techniques, as well as the message modification techniques used in the collision search attack on MD5. Breaking SHA-1 would not be possible without these powerful analytical techniques." The authors have presented a collision for 58-round SHA-1, found with 2
33 hash operations. The paper with the full attack description was published in August 2005 at the CRYPTO conference.
In an interview, Yin states that, "Roughly, we exploit the following two weaknesses: One is that the file preprocessing step is not complicated enough; another is that certain math operations in the first 20 rounds have unexpected security problems."
On 17 August 2005, an improvement on the SHA-1 attack was announced on behalf of
Xiaoyun Wang
Wang Xiaoyun (; born 1966) is a Chinese cryptographer, mathematician, and computer scientist. She is a professor in the Department of Mathematics and System Science of Shandong University and an academician of the Chinese Academy of Sciences.
Ear ...
,
Andrew Yao
Andrew Chi-Chih Yao (; born December 24, 1946) is a Chinese computer scientist and computational theorist. He is currently a professor and the dean of Institute for Interdisciplinary Information Sciences (IIIS) at Tsinghua University. Yao use ...
and
Frances Yao
Frances Foong Chu Yao () is a Chinese-born American mathematician and theoretical computer scientist. She is currently a Chair Professor at the Institute for Interdisciplinary Information Sciences (IIIS) of Tsinghua University. She was Chair Prof ...
at the CRYPTO 2005 Rump Session, lowering the complexity required for finding a collision in SHA-1 to 2
63.
On 18 December 2007 the details of this result were explained and verified by Martin Cochran.
Christophe De Cannière and Christian Rechberger further improved the attack on SHA-1 in "Finding SHA-1 Characteristics: General Results and Applications," receiving the Best Paper Award at
ASIACRYPT 2006. A two-block collision for 64-round SHA-1 was presented, found using unoptimized methods with 2
35 compression function evaluations. Since this attack requires the equivalent of about 2
35 evaluations, it is considered to be a significant theoretical break. Their attack was extended further to 73 rounds (of 80) in 2010 by Grechnikov. In order to find an actual collision in the full 80 rounds of the hash function, however, tremendous amounts of computer time are required. To that end, a collision search for SHA-1 using the volunteer computing platform
BOINC
The Berkeley Open Infrastructure for Network Computing (BOINC, pronounced – rhymes with "oink") is an open-source middleware system for volunteer computing (a type of distributed computing). Developed originally to support SETI@home, it beca ...
began August 8, 2007, organized by the
Graz University of Technology
Graz University of Technology (german: link=no, Technische Universität Graz, short ''TU Graz'') is one of five universities in Styria, Austria. It was founded in 1811 by Archduke John of Austria and is the oldest science and technology research ...
. The effort was abandoned May 12, 2009 due to lack of progress.
At the Rump Session of CRYPTO 2006, Christian Rechberger and Christophe De Cannière claimed to have discovered a collision attack on SHA-1 that would allow an attacker to select at least parts of the message.
In 2008, an attack methodology by Stéphane Manuel reported hash collisions with an estimated theoretical complexity of 2
51 to 2
57 operations. However he later retracted that claim after finding that local collision paths were not actually independent, and finally quoting for the most efficient a collision vector that was already known before this work.
Cameron McDonald, Philip Hawkes and Josef Pieprzyk presented a hash collision attack with claimed complexity 2
52 at the Rump Session of Eurocrypt 2009. However, the accompanying paper, "Differential Path for SHA-1 with complexity
''O''(2
52)" has been withdrawn due to the authors' discovery that their estimate was incorrect.
One attack against SHA-1 was Marc Stevens
with an estimated cost of $2.77M(2012) to break a single hash value by renting CPU power from cloud servers. Stevens developed this attack in a project called HashClash, implementing a differential path attack. On 8 November 2010, he claimed he had a fully working near-collision attack against full SHA-1 working with an estimated complexity equivalent to 2
57.5 SHA-1 compressions. He estimated this attack could be extended to a full collision with a complexity around 2
61.
The SHAppening
On 8 October 2015, Marc Stevens, Pierre Karpman, and Thomas Peyrin published a freestart collision attack on SHA-1's compression function that requires only 2
57 SHA-1 evaluations. This does not directly translate into a collision on the full SHA-1 hash function (where an attacker is ''not'' able to freely choose the initial internal state), but undermines the security claims for SHA-1. In particular, it was the first time that an attack on full SHA-1 had been ''demonstrated''; all earlier attacks were too expensive for their authors to carry them out. The authors named this significant breakthrough in the cryptanalysis of SHA-1 ''The SHAppening''.
The method was based on their earlier work, as well as the auxiliary paths (or boomerangs) speed-up technique from Joux and Peyrin, and using high performance/cost efficient GPU cards from
NVIDIA
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
. The collision was found on a 16-node cluster with a total of 64 graphics cards. The authors estimated that a similar collision could be found by buying US$2,000 of GPU time on
EC2.
The authors estimated that the cost of renting enough of EC2 CPU/GPU time to generate a full collision for SHA-1 at the time of publication was between US$75K and 120K, and noted that was well within the budget of criminal organizations, not to mention national
intelligence agencies
An intelligence agency is a government agency responsible for the collection, analysis, and exploitation of information in support of law enforcement, national security, military, public safety, and foreign policy objectives.
Means of informatio ...
. As such, the authors recommended that SHA-1 be deprecated as quickly as possible.
SHAttered – first public collision
On 23 February 2017, the
CWI (Centrum Wiskunde & Informatica) and Google announced the ''SHAttered'' attack, in which they generated two different PDF files with the same SHA-1 hash in roughly 2
63.1 SHA-1 evaluations. This attack is about 100,000 times faster than brute forcing a SHA-1 collision with a
birthday attack
A birthday attack is a type of cryptographic attack that exploits the mathematics behind the birthday problem in probability theory. This attack can be used to abuse communication between two or more parties. The attack depends on the higher likeli ...
, which was estimated to take 2
80 SHA-1 evaluations. The attack required "the equivalent processing power of 6,500 years of single-CPU computations and 110 years of single-GPU computations".
Birthday-Near-Collision Attack – first practical chosen-prefix attack
On 24 April 2019 a paper by Gaëtan Leurent and Thomas Peyrin presented at Eurocrypt 2019 described an enhancement to the previously best
chosen-prefix attack
In cryptography, a collision attack on a cryptographic hash tries to find two inputs producing the same hash value, i.e. a hash collision. This is in contrast to a preimage attack where a specific target hash value is specified.
There are roug ...
in
Merkle–Damgård–like digest functions based on
Davies–Meyer In cryptography, a one-way compression function is a function that transforms two fixed-length inputs into a fixed-length output. The transformation is "one-way", meaning that it is difficult given a particular output to compute inputs which compre ...
block ciphers. With these improvements, this method is capable of finding chosen-prefix collisions in approximately 2
68 SHA-1 evaluations. This is approximately 1 billion times faster (and now usable for many targeted attacks, thanks to the possibility of choosing a prefix, for example malicious code or faked identities in signed certificates) than the previous attack's 2
77.1 evaluations (but without chosen prefix, which was impractical for most targeted attacks because the found collisions were almost random)
and is fast enough to be practical for resourceful attackers, requiring approximately $100,000 of cloud processing. This method is also capable of finding chosen-prefix collisions in the
MD5 function, but at a complexity of 2
46.3 does not surpass the prior best available method at a theoretical level (2
39), though potentially at a practical level (≤2
49).
This attack has a memory requirement of 500+ GB.
On 5 January 2020 the authors published an improved attack.
In this paper they demonstrate a chosen-prefix collision attack with a complexity of 2
63.4, that at the time of publication would cost 45k USD per generated collision.
Official validation
Implementations of all FIPS-approved security functions can be officially validated through the
CMVP program, jointly run by the
National Institute of Standards and Technology
The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical sci ...
(NIST) and the
Communications Security Establishment
The Communications Security Establishment (CSE; french: Centre de la sécurité des télécommunications, ''CST''), formerly (from 2008-2014) called the Communications Security Establishment Canada (CSEC), is the Government of Canada's national c ...
(CSE). For informal verification, a package to generate a high number of test vectors is made available for download on the NIST site; the resulting verification, however, does not replace the formal CMVP validation, which is required by law for certain applications.
, there are over 2000 validated implementations of SHA-1, with 14 of them capable of handling messages with a length in bits not a multiple of eight (se
SHS Validation List).
Examples and pseudocode
Example hashes
These are examples of SHA-1
message digest
A cryptographic hash function (CHF) is a hash algorithm (a map of an arbitrary binary string to a binary string with fixed size of n bits) that has special properties desirable for cryptography:
* the probability of a particular n-bit output re ...
s in hexadecimal and in
Base64
In computer programming, Base64 is a group of binary-to-text encoding schemes that represent binary data (more specifically, a sequence of 8-bit bytes) in sequences of 24 bits that can be represented by four 6-bit Base64 digits.
Common to all bina ...
binary to
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
text encoding.
*
SHA1("The quick brown fox jumps over the lazy og")
** Outputted hexadecimal:
2fd4e1c67a2d28fced849ee1bb76e7391b93eb12
** Outputted
Base64
In computer programming, Base64 is a group of binary-to-text encoding schemes that represent binary data (more specifically, a sequence of 8-bit bytes) in sequences of 24 bits that can be represented by four 6-bit Base64 digits.
Common to all bina ...
binary to
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
text encoding:
L9ThxnotKPzthJ7hu3bnORuT6xI=
Even a small change in the message will, with overwhelming probability, result in many bits changing due to the
avalanche effect
In cryptography, the avalanche effect is the desirable property of cryptographic algorithms, typically block ciphers and cryptographic hash functions, wherein if an input is changed slightly (for example, flipping a single bit), the output changes ...
. For example, changing
dog
to
cog
produces a hash with different values for 81 of the 160 bits:
*
SHA1("The quick brown fox jumps over the lazy og")
** Outputted hexadecimal:
de9f2c7fd25e1b3afad3e85a0bd17d9b100db4b3
** Outputted
Base64
In computer programming, Base64 is a group of binary-to-text encoding schemes that represent binary data (more specifically, a sequence of 8-bit bytes) in sequences of 24 bits that can be represented by four 6-bit Base64 digits.
Common to all bina ...
binary to
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
text encoding:
3p8sf9JeGzr60+haC9F9mxANtLM=
The hash of the zero-length string is:
*
SHA1("")
** Outputted hexadecimal:
da39a3ee5e6b4b0d3255bfef95601890afd80709
** Outputted
Base64
In computer programming, Base64 is a group of binary-to-text encoding schemes that represent binary data (more specifically, a sequence of 8-bit bytes) in sequences of 24 bits that can be represented by four 6-bit Base64 digits.
Common to all bina ...
binary to
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
text encoding:
2jmj7l5rSw0yVb/vlWAYkK/YBwk=
SHA-1 pseudocode
Pseudocode
In computer science, pseudocode is a plain language description of the steps in an algorithm or another system. Pseudocode often uses structural conventions of a normal programming language, but is intended for human reading rather than machine re ...
for the SHA-1 algorithm follows:
''Note 1: All variables are unsigned 32-bit quantities and wrap modulo 232 when calculating, except for''
''ml, the message length, which is a 64-bit quantity, and''
''hh, the message digest, which is a 160-bit quantity.''
''Note 2: All constants in this pseudo code are in big endian
In computing, endianness, also known as byte sex, is the order or sequence of bytes of a word of digital data in computer memory. Endianness is primarily expressed as big-endian (BE) or little-endian (LE). A big-endian system stores the most sig ...
.''
''Within each word, the most significant byte is stored in the leftmost byte position''
''Initialize variables:''
h0 = 0x67452301
h1 = 0xEFCDAB89
h2 = 0x98BADCFE
h3 = 0x10325476
h4 = 0xC3D2E1F0
ml = message length in bits (always a multiple of the number of bits in a character).
''Pre-processing:''
append the bit '1' to the message e.g. by adding 0x80 if message length is a multiple of 8 bits.
append 0 ≤ k < 512 bits '0', such that the resulting message length in ''bits''
is
congruent
Congruence may refer to:
Mathematics
* Congruence (geometry), being the same size and shape
* Congruence or congruence relation, in abstract algebra, an equivalence relation on an algebraic structure that is compatible with the structure
* In mod ...
to −64 ≡ 448 (mod 512)
append ml, the original message length in bits, as a 64-bit
big-endian
In computing, endianness, also known as byte sex, is the order or sequence of bytes of a word of digital data in computer memory. Endianness is primarily expressed as big-endian (BE) or little-endian (LE). A big-endian system stores the most sig ...
integer.
Thus, the total length is a multiple of 512 bits.
''Process the message in successive 512-bit chunks:''
break message into 512-bit chunks
for each chunk
break chunk into sixteen 32-bit big-endian words w
0 ≤ i ≤ 15
''Message schedule: extend the sixteen 32-bit words into eighty 32-bit words:''
for i from 16 to 79
''Note 3: SHA-0 differs by not having this leftrotate.''
w
= (w
-3xor w
-8xor w
-14xor w
-16 leftrotate 1
''Initialize hash value for this chunk:''
a = h0
b = h1
c = h2
d = h3
e = h4
''Main loop:''
for i from 0 to 79
if 0 ≤ i ≤ 19 then
f = (b and c) xor ((not b) and d)
k = 0x5A827999
else if 20 ≤ i ≤ 39
f = b xor c xor d
k = 0x6ED9EBA1
else if 40 ≤ i ≤ 59
f = (b and c) xor (b and d) xor (c and d)
k = 0x8F1BBCDC
else if 60 ≤ i ≤ 79
f = b xor c xor d
k = 0xCA62C1D6
temp = (a leftrotate 5) + f + e + k + w
e = d
d = c
c = b leftrotate 30
b = a
a = temp
''Add this chunk's hash to result so far:''
h0 = h0 + a
h1 = h1 + b
h2 = h2 + c
h3 = h3 + d
h4 = h4 + e
''Produce the final hash value (big-endian) as a 160-bit number:''
hh = (h0 leftshift 128) or (h1 leftshift 96) or (h2 leftshift 64) or (h3 leftshift 32) or h4
The number
hh
is the message digest, which can be written in hexadecimal (base 16).
The chosen constant values used in the algorithm were assumed to be
nothing up my sleeve number
In cryptography, nothing-up-my-sleeve numbers are any numbers which, by their construction, are above suspicion of hidden properties. They are used in creating cryptographic functions such as hashes and ciphers. These algorithms often need rando ...
s:
* The four round constants
k
are 2
30 times the square roots of 2, 3, 5 and 10. However they were incorrectly rounded to the nearest integer instead of being rounded to the nearest odd integer, with equilibrated proportions of zero and one bits. As well, choosing the square root of 10 (which is not a prime) made it a common factor for the two other chosen square roots of primes 2 and 5, with possibly usable arithmetic properties across successive rounds, reducing the strength of the algorithm against finding collisions on some bits.
* The first four starting values for
h0
through
h3
are the same with the MD5 algorithm, and the fifth (for
h4
) is similar. However they were not properly verified for being resistant against inversion of the few first rounds to infer possible collisions on some bits, usable by multiblock differential attacks.
Instead of the formulation from the original FIPS PUB 180-1 shown, the following equivalent expressions may be used to compute
f
in the main loop above:
''Bitwise choice between ''c'' and ''d'', controlled by ''b''.''
(0 ≤ i ≤ 19): f = d xor (b and (c xor d))
''(alternative 1)''
(0 ≤ i ≤ 19): f = (b and c) or ((not b) and d)
''(alternative 2)''
(0 ≤ i ≤ 19): f = (b and c) xor ((not b) and d)
''(alternative 3)''
(0 ≤ i ≤ 19): f = vec_sel(d, c, b)
''(alternative 4)''
remo08
Remo Inc. is an American musical instruments manufacturing company based in Valencia, California, and founded by Remo Belli in 1957. Products manufactured include drum kits, drumheads, drums, and hardware.
History
Drummer and founder Re ...
''Bitwise majority function.''
(40 ≤ i ≤ 59): f = (b and c) or (d and (b or c))
''(alternative 1)''
(40 ≤ i ≤ 59): f = (b and c) or (d and (b xor c))
''(alternative 2)''
(40 ≤ i ≤ 59): f = (b and c) xor (d and (b xor c))
''(alternative 3)''
(40 ≤ i ≤ 59): f = (b and c) xor (b and d) xor (c and d)
''(alternative 4)''
(40 ≤ i ≤ 59): f = vec_sel(c, b, c xor d)
''(alternative 5)''
It was also shown that for the rounds 32–79 the computation of:
w
= (w
-3xor w
-8xor w
-14xor w
-16 leftrotate 1
can be replaced with:
w
= (w
-6xor w
-16xor w
-28
The hyphen-minus is the most commonly used type of hyphen, widely used in digital documents. It is the only character that looks like a minus sign or a dash in many character sets such as ASCII or on most keyboards, so it is also used as such. ...
xor w
-32 leftrotate 2
This transformation keeps all operands 64-bit aligned and, by removing the dependency of
w /code> on w -3/code>, allows efficient SIMD implementation with a vector length of 4 like x86
x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introd ...
SSE instructions.
Comparison of SHA functions
In the table below, ''internal state'' means the "internal hash sum" after each compression of a data block.
Implementations
Below is a list of cryptography libraries that support SHA-1:
* Botan
* Bouncy Castle
Bounce or The Bounce may refer to:
* Deflection (physics), the event where an object collides with and bounces against a plane surface
Books
* Mr. Bounce, a character from the Mr. Men series of children's books
Broadcasting, film and TV
* ''B ...
* cryptlib
cryptlib is an open-source cross-platform software security toolkit library. It is distributed under the Sleepycat License, a free software license compatible with the GNU General Public License. Alternatively, cryptlib is available under a pr ...
* Crypto++
Crypto++ (also known as CryptoPP, libcrypto++, and libcryptopp) is a free and open-source C++ class library of cryptographic algorithms and schemes written by Wei Dai. Crypto++ has been widely used in academia, student projects, open-source, and no ...
* Libgcrypt
Libgcrypt is a cryptography library developed as a separated module of GnuPG.
It can also be used independently of GnuPG, but depends on its error-reporting library Libgpg-error.
It provides functions for all fundamental cryptographic building blo ...
* Mbed TLS
Mbed TLS (previously PolarSSL) is an implementation of the TLS and SSL protocols and the respective cryptographic algorithms and support code required. It is distributed under the Apache License version 2.0. Stated on the website is that Mbed ...
* Nettle
{{redirect, Nettle
Nettle refers to plants with stinging hairs, particularly those of the genus ''Urtica''. It can also refer to plants which resemble ''Urtica'' species in appearance but do not have stinging hairs. Plants called "nettle" include ...
* LibreSSL
LibreSSL is an open-source implementation of the Transport Layer Security (TLS) protocol. The implementation is named after Secure Sockets Layer (SSL), the deprecated predecessor of TLS, for which support was removed in release 2.3.0. The OpenB ...
* OpenSSL
OpenSSL is a software library for applications that provide secure communications over computer networks against eavesdropping or need to identify the party at the other end. It is widely used by Internet servers, including the majority of HTT ...
* GnuTLS
GnuTLS (, the GNU Transport Layer Security Library) is a free software implementation of the TLS, SSL and DTLS protocols. It offers an application programming interface (API) for applications to enable secure communication over the network trans ...
*
Hardware acceleration is provided by the following processor extensions:
* Intel SHA extensions
Intel SHA Extensions are a set of extensions to the x86 instruction set architecture which support hardware acceleration of Secure Hash Algorithm (SHA) family. It was introduced in 2013.
There are seven new SSE-based instructions, four supporting ...
: Available on some Intel and AMD x86 processors.
* VIA PadLock VIA PadLock is a central processing unit (CPU) instruction set extension to the x86 microprocessor instruction set architecture (ISA) found on processors produced by VIA Technologies and Zhaoxin. Introduced in 2003 with the VIA Centaur CPUs, the ...
* IBM z/Architecture
z/Architecture, initially and briefly called ESA Modal Extensions (ESAME), is IBM's 64-bit complex instruction set computer (CISC) instruction set architecture, implemented by its mainframe computers. IBM introduced its first z/Architecture-b ...
: Available since 2003 as part of the Message-Security-Assist Extension[IBM z/Architecture Principles of Operation, publication number SA22-7832. See KIMD and KLMD instructions in Chapter 7.]
See also
* Comparison of cryptographic hash functions
* Hash function security summary
This article summarizes publicly known cryptanalysis, attacks against cryptographic hash functions. Note that not all entries may be up to date. For a summary of other hash function parameters, see comparison of cryptographic hash functions.
Tabl ...
* International Association for Cryptologic Research
International is an adjective (also used as a noun) meaning "between nations".
International may also refer to:
Music Albums
* ''International'' (Kevin Michael album), 2011
* ''International'' (New Order album), 2002
* ''International'' (The T ...
* Secure Hash Standard
The Secure Hash Algorithms are a family of cryptographic hash functions published by the National Institute of Standards and Technology (NIST) as a U.S. Federal Information Processing Standard (FIPS), including:
*SHA-0: A retronym applied to the ...
Notes
References
* Eli Biham
Eli Biham ( he, אלי ביהם) is an Israeli cryptographer and cryptanalyst, currently a professor at the Technion - Israel Institute of Technology Computer Science department. Starting from October 2008 and till 2013, Biham was the dean of t ...
, Rafi Chen, Near-Collisions of SHA-0, Cryptology ePrint Archive, Report 2004/146, 2004 (appeared on CRYPTO 2004)
IACR.org
* Xiaoyun Wang
Wang Xiaoyun (; born 1966) is a Chinese cryptographer, mathematician, and computer scientist. She is a professor in the Department of Mathematics and System Science of Shandong University and an academician of the Chinese Academy of Sciences.
Ear ...
, Hongbo Yu and Yiqun Lisa Yin
Efficient Collision Search Attacks on SHA-0
Crypto 2005
* Xiaoyun Wang
Wang Xiaoyun (; born 1966) is a Chinese cryptographer, mathematician, and computer scientist. She is a professor in the Department of Mathematics and System Science of Shandong University and an academician of the Chinese Academy of Sciences.
Ear ...
, Yiqun Lisa Yin and Hongbo Yu
Finding Collisions in the Full SHA-1
Crypto 2005
* Henri Gilbert, Helena Handschuh
Security Analysis of SHA-256 and Sisters
Selected Areas in Cryptography
Selected Areas in Cryptography (SAC) is an international cryptography conference (originally a workshop) held every August in Canada since 1994. The first workshop was organized by Carlisle Adams, Henk Meijer, Stafford Tavares and Paul van Oorscho ...
2003: pp175–193
An Illustrated Guide to Cryptographic Hashes
*
* A. Cilardo, L. Esposito, A. Veniero, A. Mazzeo, V. Beltran, E. Ayugadé
A CellBE-based HPC application for the analysis of vulnerabilities in cryptographic hash functions
High Performance Computing and Communication international conference, August 2010
External links
– Official NIST
The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical sci ...
site for the Secure Hash Standard
FIPS 180-4: Secure Hash Standard (SHS)
* (with sample C implementation)
Interview with Yiqun Lisa Yin concerning the attack on SHA-1
Explanation of the successful attacks on SHA-1
(3 pages, 2006)
*
* b
Christof Paar
{{Cryptography navbox, hash
Cryptographic hash functions
Broken hash functions
Articles with example pseudocode
Checksum algorithms
National Security Agency cryptography