In
cryptography
Cryptography, or cryptology (from grc, , translit=kryptós "hidden, secret"; and ''graphein'', "to write", or ''-logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of adver ...
, a one-way compression function is a function that transforms two fixed-length inputs into a fixed-length output. The transformation is
"one-way", meaning that it is difficult given a particular output to compute inputs which compress to that output. One-way compression functions are not related to conventional
data compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression ...
algorithms, which instead can be inverted exactly (lossless compression) or approximately (lossy compression) to the original data.
One-way compression functions are for instance used in the
Merkle–Damgård construction
In cryptography, the Merkle–Damgård construction or Merkle–Damgård hash function is a method of building collision-resistant cryptographic hash functions from collision-resistant one-way compression functions. Goldwasser, S. and Bellare, M ...
inside
cryptographic hash function
A cryptographic hash function (CHF) is a hash algorithm (a map of an arbitrary binary string to a binary string with fixed size of n bits) that has special properties desirable for cryptography:
* the probability of a particular n-bit output re ...
s.
One-way compression functions are often built from
block ciphers.
Some methods to turn any normal block cipher into a one-way compression function are Davies–Meyer, Matyas–Meyer–Oseas, Miyaguchi–Preneel (single-block-length compression functions) and MDC-2/Meyer–Schilling, MDC-4, Hirose (double-block-length compression functions). These methods are described in detail further down. (
MDC-2
In cryptography, MDC-2 (Modification Detection Code 2, sometimes called Meyer–Schilling, standardized in ISO 10118-2) is a cryptographic hash function. MDC-2 is a hash function based on a block cipher with a proof of security in the ideal-ciph ...
is also the name of a hash function patented by
IBM.)
Compression
A compression function mixes two fixed length inputs and produces a single fixed length output of the same size as one of the inputs. This can also be seen as that the compression function transforms one large fixed-length input into a shorter, fixed-length output.
For instance, ''input A'' might be 128 bits, ''input B'' 128 bits and they are compressed together to a single output of 128 bits. This is equivalent to having a single 256-bit input compressed to a single output of 128 bits.
Some compression functions do not compress by half, but instead by some other factor. For example, ''input A'' might be 256 bits, and ''input B'' 128 bits, which are compressed to a single output of 128 bits. That is, a total of 384 input bits are compressed together to 128 output bits.
The mixing is done in such a way that full
avalanche effect
In cryptography, the avalanche effect is the desirable property of cryptographic algorithms, typically block ciphers and cryptographic hash functions, wherein if an input is changed slightly (for example, flipping a single bit), the output changes ...
is achieved. That is, every output bit depends on every input bit.
One-way
A
one-way function
In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here, "easy" and "hard" are to be understood in the sense of computational complexity theory, s ...
is a function that is easy to compute but hard to invert. A one-way compression function (also called hash function) should have the following properties:
* Easy to compute: If you have some input(s), it is easy to calculate the output.
* Preimage-resistance: If an attacker only knows the output it should be infeasible to calculate an input. In other words, given an output
, it should be unfeasible to calculate an input
such that
.
* Second preimage-resistance: Given an input
whose output is
, it should be infeasible to find another input
that has the same output
, i.e.
.
*
Collision-resistance: It should be hard to find any two different inputs that compress to the same output i.e. an attacker should not be able to find a pair of messages
such that
. Due to the
birthday paradox
In probability theory, the birthday problem asks for the probability that, in a set of randomly chosen people, at least two will share a birthday. The birthday paradox is that, counterintuitively, the probability of a shared birthday exceeds 5 ...
(see also
birthday attack) there is a 50% chance a collision can be found in time of about
where
is the number of bits in the hash function's output. An attack on the hash function thus should not be able to find a collision with less than about
work.
Ideally one would like the "infeasibility" in preimage-resistance and second preimage-resistance to mean a work of about
where
is the number of bits in the hash function's output. However, particularly for second preimage-resistance this is a difficult problem.
The Merkle–Damgård construction
A common use of one-way compression functions is in the Merkle–Damgård construction inside cryptographic hash functions. Most widely used hash functions, including
MD5,
SHA-1
In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographically broken but still widely used hash function which takes an input and produces a 160-bit (20- byte) hash value known as a message digest – typically rendered as 40 hexadec ...
(which is deprecated) and
SHA-2
SHA-2 (Secure Hash Algorithm 2) is a set of cryptographic hash functions designed by the United States National Security Agency (NSA) and first published in 2001. They are built using the Merkle–Damgård construction, from a one-way compression ...
use this construction.
A hash function must be able to process an arbitrary-length message into a fixed-length output. This can be achieved by breaking the input up into a series of equal-sized blocks, and operating on them in sequence using a one-way compression function. The compression function can either be specially designed for hashing or be built from a block cipher. The last block processed should also be
length padded, which is crucial to the security of this construction.
When length padding (also called MD-strengthening) is applied, attacks cannot find collisions faster than the birthday paradox (
,
being the block size in bits) if the used function
is collision-resistant.
[Ivan Damgård]
A design principle for hash functions
In Gilles Brassard, editor, CRYPTO, volume 435 of LNCS, pages 416–427. Springer, 1989.[Ralph Merkle]
One way hash functions and DES
In Gilles Brassard, editor, CRYPTO, volume 435 of LNCS, pages 428–446. Springer, 1989. Hence, the Merkle–Damgård hash construction reduces the problem of finding a proper hash function to finding a proper compression function.
A second preimage attack (given a message
an attacker finds another message
to satisfy
can be done according to Kelsey and Schneier
[John Kelsey and Bruce Schneier]
Second preimages on ''n''-bit hash functions for much less than 2''n'' work
In Ronald Cramer, editor, EUROCRYPT, volume 3494 of LNCS, pages 474–490. Springer, 2005. for a
-message-block message in time
. Note that the complexity of this attack reaches a minimum of
for long messages when
and approaches
when messages are short.
Construction from block ciphers
One-way compression functions are often built from block ciphers.
Block ciphers take (like one-way compression functions) two fixed size inputs (the
key
Key or The Key may refer to:
Common meanings
* Key (cryptography), a piece of information that controls the operation of a cryptography algorithm
* Key (lock), device used to control access to places or facilities restricted by a lock
* Key (map ...
and the
plaintext
In cryptography, plaintext usually means unencrypted information pending input into cryptographic algorithms, usually encryption algorithms. This usually refers to data that is transmitted or stored unencrypted.
Overview
With the advent of comp ...
) and return one single output (the
ciphertext
In cryptography, ciphertext or cyphertext is the result of encryption performed on plaintext using an algorithm, called a cipher. Ciphertext is also known as encrypted or encoded information because it contains a form of the original plaintext ...
) which is the same size as the input plaintext.
However, modern block ciphers are only partially one-way. That is, given a plaintext and a ciphertext it is infeasible to find a key that encrypts the plaintext to the ciphertext. But, given a ciphertext and a key a matching plaintext can be found simply by using the block cipher's decryption function. Thus, to turn a block cipher into a one-way compression function some extra operations have to be added.
Some methods to turn any normal block cipher into a one-way compression function are Davies–Meyer, Matyas–Meyer–Oseas, Miyaguchi–Preneel (single-block-length compression functions) and MDC-2, MDC-4, Hirose (double-block-length compressions functions).
Single-block-length compression functions output the same number of bits as processed by the underlying block cipher. Consequently, double-block-length compression functions output twice the number of bits.
If a block cipher has a
block size of say 128 bits single-block-length methods create a hash function that has the block size of 128 bits and produces a hash of 128 bits. Double-block-length methods make hashes with double the hash size compared to the block size of the block cipher used. So a 128-bit block cipher can be turned into a 256-bit hash function.
These methods are then used inside the Merkle–Damgård construction to build the actual hash function. These methods are described in detail further down.
Using a block cipher to build the one-way compression function for a hash function is usually somewhat slower than using a specially designed one-way compression function in the hash function. This is because all known secure constructions do the
key scheduling for each block of the message. Black, Cochran and Shrimpton have shown that it is impossible to construct a one-way compression function that makes only one call to a block cipher with a fixed key. In practice reasonable speeds are achieved provided the key scheduling of the selected block cipher is not a too heavy operation.
But, in some cases it is easier because a single implementation of a block cipher can be used for both a block cipher and a hash function. It can also save
code
In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communication ...
space in very tiny
embedded system
An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' as ...
s like for instance
smart card
A smart card, chip card, or integrated circuit card (ICC or IC card) is a physical electronic authentication device, used to control access to a resource. It is typically a plastic credit card-sized card with an embedded integrated circuit (IC) c ...
s or
nodes in cars or other machines.
Therefore, the hash-rate or rate gives a glimpse of the efficiency of a hash function based on a certain compression function. The rate of an iterated hash function outlines the ratio between the number of block cipher operations and the output. More precisely, the rate represents the ratio between the number of processed bits of input
, the output bit-length
of the block cipher, and the necessary block cipher operations
to produce these
output bits. Generally, the usage of fewer block cipher operations results in a better overall performance of the entire hash function, but it also leads to a smaller hash-value which could be undesirable. The rate is expressed by the formula:
The hash function can only be considered secure if at least the following conditions are met:
* The block cipher has no special properties that distinguish it from ideal ciphers, such as weak keys or keys that lead to identical or related encryptions (fixed points or key-collisions).
* The resulting hash size is big enough. According to the
birthday attack a
security level of 2
80 (generally assumed to be infeasible to compute today) is desirable thus the hash size should be at least 160 bits.
* The last block is properly length padded prior to the hashing. (See
Merkle–Damgård construction
In cryptography, the Merkle–Damgård construction or Merkle–Damgård hash function is a method of building collision-resistant cryptographic hash functions from collision-resistant one-way compression functions. Goldwasser, S. and Bellare, M ...
.) Length padding is normally implemented and handled internally in specialised hash functions like
SHA-1
In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographically broken but still widely used hash function which takes an input and produces a 160-bit (20- byte) hash value known as a message digest – typically rendered as 40 hexadec ...
etc.
The constructions presented below: Davies–Meyer, Matyas–Meyer–Oseas, Miyaguchi–Preneel and Hirose have been shown to be secure under the
black-box
In science, computing, and engineering, a black box is a system which can be viewed in terms of its inputs and outputs (or transfer characteristics), without any knowledge of its internal workings. Its implementation is "opaque" (black). The te ...
analysis.
[S. Hirose, ]
Some Plausible Constructions of Double-Block-Length Hash Functions
'. In: Robshaw, M. J. B. (ed.) FSE 2006, LNCS, vol. 4047, pp. 210–225, Springer, Heidelberg 2006. The goal is to show that any attack that can be found is at most as efficient as the
birthday attack under certain assumptions. The black-box model assumes that a block cipher is used that is randomly chosen from a set containing all appropriate block ciphers. In this model an attacker may freely encrypt and decrypt any blocks, but does not have access to an implementation of the block cipher. The encryption and decryption function are represented by oracles that receive a pair of either a plaintext and a key or a ciphertext and a key. The oracles then respond with a randomly chosen plaintext or ciphertext, if the pair was asked for the first time. They both share a table for these triplets, a pair from the query and corresponding response, and return the record, if a query was received for the second time. For the proof there is a collision finding algorithm that makes randomly chosen queries to the oracles. The algorithm returns 1, if two responses result in a collision involving the hash function that is built from a compression function applying this block cipher (0 else). The probability that the algorithm returns 1 is dependent on the number of queries which determine the security level.
Davies–Meyer
The Davies–Meyer single-block-length compression function feeds each block of the message (
) as the key to a block cipher. It feeds the previous hash value (
) as the plaintext to be encrypted. The output ciphertext is then also
XORed
Exclusive or or exclusive disjunction is a Logical connective, logical operation that is true if and only if its arguments differ (one is true, the other is false).
It is Table of logic symbols, symbolized by the prefix operator J and by the ...
(⊕) with the previous hash value (
) to produce the next hash value (
). In the first round when there is no previous hash value it uses a constant pre-specified initial value (
).
In
mathematical notation
Mathematical notation consists of using symbols for representing operations, unspecified numbers, relations and any other mathematical objects, and assembling them into expressions and formulas. Mathematical notation is widely used in mathematic ...
Davies–Meyer can be described as:
:
The scheme has the rate (k is the keysize):
:
If the block cipher uses for instance 256-bit keys then each message block (
) is a 256-bit chunk of the message. If the same block cipher uses a block size of 128 bits then the input and output hash values in each round is 128 bits.
Variations of this method replace XOR with any other group operation, such as addition on 32-bit unsigned integers.
A notable property of the Davies–Meyer construction is that even if the underlying block cipher is totally secure, it is possible to compute
fixed points for the construction: for any
, one can find a value of
such that
: one just has to set
. This is a property that
random function
In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a Indexed family, family of random variables. Stochastic processes are widely used as mathematical models of systems and phen ...
s certainly do not have. So far, no practical attack has been based on this property, but one should be aware of this "feature". The fixed-points can be used in a second preimage attack (given a message
, attacker finds another message
to satisfy
of Kelsey and Schneier
for a
-message-block message in time
. If the construction does not allow easy creation of fixed points (like Matyas–Meyer–Oseas or Miyaguchi–Preneel) then this attack can be done in
time. Note that in both cases the complexity is above
but below
when messages are long and that when messages get shorter the complexity of the attack approaches
.
The security of the Davies–Meyer construction in the Ideal Cipher Model was first proven by R. Winternitz.
Matyas–Meyer–Oseas
The Matyas–Meyer–Oseas single-block-length one-way compression function can be considered the dual (the opposite) of Davies–Meyer.
It feeds each block of the message (
) as the plaintext to be encrypted. The output ciphertext is then also XORed (⊕) with the same message block (
) to produce the next hash value (
). The previous hash value (
) is fed as the key to the block cipher. In the first round when there is no previous hash value it uses a constant pre-specified initial value (
).
If the block cipher has different block and key sizes the hash value (
) will have the wrong size for use as the key. The cipher might also have other special requirements on the key. Then the hash value is first fed through the function
to be converted/padded to fit as key for the cipher.
In mathematical notation Matyas–Meyer–Oseas can be described as:
:
The scheme has the rate:
:
A second preimage attack (given a message
an attacker finds another message
to satisfy
) can be done according to Kelsey and Schneier
for a
-message-block message in time
. Note that the complexity is above
but below
when messages are long, and that when messages get shorter the complexity of the attack approaches
.
Miyaguchi–Preneel
The Miyaguchi–Preneel single-block-length one-way compression function is an extended variant of Matyas–Meyer–Oseas. It was independently proposed by
Shoji Miyaguchi and
Bart Preneel.
It feeds each block of the message (
) as the plaintext to be encrypted. The output ciphertext is then XORed (⊕) with the same message block (
) and then also XORed with the previous hash value (
) to produce the next hash value (
). The previous hash value (
) is fed as the key to the block cipher. In the first round when there is no previous hash value it uses a constant pre-specified initial value (
).
If the block cipher has different block and key sizes the hash value (
) will have the wrong size for use as the key. The cipher might also have other special requirements on the key. Then the hash value is first fed through the function
to be converted/padded to fit as key for the cipher.
In mathematical notation Miyaguchi–Preneel can be described as:
:
The scheme has the rate:
:
The roles of
and
may be switched, so that
is encrypted under the key
, thus making this method an extension of Davies–Meyer instead.
A second preimage attack (given a message
an attacker finds another message
to satisfy
) can be done according to Kelsey and Schneier
for a
-message-block message in time
. Note that the complexity is above
but below
when messages are long, and that when messages get shorter the complexity of the attack approaches
.
Hirose
The Hirose
double-block-length one-way compression function consists of a block cipher plus a permutation
. It was proposed by Shoichi Hirose in 2006 and is based on a work
[M. Nandi, ''Towards optimal double-length hash functions'', In: Proceedings of the 6th International Conference on Cryptology in India (INDOCRYPT 2005), Lecture Notes in Computer Science 3797, pages 77–89, 2005.] by
Mridul Nandi.
It uses a block cipher whose key length
is larger than the block length
, and produces a hash of size
. For example, any of the
AES candidates with a 192- or 256-bit key (and 128-bit block).
Each round accepts a portion of the message
that is
bits long, and uses it to update two
-bit state values
and
.
First,
is concatenated with
to produce a key
. Then the two feedback values are updated according to:
*
*
is an arbitrary fixed-point-free permutation on an
-bit value, typically defined as
for an arbitrary non-zero constant
(all ones may be a convenient choice).
Each encryption resembles the standard Davies–Meyer construction. The advantage of this scheme over other proposed double-block-length schemes is that both encryptions use the same key, and thus key scheduling effort may be shared.
The final output is
. The scheme has the rate
relative to encrypting the message with the cipher.
Hirose also provides a proof in the Ideal Cipher Model.
Sponge construction
The
sponge construction
In cryptography, a sponge function or sponge construction is any of a class of algorithms with finite state (computer science), internal state that take an input bit stream of any length and produce an output bit stream of any desired length. Spon ...
can be used to build one-way compression functions.
See also
*
Whirlpool
A whirlpool is a body of rotating water produced by opposing currents or a current running into an obstacle. Small whirlpools form when a bath or a sink is draining. More powerful ones formed in seas or oceans may be called maelstroms ( ). ''Vo ...
A cryptographic hash function built using the Miyaguchi–Preneel construction and a block cipher similar to
Square
In Euclidean geometry, a square is a regular quadrilateral, which means that it has four equal sides and four equal angles (90-degree angles, π/2 radian angles, or right angles). It can also be defined as a rectangle with two equal-length adj ...
and
AES.
*
CBC-MAC
In cryptography, a cipher block chaining message authentication code (CBC-MAC) is a technique for constructing a message authentication code (MAC) from a block cipher. The message is encrypted with some block cipher algorithm in cipher block cha ...
,
OMAC, and
PMAC Methods to turn block ciphers into
message authentication codes (MACs).
References
Citations
Sources
*
{{DEFAULTSORT:One-Way Compression Function
Cryptographic hash functions
Cryptographic primitives