Badger is a
Message Authentication Code (MAC) based on the idea of
universal hashing
In mathematics and computing, universal hashing (in a randomized algorithm or data structure) refers to selecting a hash function at random from a family of hash functions with a certain mathematical property (see definition below). This guarantees ...
and was developed by Boesgaard, Scavenius, Pedersen, Christensen, and Zenner.
It is constructed by strengthening the ∆-universal hash family MMH using an ϵ-almost strongly universal (ASU) hash function family after the application of ENH (see below), where the value of ϵ is
.
Since Badger is a MAC function based on the
universal hash
In mathematics and computing, universal hashing (in a randomized algorithm or data structure) refers to selecting a hash function at random from a family of hash functions with a certain mathematical property (see definition below). This guarantees ...
function approach, the conditions needed for the security of Badger are the same as those for other universal hash functions such as
UMAC
In cryptography, a message authentication code based on universal hashing, or UMAC, is a type of message authentication code (MAC) calculated choosing a hash function from a class of hash functions according to some secret (random) process and ap ...
.
Introduction
The Badger MAC processes a message of length up to
bits and returns an
authentication tag of length
bits, where
. According to the
security" \n\n\nsecurity.txt is a proposed standard for websites' security information that is meant to allow security researchers to easily report security vulnerabilities. The standard prescribes a text file called \"security.txt\" in the well known locat ...
needs, user can choose the value of
, that is the number of parallel
hash tree
Hash, hashes, hash mark, or hashing may refer to:
Substances
* Hash (food), a coarse mixture of ingredients
* Hash, a nickname for hashish, a cannabis product
Hash mark
*Hash mark (sports), a marking on hockey rinks and gridiron football fiel ...
s in Badger. One can choose larger values of ''u'', but those values do not influence further the security of MAC. The
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
uses a 128-bit key and the limited message length to be processed under this key is
.
The key setup has to be run only once per key in order to run the Badger
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
under a given key, since the resulting internal state of the MAC can be saved to be used with any other message that will be processed later.
ENH
Hash families can be combined in order to obtain new hash families. For the ϵ-AU, ϵ-A∆U, and ϵ-ASU families, the latter are contained in the former. For instance, an A∆U family is also an AU family, an ASU is also an A∆U family, and so forth. On the other hand, a stronger family can be reduced to a weaker one, as long as a performance gain can be reached. A method to reduce ∆-universal hash function to
universal hash
In mathematics and computing, universal hashing (in a randomized algorithm or data structure) refers to selecting a hash function at random from a family of hash functions with a certain mathematical property (see definition below). This guarantees ...
functions will be described in the following.
Theorem 2
[
Let be an ϵ-AΔU hash family from a set ''A'' to a set ''B''. Consider a message . Then the family ''H'' consisting of the functions is ϵ-AU.
If , then the ]probability
Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speakin ...
that
is at most ϵ,
since is an ϵ-A∆U family. If but , then the probability
Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speakin ...
is trivially 0.
The proof for Theorem 2 was described in [
The ENH-family is constructed based on the ]universal hash
In mathematics and computing, universal hashing (in a randomized algorithm or data structure) refers to selecting a hash function at random from a family of hash functions with a certain mathematical property (see definition below). This guarantees ...
family NH (which is also used in UMAC
In cryptography, a message authentication code based on universal hashing, or UMAC, is a type of message authentication code (MAC) calculated choosing a hash function from a class of hash functions according to some secret (random) process and ap ...
):
:
Where '' means 'addition modulo ', and . It is a -A∆U hash family.
Lemma 1[
The following version of NH is -A∆U:
:
Choosing w=32 and applying Theorem 1, one can obtain the -AU function family ENH, which will be the basic building block of the badger MAC:
:
where all arguments are 32-bits long and the output has 64-bits.
]
Construction
Badger is constructed using the strongly universality hash family and can be described as
: [
where an -AU universal function family ''H*'' is used to hash messages of any size onto a fixed size and an -ASU function family ''F'' is used to guarantee the strong universality of the overall construction. ''NH'' and ''ENH'' are used to construct ''H*''. The maximum input size of the function family ''H*'' is and the output size is 128 bits, split into 64 bits each for the message and the hash. The collision probability for the ''H*''-function ranges from to . To construct the strongly universal function family ''F'', the ∆-universal hash family MMH* is transformed into a strongly universal hash family by adding another key.
]
Two steps on Badger
There are two steps that have to be executed for every message: processing phase and finalize phase.
Processing phase[
In this phase, the data is hashed to a 64-bit string. A core function : is used in this processing phase, that hashes a 128-bit string to a 64-bit string as follows:
:
for any ''n'', means addition modulo . Given a ''2n''-bit string ''x'', ''L(x)'' means least significant ''n'' bits, and ''U(x)'' means most significant ''n'' bits.
A message can be processed by using this function. Denote level_key i] by .
Pseudo-code of the processing phase is as follow.
L = , M,
if L = 0
M^1 = ⋯ = M^u = 0
Go to finalization
r = L mod 64
if r≠0:
M = 0^(64-r)∥M
for i = 1 to u:
M^i = M
v^' = max
for j = 1 to v^':
divide M^i into 64-bit blocks, M^i = m_t^i∥⋯∥m_1^i
if t is even:
M^i = h(k_j^i, m_t^i, m_(t-1)^i) ∥⋯∥ h(k_j^i, m_2^i, m_1^i)
else
M^i = m_t^i∥h(k_j^i, m_(t-1)^i, m_(t-2)^i) ∥⋯∥ h(k_j^i, m_2^i, m_1^i)
Finalize phase][
In this phase, the 64-string resulting from the processing phase is transformed into the desired MAC tag. This finalization phase uses the Rabbit (cipher), Rabbit stream cipher and uses both key setup and IV setup by taking the finalization key final_key i] as .
Pseudo-code of the finalization phase
RabbitKeySetup(K)
RabbitIVSetup(N)
for i = 1 to u:
Q^i =0^7∥L∥M^i
divide Q^i into 27-bit blocks, Q^i=q_5^i∥⋯∥q_1^i
S^i =(Σ_(j=1)^5 (q_j^i K_j^i))+K_6^i mod p
S = S^u∥⋯∥S^1
S = S ⨁ RabbitNextbit(u∙32)
return S
Notation:
From the pseudocode above, ''k'' denotes the key in the Rabbit Key Setup(K) which initializes Rabbit with the 128-bit key ''k''. ''M'' denotes the message to be hashed and , ''M'', denotes the length of the message in bits. q_i denotes a message ''M'' that is divided into ''i'' blocks. For the given ''2n''-bit string ''x'' then L(''x'') and U(''x'') respectively denoted its least significant n bits and most significant ''n'' bits.
]
Performance
Boesgard, Christensen and Zenner report the performance of Badger measured on a 1.0 GHz Pentium III
The Pentium III (marketed as Intel Pentium III Processor, informally PIII or P3) brand refers to Intel's 32-bit x86 desktop and mobile CPUs based on the sixth-generation P6 microarchitecture introduced on February 28, 1999. The brand's initial ...
and on a 1.7 GHz Pentium 4
Pentium 4 is a series of single-core CPUs for desktops, laptops and entry-level servers manufactured by Intel. The processors were shipped from November 20, 2000 until August 8, 2008. The production of Netburst processors was active from 200 ...
processor.[ The speed-optimized versions were programmed in assembly language inlined in C and compiled using the Intel ]C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
7.1 compiler.
The following table presents Badger's properties for various restricted message lengths. "Memory req." denotes the amount of memory required to store the internal state including key material and the inner state of the Rabbit (cipher), Rabbit stream cipher . "Setup" denotes the key setup, and "Fin." denotes finalization with IV-setup.
MMH (Multilinear Modular Hashing)
The name MMH stands for Multilinear-Modular-Hashing. Applications in Multimedia
Multimedia is a form of communication that uses a combination of different content forms such as text, audio, images, animations, or video into a single interactive presentation, in contrast to tradit ...
are for example to verify the integrity
Integrity is the practice of being honest and showing a consistent and uncompromising adherence to strong moral and ethical principles and values.
In ethics, integrity is regarded as the honesty and truthfulness or accuracy of one's actions. In ...
of an on-line multimedia title. The performance of MMH is based on the improved support of integer scalar product
In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a scalar as a result". It is also used sometimes for other symmetric bilinear forms, for example in a pseudo-Euclidean space. is an alge ...
s in modern microprocessors.
MMH uses single precision scalar products as its most basic operation. It consists of a (modified) inner product
In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, often ...
between the message and a key modulo a prime
A prime number (or a prime) is a natural number greater than 1 that is not a product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime because the only ways ...
. The construction of MMH works in the finite field
In mathematics, a finite field or Galois field (so-named in honor of Évariste Galois) is a field that contains a finite number of elements. As with any field, a finite field is a set on which the operations of multiplication, addition, subtr ...
for some prime integer .
MMH*
MMH* involves a construction of a family of hash functions
A hash function is any function that can be used to map data of arbitrary size to fixed-size values. The values returned by a hash function are called ''hash values'', ''hash codes'', ''digests'', or simply ''hashes''. The values are usually ...
consisting of multilinear functions on for some positive integer . The family MMH* of functions from to is defined as follows.
:
where ''x, m'' are vectors, and the functions are defined as follows.
: = =
In the case of MAC, is a message and is a key where and .
MMH* should satisfy the security requirements of a MAC, enabling say Ana and Bob to communicate in an authenticated way. They have a secret key . Say Charles listens to the conversation between Ana and Bob and wants to change the message into his own message to Bob which should pass as a message from Ana. So, his message and Ana's message will differ in at least one bit (e.g. ).
Assume that Charles knows that the function is of the form and he knows Ana's message but he does not know the key ''x'' then the probability that Charles can change the message or send his own message can be explained by the following theorem.
Theorem 1:The family MMH* is ∆-universal.
Proof:
Take , and let be two different messages. Assume without loss of generality
''Without loss of generality'' (often abbreviated to WOLOG, WLOG or w.l.o.g.; less commonly stated as ''without any loss of generality'' or ''with no loss of generality'') is a frequently used expression in mathematics. The term is used to indicat ...
that . Then for any choice of , there is
:
To explain the theorem above, take for prime represent the field as . If one takes an element in , let say then the probability that is
:
So, what one actually needs to compute is
:
But,
:
From the proof above, is the collision
In physics, a collision is any event in which two or more bodies exert forces on each other in a relatively short time. Although the most common use of the word ''collision'' refers to incidents in which two or more objects collide with great fo ...
probability
Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speakin ...
of the attacker in 1 round, so on average verification queries will suffice to get one message accepted. To reduce the collision probability, it is necessary to choose large ''p'' or to concatenate such MACs using independent keys so that the collision probability] becomes . In this case the number of keys are increased by a factor of and the output is also increased by .
MMH*32
Halevi and Krawczyk[ construct a variant called . The construction works with 32-bit ]integers
An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
and with the prime
A prime number (or a prime) is a natural number greater than 1 that is not a product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime because the only ways ...
integer . Actually the prime ''p'' can be chosen to be any prime which satisfies