# Secure digest functions

People know that digital signatures are very important in information security. The security of digital signatures depends on the cryptographic strength of the underlying hash functions. Hash functions also have many other applications in cryptography such as data integrity, group signature, e-cash and many other cryptographic protocols. The use of hash functions in these applications not only ensure the security, but also greatly improve the efficiency. Nowadays, there are two widely used hash functions — MD5 and SHA-1.

MD5 is a hash function designed by Ron Rivest as a strengthened version of MD4. Since its publication, some weaknesses has been found. In 1993, B. den Boer and A. Bosselaers found a kind of pseudo-collision for MD5 which consists of the same message with two different sets of initial values. This attack discloses the weak avalanche in the most significant bit for all the chaining variables in MD5. In the rump session of Eurocrypt’96, H. Dobbertin presented a semi free-start collision which consists of two different 512-bit messages with a chosen initial value.

# MD5 Algorithms

In order to conveniently describe the general structure of MD5, we first recall the iteration process for hash functions. Generally a hash function is iterated by a compression function X = f(Z) which compresses l-bit message block Z to s-bit hash value X where l>s. For MD5, l = 512, and s = 128. The iterating method is usually called the MerkleDamgard meta-method. For a padded message M with multiples of l-bit length, the iterating process is as follows:

Here M = (M0, M2, ··· , Mt−1), and H0 = IV0 is the initial value for the hash function. In the above iterating process, we omit the padding method because it has no influence on our attack. The following is to describe the compression function for MD5. For each 512-bit block Mi of the padded message M, divide Mi into 32-bit words.

The compression algorithm for Mi has four rounds, and each round has 16 operations. Four successive step operations are as follows:

# SHA-1 Algorithm

The hash function SHA-1 takes a message of length less than 264 bits and produces a 160-bit hash value. The input message is padded and then processed in 512-bit blocks in the Damgard/Merkle iterative structure. Each iteration invokes a so-called compression function which takes a 160-bit chaining value and a 512-bit message block and outputs another 160-bit chaining value. The initial chaining value (called IV) is a set of fixed constants, and the final chaining value is the hash of the message. In what follows, we describe the compression function of SHA-1. For each 512-bit block of the padded message, divide it into 16 32-bit words, (m0, m1, …., m15). The message words are first expanded as follows:

The expanded message words are then processed in four rounds, each consisting of 20 steps. The step function is defined as follows.

The initial chaining value

is defined as: (0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476, 0xc3d2e1f0) Each round employs a different Boolean function fi and constant ki, which is summarized in Table 1.