Verifying the results of:
Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD
by Xiaoyun Wang, Dengguo Feng, Xuejia Lai, Hongbo Yu
(available from http://eprint.iacr.org/2004/199/)
Earlier version of the paper by Wang et all (Aug 16) contained a flaw
which led to questions whether or not the attack is real. The bug has
now (Aug 17, 2004) been fixed.
Note that the paper contains no information about the methods and
algorithmics used in the attack. My moderately educated guess is that
both Joux and the Chinese group has used the "neutral bit" and other
techniques of Biham and Chen ("Near-Collisions of SHA-0", Crypto 2004)
to improve older attacks. These are exciting times in hash function
cryptanalysis!
The new paper provides at least one real collision for the MD5 function
-- which I have extracted from the it. You can now easily check it too.
Consider these 128-byte files, which only differ in six bytes (in fact
their Hamming distance is only six bits, too):
file1.dat:
00000000 d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
00000010 2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89
00000020 55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a
00000030 08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b
00000040 96 0b 1d d1 dc 41 7b 9c e4 d8 97 f4 5a 65 55 d5
00000050 35 73 9a c7 f0 eb fd 0c 30 29 f1 66 d1 09 b1 8f
00000060 75 27 7f 79 30 d5 5c eb 22 e8 ad ba 79 cc 15 5c
00000070 ed 74 cb dd 5f c5 d3 6d b1 9b 0a d8 35 cc a7 e3
MD5(file1.dat) = a4c0d35c95a63a805915367dcfe6b751
file2.dat:
00000000 d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
00000010 2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89
00000020 55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a
00000030 08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b
00000040 96 0b 1d d1 dc 41 7b 9c e4 d8 97 f4 5a 65 55 d5
00000050 35 73 9a 47 f0 eb fd 0c 30 29 f1 66 d1 09 b1 8f
00000060 75 27 7f 79 30 d5 5c eb 22 e8 ad ba 79 4c 15 5c
00000070 ed 74 cb dd 5f c5 d3 6d b1 9b 0a 58 35 cc a7 e3
MD5(file2.dat) = a4c0d35c95a63a805915367dcfe6b751
Once you have downloaded these files you can easily verify (in UNIX shell)
that this is indeed a collision for MD5:
$ cmp file1.dat file2.dat
file1.dat file2.dat differ: char 20, line 1
$ md5sum file1.dat
a4c0d35c95a63a805915367dcfe6b751 file1.dat
$ md5sum file2.dat
a4c0d35c95a63a805915367dcfe6b751 file2.dat
This clearly shows that the resistance of MD5 against collision attacks
is significantly lower than 2^64 indicated by its 128-bit digest. Since the
attack allows free selection of IV, these attacks mean that MD5 should
not be used for any serious cryptographic purpose.
Note for that because MD5 is a chained hash function, you can generate an
infinity of new collisions from these by simple process of concatenation:
$ echo 'Hello, World!' > hello.txt
$ cat file1.dat hello.txt | md5sum
158701224aef36986648d9f0dfb0ca3c -
$ cat file2.dat hello.txt | md5sum
158701224aef36986648d9f0dfb0ca3c -
Here the text "Hello, World!" has simply been added at the end of the
previous collisions.
Cheers,
-mjos
This file was written by Markku-Juhani O. Saarinen < mjos@iki.fi> on Aug 17, 2004.
Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD
by Xiaoyun Wang, Dengguo Feng, Xuejia Lai, Hongbo Yu
(available from http://eprint.iacr.org/2004/199/)
Earlier version of the paper by Wang et all (Aug 16) contained a flaw
which led to questions whether or not the attack is real. The bug has
now (Aug 17, 2004) been fixed.
Note that the paper contains no information about the methods and
algorithmics used in the attack. My moderately educated guess is that
both Joux and the Chinese group has used the "neutral bit" and other
techniques of Biham and Chen ("Near-Collisions of SHA-0", Crypto 2004)
to improve older attacks. These are exciting times in hash function
cryptanalysis!
The new paper provides at least one real collision for the MD5 function
-- which I have extracted from the it. You can now easily check it too.
Consider these 128-byte files, which only differ in six bytes (in fact
their Hamming distance is only six bits, too):
file1.dat:
00000000 d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
00000010 2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89
00000020 55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a
00000030 08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b
00000040 96 0b 1d d1 dc 41 7b 9c e4 d8 97 f4 5a 65 55 d5
00000050 35 73 9a c7 f0 eb fd 0c 30 29 f1 66 d1 09 b1 8f
00000060 75 27 7f 79 30 d5 5c eb 22 e8 ad ba 79 cc 15 5c
00000070 ed 74 cb dd 5f c5 d3 6d b1 9b 0a d8 35 cc a7 e3
MD5(file1.dat) = a4c0d35c95a63a805915367dcfe6b751
file2.dat:
00000000 d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
00000010 2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89
00000020 55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a
00000030 08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b
00000040 96 0b 1d d1 dc 41 7b 9c e4 d8 97 f4 5a 65 55 d5
00000050 35 73 9a 47 f0 eb fd 0c 30 29 f1 66 d1 09 b1 8f
00000060 75 27 7f 79 30 d5 5c eb 22 e8 ad ba 79 4c 15 5c
00000070 ed 74 cb dd 5f c5 d3 6d b1 9b 0a 58 35 cc a7 e3
MD5(file2.dat) = a4c0d35c95a63a805915367dcfe6b751
Once you have downloaded these files you can easily verify (in UNIX shell)
that this is indeed a collision for MD5:
$ cmp file1.dat file2.dat
file1.dat file2.dat differ: char 20, line 1
$ md5sum file1.dat
a4c0d35c95a63a805915367dcfe6b751 file1.dat
$ md5sum file2.dat
a4c0d35c95a63a805915367dcfe6b751 file2.dat
This clearly shows that the resistance of MD5 against collision attacks
is significantly lower than 2^64 indicated by its 128-bit digest. Since the
attack allows free selection of IV, these attacks mean that MD5 should
not be used for any serious cryptographic purpose.
Note for that because MD5 is a chained hash function, you can generate an
infinity of new collisions from these by simple process of concatenation:
$ echo 'Hello, World!' > hello.txt
$ cat file1.dat hello.txt | md5sum
158701224aef36986648d9f0dfb0ca3c -
$ cat file2.dat hello.txt | md5sum
158701224aef36986648d9f0dfb0ca3c -
Here the text "Hello, World!" has simply been added at the end of the
previous collisions.
Cheers,
-mjos
This file was written by Markku-Juhani O. Saarinen < mjos@iki.fi> on Aug 17, 2004.