# MD5值相同的两个不同的文件

Verifying the results of:

Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD
by Xiaoyun Wang, Dengguo Feng, Xuejia Lai, Hongbo Yu

(available from http://eprint.iacr.org/2004/199/)

Earlier version of the paper by Wang et all (Aug 16) contained a flaw
which led to questions whether or not the attack is real. The bug has
now (Aug 17, 2004) been fixed.

Note that the paper contains no information about the methods and
algorithmics used in the attack. My moderately educated guess is that
both Joux and the Chinese group has used the "neutral bit" and other
techniques of Biham and Chen ("Near-Collisions of SHA-0", Crypto 2004)
to improve older attacks. These are exciting times in hash function
cryptanalysis!

The new paper provides at least one real collision for the MD5 function
-- which I have extracted from the it. You can now easily check it too.

Consider these 128-byte files, which only differ in six bytes (in fact
their Hamming distance is only six bits, too):

file1.dat:

00000000  d1 31 dd 02 c5 e6 ee c4  69 3d 9a 06 98 af f9 5c
00000010  2f ca b5 87 12 46 7e ab  40 04 58 3e b8 fb 7f 89
00000020  55 ad 34 06 09 f4 b3 02  83 e4 88 83 25 71 41 5a
00000030  08 51 25 e8 f7 cd c9 9f  d9 1d bd f2 80 37 3c 5b
00000040  96 0b 1d d1 dc 41 7b 9c  e4 d8 97 f4 5a 65 55 d5
00000050  35 73 9a c7 f0 eb fd 0c  30 29 f1 66 d1 09 b1 8f
00000060  75 27 7f 79 30 d5 5c eb  22 e8 ad ba 79 cc 15 5c
00000070  ed 74 cb dd 5f c5 d3 6d  b1 9b 0a d8 35 cc a7 e3

MD5(file1.dat) = a4c0d35c95a63a805915367dcfe6b751

file2.dat:

00000000  d1 31 dd 02 c5 e6 ee c4  69 3d 9a 06 98 af f9 5c
00000010  2f ca b5 07 12 46 7e ab  40 04 58 3e b8 fb 7f 89
00000020  55 ad 34 06 09 f4 b3 02  83 e4 88 83 25 f1 41 5a
00000030  08 51 25 e8 f7 cd c9 9f  d9 1d bd 72 80 37 3c 5b
00000040  96 0b 1d d1 dc 41 7b 9c  e4 d8 97 f4 5a 65 55 d5
00000050  35 73 9a 47 f0 eb fd 0c  30 29 f1 66 d1 09 b1 8f
00000060  75 27 7f 79 30 d5 5c eb  22 e8 ad ba 79 4c 15 5c
00000070  ed 74 cb dd 5f c5 d3 6d  b1 9b 0a 58 35 cc a7 e3

MD5(file2.dat) = a4c0d35c95a63a805915367dcfe6b751

Once you have downloaded these files you can easily verify (in UNIX shell)
that this is indeed a collision for MD5:

$cmp file1.dat file2.dat file1.dat file2.dat differ: char 20, line 1$ md5sum file1.dat
a4c0d35c95a63a805915367dcfe6b751  file1.dat
$md5sum file2.dat a4c0d35c95a63a805915367dcfe6b751 file2.dat This clearly shows that the resistance of MD5 against collision attacks is significantly lower than 2^64 indicated by its 128-bit digest. Since the attack allows free selection of IV, these attacks mean that MD5 should not be used for any serious cryptographic purpose. Note for that because MD5 is a chained hash function, you can generate an infinity of new collisions from these by simple process of concatenation:$ echo 'Hello, World!' > hello.txt
$cat file1.dat hello.txt | md5sum 158701224aef36986648d9f0dfb0ca3c -$ cat file2.dat hello.txt | md5sum
158701224aef36986648d9f0dfb0ca3c  -

Here the text "Hello, World!" has simply been added at the end of the
previous collisions.

Cheers,
-mjos

This file was written by Markku-Juhani O. Saarinen <mjos@iki.fi> on Aug 17, 2004.
• 本文已收录于以下专栏：

举报原因： 您举报文章：MD5值相同的两个不同的文件 色情 政治 抄袭 广告 招聘 骂人 其他 (最多只允许输入30个字)