The Salsa20 core
Salsa20 核函数
The Salsa20 core is a function from 64-byte strings to 64-byte strings: the Salsa20 core reads a 64-byte string x and produces a 64-byte string Salsa20(x).
Salsa20核函数将一个64字节的字节流x转换为另一个64字节的字节流Salsa20(x),这个函数的产生的伪随机字节流主要用来对消息进行xor异或运算达到加密的目的。
The Salsa20 stream cipher has a separate page. The Salsa20 stream cipher uses the Salsa20 core to encrypt data.
Salsa20 流加密算法有一个专门的网页。Salsa20 流加密算法使用Salsa20核来加密数据
The Rumba20 compression function has a separate page. The Rumba20 compression function uses the Salsa20 core to compress a 192-byte string to a 64-byte string.
Rumba20压缩函数有一个独立的网页。Rumba20压缩函数使用Salsa20核将192字节数据压缩为64字节。
I originally introduced the Salsa20 core as the "Salsa20 hash function," but this terminology turns out to confuse people who think that "hash function" means "collision-resistant compression function." The Salsa20 core does not compress and is not collision-resistant. If you want a collision-resistant compression function, look at Rumba20. (I wonder what the same people think of the FNV hash function, perfect hash functions, universal hash functions, etc.)
我最开始使用Salsa20哈希函数来命名Salsa20核函数,但是这个专业的叫法让人们误认为这个哈希函数是collision-resistant的。(collision-resistant指两个不同的输入值传入这个哈希函数后,必然不会产生相同的输出值,这个哈希函数就是collision-resistant)Salsa20核函数并不压缩数据和collision-resistant。如果你需要满足这样条件的哈希函数请参考Rumba20。
History: I introduced Salsa20 in March 2005. It is a refinement of Salsa10, which I introduced in November 2004.
历史:我在2005年3月引入了Salsa20。它对我在2004年11月提出的 Salsa10进行了改良。
ChaCha20在核函数上稍微做了调整,数据bit扩散更快。每一个1/4 round会修改一个字两次,每一个输入字也会影响到输出字。
Definition of the Salsa20 core
Salsa20核函数的定义
The 64-byte input x to Salsa20 is viewed in little-endian form as 16 words x0, x1, x2, ..., x15 in {0,1,...,2^32-1}. These 16 words are fed through 320 invertible modifications, where each modification changes one word. The resulting 16 words are added to the original x0, x1, x2, ..., x15 respectively modulo 2^32, producing, in little-endian form, the 64-byte output Salsa20(x).
将64字节的输入流x看作16个字 x0, x1, x2, ..., x15,其中每个字为4个字节的小端对齐的无符号数字。由于字只有4个字节,因此字的范围在{0,1,...,2^32-1}中。对这16个字进行总计320次可翻转的转换得到输出的16个字。最后将输出的16个字分别加上原来的16个字并对2的32次方取模,同样保持小端字节序,从而得到64字节的输出字节流Salsa20(x).
Each modification involves xor'ing into one word a rotated version of the sum of two other words modulo 2^32. Thus the 320 modifications involve, overall, 320 additions, 320 xor's, and 320 rotations. The rotations are all by constant distances.
每一个转换改变一个字z,先将两个其他的字x和y加和取2的32次方的模,再循环移位固定的距离后,将这个结果和z进行异或操作得到新的z。循环移位的距离是固定的。因此320次转换总的计算量包括320次加法、320次异或、320次循环移位操作。
The entire series of modifications is a series of 10 identical double-rounds. Each double-round is a series of 2 rounds. Each round is a set of 4 parallel quarter-rounds. Each quarter-round modifies 4 words.
整个转换过程包含10次相同的 double-round。每一个双倍循环又由两个rounds组成.每一个round由4个并行的1/4的round组成。每一个1/4的round转换4个字。
The complete function is defined as follows:
完整的函数定义如下:
b ^= (a+d) <<< 7; c ^= (b+a) <<< 9; d ^= (c+b) <<< 13; a ^= (d+c) <<< 18;
// R(a,b)宏定义了一个简单的循环左移位操作,rotl32(),C语言没有实现这个操作,汇编语言实现了
#define R(a,b) (((a) << (b)) | ((a) >> (32 - (b)))) void salsa20_word_specification(uint32 out[16],uint32 in[16]) { int i; uint32 x[16]; for (i = 0;i < 16;++i) x[i] = in[i]; // 这个循环10次 for (i = 20;i > 0;i -= 2) { // 循环移位的距离总是7,9,13,18,每4个语句为一组,例如前4个只对4,8,12,0进行转换 x[ 4] ^= R(x[ 0]+x[12], 7); x[ 8] ^= R(x[ 4]+x[ 0], 9); x[12] ^= R(x[ 8]+x[ 4],13); x[ 0] ^= R(x[12]+x[ 8],18); x[ 9] ^= R(x[ 5]+x[ 1], 7); x[13] ^= R(x[ 9]+x[ 5], 9); x[ 1] ^= R(x[13]+x[ 9],13); x[ 5] ^= R(x[ 1]+x[13],18); x[14] ^= R(x[10]+x[ 6], 7); x[ 2] ^= R(x[14]+x[10], 9); x[ 6] ^= R(x[ 2]+x[14],13); x[10] ^= R(x[ 6]+x[ 2],18); x[ 3] ^= R(x[15]+x[11], 7); x[ 7] ^= R(x[ 3]+x[15], 9); x[11] ^= R(x[ 7]+x[ 3],13); x[15] ^= R(x[11]+x[ 7],18); x[ 1] ^= R(x[ 0]+x[ 3], 7); x[ 2] ^= R(x[ 1]+x[ 0], 9); x[ 3] ^= R(x[ 2]+x[ 1],13); x[ 0] ^= R(x[ 3]+x[ 2],18); x[ 6] ^= R(x[ 5]+x[ 4], 7); x[ 7] ^= R(x[ 6]+x[ 5], 9); x[ 4] ^= R(x[ 7]+x[ 6],13); x[ 5] ^= R(x[ 4]+x[ 7],18); x[11] ^= R(x[10]+x[ 9], 7); x[ 8] ^= R(x[11]+x[10], 9); x[ 9] ^= R(x[ 8]+x[11],13); x[10] ^= R(x[ 9]+x[ 8],18); x[12] ^= R(x[15]+x[14], 7); x[13] ^= R(x[12]+x[15], 9); x[14] ^= R(x[13]+x[12],13); x[15] ^= R(x[14