哈希函数与密码安全-CSDN博客

哈希密码

什么是哈希函数？ (What is a Hash function?)

It’s an algorithm that maps an input of arbitrary length to a unique output of fixed length, this value is known as HASH, FINGERPRINT or DIGEST.

这是一种将任意长度的输入映射到固定长度的唯一输出的算法，该值称为HASH，FINGERPRINT或DIGEST。

It is usually used to verify the integrity of data, in fact, digital signature algorithms are applied to the DIGEST and not to the entire document.

它通常用于验证数据的完整性，实际上，数字签名算法应用于DIGEST，而不应用于整个文档。

什么是碰撞？ (What are collisions?)

Every input of HASH functions supposed to be mapped to a different output (DIGEST) but this is not always true, it’s possible to find two messages that may produce the same result, in this case, we have found a collision. That’s not all because for each message there are infinite collisions.

HASH函数的每个输入都应该映射到不同的输出(DIGEST)，但这并不总是正确的，有可能找到两条可能产生相同结果的消息，在这种情况下，我们发现了冲突。这还不是全部，因为每条消息都有无限的冲突。

所以呢？ (So what?)

Security of HASH functions is based on the fact it’s very hard to find collisions knowing the hashed message. This is fundamental, let’s suppose we have digitally signed a document, someone knowing it may compute a variant disadvantageous for us, that collides that is to say it produces the same DIGEST.

HASH函数的安全性基于这样一个事实，即在知道哈希消息的情况下很难发现冲突。这是基础，假设我们已经对文档进行了数字签名，有人知道该文档可能会对我们产生不利影响的变体，发生冲突，也就是说它产生相同的DIGEST。

So when using a HASH function we have to be sure it’s computationally impossible to find a collision, this safety is based on the birthday paradox.

因此，在使用哈希函数时，我们必须确保在计算上不可能找到碰撞，这种安全性是基于生日悖论。

什么是生日悖论？ (What is the Birthday paradox?)

It is based on the question: “How many people have I to consider, to have a probability greater than the 50% to have at least 2 people born on the same day?”

它基于以下问题：“ 我要考虑多少人，同一天出生至少2个人的概率大于50％？ ”

So we have to consider couples of people, given n people, we can compute the total amount of couples using the simple combination formula :

因此，我们必须考虑一对夫妇，给定n个人，我们可以使用简单的组合公式计算夫妇的总数：

where n indicates the number of people a k the size of the group, in our case a couple so 2.

其中n表示人数等于小组人数的k ，在本例中为2。

With 57 people the probability that two of them are born on the same day is 99% considering we have 1596 couples and the day in a year are 365, so skipping calculations, the answer is 23 (253 couples)

考虑到我们有1596对夫妇，并且一年中的一天是365对，因此有57个人的两个人在同一天出生的概率为99％，因此跳过计算，答案是23(253对夫妇)

哈希函数中的生日悖论 (Birthday paradox in Hash function)

The same thinking can be applied in HASH functions, and it’s known that we have a probability greater than 50% of finding a collision for 2^n/2 possible inputs, where n stands for the number of bits composing the DIGEST.

可以在HASH函数中应用相同的思路，并且众所周知，对于2 ^ n / 2个可能的输入，我们有大于50％的概率找到碰撞，其中n代表组成DIGEST的位数。

Here’s a table showing Bits and the number of values to consider.

下表显示了位数和要考虑的值数。

生日袭击 (Birthday attack)

It consists of computing n/2 variants of the original document to find a collision. That’s because it’s important to use at least a 256-bit DIGEST.

它由计算原始文档的n / 2个变体来查找冲突。这是因为至少要使用256位DIGEST很重要。

身份验证和哈希 (Authentication and Hashing)

Hashing is very good to store password because its transformation is mathematically irreversible, and they are deterministic.

哈希存储密码非常好，因为它的转换在数学上是不可逆的，并且是确定性的 。

A deterministic function is a function that given the same input always produces the same output. Obviously this is a must in authentication because it would be a big problem if a password may log in to different accounts.

确定性函数是给定相同输入总是产生相同输出的函数。显然，这是身份验证中必须的，因为如果密码可以登录到其他帐户，这将是一个大问题。

So when saving user credentials we store the username and the hashed password in the DB. When the user logs in, we hash the password sent and compare it to the hash connected with the provided username. If the hashed password and the stored hash match, we have a valid login.

因此，在保存用户凭据时，我们将用户名和哈希密码存储在数据库中。当用户登录时，我们对发送的密码进行哈希处理，并将其与与提供的用户名连接的哈希进行比较。如果哈希密码和存储的哈希密码匹配，则我们具有有效的登录名。

我应该使用它们来存储密码吗？ (Should I use them to store passwords?)

The short answer is yes but …

简短的答案是，但是……

In recent times it’s advised to avoid hashing to store passwords because it’s a fast operation not meant to be computationally fast by reducing password safety. For example, modern hardware could compute billions of SHA-256 per second. Instead of a fast function, we need a function that is slow at hashing passwords to bring attackers almost to a halt.

在最近的时间里，建议避免散列存储密码，因为这是一项快速操作，并不意味着通过降低密码安全性来实现计算速度快。例如，现代硬件每秒可以计算数十亿个SHA-256 。除了快速功能之外，我们还需要一个散列密码速度较慢的功能，以使攻击者几乎无法使用。

It’s common to use hash functions like bcrypt (Blowfish-crypt) which is an adaptive function: the iteration count of rounds can be increased to make it slower, so it remains resistant to brute force attacks even with increasing computation power.

通常使用像bcrypt (Blowfish-crypt)这样的散列函数，这是一种自适应函数：可以增加回合的迭代次数以使其变慢，因此即使提高了计算能力，它仍然可以抵抗暴力攻击。

结论 (Conclusions)

Let’s recap what we’ve learned through this article:

让我们回顾一下通过本文中学到的内容：

The core purpose of hashing is to create a fingerprint of data to assess data integrity.
哈希的核心目的是创建数据指纹以评估数据完整性。
Hashing functions take arbitrary inputs and transform them into outputs of a fixed length.
散列函数采用任意输入并将其转换为固定长度的输出。
Hashing is not sufficient to protect passwords for mass exploitation, it’s safer to use cryptographic salts.
散列不足以保护密码以供大规模利用，而使用加密盐则更安全。
MD5 and SHA-1 have been reported as being vulnerable due to collisions. The SHA-2 family stands as a better option.
据报道， MD5和SHA-1由于碰撞而容易受到攻击。 SHA-2系列是更好的选择。
SHA family is not ideal to store password because it’s very fast so vulnerable to brute-force attacks, it’s better to use functions like bcrypt
SHA系列不是理想的存储密码，因为它非常快，容易受到暴力攻击，因此最好使用bcrypt之类的功能

翻译自: https://medium.com/swlh/hashing-birthday-and-passwords-254756df55b7