mysql 字符串hash,MySQL:UNIQUE文本字段使用附加的HASH字段

In my MySQL DB I have a table defined like:

CREATE TABLE `mytablex_cs` (

`id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,

`tag` varchar(6) COLLATE utf8_bin NOT NULL DEFAULT '',

`value` text COLLATE utf8_bin NOT NULL,

PRIMARY KEY (`id`),

KEY `kt` (`tag`),

KEY `kv` (`value`(200))

) ENGINE=MyISAM AUTO_INCREMENT=7 DEFAULT CHARSET=utf8 COLLATE=utf8_bin

I need to implement a UNIQUE constraint (key) on the value field.

I know that is not yet possible to define a unique index on the entire value for a blob or text field, but there is a ticket(?) open to implement such feature (see this page) where it has been suggested to create a unique key using a hash like it is already implemented for other fields.

Now I would like to use a similar approach adding to the table another field that will contain the hash and creating a unique key on this field.

I gave a look to possible ways to create this hash and, since I would like to avoid collisions (I need to insert several millions of entries), it seems that the RIPEMD-160 algorithm is the best one, even if a quick search gave me several similar solutions that use SHA256 or even SHA1 and MD5.

I totally lack of knowledge in cryptography, so what are the down sides of choosing this approach?

Another question I have is: which algorithm is currently used by MySQL to create the hash?

解决方案

Lets look at your requirements:

You need to ensure that a value field is unique. The value field is a text column and due to the nature of it there is no way to create a unique index on the value field(for now). So using a extra field which is the hash of the field value is your only real option here.

Advantages to this approach:

Easy to calculate the hash.

It is extremely rare to create a duplicate hash for two different values so your hash values are almost gauranteed to be unqiue.

Hashes are normally some numeric value(expressed as hexdecimal) that can be efficiently indexed.

The hashes wont take up a lot of space, different hashing function return different length hashes so play around with the different algorithms and test them to find one that suits your need.

Disadvantages of this approach:

Extra field to cater for during INSERTS and UPDATES i.e. there is more work to do.

If you already have data in the table and this is in production you will have to update the current data and hopefully you dont have duplicates already. Also it will take time to run the update. Thus it might be tricky to apply the change in a already working system.

Hashing functions are CPU intensive and can have a negative impact on CPU usage.

I assume you understand what a hash function does and conceptually how it works.

MySQL supports as far as I know MD5, SHA, SHA1 and SHA2 hashing functions. Most if not all of these should be sufficient for just hashing. Some functions like MD5 has some issues when used in cryptography applications i.e. when using it in PKI as a signature algorithm etc. However these issues should not be that important when you decide on using it to create a unique value as it is not really being applied in a cryptography context here.

To use the MySQL hashing functions you can try the following examples:

SELECT MD5('1234')

SELECT SHA('1234')

SELECT SHA1('1234')

SELECT SHA2('1234',224);

As with everythig new you should try all the approaches and find the one that will be most successfull in your case.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
"No module named Crypto.Hash" 的报错意味着在Python中找不到名为 "Crypto.Hash" 的模块。为了解决这个问题,你可以按照以下方法之一进行操作: 1. 确保你已经正确安装了Crypto库。Crypto库不是Python的内置模块,所以你需要先下载安装它。可以通过运行命令 "pip install pycryptodome" 来安装。 2. 检查你的代码是否正确导入了Crypto模块中的Hash子模块。在使用Hash子模块之前,你需要使用import语句将其导入到你的代码中。确保你的导入语句是正确的,例如 "from Crypto.Hash import *" 或者 "import Crypto.Hash"。 3. 如果你已经安装了pycryptodome但仍然遇到问题,尝试修改文件夹名称。根据引用中的建议,你可以在Python安装目录下找到名为 "crypto" 的文件夹,并将其改名为 "Crypto"。 综上所述,你可以尝试按照以上方法解决"No module named Crypto.Hash"的问题。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [下载python中Crypto库报错:ModuleNotFoundError: No module named ‘Crypto’的解决](https://download.csdn.net/download/weixin_38644780/12869337)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] - *2* *3* [No module named 'Crypto' 解决方案](https://blog.csdn.net/DanielJackZ/article/details/104531615)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值