hash function & unordered_map & map

std::map 是通过数据结构红黑树实现的(自平衡二叉查找树)。因此,一个数据结构能够作为map的key的前提是,该数据结构能够 copiable and assignable ,并且能够被用来比较大小( overrides operator< )。

注意,这里的 operator< 要满足 strict weak ordering ,即 if( !(x<y) && !(y<x) )为true, 等价于 x==y为true, 所以判断两个对象是否相等的条件也要现在operator< 中。

而 std::unordered_map则要求key能够 copiable and assignable 并且能够被hash(overrides operator()) ,和能够被判断是否相等operator==() 以处理hash冲突的情况)。

关于什么情况下使用map/unordered_map:
see link: http://kariddi.blogspot.hk/2012/07/c11-unorderedmap-vs-map.html

So , is the new unordered_map worth it? Well, for integer keys the G++ implementation showed pretty good performance. The hashing for integer numbers is lighting fast (probably it is skipped completely and the integer itself is used as the hash value, but I didn’t check). Using string keys g++ unordered_map showed some performance problems , at least with the examples I used. The problems were mitigated by increasing the bucket count of the map, but at the cost of an increased memory footprint. Overall , for non-integer keys, the std::map implementation in g++ 4.7.1 libstdc++ seems more robust and less dependent on how the key hash values collide then std::unordered_map. Std::map also comes with the added bonus of being ordered. Those who thought that std::map would have been completely replaced by std::unordered_map for all the usages that didn’t require the items to be ordered may have remained disapponted … at least for now.

key为整数的时候: unordered_map
key为string的时候: map

关于string常用的hash算法有如下:

转载自:http://www.cse.yorku.ca/~oz/hash.html

djb2
djb2 is one of the best string hash functions i know. it has excellent distribution and speed on many different sets of keys and table sizes.

this algorithm (k=33) was first reported by dan bernstein many years ago in comp.lang.c. another version of this algorithm (now favored by bernstein) uses xor: hash(i) = hash(i - 1) * 33 ^ str[i]; the magic of number 33 (why it works better than many other constants, prime or not) has never been adequately explained.

    unsigned long
    hash(unsigned char *str)
    {
        unsigned long hash = 5381;
        int c;

        while (c = *str++)
            hash = ((hash << 5) + hash) + c; /* hash * 33 + c */

        return hash;
    }

著名内存数据库 Redis 就是使用该hash函数。

sdbm
this algorithm was created for sdbm (a public-domain reimplementation of ndbm) database library. it was found to do well in scrambling bits, causing better distribution of the keys and fewer splits. it also happens to be a good general hashing function with good distribution. the actual function is hash(i) = hash(i - 1) * 65599 + str[i]; what is included below is the faster version used in gawk. [there is even a faster, duff-device version] the magic constant 65599 was picked out of thin air while experimenting with different constants, and turns out to be a prime. this is one of the algorithms used in berkeley db (see sleepycat) and elsewhere.

    static unsigned long
    sdbm(str)
    unsigned char *str;
    {
        unsigned long hash = 0;
        int c;

        while (c = *str++)
            hash = c + (hash << 6) + (hash << 16) - hash;

        return hash;
    }

lose lose
This hash function appeared in K&R (1st ed) but at least the reader was warned: “This is not the best possible algorithm, but it has the merit of extreme simplicity.” This is an understatement; It is a terrible hashing algorithm, and it could have been much better without sacrificing its “extreme simplicity.” [see the second edition!] Many C programmers use this function without actually testing it, or checking something like Knuth’s Sorting and Searching, so it stuck. It is now found mixed with otherwise respectable code, eg. cnews. sigh.

    unsigned long
    hash(unsigned char *str)
    {
    unsigned int hash = 0;
    int c;

    while (c = *str++)
        hash += c;

    return hash;
    }
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值