C++ std::map 与 std::unordered_map 的 key 为 字符串的一些问题

最近在学习 C++ 的一些基础知识,发现 std::mapstd::unordered_mapkey字符串 的一些问题,不知道是我使用姿势问题,还是其他的,如有大神熟悉,望不吝指正。

目测内部对 字符串的 hash 计算出了问题?但是我怎么都不相信,毕竟这是经历了多少年的标准库了,而我现在才接触而已,我觉得 100% 是我的 使用姿势不对 的问题。

那么看看我是如何测试的。


先插入数据:

using _t = std::map<const char*, int>;
...

	_t* m = new _t();

    std::string key_name = "this is key";
    m->insert(_t::value_type(key_name.c_str(), 99)); // 插入 key 为:this is key,value为:99


然后查找:

    _t::const_iterator ret = m->find(key_name.c_str());
    if (ret != m->end()) {
        std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
    } else {
        std::cout << "Not found the key : " << "this is key" << "\n";
    }

结果跑进了第一个分支 Found the key,是没有问题的。


然后,如果将 key_name.c_str() 改为明文字符串 "this is key" 就有问题:

    _t::const_iterator ret = m->find("this is key");
    if (ret != m->end()) {
        std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
    } else {
        std::cout << "Not found the key : " << "this is key" << "\n";
    }

结果跑进了第二个分支 Not found the key有问题的。


所以我怀疑是否 terminate char(结尾符) 的问题,然后我输出了所有的字符,做比较:

void print_chars(const char* title, const char* chars) {
    std::cout << title << ": ";
    while(true) {
        std::cout << *chars << "(" << (unsigned int)(*chars) << ")" << ",";
        if (*(++chars) == 0) {
            std::cout << "\\0";
            break;
        }
    }
    std::cout << '\n';
}

...
    print_chars("In [main funcs], key_name.c_str() :    ", key_name.c_str());
    print_chars("In [main funcs],const char* :          ", "this is key");

/* 输出:
In [main funcs], key_name.c_str() :    : t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
In [main funcs],const char* :          : t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
*/

输出结果一模一样


是否 key_name.c_str() 的方式就没问题了呢,然而不是,我将 key_name.c_str() 传入方法的参数后,再用 map->find 还是查找不到。

void printf_map_find(_t* m, std::string key_name) {
    print_chars("In [printf_map_find funs]", key_name.c_str());
    _t::const_iterator ret = m->find(key_name.c_str());
    if (ret != m->end()) {
        std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
    } else {
        std::cout << "Not found the key : " << key_name.c_str() << "\n";
    }
}

printf_map_find(m, key_name);

/* 输出:
In [printf_map_find funs]: t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Not found the key : this is key
*/

结果发现 key_name.c_str() 也不能查询结果了,从 char* 的打印结果来看,还是与之前的内容一致的


难道是有序的 std::map 的问题吗?那我试试无序的 std::unordered_map

using _uo_t = std::unordered_map<const char*, int>;

void printf_unordered_map_find(_uo_t* m, std::string key_name) {
    print_chars("In [printf_unordered_map_find funs]", key_name.c_str());
    _uo_t::const_iterator ret = m->find(key_name.c_str());
    if (ret != m->end()) {
        std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
    } else {
        std::cout << "Not found the key : " << key_name.c_str() << "\n";
    }
}
...

    std::cout << " === testing std:::unordered_map find ===\n";
    std::cout << "  ## using std::string.c_str() for std::unordered_map's key ##\n";
    _uo_t* m_uo = new _uo_t();
    m_uo->insert(_uo_t::value_type(key_name.c_str(), 99));

    print_chars("In [main funcs], test in unordered_map", key_name.c_str());
    _uo_t::const_iterator ret_uo = m_uo->find(key_name.c_str());
    if (ret_uo != m_uo->end()) {
        std::cout << "Found the key : " << (*ret_uo).first << ", value : " << (*ret_uo).second << "\n";
    } else {
        std::cout << "Not found the key : " << key_name.c_str() << "\n";
    }
    std::cout << "---------------------\n";
    printf_unordered_map_find(m_uo, key_name);

/* 输出:
 === testing std:::unordered_map find ===
  ## using std::string.c_str() for std::unordered_map's key ##
In [main funcs], test in unordered_map: t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Found the key : this is key, value : 99
---------------------
In [printf_unordered_map_find funs]: t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Not found the key : this is key
*/

然后我测试了一下,发现与上面的 std::map 的结果一致


最后,我只是不使用 const char* 来作为 key 了,使用 int 值,就是将 const char* 哈希后的值

using _uo_hash_t = std::unordered_map<int, int>;

int getHash(const char* chars) {
    int hash = 0;
    while((*chars) != 0) {
        hash = hash * 31 + (*chars++);
    }
    return hash;
}

int getHash(std::string str) {
    return getHash(str.c_str());
}

void printf_unordered_map_find_by_hash(_uo_hash_t* m, std::string key_name) {
    std::cout << "In [printf_unordered_map_find_by_hash funcs]\n";
    
    int hash_code = getHash(key_name);
    std::cout << key_name << " ==> hash code : " << hash_code << "\n";

    _uo_hash_t::const_iterator ret = m->find(hash_code);
    if (ret != m->end()) {
        std::cout << "Found the key(hash) : " << (*ret).first << " original char* : " << key_name.c_str() << ", value : " << (*ret).second << "\n";
    } else {
        std::cout << "Not found key(hash) : " << (*ret).first << " original char* : " << key_name.c_str() << "\n";
    }
}
...
    std::cout << " === testing std:::unordered_map hash key find ===\n";
    std::cout << "  ## using hash(std::string.c_str()) for std::unordered_map's key ##\n";
    _uo_hash_t* m_uo_hash = new _uo_hash_t();
    m_uo_hash->insert(_uo_hash_t::value_type(getHash(key_name.c_str()), 99));
    _uo_hash_t::const_iterator ret_uo_hash = m_uo_hash->find(getHash(key_name.c_str()));
    if (ret_uo_hash != m_uo_hash->end()) {
        std::cout << "Found the key(hash) : " << (*ret_uo_hash).first << ", original char* : " << key_name.c_str() << ", value : " << (*ret_uo_hash).second << "\n";
    } else {
        std::cout << "Not found key(hash) : " << (*ret_uo_hash).first << ", original char* : " << key_name.c_str() << "\n";
    }
    std::cout << "---------------------\n";
    printf_unordered_map_find_by_hash(m_uo_hash, key_name);
...

/* 输出:
 === testing std:::unordered_map hash key find ===
  ## using hash(std::string.c_str()) for std::unordered_map's key ##
Found the key(hash) : -2046255573 original char* : this is key, value : 99
---------------------
In [printf_unordered_map_find_by_hash funcs]
this is key ==> hash code : -2046255573
Found the key(hash) : -2046255573 original char* : this is key, value : 99
*/

可以看到结果,两个都可以 Found 到。

完整测试代码

#include<iostream>
#include<map>
#include<unordered_map>
#include<utility>
#include<string>

using _t = std::map<const char*, int>;
using _uo_t = std::unordered_map<const char*, int>;
using _uo_hash_t = std::unordered_map<int, int>;

void print_chars(const char* title, const char* chars) {
    std::cout << title << ": ";
    while(true) {
        std::cout << *chars << "(" << (unsigned int)(*chars) << ")" << ",";
        if (*(++chars) == 0) {
            std::cout << "\\0";
            break;
        }
    }
    std::cout << '\n';
}

void printf_map_find(_t* m, std::string key_name) {
    print_chars("In [printf_map_find funs]", key_name.c_str());
    _t::const_iterator ret = m->find(key_name.c_str());
    if (ret != m->end()) {
        std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
    } else {
        std::cout << "Not found the key : " << key_name.c_str() << "\n";
    }
}

void printf_unordered_map_find(_uo_t* m, std::string key_name) {
    print_chars("In [printf_unordered_map_find funs]", key_name.c_str());
    _uo_t::const_iterator ret = m->find(key_name.c_str());
    if (ret != m->end()) {
        std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
    } else {
        std::cout << "Not found the key : " << key_name.c_str() << "\n";
    }
}

int getHash(const char* chars) {
    int hash = 0;
    while((*chars) != 0) {
        hash = hash * 31 + (*chars++);
    }
    return hash;
}

int getHash(std::string str) {
    return getHash(str.c_str());
}

void printf_unordered_map_find_by_hash(_uo_hash_t* m, std::string key_name) {
    std::cout << "In [printf_unordered_map_find_by_hash funcs]\n";
    
    int hash_code = getHash(key_name);
    std::cout << key_name << " ==> hash code : " << hash_code << "\n";

    _uo_hash_t::const_iterator ret = m->find(hash_code);
    if (ret != m->end()) {
        std::cout << "Found the key(hash) : " << (*ret).first << " original char* : " << key_name.c_str() << ", value : " << (*ret).second << "\n";
    } else {
        std::cout << "Not found key(hash) : " << (*ret).first << " original char* : " << key_name.c_str() << "\n";
    }
}

int main() {

    _t* m = new _t();
    //
    // testing insert
    //
    std::cout << " === testing std::map insert ===\n";
    m->insert(std::pair<const char*, int>("test1", 1));
    m->insert(std::pair<const char*, int>("test2", 2));
    m->insert(_t::value_type("test3", 3));

    // 注意 map与vector的iterator 没有重载+运算符,所以用 it + i 的方式不行
    // vector是有重载+云算法的
    // 但 map 的 iterator 是有实现 ++ 运算符的,所以可以使用下列方式来遍历
    _t::iterator it = m->begin();
    for (; it != m->end(); it++) {
        std::cout << it->first << ":" << it->second << "\n";
    }

    //
    // testing find
    //

    //
    // 测试 std::map ,让 const char* 作为 key,发现跨函数调用会有问题
    //
    std::cout << " === testing std::map find ===\n";

    std::cout << "  ## using std::string.c_str() for std::map's key ##\n";
    std::string key_name = "this is key";
    m->insert(_t::value_type(key_name.c_str(), 99));

    // 在main 函数内打印 key_name.c_str() 与 在 printf_map_find 函数内的 key_name.c_str() 打印内容是一样的
    print_chars("In [main funcs], key_name.c_str() :    ", key_name.c_str());
    print_chars("In [main funcs],const char* :          ", "this is key");
    // _t::const_iterator ret = m->find("this is key"); // 这里注意,明文字符串也查找不了
    _t::const_iterator ret = m->find(key_name.c_str()); // 但没有跨函数来传入 key_name.c_str() 的查找是能找到的
    if (ret != m->end()) {
        std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
    } else {
        std::cout << "Not found the key : " << key_name.c_str() << "\n";
    }

    std::cout << "---------------------\n";
    printf_map_find(m, key_name);

    std::cout << "\n\n";

    //
    // 使用 std::unordered_map 无序的 hash_map 会有问题,可能自己对底层 unordered_map 不太了解
    //
    std::cout << " === testing std:::unordered_map find ===\n";
    std::cout << "  ## using std::string.c_str() for std::unordered_map's key ##\n";
    _uo_t* m_uo = new _uo_t();
    m_uo->insert(_uo_t::value_type(key_name.c_str(), 99));

    print_chars("In [main funcs], test in unordered_map", key_name.c_str());
    _uo_t::const_iterator ret_uo = m_uo->find(key_name.c_str());
    if (ret_uo != m_uo->end()) {
        std::cout << "Found the key : " << (*ret_uo).first << ", value : " << (*ret_uo).second << "\n";
    } else {
        std::cout << "Not found the key : " << key_name.c_str() << "\n";
    }
    std::cout << "---------------------\n";
    printf_unordered_map_find(m_uo, key_name);

    std::cout << "\n\n";

    //
    // 使用 hash code 就没有上面的字符串的问题
    //
    std::cout << " === testing std:::unordered_map hash key find ===\n";
    std::cout << "  ## using hash(std::string.c_str()) for std::unordered_map's key ##\n";
    _uo_hash_t* m_uo_hash = new _uo_hash_t();
    m_uo_hash->insert(_uo_hash_t::value_type(getHash(key_name.c_str()), 99));
    _uo_hash_t::const_iterator ret_uo_hash = m_uo_hash->find(getHash(key_name.c_str()));
    if (ret_uo_hash != m_uo_hash->end()) {
        std::cout << "Found the key(hash) : " << (*ret_uo_hash).first << ", original char* : " << key_name.c_str() << ", value : " << (*ret_uo_hash).second << "\n";
    } else {
        std::cout << "Not found key(hash) : " << (*ret_uo_hash).first << ", original char* : " << key_name.c_str() << "\n";
    }
    std::cout << "---------------------\n";
    printf_unordered_map_find_by_hash(m_uo_hash, key_name);

    return 0;
}

/* 输出:
 === testing std::map insert ===
test1:1
test2:2
test3:3
 === testing std::map find ===
  ## using std::string.c_str() for std::map's key ##
In [main funcs], key_name.c_str() :    : t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
In [main funcs],const char* :          : t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Found the key : this is key, value : 99
---------------------
In [printf_map_find funs]: t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Not found the key : this is key


 === testing std:::unordered_map find ===
  ## using std::string.c_str() for std::unordered_map's key ##
In [main funcs], test in unordered_map: t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Found the key : this is key, value : 99
---------------------
In [printf_unordered_map_find funs]: t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Not found the key : this is key


 === testing std:::unordered_map hash key find ===
  ## using hash(std::string.c_str()) for std::unordered_map's key ##
Found the key(hash) : -2046255573 original char* : this is key, value : 99
---------------------
In [printf_unordered_map_find_by_hash funcs]
this is key ==> hash code : -2046255573
Found the key(hash) : -2046255573 original char* : this is key, value : 99
*/

总结

我对 C++ 还不够熟悉,肯定是我使用姿势不对。

以后有空我再去读一下 windows 下的 SDK 的C++ 标准库代码吧,因为现在看得太难受,第一不熟悉,第二,这反人类的写法,我真的无语,说是开源了,但是我觉得这些部分开源的文件,绝对是有处理过的,因为可读性真的太差了。。。

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值