最近在学习 C++ 的一些基础知识,发现 std::map
与 std::unordered_map
的 key 为 字符串 的一些问题,不知道是我使用姿势问题,还是其他的,如有大神熟悉,望不吝指正。
目测内部对 字符串的 hash 计算出了问题?但是我怎么都不相信,毕竟这是经历了多少年的标准库了,而我现在才接触而已,我觉得 100% 是我的 使用姿势不对 的问题。
那么看看我是如何测试的。
先插入数据:
using _t = std::map<const char*, int>;
...
_t* m = new _t();
std::string key_name = "this is key";
m->insert(_t::value_type(key_name.c_str(), 99)); // 插入 key 为:this is key,value为:99
然后查找:
_t::const_iterator ret = m->find(key_name.c_str());
if (ret != m->end()) {
std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
} else {
std::cout << "Not found the key : " << "this is key" << "\n";
}
结果跑进了第一个分支 Found the key
,是没有问题的。
然后,如果将 key_name.c_str()
改为明文字符串 "this is key"
就有问题:
_t::const_iterator ret = m->find("this is key");
if (ret != m->end()) {
std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
} else {
std::cout << "Not found the key : " << "this is key" << "\n";
}
结果跑进了第二个分支 Not found the key
,有问题的。
所以我怀疑是否 terminate char(结尾符) 的问题,然后我输出了所有的字符,做比较:
void print_chars(const char* title, const char* chars) {
std::cout << title << ": ";
while(true) {
std::cout << *chars << "(" << (unsigned int)(*chars) << ")" << ",";
if (*(++chars) == 0) {
std::cout << "\\0";
break;
}
}
std::cout << '\n';
}
...
print_chars("In [main funcs], key_name.c_str() : ", key_name.c_str());
print_chars("In [main funcs],const char* : ", "this is key");
/* 输出:
In [main funcs], key_name.c_str() : : t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
In [main funcs],const char* : : t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
*/
输出结果一模一样
是否 key_name.c_str()
的方式就没问题了呢,然而不是,我将 key_name.c_str()
传入方法的参数后,再用 map->find
还是查找不到。
void printf_map_find(_t* m, std::string key_name) {
print_chars("In [printf_map_find funs]", key_name.c_str());
_t::const_iterator ret = m->find(key_name.c_str());
if (ret != m->end()) {
std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
} else {
std::cout << "Not found the key : " << key_name.c_str() << "\n";
}
}
printf_map_find(m, key_name);
/* 输出:
In [printf_map_find funs]: t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Not found the key : this is key
*/
结果发现 key_name.c_str()
也不能查询结果了,从 char*
的打印结果来看,还是与之前的内容一致的
难道是有序的 std::map
的问题吗?那我试试无序的 std::unordered_map
:
using _uo_t = std::unordered_map<const char*, int>;
void printf_unordered_map_find(_uo_t* m, std::string key_name) {
print_chars("In [printf_unordered_map_find funs]", key_name.c_str());
_uo_t::const_iterator ret = m->find(key_name.c_str());
if (ret != m->end()) {
std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
} else {
std::cout << "Not found the key : " << key_name.c_str() << "\n";
}
}
...
std::cout << " === testing std:::unordered_map find ===\n";
std::cout << " ## using std::string.c_str() for std::unordered_map's key ##\n";
_uo_t* m_uo = new _uo_t();
m_uo->insert(_uo_t::value_type(key_name.c_str(), 99));
print_chars("In [main funcs], test in unordered_map", key_name.c_str());
_uo_t::const_iterator ret_uo = m_uo->find(key_name.c_str());
if (ret_uo != m_uo->end()) {
std::cout << "Found the key : " << (*ret_uo).first << ", value : " << (*ret_uo).second << "\n";
} else {
std::cout << "Not found the key : " << key_name.c_str() << "\n";
}
std::cout << "---------------------\n";
printf_unordered_map_find(m_uo, key_name);
/* 输出:
=== testing std:::unordered_map find ===
## using std::string.c_str() for std::unordered_map's key ##
In [main funcs], test in unordered_map: t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Found the key : this is key, value : 99
---------------------
In [printf_unordered_map_find funs]: t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Not found the key : this is key
*/
然后我测试了一下,发现与上面的 std::map 的结果一致
最后,我只是不使用 const char*
来作为 key
了,使用 int
值,就是将 const char* 哈希后的值
:
using _uo_hash_t = std::unordered_map<int, int>;
int getHash(const char* chars) {
int hash = 0;
while((*chars) != 0) {
hash = hash * 31 + (*chars++);
}
return hash;
}
int getHash(std::string str) {
return getHash(str.c_str());
}
void printf_unordered_map_find_by_hash(_uo_hash_t* m, std::string key_name) {
std::cout << "In [printf_unordered_map_find_by_hash funcs]\n";
int hash_code = getHash(key_name);
std::cout << key_name << " ==> hash code : " << hash_code << "\n";
_uo_hash_t::const_iterator ret = m->find(hash_code);
if (ret != m->end()) {
std::cout << "Found the key(hash) : " << (*ret).first << " original char* : " << key_name.c_str() << ", value : " << (*ret).second << "\n";
} else {
std::cout << "Not found key(hash) : " << (*ret).first << " original char* : " << key_name.c_str() << "\n";
}
}
...
std::cout << " === testing std:::unordered_map hash key find ===\n";
std::cout << " ## using hash(std::string.c_str()) for std::unordered_map's key ##\n";
_uo_hash_t* m_uo_hash = new _uo_hash_t();
m_uo_hash->insert(_uo_hash_t::value_type(getHash(key_name.c_str()), 99));
_uo_hash_t::const_iterator ret_uo_hash = m_uo_hash->find(getHash(key_name.c_str()));
if (ret_uo_hash != m_uo_hash->end()) {
std::cout << "Found the key(hash) : " << (*ret_uo_hash).first << ", original char* : " << key_name.c_str() << ", value : " << (*ret_uo_hash).second << "\n";
} else {
std::cout << "Not found key(hash) : " << (*ret_uo_hash).first << ", original char* : " << key_name.c_str() << "\n";
}
std::cout << "---------------------\n";
printf_unordered_map_find_by_hash(m_uo_hash, key_name);
...
/* 输出:
=== testing std:::unordered_map hash key find ===
## using hash(std::string.c_str()) for std::unordered_map's key ##
Found the key(hash) : -2046255573 original char* : this is key, value : 99
---------------------
In [printf_unordered_map_find_by_hash funcs]
this is key ==> hash code : -2046255573
Found the key(hash) : -2046255573 original char* : this is key, value : 99
*/
可以看到结果,两个都可以 Found
到。
完整测试代码
#include<iostream>
#include<map>
#include<unordered_map>
#include<utility>
#include<string>
using _t = std::map<const char*, int>;
using _uo_t = std::unordered_map<const char*, int>;
using _uo_hash_t = std::unordered_map<int, int>;
void print_chars(const char* title, const char* chars) {
std::cout << title << ": ";
while(true) {
std::cout << *chars << "(" << (unsigned int)(*chars) << ")" << ",";
if (*(++chars) == 0) {
std::cout << "\\0";
break;
}
}
std::cout << '\n';
}
void printf_map_find(_t* m, std::string key_name) {
print_chars("In [printf_map_find funs]", key_name.c_str());
_t::const_iterator ret = m->find(key_name.c_str());
if (ret != m->end()) {
std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
} else {
std::cout << "Not found the key : " << key_name.c_str() << "\n";
}
}
void printf_unordered_map_find(_uo_t* m, std::string key_name) {
print_chars("In [printf_unordered_map_find funs]", key_name.c_str());
_uo_t::const_iterator ret = m->find(key_name.c_str());
if (ret != m->end()) {
std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
} else {
std::cout << "Not found the key : " << key_name.c_str() << "\n";
}
}
int getHash(const char* chars) {
int hash = 0;
while((*chars) != 0) {
hash = hash * 31 + (*chars++);
}
return hash;
}
int getHash(std::string str) {
return getHash(str.c_str());
}
void printf_unordered_map_find_by_hash(_uo_hash_t* m, std::string key_name) {
std::cout << "In [printf_unordered_map_find_by_hash funcs]\n";
int hash_code = getHash(key_name);
std::cout << key_name << " ==> hash code : " << hash_code << "\n";
_uo_hash_t::const_iterator ret = m->find(hash_code);
if (ret != m->end()) {
std::cout << "Found the key(hash) : " << (*ret).first << " original char* : " << key_name.c_str() << ", value : " << (*ret).second << "\n";
} else {
std::cout << "Not found key(hash) : " << (*ret).first << " original char* : " << key_name.c_str() << "\n";
}
}
int main() {
_t* m = new _t();
//
// testing insert
//
std::cout << " === testing std::map insert ===\n";
m->insert(std::pair<const char*, int>("test1", 1));
m->insert(std::pair<const char*, int>("test2", 2));
m->insert(_t::value_type("test3", 3));
// 注意 map与vector的iterator 没有重载+运算符,所以用 it + i 的方式不行
// vector是有重载+云算法的
// 但 map 的 iterator 是有实现 ++ 运算符的,所以可以使用下列方式来遍历
_t::iterator it = m->begin();
for (; it != m->end(); it++) {
std::cout << it->first << ":" << it->second << "\n";
}
//
// testing find
//
//
// 测试 std::map ,让 const char* 作为 key,发现跨函数调用会有问题
//
std::cout << " === testing std::map find ===\n";
std::cout << " ## using std::string.c_str() for std::map's key ##\n";
std::string key_name = "this is key";
m->insert(_t::value_type(key_name.c_str(), 99));
// 在main 函数内打印 key_name.c_str() 与 在 printf_map_find 函数内的 key_name.c_str() 打印内容是一样的
print_chars("In [main funcs], key_name.c_str() : ", key_name.c_str());
print_chars("In [main funcs],const char* : ", "this is key");
// _t::const_iterator ret = m->find("this is key"); // 这里注意,明文字符串也查找不了
_t::const_iterator ret = m->find(key_name.c_str()); // 但没有跨函数来传入 key_name.c_str() 的查找是能找到的
if (ret != m->end()) {
std::cout << "Found the key : " << (*ret).first << ", value : " << (*ret).second << "\n";
} else {
std::cout << "Not found the key : " << key_name.c_str() << "\n";
}
std::cout << "---------------------\n";
printf_map_find(m, key_name);
std::cout << "\n\n";
//
// 使用 std::unordered_map 无序的 hash_map 会有问题,可能自己对底层 unordered_map 不太了解
//
std::cout << " === testing std:::unordered_map find ===\n";
std::cout << " ## using std::string.c_str() for std::unordered_map's key ##\n";
_uo_t* m_uo = new _uo_t();
m_uo->insert(_uo_t::value_type(key_name.c_str(), 99));
print_chars("In [main funcs], test in unordered_map", key_name.c_str());
_uo_t::const_iterator ret_uo = m_uo->find(key_name.c_str());
if (ret_uo != m_uo->end()) {
std::cout << "Found the key : " << (*ret_uo).first << ", value : " << (*ret_uo).second << "\n";
} else {
std::cout << "Not found the key : " << key_name.c_str() << "\n";
}
std::cout << "---------------------\n";
printf_unordered_map_find(m_uo, key_name);
std::cout << "\n\n";
//
// 使用 hash code 就没有上面的字符串的问题
//
std::cout << " === testing std:::unordered_map hash key find ===\n";
std::cout << " ## using hash(std::string.c_str()) for std::unordered_map's key ##\n";
_uo_hash_t* m_uo_hash = new _uo_hash_t();
m_uo_hash->insert(_uo_hash_t::value_type(getHash(key_name.c_str()), 99));
_uo_hash_t::const_iterator ret_uo_hash = m_uo_hash->find(getHash(key_name.c_str()));
if (ret_uo_hash != m_uo_hash->end()) {
std::cout << "Found the key(hash) : " << (*ret_uo_hash).first << ", original char* : " << key_name.c_str() << ", value : " << (*ret_uo_hash).second << "\n";
} else {
std::cout << "Not found key(hash) : " << (*ret_uo_hash).first << ", original char* : " << key_name.c_str() << "\n";
}
std::cout << "---------------------\n";
printf_unordered_map_find_by_hash(m_uo_hash, key_name);
return 0;
}
/* 输出:
=== testing std::map insert ===
test1:1
test2:2
test3:3
=== testing std::map find ===
## using std::string.c_str() for std::map's key ##
In [main funcs], key_name.c_str() : : t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
In [main funcs],const char* : : t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Found the key : this is key, value : 99
---------------------
In [printf_map_find funs]: t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Not found the key : this is key
=== testing std:::unordered_map find ===
## using std::string.c_str() for std::unordered_map's key ##
In [main funcs], test in unordered_map: t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Found the key : this is key, value : 99
---------------------
In [printf_unordered_map_find funs]: t(116),h(104),i(105),s(115), (32),i(105),s(115), (32),k(107),e(101),y(121),\0
Not found the key : this is key
=== testing std:::unordered_map hash key find ===
## using hash(std::string.c_str()) for std::unordered_map's key ##
Found the key(hash) : -2046255573 original char* : this is key, value : 99
---------------------
In [printf_unordered_map_find_by_hash funcs]
this is key ==> hash code : -2046255573
Found the key(hash) : -2046255573 original char* : this is key, value : 99
*/
总结
我对 C++ 还不够熟悉,肯定是我使用姿势不对。
以后有空我再去读一下 windows 下的 SDK 的C++ 标准库代码吧,因为现在看得太难受,第一不熟悉,第二,这反人类的写法,我真的无语,说是开源了,但是我觉得这些部分开源的文件,绝对是有处理过的,因为可读性真的太差了。。。