[灌水]比STL::map&STL::unordered_map效率高十倍的自写Hash_map

最新推荐文章于 2023-01-14 20:42:49 发布

八宝咸鱼

最新推荐文章于 2023-01-14 20:42:49 发布

阅读量293

点赞数

分类专栏：数据结构文章标签：算法 c++ 数据结构 stl

本文链接：https://blog.csdn.net/qq_42468226/article/details/117356720

版权

数据结构专栏收录该内容

7 篇文章 0 订阅

订阅专栏

这个文章就是扯犊子用的,当然也附带源码.
本文内的hash_map基于我之前发布的哈希表源码,改进而来.

200万条数据,循环100次也就是2亿次的覆盖or追加.
第一列的第一个窗口是STL容器的效率,第一列第二个窗口是自写Hash_map的效率
可以看到,速度相差十倍.
而最下面的那个窗口则是检测数据写入是否成功.
这2亿条数据,只要有一条写入不成功,都会直接退出程序并打印失败字符串.

该hash_map之所以比STL的效率高,是因为哈希冲突的几率低,依靠最优指数解决的哈希冲突.
不保证该办法应用于所有情况.

hash_map的效率,在最优指数支持的区间内,尽可能减少判断,和数组的效率是相差不大的.
即拥有数组的写入效率,又拥有哈希表的快速定位能力.

之所以右下角的窗口最大的数据为200万,是因为重复10次200万,而key也就代表着重复9次,除了第1次是数据追加,其余的均是数据覆写.

改进过的Hash_map源码


#ifndef NULL
#define NULL 0x0
#endif
#include <iostream>
#include <windows.h>


using uint = unsigned int;
using hash = uint;
using BOOL = int;

template <typename keyType, typename dataType>
class Hash_node
{
    keyType key;
    dataType data;
    Hash_node* nPoint;
public:
    Hash_node(const keyType& key = NULL, const dataType& data = NULL, Hash_node* nPoint = nullptr) :key(key), data(data), nPoint(nPoint)
    {

    }

    Hash_node GetObj(void) const
    {
        return this;
    }

    const keyType& GetKey(void) const
    {
        return this->key;
    }

    dataType& GetData(void)
    {
        return this->data;
    }

    Hash_node*& GetPoint(void)
    {
        return this->nPoint;
    }
};

template <typename keyType, typename dataType, size_t mapSize>
class Hash_map
{
    hash hashValue;
    Hash_node<keyType, dataType>** map{ new Hash_node<keyType, dataType>*[mapSize] {nullptr} };

    BOOL CalcMapSize(void) const
    {
        size_t tmp = mapSize;
        if (tmp > 32 && tmp < 64)
            return 0;
        if (tmp > 64 && tmp < 128)
            return 1;
        if (tmp > 128 && tmp < 256)
            return 2;
        if (tmp > 1048576 && tmp < 2097152)
            return 3;
        return 0;
    }

    hash CalcHashModle(void) const
    {
        hash moudle{ 0 };
        switch (CalcMapSize())
        {
        case 0:
            moudle = 53;
            break;
        case 1:
            moudle = 97;
            break;
        case 2:
            moudle = 193;
            break;
        case 3:
            moudle = 1572869;
            break;
        default:
            moudle = 53;
            break;
        }
        return moudle;
    }

    hash HashFunction(const keyType& key)
    {
        hash moudle = this->CalcHashModle();

        this->hashValue = ((hash)key % moudle) % mapSize;
        return this->hashValue;
    }

public:
    Hash_map(const keyType& key = NULL, const dataType& data = NULL) :hashValue()
    {
    }

    ~Hash_map()
    {
        if (!map)
            return;
        this->RemoveAll();
        delete[]map;
    }

    BOOL Push(const keyType& key, const dataType& data)
    {
        this->HashFunction(key);

        Hash_node<keyType, dataType>** hnpp{ &this->map[hashValue] };
        while (*hnpp)
        {
            if ((*hnpp)->GetKey() == key)
            {
                (*hnpp)->GetData() = data;
                return 0;   // 数据覆盖
            }

            hnpp = &(*hnpp)->GetPoint();
        }

        if (!*hnpp)
        {
            *hnpp = new Hash_node<keyType, dataType>{ key,data };
            return 1;   // 数据写入
        }

        return -1;  // 写入失败
    }

    BOOL FindItem(keyType key, dataType data)
    {
        this->HashFunction(key);

        Hash_node<keyType, dataType>** hnpp{ &this->map[hashValue] };
        while (*hnpp)
        {

            if ((*hnpp)->GetData() == data && (*hnpp)->GetKey() == key)
            {
                std::cout << "Find key & data :" << key << "|" << data << std::endl;
                return 2;
            }

            if ((*hnpp)->GetKey() == key)
            {
                std::cout << "Find key" << std::endl;
                return 1;
            }

            hnpp = &(*hnpp)->GetPoint();
        }

        std::cout << "defeated" << std::endl;
        return 0;  // 写入失败
    }

    void RemoveAll(void)
    {
        if (!map)
            return;

        auto p = map;
        for (int i = 0; i < mapSize; ++i)
        {
            auto q = *p++;
            while (q)
            {
                auto ptr = q;
                q = q->GetPoint();
                delete ptr;
            }
        }
    }

};

BOOL main(void)
{
    auto t = GetTickCount();
    Hash_map<int, int, 2097150>hash_map{ 15,99 };
    for (int i = 0; i < 100; ++i)
        for (int j = 0; j < 2097150; ++j)
            hash_map.Push(j, j);
    std::cout << "Hash_map:" << (GetTickCount() - t) / 1000.f << std::endl;

    for (int i = 0; i < 2097150; ++i)
        if (!hash_map.FindItem(i, i))
        {
            std::cout << "有数据写入失败:" << i << std::endl;
            return -1;
        }

    return 0;
}

STL容器测试代码

#include <iostream>
#include <Windows.h>
#include <unordered_map>
#include <map>

int main(void)
{
    auto t = GetTickCount();
    std::unordered_map<int, int>hash_map;
    std::map<int, int>map;
    for (int i = 0; i < 100; ++i)
        for (int j = 0; j < 2097150; ++j)
            hash_map[j] = j;
    std::cout << "STLHash_map:" << ( GetTickCount() - t ) / 1000.f << std::endl;
    t = GetTickCount();
    for (int i = 0; i < 100; ++i)
        for (int j = 0; j < 2097150; ++j)
            map[j] = j;
    std::cout << "STLMap:" << ( GetTickCount() - t) / 1000.f << std::endl;

    return 0;
}

大家的执行次数一样,写入的数据也是一样的.

除了因为改进的hash_map使用最优质数,减少哈希冲突,让其宽度更广以外.
还因为写入算法,是依靠2级指针来实现的.又省去了至少2次判断.

源码不保证可以应用所用项目.
请自行根据自己需求改进.
默认源码,支持泛式使用.
模板参数1:key值类型,模板参数2:数据类型,模板参数3:mapSize

CalcMapSize和CalcHashMoudle函数,为了保障效率,请自行根据最优指数修改添加.

八宝咸鱼

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
打赏
0
评论
[灌水]比STL::map&STL::unordered_map效率高十倍的自写Hash_map

这个文章就是扯犊子用的,当然也附带源码.本文内的hash_map基于我之前发布的哈希表源码,改进而来.200万条数据,循环100次也就是2亿次的覆盖or追加.第一列的第一个窗口是STL容器的效率,第一列第二个窗口是自写Hash_map的效率可以看到,速度相差十倍.而最下面的那个窗口则是检测数据写入是否成功.这2亿条数据,只要有一条写入不成功,都会直接退出程序并打印失败字符串.该hash_map之所以比STL的效率高,是因为哈希冲突的几率低,依靠最优指数解决的哈希冲突.不保证该办法应用于所有
复制链接

扫一扫