实现哈希桶（空间利用率较高的哈希表）

最新推荐文章于 2023-05-09 12:56:25 发布

weixin_34302561

最新推荐文章于 2023-05-09 12:56:25 发布

阅读量470

点赞数

文章标签：数据结构与算法

原文链接：http://blog.51cto.com/helloleex/1771542

版权

前面几篇博客已经写过了哈希表的闭散列法，也写过哈希表的应用，在这里就不赘述。今天我们要实现的是一个哈希桶。什么哈希桶呢？

哈希桶：哈希桶就是盛放不同key链表的容器（即是哈希表），在这里我们可以把每个key的位置看作是一个孔，孔里放了一个链表
相信大家可以看出来，使用一个数组来存放记录方法的哈希冲突太多，基于载荷因子的束缚，空间利用率不高，在需要节省空间的情况下，我们可以用哈希桶来处理哈希冲突。
哈希桶是使用一个顺序表来存放具有相同哈希值的key的链表的头节点，利用这个头节点可以找到其它的key。

#pragma once
#include<iostream>
#include<string>
#include<vector>

using namespace std;

template<class T>
struct DefaultFunc
{
	size_t operator()(const T& data)
	{
		return (size_t)data;
	}
};

struct StringFunc
{
	size_t operator()(const string& str)
	{
		size_t sum = 0;
		for (size_t i = 0; i < str.size(); ++i)
		{
			sum += str[i];
		}
		return sum;
	}
};

template<class K, class V>
struct HashBucketNode
{
	K _key;
	V _value;
	HashBucketNode<K, V>* _next;
	HashBucketNode(const K& key, const V& value)
		:_key(key)
		, _value(value)
		, _next(NULL)
	{}
};

template<class K, class V, class FuncModle = DefaultFunc<K>>
class HashBucket
{
	typedef HashBucketNode<K, V> Node;
public:
	HashBucket();
	//~HashBucket();
	HashBucket(const HashBucket<K, V, FuncModle>& h);
	HashBucket<K, V, FuncModle>& operator=(HashBucket<K, V, FuncModle> h);
	bool Insert(const K& key, const V& value);
	Node* Find(const K& key);
	bool Remove(const K& key);
	//bool Alter(const K& key);//用Find实现
	void Print();
protected:
	void _CheckExpand();//检查是否需要扩容
	size_t _GetNextPrime(size_t size);//从素数表中得到比当前素数大的素数
	size_t _HashFunc(const K& key,size_t mod);//哈希函数
protected:
	vector<Node*> _table;
	size_t _size;//记录的个数
};

得到哈希桶的结构以后，我们来实现几个比较重要的函数：

（一）bool Insert(const K& key, const V& value)

插入函数是最难实现的，它涉及到是否需要扩容。为插入函数我们写了两个内部函数void _CheckExpand(); 和 size_t _GetNextPrime(size_t size);

template<class K, class V, class FuncModle>
bool HashBucket<K, V, FuncModle>::Insert(const K& key, const V& value)
{
	_CheckExpand();
	//使用头插法插入
	size_t index = _HashFunc(key,_table.size());
	Node* tmp=new Node(key, value);//一定要用new出结点
	if (_table[index] == NULL)
	{
		_table[index] = tmp;
	}
	else
	{
		//不为NULL则使用头插法插入新结点
		tmp->_next = _table[index];
		_table[index] = tmp;
	}
	_size++;
	return true;
}
template<class K, class V, class FuncModle>
size_t  HashBucket<K, V, FuncModle>::_GetNextPrime(size_t size)
{
	const int _PrimeSize = 28;
	static const unsigned long _PrimeList[_PrimeSize] =
	{
		53ul, 97ul, 193ul, 389ul, 769ul,
		1543ul, 3079ul, 6151ul, 12289ul, 24593ul,
		49157ul, 98317ul, 196613ul, 393241ul, 786433ul,
		1572869ul, 3145739ul, 6291469ul, 12582917ul, 25165843ul,
		50331653ul, 100663319ul, 201326611ul, 402653189ul, 805306457ul,
		1610612741ul, 3221225473ul, 4294967291ul
	};
	for (int i = 0; i < _PrimeSize; ++i)
	{
		if (_PrimeList[i] > size)
			return _PrimeList[i];
	}
	assert(false);
	return 4294967291ul;
}
template<class K, class V, class FuncModle>
void HashBucket<K, V, FuncModle>::_CheckExpand()
{
	if (_size == _table.size())
	{
		size_t newsize = _GetNextPrime(_size);//从素数表中取出下一个素数
		if (newsize == _size)
			return;
		vector<Node*> newtable;
		newtable.resize(newsize);
		for (size_t i = 0; i < _size; ++i)
		{
			size_t index = _HashFunc(_table[i]->_key,newsize);
			if (_table[i])
			{
				Node* head = _table[i];//保存头节点
				newtable[index] = head;//摘下vector的第一个指针为新表的头节点
				_table[i] = _table[i]->_next;
				while (_table[i])//依次摘结点
				{
					Node* tmp = _table[i];
					_table[i] = _table[i]->_next;
					head->_next = tmp;
					head = head->_next;
				}
			}
			else
				newtable[index] = NULL;
			_table[i] = NULL;
		}
		swap(_table, newtable);
	}
}

在扩容的函数中我们使用了素数表，有大师证明Mod素数表中的素数可以减少哈希冲突，其实我也感觉很奇妙。

在调用哈希函数HashFunc的时候一定要注意，我们用key取模一定模的是vector当前的容量。

在插入的时候使用头插法是很高效的！！！

（二）Node* Find(const K& key);

查找函数返回一个结点的指针方便我们在外部更改数据。

emplate<class K, class V, class FuncModle>
HashBucketNode<K,V>* HashBucket<K, V, FuncModle>::Find(const K& key)
{
	size_t index = _HashFunc(key, _table.size());
	while (_table[index])
	{
		if (_table[index]->_key == key)
		{
			return _table[index];
		}
		_table[index] = _table[index]->_next;
	}
	return NULL;
}

（三）bool Remove(const K& key);

我们在写删除节点函数时最好别调用Find函数去查找要删除的结点，如果要删除的结点是头节点或者尾节点的话就无法修改要删除指针的前一个指针的_next域；

template<class K, class V, class FuncModle>
bool HashBucket<K, V, FuncModle>::Remove(const K& key)
{
	//不能用find找到，然后删。
	size_t index = _HashFunc(key,_table.size());
	if (_table[index] == NULL)
		return false;
	Node* cur = _table[index];
	if (cur->_key==key)
	{
		Node* del = cur;
		_table[index] = cur->_next;
		delete del;
		_size--;
		return true;
	}
	else
	{
		Node* prev = NULL;
		while (cur)
		{
			if (cur->_key == key)
			{
				prev->_next = cur->_next;
				delete cur;
				_size--;
				return true;
			}
			prev = cur;
			cur = cur->_next;
		}
		return false;
	}
}

其他的函数比较的简单，在这里我就不写出来了；

转载于:https://blog.51cto.com/helloleex/1771542

weixin_34302561

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
实现哈希桶（空间利用率较高的哈希表）

前面几篇博客已经写过了哈希表的闭散列法，也写过哈希表的应用，在这里就不赘述。今天我们要实现的是一个哈希桶。什么哈希桶呢？哈希桶：哈希桶就是盛放不同key链表的容器（即是哈希表），在这里我们可以把每个key的位置看作是一个孔，孔里放了一个链表相信大家可以看出来，使用一个数组来存放记录方法的哈希冲突太多，基于载荷因子的束缚，空间利用率不高，在需要节省空间的情况下，我们可以用哈...
复制链接

扫一扫