哈希碰撞(Hash Collision)是在哈希表(Hash Table)中出现的一个常见问题。当两个不同的键通过哈希函数计算后得到了相同的哈希值,就发生了哈希碰撞。为了处理哈希碰撞,通常有四种主要的方法:开放地址法(Open Addressing)、链地址法(Separate Chaining)、再哈希法(Rehashing)、共存链法(Coalesced Chaining)。
哈希碰撞的四种处理方法
1. 链地址法(Separate Chaining)
链地址法是处理哈希碰撞的最常用方法之一。其基本思想是为每个哈希表的桶(bucket)维护一个链表,所有哈希值相同的元素都被放到相应桶的链表中。
优点
- 实现简单,插入和删除操作容易。
- 当哈希表的负载因子(load factor)较低时,链表的长度通常很短,查找效率较高。
代码示例
#include <iostream>
#include <list>
#include <vector>
#include <functional>
class HashTable
{
public:
HashTable(int size) : table(size) {}
void insert(int key)
{
int index = hashFunction(key);
table[index].push_back(key);
}
bool search(int key)
{
int index = hashFunction(key);
for (int k : table[index])
{
if (k == key)
{
return true;
}
}
return false;
}
void remove(int key)
{
int index = hashFunction(key);
table[index].remove(key);
}
void display()
{
for (int i = 0; i < table.size(); i++)
{
std::cout << i;
for (int k : table[i])
{
std::cout << " --> " << k;
}
std::cout << std::endl;
}
}
private:
std::vector<std::list<int>> table;
int hashFunction(int key)
{
return key % table.size();
}
};
int main()
{
HashTable ht(7);
ht.insert(10);
ht.insert(20);
ht.insert(15);
ht.insert(7);
std::cout << "Hash Table contents:" << std::endl;
ht.display();
std::cout << "Search for 10: " << (ht.search(10) ? "Found" : "Not Found") << std::endl;
std::cout << "Search for 5: " << (ht.search(5) ? "Found" : "Not Found") << std::endl;
ht.remove(20);
std::cout << "Hash Table contents after removing 20:" << std::endl;
ht.display();
return 0;
}
代码解释
HashTable
类包含一个向量,其中每个元素是一个链表,用于处理碰撞。insert
方法计算键的哈希值,并将键插入到相应的链表中。search
方法在对应的链表中查找键。remove
方法从对应的链表中删除键。display
方法显示哈希表的内容。hashFunction
是一个简单的取模运算,用于计算哈希值。
输出:
Hash Table contents:
0
1
2
3
4
5 --> 15
6
Search for 10: Found
Search for 5: Not Found
Hash Table contents after removing 20:
0
1
2
3
4
5 --> 15
6
2. 开放地址法(Open Addressing)
开放地址法是另一种处理哈希碰撞的方法。其基本思想是,当碰撞发生时,通过探测(如线性探测、二次探测或双重哈希)找到下一个可用位置。
优点
- 不需要额外的链表存储空间。
- 当负载因子较低时,查找效率较高。
代码示例
#include <iostream>
#include <vector>
class HashTable
{
public:
HashTable(int size) : table(size, -1) {}
void insert(int key)
{
int index = hashFunction(key);
int originalIndex = index;
while (table[index] != -1)
{
index = (index + 1) % table.size();
if (index == originalIndex)
{
std::cout << "Hash Table is full, cannot insert key " << key << std::endl;
return;
}
}
table[index] = key;
}
bool search(int key)
{
int index = hashFunction(key);
int originalIndex = index;
while (table[index] != -1)
{
if (table[index] == key)
{
return true;
}
index = (index + 1) % table.size();
if (index == originalIndex)
{
break;
}
}
return false;
}
void remove(int key)
{
int index = hashFunction(key);
int originalIndex = index;
while (table[index] != -1)
{
if (table[index] == key)
{
table[index] = -1;
return;
}
index = (index + 1) % table.size();
if (index == originalIndex)
{
break;
}
}
std::cout << "Key " << key << " not found" << std::endl;
}
void display()
{
for (int i = 0; i < table.size(); i++)
{
std::cout << i << ": " << table[i] << std::endl;
}
}
private:
std::vector<int> table;
int hashFunction(int key)
{
return key % table.size();
}
};
int main()
{
HashTable ht(7);
ht.insert(10);
ht.insert(20);
ht.insert(15);
ht.insert(7);
std::cout << "Hash Table contents:" << std::endl;
ht.display();
std::cout << "Search for 10: " << (ht.search(10) ? "Found" : "Not Found") << std::endl;
std::cout << "Search for 5: " << (ht.search(5) ? "Found" : "Not Found") << std::endl;
ht.remove(20);
std::cout << "Hash Table contents after removing 20:" << std::endl;
ht.display();
return 0;
}
代码解释
HashTable
类包含一个向量,用于存储键值。insert
方法计算键的哈希值,并使用线性探测法找到下一个可用位置。search
方法使用线性探测法查找键。remove
方法使用线性探测法查找并删除键,将对应位置标记为空(-1)。display
方法显示哈希表的内容。hashFunction
是一个简单的取模运算,用于计算哈希值。
输出:
Hash Table contents:
0: -1
1: -1
2: -1
3: -1
4: 20
5: 15
6: -1
Search for 10: Found
Search for 5: Not Found
Hash Table contents after removing 20:
0: -1
1: -1
2: -1
3: -1
4: -1
5: 15
6: -1
3. 再哈希法(Rehashing)
再哈希法(Rehashing)是指在发生碰撞时,使用一个新的哈希函数来计算新的哈希值,直到找到一个空位置。再哈希法可以减少聚集现象,提高哈希表的性能。
优点
- 能够有效减少聚集现象。
- 适用于哈希表负载因子较高的情况。
代码示例
#include <iostream>
#include <vector>
class HashTable
{
public:
HashTable(int size) : table(size, -1), size(size) {}
void insert(int key)
{
int index = hashFunction(key, 0);
int i = 1;
while (table[index] != -1)
{
index = hashFunction(key, i);
i++;
if (i == size)
{
std::cout << "Hash Table is full, cannot insert key " << key << std::endl;
return;
}
}
table[index] = key;
}
bool search(int key)
{
int index = hashFunction(key, 0);
int i = 1;
while (table[index] != -1)
{
if (table[index] == key)
{
return true;
}
index = hashFunction(key, i);
i++;
if (i == size)
{
break;
}
}
return false;
}
void remove(int key)
{
int index = hashFunction(key, 0);
int i = 1;
while (table[index] != -1)
{
if (table[index] == key)
{
table[index] = -1;
return;
}
index = hashFunction(key, i);
i++;
if (i == size)
{
break;
}
}
std::cout << "Key " << key << " not found" << std::endl;
}
void display()
{
for (int i = 0; i < size; i++)
{
std::cout << i << ": " << table[i] << std::endl;
}
}
private:
std::vector<int> table;
int size;
int hashFunction(int key, int i)
{
return (key % size + i * (1 + key % (size - 1))) % size;
}
};
int main()
{
HashTable ht(7);
ht.insert(10);
ht.insert(20);
ht.insert(15);
ht.insert(7);
std::cout << "Hash Table contents:" << std::endl;
ht.display();
std::cout << "Search for 10: " << (ht.search(10) ? "Found" : "Not Found") << std::endl;
std::cout << "Search for 5: " << (ht.search(5) ? "Found" : "Not Found") << std::endl;
ht.remove(20);
std::cout << "Hash Table contents after removing 20:" << std::endl;
ht.display();
return 0;
}
代码解释
HashTable
类包含一个向量,用于存储键值。insert
方法计算键的哈希值,并使用再哈希法找到下一个可用位置。search
方法使用再哈希法查找键。remove
方法使用再哈希法查找并删除键,将对应位置标记为空(-1)。display
方法显示哈希表的内容。hashFunction
方法使用再哈希算法,计算新的哈希值。
输出:
Hash Table contents:
0: 15
1: -1
2: -1
3: -1
4: 10
5: -1
6: 7
Search for 10: Found
Search for 5: Not Found
Hash Table contents after removing 20:
0: 15
1: -1
2: -1
3: -1
4: -1
5: -1
6: 7
4. 共存链法(Coalesced Chaining)
共存链法结合了链地址法和开放地址法的特点。它将链表存储在哈希表的数组中,通过数组中的空闲位置形成链表。
优点
- 能够减少内存分配次数,提高性能。
- 结合了链地址法和开放地址法的优点。
代码示例
#include <iostream>
#include <vector>
class HashTable
{
public:
HashTable(int size) : table(size, -1), next(size, -1), size(size) {}
void insert(int key)
{
int index = hashFunction(key);
if (table[index] == -1)
{
table[index] = key;
}
else
{
int current = index;
while (next[current] != -1)
{
current = next[current];
}
int newIndex = findNextFreeSlot();
if (newIndex != -1)
{
table[newIndex] = key;
next[current] = newIndex;
}
else
{
std::cout << "Hash Table is full, cannot insert key " << key << std::endl;
}
}
}
bool search(int key)
{
int index = hashFunction(key);
while (index != -1)
{
if (table[index] == key)
{
return true;
}
index = next[index];
}
return false;
}
void remove(int key)
{
int index = hashFunction(key);
int prev = -1;
while (index != -1 && table[index] != key)
{
prev = index;
index = next[index];
}
if (index == -1)
{
std::cout << "Key " << key << " not found" << std::endl;
return;
}
if (prev != -1)
{
next[prev] = next[index];
}
table[index] = -1;
next[index] = -1;
}
void display()
{
for (int i = 0; i < size; i++)
{
std::cout << i << ": " << table[i] << " (next: " << next[i] << ")" << std::endl;
}
}
private:
std::vector<int> table;
std::vector<int> next;
int size;
int hashFunction(int key)
{
return key % size;
}
int findNextFreeSlot()
{
for (int i = 0; i < size; i++)
{
if (table[i] == -1)
{
return i;
}
}
return -1;
}
};
int main()
{
HashTable ht(7);
ht.insert(10);
ht.insert(20);
ht.insert(15);
ht.insert(7);
std::cout << "Hash Table contents:" << std::endl;
ht.display();
std::cout << "Search for 10: " << (ht.search(10) ? "Found" : "Not Found") << std::endl;
std::cout << "Search for 5: " << (ht.search(5) ? "Found" : "Not Found") << std::endl;
ht.remove(20);
std::cout << "Hash Table contents after removing 20:" << std::endl;
ht.display();
return 0;
}
代码解释
HashTable
类包含两个向量:一个用于存储键值,另一个用于存储下一个节点的索引。insert
方法计算键的哈希值,并在碰撞时通过链表结构存储在数组中。search
方法通过链表结构在数组中查找键。remove
方法通过链表结构在数组中查找并删除键。display
方法显示哈希表的内容及其链表结构。hashFunction
方法计算哈希值。findNextFreeSlot
方法找到下一个空闲的位置。
输出:
Hash Table contents:
0: -1 (next: -1)
1: -1 (next: -1)
2: -1 (next: -1)
3: -1 (next: -1)
4: 10 (next: 6)
5: 15 (next: -1)
6: 7 (next: -1)
Search for 10: Found
Search for 5: Not Found
Hash Table contents after removing 20:
0: -1 (next: -1)
1: -1 (next: -1)
2: -1 (next: -1)
3: -1 (next: -1)
4: -1 (next: -1)
5: 15 (next: -1)
6: 7 (next: -1)