算法基础系列第二章——哈希表

最新推荐文章于 2024-04-23 14:26:21 发布

杨枝

最新推荐文章于 2024-04-23 14:26:21 发布

阅读量619

点赞数 5

分类专栏：算法基础文章标签：算法散列表数据结构

本文链接：https://blog.csdn.net/weixin_52621323/article/details/121606041

版权

算法基础专栏收录该内容

14 篇文章 21 订阅

订阅专栏

PAT甲级真题1078

原题描述(PAT甲级真题1078)

The task of this problem is simple: insert a sequence of distinct positive integers into a hash table, and output the positions of the input numbers. The hash function is defined to be “H(key) = key % TSize” where TSize is the maximum size of the hash table. Quadratic probing (with positive increments only) is used to solve the collisions.

Note that the table size is better to be prime. If the maximum size given by the user is not prime, you must re-define the table size to be the smallest prime number which is larger than the size given by the user.

Input Specification:

Each input file contains one test case. For each case, the first line contains two positive numbers: MSize (<=10 ^ 4) and N (<=MSize) which are the user-defined table size and the number of input numbers, respectively. Then N distinct positive integers are given in the next line. All the numbers in a line are separated by a space.

Output Specification:

For each test case, print the corresponding positions (index starts from 0) of the input numbers in one line. All the numbers in a line are separated by a space, and there must be no extra space at the end of the line. In case it is impossible to insert the number, print “-” instead.

Sample Input:
4 4
10 6 4 15
Sample Output:
0 1 4 -

//译文

将一个由若干个不同正整数构成的整数序列插入到一个哈希表中，然后输出输入数字的位置。

哈希函数定义为 H(key)=key%TSize，其中 TSize 是哈希表的最大大小。

利用只具有正增量的二次探测法来解决冲突。

注意，哈希表的大小最好是素数，如果用户给出的最大大小不是素数，则必须将表大小重新定义为大于用户给出的大小的最小素数。

输入格式
第一行包含两个整数 MSize 和 N，分别表示用户定义的表的大小以及输入数字的数量。

第二行包含 N 个不同的正整数，数字之间用空格隔开。

输出格式
在一行中，输出每个输入数字的相应位置（索引从 0 开始），数字之间用空格隔开，行尾不得有多余空格。

如果无法插入某个数字，则输出 -。

数据范围
1≤MSize≤104,
1≤N≤MSize,
输入数字均在 [1,105] 范围内。

输入样例：
4 4
10 6 4 15
输出样例：
0 1 4 -

源代码

#include <iostream>
#include <cstring>
#include <algorithm>

using namespace std;
const int N = 10000 + 10;
const int null = 0x3f3f3f3f;
int h[N];
int n,m;
int size_prime;

//求所给哈希表长度的最小质数
bool is_prime(int n)
{
    if(n < 2) return false;
    else 
    {
        for(int i = 2; i * i <= n ;i++)
        {
            if(n % i == 0) return false;
        }
        return true;
    }
}

int find(int x,int tsize)
{
    //求哈希值的核心
    int start = (x % tsize + tsize) % tsize;
    int k = start;
    int idx = 1;//计数器
    bool flag = true;
    while (h[k] != null && h[k] != x && h[k] != -1)
    {   
        if(idx == tsize)
        {
            flag =false;
            break;
        }

        k = (start + idx * idx) % tsize;
        idx ++;
    }
        if(!flag) k = -1;
        else return k;
}

int main()
{
    cin >> m >> n;
    size_prime = m;
    while(!is_prime(size_prime)) ++ size_prime;

    memset(h,-1,sizeof h);
    fill(h,h+size_prime,null);

    for(int i= 0; i <n;i++)
    {
        int x;
        cin >> x;
        int k = find(x,size_prime);
        if(k == -1) cout <<'-' <<' ';
        else
        {
            h[k] = x;
            cout << k <<' ';
        }
    }
    return 0;
}

解题思路

第一个要理解的点

这道题是明确说出了，会出现冲突的

何为冲突？反映在哈希表上就是，传入值进行匹配哈希值，经过计算以后，发现多个数的哈希值k一样，因为哈希表的基础模型是数组，所以当出现冲突的时候，后面的值就需要重新计算哈希值。本题也很直接的给出了利用只具有正增量的二次探测法来解决冲突。

第二个要注意的点

题目要求了：“哈希表的大小最好是素数，如果用户给出的最大大小不是素数，则必须将表大小重新定义为大于用户给出的大小的最小素数。”

bool is_prime(int n)
{
    if(n < 2) return false;
    else 
    {
        for(int i = 2; i * i <= n ;i++)
        {
            if(n % i == 0) return false;
        }
        return true;
    }
}

/*********调用*********/
while(!is_prime(size_prime)) ++ size_prime;

第三个需要做的铺垫 --> 初始化问题

1≤MSize≤104,
1≤N≤MSize,
输入数字均在 [1,105] 范围内。
/***********************************/
 *const int N = 10000 + 10;        *
 *const int null = 0x3f3f3f3f;     *
 *int h[N];                        *
/***********************************/

首先需要为这个为了满足最大范围要求而创建哈希数组初始化

memset(h,-1,sizeof h);

其次是为用户定义的表的大小MSize在处理为合适的素数以后，由这个大小MSize在最大的哈希数组h[N]中所需要的空间初始化

fill(h,h+size_prime,null);

拓1——关于menset函数

函数介绍

void *memset(void *s, int ch, size_t n);

函数解释：
将s中当前位置后面的 n 个字节（typedef unsigned int size_t ）用 ch 替换并返回 s 。

函数作用：
作用是在一段内存块中填充某个给定的值，它是对较大的结构体或数组进行清零操作的一种最快方法 [1]。

函数原型：
memset()函数原型是extern void *memset(void *buffer, int c, int count) buffer：为指针或是数组,c：是赋给buffer的值,count：是buffer的长度.

常见错误

memset函数按字节对内存块进行初始化，所以不能用它将int数组初始化为0和-1之外的其他值（除非该值高字节和低字节相同）。

拓2——fill函数

fill函数的作用是：将一个区间的元素都赋予val值。
函数参数：fill(first,last,val);//first为容器的首迭代器，last为容器的末迭代器，val为将要替换的值。

开放寻址法构建 find(int key , int TSize)

在第一步和第二步理解到位的基础上就可以开始构建这个核心的find()函数了。

int find(int x,int TSize)
{
    //求哈希值的核心
    int start = (x % TSize + TSize) % TSize;
    int k = start;
    int idx = 1;//计数器
    bool flag = true;

    while (h[k] != null && h[k] != x && h[k] != -1)
    {
        if(idx == TSize)
        {
            flag =false;//标记这轮寻址插入失败
            break;
        }

        // 重新求新的哈希值
        k = (start + idx * idx) % TSize;
        idx ++;
    }
        if(!flag) k = -1;
        else return k;
}