PAT (Advanced Level) Practice 1145 Hashing - Average Search Time (25 分) 凌宸1642_input a number n as the number of all numbers (int-CSDN博客

本文链接：https://blog.csdn.net/lingchen1120/article/details/119814743

PAT (Advanced Level) Practice 1145 Hashing - Average Search Time (25 分) 凌宸1642

题目描述：

The task of this problem is simple: insert a sequence of distinct positive integers into a hash table first. Then try to find another sequence of integer keys from the table and output the average search time (the number of comparisons made to find whether or not the key is in the table). The hash function is defined to be H(key)=key%TSize where TSize is the maximum size of the hash table. Quadratic probing (with positive increments only) is used to solve the collisions.

Note that the table size is better to be prime. If the maximum size given by the user is not prime, you must re-define the table size to be the smallest prime number which is larger than the size given by the user.

译：这个问题的任务很简单：首先将不同的正整数序列插入到哈希表中。然后尝试从表中找到另一个整数键序列并输出平均搜索时间（为查找键是否在表中而进行的比较次数）。散列函数定义为 H(key) = key % TSize 其中 TSize 是散列表的最大大小。二次探测（仅具有正增量）用于解决冲突。

请注意，表大小最好是素数。如果用户给出的最大大小不是素数，则必须重新定义表大小为大于用户给出大小的最小素数。

Input Specification (输入说明):

Each input file contains one test case. For each case, the first line contains 3 positive numbers: MSize, N, and M, which are the user-defined table size, the number of input numbers, and the number of keys to be found, respectively. All the three numbers are no more than 104. Then N distinct positive integers are given in the next line, followed by M positive integer keys in the next line. All the numbers in a line are separated by a space and are no more than 10⁵.

译：每个输入文件包含一个测试用例。对于每种情况，第一行包含 3 个正数：MSize、N 和 M，分别是用户定义的表大小、输入数字的数量和要查找的键的数量。这三个数字都不超过 104。然后在下一行给出 N 个不同的正整数，然后在下一行给出 M 个正整数键。一行中的所有数字以空格分隔，且不超过 10⁵。

output Specification (输出说明):

For each test case, in case it is impossible to insert some number, print in a line X cannot be inserted. where X is the input number. Finally print in a line the average search time for all the M keys, accurate up to 1 decimal place.

译：对于每个测试用例，如果无法插入某个数字，则打印 X cannot be inserted.。其中 X 是输入数字。最后一行打印所有M键的平均搜索时间，精确到小数点后一位。

Sample Input (样例输入):

4 5 4
10 6 4 15 11
11 4 15 2

Sample Output (样例输出):

15 cannot be inserted.
2.8

The Idea：

首先得了解什么是解决哈希冲突的 [正向二次探测法]

https://zhidao.baidu.com/question/648138432445712725.html)。
然后其实就是模拟其过程，这里本菜鸡也不太明白的是，为什么样例中的 15 ，查找了 msize + 1 次？。

The Codes:

#include<bits/stdc++.h>
using namespace std ;
int msize , n , m , t , ans= 0 , pos ; 
unordered_map<int , int> mp ;
bool isPrime(int x){
	for(int i = 2 ; i * i <= x ; i ++)
		if(x % i == 0) return false ;
	return true ;
}
int main(){
	cin >> msize >> n >> m ;
	while(!isPrime(msize)) msize ++ ; 
	vector<int> hash(msize , -1) ;
	for(int i = 0 ; i < n ; i ++){
		cin >> t ;
		bool flag = false ;
		int j = 0 ;
		for( ; j < msize ; j ++){
			pos = (t + j * j) % msize ;
			if(hash[pos] == -1){
				hash[pos] = t ;
				mp[t] = j + 1 ;
				flag = true ;
				break ;
			}
		}
		if(!flag) mp[t] = msize + 1 ; // 与刚好第 msize次找到加以区分
	}
	for(int i = 0 ; i < m ; i ++){
		cin >> t ;
		if(mp.count(t)){
			if(mp[t] == msize + 1){
				ans += msize + 1 ;
				cout << t << " cannot be inserted." << endl ; 
			}else ans += mp[t] ;
		}else {
			for(int j = 0; j <= msize ; j ++){ // 也就是不理解的地方在这：为啥 j 的范围是 [0 , msize] 因为 msize 根本没有实际意义 msize % msize === 0 % msize
				ans ++ ;
				pos = (t + j * j) % msize ;
				if(hash[pos] == -1) break ;
			}	
		}
	}
	printf("%.1f\n" , ans * 1.0 / m) ;
	return 0 ;
}