Cuckoo Hashing ：二分匹配问题

最新推荐文章于 2021-02-26 15:51:05 发布

梁天超

最新推荐文章于 2021-02-26 15:51:05 发布

阅读量1.6k

点赞数 1

分类专栏： ACM 文章标签： ACM 二分匹配

本文链接：https://blog.csdn.net/LTianchao/article/details/17383203

版权

ACM 专栏收录该内容

30 篇文章 0 订阅

订阅专栏

题目描述

Description

One of the most fundamental data structure problems is the dictionary problem: given a set D of words you want to be able to quickly determine if any given query string q is present in the dictionary D or not. Hashing is a well-known solution for the problem. The idea is to create a function h : Σ* → [0..n-1] from all strings to the integer range 0, 1, .., n-1, i.e. you describe a fast deterministic program which takes a string as input and outputs an integer between 0 and n-1. Next you allocate an empty hash table T of size n and for each word w in D, you set T[h(w)] = w. Thus, given a query string q, you only need to calculate h(q) and see if T[h(q)] equals q, to determine if q is in the dictionary. Seems simple enough, but aren't we forgetting something? Of course, what if two words in D map to the same location in the table? This phenomenon, called collision, happens fairly often (remember the Birthday paradox: in a class of 24 pupils there is more than 50% chance that two of them share birthday). On average you will only be able to put roughly √n-sized dictionaries into the table without getting collisions, quite poor space usage!

A stronger variant is Cuckoo Hashing. The idea is to use two hash functions h1 and h2. Thus each string maps to two positions in the table. A query string q is now handled as follows: you compute both h1(q) and h2(q), and if T[h1(q)] = q, or T[h2(q)] = q, you conclude that q is in D. The name "Cuckoo Hashing" stems from the process of creating the table. Initially you have an empty table. You iterate over the words d in D, and insert them one by one. If T[h1(d)] is free, you set T[h1(d)] = d. Otherwise if T[h2(d)] is free, you set T[h2(d)] = d. If both are occupied however, just like the cuckoo with other birds' eggs, you evict the word r in T[h1(d)] and set T[h1(d)] = d. Next you put r back into the table in its alternative place (and if that entry was already occupied you evict that word and move it to its alternative place, and so on). Of course, we may end up in an infinite loop here, in which case we need to rebuild the table with other choices of hash functions. The good news is that this will not happen with great probability even if D contains up to n/2 words!

Input

On the first line of input is a single positive integer 1 ≤ t ≤ 50 specifying the number of test cases to follow. Each test case begins with two positive integers 1 ≤ m ≤ n ≤ 10000 on a line of itself, m telling the number of words in the dictionary and n the size of the hash table in the test case. Next follow m lines of which the ith describes the ith word di in the dictionary D by two non negative integers h1(di) and h2(di) less than n giving the two hash function values of the word di. The two values may be identical.

Output

For each test case there should be exactly one line of output either containing the string "successful hashing" if it is possible to insert all words in the given order into the table, or the string "rehash necessary" if it is impossible.

Sample Input

Sample Output

successful hashing
rehash necessary

题目废话比较多，简而言之，有m个数放n个位置中，每个数有两个位置可选，问是否可以放下。

最初思路，找到m个数的组合 1（k1）,2(k2)...m(km) 其中i（ki）表示第i个数放入第ki个位置（ki=1,2），我一看，这不就是标准的回溯法嘛，so easy。

void backtrack(int step)
{
    if(flag==1)
        return;
    if(step==m)
    {
        flag=1;
        return;
    }
    if(mark[p1[step]]==0)
    {
        mark[p1[step]]=1;
        backtrack(step+1);
        mark[p1[step]]=0;

    }
    if(mark[p2[step]]==0)
    {
        mark[p2[step]]=1;
        backtrack(step+1);
        mark[p2[step]]=0;

    }
}
int main()
{
//   freopen("1523.txt","r",stdin);
    int t,i;
    cin>>t;
    while(t--)
    {
        flag=0;
        memset(mark,0,sizeof(mark));
       // scanf("%d%d",&m,&n);
		scanf_(m);
		scanf_(n);
    //    cin>>m>>n;
        for(i=0;i<m;i++)
		{
        //    cin>>p1[i]>>p2[i];
      //  scanf("%d%d",&p1[i],&p2[i]);
		scanf_(p1[i]);
		scanf_(p2[i]);
		}
        backtrack(0);
        if(flag==1)
            printf("successful hashing\n");
        else
            printf("rehash necessary\n");
    }
}

果断超时不解释啊。回溯法的时间复杂度是指数级的有木有。

其实这道题是所谓的二分匹配问题。关于什么事二分匹配问题我也不太懂，但是代码却比较直观，根据题设的直线思维即可理解。

注意每加入一个点时都要判断当前点能否成功加入，若不能则不必再往下进行。 link【i】表示当前第i个位置保存的点。 used数组每次加入点时都重置，保存当次迭代

过程中使用的点，每次迭代不可use同一个点，否则将进入死循环。

#include<iostream>
using namespace std;
#define M 10002

int arr[M][2];
bool used[M];
int link[M];//link[i] represents the point on the i-th position

int ans,t,n,m;

bool dfs(int a)
{
	for(int i=0; i<2; i++)
	{
		int j = arr[a][i];
		if(!used[j])
		{
			used[j] = true;
			if(link[j]==-1||dfs(link[j]))
			{
				link[j] = a;
				return true;
			}
		}
	}
	return false;
}

int main()
{
	int i;
	cin>>t;
	while(t--)
	{
		memset(link,-1,sizeof(link));
		scanf("%d%d",&m,&n);
		for(i=0; i<m; i++)
			scanf("%d%d",&arr[i][0],&arr[i][1]);
		ans = 0;
		for(i=0; i<m; i++)
		{
			memset(used,false,sizeof(used));
			if(!dfs(i))
				break;
		}
		if(i==m)
			cout<<"successful hashing"<<endl;
		else cout<<"rehash necessary"<<endl;
	}
	return 0;
}

梁天超

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Cuckoo Hashing ：二分匹配问题

题目描述DescriptionOne of the most fundamental data structure problems is the dictionary problem: given a set D of words you want to be able to quickly determine if any given query string q is present in the dictionary D or not. Hashing is a well-known solut
复制链接

扫一扫