CSU天梯赛校赛 L3-1 Hashing - Hard Version

最新推荐文章于 2021-08-18 08:54:03 发布

xiyue_jiang

最新推荐文章于 2021-08-18 08:54:03 发布

阅读量186

点赞数 2

分类专栏： GPLT 图拓扑排序文章标签：模拟拓扑排序

本文链接：https://blog.csdn.net/xiyue_jiang/article/details/88659983

版权

GPLT 同时被 3 个专栏收录

1 篇文章 0 订阅

订阅专栏

图

1 篇文章 0 订阅

订阅专栏

拓扑排序

1 篇文章 0 订阅

订阅专栏

Hashing - Hard Version

L3-1 Hashing - Hard Version （30 分)

Given a hash table of size N, we can define a hash function H(x)=x%N. Suppose that the linear probing is used to solve collisions, we can easily obtain the status of the hash table with a given sequence of input numbers.

However, now you are asked to solve the reversed problem: reconstruct the input sequence from the given status of the hash table. Whenever there are multiple choices, the smallest number is always taken.

Input Specification:

Each input file contains one test case. For each test case, the first line contains a positive integer N (≤1000), which is the size of the hash table. The next line contains N integers, separated by a space. A negative integer represents an empty cell in the hash table. It is guaranteed that all the non-negative integers are distinct in the table.

Output Specification:

For each test case, print a line that contains the input sequence, with the numbers separated by a space. Notice that there must be no extra space at the end of each line.

Sample Input:

11
33 1 13 12 34 38 27 22 32 -1 21

Sample Output:

1 13 12 21 33 34 38 27 22 32

题意概述

从给定的哈希表反推出可以生成该哈希表的输入序列，若存在多种输入序列同时满足给定的哈希表，输出其中最小的那个序列。

要点

注意哈希表中的空位是由“任意负整数”表示的，如果仅认为空位以 $- 1$ 表示，会出现错误。
这题可以单纯从模拟的角度考虑，也可以更一般地将其抽象为一个拓扑排序问题。

解析

检查给定的哈希表，将每个元素对表长的余数标记如下：

元素	1	13	12	34	38	27	22	32	-1	21
下标	1	2	3	4	5	6	7	8	9	10
余数	1	2	1	1	5	5	0	10	/	0
与命中元素的距离	0	0	2	3	0	1	7	9	/	0

观察上表，容易发现每个命中哈希表的元素（记该元素为 $e_h$ ）必定是与同余元素相比，在原序列中第一个出现的，而对于未命中元素（记该元素为 $e_l$ ），由线性探测法的原理可知， $l$ 被放置在哈希表中的某个单元时一定满足条件： $e_l$ 和 $e_h$ 之间无空位。

根据上述讨论，我们已经可以从模拟角度推算出样例哈希表对应的原序列了。

模拟思路

由于题目要求输出所有可能序列中最小的一个，应该尽可能将较小的元素优先输出，故首先对所有元素进行升序排序，得到顺序列表 $l i s t$ 。接着，对每个元素进行计算，设当前元素为 $e$ ：

判断 $e$ 是否命中，若命中，可直接输出， $e$ 指向 $l i s t$ 中的下一个元素；若未命中，则在哈希表中寻找与 $e$ 同余的命中元素 $e_h$ ，并统计两者之间相隔元素的数量 $d$ ，进入步骤2。
在 $l i s t$ 中反复顺序扫描 $e$ 之后的元素 $e_t$ ，扫描次数为 $d$ 。如果 $e_t$ 命中哈希表，或者 $e_t$ 在哈希表中的前一个元素已经被输出，则继续判断 $e_t$ 是否也处在 $e_h$ 和 $e_t$ 之间；如果条件满足，则将 $e_t$ 输出。 $d$ 次扫描结束后，输出 $e$ 。这个循环完成了“将尽可能较小的元素优先输出”的任务。
$e$ 移动到 $l i s t$ 中下一个未被输出的元素，转至步骤1，若 $l i s t$ 中所有元素都被输出，则结束计算。

拓扑排序思路

前文讨论出的规律，可以进一步转化为图的规律：每个元素抽象为图的一个顶点，有向边则是由命中元素 $e_h$ 指向与其具有相同余数的非命中元素 $e_l$ 。于是表格中“与命中元素的距离”可以抽象为顶点的入度。每当输出一个顶点，就将其对应后继顶点的入度减一，当后继顶点（即某个 $e_h$ 对应的一个 $e_l$ ）的入度减至 $0$ ，我们就可以输出这个顶点。这不就相当于达到了“模拟思路”中步骤2循环所达成的效果吗？

由于同一时刻入度为0的顶点可能有多个，题目要求“尽可能优先输出较小的元素”，故我们可以使用一个升序的优先队列来保存入度为零的顶点。

令 $\text{dist}(e)$ 表示某一元素 $e$ 与同余命中元素的距离（若 $e$ 本身就命中，则该距离为 $0$ ）， $\text{pos}(e)$ 表示 $e$ 在哈希表中的下标， $\text{val}(e)$ 表示元素 $e$ 的值， $n$ 表示哈希表的容量，可得下列公式：
$\text{dist}(e) = (\text{pos}(e) -\text{val}(e) \% n + n) \% n$
通过下列步骤可以计算出答案：

将命中元素加入优先队列。
若队列不空，取出队首元素 $e_{top}$ ，并输出之。
检查其余所有入度大于 $0$ 的元素 $e$ ，若 $e_t$ 与 $e_{top}$ 的距离不大于 $\text{dist}(e)$ ，则将 $e$ 的入度减一。若操作结束后 $e$ 的入度为 $0$ ，就将其压入队列。转至步骤2。

可见，问题转化为拓扑排序后更加清晰明了了。

程序代码

拓扑排序版本（AC）

#include "bits/stdc++.h"
using namespace std;
const int maxn = 1009;

priority_queue<int, vector<int>, greater<int> > q;
vector<int> vec;
int inOrder[maxn];
int arr[maxn];

int dist(int currPos, int val, int n){
    int pos = val % n;
    int d = (currPos - pos + n) % n;
    return d;
}

int main(){
#ifdef TEST
freopen("test.txt", "r", stdin);
#endif // TEST

    int n;
    while(cin >> n){
        memset(inOrder, 0, sizeof(inOrder));
        for(int i = 0; i < n; i++){
            int t;
            cin >> t;
            arr[i] = t;
            vec.push_back(t);
            if(t >= 0){
                int d = dist(i, t, n);
                inOrder[i] = d;
                if(inOrder[i] == 0)
                    q.push(t);
            }
        }
        bool firstOut = true;
        while(!q.empty()){
            int tmp = q.top();
            q.pop();
            if(firstOut){
                cout << tmp;
                firstOut = false;
            }
            else{
                cout << " " << tmp;
            }
            
            int tmpPos = find(vec.begin(), vec.end(), tmp) - vec.begin();

            for(int i = tmpPos; i < tmpPos + n; i++){
                int nextNode = vec[i % n];
                if(i % n == 8)
                    cout << "";
                if(dist(i % n, nextNode, n) >= (i - tmpPos + n) % n){
                    if(inOrder[i % n] > -1)
                        inOrder[i % n]--;
                    if(inOrder[i % n] == 0)
                        q.push(nextNode);
                }
            }
        }
    }

    return 0;
}

模拟版本（最大规模测试点未通过）

#include "bits/stdc++.h"
using namespace std;
const int maxn = 1e3+9;

int n;
bool vis[maxn];
int arr[maxn];
map<int, int> positionMap;
vector<int> vec;

bool firstOut = true;
void print(vector<int>::iterator it){
    if(firstOut){
        cout << *it;
        firstOut = false;
    }
    else{
        cout << " " << *it;
    }
    vis[positionMap[*it]] = 1;
    vec.erase(it);
}
bool okToPop(vector<int>::iterator it){
    int pos = positionMap[*it];
    int prev = pos - 1;
    prev = prev >= 0 ? prev : n - 1;
    return vis[prev] || (*it % n == pos);
}
int dist(int v){
    int pos = positionMap[v];
    int cnt = 0;
    int k = n;
    while(k--){
        if(!vis[pos])
            cnt++;
        if(pos == v%n)
            break;
        pos -= 1;
        if(pos < 0) pos += n;
    }
    return cnt - 1;
}
void debug(){
    cout << endl;
    cout << "debug:  ";
    for(auto i : vec)
        cout << i << " ";
    cout << endl;
}

int main(){
#ifdef TEST
freopen("test.txt", "r", stdin);
#endif // TEST

    memset(vis, 0, sizeof(vis));
    cin >> n;
    for(int i = 0; i < n; i++){
        int k;
        cin >> k;
        if(k >= 0){
            positionMap[k] = i;
            vec.push_back(k);
        }
        else{
            vis[i] = 1;
        }
    }
    int len = vec.size();
    sort(vec.begin(), vec.end());

    int s;
    while(s = vec.size()){
        auto it = vec.begin();
        int value = *it;
        if(*it % n == positionMap[*it]){ // hash hit
            print(it);
        }
        else{ // hash fail
            int pos = positionMap[*it];
            int cnt = dist(*it);
            while(cnt){
                for(auto i = it+1; i != vec.end(); i++){
//                    debug();
                    if(okToPop(i)){
                        print(i);
                        cnt--;
                        break;
                    }
                }
            }
            print(it);
        }
    }

    return 0;
}

模拟和拓扑排序相较之下，还是抽象程度更高的拓扑排序更易实现，且逻辑更清晰明白。我做此题早已推导出哈希表和潜在原序列的关联规律，但模拟实现却花了两小时有余。当然一方面是长时间未练习，编程手法有所生疏，另一方面还是思路上不够开阔，没能更进一步，抽象成拓扑排序。今后还需多加努力！

xiyue_jiang

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
CSU天梯赛校赛 L3-1 Hashing - Hard Version

Hashing - Hard VersionL3-1 Hashing - Hard Version （30 分)Given a hash table of size N, we can define a hash function H(x)=x%N. Suppose that the linear probing is used to solve collisions, we can e...
复制链接

扫一扫