HDU——2473Junk-Mail Filter（并查集删点）

最新推荐文章于 2021-10-09 19:49:02 发布

无语_

最新推荐文章于 2021-10-09 19:49:02 发布

阅读量537

点赞数

分类专栏： HDU 算法文章标签：并查集

本文链接：https://blog.csdn.net/a88770202/article/details/51741132

版权

算法同时被 3 个专栏收录

145 篇文章 0 订阅

订阅专栏

HDU

80 篇文章 1 订阅

订阅专栏

并查集

18 篇文章 0 订阅

订阅专栏

Junk-Mail Filter

Time Limit: 15000/8000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)
Total Submission(s): 8687 Accepted Submission(s): 2753

Problem Description
Recognizing junk mails is a tough task. The method used here consists of two steps:
1) Extract the common characteristics from the incoming email.
2) Use a filter matching the set of common characteristics extracted to determine whether the email is a spam.

We want to extract the set of common characteristics from the N sample junk emails available at the moment, and thus having a handy data-analyzing tool would be helpful. The tool should support the following kinds of operations:

a) “M X Y”, meaning that we think that the characteristics of spam X and Y are the same. Note that the relationship defined here is transitive, so
relationships (other than the one between X and Y) need to be created if they are not present at the moment.

b) “S X”, meaning that we think spam X had been misidentified. Your tool should remove all relationships that spam X has when this command is received; after that, spam X will become an isolated node in the relationship graph.

Initially no relationships exist between any pair of the junk emails, so the number of distinct characteristics at that time is N.
Please help us keep track of any necessary information to solve our problem.

Input
There are multiple test cases in the input file.
Each test case starts with two integers, N and M (1 ≤ N ≤ 10⁵, 1 ≤ M ≤ 10⁶), the number of email samples and the number of operations. M lines follow, each line is one of the two formats described above.
Two successive test cases are separated by a blank line. A case with N = 0 and M = 0 indicates the end of the input file, and should not be processed by your program.

Output
For each test case, please print a single integer, the number of distinct common characteristics, to the console. Follow the format as indicated in the sample below.

Sample Input
5 6
M 0 1
M 1 2
M 1 3
S 1
M 1 2
S 3

3 1
M 1 2

0 0

Sample Output
Case #1: 3
Case #2: 2

并查集删点的操作看看了很久，找了无数篇博客，但是感觉除了一个有画图的博客之外其他的讲的都不是很详细，因此在有一点理解之后想写一下自己的看法
首先题目中分M（合并）和S（分离）两种操作，显然前者非常简单，后者有一点难理解。由于我一开始连第一个样例都看不懂，先解释一下题目第一组数据样例吧。

操作	集合关系
M 0 1	{0 1} {2} {3} {4}
M 1 2	{0 1 2} {3} {4}
M 1 3	{0 1 2 3} {4}
S 1	{0 2 3} {1} {4}
M 1 2	{0 1 2 3} {4}
S 3	{0 1 2} {3} {4}

题目中说Your tool should remove all relationships that spam X has when this command is received，可能会误认为是把它所连的边全部抹掉，其实题目中集合的概念可以看成一些泡泡，M就是融合，S就是分裂，显然分裂出去一定是出去一个，但是融合就不一定了，比如{0 2 3}和{4 5 6}，两边集合中任意地取出两个数进行融合就会使得两个集合融合在一起形成{0 2 3 4 5 6}，就是说虽然M A B连接的是一个，但是实际上会把A所在的集合和B所在的集合联合到一起。然后这样就可以解释第一组样例了。1出去之后2又把1拉回来，然后把3分裂了出去。因此结果为3组如上表所示。
然后重点就是如何进行这样的S操作，其他很多人的博客已经解释过仅仅把祖先改掉是没用的，因此需要另一种思路：最普通的并查集是把数组元素进行合并，而数组元素是不会变的，永远都是最大编号的范围内——只认人。
而删点操作要换一换，不是把人合并，而是把这个位置的人合并——认位置不认人。放到题目里就是把分裂出去的人本身替换掉，若再拉回来就是另一个实体，但是回去之后的位置却是分裂之前的那个人所在的位置。比如下列这组数据
6 7
M 0 1
M 1 2
M 2 3
S 3
M 3 4
M 4 5
M 0 3

最后3的位置会被6替换掉（题目中N=6指的是编号范围从0~N-1）
然后后面的派3去融合其实是派6去融合。但是算的还是3的位置。

代码：

#include<iostream>
#include<algorithm>
#include<cstdlib>
#include<sstream>
#include<cstring>
#include<cstdio>
#include<string>
#include<deque>
#include<stack>
#include<cmath>
#include<queue>
#include<set>
#include<map>
using namespace std;
#define INF 0x3f3f3f3f
#define MM(x) memset(x,0,sizeof(x))
#define MMINF(x) memset(x,INF,sizeof(x))
typedef long long LL;
const double PI=acos(-1.0);
const int N=1100010;
int pre[N],ran[N];//pre记录父亲是谁，ran集合元素个数
int vir[N],mark[N];//vir记录某个位置的孩子是谁，mark统计集合个数用的数组mark
void init()
{
    for (int i=0; i<N; i++)
    {
        pre[i]=i;
        ran[i]=1;
        vir[i]=i;
    }
    MM(mark);
}
int find(int n)
{
    if(n!=pre[n])
        return pre[n]=find(pre[n]);
    return pre[n];
}
void joint(int a,int b)
{
    int fa=find(a),fb=find(b);
    if(fa!=fb)
    {
        if(ran[fa]>=ran[fb])
        {
            ran[fa]+=ran[fb];
            pre[fb]=fa;
            ran[fb]=0;
        }
        else
        {
            ran[fb]+=ran[fa];
            pre[fa]=fb;
            ran[fa]=0;
        }
    }
}
int main(void)
{
    int n,m,i,j,a,b,c,k,cas=0;
    char ops[5];
    while (~scanf("%d%d",&n,&m)&&(n||m))
    {
        init();
        k=n;
        for (i=0; i<m; i++)
        {
            scanf("%s",ops);
            if(ops[0]=='M')
            {
                scanf("%d%d",&a,&b);
                joint(vir[a],vir[b]);//合并这个两个位置的人
            }
            else
            {
                scanf("%d",&c);
                ran[find(c)]--;//分离出去一个原来的集合个数减一
                vir[c]=k;//这个位置的人换成无关的人
                pre[k]=k;//刚分离出去刚形成的新点，把祖先先改为自己
                k++;//更新已用人员
            }
        }
        int r=0;
        for (i=0; i<n; i++)
        {
            int f=find(vir[i]);
            if(!mark[f])
            {
                r++;
                mark[f]=1;
            }   
        }
        printf("Case #%d: %d\n",++cas,r);
    }
    return 0;
}