HDU - 4416 ——Good Article Good sentence (后缀树组)

传送门:http://acm.split.hdu.edu.cn/showproblem.php?pid=4416



In middle school, teachers used to encourage us to pick up pretty sentences so that we could apply those sentences in our own articles. One of my classmates ZengXiao Xian, wanted to get sentences which are different from that of others, because he thought the distinct pretty sentences might benefit him a lot to get a high score in his article. 
Assume that all of the sentences came from some articles. ZengXiao Xian intended to pick from Article A. The number of his classmates is n. The i-th classmate picked from Article Bi. Now ZengXiao Xian wants to know how many different sentences she could pick from Article A which don't belong to either of her classmates?Article. To simplify the problem, ZengXiao Xian wants to know how many different strings, which is the substring of string A, but is not substring of either of string Bi. Of course, you will help him, won't you? 
Input
The first line contains an integer T, the number of test data. 
For each test data 
The first line contains an integer meaning the number of classmates. 
The second line is the string A;The next n lines,the ith line input string Bi. 
The length of the string A does not exceed 100,000 characters , The sum of total length of all strings Bi does not exceed 100,000, and assume all string consist only lowercase characters 'a' to 'z'. 
Output
For each case, print the case number and the number of substrings that ZengXiao Xian can find.
Sample Input
3
2
abab
ab
ba
1
aaa
bbb
2
aaaa
aa
aaa
Sample Output
Case 1: 3
Case 2: 3
Case 3: 1

题目大意:

有一个字符串属于A串,有n个字符串属于B串,问有多少个子串属于A串但不属于B串。

题解:

关于子串问题容易想到后缀数组。将A串和n个B串通过不同的特殊字符链接在一起,然后求后缀树组和height数组。if(sa[i]<A串的长度)表示这是A串的后缀,否则是B串的后缀。那么现在有两个要求的东西,1)既在A串又在B串的子串个数,在A串重复的个数。对于以i为开头的A的后缀在B串中有多找重复的呢?应该是和i的排名相邻的B串的lcp。正反各求一遍每个属于A串后缀的在B串的个数去最大值,最后在求出A串中重复的,两个相邻的A串后缀的lcp就是了。最后用A的所有子串建去即在A又在B的子串和在A中重复的。注意细节,代码中标出。



#include <iostream>
#include <cstdio>
#include <algorithm>
#include <cmath>
#include <cstring>
#define INF 0x3f3f3f3f
using namespace std;
const int MAXN  = 3e5+100;
int wa[MAXN],wb[MAXN],wv[MAXN];
int Ws[MAXN];
int sa[MAXN],Rank[MAXN],height[MAXN];
int cmp(int *r,int a,int b,int l){
    return r[a]==r[b] && r[a+l]==r[b+l];
}
void SA(int *r,int n,int m){
    int *x = wa,*y=wb;
    for(int i = 0;i<m;i++) Ws[i] = 0;
    for(int i = 0;i<n;i++)++Ws[x[i]=r[i]];
    for(int i = 1;i<m;i++)Ws[i]+=Ws[i-1];
    for(int i = n-1;i>=0;i--)sa[--Ws[x[i]]] = i;
    int p = 1;
    for(int j = 1;p<n;j<<=1,m=p){
        p = 0;
        for(int i = n-j;i<n;i++)y[p++] = i;
        for(int i = 0;i<n;i++)if(sa[i]>=j)y[p++]=sa[i]-j;
        for(int i = 0;i<n;i++)wv[i] = x[y[i]];
        for(int i = 0;i<m;++i)Ws[i] = 0;
        for(int i = 0;i<n;i++)++Ws[wv[i]];
        for(int i = 1;i<m;i++)Ws[i] +=Ws[i-1];
        for(int i = n-1;i>=0;--i)sa[--Ws[wv[i]]] = y[i];
        swap(x,y);
        x[sa[0]] = 0;
        p = 1;
        for(int i =1;i<n;i++){
            x[sa[i]] = cmp(y,sa[i-1],sa[i],j)?p-1:p++;
        }
    }
    for(int i = 1;i<n;i++)Rank[sa[i]] = i;
    int k = 0;
    for(int i = 0;i<n-1;height[Rank[i++]] = k){
        if(k)--k;
        for(int j =sa[Rank[i]-1];r[i+k]==r[j+k];++k);
    }
}
char s[100005];
int r[MAXN],pos[MAXN];
int main(){
    int T,w = 0;
    scanf("%d",&T);
    while(T--){
        int n;
        int top = 0,z = 30;
        scanf("%d",&n);
        scanf("%s",s);
        int len = strlen(s);
        for(int i  = 0;i<len;i++){
            r[top++] = s[i]-'a'+1;
        }
        for(int i = 0;i<n;i++){
            r[top++] = z++;
            scanf("%s",s);
            int le = strlen(s);
            for(int j = 0;j<le;j++){
                r[top++] = s[j]-'a'+1;
            }
        }
        r[top] = 0;
        SA(r,top+1,z+10);
        int temp = INF;
        memset(pos,0,sizeof(pos));
        for(int i = 0;i<=top;i++){
            if(sa[i]<len){
                temp = min(temp,height[i]);
                pos[sa[i]] = max(temp,pos[sa[i]]);
            }
            else temp = INF;
        }
        temp = INF;
        for(int i = top-1;i>=0;i--){
            if(sa[i]<len){
                temp = min(temp,height[i+1]);
                pos[sa[i]] = max(temp,pos[sa[i]]);
            }
            else temp = INF;
        }
        for(int i = 1;i<=top;i++){
            if(sa[i]<len&&sa[i-1]<len){
                pos[sa[i-1]] = max(pos[sa[i-1]],height[i]);
                ///这里有一个细节,更新pos时必须是sa[i-1]的,而不是sa[i]的
                ///因为在反向更新pos数组时,pos[sa[i]]可能会大于height[i],如(A:表示A串的)(B:ab)(A:abc)(A:abcabc)(B:abcad)
                ///而pos[sa[i-1]]是由min(temp,height[i])更新来的,最大是height[i].
            }
        }
        long long ans = (long long)len*(len+1)>>1;
        for(int i = 0;i<=top;i++){
            ans-=pos[i];
        }
        printf("Case %d: %lld\n",++w,ans);
    }
    return 0;
}


 


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值