HDU 3336 Count the string 数据结构+扩展KMP

最新推荐文章于 2021-08-03 15:02:13 发布

黄金四叶草

最新推荐文章于 2021-08-03 15:02:13 发布

阅读量234

点赞数

分类专栏： ACM-字符串

本文链接：https://blog.csdn.net/fans_ac/article/details/52003758

版权

ACM-字符串专栏收录该内容

15 篇文章 0 订阅

订阅专栏

Count the string

Time Limit:1000MS Memory Limit:32768KB 64bit IO Format:%I64d & %I64u

Submit Status

Description

It is well known that AekdyCoin is good at string problems as well as number theory problems. When given a string s, we can write down all the non-empty prefixes of this string. For example:
s: "abab"
The prefixes are: "a", "ab", "aba", "abab"
For each prefix, we can count the times it matches in s. So we can see that prefix "a" matches twice, "ab" matches twice too, "aba" matches once, and "abab" matches once. Now you are asked to calculate the sum of the match times for all the prefixes. For "abab", it is 2 + 2 + 1 + 1 = 6.
The answer may be very large, so output the answer mod 10007.

Input

The first line is a single integer T, indicating the number of test cases.
For each case, the first line is an integer n (1 <= n <= 200000), which is the length of string s. A line follows giving the string s. The characters in the strings are all lower-case letters.

Output

For each case, output only one number: the sum of the match times for all the prefixes of s mod 10007.

Sample Input

2
4
abab 
8
abababab

Sample Output

6 
20

解题思路：题意描述还是非常清楚的，求每一个前缀在原串中出现的次数，输出出现次数的和。
看到这题很容易就会想到这是对kmp算法中失配函数（next）数组的应用
但是仔细推理一下，好像又不是那么回事，正常的失配函数貌似推不出数量，存在着大量的重复统计

请教了别人才知道，这题最好的做法是用扩展kmp算法
1，扩展kmp算法计算出extend数组，这个数组的含义是，当前位置i后面多少位是与前缀匹配的。
2，这个数组的计算方法可以直接套用模版，因为灵活运用kmp算法已经基本足够
3，计算结果直接是extend数组里面的非零数相加即可。
举例：

    
    下标 0 1 2 3 4 5 6 7 8
字符串 a b a b a b a b \n
失配/next数组 0 0 0 1 2 3 4 5 6
extend数组 8 0 6 0 4 0 2 0 0
上图可以观察出两者的差别：extend数组记录当前位置i后面多少位能与前缀匹配。
如果这个概念记的比较的牢固，那么这题也就出来了
一个串的前缀个数就是这个串的长度
那么当前位置i后面x位能与前缀匹配就意味着i后面的x长度的字符串包含了x个前缀与原串的前缀相同
所有的非零数相加就是结果。

#include<iostream>
#include<cstring>
#include<cstdio>
using namespace std;
const int N = 200005;
int Next[N];
void getNext(char *T) {
    int i,length = strlen(T);
    Next[0] = length;
    for(i = 0; i<length-1 && T[i]==T[i+1]; i++);
    Next[1] = i;
    int a = 1;
    for(int k = 2; k < length; k++) {
        int p = a+Next[a]-1, L = Next[k-a];
        if((k-1)+L>=p) {
            int j = (p-k+1)>0? (p-k+1) : 0;
            while(k+j<length && T[k+j]==T[j]) j++;// 枚举(p+1，length) 与(p-k+1,length) 区间比较
            Next[k] = j, a = k;
        } else Next[k] = L;
    }
}
int main() {
    int T ;
    char t[N];
    int n ;
    scanf("%d",&T);
    while(T--) {
        scanf("%d%s",&n,t); ///文本串 和 模板串
        getNext(t);
        int ans = 0 ;
        for(int i=0;i<n;i++){
            ans=(ans+Next[i])%10007 ;
        }
        printf("%d\n",ans%10007);
    }
    return 0;
}


这里有一个不错的模版，也是网上哪个大牛的。感觉不错，在此处分享，以后再求extend直接套用这个模版还是不错的。

//C/C++ 模板
#include<iostream>
#include<cstring>
#include<cstdio>
using namespace std;
const int N = 101010;
int next[N],extand[N];
void getnext(char *T){// next[i]: 以第i位置开始的子串 与 T的公共前缀
     int i,length = strlen(T);
     next[0] = length;
     for(i = 0;i<length-1 && T[i]==T[i+1]; i++);
          next[1] = i;
          int a = 1;
          for(int k = 2; k < length; k++){
                  int p = a+next[a]-1, L = next[k-a];
                  if( (k-1)+L >= p ){
                       int j = (p-k+1)>0? (p-k+1) : 0;
                       while(k+j<length && T[k+j]==T[j]) j++;// 枚举(p+1，length) 与(p-k+1,length) 区间比较
                       next[k] = j, a = k;
                  }
                  else next[k] = L;
         }
}
void getextand(char *S,char *T){
   memset(next,0,sizeof(next));
         getnext(T);
         int Slen = strlen(S), Tlen = strlen(T), a = 0;
         int MinLen = Slen>Tlen?Tlen:Slen;
         while(a<MinLen && S[a]==T[a]) a++;
         extand[0] = a, a = 0;
         for(int k = 1; k < Slen; k++){
              int p = a+extand[a]-1, L = next[k-a];
              if( (k-1)+L >= p ){
                   int j = (p-k+1)>0? (p-k+1) : 0;
                   while(k+j<Slen && j<Tlen && S[k+j]==T[j] ) j++;
                   extand[k] = j;a = k;
              }
              else extand[k] = L;
         }
}

int main(){
             char s[N],t[N];
             while(~scanf("%s %s",s,t)){///文本串 和 模板串
                      getextand(s,t);
                      ///求字符串s的所有后缀和s本身的最长公共前缀，用next[]数组保存这些值
                      for(int i = 0; i < strlen(t); i++)
                               printf("%d ",next[i]);
                      puts("");
                      ///此时已经求出next[]，我们用extend[]保存字符串S的所有后缀和字符串T的最长公共前缀的值
                      for(int i = 0; i < strlen(s); i++)
                               printf("%d ",extand[i]);
                      puts("");
             }
}

黄金四叶草

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
HDU 3336 Count the string 数据结构+扩展KMP

Count the stringTime Limit:1000MS Memory Limit:32768KB 64bit IO Format:%I64d & %I64uSubmitStatusDescriptionIt is well known that AekdyCoin is good at string problems as well as
复制链接

扫一扫