hdu 3336 Count the string 用心写的题解

最新推荐文章于 2020-09-15 15:18:49 发布

fanesemyk

最新推荐文章于 2020-09-15 15:18:49 发布

阅读量525

点赞数 3

分类专栏：字符串

本文链接：https://blog.csdn.net/fanesemyk/article/details/53086534

版权

字符串专栏收录该内容

5 篇文章 0 订阅

订阅专栏

Problem Description
It is well known that AekdyCoin is good at string problems as well as number theory problems. When given a string s, we can write down all the non-empty prefixes of this string. For example:
s: “abab”
The prefixes are: “a”, “ab”, “aba”, “abab”
For each prefix, we can count the times it matches in s. So we can see that prefix “a” matches twice, “ab” matches twice too, “aba” matches once, and “abab” matches once. Now you are asked to calculate the sum of the match times for all the prefixes. For “abab”, it is 2 + 2 + 1 + 1 = 6.
The answer may be very large, so output the answer mod 10007.

Input
The first line is a single integer T, indicating the number of test cases.
For each case, the first line is an integer n (1 <= n <= 200000), which is the length of string s. A line follows giving the string s. The characters in the strings are all lower-case letters.

Output
For each case, output only one number: the sum of the match times for all the prefixes of s mod 10007.

Sample Input
1
4
abab

Sample Output
6

Author
foreverlin@HNU

Source
HDOJ Monthly Contest – 2010.03.06

Recommend
lcy

首先题目大意很好搞懂，就是给一个字符串s，求它的所有前缀的匹配次数。这道题做了两个星期，特此来写个题解。
首先最智障的方法，对每一个前缀都kmp一次求匹配次数。这么做当然会超时，但能给我们一些思考。
然后我们试着优化算法。第一个比较容易想到的优化是剪枝：当s[I]的前缀在s中的匹配次数为1时，则对于任何j>I,s【j】的前缀在s中的匹配次数都是1.但是这个剪枝遇到形如’aaaaa’字符串就失效了。

知识补充：自匹配：对一个字符串，前k个字符和后k个字符一样，就称其构成了一个长度为k的自匹配。
真前缀：前缀不算自己。

接着继续优化。这里贴一篇我觉得很好的博文的内容：(以下内容为转载）（斜体加粗内容为补充）
要理解next数组的含义：next【i】表示s【i】的真前缀中，最长自匹配的长度，具体见下面例子。
根据next数组，找到一个匹配的后缀ans＋1，接着对next回溯一直到为0，相当于把子串里蕴含的前缀子串也找出来，举个例子：
s a b a b a
i 0 1 2 3 4 5
next[i] -1 0 0 1 2 3
cnt[0]=0;(无真前缀）
cnt[1]=1;(自身）
cnt[2]=1;（自身）
cnt[3]=2;（“aba”中“a”构成了自匹配）
cnt[4]=2;（“abab”中“ab”构成了自匹配）
cnt[5]=3;（“ababa”中“a”“aba”构成了自匹配）
ans=9;

具体的，拿next【5】这个字符举个例子。首先，next【5】==3，说明了s【5】的一个真前缀：s【0】到s【4】中，有一个长度为3的最长自匹配。也就是s【0】==s【2】，s【1】==s【3】，s【2】==s【4】。那么很显然，s【0】到s【2】这个长度为3的子串，也就是‘aba’，在s【2】到s【4】出现了一次。

那么，接下来，也是这道题最关键的部分，我们如何算出答案呢？我们考虑对于每一个字符，都对它的真前缀做一个扫描，计算所有以这个字符前一个字符（因为是真前缀）结尾的匹配个数，记作cnt【i】。或者简单的说，cnt【i】就是它的真前缀中自匹配的个数。注意，这里的匹配指的是所有的可行的匹配。比如，对于字符串“ababa”，在考虑cnt【4】时，就要计算”a” “ab” “aba” “abab”分别的，以s【3】也就是b结尾的匹配个数。最后，我们把这些匹配个数一加，就是所求的匹配个数了。具体的，就是cnt【4】=0+1+0+1=2
那么这个和next表又有什么关系呢？还是拿上面的例子。对于next【5】==3，我们知道了在s【5】真前缀中有一个长度为3的最长自匹配。那么，我们通过next【3】==1知道，在s【3】真前缀中有一个长度为1的最长自匹配。那么想象一下，这个长度为1的自匹配是否也出现在了s【5】的真前缀中并且以s【4】结尾？

答案是肯定的。具体的，“aba”是“ababa”的一个最长自匹配，“a”是“aba”的一个最长自匹配。那么，“a”是“ababa”的一个自匹配。

或者说，“a”是cnt【5】（虚构的哨兵）中的一项。
这时候我们再翻回去看一下cnt【】，是不是懂了呢？
我们就可以得到如下代码：

for(int i = 1; i <= len; i++)  
        {  
            int tmp = next[i];  
            while(tmp)  
            {  
                ans = (ans + 1) % MOD;  
                tmp = next[tmp];  
            }  
        }

但是，聪明的同学会发现，这段代码其实仍有可能超时。它的复杂度就是答案。
我们可以将其改进成为dp

dp[0]=0;
        for(int i=1;i<=n;i++)
        {
            dp[i]=(dp[N[i]]+1)%10007;
            sum=(sum+dp[i])%10007;
        }

这就是这个程序的主体框架了。
最后，感谢卿爷，感谢TCgogogo的博客，他的博客让我彻底搞懂这道题：

http://blog.csdn.net/tc_to_top/article/details/43730479

，感谢邓俊辉老师的数据结构公开课和数据结构教材。

fanesemyk

关注

3
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
hdu 3336 Count the string 用心写的题解

Problem Description It is well known that AekdyCoin is good at string problems as well as number theory problems. When given a string s, we can write down all the non-empty prefixes of this string. Fo
复制链接

扫一扫