Consider the string s
to be the infinite wraparound string of "abcdefghijklmnopqrstuvwxyz", so s
will look like this: "...zabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcd....".
Now we have another string p
. Your job is to find out how many unique non-empty substrings of p
are present in s
. In particular, your input is the string p
and you need to output the number of different non-empty substrings of p
in the string s
.
Note: p
consists of only lowercase English letters and the size of p might be over 10000.
Example 1:
Input: "a"
Output: 1
Explanation: Only the substring "a" of string "a" is in the string s.
Example 2:
Input: "cac"
Output: 2
Explanation: There are two substrings "a", "c" of string "cac" in the string s.
Example 3:
Input: "zab"
Output: 6
Explanation: There are six substrings "z", "a", "b", "za", "ab", "zab" of string "zab" in the string s.
题目分析:开始一直纠结于如何枚举全部子串,中间存在着容斥问题,很快发现想偏了,如果可以回避重叠带来的容斥就好做了,最后想到用后缀的方式,枚举分别以26个字母结尾的连续子串的最大串长,它们的和就是答案。为什么这就是答案?首先最大串长实际表示的是该串后缀子串的个数,举个例子具体分析:abcbcdcde
以a,b,c,d,e结尾的最长连续子串分别是a,ab,abc,bcd,cde,因此答案是12,首先四个子串的最后一个字符都是不同的,因此它们的所有后缀子串肯定也都不同,现在只需要证明不遗漏即可,直接证不好证,尝试反证,假设有遗漏,遗漏的那个串为s且末位为‘?’,那么‘?’肯定是a,b,c,d,e中的某个,又因为我们取的答案子串是以a,b,c,d,e结尾的最长的连续子串,因此s必然是那些后缀串中的某一个。
public class Solution {
public int findSubstringInWraproundString(String p) {
p += "#";
int len = p.length(), ans = 0, tmp = 1;
int[] dp = new int[26];
Arrays.fill(dp, 0);
for (int i = 0; i < len - 1; i ++) {
char cur = p.charAt(i);
if ((cur - 'a' + 1) % 26 == p.charAt(i + 1) - 'a') {
tmp ++;
} else {
tmp = 1;
}
dp[cur - 'a'] = Math.max(dp[cur - 'a'], tmp);
}
for (int i = 0; i < 26; i ++) {
ans += dp[i];
}
return ans;
}
}