「算法笔记」后缀数组 —— 从入门到进阶

后缀数组

对于字符串   S \,S S,约定它的长度为   n \,n n,位置从   1   \,1\, 1开始编号。用   S ( l , r )   \,S(l, r)\, S(l,r)表示它从   l   \,l\, l开始到   r   \,r\, r结束的子串。字符串的   ( \,( ( )   )\, )等号运算表示它们字典序的关系,字符串的加法运算表示顺次拼接字符串

后缀数组是什么?

后缀数组 &ThinSpace; ( suffix array ) &ThinSpace; \,(\text{suffix array})\, (suffix array),在 &ThinSpace; OI &ThinSpace; \,\text{OI}\, OI中经常被简写为 &ThinSpace; s a [ ] \,sa[] sa[]。它是对于一个字符串 &ThinSpace; S &ThinSpace; \,S\, S构造出的数组,满足 &ThinSpace; ∀ &ThinSpace; 1 ≤ i &lt; n , S ( s a [ i ] , n ) &lt; S ( s a [ i + 1 ] , n ) &ThinSpace; \,\forall \, 1 \le i \lt n, S(sa[i], n) &lt; S(sa[i + 1], n) \, 1i<n,S(sa[i],n)<S(sa[i+1],n)。例如对于串 &ThinSpace; ababa \,\text{ababa} ababa s a [ ] = { 5 , 3 , 1 , 4 , 2 } sa[] = \{5, 3, 1, 4, 2\} sa[]={5,3,1,4,2}。倍增法可以在 &ThinSpace; O ( n log ⁡ n ) &ThinSpace; \,O(n \log n)\, O(nlogn)的时间内快速构造后缀数组。还有一种快速构造后缀数组的方法是 &ThinSpace; D C 3 &ThinSpace; \,\mathtt{DC_3}\, DC3法$,时间复杂度为线性。但是倍增法算法更加优美,代码更加简洁,所以本文只介绍倍增法。

倍增法求后缀数组

思路分析

定义 &ThinSpace; r a n k &ThinSpace; \,rank\, rank数组, r n k [ i ] &ThinSpace; rnk[i]\, rnk[i]表示 &ThinSpace; i &ThinSpace; \,i\, i开始的后缀,即 &ThinSpace; S ( i , n ) \,S(i, n) S(i,n),在 &ThinSpace; s &ThinSpace; \,s\, s所有后缀中的排名。例如对于串 &ThinSpace; ababa \,\text{ababa} ababa r n k [ ] = { 3 , 5 , 2 , 4 , 1 } rnk[] = \{3, 5, 2, 4, 1\} rnk[]={3,5,2,4,1}。不难发现 &ThinSpace; ∀ &ThinSpace; 1 ≤ i ≤ n , s a [ r n k [ i ] ] = i \,\forall \, 1 \le i \le n, sa[rnk[i]] = i 1in,sa[rnk[i]]=i。倍增法的思想是用 &ThinSpace; S ( i , min ⁡ { n , i + 2 k − 1 } ) ( 1 ≤ i ≤ n ) &ThinSpace; \, S(i, \min\{n, i + 2^{k} - 1\}) (1 \le i \le n)\, S(i,min{n,i+2k1})(1in) &ThinSpace; r n k &ThinSpace; \,rnk\, rnk数组 &ThinSpace; ( r n k k ) &ThinSpace; \,(rnk_k)\, (rnkk)递推出 &ThinSpace; S ( i , min ⁡ { n , i + 2 k + 1 − 1 } ) ( 1 ≤ i ≤ n ) &ThinSpace; \, S(i, \min\{n, i + 2^{k + 1} - 1\}) (1 \le i \le n)\, S(i,min{n,i+2k+11})(1in) &ThinSpace; r n k &ThinSpace; \,rnk\, rnk数组 &ThinSpace; ( r n k k + 1 ) &ThinSpace; \,(rnk_{k + 1})\, (rnkk+1)

发现 &ThinSpace; S ( i , i + 2 k − 1 ) + S ( i + 2 k , i + 2 k + 1 ) = S ( i + 2 k + 1 ) &ThinSpace; \,S(i, i + 2^k - 1) + S(i + 2^k, i + 2^{k + 1}) = S(i + 2^{k + 1})\, S(i,i+2k1)+S(i+2k,i+2k+1)=S(i+2k+1)。所以将所有二元组 &ThinSpace; ( r n k k , i , r n k k , i + 2 k ) &ThinSpace; \,(rnk_{k, i}, rnk_{k, i + 2^k})\, (rnkk,i,rnkk,i+2k)排序的结果就是 &ThinSpace; s a k + 1 &ThinSpace; \,sa_{k + 1}\, sak+1了。我们排序是可以用基数排序,先按照第二关键字排序再按照第一关键字排序即可。要注意的小细节是 &ThinSpace; i + 2 k &gt; n &ThinSpace; \,i + 2^k &gt; n\, i+2k>n时就让 &ThinSpace; r n k [ i + 2 k ] = − inf ⁡ \,rnk[i + 2^k] = -\inf rnk[i+2k]=inf。求出 &ThinSpace; s a k + 1 &ThinSpace; \,sa_{k + 1}\, sak+1后可以推出 &ThinSpace; r n k k + 1 &ThinSpace; \,rnk_{k + 1}\, rnkk+1。注意不能直接 &ThinSpace; r n k k + 1 [ s a k + 1 [ i ] ] = i \,rnk_{k + 1}[sa_{k + 1}[i]]=i rnkk+1[sak+1[i]]=i,因为 &ThinSpace; r n k k + 1 &ThinSpace; \,rnk_{k + 1}\, rnkk+1可能会重复,而不是一个排列。我们考虑用 &ThinSpace; r n k k + 1 [ s a k + 1 [ i ] ] &ThinSpace; \,rnk_{k + 1}[sa_{k + 1}[i]]\, rnkk+1[sak+1[i]]推出 &ThinSpace; r n k k + 1 [ s a k + 1 [ i + 1 ] ] \,rnk_{k + 1}[sa_{k + 1}[i + 1]] rnkk+1[sak+1[i+1]],这是我们只需判断这两个位置开始 &ThinSpace; 2 k + 1 &ThinSpace; \,2^{k + 1}\, 2k+1长度的串是否完全相同即可。总的时间复杂度为 &ThinSpace; O ( n log ⁡ n ) \, O(n \log n) O(nlogn)

代码实现

sigma &ThinSpace; \text{sigma}\, sigma &ThinSpace; ∣ ∑ ∣ \,\vert \sum \vert ,即字符集大小。

void radix_sort(int a[maxn + 3], int b[maxn + 3], int k[maxn + 3]) {
	fill(cnt + 1, cnt + n + 1, 0);
	for (int i = 1; i <= n; i++) cnt[k[i]]++;
	for (int i = 1; i <= n; i++) cnt[i] += cnt[i - 1];
	for (int i = n; i; i--) b[cnt[k[a[i]]]--] = a[i];
}

void suffix_sort() {
	for (int i = 1; i <= n; i++) rnk[i] = a[i];
	if (cnt[sigma] == n) { for (int i = 1; i <= n; i++) sa[rnk[i]] = i; return; }
	for (int k = 1; k < n; k <<= 1) {
		for (int i = 1; i <= n; i++) {
			t_1[i] = (i + k <= n ? rnk[i + k] : 0) + 1;
			t_2[i] = rnk[i];
		}
		for (int i = 1; i <= n; i++) sa[i] = i;
		radix_sort(sa, rnk, t_1);
		radix_sort(rnk, sa, t_2);
		rnk[sa[1]] = 1;
		for (int i = 2, x, y; i <= n; i++) {
			x = sa[i], y = sa[i - 1];
			rnk[x] = rnk[y] + (t_1[x] != t_1[y] || t_2[x] != t_2[y]);
		}
		if (rnk[sa[n]] == n) return;
	}
}

int main() {
	scanf("%d %s", &n, s + 1);
	for (int i = 1; i <= n; i++) {
		a[i] = s[i] - 'a' + 1;
		cnt[a[i]] = 1;
	}
	for (int i = 1; i <= sigma; i++) cnt[i] += cnt[i - 1];
	for (int i = 1; i <= n; i++) a[i] = cnt[a[i]];
	suffix_sort();
	return 0;
}

height 数组

height 数组是什么?

h e i g h t [ ] &ThinSpace; height[]\, height[]可简写为 &ThinSpace; h e i [ ] \,hei[] hei[] h e i [ i ] &ThinSpace; hei[i]\, hei[i]表示 &ThinSpace; S ( s a [ i ] , n ) &ThinSpace; \,S(sa[i], n)\, S(sa[i],n) &ThinSpace; S ( s a [ i + 1 ] , n ) &ThinSpace; \,S(sa[i + 1], n)\, S(sa[i+1],n)的最长公共前缀 &ThinSpace; ( LCP ) \,(\text{LCP}) (LCP)。两个串的最长公共前缀就是两个串公共前缀的最长长度。例如对于串 &ThinSpace; ababa \,\text{ababa} ababa h e i [ ] = { 1 , 3 , 0 , 2 } hei[] = \{1, 3, 0, 2\} hei[]={1,3,0,2}

线性求 height 数组

思路分析

我们记 &ThinSpace; h e i ′ [ i ] = h e i [ r n k [ i ] ] \,hei&#x27;[i] = hei[rnk[i]] hei[i]=hei[rnk[i]],也就是 &ThinSpace; S ( i , n ) &ThinSpace; \,S(i, n)\, S(i,n)和其前一名的 &ThinSpace; LCP \,\text{LCP} LCP。不难发现 &ThinSpace; h e i ′ [ i ] ≤ h e i ′ [ i − 1 ] − 1 &ThinSpace; \,hei&#x27;[i] \le hei&#x27;[i - 1] - 1\, hei[i]hei[i1]1 &ThinSpace; ( \,( (不懂的话可以找个串自己在纸上画一画,应该不难理解 ) &ThinSpace; )\, )。通过这个性质我们就可以暴力的从 &ThinSpace; h e i ′ [ i ] &ThinSpace; \,hei&#x27;[i]\, hei[i]递推到 &ThinSpace; h e i ′ [ i + 1 ] \,hei&#x27;[i + 1] hei[i+1],从而算出 h e i [ ] hei[] hei[]。因为每次答案只会 &ThinSpace; − 1 \,-1 1,所以这个做法是均摊 &ThinSpace; O ( 1 ) &ThinSpace; \,O(1)\, O(1)的。

代码实现

void get_height() {
	for (int i = 1, j, k = 0; i <= n; hei[rnk[i++] - 1] = k) {
		for (j = sa[rnk[i] - 1], k = max(0, k - 1); a[i + k] == a[j + k]; k++);
	}
}

例题讲解

至此,大家已经对后缀数组以及 &ThinSpace; h e i g h t &ThinSpace; \,height\, height数组有所了解。现在我们就来看一下一些例题吧。

例 1:LCP 询问

题目大意

给定一个串,多次询问 &ThinSpace; LCP ( S ( x , n ) , S ( y , n ) ) \, \text{LCP}(S(x, n), S(y, n)) LCP(S(x,n),S(y,n))

思路分析

很明显,如果 &ThinSpace; r n k [ i ] &lt; r n k [ j ] \, rnk[i] &lt; rnk[j] rnk[i]<rnk[j],那么 &ThinSpace; LCP ( S ( x , n ) , S ( y , n ) ) &ThinSpace; \, \text{LCP}(S(x, n), S(y, n))\, LCP(S(x,n),S(y,n))就等于 min ⁡ i = r n k x r n k y − 1 h e i [ i ] \min_{i = rnk_x}^{rnk_y - 1}{hei[i]} mini=rnkxrnky1hei[i] RMQ &ThinSpace; \text{RMQ}\, RMQ即可。

例 2:Musical Themes

题目大意

题目链接:Musical Themes

给定一个数列,问最大的 &ThinSpace; k &ThinSpace; \,k\, k使得存在两个连续的子数列长度都为 &ThinSpace; k \,k k,它们不相交且对应位置的相邻两个数差都相等。

思路分析

先将给定数列差分。然后二分答案,问题就变成了是否存在长度为 &ThinSpace; k &ThinSpace; \,k\, k的不相交的两个子串。

我们把所有后缀排序后做成表格。例如对于 &ThinSpace; k = 2 \,k = 2 k=2,串 &ThinSpace; aabaaba \,\text{aabaaba} aabaaba

起点后缀LCP
7a1
4aaba2
1aabaaba1
5aba3
2abaaba0
6ba2
3baabaN / A

我们把它分成若干个联通块,每个联通块内的 &ThinSpace; LCP &ThinSpace; \, \text{LCP} \, LCP &ThinSpace; ≥ k \, \ge k k。例如对于上例:

起点后缀LCP
7a1
起点后缀LCP
4aaba2
1aabaabaN / A
起点后缀LCP
5aba3
2abaabaN / A
起点后缀LCP
6ba2
3baabaN / A

那么,对于每个组分别考虑。如果能够找到两个串的起点相差至少 &ThinSpace; k \,k k,这两个串就满足要求。总时间复杂度 &ThinSpace; O ( n log ⁡ n ) \,O(n \log n) O(nlogn)

例题 3:【NOI 2015】品酒大会

题目大意

题目链接:【NOI 2015】品酒大会

给定字符串和 &ThinSpace; A [ ] \,A[] A[],求对于所有 &ThinSpace; 0 ≤ r &lt; n \, 0 \le r &lt; n 0r<n,求出有多少对 &ThinSpace; i &lt; j , LCP ( i , j ) ≥ r &ThinSpace; \, i &lt; j, \text{LCP}(i, j) \ge r\, i<j,LCP(i,j)r和这些 &ThinSpace; ( i , j ) &ThinSpace; \,(i, j)\, (i,j) &ThinSpace; max ⁡ A i × A j \,\max{A_i \times A_j} maxAi×Aj

思路分析

结论 &ThinSpace; 1 \,1 1 max ⁡ A i × A j = max ⁡ { max ⁡ A i × second_max &ThinSpace; A i , min ⁡ A i × second_min &ThinSpace; A i } \max{A_i \times A_j} = \max\{\max{A_i} \times \text{second\_max}{\,A_i}, \min{A_i} \times \text{second\_min}{\,A_i}\} maxAi×Aj=max{maxAi×second_maxAi,minAi×second_minAi}。显然。

像上一题一样用切分联通块的思想。从大到小枚举 &ThinSpace; r \, r r,然后对于 &ThinSpace; LCP ≥ r &ThinSpace; \,\text{LCP} \ge r\, LCPr的所有联通块,维护需要用到的值即可。由于联通块是区间,时间复杂度为 &ThinSpace; O ( n ) \,O(n) O(n)

代码实现

// luogu-judger-enable-o2
#include <cstdio>
#include <vector>
#include <numeric>
#include <algorithm>
using namespace std;

typedef long long llong;
const int maxn = 3e5, inf = 1e9 + 1, alpha = 26;
const llong infl = 1e18 + inf;
int n, a[maxn + 3], fa[maxn + 3], sz[maxn + 3], mn[maxn + 3][2], mx[maxn + 3][2];
llong c_ans, c_res = -infl, ans[maxn + 3], res[maxn + 3];
char s[maxn + 3];
vector<int> vec[maxn + 3];

namespace suffix_array {
    int sa[maxn + 3], rnk[maxn + 3], cnt[maxn + 3], tmp_1[maxn + 3], tmp_2[maxn + 3], ht[maxn + 3];
    void radix_sort(int from[], int key[], int to[]) {
        fill(cnt + 1, cnt + n + 1, 0);
        for (int i = 1; i <= n; i++) cnt[key[i]]++;
        for (int i = 2; i <= n; i++) cnt[i] += cnt[i - 1];
        for (int i = n; i; i--) to[cnt[key[from[i]]]--] = from[i];
    }
    void suffix_sort() {
        fill(cnt + 1, cnt + alpha + 1, 0);
        for (int i = 1; i <= n; i++) cnt[int(s[i] - 'a' + 1)] = 1;
        for (int i = 2; i <= alpha; i++) cnt[i] += cnt[i - 1];
        for (int i = 1; i <= n; i++) rnk[i] = cnt[int(s[i] - 'a' + 1)];
        if (cnt[alpha] == n) { for (int i = 1; i <= n; i++) sa[rnk[i]] = i; return; }
        for (int k = 1; k < n; k <<= 1) {
            for (int i = 1; i <= n; i++) {
                tmp_1[i] = rnk[i];
                tmp_2[i] = (i + k <= n ? rnk[i + k] : 0) + 1;
            }
            iota(sa + 1, sa + n + 1, 1);
            radix_sort(sa, tmp_2, rnk);
            radix_sort(rnk, tmp_1, sa);
            rnk[sa[1]] = 1;
            for (int i = 2, x, y; i <= n; i++) {
                x = sa[i], y = sa[i - 1];
                rnk[x] = rnk[y] + (tmp_1[x] != tmp_1[y] || tmp_2[x] != tmp_2[y]);
            }
            if (rnk[sa[n]] == n) return;
        }
    }
    void get_height() {
        for (int i = 1, j, k = 0; i <= n; ht[rnk[i++] - 1] = k) {
            for (j = sa[rnk[i] - 1], k = max(0, k - 1); s[i + k] == s[j + k]; k++);
        }
    }
}

using namespace suffix_array;

int find(int x) {
    return fa[x] == x ? x : fa[x] = find(fa[x]);
}

llong func(int x) {
    return 1ll * x * (x - 1) / 2;
}

void solve(int a, int b) {
    a = sa[a], b = sa[b];
    a = find(a), b = find(b);
    c_ans -= func(sz[a]), c_ans -= func(sz[b]);
    fa[b] = a, sz[a] += sz[b], sz[b] = 0;
    c_ans += func(sz[a]);
    if (mx[b][0] >= mx[a][0]) {
        mx[a][1] = mx[a][0];
        mx[a][0] = mx[b][0];
    } else if (mx[b][0] >= mx[a][1]) {
        mx[a][1] = mx[b][0];
    }
    if (mx[b][1] >= mx[a][0]) {
        mx[a][1] = mx[a][0];
        mx[a][0] = mx[b][1];
    } else if (mx[b][1] >= mx[a][1]) {
        mx[a][1] = mx[b][1];
    }
    if (mn[b][0] <= mn[a][0]) {
        mn[a][1] = mn[a][0];
        mn[a][0] = mn[b][0];
    } else if (mn[b][0] <= mn[a][1]) {
        mn[a][1] = mn[b][0];
    }
    if (mn[b][1] <= mn[a][0]) {
        mn[a][1] = mn[a][0];
        mn[a][0] = mn[b][1];
    } else if (mn[b][1] <= mn[a][1]) {
        mn[a][1] = mn[b][1];
    }
    c_res = max(c_res, 1ll * mx[a][0] * mx[a][1]);
    c_res = max(c_res, 1ll * mn[a][0] * mn[a][1]);
}

int main() {
    scanf("%d %s", &n, s + 1);
    for (int i = 1; i <= n; i++) {
        scanf("%d", &a[i]);
    }
    suffix_sort();
    get_height();
    for (int i = 1; i <= n; i++) {
        fa[i] = i, sz[i] = 1;
        mx[i][0] = a[i], mn[i][0] = a[i];
        mx[i][1] = -inf, mn[i][1] = inf;
    }
    for (int i = 1; i < n; i++) {
        vec[ht[i]].push_back(i);
        if (ht[i] < 0 || ht[i] >= n) puts("???");
    }
    for (int i = n - 1; ~i; i--) {
        for (int j: vec[i]) {
            solve(j, j + 1);
        }
        ans[i] = c_ans;
        if (c_ans) res[i] = c_res;
    }
    for (int i = 0; i < n; i++) {
        printf("%lld %lld\n", ans[i], res[i]);
    }
    return 0;
}

例题 4:【NOI 2016】优秀的拆分

题目大意

【NOI 2016】优秀的拆分

问对于 &ThinSpace; S &ThinSpace; \,S\, S的所有子串形如 &ThinSpace; A A B B &ThinSpace; \,AABB\, AABB拆分的总个数。比如 &ThinSpace; aabaabaa &ThinSpace; \,\text{aabaabaa}\, aabaabaa有两种拆法,统计答案是就要 &ThinSpace; + 2 \,+2 +2

思路分析

答案显然等于 &ThinSpace; ∑ end[i]begin [ i + 1 ] \,\sum \text{end[i]} \text{begin}[i + 1] end[i]begin[i+1],其中 &ThinSpace; end [ i ] &ThinSpace; \,\text{end}[i]\, end[i]表示以 &ThinSpace; i &ThinSpace; \,i\, i结尾的形如 &ThinSpace; A A &ThinSpace; \,AA\, AA拆分的个数, begin [ i ] &ThinSpace; \text{begin}[i]\, begin[i]表示以 &ThinSpace; i &ThinSpace; \,i\, i开头的形如 &ThinSpace; A A &ThinSpace; \,AA\, AA拆分的个数。

如何求出这两个数组呢?考虑枚举 &ThinSpace; A &ThinSpace; \,A\, A的长度 &ThinSpace; k \,k k。我们在字符串的每 &ThinSpace; k &ThinSpace; \,k\, k个字符都设置一个关键点。一个 &ThinSpace; A A &ThinSpace; \,AA\, AA串必定跨过连续两个关键点,并且两个关键点的最长公共前缀和最长公共后缀的和必定 &ThinSpace; &gt; k \,&gt;k >k。对于跨过两个关键点的所有 &ThinSpace; A A &ThinSpace; \,AA\, AA串,肯定是一个区间。于是我们算出区间并给两个数组区间加一即可。时间复杂度 &ThinSpace; ∑ i = 1 n n i = O ( n log ⁡ n ) \, \sum_{i = 1}^{n} \frac{n}{i} = O(n \log n) i=1nin=O(nlogn)

代码实现

细节较多。

// luogu-judger-enable-o2
#include <cstdio>
#include <cstring>
#include <numeric>
#include <algorithm>
using namespace std;

typedef long long llong;
const int maxn = 3e4, logn = 15, alpha = 26;
int T, n, log_2[maxn + 3], L[maxn + 3], R[maxn + 3];

struct suffix_array {
    int sa[maxn + 3], rnk[maxn + 3], cnt[maxn + 3], tmp_1[maxn + 3], tmp_2[maxn + 3], ht[maxn + 3][logn + 3];
    char s[maxn + 3];
    void radix_sort(int from[], int key[], int to[]) {
        fill(cnt + 1, cnt + n + 1, 0);
        for (int i = 1; i <= n; i++) cnt[key[i]]++;
        for (int i = 2; i <= n; i++) cnt[i] += cnt[i - 1];
        for (int i = n; i; i--) to[cnt[key[from[i]]]--] = from[i];
    }
    void suffix_sort() {
        fill(cnt + 1, cnt + alpha + 1, 0);
        for (int i = 1; i <= n; i++) cnt[int(s[i] - 'a' + 1)] = 1;
        for (int i = 2; i <= alpha; i++) cnt[i] += cnt[i - 1];
        for (int i = 1; i <= n; i++) rnk[i] = cnt[int(s[i] - 'a' + 1)];
        if (cnt[alpha] == n) { for (int i = 1; i <= n; i++) sa[rnk[i]] = i; return; }
        for (int k = 1; k < n; k <<= 1) {
            for (int i = 1; i <= n; i++) {
                tmp_1[i] = rnk[i];
                tmp_2[i] = (i + k <= n ? rnk[i + k] : 0) + 1;
            }
            iota(sa + 1, sa + n + 1, 1);
            radix_sort(sa, tmp_2, rnk);
            radix_sort(rnk, tmp_1, sa);
            rnk[sa[1]] = 1;
            for (int i = 2, x, y; i <= n; i++) {
                x = sa[i], y = sa[i - 1];
                rnk[x] = rnk[y] + (tmp_1[x] != tmp_1[y] || tmp_2[x] != tmp_2[y]);
            }
            if (rnk[sa[n]] == n) return;
        }
    }
    void get_height() {
        for (int i = 1, j, k = 0; i <= n; ht[rnk[i++] - 1][0] = k) {
            for (j = sa[rnk[i] - 1], k = max(0, k - 1); s[i + k] == s[j + k]; k++);
        }
    	for (int k = 1; 1 << k < n - 1; k++) {
    		for (int i = 1, j = (1 << (k - 1)) + 1; j < n; i++, j++) {
    			ht[i][k] = min(ht[i][k - 1], ht[j][k - 1]);
    		}
    	}
    }
    int lcp(int a, int b) {
        if (a == b) return n - a + 1;
        a = rnk[a], b = rnk[b];
        if (a > b) swap(a, b);
        b--;
        int x = log_2[b - a + 1];
        return min(ht[a][x], ht[b - (1 << x) + 1][x]);
    }
} sa_1, sa_2;

int lcp(int a, int b) {
    return sa_1.lcp(a, b);
}

int lcs(int a, int b) {
    return sa_2.lcp(n - a + 1, n - b + 1);
}

int main() {
    scanf("%d", &T);
    for (int i = 2; i <= maxn; i++) {
        log_2[i] = log_2[i / 2] + 1;
    }
    while (T--) {
        scanf("%s", sa_1.s + 1);
        n = strlen(sa_1.s + 1);
        for (int i = 1; i <= n; i++) {
            sa_2.s[i] = sa_1.s[n - i + 1];
        }
        sa_1.suffix_sort();
        sa_1.get_height();
        sa_2.suffix_sort();
        sa_2.get_height();
        fill(L + 1, L + n + 1, 0);
        fill(R + 1, R + n + 1, 0);
        for (int k = 1; k < n / 2; k++) {
            for (int i = k, j = 2 * k, a, b; j <= n; i += k, j += k) {
                a = lcs(i, j), b = lcp(i, j);
                a = min(a, k), b = min(b, k);
                if (a + b >= k + 1) {
                    L[i - a + 1]++, L[(j + b - 2 * k) + 1]--;
                    R[i - a + 2 * k]++, R[(j + b - 1) + 1]--;
                }
            }
        }
        for (int i = 1; i <= n; i++) {
            L[i] += L[i - 1], R[i] += R[i - 1];
        }
        llong ans = 0;
        for (int i = 1; i < n; i++) {
            ans += 1ll * R[i] * L[i + 1];
        }
        printf("%lld\n", ans);
    }
    return 0;
}

例题 5:【SCOI 2012】喵星球上的点名

题目大意

【SCOI 2012】喵星球上的点名

&ThinSpace; n &ThinSpace; \,n\, n个人,每个人有姓名两个串。 q &ThinSpace; q\, q次点名,每次点一个串,一个人被点到当且仅当他的姓包含点名串或他的名包含点名串。问每次点到几个人,最终每个人被点到几次。

思路分析

首先把一个人的姓名在中间加上一个特殊字符拼在一起。这样我们就不用分别考虑姓名了。然后我们把所有名字串和所有询问串顺次拼在一起,中间用互不相同的特殊字符隔开。对于对于这个大字符串算出后缀数组。

对于每个询问,我们从这个串开头的后缀 &ThinSpace; x &ThinSpace; \,x\, x的排名往左右扩展,扩展出 &ThinSpace; s a [ ] &ThinSpace; \,sa[]\, sa[]上的区间 &ThinSpace; [ l , r ] &ThinSpace; \,[l, r]\, [l,r]满足 &ThinSpace; ∀ &ThinSpace; l ≤ i ≤ r , LCP(x, sa[i]) ≥ &ThinSpace; \,\forall\,l \le i \le r, \text{LCP(x, sa[i])} \ge\, lir,LCP(x, sa[i])询问串长度。然后我们点一遍这个区间里的所有人即可。

现在问题变成了有一个序列,每次询问一个区间 &ThinSpace; [ l , r ] &ThinSpace; \,[l, r]\, [l,r]中有多少个不同的数,并在最后输出每个数被多少个区间覆盖。记 &ThinSpace; n x t [ i ] &ThinSpace; \,nxt[i]\, nxt[i]表示在位置 &ThinSpace; i &ThinSpace; \,i\, i后下一个 &ThinSpace; A i &ThinSpace; \,A_i\, Ai出现的位置。第一个问题就是询问有多少个 &ThinSpace; l ≤ i ≤ r &ThinSpace; \, l \le i \le r \, lir满足 &ThinSpace; n x t [ i ] &gt; r \,nxt[i] &gt; r nxt[i]>r;第二个问题就是询问每个位置有多少次出现在区间 &ThinSpace; [ l , r ] &ThinSpace; \,[l, r]\, [l,r]内且 &ThinSpace; n x t [ i ] &gt; r &ThinSpace; \,nxt[i] &gt; r\, nxt[i]>r。这两个问题可以通过离线后使用树状数组解决。总时间复杂度为 &ThinSpace; O ( n log ⁡ n ) \,O(n \log n) O(nlogn)

代码实现

注意总串的最大长度。

// luogu-judger-enable-o2
#include <cstdio>
#include <vector>
#include <numeric>
#include <algorithm>
using namespace std;

const int maxn = 5e4, maxm = 1e5, maxk = 2 * maxn + 3 * maxm, logk = 18, alpha = 1e4 + maxn + maxm + 2;
int cnt[maxk + 3], sa[maxk + 3], rnk[maxk + 3], tmp_1[maxk + 3], tmp_2[maxk + 3], ht[maxk + 3][logk + 3];
int n, m, len, str[maxk + 3], bel_1[maxk + 3], bel_2[maxk + 3], pos[maxm + 3], q_len[maxm + 3], bit[maxk + 3], ans[maxm + 3];

int add_to_str(int id) {
    int k; scanf("%d", &k);
    for (int i = 1; i <= k; i++) {
        scanf("%d", &str[++len]);
        str[len] += n + m + 2, bel_1[len] = id;
    }
    return k;
}

void radix_sort(int from[], int key[], int to[], int n) {
    fill(cnt + 1, cnt + n + 1, 0);
    for (int i = 1; i <= n; i++) cnt[key[i]]++;
    for (int i = 2; i <= n; i++) cnt[i] += cnt[i - 1];
    for (int i = n; i; i--) to[cnt[key[from[i]]]--] = from[i];
}

void suffix_sort(int s[], int n) {
    for (int i = 1; i <= n; i++) cnt[s[i]] = 1;
    for (int i = 2; i <= alpha; i++) cnt[i] += cnt[i - 1];
    for (int i = 1; i <= n; i++) rnk[i] = cnt[s[i]];
    if (cnt[alpha] == n) { for (int i = 1; i <= n; i++) sa[rnk[i]] = i; return; }
    for (int k = 1; k < n; k <<= 1) {
        for (int i = 1; i <= n; i++) {
            tmp_1[i] = rnk[i];
            tmp_2[i] = (i + k <= n ? rnk[i + k] : 0) + 1;
        }
        iota(sa + 1, sa + n + 1, 1);
        radix_sort(sa, tmp_2, rnk, n);
        radix_sort(rnk, tmp_1, sa, n);
        rnk[sa[1]] = 1;
        for (int i = 2, x, y; i <= n; i++) {
            x = sa[i], y = sa[i - 1];
            rnk[x] = rnk[y] + (tmp_1[x] != tmp_1[y] || tmp_2[x] != tmp_2[y]);
        }
        if (rnk[sa[n]] == n) return;
    }
}

void get_height(int s[], int n) {
    for (int i = 1, j, k = 0; i <= n; ht[rnk[i++] - 1][0] = k) {
        for (j = sa[rnk[i] - 1], k = max(0, k - 1); s[i + k] == s[j + k]; k++);
    }
}

struct query {
    int id, lb;
    query(int id = 0, int lb = 0): id(id), lb(lb) {}
};

vector<query> vec_1[maxk + 3], vec_2[maxk + 3];

void add(int x, int y) {
    for (int i = x; i <= len; i += i & -i) {
        bit[i] += y;
    }
}

int sum(int x) {
    int y = 0;
    for (int i = x; i; i ^= i & -i) {
        y += bit[i];
    }
    return y;
}

int main() {
    scanf("%d %d", &n, &m);
    for (int i = 1; i <= n; i++) {
        add_to_str(i), str[++len] = n + m + 1;
        add_to_str(i), str[++len] = m + i;
    }
    for (int i = 1; i <= m; i++) {
        pos[i] = len + 1, q_len[i] = add_to_str(0);
        str[++len] = i;
    }
    suffix_sort(str, len);
    get_height(str, len);
    for (int k = 1; 1 << k < len - 1; k++) {
        for (int i = 1, j = (1 << (k - 1)) + 1; j < len; i++, j++) {
            ht[i][k] = min(ht[i][k - 1], ht[j][k - 1]);
        }
    }
    for (int i = 1; i <= len; i++) {
        bel_2[i] = bel_1[sa[i]];
    }
    for (int i = 1; i <= m; i++) {
        int l = rnk[pos[i]], r = l;
        for (int j = logk; ~j; j--) {
            if (ht[r][j] >= q_len[i]) {
                r += 1 << j;
            }
        }
        vec_1[r].push_back(query(i, l));
        vec_2[l].push_back(query(1, l));
        vec_2[r + 1].push_back(query(-1, l));
    }
    fill(tmp_1 + 1, tmp_1 + n + 1, 0);
    for (int i = 1, x; i <= len; i++) {
        x = bel_2[i];
        if (x) {
            if (tmp_1[x]) add(tmp_1[x], -1);
            tmp_1[x] = i, add(i, 1);
        }
        for (query q: vec_1[i]) {
            ans[q.id] = sum(i) - sum(q.lb - 1);
        }
    }
    for (int i = 1; i <= m; i++) {
        printf("%d\n", ans[i]);
    }
    fill(ans + 1, ans + n + 1, 0);
    fill(bit + 1, bit + len + 1, 0);
    fill(tmp_1 + 1, tmp_1 + n + 1, 0);
    for (int i = 1, x; i <= len; i++) {
        for (query j: vec_2[i]) {
            add(j.lb, j.id);
        }
        x = bel_2[i];
        if (x) {
            ans[x] += sum(i) - sum(tmp_1[x]);
            tmp_1[x] = i;
        }
    }
    for (int i = 1; i <= n; i++) {
        printf("%d%c", ans[i], " \n"[i == n]);
    }
    return 0;
}
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值