Lexicographical Substring Search
SPOJ - SUBLEX
Little Daniel loves to play with strings! He always finds different ways to have fun with strings! Knowing that, his friend Kinan decided to test his skills so he gave him a string S and asked him Q questions of the form:
If all distinct substrings of string S were sorted lexicographically, which one will be the K-th smallest?
After knowing the huge number of questions Kinan will ask, Daniel figured out that he can't do this alone. Daniel, of course, knows your exceptional programming skills, so he asked you to write him a program which given S will answer Kinan's questions.
Example:
S = "aaa" (without quotes)
substrings of S are "a" , "a" , "a" , "aa" , "aa" , "aaa". The sorted list of substrings will be:
"a", "aa", "aaa".
Input
In the first line there is Kinan's string S (with length no more than 90000 characters). It contains only small letters of English alphabet. The second line contains a single integer Q (Q <= 500) , the number of questions Daniel will be asked. In the next Q lines a single integer K is given (0 < K < 2^31).
Output
Output consists of Q lines, the i-th contains a string which is the answer to the i-th asked question.
Example
Input: aaa 2 2 3 Output: aa aaa
Edited: Some input file contains garbage at the end. Do not process them.
My Solution
题意:给一个长度不大于90000的字符串,每次询问它的所有不同子串中,字典序第k小的,询问不大于500个。
后缀数组
本来是找了一个后缀自动机的题,看了题以后觉得用后缀数组做比较方便,
结果交了一发,发现自己的代码竟然是目前为止vj上这个题史上最快的,害怕
开心 Y ( ^ _ ^ ) Y,
首先用字符串s,倍增算法跑出 sa和height,
然后把询问根据v[j].q的大小排序
然后对于每个i,(1<=i<=n),
cnt += (n - sa[i]) - height[i]; 表示截止当前后缀,所能表示的不同子串的第cnt小的子串,
所以如果 cnt >= v[ptr].q 则 v[ptr].ansi = sa[i], v[ptr].anslen = (n - sa[i]) - (cnt - v[ptr].q);然后ptr++,直到不满足cnt >= v[ptr].q这个条件,
然后开始下一个后缀。
得到答案以后在把v数组按照ind排序,然后输出 s.substr(v[i].ansi, v[i].anslen)即可。
本来史上最快是60ms,笔者自己的程序竟然只用了10ms刷新了记录,虽然题目不大难,但能刷新记录还是很开心的 ☺☺。
复杂度 O(nlogn)
#include <iostream>
#include <cstdio>
#include <string>
#include <cstring>
#include <algorithm>
using namespace std;
typedef long long LL;
const int maxn = 9e4 + 8; //记得有时候要开2倍
int sa[maxn], height[maxn];
int _rank[maxn], t1[maxn], t2[maxn], c[maxn];
string s;
inline void get_sa(const int &n, int m)
{
int i, k, *x = t1, *y = t2, p, j;
for(i = 0; i < m; i++) c[i] = 0;
for(i = 0; i < n; i++) ++ c[x[i] = s[i]];
for(i = 1; i < m; i++) c[i] += c[i - 1];
for(i = n - 1; i >= 0; i--) sa[-- c[x[i]]] = i;
for(k = 1; k <= n; k <<= 1){
p = 0;
for(i = n - k; i < n; i++) y[p ++] = i;
for(i = 0; i < n; i++) if(sa[i] >= k) y[p ++] = sa[i] - k;
for(i = 0; i < m; i++) c[i] = 0;
for(i = 0; i < n; i++) ++ c[x[y[i]]];
for(i = 1; i < m; i++) c[i] += c[i - 1];
for(i = n - 1; i >= 0; i--) sa[--c[x[y[i]]]] = y[i];
swap(x, y), p = 1, x[sa[0]] = 0;
for(i = 1; i < n; i++)
x[sa[i]] = (y[sa[i-1]] == y[sa[i]] && y[sa[i-1]+k] == y[sa[i]+k]) ? p - 1 : p ++;
if(p >= n) break;
m = p;
}
k = 0;
for(i = 0; i < n; i++) _rank[sa[i]] = i;
for(i = 0; i < n; i++){
if(k) --k; if(!_rank[i]) continue;
j = sa[_rank[i] - 1];
while(s[i + k] == s[j + k]) k++;
height[_rank[i]] = k;
}
}
inline void print(const int &n)
{
for(int i = 1; i <= n; i++){ //sa and height is 1~n based
//cout << i << " : " << _rank[sa[i]] << " " << sa[i] << endl;
for(int j = sa[i]; j < n; j++){ //the context is 0~n-1 based
cout << s[j];
}
cout << endl;
}
cout << endl;
}
struct p{
int q, ansi, anslen, ind;
} v[maxn];
inline bool cmpq(const p &a, const p &b)
{
return a.q < b.q;
}
inline bool cmpind(const p &a, const p &b)
{
return a.ind < b.ind;
}
int main()
{
#ifdef LOCAL
freopen("14.in", "r", stdin);
//freopen("14.out", "w", stdout);
#endif // LOCAL
ios::sync_with_stdio(false); cin.tie(0);
LL n, q, i, ptr = 0, cnt = 0;
cin >> s >> q;
n = s.size();
get_sa(n+1, 256);
//print(n);
for(i = 0; i < q; i++){
cin >> v[i].q;
v[i].ind = i;
}
sort(v, v + q, cmpq);
for(i = 1; i <= n; i++){
cnt += (n - sa[i]) - height[i];
while(v[ptr].q <= cnt){
if(ptr >= q) break;
v[ptr].ansi = sa[i], v[ptr].anslen = (n - sa[i]) - (cnt - v[ptr].q);
ptr++;
}
}
sort(v, v + q, cmpind);
for(i = 0; i < q; i++) cout << s.substr(v[i].ansi, v[i].anslen) << "\n";
return 0;
}
------from ProLights