Codeforces 666E Forensic Examination （后缀自动机+线段树合并）

最新推荐文章于 2019-01-06 18:32:00 发布

Mogician_Evian

最新推荐文章于 2019-01-06 18:32:00 发布

阅读量423

点赞数

分类专栏：后缀自动机线段树倍增文章标签：线段树合并后缀自动机

本文链接：https://blog.csdn.net/Mogician_Evian/article/details/79344642

版权

线段树同时被 3 个专栏收录

8 篇文章 0 订阅

订阅专栏

后缀自动机

6 篇文章 0 订阅

订阅专栏

倍增

5 篇文章 0 订阅

订阅专栏

E. Forensic Examination

The country of Reberland is the archenemy of Berland. Recently the authorities of Berland arrested a Reberlandian spy who tried to bring the leaflets intended for agitational propaganda to Berland illegally . The most leaflets contain substrings of the Absolutely Inadmissible Swearword and maybe even the whole word.

Berland legal system uses the difficult algorithm in order to determine the guilt of the spy. The main part of this algorithm is the following procedure.

All the m leaflets that are brought by the spy are numbered from 1 to m. After that it’s needed to get the answer to q queries of the following kind: “In which leaflet in the segment of numbers [l, r] the substring of the Absolutely Inadmissible Swearword [pl, pr] occurs more often?”.

The expert wants you to automate that procedure because this time texts of leaflets are too long. Help him!

Input

The first line contains the string s (1 ≤ |s| ≤ 5·105) — the Absolutely Inadmissible Swearword. The string s consists of only lowercase English letters.

The second line contains the only integer m (1 ≤ m ≤ 5·104) — the number of texts of leaflets for expertise.

Each of the next m lines contains the only string ti — the text of the i-th leaflet. The sum of lengths of all leaflet texts doesn’t exceed 5·104. The text of the leaflets consists of only lowercase English letters.

The next line contains integer q (1 ≤ q ≤ 5·105) — the number of queries for expertise.

Finally, each of the last q lines contains four integers l, r, pl, pr (1 ≤ l ≤ r ≤ m, 1 ≤ pl ≤ pr ≤ |s|), where |s| is the length of the Absolutely Inadmissible Swearword.

Output

Print q lines. The i-th of them should contain two integers — the number of the text with the most occurences and the number of occurences of the substring [pl, pr] of the string s. If there are several text numbers print the smallest one.

input

suffixtree
3
suffixtreesareawesome
cartesiantreeisworsethansegmenttree
nyeeheeheee
2
1 2 1 10
1 3 9 10

output

1 1
3 4

题目大意：
给定一个字符串 $s$ ，和 $m$ 个字符串 $p_1.......p_m$ ， $q$ 次询问，每次询问在 $p_l......p_r$ 中， $s$ 的一个子串 $s[x...y]$ 在哪一个串中出现次数最多。

询问子串的出现次数，考虑构建后缀自动机，由于有多个串，因此需要分开维护 $Right$ 集合的大小，构建串 $s+@+p_1+@+......+p_m$ 的后缀自动机。对于每个节点，维护一颗线段树，表示该节点表示的子串在每一个 $p$ 中出现的次数，这个在每次添加新的 $np$ 节点时直接往线段树中插入就行。

自动机建好了之后按照 $parent$ 树，自底向上将子节点的线段树合并到父节点上，利用可持久化线段树可以完成。

然后对于每次询问，预先记录每个 $s$ 每个前缀在自动机上的位置，然后在 $parent$ 树上倍增查找子串对应节点，找到对应节点后直接在线段树上询问区间 $[l,r]$ 的最大值即可。

总时间复杂度 $O(nlog^2n)$ ，注意在合并线段树的时候一定要新开节点。

同时这题有个坑点就是当出现次数是 0 <script type="math/tex" id="MathJax-Element-7615">0</script>时，直接输出查询的左界。

代码：

#include<stdio.h>
#include<iostream>
#include<algorithm>
#include<cstring>
#define N 1200000
#define M 20000000
using namespace std;
typedef pair<int,int> par;
namespace Seg
{
    int rt[N],ls[M],rs[M],Max[M],id[M],tot;
    int CP(int p)
    {
        int o=++tot;
        ls[o]=ls[p];
        rs[o]=rs[p];
        Max[o]=Max[p];
        id[o]=id[p];
        return o;
    }
    void UD(int p)
    {
        Max[p]=max(Max[ls[p]],Max[rs[p]]);
        if(Max[ls[p]]<Max[rs[p]])id[p]=id[rs[p]];
        else id[p]=id[ls[p]];
    }
    int ADD(int p,int l,int r,int k,int d)
    {
        int o=CP(p);
        if(l==r)return Max[o]+=d,id[o]=k,o;
        int mid=l+r>>1;
        if(k<=mid)ls[o]=ADD(ls[o],l,mid,k,d);
        else rs[o]=ADD(rs[o],mid+1,r,k,d);
        UD(o);return o;
    }
    par GS(int p,int l,int r,int x,int y)
    {
        if(Max[p]==0)return par(0,max(l,x));
        if(x<=l&&y>=r)return par(Max[p],id[p]);
        int mid=l+r>>1;par t1=par(-1,0),t2=par(-1,0);
        if(x<=mid&&y>=l)t1=GS(ls[p],l,mid,x,y);
        if(x<=r&&y>mid)t2=GS(rs[p],mid+1,r,x,y);
        if(t1.first>=t2.first)return t1;
        return t2;
    }
    int Merge(int p1,int p2,int l,int r)
    {
        if(!p1)return CP(p2);
        if(!p2)return CP(p1);
        int o=++tot,mid=l+r>>1;
        if(l==r)return Max[o]=Max[p1]+Max[p2],id[o]=l,o;
        ls[o]=Merge(ls[p1],ls[p2],l,mid);
        rs[o]=Merge(rs[p1],rs[p2],mid+1,r);
        UD(o);return o;
    }
}
char s[N];
int TOT,LA[N],NE[N],EN[N],S=21;
int m,q,tot=1,rt=1,las=1,son[N][27],pra[N],Max[N],en[N],fa[N][22];
void ADD(int x,int y)
{
    TOT++;
    EN[TOT]=y;
    NE[TOT]=LA[x];
    LA[x]=TOT;
}
int NP(int x)
{
    Max[++tot]=x;
    return tot;
}
void Ins(int t,int ty)
{
    int p=las,np,q,nq;
    np=NP(Max[p]+1);
    if(ty)Seg::rt[np]=Seg::ADD(Seg::rt[np],1,m,ty,1);
    while(p&&!son[p][t])son[p][t]=np,p=pra[p];
    if(!p)pra[np]=rt;
    else
    {
        q=son[p][t];
        if(Max[q]==Max[p]+1)pra[np]=q;
        else
        {
            nq=NP(Max[p]+1);
            memcpy(son[nq],son[q],sizeof(son[q]));
            pra[nq]=pra[q];
            pra[q]=pra[np]=nq;
            while(son[p][t]==q)son[p][t]=nq,p=pra[p];
        }
    }
    las=np;
}
void DFS(int x,int f)
{
    int i,y;fa[x][0]=f;
    for(i=1;i<=S;i++)fa[x][i]=fa[fa[x][i-1]][i-1];
    for(i=LA[x];i;i=NE[i])
    {
        DFS(EN[i],x);
        Seg::rt[x]=Seg::Merge(Seg::rt[x],Seg::rt[EN[i]],1,m);
    }
}
int Find(int x,int le)
{
    for(int i=S;i>=0;i--)if(Max[fa[x][i]]>=le)x=fa[x][i];
    return x;
}
void Query(int l,int r,int x,int y)
{
    int p=Find(en[y],y-x+1);
    par t=Seg::GS(Seg::rt[p],1,m,l,r);
    printf("%d %d\n",t.second,t.first);
}
int main()
{
    int i,j,k,L,l,r,x,y;
    scanf("%s\n%d",s,&m);
    L=strlen(s);
    for(i=0;i<L;i++)Ins(s[i]-'a',0),en[i]=las;
    for(i=1;i<=m;i++)
    {
        scanf("\n%s",s);
        L=strlen(s);Ins(26,0);
        for(j=0;j<L;j++)Ins(s[j]-'a',i);
    }
    for(i=1;i<=tot;i++)ADD(pra[i],i);
    DFS(1,0);
    scanf("%d",&q);
    while(q--)
    {
        scanf("%d%d%d%d",&l,&r,&x,&y);
        Query(l,r,x-1,y-1);
    }
}