AC自动机

linjiayang2016

于 2019-10-24 20:34:49 发布

阅读量161

点赞数

分类专栏： AC自动机文章标签： AC自动机

本文链接：https://blog.csdn.net/linjiayang2016/article/details/102730926

版权

AC自动机专栏收录该内容

2 篇文章 0 订阅

订阅专栏

前提

　　当我们遇到多个模式串S和多个文本串T，求每个文本串中出现的模式串次数的复杂度便无法保证，因为在匹配时，需要反复跳转fail，无法保证每个点只到达一次，因此最坏时间复杂度为 $\text T(n)=\Theta\left(\sum|S|\times\sum |T|+\sum|T|\right)$ 　　由于我们只需要求模式串出现的总次数，换言之，不同的模式串出现一次对答案的贡献是相同的，我们不需要区分具体是哪一个模式串对答案产生了贡献，因此，我们考虑将fail树建出来，然后一次深度优先搜索求出以每个点到根节点的路径上的结束标记的个数，那么在统计答案时，只需加上这个个数而无需跳转，时间复杂度降低到 $\Theta\left(\left(\sum|S|\right)\times\left(\sum|T|\right)\right)$

struct node{
	static const int maxk=27;
	node *ch[maxk],*fail; int f,d;
	node(node *f=NULL):fail(f),f(0),d(0){memset(ch,0,sizeof ch);}
	void clear(){memset(ch,0,sizeof ch);fail=NULL;d=f=0;}
};
struct AC{
	static const int maxn=100010;
	node *q[maxn],*rt,*srt;
	AC(){clear();}
	void clear(){
		srt=new node();
		srt->fail=srt;
		rt=new node(srt);
		for(int i=0;i<node::maxk;i++)
			srt->ch[i]=rt;
	}
	void insert(const char *s,int v){
		node *t=rt;
		for(;*s;s++){
			int x=*s-'a';
			if(!t->ch[x]) t->ch[x]=new node();
			t=t->ch[x];
		} t->d+=v;
	}
	int ed;
	void build(node * const Rt=NULL){
		int st=0;ed=0;
		q[ed++]=Rt;
		while(st!=ed){
			node *t=q[st++];
			for(int i=0;i<node::maxk;i++)
				if(t->ch[i]){
					t->ch[i]->fail=t->fail->ch[i];
					q[ed++]=t->ch[i];
					t->ch[i]->f=t->ch[i]->fail->d+t->ch[i]->d;
				} else t->ch[i]=t->fail->ch[i];
		}
	}
	int match(const char *s){
		int ret=0;
		for(node *u=rt;*s;++s)
			u=u->ch[*s-'a'],ret+=u->f;
		return ret;
	}
};