AC自动机典例集

AC自动机模板代码:

#include<iostream>
#include<cctype>
#include<cstring>
#include<cstdio>
#include<string>
#include<queue>
#include<algorithm>

using namespace std;
const int N = 1e6 + 6;
int n;

namespace AC {
int tr[N][26], tot;
int e[N], fail[N];
void insert(char *s) {
  int u = 0;
  for (int i = 1; s[i]; i++) {
    if (!tr[u][s[i] - 'a']) tr[u][s[i] - 'a'] = ++tot;
    u = tr[u][s[i] - 'a'];
  }
  e[u]++;
}
queue<int> q;
void build() {
  for (int i = 0; i < 26; i++)
    if (tr[0][i]) q.push(tr[0][i]);
  while (q.size()) {
    int u = q.front();
    q.pop();
    for (int i = 0; i < 26; i++) {
      if (tr[u][i])
        fail[tr[u][i]] = tr[fail[u]][i], q.push(tr[u][i]);
      else
        tr[u][i] = tr[fail[u]][i];
    }
  }
}
int query(char *t) {
  int u = 0, res = 0;
  for (int i = 1; t[i]; i++) {
    u = tr[u][t[i] - 'a'];  // 转移
    for (int j = u; j && e[j] != -1; j = fail[j]) {
      res += e[j], e[j] = -1;
    }
  }
  return res;
}
}  // namespace AC

char s[N];
int main() {
  scanf("%d", &n);
  for (int i = 1; i <= n; i++) scanf("%s", s + 1), AC::insert(s);
  scanf("%s", s + 1);
  AC::build();
  printf("%d", AC::query(s));
  return 0;
}
/*
6
he
her
hers
i
she
his
ushersheishis
0
*/

Family View(HDU-5880)

Description

Steam is a digital distribution platform developed by Valve Corporation offering digital rights management (DRM), multiplayer gaming and social networking services. A family view can help you to prevent your children access to some content which are not suitable for them.

Take an MMORPG game as an example, given a sentence T, and a list of forbidden words {P}, your job is to use ‘*’ to subsititute all the characters, which is a part of the substring matched with at least one forbidden word in the list (case-insensitive).

For example, T is: “I love Beijing’s Tiananmen, the sun rises over Tiananmen. Our great leader Chairman Mao, he leades us marching on.”

And {P} is: {“tiananmen”, “eat”}

The result should be: “I love Beijing’s *********, the sun rises over ******. Our gr leader Chairman Mao, he leades us marching on.”

Input

The first line contains the number of test cases. For each test case:
The first line contains an integer n, represneting the size of the forbidden words list P. Each line of the next n lines contains a forbidden words Pi (1≤|Pi|≤1000000,∑|Pi|≤1000000) where Pi only contains lowercase letters.

The last line contains a string T (|T|≤1000000).

Output

For each case output the sentence in a line.

Sample Input

1
3
trump
ri
o
Donald John Trump (born June 14, 1946) is an American businessman, television personality, author, politician, and the Republican Party nominee for President of the United States in the 2016 election. He is chairman of The Trump Organization, which is the principal holding company for his real estate ventures and other business interests.

Sample Output

D*nald J*hn ***** (b*rn June 14, 1946) is an Ame**can businessman, televisi*n pers*nality, auth*r, p*litician, and the Republican Party n*minee f*r President *f the United States in the 2016 electi*n. He is chairman *f The ***** *rganizati*n, which is the p**ncipal h*lding c*mpany f*r his real estate ventures and *ther business interests.

思路:

大致思路是根据样例提供的子串构建字典树上AC自动机,需要用数组记录子串出现的最后一个节点及其所对应的子串的长度。之后用被检测串匹配这些子串,进而用*号来修改母串即可
值得注意的细节是用结构体或者namespacce来实现,之后是fail指针一定要逐个赋值,

AC代码:

#include<iostream>
#include<cctype>
#include<cstring>
#include<cstdio>
#include<string>
#include<queue>
#include<algorithm>

using namespace std;
const int maxn = 1000010;

struct node
{
	int fail[maxn], trie[maxn][26], e[maxn];
	int root, tot;

	int newnode()
	{
		for (int i = 0; i < 26; ++i)trie[tot][i] = -1;
		tot++;
		return tot - 1;
	}
	void init()
	{
		tot = 0;
		memset(e, 0, sizeof e);
		root = newnode();
	}
	void insert(char s[])
	{
		int p = root, len = strlen(s);
		for (int i = 0; i < len; ++i) {
			int c = s[i] - 'a';
			if (trie[p][c] == -1)
				trie[p][c] = newnode();
			p = trie[p][c];
		}
		e[p] = len;
	}
	void build()
	{
		queue<int>q;
		int i;
		for (i = 0; i < 26; ++i) {
			if (trie[root][i] != -1) {
				fail[trie[root][i]] = root;
				q.push(trie[root][i]);
			}
			else
				trie[root][i] = root;
		}
		while (!q.empty()) {
			int now = q.front();
			q.pop();
			for (i = 0; i < 26; ++i) {
				if (~trie[now][i]) {
					fail[trie[now][i]] = trie[fail[now]][i];
					q.push(trie[now][i]);
				}
				else
					trie[now][i] = trie[fail[now]][i];
			}
		}
	}
	void query(char s[])
	{
		int len = strlen(s);
		int i, p = root, j, k;
		for (i = 0; i < len; ++i) {
			int c = tolower(s[i]) - 'a';
			p = trie[p][c];
			for (j = p; j; j = fail[j]) {
				if (e[j]) {
					for (k = i; k > i - e[j]; k--)
						s[k] = '*';
				}
			}
		}
	}
}AC;


char str[maxn];
int main()
{
	ios::sync_with_stdio(0);
	cin.tie(0); cout.tie(0);
	int T, n;
	cin >> T;
	while (T--)
	{
		AC.init();
		cin >> n;
		while (n--)
		{
			cin >> str;
			AC.insert(str);
		}
		AC.build();
		while (getchar() != '\n');
		cin.getline(str, maxn);
		AC.query(str);
		cout << str << endl;
	}
	return 0;
}
/*
2
3
trump
ri
o
Donald John Trump (born June 14, 1946) is an American businessman, television personality, author, politician, and the Republican Party nominee for President of the United States in the 2016 election. He is chairman of The Trump Organization, which is the principal holding company for his real estate ventures and other business interests.
4
hers
his
she
is
ushersheishis
*/
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Python AC自动机是一个用于字符串匹配的算法,它可以高效地在一段文本中查找多个预定义的模式。它的实现可以使用多种库,其中包括ac自动机python和ahocorasick-python。 ac自动机python是一个对标准的ac自动机算法进行了完善和优化的实现,适用于主流的Python发行版,包括Python2和Python3。它提供了更准确的结果,并且可以通过pip进行安装,具体的安装方法可以参考官方文档或者使用pip install命令进行安装。 ahocorasick-python是另一个实现AC自动机的库,它也可以用于Python2和Python3。你可以通过官方网站或者GitHub源码获取更多关于该库的信息和安装指南。 对于AC自动机的使用,一个常见的例子是在一段包含m个字符的文章中查找n个单词出现的次数。要了解AC自动机,需要有关于模式树(字典树)Trie和KMP模式匹配算法的基础知识。AC自动机的算法包括三个步骤:构造一棵Trie树,构造失败指针和模式匹配过程。在构造好AC自动机后,可以使用它来快速地在文本中查找预定义的模式,并统计它们的出现次数。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [ahocorasick-python:AC自动机python的实现,并进行了优化。 主要修复了 查询不准确的问题](https://download.csdn.net/download/weixin_42122986/18825869)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] - *2* *3* [Python实现多模匹配——AC自动机](https://blog.csdn.net/zichen_ziqi/article/details/104246446)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值