Problem
Given a set of pattern strings, and a text, you have to find, if any of the pattern is a substring of the
text. If any of the pattern string can be found in text, then print ‘yes’, otherwise ‘no’ (without quotes).
But, unfortunately, thats not what is asked here.
The problem described above, requires a input file generator. The generator generates a text of
length L, by choosing L characters randomly. Probability of choosing each character is given as priori,
and independent of choosing others.
Now, given a set of patterns, calculate the probability of a valid program generating “no”.
Input
First line contains an integer T, the number of test cases. Each case starts with an integer K, the
number of pattern strings. Next K lines each contain a pattern string, followed by an integer N,
number of valid characters. Next N lines each contain a character and the probability of selecting that
character, pi. Next an integer L, the length of the string generated. The generated text can consist of
only the valid characters, given above.
There will be a blank line after each test case.
Output
For each test case, output the number of test case, and the probability of getting a “no”.
Constraints:
• T ≤ 50
• K ≤ 20
• Length of each pattern string is between 1 and 20
• Each pattern string consists of only alphanumeric characters (a to z, A to Z, 0 to 9)
• Valid characters are all alphanumeric characters
• ∑pi = 1
• L ≤ 100
Sample Input
2
1
a
2
a 0.5
b 0.5
2
2
ab
ab
2
a 0.2
b 0.8
2
Sample Output
Case #1: 0.250000
Case #2: 0.840000
题意
给定k个字符串与n个字符。每个字符有出现概率。
用n个字符组成一个长度为L的字符串,求这个字符串不包含上面k个字符串的概率。
想法
看着就像一道dp
一开始我想的是,dp[i]表示这个字符串第i位到L位的组成满足条件的概率
但发现状态光有一个i是不够的,后面是否满足条件与i前几位的字符也有关
于是考虑用数据结构,trie
对AC自动机进行改造,flag表示当前点及其fail,fail的fail(…)是否为一个字符串的结尾
因为它的那一堆fail都是 从根节点到它 组成的字符串的后缀,即使它不是某个字符串的结尾,如果它的fail是,那么在当前已组成的字符串后面加上它也是不行的
进行记忆化搜索,状态是当前节点及后面还需组多少字符
转移时枚举每一个可能的下一个字符,运用全概率公式计算
正确性
与正常AC自动机匹配类似
记忆化搜索时每次走的那个节点都是 当前已组成的字符串 在trie树中最长的后缀
如果当前字符串加上某个字符不可以,通过fail的不断迭代是可以知道的
如果加上某个字符可以,就把组成的新字符串 在trie树中最长的后缀 的节点进行记忆化搜索。
转移时一个个枚举下一个字符不重不漏,且无后效性。
代码
(注:我居然忘在main函数里调用getFail()了,简直崩溃……)
#include<cstdio>
#include<iostream>
#include<cstring>
using namespace std;
struct trie{
trie *ch[63],*fail;
int flag,id;
void clear(){
flag=0;
fail=NULL;
for(int i=0;i<63;i++)
ch[i]=NULL;
}
}pool[500],*root;
int cnt,tot;
int get(char s){
if(s<='9') return s-'0';
else if(s<='Z') return s-'A'+10;
return s-'a'+36;
}
void add(char s[]){
int len=strlen(s),id;
trie *p=root;
for(int i=0;i<len;i++){
id=get(s[i]);
if(!p->ch[id]){
pool[++cnt].clear();
p->ch[id]=&pool[cnt];
p->ch[id]->id=++tot;
}
p=p->ch[id];
}
p->flag=1;
}
trie *q[500];
void getFail(){
int head=0,tail=0;
trie *p,*tmp;
q[tail++]=root;
while(head<tail){
tmp=q[head++];
for(int i=0;i<63;i++) {
if(!tmp->ch[i]) continue;
p=tmp->fail;
while(p && !p->ch[i])
p=p->fail;
tmp->ch[i]->fail= p?p->ch[i]:root;
tmp->ch[i]->flag |= tmp->ch[i]->fail->flag;
q[tail++]=tmp->ch[i];
}
}
}
double vis[500][105],pr[65];
double dfs(trie *p,int dep){
if(vis[p->id][dep]) return vis[p->id][dep];
if(dep==0) return vis[p->id][dep]=1.0;
double ret=0;
for(int i=0;i<63;i++)
if(pr[i]!=0.0){
trie *tmp=p;
while(tmp && !tmp->ch[i]) tmp=tmp->fail;
tmp= tmp? tmp->ch[i] : root;
if(!tmp->flag) ret+=dfs(tmp,dep-1)*pr[i];
}
return vis[p->id][dep]=ret;
}
int main()
{
int T,K,L,n,i,j;
char s[25];
scanf("%d",&T);
root=&pool[++cnt];
for(j=0;j<T;j++){
memset(pr,0.0,sizeof(pr));
memset(vis,0.0,sizeof(vis));
cnt=1; tot=0;
pool[cnt].clear();
scanf("%d",&K);
for(i=0;i<K;i++)
scanf("%s",s),add(s);
scanf("%d",&n);
getFail();
for(i=0;i<n;i++) {
scanf("%s",s);
scanf("%lf",&pr[get(s[0])]);
}
scanf("%d",&L);
printf("Case #%d: %lf\n",j+1,dfs(root,L));
}
return 0;
}