POJ - 3691 DNA repair

POJ - 3691

Time Limit: 2000MS Memory Limit: 65536KB 64bit IO Format: %I64d & %I64u

Status

Description

Biologists finally invent techniques of repairing DNA that contains segments causing kinds of inherited diseases. For the sake of simplicity, a DNA is represented as a string containing characters 'A', 'G' , 'C' and 'T'. The repairing techniques are simply to change some characters to eliminate all segments causing diseases. For example, we can repair a DNA "AAGCAG" to "AGGCAC" to eliminate the initial causing disease segments "AAG", "AGC" and "CAG" by changing two characters. Note that the repaired DNA can still contain only characters 'A', 'G', 'C' and 'T'.

You are to help the biologists to repair a DNA by changing least number of characters.

Input

The input consists of multiple test cases. Each test case starts with a line containing one integers N (1 ≤ N ≤ 50), which is the number of DNA segments causing inherited diseases.
The following N lines gives N non-empty strings of length not greater than 20 containing only characters in "AGCT", which are the DNA segments causing inherited disease.
The last line of the test case is a non-empty string of length not greater than 1000 containing only characters in "AGCT", which is the DNA to be repaired.

The last test case is followed by a line containing one zeros.

Output

For each test case, print a line containing the test case number( beginning with 1) followed by the
number of characters which need to be changed. If it's impossible to repair the given DNA, print -1.

Sample Input

2
AAA
AAG
AAAG    
2
A
TG
TGAATG
4
A
G
C
T
AGT
0

Sample Output

Case 1: 1
Case 2: 4
Case 3: -1




1,建树

很明显是在字典树上的查找,但不能直接用ac自动机的字典树,因为ac自动机是通过从匹配串的每一位都搜索来避免漏掉,比如建树

            A-B-C

            B-C

当在字符串“ABCD”中查找所有字符串是,如果只从第一位开始,那么直接成功走到A-B-C上的C,但是忽略了B-C。从各个位都开始搜索一遍,就可以避免。但题目复杂度太高,并不允许这样的查找。                                   


可以这样实现,更新fail数组的同时(其实并不需要),将A-B-C上更新fail【b】时,直接将B-C连接到A-B-C的B上,即

             if (!tre[now].chi[i])

                       tr[now].ch[i]=tr[fail[now]].ch[i];

            同时flag也应该修改,将A-B-C上fail【c】连接到B-C上的C时,A-B-C中C节点的flag应该加上B-C中c的flag的值,即+1;

这样在查找时,只需要一直向下,一次就可以统计出DNA中有几个致病基因。。。


2,dp过程

因为可以修改DNA,所以字典树应该是一个完全的树,即每个父节点都有4个子节点,要么通过bfs过程连接到其他树枝,要么自身就是致病基因的一部分。

                                      a,如果下个点flag!=0 即是病毒的结尾,那么不能继续,continue即可;

                                      b,如果下个点flag=0,则可以更新,因为DNA可以随意修改,所以只有当flag!=0 是停止,否则继续

具体dp方程见代码,很明确












#include<iostream>

#include<algorithm>
#include<stdio.h>
#include<queue>
#include<string.h>
using namespace std;
char t[25],s[3000];
queue<int>q;
struct fs{int whe;int num;string str;};
queue<fs> qq;
struct node{
    int ch[5];
    void init()//清零函数
    {
        for (int w=0;w<=3;w++)
        ch[w]=0;
    }
}tr[1050];//字典树的常规构造
int T,n,ans,tot,fail[5000],flag[5000],vis[5000];
void add(char x[])
{
    int now=0;
    int len=strlen(x);
    for (int i=0;i<len;i++)
    {
        int tmp=x[i]-'0';
        if (!tr[now].ch[tmp])
        {
            tot++,tr[now].ch[tmp]=tot;
            //因为多组数据因此每新建一个点时,注意清零flag与fail及子节点
            flag[tot]=0;
            fail[tot]=0;
            vis[tot]=0;
            tr[tot].init();
        }
        now=tr[now].ch[tmp];
    }
    flag[now]++;
}
inline void bfs()
{
    for (int i=0;i<=3;i++)
    if (tr[0].ch[i]) q.push(tr[0].ch[i]);
    //首先把第一层push进去,因为第一层的fail必须是0
    //如果不在这里先push进去在下面的地方会出错
    while(!q.empty())
    {
        int now=q.front();q.pop();
        for (int i=0;i<4;i++)
        if (tr[now].ch[i])
        {
            fail[tr[now].ch[i]]=tr[fail[now]].ch[i];
            flag[tr[now].ch[i]]+=flag[tr[now].ch[i]];
            //刚才提到过的地方,可以自己脑内模拟一下
            //而且这里的构建tree树的优化,大家可以自己去阅读我提到的博客仔细思考一下,为什么不用while()循环一直找fail来查询是否存在ch[i]
            q.push(tr[now].ch[i]);
        }
        else
        tr[now].ch[i]=tr[fail[now]].ch[i];//注意这里flag的操作

       

//如果没有节点的话查找时就要查找tr[fail[now]].ch[i],那么我们就直接建一条连接两个点的边,查找时就不用再判断是否存在节点之类的问题

    }
}

int dp[2050][2050];
int main()
{
        int a=0;
        int n;

        while(1)
{
cin>>n;

        if (n==0) return 0;
        memset(vis,0,sizeof(vis));
        memset(flag,0,sizeof(flag));
        tot=0;tr[0].init();

        for (int i=1;i<=n;i++)
                {
                cin>>t;

                int len=strlen(t);
                for (int i=0;i<len;i++)
                if (t[i]=='A')
                        t[i]='0';
                else if (t[i]=='T')
                        t[i]='1';
                else if (t[i]=='G')
                        t[i]='2';
                else if (t[i]=='C')
                        t[i]='3';
                add(t);

                }
        cin>>s;
        int len=strlen(s);
         for (int i=0;i<len;i++)
                if (s[i]=='A')
                        s[i]='0';
                else if (s[i]=='T')
                        s[i]='1';
                else if (s[i]=='G')
                        s[i]='2';
                else if (s[i]=='C')
                        s[i]='3';
       // cout<<s<<endl;





        bfs();
        for (int i=0;i<=2000;i++)
                for (int j=0;j<=2000;j++)
                        dp[i][j]=9999;
                        dp[0][0]=0;
        ans=9999;
        for (int i=0;i<len;i++)
                for (int j=0;j<=tot;j++)
                        {
                        for (int k=0;k<=3;k++)
                                {
                                if (flag[tr[j].ch[k]]!=0)    continue;
                                if (s[i]==k+'0')        dp[i+1][tr[j].ch[k]]=min(dp[i+1][tr[j].ch[k]],dp[i][j]);
                                else if (s[i]!=k+'0')            dp[i+1][tr[j].ch[k]]=min(dp[i+1][tr[j].ch[k]],dp[i][j]+1);
                                }

                        }


                        for (int i=0;i<=tot;i++)
                                {
                                ans=min(ans,dp[len][i]);
                                //cout<<dp[len][i]<<endl;
                                }
                                a++;
                                cout<<"Case "<<a<<": ";

                if (ans==9999)  cout<<-1<<endl;
                else cout<<ans<<endl;
}


                                }






  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值