HDU 1560 DNA sequence（状压+搜索）

最新推荐文章于 2022-10-15 16:14:03 发布

JOY酷酷

最新推荐文章于 2022-10-15 16:14:03 发布

阅读量500

点赞数

分类专栏：其他

本文链接：https://blog.csdn.net/baidu_27438681/article/details/52238692

版权

其他专栏收录该内容

13 篇文章 0 订阅

订阅专栏

Description
The twenty-first century is a biology-technology developing century. We know that a gene is made of DNA. The nucleotide bases from which DNA is built are A(adenine), C(cytosine), G(guanine), and T(thymine). Finding the longest common subsequence between DNA/Protein sequences is one of the basic problems in modern computational molecular biology. But this problem is a little different. Given several DNA sequences, you are asked to make a shortest sequence from them so that each of the given sequence is the subsequence of it.

For example, given "ACGT","ATGC","CGTT" and "CAGT", you can make a sequence in the following way. It is the shortest but may be not the only one.

Input
The first line is the test case number t. Then t test cases follow. In each case, the first line is an integer n ( 1<=n<=8 ) represents number of the DNA sequences. The following k lines contain the k sequences, one per line. Assuming that the length of any sequence is between 1 and 5.
Output
For each test case, print a line containing the length of the shortest sequence that can be made from these sequences.
Sample Input
1
4
ACGT
ATGC
CGTT
CAGT
Sample Output

题意：给N个长度不大于5的字符串，让你构造一个字符串使这N个字符串都是他的子序列。输出他的最小长度。

思路：构造的字符串每个位置有4中情况，BFS一下，出现过的状态就剪枝。

#include <iostream>
#include <stdio.h>
#include <cmath>
#include <algorithm>
#include <iomanip>
#include <cstdlib>
#include <string.h>
#include <vector>
#include <queue>
#include <stack>
#include <ctype.h>
using namespace std;

int vis[2000005];  //记录当前状态是否出现过，以便剪枝
char s[10][10];  //记录每个字符串
int len[10];   //记录每个字符串长度
int p[10];   //表示6的i次方
char temp[4]={'A','C','G','T'};

void init()   //预处理6的i次方
{
    p[0]=1;
    for(int i=1;i<10;i++)
    {
        p[i]=p[i-1]*6;
    }
}

struct node
{
    int step;   //步数，即当前构造的字符串长度
    int zt;    //状态，指此时给出的字符串的前i个是构造出的字符串的子序列
    //一共8个字符串，最多长度为5，则每一位是0-5，六进制数最大6^8。
    node() {}
    node(int Step, int Zt)
    {
        step=Step;
        zt=Zt;
    }
};

int main()
{
    init();
    int t;
    scanf("%d",&t);
    while(t--)
    {
        int n;
        scanf("%d",&n);
        int res=0;   //记录最终要达到的状态
        for(int i=1;i<=n;i++)
        {
            scanf("%s",s[i]+1);
            len[i]=(int)strlen(s[i]+1);
            res+=len[i]*p[i-1];
        }
        memset(vis,0,sizeof(vis));
        queue<node> q;
        q.push(node(0,0));
        vis[0]=1;
        int ans=0;
        while(!q.empty())
        {
            node k=q.front();
            q.pop();
            if(k.zt==res)
            {
                ans=k.step;
                break;
            }
            node l;
            for(int i=0;i<4;i++)
            {
                l.zt=0;
                l.step=k.step+1;
                int kk=k.zt;
                for(int j=1;j<=n;j++)
                {
                    int x=kk%6;
                    kk/=6;
                    if(x==len[j]||s[j][x+1]!=temp[i])
                    {
                        l.zt+=x*p[j-1];
                    }
                    else
                    {
                        l.zt+=(x+1)*p[j-1];
                    }
                }
                if(vis[l.zt]==0)   //剪枝，出现过的状态不再放入队列，否则会MLE
                {
                    q.push(l);
                    vis[l.zt]=1;
                }
            }
        }
        cout<<ans<<endl;
    }
    return 0;
}

JOY酷酷

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
HDU 1560 DNA sequence（状压+搜索）

DescriptionThe twenty-first century is a biology-technology developing century. We know that a gene is made of DNA. The nucleotide bases from which DNA is built are A(adenine), C(cytosine), G(guanin
复制链接

扫一扫