DNA sequence
Time Limit: 15000/5000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)
Total Submission(s): 4652 Accepted Submission(s): 2228
Problem Description
The twenty-first century is a biology-technology developing century. We know that a gene is made of DNA. The nucleotide bases from which DNA is built are A(adenine), C(cytosine), G(guanine), and T(thymine). Finding the longest common subsequence between DNA/Protein sequences is one of the basic problems in modern computational molecular biology. But this problem is a little different. Given several DNA sequences, you are asked to make a shortest sequence from them so that each of the given sequence is the subsequence of it.
For example, given “ACGT”,“ATGC”,“CGTT” and “CAGT”, you can make a sequence in the following way. It is the shortest but may be not the only one.
Input
The first line is the test case number t. Then t test cases follow. In each case, the first line is an integer n ( 1<=n<=8 ) represents number of the DNA sequences. The following k lines contain the k sequences, one per line. Assuming that the length of any sequence is between 1 and 5.
Output
For each test case, print a line containing the length of the shortest sequence that can be made from these sequences.
Sample Input
1
4
ACGT
ATGC
CGTT
CAGT
Sample Output
8
题目分析:
这题用了IDA*算法,大体思路是,先定一个有可能的最短长度depth,然后先深搜到depth这么长,搜完没有答案的话就增加depth的长度,直到能搜到。而在dfs中判断当前状态在当前的depth下有没有可能找到符合题意的答案,从而剪枝。详细的请看这篇博客。
代码实现:
#include<iostream>
#include<cstdio>
#include<algorithm>
#include<cstring>
#include<string>
#include<vector>
#include<stack>
#include<cstdlib>
#include<cmath>
#include<map>
#include<queue>
//#include <bits/stdc++.h>
using namespace std;
const int INF = 0x3f3f3f3f;
#define LL long long
#define pf printf
#define sf(n) scanf("%d", &n)
#define sff(a,b) scanf("%d %d", &a, &b)
#define sfff(a,b,c) scanf("%d %d %d", &a, &b, &c)
#define ms(i,j) memset(i,j,sizeof(i))
char a[10][6],d[4]= {'A','G','C','T'};
int len[10],now[10],depth,maxl,ans,n,T;
void dfs(int l,int nlen[])
{
if(l>depth)
return;
int left=0,t=0; //left表示最少还需要增加多少个字符来满足题意
for(int i=0; i<n; i++)
{
t=len[i]-nlen[i];
left=max(left,t);
}
if(left==0)
{
ans=l;
return;
}
if(l+left>depth) //剪枝
return;
for(int i=0; i<4; i++)
{
int pos[10]= {0}; //pos[i]表示第i个字符串已匹配的字符数
int flag=0; //flag标记当前增加的字符能否使得离目标更
for(int j=0; j<n; j++) //近,如果不能就不用再继续往下搜了。
{
if(d[i]==a[j][nlen[j]])
{
pos[j]=nlen[j]+1;
flag=1;
}
else
pos[j]=nlen[j];
}
if(flag)
dfs(l+1,pos);
}
}
int main()
{
sf(T);
while(T--)
{
sf(n);
maxl=0;
ans=-1;
ms(now,0);
for(int i=0; i<n; i++)
{
scanf("%s",a[i]);
len[i]=strlen(a[i]);
maxl=max(maxl,len[i]);
}
depth=maxl;
while(1)
{
dfs(0,now);
if(ans!=-1)
break;
depth++; //迭代加深
}
pf("%d\n",ans);
}
}