DNA sequence
Problem Description
The twenty-first century is a biology-technology developing century. We know that a gene is made of DNA. The nucleotide bases from which DNA is built are A(adenine), C(cytosine), G(guanine), and T(thymine). Finding the longest common subsequence between DNA/Protein sequences is one of the basic problems in modern computational molecular biology. But this problem is a little different. Given several DNA sequences, you are asked to make a shortest sequence from them so that each of the given sequence is the subsequence of it.
For example, given “ACGT”,”ATGC”,”CGTT” and “CAGT”, you can make a sequence in the following way. It is the shortest but may be not the only one.
Input
The first line is the test case number t. Then t test cases follow. In each case, the first line is an integer n ( 1<=n<=8 ) represents number of the DNA sequences. The following k lines contain the k sequences, one per line. Assuming that the length of any sequence is between 1 and 5.
Output
For each test case, print a line containing the length of the shortest sequence that can be made from these sequences.
Sample Input
1
4
ACGT
ATGC
CGTT
CAGT
Sample Output
8
代码如下:
#include<cstdio>
#include<cstring>
#include<algorithm>
using namespace std;
#define max(a,b) ((a)>(b)?(a):(b))
#define min(a,b) ((a)<(b)?(a):(b))
char str[10][10];
int t,n,deep;
int ans;
char DNA[4]={'A','T','C','G'};
void dfs(int index,int len[])
{
if(index>deep) return;
int maxx=0;
for(int i=0;i<n;++i)
maxx=max(strlen(str[i])-len[i],maxx);
if(maxx==0)
{
ans=index;
return;
}
if(index+maxx>deep) return;
for(int i=0;i<4;++i)
{
int flag=0;
int pos[10];
for(int j=0;j<n;++j)
{
if(str[j][len[j]]==DNA[i])
{
flag=1;
pos[j]=len[j]+1;
}
else pos[j]=len[j];
}
if(flag) dfs(index+1,pos);
if(ans!=-1) return;
}
}
int main()
{
for(scanf("%d",&t);t;--t)
{
deep=0;
scanf("%d",&n);
for(int i=0;i<n;++i)
{
scanf("%s",str[i]);
deep=max(deep,strlen(str[i]));
}
ans=-1;
int pos[10]={0};
while(1)
{
dfs(0,pos);
if(ans!=-1) break;
++deep; //加深迭代
}
printf("%d\n",ans);
}
}