DNA sequence
The twenty-first century is a biology-technology developing century. We know that a gene is made of DNA. The nucleotide bases from which DNA is built are A(adenine), C(cytosine), G(guanine), and T(thymine). Finding the longest common subsequence between DNA/Protein sequences is one of the basic problems in modern computational molecular biology. But this problem is a little different. Given several DNA sequences, you are asked to make a shortest sequence from them so that each of the given sequence is the subsequence of it.
For example, given “ACGT”,“ATGC”,“CGTT” and “CAGT”, you can make a sequence in the following way. It is the shortest but may be not the only one.
Input
The first line is the test case number t. Then t test cases follow. In each case, the first line is an integer n ( 1<=n<=8 ) represents number of the DNA sequences. The following k lines contain the k sequences, one per line. Assuming that the length of any sequence is between 1 and 5.
Output
For each test case, print a line containing the length of the shortest sequence that can be made from these sequences.
Sample Input
1
4
ACGT
ATGC
CGTT
CAGT
Sample Output
8
一开始还想用很笨的方法构造序列,后来看了下别人的题解,才发现序列的构造过程已经在题目里了。。。。
思路
很明显可以采用IDA*的算法去解,设置depth来控制搜索层数,每一次都从第0层开始搜索,在这道题目里,可以把depth理解为最后要构造出的序列长度,最短序列的构造就依据子序列来进行构造
代码如下
#include<iostream>
#include<string>
#include<cstring>
#include<algorithm>
using namespace std;
string data[10];
int len[10]={0};//记录子序列的长度
int pos[10]={0};//记录每一个子序列的已构造数
int n,m;
int depth=0;
char s[5]="ACGT";
bool ida(int now);
int left();
int main()
{
cin>>n;
while(n--)
{
memset(len,0,sizeof(len));
memset(pos,0,sizeof(pos));
depth=0;
cin>>m;
for(int i=0;i<m;i++)
{
cin>>data[i];
len[i]=data[i].size();
depth=max(depth,len[i]);//最后要构造出的序列长度至少要等于子序列中的最大长度
}
while(1)
{
if(ida(0))break;
depth++;
}
cout<<depth<<endl;
}
return 0;
}
int left()//计算剩余序列中至少还需构造的最大字母数
{
int l=0;
for(int i=0;i<m;i++)
{
l=max(l,len[i]-pos[i]);
}
return l;
}
bool ida(int now)//now可以理解为已经构造出的字母数
{
if(now+left()>depth)return false;//剩余序列中至少还需构造的最大字母数+已构造字母数大于depth就无需继续搜索
if(!left())return true;//至少还需构造的字母数为0就意味着构造完成,解已经出来了
int temp[10];
for(int i=0;i<m;i++)
{
temp[i]=pos[i];
}
for(int i=0;i<4;i++)
{
int flag=0;
for(int j=0;j<m;j++)
{
int t=pos[j];
if(data[j][t]==s[i]){
flag=1;
pos[j]++;
}
}
if(flag){
if(ida(now+1))return true;
for(int u=0;u<m;u++)//回溯时要复原pos数组中的数值
{
pos[u]=temp[u];
}
}
}
return false;
}