Time Limit: 1000MS | Memory Limit: 65536K | |
Total Submissions: 7922 | Accepted: 3141 |
Description
Let x and y be two strings over some finite alphabet A. We would like to transformx into y allowing only operations given below:
- Deletion: a letter in x is missing in y at a corresponding position.
- Insertion: a letter in y is missing in x at a corresponding position.
- Change: letters at corresponding positions are distinct
Certainly, we would like to minimize the number of all possible operations.
IllustrationA G T A A G T * A G G C | | | | | | | A G T * C * T G A C G CDeletion: * in the bottom line
Insertion: * in the top line
Change: when the letters at the top and bottom are distinct
This tells us that to transform x = AGTCTGACGC into y = AGTAAGTAGGC we would be required to perform 5 operations (2 changes, 2 deletions and 1 insertion). If we want to minimize the number operations, we should do it like
A G T A A G T A G G C | | | | | | | A G T C T G * A C G C
and 4 moves would be required (3 changes and 1 deletion).
In this problem we would always consider strings x and y to be fixed, such that the number of letters inx is m and the number of letters in y is n wheren ≥ m.
Assign 1 as the cost of an operation performed. Otherwise, assign 0 if there is no operation performed.
Write a program that would minimize the number of possible operations to transform any stringx into a string y.
Input
The input consists of the strings x and y prefixed by their respective lengths, which are within 1000.
Output
An integer representing the minimum number of possible operations to transform any stringx into a string y.
Sample Input
10 AGTCTGACGC 11 AGTAAGTAGGC
分析:(参考:http://blog.csdn.net/kuaisuzhuceh/article/details/8680799?reload)
先做一些约定:
①、第一个DNA串表示为数组A[1...n],第二个DNA串表示为数组B[1...n]。
问题转化:
原问题相当于,给定两个数组A[1...n],B[1...m],要求的是B[1...m]变为A[1...n](通过增加一个字符,删除一个 字符,改变一个字符)至少需要多少步?
设计状态:
我们可以用定义一个二维数组dp[i][j]表示状态,dp[i][j]表示A[1...n]的子串A[1...i]和B[1....m]的子串B[1...j]的最短距离,即B[1...j]需要经过多少次操作(增、删、修改)可以变为A[1..i]。
状态转移方程:
有三种情况可以导致我们上面设计的状态会发生转移。我们现在来看A[i] 和 B[j] ,①、我们可以在B[j]后面插入一个核苷酸(即一个字符)ch,ch==A[i],这样做的话,至少需要dp[i - 1][j] + 1步操作,即dp[i][j] = dp[i - 1][j] + 1。②、我们可以删除B[j],这样的话,B[1...j] 变为A[1...i] 需要dp[i][j - 1]步,即dp[i][j] = dp[i][j - 1] + 1。③、我们也可以考虑修改B[j],使它变为A[j],但是如果B[j]本来就等于A[i]的话,那修改其实相当于用了0步,如果B[j] != A[i] 的话,那修改相当于用了1步。所以dp[i][j] = dp[i - 1][j - 1] + (A[i] == B[j] ? 0, 1)。
决策:
决策就很简单了,从上面三种状态转移中选择一个最小值就可以了。
处理边界:
处理好边界非常重要,这里需要注意的是对dp[0][0....m],dp[0.....n][0]的初始化,可以这样看,dp[0][i],就是说A[1...n]是一个空串,而B[1...m]十个长度为i的串,很显然B串变为A串就是删除i个核苷酸。dp[0....n][0]怎么初始化,大家自己想一想吧,道理是一样的。代码:#include<stdio.h> #include<string.h> #define N 1100 char s1[N],s2[N]; int d[N][N]; int min(int a,int b, int c){ int mx=a; if(mx>b) mx=b; if(mx>c) mx=c; return mx; } int main(){ int n,i,j,k,len1,len2; while(scanf("%d",&len1)!=EOF){ scanf("%s",s1); scanf("%d %s",&len2,s2); memset(d,0,sizeof(d)); for(i=0;i<=len1;i++) d[i][0]=i; for(j=0;j<=len2;j++) d[0][j]=j; for(i=1;i<=len1;i++){ for(j=1;j<=len2;j++){ d[i][j]=min(d[i][j-1]+1,d[i-1][j]+1,d[i-1][j-1]+(s1[i-1]==s2[j-1]?0:1)); } } printf("%d\n",d[len1][len2]); } system("pause"); return 0; }