动态规划特征:
1.最优子结构 the property of optimal substructure
An opt solution to a problem contain opt solution subproblem
2.重叠子过程尽量少
recursive solution contains a small number distinct subproblems repeat many times
Longest common subsequence
1.memoization alg 备忘法
伪代码
LSC(x,y,i,j) //ignoring base case
if c[i][j] = NULL
then if x[i] = y[j]
c[i][j] = LSC(x,y,i-1,j-1) + 1
else c[i][j] = max{LCS(x,y,i-1,j),LCS(x,y,i,j-1)}
return c[i][j]
else return c[i][j]
2.动态规划法
Ø | A | G | C | A | T | |
---|---|---|---|---|---|---|
Ø | 0 | 0 | 0 | 0 | 0 | 0 |
G | 0 | 0 | 1 | 1 | 1 | 1 |
A | 0 | 1 | 1 | 1 | 2 | 2 |
C | 0 | 1 | 1 | 2 | 2 | 2 |
function LCSLength(X[1..m], Y[1..n])
C = array(0..m, 0..n)
for i := 0..m
C[i,0] = 0
for j := 0..n
C[0,j] = 0
for i := 1..m
for j := 1..n
if X[i] = Y[j]
C[i,j] := C[i-1,j-1] + 1
else
C[i,j] := max(C[i,j-1], C[i-1,j])
return C[m,n]
#include<stdio.h>
int c[50][50];
void LCSlength(char x[],char y[],int m,int n){
int i,j;
for(i = 0;i<m;i++)
c[0][i] = 0;
for(j = 0;j<n;j++)
c[j][0] = 0;
for(i = 0;i<m;i++)
for(j = 0;j<n;j++){
if(x[i] == y[j]) c[i+1][j+1] = c[i][j] + 1;
else if(c[i+1][j]>c[i][j+1]) c[i+1][j+1] = c[i+1][j];
else c[i+1][j+1] = c[i][j+1];
}
}
void LCS(char *lcs,char *x,char *y,int m,int n){
int i,j,k;
LCSlength(x,y,m,n);
i = m-1;
j = n-1;
k = c[m][n]-1;
while(i>=0&&j>=0){
if(x[i] == y[j]) {lcs[k--] = x[i];
i--;
j--;
}
else if(c[i][j+1]>c[i+1][j]) i--;
else j--;
}
}
int main(){
char x[7] = {'A','B','C','B','D','A','B'};
char y[6] = {'B','D','C','A','B','A'};
char lcs[6];
LCS(lcs,x,y,7,6);
lcs[c[7][6]] = '\0';
printf("%d %s\n",c[7][6],lcs);
}
递归方法回溯LCS(一个)
伪代码
function backtrack(C[0..m,0..n], X[1..m], Y[1..n], i, j)
if i = 0 or j = 0
return ""
else if X[i] = Y[j]
return backtrack(C, X, Y, i-1, j-1) + X[i]
else
if C[i,j-1] > C[i-1,j]
return backtrack(C, X, Y, i, j-1)
else
return backtrack(C, X, Y, i-1, j)
回溯所有LCS
伪代码
function backtrackAll(C[0..m,0..n], X[1..m], Y[1..n], i, j)
if i = 0 or j = 0
return {""}
else if X[i] = Y[j]
return {Z + X[i] for all Z in backtrackAll(C, X, Y, i-1, j-1)}
else
R := {}
if C[i,j-1] ≥ C[i-1,j]
R := backtrackAll(C, X, Y, i, j-1)
if C[i-1,j] ≥ C[i,j-1]
R := R ∪ backtrackAll(C, X, Y, i-1, j)
return R
相关:
1.Shortest common supersequence
u 是 x和y的common supersequence当且仅当,x和y均为u的子序列
Given two sequences X = < x1,...,xm > and Y = < y1,...,yn >, a sequence U = < u1,...,uk > is a common supersequence of X and Y ifU is a supersequence of both X and Y. In other words, a shortest common supersequence of strings x and y is a shortest string z such that both x and y are subsequences of z.
For example, if X and Y, the lcs is Z. By inserting the non-lcs symbols while preserving the symbol order, we get the scs: U.
与LCS的关系
2.编辑距离/Levenshtein距离
编辑距离,又称Levenshtein距离,是指两个字串之间,由一个转成另一个所需的最少编辑操作次数。许可的编辑操作包括将一个字符替换成另一个字符,插入一个字符,删除一个字符。
The edit distance when only insertion and deletion is allowed (no substitution), or when the cost of the substitution is the double of the cost of an insertion or deletion, is: