class Solution {
public:
int minDistance(string word1, string word2) {
int l1=word1.length();
int l2=word2.length();
int dp[l1+5][l2+5];
// memset(dp,0,sizeof(dp));
for (int i=0;i<=l1;i++){
for (int j=0;j<=l2;j++){
if (i==0){
dp[i][j]=j;
}
else if (j==0){
dp[i][j]=i;
}
else dp[i][j]=1<<29;
}
}
//dp[0][0]=0;
for (int i=1;i<=l1;i++){
for (int j=1;j<=l2;j++){
if (word1[i-1]==word2[j-1]){
dp[i][j]=dp[i-1][j-1];
}
else {
dp[i][j]=min(dp[i-1][j]+1,dp[i][j-1]+1);
dp[i][j]=min(dp[i-1][j-1]+1,dp[i][j]);
}
//cout<<dp[i][j]<<" "<<dp[i-1][j]<<" "<<dp[i][j-1]<<" "<<dp[i-1][j-1]<<" *** ";
// cout<<dp[i][j]<<" ";
}
//cout<<endl;
}
return dp[l1][l2];
}
};
上一篇文章我们讲了在机器学习中常用的距离度量方法,其中提到了edit distance,就是将一个字符串转化成另外一个字符串的最少步骤,包含replace,delete和insert三种方法。
自己尝试做了做,动态规划解法如下,和最长公共子序列问题思路差不多,贴一个讲的很清楚的
dp[i][0] = i
;dp[0][j] = j
.
If they are euqal, then no more operation is needed and dp[i][j] = dp[i - 1][j - 1]
. Well, what if they are not equal?
If they are not equal, we need to consider three cases:
- Replace
word1[i - 1]
byword2[j - 1]
(dp[i][j] = dp[i - 1][j - 1] + 1 (for replacement)
); - Delete
word1[i - 1]
andword1[0..i - 2] = word2[0..j - 1]
(dp[i][j] = dp[i - 1][j] + 1 (for deletion)
); - Insert
word2[j - 1]
toword1[0..i - 1]
andword1[0..i - 1] + word2[j - 1] = word2[0..j - 1]
(dp[i][j] = dp[i][j - 1] + 1 (for insertion)
).
Putting these together, we now have:
dp[i][0] = i
;dp[0][j] = j
;dp[i][j] = dp[i - 1][j - 1]
, ifword1[i - 1] = word2[j - 1]
;dp[i][j] = min(dp[i - 1][j - 1] + 1, dp[i - 1][j] + 1, dp[i][j - 1] + 1)
, otherwise.