# 题目

### Edit Distance

Total Accepted: 9568 Total Submissions: 38449

Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.)

You have the following 3 operations permitted on a word:

a) Insert a character
b) Delete a character
c) Replace a character

# 解法

## 思路

1. 了解什么是编辑距离
2. 编辑距离建模
3. 因为是DP系列题目，所以跳过一些判断直接想能否用DP解

### 编辑距离

In computer science, edit distance is a way of quantifying how dissimilar two strings (e.g., words) are to one another by counting the minimum number of operations required to transform one string into the other. Edit distances find applications in natural language processing, where automatic spelling correction can determine candidate corrections for a misspelled word by selecting words from a dictionary that have a low distance to the word in question. Inbioinformatics, it can be used to quantify the similarity of macromolecules such as DNA, which can be viewed as strings of the letters A, C, G and T.

Given two strings a and b on an alphabet Σ (e.g. the set of ASCII characters, the set of bytes [0..255], etc.), the edit distance d(a, b) is the minimum-weight series of edit operations that transforms a into b. One of the simplest sets of edit operations is that defined by Levenshtein in 1966:

Insertion of a single symbol. If a = uv, then inserting the symbol x produces uxv. This can also be denoted ε→x, using ε to denote the empty string.
Deletion of a single symbol changes uxv to uv (x→ε).
Substitution of a single symbol x for a symbol y ≠ x changes uxv to uyv (xy).

The Levenshtein distance between "kitten" and "sitting" is 3. The minimal edit script that transforms the former into the latter is:

1. kitten → sitten (substitution of "s" for "k")
2. sitten → sittin (substitution of "i" for "e")
3. sittin → sitting (insertion of "g" at the end).

LCS distance(insertions and deletions only) gives a different distance and minimal edit script:

1. delete k at 0
2. insert s at 0
3. delete e at 4
4. insert i at 4
5. insert g at 6

for a total cost/distance of 5 operations.

## DP解法

### DP实现：

Basic algorithm
Main article: Wagner–Fischer algorithm
Using Levenshtein's original operations, the edit distance between  and is given by , defined by the recurrence

This algorithm can be generalized to handle transpositions by adding an additional term in the recursive clause's minimization

public class Solution {
public int minDistance(String word1, String word2) {
int len1 = word1.length();
int len2 = word2.length();
int[][] dp = new int[len1+1][len2+1];

if(len1 == 0) return len2;
if(len2 == 0) return len1;

for(int i = 0; i < len1+1; i++)
dp[i][0] = i;
for(int i = 0 ; i < len2+1; i++)
dp[0][i] = i;

for(int i = 1; i < len1+1; i++){
for(int j = 1 ; j <len2+1;j++){
//dp[i][j]  VS dp[i][j-1]+1(cost) for delete  VS dp[i-1][j] +1(cost) for insert VS dp[i-1][j-1] + cost for substitution
int cost = word1.charAt(i-1)==word2.charAt(j-1) ? 0 : 1;
dp[i][j] = Math.min(dp[i-1][j-1]+cost,Math.min(dp[i][j-1]+1,dp[i-1][j]+1));

}
}
return dp[len1][len2];
}
}