最喜欢的算法(们) - Levenshtein distance

String Matching: Levenshtein distance

  • Purpose: to use as little effort to convert one string into the other
  • Intuition behind the method: replacement, addition or deletion of a charcter in a string
  • Steps

Step

Description

1

Set n to be the length of s.

Set m to be the length of t.

If n = 0, return m and exit.

If m = 0, return n and exit.

Construct a matrix containing 0..m rows and 0..n columns.

2

Initialize the first row to 0..n.

Initialize the first column to 0..m.

3

Examine each character of s (i from 1 to n).

4

Examine each character of t (j from 1 to m).

5

If s[i] equals t[j], the cost is 0.

If s[i] doesn't equal t[j], the cost is 1.

6

Set cell d[i,j] of the matrix equal to the minimum of:

a. The cell immediately above plus 1: d[i-1,j] + 1.

b. The cell immediately to the left plus 1: d[i,j-1] + 1.

c. The cell diagonally above and to the left plus the cost: d[i-1,j-1] + cost.

7

After the iteration steps (3, 4, 5, 6) are complete, the distance is found in cell d[n,m].

  • Example

This section shows how the Levenshtein distance is computed when the source string is "GUMBO" and the target string is "GAMBOL".

Steps 1 and 2
  GUMBO
 012345
G1     
A2     
M3     
B4     
O5     
L6     
Steps 3 to 6 When i = 1
  GUMBO
 012345
G10    
A21    
M32    
B43    
O54    
L65    
Steps 3 to 6 When i = 2
  GUMBO
 012345
G101   
A211   
M322   
B433   
O544   
L655   
Steps 3 to 6 When i = 3
  GUMBO
 012345
G1012  
A2112  
M3221  
B4332  
O5443  
L6554  
Steps 3 to 6 When i = 4
  GUMBO
 012345
G10123 
A21123 
M32212 
B43321 
O54432 
L65543 
Steps 3 to 6 When i = 5
  GUMBO
 012345
G101234
A211234
M322123
B433212
O544321
L655432
Step 7

The distance is in the lower right hand corner of the matrix, i.e. 2. This corresponds to our intuitive realization that "GUMBO" can be transformed into "GAMBOL" by substituting "A" for "U" and adding "L" (one substitution and 1 insertion = 2 changes).

 

转载于:https://www.cnblogs.com/postmodernist/p/5177424.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值