Levenshtein's eidit distance

11 篇文章 0 订阅


  From Wiki, the definition of  Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is equal to the number of single-character edits required to change one word into the other.

Mathematically, the Levenshtein distance between two strings a, b is given by \operatorname{lev}_{a,b}(|a|,|b|) where

\qquad\operatorname{lev}_{a,b}(i,j) = \begin{cases}
  0 &, i=j=0 \\
  i &, j = 0 \text{ and } i > 0 \\
  j &, i = 0 \text{ and } j > 0 \\
  \min \begin{cases}
          \operatorname{lev}_{a,b}(i-1,j) + 1 \\
          \operatorname{lev}_{a,b}(i,j-1) + 1 \\
          \operatorname{lev}_{a,b}(i-1,j-1) + [a_i \neq b_j]
       \end{cases} &, \text{ else}
\end{cases}

Note that the first element in the minimum corresponds to deletion(from a to b), the second to insertion and the third to match or mismatch, depending on whether the respective symbols are the same.

 

Below is an implementation in java.

 

package test.LevenshteinDistance;

public class LevenshteinDistance {

	/**
	 * @param args
	 */
	public static void main(String[] args) {
		// TODO Auto-generated method stub
		String s = "kitten";
		String t = "sitting";
		
		LevenshteinDistance ld = new LevenshteinDistance();
		
		//int recurlen = ld.recurLevenDistance(s, t);
		int dynlen = ld.dynLevenDistance(s, t);
		//System.out.println("The length is : " + recurlen);
		System.out.println("The length is : " + dynlen);
		System.out.println(s.length());
	}
	
	/*public int minimum(int a, int b, int c){
		return Math.min(Math.min(a, b), c);
	}*/
	
	public int minimum(int a, int b, int c){
		if(a < b && a < c) return a;
		if(b < a && b < c) return b;
		return c;
	}
	
	// compute the distance use recursive way
	public int recurLevenDistance(String s, String t){
		int slen = s.length();
		int tlen = t.length();
		
		int ins = 0; // for recording insert length
		int del = 0; // for recording delete length
		int sub = 0; // for substitution length
		
		if(slen == 0 && tlen == 0) return 0;
		if(slen == 0) return tlen;
		if(tlen == 0) return slen;
		
		/*if(s.charAt(slen - 1) == t.charAt(tlen - 1)){
			sub = recurLevenDistance(s.substring(0, slen - 1), t.substring(0, tlen - 1));
		}
		else{
			sub = recurLevenDistance(s.substring(0, slen - 1), t.substring(0, tlen - 1)) + 1;
		}*/
		
		sub = recurLevenDistance(s.substring(0, slen - 1), t.substring(0, tlen - 1)) 
				+ ((s.charAt(slen - 1) == t.charAt(tlen - 1)) ? 0 : 1);
		
		ins = recurLevenDistance(s.substring(0, slen - 1), t) + 1;
		del = recurLevenDistance(s, t.substring(0, tlen - 1)) + 1;
		
		return minimum(ins, del, sub);
	}
	
	// compute the distance use dynamic way
	public int dynLevenDistance(String s, String t){
		int[][] distance = new int[s.length() + 1][t.length() + 1];
		
		int i, j;
		for(i = 0; i <= s.length(); i++)
			distance[i][0] = i;
		for(j = 0; j <= t.length(); j++)
			distance[0][j] = j;
		
		for(i = 1; i <= s.length(); i++){
			for(j = 1; j <= t.length(); j++){
				distance[i][j] = minimum(distance[i - 1][j] + 1,
						distance[i][j - 1] + 1,
						distance[i - 1][j - 1] + ((s.charAt(i - 1) == t.charAt(j - 1)) ? 0 : 1));
			}
		}
		
		return distance[s.length()][t.length()];
	}
}


 

 

 

 

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值