java分析文本相似度、相差度
2018-11-22
package util;/*** 计算文本相差度/相似度* 返回数字越大,两个字符串相差就越大* @author Administrator**/public class Distance {public static void main(String[] args){Distance dis = new Distance();String s1="比干缘何落马死,勾践因此可吞吴省长市长共商议,抓好米袋菜篮子饭店是事实上是个 规范的炎热他依然额头";String s2="比干缘何落马死勾践因此可吞吴省长市长共商议,抓好米袋分舵是否但是发斯蒂芬斯蒂芬的说法都是菜篮子";// System.out.println((s1.length()));System.out.println(dis.LD(s2, s1));}// ****************************// Get minimum of three values// ****************************private int Minimum(int a, int b, int c) {int mi;mi = a;if (b < mi) {mi = b;}if (c < mi) {mi = c;}return mi;}// *****************************// Compute Levenshtein distance// *****************************public int LD(String s, String t) {int d[][]; // matrixint n; // length of sint m; // length of tint i; // iterates through sint j; // iterates through tchar s_i; // ith character of schar t_j; // jth character of tint cost; // cost// Step 1n = s.length();m = t.length();if (n == 0) {return m;}if (m == 0) {return n;}d = new int[n + 1][m + 1];// Step 2for (i = 0; i <= n; i++) {d[i][0] = i;}for (j = 0; j <= m; j++) {d[0][j] = j;}// Step 3for (i = 1; i <= n; i++) {s_i = s.charAt(i - 1);// Step 4for (j = 1; j <= m; j++) {t_j = t.charAt(j - 1);// Step 5if (s_i == t_j) {cost = 0;} else {cost = 1;}// Step 6d[i][j] = Minimum(d[i - 1][j] + 1, d[i][j - 1] + 1,d[i - 1][j - 1] + cost);}}// Step 7return d[n][m];}}
免责声明:本文仅代表文章作者的个人观点,与本站无关。其原创性、真实性以及文中陈述文字和内容未经本站证实,对本文以及其中全部或者部分内容文字的真实性、完整性和原创性本站不作任何保证或承诺,请读者仅作参考,并自行核实相关内容。
http://www.pinlue.com/style/images/nopic.gif