内容并不是原创,从网上找的一个现成的代码,但是我需要在这个代码上进行修改。所以先保存一份。
str1和str2中的词语会循环比较。
import edu.sussex.nlp.jws.JWS;
import edu.sussex.nlp.jws.Lin;
public class TestExamples {
private String str1;
private String str2;
private String dir = "C:/WordNet/";
private JWS ws = new JWS(dir, "2.1");
public TestExamples(String str1,String str2){
this.str1=str1;
this.str2=str2;
}
public double getSimilarity(){
String[] strs1 = splitString(str1);
String[] strs2 = splitString(str2);
double sum = 0.0;
for(String s1 : strs1){
for(String s2: strs2){
double sc= maxScoreOfLin(s1,s2);
sum+= sc;
System.out.println("当前计算: "+s1+" VS "+s2+" 的相似度为:"+sc);
}
}
double Similarity = sum /(strs1.length * strs2.length);
sum=0;
return Similarity;
}
private String[] splitString(String str){
String[] ret = str.split(" ");
return ret;
}
private double maxScoreOfLin(String str1,String str2){
Lin lin = ws.getLin();
double sc = lin.max(str1, str2, "n");
if(sc==0){
sc = lin.max(str1, str2, "v");
}
return sc;
}
public static void main(String args[]){
String s1="was born in";
String s2="birth place is";
TestExamples sm= new TestExamples(s1, s2);
System.out.println(sm.getSimilarity());
}
}
结果如下:
Loading modules
set up:
... finding noun and verb <roots>
... calculating IC <roots> ...
... ICFinder
... DepthFinder
... PathFinder
... JiangAndConrath
... Lin
... Resnik
... Path
... WuAndPalmer
... Adapted Lesk : all relations
... Adapted Lesk (1)
... Adapted Lesk (2)
... HirstAndStOnge
... LeacockAndChodorow
... calculating depths of <roots> ...
Java WordNet::Similarity using WordNet 2.1 : loaded
当前计算: was VS birth 的相似度为:0.0
当前计算: was VS place 的相似度为:0.0
当前计算: was VS is 的相似度为:0.0
当前计算: born VS birth 的相似度为:0.0
当前计算: born VS place 的相似度为:0.0
当前计算: born VS is 的相似度为:0.0
当前计算: in VS birth 的相似度为:0.2853640399251109
当前计算: in VS place 的相似度为:0.3983913320550497
当前计算: in VS is 的相似度为:0.0
0.07597281910890673