JWS同义词查询

最新推荐文章于 2021-02-27 01:37:27 发布

Lycoris_

最新推荐文章于 2021-02-27 01:37:27 发布

阅读量751

点赞数

内容并不是原创，从网上找的一个现成的代码，但是我需要在这个代码上进行修改。所以先保存一份。

str1和str2中的词语会循环比较。

import edu.sussex.nlp.jws.JWS;
import edu.sussex.nlp.jws.Lin;


public class TestExamples {

    private String str1;
    private String str2;
    private String dir = "C:/WordNet/";
    private JWS    ws = new JWS(dir, "2.1");
    
    public TestExamples(String str1,String str2){
        this.str1=str1;
        this.str2=str2;
    }
    
    public double getSimilarity(){
        String[] strs1 = splitString(str1);
        String[] strs2 = splitString(str2);
        double sum = 0.0;
        for(String s1 : strs1){
            for(String s2: strs2){
                double sc= maxScoreOfLin(s1,s2);
                sum+= sc;
                System.out.println("当前计算: "+s1+" VS "+s2+" 的相似度为:"+sc);
            }
        }
        double Similarity = sum /(strs1.length * strs2.length);
        sum=0;
        return Similarity;
    }
    
    private String[] splitString(String str){
        String[] ret = str.split(" ");
        return ret;
    }
    
    private double maxScoreOfLin(String str1,String str2){
        Lin lin = ws.getLin();
        double sc = lin.max(str1, str2, "n");
        if(sc==0){
            sc = lin.max(str1, str2, "v");
        }
        return sc;
    }
    
    public static void main(String args[]){
        String s1="was born in";
        String s2="birth place is";
        TestExamples sm= new TestExamples(s1, s2);
        System.out.println(sm.getSimilarity());
    }
}

结果如下：
Loading modules
set up:
... finding noun and verb <roots>
... calculating IC <roots> ...
... ICFinder
... DepthFinder
... PathFinder
... JiangAndConrath
... Lin
... Resnik
... Path
... WuAndPalmer
... Adapted Lesk : all relations
... Adapted Lesk (1)
... Adapted Lesk (2)
... HirstAndStOnge
... LeacockAndChodorow
... calculating depths of <roots> ...

Java WordNet::Similarity using WordNet 2.1 : loaded

当前计算: was VS birth 的相似度为:0.0
当前计算: was VS place 的相似度为:0.0
当前计算: was VS is 的相似度为:0.0
当前计算: born VS birth 的相似度为:0.0
当前计算: born VS place 的相似度为:0.0
当前计算: born VS is 的相似度为:0.0
当前计算: in VS birth 的相似度为:0.2853640399251109
当前计算: in VS place 的相似度为:0.3983913320550497
当前计算: in VS is 的相似度为:0.0
0.07597281910890673