Stanford parser java+eclipse调用

转载自http://blog.sina.com.cn/s/blog_8af106960101a64w.html

Stanford Parser句法分析器官网:http://nlp.stanford.edu/software/lex-parser.shtml#Download

==================================================================================

下载:Download

  官网-》Download Stanford Parser version 3.2.0-》stanford-parser-full-2013-06-20.zip

==================================================================================

解压缩:stanford-parser-full-2013-06-20.zip-》stanford-parser-full

  部分如下

  Stanford <wbr>Parser使用之 <wbr>Eclipse+java调用
  -》解压缩stanford-parser.jar-》stanford-parser文件夹

  -》解压缩stanford-parser-3.2.0-models.jar-》stanford-parser-3.2.0-models文件夹-》将下面的

  edu\stanford\nlp\models\lexparser中的englishPCFG.ser.gz拷贝到工程的source文件夹下。

==================================================================================

使用:

  参考1:http://sbp810050504.blog.51cto.com/2799422/778398

  参考2:http://blog.sina.com.cn/s/blog_59e0c16f0100ufsv.html

  首先要将stanford-parser.jar文件加载到lib文件夹中。右键build path->add to build path. 只加载stanford-parser.jar即可。

  其中参考1中将englishPCFG.ser.gz

   Stanford <wbr>Parser使用之 <wbr>Eclipse+java调用

  错的!!!
  也加入到Referenced Libraries中,在本地报错,不要加入。

  另外在初始化要写绝对路径,也和参考的不一样!

  更新:englishPCFG.ser.gz拷贝到工程的source文件夹下即可。

  工程的层次结构图:

  stanfordpatser工程名
   src代码
     stanfordparser包
       Parser.java类
   lib
     stanford-parser.jar
   source
     englishPCFG.ser.gz

==================================================================================

本地.java中调用的代码

  //LexicalizedParser lp = LexicalizedParser.loadModel("...\\stanford-parser- //full\\englishPCFG.ser.gz");//本地中为绝对路径

  //相对路径即可

  LexicalizedParser lp = LexicalizedParser.loadModel("source/englishPCFG.ser.gz");

  String subsen = "One beer later and I'm walking down the street smoking a cig with them";
  PTBTokenizer ptb = PTBTokenizer.newPTBTokenizer(new StringReader(subsen));
  List words = ptb.tokenize();
  System.out.println(lp.parse(words));
=================

结果:

  (ROOT (S (NP (NP (CD One) (NN beer) (RB later)) (CC and) (NP (PRP I))) (VP (VBP 'm) (VP (VBG walking) (PRT (RP down)) (NP (NP (DT the) (NN street)) (VP (VBG smoking) (NP (DT a) (NN cig)) (PP (IN with) (NP (PRP them)))))))))

==================================================================================

自带的样例ParserDemo.java

public static void main(String[] args) {
    //LexicalizedParser lp = LexicalizedParser.loadModel("D:\\my download\\Parser\\Stanford //Parser\\stanford-parser-full\\englishPCFG.ser.gz");

   //相对路径即可

   LexicalizedParser lp = LexicalizedParser.loadModel("source/englishPCFG.ser.gz");
   demoAPI(lp);
   }

 
  public static void demoAPI(LexicalizedParser lp) {
    // This option shows parsing a list of correctly tokenized words第一块
    String[] sent = { "This", "is", "an", "easy", "sentence", "." };
    List rawWords = Sentence.toCoreLabelList(sent);
    Tree parse = lp.apply(rawWords);
    parse.pennPrint();
    System.out.println();

    // This option shows loading and using an explicit tokenizer第二块
    String sent2 = "This is another sentence.";
    TokenizerFactory tokenizerFactory =
      PTBTokenizer.factory(new CoreLabelTokenFactory(), "");
    List rawWords2 =
      tokenizerFactory.getTokenizer(new StringReader(sent2)).tokenize();
    parse = lp.apply(rawWords2);

    TreebankLanguagePack tlp = new PennTreebankLanguagePack();
    GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
    GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
    List tdl = gs.typedDependenciesCCprocessed();
    System.out.println(tdl);

    //for(TypedDependency tdl1:tdl){
    //   System.out.println(tdl1);       //例如输出完整的:nsubj(sentence-4, This-1)

    //   System.out.println(tdl1.gov()); //例如输出支配地位的:sentence-4

    //   System.out.println(tdl1.dep()); //例如输出从属地位的:This-1

    //   System.out.println(tdl1.reln());//例如输出关系:nsubj
    //  }

    System.out.println();

    输出第三块

    TreePrint tp = new TreePrint("penn,typedDependenciesCollapsed");
    tp.printTree(parse);
  }

输出结果:

Loading parser from serialized file D:\my download\Parser\Stanford Parser\stanford-parser-full\englishPCFG.ser.gz ... done [1.6 sec].

Loading parser from serialized file source/englishPCFG.ser.gz ... done [1.4 sec].
(ROOT
  (S
    (NP (DT This))
    (VP (VBZ is)
      (NP (DT an) (JJ easy) (NN sentence)))
    (. .)))

[nsubj(sentence-4, This-1), cop(sentence-4, is-2), det(sentence-4, another-3), root(ROOT-0, sentence-4)]

(ROOT
  (S
    (NP (DT This))
    (VP (VBZ is)
      (NP (DT another) (NN sentence)))
    (. .)))

nsubj(sentence-4, This-1)
cop(sentence-4, is-2)
det(sentence-4, another-3)
root(ROOT-0, sentence-4)

 

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值