两个节点在依存树上的最短子树(Java)

使用StanfordParser解析句子成依存树,寻找给定两个节点间的最短路径。baseDependencies产生一棵树结构,每个节点唯一父节点;collapsedDependencies解析出图结构,节点可能有多个父节点。
摘要由CSDN通过智能技术生成

前提:给出一句话,StanfordParser对该句话解析成依存书,针对给出的两个节点,找他们的最短依存树

package StanfordParser;

import java.io.StringReader;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;

import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.ling.IndexedWord;
import edu.stanford.nlp.parser.lexparser.LexicalizedParser;
import edu.stanford.nlp.process.CoreLabelTokenFactory;
import edu.stanford.nlp.process.PTBTokenizer;
import edu.stanford.nlp.process.Tokenizer;
import edu.stanford.nlp.process.TokenizerFactory;
import edu.stanford.nlp.trees.GrammaticalRelation;
import edu.stanford.nlp.trees.GrammaticalStructure;
import edu.stanford.nlp.trees.GrammaticalStructureFactory;
import edu.stanford.nlp.trees.Tree;
import edu.stanford.nlp.trees.TreebankLanguagePack;
import edu.stanford.nlp.trees.TypedDependency;

public class shortestDependentTree{

    public static void main(String args[]){
         String parserModel = "/home/ubuntu/nlpTools/stanford-parser-full-2016-10-31/stanford-parser-3.7.0-models/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz";
         LexicalizedParser lp = LexicalizedParser.loadModel(parserModel);
         String sent = "In mature human B cells, BMP-6 inhibited cell growth, and rapidly induced phosphorylation of Smad1/5/8 followed by an upregulation of Id1.";
         shortestDependentTree sdt = new shortestDependentTree();
         Collection<TypedDependency> parse_result = sdt.demoAPI(lp, sent);
         //给出两个节点,找两个节点之间的最短路径
         Boolean is_first = true;
         //sdt.getShortestDependentPath("In", 1, "Smad1/5/8", 17, parse_result, is_first);       
         //sdt.getShortestDependentPath("cells", 5, "growth", 10, parse_result, is_first);
         sdt.getShortestDependentPath("BMP-6", 7, "inhibited", 8, parse_result, is_first);
    }

    public Collection<TypedDependency> demoAPI(LexicalizedParser lp, String sent) {
        // This option shows loading and using an explicit tokenizer
        //String sent2 = "This is another sentence.";
        TokenizerFactory<CoreLabel> tokenizerFactory = PTBTokenizer.factory(new CoreLabelTokenFactory(), "");
        Tokenizer<CoreLabel> tok =tokenizerFactory.getTokenizer(new StringReader(sent));
        List<CoreLabel> rawWords2 = tok.tokenize();
        Tree parse = lp.apply(rawWords2);

        TreebankLanguagePack tlp = lp.treebankLanguagePack(); // PennTreebankLanguagePack for English
        GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
        GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
        //List<TypedDependency> tdl = 
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值