两个节点在依存树上的最短子树(Java)_java 树结构如何找到两节点之间最短节点-CSDN博客

本文链接：https://blog.csdn.net/appleml/article/details/62419302

使用StanfordParser解析句子成依存树，寻找给定两个节点间的最短路径。baseDependencies产生一棵树结构，每个节点唯一父节点；collapsedDependencies解析出图结构，节点可能有多个父节点。

摘要由CSDN通过智能技术生成

前提：给出一句话，StanfordParser对该句话解析成依存书，针对给出的两个节点，找他们的最短依存树

package StanfordParser;

import java.io.StringReader;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;

import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.ling.IndexedWord;
import edu.stanford.nlp.parser.lexparser.LexicalizedParser;
import edu.stanford.nlp.process.CoreLabelTokenFactory;
import edu.stanford.nlp.process.PTBTokenizer;
import edu.stanford.nlp.process.Tokenizer;
import edu.stanford.nlp.process.TokenizerFactory;
import edu.stanford.nlp.trees.GrammaticalRelation;
import edu.stanford.nlp.trees.GrammaticalStructure;
import edu.stanford.nlp.trees.GrammaticalStructureFactory;
import edu.stanford.nlp.trees.Tree;
import edu.stanford.nlp.trees.TreebankLanguagePack;
import edu.stanford.nlp.trees.TypedDependency;

public class shortestDependentTree{

    public static void main(String args[]){
         String parserModel = "/home/ubuntu/nlpTools/stanford-parser-full-2016-10-31/stanford-parser-3.7.0-models/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz";
         LexicalizedParser lp = LexicalizedParser.loadModel(parserModel);
         String sent = "In mature human B cells, BMP-6 inhibited cell growth, and rapidly induced phosphorylation of Smad1/5/8 followed by an upregulation of Id1.";
         shortestDependentTree sdt = new shortestDependentTree();
         Collection<TypedDependency> parse_result = sdt.demoAPI(lp, sent);
         //给出两个节点，找两个节点之间的最短路径
         Boolean is_first = true;
         //sdt.getShortestDependentPath("In", 1, "Smad1/5/8", 17, parse_result, is_first);       
         //sdt.getShortestDependentPath("cells", 5, "growth", 10, parse_result, is_first);
         sdt.getShortestDependentPath("BMP-6", 7, "inhibited", 8, parse_result, is_first);
    }

    public Collection<TypedDependency> demoAPI(LexicalizedParser lp, String sent) {
        // This option shows loading and using an explicit tokenizer
        //String sent2 = "This is another sentence.";
        TokenizerFactory<CoreLabel> tokenizerFactory = PTBTokenizer.factory(new CoreLabelTokenFactory(), "");
        Tokenizer<CoreLabel> tok =tokenizerFactory.getTokenizer(new StringReader(sent));
        List<CoreLabel> rawWords2 = tok.tokenize();
        Tree parse = lp.apply(rawWords2);

        TreebankLanguagePack tlp = lp.treebankLanguagePack(); // PennTreebankLanguagePack for English
        GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
        GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
        //List<TypedDependency> tdl =