在使用StanfordParser(SD)进行语法分析时,SD默认使用的是从文件读入和输出到输出流,如下:
在cmd的python命令行里输入:
java -mx150m -cp "*;" edu.stanford.nlp.parser.lexparser.LexicalizedParser -outputFormat "penn,typedDependencies" edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz input.txt
可以看到,SD是可以从输入流中读取数据,并输出到文件中的。
在cmd的python命令行里输入:
java -mx150m -cp "*;" edu.stanford.nlp.parser.lexparser.LexicalizedParser -outputFormat "penn,typedDependencies" edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz -
在导入parser包后,会提示输入分析的句子,如下所示
G:\Bioinformatics\TextMining\stanford-parser>java -mx150m -cp "*;" edu.stanford.nlp.parser.lexparser.LexicalizedParser -outputFormat "penn,typedDependencies" edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz -
Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... done [1.1 sec].
Parsing file: -
The first matrix was generated from pseudo-random draws from a Gaussian distribution.
The second matrix was generated to precisely match the conditions that NMF models.
Parsing [sent. 1 len. 13]: The first matrix was generated from pseudo-random dra
ws from a Gaussian distribution .
(ROOT
(S
(NP (DT The) (JJ first) (NN matrix))
(VP (VBD was)
(VP (VBN generated)
(PP (IN from)
(NP (JJ pseudo-random) (NNS draws)))
(PP (IN from)
(NP (DT a) (NNP Gaussian) (NN distribution)))))
(. .)))
det(matrix-3, The-1)
amod(matrix-3, first-2)
nsubjpass(generated-5, matrix-3)
auxpass(generated-5, was-4)
root(ROOT-0, generated-5)
amod(draws-8, pseudo-random-7)
prep_from(generated-5, draws-8)
det(distribution-12, a-10)
nn(distribution-12, Gaussian-11)
prep_from(generated-5, distribution-12)
细心的同学会发现,最后一句还没有被处理。细看FAQ发现,最后一句说,需要关闭输入流最后一句才会被处理或者使用参数 -sentences newline(有关参数 -sentences newline请查看上一篇博客)
这里我们关闭输入流,结果如下:
-
Parsing [sent. 2 len. 14]: The second matrix was generated to precisely match th
e conditions that NMF models .
(ROOT
(S
(NP (DT The) (JJ second) (NN matrix))
(VP (VBD was)
(VP (VBN generated)
(S
(VP (TO to)
(VP
(ADVP (RB precisely))
(VB match)
(NP (DT the) (NNS conditions))
(NP (DT that) (NNP NMF) (NNS models)))))))
(. .)))
det(matrix-3, The-1)
amod(matrix-3, second-2)
nsubjpass(generated-5, matrix-3)
xsubj(match-8, matrix-3)
auxpass(generated-5, was-4)
root(ROOT-0, generated-5)
aux(match-8, to-6)
advmod(match-8, precisely-7)
xcomp(generated-5, match-8)
det(conditions-10, the-9)
iobj(match-8, conditions-10)
det(models-13, that-11)
nn(models-13, NMF-12)
dobj(match-8, models-13)