参考资料:http://www.it165.net/pro/html/201611/76661.html
说明操作的环境:系统是Ubuntu 16.04, IDE 采用Pycharm
1. 英语句子的词性标记
需要的操作步骤:在stanford NLP网页下载 stanford-postagger-2015-12-09.zip 抽取(解压)后文件夹为stanford-postagger-2015-12-09, 在主文件夹下新建nlpTools/stanfordNLTK文件夹, 将stanford-postagger-2015-12-09 下文件models和stanford-postagger-3.6.0.jar放到文件夹stanfordNLTK下, 运行代码:
from nltk.tag import StanfordPOSTagger
model_filename='/home/ubuntu/nlpTools/stanfordNLTk/models/english-bidirectional-distsim.tagger'
path_to_jar='/home/ubuntu/nlpTools/stanfordNLTk/stanford-postagger.jar'
def getPos(sent):
eng_tagger = StanfordPOSTagger(model_filename, path_to_jar)
print(eng_tagger.tag(sent.split()))
if __name__ == '__main__':
getPos('I am a student.')
2. 英语句子的依存分析(解析的结果错误百出,还不太明白