# 斯坦福的命名实体识别工具包
sn = StanfordNERTagger('F://VEN/stanford-ner-2020-11-17/classifiers/english.muc.7class.distsim.crf.ser.gz',
path_to_jar='F://VEN/stanford-ner-2020-11-17/stanford-ner.jar')
# 进行识别
ne_annotated_sentences = [sn.tag(sent) for sent in tokenized_sentences]
每句话调用一次斯坦福的工具包,每次都要重启JVM,速度过慢
改进:将所有的语句存成列表,作为参数调用sn.tag_sents()方法
ne_annotated_sentences = sn.tag_sents(tokenized_sentences)