解决出现类似报错的情况,是因为少了Java class的参数,这是新版本中为注明的修改。
from nltk.tokenize import StanfordSegmenter segmenter = StanfordSegmenter( path_to_sihan_corpora_dict="E:/NLP/NLP_code/Installation/base/stanford-segmenter-2017-06-09/data", path_to_model="E:/NLP/NLP_code/Installation/base/stanford-segmenter-2017-06-09/data/pku.gz", path_to_dict="E:/NLP/NLP_code/Installation/base/stanford-segmenter-2017-06-09/data/dict-chris6.ser.gz") res = segmenter.segment(u"北海已成为中国对外开放中升起的一颗明星") print(res) C:\Users\lybroman\AppData\Local\Programs\Python\Python36-32\python.exe D:/programming/leetcode/test.py D:/programming/leetcode/test.py:3: DeprecationWarning: The StanfordTokenizer will be deprecated in version 3.2.5. Please use nltk.parse.corenlp.CoreNLPTokenizer instead.' path_to_sihan_corpora_dict="E:/NLP/NLP_code/Installation/base/stanford-segmenter-2017-06-09/data", path_t