运行《python 数据挖掘 概念、方法与实践》第六章文本中的命名实体识别,python3代码如下
import nltk
import pprint
filename = 'lkmlEmailsReduced.txt'
with open(filename, 'r', encoding='utf8') as sampleFile:
text=sampleFile.read()
en = {}
try:
sent_detector = nltk.data.load('tokenizers/punkt/english.pickle')
sentences = sent_detector.tokenize(text.strip())
出现以下错误
Resource 'tokenizers/punkt/english.pickle' not found. Please
use the NLTK Downloader to obtain the resource: >>>
原因一:
没有进行如下操作:
1、
import nltk nltk.download()
- 运行成功后弹出NLTK Downloader,点击"all" 修改