问题描述
使用NLTK时,出现stopword资源找不到。
解决方法
在网上找到公开资源,下载文件后,解压到相应地址。
- 查看地址路径:
from nltk import data
print(data.path)
将下载的文件解压到下面目录里,没有文件夹的新建文件夹:
- 我在解压时候没有这个目录,找到在/Users/mac/opt/anaconda3文件
- 在目录下新建一个nltk_data文件夹;
- 再在nltk_data里建corpora文件夹,将stopword拉进去。
过程记录
这里记录自己的解决过程。
from nltk.corpus import stopwords
stop = stopwords.words('english')
报错内容:
---------------------------------------------------------------------------
LookupError Traceback (most recent call last)
~/opt/anaconda3/lib/python3.7/site-packages/nltk/corpus/util.py in __load(self)
85 try:
---> 86 root = nltk.data.find('{}/{}'.format(self.subdir, zip_name))
87 except LookupError:
~/opt/anaconda3/lib/python3.7/site-packages/nltk/data.py in find(resource_name, paths)
700 resource_not_found = '\n%s\n%s\n%s\n' % (sep, msg, sep)
--> 701 raise LookupError(resource_not_found)
702
LookupError:
**********************************************************************
Resource stopwords not found.
Please use the NLTK Downloader to obtain the resource:
>>> import nltk
>>> nltk.download('stopwords')
For more information see: https://www.nltk.org/data.html
Attempted to load corpora/stopwords.zip/stopwords/
Searched in:
- '/Users/mac/nltk_data'
- '/Users/mac/opt/anaconda3/nltk_data'
- '/Users/mac/opt/anaconda3/share/nltk_data'
- '/Users/mac/opt/anaconda3/lib/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
**********************************************************************
During handling of the above exception, another exception occurred:
LookupError Traceback (most recent call last)
<ipython-input-18-a8339b7b1fb5> in <module>
1 from nltk.corpus import stopwords
----> 2 stop = stopwords.words('english')
3 train['stopwords']=train['tweet'].apply(lambda x: len([x for x in x.split() if x in stop]))
4 train[['tweet','stopwords']].head()
~/opt/anaconda3/lib/python3.7/site-packages/nltk/corpus/util.py in __getattr__(self, attr)
121 raise AttributeError("LazyCorpusLoader object has no attribute '__bases__'")
122
--> 123 self.__load()
124 # This looks circular, but its not, since __load() changes our
125 # __class__ to something new:
~/opt/anaconda3/lib/python3.7/site-packages/nltk/corpus/util.py in __load(self)
86 root = nltk.data.find('{}/{}'.format(self.subdir, zip_name))
87 except LookupError:
---> 88 raise e
89
90 # Load the corpus.
~/opt/anaconda3/lib/python3.7/site-packages/nltk/corpus/util.py in __load(self)
81 else:
82 try:
---> 83 root = nltk.data.find('{}/{}'.format(self.subdir, self.__name))
84 except LookupError as e:
85 try:
~/opt/anaconda3/lib/python3.7/site-packages/nltk/data.py in find(resource_name, paths)
699 sep = '*' * 70
700 resource_not_found = '\n%s\n%s\n%s\n' % (sep, msg, sep)
--> 701 raise LookupError(resource_not_found)
702
703
LookupError:
**********************************************************************
Resource stopwords not found.
Please use the NLTK Downloader to obtain the resource:
>>> import nltk
>>> nltk.download('stopwords')
For more information see: https://www.nltk.org/data.html
Attempted to load corpora/stopwords
Searched in:
- '/Users/mac/nltk_data'
- '/Users/mac/opt/anaconda3/nltk_data'
- '/Users/mac/opt/anaconda3/share/nltk_data'
- '/Users/mac/opt/anaconda3/lib/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
**********************************************************************
根据返回的信息,输入代码下载,依旧失败
import nltk
nltk.download('stopwords')
直接去网站下载:http://www.nltk.org/nltk_data/
显示找不到网页,可能被墙掉了。