1.项目地址和详细安装说明
https://github.com/lancopku/pkuseg-python
2.安装和导入
pip install spacy_pkuseg
import spacy_pkuseg as pkuseg
pku_news = pkuseg.pkuseg(model_name='news', user_dict='mydic.txt',
postag=False)
3.常见问题
3.1.加载超时
(1)问题
加载模型时出现:ReadTimeout: HTTPSConnectionPool(host=‘github.com’, port=443): Read timed out. (read timeout=5)
(2)原因和解决办法
国内网络访问github时速度太慢。解决思路修改timeout。具体如下:根据错误提示,将安装路径下lib\site-packages\spacy_pkuseg\download.py中
def _download_url_to_file(url, dst, hash_prefix, progress):
if requests_available:
u = urlopen(url, stream=True, timeout=5)
的参数timeout=5改为timeout=50即可。
(3)类似帖子
https://blog.csdn.net/weixin_44792660/article/details/128742902