-
环境说明:
系统:centos7.x
python版本:3.9.0这里安装完整版本hanlp[full],精简版会有不少问题出现,没有找到解决方案
官网安装地址:
https://hanlp.hankcs.com/install.html#_2-x-%E6%9C%AC%E5%9C%B0%E7%89%88
文档地址
https://hanlp.hankcs.com/docs/
-
python安装参照
https://blog.csdn.net/liuxiaoming1109/article/details/128814108?spm=1001.2014.3001.5501
-
安装
pip install hanlp[full]
错误1、
Collecting hanlp Using cached hanlp-2.1.0b45-py3-none-any.whl (647 kB) Collecting hanlp-common>=0.0.19 Using cached hanlp_common-0.0.19.tar.gz (28 kB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [1 lines of output] ERROR: Can not execute `setup.py` since setuptools is not available in the build environment. [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed × Encountered error while generating package metadata. ╰─> See above for output. note: This is an issue with the package mentioned above, not pip.
错误:无法执行“setup.py”,因为setuptools在构建环境中不可用
解决方案:
pip install --upgrade setuptools报错2、
ERROR: Exception: Traceback (most recent call last): File "/usr/local/python3/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 437, in _error_catcher yield File "/usr/local/python3/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 560, in read data = self._fp_read(amt) if not fp_closed else b"" File "/usr/local/python3/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 526, in _fp_read return self._fp.read(amt) if amt is not None else self._fp.read() File "/usr/local/python3/lib/python3.9/site-packages/pip/_vendor/cachecontrol/filewrapper.py", line 90, in read data = self.__fp.read(amt) File "/usr/local/python3/lib/python3.9/http/client.py", line 458, in read n = self.readinto(b) File "/usr/local/python3/lib/python3.9/http/client.py", line 502, in readinto n = self.fp.readinto(b) File "/usr/local/python3/lib/python3.9/socket.py", line 704, in readinto return self._sock.recv_into(b) File "/usr/local/python3/lib/python3.9/ssl.py", line 1241, in recv_into return self.read(nbytes, buffer) File "/usr/local/python3/lib/python3.9/ssl.py", line 1099, in read return self._sslobj.read(len, buffer) socket.timeout: The read operation timed out During handling of the above exception, another exception occurred: Traceback (most recent call last):
解决方案:
安装Python库出现的超时问题
使用镜像安装pip install -i https://pypi.douban.com/simple/ hanlp[full]
-
更改默认环境变量
cd /home mkdir hanlp export HANLP_HOME=/home/hanlp
如果下载资源缓慢,可是设置下载镜像地址
export HANLP_URL=https://od.hankcs.com/hanlp/data/
-
测试安装
分词演示import hanlp # 分词 tokenizer = hanlp.load('PKU_NAME_MERGED_SIX_MONTHS_CONVSEG') print(tokenizer('HanLP是面向生产环境的自然语言处理工具包。'))
结果
[‘HanLP’, ‘是’, ‘面向’, ‘生产’, ‘环境’, ‘的’, ‘自然’, ‘语言’, ‘处理’, ‘工具包’, ‘。’]结构输出演示
import hanlp HanLP = hanlp.load(hanlp.pretrained.mtl.CLOSE_TOK_POS_NER_SRL_DEP_SDP_CON_ELECTRA_SMALL_ZH) word=HanLP(['HanLP是面向生产环境的自然语言处理工具包。','晓美焰来到北京立方庭参观自然语义科技公司。','徐先生还具体帮助他确定了把画雄鹰、松鼠和麻雀作为主攻目标。','剑桥分析公司多位高管对卧底记者说,他们确保了唐纳德·特朗普在总统大选 中获胜。','萨哈夫说,伊拉克将同联合国销毁伊拉克大规模杀伤性武器特别委员会继续保持合作。']).pretty_print()
结果
部分数据输出演示
import hanlp HanLP = hanlp.load(hanlp.pretrained.mtl.CLOSE_TOK_POS_NER_SRL_DEP_SDP_CON_ELECTRA_SMALL_ZH) word=HanLP(['HanLP是面向生产环境的自然语言处理工具包。'], tasks=['tok/coarse', 'sdp', 'pos/ctb', 'ner/msra']) print(word)
结果
Hanlp本地化安装
于 2023-02-06 14:29:43 首次发布