Hanlp本地化安装

  1. 环境说明:
    系统:centos7.x
    python版本:3.9.0

    这里安装完整版本hanlp[full],精简版会有不少问题出现,没有找到解决方案

    官网安装地址:

    https://hanlp.hankcs.com/install.html#_2-x-%E6%9C%AC%E5%9C%B0%E7%89%88
    

    文档地址

    https://hanlp.hankcs.com/docs/
    
  2. python安装参照

    https://blog.csdn.net/liuxiaoming1109/article/details/128814108?spm=1001.2014.3001.5501
    
  3. 安装

    pip install hanlp[full] 
    

    错误1、

    Collecting hanlp
    Using cached hanlp-2.1.0b45-py3-none-any.whl (647 kB)
    Collecting hanlp-common>=0.0.19
      Using cached hanlp_common-0.0.19.tar.gz (28 kB)
      Preparing metadata (setup.py) ... error
      error: subprocess-exited-with-error
      
      × python setup.py egg_info did not run successfully.
      │ exit code: 1
      ╰─> [1 lines of output]
          ERROR: Can not execute `setup.py` since setuptools is not available in the build environment.
          [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
    error: metadata-generation-failed
    
    × Encountered error while generating package metadata.
    ╰─> See above for output.
    
    note: This is an issue with the package mentioned above, not pip.
    

    错误:无法执行“setup.py”,因为setuptools在构建环境中不可用
    解决方案:
    pip install --upgrade setuptools

    报错2、

    ERROR: Exception:
    Traceback (most recent call last):
      File "/usr/local/python3/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 437, in _error_catcher
        yield
      File "/usr/local/python3/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 560, in read
        data = self._fp_read(amt) if not fp_closed else b""
      File "/usr/local/python3/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 526, in _fp_read
        return self._fp.read(amt) if amt is not None else self._fp.read()
      File "/usr/local/python3/lib/python3.9/site-packages/pip/_vendor/cachecontrol/filewrapper.py", line 90, in read
        data = self.__fp.read(amt)
      File "/usr/local/python3/lib/python3.9/http/client.py", line 458, in read
        n = self.readinto(b)
      File "/usr/local/python3/lib/python3.9/http/client.py", line 502, in readinto
        n = self.fp.readinto(b)
      File "/usr/local/python3/lib/python3.9/socket.py", line 704, in readinto
        return self._sock.recv_into(b)
      File "/usr/local/python3/lib/python3.9/ssl.py", line 1241, in recv_into
        return self.read(nbytes, buffer)
      File "/usr/local/python3/lib/python3.9/ssl.py", line 1099, in read
        return self._sslobj.read(len, buffer)
    socket.timeout: The read operation timed out
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
    

    解决方案:
    安装Python库出现的超时问题
    使用镜像安装

    pip install -i https://pypi.douban.com/simple/ hanlp[full]
    
  4. 更改默认环境变量

    cd /home
    mkdir hanlp
    export HANLP_HOME=/home/hanlp
    

    如果下载资源缓慢,可是设置下载镜像地址

    export HANLP_URL=https://od.hankcs.com/hanlp/data/
    
  5. 测试安装
    分词演示

    import hanlp
    # 分词
    tokenizer = hanlp.load('PKU_NAME_MERGED_SIX_MONTHS_CONVSEG')
    print(tokenizer('HanLP是面向生产环境的自然语言处理工具包。'))
    

    结果
    [‘HanLP’, ‘是’, ‘面向’, ‘生产’, ‘环境’, ‘的’, ‘自然’, ‘语言’, ‘处理’, ‘工具包’, ‘。’]

    结构输出演示

    import hanlp
    HanLP = hanlp.load(hanlp.pretrained.mtl.CLOSE_TOK_POS_NER_SRL_DEP_SDP_CON_ELECTRA_SMALL_ZH) 
    word=HanLP(['HanLP是面向生产环境的自然语言处理工具包。','晓美焰来到北京立方庭参观自然语义科技公司。','徐先生还具体帮助他确定了把画雄鹰、松鼠和麻雀作为主攻目标。','剑桥分析公司多位高管对卧底记者说,他们确保了唐纳德·特朗普在总统大选
    中获胜。','萨哈夫说,伊拉克将同联合国销毁伊拉克大规模杀伤性武器特别委员会继续保持合作。']).pretty_print()
    

    结果
    在这里插入图片描述
    在这里插入图片描述

    部分数据输出演示

    import hanlp
    HanLP = hanlp.load(hanlp.pretrained.mtl.CLOSE_TOK_POS_NER_SRL_DEP_SDP_CON_ELECTRA_SMALL_ZH)
    word=HanLP(['HanLP是面向生产环境的自然语言处理工具包。'], tasks=['tok/coarse', 'sdp', 'pos/ctb', 'ner/msra'])
    print(word)
    

    结果
    在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值