ElasticSearch 安装插件(分词器HanLP v7.3.2,其他插件loading……)

离线安装分词器

(截止到目前为止,他们的分词准确性从高到低依次是: hanlp> ansj >结巴>IK>Smart Chinese Analysis)

基于ElasticSearch7.3.2安装

官网地址 这个是HanLP为了支持ES,开发的插件
官方下载地址
1. 下载分词器
[elasticsearch@test1 download]$ pwd
/home/elasticsearch/download
[elasticsearch@test1 download]$ wget https://github.com/KennFalcon/elasticsearch-analysis-hanlp/releases/download/v7.3.2/elasticsearch-analysis-hanlp-7.3.2.zip
2.安装到ES
2.1 单机安装HanLP插件到ES(单点)
[elasticsearch@test1 elasticsearch-7.3.2]$ ./bin/elasticsearch-plugin install file:///home/elasticsearch/download/elasticsearch-analysis-hanlp-7.3.2.zip
-> Downloading file:///home/elasticsearch/download/elasticsearch-analysis-hanlp-7.3.2.zip
[=================================================] 100%
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@     WARNING: plugin requires additional permissions     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
* java.io.FilePermission <<ALL FILES>> read,write,delete
* java.lang.RuntimePermission getClassLoader
* java.lang.RuntimePermission setContextClassLoader
* java.net.SocketPermission * connect,resolve
* java.util.PropertyPermission * read,write
See http://docs.oracle.com/javase/8/docs/technotes/guides/security/permissions.html
for descriptions of what these permissions allow and the associated risks.

Continue with installation? [y/N]y
-> Installed analysis-hanlp
[elasticsearch@test1 elasticsearch-7.3.2]$

①配置修改
修改elasticsearch-7.3.2/config/analysis-hanlp目录下的hanlp.properties文件,
修改其中root的属性,值为analysis-hanlp下的data 目录的地址 ;
修改为/elasticsearch-7.3.2/plugins/analysis-hanlp
(目录下的plugin-descriptor.properties文件,其中elasticsearch.version=你的es版本号(like:v7.3.2) )

# Root path of data
root=/home/elasticsearch/deploy/elasticsearch-7.3.2/plugins/analysis-hanlp/ #绝对路径

②修改elasticsearch-7.3.2/config目录下的jvm.options文件,最后一行添加

 -Djava.security.policy=/home/elasticsearch/deploy/elasticsearch-7.3.2/plugins/analysis-hanlp/plugin-security.policy #绝对路径
2.2 使用ansible-playbook 将插件安装到集群(集群)
[elasticsearch@test1 deploy]$ cat > setup-plugins.yml << leo
# 使用方法 ansible-playbook -i hosts.ini setup-plugins.yml
---
- hosts: servers
  tasks:
    - name: 创建deploy目录
      shell: 'mkdir -p /home/elasticsearch/download/'

    - name: '上传文件'
      # 将本地文件复制到远程服务器
      copy:
        src: '{{ item.src }}'
        dest: '{{ item.dest }}'
      with_items:
        - { src: '/home/elasticsearch/download/elasticsearch-analysis-hanlp-{{ version }}.zip', dest: '/home/elasticsearch/download/elasticsearch-analysis-hanlp-{{ version }}.zip' }

    - name: 安装HanLP插件
      shell: 'setsid { deploy_dir }}/elasticsearch-{{ version }}/bin/elasticsearch-plugin install file:///home/elasticsearch/download/elasticsearch-analysis-hanlp-{{ version }}.zip &'

    - name: 修改配置文件 jvm.options
      lineinfile:
        # /home/elasticsearch/deploy/elasticsearch-7.3.2/config/jvm.options
        dest: '{{ deploy_dir }}/elasticsearch-{{ version }}/config/jvm.options'
        line: '{{ item.value }}'
        regexp: '^{{ item.value }}.*'
        state: present
      # 定义集合,并循环执行所在的模块
      with_items:

          # 在文件末尾追加如下文件路径 -Djava.security.policy=/home/elasticsearch/deploy/elasticsearch-7.3.2/plugins/analysis-hanlp/plugin-security.policy
        - { value: '-Djava.security.policy={{ deploy_dir }}/elasticsearch-{{ version }}/plugins/analysis-hanlp/plugin-security.policy' }

    - name: 修改配置文件 hanlp.properties
      lineinfile:
        # /home/elasticsearch/deploy/elasticsearch-7.3.2/config/analysis-hanlp/hanlp.properties
        dest: '{{ deploy_dir }}/elasticsearch-{{ version }}/config/analysis-hanlp/hanlp.properties'
        line: '{{ item.key }}={{ item.value }}'
        regexp: '^{{ item.key }}.*'
        state: present
      # 定义集合,并循环执行所在的模块
      with_items:
          # 改为绝对路径 root=/home/elasticsearch/deploy/elasticsearch-7.3.2/plugins/analysis-hanlp/
        - { key: 'root', value: '{{ deploy_dir }}/elasticsearch-{{ version }}/plugins/analysis-hanlp/' }
leo

[elasticsearch@test1 deploy]$
3. 重启ES
4. 查看安装是否成功
curl -X GET "http://192.168.180.47:9200/_analyze" -H "Content-Type: application/json" -d '
{
    "text": "我们大家的中华人民共和国"
}'

# 默认不使用分词器的查询结果
{
  "tokens": [
    {
      "token": "我",
      "start_offset": 0,
      "end_offset": 1,
      "type": "<IDEOGRAPHIC>",
      "position": 0
    },
    {
      "token": "们",
      "start_offset": 1,
      "end_offset": 2,
      "type": "<IDEOGRAPHIC>",
      "position": 1
    },
    {
      "token": "大",
      "start_offset": 2,
      "end_offset": 3,
      "type": "<IDEOGRAPHIC>",
      "position": 2
    },
    {
      "token": "家",
      "start_offset": 3,
      "end_offset": 4,
      "type": "<IDEOGRAPHIC>",
      "position": 3
    },
    {
      "token": "的",
      "start_offset": 4,
      "end_offset": 5,
      "type": "<IDEOGRAPHIC>",
      "position": 4
    },
    {
      "token": "中",
      "start_offset": 5,
      "end_offset": 6,
      "type": "<IDEOGRAPHIC>",
      "position": 5
    },
    {
      "token": "华",
      "start_offset": 6,
      "end_offset": 7,
      "type": "<IDEOGRAPHIC>",
      "position": 6
    },
    {
      "token": "人",
      "start_offset": 7,
      "end_offset": 8,
      "type": "<IDEOGRAPHIC>",
      "position": 7
    },
    {
      "token": "民",
      "start_offset": 8,
      "end_offset": 9,
      "type": "<IDEOGRAPHIC>",
      "position": 8
    },
    {
      "token": "共",
      "start_offset": 9,
      "end_offset": 10,
      "type": "<IDEOGRAPHIC>",
      "position": 9
    },
    {
      "token": "和",
      "start_offset": 10,
      "end_offset": 11,
      "type": "<IDEOGRAPHIC>",
      "position": 10
    },
    {
      "token": "国",
      "start_offset": 11,
      "end_offset": 12,
      "type": "<IDEOGRAPHIC>",
      "position": 11
    }
  ]
}

分词器使用测试

curl -X GET "http://192.168.180.47:9200/_analyze" -H "Content-Type: application/json" -d '
{
    "text": "我们大家的中华人民共和国",
    "analyzer": "hanlp"
}'
# 使用分词器后的查询结果
{
  "tokens": [
    {
      "token": "我们",
      "start_offset": 0,
      "end_offset": 2,
      "type": "rr",
      "position": 0
    },
    {
      "token": "大家",
      "start_offset": 2,
      "end_offset": 4,
      "type": "rr",
      "position": 1
    },
    {
      "token": "的",
      "start_offset": 4,
      "end_offset": 5,
      "type": "ude1",
      "position": 2
    },
    {
      "token": "中华人民共和国",
      "start_offset": 5,
      "end_offset": 12,
      "type": "ns",
      "position": 3
    }
  ]
}


  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值