前言
之前写了Docker部署Elasticsearch和Kinbana,但Elasticsearch毕竟是国外的,对分词方面明显跟不上我们的需求,所以在很多时候,我们都会安装分词器插件,如IK分词器、JieBa分词器等,这篇就是经过自己实践安装的IK分词器
安装步骤
-
准备 IK 分词器的安装文件,可以从官方 GitHub 仓库或者 Maven Central 等渠道获取相应版本的 IK 分词器。
浏览器中输入以下地址即可下载,记得更换自己的版本号,IK分词器版本跟ES版本保持一致
https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.0/elasticsearch-analysis-ik-6.3.0.zip
-
IK分词器安装包下载完毕后,找到docker部署时的插件映射文件夹, 如我的在
/opt/volumn/es/sigle/plugins
, 将安装包移到plugins下,并解压到指定文件夹中,然后移除zip安装包# 将安装包解压至指定目录 [root@Genterator plugins]# unzip elasticsearch-analysis-ik-7.17.1.zip -d ./elasticsearch-analysis-ik Archive: elasticsearch-analysis-ik-7.17.1.zip creating: ./elasticsearch-analysis-ik/config/ inflating: ./elasticsearch-analysis-ik/config/stopword.dic inflating: ./elasticsearch-analysis-ik/config/extra_main.dic inflating: ./elasticsearch-analysis-ik/config/quantifier.dic inflating: ./elasticsearch-analysis-ik/config/extra_single_word.dic inflating: ./elasticsearch-analysis-ik/config/IKAnalyzer.cfg.xml inflating: ./elasticsearch-analysis-ik/config/surname.dic inflating: ./elasticsearch-analysis-ik/config/extra_single_word_low_freq.dic inflating: ./elasticsearch-analysis-ik/config/extra_single_word_full.dic inflating: ./elasticsearch-analysis-ik/config/preposition.dic inflating: ./elasticsearch-analysis-ik/config/extra_stopword.dic inflating: ./elasticsearch-analysis-ik/config/suffix.dic inflating: ./elasticsearch-analysis-ik/config/main.dic inflating: ./elasticsearch-analysis-ik/plugin-descriptor.properties inflating: ./elasticsearch-analysis-ik/plugin-security.policy inflating: ./elasticsearch-analysis-ik/elasticsearch-analysis-ik-7.17.1.jar inflating: ./elasticsearch-analysis-ik/httpclient-4.5.2.jar inflating: ./elasticsearch-analysis-ik/httpcore-4.4.4.jar inflating: ./elasticsearch-analysis-ik/commons-logging-1.2.jar inflating: ./elasticsearch-analysis-ik/commons-codec-1.9.jar [root@Genterator plugins]# ll total 4400 drwxr-xr-x 3 root root 244 Jun 3 19:33 elasticsearch-analysis-ik -rw-r--r-- 1 root root 4504811 Jun 3 19:32 elasticsearch-analysis-ik-7.17.1.zip # 移除安装包 [root@Genterator plugins]# rm -rf elasticsearch-analysis-ik-7.17.1.zip
-
使用
docker restart [container_name_or_id]
命令重启 Elasticsearch 容器。在重启后,IK 分词器应该已经生效了[root@Genterator plugins]# docker restart es es
IK带有两个分词器:
- ik_max_word:会将文本做最细粒度的拆分,尽可能多的拆分出词语
- ik_smart:会做最粗粒度的拆分,已被分出的词语将不会再次被其他词语占有
# 最细粒度 GET /_analyze { "analyzer": "ik_max_word", "text": "中国人民共和国" } # 最粗粒度 GET /_analyze { "analyzer": "ik_smart", "text": "中国人民共和国" }
-
为索引设置分词器
添加分析器之前,必须先关闭索引,添加后再打开索引
# 关闭分词器 POST /blog/_close # 设置分词器 PUT /blog/_settings { # 指定分词器 "analysis": { "analyzer": { "ik":{ "tokenizer":"ik_max_word" } } } } # 打开索引 POST /blog/_open