二、ElasticSearch6 安装中文分词器(IK Analysis)

通过前一篇的安装后:ElasticSearch6.2.4 安装OK了 我们继续安装IK分词器

一、安装

    以下是版本对照表(GitHub地址): 

IK versionES version
master6.x -> master
6.2.46.2.4
6.1.36.1.3
5.6.85.6.8
5.5.35.5.3
5.4.35.4.3
5.3.35.3.3
5.2.25.2.2
5.1.25.1.2
1.10.62.4.6
1.9.52.3.5
1.8.12.2.1
1.7.02.1.1
1.5.02.0.0
1.2.61.0.0
1.2.50.90.x
1.1.30.20.x
1.0.00.16.2 -> 0.19.0

  1、离线安装:

   (1、)如下地址下载最新包(自行检查对应版本号)
https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.4/elasticsearch-analysis-ik-6.2.4.zip
   (2、)解压到es安装目录下
[payment@localhost elasticsearch-6.2.4]$ cd plugins/
[payment@localhost plugins]$ pwd
/home/payment/elasticSearch/elasticsearch-6.2.4/plugins
[payment@localhost plugins]$ unzip elasticsearch-analysis-ik-6.2.4.zip

   2、在线安装(推荐):

[payment@gameServer elasticsearch-6.2.4]$ pwd
/home/payment/elasticSearch/elasticsearch-6.2.4
[payment@gameServer elasticsearch-6.2.4]$ 
[payment@gameServer elasticsearch-6.2.4]$ ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.4/elasticsearch-analysis-ik-6.2.4.zip
-> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.4/elasticsearch-analysis-ik-6.2.4.zip
[=================================================] 100%   
-> Installed analysis-ik
[payment@gameServer elasticsearch-6.2.4]$ 

 二、重启ElasticSearch服务

    1、停止服务:

[payment@gameServer elasticsearch-6.2.4]$ ps -ef|grep elasticsearch
payment  27352     1  0 10:50 pts/0    00:00:39 /usr/local/java/jdk1.8.0_161//bin/java -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=/tmp/elasticsearch.oFTj99LA -XX:+HeapDumpOnOutOfMemoryError -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:logs/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=32 -XX:GCLogFileSize=64m -Des.path.home=/home/payment/elasticSearch/elasticsearch-6.2.4 -Des.path.conf=/home/payment/elasticSearch/elasticsearch-6.2.4/config -cp /home/payment/elasticSearch/elasticsearch-6.2.4/lib/* org.elasticsearch.bootstrap.Elasticsearch -d
payment  29017 26594  0 13:10 pts/0    00:00:00 grep elasticsearch
[payment@gameServer elasticsearch-6.2.4]$ 
[payment@gameServer elasticsearch-6.2.4]$ 
[payment@gameServer elasticsearch-6.2.4]$ kill -9 27352

    2、启动ElasticSearch 

[payment@gameServer elasticsearch-6.2.4]$ pwd
/home/payment/elasticSearch/elasticsearch-6.2.4
[payment@gameServer elasticsearch-6.2.4]$ ./bin/elasticsearch -d && tail -f logs/elasticsearch.log
[2018-06-06T13:12:28,029][INFO ][o.e.d.DiscoveryModule    ] [SdEluaQ] using discovery type [zen]
[2018-06-06T13:12:28,536][INFO ][o.e.n.Node               ] initialized
[2018-06-06T13:12:28,536][INFO ][o.e.n.Node               ] [SdEluaQ] starting ...
[2018-06-06T13:12:28,711][INFO ][o.e.t.TransportService   ] [SdEluaQ] publish_address {172.17.63.15:9300}, bound_addresses {172.17.63.15:9300}
[2018-06-06T13:12:28,721][INFO ][o.e.b.BootstrapChecks    ] [SdEluaQ] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2018-06-06T13:12:31,765][INFO ][o.e.c.s.MasterService    ] [SdEluaQ] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{IYnq99tLTjKcjGSXoxTS5w}{172.17.63.15}{172.17.63.15:9300}
[2018-06-06T13:12:31,769][INFO ][o.e.c.s.ClusterApplierService] [SdEluaQ] new_master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{IYnq99tLTjKcjGSXoxTS5w}{172.17.63.15}{172.17.63.15:9300}, reason: apply cluster state (from master [master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{IYnq99tLTjKcjGSXoxTS5w}{172.17.63.15}{172.17.63.15:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
[2018-06-06T13:12:31,782][INFO ][o.e.h.n.Netty4HttpServerTransport] [SdEluaQ] publish_address {172.17.63.15:9200}, bound_addresses {172.17.63.15:9200}
[2018-06-06T13:12:31,782][INFO ][o.e.n.Node               ] [SdEluaQ] started
[2018-06-06T13:12:31,921][INFO ][o.e.g.GatewayService     ] [SdEluaQ] recovered [0] indices into cluster_state
[2018-06-06T13:13:42,980][INFO ][o.e.n.Node               ] [] initializing ...
[2018-06-06T13:13:43,141][INFO ][o.e.e.NodeEnvironment    ] [SdEluaQ] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [402.8gb], net total_space [442.7gb], types [rootfs]
[2018-06-06T13:13:43,141][INFO ][o.e.e.NodeEnvironment    ] [SdEluaQ] heap size [990.7mb], compressed ordinary object pointers [true]
[2018-06-06T13:13:43,143][INFO ][o.e.n.Node               ] node name [SdEluaQ] derived from node ID [SdEluaQkTfi1p-yRtlxHSA]; set [node.name] to override
[2018-06-06T13:13:43,143][INFO ][o.e.n.Node               ] version[6.2.4], pid[29196], build[ccec39f/2018-04-12T20:37:28.497551Z], OS[Linux/2.6.32-696.28.1.el6.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_161/25.161-b12]
[2018-06-06T13:13:43,143][INFO ][o.e.n.Node               ] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch.vXQsyXAG, -XX:+HeapDumpOnOutOfMemoryError, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -Des.path.home=/home/payment/elasticSearch/elasticsearch-6.2.4, -Des.path.conf=/home/payment/elasticSearch/elasticsearch-6.2.4/config]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [aggs-matrix-stats]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [analysis-common]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [ingest-common]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [lang-expression]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [lang-mustache]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [lang-painless]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [mapper-extras]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [parent-join]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [percolator]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [rank-eval]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [reindex]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [repository-url]
[2018-06-06T13:13:43,783][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [transport-netty4]
[2018-06-06T13:13:43,783][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded module [tribe]
[2018-06-06T13:13:43,783][INFO ][o.e.p.PluginsService     ] [SdEluaQ] loaded plugin [analysis-ik]
[2018-06-06T13:13:46,137][INFO ][o.e.d.DiscoveryModule    ] [SdEluaQ] using discovery type [zen]
[2018-06-06T13:13:46,605][INFO ][o.e.n.Node               ] initialized
[2018-06-06T13:13:46,605][INFO ][o.e.n.Node               ] [SdEluaQ] starting ...
[2018-06-06T13:13:46,770][INFO ][o.e.t.TransportService   ] [SdEluaQ] publish_address {172.17.63.15:9300}, bound_addresses {172.17.63.15:9300}
[2018-06-06T13:13:46,778][INFO ][o.e.b.BootstrapChecks    ] [SdEluaQ] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2018-06-06T13:13:49,828][INFO ][o.e.c.s.MasterService    ] [SdEluaQ] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{OJnGIoaBRDaK0mBJRTarMQ}{172.17.63.15}{172.17.63.15:9300}
[2018-06-06T13:13:49,835][INFO ][o.e.c.s.ClusterApplierService] [SdEluaQ] new_master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{OJnGIoaBRDaK0mBJRTarMQ}{172.17.63.15}{172.17.63.15:9300}, reason: apply cluster state (from master [master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{OJnGIoaBRDaK0mBJRTarMQ}{172.17.63.15}{172.17.63.15:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
[2018-06-06T13:13:49,853][INFO ][o.e.h.n.Netty4HttpServerTransport] [SdEluaQ] publish_address {172.17.63.15:9200}, bound_addresses {172.17.63.15:9200}
[2018-06-06T13:13:49,861][INFO ][o.e.n.Node               ] [SdEluaQ] started
[2018-06-06T13:13:49,973][INFO ][o.e.g.GatewayService     ] [SdEluaQ] recovered [0] indices into cluster_state
启动并监听启动日志:
   看到:加载了 分词插件 
loaded plugin [analysis-ik]

三、检查分词器

   检查分词:

[root@gameServer ~]# curl -XGET http://172.17.63.15:9200/_analyze?pretty -H 'Content-Type:application/json' -d'               
{
  "analyzer": "ik_smart",
  "text": "听说看这篇博客的哥们最帅、姑娘最美"
}'
{
  "tokens" : [
    {
      "token" : "听说",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "看",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "这篇",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "博客",
      "start_offset" : 5,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "的",
      "start_offset" : 7,
      "end_offset" : 8,
      "type" : "CN_CHAR",
      "position" : 4
    },
    {
      "token" : "哥们",
      "start_offset" : 8,
      "end_offset" : 10,
      "type" : "CN_WORD",
      "position" : 5
    },
    {
      "token" : "最",
      "start_offset" : 10,
      "end_offset" : 11,
      "type" : "CN_CHAR",
      "position" : 6
    },
    {
      "token" : "帅",
      "start_offset" : 11,
      "end_offset" : 12,
      "type" : "CN_CHAR",
      "position" : 7
    },
    {
      "token" : "姑娘",
      "start_offset" : 13,
      "end_offset" : 15,
      "type" : "CN_WORD",
      "position" : 8
    },
    {
      "token" : "最美",
      "start_offset" : 15,
      "end_offset" : 17,
      "type" : "CN_WORD",
      "position" : 9
    }
  ]
}
解释(来源 GitHub ):

ik_max_word 和 ik_smart 什么区别?
ik_max_word: 会将文本做最细粒度的拆分,比如会将“中华人民共和国国歌”拆分为“中华人民共和国,中华人民,中华,华人,人民共和国,人民,人,民,共和国,共和,和,国国,国歌”,会穷尽各种可能的组合;
ik_smart: 会做最粗粒度的拆分,比如会将“中华人民共和国国歌”拆分为“中华人民共和国,国歌”。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
要在Elasticsearch安装中文分词IK,请按照以下步骤操作: 1.确保您的Elasticsearch版本与IK分词器版本兼容。您可以在IK分词器的GitHub页面上查看兼容性信息。 2.下载IK分词器插件。您可以在IK分词器的GitHub页面上找到最新版本的插件。 3.将IK分词器插件安装Elasticsearch中。您可以使用以下命令将插件安装Elasticsearch中: ``` sudo bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v{版本号}/elasticsearch-analysis-ik-{版本号}.zip ``` 请将{版本号}替换为您要安装IK分词器的版本号。例如,如果您要安装版本7.5.1的IK分词器,则应使用以下命令: ``` sudo bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.5.1/elasticsearch-analysis-ik-7.5.1.zip ``` 4.安装完成后,重启Elasticsearch以使IK分词器生效: ``` sudo systemctl restart elasticsearch ``` 5.现在您可以在Elasticsearch索引中使用中文分词IK了。您可以使用以下代码段在索引映射中配置IK分词器: ``` "analysis": { "analyzer": { "ik_max_word": { "tokenizer": "ik_max_word" }, "ik_smart": { "tokenizer": "ik_smart" } }, "tokenizer": { "ik_max_word": { "type": "ik_max_word" }, "ik_smart": { "type": "ik_smart" } } } ``` 在这个示例中,我们为两个分词器ik_max_word和ik_smart定义了令牌。您可以根据需要添加其他分词器和令牌
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值