ElasticSearch的拼音分词

插件源码地址:https://github.com/medcl/elasticsearch-analysis-pinyin
下载:https://github.com/medcl/elasticsearch-analysis-pinyin/releases/download/v6.5.4/elasticsearch-analysis-pinyin-6.5.4.zip

将拼音分词的插件解压到指定换目录 下

创建容器的时候指定插件挂载的位置

-v /usr/local/elasticsearch/pinyin:/usr/share/elasticsearch/plugins/pinyin

创建窗口命令示例:

docker create --name es-node01 --net host -v /usr/local/elasticsearch/pinyin:/usr/share/elasticsearch/plugins/pinyin -v /usr/local/elasticsearch/node01/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml -v /usr/local/elasticsearch/node01/jvm.options:/usr/share/elasticsearch/config/jvm.options -v /usr/local/elasticsearch/ik:/usr/share/elasticsearch/plugins/ik -v /usr/local/elasticsearch/node01/data:/usr/share/elasticsearch/data elasticsearch:6.5.4 
docker create --name es-node02 --net host -v /usr/local/elasticsearch/pinyin:/usr/share/elasticsearch/plugins/pinyin -v /usr/local/elasticsearch/node02/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml -v /usr/local/elasticsearch/node02/jvm.options:/usr/share/elasticsearch/config/jvm.options -v /usr/local/elasticsearch/ik:/usr/share/elasticsearch/plugins/ik -v /usr/local/elasticsearch/node02/data:/usr/share/elasticsearch/data elasticsearch:6.5.4 
docker create --name es-node03 --net host -v /usr/local/elasticsearch/pinyin:/usr/share/elasticsearch/plugins/pinyin -v /usr/local/elasticsearch/node03/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml -v /usr/local/elasticsearch/node03/jvm.options:/usr/share/elasticsearch/config/jvm.options -v /usr/local/elasticsearch/ik:/usr/share/elasticsearch/plugins/ik -v /usr/local/elasticsearch/node03/data:/usr/share/elasticsearch/data elasticsearch:6.5.4

 

测试(官网的例子)

创建索引:

http://192.168.142.128:9200/medcl/
{
    "settings" : {
        "analysis" : {
            "analyzer" : {
                "pinyin_analyzer" : {
                    "tokenizer" : "my_pinyin"
                    }
            },
            "tokenizer" : {
                "my_pinyin" : {
                    "type" : "pinyin",
                    "keep_separate_first_letter" : false,
                    "keep_full_pinyin" : true,
                    "keep_original" : true,
                    "limit_first_letter_length" : 16,
                    "lowercase" : true,
                    "remove_duplicated_term" : true
                }
            }
        }
    }
}

参数说明:

 

  • keep_fifirst_letter:启用此选项时,例如:刘德华> ldh,默认值:true 
  • keep_separate_fifirst_letter:启用该选项时,将保留第一个字母分开,例如:刘德华> l,d,h,默认:假的,注意:查询结果也许是太模糊,由于长期过频 
  • keep_full_pinyin:当启用该选项,例如:刘德华> [ liu,de,hua],默认值:true 
  • keep_original:当启用此选项时,也会保留原始输入,默认值:false 
  • limit_fifirst_letter_length:设置fifirst_letter结果的最大长度,默认值:16 
  • lowercase:小写非中文字母,默认值:true 
  • remove_duplicated_term:当启用此选项时,将删除重复项以保存索引,例如:de的> de,默认值: false,注意:位置相关查询可能受影响

 

测试:

http://192.168.142.128:9202/medcl/_analyze
{
  "text": ["刘德华"],
  "analyzer": "pinyin_analyzer"
}

 

我这里有报错:

这里是内存溢出

修改配置文件:jvm.options中的-Xms后的数据改大保证正常运行即可

-Xms256m
-Xmx256m

 

测试搜索数据:

创建mapping

PUT http://192.168.142.128:9200/medcl/folks/_mapping
{
  "folks": {
    "properties": {
      "name": {
        "type": "keyword",
        "fields": {
          "pinyin": {
            "type": "text",
            "store": false,
            "term_vector": "with_offsets",
            "analyzer": "pinyin_analyzer",
            "boost": 10
          }
        }
      }
    }
  }
}

插入一条数据

POST http://192.168.142.128:9200/medcl/folks/andy
{"name":"刘德华"}

查询

POST http://192.168.142.128:9200/medcl/folks/_search
{
	"query":{
		"match":{
			"name.pinyin":"liudehua"
		}
	}
	
}

 

结果如下

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值