混合索引java代码,Elasticsearch 实现拼音,中文,首字母混合搜索

在实际搜索需求中,常常需要对中文做拼音搜索,首字母搜索或者中文拼音首字母混合搜索。

比如要对 "广发聚财信用" 这几个中文进行拼音搜索,我们可能的搜索关键字是:“广发”,“聚财”,“guangfa”,“gfjc”,“guangfajucai”,“guangfjc”,“gfajcai”,“广发juc” 等等的混合搜索,本文主要使用elasticsearch-analysis-lc-pinyin 实现针对这种中文和拼音的混合或全拼、首字母的搜索。

该插件主要基于elasticsearch1.4.5开发出来,更高的版本是否支持请自行测试

一、安装插件

git 地址:http://git.oschina.net/music_code_m/elasticsearch-analysis-lc-pinyin

1、用git下载源码

git clone https://git.oschina.net/music_code_m/elasticsearch-analysis-lc-pinyin.git

2、编译

进入elasticsearch-analysis-lc-pinyin根目录执行如下命令

062d24dcf5251846d7d0adf370979a14.png

3、安装

执行编译完成后会在releases文件夹下生成elasticsearch-analysis-lc-pinyin-1.4.5.zip压缩包

1c5638388c53de6b6935dcd0829da7ec.png

将该文件拷贝到es的plugin目录下并解压

c4699e52dafd4b3a38b7f30f001f0057.png

修改es的config目录下elasticsearch.yml文件声明pinyin插件,在文件末尾加上

4b1b121c97d2055aea252eb356b47c53.png

接下来重启es就OK啦。

上面的配置定义了lc_pinyin分词器,其中

lc_index:这个分词器用来索引文档时指定的分词器

lc_search:这个分词器用在搜索时指定的分词器

现在就可以测试下分词效果了,

1).首先创建一个索引

1b3b6eadc70febd0e1e71ef366d3d2ee.png

2 ) 创建mapping

6fb59ae1afb4989a50f271e9900df38c.png

3) 索引一条数据

b3ec1ef2ffae8c3115d9e98a3f254223.png

4)构造查询条件

搜索:guangfa

curl -XPOST http://localhost:9200/index/_search?pretty -d '

{

"query": {

"match": {

"content": {

"query": "guangfa",

"analyzer": "lc_search",

"type": "phrase",

"slop": 1,

"zero_terms_query": "NONE"

}

}

},

"highlight": {

"pre_tags": [

"",

""

],

"post_tags": [

"",

""

],

"fields": {

"content": {}

}

}

}'

返回结果:

{

"took" : 2,

"timed_out" : false,

"_shards" : {

"total" : 5,

"successful" : 5,

"failed" : 0

},

"hits" : {

"total" : 1,

"max_score" : 4.841117,

"hits" : [ {

"_index" : "index",

"_type" : "fulltext",

"_id" : "1",

"_score" : 4.841117,

"_source": {"content":"广发聚财信用"}

"highlight" : {

"content" : [ "广发聚财信用" ]

}

} ]

}

}

搜索:guang发ju财

curl -XPOST http://localhost:9200/index/_search?pretty -d '

{

"query": {

"match": {

"content": {

"query": "guang发ju财",

"analyzer": "lc_search",

"type": "phrase",

"slop": 1,

"zero_terms_query": "NONE"

}

}

},

"highlight": {

"pre_tags": [

"",

""

],

"post_tags": [

"",

""

],

"fields": {

"content": {}

}

}

}'

{

"took" : 15,

"timed_out" : false,

"_shards" : {

"total" : 5,

"successful" : 5,

"failed" : 0

},

"hits" : {

"total" : 1,

"max_score" : 3.6822338,

"hits" : [ {

"_index" : "index",

"_type" : "fulltext",

"_id" : "1",

"_score" : 3.6822338,

"_source":{"content":"广发聚财信用"}

"highlight" : {

"content" : [ "广发聚财信用" ]

}

} ]

}

}

搜索:j财xy

curl -XPOST http://localhost:9200/index/_search?pretty -d '

{

"query": {

"match": {

"content": {

"query": "j财xy",

"analyzer": "lc_search",

"type": "phrase",

"slop": 1,

"zero_terms_query": "NONE"

}

}

},

"highlight": {

"pre_tags": [

"",

""

],

"post_tags": [

"",

""

],

"fields": {

"content": {}

}

}

}'

{

"took" : 22,

"timed_out" : false,

"_shards" : {

"total" : 5,

"successful" : 5,

"failed" : 0

},

"hits" : {

"total" : 1,

"max_score" : 6.682234,

"hits" : [ {

"_index" : "index",

"_type" : "fulltext",

"_id" : "1",

"_score" : 6.682234,

"_source":{"content":"广发聚财信用"}

"highlight" : {

"content" : [ "广发聚财信用" ]

}

} ]

}

}

搜索:gfjc

curl -XPOST http://localhost:9200/index/_search?pretty -d '

{

"query": {

"match": {

"content": {

"query": "gfjc",

"analyzer": "lc_search",

"type": "phrase",

"slop": 1,

"zero_terms_query": "NONE"

}

}

},

"highlight": {

"pre_tags": [

"",

""

],

"post_tags": [

"",

""

],

"fields": {

"content": {}

}

}

}'

{

"took" : 6,

"timed_out" : false,

"_shards" : {

"total" : 5,

"successful" : 5,

"failed" : 0

},

"hits" : {

"total" : 1,

"max_score" : 6.682234,

"hits" : [ {

"_index" : "index",

"_type" : "fulltext",

"_id" : "1",

"_score" : 6.682234,

"_source":{"content":"广发聚财信用"}

"highlight" : {

"content" : [ "广发聚财信用" ]

}

} ]

}

}

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值