- 使用pinyin分词器
- 下载拼音分词器https://github.com/medcl/elasticsearch-analysis-pinyin
- 解压进入目录elasticsearch-analysis-pinyin
- 修改pom.xml中es版本为自己使用的版本
- 命令行mvn package 打包
- 进入elasticsearch-analysis-pinyin-master\target\releases解压elasticsearch-analysis-pinyin-7.7.0.zip文件
- 将解压后的文件拷贝到es安装目录下的 plugins/pinyin 中
- 重启es
- 修改报错的地方,重复步骤4567
- 索引设置
PUT /book
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"pinyin_analyzer": {
"tokenizer": "my_pinyin"
}
},
"tokenizer": {
"my_pinyin": {
"type": "pinyin",
"keep_none_chinese": false,
"keep_full_pinyin": false,
"keep_joined_full_pinyin": true,
"keep_none_chinese_in_joined_full_pinyin": true,
"keep_first_letter": false,
"keep_none_chinese_in_first_letter": false,
"none_chinese_pinyin_tokenize": false
}
}
}
}
}
}
'keep_none_chinese' => false, // 对非中文不拆分词
'keep_full_pinyin' => false, // 关闭: 刘德华 -> liu, de, hua
'keep_joined_full_pinyin' => true, // 刘德华 -> liudehua
'keep_none_chinese_in_joined_full_pinyin' => true, // 刘德华2016 -> liudehua2016
'keep_first_letter' => true, // 刘德华 -> ldh
'keep_none_chinese_in_first_letter' => true, // 刘德华2016 -> ldh2016
'none_chinese_pinyin_tokenize' => false, // 没有卵用
keep_separate_first_letter :将字母分割,例如:刘德华> l,d,h,default:false。
keep_full_pinyin :包含全拼拼音,例如:刘德华> [ liu,de,hua],default:true。
limit_first_letter_length :设置first_letter结果的最大长度,default:16。
lowercase :小写非中文字母,default:true。
keep_none_chinese : 不在结果中保留非中文字母或数字,default:true。
- 设置字段
POST /book/_mapping
{
"properties": {
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
},
"sort": {
"type": "text",
"analyzer": "pinyin_analyzer"
}
}
},
"author": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
},
"sort": {
"type": "text",
"analyzer": "pinyin_analyzer"
}
}
}
}
}
注意:Only text fields support the analyzer mapping parameter.只有text可以设置分词器
- 检索
GET /book/_search
{
"query": {
"match": {
"title": "测试"
}
},
"from": 0,
"size": 20,
"sort": {
"title.sort" : "asc"
}
}
- 拼音检索
- 安装ik分词器步骤https://github.com/medcl/elasticsearch-analysis-ik同拼音分词器
- 安装完成需要重启es
- 索引设置
PUT /book
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"ik_smart_pinyin": {
"type": "custom",
"tokenizer": "ik_smart",
"filter": "my_pinyin_filter"
},
"ik_max_word_pinyin": {
"type": "custom",
"tokenizer": "ik_max_word",
"filter": "my_pinyin_filter"
},
"pinyin_analyzer": {
"tokenizer": "my_pinyin_tokenizer"
}
},
"tokenizer": {
"my_pinyin_tokenizer": {
"type": "pinyin",
"keep_first_letter": false,
"keep_full_pinyin": false,
"keep_joined_full_pinyin": true,
"keep_none_chinese_in_first_letter": true,
"none_chinese_pinyin_tokenize": false,
"lowercase": true,
"with_tone_number": true
}
},
"filter": {
"my_pinyin_filter": {
"type": "pinyin",
"keep_first_letter": false,
"keep_full_pinyin": false,
"keep_joined_full_pinyin": true,
"keep_none_chinese_in_first_letter": true,
"none_chinese_pinyin_tokenize": false,
"lowercase": true,
"with_tone_number": true
}
}
}
}
}
}
- 字段设置
POST /ancientbook/_mapping
{
"properties": {
"title": {
"type": "text",
"analyzer": "ik_max_word_pinyin",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"author": {
"type": "text",
"analyzer": "ik_max_word_pinyin",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
该配置可以实现中文/拼音检索,但无法按拼音排序。
- 使用icu分词器
- 安装插件
- es安装目录下查看插件./bin/elasticsearch-plugin list
- es安装目录下./bin/elasticsearch-plugin install analysis-icu
- 重启es
- 索引设置
- 安装插件
PUT /book
POST /book/_mapping
{
"properties": {
"title": {
"type": "text",
"analyzer": "icu_analyzer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
},
"sort": {
"type": "icu_collation_keyword",
"index": false,
"language": "zh",
"country": "CN"
}
}
},
"author": {
"type": "text",
"analyzer": "icu_analyzer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
},
"sort": {
"type": "icu_collation_keyword",
"index": false,
"language": "zh",
"country": "CN"
}
}
}
}
}
- 检索
GET /book/_search
{
"query": {
"bool": {
"must": [
{
"match_all": {
}
}
]
}
},
"from": 0,
"size": 10,
"sort": [
{
"title.sort": "asc"
}
]
}
- 同时需要分词检索和按拼音排序使用的是icu分词器
1.es安装目录下查看插件./bin/elasticsearch-plugin list
2.es安装目录下./bin/elasticsearch-plugin install analysis-icu
3.重启es