效果展示
在搜索框根据拼音首字母进行提示
拼音分词器
和IK中文分词器一样的用法,按照下面的顺序执行。
# 进入容器内部
docker exec -it elasticsearch /bin/bash
# 在线下载并安装
./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-pinyin/releases/download/v7.12.1/elasticsearch-analysis-pinyin-7.12.1.zip
#退出
exit
#重启容器
docker restart elasticsearch
重启完成之后进行拼音分词可以看见每个字都有,以及整个词语首字母组合成的一个。
自定义分词器
只用默认的功能还远远不够。
先用ik进行分词,再用拼音分词器分
PUT /test
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "ik_max_word",
"filter": "py"
}
},
"filter": {
"py": {
"type": "pinyin",
"keep_full_pinyin": false,
"keep_joined_full_pinyin": true,
"keep_original": true,
"limit_first_letter_length": 16,
"remove_duplicated_term": true,
"none_chinese_pinyin_tokenize": false
}
}
}
},
"mappings": {
"properties": {
"name":{
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
在test这份索引库当中再次测试就可以看见既有中文也有拼音分词了。
POST /test/_analyze
{
"text":["北岭山脚鼠鼠"],
"analyzer": "my_analyzer"
}
但是这里还会有问题,用中文搜索时会把同音字也一起搜索到
指定搜索时和创建时用不同的分词器
在上面的语句里面加上了一条
"search_analyzer": "ik_smart"
POST /test/_doc/1
{
"id": 1,
"name": "狮子"
}
POST /test/_doc/2
{
"id": 2,
"name": "虱子"
}
GET /test/_search
{
"query": {
"match": {
"name": "掉入狮子笼咋办"
}
}
}
结果如下
DSL实现自动补全查询
查询补全语法
数据准备
// 自动补全的索引库
PUT test2
{
"mappings": {
"properties": {
"title":{
"type": "completion"
}
}
}
}
// 示例数据
POST test2/_doc
{
"title": ["Sony", "WH-1000XM3"]
}
POST test2/_doc
{
"title": ["SK-II", "PITERA"]
}
POST test2/_doc
{
"title": ["Nintendo", "switch"]
}
查询语句
// 自动补全查询
GET /test2/_search
{
"suggest": {
"title_suggest": {
"text": "s", // 关键字
"completion": {
"field": "title", // 补全字段
"skip_duplicates": true, // 跳过重复的
"size": 10 // 获取前10条结果
}
}
}
}
酒店数据自动补全
修改酒店索引库数据结构
DELETE /hotel
# 酒店数据索引库
PUT /hotel
{
"settings": {
"analysis": {
"analyzer": {
"text_anlyzer": {
"tokenizer": "ik_max_word",
"filter": "py"
},
"completion_analyzer": {
"tokenizer": "keyword",
"filter": "py"
}
},
"filter": {
"py": {
"type": "pinyin",
"keep_full_pinyin": false,
"keep_joined_full_pinyin": true,
"keep_original": true,
"limit_first_letter_length": 16,
"remove_duplicated_term": true,
"none_chinese_pinyin_tokenize": false
}
}
}
},
"mappings": {
"properties": {
"id":{
"type": "keyword"
},
"name":{
"type": "text",
"analyzer": "text_anlyzer",
"search_analyzer": "ik_smart",
"copy_to": "all"
},
"address":{
"type": "keyword",
"index": false
},
"price":{
"type": "integer"
},
"score":{
"type": "integer"
},
"brand":{
"type": "keyword",
"copy_to": "all"
},
"city":{
"type": "keyword"
},
"starName":{
"type": "keyword"
},
"business":{
"type": "keyword",
"copy_to": "all"
},
"location":{
"type": "geo_point"
},
"pic":{
"type": "keyword",
"index": false
},
"all":{
"type": "text",
"analyzer": "text_anlyzer",
"search_analyzer": "ik_smart"
},
"suggestion":{
"type": "completion",
"analyzer": "completion_analyzer"
}
}
}
}
先删除再重新创建一个
然后在HotelDoc这个实体类里面新增一个字段suggestion,这个字段是由现有的字段组成放进去。
private List<String> suggestion;this.suggestion= Arrays.asList(this.brand,this.business);
然后重新执行之前的批量插入的语句
再次测试搜索可以看见搜索得到的结果里面多出了品牌和商圈信息。
但是这里business字段有可能是由多个的,要进行切割。
修改HotelDoc上面的构造方法的代码
if(this.business.contains("、")){
//business有多个值,需要切割
String[] arr = this.business.split("、");
//添加元素
this.suggestion=new ArrayList<>();
this.suggestion.add(this.brand);
Collections.addAll(this.suggestion,arr);
}else {
this.suggestion = Arrays.asList(this.brand, this.business);
}
再次插入数据可以看见多个词条已经分开了。
进行搜索测试
搜索所有以h开头的词条
RestAPI实现自动补全
请求组装+响应解析
@Test
void testSuggest() throws IOException {
//1.准备request
SearchRequest request = new SearchRequest("hotel");
//2.准备DSl
request.source().suggest(new SuggestBuilder().addSuggestion(
"suggestion",
SuggestBuilders.completionSuggestion("suggestion")
.prefix("h")
.skipDuplicates(true)
.size(10)
));
//3.发起请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.解析结果
Suggest suggest= response.getSuggest();
//4.1根据补全查询名称,获取补全结果
CompletionSuggestion suggestions = suggest.getSuggestion("suggestion");
//4.2获取options
List<CompletionSuggestion.Entry.Option> options = suggestions.getOptions();
//4.3遍历
for (CompletionSuggestion.Entry.Option option : options) {
String text = option.getText().toString();
System.out.println(text);
}
}
实现搜索框自动补全
Controller中
@GetMapping("suggestion")
public List<String>getSuggestion(@RequestParam("key")String prefix){
return hotelService.getSuggestions(prefix);
}
Service中
@Override
public List<String> getSuggestions(String prefix) {
try {
//1.准备request
SearchRequest request = new SearchRequest("hotel");
//2.准备DSl
request.source().suggest(new SuggestBuilder().addSuggestion(
"suggestion",
SuggestBuilders.completionSuggestion("suggestion")
.prefix(prefix)
.skipDuplicates(true)
.size(10)
));
//3.发起请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.解析结果
Suggest suggest= response.getSuggest();
//4.1根据补全查询名称,获取补全结果
CompletionSuggestion suggestions = suggest.getSuggestion("suggestion");
//4.2获取options
List<CompletionSuggestion.Entry.Option> options = suggestions.getOptions();
//4.3遍历
List<String>list=new ArrayList<>(options.size());
for (CompletionSuggestion.Entry.Option option : options) {
String text = option.getText().toString();
list.add(text);
}
return list;
} catch (IOException e) {
throw new RuntimeException(e);
}
}
效果演示
成功根据提示进行查询