elasticsearch5.x:查询建议介绍、Suggester 介绍
参考:http://www.cnblogs.com/leeSmall/p/9206646.html
参考(重点):https://elasticsearch.cn/article/142
参考(官网):https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html
一、查询建议介绍
1. 查询建议是什么?
查询建议,为用户提供良好的使用体验。主要包括: 拼写检查; 自动建议查询词(自动补全)
拼写检查如图:
自动建议查询词(自动补全):
2. ES中查询建议的API
查询建议也是使用_search端点地址。在DSL中suggest节点来定义需要的建议查询
示例1:定义单个建议查询词
POST twitter/_search
{
"query" : {
"match": {
"message": "tring out Elasticsearch"
}
},
"suggest" : { <!-- 定义建议查询 -->
"my-suggestion" : { <!-- 一个建议查询名 -->
"text" : "tring out Elasticsearch", <!-- 查询文本 -->
"term" : { <!-- 使用词项建议器 -->
"field" : "message" <!-- 指定在哪个字段上获取建议词 -->
}
}
}
}
PUT index
{
"mappings":{
"completion":{
"properties":{
"title": {
"type": "text",
"analyzer": "ik_smart"
},
"title_suggest": {
"type": "completion",
"analyzer": "ik_smart",
"search_analyzer": "ik_smart"
}
}
}
}
}
示例2:定义多个建议查询词
POST _search { "suggest": { "my-suggest-1" : { "text" : "tring out Elasticsearch", "term" : { "field" : "message" } }, "my-suggest-2" : { "text" : "kmichy", "term" : { "field" : "user" } } } }
示例3:多个建议查询可以使用全局的查询文本
POST _search { "suggest": { "text" : "tring out Elasticsearch", "my-suggest-1" : { "term" : { "field" : "message" } }, "my-suggest-2" : { "term" : { "field" : "user" } } } }
二、Suggester 介绍
1. Term suggester
term 词项建议器,对给入的文本进行分词,为每个词进行模糊查询提供词项建议。对于在索引中存在词默认不提供建议词,不存在的词则根据模糊查询结果进行排序后取一定数量的建议词。
常用的建议选项:
示例1:
POST twitter/_search
{
"query" : {
"match": {
"message": "tring out Elasticsearch"
}
},
"suggest" : { <!-- 定义建议查询 -->
"my-suggestion" : { <!-- 一个建议查询名 -->
"text" : "tring out Elasticsearch", <!-- 查询文本 -->
"term" : { <!-- 使用词项建议器 -->
"field" : "message" <!-- 指定在哪个字段上获取建议词 -->
}
}
}
}
2. phrase suggester
phrase 短语建议,在term的基础上,会考量多个term之间的关系,比如是否同时出现在索引的原文里,相邻程度,以及词频等
示例
POST twitter/_search { "query" : { "match": { "message": "tring out Elasticsearch" } }, "suggest" : { "my-suggestion" : { "text" : "tring out Elasticsearch", "phrase" : { "field" : "message" } } } }
结果:
{
"took": 30,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1.113083,
"hits": [
{
"_index": "twitter",
"_type": "tweet",
"_id": "4",
"_score": 1.113083,
"_source": {
"user": "kimchy",
"postDate": "2018-07-23T07:29:57.653Z",
"message": "trying out Elasticsearch"
}
},
{
"_index": "twitter",
"_type": "tweet",
"_id": "7",
"_score": 0.98382175,
"_source": {
"user": "yuchen20",
"postDate": "2018-07-23T08:12:05.604Z",
"message": "trying out Elasticsearch"
}
}
]
},
"suggest": { <!-- 建议-->
"my-suggestion": [
{
"text": "tring out Elasticsearch",
"offset": 0,
"length": 23,
"options": [{
{
"text": "trying out elasticsearch",
"score": 0.5118434
}
]
}
]
}
}
3. Completion suggester 自动补全
针对自动补全场景而设计的建议器。此场景下用户每输入一个字符的时候,就需要即时发送一次查询请求到后端查找匹配项,在用户输入速度较高的情况下对后端响应速度要求比较苛刻。因此实现上它和前面两个Suggester采用了不同的数据结构,索引并非通过倒排来完成,而是将analyze过的数据编码成FST和索引一起存放。对于一个open状态的索引,FST会被ES整个装载到内存里的,进行前缀查找速度极快。但是FST只能用于前缀查找,这也是Completion Suggester的局限所在。
官网链接:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html
示例1:
为了使用自动补全,索引中用来提供补全建议的字段需特殊设计,字段类型为 completion。 先设置mapping:
PUT index/ { "mappings":{ "completion":{ "properties":{ "title": { "type": "text", "analyzer": "ik_smart" }, "title_suggest": { "type": "completion", "analyzer": "ik_smart", "search_analyzer": "ik_smart" } } } } }
重点是title_suggest,这个字段就是之后我们搜索补全的字段,需要设置type为completion,analyzer按情况设置分析器
索引数据:
POST /index/completion/_bulk { "index" : { } } { "title": "背景天安门广场大学", "title_suggest": "背景天安门广场大学"} { "index" : { } } { "title": "北京天安门","title_suggest": "北京天安门"} { "index" : { } } { "title": "北京鸟巢","title_suggest": "北京鸟巢"} { "index" : { } } { "title": "奥林匹克公园","title_suggest": "奥林匹克公园"} { "index" : { } } { "title": "奥林匹克森林公园","title_suggest": "奥林匹克森林公园"} { "index" : { } } { "title": "北京奥林匹克公园","title_suggest": "北京奥林匹克公园"} { "index" : { } } { "title": "北京奥林匹克公园","title_suggest": {"input": "我爱中国","weight": 100}}
索引的时候可以对suggest字段,增加weight增加排序权重
搜索补全:
POST /index/completion/_search { "size": 0, "suggest":{ "blog-suggest":{ "prefix":"北京", "completion":{ "field":"title_suggest" } } } }
结果:
{ "took": 3, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 0, "max_score": 0, "hits": [] }, "suggest": { "blog-suggest": [ { "text": "北京", "offset": 0, "length": 2, "options": [ { "text": "北京天安门", "_index": "index", "_type": "completion", "_id": "AWSRo_hn9K_aupETR6FR", "_score": 1, "_source": { "title": "北京天安门", "title_suggest": "北京天安门" } }, { "text": "北京奥林匹克公园", "_index": "index", "_type": "completion", "_id": "AWSRo_hn9K_aupETR6FV", "_score": 1, "_source": { "title": "北京奥林匹克公园", "title_suggest": "北京奥林匹克公园" } }, { "text": "北京鸟巢", "_index": "index", "_type": "completion", "_id": "AWSRo_hn9K_aupETR6FS", "_score": 1, "_source": { "title": "北京鸟巢", "title_suggest": "北京鸟巢" } } ] } ] } }
示例2:
创建映射
PUT music
{
"mappings": {
"docc" : {
"properties" : {
"suggest" : {
"type" : "completion"
},
"title" : {
"type": "keyword"
}
}
}
}
}
Input 指定输入词 Weight 指定排序值(可选)
PUT music/docc/1?refresh
{
"suggest" : {
"input": [ "Nevermind", "Nirvana" ],
"weight" : 34
}
}
指定不同的排序值:
PUT music/_doc/1?refresh
{
"suggest" : [
{
"input": "Nevermind",
"weight" : 10
},
{
"input": "Nirvana",
"weight" : 3
}
]}
放入一条重复数据
PUT music/docc/2?refresh { "suggest" : { "input": [ "Nevermind", "Nirvana" ], "weight" : 20 } }
查询建议根据前缀查询:
POST music/_search?pretty
{
"suggest": {
"song-suggest" : {
"prefix" : "nir",
"completion" : {
"field" : "suggest"
}
}
}
}
对建议查询结果去重: "skip_duplicates": true ,该特性在6.x支持,5.x不支持
POST music/_search?pretty { "suggest": { "song-suggest" : { "prefix" : "nir", "completion" : { "field" : "suggest", "skip_duplicates": true } } } }
查询建议文档存储短语
PUT music/docc/3?refresh { "suggest" : { "input": [ "lucene solr", "lucene so cool","lucene elasticsearch" ], "weight" : 20 } } PUT music/docc/4?refresh { "suggest" : { "input": ["lucene solr cool","lucene elasticsearch" ], "weight" : 10 } }
查询
POST music/_search?pretty
{
"suggest": {
"song-suggest" : {
"prefix" : "lucene s",
"completion" : {
"field" : "suggest"
}
}
}
}
三 、java -api
## elasticsearch5.x:查询建议java-api介绍、Suggester 介绍
参考:http://www.mamicode.com/info-detail-2347270.html
package com.youlan.es.util;
import java.util.concurrent.ExecutionException;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.rest.RestStatus;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.suggest.*;
import org.elasticsearch.search.suggest.completion.CompletionSuggestion;
import org.elasticsearch.search.suggest.phrase.PhraseSuggestion;
import org.elasticsearch.search.suggest.term.TermSuggestion;
public class SuggestDemo {
private static Logger logger = LogManager.getRootLogger();
//拼写检查(英文)
public static void termSuggest(TransportClient client) {
// 1、创建search请求
//SearchRequest searchRequest = new SearchRequest();
SearchRequest searchRequest = new SearchRequest("twitter");
// 2、用SearchSourceBuilder来构造查询请求体 ,请仔细查看它的方法,构造各种查询的方法都在这。
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.size(0);
//做查询建议
//词项建议
SuggestionBuilder termSuggestionBuilder =
SuggestBuilders.termSuggestion("message").text("tring out Elticsearch");//搜索框输入内容:tring out Elticsearch
SuggestBuilder suggestBuilder = new SuggestBuilder();
suggestBuilder.addSuggestion("suggest_user", termSuggestionBuilder);
sourceBuilder.suggest(suggestBuilder);
searchRequest.source(sourceBuilder);
try{
//3、发送请求
SearchResponse searchResponse = client.search(searchRequest).get();
//4、处理响应
//搜索结果状态信息
if(RestStatus.OK.equals(searchResponse.status())) {
// 获取建议结果
Suggest suggest = searchResponse.getSuggest();
TermSuggestion termSuggestion = suggest.getSuggestion("suggest_user");
for (TermSuggestion.Entry entry : termSuggestion.getEntries()) {
logger.info("text: " + entry.getText().string());
for (TermSuggestion.Entry.Option option : entry) {
String suggestText = option.getText().string();//建议内容
logger.info(" suggest option : " + suggestText);
}
}
}
} catch (InterruptedException | ExecutionException e) {
logger.error(e);
}
/*
"suggest": {
"my-suggestion": [
{
"text": "tring",
"offset": 0,
"length": 5,
"options": [
{
"text": "trying",
"score": 0.8,
"freq": 2
}
]
},
{
"text": "out",
"offset": 6,
"length": 3,
"options": []
},
{
"text": "elasticsearch",
"offset": 10,
"length": 13,
"options": []
}
]
}*/
}
public static void phraseSuggest(TransportClient client){
//1、创建search请求
SearchRequest searchRequest = new SearchRequest("twitter");
//2、构造查询qing'qi请求体
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.size(0);
SuggestionBuilder phraseSuggestBuilder = SuggestBuilders.phraseSuggestion( "message").text("tring out");
SuggestBuilder suggestBuilder = new SuggestBuilder();
suggestBuilder.addSuggestion("my-suggestion",phraseSuggestBuilder);
sourceBuilder.suggest(suggestBuilder);
searchRequest.source(sourceBuilder);
try {
//3、发送请求
SearchResponse searchResponse = client.search(searchRequest).get();
//4、处理响应
//搜索状态信息
if (RestStatus.OK.equals(searchResponse.status())){
//获得建议
Suggest suggest = searchResponse.getSuggest();
PhraseSuggestion phraseSuggestion =suggest.getSuggestion("my-suggestion");
for (PhraseSuggestion.Entry entry:phraseSuggestion){
logger.info("text:"+entry.getText().string());
for (PhraseSuggestion.Entry.Option option:entry){
String suggestText = option.getText().string();
logger.info(" suggest option :"+suggestText);
}
}
}
} catch (InterruptedException e) {
logger.error("请求出错:"+e);
} catch (ExecutionException e) {
logger.error(e);
}
}
//自动补全
public static void completionSuggester(TransportClient client) {
// 1、创建search请求
//SearchRequest searchRequest = new SearchRequest();
SearchRequest searchRequest = new SearchRequest("music");
// 2、用SearchSourceBuilder来构造查询请求体 ,请仔细查看它的方法,构造各种查询的方法都在这。
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.size(0);
//做查询建议
//自动补全
/*POST music/_search?pretty
{
"suggest": {
"song-suggest" : {
"prefix" : "lucene s",
"completion" : {
"field" : "suggest" ,
"skip_duplicates": true
}
}
}
}*/
SuggestionBuilder termSuggestionBuilder =
SuggestBuilders.completionSuggestion("suggest").prefix("lucene s");
// .skipDuplicates(true) 6.x去重;
SuggestBuilder suggestBuilder = new SuggestBuilder();
suggestBuilder.addSuggestion("song-suggest", termSuggestionBuilder);
sourceBuilder.suggest(suggestBuilder);
searchRequest.source(sourceBuilder);
try {
//3、发送请求
SearchResponse searchResponse = client.search(searchRequest).get();
//4、处理响应
//搜索结果状态信息
if(RestStatus.OK.equals(searchResponse.status())) {
// 获取建议结果
Suggest suggest = searchResponse.getSuggest();
CompletionSuggestion termSuggestion = suggest.getSuggestion("song-suggest");
for (CompletionSuggestion.Entry entry : termSuggestion.getEntries()) {
logger.info("text: " + entry.getText().string());
for (CompletionSuggestion.Entry.Option option : entry) {
String suggestText = option.getText().string();
logger.info(" suggest option : " + suggestText);
}
}
}
} catch (InterruptedException | ExecutionException e) {
logger.error(e);
}
// 结果:
// {
// "took": 7,
// "timed_out": false,
// "_shards": {
// "total": 5,
// "successful": 5,
// "skipped": 0,
// "failed": 0
// },
// "hits": {
// "total": 0,
// "max_score": 0,
// "hits": []
// },
// "suggest": {
// "song-suggest": [
// {
// "text": "lucene s",
// "offset": 0,
// "length": 8,
// "options": [
// {
// "text": "lucene so cool",
// "_index": "music",
// "_type": "docc",
// "_id": "3",
// "_score": 20,
// "_source": {
// "suggest": {
// "input": [
// "lucene solr",
// "lucene so cool",
// "lucene elasticsearch"
// ],
// "weight": 20
// }
// }
// },
// {
// "text": "lucene solr cool",
// "_index": "music",
// "_type": "docc",
// "_id": "4",
// "_score": 10,
// "_source": {
// "suggest": {
// "input": [
// "lucene solr cool",
// "lucene elasticsearch"
// ],
// "weight": 10
// }
// }
// }
// ]
// }
// ]
// }
// }
}
public static void main(String[] args) {
EsClient esClient= new EsClient();
try (TransportClient client =esClient.getConnection() ;) {
logger.info("---------------- 拼写检查:termSuggest----------------------");
termSuggest(client);
logger.info("------------------ 短语建议:phraseSuggest--------------------");
phraseSuggest(client);
logger.info("------------------ 自动补全:completionSuggester--------------------");
completionSuggester(client);
} catch (Exception e) {
logger.error(e);
}
}
}