文章目录
介绍
SearchTemplate查询模板可以用来解耦,IndexAlias索引别名可以实现封装和解耦,SuggestAPI推荐API可以将输入的文本分解为单词,然后在索引的字段里查找相似的单词并返回。
SearchTemplate
示例如下,给标题做一个match_phrase匹配,q为参数:
POST /_scripts/movies
{
"script": {
"lang": "mustache",
"source": {
"_source": [
"title"
],
"size": 20,
"query": {
"bool": {
"must": {
"match_phrase": {
"title": "{{q}}"
}
}
}
}
}
}
}
使用方法如下,只需要给指定查询模板传参即可:
POST movies/_search/template
{
"id": "movies",
"params": {
"q": "Safe Passage"
}
}
IndexAlias
示例如下,给某个索引起名为movies-today,并加入过滤器,过滤出rating字段≥10的记录:
POST _aliases
{
"actions": [
{
"add": {
"index": "movies-2020-07-13",
"alias": "movies-today",
"filter": {
"range": {
"rating": {
"gte": 10
}
}
}
}
}
]
}
实现要给movies-2020-07-13索引插入数据:
POST movies-2020-07-13/_doc/1
{
"name": "n1",
"rating":11
}
POST movies-2020-07-13/_doc/2
{
"name": "n2",
"rating":9
}
然后对索引别名查询即可:
POST movies-today/_search
{
"query": {"match_all": {}}
}
SuggestAPI
ES7中总共有四种推荐器:Term/Phrase Suggester、Complete/Context Suggester。
Phrase Suggester
先插入测试数据:
POST _bulk
{"index": {"_index": "article", "_id": 1}}
{"body": "lucene is very cool"}
{"index": {"_index": "article", "_id": 2}}
{"body": "ElasticSearch is built on top of lucene"}
{"index": {"_index": "article", "_id": 3}}
{"body": "ElasticSearch rocks"}
{"index": {"_index": "article", "_id": 4}}
{"body": "Elastic is the corporation of ELK stack"}
{"index": {"_index": "article", "_id": 5}}
{"body": "ELK stack rocks"}
{"index": {"_index": "article", "_id": 6}}
{"body": "Elastic is rock solid"}
然后编写查询体,给出Suggester,这里是对文本luece rock进行缺失建议:
POST article/_search
{
"size": 20,
"query": {"match": {
"body": "luece rock"
}},
"suggest": {
"term-suggestion": {
"text": "luece rock",
"term": {
"suggest_mode": "missing",
"field": "body"
}
}
}
}
有三种建议模式:Missing(如果指定文本就是已存在的字段,就不会推荐)、Popular(推荐出现频率更高的词)和Always(不管文本是不是已存在的字段,都进行推荐),所以上面的例子输出中的suggest部分如下所示
"suggest" : {
"term-suggestion" : [
{
"text" : "luece",
"offset" : 0,
"length" : 5,
"options" : [
{
"text" : "lucene",
"score" : 0.6,
"freq" : 4
}
]
},
{
"text" : "rock",
"offset" : 6,
"length" : 4,
"options" : [ ]
}
]
}
但如果把上面的rock改成hock也不会对它进行推荐,这时加入prefix_length字段,令其为0即可:
POST article/_search
{
"size": 20,
"query": {"match": {
"body": "luece builf hock"
}},
"suggest": {
"term-suggestion": {
"text": "luece builf hock",
"term": {
"suggest_mode": "missing",
"field": "body",
"prefix_length": 0
}
}
}
}
输出的suggest字段如下:
"suggest" : {
"term-suggestion" : [
{
"text" : "luece",
"offset" : 0,
"length" : 5,
"options" : [
{
"text" : "lucene",
"score" : 0.6,
"freq" : 2
}
]
},
{
"text" : "builf",
"offset" : 6,
"length" : 5,
"options" : [
{
"text" : "built",
"score" : 0.8,
"freq" : 1
}
]
},
{
"text" : "hock",
"offset" : 12,
"length" : 4,
"options" : [
{
"text" : "rock",
"score" : 0.75,
"freq" : 1
}
]
}
]
}
Phrase Suggester
phrase建议器可以在term建议器的基础上增加一些逻辑,例如max_errors控制返回的结果中错误单词数,confidence控制返回结果的置信度阈值(此阈值越高,返回结果数越少),也可以加入高亮,指定高亮标签:
POST article/_search
{
"suggest": {
"my_suggestion": {
"text": "lucne and elasticsear rodk very well",
"phrase": {
"field": "body",
"max_errors": 3,
"confidence": 1,
"direct_generator": [
{"field": "body", "suggest_mode": "missing"}
],
"highlight": {
"pre_tag": "<em>",
"post_tag": "</em>"
}
}
}
}
}
输出的suggest部分如下:
"suggest" : {
"my_suggestion" : [
{
"text" : "lucne and elasticsear rodk very well",
"offset" : 0,
"length" : 36,
"options" : [
{
"text" : "lucene and elasticsearch rock very well",
"highlighted" : "<em>lucene</em> and <em>elasticsearch rock</em> very well",
"score" : 1.6991E-4
},
{
"text" : "lucene and elasticsearch rocks very well",
"highlighted" : "<em>lucene</em> and <em>elasticsearch rocks</em> very well",
"score" : 1.6991E-4
},
{
"text" : "lucene and elasticsearch rodk very well",
"highlighted" : "<em>lucene</em> and <em>elasticsearch</em> rodk very well",
"score" : 1.393378E-4
}
]
}
]
}
CompletionSuggeser
补全建议器提供了自动补全功能。
使用时要先给文档设置Mapping,指定对哪个字段进行补全:
PUT article
{
"mappings": {
"properties": {
"body": {
"type": "completion"
}
}
}
}
然后插入数据,并进行补全查询,指定前缀和要补全的字段即可:
POST article/_search
{
"suggest": {
"YOUR_SUGGESTION": {
"prefix": "e",
"completion": {
"field": "body"
}
}
}
}
ContextSuggester
这是对补全建议器的扩展,可以在搜索中加入更多的上下文信息。es中可以定义Category(任意字符串)和Geo(地理位置信息)两种上下文。
实现上下文建议器的步骤有三:定制Mapping;索引数据并加入上下文信息;结合上下文进行建议查询。
使用示例如下,先给文档设置Mapping,让某个字段的类型为补全类型,并给定上下文信息:
PUT comments
PUT comments/_mapping
{
"properties": {
"comment_autocomplete": {
"type": "completion",
"contexts": [
{
"type": "category",
"name": "comment_category"
}
]
}
}
}
然后插入数据,设置补全信息,给定样例输入和对应的上下文:
POST comments/_doc
{
"comment": "I love the star war movie",
"comment_autocomplete": {
"input": ["star wars"],
"contexts": {
"comment_category": "movies"
}
}
}
POST comments/_doc
{
"comment": "Where can I find a Starbucks",
"comment_autocomplete": {
"input": ["starbucks"],
"contexts": {
"comment_category": "coffee"
}
}
}
最后进行查询,给定待补全的前缀、使用的补全字段,以及上下文信息:
POST comments/_search
{
"suggest": {
"YOUR_SUGGESTION": {
"prefix": "sta",
"completion": {
"field": "comment_autocomplete",
"contexts": {
"comment_category": "movies"
}
}
}
}
}
输出的建议字段如下,可见es根据输入前缀和上下文输出了对应的数据:
"suggest" : {
"YOUR_SUGGESTION" : [
{
"text" : "sta",
"offset" : 0,
"length" : 3,
"options" : [
{
"text" : "star wars",
"_index" : "comments",
"_type" : "_doc",
"_id" : "JHZvRnMBVFEAERRHgcsw",
"_score" : 1.0,
"_source" : {
"comment" : "I love the star war movie",
"comment_autocomplete" : {
"input" : [
"star wars"
],
"contexts" : {
"comment_category" : "movies"
}
}
},
"contexts" : {
"comment_category" : [
"movies"
]
}
}
]
}
]
}
和phrase、term在精准度、召回率和性能方面的比较:
精准度:Completion > Phrase > Term
召回率:Term > Phrase > Completion
性能:Completion > Phrase > Term
1452

被折叠的 条评论
为什么被折叠?



