elasticsearch文档 Compound queries

songtaiwu

已于 2024-08-14 15:39:24 修改

阅读量289

点赞数 4

文章标签： elasticsearch 大数据搜索引擎

于 2024-08-14 15:00:51 首次发布

本文链接：https://blog.csdn.net/songtaiwu/article/details/141189073

版权

Boolean query | Elasticsearch Guide [8.15] | Elastic

Boolean query

布尔查询是用于匹配出和组合中的其他查询匹配的文档。bool查询映射到Lucene BooleanQuery。它由一个或更多的布尔子句组成，每个子句是一个类型化的事件。事件如下：

must	子句（查询）必须出现在匹配的文档中，并将有助于得分。
filter	子句（查询）必须出现在匹配的文档中。但和must不同，查询分数会被忽略。Filter子句是在过滤上下文执行的，意味着分数会被忽略，子句被考虑用于缓存。
should	子句（查询）应出现在匹配的文档中。
must_not	子句（查询）必须不能出现在匹配的文档中。子句是在过滤上下文执行的，意味着分数会被忽略，子句被考虑用于缓存。因为分数被忽略，所有文档的分数都是0。

布尔查询采用的是 “更多匹配更好分数” 的策略，所以一个文档对于must 或 should子句的匹配会被加到一起来计算最终的_score分数。

POST _search
{
  "query": {
    "bool" : {
      "must" : {
        "term" : { "user.id" : "kimchy" }
      },
      "filter": {
        "term" : { "tags" : "production" }
      },
      "must_not" : {
        "range" : {
          "age" : { "gte" : 10, "lte" : 20 }
        }
      },
      "should" : [
        { "term" : { "tags" : "env1" } },
        { "term" : { "tags" : "deployed" } }
      ],
      "minimum_should_match" : 1,
      "boost" : 1.0
    }
  }
}

Scoring with bool.filter

在filter元素下面列出的查询语句是不进行分数计算的，分数返回0。只有在特殊说明的情况下才能影响到分数，一共有三种情况，下面举例。

第一个查询：对于全部返回的文档，分数都是0，因为没有特定的分数查询。

{
	"query": {
		"bool": {
			"filter": [
				{
					"match": {
						"scenery": "大栅栏"
					}
				}
			]
		}
	}
}

{
	"took": 2,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 1,
			"relation": "eq"
		},
		"max_score": 0,
		"hits": [
			{
				"_index": "app_mark",
				"_type": "_doc",
				"_id": "20",
				"_score": 0,
				"_source": {
					"scenery": "大栅栏街道前门西大街前门西大街辅路-道路停车位",
					"description": "",
					"add_time": "2024-08-14T13:43:55+08:00"
				}
			}
		]
	}
}

第二个查询：bool 查询用了 match_all 语句，他会把把所有文档都设置分数 1.0 。

{
	"query": {
		"bool": {
			"filter": [
				{
					"match": {
						"scenery": "大栅栏"
					}
				}
			],
			"must" : {
				"match_all": {}
			}
		}
	}
}

{
	"took": 4,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 1,
			"relation": "eq"
		},
		"max_score": 1,
		"hits": [
			{
				"_index": "app_mark",
				"_type": "_doc",
				"_id": "20",
				"_score": 1,
				"_source": {
					"scenery": "大栅栏街道前门西大街前门西大街辅路-道路停车位",
					"description": "",
					"add_time": "2024-08-14T13:43:55+08:00"
				}
			}
		]
	}
}

第三个查询：用constant_score查询，它的行为和上面第二个查询例子是一样的。constant_score语句给所有filter过滤匹配的文档都设置分数1.0。

{
	"query": {
		"constant_score": {
			"filter": {
				"match": {
					"scenery": "大栅栏"
				}
			}
		}
	}
}

{
	"took": 3,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 1,
			"relation": "eq"
		},
		"max_score": 1,
		"hits": [
			{
				"_index": "app_mark",
				"_type": "_doc",
				"_id": "20",
				"_score": 1,
				"_source": {
					"scenery": "大栅栏街道前门西大街前门西大街辅路-道路停车位",
					"description": "",
					"add_time": "2024-08-14T13:43:55+08:00"
				}
			}
		]
	}
}

Named queries

每个 query 接收一个 _name 在最上层的定义中。你可以用 named querie跟踪一个query匹配到的文档。如果named queries 被使用了，返回的内容中每个文档会包含一个 matched_queries属性。

例如下面语句，scenery字段含义 "解放" ， description字段还有"测试" 的文档，如果匹配了第一个，返回的字段包含first，匹配了第二个，返回的字段包含second，都匹配，则first、second都有。

{
	"query": {
		"bool": {
			"should": [
				{
					"match": {
						"scenery": {
							"query": "解放",
							"_name": "first"
						}
					}
				},
				{
					"match": {
						"description": {
							"query": "测试",
							"_name": "second"
						}
					}
				}
			]
		}
	}
}

返回值举例

{
	"took": 4,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 2,
			"relation": "eq"
		},
		"max_score": 1.0880616,
		"hits": [
			{
				"_index": "app_mark",
				"_type": "_doc",
				"_id": "16",
				"_score": 1.0880616,
				"_source": {
					"scenery": "解放街道紫禁城(天元金都店)天元金都",
					"description": "测试描述1",
					"add_time": "2024-08-13T12:01:35+08:00"
				},
				"matched_queries": [
					"first",
					"second"
				]
			},
			{
				"_index": "app_mark",
				"_type": "_doc",
				"_id": "17",
				"_score": 0.7392724,
				"_source": {
					"scenery": "解放街道紫禁城(天元金都店)天元金都",
					"description": "",
					"add_time": "2024-08-13T13:25:22+08:00"
				},
				"matched_queries": [
					"first"
				]
			}
		]
	}
}