elasticsearch查询模板示例
elasticsearch的查询模板功能非常强大,可以参数化复杂查询,对于定义用户应用场景非常有帮助。
本文记录一个项目中的示例,希望对你有帮助。
1. 需求描述
统计用户在单位时间内用户数,对于重复用户事务只能计算一次。
事件索引包括3个字段,user_id, user_name和create_time:
-POST /$ES/event_index
{
"mappings": {
"event": {
"properties": {
"user_id": {
"type": "keyword
},
"create_time": {
"type": "date",
"index": "not_analyzed",
"format": "yyyy-MM-dd HH:mm:ss"
},
"user_name": {
"type": "text",
}
}
}
}
}
下面加几条数据到event_index里:
-PUT /event_index/_doc
{
"user_id": "1",
"user_name": "format1",
"create_time": "2015-11-07 12:00:00"
}
-PUT /event_index/_doc
{
"user_id": "2",
"user_name": "format2",
"create_time": "2015-11-07 13:30:00"
}
-PUT /event_index/_doc
{
"user_id": "3",
"user_name": "format3",
"create_time": "2015-11-07 13:30:00"
}
-PUT /event_index/_doc
{
"user_id": "1",
"user_name": "format1",
"create_time": "2015-11-07 13:50:00"
}
-PUT /event_index/_doc
{
"user_id": "1",
"user_name": "format1",
"create_time": "2015-11-07 13:55:00"
}
2. 定义查询模板
定义对应的查询模板,模板ID为stats,使用了Cardinality和DateHistogram这两个Aggregation,其中Date Histogram嵌套在Cardinality里。在定义模板的时候,{{}} 表示参数,调用模板时通过变量传值进来:
-POST _scripts/stats
{
"script": {
"lang": "mustache",
"source": {
"query": {
"bool": {
"must": [
{
"range": {
"create_time": {
"gte": "{{start_date}}",
"lte": "{{end_date}}"
}
}
}
]
}
},
"size": 0,
"aggs": {
"stats_data": {
"date_histogram": {
"field": "create_time",
"interval": "{{interval}}"
},
"aggs": {
"time": {
"cardinality": {
"field": "user_id"
}
}
}
}
}
}
}
}
Cardinality 聚集的作用就是类似sql中的distinct,去除重复记录。
Date Histogram 聚集的作用是根据时间进行统计,内部interval属性表示统计范畴。
3. 使用模板查询
使用查询模板进行查询,传入模板中的参数,以小时为单位:
GET _search/template
{
"id": "stats",
"params": {
"start_date": "2015-11-07 00:00:00",
"end_date": "2015-11-07 23:59:59",
"interval": "hour
}
}
结果:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 0,
"hits": []
},
"aggregations": {
"stats_data": {
"buckets": [
{
"key_as_string": "2015-11-07 12:00:00",
"key": 1446897600000,
"doc_count": 1,
"time": {
"value": 1
}
},
{
"key_as_string": "2015-11-07 13:00:00",
"key": 1446901200000,
"doc_count": 4,
"time": {
"value": 3
}
}
]
}
}
}
12点-13点的只有1条数据,1个用户。13-14点的有4条数据,3个用户。
再次查询以天为单位:
GET _search/template
{
"id": "stats",
"params": {
"start_date": "2015-11-07 00:00:00",
"end_date": "2015-11-07 23:59:59",
"interval": "day
}
}
返回结果:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 0,
"hits": []
},
"aggregations": {
"stats_data": {
"buckets": [
{
"key_as_string": "2015-11-07 00:00:00",
"key": 1446854400000,
"doc_count": 5,
"time": {
"value": 3
}
}
]
}
}
}
11-07这一天有5条数据,3个用户。
4. 总结
本文通过一个示例说明elasticsearch查询模板的使用,其中使用了2个聚集查询。使用查询模板可以简化应用,封装复杂业务逻辑,更多内容可参考官网相关内容。