01.elasticsearch metric aggregation 查询


elasticsearch的aggregate查询现在越来越丰富了,目前总共有4类。

  1. metric aggregation: 主要是min,max,avg,sum,percetile 等单个统计指标的查询,同时,可以用作bucket agg的子聚合查询,但是本身不能包含子查询
  2. bucket aggregation: 主要是类似group by的查询操作,而且可以含有子查询
  3. matrix aggregation: 使用多个字段的值进行计算从而产生一个多维矩阵
  4. pipline aggregation: 主要是能够在其他的aggregation进行一些附加的处理来增强数据

本篇就主要学习metric aggregation

1. 数据准备

演唱会的票信息
GET seats1028/_search

{
"play" : "Auntie Jo",   # 演唱会名称
"date" : "2018-11-6",  # 时间
"theatre" : "Skyline",  # 地点
"sold" : false,      # 这个票是否已经卖出
"actors" : [         # 演员
	"Jo Hangum",
	"Jon Hittle",
	"Rob Kettleman",
	"Laura Conrad",
	"Simon Hower",
	"Nora Blue"
        ],
"number" : 8,   #可以使用的人数(团体座位)
"datetime" : 1541497200000,
"price" : 8321,    # 票价
"tip" : 17.5,      # 优惠
"time" : "5:40PM"
}

总共有3w条这样的数据

2. metric aggregation分类

1 . Avg Aggregation : 求query出来的结果的average 值
2 . Weighted Avg Aggregation: 带权重的average值,可以选取另一个字段的值作为权重值
3 . Max Aggregation: 求query出来的结果的max值
4 . Min Aggregation: 求query出来的结果的min值
5 . Sum Aggregation: 求query出来的结果的sum值
6 . Value Count Aggregation: query出来的结果的某个field的的值的个数,注意可能这个field的值是一个数组则这个一个doc可能就贡献了多个value count
7 . Cardinality Aggregation: 某个filed的value去重后的count, 可以理解为value_count做了去重
8 . Percentiles Aggregation: 百分比求值,对于大小的数值,求百分位,比如响应时间的99分位,95分位等对应的具体响应时间是多少
9 . Percentile Ranks Aggregation: 某个具体的响应时间在总体中所处的分位值
10. Stats Aggregation: 上面value_count,min,max,sum,avg的快捷方式
11. Extended Stats Aggregation: 在stats的基础上增加了平方和、方差、标准差、平均值加/减两个标准差的区间
12. Top Hits Aggregation: 一般是嵌套查询,用在term查询当中,返回每个bucket的topN
13. Geo Bounds Aggregation: 地理位置的边界聚合
14. Geo Centroid Aggregation:
15. Scripted Metric Aggregation: 使用script的聚合
16. Median Absolute Deviation Aggregation: 平方差聚合

metric agg可以用作bucket agg的子聚合查询,但是metric agg 本身不能包含子查询

3.使用样例

1 . Avg Aggregation : 求query出来的结果的average 值

GET seats1028/_search
{
  "size": 0,
  "aggs": {
    "avg_price": {
      "avg": {
        "field": "price"
      }
    }
  }
}

返回

{
  "took" : 8,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "avg_price" : {
      "value" : 4995.498812220319
    }
  }
}

2 . Weighted Avg Aggregation: 带权重的average值,可以选取另一个字段的值作为权重值

GET seats1028/_search
{
  "size": 0,
  "aggs": {
    "avg_price": {
      "weighted_avg": {
        "value": {
          "field": "price"
        },
        "weight": {
          "field": "number"
        }
      }
    }
  }
}

返回
{
  "took" : 14,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "avg_price" : {
      "value" : 4981.713482182667
    }
  }
}


3 . Max Aggregation: 求query出来的结果的max值

GET seats1028/_search
{
  "size": 0,
  "aggs": {
    "max_price": {
      "max": {
        "field": "price"
      }
    }
  }
}
返回

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "max_price" : {
      "value" : 9999.0
    }
  }
}


4 . Min Aggregation: 求query出来的结果的min值

GET seats1028/_search
{
  "size": 0,
  "aggs": {
    "min_price": {
      "min": {
        "field": "price"
      }
    }
  }
}

返回

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "min_price" : {
      "value" : 0.0
    }
  }
}


5 . Sum Aggregation: 求query出来的结果的sum值

GET seats1028/_search
{
  "size": 0,
  "aggs": {
    "sum_price": {
      "sum": {
        "field": "price"
      }
    }
  }
}

返回

{
  "took" : 8,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "sum_price" : {
      "value" : 1.80847048E8
    }
  }
}

6 . Value Count Aggregation: query出来的结果的某个field的的值的个数,注意可能这个field的值是一个数组则这个一个doc可能就贡献了多个value count

GET seats1028/_search
{
  "size": 0,
  "aggs": {
    "count_price": {
      "value_count": {
        "field": "price"
      }
    }
  }
}


返回
{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "count_price" : {
      "value" : 36202
    }
  }
}


7 . Cardinality Aggregation: 某个filed的value去重后的count, 可以理解为value_count做了去重

GET seats1028/_search
{
  "size": 0,
  "aggs": {
    "unique_count_price": {
      "cardinality": {
        "field": "price"
      }
    }
  }
}


返回
{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "unique_count_price" : {
      "value" : 9591
    }
  }
}


8 . Percentiles Aggregation: 百分比求值,对于大小的数值,求百分位,比如响应时间的99分位,95分位等对应的具体响应时间是多少

GET seats1028/_search
{
  "size": 0,
  "aggs": {
    "percentile_price": {
      "percentiles": {
        "field": "price"
      }
    }
  }
}


返回
{
  "took" : 56,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "percentile_price" : {
      "values" : {
        "1.0" : 100.22,
        "5.0" : 500.1716280451575,
        "25.0" : 2478.398741375389,
        "50.0" : 4990.070945393942,
        "75.0" : 7509.41570777247,
        "95.0" : 9487.07620155039,
        "99.0" : 9894.284278074867
      }
    }
  }
}


9 . Percentile Ranks Aggregation: 某个具体的响应时间在总体中所处的分位值

GET seats1028/_search
{
  "size": 0,
  "aggs": {
    "rank_price": {
      "percentile_ranks": {
        "field": "price",
        "values": [
          1000,
          2500
        ]
      }
    }
  }
}


返回
{
  "took" : 15,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "rank_price" : {
      "values" : {
        "1000.0" : 9.948376390210644,
        "2500.0" : 25.172259605722964
      }
    }
  }
}


10. Stats Aggregation: 上面value_count,min,max,sum,avg的快捷方式

GET seats1028/_search
{
  "size": 0,
  "aggs": {
    "stats_price": {
      "stats": {
        "field": "price"
      }
    }
  }
}


返回
{
  "took" : 9,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "stats_price" : {
      "count" : 36202,
      "min" : 0.0,
      "max" : 9999.0,
      "avg" : 4995.498812220319,
      "sum" : 1.80847048E8
    }
  }
}


11. top_hit

这个一般用于嵌套的子查询,比如下面的查询按照row划分bucket,然后找出每个bucket中的price最高的两个


GET seats1105/_search
{
  "size": 1,
  "query": {
    "match_all": {}
  },
  "aggs": {
    "row_term": {
      "terms": {
        "field": "row",
        "size": 10
      },
      "aggs": {
        "top_price": {
          "top_hits": {
            "size": 2,
            "sort": [
              {
                "price": {
                  "order": "desc"
                }
              }
            ]
          }
        }
      }
    }
  }
}

返回

  "aggregations" : {
    "row_term" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 2,
          "doc_count" : 5796,
          "top_price" : {
            "hits" : {
              "total" : {
                "value" : 5796,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "seats1105",
                  "_type" : "_doc",
                  "_id" : "1957",
                  "_score" : null,
                  "_source" : {
                    "price" : 9998
                  },
                  "sort" : [
                    9998
                  ]
                },
                {
                  "_index" : "seats1105",
                  "_type" : "_doc",
                  "_id" : "7105",
                  "_score" : null,
                  "_source" : {
                    "price" : 9997
                  },
                  "sort" : [
                    9997
                  ]
                }
              ]
            }
          }
        },
        {
          "key" : 3,
          "doc_count" : 5796,
          "top_price" : {
            "hits" : {
              "total" : {
                "value" : 5796,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "seats1105",
                  "_type" : "_doc",
                  "_id" : "3993",
                  "_score" : null,
                  "_source" : {
                    "price" : 9999
                  },
                  "sort" : [
                    9999
                  ]
                },
                {
                  "_index" : "seats1105",
                  "_type" : "_doc",
                  "_id" : "6656",
                  "_score" : null,
                  "_source" : {
                    "price" : 9999
                  },
                  "sort" : [
                    9999
                  ]
                }
              ]
            }
          }
        },
        {
          "key" : 1,
          "doc_count" : 5791,
          "top_price" : {
            "hits" : {
              "total" : {
                "value" : 5791,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "seats1105",
                  "_type" : "_doc",
                  "_id" : "4321",
                  "_score" : null,
                  "_source" : {
                    "price" : 9999
                  },
                  "sort" : [
                    9999
                  ]
                },
                {
                  "_index" : "seats1105",
                  "_type" : "_doc",
                  "_id" : "8380",
                  "_score" : null,
                  "_source" : {
                    "price" : 9999
                  },
                  "sort" : [
                    9999
                  ]
                }
              ]
            }
          }
        }
    ]
  }

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值