聚合分析,英文名 aggregation,是 ES 处搜索功能外提供的针对 ES 数据做统计分析的功能。其功能丰富,提供了 bucket, metric, pipeline 等多种分析方式,其主要方式有:
- Metric:指标分析类型,如计算最大值,最小值等。
- Bucket:分桶类型,类似与 SQL 中的 GROUP BY 语法。
- Pipeline:管道分析类型,基于上一层的聚合分析结果进行再次分析。
- Matrix:矩阵分析类型。
1 Metric 指标分析
Metric 指标分析类型,主要分析计算各种指标,主要分为以下两类:
- 单值分析,只输出一个分析结果,如:max、min、sum、avg、cardinality
- 多值分析,输出多个分析结果,stats、extended stats、percentile、percentile rank、top hits
# 准备数据
POST test_aggregation_index/doc/_bulk
{"index":{"_id":"1"}}
{"username":"alfred way","job":"java engineer","age":18,"birth":"1990-01-02","isMarried":false,"salary":10000}
{"index":{"_id":"2"}}
{"username":"tom","job":"java senior engineer","age":28,"birth":"1980-05-07","isMarried":true,"salary":30000}
{"index":{"_id":"3"}}
{"username":"lee","job":"ruby engineer","age":22,"birth":"1985-08-07","isMarried":false,"salary":15000}
{"index":{"_id":"4"}}
{"username":"Nick","job":"web engineer","age":23,"birth":"1989-08-07","isMarried":false,"salary":8000}
{"index":{"_id":"5"}}
{"username":"Niko","job":"web engineer","age":18,"birth":"1994-08-07","isMarried":false,"salary":5000}
{"index":{"_id":"6"}}
{"username":"Michell","job":"ruby engineer","age":26,"birth":"1987-08-07","isMarried":false,"salary":12000}
1.1 Metric min
返回数值类字段的最小值
# 统计年龄最小值
GET test_aggregation_index/doc/_search
{
"size": 0,
"aggs": {
"min_age": {
"min": {
"field": "age"
}
}
}
}
1.2 Metric max
返回数值类字段的最大值
# 统计年龄最大值
GET test_aggregation_index/doc/_search
{
"size": 0,
"aggs": {
"max_age": {
"max": {
"field": "age"
}
}
}
}
1.3 Metric avg
返回数值类字段的平均值
# 统计年龄的平均值
GET test_aggregation_index/doc/_search
{
"size": 0,
"aggs": {
"avg_age": {
"avg": {
"field": "age"
}
}
}
}
1.4 Metric sum
返回字段类字段的总和
# 统计年龄总和
GET test_aggregation_index/doc/_search
{
"size": 0,
"aggs": {
"sum_age": {
"sum": {
"field": "age"
}
}
}
}