ElasticSearch Aggregation(八)

ElasticSearch Aggregation(八)

管道聚合

cumulative sum聚合

一种父管道聚合,它计算父直方图聚合中指定指标的累积和(当前分桶指标的累积和等于前几个桶指标的累加总和)。指定的指标必须是数值,并且外围的直方图必须将min_doc_count设置为0(直方图聚合的默认值)。

语法

cumulative_sum语法为:

{
  "cumulative_sum": {
    "buckets_path": "the_sum"
  }
}

以下代码段计算每月总销售额的累计总和:

curl -X POST "localhost:9200/sales/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "sales_per_month": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"
      },
      "aggs": {
        "sales": {
          "sum": {
            "field": "price"
          }
        },
        "cumulative_sales": {
          "cumulative_sum": {
            "buckets_path": "sales" 
          }
        }
      }
    }
  }
}
'

响应:

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               },
               "cumulative_sales": {
                  "value": 550.0
               }
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0
               },
               "cumulative_sales": {
                  "value": 610.0
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "sales": {
                  "value": 375.0
               },
               "cumulative_sales": {
                  "value": 985.0
               }
            }
         ]
      }
   }
}

通过以上例子中2015/03/01 00:00:00桶中的累积和聚合为例。他的累积和等会550.0+60.0+375.0=985.0

Derivative 聚合

衍生聚合。略

extended stats bucket聚合

同级管道聚合,它计算同级聚合中指定指标的所有存储桶的各种统计信息。指定的指标必须是数字,同级聚合必须是多桶聚合。

语法

extended_stats_bucket聚合语法为:

{
  "extended_stats_bucket": {
    "buckets_path": "the_sum"
  }
}

以下代码段计算每月销售桶的扩展统计数据:

curl -X POST "localhost:9200/sales/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "sales_per_month": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"
      },
      "aggs": {
        "sales": {
          "sum": {
            "field": "price"
          }
        }
      }
    },
    "stats_monthly_sales": {
      "extended_stats_bucket": {
        "buckets_path": "sales_per_month>sales" 
      }
    }
  }
}
'

响应:

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               }
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "sales": {
                  "value": 375.0
               }
            }
         ]
      },
      "stats_monthly_sales": {
         "count": 3,
         "min": 60.0,
         "max": 550.0,
         "avg": 328.3333333333333,
         "sum": 985.0,
         "sum_of_squares": 446725.0,
         "variance": 41105.55555555556,
         "variance_population": 41105.55555555556,
         "variance_sampling": 61658.33333333334,
         "std_deviation": 202.74505063146563,
         "std_deviation_population": 202.74505063146563,
         "std_deviation_sampling": 248.3109609609156,
         "std_deviation_bounds": {
           "upper": 733.8234345962646,
           "lower": -77.15676792959795,
           "upper_population" : 733.8234345962646,
           "lower_population" : -77.15676792959795,
           "upper_sampling" : 824.9552552551645,
           "lower_sampling" : -168.28858858849787
         }
      }
   }
}

最大桶聚合

一个同级管道聚合,它用同级聚合中指定指标的最大值来标识桶,并输出桶的值和键。指定的指标必须是数值,并且同级聚合必须是多桶聚合。

语法

max_bucket语法如下:

{
  "max_bucket": {
    "buckets_path": "the_sum"
  }
}

以下代码段计算每月总销售额的最大值:

curl -X POST "localhost:9200/sales/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "sales_per_month": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"
      },
      "aggs": {
        "sales": {
          "sum": {
            "field": "price"
          }
        }
      }
    },
    "max_monthly_sales": {
      "max_bucket": {
        "buckets_path": "sales_per_month>sales" 
      }
    }
  }
}
'

以下可能是响应:

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               }
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "sales": {
                  "value": 375.0
               }
            }
         ]
      },
      "max_monthly_sales": {
          "keys": ["2015/01/01 00:00:00"], 
          "value": 550.0
      }
   }
}

最小桶聚合

与最大桶聚合类似

moving function聚合

给定一系列有序的数据,Moving Function聚合将在数据之间生成一个滑动窗口,并允许用户指定在每个数据窗口上执行自定义脚本。为了方便起见,预定义了一些常用函数,如最小/最大值、移动平均值等。

语法

moving_fn聚合语法如下:

{
  "moving_fn": {
    "buckets_path": "the_sum",
    "window": 10,
    "script": "MovingFunctions.min(values)"
  }
}

moving_fn 聚合必须嵌入到直方图或 date_histogram 聚合中。它们可以像任何其他指标聚合一样嵌入:

curl -X POST "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "my_date_histo": {                  
      "date_histogram": {
        "field": "date",
        "calendar_interval": "1M"
      },
      "aggs": {
        "the_sum": {
          "sum": { "field": "price" }   
        },
        "the_movfn": {
          "moving_fn": {
            "buckets_path": "the_sum",  
            "window": 10,
            "script": "MovingFunctions.unweightedAvg(values)"
          }
        }
      }
    }
  }
}
'

移动平均通过第一个指定的histogram或者date_histogram上的字段来构建。然后,您可以选择在该直方图中添加数字指标,例如总和。最后,moving_fn 被嵌入到直方图中。然后buckets_path 参数用于“指向”直方图中的同级指标之一。

来自上述聚合的示例响应可能如下所示:

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "my_date_histo": {
         "buckets": [
             {
                 "key_as_string": "2015/01/01 00:00:00",
                 "key": 1420070400000,
                 "doc_count": 3,
                 "the_sum": {
                    "value": 550.0
                 },
                 "the_movfn": {
                    "value": null
                 }
             },
             {
                 "key_as_string": "2015/02/01 00:00:00",
                 "key": 1422748800000,
                 "doc_count": 2,
                 "the_sum": {
                    "value": 60.0
                 },
                 "the_movfn": {
                    "value": 550.0
                 }
             },
             {
                 "key_as_string": "2015/03/01 00:00:00",
                 "key": 1425168000000,
                 "doc_count": 2,
                 "the_sum": {
                    "value": 375.0
                 },
                 "the_movfn": {
                    "value": 305.0
                 }
             }
         ]
      }
   }
}
自定义用户脚本

移动函数聚合允许用户指定任意脚本来定义自定义逻辑。每次收集新的数据窗口时都会调用该脚本。这些值在values变量中提供给脚本。然后,脚本应该执行某种计算,并生成一个double作为结果。不允许发出null,尽管NaN+/- Inf是允许的。

例如,此脚本将简单地返回窗口中的第一个值,如果没有可用值,则返回 NaN

curl -X POST "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "my_date_histo": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "1M"
      },
      "aggs": {
        "the_sum": {
          "sum": { "field": "price" }
        },
        "the_movavg": {
          "moving_fn": {
            "buckets_path": "the_sum",
            "window": 10,
            "script": "return values.length > 0 ? values[0] : Double.NaN"
          }
        }
      }
    }
  }
}
'
内置函数
  • max()
  • min()
  • sum()
  • stdDev()
  • unweightedAvg()
  • linearWeightedAvg()
  • ewma()
  • holt()
  • holtWinters()

这些函数可从 MovingFunctions 命名空间获得。例如。MovingFunctions.max()

Moving percentiles聚合

给定一系列有序的百分位数,移动百分位数聚合将在这些百分位数上滑动一个窗口,并允许用户计算累积百分位数

这在概念上与移动函数管道聚合非常相似,不同之处在于它适用于百分位数草图而不是实际的桶值。

语法

move_percentiles 聚合看起来像这样:

{
  "moving_percentiles": {
    "buckets_path": "the_percentile",
    "window": 10
  }
}

move_percentiles 聚合必须嵌入到 histogramdate_histogram 聚合中。它们可以像任何其他指标聚合一样嵌入:

curl -X POST "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "my_date_histo": {                          
        "date_histogram": {
        "field": "date",
        "calendar_interval": "1M"
      },
      "aggs": {
        "the_percentile": {                     
            "percentiles": {
            "field": "price",
            "percents": [ 1.0, 99.0 ]
          }
        },
        "the_movperc": {
          "moving_percentiles": {
            "buckets_path": "the_percentile",   
            "window": 10
          }
        }
      }
    }
  }
}
'

以下可能是响应:

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "my_date_histo": {
         "buckets": [
             {
                 "key_as_string": "2015/01/01 00:00:00",
                 "key": 1420070400000,
                 "doc_count": 3,
                 "the_percentile": {
                     "values": {
                       "1.0": 150.0,
                       "99.0": 200.0
                     }
                 }
             },
             {
                 "key_as_string": "2015/02/01 00:00:00",
                 "key": 1422748800000,
                 "doc_count": 2,
                 "the_percentile": {
                     "values": {
                       "1.0": 10.0,
                       "99.0": 50.0
                     }
                 },
                 "the_movperc": {
                   "values": {
                     "1.0": 150.0,
                     "99.0": 200.0
                   }
                 }
             },
             {
                 "key_as_string": "2015/03/01 00:00:00",
                 "key": 1425168000000,
                 "doc_count": 2,
                 "the_percentile": {
                    "values": {
                      "1.0": 175.0,
                      "99.0": 200.0
                    }
                 },
                 "the_movperc": {
                    "values": {
                      "1.0": 10.0,
                      "99.0": 200.0
                    }
                 }
             }
         ]
      }
   }
}

百分位桶聚合

同级管道聚合,计算同级聚合中指定指标的所有bucket的百分比。指定的指标必须是数值,并且同级聚合必须是多桶聚合。

语法

percentiles_bucket` 聚合看起来像这样:

{
  "percentiles_bucket": {
    "buckets_path": "the_sum"
  }
}

以下代码段计算每月总销售额的百分位数:

curl -X POST "localhost:9200/sales/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "sales_per_month": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"
      },
      "aggs": {
        "sales": {
          "sum": {
            "field": "price"
          }
        }
      }
    },
    "percentiles_monthly_sales": {
      "percentiles_bucket": {
        "buckets_path": "sales_per_month>sales", 
        "percents": [ 25.0, 50.0, 75.0 ]         
      }
    }
  }
}
'

以下可能是响应:

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               }
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "sales": {
                  "value": 375.0
               }
            }
         ]
      },
      "percentiles_monthly_sales": {
        "values" : {
            "25.0": 375.0,
            "50.0": 375.0,
            "75.0": 550.0
         }
      }
   }
}

stats bucket聚合

同级管道聚合,它计算同级聚合中指定度量的所有bucket的各种统计信息。指定的度量必须是数值,并且同级聚合必须是多桶聚合。

语法

stats_bucket 聚合如下所示:

{
  "stats_bucket": {
    "buckets_path": "the_sum"
  }
}

以下代码段计算每月销售额的统计数据:

curl -X POST "localhost:9200/sales/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "sales_per_month": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"
      },
      "aggs": {
        "sales": {
          "sum": {
            "field": "price"
          }
        }
      }
    },
    "stats_monthly_sales": {
      "stats_bucket": {
        "buckets_path": "sales_per_month>sales" 
      }
    }
  }
}
'

以下可能是响应:

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               }
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "sales": {
                  "value": 375.0
               }
            }
         ]
      },
      "stats_monthly_sales": {
         "count": 3,
         "min": 60.0,
         "max": 550.0,
         "avg": 328.3333333333333,
         "sum": 985.0
      }
   }
}

sum bucket聚合

一个同级管道聚合,它计算同级聚合中指定度量的所有桶的总和。指定的度量必须是数值,并且同级聚合必须是多桶聚合。

语法

sum_bucket 聚合看起来像这样:

{
  "sum_bucket": {
    "buckets_path": "the_sum"
  }
}

以下代码段计算所有每月总销售额的总和:

curl -X POST "localhost:9200/sales/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "sales_per_month": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"
      },
      "aggs": {
        "sales": {
          "sum": {
            "field": "price"
          }
        }
      }
    },
    "sum_monthly_sales": {
      "sum_bucket": {
        "buckets_path": "sales_per_month>sales" 
      }
    }
  }
}
'

以下可能是响应:

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               }
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "sales": {
                  "value": 375.0
               }
            }
         ]
      },
      "sum_monthly_sales": {
          "value": 985.0
      }
   }
}
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值