select COUNT(brand) ----------- Metric 一系列的统计方法
from cars
GROUP by brand ----------- Bucket 一组满足条件的文档
____________________________________________________________________
"aggregations": { ----------- 和query同级的关键字
"<aggregation_name>":{ ----------- 自定义聚合的名字
"<aggregation_type>" :{ ----------- 自定义聚合的类型
<aggregation_body>
}
[,”meta“:{[<meta_data_body>]}]
[,"aggregation":{[sub_aggregation]}] ----------- 子聚合查询
}
}
____________________________________________________________________
Metric Aggregation
- 单值分析:只输出一个分析结果
min, max, avg, sum
Cardinality (类似 distinct Count)
- 多值分析
stats, extended stats
percentile, percentile rank
top hits (排在前面的示例)
POST employees/_search
{
"size": 0,
"aggs": {
"min_salary": {
"max": {
"field": "salary"
}
}
}
}
POST employees/_search
{
"size": 0,
"aggs": {
"max_salary": {
"max": {
"field": "salary"
}
},
"min_salary": {
"min": {
"field": "salary"
}
},
"avg_salary": {
"avg": {
"field": "salary"
}
}
}
}
POST employees/_search
{
"size": 0,
"aggs": {
"stats_salary": {
"stats": {
"field": "salary"
}
}
}
}
____________________________________________________________________
Bucket
一些常见的 Bucket Aggregation
- Term
- 数字类型
Range / Data Range
Histogram / Date Histogram
- 支持嵌套
字段需要打开了fielddata ,才能进行 Terms Aggregation
keyword 默认支持 doc_values
Text 需要在 Mapping 中 enable 。会按照分词后的结果进行分
--- term aggregation ---
// 对job进行term分词。分词后的term
POST employees/_search
{
"size": 0,
"aggs": {
"mamj_jobs": {
"terms": {
"field": "job" //"job.keyword"
}
}
}
}
//统计数
POST employees/_search
{
"size": 0,
"aggs": {
"mamj_cardinate": {
"cardinality": {
"field": "job" //"job.keyword"
"size":3
}
}
}
}
// 对job进行term分词。分词后的term
POST employees/_search
{
"size": 0,
"aggs": {
"mamj_jobs": {
"terms": {
"field": "job" //"job.keyword"
}
}
}
}
//统计数
POST employees/_search
{
"size": 0,
"aggs": {
"mamj_cardinate": {
"cardinality": {
"field": "job" //"job.keyword"
"size":3
}
}
}
}
//指定size,不同工种中,年纪最大的3个员工的具体信息
POST employees/_search
{
"size": 0,
"aggs": {
"mmj_jobs": {
"terms": {
"field": "job.keyword"
},
"aggs": {
"old_employee": {
"top_hits": {
"size": 3,
"sort":[
{
"age":{
"order":"desc"
}
}
]
}
}
}
}
}
}
//性能有要求时,keyword设置eager_global_ordinals
PUT my_index
{
"mappings": {
"properties": {
"foo":{
"type":"keyword",
"eager_global_ordinals":true
}
}
}
}
-- Range & Histogram 聚合 ---
按数据字的范围,进行分桶
在Range Aggregation中,可以自定义key
POST employees/_search
{
"size": 0,
"aggs": {
"salary_range": {
"range": {
"field": "salary",
"ranges": [
{
"to": 10000
},
{
"from": 10000,
"to": 20000
},
{
"key": ">20000", //自定的key
"from": 20000
}
]
}
}
}
}
#Salary HIstogram,工资0到10万,以5000一个区间进行分桶
POST employees/_search
{
"size": 0,
"aggs": {
"salary_histrogram": {
"histogram": {
"field": "salary",
"interval": 5000,
"extended_bounds": {
"min": 0,
"max": 100000
}
}
}
}
}
Bucket + Metric Aggregation
Bucket聚合分析允许通过添加子聚全合分析来进一步分析,子聚合分析可以是Bucket 或者 Metric
#多次嵌套,根据工作类型分桶,然后按照性别分桶,计算工资信息
POST employees/_search
{
"size": 0,
"aggs": {
"job_gender_stats": {
"terms": {
"field": "job.keyword"
},
"aggs": {
"gender_stats": {
"terms": {
"field": "gender"
},
"aggs": {
"salary_stats": {
"stats": {
"field": "salary"
}
}
}
}
}
}
}
}