ElasticSearch--聚合分析(一)

运行在查询结果上的聚集

GET /get-together/group/_search
{
  "query": {
    "match": {
      "location": "Denver"
    }
  }, 
  "aggs": {
    "top_tags": {
      "terms": {
        "field": "tags.keyword"
      }
    }
  }
}

度量计数:
获取统计统计数据,如平均数、最大数和最小数

GET /get-together/event/_search?pretty
{
  "aggs": {
    "attendees_avg": {
      "avg": {
        "script": "doc['attendees'].values.length"
      }
    }
  }
}

查询结果:

"aggregations": {
    "attendees_stats": {
      "count": 0,
      "min": null,
      "max": null,
      "avg": null,
      "sum": null
    }
  }

百分位统计,如下,查询80%的值不超过的值是多少,90%的值不超过的有多少。

GET get-together/event/_search
{
  "aggs": {
    "attendees_percentiles": {
      "percentiles": {
        "script": "doc['attendees.keyword'].values.length",
        "percents": [80,99]
      }
    }
  }
}

设置指定的区间,查询区间的百分比

GET /website/logs/_search
{
    "size":0,
    "aggs":{
        "group_by_province":{
            "terms":{
                "field":"province"
            },
            "aggs":{
                "latency_percentile_ranks":{
                    "percentile_ranks":{
                        "field":"latency",
                        "values":[
                            200,
                            1000
                        ]
                    }
                }
            }
        }
    }
}

多桶型聚集
Terms聚集
如下,按表情进行分组,每一个标签出现的次数

GET get-together/group/_search
{
  "aggs": {
    "tags": {
      "terms": {
        "field": "tags.keyword",
        "order": {
          "_term": "asc"
        }
      }
    }
  }
}

Rang聚集
查询参加人数小于4个有多少,大于4个小于6个的有多少,大于6个的有多少。

GET get-together/event/_search
{
  "aggs": {
    "attendees_breakdown": {
      "range": {
        "script": "doc['attendees.keyword'].values.length",
        "ranges": [
          {
            "to": 4
          },
          {
            "from": 4,
            "to": 6
          },
          {
            "from": 6
          }
        ]
      }
    }
  }
}

date_range聚集
按照时间分组,

GET get-together/event/_search
{
  "aggs": {
    "dates_breakdown": {
      "date_range": {
        "field": "date",
        "format": "YYYY-MM", 
        "ranges": [
          {
            "to": "2013-07"
          },
          {
            "from": "2013-07"
          }
        ]
      }
    }
  }
}

histogram聚集,和rang聚集类似,但是可以定义一个固定的间距,
直方图

GET get-together/event/_search
{
  "aggs": {
    "attendees_histogram": {
      "histogram": {
        "script": "doc['attendees.keyword'].values.length",
        "interval": 1
      }
    }
  }
}

日期直方图
根据date字段按月分组计数,

GET get-together/event/_search
{
  "aggs": {
    "event_dates": {
      "date_histogram": {
        "field": "date",
        "interval": "1M"
      }
    }
  }
}

嵌套聚集
先按标签进行分组,然后按月份进行分组,然后按大于3人、小于3人进行分组。

GET get-together/group/_search
{
  "aggs": {
    "top_tags": {
      "terms": {
        "field": "tags.keyword"
      },
      "aggs": {
        "groups_per_month": {
          "date_histogram": {
            "field": "created_on",
            "interval": "1M"
          },
          "aggs": {
            "number_of_members": {
              "range": {
                "script": "doc['members.keyword'].values.length",
                "ranges": [
                  {
                    "to": 3
                  },{
                    "from": 3
                  }
                ]
              }
            }
          }
        }
      }
    }
  }
}

过滤+聚合,
筛选出price大于1200的数据然后聚合。

GET /tvs/sales/_search
{
  "size": 0,
  "query": {
    "constant_score": {
      "filter": {
        "range": {
          "price": {
            "gte": 1200
          }
        } 
      }
    }
  },
  "aggs": {
    "single_avg_price": {
      "avg": {
        "field": "price"
      }
    }
  }
}

针对各个bucket中进行过滤
如下,在recent_150d中,根据sold_date筛选出数据,然后进行聚合,与在query不同,影响的结果级只在recent_150d中。

GET /tvs/sales/_search
{
  "size": 0,
  "query": {
    "term": {
      "brand": {
        "value": "长虹"
      }
    }
  },
  "aggs": {
    "recent_150d": {
      "filter": {
        "range": {
          "sold_date": {
            "gte": "now-150d"
          }
        }
      },
      "aggs": {
        "recent_150d_avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

单桶聚集,
1创建一个桶,包含了搜索的索引和类型中的全部文档,
2 filter,对聚集的文档进行筛选,如下只聚集创建时间在2013年07月01日之后的。

GET get-together/event/_search
{
  "query": {
    "match": {
      "title": "elasticsearch"
    }
  },
  "aggs": {
    "since_july": {
      "filter": {
        "range": {
          "date": {
            "gt": "2013-07-01T00:00"
          }
        }
      },
      "aggs": {
        "description_cloud": {
          "terms": {
            "field": "description"
          }
        }
      }
    }
  }
}

查询结果中total为根据搜索条件得到的,但是aggregations中不是根据搜索条件得到的统计数据
3 miss,统计那些文档字段不存在的计数

GET get-together/event/_search
{
  "aggs": {
    "event_dates": {
      "date_histogram": {
        "field": "date",
        "interval": "1M"
      }
    },
    "missing_date":{
      "missing": {
        "field": "date"
      }
    }
  }
}

分组聚合后只显示部分数据的部分字段,如下只显示每组中的前5行数据的title字段。

GET /website/blogs/_search 
{
  "size": 0, 
  "aggs": {
    "group_by_username": {
      "terms": {
        "field": "userInfo.userName.keyword"
      },
      "aggs": {
        "top_blogs": {
          "top_hits": {
            "_source": {
              "include": "title"
            }, 
            "size": 5
          }
        }
      }
    }
  }
}

参考《ElasticSearch实战》

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值