ElasticSearch聚合和OR条件查询

在使用ElasticSearch中经常会遇到统计、查询需求,实现类似sql分组计算、条件查询的语法,ES在这些方面都支持的较不错,用起来也比较方便。笔者就自己开发中用到的Java API举例如下。

分组聚合查询

实现类似select avg(executionTime) from t WHERE executionDate between 'A' AND 'B' group by executionDate ORDER BY executionDate ASC的查询后聚合。ES分组计算的逻辑是先将executionDate分组,然后在每个分组内在进行聚合,在API中是两个聚合aggregation包含的逻辑关系。

  • HTTP请求
    POST请求体

    {
    “from”:0,
    “size”:2,
    “query”: {
    “range”: {
    “executionDate”: {
    “gte”:“2019-01-20”,
    “lte”:“2019-01-25”
    }
    }
    },
    “aggregations”:{
    “executionDateGroup”:{
    “terms”:{
    “field”:“executionDate”
    },
    “aggregations”:{
    “executionTimeAvg”:{
    “avg”:{
    “field”:“executionTime”
    }
    }
    }
    }

    }
    }

POST返回结果

{
    "took": 1510,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 52,
        "max_score": 1,
        "hits": [
            //....
        ]
    },
    "aggregations": {
        "executionDateGroup": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 2,
            "buckets": [
                {
                    "key": 1548425669000,
                    "key_as_string": "2019-01-25 14:14:29",
                    "doc_count": 6,
                    "executionTimeAvg": {
                        "value": 72.33333333333333
                    }
                },
                {
                    "key": 1548425670000,
                    "key_as_string": "2019-01-25 14:14:30",
                    "doc_count": 6,
                    "executionTimeAvg": {
                        "value": 67.66666666666667
                    }
                },
                //.....
            ]
        }
    }
}

JAVA代码API其实与上面HTTP请求一样封装和获取数据,整体思路变化不大。
代码示例

    //1.先对executionDate分组,取名executionDateGroup,即实现group by executionDate,如果不设置size,则聚合返回默认10组数据
    TermsAggregationBuilder groupTerms = AggregationBuilders.terms("executionDateGroup").field("executionDate").size(Integer.MAX_VALUE);
    //设置排序 true为正序、flase为倒序,实现ORDER BY executionDate ASC
    groupTerms.order(BucketOrder.key(true));
    
    //2.聚合avg(executionTime),取名executionTimeAvg
    AvgAggregationBuilder timeAvg = AggregationBuilders.avg("executionTimeAvg").field("executionTime");
    
    //3.两个aggregation父子关系
    groupTerms.subAggregation(timeAvg);
    
    //查询条件
    BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
    RangeQueryBuilder rangeDateQuery = QueryBuilders.rangeQuery("executionDate").gte(DateFormatUtils.format(fromDate, "yyyy-MM-dd HH:mm:ss"))
            .lte(DateFormatUtils.format(toDate, "yyyy-MM-dd HH:mm:ss"));
    queryBuilder.must(rangeDateQuery);
    //如果方法名不为空,则添加查询条件
    if (StringUtils.isNotBlank(condition.getMethodName())) {
        queryBuilder.must(new TermQueryBuilder("methodName", condition.getMethodName()));
    }
    SearchResponse searchRes = esClient.prepareSearch(CommonConstants.EHR_EXCUTION_AOP_LOG_INDEX)
            .setTypes(CommonConstants.EHR_EXCUTION_AOP_LOG_TYPE)
            .setQuery(queryBuilder)
            .addAggregation(groupTerms)  //设置聚合
            .addSort("executionDate", SortOrder.ASC)
            .get();
    
    // 获取结果,提取结果的顺序与aggregation组合关系一致,先提取外层executionDateGroup,再提取多个分组的子层executionTimeAvg
    Aggregation executionDateGroup = searchRes.getAggregations().get("executionDateGroup");
    Terms timeAvgTerms = null;
    if (executionDateGroup instanceof Terms) {
        //外层aggregation
        timeAvgTerms = (Terms) executionDateGroup;
        List<? extends Terms.Bucket> buckets = timeAvgTerms.getBuckets();
        for (Terms.Bucket elem : buckets) {
            //子级aggregation
            InternalAvg executionTimeAvg = (InternalAvg) elem.getAggregations().get("executionTimeAvg");
            ExecutionInfoDto executionInfoDto = new ExecutionInfoDto();
            executionInfoDto.setExecutionDate(elem.getKeyAsString());
            executionInfoDto.setExecutionTime((int)executionTimeAvg.getValue());
            executionInfoDtoList.add(executionInfoDto);
        }
    }
OR条件查询

ES条件查询中有MUST/MUSTNOT/SHOULD逻辑,其中MUST/MUSTNOT与sql中的AND/NOT理解和用法基本一致,但SHOULD则与sql中的OR有些不一样,在ES中如果要表示OR查询,则需要配合MUST一起使用,即MUST(SHOULD A, SHOULD B),表示A OR B
HTTP POST请求体

{
    "query": {
        "bool": {
            "must": {
                //or条件组装
                "bool" : { 
                    "should": [
                        { "match": { "about": "music" }},
                        { "match": { "about": "climb" }} ] 
                }
            },
            "must": {
                "match": { "first_nale": "John" }
            },
            "must_not": {
                "match": {"last_name": "Smith" }
            }
        }
    }
}

JAVA代码示例

    BoolQueryBuilder pinyinQuery = QueryBuilders.boolQuery();
    if (!keyword.contains("-")) {
        pinyinQuery.should(QueryBuilders.matchQuery("userName.pinyin", pinyin));
    }
    pinyinQuery.should(QueryBuilders.termQuery("userId", keyword));
          
    BoolQueryBuilder queryEs = QueryBuilders.boolQuery()
            .must(pinyinQuery)
            .must(new TermQueryBuilder("officeStatus", "1"));
    // 构造highlight
    HighlightBuilder hiBuilder= new HighlightBuilder();
    hiBuilder.preTags("<h2>").postTags("</h2>").field("userName.pinyin");
    SearchResponse scrollRes =
            client.prepareSearch(ES_INDEX_NAME)
                    .setTypes(ES_INDEX_USER_TYPE)
                    .setQuery(queryEs)
                    .highlighter(hiBuilder)
                    .setScroll("10s")
                    .setSize(1000)
                    .get();
  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值