elasticsearch系列-ES对多个字段聚合，select A,B,COUNT() from table group by A,B

最新推荐文章于 2024-08-13 23:11:18 发布

Ajekseg

最新推荐文章于 2024-08-13 23:11:18 发布

阅读量973

点赞数

分类专栏： java 文章标签： elasticsearch 大数据搜索引擎 spring 运维

本文链接：https://blog.csdn.net/ajekseg/article/details/126660878

版权

java 专栏收录该内容

306 篇文章 12 订阅

订阅专栏

本文介绍了如何在Elasticsearch中进行多字段聚合查询，以组合SEX和PROF字段并计数。正确方法是使用Groovy脚本将字段连接后聚合，错误方法则是分别对字段进行单独聚合，导致结果未按预期组合。通过Java API也能实现相同效果，但需注意启用Groovy脚本支持。

摘要由CSDN通过智能技术生成

**ES对多个字段聚合，select A,B,**COUNT(*)from table group by A,B

假设有下表

NAME SEX PROF

李诚男副教授

张旭男讲师

王萍女助教

刘冰女助教

要查询select SEX,PROF,COUNT(*) from table group by SEX,PROF

1、正确的答案：

修改elasticsearch.yml配置文件，添加下面两个配置，重启es集群

script.engine.groovy.inline.aggs: on

script.engine.groovy.inline.search: on

{
    "size": 0,
    "query": {
        "match_all": {}
    },
    "aggs": {
        "sexprof": {
            "terms": {
                "script": {
                    "inline": "doc['SEX.keyword'].value +'-split-'+ doc['PROF.keyword'].value "
                }
            }
        }
    }
}

JavaAPI

Script script = new Script(ScriptType.INLINE, “groovy”, “doc[’ SEX .keyword’].value+‘-split-’+doc[’ PROF .keyword’].value”, new HashMap<String, Object>());

TermsAggregationBuilder callTypeTeamAgg =AggregationBuilders.terms(“sexprof”).script(script);

这样得到的sex,prof两列是一起返回的，中间通过"-split-"分开，拿到结果后自行处理,结果大概像下面的（省略了没用的信息）：

{
    "aggregations": {
        "sexprof": {
            "doc_count_error_upper_bound": 5,
            "sum_other_doc_count": 379,
            "buckets": [
                {
                    "key": "女-split-助教",
                    "doc_count": 2
                },
                {
                    "key": "男-split-讲师",
                    "doc_count": 1
                },
                {
                    "key": "男-split-教授",
                    "doc_count": 1
                }
            ]
        }
    }
}

2、错误的答案：

{
    "query": {
        "match_all": {}
    },
    "aggs": {
        "sex": {
            "terms": {
                "field": "SEX.keyword"
            }
        },
        "prof": {
            "terms": {
                "field": "PROF.keyword'"
            }
        }
    }
}

拿到的结果是大概像这样的（省略了没用的信息），分开统计了，这明显不是我们想要的

{
    "aggregations": {
        "sex": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "女",
                    "doc_count": 2
                },
                {
                    "key": "男",
                    "doc_count": 2
                }
            ]
        },
        "prof": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "副教授",
                    "doc_count": 1
                },
                {
                    "key": "讲师",
                    "doc_count": 1
                },
                {
                    "key": "助教",
                    "doc_count": 1
                }
            ]
        }
    }
}