对于想对数组中某个字段进行数量聚合的两种情况

最新推荐文章于 2023-03-15 20:50:41 发布

goxingman

最新推荐文章于 2023-03-15 20:50:41 发布

阅读量1.7k

点赞数

分类专栏： es

本文链接：https://blog.csdn.net/goxingman/article/details/116585144

版权

es 专栏收录该内容

15 篇文章 2 订阅

订阅专栏

一、不管单条doc中的对应数组有出现了几次相同元素，都按一次来计算， mapping设计数组不为nested

举例：下面这个数据中，如果id进行聚合，那么虽然id=22出现两次，但是聚合出来id：22 数量为1，他是统计的doc条数

场景：比如主体doc为爱好，数组里面是人名，此种聚合可以统计出美每个人名有几种爱好

"_source": {
	"id": 1,
	"files": [{
			"id": "22",
			"filename": "方法"
		},
		{
			"id": "22",
			"filename": "方哈"
		}
	]
}

对应maping：

{
	"test1": {
		"mappings": {
			"test1": {
				"properties": {
					"files": {
						"properties": {
							"filename": {
								"type": "text",
								"fields": {
									"keyword": {
										"type": "keyword",
										"ignore_above": 256
									}
								}
							},
							"id": {
								"type": "keyword"
							}
						}
					},
					"id": {
						"type": "keyword"
					}
				}
			}
		}
	}
}

代码：就是单纯的agg+terms聚合，网上很多就不写了

二、单条doc中的对应数组有出现了几次相同元素，就按几次来计算， mapping设计数组为nested类型，还拿上面数据举例

举例：下面这个数据中，如果id进行聚合，id=22出现两次，聚合出来id：22 数量为2，统计的是id中某个元素出现的次数

场景：比如主体是一个申请单，数组是申请时候提交的材料格式，此种统计可以统计出，所有提交材料每种格式的数量

"_source": {
	"id": 1,
	"files": [{
			"id": "22",
			"filename": "方法"
		},
		{
			"id": "22",
			"filename": "方哈"
		}
	]
}

对应mapping

{
	"test1": {
		"mappings": {
			"test1": {
				"properties": {
					"files": {
						"type": "nested",
						"properties": {
							"filename": {
								"type": "text",
								"fields": {
									"keyword": {
										"type": "keyword",
										"ignore_above": 256
									}
								}
							},
							"id": {
								"type": "keyword"
							}
						}
					},
					"id": {
						"type": "keyword"
					}
				}
			}
		}
	}
}

代码：

    //es聚类文件类型，path是点到数组那层，field是点到需要聚合的字段那层
    private Map<String ,Long> getAgg(String path,String field,Class T){
        String KEY1 = "key1";
        String KEY2 = "key2";
        Map<String ,Long> typeCount = new HashMap<>();
        NativeSearchQueryBuilder nsq = new NativeSearchQueryBuilder();
        TermsAggregationBuilder testCount = AggregationBuilders.terms(KEY1).field(field);
        NestedAggregationBuilder test = AggregationBuilders.nested(KEY2, path).subAggregation(testCount);
        nsq.addAggregation(test);
        AggregatedPage<AdminReviewApplication> res = elasticsearchRestTemplate.queryForPage(nsq.build(), T);

        Aggregations aggregations = res.getAggregations();
        ParsedNested test1 = aggregations.get(KEY2);
        ParsedStringTerms filetType = test1.getAggregations().get(KEY1);
        List<? extends Terms.Bucket> buckets = filetType.getBuckets();
        for (Terms.Bucket bucket : buckets) {
            typeCount.put(bucket.getKeyAsString(),bucket.getDocCount());
        }
        return typeCount;

    }

goxingman

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
对于想对数组中某个字段进行数量聚合的两种情况

一、不管单条doc中的对应数组有出现了几次相同元素，都按一次来计算， mapping设计数组不为nested举例：下面这个数据中，如果id进行聚合，那么虽然id=22出现两次，但是聚合出来id：22 数量为1，他是统计的doc条数场景：比如主体doc为爱好，数组里面是人名，此种聚合可以统计出美每个人名有几种爱好"_source": { "id": 1, "files": [{ "id": "22", "filename": "方法" }, { "id": "22"
复制链接

扫一扫