聚合权重求平均:weighted_avg
关于聚合有权重时,求平均的方式,总结了以下几点:
1、首先确定平均值字段和权重字段,字段都为数值类型
2、权重的字段
如果权重都相等,则和常规求平均结果一致
如果权重都为0,则平均值为null
如果权重值不同,可以根据下面的公式进行计划
废话少说,直接上代码,验证下:
1、先创建Index
//创建Index
PUT /demo_avg_test
{
"mappings": {
"properties": {
"lesson":{
"type": "text"
},
"scores": {
"type": "double"
},
"weighttest": {
"type": "integer"
}
}
}
}
2、导入文档有两种形式:单个或批量
- 单个导入
//单个文档导入
PUT /demo_avg_test/_doc/1
{
"id":1,
"lesson":"语言",
"scores":10,
"weighttest":20
}
PUT /demo_avg_test/_doc/2
{
"id":2,
"lesson":"历史",
"scores":40,
"weighttest":100
}
PUT /demo_avg_test/_doc/3
{
"id":3,
"lesson":"数学",
"scores":50,
"weighttest":100
}
2.批量导入文档
//批量导入文档
PUT /_bulk
{"index":{"_index":"demo_avg_test","_id":1}}
{"lesson":"语言","scores":10,"weighttest":20}
{"index":{"_index":"demo_avg_test","_id":2}}
{"lesson":"历史","scores":20,"weighttest":40}
{"index":{"_index":"demo_avg_test","_id":3}}
{"lesson":"数学","scores":50,"weighttest":100}
3、查询全部:验证下数据
//查询全部
GET /demo_avg_test/_search
{
"query": {"match_all": {}}
}
4、聚合权重求平均值
聚合权重平均查询
#聚合权重平均查询
GET /demo_avg_test/_search
{
"size": 3,
"aggs": {
"weighted_type": {
"weighted_avg": {
"value": {
"field": "scores"
},
"weight": {
"field": "weighttest"
}
}
}
}
}
说明:
-求平均值字段:scores
-权重字段:weighttest
平均分计算规则:
加权平均值:∑(值*权重) / ∑(权重)
公式:
加
权
平
均
值
为
∑
(
值
∗
权
重
)
/
∑
(
权
重
)
加权平均值为∑(值*权重)/∑(权重)
加权平均值为∑(值∗权重)/∑(权重)
1)权重值不等时
#结果1
{
"took" : 281,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "demo_avg_test",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"lesson" : "语言",
"scores" : 10,
"weighttest" : 20
}
},
{
"_index" : "demo_avg_test",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"lesson" : "历史",
"scores" : 20,
"weighttest" : 40
}
},
{
"_index" : "demo_avg_test",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"lesson" : "数学",
"scores" : 50,
"weighttest" : 100
}
}
]
},
"aggregations" : {
"weighted_type" : {
"value" : 37.5
}
}
}
平均分数:37.5
1020+2040+50*100/(20+40+100)=37.5
2)权重值为0,则平均值为null
#结果2
{
"took" : 719,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "demo_avg_test",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"lesson" : "语言",
"scores" : 10,
"weighttest" : 0
}
},
{
"_index" : "demo_avg_test",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"lesson" : "历史",
"scores" : 20,
"weighttest" : 0
}
},
{
"_index" : "demo_avg_test",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"lesson" : "数学",
"scores" : 50,
"weighttest" : 0
}
}
]
},
"aggregations" : {
"weighted_type" : {
"value" : null
}
}
}
平均分数:null
3)权重值都相等时
#结果3
#如果将权重值都修改为相同值,则权重如常规变量1一样
下面将权重修改为100,其平均值为:(10+20+50)/3=26.6666666
{
"took" : 296,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "demo_avg_test",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"lesson" : "语言",
"scores" : 10,
"weighttest" : 100
}
},
{
"_index" : "demo_avg_test",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"lesson" : "历史",
"scores" : 20,
"weighttest" : 100
}
},
{
"_index" : "demo_avg_test",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"lesson" : "数学",
"scores" : 50,
"weighttest" : 100
}
}
]
},
"aggregations" : {
"weighted_type" : {
"value" : 26.666666666666668
}
}
}
平均值:26.6
下面将权重修改为100,其平均值为:(10+20+50)/3=26.6
到此结束了,关于聚合的权重求平均,还有脚本的方式,以及参数缺省的情况还有多种情况,后面再介绍了。
关于聚合其实官网提供了多种指标聚合方式,求最大、最小、平均、汇总、折叠等
今天在看文档时,看到这个挺意思,自测了下,作个记录。
加油!日拱一卒无有尽,功不唐捐终入海!