es主要解决简单大量数据的搜索问题,不擅长进行数据的汇总聚合计算。因为es是分布式搜索,对于数据的汇总聚合并且排序存在误差的问题。我们进行问题的复现、问题的原因和解决的方法。
1.es 汇总排序问题剖析
1.1 创建erp信息索引
put os_erp_data
{
"settings": {
"number_of_shards": "2",
"number_of_replicas": "1"
},
"mappings": {
"os_erp_data_type": {
"properties": {
"erp_id": {
"type": "keyword"
},
"order_d7_sale_qtty": {
"type": "integer"
}
}
}
}
}
1.2 索引插入文档
分片1
os_erp_data/os_erp_data_type?routing=a
{"erp_id":"erp_14","order_d7_sale_qtty":22491}
{"erp_id":"erp_8","order_d7_sale_qtty":21632}
{"erp_id":"erp_4","order_d7_sale_qtty":21502}
{"erp_id":"erp_15","order_d7_sale_qtty":21234}
{"erp_id":"erp_26","order_d7_sale_qtty":20731}
{"erp_id":"erp_10","order_d7_sale_qtty":20306}
{"erp_id":"erp_17","order_d7_sale_qtty":19942}
{"erp_id":"erp_9","order_d7_sale_qtty":19418}
{"erp_id":"erp_25","order_d7_sale_qtty":19191}
{"erp_id":"erp_16","order_d7_sale_qtty":18797}
{"erp_id":"erp_6","order_d7_sale_qtty":18306}
{"erp_id":"erp_3","order_d7_sale_qtty":18166}
{"erp_id":"erp_22","order_d7_sale_qtty":17669}
{"erp_id":"erp_24","order_d7_sale_qtty":16971}
{"erp_id":"erp_27","order_d7_sale_qtty":16911}
{"erp_id":"erp_18","order_d7_sale_qtty":16758}
{"erp_id":"erp_23","order_d7_sale_qtty":16527}
{"erp_id":"erp_13","order_d7_sale_qtty":15705}
{"erp_id":"erp_7","order_d7_sale_qtty":15251}
{"erp_id":"erp_11","order_d7_sale_qtty":15019}
{"erp_id":"erp_12","order_d7_sale_qtty":14387}
{"erp_id":"erp_2","order_d7_sale_qtty":14329}
{"erp_id":"erp_30","order_d7_sale_qtty":14023}
{"erp_id":"erp_5","order_d7_sale_qtty":13421}
{"erp_id":"erp_29","order_d7_sale_qtty":13309}
{"erp_id":"erp_1","order_d7_sale_qtty":12574}
{"erp_id":"erp_28","order_d7_sale_qtty":12189}
{"erp_id":"erp_19","order_d7_sale_qtty":11673}
{"erp_id":"erp_21","order_d7_sale_qtty":11460}
{"erp_id":"erp_20","order_d7_sale_qtty":10576}
分片2
os_erp_data?routing=b
{"erp_id":"erp_19","order_d7_sale_qtty":168589}
{"erp_id":"erp_21","order_d7_sale_qtty":164705}
{"erp_id":"erp_16","order_d7_sale_qtty":162088}
{"erp_id":"erp_9","order_d7_sale_qtty":161579}
{"erp_id":"erp_8","order_d7_sale_qtty":160459}
{"erp_id":"erp_28","order_d7_sale_qtty":159775}
{"erp_id":"erp_15","order_d7_sale_qtty":158124}
{"erp_id":"erp_26","order_d7_sale_qtty":156609}
{"erp_id":"erp_24","order_d7_sale_qtty":156208}
{"erp_id":"erp_11","order_d7_sale_qtty":153976}
{"erp_id":"erp_4","order_d7_sale_qtty":153479}
{"erp_id":"erp_23","order_d7_sale_qtty":152833}
{"erp_id":"erp_12","order_d7_sale_qtty":152052}
{"erp_id":"erp_20","order_d7_sale_qtty":150718}
{"erp_id":"erp_29","order_d7_sale_qtty":150320}
{"erp_id":"erp_17","order_d7_sale_qtty":149352}
{"erp_id":"erp_10","order_d7_sale_qtty":148473}
{"erp_id":"erp_2","order_d7_sale_qtty":147812}
{"erp_id":"erp_5","order_d7_sale_qtty":147791}
{"erp_id":"erp_3","order_d7_sale_qtty":146158}
{"erp_id":"erp_6","order_d7_sale_qtty":145604}
{"erp_id":"erp_7","order_d7_sale_qtty":145439}
{"erp_id":"erp_18","order_d7_sale_qtty":144984}
{"erp_id":"erp_13","order_d7_sale_qtty":144784}
{"erp_id":"erp_14","order_d7_sale_qtty":144004}
{"erp_id":"erp_27","order_d7_sale_qtty":143564}
{"erp_id":"erp_30","order_d7_sale_qtty":140984}
{"erp_id":"erp_22","order_d7_sale_qtty":140309}
{"erp_id":"erp_25","order_d7_sale_qtty":133879}
{"erp_id":"erp_1","order_d7_sale_qtty":133233}
1.3 问题复现
os_erp_data/_search
{
"size":0,
"aggs":{
"erp_name":{
"terms":{
"field":"erp_id",
"size" :5,
"order":{
"order_d7_sale_qtty_sum":"desc"
}
}
,
"aggs":{
"order_d7_sale_qtty_sum":{
"sum":{
"field":"order_d7_sale_qtty"
}
}}
}
}
}
总结:
- 在聚合排序的操作中,实际上是每个分片自身先做汇总排序,然后将每个分片的前17名放在一起再次聚合,再排序,将排序后的前5条记录作为结果返回;
- 为什么用每个分片的前17名?这是用官方给出的算式得来的,地址是:Terms Aggregation | Elasticsearch Reference [6.1] | Elastic ,如下图:
shard_size =size*1.5+10。
1.4 调整 shard_size 参数,设置过大会有性能问题
os_erp_data/_search
{
"size":0,
"aggs":{
"erp_name":{
"terms":{
"field":"erp_id",
"size" :5,
"shard_size": 28,
"order":{
"order_d7_sale_qtty_sum":"desc"
}
}
,
"aggs":{
"order_d7_sale_qtty_sum":{
"sum":{
"field":"order_d7_sale_qtty"
}
}}
}
}
}
两种解决方式:调整 shard_size 大小 和 需要统计排序的数据放到一个分片中。
大量的数据实时聚合计算,建议更换为 clickhouse,更适合做这些事情。