libwebsockets源码内部实现分析_ElasticSearch Aggregations GroupBy 实现源码分析

最新推荐文章于 2024-05-14 13:24:04 发布

金融四十人论坛

最新推荐文章于 2024-05-14 13:24:04 发布

阅读量275

点赞数

文章标签： libwebsockets源码内部实现分析

本文链接：https://blog.csdn.net/weixin_30868873/article/details/113086030

版权

准备工作

为了方便调试，我对索引做了如下配置

{ "mappings": { "my_type": { "properties": { "newtype": {  "type": "string", "index": "not_analyzed" }, "num": {  "type": "integer" } } } }, "settings" : { "index" : { "number_of_shards" : 1,  "number_of_replicas" : 0  } }}

这样只有一个分片，方便IDE的跟踪，也算是个看源码的技巧

数据

{ "user" : "kimchy", "post_date" : "2009-11-15T14:12:12", "newtype": "abc", "message" : "trying out Elasticsearch", "num" : 10}

查询语句

假定的查询如下：

{ "from": 0, "size": 0, "_source": { "includes": [ "AVG" ], "excludes": [] }, "aggregations": { "newtype": { "terms": { "field": "newtype", "size": 200 }, "aggregations": { "AVG(num)": { "avg": { "field": "num" } } } } }}

其语义类似这个sql 语句：

SELECT avg(num) FROM twitter group by newtype

也就是按newtype 字段进行group by,然后对num求平均值。在我们实际的业务系统中，这种统计需求也是最多的。

Phase概念

在查询过程中，ES是将整个查询分成几个阶段的，大体如下：

QueryPhase
rescorePhase
suggestPhase
aggregationPhase
FetchPhase

对于全文检索，可能还有DFSPhase。

顺带提一点，Spark SQL + ES 的组合，最影响响应时间的地方其实是Fetch original source 。

而对于这些Phase,并不是一个链路的模式，而是在某个Phase调用另外一个Phase。这个在源码中也很明显，我们看如下一段代码：

 //创建聚合需要的AggregationContext, //里面包含了各个Aggregator aggregationPhase.preProcess(searchContext); //实际query,还有聚合操作其实是在这部完成的 boolean rescore = execute(searchContext, searchContext.searcher()); //如果是全文检索，并且需要打分 if (rescore) { // only if we do a regular search rescorePhase.execute(searchContext); } suggestPhase.execute(searchContext); //获取聚合结果 aggregationPhase.execute(searchContext);  }

Aggregation的相关概念

要了解具体是如何实现聚合功能的，则需要了解ES 的aggregator相关的概念。大体有五个：

AggregatorFactory (典型的工厂模式)负责创建Aggregator实例
Aggregator (负责提供collector,并且提供具体聚合逻辑的类)
Aggregations (聚合结果)
PipelineAggregator (对聚合结果进一步处理)
Aggregator 的嵌套，比如示例中的AvgAggregator 就是根据GlobalOrdinalsStringTermsAggregator 的以bucket为维度，对相关数据进行操作.这种嵌套结构也是
Bucket 其实就是被groupBy 字段的数字表示形式。用数字表示，可以

最低0.47元/天解锁文章

金融四十人论坛

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
libwebsockets源码内部实现分析_ElasticSearch Aggregations GroupBy 实现源码分析

准备工作为了方便调试，我对索引做了如下配置{ "mappings": { "my_type": { "properties": { "newtype": { "type": "string", "index": "not_analyzed" }, "num": { "type": "integer" } } } }, "settings" : { "index" : { "number_of_...
复制链接

扫一扫