broker druid 查询_Druid官方文档翻译：groupBy查询

最新推荐文章于 2023-08-18 20:40:12 发布

王克丹

最新推荐文章于 2023-08-18 20:40:12 发布

阅读量323

点赞数

文章标签： broker druid 查询

本文链接：https://blog.csdn.net/weixin_31173113/article/details/111925074

版权

groupBy Queries

These types of queries take a groupBy query object and return an array of JSON objects where each object represents a grouping asked for by the query.

Note: If you are doing aggregations with time as your only grouping, or an ordered groupBy over a single dimension, consider Timeseries and TopN queries as well as groupBy. Their performance may be better in some cases. See Alternatives below for more details.

这些类型的查询接受groupBy查询对象并返回JSON对象数组，其中每个对象表示查询所请求的分组。

注意：如果将时间作为唯一的分组进行聚合，或者在单个维度上进行有序的groupBy查询，请考虑Timeseries和TopN查询。在某些情况下，他们的表现可能更好。有关详细信息，请参阅下面的备选方案。

An example groupBy query object is shown below:

{

"queryType": "groupBy",

"dataSource": "sample_datasource",

"granularity": "day",

"dimensions": ["country", "device"],

"limitSpec": { "type": "default", "limit": 5000, "columns": ["country", "data_transfer"] },

"filter": {

"type": "and",

"fields": [

{ "type": "selector", "dimension": "carrier", "value": "AT&T" },

{ "type": "or",

"fields": [

{ "type": "selector", "dimension": "make", "value": "Apple" },

{ "type": "selector", "dimension": "make", "value": "Samsung" }

]

}

]

"aggregations": [

{ "type": "longSum", "name": "total_usage", "fieldName": "user_count" },

{ "type": "doubleSum", "name": "data_transfer", "fieldName": "data_transfer" }

"postAggregations": [

{ "type": "arithmetic",

"name": "avg_usage",

"fn": "/",

"fields": [

{ "type": "fieldAccess", "fieldName": "data_transfer" },

{ "type": "fieldAccess", "fieldName": "total_usage" }

]

}

"intervals": [ "2012-01-01T00:00:00.000/2012-01-03T00:00:00.000" ],

"having": {

"type": "greaterThan",

"aggregation": "total_usage",

"value": 100

}

Following are main parts to a groupBy query:propertydescriptionrequired?queryTypeThis String should always be “groupBy”; this is the first thing Druid looks at to figure out how to interpret the queryyes

dataSourceA String or Object defining the data source to query, very similar to a table in a relational database. See DataSource for more information.yes

dimensionsA JSON list of dimensions to do the groupBy over; or see DimensionSpec for ways to extract dimensions.yes

limitSpecno

havingno

granularityDefines the granularity of the query. See Granularitiesyes

filterno

aggregationsno

postAggregationsno

intervalsA JSON Object representing ISO-8601 Intervals. This defines the time ranges to run the query over.yes

subtotalsSpecA JSON array of arrays to return additional result sets for groupings of subsets of top level dimensions. It is described later in more detail.no

contextAn additional JSON Object which can be used to specify certain flags.no

To pull it all together, the above query would return n*m data points, up to a maximum of 5000 points, where n is the cardinality of the country dimension, m is the cardinality of the devicedimension, each day between 2012-01-01 and 2012-01-03, from the sample_datasource table. Each data point contains the (long) sum of total_usage if the value of the data point is greater than 100, the (double) sum of data_transfer and the (double) result of total_usage divided by data_transfer for the filter set for a particular grouping of country and device.

The output looks like this:

[

{

"version" : "v1",

"timestamp" : "2012-01-01T00:00:00.000Z",

"event" : {

"country" : ,

"device" : ,

"total_usage" : ,

"data_transfer" :,

"avg_usage" :

}

{

"version" : "v1",

"timestamp" : "2012-01-01T00:00:12.000Z",

"event" : {

"dim1" : ,

"dim2" : ,

"sample_name1" : ,

"sample_name2" :,

"avg_usage" :

}

...

]

Behavior on multi-value dimensions

groupBy queries can group on multi-value dimensions. When grouping on a multi-value dimension, all values from matching rows will be used to generate one group per value. It’s possible for a query to return more groups than there are rows. For example, a groupBy on the dimension tags with filter "t1" AND "t3" would match only row1, and generate a result with three groups: t1, t2, and t3. If you only need to include values that match your filter, you can use a filtered dimensionSp

最低0.47元/天解锁文章

王克丹

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
broker druid 查询_Druid官方文档翻译：groupBy查询

groupBy QueriesThese types of queries take a groupBy query object and return an array of JSON objects where each object represents a grouping asked for by the query.Note: If you are doing aggregations...
复制链接

扫一扫