Apache Druid的Rollup功能

本文详细介绍了如何使用Apache Druid加载示例数据,并配置rollup索引,实现srcIP和dstIP字段的分钟级聚合,包括计数、包转发和字节总计。通过rollup-index.json文件展示了设置和执行任务的过程,最后展示了查询结果概要。
摘要由CSDN通过智能技术生成

1. 加载示例数据

从quickstart/tutorial/rollup-data.json读取数据,创建一个名称为rollup-tutorial的数据源

rollup-data.json的数据是一个网络流相关的数据,数据内容如下

[root@bigdata001 quickstart]# cd ..
[root@bigdata001 apache-druid-0.22.1]# cat quickstart/tutorial/rollup-data.json 
{"timestamp":"2018-01-01T01:01:35Z","srcIP":"1.1.1.1", "dstIP":"2.2.2.2","packets":20,"bytes":9024}
{"timestamp":"2018-01-01T01:01:51Z","srcIP":"1.1.1.1", "dstIP":"2.2.2.2","packets":255,"bytes":21133}
{"timestamp":"2018-01-01T01:01:59Z","srcIP":"1.1.1.1", "dstIP":"2.2.2.2","packets":11,"bytes":5780}
{"timestamp":"2018-01-01T01:02:14Z","srcIP":"1.1.1.1", "dstIP":"2.2.2.2","packets":38,"bytes":6289}
{"timestamp":"2018-01-01T01:02:29Z","srcIP":"1.1.1.1", "dstIP":"2.2.2.2","packets":377,"bytes":359971}
{"timestamp":"2018-01-01T01:03:29Z","srcIP":"1.1.1.1", "dstIP":"2.2.2.2","packets":49,"bytes":10204}
{"timestamp":"2018-01-02T21:33:14Z","srcIP":"7.7.7.7", "dstIP":"8.8.8.8","packets":38,"bytes":6289}
{"timestamp":"2018-01-02T21:33:45Z","srcIP":"7.7.7.7", "dstIP":"8.8.8.8","packets":123,"bytes":93999}
{"timestamp":"2018-01-02T21:35:45Z","srcIP":"7.7.7.7", "dstIP":"8.8.8.8","packets":12,"bytes":2818}
[root@bigdata001 apache-druid-0.22.1]#

rollup-index.json内容如下,并进行了如下设置

  1. 设置"rollup" : true开启Rollup
  2. dimensions模块设置了srcIP、dstIP字段,相当于聚合字段
  3. metricsSpec模块是指标计算字段,相当于count(*) as count, sum(packets) as packets, sum(bytes) as bytes
  4. "queryGranularity" : "minute"设置为分钟级别。这样在group by之前,输入数据的timestamp字段按分钟进行标记,然后timestamp也会作为聚合字段
[root@bigdata001 apache-druid-0.22.1]# 
[root@bigdata001 apache-druid-0.22.1]# cat quickstart/tutorial/rollup-index.json 
{
  "type" : "index_parallel",
  "spec" : {
    "dataSchema" : {
      "dataSource" : "rollup-tutorial",
      "timestampSpec": {
        "column": "timestamp",
        "format": "iso"
      },
      "dimensionsSpec" : {
        "dimensions" : [
          "srcIP",
          "dstIP"
        ]
      },
      "metricsSpec" : [
        { "type" : "count", "name" : "count" },
        { "type" : "longSum", "name" : "packets", "fieldName" : "packets" },
        { "type" : "longSum", "name" : "bytes", "fieldName" : "bytes" }
      ],
      "granularitySpec" : {
        "type" : "uniform",
        "segmentGranularity" : "week",
        "queryGranularity" : "minute",
        "intervals" : ["2018-01-01/2018-01-03"],
        "rollup" : true
      }
    },
    "ioConfig" : {
      "type" : "index_parallel",
      "inputSource" : {
        "type" : "local",
        "baseDir" : "quickstart/tutorial",
        "filter" : "rollup-data.json"
      },
      "inputFormat" : {
        "type" : "json"
      },
      "appendToExisting" : false
    },
    "tuningConfig" : {
      "type" : "index_parallel",
      "maxRowsPerSegment" : 5000000,
      "maxRowsInMemory" : 25000
    }
  }
}
[root@bigdata001 apache-druid-0.22.1]#

在命令行执行task

[root@bigdata001 apache-druid-0.22.1]# 
[root@bigdata001 apache-druid-0.22.1]# bin/post-index-task --file quickstart/tutorial/rollup-index.json --url http://bigdata003:9081
Beginning indexing data for rollup-tutorial
Task started: index_parallel_rollup-tutorial_iaodppod_2022-03-31T14:34:26.961Z
Task log:     http://bigdata003:9081/druid/indexer/v1/task/index_parallel_rollup-tutorial_iaodppod_2022-03-31T14:34:26.961Z/log
Task status:  http://bigdata003:9081/druid/indexer/v1/task/index_parallel_rollup-tutorial_iaodppod_2022-03-31T14:34:26.961Z/status
Task index_parallel_rollup-tutorial_iaodppod_2022-03-31T14:34:26.961Z still running...
Task index_parallel_rollup-tutorial_iaodppod_2022-03-31T14:34:26.961Z still running...
Task index_parallel_rollup-tutorial_iaodppod_2022-03-31T14:34:26.961Z still running...
Task finished with status: SUCCESS
Completed indexing data for rollup-tutorial. Now loading indexed data onto the cluster...
[root@bigdata001 apache-druid-0.22.1]#

2. 查询数据源数据

dsql> select * from "rollup-tutorial";
┌──────────────────────────┬────────┬───────┬─────────┬─────────┬─────────┐
│ __time                   │ bytes  │ count │ dstIP   │ packets │ srcIP   │
├──────────────────────────┼────────┼───────┼─────────┼─────────┼─────────┤
│ 2018-01-01T01:01:00.000Z │  35937 │     3 │ 2.2.2.2 │     286 │ 1.1.1.1 │
│ 2018-01-01T01:02:00.000Z │ 366260 │     2 │ 2.2.2.2 │     415 │ 1.1.1.1 │
│ 2018-01-01T01:03:00.000Z │  10204 │     1 │ 2.2.2.2 │      49 │ 1.1.1.1 │
│ 2018-01-02T21:33:00.000Z │ 100288 │     2 │ 8.8.8.8 │     161 │ 7.7.7.7 │
│ 2018-01-02T21:35:00.000Z │   2818 │     1 │ 8.8.8.8 │      12 │ 7.7.7.7 │
└──────────────────────────┴────────┴───────┴─────────┴─────────┴─────────┘
Retrieved 5 rows in 1.82s.

dsql>
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值