Druid at Pulsar

Pulsar is an open source project of eBay and it includes two parts, pulsar pipeline and pulsar reporting. Pulsar pipeline is a streaming framework which will distribute more than 8 billion events every day and pulsar reporting is in response of storing, qu
摘要由CSDN通过智能技术生成

作者:Xiaoming Zhang

A glance of Pulsar and druid

Pulsar is anopen source project of eBay and it includes two parts, pulsar pipeline andpulsar reporting. Pulsar pipeline is a streaming framework which willdistribute more than 8 billion events every day and pulsar reporting is in responseof storing, querying and visualizing these data. Druid is part of pulsarreporting.

This paper willhave an introduction and a little deep dive of druid and show you the role itis playing at pulsar reporting.


Druid components introduction

Druid is an open source project which is ananalytics data store designed for business intelligence (Online analyticalprocessing) queries on event data.

Druid Skills (From official website):

1.      Sub-Second Queries.

Support multidimensional filtering, aggression and is ableto target the very data to do query.

2.      Real time Ingestion

Support streaming data ingestion and offers insightson events immediately after they occur

3.      Scalable

Able to deal with trillions of events for total,millions events for each second

4.      Highly Available

SaaS (Software as a service), need to be up all the timeand Scale up and down will not lose data

5.      Designed for Analytics

Supports a lot of filters, aggregators and query types, is ableto plugging in new functionality.

Supports approximate algorithms for cardinality estimation,and histogram and quantile calculations.

 

Glance at Druid Structure of Pulsarreporting:




Receiveabout 10 Billion events per day and the peak traffic is about 200k/s.

Eachmachine at our cluster is with 128GB memory and for each historical nodes, diskis more than 6 TB.

 

Druid ata glance:



Briefintroduction to all nodes:

Real-time

Real-timenode index the coming data and these indexed data are able to queryimmediately. Real-time nodes will build up data to segments and after a periodof time the segment will handover to historical node.



Anexample of real-time segment: 2015-11-18T06:00:00.000Z_2015-11-18T07:00:00.000Z,which will be stored at the folder of the scheme you defined. All segments arestored like the above format.

Here isthe segment information at My SQL:

Id |dataSource | created_date | start | end | partitioned | version | used |payload   pulsar_event_2014-09-15T05:00:00.000-07:00_2014-09-15T06:00:00.000-07:00_2014-09-15T05:00:00.000-07:00_1| pulsar_event | 2014-09-15T09:37:30.231-07:00 | 2014-09-15T05:00:00.000-07:00| 2014-09-15T06:00:00.000-07:00 |          1 | 2014-09-15T05:00:00.000-07:00 |   0 | {"dataSource":"pulsar_event","interval":"2014-09-15T05:00:00.000-07:00/2014-09-15T06:00:00.000-07:00","version":"2014-09-15T05:00:00.000-07:00","loadSpec":{"type":"hdfs","path":"hdfs://xxxx/20140915T050000.000-0700_20140915T060000.000-0700/2014-09-15T05_00_00.000-07_00/1/index.zip"},"dimensions":"browserfamily,browserversion,city,continent,country,deviceclass,devicefamily,eventtype,guid,js_ev_type,linespeed,osfamily,osversion,page,region,sessionid,site,tenant,timestamp,uid","metrics":"count","shardSpec":{"type":

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
Spring Boot是一个用于创建独立的、基于Spring的生产级应用程序的框架。它简化了Spring应用程序的开发过程,并提供了许多开箱即用的功能和约定,使开发人员能够更快速地构建应用程序。 Druid是一个开源的Java数据库连接池。它提供了高性能、可扩展和可管理的数据库连接池实现,同时还提供了监控和统计功能,可以帮助开发人员更好地管理和优化数据库连接。 在Spring Boot中使用Druid作为数据库连接池非常简单。首先,需要在pom.xml文件中添加Druid的依赖: ```xml <dependency> <groupId>com.alibaba</groupId> <artifactId>druid-spring-boot-starter</artifactId> <version>1.2.6</version> </dependency> ``` 然后,在application.properties或application.yml文件中配置Druid的属性: ```yaml spring.datasource.url=jdbc:mysql://localhost:3306/mydb spring.datasource.username=root spring.datasource.password=123456 spring.datasource.driver-class-name=com.mysql.jdbc.Driver # Druid相关配置 spring.datasource.druid.initial-size=5 spring.datasource.druid.min-idle=5 spring.datasource.druid.max-active=20 spring.datasource.druid.max-wait=60000 ``` 这样就完成了Druid的配置。接下来,可以在代码中使用@Autowired注解将DataSource注入到需要使用的地方,例如DAO层的数据访问类。 需要注意的是,Druid还提供了丰富的监控和统计功能,可以在配置文件中进行相应的配置,以便在浏览器中查看相关信息。例如: ```yaml # Druid监控配置 spring.datasource.druid.stat-view-servlet.enabled=true spring.datasource.druid.stat-view-servlet.url-pattern=/druid/* spring.datasource.druid.stat-view-servlet.login-username=admin spring.datasource.druid.stat-view-servlet.login-password=admin ``` 这样就可以通过访问http://localhost:8080/druid/来查看Druid的监控页面。 总之,使用Spring Boot和Druid可以帮助开发人员更便捷地构建高性能、可管理的应用程序,并提供了丰富的监控和统计功能。
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值