需要搭建的节点:
如图所示,需要搭建的节点:
- MiddieManager:执行提交的任务的工作节点,数据的导入(流式数据、批量式数据)
- Coordinator:管理段,加载新段,丢弃过时段,管理段复制以及平衡段负载
- Overlord:负责接收任务、协调任务分配
- Broker:转发查询(sql、API)
- Historical:用于查询深度存储内容
存储方案:
- 元数据存储:mysql
- 深度存储:HDFS
组件之间的通讯协同者:Zookeeper
我的环境:
- Java 1.8
- Hadoop 3.1.1
- Zookeeper 3.4.6
- Mysql 5.7
- Druid 0.13.0
目录结构:
- DISCLAIMER
,
LICENSE和
NOTICE`文件 bin/*
- 对此快速入门有用的脚本conf/*
- 群集设置的模板配置extensions/*
- 核心德鲁伊扩展hadoop-dependencies/*
- 德鲁伊Hadoop依赖lib/*
- 核心德鲁伊的jar库和依赖项quickstart/*
- 快速入门教程的配置文件,示例数据和其他文件
在配置中最用的目录是conf/druid
,我们所有的配置都在这个文件中。
conf/druid
的目录结构如下:
├── conf
│ ├── druid
│ │ ├── broker
│ │ │ ├── jvm.config
│ │ │ └── runtime.properties
│ │ ├── _common
│ │ │ ├── common.runtime.properties
│ │ │ └── log4j2.xml
│ │ ├── coordinator
│ │ │ ├── jvm.config
│ │ │ └── runtime.properties
│ │ ├── historical
│ │ │ ├── jvm.config
│ │ │ └── runtime.properties
│ │ ├── middleManager
│ │ │ ├── jvm.config
│ │ │ └── runtime.properties
│ │ └── overlord
│ │ ├── jvm.config
│ │ └── runtime.properties
其中:
- broker
- coordinator
- historical
- middleManager
- overlord
这些目录是各种类服务的配置,这些目录中的runtime.properties
文件是服务相关配置文件,jvm.config
文件是JVM虚拟机运行时的配置文件。
而_common
目录则是druid集群的通用配置目录。
下面开始搭建
-
修改conf/druid/_common/common.runtime.properties`文件
vim conf/druid/_common/common.runtime.properties
内容如下:
# # Extensions # druid.extensions.loadList=["druid-hdfs-storage","mysql-metadata-storage"] # # Logging # # Log all runtime properties on startup. Disable to avoid logging properties on startup: druid.startup.logging.logProperties=true # # Zookeeper # druid.zk.service.host=slave2.hadoop:2181 druid.zk.paths.base=/druid # # Metadata storage # # For MySQL (make sure to include the MySQL JDBC driver on the classpath): druid.metadata.storage.type=mysql druid.metadata.storage.connector.connectURI=jdbc:mysql://master:3306/druid?characterencoding=utf-8 druid.metadata.storage.connector.user=root druid.metadata.storage.connector.password=root # # Deep storage # # For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp): druid.storage.type=hdfs druid.storage.storageDirectory=hdfs://slave2.hadoop:8020/druid/segments # # Indexing service logs # # For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp): druid.indexer.logs.type=hdfs druid.indexer.logs.directory=hdfs://slave2.hadoop:8020/druid/indexing-logs # # Service discovery # druid.selectors.indexing.serviceName=druid/overlord druid.selectors.coordinator.serviceName=druid/coordinator # # Monitoring # druid.monitoring.monitors=["org.apache.druid.java.util.metrics.JvmMonitor"] druid.emitter=logging druid.emitter.logging.logLevel=info # Storage type of double columns # ommiting this will lead to index double as float at the storage layer druid.indexing.doubleStorage=double # # 启用dsql # druid.sql.enable = true
需要修改的有:
# 外部存储形式 druid.extensions.loadList=["druid-hdfs-storage","mysql-metadata-storage"] # 深度存储类型、路径 druid.storage.type=hdfs druid.storage.storageDirectory=hdfs://slave2.hadoop:8020/druid/segments # 索引日志存储类型、路径 druid.indexer.logs.type=hdfs druid.indexer.logs.directory=hdfs://slave2.hadoop:8020/druid/indexing-logs # MySQL 连接信息 druid.metadata.storage.type=mysql druid.metadata.storage.connector.connectURI=jdbc:mysql://master:3306/druid?characterencoding=utf-8 druid.metadata.storage.connector.user=root druid.metadata.storage.connector.password=root # Zookeeper服务地址 druid.zk.service.host=slave2.hadoop:2181 druid.zk.paths.base=/druid # 启用dsql druid.sql.enable = true
-
添加hadoop配置文件,以及添加mysql-jdbc.jar
- 将hadoop目录下的etc/hadoop/目录中的core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml四个文件拷贝至druid目录下的conf/druid/_common/目录下
- 将mysql-jdbc.jar命名为mysql-connector-java-xxxx.jar形式,复制到extensions/mysql-metadata-storage目录下,同时在mysql数据库中添加数据库,执行命令:
CREATE DATABASE druid DEFAULT CHARACTER SET utf8mb4;
-
配置coordinator
默认配置即可
runtime.properties文件
# 服务名称 druid.service=druid/coordinator # 运行端口 druid.plaintextPort=8081 druid.coordinator.startDelay=PT30S druid.coordinator.period=PT30S
jvm.config文件
-server -Xms3g -Xmx3g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=var/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Dderby.stream.error.file=var/druid/derby.log
-
配置overlord
默认配置即可
runtime.properties文件
# 服务名称 druid.service=druid/overlord # 运行端口 druid.plaintextPort=8090 druid.indexer.queue.startDelay=PT30S druid.indexer.runner.type=remote druid.indexer.storage.type=metadata
jvm.config文件
-server -Xms3g -Xmx3g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=var/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
-
配置historical
runtime.properties文件
# 服务名称 druid.service=druid/historical # 服务端口 druid.plaintextPort=9083 # HTTP server threads druid.server.http.numThreads=25 # Processing threads and buffers druid.processing.buffer.sizeBytes=536870912 druid.processing.numThreads=7 # Segment storage druid.segmentCache.locations=[{"path":"var/druid/segment-cache","maxSize":130000000000}] druid.server.maxSize=130000000000
jvm.config文件
-server -Xms8g -Xmx8g -XX:MaxDirectMemorySize=6144m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=var/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
需要注意的是:
我们应当保证
-XX:MaxDirectMemorySize=6144m > druid.processing.buffer.sizeBytes[536,870,912] * (druid.processing.numMergeBuffers[2] + druid.processing.numThreads[7] + 1)=5120m
,如果不满足该条件则服务无法启动 -
配置middleManager
runtime.properties文件
# 服务名称 druid.service=druid/middleManager # 服务端口 druid.plaintextPort=8091 # Number of tasks per middleManager druid.worker.capacity=3 # Task launch parameters druid.indexer.runner.javaOpts=-server -Xmx2g -XX:MaxDirectMemorySize=3072m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Dhdp.version=3.0.1.0-187 -Dhadoop.mapreduce.job.classloader=true druid.indexer.task.baseTaskDir=var/druid/task # HTTP server threads druid.server.http.numThreads=25 # Processing threads and buffers on Peons druid.indexer.fork.property.druid.processing.buffer.sizeBytes=536870912 druid.indexer.fork.property.druid.processing.numThreads=2 # Hadoop indexing druid.indexer.task.hadoopWorkingPath=/tmp/druid-indexing
jvm.config文件
-server -Xms3g -Xmx3g -XX:MaxDirectMemorySize=3072m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=var/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
需要注意的是:
我们应当保证
-XX:MaxDirectMemorySize=3072m > druid.processing.buffer.sizeBytes[536,870,912] * (druid.processing.numMergeBuffers[2] + druid.processing.numThreads[2] + 1)=2560m
,如果不满足该条件则服务无法启动。并且在runtime.properties文件中的
druid.indexer.runner.javaOpts
应该添加-XX:MaxDirectMemorySize=3072m
,保证内存足够,否则在运行任务时会出现崩溃
特别注意:
-Dhdp.version=3.0.1.0-187
指定hadoop版本-Dhadoop.mapreduce.job.classloader=true
防止包冲突 -
配置broker
runtime.properties文件
# 服务名称 druid.service=druid/broker # 服务端口 druid.plaintextPort=8082 # HTTP server threads druid.broker.http.numConnections=5 druid.server.http.numThreads=25 # Processing threads and buffers druid.processing.buffer.sizeBytes=536870912 druid.processing.numThreads=7 # Query cache druid.broker.cache.useCache=true druid.broker.cache.populateCache=true druid.cache.type=local druid.cache.sizeInBytes=2000000000
jvm.config文件
-server -Xms24g -Xmx24g -XX:MaxDirectMemorySize=6114m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=var/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
需要注意的是:
我们应当保证
-XX:MaxDirectMemorySize=6144m > druid.processing.buffer.sizeBytes[536,870,912] * (druid.processing.numMergeBuffers[2] + druid.processing.numThreads[7] + 1)=5120m
,如果不满足该条件则服务无法启动 -
完成配置后,将配置好的druid分发到各个节点
rsync -za /root/apache-druid-0.13.0-incubating/ slave3:/root/apache-druid-0.13.0-incubating/
-
在服务相应的服务器节点执行相应的程序
-
coordinator
nohup java `cat conf/druid/coordinator/jvm.config | xargs` -cp conf/druid/_common:conf/druid/coordinator:lib/* org.apache.druid.cli.Main server coordinator &
-
overlord
nohup java `cat conf/druid/overlord/jvm.config | xargs` -cp conf/druid/_common:conf/druid/overlord:lib/* org.apache.druid.cli.Main server overlord &
-
historical
nohup java `cat conf/druid/historical/jvm.config | xargs` -cp conf/druid/_common:conf/druid/historical:lib/* org.apache.druid.cli.Main server historical &
-
middleManager
nohup java `cat conf/druid/middleManager/jvm.config | xargs` -cp conf/druid/_common:conf/druid/middleManager:lib/* org.apache.druid.cli.Main server middleManager &
-
broker
nohup java `cat conf/druid/broker/jvm.config | xargs` -cp conf/druid/_common:conf/druid/broker:lib/* org.apache.druid.cli.Main server broker &
-
-
测试服务是否正常启动
访问:coordinator服务的节点,host:8081
访问:overlord服务的节点,hots:8090,如下图