docker desktop 安装clickhosue单机, 使用docker-compose.yml 搭建clickhosue 集群(4节点双分片双副本)
文章目录
一、下载clickhouse镜像
打开 windows powerShell(管理员),命令(路径任意)
docker search clickhosue
下载镜像
docker pull yandex/clickhouse-server
docker pull yandex/clickhouse-client
二、单机clickhouse安装
1.clickhouse 挂载目录准备
选定要挂载的目录路径,创建三个文件夹: etc/log/data,分别作 配置文件目录/输出日志目录/数据存储目录
2.配置文件拷贝
起一个白板 clickhouse
docker run --rm -d --name=temp yandex/clickhouse-server
拷贝配置文件
docker cp temp:/etc/clickhouse-server/config.xml D:/你的路径/etc/config.xml
docker cp temp:/etc/clickhouse-server/users.xml D:/你的路径//etc/users.xml
删除白板 clickhouse
docker stop temp
3.添加参数
打开config.xml,找到
<!-- <timezone>Europe/Moscow</timezone> -->
修改为:
<timezone>Asia/Shanghai</timezone>
找到
<!-- <listen_host>0.0.0.0</listen_host> -->
修改为
<listen_host>::</listen_host>
添加:
<include_from>/etc/clickhouse-server/metrika.xml</include_from>
4.规划clickhouse 端口
tcp端口 | http端口 |
---|---|
9001 | 8124 |
打开 windows powerShell(管理员),命令(路径任意)
netsh interface ipv4 show excludedportrange protocol=tcp
由于hyper-v会占用端口,提前查看,避免修改的clickhosue端口被占用
并通过命令,查看是否被其他进程占用
netstat -aon|findstr “port”
5.启动clickhouse
命令解析
容器一直启动: docker run --restart always -d
容器命名: --name chserver
文件句柄数配置: --ulimit nofile=262144:262144
配置目录挂载: --volume=D:/你的目录/etc:/etc/clickhouse-server/
日志目录挂载: --volume=D:/你的目录/log/clickhouse-server/
数据目录挂载: --volume=D:/你的目录/data:/var/lib/clickhouse/
端口映射: -p 9001:9000 -p 8124:8123
镜像: yandex/clickhouse-server
合起来
docker run --restart always -d --name chserver --ulimit nofile=262144:262144 --volume=D:/你的目录/etc:/etc/clickhouse-server/ --volume=--volume=D:/你的目录/log/clickhouse-server/ --volume=--volume=D:/你的目录/data:/var/lib/clickhouse/ -p 9001:9000 -p 8124:8123 yandex/clickhouse-server
三、zookeeper+clickhouse集群搭建
1.集群结构
示意图
zookeeper对clickhouse集群提供管理与服务,使clickhouse集群的分片与复制功能得到支持
zookeeper+clickhouse集群目录结构
├── temp
├── docker-compose.yml
└── zookeeper-cluster
│ ├── zookeeper-01
│ ├── datalog
│ ├── conf
│ │ └── zoo.cfg
│ └── data
│ ├── zookeeper-02
│ ├── datalog
│ ├── conf
│ │ └── zoo.cfg
│ └── data
│ └── zookeeper-03
│ ├── datalog
│ ├── conf
│ │ └── zoo.cfg
└── data
└── clickhouse-cluster
│ ├── ch01-01
│ ├── log
│ ├── etc
│ │ ├── metrika.xml
│ │ ├── config.xml
│ │ ├── macros.xml
│ │ └── users.xml
│ └── data
│ ├── ch02-02
│ ├── log
│ ├── etc
│ │ ├── metrika.xml
│ │ ├── config.xml
│ │ ├── macros.xml
│ │ └── users.xml
│ └── data
│ ├── ch02-03
│ ├── log
│ ├── etc
│ │ ├── metrika.xml
│ │ ├── config.xml
│ │ ├── macros.xml
│ │ └── users.xml
│ └── data
│ └── ch01-04
│ ├── log
│ ├── etc
│ │ ├── metrika.xml
│ │ ├── config.xml
│ │ ├── macros.xml
│ │ └── users.xml
│ └── data
2.端口规划
如上面安装单机版clickhouse时的端口规划一样
2.1 排除hyper-v占用的端口范围
命令: netsh interface ipv4 show excludedportrange protocol=tcp
2.2 规划映射端口
使用命令: netstat -aon|findstr “port” ;
查看在本机规划的端口是否被占用,没占用的即可使用
扩展命令: tasklist|findstr “进程号” ; 可查看占用该端口的进程
zookeeper默认端口:
2181 | client port |
---|---|
2888 | client port |
3888 | 选举端口 |
8080 | admin.serverport |
规划zookeeper各个节点的端口映射:
节点 | client port | LF通信端口 | 选举端口 | admin.serverport |
---|---|---|---|---|
zookeeper-01 | 2187 | 2877 | 3877 | 7897 |
zookeeper-02 | 2188 | 2878 | 3878 | 7898 |
zookeeper-03 | 2189 | 2879 | 3879 | 7899 |
clickhouse默认端口:
9000 | tcp port |
---|---|
8123 | Http port |
规划clickhouse各个节点的端口映射(同上,要避开被占用的端口)
节点 | tcp port | Http port |
---|---|---|
ch01-01 | 9001 | 8821 |
ch02-02 | 9002 | 8822 |
ch02-03 | 9003 | 8823 |
ch01-04 | 9004 | 8824 |
3. 集群的目录结构与相关配置文件
3.1 zookeeper-cluster目录
这里在D:盘,新建了一个temp目录,存放zookeepr-cluster和clickhosue-cluster
在zookeeper-cluster三个目录,分别是:zookeeper-01/zookeeper-02/zookeeper-03 , 每个目录下各自都有:data/datalog/conf
3.2 zookeeper-cluster 的配置文件
在conf下新建文本,并重命名为 : zoo.cfg
#心跳时间 单位毫秒
tickTime=30000
#leader和follower初始连接能容忍的最大心跳数
initLimit=10
#leader和follower请求应答能容忍的最大心跳数
syncLimit=5
#zookeeper数据目录
dataDir=/data
#zookeeper日志目录
dataLogDir=/datalog
#客户端连接端口 修改为本节点的client port
clientPort=2187
#需要保留的文件个数
autopurge.snapRetainCount=3
#定时清理时间间隔 单位小时 设为0表示不清理
autopure.purgeInterval=1
maxClientCnxns=60
standaloneEnabled=true
admin.enableServer=true
quorumListenOnAllIPs=true
#修改此端口为本节点的admin.serverPort
admin.serverPort=7897
#zookeeper集群信息 服务器地址:LF通信端口:选举端口
server.1=zk-01:2877:3877
server.2=zk-02:2878:3878
server.3=zk-03:2879:3879
zoo.cfg内需要修改两项: admin.serverPort , clientPort,根据每个节点进行修改
cp 此zoo.cfg到zookeeper-02/zookeeper03,并修改上述两项
3.3 clickhosue-cluster目录
在clickhouse-cluster 下新建4个节点的目录,每个节点下各有 data/etc/log
3.4 clickhouse-cluster的配置文件
在一个节点的etc下,将上面安装单机版的config.xml与users.xml文件cp过来
新建一个文本,重命名为macros.xml
<yandex>
<macros>
<layer>01</layer>
<shard>01</shard>
<replica>ch01-01</replica>
</macros>
</yandex>
标签解释:
标签 | 作用 |
---|---|
< layer> | 双级分片设置,此处为单集群,设置01 |
< shard> | 分片编号,同一分片,副本编号一致 |
< replica> | 副本的唯一标识.此处以 ch{shard}-{ck_num} 命名 |
cp到剩余节点,根据分片规则修改shard和ch01-01即可
新建 一个文本,重命名为metrika.xml
<yandex>
<clickhouse_remote_servers>
<cluster_1>
<shard>
<weight>1</weight>
<internal_replication>true</internal_replication>
<replica>
<host>ch01-01</host>
<port>9000</port>
</replica>
<replica>
<host>ch01-04</host>
<port>9000</port>
</replica>
</shard>
<shard>
<weight>1</weight>
<internal_replication>true</internal_replication>
<replica>
<host>ch02-02</host>
<port>9000</port>
</replica>
<replica>
<host>ch02-03</host>
<port>9000</port>
</replica>
</shard>
</cluster_1>
</clickhouse_remote_servers>
<zookeeper-servers>
<node index="1">
<host>zookeeper-01</host>
<port>2187</port>
</node>
<node index="2">
<host>zookeeper-02</host>
<port>2188</port>
</node>
<node index="3">
<host>zookeeper-03</host>
<port>2189</port>
</node>
</zookeeper-servers>
<networks>
<ip>::/0</ip>
</networks>
<clickhouse_compression>
<case>
<min_part_size>10000000000</min_part_size>
<min_part_size_ratio>0.01</min_part_size_ratio>
<method>lz4</method>
</case>
</clickhouse_compression>
</yandex>
标签解释:
标签 | 作用 |
---|---|
< cluster_1> | 分布式标识标签, 可自定义,创建分布式表时会用到. |
< weight> | 分片权重,即写入数据时有多大的概率落到此分片,这里所有分片权重都为1. |
<internal_replication> | 表示是否只将数据写入其中一个副本.默认是 false,表示写入所有副本 .因为在复制表的情况下,可能会导致重复和不一致 , 这里是 true.clickhouse分布式表只管写入一个副本,其余同步表的事情交给复制表和zookeeper来进行. |
< replica> | 表示一个分片下的所有副本 |
注意: 此文件内的缩进格式一定要调好
cp到剩余节点即可,无需修改
4. docker-compose.yml
在temp根目录下,新建文本并重命名为docker-compose.yml
version: '3'
services:
zookeeper-01:
image: zookeeper
ports:
- 2187:2181
- 2877:2888
- 3877:3888
container_name: zookeeper-01
hostname: zookeeper-01
environment:
ZOO_MY_ID: 1
volumes:
- D:/temp/zookeeper-cluster/zookeeper-01/conf/zoo.cfg:/conf/zoo.cfg
- D:/temp/zookeeper-cluster/zookeeper-01/data/:/data/
- D:/temp/zookeeper-cluster/zookeeper-01/datalog/:/datalog/
networks:
default:
ipv4_address: 172.12.0.2
zookeeper-02:
image: zookeeper
ports:
- 2188:2181
- 2878:2888
- 3878:3888
container_name: zookeeper-02
hostname: zookeeper-02
environment:
ZOO_MY_ID: 2
volumes:
- D:/temp/zookeeper-cluster/zookeeper-02/conf/zoo.cfg:/conf/zoo.cfg
- D:/temp/zookeeper-cluster/zookeeper-02/data/:/data/
- D:/temp/zookeeper-cluster/zookeeper-02/datalog/:/datalog/
networks:
default:
ipv4_address: 172.12.0.3
zookeeper-03:
image: zookeeper
ports:
- 2189:2181
- 2879:2888
- 3879:3888
container_name: zookeeper-03
hostname: zookeeper-03
environment:
ZOO_MY_ID: 3
volumes:
- D:/temp/zookeeper-cluster/zookeeper-03/conf/zoo.cfg:/conf/zoo.cfg
- D:/temp/zookeeper-cluster/zookeeper-03/data/:/data/
- D:/temp/zookeeper-cluster/zookeeper-03/datalog/:/datalog/
networks:
default:
ipv4_address: 172.12.0.4
ch01-01:
image: yandex/clickhouse-server
hostname: ch01-01
container_name: ch01-01
ports:
- 9001:9000
- 8821:8123
volumes:
- D:/temp/clickhouse-cluster/ch01-01/etc/config.xml:/etc/clickhouse-server/config.xml
- D:/temp/clickhouse-cluster/ch01-01/etc/users.xml:/etc/clickhouse-server/users.xml
- D:/temp/clickhouse-cluster/ch01-01/etc/metrika.xml:/etc/clickhouse-server/metrika.xml
- D:/temp/clickhouse-cluster/ch01-01/etc/macros.xml:/etc/clickhouse-server/config.d/macros.xml
- D:/temp/clickhouse-cluster/ch01-01/data/:/var/lib/clickhouse/
- D:/temp/clickhouse-cluster/ch01-01/log/:/var/log/clickhouse-server/
ulimits:
nofile:
soft: 262144
hard: 262144
depends_on:
- zookeeper-01
- zookeeper-02
- zookeeper-03
networks:
default:
ipv4_address: 172.12.0.5
ch02-02:
image: yandex/clickhouse-server
hostname: ch02-02
container_name: ch02-02
ports:
- 9002:9000
- 8822:8123
volumes:
- D:/temp/clickhouse-cluster/ch02-02/etc/config.xml:/etc/clickhouse-server/config.xml
- D:/temp/clickhouse-cluster/ch02-02/etc/users.xml:/etc/clickhouse-server/users.xml
- D:/temp/clickhouse-cluster/ch02-02/etc/metrika.xml:/etc/clickhouse-server/metrika.xml
- D:/temp/clickhouse-cluster/ch02-02/etc/macros.xml:/etc/clickhouse-server/config.d/macros.xml
- D:/temp/clickhouse-cluster/ch02-02/data/:/var/lib/clickhouse/
- D:/temp/clickhouse-cluster/ch02-02/log/:/var/log/clickhouse-server/
ulimits:
nofile:
soft: 262144
hard: 262144
depends_on:
- zookeeper-01
- zookeeper-02
- zookeeper-03
networks:
default:
ipv4_address: 172.12.0.6
ch02-03:
image: yandex/clickhouse-server
hostname: ch02-03
container_name: ch02-03
ports:
- 9003:9000
- 8823:8123
volumes:
- D:/temp/clickhouse-cluster/ch02-03/etc/config.xml:/etc/clickhouse-server/config.xml
- D:/temp/clickhouse-cluster/ch02-03/etc/users.xml:/etc/clickhouse-server/users.xml
- D:/temp/clickhouse-cluster/ch02-03/etc/metrika.xml:/etc/clickhouse-server/metrika.xml
- D:/temp/clickhouse-cluster/ch02-03/etc/macros.xml:/etc/clickhouse-server/config.d/macros.xml
- D:/temp/clickhouse-cluster/ch02-03/data/:/var/lib/clickhouse/
- D:/temp/clickhouse-cluster/ch02-03/log/:/var/log/clickhouse-server/
ulimits:
nofile:
soft: 262144
hard: 262144
depends_on:
- zookeeper-01
- zookeeper-02
- zookeeper-03
networks:
default:
ipv4_address: 172.12.0.7
ch01-04:
image: yandex/clickhouse-server
hostname: ch01-04
container_name: ch01-04
ports:
- 9004:9000
- 8824:8123
volumes:
- D:/temp/clickhouse-cluster/ch01-04/etc/config.xml:/etc/clickhouse-server/config.xml
- D:/temp/clickhouse-cluster/ch01-04/etc/users.xml:/etc/clickhouse-server/users.xml
- D:/temp/clickhouse-cluster/ch01-04/etc/metrika.xml:/etc/clickhouse-server/metrika.xml
- D:/temp/clickhouse-cluster/ch01-04/etc/macros.xml:/etc/clickhouse-server/config.d/macros.xml
- D:/temp/clickhouse-cluster/ch01-04/data/:/var/lib/clickhouse/
- D:/temp/clickhouse-cluster/ch01-04/log/:/var/log/clickhouse-server/
ulimits:
nofile:
soft: 262144
hard: 262144
depends_on:
- zookeeper-01
- zookeeper-02
- zookeeper-03
networks:
default:
ipv4_address: 172.12.0.8
networks:
default:
external:
name: zk-3-ch-3
注意:此文件的格式缩进,万万不可使用Tab 键,要使用空格键,配置完后,可以在yaml格式验证网站验证yaml格式是否正确
5. 配置本机hosts
添加
#Added by zookeeper 集群
172.12.0.2 zk-01
172.12.0.3 zk-02
172.12.0.4 zk-03
#Added by clickhouse 集群
172.12.0.5 ch01-01
172.12.0.6 ch02-02
172.12.0.7 ch02-03
172.12.0.8 ch01-04
6. 启动
打开PowerShell(管理员)
#第一步,进入docker-compose.yml所在目录
cd D:\temp\
#创建虚拟网关与网段
docker network create --driver bridge --subnet 172.12.0.0/25 --gateway 172.12.0.1 zk-3-ch-3
#使用命令启动集群
docker-compose up -d
docker desktop弹出通知,大意是:是否确定在挂载,点close即可
6.1 小扩充
查看zookeeper集群各个节点日志,即可看出集群是否启动成功与哪个节点是leader
#zookeeper-01状态为 following - broadcast
2021-02-01 13:14:34,568 [myid:1] - INFO [QuorumPeer[myid=1](plain=0.0.0.0:2187)(secure=disabled):QuorumPeer@863] - Peer state changed: following - broadcast
#zookeeper-02状态为 following - broadcast
2021-02-01 13:14:34,611 [myid:2] - INFO [QuorumPeer[myid=2](plain=0.0.0.0:2188)(secure=disabled):QuorumPeer@863] - Peer state changed: following - broadcast
#zookeeper-02状态为 leading - broadcast
2021-02-01 13:14:34,532 [myid:3] - INFO [QuorumPeer[myid=3](plain=0.0.0.0:2189)(secure=disabled):QuorumPeer@863] - Peer state changed: leading - broadcast
期间可能报一些WRAN 日志,据我分析,是因为zookeeper集群没有真正意义上的并行启动,一个节点启动后,会立刻寻找配置的其他节点,所以报WARN,全启动后即不会出现(有解决办法告诉我,谢谢)
四 clickhouse集群 分片与副本实现验证
这里使用的是dbeaver
施工中-------------------------------------------------------------
总结
本文使用的镜像截至与2021-01-22 为最新,可能有些小问题,最新版嘛,你懂得