ClickHouse多实例部署

ClickHouse多实例部署

文章借鉴http://fuxkdb.com/2020/05/02/2020-05-02-ClickHouse多实例部署/
感谢大佬!
生产环境并不建议多实例部署, ClickHouse一个查询可以用到多个CPU, 本例只适用于测试环境
在这里插入图片描述

集群部署关系如下:

在这里插入图片描述

逻辑结构图如下:

在这里插入图片描述

编辑三台主机/etc/hosts添加如下内容:

172.16.120.10 centos-1
172.16.120.11 centos-2
172.16.120.12 centos-3

依赖组件安装

JDK

下载openjdk
wget https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u242-b08/OpenJDK8U-jdk_x64_linux_hotspot_8u242b08.tar.gz
tar -zxvf OpenJDK8U-jdk_x64_linux_hotspot_8u242b08.tar.gz -C /usr/local/
做软链
ln -s /usr/local/jdk8u242-b08 /usr/local/java
配置环境变量
#vi ~/.bashrc
 
export JAVA_HOME=/usr/local/java
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH

ZooKeeper

3.6.0有bug

所以改用稳定版本3.4.14

下载安装包
wget https://downloads.apache.org/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
tar -zxvf zookeeper-3.4.14.tar.gz -C /usr/local/
做软链
ln -s /usr/local/zookeeper-3.4.14 /usr/local/zookeeper
配置环境变量
#vi ~/.bashrc
 
export ZOOKEEPER_HOME=/usr/local/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin
修改配置文件
cd /usr/local/zookeeper/conf
 
 
#参考官方
#https://clickhouse.tech/docs/en/operations/tips/#zookeeper
#vim zoo.cfg
 
 
tickTime=2000
initLimit=30000
syncLimit=10
maxClientCnxns=2000
maxSessionTimeout=60000000
dataDir=/data/zookeeper/data
dataLogDir=/data/zookeeper/logs
autopurge.snapRetainCount=10
autopurge.purgeInterval=1
preAllocSize=131072
snapCount=3000000
leaderServes=yes
clientPort=2181
 
 
 
集群配置部分三个节点分别为:
# centos-1
server.1=0.0.0.0:2888:3888
server.2=172.16.120.11:2888:3888
server.3=172.16.120.12:2888:3888
  
  
# centos-2
server.1=172.16.120.10:2888:3888
server.2=0.0.0.0:2888:3888
server.3=172.16.120.12:2888:3888
  
  
# centos-3
server.1=172.16.120.10:2888:3888
server.2=172.16.120.11:2888:3888
server.3=0.0.0.0:2888:3888
创建目录
mkdir -p /data/zookeeper/{data,logs}
mkdir -p /usr/local/zookeeper/logs
myid
# centos-1
echo "1">/data/zookeeper/data/myid
  
# centos-2
echo "2">/data/zookeeper/data/myid
  
# centos-3
echo "3">/data/zookeeper/data/myid
配置zk日志

默认zk日志输出到一个文件,且不会自动清理,所以,一段时间后zk日志会非常大

1.zookeeper-env.sh ./conf目录下新建zookeeper-env.sh文件,修改到sudo chmod 755 zookeeper-env.sh权限
#cat conf/zookeeper-env.sh
 
#!/usr/bin/env bash
#tip:custom configurationfile,do not amend the zkEnv.sh file
#chang the log dir and output of rolling file
 
ZOO_LOG_DIR="/usr/local/zookeeper/logs"
ZOO_LOG4J_PROP="INFO,ROLLINGFILE"
2.log4j.properties 修改日志的输入形式
zookeeper.root.logger=INFO, ROLLINGFILE
#zookeeper.root.logger=INFO, CONSOLE
 
 
# Max log file size of 10MB
log4j.appender.ROLLINGFILE.MaxFileSize=128MB
# uncomment the next line to limit number of backup files
log4j.appender.ROLLINGFILE.MaxBackupIndex=10
配置运行zk的JVM

./conf目录下新建java.env文件,修改到sudo chmod 755 java.env权限,主要用于GC log,RAM等的配置.

#!/usr/bin/env bash
#config the jvm parameter in a reasonable
#note that the shell is source in so that do not need to use export
#set java  classpath
#CLASSPATH=""
#set jvm start parameter , also can set JVMFLAGS variable
SERVER_JVMFLAGS="-Xms1024m -Xmx2048m $JVMFLAGS"
启动zookeeper服务(所有节点)
# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
验证zk
# zkServer.sh status
 
#bin/zkCli.sh -server 127.0.0.1:2181
Connecting to 127.0.0.1:2181
 
 
[zk: 127.0.0.1:2181(CONNECTED) 0]
[zk: 127.0.0.1:2181(CONNECTED) 0] ls /
[zookeeper]
 
[zk: 127.0.0.1:2181(CONNECTED) 1] create /zk_test mydata
Created /zk_test
 
[zk: 127.0.0.1:2181(CONNECTED) 2] ls /
[zk_test, zookeeper]
 
[zk: 127.0.0.1:2181(CONNECTED) 3] get /zk_test
mydata
 
[zk: 127.0.0.1:2181(CONNECTED) 4] set /zk_test junk
[zk: 127.0.0.1:2181(CONNECTED) 5] get /zk_test
junk
 
[zk: 127.0.0.1:2181(CONNECTED) 6] delete /zk_test
[zk: 127.0.0.1:2181(CONNECTED) 7] ls /
[zookeeper]
 
[zk: 127.0.0.1:2181(CONNECTED) 8]

ClickHouse安装

yum安装

yum install yum-utils
rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
 
 
yum install clickhouse-server clickhouse-client

RPM方式离线安装

查看clickhouse源码tag,找出最近的长期维护的版本

在这里插入图片描述

官网下载如下包:
https://repo.yandex.ru/clickhouse/rpm/stable/x86_64/clickhouse-client-21.8.12.29-2.noarch.rpm 
https://repo.yandex.ru/clickhouse/rpm/stable/x86_64/clickhouse-common-static-21.8.12.29-2.x86_64.rpm
https://repo.yandex.ru/clickhouse/rpm/stable/x86_64/clickhouse-common-static-dbg-21.8.12.29-2.x86_64.rpm
https://repo.yandex.ru/clickhouse/rpm/stable/x86_64/clickhouse-server-21.8.12.29-2.noarch.rpm
https://repo.yandex.ru/clickhouse/rpm/stable/x86_64/clickhouse-test-21.8.12.29-2.noarch.rpm
安装clickhouse
rpm -ivh /soft/clickhouse/*.rpm

创建目录

centos-1 创建目录:
mkdir -p /data/clickhouse/{node1,node4}/{data,tmp,logs}
  
centos-2 创建目录:
mkdir -p /data/clickhouse/{node2,node5}/{data,tmp,logs}
  
centos-3 创建目录:
mkdir -p /data/clickhouse/{node3,node6}/{data,tmp,logs}

创建配置文件

配置clickhouse参数文件,账户文件,分片配置文件

node1配置文件如下

config.xml
<?xml version="1.0"?>
<yandex>
    <!--日志-->
    <logger>
        <level>warning</level>
        <log>/data/clickhouse/node1/logs/clickhouse.log</log>
        <errorlog>/data/clickhouse/node1/logs/error.log</errorlog>
        <size>500M</size>
        <count>5</count>
    </logger>
    <!--本地节点信息-->
    <http_port>8123</http_port>
    <tcp_port>9000</tcp_port>
    <interserver_http_port>9009</interserver_http_port>
    <interserver_http_host>centos-1</interserver_http_host>
    <!--本机域名或IP-->
    <!--本地配置-->
    <listen_host>0.0.0.0</listen_host>
    <max_connections>2048</max_connections>
    <keep_alive_timeout>3</keep_alive_timeout>
    <max_concurrent_queries>64</max_concurrent_queries>
    <uncompressed_cache_size>4294967296</uncompressed_cache_size>
    <mark_cache_size>5368709120</mark_cache_size>
    <path>/data/clickhouse/node1/</path>
    <tmp_path>/data/clickhouse/node1/tmp/</tmp_path>
    <users_config>/data/clickhouse/node1/users.xml</users_config>
    <access_control_path>/data/clickhouse/node1/access/</access_control_path>
    <default_profile>default</default_profile>
    <query_log>
        <database>system</database>
        <table>query_log</table>
        <partition_by>toMonday(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_log>
    <query_thread_log>
        <database>system</database>
        <table>query_thread_log</table>
        <partition_by>toMonday(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_thread_log>
    <prometheus>
        <endpoint>/metrics</endpoint>
        <port>8001</port>
        <metrics>true</metrics>
        <events>true</events>
        <asynchronous_metrics>true</asynchronous_metrics>
    </prometheus>
    <default_database>default</default_database>
    <timezone>Asia/Shanghai</timezone>
    <!--集群相关配置-->
    <remote_servers incl="clickhouse_remote_servers" />
    <zookeeper incl="zookeeper-servers" optional="true" />
    <macros incl="macros" optional="true" />
    <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>
    <max_session_timeout>3600</max_session_timeout>
    <default_session_timeout>300</default_session_timeout>
    <max_table_size_to_drop>0</max_table_size_to_drop>
    <merge_tree>
        <parts_to_delay_insert>300</parts_to_delay_insert>
        <parts_to_throw_insert>600</parts_to_throw_insert>
        <max_delay_to_insert>2</max_delay_to_insert>
    </merge_tree>
    <max_table_size_to_drop>0</max_table_size_to_drop>
    <max_partition_size_to_drop>0</max_partition_size_to_drop>
    <distributed_ddl>
        <!-- Path in ZooKeeper to queue with DDL queries -->
        <path>/clickhouse/task_queue/ddl</path>
    </distributed_ddl>
    <include_from>/data/clickhouse/node1/metrika.xml</include_from>
</yandex>

users.xml
<?xml version="1.0"?>
<yandex>
    <profiles>
        <default>
            <!-- 请根据自己机器实际内存配置 -->
            <max_memory_usage>54975581388</max_memory_usage>
            <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by>
            <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort>
            <use_uncompressed_cache>0</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
            <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
            <max_threads>8</max_threads>
            <log_queries>1</log_queries>
        </default>
        <readonly>
            <max_threads>8</max_threads>
            <max_memory_usage>54975581388</max_memory_usage>
            <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by>
            <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort>
            <use_uncompressed_cache>0</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
            <readonly>1</readonly>
            <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
            <log_queries>1</log_queries>
        </readonly>
    </profiles>
    <quotas>
        <default>
            <interval>
                <duration>3600</duration>
                <queries>0</queries>
                <errors>0</errors>
                <result_rows>0</result_rows>
                <read_rows>0</read_rows>
                <execution_time>0</execution_time>
            </interval>
        </default>
    </quotas>
    <users>
        <default>
            <password_sha256_hex>328afba155145b9ab4dadc378a38f830107b1c48ae64a37690b163408b47c52c</password_sha256_hex>
            <access_management>1</access_management>
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
        </default>
        <ch_ro>
            <password_sha256_hex>328afba155145b9ab4dadc378a38f830107b1c48ae64a37690b163408b47c52c</password_sha256_hex>
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>readonly</profile>
            <quota>default</quota>
        </ch_ro>
    </users>
</yandex>

metrika.xml
<?xml version="1.0"?>
<yandex>
    <!--ck集群节点-->
    <clickhouse_remote_servers>
        <ch_cluster_all>
            <!--分片1-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-1</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集1-->
                <replica>
                    <host>centos-3</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
            <!--分片2-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-2</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集2-->
                <replica>
                    <host>centos-1</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
            <!--分片3-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-3</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集3-->
                <replica>
                    <host>centos-2</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
        </ch_cluster_all>
    </clickhouse_remote_servers>
    <!--zookeeper相关配置-->
    <zookeeper-servers>
        <node index="1">
            <host>centos-1</host>
            <port>2181</port>
        </node>
        <node index="2">
            <host>centos-2</host>
            <port>2181</port>
        </node>
        <node index="3">
            <host>centos-3</host>
            <port>2181</port>
        </node>
    </zookeeper-servers>
    <macros>
        <layer>01</layer>
        <shard>01</shard>
        <!--分片号-->
        <replica>node1</replica>
        <!--当前节点IP-->
    </macros>
    <networks>
        <ip>::/0</ip>
    </networks>
    <!--压缩相关配置-->
    <clickhouse_compression>
        <case>
            <min_part_size>10000000000</min_part_size>
            <min_part_size_ratio>0.01</min_part_size_ratio>
            <method>lz4</method>
            <!--压缩算法lz4压缩比zstd快, 更占磁盘-->
        </case>
    </clickhouse_compression>
</yandex>

node2配置文件如下:

config.xml
<?xml version="1.0"?>
<yandex>
    <!--日志-->
    <logger>
        <level>warning</level>
        <log>/data/clickhouse/node2/logs/clickhouse.log</log>
        <errorlog>/data/clickhouse/node2/logs/error.log</errorlog>
        <size>500M</size>
        <count>5</count>
    </logger>
    <!--本地节点信息-->
    <http_port>8123</http_port>
    <tcp_port>9000</tcp_port>
    <interserver_http_port>9009</interserver_http_port>
    <interserver_http_host>centos-2</interserver_http_host>
    <!--本机域名或IP-->
    <!--本地配置-->
    <listen_host>0.0.0.0</listen_host>
    <max_connections>2048</max_connections>
    <keep_alive_timeout>3</keep_alive_timeout>
    <max_concurrent_queries>64</max_concurrent_queries>
    <uncompressed_cache_size>4294967296</uncompressed_cache_size>
    <mark_cache_size>5368709120</mark_cache_size>
    <path>/data/clickhouse/node2/</path>
    <tmp_path>/data/clickhouse/node2/tmp/</tmp_path>
    <users_config>/data/clickhouse/node2/users.xml</users_config>
    <access_control_path>/data/clickhouse/node2/access/</access_control_path>
    <default_profile>default</default_profile>
    <query_log>
        <database>system</database>
        <table>query_log</table>
        <partition_by>toMonday(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_log>
    <query_thread_log>
        <database>system</database>
        <table>query_thread_log</table>
        <partition_by>toMonday(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_thread_log>
    <prometheus>
        <endpoint>/metrics</endpoint>
        <port>8001</port>
        <metrics>true</metrics>
        <events>true</events>
        <asynchronous_metrics>true</asynchronous_metrics>
    </prometheus>
    <default_database>default</default_database>
    <timezone>Asia/Shanghai</timezone>
    <!--集群相关配置-->
    <remote_servers incl="clickhouse_remote_servers" />
    <zookeeper incl="zookeeper-servers" optional="true" />
    <macros incl="macros" optional="true" />
    <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>
    <max_session_timeout>3600</max_session_timeout>
    <default_session_timeout>300</default_session_timeout>
    <max_table_size_to_drop>0</max_table_size_to_drop>
    <merge_tree>
        <parts_to_delay_insert>300</parts_to_delay_insert>
        <parts_to_throw_insert>600</parts_to_throw_insert>
        <max_delay_to_insert>2</max_delay_to_insert>
    </merge_tree>
    <max_table_size_to_drop>0</max_table_size_to_drop>
    <max_partition_size_to_drop>0</max_partition_size_to_drop>
    <distributed_ddl>
        <!-- Path in ZooKeeper to queue with DDL queries -->
        <path>/clickhouse/task_queue/ddl</path>
    </distributed_ddl>
    <include_from>/data/clickhouse/node2/metrika.xml</include_from>
</yandex>
users.xml
<?xml version="1.0"?>
<yandex>
    <profiles>
        <default>
            <!-- 请根据自己机器实际内存配置 -->
            <max_memory_usage>54975581388</max_memory_usage>
            <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by>
            <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort>
            <use_uncompressed_cache>0</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
            <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
            <max_threads>8</max_threads>
            <log_queries>1</log_queries>
        </default>
        <readonly>
            <max_threads>8</max_threads>
            <max_memory_usage>54975581388</max_memory_usage>
            <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by>
            <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort>
            <use_uncompressed_cache>0</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
            <readonly>1</readonly>
            <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
            <log_queries>1</log_queries>
        </readonly>
    </profiles>
    <quotas>
        <default>
            <interval>
                <duration>3600</duration>
                <queries>0</queries>
                <errors>0</errors>
                <result_rows>0</result_rows>
                <read_rows>0</read_rows>
                <execution_time>0</execution_time>
            </interval>
        </default>
    </quotas>
    <users>
        <default>
            <password_sha256_hex>328afba155145b9ab4dadc378a38f830107b1c48ae64a37690b163408b47c52c</password_sha256_hex>
            <access_management>1</access_management>
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
        </default>
        <ch_ro>
            <password_sha256_hex>328afba155145b9ab4dadc378a38f830107b1c48ae64a37690b163408b47c52c</password_sha256_hex>
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>readonly</profile>
            <quota>default</quota>
        </ch_ro>
    </users>
</yandex>
metrika.xml
<?xml version="1.0"?>
<yandex>
    <!--ck集群节点-->
    <clickhouse_remote_servers>
        <ch_cluster_all>
            <!--分片1-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-1</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集1-->
                <replica>
                    <host>centos-3</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
            <!--分片2-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-2</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集2-->
                <replica>
                    <host>centos-1</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
            <!--分片3-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-3</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集3-->
                <replica>
                    <host>centos-2</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
        </ch_cluster_all>
    </clickhouse_remote_servers>
    <!--zookeeper相关配置-->
    <zookeeper-servers>
        <node index="1">
            <host>centos-1</host>
            <port>2181</port>
        </node>
        <node index="2">
            <host>centos-2</host>
            <port>2181</port>
        </node>
        <node index="3">
            <host>centos-3</host>
            <port>2181</port>
        </node>
    </zookeeper-servers>
    <macros>
        <layer>01</layer>
        <shard>02</shard>
        <!--分片号-->
        <replica>node2</replica>
        <!--当前节点IP-->
    </macros>
    <networks>
        <ip>::/0</ip>
    </networks>
    <!--压缩相关配置-->
    <clickhouse_compression>
        <case>
            <min_part_size>10000000000</min_part_size>
            <min_part_size_ratio>0.01</min_part_size_ratio>
            <method>lz4</method>
            <!--压缩算法lz4压缩比zstd快, 更占磁盘-->
        </case>
    </clickhouse_compression>
</yandex>

node3配置文件如下:

config.xml
<?xml version="1.0"?>
<yandex>
    <!--日志-->
    <logger>
        <level>warning</level>
        <log>/data/clickhouse/node3/logs/clickhouse.log</log>
        <errorlog>/data/clickhouse/node3/logs/error.log</errorlog>
        <size>500M</size>
        <count>5</count>
    </logger>
    <!--本地节点信息-->
    <http_port>8123</http_port>
    <tcp_port>9000</tcp_port>
    <interserver_http_port>9009</interserver_http_port>
    <interserver_http_host>centos-3</interserver_http_host>
    <!--本机域名或IP-->
    <!--本地配置-->
    <listen_host>0.0.0.0</listen_host>
    <max_connections>2048</max_connections>
    <keep_alive_timeout>3</keep_alive_timeout>
    <max_concurrent_queries>64</max_concurrent_queries>
    <uncompressed_cache_size>4294967296</uncompressed_cache_size>
    <mark_cache_size>5368709120</mark_cache_size>
    <path>/data/clickhouse/node3/</path>
    <tmp_path>/data/clickhouse/node3/tmp/</tmp_path>
    <users_config>/data/clickhouse/node3/users.xml</users_config>
    <access_control_path>/data/clickhouse/node3/access/</access_control_path>
    <default_profile>default</default_profile>
    <query_log>
        <database>system</database>
        <table>query_log</table>
        <partition_by>toMonday(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_log>
    <query_thread_log>
        <database>system</database>
        <table>query_thread_log</table>
        <partition_by>toMonday(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_thread_log>
    <prometheus>
        <endpoint>/metrics</endpoint>
        <port>8001</port>
        <metrics>true</metrics>
        <events>true</events>
        <asynchronous_metrics>true</asynchronous_metrics>
    </prometheus>
    <default_database>default</default_database>
    <timezone>Asia/Shanghai</timezone>
    <!--集群相关配置-->
    <remote_servers incl="clickhouse_remote_servers" />
    <zookeeper incl="zookeeper-servers" optional="true" />
    <macros incl="macros" optional="true" />
    <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>
    <max_session_timeout>3600</max_session_timeout>
    <default_session_timeout>300</default_session_timeout>
    <max_table_size_to_drop>0</max_table_size_to_drop>
    <merge_tree>
        <parts_to_delay_insert>300</parts_to_delay_insert>
        <parts_to_throw_insert>600</parts_to_throw_insert>
        <max_delay_to_insert>2</max_delay_to_insert>
    </merge_tree>
    <max_table_size_to_drop>0</max_table_size_to_drop>
    <max_partition_size_to_drop>0</max_partition_size_to_drop>
    <distributed_ddl>
        <!-- Path in ZooKeeper to queue with DDL queries -->
        <path>/clickhouse/task_queue/ddl</path>
    </distributed_ddl>
    <include_from>/data/clickhouse/node3/metrika.xml</include_from>
</yandex>
users.xml
<?xml version="1.0"?>
<yandex>
    <profiles>
        <default>
            <!-- 请根据自己机器实际内存配置 -->
            <max_memory_usage>54975581388</max_memory_usage>
            <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by>
            <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort>
            <use_uncompressed_cache>0</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
            <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
            <max_threads>8</max_threads>
            <log_queries>1</log_queries>
        </default>
        <readonly>
            <max_threads>8</max_threads>
            <max_memory_usage>54975581388</max_memory_usage>
            <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by>
            <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort>
            <use_uncompressed_cache>0</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
            <readonly>1</readonly>
            <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
            <log_queries>1</log_queries>
        </readonly>
    </profiles>
    <quotas>
        <default>
            <interval>
                <duration>3600</duration>
                <queries>0</queries>
                <errors>0</errors>
                <result_rows>0</result_rows>
                <read_rows>0</read_rows>
                <execution_time>0</execution_time>
            </interval>
        </default>
    </quotas>
    <users>
        <default>
            <password_sha256_hex>328afba155145b9ab4dadc378a38f830107b1c48ae64a37690b163408b47c52c</password_sha256_hex>
            <access_management>1</access_management>
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
        </default>
        <ch_ro>
            <password_sha256_hex>328afba155145b9ab4dadc378a38f830107b1c48ae64a37690b163408b47c52c</password_sha256_hex>
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>readonly</profile>
            <quota>default</quota>
        </ch_ro>
    </users>
</yandex>
metrika.xml
<?xml version="1.0"?>
<yandex>
    <!--ck集群节点-->
    <clickhouse_remote_servers>
        <ch_cluster_all>
            <!--分片1-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-1</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集1-->
                <replica>
                    <host>centos-3</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
            <!--分片2-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-2</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集2-->
                <replica>
                    <host>centos-1</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
            <!--分片3-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-3</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集3-->
                <replica>
                    <host>centos-2</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
        </ch_cluster_all>
    </clickhouse_remote_servers>
    <!--zookeeper相关配置-->
    <zookeeper-servers>
        <node index="1">
            <host>centos-1</host>
            <port>2181</port>
        </node>
        <node index="2">
            <host>centos-2</host>
            <port>2181</port>
        </node>
        <node index="3">
            <host>centos-3</host>
            <port>2181</port>
        </node>
    </zookeeper-servers>
    <macros>
        <layer>01</layer>
        <shard>03</shard>
        <!--分片号-->
        <replica>node3</replica>
        <!--当前节点IP-->
    </macros>
    <networks>
        <ip>::/0</ip>
    </networks>
    <!--压缩相关配置-->
    <clickhouse_compression>
        <case>
            <min_part_size>10000000000</min_part_size>
            <min_part_size_ratio>0.01</min_part_size_ratio>
            <method>lz4</method>
            <!--压缩算法lz4压缩比zstd快, 更占磁盘-->
        </case>
    </clickhouse_compression>
</yandex>

node4配置文件如下:

config.xml
<?xml version="1.0"?>
<yandex>
    <!--日志-->
    <logger>
        <level>warning</level>
        <log>/data/clickhouse/node4/logs/clickhouse.log</log>
        <errorlog>/data/clickhouse/node4/logs/error.log</errorlog>
        <size>500M</size>
        <count>5</count>
    </logger>
    <!--本地节点信息-->
    <http_port>8124</http_port>
    <tcp_port>9002</tcp_port>
    <interserver_http_port>9010</interserver_http_port>
    <interserver_http_host>centos-1</interserver_http_host>
    <!--本机域名或IP-->
    <!--本地配置-->
    <listen_host>0.0.0.0</listen_host>
    <max_connections>2048</max_connections>
    <keep_alive_timeout>3</keep_alive_timeout>
    <max_concurrent_queries>64</max_concurrent_queries>
    <uncompressed_cache_size>4294967296</uncompressed_cache_size>
    <mark_cache_size>5368709120</mark_cache_size>
    <path>/data/clickhouse/node4/</path>
    <tmp_path>/data/clickhouse/node4/tmp/</tmp_path>
    <users_config>/data/clickhouse/node4/users.xml</users_config>
    <access_control_path>/data/clickhouse/node4/access/</access_control_path>
    <default_profile>default</default_profile>
    <query_log>
        <database>system</database>
        <table>query_log</table>
        <partition_by>toMonday(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_log>
    <query_thread_log>
        <database>system</database>
        <table>query_thread_log</table>
        <partition_by>toMonday(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_thread_log>
    <prometheus>
        <endpoint>/metrics</endpoint>
        <port>8002</port>
        <metrics>true</metrics>
        <events>true</events>
        <asynchronous_metrics>true</asynchronous_metrics>
    </prometheus>
    <default_database>default</default_database>
    <timezone>Asia/Shanghai</timezone>
    <!--集群相关配置-->
    <remote_servers incl="clickhouse_remote_servers" />
    <zookeeper incl="zookeeper-servers" optional="true" />
    <macros incl="macros" optional="true" />
    <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>
    <max_session_timeout>3600</max_session_timeout>
    <default_session_timeout>300</default_session_timeout>
    <max_table_size_to_drop>0</max_table_size_to_drop>
    <merge_tree>
        <parts_to_delay_insert>300</parts_to_delay_insert>
        <parts_to_throw_insert>600</parts_to_throw_insert>
        <max_delay_to_insert>2</max_delay_to_insert>
    </merge_tree>
    <max_table_size_to_drop>0</max_table_size_to_drop>
    <max_partition_size_to_drop>0</max_partition_size_to_drop>
    <distributed_ddl>
        <!-- Path in ZooKeeper to queue with DDL queries -->
        <path>/clickhouse/task_queue/ddl</path>
    </distributed_ddl>
    <include_from>/data/clickhouse/node4/metrika.xml</include_from>
</yandex>
users.xml
<?xml version="1.0"?>
<yandex>
    <profiles>
        <default>
            <!-- 请根据自己机器实际内存配置 -->
            <max_memory_usage>54975581388</max_memory_usage>
            <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by>
            <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort>
            <use_uncompressed_cache>0</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
            <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
            <max_threads>8</max_threads>
            <log_queries>1</log_queries>
        </default>
        <readonly>
            <max_threads>8</max_threads>
            <max_memory_usage>54975581388</max_memory_usage>
            <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by>
            <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort>
            <use_uncompressed_cache>0</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
            <readonly>1</readonly>
            <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
            <log_queries>1</log_queries>
        </readonly>
    </profiles>
    <quotas>
        <default>
            <interval>
                <duration>3600</duration>
                <queries>0</queries>
                <errors>0</errors>
                <result_rows>0</result_rows>
                <read_rows>0</read_rows>
                <execution_time>0</execution_time>
            </interval>
        </default>
    </quotas>
    <users>
        <default>
            <password_sha256_hex>328afba155145b9ab4dadc378a38f830107b1c48ae64a37690b163408b47c52c</password_sha256_hex>
            <access_management>1</access_management>
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
        </default>
        <ch_ro>
            <password_sha256_hex>328afba155145b9ab4dadc378a38f830107b1c48ae64a37690b163408b47c52c</password_sha256_hex>
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>readonly</profile>
            <quota>default</quota>
        </ch_ro>
    </users>
</yandex>
metrika.xml
<?xml version="1.0"?>
<yandex>
    <!--ck集群节点-->
    <clickhouse_remote_servers>
        <ch_cluster_all>
            <!--分片1-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-1</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集1-->
                <replica>
                    <host>centos-3</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
            <!--分片2-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-2</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集2-->
                <replica>
                    <host>centos-1</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
            <!--分片3-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-3</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集3-->
                <replica>
                    <host>centos-2</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
        </ch_cluster_all>
    </clickhouse_remote_servers>
    <!--zookeeper相关配置-->
    <zookeeper-servers>
        <node index="1">
            <host>centos-1</host>
            <port>2181</port>
        </node>
        <node index="2">
            <host>centos-2</host>
            <port>2181</port>
        </node>
        <node index="3">
            <host>centos-3</host>
            <port>2181</port>
        </node>
    </zookeeper-servers>
    <macros>
        <layer>01</layer>
        <shard>02</shard>
        <!--分片号-->
        <replica>node4</replica>
    </macros>
    <networks>
        <ip>::/0</ip>
    </networks>
    <!--压缩相关配置-->
    <clickhouse_compression>
        <case>
            <min_part_size>10000000000</min_part_size>
            <min_part_size_ratio>0.01</min_part_size_ratio>
            <method>lz4</method>
            <!--压缩算法lz4压缩比zstd快, 更占磁盘-->
        </case>
    </clickhouse_compression>
</yandex>

node5配置文件如下:

config.xml
<?xml version="1.0"?>
<yandex>
    <!--日志-->
    <logger>
        <level>warning</level>
        <log>/data/clickhouse/node5/logs/clickhouse.log</log>
        <errorlog>/data/clickhouse/node5/logs/error.log</errorlog>
        <size>500M</size>
        <count>5</count>
    </logger>
    <!--本地节点信息-->
    <http_port>8124</http_port>
    <tcp_port>9002</tcp_port>
    <interserver_http_port>9010</interserver_http_port>
    <interserver_http_host>centos-2</interserver_http_host>
    <!--本机域名或IP-->
    <!--本地配置-->
    <listen_host>0.0.0.0</listen_host>
    <max_connections>2048</max_connections>
    <keep_alive_timeout>3</keep_alive_timeout>
    <max_concurrent_queries>64</max_concurrent_queries>
    <uncompressed_cache_size>4294967296</uncompressed_cache_size>
    <mark_cache_size>5368709120</mark_cache_size>
    <path>/data/clickhouse/node5/</path>
    <tmp_path>/data/clickhouse/node5/tmp/</tmp_path>
    <users_config>/data/clickhouse/node5/users.xml</users_config>
    <access_control_path>/data/clickhouse/node5/access/</access_control_path>
    <default_profile>default</default_profile>
    <query_log>
        <database>system</database>
        <table>query_log</table>
        <partition_by>toMonday(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_log>
    <query_thread_log>
        <database>system</database>
        <table>query_thread_log</table>
        <partition_by>toMonday(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_thread_log>
    <prometheus>
        <endpoint>/metrics</endpoint>
        <port>8002</port>
        <metrics>true</metrics>
        <events>true</events>
        <asynchronous_metrics>true</asynchronous_metrics>
    </prometheus>
    <default_database>default</default_database>
    <timezone>Asia/Shanghai</timezone>
    <!--集群相关配置-->
    <remote_servers incl="clickhouse_remote_servers" />
    <zookeeper incl="zookeeper-servers" optional="true" />
    <macros incl="macros" optional="true" />
    <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>
    <max_session_timeout>3600</max_session_timeout>
    <default_session_timeout>300</default_session_timeout>
    <max_table_size_to_drop>0</max_table_size_to_drop>
    <merge_tree>
        <parts_to_delay_insert>300</parts_to_delay_insert>
        <parts_to_throw_insert>600</parts_to_throw_insert>
        <max_delay_to_insert>2</max_delay_to_insert>
    </merge_tree>
    <max_table_size_to_drop>0</max_table_size_to_drop>
    <max_partition_size_to_drop>0</max_partition_size_to_drop>
    <distributed_ddl>
        <!-- Path in ZooKeeper to queue with DDL queries -->
        <path>/clickhouse/task_queue/ddl</path>
    </distributed_ddl>
    <include_from>/data/clickhouse/node5/metrika.xml</include_from>
</yandex>
users.xml
<?xml version="1.0"?>
<yandex>
    <profiles>
        <default>
            <!-- 请根据自己机器实际内存配置 -->
            <max_memory_usage>54975581388</max_memory_usage>
            <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by>
            <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort>
            <use_uncompressed_cache>0</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
            <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
            <max_threads>8</max_threads>
            <log_queries>1</log_queries>
        </default>
        <readonly>
            <max_threads>8</max_threads>
            <max_memory_usage>54975581388</max_memory_usage>
            <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by>
            <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort>
            <use_uncompressed_cache>0</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
            <readonly>1</readonly>
            <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
            <log_queries>1</log_queries>
        </readonly>
    </profiles>
    <quotas>
        <default>
            <interval>
                <duration>3600</duration>
                <queries>0</queries>
                <errors>0</errors>
                <result_rows>0</result_rows>
                <read_rows>0</read_rows>
                <execution_time>0</execution_time>
            </interval>
        </default>
    </quotas>
    <users>
        <default>
            <password_sha256_hex>328afba155145b9ab4dadc378a38f830107b1c48ae64a37690b163408b47c52c</password_sha256_hex>
            <access_management>1</access_management>
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
        </default>
        <ch_ro>
            <password_sha256_hex>328afba155145b9ab4dadc378a38f830107b1c48ae64a37690b163408b47c52c</password_sha256_hex>
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>readonly</profile>
            <quota>default</quota>
        </ch_ro>
    </users>
</yandex>
metrika.xml
<?xml version="1.0"?>
<yandex>
    <!--ck集群节点-->
    <clickhouse_remote_servers>
        <ch_cluster_all>
            <!--分片1-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-1</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集1-->
                <replica>
                    <host>centos-3</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
            <!--分片2-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-2</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集2-->
                <replica>
                    <host>centos-1</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
            <!--分片3-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-3</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集3-->
                <replica>
                    <host>centos-2</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
        </ch_cluster_all>
    </clickhouse_remote_servers>
    <!--zookeeper相关配置-->
    <zookeeper-servers>
        <node index="1">
            <host>centos-1</host>
            <port>2181</port>
        </node>
        <node index="2">
            <host>centos-2</host>
            <port>2181</port>
        </node>
        <node index="3">
            <host>centos-3</host>
            <port>2181</port>
        </node>
    </zookeeper-servers>
    <macros>
        <layer>01</layer>
        <shard>03</shard>
        <!--分片号-->
        <replica>node5</replica>
        <!--当前节点IP-->
    </macros>
    <networks>
        <ip>::/0</ip>
    </networks>
    <!--压缩相关配置-->
    <clickhouse_compression>
        <case>
            <min_part_size>10000000000</min_part_size>
            <min_part_size_ratio>0.01</min_part_size_ratio>
            <method>lz4</method>
            <!--压缩算法lz4压缩比zstd快, 更占磁盘-->
        </case>
    </clickhouse_compression>
</yandex>

node6配置文件如下:

config.xml
<?xml version="1.0"?>
<yandex>
    <!--日志-->
    <logger>
        <level>warning</level>
        <log>/data/clickhouse/node6/logs/clickhouse.log</log>
        <errorlog>/data/clickhouse/node6/logs/error.log</errorlog>
        <size>500M</size>
        <count>5</count>
    </logger>
    <!--本地节点信息-->
    <http_port>8124</http_port>
    <tcp_port>9002</tcp_port>
    <interserver_http_port>9010</interserver_http_port>
    <interserver_http_host>centos-3</interserver_http_host>
    <!--本机域名或IP-->
    <!--本地配置-->
    <listen_host>0.0.0.0</listen_host>
    <max_connections>2048</max_connections>
    <keep_alive_timeout>3</keep_alive_timeout>
    <max_concurrent_queries>64</max_concurrent_queries>
    <uncompressed_cache_size>4294967296</uncompressed_cache_size>
    <mark_cache_size>5368709120</mark_cache_size>
    <path>/data/clickhouse/node6/</path>
    <tmp_path>/data/clickhouse/node6/tmp/</tmp_path>
    <users_config>/data/clickhouse/node6/users.xml</users_config>
    <access_control_path>/data/clickhouse/node6/access/</access_control_path>
    <default_profile>default</default_profile>
    <query_log>
        <database>system</database>
        <table>query_log</table>
        <partition_by>toMonday(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_log>
    <query_thread_log>
        <database>system</database>
        <table>query_thread_log</table>
        <partition_by>toMonday(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_thread_log>
    <prometheus>
        <endpoint>/metrics</endpoint>
        <port>8002</port>
        <metrics>true</metrics>
        <events>true</events>
        <asynchronous_metrics>true</asynchronous_metrics>
    </prometheus>
    <default_database>default</default_database>
    <timezone>Asia/Shanghai</timezone>
    <!--集群相关配置-->
    <remote_servers incl="clickhouse_remote_servers" />
    <zookeeper incl="zookeeper-servers" optional="true" />
    <macros incl="macros" optional="true" />
    <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>
    <max_session_timeout>3600</max_session_timeout>
    <default_session_timeout>300</default_session_timeout>
    <max_table_size_to_drop>0</max_table_size_to_drop>
    <merge_tree>
        <parts_to_delay_insert>300</parts_to_delay_insert>
        <parts_to_throw_insert>600</parts_to_throw_insert>
        <max_delay_to_insert>2</max_delay_to_insert>
    </merge_tree>
    <max_table_size_to_drop>0</max_table_size_to_drop>
    <max_partition_size_to_drop>0</max_partition_size_to_drop>
    <distributed_ddl>
        <!-- Path in ZooKeeper to queue with DDL queries -->
        <path>/clickhouse/task_queue/ddl</path>
    </distributed_ddl>
    <include_from>/data/clickhouse/node6/metrika.xml</include_from>
</yandex>
users.xml
<?xml version="1.0"?>
<yandex>
    <profiles>
        <default>
            <!-- 请根据自己机器实际内存配置 -->
            <max_memory_usage>54975581388</max_memory_usage>
            <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by>
            <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort>
            <use_uncompressed_cache>0</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
            <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
            <max_threads>8</max_threads>
            <log_queries>1</log_queries>
        </default>
        <readonly>
            <max_threads>8</max_threads>
            <max_memory_usage>54975581388</max_memory_usage>
            <max_memory_usage_for_all_queries>61847529062</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>21474836480</max_bytes_before_external_group_by>
            <max_bytes_before_external_sort>21474836480</max_bytes_before_external_sort>
            <use_uncompressed_cache>0</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
            <readonly>1</readonly>
            <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
            <log_queries>1</log_queries>
        </readonly>
    </profiles>
    <quotas>
        <default>
            <interval>
                <duration>3600</duration>
                <queries>0</queries>
                <errors>0</errors>
                <result_rows>0</result_rows>
                <read_rows>0</read_rows>
                <execution_time>0</execution_time>
            </interval>
        </default>
    </quotas>
    <users>
        <default>
            <password_sha256_hex>328afba155145b9ab4dadc378a38f830107b1c48ae64a37690b163408b47c52c</password_sha256_hex>
            <access_management>1</access_management>
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
        </default>
        <ch_ro>
            <password_sha256_hex>328afba155145b9ab4dadc378a38f830107b1c48ae64a37690b163408b47c52c</password_sha256_hex>
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>readonly</profile>
            <quota>default</quota>
        </ch_ro>
    </users>
</yandex>
metrika.xml
<?xml version="1.0"?>
<yandex>
    <!--ck集群节点-->
    <clickhouse_remote_servers>
        <ch_cluster_all>
            <!--分片1-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-1</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集1-->
                <replica>
                    <host>centos-3</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
            <!--分片2-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-2</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集2-->
                <replica>
                    <host>centos-1</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
            <!--分片3-->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>centos-3</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
                <!--复制集3-->
                <replica>
                    <host>centos-2</host>
                    <port>9002</port>
                    <user>default</user>
                    <password>t5P9rh5M</password>
                </replica>
            </shard>
        </ch_cluster_all>
    </clickhouse_remote_servers>
    <!--zookeeper相关配置-->
    <zookeeper-servers>
        <node index="1">
            <host>centos-1</host>
            <port>2181</port>
        </node>
        <node index="2">
            <host>centos-2</host>
            <port>2181</port>
        </node>
        <node index="3">
            <host>centos-3</host>
            <port>2181</port>
        </node>
    </zookeeper-servers>
    <macros>
        <layer>01</layer>
        <shard>01</shard>
        <!--分片号-->
        <replica>node6</replica>
        <!--当前节点IP-->
    </macros>
    <networks>
        <ip>::/0</ip>
    </networks>
    <!--压缩相关配置-->
    <clickhouse_compression>
        <case>
            <min_part_size>10000000000</min_part_size>
            <min_part_size_ratio>0.01</min_part_size_ratio>
            <method>lz4</method>
            <!--压缩算法lz4压缩比zstd快, 更占磁盘-->
        </case>
    </clickhouse_compression>
</yandex>

用户密码生成方式

密码生成方式:

https://clickhouse.tech/docs/en/operations/settings/settings_users/

shell命令行执行:

PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha256sum | tr -d '-' 

输出结果

t5P9rh5M(明文密码)
328afba155145b9ab4dadc378a38f830107b1c48ae64a37690b163408b47c52c(加密长密码, 即password_sha256_hex) 

修改属主

cd /data && chown -R  clickhouse.clickhouse clickhouse

进程守护

使用systemd管理

以node1为例

# vim /etc/systemd/system/clickhouse_node1.service
 
 
[Unit]
Description=ClickHouse Server (analytic DBMS for big data)
Requires=network-online.target
After=network-online.target
 
[Service]
#Type=simple
Type=forking
User=clickhouse
Group=clickhouse
Restart=always
RestartSec=30
RuntimeDirectory=clickhouse-server
#ExecStart=/usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse-server.pid
ExecStart=/usr/bin/clickhouse-server --daemon --config=/data/clickhouse/ch_9000/config.xml --pid-file=/data/clickhouse/node1/clickhouse-server.pid
#PIDFile=/data/clickhouse/node1/clickhouse-server.pid
LimitCORE=infinity
LimitNOFILE=500000
CapabilityBoundingSet=CAP_NET_ADMIN CAP_IPC_LOCK CAP_SYS_NICE
 
[Install]
WantedBy=multi-user.target

以上面的配置作为模板创建 node1~node6 的systemd配置文件

ClickHouse服务启动

centos-1主机进行如下操作:
systemctl start clickhouse_node1.service
systemctl start clickhouse_node4.service
 
 
centos-2主机进行如下操作:
systemctl start clickhouse_node2.service
systemctl start clickhouse_node5.service
 
 
centos-3主机进行如下操作:
systemctl start clickhouse_node3.service
systemctl start clickhouse_node6.service
 
 
验证如下端口是否被监听:
netstat -anlp|grep 9000 (clickhouse tcp端口)
netstat -anlp|grep 9002 (clickhouse tcp端口)
netstat -anlp|grep 8123    (clickhouse http端口)
netstat -anlp|grep 8124    (clickhouse http端口)
netstat -anlp|grep 9009   (clickhouse 数据交互端口)
netstat -anlp|grep 9010    (clickhouse 数据交互端口)

 
 
登陆方式:
clickhouse-client -u default --password xxxxxx --port 9000 -hcentos-1 --query="show databases"
验证集群状态
[root@vm1 ~]# clickhouse-client -u default --password t5P9rh5M --port 9000 -hcentos-1 --query="select cluster,shard_num,replica_num,host_name,port,user from system.clusters"
ch_cluster_all	1	1	centos-1	9000	default
ch_cluster_all	1	2	centos-3	9002	default
ch_cluster_all	2	1	centos-2	9000	default
ch_cluster_all	2	2	centos-1	9002	default
ch_cluster_all	3	1	centos-3	9000	default
ch_cluster_all	3	2	centos-2	9002	default
接下来再来看一下集群的分片信息(宏变量):分别在各自机器上执行下面命令:(以node1为例)
vm1 :) select * from system.macros;

SELECT *
FROM system.macros

Query id: dfe3400f-cbce-4657-9ed1-a4250b58748c

┌─macro───┬─substitution─┐
│ layer   │ 01           │
│ replica │ node1        │
│ shard   │ 01           │
└─────────┴──────────────┘

3 rows in set. Elapsed: 0.003 sec

测试集群功能是否正常

创建本地表

#创建本地表  t_local
CREATE TABLE t_local on cluster  ch_cluster_all 
(
    EventDate DateTime,
    CounterID UInt32,
    UserID UInt32
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/t_local', '{replica}')
PARTITION BY toYYYYMM(EventDate)
ORDER BY (CounterID, EventDate, intHash32(UserID))
SAMPLE BY intHash32(UserID);

创建分布式表

#创建分布式表  t_all
CREATE TABLE t_all on cluster ch_cluster_all
(
    EventDate DateTime,
    CounterID UInt32,
    UserID UInt32
)
ENGINE = Distributed(ch_cluster_all, default, t_local, CounterID);

Distributed参数说明:
分布式表一般加_all
第一个参数是集群名称
第二个参数是数据库名称
第三个参数是对应的本地表
第四个参数是随机分布数据,可选的,用于分片的key值,在数据写入的过程中,分布式表会依据分片key的规则,将数据分布到各个节点的本地表

验证数据分片和副本状态

插入数据测试:
INSERT INTO t_all VALUES ('2019-01-16 00:00:00', 1, 1),('2019-02-10 00:00:00',2, 2),('2019-03-10 00:00:00',3, 3),('2021-11-11 00:00:00',4, 4)
[root@vm1 ~]# clickhouse-client -u default --password t5P9rh5M --port 9000 -hcentos-1 --query="select * from default.t_all"
2019-03-10 00:00:00     3       3
2019-02-10 00:00:00     2       2
2021-11-11 00:00:00     4       4
2019-01-16 00:00:00     1       1

[root@vm1 ~]# clickhouse-client -u default --password t5P9rh5M --port 9000 -hcentos-1 --query="select * from default.t_local"
2019-03-10 00:00:00     3       3

[root@vm1 ~]# clickhouse-client -u default --password t5P9rh5M --port 9000 -hcentos-2 --query="select * from default.t_local"
2021-11-11 00:00:00     4       4
2019-01-16 00:00:00     1       1

[root@vm1 ~]# clickhouse-client -u default --password t5P9rh5M --port 9000 -hcentos-3 --query="select * from default.t_local"
2019-02-10 00:00:00     2       2

[root@vm1 ~]# clickhouse-client -u default --password t5P9rh5M --port 9002 -hcentos-1 --query="select * from default.t_local"
2021-11-11 00:00:00     4       4
2019-01-16 00:00:00     1       1

[root@vm1 ~]# clickhouse-client -u default --password t5P9rh5M --port 9002 -hcentos-2 --query="select * from default.t_local"
2019-02-10 00:00:00     2       2

[root@vm1 ~]# clickhouse-client -u default --password t5P9rh5M --port 9002 -hcentos-3 --query="select * from default.t_local"
2019-03-10 00:00:00     3       3

以上查询可见,插入的数据已经做分片和副本化处理,集群功能正常。

实现类似mysql免输用户密码登录

我们都知道mysql cli可以通过在my.cnf添加如下参数实现快速登录

[client]
user=xx
password=xxx

clickhouse-client也可以

根据官方文档描述, 在client config文件中添加如下参数即可

# vim ~/.clickhouse-client/config.xml 

<config>
    <user>username</user>
    <password>password</password>
    <secure>False</secure>
</config>

分布式DDL操作

默认情况下,CREATE、DROP、ALTER、RENAME操作仅仅在当前执行该命令的server上生效。在集群环境下,可以使用ON CLUSTER语句,这样就可以在整个集群发挥作用。

分布式表只是作为一个查询引擎,本身不存储任何数据,查询时将sql发送到所有集群分片,然后进行进行处理和聚合后将结果返回给客户端,因此clickhouse限制聚合结果大小不能大于分布式表节点的内存,当然这个一般条件下都不会超过;分布式表可以所有实例都创建,也可以只在一部分实例创建,这个和业务代码中查询的示例一致,建议设置多个,当某个节点挂掉时可以查询其他节点上的表。分布式表的建表语句如下:

分布式表一般用本地表加all来表示,这里distable就是上面xml配置中的标签名称,最后的rand()表示向分布式表插入数据时,将随机插入到副本,在生产环境建议插入的时候客户端可以随机分桶插入到本地表,查询的时候走分布式表,即分布式表只读,本地复制表只写.

比如创建一张分布式表:

#创建分布式表  t_all
CREATE TABLE t_all on cluster ch_cluster_all
(
    EventDate DateTime,
    CounterID UInt32,
    UserID UInt32
)
ENGINE = Distributed(ch_cluster_all, default, t_local, CounterID);

Distributed表引擎的定义形式如下所示:关于ClickHouse的表引擎,后续文章会做详细解释。

Distributed(cluster_name, database_name, table_name[, sharding_key])

各个参数的含义分别如下:

  • cluster_name:集群名称,与集群配置中的自定义名称相对应。
  • database_name:数据库名称
  • table_name:表名称
  • sharding_key:可选的,用于分片的key值,在数据写入的过程中,分布式表会依据分片key的规则,将数据分布到各个节点的本地表。

温馨提示:

创建分布式表是读时检查的机制,也就是说对创建分布式表和本地表的顺序并没有强制要求

同样值得注意的是,在上面的语句中使用了ON CLUSTER分布式DDL,这意味着在集群的每个分片节点上,都会创建一张Distributed表,这样便可以从其中任意一端发起对所有分片的读、写请求。

创建完成上面的分布式表时,在每台机器上查看表,发现每台机器上都存在一张刚刚创建好的表。

接下来就需要创建本地表了,使用变量宏创建本地表:

#创建本地表  t_local
CREATE TABLE t_local on cluster  ch_cluster_all 
(
    EventDate DateTime,
    CounterID UInt32,
    UserID UInt32
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/t_local', '{replica}')
PARTITION BY toYYYYMM(EventDate)
ORDER BY (CounterID, EventDate, intHash32(UserID))
SAMPLE BY intHash32(UserID);

ReplicatedMergeTree 参数说明

  • zoo_path — ZooKeeper 中该表的路径,本例子中为/clickhouse/tables/{layer}-{shard}/t_local
  • replica_name — ZooKeeper 中的该表的副本名称,本例中为{replica}
    注:
    {layer},{shard},{replica}均为 每个节点metrika.xml中配置的变量

问题

ClickHouse到底改写本地表还是分布式表???
    如果预估自己的业务数据量不大(日增不到百万行), 那么写分布式表和本地表都可以, 但要注意如果选择写本地表, 请保证每次写入数据都建立新的连接, 且每个连接写入的数据量基本相同

    如果预估自己的业务数据量大(日增百万以上, 并发插入大于10), 那么请写本地表

    建议每次插入50W行左右数据, 最多不可超过100W行. 总之CH不像MySQL要小事务. 比如1000W行数据, MySQL建议一次插入1W左右, 使用小事务, 执行1000次. CH建议20次,每次50W. 这是MergeTree引擎原理决定的, 频繁少量插入会导致data part过多, 合并不过来.

    再有, AP不像TP, TP为了避免建立新连接产生的损耗影响性能, 通常会使用长连接, 连接池等技术做优化. 但AP业务不需要, 因为AP的属性就不会有高并发, 小SQL.
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值