clickhouse分布式存储+备份配置

百战天王

已于 2022-04-21 17:31:36 修改

阅读量1.8k

点赞数

分类专栏： clickhouse 文章标签：分布式

于 2021-04-09 10:58:27 首次发布

本文链接：https://blog.csdn.net/wzp1986/article/details/115541317

版权

clickhouse 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

文章目录

在配置文件里定义分片与副本
创建主表与从表
创建分区表
总结

在配置文件里定义分片与副本

假设集群为两个节点，每个节点互为主备，主副本放在primary库内，从副本放在replica库内

<yandex>
    <timezone>Asia/Shanghai</timezone>
    <interserver_http_host>10.0.1.1</interserver_http_host>
    <zookeeper>
        <node index="1">
            <host>10.0.0.1</host>
            <port>2181</port>
        </node>
        <node index="2">
            <host>10.0.0.2</host>
            <port>2181</port>
        </node>
        <node index="3">
            <host>10.0.0.3</host>
            <port>2181</port>
        </node>
    </zookeeper>
    <remote_servers>
        <default_cluster>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <default_database>primary</default_database>
                    <host>10.0.1.1</host>
                    <port>9000</port>
                </replica>               
                <replica>
                    <default_database>replica</default_database>
                    <host>10.0.1.2</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <default_database>primary</default_database>
                    <host>10.0.1.2</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <default_database>replica</default_database>
                    <host>10.0.1.1</host>
                    <port>9000</port>
                </replica>
            </shard>
        </default_cluster>
    </remote_servers>
    <macros>
        <!--配置节点使用的副本名称，以及负责存储哪个分片的主副本、哪个分片的从副本-->
        <replica_name>10.0.1.1</replica_name>
        <primary_shard>01</primary_shard>
        <replica_shard>02</replica_shard>
    </macros>
    <merge_tree>
        <!--更多配置，见：https://github.com/ClickHouse/ClickHouse/blob/master/src/Storages/MergeTree/MergeTreeSettings.h-->
        <replicated_deduplication_window>0</replicated_deduplication_window>
    </merge_tree>
</yandex>

replicated_deduplication_window控制在zk上保留hash值的数据块数量，设为0，则在主从同步时不再检测数据块是否重复；

merge_tree配置需要在数据库初始化的时候指定，之后要使用不同的值，只能在创建表的时候指定专用SETTINGS：https://clickhouse.tech/docs/en/operations/settings/merge-tree-settings/

创建主表与从表

建表语句里使用宏macros变量，在不同节点可执行相同建表语句，然后各节点会在zk的不同节点内记录主从同步信息;
primary_shard表示节点的primary库负责哪个分片的写入；
replica_shard表示节点的replica库负责哪个分片的备份

#主副本表，在primary库
CREATE TABLE primary.t_event (
    event_id                String,
    event_time              DateTime64(3),
    event_code              String,
    ...
  )
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{primary_shard}/t_event', '{replica_name}')
PARTITION BY toYYYYMMDD(event_time)
ORDER BY (event_id, event_code);
 
#从副本表，在replica库
CREATE TABLE replica.t_event (
    event_id                String,
    event_time              DateTime64(3),
    event_code              String,
    ...
  )
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{replica_shard}/t_event', '{replica_name}')
PARTITION BY toYYYYMMDD(event_time)
ORDER BY (event_id, event_code);

节点1的macros：

    <macros>
        <!--节点1负责01分片的写入，02分片的备份-->
        <replica_name>10.0.1.1</replica_name>
        <primary_shard>01</primary_shard>
        <replica_shard>02</replica_shard>
    </macros>

节点2的macros：

    <macros>
        <!--节点2负责02分片的写入，01分片的备份-->
        <replica_name>10.0.1.2</replica_name>
        <primary_shard>02</primary_shard>
        <replica_shard>01</replica_shard>
    </macros>

创建分区表

Distributed引擎里的第2个参数保持为空，其访问副本时会就使用副本的default_database配置里指定的库；第4个参数表示分区数据分布是随机的

CREATE TABLE default.t_event AS primary.t_event_result ENGINE = Distributed(default_cluster, '', t_event, rand())

查询统计时，访问任一节点的default.t_event表即可获取全集群数据，Distributed引擎负责把查询任务分配到各个分片所在节点；
写入数据时，往各节点的primary.t_event表进行写入，ReplicatedMergeTree引擎会负责数据的主从同步，java程序使用BalancedClickhouseDataSource连接到多个节点随机写入；
当一个节点宕机时，查询仍然能获取所有数据，新增数据全部写入剩余的存活分片的主表

总结

1、以上三个建表语句都是要在所有节点执行的
2、分区表的Distributed引擎只负责分布式查询，用于保证数据冗余的internal_replication机制实际未使用
3、数据复制/冗余实际由ReplicatedMergeTree引擎负责，它把primary库接收到数据，复制到另一个节点的replica库