本文档深入分析了ClickHouse表readonly或zk元数据lost的问题,并提出了相应的解决方案。通过对问题的深入研究和实践经验的总结,我们总结了一些有效的方法来处理这些问题,以确保ClickHouse表的正常运行和数据的安全性
ClickHouse是一款高性能的分布式列式数据库,在处理大量数据时具有出色的性能。然而,在实际使用过程中,我们可能会遇到ClickHouse表readonly或zk元数据lost的问题,这会影响到数据库的正常操作和数据的一致性。因此,理解并解决这些问题对于确保ClickHouse系统的稳定性和可靠性至关重要。
Clickhouse 22.x 表readonly处理方法:
clickHouse 版本需要大于 21.7
#readonly时请手动删除zk或ck-keeper元数据再执行(只读表多可以直接删除data/app/clickhouse/coordination)
老本请参考https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-zookeeper/altinity-kb-recovering-from-complete-metadata-loss-in-zookeeper/
#!/bin/bash
# clickhouse机器的IP和端口
hosts=("192.168.0.110" "192.168.0.111" "192.168.0.112")
ports=("18101")
for host in ${hosts[@]}; do
for port in ${ports[@]}; do
query_result=$(clickhouse-client -u default --password 123456 --port $port -h $host --query "select database,table from system.replicas where is_readonly;")
# 该节点没有readonly状态的表,直接跳过
if [ -z "$query_result" ]; then
echo "$host:$port not have readonly table,skip"
continue
fi
# 将结果按行分割,并遍历每一行
while IFS= read -r line; do
# 分割行中的字段获取到数据库名和表名,并保存到变量中
database=$(echo $line | cut -d ' ' -f 1)
table=$(echo $line | cut -d ' ' -f 2)
# 拼接新的SQL语句数组
new_sql=( "SYSTEM RESTART REPLICA $database.$table;" "SYSTEM RESTORE REPLICA $database.$table;")
# 循环执行新的SQL语句并输出结果
for sql in "${new_sql[@]}"; do
# 执行新的SQL语句并输出结果
echo "Results for $database.$table:"
clickhouse-client -u default --password 123456 --port $port -h $host --query "$sql"
echo ""
done
done <<< "$query_result"
done
done
Clickhouse 20.x批量脚本:
#ck 工作目录
CLICKHOUSE_WORKING_FOLDER=/data/app/clickhouse
#ck 数据目录
CLICKHOUSE_DATA_FOLDER=/data/appData/clickhouse
# should be same disk as CLICKHOUSE_WORKING_FOLDER! (otherwise we can't use hardlinks)
#ckrecovery 数据目录
CLICKHOUSE_TOOLSET_FOLDER=/data/clickhouse-zookeeper-recovery-master
#ck ck用户
CLICKHOUSE_WORKING_USER=commonuser
#ck ck用户
BACKUP_FOLDER="${CLICKHOUSE_TOOLSET_FOLDER}/backup2024-01-02"
#ck ck脚本目录
# if you need some adjustments - like username/password/port/listened host or some parameter - adjust it here.
CLICKHOUSE_CLIENT=''"${CLICKHOUSE_WORKING_FOLDER}"'/bin/clickhouse-client -h 127.0.0.1 -u default --port 18101 --password Rootmaster@777 --max_query_size=10000000'
#ck ck脚本目录
CLICKHOUSE_EXTRACT_FROM_CONFIG=''"${CLICKHOUSE_WORKING_FOLDER}"'/bin/clickhouse-extract-from-config --config-file /data/app/clickhouse/etc/clickhouse-server/config.xml'
# for replicated tables we should use data only
# from single replica (others will replicate)
# otherwise we will have replicated data
# if last character of the hostname is 1 we are on the master replica.
HOSTNAME_SHORT=$(hostname -s)
MASTER_REPLICA=$( [ "${HOSTNAME_SHORT: -1}" == "1" ] && echo 'true' || echo 'false' )
### TODO: expose settings above via command-line args
### those normally should not be changed
ck ck表元数据目录
METADATA_FOLDER="${CLICKHOUSE_DATA_FOLDER}/metadata"
ck ck表数据目录
DATA_FOLDER="${CLICKHOUSE_DATA_FOLDER}/data"
ck ck备份表元数据目录
BACKUP_METADATA_FOLDER="${BACKUP_FOLDER}/metadata"
ck ck备份表数据目录
BACKUP_DATA_FOLDER="${BACKUP_FOLDER}/data"
# we do mv instead of rm -rf (just in case), that folder is used as trashbin
TRASHBIN_FOLDER="${CLICKHOUSE_TOOLSET_FOLDER}/trashbin_$(date +%Y%m%d_%H%M%S)"
# we will put some tmp files there
TMP_FOLDER="${CLICKHOUSE_TOOLSET_FOLDER}/tmp"
#命令执行
cd /data/clickhouse-zookeeper-recovery-master
#创建备份
./toolset.sh create_local_backup
#重置node
./toolset.sh reset_node
#显示状态
./toolset.sh show_status
#恢复noreplicated
./toolset.sh recover_non_replicated
#恢复副本
./toolset.sh refill_replicated_tables
#恢复kafka表
./toolset.sh recreate_kafka_tables