ClickHouse备份和恢复
clickhouse-backup
ClickHouse备份工具clickhouse-backup
github地址:https://github.com/AlexAkulov/clickhouse-backup/
centos x86_64可以直接下载
如果网络无法访问github也可以从我的网盘下载:
https://url50.ctfile.com/d/35034150-46598783-edcbc5?p=8173 (访问密码:8173)
安装:
[root@localhost installs]# rpm -ivh clickhouse-backup-2.1.2-1.x86_64.rpm
Verifying... ################################# [100%]
Preparing... ################################# [100%]
Updating / installing...
1:clickhouse-backup-2.1.2-1 ################################# [100%]
[root@localhost installs]#
复制配置文件
[root@localhost installs]# cp /etc/clickhouse-backup/config.yml.example /etc/clickhouse-backup/config.yml
[root@localhost installs]# vim /etc/clickhouse-backup/config.yml
general:
remote_storage: none
max_file_size: 0
disable_progress_bar: true
backups_to_keep_local: 7 # 本地备份保留的时长 默认为 0 永久保存
......
clickhouse:
username: default
password: ""
host: localhost
port: 9000
disk_mapping: {}
skip_tables: # 这里是不需要备份的库
- system.*
- INFORMATION_SCHEMA.*
- information_schema.*
- _temporary_and_external_tables.*
- your_databases.*
创建备份
查看帮助
[root@localhost ~]# clickhouse-backup --help
NAME:
clickhouse-backup - Tool for easy backup of ClickHouse with cloud support
USAGE:
clickhouse-backup <command> [-t, --tables=<db>.<table>] <backup_name>
VERSION:
2.1.2
DESCRIPTION:
Run as 'root' or 'clickhouse' user
COMMANDS:
tables List list of tables, exclude skip_tables
create Create new backup
create_remote Create and upload
upload Upload backup to remote storage
list List list of backups
download Download backup from remote storage
restore Create schema and restore data from backup
restore_remote Download and restore
delete Delete specific backup
default-config List default config
print-config List current config
clean Remove data in 'shadow' folder from all `path` folders available from `system.disks`
clean_remote_broken Remove all broken remote backups
watch Run infinite loop which create full + incremental backup sequence to allow efficient backup sequences
server Run API server
help, h Shows a list of commands or help for one command
GLOBAL OPTIONS:
--config FILE, -c FILE Config FILE name. (default: "/etc/clickhouse-backup/config.yml") [$CLICKHOUSE_BACKUP_CONFIG]
--help, -h show help
--version, -v print the version
查看需要备份的库和表
[root@localhost ~]# clickhouse-backup tables
2022/11/24 11:16:36.201640 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2022/11/24 11:16:36.202915 info clickhouse connection open: tcp://localhost:9000 logger=clickhouse
2022/11/24 11:16:36.202940 info SELECT count() FROM system.settings WHERE name = 'show_table_uuid_in_table_create_query_if_not_nil' logger=clickhouse
2022/11/24 11:16:36.205906 info SELECT name FROM system.databases WHERE engine IN ('MySQL','PostgreSQL') logger=clickhouse
2022/11/24 11:16:36.207618 info SELECT countIf(name='data_path') is_data_path_present, countIf(name='data_paths') is_data_paths_present, countIf(name='uuid') is_uuid_present, countIf(name='create_table_query') is_create_table_query_present, countIf(name='total_bytes') is_total_bytes_present FROM system.columns WHERE database='system' AND table='tables' logger=clickhouse
2022/11/24 11:16:36.210662 info SELECT database, name, engine , data_paths , uuid , create_table_query , coalesce(total_bytes, 0) AS total_bytes FROM system.tables WHERE is_temporary = 0 SETTINGS show_table_uuid_in_table_create_query_if_not_nil=1 logger=clickhouse
2022/11/24 11:16:36.219075 info SELECT sum(bytes_on_disk) as size FROM system.parts WHERE active AND database='test_database' AND table='user' GROUP BY database, table logger=clickhouse
2022/11/24 11:16:36.221559 info SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER' logger=clickhouse
2022/11/24 11:16:36.223326 info SELECT * FROM system.disks; logger=clickhouse
AA.test01 24.76MiB default
AA.test02 33.75MiB default
test_database.user 0B default
2022/11/24 11:16:36.224745 info clickhouse connection closed logger=clickhouse
创建备份
[root@localhost ~]# clickhouse-backup create
2022/11/24 11:31:17.140176 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2022/11/24 11:31:17.142176 info clickhouse connection open: tcp://localhost:9000 logger=clickhouse
2022/11/24 11:31:17.142198 info SELECT name, engine FROM system.databases WHERE name NOT IN ('system', 'INFORMATION_SCHEMA', 'information_schema', '_temporary_and_external_tables') logger=clickhouse
2022/11/24 11:31:17.146616 info SHOW CREATE DATABASE `AA` logger=clickhouse
2022/11/24 11:31:17.149589 info SHOW CREATE DATABASE `default` logger=clickhouse
2022/11/24 11:31:17.152777 info SHOW CREATE DATABASE `test_database` logger=clickhouse
2022/11/24 11:31:17.155632 info SELECT count() FROM system.settings WHERE name = 'show_table_uuid_in_table_create_query_if_not_nil' logger=clickhouse
2022/11/24 11:31:17.157411 info SELECT name FROM system.databases WHERE engine IN ('MySQL','PostgreSQL') logger=clickhouse
2022/11/24 11:31:17.160184 info SELECT countIf(name='data_path') is_data_path_present, countIf(name='data_paths') is_data_paths_present, countIf(name='uuid') is_uuid_present, countIf(name='create_table_query') is_create_table_query_present, countIf(name='total_bytes') is_total_bytes_present FROM system.columns WHERE database='system' AND table='tables' logger=clickhouse
2022/11/24 11:31:17.164619 info SELECT database, name, engine , data_paths , uuid , create_table_query , coalesce(total_bytes, 0) AS total_bytes FROM system.tables WHERE is_temporary = 0 SETTINGS show_table_uuid_in_table_create_query_if_not_nil=1 logger=clickhouse
2022/11/24 11:31:17.177065 info SELECT sum(bytes_on_disk) as size FROM system.parts WHERE active AND database='test_database' AND table='user' GROUP BY database, table logger=clickhouse
2022/11/24 11:31:17.178942 info SELECT toUInt8(count()) udf_presents FROM system.columns WHERE database='system' AND table='functions' AND name='create_query' logger=clickhouse
2022/11/24 11:31:17.182630 info SELECT name, create_query FROM system.functions WHERE create_query!='' logger=clickhouse
2022/11/24 11:31:17.185649 info SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER' logger=clickhouse
2022/11/24 11:31:17.187439 info SELECT * FROM system.disks; logger=clickhouse
2022/11/24 11:31:17.190192 info ALTER TABLE `AA`.`test01` FREEZE WITH NAME '85a9651d0c3b4b56b0c99ecebba0fd8b'; logger=clickhouse
2022/11/24 11:31:17.195510 info done backup=2022-11-24T03-31-17 logger=backuper operation=create table=AA.test01
2022/11/24 11:31:17.195561 info ALTER TABLE `AA`.`test02` FREEZE WITH NAME '16fb2695377b4807bafea4788ca75e3c'; logger=clickhouse
2022/11/24 11:31:17.207155 info done backup=2022-11-24T03-31-17 logger=backuper operation=create table=AA.test02
2022/11/24 11:31:17.207299 info ALTER TABLE `test_database`.`user` FREEZE WITH NAME '5fc753558db24f9682f3846579047940'; logger=clickhouse
2022/11/24 11:31:17.212913 info done backup=2022-11-24T03-31-17 logger=backuper operation=create table=test_database.user
2022/11/24 11:31:17.212949 info SELECT value FROM `system`.`build_options` where name='VERSION_DESCRIBE' logger=clickhouse
2022/11/24 11:31:17.216027 info done backup=2022-11-24T03-31-17 duration=76ms logger=backuper operation=create
2022/11/24 11:31:17.216231 info clickhouse connection closed logger=clickhouse
列出现有的备份
[root@localhost ~]# clickhouse-backup list
2022/11/24 11:32:07.135670 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2022/11/24 11:32:07.137734 info clickhouse connection open: tcp://localhost:9000 logger=clickhouse
2022/11/24 11:32:07.137755 info SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER' logger=clickhouse
2022/11/24 11:32:07.140801 info SELECT * FROM system.disks; logger=clickhouse
2022-11-23T09-59-08 33.76MiB 23/11/2022 09:59:08 local
2022-11-23T10-11-00 58.53MiB 23/11/2022 10:11:00 local
2022-11-24T03-31-17 58.53MiB 24/11/2022 03:31:17 local
2022/11/24 11:32:07.142366 info clickhouse connection closed logger=clickhouse
可以看到一个名为2022-11-24T03-31-17
的备份。此备份存储路径为/var/lib/clickhouse/backup
,备份名称默认为当前时间戳,但是可以通过-name指定备份名称。备份包含两个目录:metadata目录和shadow目录,前者包含重新创建表架构所需的DDL语句,后者包含alter table ... freeze
操作结果的数据。
恢复数据
模拟删除备份过的表:
[root@localhost ~]# clickhouse-client
ClickHouse client version 21.11.5.33 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 21.11.5 revision 54450.
localhost :) use AA
USE AA
Query id: 2f00131b-fd76-49a3-91c1-560a4a0d4490
Ok.
0 rows in set. Elapsed: 0.000 sec.
localhost :) show tables;
SHOW TABLES
Query id: 4b0d78a5-e12a-43c6-8ca3-be58f4c0bc35
┌─name───┐
│ test01 │
│ test02 │
└────────┘
2 rows in set. Elapsed: 0.001 sec.
localhost :) drop table test01
DROP TABLE test01
Query id: dec5c657-af69-4986-85b6-5a717ef35604
Ok.
0 rows in set. Elapsed: 0.000 sec.
localhost :) drop table test02
DROP TABLE test02
Query id: f510d3a9-f635-4646-8c4a-9179ffdb8995
Ok.
0 rows in set. Elapsed: 0.000 sec.
[root@localhost ~]# clickhouse-backup restore 2022-11-24T03-31-17
2022/11/24 11:42:17.863344 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2022/11/24 11:42:17.865137 info clickhouse connection open: tcp://localhost:9000 logger=clickhouse
2022/11/24 11:42:17.865161 info SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER' logger=clickhouse
2022/11/24 11:42:17.868301 info SELECT * FROM system.disks; logger=clickhouse
2022/11/24 11:42:17.869598 info CREATE DATABASE IF NOT EXISTS `AA` ENGINE = Atomic with args [[]] logger=clickhouse
2022/11/24 11:42:17.872942 info CREATE DATABASE IF NOT EXISTS `default` ENGINE = Atomic with args [[]] logger=clickhouse
2022/11/24 11:42:17.875816 info CREATE DATABASE IF NOT EXISTS `test_database` ENGINE = Atomic with args [[]] logger=clickhouse
2022/11/24 11:42:17.878021 info SELECT engine FROM system.databases WHERE name = 'AA' logger=clickhouse
2022/11/24 11:42:17.880392 info DROP TABLE IF EXISTS `AA`.`test01` NO DELAY logger=clickhouse
2022/11/24 11:42:17.881065 info SELECT engine FROM system.databases WHERE name = 'AA' logger=clickhouse
2022/11/24 11:42:17.882949 info DROP TABLE IF EXISTS `AA`.`test02` NO DELAY logger=clickhouse
2022/11/24 11:42:17.884660 info SELECT engine FROM system.databases WHERE name = 'test_database' logger=clickhouse
2022/11/24 11:42:17.887765 info DROP TABLE IF EXISTS `test_database`.`user` NO DELAY logger=clickhouse
2022/11/24 11:42:17.889459 info CREATE DATABASE IF NOT EXISTS `AA` logger=clickhouse
......
2022/11/24 11:42:17.892529 warn can't create table 'AA.test01': code: 57, message: Directory for table data store/711/7119354a-81db-4e2b-b119-354a81db2e2b/ already exists, will try again backup=2022-11-24T03-31-17 operation=restore
......
......
2022/11/24 11:42:17.895110 warn can't create table 'AA.test02': code: 57, message: Directory for table data store/c4b/c4ba3366-6c54-497a-84ba-33666c54397a/ already exists, will try again backup=2022-11-24T03-31-17 operation=restore
2022/11/24 11:42:17.895127 info CREATE DATABASE IF NOT EXISTS `test_database` logger=clickhouse
......
......
......
2022/11/24 11:42:17.902769 info clickhouse connection closed logger=clickhouse
2022/11/24 11:42:17.902791 error can't create table `AA`.`test01`: code: 57, message: Directory for table data store/711/7119354a-81db-4e2b-b119-354a81db2e2b/ already exists after 3 times, please check your schema dependencies
如果有上述的错误信息可以先恢复表结构,然后再恢复数据
参数:--schema
只还原表结构,--data
只还原数据,--table=dbname.table
备份或还原特定表
# 恢复结构
[root@localhost ~]# clickhouse-backup restore --schema 2022-11-24T03-31-17
2022/11/24 11:45:41.253234 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2022/11/24 11:45:41.254727 info clickhouse connection open: tcp://localhost:9000 logger=clickhouse
2022/11/24 11:45:41.254749 info SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER' logger=clickhouse
2022/11/24 11:45:41.257530 info SELECT * FROM system.disks; logger=clickhouse
2022/11/24 11:45:41.258862 info CREATE DATABASE IF NOT EXISTS `AA` ENGINE = Atomic with args [[]] logger=clickhouse
2022/11/24 11:45:41.262204 info CREATE DATABASE IF NOT EXISTS `default` ENGINE = Atomic with args [[]] logger=clickhouse
2022/11/24 11:45:41.265545 info CREATE DATABASE IF NOT EXISTS `test_database` ENGINE = Atomic with args [[]] logger=clickhouse
2022/11/24 11:45:41.266979 info SELECT engine FROM system.databases WHERE name = 'AA' logger=clickhouse
2022/11/24 11:45:41.268992 info DROP TABLE IF EXISTS `AA`.`test01` NO DELAY logger=clickhouse
2022/11/24 11:45:41.270966 info SELECT engine FROM system.databases WHERE name = 'AA' logger=clickhouse
2022/11/24 11:45:41.272546 info DROP TABLE IF EXISTS `AA`.`test02` NO DELAY logger=clickhouse
2022/11/24 11:45:41.274426 info SELECT engine FROM system.databases WHERE name = 'test_database' logger=clickhouse
2022/11/24 11:45:41.277202 info DROP TABLE IF EXISTS `test_database`.`user` NO DELAY logger=clickhouse
2022/11/24 11:45:41.279613 info CREATE DATABASE IF NOT EXISTS `AA` logger=clickhouse
......
......
2022/11/24 11:45:41.293221 info CREATE DATABASE IF NOT EXISTS `test_database` logger=clickhouse
......
......
2022/11/24 11:45:41.299983 info done backup=2022-11-24T03-31-17 operation=restore
2022/11/24 11:45:41.300017 info clickhouse connection closed logger=clickhouse
# 恢复数据
[root@localhost ~]# clickhouse-backup restore --data 2022-11-24T03-31-17
2022/11/24 11:46:02.008705 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2022/11/24 11:46:02.010653 info clickhouse connection open: tcp://localhost:9000 logger=clickhouse
2022/11/24 11:46:02.010673 info SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER' logger=clickhouse
2022/11/24 11:46:02.012639 info SELECT * FROM system.disks; logger=clickhouse
2022/11/24 11:46:02.014091 info CREATE DATABASE IF NOT EXISTS `AA` ENGINE = Atomic with args [[]] logger=clickhouse
2022/11/24 11:46:02.017339 info CREATE DATABASE IF NOT EXISTS `default` ENGINE = Atomic with args [[]] logger=clickhouse
2022/11/24 11:46:02.020124 info CREATE DATABASE IF NOT EXISTS `test_database` ENGINE = Atomic with args [[]] logger=clickhouse
2022/11/24 11:46:02.022195 info SELECT count() FROM system.settings WHERE name = 'show_table_uuid_in_table_create_query_if_not_nil' logger=clickhouse
2022/11/24 11:46:02.024477 info SELECT name FROM system.databases WHERE engine IN ('MySQL','PostgreSQL') logger=clickhouse
2022/11/24 11:46:02.026035 info SELECT countIf(name='data_path') is_data_path_present, countIf(name='data_paths') is_data_paths_present, countIf(name='uuid') is_uuid_present, countIf(name='create_table_query') is_create_table_query_present, countIf(name='total_bytes') is_total_bytes_present FROM system.columns WHERE database='system' AND table='tables' logger=clickhouse
2022/11/24 11:46:02.029620 info SELECT database, name, engine , data_paths , uuid , create_table_query , coalesce(total_bytes, 0) AS total_bytes FROM system.tables WHERE is_temporary = 0 SETTINGS show_table_uuid_in_table_create_query_if_not_nil=1 logger=clickhouse
2022/11/24 11:46:02.038140 info SELECT sum(bytes_on_disk) as size FROM system.parts WHERE active AND database='AA' AND table='test01' GROUP BY database, table logger=clickhouse
2022/11/24 11:46:02.040321 info SELECT sum(bytes_on_disk) as size FROM system.parts WHERE active AND database='AA' AND table='test02' GROUP BY database, table logger=clickhouse
2022/11/24 11:46:02.043058 info SELECT sum(bytes_on_disk) as size FROM system.parts WHERE active AND database='test_database' AND table='user' GROUP BY database, table logger=clickhouse
2022/11/24 11:46:02.047140 info ALTER TABLE `AA`.`test01` ATTACH PART 'all_1_1_0' logger=clickhouse
2022/11/24 11:46:02.049536 info done backup=2022-11-24T03-31-17 operation=restore table=AA.test01
2022/11/24 11:46:02.052962 info ALTER TABLE `AA`.`test02` ATTACH PART '1970_1_1_0' logger=clickhouse
2022/11/24 11:46:02.055558 info ALTER TABLE `AA`.`test02` ATTACH PART '2020_2_2_0' logger=clickhouse
2022/11/24 11:46:02.056601 info ALTER TABLE `AA`.`test02` ATTACH PART '2021_3_3_0' logger=clickhouse
2022/11/24 11:46:02.058794 info ALTER TABLE `AA`.`test02` ATTACH PART '2022_4_4_0' logger=clickhouse
2022/11/24 11:46:02.060960 info done backup=2022-11-24T03-31-17 operation=restore table=AA.test02
2022/11/24 11:46:02.060982 info done backup=2022-11-24T03-31-17 operation=restore table=test_database.user
2022/11/24 11:46:02.061006 info done backup=2022-11-24T03-31-17 duration=39ms operation=restore
2022/11/24 11:46:02.061022 info done backup=2022-11-24T03-31-17 operation=restore
2022/11/24 11:46:02.061054 info clickhouse connection closed logger=clickhouse
clickhouse-backup的api文档
确保/var/lib/clickhouse/backup
的权限是clickhouse:clickhouse
,否则可能会导致数据损坏。
远程备份:
较新版本才支持,需要设置config里的s3相关配置;
上传到远程存储:sudo clickhouse-backup upload xxxx;
从远程存储下载:sudo clickhouse-backup download xxx;
保存周期:
backups_to_keep_local
,保存到本地的存储周期,单位为天,
backups_to_keep_remote
,远程存储的保存周期,单位为天,
两者值若为0,表示永不删除。
原文链接:https://blog.csdn.net/qq_37475168/article/details/127933734
在原本基础上做了些许删减和润色