kubenertes中etcd备份与恢复策略
一、备份
备份方式:
1 脚本备份方式参考如下(目前建议使用这种方式
建议ETCD节点都进行备份,备份时间点错开。
#! /bin/bash
ETCDCTL_PATH='/usr/local/bin/etcdctl' #etcdctl命令路径
ENDPOINTS=' https://10.0.0.11:2379' #etcd服务地址,一般是master节点地址
ETCD_DATA_DIR="/var/lib/etcd" #etcd目录
BACKUP_DIR="/etcd_backup/etcd-$(date +%Y-%m-%d_%H:%M:%S)" # 备份目录
ETCDCTL_CERT="/etc/kubernetes/pki/etcd/peer.crt" #按照各环境修改
ETCDCTL_KEY="/etc/kubernetes/pki/etcd/peer.key" #按照各环境修改
ETCDCTL_CA_FILE="/etc/kubernetes/pki/etcd/ca.crt" #按照各环境修改
[ ! -d $BACKUP_DIR ] && mkdir -p $BACKUP_DIR
#export ETCDCTL_API=2;$ETCDCTL_PATH backup --data-dir $ETCD_DATA_DIR --backup-dir $BACKUP_DIR #版本2的备份
#sleep 3
{
export ETCDCTL_API=3;$ETCDCTL_PATH --endpoints="$ENDPOINTS" snapshot save $BACKUP_DIR/snapshot.db \
--cacert="$ETCDCTL_CA_FILE" \
--cert="$ETCDCTL_CERT" \
--key="$ETCDCTL_KEY"
} > /dev/null
sleep 3
cd $BACKUP_DIR/../;ls -lt |awk '{if(NR>10){print "rm -rf "$10}}'|sh #保留备份份数10
写入定时任务定时备份
00 01 * * * /bin/bash /etcd/etcd_backup.sh > /etcd/etcd_backup.log 2>&1
脚本执行详细(成功):
[root@csp1 etcd]# sh etcd_backup.sh
{"level":"info","ts":1634109231.4035132,"caller":"snapshot/v3_snapshot.go:110","msg":"created temporary db file","path":"/root/etcd/backup/etcd-2021-10-13_15:13:51/snapshot.db.part"}
{"level":"warn","ts":"2021-10-13T15:13:51.411+0800","caller":"clientv3/retry_interceptor.go:116","msg":"retry stream intercept"}
{"level":"info","ts":1634109231.4116611,"caller":"snapshot/v3_snapshot.go:121","msg":"fetching snapshot","endpoint":"https://172.16.63.31:2379"}
{"level":"info","ts":1634109231.7625456,"caller":"snapshot/v3_snapshot.go:134","msg":"fetched snapshot","endpoint":"https://172.16.63.31:2379","took":0.35894897}
{"level":"info","ts":1634109231.7626708,"caller":"snapshot/v3_snapshot.go:143","msg":"saved","path":"/root/etcd/backup/etcd-2021-10-13_15:13:51/snapshot.db"}
二、etcd数据恢复
Etcd数据恢复
查看etcd是否健康
ETCDCTL_API=3 etcdctl --endpoints https://10.0.0.11:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/peer.crt \
--key=/etc/kubernetes/pki/etcd/peer.key \
endpoint health
停止集群master节点的kubelet服务
systemctl stop kubelet
查看etcd的成员列表
ETCDCTL_API=3 etcdctl --endpoints https://10.0.0.11:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/peer.crt \
--key=/etc/kubernetes/pki/etcd/peer.key \
member list
# 返回的输出
57965df5add21941, started, csp1, https://10.0.0.11:2380, https://10.0.0.11:2379, false
查看etcd的yaml文件
kubectl -n kube-system get pod etcd-master01 -o yaml
etcd数据存放目录是/var/lib/etcd/,此目录是容器挂载宿主机的/var/lib/etcd/。删除宿主机的/var/lib/etcd/目录就是清空etcd容器的数据。
[root@master01 ~]# rm -rf /var/lib/etcd/
恢复etcd数据
export ETCDCTL_API=3
etcdctl snapshot restore /root/etcd/backup/snapshot.db \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/peer.crt \
--key=/etc/kubernetes/pki/etcd/peer.key \
--name=master01 \
--data-dir=/var/lib/etcd \
--skip-hash-check \
--initial-advertise-peer-urls= https://10.0.0.11:2380 \
--initial-cluster=master01= https://10.0.0.111:2380
查看数据是否恢复
[root@master01 ~]# ls /var/lib/etcd/