临时方案
一、证书过期,导致不能登录
版本:v2.3.6
2.3 +
[root@localhost ~]# docker ps -a | grep 8443
17a1d1cf97fc harbor.jettech.com/rancher/rancher:v2.3.6 "entrypoint.sh" About an hour ago Up About an hour 0.0.0.0:80->80/tcp, 0.0.0.0:8443->443/tcp heuristic_shannon
[root@localhost ~]# docker exec -it 17a1d1cf97fc bash
root@17a1d1cf97fc:/var/lib/rancher# ls
k3s management-state
root@17a1d1cf97fc:/var/lib/rancher# cp -r k3s/server/tls k3s/server/tls_bak
root@17a1d1cf97fc:/var/lib/rancher# rm -f k3s/server/tls/*.crt #删除之前的证书
exit
root@17a1d1cf97fc:/var/lib/rancher# docker restart 17a1d1cf97fc #重启容器会重新生成新的证书
二、查看证书过期时间
root@17a1d1cf97fc:/var/lib/rancher/k3s/server/tls# for i in $(ls /var/lib/rancher/k3s/server/tls/*.crt);do openssl x509 -enddate -noout -in $i; done
notAfter=Jan 21 07:06:21 2023 GMT
notAfter=Jan 21 07:06:21 2023 GMT
notAfter=Jan 19 07:06:21 2032 GMT
notAfter=Jan 21 07:06:21 2023 GMT
notAfter=Jan 21 07:06:21 2023 GMT
notAfter=Jan 21 07:06:21 2023 GMT
notAfter=Jan 21 07:06:21 2023 GMT
notAfter=Jan 19 07:06:21 2032 GMT
notAfter=Jan 19 07:06:21 2032 GMT
notAfter=Jan 21 07:06:21 2023 GMT
三、证书即将过期,可以去rancher上更新证书时间
2.4 版本+
1.exec 到 rancher server
kubectl --insecure-skip-tls-verify -n kube-system delete secrets k3s-serving
kubectl --insecure-skip-tls-verify delete secret serving-cert -n cattle-system
rm -f /var/lib/rancher/k3s/server/tls/dynamic-cert.json
2.重启 rancher-server
docker restart rancher_id
3.执行以下命令刷新参数
curl --insecure -sfL https://server-url/v3
长期方案
通过Jenkins任务定时去检查证书的过期时间,如果证书3个月后过期,则通过企业微信机器人发出通知
一共需要三个脚本
check_rancher_cert_time.groovy
timestamps {
/**
* 每周一9点检查192.168.1.2的rancher证书过期时间
*/
properties([
pipelineTriggers([
cron('H 9 * * 1')
])
])
node('master') {
rancherIP = "192.168.1.2"
rancherSshUser = "admin"
rancherSshPwd = "123456"
scriptDir = "${WORKSPACE}/devops/shell/k8s" //脚本所在目录
vxUrl = "https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=8d4432c9-d15c-4803-879a-73xdasfads32" //机器人API
try {
stage("git pull") {
gitClone()
}
stage("check ${rancherIP} rancher cert time") {
sh """
sshpass -p ${rancherSshPwd} scp -o StrictHostKeyChecking=no ${scriptDir}/{check_rancher_cert.sh,check_cert_time.py} ${rancherSshUser}@${rancherIP}:.
sshpass -p ${rancherSshPwd} ssh -o StrictHostKeyChecking=no ${rancherSshUser}@${rancherIP} "sudo bash check_rancher_cert.sh ${vxUrl}"
"""
}
} catch (e) {
throw e
} finally {
}
}
}
def gitClone(){
sh """
if [[ -d devops ]];then
cd devops && git pull
else
git clone --depth=1 -b devops https://lvhy@192.168.2.3/devops/devops.git devops
fi
"""
}
check_rancher_cert.sh
#!/bin/bash
# vim:sw=4:ts=4:et
<<INFO
AUTHOR:运维@小兵
DATE:2021-12-18
DESCRIBE:检查rancher证书过期时间,距证书过期提前三个月通过企业微信通知
SYSTEM:CentOS 7.6.1810
WARNING:警告信息
MODIFY:
INFO
set -e
WORKDIR=$(cd `dirname $0`;pwd)
BEFORE_MONTH=3 #距证书过期提前三个月通知
EXCEED_CERT_PATH="${WORKDIR}/exceed_cert_file.txt" #存放过期的证书文件
TIME=$(date "+%Y-%m-%d %H:%M")
Check_Env(){
if [[ ! -f ${WORKDIR}/check_cert_time.py ]];then
echo "ERROR:${WORKDIR}/check_cert_time.py Not Found"
fi
}
#获取过期证书,并写入${EXCEED_CERT_PATH}
Get_Exceed_Cert(){
local docker_root_dir=$(docker info | grep '^Docker Root Dir:' | awk -F': ' '{print $2}') #docker存储路径
[[ ! -d ${docker_root_dir} ]] && echo "ERROR:Docker Root Dir:${docker_root_dir} Not Found" && exit 1
local cert_path=$(find ${docker_root_dir} -name 'serving-kube-apiserver.crt' -type f)
if [[ $(echo ${cert_path} | wc -l) -ne 1 ]];then
echo "ERROR:${cert_path} Is Error" && exit 1
fi
local cert_dir=${cert_path%/*} #证书所在目录
[[ ! -d ${cert_dir} ]] && echo "ERROR:Docker Root Dir:${cert_dir} Not Found" && exit 1
echo "证书目录:${cert_dir}"
cd ${cert_dir}
echo "=====================now time:${TIME}=====================" > ${EXCEED_CERT_PATH}
for name in `ls *.crt`
do
local cert_time_info=$(openssl x509 -enddate -noout -in ${name}) #如notAfter=May 26 06:27:49 2022 GMT
local cert_exceed_time=$(echo ${cert_time_info} | awk -F'[ =]' '{printf"%s %s %s\n",$2,$3,$5}') #获取到证书过期时间,如May 26 2022
python ${WORKDIR}/check_cert_time.py ${EXCEED_CERT_PATH} ${name} ${BEFORE_MONTH} "${cert_exceed_time}"
done
# python ${WORKDIR}/check_cert_time.py ${EXCEED_CERT_PATH} "a.crt" ${BEFORE_MONTH} "Mar 18 2022"
# python ${WORKDIR}/check_cert_time.py ${EXCEED_CERT_PATH} "b.crt" ${BEFORE_MONTH} "Mar 18 2022"
}
#企业微信通知
Vx_Notice(){
local vx_url=$1
local exceed_time=$2
local host_ip=$(ip addr |awk '/inet /' |sed -n '2p' |awk -F' ' '{print $2}' |awk -F'/' '{print $1}')
curl "${vx_url}" \
-H 'Content-Type: application/json' \
-d '{"msgtype": "text",
"text": {
"content": "'${host_ip}' Rancher following certificates will expire in '${exceed_time}'\nPlease Check '${EXCEED_CERT_PATH}'",
"mentioned_mobile_list":["@all"]}
}'
}
#检查证书
Check_Cert_Time(){
echo "INFO:Begin Check Rancher Cert Exceed Time..."
Check_Env
Get_Exceed_Cert
[[ ! -f ${EXCEED_CERT_PATH} ]] && echo "ERROR:${EXCEED_CERT_PATH} Not Found" && exit 1
if [[ $(cat ${EXCEED_CERT_PATH} | wc -l) -gt 1 ]];then
if [[ -n $1 ]];then
local vx_url=$1
if ! echo ${vx_url} | grep "https://qyapi.weixin.qq.com/" &> /dev/null;then
echo "ERROR:Vx Url ${vx_url} Is Error" && exit 1
fi
local exceed_time=$(cat ${EXCEED_CERT_PATH} | awk -F'[ :]' '/crt/{print $3}' | sed -n '1p')
Vx_Notice ${vx_url} ${exceed_time}
fi
echo -e "\033[33mWARN:The following certificates will expire in ${BEFORE_MONTH} months\033[0m"
cat ${EXCEED_CERT_PATH}
exit 1
else
echo "INFO:Rancher Cert Is Ok" | tee -a ${EXCEED_CERT_PATH} && exit 0
fi
}
[[ $# -gt 1 ]] && echo "ERROR:Invalid Param!!!,Please Excute:bash $0 <vx_url>" && exit 1
Check_Cert_Time $1
check_cert_time.py
#!/usr/bin/env python
# -*- coding:utf-8 -*-
# @FileName :check_cert.py
# @Time :2021/12/17
# @Author :运维@小兵
# @Function :检查证书过期时间,如果在n个月后过期,则把证书和过期时间写入到文件中
# @Excute :python check_cert.py 保存过期证书信息的文件 证书文件 几个月后 证书过期时间
from datetime import datetime
import sys
# 环境检查
def check_env():
ex = Exception('Invalid Param!!! eg:python %s 保存过期证书信息的文件 证书文件 几个月后 证书过期时间' % sys.argv[0])
if len(sys.argv) != 5:
raise ex
'''
param mon: 获取当前时间X月之后的时间
return: YYYY-MM-DD
'''
def get_date_month(mon=0):
now = datetime.now() # 当前时间
# 当前时间n个月后
last_y = int((int(now.year) * 12 + int(now.month) + mon) / 12)
last_m = (int(now.year) * 12 + int(now.month) + mon) % 12
if last_m < 10:
last_m = "0" + str(last_m)
last_d = int(now.day)
last_date = '%s-%s-%s' % (last_y, last_m, last_d)
return last_date
# 将GMT时间转为标准时间
def trans_gmt(gmt_time):
GMT_FORMAT = '%b %d %Y'
standard_time = datetime.strptime(gmt_time, GMT_FORMAT)
standard_time = standard_time.strftime("%Y-%m-%d") # 把<class 'datetime.datetime'>转为str
return standard_time
#检查k8s证书时间
def check_k8s_cert():
exceed_cert_file = sys.argv[1] # 保存过期证书信息文件
cert_file = sys.argv[2] # 证书名
after_mon = int(sys.argv[3]) # n个月后
cert_exceed_time = sys.argv[4] # 证书过期时间(GMT格式)
cert_exceed_time = trans_gmt(cert_exceed_time)
cert_exceed_time = int(cert_exceed_time.replace('-', '')) # 转成整型,如20220318
after_mon_time = get_date_month(after_mon)
after_mon_time = int(after_mon_time.replace('-', '')) # n个月后的时间
if cert_exceed_time <= after_mon_time:
with open(exceed_cert_file, 'a') as f:
f.write("%s 过期时间:%s\n" % (cert_file, cert_exceed_time))
# print('WARN:证书%s将在%s过期' % (cert_file,cert_exceed_time))
if __name__ == '__main__':
try:
check_env()
check_k8s_cert()
except Exception as e:
print('ERROR:%s' % e)
去Jenkins上创建一个pipeline的流水线即可
方案3. 调整服务器日期
将服务器时间同步服务关闭,调整系统日期提前到证书有效期内,然后打开 UI 页面进行证书更新。
可以手动设置节点的时间,把时间往后调整一些。因为Agent
只与K8S master
和Rancher Server
通信,如果 Rancher Server 证书未过期,那就只需调整K8S master
节点时间。
调整命令,物理机操作:
# 关闭ntp同步,不然时间会自动更新
timedatectl set-ntp false
# 修改节点时间
timedatectl set-time '2019-01-01 00:00:00'
然后再对 Rancher Server 进行升级,接着按照证书轮换步骤进行证书轮换,等到证书轮换完成后再把时间同步回来
timedatectl set-ntp true
检查证书有效期
openssl x509 -in /etc/kubernetes/ssl/kube-apiserver.pem -noout -dates
方案4.按照官方文档进行证书轮换
# 进入 server 容器
docker exec -it rancher /bin/sh
kubectl --insecure-skip-tls-verify -n kube-system delete secrets k3s-serving
kubectl --insecure-skip-tls-verify delete secret serving-cert -n cattle-system
rm -f /var/lib/rancher/k3s/server/tls/dynamic-cert.json
# 重启 server 容器
docker restart rancher
# 执行以下命令刷新参数
curl --insecure -sfL https://localhost:8443/v3
# 重启 server 容器
docker restart rancher
然后在进入页面跟新证书
因为证书改变,相应的token也会变化,所以在完成集群证书更新后,需要对连接API SERVER的 Pod 进行重建,以获取新的token。
cattle-system/cattle-cluster-agent
cattle-system/cattle-node-agent
cattle-system/kube-api-auth
ingress-nginx/nginx-ingress-controller
kube-system/canal
kube-system/kube-dns
kube-system/kube-dns-autoscaler
其他应用 Pod