案例说明:
在通用机环境,KingbaseES V8R6集群适用ssh建立互信,在kingbase用户系统密码过期后,导致集群启动失败,故障现象如下图:
适用版本: KingbaseES V8R6
一、问题分析
1、分析集群启动过程
如下图所示,执行‘sh -x sys_monitor.sh start’启动集群分析,在启动过程中kingbase会通过ssh连接节点启动数据库服务:
二、问题复现1、修改kingbase用户密码有效期
# kingbase用户有效期默认99999天
[root@node201 ~]# chage -l kingbase
Last password change : Aug 25, 2023
Password expires : never
Password inactive : never
Account expires : never
Minimum number of days between password change : 0
Maximum number of days between password change : 99999
Number of days of warning before password expires : 7
# 将kingbase用户有效期改为1天
[root@node201 ~]# chage -M 1 kingbase
[root@node201 ~]# chage -l kingbase
Last password change : Aug 25, 2023
Password expires : Aug 26, 2023
Password inactive : never
Account expires : never
Minimum number of days between password change : 0
Maximum number of days between password change : 1
Number of days of warning before password expires : 7
2、修改系统时间
[root@node201 ~]# date
Tue Oct 17 13:58:27 CST 2023
[root@node201 ~]# date 101513582023
Sun Oct 15 13:58:00 CST 2023
[root@node201 ~]# date
Sun Oct 15 13:58:01 CST 2023
3、测试ssh互信
如下所示,kingbase用户密码过期后,导致ssh连接失败:
[kingbase@node201 bin]$ ssh node201
You are required to change your password immediately (password aged)
Last login: Tue Oct 17 14:04:22 2023
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for user kingbase.
Changing password for kingbase.
(current) UNIX password:
New password:
4、测试集群启动
如下图所示,在kingbase用户通过ssh连接节点启动数据库服务时,连接失败,数据库服务启动故障:
数据库启动语句:
三、问题解决
修改kingbase用户的系统密码有效期后,集群启动正常。
四、总结
对于通用机环境,如果对系统用户kingbase和root用户配置密码有效期,必须在密码到期前修改密码,保证ssh互信的正常。如果有系统用户密码保护需求的生产环境,可以考虑使用securecmdd工具替代ssh建立集群节点互信通讯。