关键字:
KingbaseES、集群、安装部署、人大金仓
1. 集群部署
本章将介绍KingbaseES集群软件安装的具体操作流程。
1.1 环境准备
1.1.1 部署规划
本小节主要讲述集群安装的规划项。
规划项 | 推荐值 | 说明 |
服务器数量规划 | 至少两台。 | 集群节点至少一主一备,建议集群节点不要超过8个。 |
同步节点 | 至少一个同步节点。 | 至少需要规划一个同步节点,根据客户场景对高可用的需求不同,选择一个或者多个同步备节点。 |
IP地址规划 | 每个服务分配一个IP地址,如需要规划VIP,则需要额外规划一个IP。 | IP地址规划。 |
存储容量规划 | 数据目录容量按照业务数据容量的1.5倍进行规划。 | 确认磁盘空间与业务需求的合理性。 |
目录划分 | 数据文件目录划分独立的目录,其他数据库相关目录可以不单独分开。 | 数据目录与其他数据库相关目录分开部署,可以降低单个文件目录对整个数据库系统的影响。 |
互信配置 | root用户配置互信 | 不同的互信方式,影响集群的部署方式 |
1.1.2 硬件环境要求
本小节列出了安装Kingbase集群对服务器的最低要求,在实际运用中,请根据实际情况进行调整。
选项 | 配置描述 |
内存 |
|
CPU |
|
磁盘 |
|
网络 |
|
1.1.3 软件环境要求
1.1.3.1 操作系统软件包要求
选项 | 配置描述 |
操作系统命令 | cat,grep,awk,ls,mkdir,scp,sed,echo,service,chmod,chown,systemctl,test,which,cp,mv,tar,file,arping,ping,ssh,vim,unzip,tc ,ip,java |
JDK版本 | JDK1.6及以上版本 |
浏览器要求 | Chrome浏览器 |
1.1.3.2 操作系统服务要求
选项 | 配置描述 |
selinux | 部署时自动会关闭。 |
iptables | 建议关闭,如果不关机建议开通相应集群端口。 |
端口 | 至少要求服务器端口开通8890,54321,22 。 |
1.1.3.3 操作系统参数要求
KingbaseES集群安装时,会自动将下面操作系统参数修改成推荐值,
参数名称 | 集群的当前值 | 参数说明 |
kernel.sem | 5010 64128000 50100 1280 | |
kernel.shmmni | 8192 | |
vm.overcommit_memory | 2 | |
vm.overcommit_ratio | 90 | |
fs.file-max | 7672460 | |
net.ipv4.ip_local_port_range | 9000 65500 | |
ulimit相关设置 | * soft nofile 655360 | |
shopt | - | |
GSSAPIAuthentication | no | |
UseDNS | no | |
UsePAM | yes |
1.1.3.4 操作系统及CPU架构要求
CPU架构 | 操作系统 | 版本 |
X86(Intel&AMD) | windows | windows7旗舰版(Ultimate) |
CentOS | 7.5 及以上 | |
凝思 | GNU/Linux 6.0.80 (jessie)" | |
UOS | VERSION="20 SP1" | |
麒麟信安 | 麒麟信安V3.5.1 | |
麒麟 | 麒麟V10-SP1-2203版 桌面版 | |
open欧拉 | open Euler-22.03-LTS-SP1 | |
X86(海光&兆芯) | UOS | VERSION="20 SP1" |
Kylin | release V10 (Tercel) | |
MIPS64(龙芯3B3000&龙芯3B4000) | UOS | version=20 |
Kylin | VERSION="V10 (Azalea)" | |
Loongarch64(龙芯3C5000L) | Kylin | release V10 (SP1) |
aarch64(飞腾2000+) | Kylin |
|
UOS V20 | version=20 | |
aarch64(飞腾s2500) | Kylin | release V10 (SP1) |
UOS | version=20 | |
ARM(鲲鹏920) | UOS | version=20 |
Kylin | release V10 (SP1) | |
SW64(申威3231) | UOS | version=20 |
Kylin | Release V10(SP1) |
1.1.4 创建用户和组
使用root用户执行创建集群安装用户kingase, 在集群的每个节点都执行
[root@node1]# useradd kingbase [root@node1]# id kingbase uid=1000(kingbase) gid=1000(kingbase) 组=1000(kingbase) |
1.1.5 创建集群安装路径
使用root用户执行创建集群安装路径,在集群的每个节点都执行
[root@node1 root_cmd]# mkdir -p /opt/cluster [root@node1 root_cmd]# chown -R kingbase:kingbase /opt/cluster |
1.2 集群安装
1.2.1集群软件包获取
1.2.1.1 安装单机数据库
此步骤参考《基于Linux系统的数据库软件安装指南》
1.2.1.2 获取集群部署文件
1. 在${install_dir}/ClientTools/guitools/DeployTools/zip/目录下获取以下文件:
软件包名 | 备注 |
db.zip | 数据库服务器压缩包 |
cluster_install.sh | 部署脚本 |
install.conf | 部署配置文件 |
trust_cluster.sh | 配置 SSH 互信脚本 |
license.dat | 数据库授权文件 |
securecmdd.zip | Securecmdd软件包 |
2. 在集群的主节点上创建软件包存放路径(可自定义路径)
[root@node1]# mkdir -p /home/kingbase/software |
3.上传软件包
将上述软件包上传至集群的主节点的/home/kingbase/software目录
4. 修改文件权限
[root@node1]# chown -R kingbase:kingbase /home/kingbase/software [root@node1]# chmod -R 775 /home/kingbase/software [root@node1]# ls -ltr /home/kingbase/software/ 总用量 746884 -rwxrwxr-x 1 kingbase kingbase 252570 5月 28 10:31 cluster_install.sh -rwxrwxr-x 1 kingbase kingbase 327132813 5月 28 11:00 db.zip -rwxrwxr-x 1 kingbase kingbase 19527 5月 28 10:31 install.conf -rwxrwxr-x 1 kingbase kingbase 2911 5月 28 11:04 license.dat -rwxrwxr-x 1 kingbase kingbase 2594641 5月 28 10:59 securecmdd.zip -rwxrwxr-x 1 kingbase kingbase 9682 5月 28 10:31 trust_cluster.sh |
1.2.2 参数文件配置(install.conf)
1.2.2.1 参数示例
下面是一主一备集群的参数配置文件模板,可根据实际环境对参数配置文件进行修改。
模板信息如下:
主节点IP | 10.12.11.192 |
备节点IP | 10.12.11.192 |
主备节点网卡名称 | ens192 |
VIP IP | 10.12.11.192 |
参数模板
[install] on_bmj=0 all_ip=(10.12.11.192 10.12.11.193) witness_ip="" production_ip=() local_disaster_recovery_ip=() remote_disaster_recovery_ip=() install_dir="/opt/cluster" zip_package="/home/kingbase/software/db.zip" license_file=(license.dat) db_user="system" db_port="54321" db_mode="oracle" db_auth="scram-sha-256" db_case_sensitive="yes" db_checksums="yes" archive_mode="always" encoding="UTF8" locale="zh_CN.UTF-8" other_db_init_options="" sync_security_guc="no" tcp_keepalives_idle="2" tcp_keepalives_interval="2" tcp_keepalives_count="3" tcp_user_timeout="9000" connection_timeout="10" wal_sender_timeout="30000" wal_receiver_timeout="30000" trusted_servers="10.12.11.192,10.12.11.193" running_under_failure_trusted_servers='on' use_exist_data=0 data_directory="" waldir='' virtual_ip="10.12.11.200" ignore_vip_failure='off' net_device=(ens192) net_device_ip=(10.12.11.192 10.12.11.193) ipaddr_path="/sbin" arping_path="/usr/sbin" ping_path="/usr/bin" install_with_root=0 super_user="root" execute_user="kingbase" deploy_by_sshd=1 use_scmd=1 reconnect_attempts="10" reconnect_interval="6" recovery="standby" ssh_port="22" scmd_port="8890" use_ssl=0 auto_cluster_recovery_level='1' use_check_disk='off' synchronous='' sync_nodes=() potential_nodes=() async_nodes=() sync_in_same_location=0 failover_need_server_alive='off' |
1.2.2.2 参数说明
参数名称 | 默认值/建议值 | 备注 |
on_bmj | 0 | 必须配置项,该参数表示集群服务器类型。 |
all_ip |
| 部署读写分离集群时,该参数为必须配置项项,可配置IPv4、IPv6或hostname, 所有节点必须配置项为相同的类型,以空格分隔, 默认第一个被设置为集群主节点; |
witness_ip |
| 可选配置项,该参数为部署观察器节点的设备IP,只能配置一个IPv4或IPv6地址或hostname, 必须与all_ip中的IP类型相同。 |
production_ip |
| 部署读写分离集群时,该参数为不可配置项; |
local_disaster_recovery_ip |
| 部署读写分离集群时,该参数为不可配置项; |
remote_disaster_recovery_ip |
| 部署读写分离集群时,该参数为不可配置项; |
install_dir |
| 必须配置项,该参数表示软件的安装路径,该路径需是一个不存在的目录,该安装目录需与原集群保持一致,字符串不能以 / 字符结尾。 |
zip_package |
| 必须配置项,该参数表示压缩包的路径。 |
license_file |
| 可选配置项,该参数表示license文件的名字,多个license之间以 空格分隔。不配置时默认使用自带的试用版license。 |
db_user | system | 必须配置项,该参数表示数据库初始化用户名。 |
db_password | 12345678ab | 可选配置项,该参数表示数据库初始化用户密码;不配置则使用默认密码12345678ab。 |
db_port | 54321 | 必须配置项,该参数表示数据初始化时指定的数据库端口。 |
db_mode | oracle | 必须配置项,该参数表示数据初始化时指定的数据库兼容模式。 |
db_auth | scram-sha-256 | 必须配置项,该参数表示本地连接的默认身份验证方法。 |
db_case_sensitive | yes | 必须配置项,该参数表示数据库是否具有大小写敏感特性。 |
db_checksums | yes | 可选配置项,该参数表示是否开启校验数据页损坏的特性。 |
archive_mode | always | 可选配置项,该参数表示是否开启WAL日志归档的特性。 |
encoding | UTF8 | 可选配置项,该参数表示数据库初始化时数据库的字符集 ,默认“UTF8,不配置数据库默认从locale命令读取。 |
locale | zh_CN.UTF-8 | 可选配置项,该参数表示数据库初始化时数据库字符排序的字符集规则 ,默认“zh_CN.UTF-8,不配置数据库默认从locale命令读取。 |
other_db_init _options |
| 可选配置项,该参数表示数据库初始化时用户自定义添加的初始化参数 。 |
sync_security_guc | no | 可选配置项,是否开启同步安全参数的功能。 |
tcp_keepalives_idle | 2 | 该参数表示连接空闲时长超过该参数设置的值后,会发送心跳消息来探测连接是否正常,参数仅当SO_KEEPALIVE套接字参数被启用时才有效,默认是2 秒。。 |
tcp_keepalives_interval | 75 | 该参数表示在重发一个还没有被客户端告知已收到的keepalive 消息的时间间隔,默认是75秒。 |
tcp_keepalives_count | 3 | 该参数表示连续发送心跳后的次数,仍然没有ACK报文回复,则认为连接异常, 会主动断开连接,默认是3次。当tcp_user_timeout被启用后,此功能会被覆盖,即使达到指定次数后连接也不会被断开。 |
tcp_user_timeout | 9000 | 该参数表示用于指定传输的数据在TCP连接被强制关闭前可以保持未确认状态的时长。如果指定值时没有单位则以毫秒为单位,默认是9000毫秒。 |
connection_timeout | 10 | 该参数表示集群间连接超时时间,同时也是连接数据库的超时时间,默认10秒。 |
wal_sender_timeout | 30000 | 该参数表示wal sender等待发送的超时时间。如果等待时间超过参数设定的值,将断开流复制连接,默认值为30000毫秒,设置为0表示禁用该参数。 |
wal_receiver_timeout | 30000 | 该参数表示wal receiver进程等待接收的超时时间。如果等待时间超过该参数值将断开连接,默认值为30000毫秒,设置为0表示禁用该参数。 |
trusted_servers |
| 必须配置项,该参数指定部署集群的信任网关,多个网关请 用“,分隔,不允许有空格,必须与all_ip中的IP类型相同。 |
running_under_failure_trusted_servers | on | 可选配置项,该参数表示数据库节点无法ping通信任网关后是否正常运行。 |
use_exist_data | 0 | 可选配置项,该参数表示是否使用已有的单机数据目录部署集群。 |
data_directory |
| 可选配置项,该参数指定数据库data目录的全路径,机默路径为${install_dir}/kingbase/data。 |
waldir |
| 可选配置项,该参数指定数据库的WAL日志的存放全路径,默认存储在data/sys_wal目录下。 |
virtual_ip |
| 可选配置项,该参数指定集群部署时的数据库虚拟IP,该参数只能配置IPv4,掩码长度默认为24, 需跟网卡net_device的实际掩码长度一致,简称VIP。 |
ignore_VIP_failure | off | 可选配置项,该参数表示VIP操作失败是否影响命令行指令继续执行。 |
net_device |
| 可选配置项,若配置了[virtual_ip],则此参数必须配置项。该参数值为VIP 所在网卡的网卡名称,每个节点网卡名中间以空格分隔,且必须和 all_node_ip的顺序保持一致,相同网卡名只需写一次。 |
net_device_ip |
| 可选配置项,若配置了[virtual_ip],则此参数必须配置项。该参数值为VIP 所在网卡的物理IP地址,每个节点的IP中间以空格分隔,此参数也必须配置项为IPv4,且必须 和all_ip的顺序保持一致。 net_device_ip可以是VIP所在网卡的物理ip地址列表中的任何一 个,默认为第一个,且不区分虚拟IP与物理IP。 |
ipaddr_path | /sbin | 可选配置项,该参数指定ip命令路径,默认为“/sbin”。 |
arping_path |
| 可选配置项,该参数指定arping命令路径,默认为"数据库安装目录的bin" 目录。 |
ping_path | /bin | 可选配置项,该参数指定ping命令路径,默认为“/bin”。 |
super_user | root | 可选配置项,该参数指定集群部署的操作系统超级用户,默认为root。 |
execute_user | kingbase | 可选配置项,该参数指定集群部署的操作系统普通用户,默认为kingbase。 |
deploy_by_sshd | 1 | 可选配置项,该参数指定集群部署的通讯方式。 |
use_scmd | 1 | 可选配置项,该参数表示集群运行时依赖的通讯方式。 |
reconnect_attempts | 10 | 可选配置项,该参数表示数据库故障时检测重试次数。 |
reconnect_interval | 6 | 可选配置项,该参数表示数据库故障时,检测重试的时间间隔。 |
recovery | standby | 可选配置项,该参数指定集群故障自动恢复方式。 |
ssh_port | 22 | 可选配置项,该参数指定sshd/ssh通讯端口,默认为22。 |
scmd_port | 8890 | 可选配置项,该参数指定sys_securecmdd/sys_securecmd通讯端口, 默认为8890。 |
use_ssl | 0 | 可选配置项,该参数表示集群是否使用ssl方式进行集群间通信。 |
auto_cluster_recovery_level | 1 | 可选配置项,该参数指定集群是否开启故障多级别自动恢复参数。 |
use_check_disk | off | 可选配置项,该参数指定集群是否开启磁盘故障转移功能。 |
synchronous |
| 可选配置项,该参数指定集群同/异步模式。 |
sync_nodes | 0 | 可选配置项,仅当synchronous参数配置为custom时,本参数生效。 该参数用于配置集群中同步节点的host,以空格分隔, 所有节点必须全部来自于all_ip参数。 |
potential_nodes | 0 | 可选配置项,仅当synchronous参数配置为custom时,本参数生效。 该参数用于配置集群中同步候选节点的host,以空格分隔, 所有节点必须全部来自于all_ip参数。 |
async_nodes | 0 | 可选配置项,仅当synchronous参数配置为custom时,本参数生效。 该参数用于配置集群中异步节点的host,以空格分隔, 所有节点必须全部来自于all_ip参数。 |
sync_in_same_location | 0 | 可选配置项,该参数表示两地三中心集群下同步模式是否仅在本中心生效。 |
failover_need_server_alive | off | 可选配置项,该参数表示两地三中心集群支持跨中心故障转移是否进行检查。 |
1.2.3 集群搭建
1.2.3.1 修改操作系统参数
- 将securecmdd.zip软件包传到至所有备节点的/home/kingbase/software目录
- 使用root用户登录集群每个节点执行
[root@node1 software]# cd /home/kingbase/software/ [root@node1 software]# unzip securecmdd.zip [root@node1 software]# chown –R kingbase:kingbase /home/kingbase/software/ [root @node1 software]$ cd securecmdd/root_cmd/ [root@node1 root_cmd]# sh root_env_init.sh kingbase |
操作日志如下:
[root@node1 software]# cd /home/kingbase/software/ [root@node1 software]# unzip securecmdd.zip Archive: securecmdd.zip creating: securecmdd/ creating: securecmdd/root_cmd/ inflating: securecmdd/root_cmd/root_env_init.sh inflating: securecmdd/root_cmd/root_env_check.sh inflating: securecmdd/root_cmd/arping creating: securecmdd/lib/ inflating: securecmdd/lib/libcrypto.so.1.1 inflating: securecmdd/lib/libssl.so.1.1 creating: securecmdd/bin/ inflating: securecmdd/bin/sys_securecmd inflating: securecmdd/bin/sys_secureftp inflating: securecmdd/bin/sys_HAscmdd.sh inflating: securecmdd/bin/sys_securecmdd creating: securecmdd/share/ inflating: securecmdd/share/sys_HAscmdd.conf inflating: securecmdd/share/key_file inflating: securecmdd/share/securecmdd_config inflating: securecmdd/share/securecmdd.service inflating: securecmdd/share/securecmd_config inflating: securecmdd/share/accept_hosts [root@node1 software]# chown –R kingbase:kingbase /home/kingbase/software/ [root @node1 software]$ cd securecmdd/root_cmd/ [root@node1 root_cmd]# sh root_env_init.sh kingbase [2024年 05月 28日 星期二 14:51:29 CST] [INFO] change UsePAM ... [2024年 05月 28日 星期二 14:51:29 CST] [INFO] change UsePAM ... Done [2024年 05月 28日 星期二 14:51:29 CST] [INFO] change ulimit ... [2024年 05月 28日 星期二 14:51:29 CST] [INFO] change ulimit ... Done [2024年 05月 28日 星期二 14:51:29 CST] [INFO] change kernel.sem ... [2024年 05月 28日 星期二 14:51:29 CST] [INFO] change kernel.sem ... Done [2024年 05月 28日 星期二 14:51:29 CST] [INFO] no need to change "/etc/profile" [2024年 05月 28日 星期二 14:51:29 CST] [INFO] stop selinux ... [2024年 05月 28日 星期二 14:51:29 CST] [INFO] stop selinux ... Done [2024年 05月 28日 星期二 14:51:29 CST] [INFO] change RemoveIPC ... [2024年 05月 28日 星期二 14:51:29 CST] [INFO] change RemoveIPC ... Done [2024年 05月 28日 星期二 14:51:29 CST] [INFO] change DefaultTasksAccounting ... [2024年 05月 28日 星期二 14:51:29 CST] [INFO] change DefaultTasksAccounting ... Done [2024年 05月 28日 星期二 14:51:29 CST] [INFO] chmod /sbin/ip ... [2024年 05月 28日 星期二 14:51:29 CST] [INFO] chmod /sbin/ip ... Done [2024年 05月 28日 星期二 14:51:29 CST] [INFO] copy /opt/kes/bin/arping ... [2024年 05月 28日 星期二 14:51:29 CST] [INFO] copy /opt/kes/bin/arping ... Done [2024年 05月 28日 星期二 14:51:29 CST] [INFO] chmod /opt/kes/bin/arping ... [2024年 05月 28日 星期二 14:51:29 CST] [INFO] chmod /opt/kes/bin/arping ... Done [2024年 05月 28日 星期二 14:51:29 CST] [INFO] chmod /usr/bin/crontab ... [2024年 05月 28日 星期二 14:51:29 CST] [INFO] chmod /usr/bin/crontab ... Done [2024年 05月 28日 星期二 14:51:29 CST] [INFO] configuration to take effect ... [2024年 05月 28日 星期二 14:51:29 CST] [INFO] configuration to take effect ... Done |
1.2.3.2 互信配置
KingbaseES集群在部署时须保证kingbase用户的互信,本小节将讲述如何配置kingbase用户互信
使用kingbase用户登录集群主节点执行
[kingbase@node1 ~]$ cd /home/kingbase/software/ [kingbase@node1 software]$ sh trust_cluster.sh |
操作日志如下
[kingbase@node1 software]$ sh trust_cluster.sh [INFO] set password-free only between kingbase Generating public/private rsa key pair. Your identification has been saved in /home/kingbase/.ssh/id_rsa. Your public key has been saved in /home/kingbase/.ssh/id_rsa.pub. The key fingerprint is: SHA256:dVGMybqu6wqdmUN6W4ba4rLH7RoqBsBeglU0PmxonhE kingbase@node1 The key's randomart image is: +---[RSA 2048]----+ | Eo+ ..=. | | .= . +.. | |o.+ = ... | |o= = . ... | |o = . S . | |.. + = . | |. .+.O o. | |....+*.= . | |..o=++=o+o | +----[SHA256]-----+ Warning: Permanently added '10.12.11.192' (ECDSA) to the list of known hosts. known_hosts 100% 345 316.9KB/s 00:00 id_rsa 100% 1675 2.2MB/s 00:00 id_rsa.pub 100% 396 533.7KB/s 00:00 authorized_keys 100% 396 746.2KB/s 00:00 Warning: Permanently added '10.12.11.193' (ECDSA) to the list of known hosts. kingbase@10.12.11.193's password: known_hosts 100% 519 359.8KB/s 00:00 id_rsa 100% 1675 2.5MB/s 00:00 id_rsa.pub 100% 396 678.5KB/s 00:00 authorized_keys 100% 396 756.2KB/s 00:00 connect to "10.12.11.192" from current node by 'ssh' kingbase:0..... OK connect to "10.12.11.193" from "10.12.11.192" by 'ssh' kingbase->kingbase:0 .... OK connect to "10.12.11.193" from current node by 'ssh' kingbase:0..... OK connect to "10.12.11.192" from "10.12.11.193" by 'ssh' kingbase->kingbase:0 .... OK check ssh connection success! |
1.2.3.3 集群搭建
使用kingbase用户登录集群主节点执行
[kingbase@node1 software]$ cd /home/kingbase/software [kingbase@node1 software]$ sh cluster_install.sh |
操作日志如下:
[kingbase@node1 ~]$ sh cluster_install.sh [CONFIG_CHECK] will deploy the cluster of DG [CONFIG_CHECK] file format is correct ... OK [CONFIG_CHECK] encoding: UTF8 OK [CONFIG_CHECK] locale: zh_CN.UTF-8 OK [CONFIG_CHECK] the number of net_device matches the length of all_ip or the number of net_device is 1 ... OK [CONFIG_CHECK] the number of license_num matches the length of all_ip or the number of license_num is 1 ... OK [RUNNING] check if the host can be reached from current node and between all nodes by ssh ... [RUNNING] success connect to "10.12.11.192" from current node by 'ssh' ... OK [RUNNING] success connect to "10.12.11.192" from "10.12.11.192" by 'ssh' ... OK [RUNNING] success connect to "10.12.11.193" from "10.12.11.192" by 'ssh' ... OK [RUNNING] success connect to "10.12.11.193" from current node by 'ssh' ... OK [RUNNING] success connect to "10.12.11.192" from "10.12.11.193" by 'ssh' ... OK [RUNNING] success connect to "10.12.11.193" from "10.12.11.193" by 'ssh' ... OK [RUNNING] chmod /usr/bin/ping ... [RUNNING] chmod /usr/bin/ping ... Done [RUNNING] ping access rights OK [RUNNING] check if the virtual ip "10.12.11.200" already exist ... [RUNNING] there is no "10.12.11.200" on any host, OK [RUNNING] check the [net_device_ip] on dev [net_device] ... [RUNNING] 10.12.11.192 on host "10.12.11.192" on dev "ens192" ..... OK [RUNNING] 10.12.11.193 on host "10.12.11.193" on dev "ens192" ..... OK [RUNNING] check the db is running or not... [RUNNING] the db is not running on "10.12.11.192:54321" ..... OK [RUNNING] the db is not running on "10.12.11.193:54321" ..... OK [RUNNING] check the sys_securecmdd is running or not... [RUNNING] the sys_securecmdd is not running on "10.12.11.192:8890" ..... OK [RUNNING] the sys_securecmdd is not running on "10.12.11.193:8890" ..... OK [RUNNING] check if the install dir (create dir and check it's owner/permission) ... [RUNNING] check if the install dir (create dir and check it's owner/permission) on "10.12.11.192" ... OK [RUNNING] check if the install dir (create dir and check it's owner/permission) on "10.12.11.193" ... OK [RUNNING] check if the dir "/opt/cluster/kingbase" is already exist ... [RUNNING] the dir "/opt/cluster/kingbase" is not exist on "10.12.11.192" ..... OK [RUNNING] the dir "/opt/cluster/kingbase" is not exist on "10.12.11.193" ..... OK [RUNNING] check the data directory (create it and check whether it is empty) ... [RUNNING] when use_exist_data=0, create the empty data directory on "10.12.11.192" ..... OK [RUNNING] when use_exist_data=0, create the empty data directory on "10.12.11.193" ..... OK [RUNNING] install without root, skip system check [INSTALL] create the install dir "/opt/cluster/kingbase" on every host ... [INSTALL] success to create the install dir "/opt/cluster/kingbase" on "10.12.11.192" ..... OK [INSTALL] success to create the install dir "/opt/cluster/kingbase" on "10.12.11.193" ..... OK [INSTALL] success to access the zip_package "/home/kingbase/kingbase-server-V009R001B0514-linux-x86_64.tar" on "10.12.11.192" ..... OK [INSTALL] decompress the "/home/kingbase/kingbase-server-V009R001B0514-linux-x86_64.tar" to "/opt/cluster/kingbase/__tmp_decompress__" [INSTALL] success to recreate the tmp dir "/opt/cluster/kingbase/__tmp_decompress__" on "10.12.11.192" ..... OK [INSTALL] success to decompress the "/home/kingbase/kingbase-server-V009R001B0514-linux-x86_64.tar" to "/opt/cluster/kingbase/__tmp_decompress__" on "10.12.11.192"..... OK [INSTALL] scp the dir "/opt/cluster/kingbase/__tmp_decompress__" to "/opt/cluster/kingbase" on all host [INSTALL] try to copy the install dir "/opt/cluster/kingbase" to "10.12.11.192" ..... [INSTALL] success to scp the install dir "/opt/cluster/kingbase" to "10.12.11.192" ..... OK [INSTALL] try to copy the install dir "/opt/cluster/kingbase" to "10.12.11.193" ..... [INSTALL] success to scp the install dir "/opt/cluster/kingbase" to "10.12.11.193" ..... OK [INSTALL] remove the dir "/opt/cluster/kingbase/__tmp_decompress__" [INSTALL] change the auth of bin directory on 10.12.11.192 ... [INSTALL] change the auth of bin directory on 10.12.11.193 ... [RUNNING] ip access check ..... OK [RUNNING] arping access check ..... OK [INSTALL] check license_file ... [INSTALL] success to access license_file on 10.12.11.192: /opt/cluster/kingbase/bin/license.dat [INSTALL] check license_file ... [INSTALL] success to access license_file on 10.12.11.193: /opt/cluster/kingbase/bin/license.dat [INSTALL] set the archive_command to "exit 0" and the archive dir is NULL [INSTALL] the archive dir is NULL, not do archive ... [INSTALL] create the dir "etc" "log" on all host [RUNNING] config sys_securecmdd and start it ... [RUNNING] config the sys_securecmdd port to 8890 ... [RUNNING] success to config the sys_securecmdd port on 10.12.11.192 ... OK successfully initialized the sys_securecmdd, please use "/opt/cluster/kingbase/bin/sys_HAscmdd.sh start" to start the sys_securecmdd [RUNNING] success to config sys_securecmdd on 10.12.11.192 ... OK [RUNNING] success to start sys_securecmdd on 10.12.11.192 ... OK [RUNNING] config sys_securecmdd and start it ... [RUNNING] config the sys_securecmdd port to 8890 ... [RUNNING] success to config the sys_securecmdd port on 10.12.11.193 ... OK successfully initialized the sys_securecmdd, please use "/opt/cluster/kingbase/bin/sys_HAscmdd.sh start" to start the sys_securecmdd [RUNNING] success to config sys_securecmdd on 10.12.11.193 ... OK [RUNNING] success to start sys_securecmdd on 10.12.11.193 ... OK [RUNNING] check if the host can be reached between all nodes by scmd ... [RUNNING] success connect to "10.12.11.192" from "10.12.11.192" by '/opt/cluster/kingbase/bin/sys_securecmd' ... OK [RUNNING] success connect to "10.12.11.193" from "10.12.11.192" by '/opt/cluster/kingbase/bin/sys_securecmd' ... OK [RUNNING] success connect to "10.12.11.192" from "10.12.11.193" by '/opt/cluster/kingbase/bin/sys_securecmd' ... OK [RUNNING] success connect to "10.12.11.193" from "10.12.11.193" by '/opt/cluster/kingbase/bin/sys_securecmd' ... OK [INSTALL] begin to init the database on "10.12.11.192" ... The database cluster will be initialized with locales COLLATE: zh_CN.UTF-8 CTYPE: zh_CN.UTF-8 MESSAGES: C MONETARY: zh_CN.UTF-8 NUMERIC: zh_CN.UTF-8 TIME: zh_CN.UTF-8 The files belonging to this database system will be owned by user "kingbase". This user must also own the server process. The default text search configuration will be set to "simple". The comparision of strings is case-sensitive. Data page checksums are enabled. fixing permissions on existing directory /opt/cluster/kingbase/data ... ok creating subdirectories ... initdb: could not find suitable text search configuration for locale "zh_CN.UTF-8" ok selecting dynamic shared memory implementation ... posix selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting default time zone ... PRC creating configuration files ... ok Begin setup encrypt device initializing the encrypt device ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok create security database ... ok load security database ... ok syncing data to disk ... ok Success. You can now start the database server using: /opt/cluster/kingbase/bin/sys_ctl -D /opt/cluster/kingbase/data -l logfile start [INSTALL] end to init the database on "10.12.11.192" ... OK [INSTALL] wirte the kingbase.conf on "10.12.11.192" ... [INSTALL] wirte the kingbase.conf on "10.12.11.192" ... OK [INSTALL] wirte the es_rep.conf on "10.12.11.192" ... [INSTALL] wirte the es_rep.conf on "10.12.11.192" ... OK [INSTALL] wirte the sys_hba.conf on "10.12.11.192" ... [INSTALL] wirte the sys_hba.conf on "10.12.11.192" ... OK [INSTALL] wirte the .encpwd on every host [INSTALL] write the repmgr.conf on every host [INSTALL] write the repmgr.conf on "10.12.11.192" ... [INSTALL] write the repmgr.conf on "10.12.11.192" ... OK [INSTALL] write the repmgr.conf on "10.12.11.193" ... [INSTALL] write the repmgr.conf on "10.12.11.193" ... OK [INSTALL] start up the database on "10.12.11.192" ... [INSTALL] /opt/cluster/kingbase/bin/sys_ctl -w -t 60 -l /opt/cluster/kingbase/logfile -D /opt/cluster/kingbase/data start waiting for server to start.... done server started [INSTALL] start up the database on "10.12.11.192" ... OK [INSTALL] create the database "esrep" and user "esrep" for repmgr ... CREATE DATABASE CREATE ROLE GRANT GRANT ROLE [INSTALL] create the database "esrep" and user "esrep" for repmgr ... OK [INSTALL] register the primary on "10.12.11.192" ... [INFO] connecting to primary database... [NOTICE] attempting to install extension "repmgr" [NOTICE] "repmgr" extension successfully installed [NOTICE] PING 10.12.11.200 (10.12.11.200) 56(84) bytes of data. --- 10.12.11.200 ping statistics --- 3 packets transmitted, 0 received, 100% packet loss, time 1999ms [WARNING] ping host"10.12.11.200" failed [DETAIL] average RTT value is not greater than zero [INFO] loadvip result: 1, arping result: 1 [NOTICE] node (ID: 1) acquire the virtual ip 10.12.11.200 success [NOTICE] primary node record (ID: 1) registered [INSTALL] register the primary on "10.12.11.192" ... OK [INSTALL] clone and start up the standby ... clone the standby on "10.12.11.193" ... /opt/cluster/kingbase/bin/repmgr -h 10.12.11.192 -U esrep -d esrep -p 54321 --fast-checkpoint --upstream-node-id 1 standby clone [NOTICE] destination directory "/opt/cluster/kingbase/data" provided [INFO] connecting to source node [DETAIL] connection string is: host=10.12.11.192 user=esrep port=54321 dbname=esrep [DETAIL] current installation size is 87 MB [NOTICE] checking for available walsenders on the source node (2 required) [NOTICE] checking replication connections can be made to the source server (2 required) [INFO] checking and correcting permissions on existing directory "/opt/cluster/kingbase/data" [INFO] creating replication slot as user "esrep" [NOTICE] starting backup (using sys_basebackup)... [INFO] executing: /opt/cluster/kingbase/bin/sys_basebackup -l "repmgr base backup" -D /opt/cluster/kingbase/data -h 10.12.11.192 -p 54321 -U esrep -c fast -X stream -S repmgr_slot_2 [NOTICE] standby clone (using sys_basebackup) complete [NOTICE] you can now start your Kingbase server [HINT] for example: sys_ctl -D /opt/cluster/kingbase/data start [HINT] after starting the server, you need to register this standby with "repmgr standby register" clone the standby on "10.12.11.193" ... OK start up the standby on "10.12.11.193" ... /opt/cluster/kingbase/bin/sys_ctl -w -t 60 -l /opt/cluster/kingbase/logfile -D /opt/cluster/kingbase/data start waiting for server to start.... done server started start up the standby on "10.12.11.193" ... OK register the standby on "10.12.11.193" ... [INFO] connecting to local node "node2" (ID: 2) [INFO] connecting to primary database [INFO] standby registration complete [NOTICE] standby node "node2" (ID: 2) successfully registered [INSTALL] register the standby on "10.12.11.193" ... OK [INSTALL] start up the whole cluster ... 2024-05-28 15:19:56 Ready to start all DB ... 2024-05-28 15:19:56 begin to start DB on "[10.12.11.192]". 2024-05-28 15:19:57 DB on "[10.12.11.192]" already started, connect to check it. 2024-05-28 15:19:58 DB on "[10.12.11.192]" start success. 2024-05-28 15:19:58 Try to ping trusted_servers on host 10.12.11.192 ... 2024-05-28 15:20:01 Try to ping trusted_servers on host 10.12.11.193 ... 2024-05-28 15:20:04 begin to start DB on "[10.12.11.193]". 2024-05-28 15:20:05 DB on "[10.12.11.193]" already started, connect to check it. 2024-05-28 15:20:07 DB on "[10.12.11.193]" start success. ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string ----+-------+---------+-----------+----------+----------+----------+----------+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 | node1 | primary | * running | | default | 100 | 1 | | host=10.12.11.192 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 2 | node2 | standby | running | node1 | default | 100 | 1 | 0 bytes | host=10.12.11.193 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 2024-05-28 15:20:07 The primary DB is started. 2024-05-28 15:20:12 Success to load virtual ip [10.12.11.200] on primary host [10.12.11.192]. 2024-05-28 15:20:12 Try to ping vip on host 10.12.11.192 ... 2024-05-28 15:20:15 Try to ping vip on host 10.12.11.193 ... 2024-05-28 15:20:18 begin to start repmgrd on "[10.12.11.192]". [2024-05-28 15:20:19] [NOTICE] using provided configuration file "/opt/cluster/kingbase/bin/../etc/repmgr.conf" [2024-05-28 15:20:19] [NOTICE] redirecting logging output to "/opt/cluster/kingbase/log/hamgr.log" 2024-05-28 15:20:21 repmgrd on "[10.12.11.192]" start success. 2024-05-28 15:20:21 begin to start repmgrd on "[10.12.11.193]". [2024-05-28 15:16:22] [NOTICE] using provided configuration file "/opt/cluster/kingbase/bin/../etc/repmgr.conf" [2024-05-28 15:16:22] [NOTICE] redirecting logging output to "/opt/cluster/kingbase/log/hamgr.log" 2024-05-28 15:20:24 repmgrd on "[10.12.11.193]" start success. ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen ----+-------+---------+-----------+----------+---------+-------+---------+-------------------- 1 | node1 | primary | * running | | running | 18615 | no | n/a 2 | node2 | standby | running | node1 | running | 25597 | no | 2 second(s) ago [2024-05-28 15:20:30] [NOTICE] redirecting logging output to "/opt/cluster/kingbase/log/kbha.log" [2024-05-28 15:16:37] [NOTICE] redirecting logging output to "/opt/cluster/kingbase/log/kbha.log" 2024-05-28 15:20:39 Done. [INSTALL] start up the whole cluster ... OK |