一、相关环境
数据库版本:V008R006C008B0014
单机 | IP | data目录 | ||
node1 | 192.168.40.111 | /dbdata/data/data_v8 | ||
集群 | ||||
node1 | 192.168.40.111 | /dbdata/data/data | ||
node2 | 192.168.40.112 | /dbdata/data/data | ||
VIP | 192.168.40.66 |
二.原单机升级准备
1.查看数据信息
[kingbase@node1 bin]$ ./ksql test system
输入 "help" 来获取帮助信息.
test=# \dt
关联列表
架构模式 | 名称 | 类型 | 拥有者
----------+------+--------+--------
public | ckj | 数据表 | system
(1 行记录)
test=# select * from ckj;
id
----
1
2
3
(3 行记录)
test=#
2.正常关闭数据库服务
[kingbase@node1 bin]$ ./sys_ctl stop -D /dbdata/data/data_v8/
等待服务器进程关闭 .... 完成
服务器进程已经关闭
[kingbase@node1 bin]$
三、集群部署
略。。。。
注:默认安装(图形化)时,无法执行数据data目录,可以登集群部署完成后,修改repmgr.conf中data_directory参数。
四、数据迁移
1.正常停止集群
[kingbase@node1 bin]$ ./sys_monitor.sh stop
2023-12-26 16:51:09 Ready to stop all DB ...
2023-12-26 16:51:16 begin to stop repmgrd on "[192.168.40.111]".
2023-12-26 16:51:18 repmgrd on "[192.168.40.111]" stop success.
2023-12-26 16:51:18 begin to stop repmgrd on "[192.168.40.112]".
2023-12-26 16:51:19 repmgrd on "[192.168.40.112]" stop success.
2023-12-26 16:51:19 begin to stop DB on "[192.168.40.112]".
等待服务器进程关闭 .... 完成
服务器进程已经关闭
2023-12-26 16:51:20 DB on "[192.168.40.112]" stop success.
2023-12-26 16:51:20 begin to stop DB on "[192.168.40.111]".
等待服务器进程关闭 .... 完成
服务器进程已经关闭
2023-12-26 16:51:21 DB on "[192.168.40.111]" stop success.
2023-12-26 16:51:21 Done.
[kingbase@node1 bin]$
2.备份集群data目录(主备)
[kingbase@node1 data]$ mv data data_bak
[kingbase@node1 data]$ pwd
/dbdata/data
[kingbase@node1 data]$
3.将单机数据库data目录拷贝到集群的主备节点
[kingbase@node1 data]$ scp -r data_v8 192.168.40.111:/dbdata/data/
[kingbase@node1 data]$ scp -r data_v8 192.168.40.112:/dbdata/data/
[kingbase@node1 data]$ cp -r data_v8 data
[kingbase@node2 data]$ cp -r data_v8 data
4.备节点创建standby.signal文件
[kingbase@node2 data]$ pwd
/dbdata/data/data
[kingbase@node2 data]$ touch standby.signal
[kingbase@node2 data]$
5.所有节点创建kingbase.auto.conf、kingbase.conf、es_rep.conf等文件
[kingbase@node1 data]$ cp kingbase.auto.conf kingbase.auto.conf_bak
[kingbase@node1 data]$ cp ../data_bak/kingbase.auto.conf ./
[kingbase@node1 data]$ cp ../data_bak/kingbase.conf ./
[kingbase@node1 data]$ cp ../data_bak/es_rep.conf ./
[kingbase@node1 data]$ cp ../data_bak/sys_hba.conf ./
[kingbase@node1 data]$ cp ../data_bak/sys_ident.conf ./
五、创建流复制
1.启动主库,创建复制槽
[kingbase@node1 bin]$ ./sys_ctl start -D /dbdata/data/data
等待服务器进程启动 ....2023-12-26 17:06:51.371 CST [182628] LOG: sepapower extension initialized
2023-12-26 17:06:51.374 CST [182628] LOG: starting KingbaseES V008R006C008B0014 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
2023-12-26 17:06:51.374 CST [182628] LOG: listening on IPv4 address "0.0.0.0", port 54321
2023-12-26 17:06:51.375 CST [182628] LOG: listening on IPv6 address "::", port 54321
2023-12-26 17:06:51.376 CST [182628] LOG: listening on Unix socket "/tmp/.s.KINGBASE.54321"
2023-12-26 17:06:51.457 CST [182628] LOG: redirecting log output to logging collector process
2023-12-26 17:06:51.457 CST [182628] HINT: Future log output will appear in directory "sys_log".
完成
服务器进程已经启动
[kingbase@node1 bin]$ ./ksql test system
输入 "help" 来获取帮助信息.
test=# select sys_create_physical_replication_slot('repmgr_slot_2');
sys_create_physical_replication_slot
--------------------------------------
(repmgr_slot_2,)
(1 行记录)
test=# select * from sys_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
---------------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
repmgr_slot_2 | | physical | | | f | f | | | | |
(1 行记录)
test=#
2.启动备库,并且查看流复制信息
[kingbase@node2 bin]$ ./sys_ctl start -D /dbdata/data/data
等待服务器进程启动 ....2023-12-26 17:09:12.136 CST [117137] LOG: sepapower extension initialized
2023-12-26 17:09:12.139 CST [117137] LOG: starting KingbaseES V008R006C008B0014 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
2023-12-26 17:09:12.139 CST [117137] LOG: listening on IPv4 address "0.0.0.0", port 54321
2023-12-26 17:09:12.139 CST [117137] LOG: listening on IPv6 address "::", port 54321
2023-12-26 17:09:12.147 CST [117137] LOG: listening on Unix socket "/tmp/.s.KINGBASE.54321"
2023-12-26 17:09:12.227 CST [117137] LOG: redirecting log output to logging collector process
2023-12-26 17:09:12.227 CST [117137] HINT: Future log output will appear in directory "sys_log".
完成
服务器进程已经启动
注:启动后备库日志有如下报错
2023-12-26 17:13:14.255 CST,,,117254,,658a992a.1ca06,1,,2023-12-26 17:13:14 CST,,0,FATAL,XX000,"could not connect to the primary server: FATAL: password authentication failed for user ""esrep""
password retrieved from file ""/home/kingbase/.encpwd""",,,,,,,,,""
可以在主库手动创建esrep用户,创建时候因为同步为sync,所以会卡主,直接crtl+c中断,用户也可以创建成功
test=# create user esrep with superuser password 'Kingbaseha110';
^CCancel request sent
WARNING: canceling wait for synchronous replication due to user request
描述: The transaction has already committed locally, but might not have been replicated to the standby.
CREATE ROLE
此时查看主备两端是否都存在esrep用户,且流复制是否正常
test=# select * from sys_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid | 182738
usesysid | 16387
usename | esrep
application_name | node2
client_addr | 192.168.40.112
client_hostname |
client_port | 45332
backend_start | 2023-12-26 17:13:42.866469+08
backend_xmin |
state | streaming
sent_lsn |
write_lsn | 0/20008F0
flush_lsn | 0/20008F0
replay_lsn | 0/20008F0
write_lag |
flush_lag |
replay_lag |
sync_priority | 1
sync_state | quorum
reply_time | 2023-12-26 17:20:41.875443+08
六、配置repmgr集群管理
1.创建esrep库(主库)
test=# create database esrep;
CREATE DATABASE
test=#
2.创建repmgr extension
#确认repmgr组件已添加
[kingbase@node1 bin]$ cat /dbdata/data/data/kingbase.conf |grep repmgr
shared_preload_libraries = 'repmgr,liboracle_parser, synonym, plsql, force_view, kdb_flashback,plugin_debugger, plsql_plugin_debugger, plsql_plprofiler, kdb_ora_expr, sepapower, dblink, sys_kwr, sys_spacequota, sys_stat_statements, backtrace, kdb_utils_function, auto_bmr, sys_squeeze, src_restrict'
#创建repmgr插件
test=# create extension repmgr;
CREATE EXTENSION
test=#
3.注册主备库到repmgr集群
3.1 注册primary节点
[kingbase@node1 bin]$ ./repmgr primary register --force
[INFO] connecting to primary database...
[NOTICE] attempting to install extension "repmgr"
[NOTICE] "repmgr" extension successfully installed
[NOTICE] PING 192.168.40.66 (192.168.40.66) 56(84) bytes of data.
--- 192.168.40.66 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1020ms
[WARNING] ping host"192.168.40.66" failed
[DETAIL] average RTT value is not greater than zero
[INFO] loadvip result: 1, arping result: 1
[NOTICE] node (ID: 1) acquire the virtual ip 192.168.40.66/24 success
[NOTICE] primary node record (ID: 1) registered
[kingbase@node1 bin]$
[kingbase@node1 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 1 | | host=192.168.40.111 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 tcp_user_timeout=9000
[kingbase@node1 bin]$
3.2 注册standby节点
[kingbase@node2 bin]$ ./repmgr standby register --force
[INFO] connecting to local node "node2" (ID: 2)
[INFO] connecting to primary database
[WARNING] --upstream-node-id not supplied, assuming upstream node is primary (node ID: 1)
[INFO] standby registration complete
[NOTICE] standby node "node2" (ID: 2) successfully registered
[kingbase@node2 bin]$
3.3 查看集群状态
[kingbase@node1 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 1 | | host=192.168.40.111 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 tcp_user_timeout=9000
2 | node2 | standby | running | node1 | default | 100 | 1 | 0 bytes | host=192.168.40.112 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3 tcp_user_timeout=9000
[kingbase@node1 bin]$
七、验证集群以及数据
1.重启集群
[kingbase@node1 bin]$ ./sys_monitor.sh restart
2023-12-26 17:31:47 Ready to stop all DB ...
2023-12-26 17:31:53 begin to stop repmgrd on "[192.168.40.111]".
2023-12-26 17:31:54 repmgrd on "[192.168.40.111]" already stopped.
2023-12-26 17:31:54 begin to stop repmgrd on "[192.168.40.112]".
2023-12-26 17:31:55 repmgrd on "[192.168.40.112]" already stopped.
2023-12-26 17:31:55 begin to stop DB on "[192.168.40.112]".
......
2023-12-26 17:32:31 repmgrd on "[192.168.40.112]" start success.
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+--------+---------+--------------------
1 | node1 | primary | * running | | running | 185133 | no | n/a
2 | node2 | standby | running | node1 | running | 118947 | no | 1 second(s) ago
[2023-12-26 17:32:35] [NOTICE] redirecting logging output to "/home/kingbase/cluster/TMR/KingbaseES/kingbase/log/kbha.log"
[2023-12-26 17:33:06] [NOTICE] redirecting logging output to "/home/kingbase/cluster/TMR/KingbaseES/kingbase/log/kbha.log"
2023-12-26 17:32:40 Done.
2.查看repmgrd进程状态(所有节点)
[kingbase@node1 bin]$ ps aux | grep repmgr
kingbase 185133 0.1 0.1 44368 4540 ? S 17:32 0:00 /home/kingbase/cluster/TMR/KingbaseES/kingbase/bin/repmgrd -d -v -f /home/kingbase/cluster/TMR/KingbaseES/kingbase/bin/../etc/repmgr.conf
kingbase 185610 0.1 0.2 117960 6412 ? Sl 17:32 0:00 /home/kingbase/cluster/TMR/KingbaseES/kingbase/bin/kbha -A daemon -f /home/kingbase/cluster/TMR/KingbaseES/kingbase/bin/../etc/repmgr.conf
kingbase 185969 0.0 0.0 213152 888 pts/1 S+ 17:33 0:00 grep repmgr
[kingbase@node1 bin]$
3.验证数据
[kingbase@node1 bin]$ ./ksql test system
用户 system 的口令:
输入 "help" 来获取帮助信息.
test=# \dt
关联列表
架构模式 | 名称 | 类型 | 拥有者
----------+------+--------+--------
public | ckj | 数据表 | system
(1 行记录)
test=# select * from ckj;
id
----
1
2
3
(3 行记录)
test=#