案例说明:
如下图所示:KingbaseES服务进程结构
KingbaseES使用客户端/服务器的模型。 对于每个客户端的连接,KingbaseES主进程接收到客户端连接后,会为其创建一个新的服务进程。 KingbaseES 用服务进程来处理连接到数据库服务的客户端请求。 该进程负责实际处理客户端的数据库请求,连接断开时退出。当Client连接到数据库时,会有对应的kingbase的服务进程为其提供服务,如以下Client查询访问:
如下所示,操作系统对应的服务进程(backend process):
当Client结束访问正常退出数据库连接时,对应的kingbase的服务进程也将结束;但是当客户端异常退出时,会导致数据库端的kingbase服务进程没有正常结束,并占用数据库资源,本案例将详细描述手工方式对服务进程(backend process)终止。手工结束backend process可以使用数据库工具或者操作系统的kill进程方式,但是不同方式对数据库造成的影响不同。
适用版本:
KingbaseES V8R3/R6
系统架构:
一、客户端访问
[kingbase@node1 bin]$ ./ksql -h 192.168.8.201-U system-W prod
Password:
ksql (V8.0)
Type "help" for help.
prod=# selectcount(*) from t1;
count
--------100000
(1row)
二、backend process终止方案
1、pg_terminate_backend(pid)方式
Tips:函数 pg_terminate_backend() 实际上是给进程发送了一个 SIGTERM 信号。
# 查询backend process状态信息
prod=# select*from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+-------------------------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------16385| prod |17100|10|system| kingbase_*&+_ |192.168.8.200||57476|2022-11-2915:04:57.131987+08||2022-11-2915
:05:05.526379+08|2022-11-2915:05:05.539018+08| Client | ClientRead | idle |||selectcount(*) from t1; | client backend
(1row)
#终止backend process对应的pid
prod=# select pg_terminate_backend(17100);
pg_terminate_backend
----------------------
t
(1row)
#进程被终止
prod=# select*from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait
_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+------------+-------------+--------------+-----------------+------------+-------+-------------+--------------+-------+--------------
(0rows)
#如下所示,数据库进程正常,对应的backend pross被安全终止
[root@node2 sys_log]# ps -ef |grep kingbase
kingbase 130891014:39 ? 00:00:00/opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 1309013089014:39 ? 00:00:00 kingbase: logger
kingbase 1647413089014:59 ? 00:00:00 kingbase: checkpointer
kingbase 1647513089014:59 ? 00:00:00 kingbase: background writer
kingbase 1647613089014:59 ? 00:00:00 kingbase: walwriter
kingbase 1647713089014:59 ? 00:00:00 kingbase: autovacuum launcher
kingbase 1647813089014:59 ? 00:00:00 kingbase: stats collector
kingbase 1647913089014:59 ? 00:00:00 kingbase: ksh writer
kingbase 1648013089014:59 ? 00:00:00 kingbase: ksh collector
kingbase 1648113089014:59 ? 00:00:00 kingbase: kwr collector
kingbase 1648213089014:59 ? 00:00:00 kingbase: logical replication launcher
kingbase 1710013089015:04 ? 00:00:00 kingbase: system prod 192.168.8.200(57476) idle
kingbase 1721212416015:05 pts/000:00:00 ./ksql -U system test
kingbase 1721413089015:05 ? 00:00:00 kingbase: system prod [local] idle
2、操作系统kill pid方式
#查询backend process状态信息
prod=# select*from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+-------------------------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------16385| prod |18424|10|system| kingbase_*&+_ |192.168.8.200||57480|2022-11-2915:15:26.813035+08||2022-11-2915
:15:28.912910+08|2022-11-2915:15:28.922719+08| Client | ClientRead | idle |||selectcount(*) from t1; | client backend
(1row)
#操作系统下执行kill pid结束进程
[root@node2 sys_log]# kill 18424
#如下所示,数据库进程正常,对应的backend pross被安全kill
[root@node2 sys_log]# ps -ef |grep kingbase
kingbase 130891014:39 ? 00:00:00/opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 1309013089014:39 ? 00:00:00 kingbase: logger
kingbase 1647413089014:59 ? 00:00:00 kingbase: checkpointer
kingbase 1647513089014:59 ? 00:00:00 kingbase: background writer
kingbase 1647613089014:59 ? 00:00:00 kingbase: walwriter
kingbase 1647713089014:59 ? 00:00:00 kingbase: autovacuum launcher
kingbase 1647813089014:59 ? 00:00:00 kingbase: stats collector
kingbase 1647913089014:59 ? 00:00:00 kingbase: ksh writer
kingbase 1648013089014:59 ? 00:00:00 kingbase: ksh collector
kingbase 1648113089014:59 ? 00:00:00 kingbase: kwr collector
kingbase 1648213089014:59 ? 00:00:00 kingbase: logical replication launcher
kingbase 1721212416015:05 pts/000:00:00 ./ksql -U system test
kingbase 1721413089015:05 ? 00:00:00 kingbase: system prod [local] idle
#在数据库视图中已经无此backend process记录
prod=# select*from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait
_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+------------+-------------+--------------+-----------------+------------+-------+-------------+--------------+-------+--------------
(0rows)
3、操作系统kill -15 pid和数据库sys_ctl kill TERM PID
1)操作系统kill -15 pid
#查询backend process状态信息
test=# select*from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+-------------------------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------16385| prod |22955|10|system| kingbase_*&+_ |192.168.8.200||57498|2022-11-2915:44:29.993305+08||2022-11-2915
:44:32.090913+08|2022-11-2915:44:32.100617+08| Client | ClientRead | idle |||selectcount(*) from t1; | client backend
(1row)
#操作系统下执行kill -15 pid结束进程
[kingbase@node2 bin]$ kill -1522955
[kingbase@node2 bin]$ ps -ef |grep kingbase
#如下所示,数据库进程正常,对应的backend pross被安全kill
kingbase 130891014:39 ? 00:00:00/opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 2219713089015:38 ? 00:00:00 kingbase: checkpointer
kingbase 2219813089015:38 ? 00:00:00 kingbase: background writer
kingbase 2219913089015:38 ? 00:00:00 kingbase: walwriter
kingbase 2220013089015:38 ? 00:00:00 kingbase: autovacuum launcher
kingbase 2220113089015:38 ? 00:00:00 kingbase: stats collector
kingbase 2220213089015:38 ? 00:00:00 kingbase: ksh writer
kingbase 2220313089015:38 ? 00:00:00 kingbase: ksh collector
kingbase 2220413089015:38 ? 00:00:00 kingbase: kwr collector
kingbase 2220513089015:38 ? 00:00:00 kingbase: logical replication launcher
kingbase 2244412416015:40 pts/000:00:00 ./ksql -U system test
kingbase 2244513089015:40 ? 00:00:00 kingbase: system test [local] idle
#在数据库视图中已经无此backend process记录
test=# select*from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait
_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+------------+-------------+--------------+-----------------+------------+-------+-------------+--------------+-------+--------------
(0rows)
2)数据库sys_ctl kill TERM pid
#查询backend process状态信息
test=# select*from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | quer
y_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+-------------------------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------16385| prod |22443|10|system| kingbase_*&+_ |192.168.8.200||57494|2022-11-2915:40:42.804868+08||2022-11-2915
:40:44.972533+08|2022-11-2915:40:44.985340+08| Client | ClientRead | idle |||selectcount(*) from t1; | client backend
(1row)
#执行数据库命令kill进程
[kingbase@node2 bin]$ ./sys_ctl kill TERM 22443
#如下所示,数据库进程正常,对应的backend pross被安全kill
[kingbase@node2 bin]$ ps -ef |grep kingbase
kingbase 130891014:39 ? 00:00:00/opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase -D /db/kingbase/v8r6_054/data
kingbase 2219713089015:38 ? 00:00:00 kingbase: checkpointer
kingbase 2219813089015:38 ? 00:00:00 kingbase: background writer
kingbase 2219913089015:38 ? 00:00:00 kingbase: walwriter
kingbase 2220013089015:38 ? 00:00:00 kingbase: autovacuum launcher
kingbase 2220113089015:38 ? 00:00:00 kingbase: stats collector
kingbase 2220213089015:38 ? 00:00:00 kingbase: ksh writer
kingbase 2220313089015:38 ? 00:00:00 kingbase: ksh collector
kingbase 2220413089015:38 ? 00:00:00 kingbase: kwr collector
kingbase 2220513089015:38 ? 00:00:00 kingbase: logical replication launcher
kingbase 2244412416015:40 pts/000:00:00 ./ksql -U system test
kingbase 2244513089015:40 ? 00:00:00 kingbase: system test [local] idle
#在数据库视图中已经无此backend process记录
test=# select*from sys_stat_activity where client_addr='192.168.8.200';
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait
_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+------------+-------------+--------------+-----------------+------------+-------+-------------+--------------+-------+--------------
(0rows)
4、操作系统kill -3 pid和数据库sys_ctl kill QUIT PID
1)操作系统kill -3 pid
#查询backend process状态信息prod=#select*fromsys_stat_activitywhereclient_addr='192.168.8.200';datid|datname|pid|usesysid|usename|application_name|client_addr|client_hostname|client_port|backend_start|xact_start|query_start|state_change|wait_event_type|wait_event|state|backend_xid|backend_xmin|query|backend_type-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+-------------------------------+-------------------------------+-----------------+------------+-------+-------------+--------------+---------------------------------------------------------------------+----------------16385|prod|18666|10|system|kingbase_*&+_|192.168.8.200||57486|2022-11-29 15:17:34.726155+08||2022-11-29 15:17:34.736202+08|2022-11-29 15:17:34.740584+08|Client|ClientRead|idle|||selectsettingfrompg_settingswherename='enable_u
pper_colname'|clientbackend(1row)#操作系统下执行kill -3 pid结束进程
[root@node2sys_log]# kill -3 18666#如下所示,除了对应的backend process被终止外,其余的后台辅助进程也被终止并重启
[root@node2sys_log]# ps -ef |grep kingbasekingbase130891014:39?00:00:00/opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase-D/db/kingbase/v8r6_054/datakingbase1309013089014:39?00:00:00 kingbase:loggerkingbase1721212416015:05pts/000:00:00./ksql-Usystemtestkingbase1879513089015:18?00:00:00 kingbase:startup#查看对应的sys_log日志(backend process被终止导致数据库的辅助进程也将重启)prod=#select*fromsys_stat_activitywhereclient_addr='192.168.8.200';WARNING:terminatingconnectionbecauseofcrashofanotherserverprocessDETAIL:Thekingbasehascommandedthisserverprocesstorollbackthecurrenttransactionandexit,becauseanotherserverprocessexitedabnormallyandpossiblycorruptedsharedmemory.HINT:Inamomentyoushouldbeabletoreconnecttothedatabaseandrepeatyourcommand.serverclosedtheconnectionunexpectedlyThisprobablymeanstheserverterminatedabnormallybeforeorwhileprocessingtherequest.2022-11-29 15:18:06.741 CST [18666] WARNING:terminatingconnectionbecauseofcrashofanotherserverprocess2022-11-29 15:18:06.741 CST [18666] DETAIL:Thekingbasehascommandedthisserverprocesstorollbackthecurrenttransactionandexit,becauseanotherserverprocessexitedabnormallyandpossiblycorruptedsharedmemory.2022-11-29 15:18:06.741 CST [18666] HINT:Inamomentyoushouldbeabletoreconnecttothedatabaseandrepeatyourcommand.2022-11-29 15:18:06.742 CST [13089] LOG:serverprocess(PID18666)exitedwithexitcode22022-11-29 15:18:06.742 CST [13089] DETAIL: Failed process was running:selectsettingfrompg_settingswherename='enable_upper_colname'2022-11-29 15:18:06.742 CST [13089] LOG:terminatinganyotheractiveserverprocesses2022-11-29 15:18:06.743 CST [17214] WARNING:terminatingconnectionbecauseofcrashofanotherserverprocess2022-11-29 15:18:06.743 CST [17214] DETAIL:Thekingbasehascommandedthisserverprocesstorollbackthecurrenttransactionandexit,becauseanotherserverprocessexitedabnormallyandpossiblycorruptedsharedmemory.2022-11-29 15:18:06.743 CST [17214] HINT:Inamomentyoushouldbeabletoreconnecttothedatabaseandrepeatyourcommand.2022-11-29 15:18:06.744 CST [16477] WARNING:terminatingconnectionbecauseofcrashofanotherserverprocess2022-11-29 15:18:06.744 CST [16477] DETAIL:Thekingbasehascommandedthisserverprocesstorollbackthecurrenttransactionandexit,becauseanotherserverprocessexitedabnormallyandpossiblycorruptedsharedmemory.2022-11-29 15:18:06.744 CST [16477] HINT:Inamomentyoushouldbeabletoreconnecttothedatabaseandrepeatyourcommand.2022-11-29 15:18:06.745 CST [13089] LOG:allserverprocessesterminated;reinitializing2022-11-29 15:18:06.823 CST [18795] LOG:databasesystemwasinterrupted;lastknownupat2022-11-29 14:59:17 CST2022-11-29 15:19:29.897 CST [18795] LOG:databasesystemwasnotproperlyshutdown;automaticrecoveryinprogress2022-11-29 15:19:30.063 CST [18795] LOG:redostartsat0/22935D82022-11-29 15:19:30.063 CST [18795] LOG:redowalsegmentcount22022-11-29 15:19:30.063 CST [18795] LOG: invalid record length at 0/2293608:wanted24,got02022-11-29 15:19:30.063 CST [18795] LOG:redodoneat0/22935D82022-11-29 15:19:30.739 CST [13089] LOG:databasesystemisreadytoacceptconnections#数据库服务重启后
[root@node2sys_log]# ps -ef |grep kingbasekingbase130891014:39?00:00:00/opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase-D/db/kingbase/v8r6_054/datakingbase1309013089014:39?00:00:00 kingbase:loggerkingbase1721212416015:05pts/000:00:00./ksql-Usystemtestkingbase1895313089015:19?00:00:00 kingbase:checkpointerkingbase1895413089015:19?00:00:00 kingbase:backgroundwriterkingbase1895513089015:19?00:00:00 kingbase:walwriterkingbase1895713089015:19?00:00:00 kingbase:statscollectorkingbase1895813089015:19?00:00:00 kingbase:kshwriterkingbase1895913089015:19?00:00:00 kingbase:kshcollectorkingbase1896013089015:19?00:00:00 kingbase:kwrcollector
---如上所示,kill -3 pid用于终止backend process将给数据库带来极大的风险 。
2)数据库sys_ctl kill QUIT PID
#查询backend process状态信息test=#select*fromsys_stat_activitywhereclient_addr='192.168.8.200';datid|datname|pid|usesysid|usename|application_name|client_addr|client_hostname|client_port|backend_start|xact_start|query_start|state_change|wait_event_type|wait_event|state|backend_xid|backend_xmin|query|backend_type-------+---------+-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+------------+-------------------------------+-------------------------------+-----------------+------------+-------+-------------+--------------+--------------------------+----------------16385|prod|21894|10|system|kingbase_*&+_|192.168.8.200||57490|2022-11-29 15:36:10.034020+08||2022-11-29 15:36:13.902728+08|2022-11-29 15:36:13.917841+08|Client|ClientRead|idle|||selectcount(*)fromt1;|clientbackend(1row)#执行数据库命令sys_ctl kill QUIT终止backend process
[kingbase@node2bin]$./sys_ctlkillQUIT21894#如下所示,除了对应的backend process被终止外,其余的后台辅助进程也被终止并重启
[kingbase@node2bin]$ps-ef|grepkingbasekingbase130891014:39?00:00:00/opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase-D/db/kingbase/v8r6_054/datakingbase1309013089014:39?00:00:00 kingbase:loggerkingbase2189512416015:36pts/000:00:00./ksql-Usystemtestkingbase2207113089015:37?00:00:00 kingbase:startup#查看对应的sys_log日志(backend process被终止导致数据库的辅助进程也将重启)test=#select*fromsys_stat_activitywhereclient_addr='192.168.8.200';WARNING:terminatingconnectionbecauseofcrashofanotherserverprocessDETAIL:Thekingbasehascommandedthisserverprocesstorollbackthecurrenttransactionandexit,becauseanotherserverprocessexitedabnormallyandpossiblycorruptedsharedmemory.HINT:Inamomentyoushouldbeabletoreconnecttothedatabaseandrepeatyourcommand.serverclosedtheconnectionunexpectedlyThisprobablymeanstheserverterminatedabnormallybeforeorwhileprocessingtherequest.2022-11-29 15:37:13.828 CST [21894] WARNING:terminatingconnectionbecauseofcrashofanotherserverprocess2022-11-29 15:37:13.828 CST [21894] DETAIL:Thekingbasehascommandedthisserverprocesstorollbackthecurrenttransactionandexit,becauseanotherserverprocessexitedabnormallyandpossiblycorruptedsharedmemory.2022-11-29 15:37:13.828 CST [21894] HINT:Inamomentyoushouldbeabletoreconnecttothedatabaseandrepeatyourcommand.2022-11-29 15:37:13.829 CST [13089] LOG:serverprocess(PID21894)exitedwithexitcode22022-11-29 15:37:13.829 CST [13089] DETAIL: Failed process was running:selectcount(*)fromt1;2022-11-29 15:37:13.829 CST [13089] LOG:terminatinganyotheractiveserverprocesses2022-11-29 15:37:13.830 CST [21896] WARNING:terminatingconnectionbecauseofcrashofanotherserverprocess2022-11-29 15:37:13.830 CST [21896] DETAIL:Thekingbasehascommandedthisserverprocesstorollbackthecurrenttransactionandexit,becauseanotherserverprocessexitedabnormallyandpossiblycorruptedsharedmemory.2022-11-29 15:37:13.830 CST [21896] HINT:Inamomentyoushouldbeabletoreconnecttothedatabaseandrepeatyourcommand.2022-11-29 15:37:13.831 CST [18956] WARNING:terminatingconnectionbecauseofcrashofanotherserverprocess2022-11-29 15:37:13.831 CST [18956] DETAIL:Thekingbasehascommandedthisserverprocesstorollbackthecurrenttransactionandexit,becauseanotherserverprocessexitedabnormallyandpossiblycorruptedsharedmemory.2022-11-29 15:37:13.831 CST [18956] HINT:Inamomentyoushouldbeabletoreconnecttothedatabaseandrepeatyourcommand.2022-11-29 15:37:13.833 CST [13089] LOG:allserverprocessesterminated;reinitializing2022-11-29 15:37:13.933 CST [22071] LOG:databasesystemwasinterrupted;lastknownupat2022-11-29 15:19:30 CST2022-11-29 15:38:29.154 CST [22071] LOG:databasesystemwasnotproperlyshutdown;automaticrecoveryinprogress2022-11-29 15:38:29.232 CST [22071] LOG:redostartsat0/22936802022-11-29 15:38:29.232 CST [22071] LOG:redowalsegmentcount22022-11-29 15:38:29.232 CST [22071] LOG: invalid record length at 0/22936B0:wanted24,got02022-11-29 15:38:29.232 CST [22071] LOG:redodoneat0/22936802022-11-29 15:38:29.767 CST [13089] LOG:databasesystemisreadytoacceptconnection
---如上所示,sys_ctl kill QUIT pid用于终止backend process将给数据库带来极大的风险 。
5、操作系统kill -9 pid
#查询backend process状态信息prod=#select*fromsys_stat_activitywhereclient_addr='192.168.8.200';datid|datname|pid|usesysid|usename|application_name|client_addr|client_hostname|client_port|backend_start|xact_start|query_start|state_change|wait_event_type|wait_event|state|backend_xid|backend_xmin|query|backend_type-------+---------+-------+----------+---------+-------------------+---------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+-----------------+---------------------+--------+-------------+--------------+----------------------------------+------------------------------16385|prod|14114|10|system|kingbase_*&+_|192.168.8.200||57472|2022-11-29 14:48:06.372451+08|
| 2022-11-29 14:50:16.618969+08 | 2022-11-29 14:50:16.627572+08 | Client | ClientRead | idle | | | select count(*) from t1;
| client backend
(1rows)#操作系统下执行kill -9 pid结束进程
[root@node2~]# kill -9 14114 #如下所示,除了对应的backend process被终止外,其余的后台辅助进程也被终止并重启
[root@node2~]# ps -ef |grep kingbase.......kingbase130891014:39?00:00:00/opt/Kingbase/ES/V8R6_054/KESRealPro/V008R006C005B0054/Server/bin/kingbase-D/db/kingbase/v8r6_054/datakingbase1309013089014:39?00:00:00 kingbase:loggerkingbase1401012416014:47pts/000:00:00./ksql-Usystemtestkingbase1565113089014:57?00:00:00 kingbase:startup#查看对应的sys_log日志(backend process被终止导致数据库的辅助进程也将重启)2022-11-29 14:58:00.093 CST [13089] LOG:serverprocess(PID14114)was terminated by signal 9:Killed2022-11-29 14:58:00.093 CST [13089] DETAIL: Failed process was running:selectcount(*)fromt1;2022-11-29 14:58:00.093 CST [13089] LOG:terminatinganyotheractiveserverprocesses2022-11-29 14:58:00.093 CST [14602] WARNING:terminatingconnectionbecauseofcrashofanotherserverprocess2022-11-29 14:58:00.093 CST [14602] DETAIL:Thekingbasehascommandedthisserverprocesstorollbackthecurrenttransactionandexit,becauseanotherserverprocessexitedabnormallyandpossiblycorruptedsharedmemory.2022-11-29 14:58:00.093 CST [14602] HINT:Inamomentyoushouldbeabletoreconnecttothedatabaseandrepeatyourcommand.2022-11-29 14:58:00.095 CST [13095] WARNING:terminatingconnectionbecauseofcrashofanotherserverprocess2022-11-29 14:58:00.095 CST [13095] DETAIL:Thekingbasehascommandedthisserverprocesstorollbackthecurrenttransactionandexit,becauseanotherserverprocessexitedabnormallyandpossiblycorruptedsharedmemory.2022-11-29 14:58:00.095 CST [13095] HINT:Inamomentyoushouldbeabletoreconnecttothedatabaseandrepeatyourcommand.2022-11-29 14:58:00.384 CST [13089] LOG:allserverprocessesterminated;reinitializing2022-11-29 14:58:00.464 CST [15651] LOG:databasesystemwasinterrupted;lastknownupat2022-11-29 14:53:42 CST2022-11-29 14:59:16.706 CST [15651] LOG:databasesystemwasnotproperlyshutdown;automaticrecoveryinprogress2022-11-29 14:59:16.806 CST [15651] LOG:redostartsat0/22934882022-11-29 14:59:16.806 CST [15651] LOG:redowalsegmentcount22022-11-29 14:59:16.806 CST [15651] LOG: invalid record length at 0/2293560:wanted24,got02022-11-29 14:59:16.806 CST [15651] LOG:redodoneat0/22935302022-11-29 14:59:17.563 CST [13089] LOG:databasesystemisreadytoacceptconnections
---如上所示,kill -9 pid用于终止backend process将给数据库带来极大的风险 。
三、总结
对于KingbaseES数据库中异常的backend proces可以采用手工方式终止,执行时选择的方式要注意对数据库带来的风险:
1)相对安全方式:pg_terminate_backend(pid);kill pid;kill -15 pid;sys_ctl kill TERM pid。
2)不安全方式: kill -3 pid;sys_ctl kill QUIT pid;kill -9 pid。
注意:千万不要kill -9,SIGKILL没有信号处理函数,OS会直接停掉进程;Kingbase父进程发现子进程异常退出,会停掉所有进程,释放共享内存,
再重新申请共享内存,拉起所有进程。效果就等于异常重启,启动时肯定会需要时间redo,可能造成几分钟的停止服务。(除非后果可以接受,否则不要kill -9)。