DM8 监视器raft协议窥探


前言

在数据库集群设计中,脑裂是一个古老确经典的场景。早期的数据库集群架构中通常采用仲裁来帮助根据上一状态判断主备角色,进而决定如何进行切换。但单一仲裁的设计便衍生出了主备集群处于两个孤岛网络情况下的双主问题。
对于此类问题,早期集群如MySQL的MHA,会采用多节点多路径侦测来最大程度明确丢失节点的状态,在无法明确的情况下放弃切换动作,尽管避免了双主的问题,但也放弃了特定场景下自动切换的能力
在后期的架构中,引入了各种以paxos协议作为基础共识协议的变种及改进版本,其中以raft协议最为主流的被运用于各种分布式场景下一致性的确保,其最主要的改进在于对LOG顺序的确保优化以及对LEADER地位强化,避免了任期内的波动对服务层的影响,DM8在确认监视器层中也同样引入了raft协议,本次通过测试来尝试理解其工作原理。


环境描述

本次测试基于2节点DW配合3节点监视器,共用三台机器来测试监视器raft协议工作原理,具体如下

内部IP外部IP端口用途
10.30.5.17192.168.56.752141DMWATCHER_1
--8341MONITOR_3
10.30.5.18192.168.56.852142DMWATCHER_2
--8340MONITOR_2
10.30.5.24192.168.56.248341MONITOR_1

监视器配置

MONITOR_1

MON_DW_CONFIRM = 1
MON_LOG_PATH = /opt/dw/log
MON_LOG_INTERVAL = 60
MON_LOG_FILE_SIZE = 32
MON_LOG_SPACE_LIMIT = 0

MON_INST_NUM = 3
MON_HB_INTERVAL = 60
MON_BRO_INTERVAL = 100
MON_VOTE_INTERVAL = 100
MON_ID = 1
MON_MID = 45614

[GRP1]
MON_INST_OGUID = 453331
MON_DW_IP = 192.168.56.7:52141
MON_DW_IP = 192.168.56.8:52142

[MON1]
MON_HOST = 192.168.56.24
MON_PORT = 8339
MON_INST_ID = 1

[MON2]
MON_HOST = 192.168.56.8
MON_PORT = 8340
MON_INST_ID = 2

[MON3]
MON_HOST = 192.168.56.7
MON_PORT = 8341
MON_INST_ID = 3

MONITOR_2

MON_DW_CONFIRM = 1
MON_LOG_PATH = /opt/rt_02/DAMENG
MON_LOG_INTERVAL = 60
MON_LOG_FILE_SIZE = 32
MON_LOG_SPACE_LIMIT = 0

MON_INST_NUM = 3
MON_HB_INTERVAL = 60
MON_BRO_INTERVAL = 100
MON_VOTE_INTERVAL = 100
MON_ID = 2
MON_MID = 45614

[GRP1]
MON_INST_OGUID = 453331
MON_DW_IP = 192.168.56.7:52141
MON_DW_IP = 192.168.56.8:52142

[MON1]
MON_HOST = 192.168.56.24
MON_PORT = 8339
MON_INST_ID = 1

[MON2]
MON_HOST = 192.168.56.8
MON_PORT = 8340
MON_INST_ID = 2

[MON3]
MON_HOST = 192.168.56.7
MON_PORT = 8341
MON_INST_ID = 3

MONITOR_3

MON_DW_CONFIRM = 1
MON_LOG_PATH = /opt/rt_01/DAMENG
MON_LOG_INTERVAL = 60
MON_LOG_FILE_SIZE = 32
MON_LOG_SPACE_LIMIT = 0

MON_INST_NUM = 3
MON_HB_INTERVAL = 60
MON_BRO_INTERVAL = 100
MON_VOTE_INTERVAL = 100
MON_ID = 3
MON_MID = 45614

[GRP1]
MON_INST_OGUID = 453331
MON_DW_IP = 192.168.56.7:52141
MON_DW_IP = 192.168.56.8:52142

[MON1]
MON_HOST = 192.168.56.24
MON_PORT = 8339
MON_INST_ID = 1

[MON2]
MON_HOST = 192.168.56.8
MON_PORT = 8340
MON_INST_ID = 2

[MON3]
MON_HOST = 192.168.56.7
MON_PORT = 8341
MON_INST_ID = 3

单个监视器

首先启动MONITOR_1来观察相关行为

[dmdba@dmdw0 config]$ /opt/dw/dmdbms/bin/dmmonitor path=dmmonitor.ini 
[monitor]         2022-04-09 07:28:14: DMMONITOR[4.0] V8
[monitor]         2022-04-09 07:28:14: DMMONITOR[4.0] IS READY.

show monitor
[monitor]         2022-04-09 07:29:35: The monitor is not LEADER

show state
2022-04-09 07:29:38 
#--------------------------------------------------------------------------------#
GET MONITOR STATE FROM MONITOR SYSTEM, THE FIRST LINE IS SELF INFO.
MON_BRO_INTERVAL: 99 ms, MON_VOTE_INTERVAL: 3018 ms

MON_NAME       MON_STATE      ID             MON_ROLE       MON_IP                   MON_PORT       
MON1           Active         1              CANDIDATE      192.168.56.24            8339           
MON2           Active         2              NOT LEADER     192.168.56.8             8340           
MON3           Active         3              NOT LEADER     192.168.56.7             8341           
#--------------------------------------------------------------------------------#

只有一个节点启动时,是不满足raft协议过半票数当选前提的,所以本节点只能停留在CANDIDATE状态,同时在引入raft协议后,非LEADER节点将不能发起除了show state之外的其他命令,自然也不具备切换能力
,启动后会生成相应的1号节点raft日志信息如下

[dmdba@dmdw0 log]$ vi dm_raft\[mon1_45614\]_202204.log 
2022-04-09 07:28:14.616 [INFO] raft P0000004554 T0000000000000004554  ECS EP XSITE POOL : guid [48003] HB interval [60s]
2022-04-09 07:28:14.618 [INFO] raft P0000004554 T0000000000000004554  ECS AP XSITE POOL : guid [520993] HB interval [60s]
2022-04-09 07:28:17.155 [INFO] raft P0000004554 T0000000000000004561  raft[1] election starting: 2450 2547, term: 305, currentIdx: 20253
2022-04-09 07:28:17.155 [INFO] raft P0000004554 T0000000000000004561  raft[1] becoming candidate
2022-04-09 07:28:17.155 [INFO] raft P0000004554 T0000000000000004560  raft[1] sending requestVote to 3, currentTerm: 306, last_index: 20253, last_term: 305
2022-04-09 07:28:17.156 [INFO] raft P0000004554 T0000000000000004559  raft[1] sending requestVote to 2, currentTerm: 306, last_index: 20253, last_term: 305
2022-04-09 07:28:17.156 [ERROR] raft P0000004554 T0000000000000004560  Can't connect to DM server on '192.168.56.7' port(8341) errno(111)
2022-04-09 07:28:17.156 [ERROR] raft P0000004554 T0000000000000004559  Can't connect to DM server on '192.168.56.8' port(8340) errno(111)
2022-04-09 07:28:17.156 [INFO] raft P0000004554 T0000000000000004560 
......

从日志中可以观察到这样一些典型的raft协议选举信息
1.本集群没有找到LEADER,所以会不断发起新一轮选举,观察到term的推进,以及向集群内其他节点发起投票请求的信息。
2.由于没有LEADER节点的存在,不存在log写入行为,所以index不会推进

观察本地端口

[root@dmdw0 ~]# netstat -an|grep 52142
[root@dmdw0 ~]# netstat -an|grep 52141

并不会向DW建立连接,所以非LEADER节点实际上并不与DW建立连接

尝试抓包端口

[root@dmdw0 ~]# tcpdump -s0 -e -nn -vvv -i enp0s8 port 8339 -X -xx
tcpdump: listening on enp0s8, link-type EN10MB (Ethernet), capture size 262144 bytes

此时该端口不会有任何数据通信,这部分会在后续测试中说明

两个监视器

启动MONITOR_2来观察现象

show state     
2022-04-09 07:43:15 
#--------------------------------------------------------------------------------#
GET MONITOR STATE FROM MONITOR SYSTEM, THE FIRST LINE IS SELF INFO.
MON_BRO_INTERVAL: 99 ms, MON_VOTE_INTERVAL: 1821 ms

MON_NAME       MON_STATE      ID             MON_ROLE       MON_IP                   MON_PORT       
MON2           Active         2              FOLLOWER       192.168.56.8             8340           
MON1           Active         1              LEADER         192.168.56.24            8339           
MON3           Active         3              NOT LEADER     192.168.56.7             8341           
#--------------------------------------------------------------------------------#

本节点成为了FOLLOWER,MONITOR_1节点成为了LEADER

2号节点raft日志

[root@dmdsc1 log]# vi dm_raft\[mon2_45614\]_202204.log 
2022-04-09 07:42:50.043 [INFO] raft P0000003132 T0000000000000003132  ECS EP XSITE POOL : guid [926484] HB interval [60s]
2022-04-09 07:42:50.045 [INFO] raft P0000003132 T0000000000000003132  ECS AP XSITE POOL : guid [632227] HB interval [60s]
2022-04-09 07:42:51.553 [INFO] raft P0000003132 T0000000000000003136  xmal_cache_esite site(0x7fed64002098) site_type(1) esite_guid(688974)
2022-04-09 07:42:51.553 [INFO] raft P0000003132 T0000000000000003136  xmal_ep2ap_conn_process success, inout_type(1) esite_guid(688974) asite(22557168238924) asite_type(0)
2022-04-09 07:42:51.554 [INFO] raft P0000003132 T0000000000000003136  xmal_ep2ap_conn_process success, inout_type(0) esite_guid(688974) asite(22557168238924) asite_type(0)
2022-04-09 07:42:51.555 [INFO] raft P0000003132 T0000000000000003147  raft[2] raft_process_request_vote from node[1]
2022-04-09 07:42:51.555 [INFO] raft P0000003132 T0000000000000003147  raft[2] becoming follower
2022-04-09 07:42:51.555 [INFO] raft P0000003132 T0000000000000003147  raft[2] node request vote: 1 replying: granted
2022-04-09 07:42:51.656 [ERROR] raft P0000003132 T0000000000000003147  raft[2] AppendEntries no log at prev_idx 20253
2022-04-09 07:42:51.755 [INFO] raft P0000003132 T0000000000000003147  raft[2] raft_process_snapshot from node[1]
2022-04-09 07:42:51.756 [INFO] raft P0000003132 T0000000000000003147  raft[2] becoming follower
2022-04-09 07:42:51.756 [INFO] raft P0000003132 T0000000000000003147  raft[2] node: 1 snapshot replying: 471
2022-04-09 07:42:54.659 [INFO] raft P0000003132 T0000000000000003147  Extend rflog from 10 to 20

这里2号节点在处理了1号节点发来的投票请求后,成为FOLLOWER并尝试补齐缺失的日志,由于此前没有LEADER,所以index尚未推进,也就直接完成了这一步骤并切换为FOLLOWER

1号节点raft日志

2022-04-09 07:42:51.573 [INFO] raft P0000005252 T0000000000000005259  raft[1] election starting: 2447 2548, term: 470, currentIdx: 20253
2022-04-09 07:42:51.573 [INFO] raft P0000005252 T0000000000000005259  raft[1] becoming candidate
2022-04-09 07:42:51.573 [INFO] raft P0000005252 T0000000000000005258  raft[1] sending requestVote to 3, currentTerm: 471, last_index: 20253, last_term: 305
2022-04-09 07:42:51.573 [INFO] raft P0000005252 T0000000000000005257  raft[1] sending requestVote to 2, currentTerm: 471, last_index: 20253, last_term: 305
2022-04-09 07:42:51.573 [ERROR] raft P0000005252 T0000000000000005258  Can't connect to DM server on '192.168.56.7' port(8341) errno(111)
2022-04-09 07:42:51.574 [INFO] raft P0000005252 T0000000000000005258  xlnk_ep2ap_conn_create fail,code(-650), site(0x7fe244002098) site_type(0) conn_type(1) address(192.168.56.7:8341) guid(688974) fail_lnk(nth:0, type:OUT)
2022-04-09 07:42:51.575 [INFO] raft P0000005252 T0000000000000005257  xlnk_ep2ap_conn_create success, site(22557168238924) site_type(0) conn_type(0) address(192.168.56.8:8340) guid(688974) n_lnk(1)
2022-04-09 07:42:51.575 [INFO] raft P0000005252 T0000000000000005257  xmal_cache_asite site(0x7fe240002098) site_type(0) address(192.168.56.8:8340) guid(688974)
2022-04-09 07:42:51.576 [INFO] raft P0000005252 T0000000000000005257  raft[1] node[2] responded to requestvote status: granted
2022-04-09 07:42:51.576 [INFO] raft P0000005252 T0000000000000005257  raft[1] becoming leader term: 471, bro_timeout: 99
2022-04-09 07:42:51.678 [ERROR] raft P0000005252 T0000000000000005258  Can't connect to DM server on '192.168.56.7' port(8341) errno(111)
2022-04-09 07:42:51.678 [INFO] raft P0000005252 T0000000000000005258  xlnk_ep2ap_conn_create fail,code(-650), site(0x7fe244002098) site_type(0) conn_type(1) address(192.168.56.7:8341) guid(688974) fail_lnk(nth:0, type:OUT)
2022-04-09 07:42:51.678 [WARNING] raft P0000005252 T0000000000000005257  raft[1] send AppendEntries(res_index: 20241) failed. Retry
2022-04-09 07:42:51.779 [ERROR] raft P0000005252 T0000000000000005258  Can't connect to DM server on '192.168.56.7' port(8341) errno(111)
2022-04-09 07:42:51.779 [INFO] raft P0000005252 T0000000000000005258  xlnk_ep2ap_conn_create fail,code(-650), site(0x7fe244002098) site_type(0) conn_type(1) address(192.168.56.7:8341) guid(688974) fail_lnk(nth:0, type:OUT)

在没有连接到2号/3号节点前,1号节点始终处于发起选举请求投票状态,在成功连接到2号节点后,由于获得了足够的票数当选为LEADER,并向FOLLOWER发送log以补齐。

1号节点由于成为了LEADER,所以向DW建立了连接来获取数据库集群节点状态信息

[root@dmdw0 log]# netstat -an|grep 52142
tcp        0      0 192.168.56.24:55558     192.168.56.8:52142      ESTABLISHED
[root@dmdw0 log]# netstat -an|grep 52141
tcp        0      0 192.168.56.24:37838     192.168.56.7:52141      ESTABLISHED

2号节点作为FOLLOWER,不会向DW建立连接

[root@dmdsc1 ~]# netstat -an|grep 52142
tcp6       0      0 :::52142                :::*                    LISTEN     
tcp6       0      0 192.168.56.8:52142      192.168.56.24:55558     ESTABLISHED  //watcher 响应LEADER连接
tcp6       0      0 10.30.5.18:52142        10.30.5.17:39666        ESTABLISHED  //尝试通过MAL连接对端
[root@dmdsc1 ~]# netstat -an|grep 52141

这里可以推断出只有LEADER节点会承载实际确认监视器的工作,与未引入raft协议的单监视器工作模式相同,而非LEADER节点的作用将在接下来从端口消息进行测试

LEADER节点会向FOLLOWER节点MONITOR配置中的目的端口建立连接,那么传输了些什么呢

[root@dmdsc1 log]# netstat -an|grep 8340
tcp6       0      0 :::8340                 :::*                    LISTEN     
tcp6       0      0 192.168.56.8:8340       192.168.56.24:37338     ESTABLISHED
tcp6       0      0 192.168.56.8:8340       192.168.56.24:37336     ESTABLISHED

LEADER节点自身端口依旧没有消息

[root@dmdw0 ~]# tcpdump -s0 -e -nn -vvv -i enp0s3 port 8339 -X -xx -w 1
tcpdump: listening on enp0s3, link-type EN10MB (Ethernet), capture size 262144 bytes
0 packets captured
0 packets received by filter
0 packets dropped by kernel

而FOLLOWER端口有消息通过

[root@dmdsc1 log]# tcpdump -s0 -e -nn -vvv -i enp0s3 port 8340 -X -xx -w 1
tcpdump: listening on enp0s3, link-type EN10MB (Ethernet), capture size 262144 bytes
128 packets captured
148 packets received by filter
0 packets dropped by kernel

由于具体传输协议格式不明,在此仅展示部分具备可读性文本

08:49:33.797459 08:00:27:77:7d:83 > 08:00:27:31:de:84, ethertype IPv4 (0x0800), length 2032: (tos 0x0, ttl 64, id 372, offset 0, flags [DF], proto TCP (6), length 2018)
    192.168.56.24.37336 > 192.168.56.8.8340: Flags [P.], cksum 0xf945 (incorrect -> 0x674c), seq 418:2384, ack 1, win 58, options [nop,nop,TS val 30975863 ecr 35426473], length 1966
	0x0000:  4500 07e2 0174 4000 4006 4031 c0a8 3818  E....t@.@.@1..8.
	0x0010:  c0a8 3808 91d8 2094 ce5f b1bd 0cc0 f514  ..8......_......
	0x0020:  8018 003a f945 0000 0101 080a 01d8 a777  ...:.E.........w
	0x0030:  021c 90a9 ae07 0000 6f00 0100 0000 0000  ........o.......
	0x0040:  cb00 0000 0000 8f82 1174 2eb2 0000 d701  .........t......
	0x0050:  0000 0000 0000 0100 0000 006e 0000 0000  ...........n....
	0x0060:  0000 d701 0000 0000 0000 0100 0000 006e  ...............n
	0x0070:  0000 0000 0000 016e 0000 0000 0000 d701  .......n........
	0x0080:  0000 0000 0000 0000 0000 5407 0000 4752  ..........T...GR
	0x0090:  5031 0000 0000 2d73 7d56 e27f 0000 0000  P1....-s}V......
	0x00a0:  0000 0000 0000 0000 0000 3807 0000 fe2d  ..........8....-
	0x00b0:  00d3 ea06 0047 5250 3100 0000 0000 0000  .....GRP1.......
	0x00c0:  0000 0000 0079 3dd4 0132 0700 0000 0000  .....y=..2......
	0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
	0x00e0:  0000 0000 0000 0000 0000 0a00 4752 5031  ............GRP1
	0x00f0:  5f52 545f 3031 3230 3232 2d30 342d 3039  _RT_012022-04-09
	0x0100:  2030 383a 3439 3a33 3320 0000 0000 0100  .08:49:33.......
	0x0110:  0000 0000 0000 0000 0000 0000 6400 0000  ............d...
	0x0120:  0100 0102 0a00 3c00 0000 0a00 0100 1800  ......<.........
	0x0130:  2f6f 7074 2f72 745f 3031 2f44 414d 454e  /opt/rt_01/DAMEN
	0x0140:  472f 646d 2e69 6e69 0000 1c00 2f6f 7074  G/dm.ini..../opt
	0x0150:  2f64 7363 2f64 6d64 626d 732f 6269 6e2f  /dsc/dmdbms/bin/
	0x0160:  646d 7365 7276 6572 0000 0000 0000 0000  dmserver........
	0x0170:  0100 0000 0000 0000 0000 0100 0000 32c4  ..............2.
	0x0180:  5062 0000 0000 0000 0000 0300 0000 ffff  Pb..............
	0x0190:  ffff 7d00 5072 696d 6172 7920 696e 7374  ..}.Primary.inst
	0x01a0:  616e 6365 2847 5250 315f 5254 5f30 3129  ance(GRP1_RT_01)
	0x01b0:  2061 7263 6820 7374 6174 7573 2074 6f20  .arch.status.to.
	0x01c0:  696e 7374 616e 6365 2847 5250 315f 5254  instance(GRP1_RT
	0x01d0:  5f30 3229 2069 7320 5641 4c49 442c 2072  _02).is.VALID,.r
	0x01e0:  6563 6f76 6572 7920 6f66 2069 6e73 7461  ecovery.of.insta
	0x01f0:  6e63 6528 4752 5031 5f52 545f 3032 2920  nce(GRP1_RT_02).
	0x0200:  6973 206e 6f74 206e 6563 6573 7361 7279  is.not.necessary
	0x0210:  2101 000a 0047 5250 315f 5254 5f30 3101  !....GRP1_RT_01.
	0x0220:  0400 8d7d 0000 0000 002c 2300 0000 0000  ...}.....,#.....
	0x0230:  00a4 cb00 0000 0000 002d 2300 0000 0000  .........-#.....
	0x0240:  00a5 cb00 0000 0000 0000 0000 00ff ffff  ................
	0x0250:  ffff ffff ff0c 0031 3932 2e31 3638 2e35  .......192.168.5
	0x0260:  362e 3702 0001 0000 0147 5250 315f 5254  6.7......GRP1_RT
	0x0270:  5f30 3200 0000 0000 0047 5250 315f 5254  _02......GRP1_RT
	0x0280:  5f30 3200 0000 0000 0001 0000 0000 0000  _02.............
	0x0290:  0000 0000 0000 0000 4500 7365 6e64 2061  ........E.send.a
	0x02a0:  7263 6820 746f 2073 6974 6528 4752 5031  rch.to.site(GRP1
	0x02b0:  5f52 545f 3032 2920 7375 6363 6573 732c  _RT_02).success,
	0x02c0:  2062 6567 696e 206c 736e 3a35 3231 3333  .begin.lsn:52133
	0x02d0:  2c20 656e 6420 6c73 6e3a 3532 3133 3340  ,.end.lsn:52133@
	0x02e0:  0000 009f 0600 0000 0000 007c 3e10 0000  ...........|>...
	0x02f0:  0000 00f9 0b00 0000 0000 0064 611c 0000  ...........da...
	0x0300:  0000 00e6 1600 0000 0000 0007 d450 6200  .............Pb.
	0x0310:  0000 004e 0500 0004 8802 00a5 cb00 0000  ...N............
	0x0320:  0000 0000 8100 0000 0000 0040 0000 0000  ...........@....
	0x0330:  0000 007d 2701 0000 0000 0004 0200 0001  ...}'...........
	0x0340:  0000 0095 0200 0000 0000 00a5 cb00 0000  ................
	0x0350:  0000 00a5 cb00 0000 0000 001c d850 6200  .............Pb.
	0x0360:  0000 001c d850 6200 0000 0000 0040 002a  .....Pb......@.*
	0x0370:  b795 3dfd 6851 00bb 62c6 2b92 7b4e ec20  ..=.hQ..b.+.{N..
	0x0380:  dc67 094d 0469 fdb7 02e6 896c 23b5 c146  .g.M.i.....l#..F
	0x0390:  45ab 35fc 16e9 8926 f4ac c137 6323 63bb  E.5....&...7c#c.
	0x03a0:  ccc9 3055 9818 9460 7027 a8b8 4f49 0000  ..0U...`p'..OI..
	0x03b0:  0001 0100 0001 0001 08ef 9626 fdf8 403f  ...........&..@?
	0x03c0:  0100 fdf8 403f 0000 2c23 0000 0000 0000  ....@?..,#......
	0x03d0:  a4cb 0000 0000 0000 0000 4a03 0000 0000  ..........J.....
	0x03e0:  0009 000c 0047 5250 315f 5254 5f30 315f  .....GRP1_RT_01_
	0x03f0:  3101 0000 0000 0000 00e6 0704 0600 0000  1...............
	0x0400:  0000 00e8 0300 4752 5031 5f52 545f 3031  ......GRP1_RT_01
	0x0410:  0000 0000 0000 4752 5031 5f52 545f 3031  ......GRP1_RT_01
	0x0420:  0000 0000 0000 fdf8 403f fdf8 403f 0100  ........@?..@?..
	0x0430:  ab0e 0000 0000 0000 9d53 0000 0000 0000  .........S......
	0x0440:  0c00 4752 5031 5f52 545f 3031 5f32 0200  ..GRP1_RT_01_2..
	0x0450:  0000 0000 0000 e607 0406 0932 2200 0000  ...........2"...
	0x0460:  e803 0047 5250 315f 5254 5f30 3100 0000  ...GRP1_RT_01...
	0x0470:  0000 0047 5250 315f 5254 5f30 3100 0000  ...GRP1_RT_01...
	0x0480:  0000 00fd f840 3ffd f840 3f01 0098 1100  .....@?..@?.....
	0x0490:  0000 0000 00b7 8e00 0000 0000 000c 0047  ...............G
	0x04a0:  5250 315f 5254 5f30 315f 3303 0000 0000  RP1_RT_01_3.....
	0x04b0:  0000 00e6 0704 0611 0315 0000 00e8 0301  ................
	0x04c0:  4752 5031 5f52 545f 3031 0000 0000 0000  GRP1_RT_01......
	0x04d0:  4752 5031 5f52 545f 3031 0000 0000 0000  GRP1_RT_01......
	0x04e0:  fdf8 403f fdf8 403f 0100 c311 0000 0000  ..@?..@?........
	0x04f0:  0000 af94 0000 0000 0000 0c00 4752 5031  ............GRP1
	0x0500:  5f52 545f 3032 5f34 0400 0000 0000 0000  _RT_02_4........
	0x0510:  e607 0406 1108 3200 0000 e803 0147 5250  ......2......GRP
	0x0520:  315f 5254 5f30 3100 0000 0000 0047 5250  1_RT_01......GRP
	0x0530:  315f 5254 5f30 3200 0000 0000 00fd f840  1_RT_02........@
	0x0540:  3fb4 9d53 7d01 0033 1200 0000 0000 0005  ?..S}..3........
	0x0550:  9a00 0000 0000 000c 0047 5250 315f 5254  .........GRP1_RT
	0x0560:  5f30 315f 3505 0000 0000 0000 00e6 0704  _01_5...........
	0x0570:  0611 0b1b 0000 00e8 0301 4752 5031 5f52  ..........GRP1_R
	0x0580:  545f 3032 0000 0000 0000 4752 5031 5f52  T_02......GRP1_R
	0x0590:  545f 3031 0000 0000 0000 b49d 537d fdf8  T_01........S}..
	0x05a0:  403f 0100 6412 0000 0000 0000 839f 0000  @?..d...........
	0x05b0:  0000 0000 0c00 4752 5031 5f52 545f 3031  ......GRP1_RT_01
	0x05c0:  5f36 0600 0000 0000 0000 e607 0406 1120  _6..............
	0x05d0:  0d00 0000 e803 0147 5250 315f 5254 5f30  .......GRP1_RT_0
	0x05e0:  3100 0000 0000 0047 5250 315f 5254 5f30  1......GRP1_RT_0
	0x05f0:  3100 0000 0000 00fd f840 3ffd f840 3f01  1........@?..@?.
	0x0600:  0002 1400 0000 0000 004e a700 0000 0000  .........N......
	0x0610:  000c 0047 5250 315f 5254 5f30 325f 3707  ...GRP1_RT_02_7.
	0x0620:  0000 0000 0000 00e6 0704 0615 3310 0000  ............3...
	0x0630:  00e8 0301 4752 5031 5f52 545f 3031 0000  ....GRP1_RT_01..
	0x0640:  0000 0000 4752 5031 5f52 545f 3032 0000  ....GRP1_RT_02..
	0x0650:  0000 0000 fdf8 403f b49d 537d 0100 d516  ......@?..S}....
	0x0660:  0000 0000 0000 03af 0000 0000 0000 0c00  ................
	0x0670:  4752 5031 5f52 545f 3031 5f38 0800 0000  GRP1_RT_01_8....
	0x0680:  0000 0000 e607 0407 081a 2100 0000 e803  ..........!.....
	0x0690:  0147 5250 315f 5254 5f30 3200 0000 0000  .GRP1_RT_02.....
	0x06a0:  0047 5250 315f 5254 5f30 3100 0000 0000  .GRP1_RT_01.....
	0x06b0:  00b4 9d53 7dfd f840 3f01 0060 1a00 0000  ...S}..@?..`....
	0x06c0:  0000 00b5 b800 0000 0000 000c 0047 5250  .............GRP
	0x06d0:  315f 5254 5f30 315f 3909 0000 0000 0000  1_RT_01_9.......
	0x06e0:  00e6 0704 0907 181d 0000 00e8 0301 4752  ..............GR
	0x06f0:  5031 5f52 545f 3031 0000 0000 0000 4752  P1_RT_01......GR
	0x0700:  5031 5f52 545f 3031 0000 0000 0000 fdf8  P1_RT_01........
	0x0710:  403f fdf8 403f 0100 961c 0000 0000 0000  @?..@?..........
	0x0720:  c9bf 0000 0000 0000 0000 0200 0101 7700  ..............w.
	0x0730:  0100 012e b200 0000 0000 0032 3032 322d  ...........2022-
	0x0740:  3034 2d30 3920 3037 3a34 323a 3531 2000  04-09.07:42:51..
	0x0750:  0000 003a 3a66 6666 663a 3139 322e 3136  ...::ffff:192.16
	0x0760:  382e 3536 2e32 3400 0000 0000 0000 0000  8.56.24.........
	0x0770:  0000 0000 0000 0000 0000 0000 0000 0000  ................
	0x0780:  0000 0000 0000 0000 0000 0000 0000 0000  ................
	0x0790:  0000 0012 0044 4d4d 4f4e 4954 4f52 5b34  .....DMMONITOR[4
	0x07a0:  2e30 5d20 5638 0a00 0000 002f 0049 6e73  .0].V8...../.Ins
	0x07b0:  7461 6e63 6528 4752 5031 5f52 545f 3031  tance(GRP1_RT_01
	0x07c0:  2920 6973 2061 6c72 6561 6479 2069 6e20  ).is.already.in.
	0x07d0:  4f70 656e 2073 7461 7475 7321 0000 0000  Open.status!....
	0x07e0:  0000     

大体可以看出是LEADER向FOLLOWER传输了自己从DW处获取的数据库集群节点最后状态信息

一点题外话

在这个可读信息的数据标准包头后的正文头部,有一些有趣的信息,如下所示

                                      5407 0000 4752  ..........T...GR
	0x0090:  5031 0000 0000 2d73 7d56 e27f 0000 0000  P1....-s}V......
	0x00a0:  0000 0000 0000 0000 0000 3807 0000 fe2d  ..........8....-
	0x00b0:  00d3 ea06 0047 5250 3100 0000 0000 0000  .....GRP1.......
	0x00c0:  0000 0000 0079 3dd4 0132 0700 0000 0000  .....y=..2......

这个信息中我们可以看出MAL在传输协议正文开始,会包含下列信息
47525031 即MAL中的GROUP SECTION NAME: GRP1
d3ea06 即MAL中的MON_INST_OGUID: 453331 (little endian)
所以在MAL协议中标识其节点身份的方法,其实是在协议数据头部通过GROUP NAME和INST_OGUID共同进行的。

三个监视器

启动3号监视器来观察

show state
2022-04-09 08:39:28 
#--------------------------------------------------------------------------------#
GET MONITOR STATE FROM MONITOR SYSTEM, THE FIRST LINE IS SELF INFO.
MON_BRO_INTERVAL: 99 ms, MON_VOTE_INTERVAL: 2559 ms

MON_NAME       MON_STATE      ID             MON_ROLE       MON_IP                   MON_PORT       
MON3           Active         3              FOLLOWER       192.168.56.7             8341           
MON1           Active         1              LEADER         192.168.56.24            8339           
MON2           Active         2              NOT LEADER     192.168.56.8             8340           
#--------------------------------------------------------------------------------#

有趣的是3号监视器启动后,2号监视器变为了NOT LEADER状态,而3号监视器成为了FOLLOWER

2节点raft日志并没有任何信息,直接变为了NOT LEADER,此处对于该角色的定义和判断逻辑尚需要进一步学习,目前尚不明确,但并不影响raft选举工作流程的说明

3号节点raft日志

本节点raft,其实也是follower
2022-04-09 08:33:50.891 [INFO] raft P0000008170 T0000000000000008170  ECS EP XSITE POOL : guid [940642] HB interval [60s]
2022-04-09 08:33:50.893 [INFO] raft P0000008170 T0000000000000008170  ECS AP XSITE POOL : guid [456120] HB interval [60s]
2022-04-09 08:33:50.974 [INFO] raft P0000008170 T0000000000000008174  xmal_cache_esite site(0x7fe994002098) site_type(1) esite_guid(688974)
2022-04-09 08:33:50.974 [INFO] raft P0000008170 T0000000000000008174  xmal_ep2ap_conn_process success, inout_type(1) esite_guid(688974) asite(22557168268799) asite_type(0)
2022-04-09 08:33:50.975 [INFO] raft P0000008170 T0000000000000008174  xmal_ep2ap_conn_process success, inout_type(0) esite_guid(688974) asite(22557168268799) asite_type(0)
2022-04-09 08:33:50.976 [INFO] raft P0000008170 T0000000000000008186  raft[3] becoming follower
2022-04-09 08:33:50.976 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:51.073 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:51.172 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:51.272 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:51.371 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:51.471 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:51.570 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:51.669 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:51.768 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:51.867 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:51.967 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:52.067 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:52.166 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:52.266 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:52.365 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:52.465 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:52.566 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:52.666 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 26290
2022-04-09 08:33:52.765 [INFO] raft P0000008170 T0000000000000008186  raft[3] raft_process_snapshot from node[1]
2022-04-09 08:33:52.765 [INFO] raft P0000008170 T0000000000000008186  raft[3] becoming follower
2022-04-09 08:33:52.766 [INFO] raft P0000008170 T0000000000000008186  raft[3] node: 1 snapshot replying: 471
2022-04-09 08:33:57.667 [INFO] raft P0000008170 T0000000000000008186  Extend rflog from 10 to 20
2022-04-09 09:33:24.967 [INFO] raft P0000008170 T0000000000000008177  raft[3] election starting: 2559 954977, term: 471, currentIdx: 31490
2022-04-09 09:33:24.967 [INFO] raft P0000008170 T0000000000000008177  raft[3] becoming candidate
2022-04-09 09:33:24.967 [INFO] raft P0000008170 T0000000000000008175  raft[3] sending requestVote to 1, currentTerm: 472, last_index: 31490, last_term: 471
2022-04-09 09:33:24.968 [INFO] raft P0000008170 T0000000000000008176  raft[3] sending requestVote to 2, currentTerm: 472, last_index: 31490, last_term: 471
2022-04-09 09:33:24.990 [INFO] raft P0000008170 T0000000000000008175  xlnk_ep2ap_conn_create success, site(35089882808321) site_type(0) conn_type(0) address(192.168.56.24:8339) guid(940642) n_lnk(1)
2022-04-09 09:33:24.990 [INFO] raft P0000008170 T0000000000000008175  xmal_cache_asite site(0x7fe99a7057e8) site_type(0) address(192.168.56.24:8339) guid(940642)
2022-04-09 09:33:24.992 [INFO] raft P0000008170 T0000000000000008175  raft[3] node[1] responded to requestvote status: not granted
2022-04-09 09:33:24.992 [INFO] raft P0000008170 T0000000000000008176  xlnk_ep2ap_conn_create success, site(35089882808322) site_type(0) conn_type(0) address(192.168.56.8:8340) guid(940642) n_lnk(1)
2022-04-09 09:33:24.992 [INFO] raft P0000008170 T0000000000000008176  xmal_cache_asite site(0x7fe99a6047e8) site_type(0) address(192.168.56.8:8340) guid(940642)
2022-04-09 09:33:25.000 [INFO] raft P0000008170 T0000000000000008176  raft[3] node[2] responded to requestvote status: not granted
2022-04-09 09:33:25.012 [INFO] raft P0000008170 T0000000000000008186  raft[3] becoming follower
2022-04-09 09:33:25.012 [ERROR] raft P0000008170 T0000000000000008186  raft[3] AppendEntries no log at prev_idx 31491

其实看起来和2号节点流程几乎相同,唯一区别是在于此时3个节点都是通的,所以会收到来1和2两个节点的投票结果,其中1返回 not granted,2 返回 not granted,最终成为FOLLOWER

3节点端口状态

[root@dmdsc0 log]# netstat -an|grep 52142
tcp        0      0 10.30.5.17:39666        10.30.5.18:52142        ESTABLISHED
[root@dmdsc0 log]# netstat -an|grep 52141
tcp6       0      0 :::52141                :::*                    LISTEN     
tcp6       0      0 192.168.56.7:52141      192.168.56.24:37838     ESTABLISHED

再次证明非LEADER节点是不会与DW建立连接的

同样对3节点MONITOR配置端口抓包

[root@dmdsc0 log]# tcpdump -s0 -e -nn -vvv -i enp0s3 port 8341 -X -xx -w 1
tcpdump: listening on enp0s3, link-type EN10MB (Ethernet), capture size 262144 bytes
176 packets captured
204 packets received by filter
0 packets dropped by kernel

08:46:00.959030 08:00:27:77:7d:83 > 08:00:27:0d:d8:6b, ethertype IPv4 (0x0800), length 2032: (tos 0x0, ttl 64, id 42008, offset 0, flags [DF], proto TCP (6), length 2018)
    192.168.56.24.58430 > 192.168.56.7.8341: Flags [P.], cksum 0xf944 (incorrect -> 0x9a4f), seq 9524:11490, ack 1, win 58, options [nop,nop,TS val 30763267 ecr 35247702], length 1966
	0x0000:  4500 07e2 a418 4000 4006 9d8d c0a8 3818  E.....@.@.....8.
	0x0010:  c0a8 3807 e43e 2095 81dc 3260 2e5b ea6f  ..8..>....2`.[.o
	0x0020:  8018 003a f944 0000 0101 080a 01d5 6903  ...:.D........i.
	0x0030:  0219 d656 ae07 0000 6f00 0100 0000 0000  ...V....o.......
	0x0040:  b200 0000 0000 8a76 7781 2eb2 0000 d701  .......vw.......
	0x0050:  0000 0000 0000 0100 0000 5a6c 0000 0000  ..........Zl....
	0x0060:  0000 d701 0000 0000 0000 0100 0000 5a6c  ..............Zl
	0x0070:  0000 0000 0000 5b6c 0000 0000 0000 d701  ......[l........
	0x0080:  0000 0000 0000 0000 0000 5407 0000 4752  ..........T...GR
	0x0090:  5031 0000 0000 2d73 7d56 e27f 0000 0000  P1....-s}V......
	0x00a0:  0000 0000 0000 0000 0000 3807 0000 902d  ..........8....-
	0x00b0:  00d3 ea06 0047 5250 3100 0000 0000 0000  .....GRP1.......
	0x00c0:  0000 0000 0000 3bd4 0132 0700 0000 0000  ......;..2......
	0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
	0x00e0:  0000 0000 0000 0000 0000 0a00 4752 5031  ............GRP1
	0x00f0:  5f52 545f 3031 3230 3232 2d30 342d 3039  _RT_012022-04-09
	0x0100:  2030 383a 3436 3a30 3020 0000 0000 0100  .08:46:00.......
	0x0110:  0000 0000 0000 0000 0000 0000 6400 0000  ............d...
	0x0120:  0100 0102 0a00 3c00 0000 0a00 0100 1800  ......<.........
	0x0130:  2f6f 7074 2f72 745f 3031 2f44 414d 454e  /opt/rt_01/DAMEN
	0x0140:  472f 646d 2e69 6e69 0000 1c00 2f6f 7074  G/dm.ini..../opt
	0x0150:  2f64 7363 2f64 6d64 626d 732f 6269 6e2f  /dsc/dmdbms/bin/
	0x0160:  646d 7365 7276 6572 0000 0000 0000 0000  dmserver........
	0x0170:  0100 0000 0000 0000 0000 0100 0000 32c4  ..............2.
	0x0180:  5062 0000 0000 0000 0000 0300 0000 ffff  Pb..............
	0x0190:  ffff 7d00 5072 696d 6172 7920 696e 7374  ..}.Primary.inst
	0x01a0:  616e 6365 2847 5250 315f 5254 5f30 3129  ance(GRP1_RT_01)
	0x01b0:  2061 7263 6820 7374 6174 7573 2074 6f20  .arch.status.to.
	0x01c0:  696e 7374 616e 6365 2847 5250 315f 5254  instance(GRP1_RT
	0x01d0:  5f30 3229 2069 7320 5641 4c49 442c 2072  _02).is.VALID,.r
	0x01e0:  6563 6f76 6572 7920 6f66 2069 6e73 7461  ecovery.of.insta
	0x01f0:  6e63 6528 4752 5031 5f52 545f 3032 2920  nce(GRP1_RT_02).
	0x0200:  6973 206e 6f74 206e 6563 6573 7361 7279  is.not.necessary
	0x0210:  2101 000a 0047 5250 315f 5254 5f30 3101  !....GRP1_RT_01.
	0x0220:  0400 8d7d 0000 0000 00e6 2200 0000 0000  ...}......".....
	0x0230:  005e cb00 0000 0000 00e6 2200 0000 0000  .^........".....
	0x0240:  005f cb00 0000 0000 0000 0000 00ff ffff  ._..............
	0x0250:  ffff ffff ff0c 0031 3932 2e31 3638 2e35  .......192.168.5
	0x0260:  362e 3702 0001 0000 0147 5250 315f 5254  6.7......GRP1_RT
	0x0270:  5f30 3200 0000 0000 0047 5250 315f 5254  _02......GRP1_RT
	0x0280:  5f30 3200 0000 0000 0001 0000 0000 0000  _02.............
	0x0290:  0000 0000 0000 0000 4500 7365 6e64 2061  ........E.send.a
	0x02a0:  7263 6820 746f 2073 6974 6528 4752 5031  rch.to.site(GRP1
	0x02b0:  5f52 545f 3032 2920 7375 6363 6573 732c  _RT_02).success,
	0x02c0:  2062 6567 696e 206c 736e 3a35 3230 3632  .begin.lsn:52062
	0x02d0:  2c20 656e 6420 6c73 6e3a 3532 3036 3240  ,.end.lsn:52062@
	0x02e0:  0000 0058 0600 0000 0000 0060 af0f 0000  ...X.......`....
	0x02f0:  0000 00b2 0b00 0000 0000 0092 281b 0000  ............(...
	0x0300:  0000 00e6 1600 0000 0000 0007 d450 6200  .............Pb.
	0x0310:  0000 004e 0500 0004 8802 005e cb00 0000  ...N.......^....
	0x0320:  0000 0000 8100 0000 0000 0040 0000 0000  ...........@....
	0x0330:  0000 002f c300 0000 0000 0004 0200 0001  .../............
	0x0340:  0000 00ce 0300 0000 0000 005e cb00 0000  ...........^....
	0x0350:  0000 005e cb00 0000 0000 0045 d750 6200  ...^.......E.Pb.
	0x0360:  0000 0045 d750 6200 0000 0000 0040 002a  ...E.Pb......@.*
	0x0370:  b795 3dfd 6851 00bb 62c6 2b92 7b4e ec20  ..=.hQ..b.+.{N..
	0x0380:  dc67 094d 0469 fdb7 02e6 896c 23b5 c146  .g.M.i.....l#..F
	0x0390:  45ab 35fc 16e9 8926 f4ac c137 6323 63bb  E.5....&...7c#c.
	0x03a0:  ccc9 3055 9818 9460 7027 a8b8 4f49 0000  ..0U...`p'..OI..
	0x03b0:  0001 0100 0001 0001 08ef 9626 fdf8 403f  ...........&..@?
	0x03c0:  0100 fdf8 403f 0000 e622 0000 0000 0000  ....@?..."......
	0x03d0:  5ecb 0000 0000 0000 0000 4a03 0000 0000  ^.........J.....
	0x03e0:  0009 000c 0047 5250 315f 5254 5f30 315f  .....GRP1_RT_01_
	0x03f0:  3101 0000 0000 0000 00e6 0704 0600 0000  1...............
	0x0400:  0000 00e8 0300 4752 5031 5f52 545f 3031  ......GRP1_RT_01
	0x0410:  0000 0000 0000 4752 5031 5f52 545f 3031  ......GRP1_RT_01
	0x0420:  0000 0000 0000 fdf8 403f fdf8 403f 0100  ........@?..@?..
	0x0430:  ab0e 0000 0000 0000 9d53 0000 0000 0000  .........S......
	0x0440:  0c00 4752 5031 5f52 545f 3031 5f32 0200  ..GRP1_RT_01_2..
	0x0450:  0000 0000 0000 e607 0406 0932 2200 0000  ...........2"...
	0x0460:  e803 0047 5250 315f 5254 5f30 3100 0000  ...GRP1_RT_01...
	0x0470:  0000 0047 5250 315f 5254 5f30 3100 0000  ...GRP1_RT_01...
	0x0480:  0000 00fd f840 3ffd f840 3f01 0098 1100  .....@?..@?.....
	0x0490:  0000 0000 00b7 8e00 0000 0000 000c 0047  ...............G
	0x04a0:  5250 315f 5254 5f30 315f 3303 0000 0000  RP1_RT_01_3.....
	0x04b0:  0000 00e6 0704 0611 0315 0000 00e8 0301  ................
	0x04c0:  4752 5031 5f52 545f 3031 0000 0000 0000  GRP1_RT_01......
	0x04d0:  4752 5031 5f52 545f 3031 0000 0000 0000  GRP1_RT_01......
	0x04e0:  fdf8 403f fdf8 403f 0100 c311 0000 0000  ..@?..@?........
	0x04f0:  0000 af94 0000 0000 0000 0c00 4752 5031  ............GRP1
	0x0500:  5f52 545f 3032 5f34 0400 0000 0000 0000  _RT_02_4........
	0x0510:  e607 0406 1108 3200 0000 e803 0147 5250  ......2......GRP
	0x0520:  315f 5254 5f30 3100 0000 0000 0047 5250  1_RT_01......GRP
	0x0530:  315f 5254 5f30 3200 0000 0000 00fd f840  1_RT_02........@
	0x0540:  3fb4 9d53 7d01 0033 1200 0000 0000 0005  ?..S}..3........
	0x0550:  9a00 0000 0000 000c 0047 5250 315f 5254  .........GRP1_RT
	0x0560:  5f30 315f 3505 0000 0000 0000 00e6 0704  _01_5...........
	0x0570:  0611 0b1b 0000 00e8 0301 4752 5031 5f52  ..........GRP1_R
	0x0580:  545f 3032 0000 0000 0000 4752 5031 5f52  T_02......GRP1_R
	0x0590:  545f 3031 0000 0000 0000 b49d 537d fdf8  T_01........S}..
	0x05a0:  403f 0100 6412 0000 0000 0000 839f 0000  @?..d...........
	0x05b0:  0000 0000 0c00 4752 5031 5f52 545f 3031  ......GRP1_RT_01
	0x05c0:  5f36 0600 0000 0000 0000 e607 0406 1120  _6..............
	0x05d0:  0d00 0000 e803 0147 5250 315f 5254 5f30  .......GRP1_RT_0
	0x05e0:  3100 0000 0000 0047 5250 315f 5254 5f30  1......GRP1_RT_0
	0x05f0:  3100 0000 0000 00fd f840 3ffd f840 3f01  1........@?..@?.
	0x0600:  0002 1400 0000 0000 004e a700 0000 0000  .........N......
	0x0610:  000c 0047 5250 315f 5254 5f30 325f 3707  ...GRP1_RT_02_7.
	0x0620:  0000 0000 0000 00e6 0704 0615 3310 0000  ............3...
	0x0630:  00e8 0301 4752 5031 5f52 545f 3031 0000  ....GRP1_RT_01..
	0x0640:  0000 0000 4752 5031 5f52 545f 3032 0000  ....GRP1_RT_02..
	0x0650:  0000 0000 fdf8 403f b49d 537d 0100 d516  ......@?..S}....
	0x0660:  0000 0000 0000 03af 0000 0000 0000 0c00  ................
	0x0670:  4752 5031 5f52 545f 3031 5f38 0800 0000  GRP1_RT_01_8....
	0x0680:  0000 0000 e607 0407 081a 2100 0000 e803  ..........!.....
	0x0690:  0147 5250 315f 5254 5f30 3200 0000 0000  .GRP1_RT_02.....
	0x06a0:  0047 5250 315f 5254 5f30 3100 0000 0000  .GRP1_RT_01.....
	0x06b0:  00b4 9d53 7dfd f840 3f01 0060 1a00 0000  ...S}..@?..`....
	0x06c0:  0000 00b5 b800 0000 0000 000c 0047 5250  .............GRP
	0x06d0:  315f 5254 5f30 315f 3909 0000 0000 0000  1_RT_01_9.......
	0x06e0:  00e6 0704 0907 181d 0000 00e8 0301 4752  ..............GR
	0x06f0:  5031 5f52 545f 3031 0000 0000 0000 4752  P1_RT_01......GR
	0x0700:  5031 5f52 545f 3031 0000 0000 0000 fdf8  P1_RT_01........
	0x0710:  403f fdf8 403f 0100 961c 0000 0000 0000  @?..@?..........
	0x0720:  c9bf 0000 0000 0000 0000 0200 0101 7700  ..............w.
	0x0730:  0100 012e b200 0000 0000 0032 3032 322d  ...........2022-
	0x0740:  3034 2d30 3920 3037 3a34 323a 3531 2000  04-09.07:42:51..
	0x0750:  0000 003a 3a66 6666 663a 3139 322e 3136  ...::ffff:192.16
	0x0760:  382e 3536 2e32 3400 0000 0000 0000 0000  8.56.24.........
	0x0770:  0000 0000 0000 0000 0000 0000 0000 0000  ................
	0x0780:  0000 0000 0000 0000 0000 0000 0000 0000  ................
	0x0790:  0000 0012 0044 4d4d 4f4e 4954 4f52 5b34  .....DMMONITOR[4
	0x07a0:  2e30 5d20 5638 0a00 0000 002f 0049 6e73  .0].V8...../.Ins
	0x07b0:  7461 6e63 6528 4752 5031 5f52 545f 3031  tance(GRP1_RT_01
	0x07c0:  2920 6973 2061 6c72 6561 6479 2069 6e20  ).is.already.in.
	0x07d0:  4f70 656e 2073 7461 7475 7321 0000 0000  Open.status!....
	0x07e0:  0000                                     ..

也会得到来自LEADER发送的状态信息,与之前相同,不再赘述。

整体状态

此时会在3个 节点看到不同的状态信息

1号节点

show state    
2022-04-10 06:33:16 
#--------------------------------------------------------------------------------#
GET MONITOR STATE FROM MONITOR SYSTEM, THE FIRST LINE IS SELF INFO.
MON_BRO_INTERVAL: 99 ms, MON_VOTE_INTERVAL: 2623 ms

MON_NAME       MON_STATE      ID             MON_ROLE       MON_IP                   MON_PORT       
MON1           Active         1              LEADER         192.168.56.24            8339           
MON2           Active         2              NOT LEADER     192.168.56.8             8340           
MON3           Active         3              NOT LEADER     192.168.56.7             8341           
#--------------------------------------------------------------------------------#

2号节点

show state
2022-04-10 06:33:27 
#--------------------------------------------------------------------------------#
GET MONITOR STATE FROM MONITOR SYSTEM, THE FIRST LINE IS SELF INFO.
MON_BRO_INTERVAL: 99 ms, MON_VOTE_INTERVAL: 2419 ms

MON_NAME       MON_STATE      ID             MON_ROLE       MON_IP                   MON_PORT       
MON2           Active         2              FOLLOWER       192.168.56.8             8340           
MON1           Active         1              LEADER         192.168.56.24            8339           
MON3           Active         3              NOT LEADER     192.168.56.7             8341           
#--------------------------------------------------------------------------------#

3号节点

show state
2022-04-10 06:33:32 
#--------------------------------------------------------------------------------#
GET MONITOR STATE FROM MONITOR SYSTEM, THE FIRST LINE IS SELF INFO.
MON_BRO_INTERVAL: 99 ms, MON_VOTE_INTERVAL: 1674 ms

MON_NAME       MON_STATE      ID             MON_ROLE       MON_IP                   MON_PORT       
MON3           Active         3              FOLLOWER       192.168.56.7             8341           
MON1           Active         1              LEADER         192.168.56.24            8339           
MON2           Active         2              NOT LEADER     192.168.56.8             8340           
#--------------------------------------------------------------------------------#

此时并没有出现常规raft的2个FOLLOWER,而是始终存在NOT LEADER的角色,除了LEADER达成一致外,对于FOLLOWER和NOT LEADER的角色是存在差异的,除了LEADER外,所有FOLLOWER都将票投给了自己,在MONITOR端口上抓包,实际上FOLLOWER和NOT LEADER都会收到来自LEADER的信息,在标准raft协议的角色上似乎都是按FOLLOWER处理

关闭1号节点

2号节点从NOT LEADER变为FOLLOWER

show state
2022-04-10 06:43:40 
#--------------------------------------------------------------------------------#
GET MONITOR STATE FROM MONITOR SYSTEM, THE FIRST LINE IS SELF INFO.
MON_BRO_INTERVAL: 99 ms, MON_VOTE_INTERVAL: 2356 ms

MON_NAME       MON_STATE      ID             MON_ROLE       MON_IP                   MON_PORT       
MON2           Active         2              FOLLOWER       192.168.56.8             8340           
MON1           Active         1              NOT LEADER     192.168.56.24            8339           
MON3           Active         3              LEADER         192.168.56.7             8341           
#--------------------------------------------------------------------------------#

3号节点从FOLLOWER变为LEADER

show state
2022-04-10 06:43:56 
#--------------------------------------------------------------------------------#
GET MONITOR STATE FROM MONITOR SYSTEM, THE FIRST LINE IS SELF INFO.
MON_BRO_INTERVAL: 99 ms, MON_VOTE_INTERVAL: 2652 ms

MON_NAME       MON_STATE      ID             MON_ROLE       MON_IP                   MON_PORT       
MON3           Active         3              LEADER         192.168.56.7             8341           
MON1           Active         1              NOT LEADER     192.168.56.24            8339           
MON2           Active         2              NOT LEADER     192.168.56.8             8340           
#--------------------------------------------------------------------------------#

恢复1号节点

1号节点成为了新的FOLLOWER

show state
2022-04-10 06:54:04 
#--------------------------------------------------------------------------------#
GET MONITOR STATE FROM MONITOR SYSTEM, THE FIRST LINE IS SELF INFO.
MON_BRO_INTERVAL: 99 ms, MON_VOTE_INTERVAL: 2395 ms

MON_NAME       MON_STATE      ID             MON_ROLE       MON_IP                   MON_PORT       
MON1           Active         1              FOLLOWER       192.168.56.24            8339           
MON2           Active         2              NOT LEADER     192.168.56.8             8340           
MON3           Active         3              LEADER         192.168.56.7             8341           
#--------------------------------------------------------------------------------#

2号节点成为NOT LEADER

show state
2022-04-10 06:54:11 
#--------------------------------------------------------------------------------#
GET MONITOR STATE FROM MONITOR SYSTEM, THE FIRST LINE IS SELF INFO.
MON_BRO_INTERVAL: 99 ms, MON_VOTE_INTERVAL: 2356 ms

MON_NAME       MON_STATE      ID             MON_ROLE       MON_IP                   MON_PORT       
MON2           Active         2              FOLLOWER       192.168.56.8             8340           
MON1           Active         1              NOT LEADER     192.168.56.24            8339           
MON3           Active         3              LEADER         192.168.56.7             8341           
#--------------------------------------------------------------------------------#

3号节点保持LEADER状态

show state
2022-04-10 06:54:15 
#--------------------------------------------------------------------------------#
GET MONITOR STATE FROM MONITOR SYSTEM, THE FIRST LINE IS SELF INFO.
MON_BRO_INTERVAL: 99 ms, MON_VOTE_INTERVAL: 2652 ms

MON_NAME       MON_STATE      ID             MON_ROLE       MON_IP                   MON_PORT       
MON3           Active         3              LEADER         192.168.56.7             8341           
MON1           Active         1              NOT LEADER     192.168.56.24            8339           
MON2           Active         2              NOT LEADER     192.168.56.8             8340           
#--------------------------------------------------------------------------------#

一点推论

从上面的现象来观察,可以推断出一些DM8的监视器raft协议中的特点

  • 遵循标准raft协议的实现以及基本算法
  • 除了标准raft协议的三种角色外还加入了一个名为NOT LEADER的角色,其行为与FOLLOWER无异,被动接收LEADER发来的日志信息,并且同等参与投票
  • NOT LEADER PROMOTE时似乎只能提升为FOLLOWER而不会变为LEADER(顾名思义??)
  • 最新加入的有效节点将会变为FOLLOWER而将上一任FOLLOWER降级为NOT LEADER(原因不明)

总结

基于本文的测试内容,对于DM8监视器raft协议工作流程可以粗略概括为:
在不更改原单一MONITOR连接DW工作模式的前提下,将MONITOR层增加raft选举流程,通过标准raft协议的LOG APPEND流程定向从LEADER流入FOLLOWER/NOT LEADER,一方面实现FOLLOWER对LEADER存活的监控,以便任期超时无LEADER发起新一轮的投票选举,另一方面基于最后的LOG能够在MONITOR故障接管时作为上一状态与当前从DW获取的信息进行比对,形成有效切换结论。

遗留问题

目前对于NOT LEADER角色的成因和定义尚不明确,应当与raft协议在节点中定义的某种RANK机制有关,后续如果有机会搞清楚这部分会再进行补充

达梦云适配技术社区
https://eco.dameng.com/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

-守仁-

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值