Redis碎碎念

最新推荐文章于 2022-04-29 08:06:39 发布

weixin_34314962

最新推荐文章于 2022-04-29 08:06:39 发布

阅读量107

点赞数

文章标签：数据库运维

Redis Sentinel是Redis的高可用方案。是Redis 2.8中正式引入的。

在之前的主从复制方案中，如果主节点出现问题，需要手动将一个从节点升级为主节点，然后将其它从节点指向新的主节点，并且需要修改应用方主节点的地址。整个过程都需要人工干预。

Sentinel的相关参数

# bind 127.0.0.1 192.168.1.1
# protected-mode no
port 26379
# sentinel announce-ip <ip>
# sentinel announce-port <port>
dir /tmp sentinel monitor mymaster 127.0.0.1 6379 2 # sentinel auth-pass <master-name> <password> sentinel down-after-milliseconds mymaster 30000 sentinel parallel-syncs mymaster 1 sentinel failover-timeout mymaster 180000 # sentinel notification-script mymaster /var/redis/notify.sh # sentinel client-reconfig-script mymaster /var/redis/reconfig.sh sentinel deny-scripts-reconfig yes

其中，

dir：设置Sentinel的工作目录。

sentinel monitor mymaster 127.0.0.1 6379 2：其中2是quorum，即权重，代表至少需要两个Sentinel节点认为主节点主观下线，才可判定主节点为客观下线。一般建议将其设置为Sentinel节点的一半加1。不仅如此，quorum还与Sentinel节点的领导者选举有关。为了选出Sentinel的领导者，至少需要max(quorum, num(sentinels) / 2 + 1)个Sentinel节点参与选举。

sentinel down-after-milliseconds mymaster 30000：每个Sentinel节点都要通过定期发送ping命令来判断Redis节点和其余Sentinel节点是否可达。

如果在指定的时间内，没有收到主节点的有效回复，则判断其为主观下线。需要注意的是，该参数不仅用来判断主节点状态，同样也用来判断该主节点下面的从节点及其它Sentinel的状态。其默认值为30s。

sentinel parallel-syncs mymaster 1：在failover期间，允许多少个slave同时指向新的主节点。如果numslaves设置较大的话，虽然复制操作并不会阻塞主节点，但多个节点同时指向新的主节点，会增加主节点的网络和磁盘IO负载。

sentinel failover-timeout mymaster 180000：定义故障转移超时时间。默认180000，单位秒，即3min。需要注意的是，该时间不是总的故障转移的时间，而是适用于故障转移的多个场景。

# Specifies the failover timeout in milliseconds. It is used in many ways:
#
# - The time needed to re-start a failover after a previous failover was
#   already tried against the same master by a given Sentinel, is two
#   times the failover timeout.
#
# - The time needed for a slave replicating to a wrong master according
#   to a Sentinel current configuration, to be forced to replicate
#   with the right master, is exactly the failover timeout (counting since
#   the moment a Sentinel detected the misconfiguration).
#
# - The time needed to cancel a failover that is already in progress but # did not produced any configuration change (SLAVEOF NO ONE yet not # acknowledged by the promoted slave). # # - The maximum time a failover in progress waits for all the slaves to be # reconfigured as slaves of the new master. However even after this time # the slaves will be reconfigured by the Sentinels anyway, but not with # the exact parallel-syncs progression as specified.

第一种场景：

sentinel notification-script：定义通知脚本，当Sentinel出现WARNING级别的事件时，会调用该脚本，其会传入两个参数：事件类型，事件描述。

sentinel client-reconfig-script：当主节点发生切换时，会调用该参数定义的脚本，其会传入以下参数：<master-name> <role> <state> <from-ip> <from-port> <to-ip> <to-port>

关于脚本，其必须遵循一定的规则。

# SCRIPTS EXECUTION
#
# sentinel notification-script and sentinel reconfig-script are used in order
# to configure scripts that are called to notify the system administrator
# or to reconfigure clients after a failover. The scripts are executed
# with the following rules for error handling:
#
# If script exits with "1" the execution is retried later (up to a maximum
# number of times currently set to 10).
#
# If script exits with "2" (or an higher value) the script execution is # not retried. # # If script terminates because it receives a signal the behavior is the same # as exit code 1. # # A script has a maximum running time of 60 seconds. After this limit is # reached the script is terminated with a SIGKILL and the execution retried.

sentinel deny-scripts-reconfig：不允许使用SENTINEL SET设置notification-script和client-reconfig-script。

Sentinel的常见操作

PING This command simply returns PONG.
SENTINEL masters Show a list of monitored masters and their state.
SENTINEL master <master name> Show the state and info of the specified master.
SENTINEL slaves <master name> Show a list of slaves for this master, and their state.
SENTINEL sentinels <master name> Show a list of sentinel instances for this master, and their state.
SENTINEL get-master-addr-by-name <master name> Return the ip and port number of the master with that name. If a failover is in progress or terminated successfully for this master it returns the address and port of the promoted slave.
SENTINEL reset <pattern> This command will reset all the masters with matching name. The pattern argument is a glob-style pattern. The reset process clears any previous state in a master (including a failover in progress), and removes every slave and sentinel already discovered and associated with the master.
SENTINEL failover <master name> Force a failover as if the master was not reachable, and without asking for agreement to other Sentinels (however a new version of the configuration will be published so that the other Sentinels will update their configurations).
SENTINEL ckquorum <master name> Check if the current Sentinel configuration is able to reach the quorum needed to failover a master, and the majority needed to authorize the failover. This command should be used in monitoring systems to check if a Sentinel deployment is ok.
SENTINEL flushconfig Force Sentinel to rewrite its configuration on disk, including the current Sentinel state. Normally Sentinel rewrites the configuration every time something changes in its state (in the context of the subset of the state which is persisted on disk across restart). However sometimes it is possible that the configuration file is lost because of operation errors, disk failures, package upgrade scripts or configuration managers. In those cases a way to to force Sentinel to rewrite the configuration file is handy. This command works even if the previous configuration file is completely missing.

sentinel masters

输出被监控的主节点的状态信息

127.0.0.1:26379> sentinel masters
1)  1) "name"
    2) "mymaster"
    3) "ip"
    4) "127.0.0.1"
    5) "port"
    6) "6379"
    7) "runid"
    8) "6ab2be5db3a37c10f2473c8fb9daed147a32df3e"
    9) "flags"
   10) "master"
   11) "link-pending-commands"
   12) "0"
   13) "link-refcount"
   14) "1"
   15) "last-ping-sent"
   16) "0"
   17) "last-ok-ping-reply"
   18) "639"
   19) "last-ping-reply"
   20) "639"
   21) "down-after-milliseconds"
   22) "30000"
   23) "info-refresh"
   24) "2075"
   25) "role-reported"
   26) "master"
   27) "role-reported-time"
   28) "759682"
   29) "config-epoch"
   30) "0"
   31) "num-slaves"
   32) "2"
   33) "num-other-sentinels"
   34) "2"
   35) "quorum"
   36) "2"
   37) "failover-timeout"
   38) "180000"
   39) "parallel-syncs"
   40) "1"

View Code

也可单独查看某个主节点的状态

sentinel master mymaster

sentinel slaves mymaster

查看某个主节点slave的状态

127.0.0.1:26379> sentinel slaves mymaster
1)  1) "name"
    2) "127.0.0.1:6380"
    3) "ip"
    4) "127.0.0.1"
    5) "port"
    6) "6380"
    7) "runid"
    8) "983b87fd070c7f052b26f5135bbb30fdeb170a54"
    9) "flags"
   10) "slave"
   11) "link-pending-commands"
   12) "0"
   13) "link-refcount"
   14) "1"
   15) "last-ping-sent"
   16) "0"
   17) "last-ok-ping-reply"
   18) "178"
   19) "last-ping-reply"
   20) "178"
   21) "down-after-milliseconds"
   22) "30000"
   23) "info-refresh"
   24) "6160"
   25) "role-reported"
   26) "slave"
   27) "role-reported-time"
   28) "489019"
   29) "master-link-down-time"
   30) "0"
   31) "master-link-status"
   32) "ok"
   33) "master-host"
   34) "127.0.0.1"
   35) "master-port"
   36) "6379"
   37) "slave-priority"
   38) "100"
   39) "slave-repl-offset"
   40) "70375"
2)  1) "name"
    2) "127.0.0.1:6381"
    3) "ip"
    4) "127.0.0.1"
    5) "port"
    6) "6381"
    7) "runid"
    8) "b88059cce9104dd4e0366afd6ad07a163dae8b15"
    9) "flags"
   10) "slave"
   11) "link-pending-commands"
   12) "0"
   13) "link-refcount"
   14) "1"
   15) "last-ping-sent"
   16) "0"
   17) "last-ok-ping-reply"
   18) "178"
   19) "last-ping-reply"
   20) "178"
   21) "down-after-milliseconds"
   22) "30000"
   23) "info-refresh"
   24) "2918"
   25) "role-reported"
   26) "slave"
   27) "role-reported-time"
   28) "489019"
   29) "master-link-down-time"
   30) "0"
   31) "master-link-status"
   32) "ok"
   33) "master-host"
   34) "127.0.0.1"
   35) "master-port"
   36) "6379"
   37) "slave-priority"
   38) "100"
   39) "slave-repl-offset"
   40) "71040"

View Code

sentinel sentinels mymaster

查看其它Sentinel的状态

127.0.0.1:26379> sentinel sentinels mymaster
1)  1) "name"
    2) "738ccbddaa0d4379d89a147613d9aecfec765bcb"
    3) "ip"
    4) "127.0.0.1"
    5) "port"
    6) "26381"
    7) "runid"
    8) "738ccbddaa0d4379d89a147613d9aecfec765bcb"
    9) "flags"
   10) "sentinel"
   11) "link-pending-commands"
   12) "0"
   13) "link-refcount"
   14) "1"
   15) "last-ping-sent"
   16) "0"
   17) "last-ok-ping-reply"
   18) "475"
   19) "last-ping-reply"
   20) "475"
   21) "down-after-milliseconds"
   22) "30000"
   23) "last-hello-message"
   24) "79"
   25) "voted-leader"
   26) "?"
   27) "voted-leader-epoch"
   28) "0"
2)  1) "name"
    2) "7251bb129ca373ad0d8c7baf3b6577ae2593079f"
    3) "ip"
    4) "127.0.0.1"
    5) "port"
    6) "26380"
    7) "runid"
    8) "7251bb129ca373ad0d8c7baf3b6577ae2593079f"
    9) "flags"
   10) "sentinel"
   11) "link-pending-commands"
   12) "0"
   13) "link-refcount"
   14) "1"
   15) "last-ping-sent"
   16) "0"
   17) "last-ok-ping-reply"
   18) "475"
   19) "last-ping-reply"
   20) "475"
   21) "down-after-milliseconds"
   22) "30000"
   23) "last-hello-message"
   24) "985"
   25) "voted-leader"
   26) "?"
   27) "voted-leader-epoch"
   28) "0"

View Code

sentinel get-master-addr-by-name <master name>

返回指定<master name>主节点的IP地址和端口。如果在进行故障转移，则显示的是新主的信息。

127.0.0.1:26379> sentinel get-master-addr-by-name mymaster
1) "127.0.0.1"
2) "6379"

sentinel reset <pattern>

对符合<pattern>（通配符风格）主节点的配置进行重置。

如果某个slave宕机了，其依然处于sentinel的管理中，所以，在其恢复正常后，其依然会加入到之前的复制环境中，即使配置文件中没有指定slaveof选项。不仅如此，如果主节点宕机了，在其重启后，其默认会作为从节点接入到之前的复制环境中。

但很多时候，我们可能就是想移除old master，slave，这个时候，sentinel reset就派上用场了。其会基于当前主节点的状态，重置其配置（they'll refresh the list of slaves within the next 10 seconds, only adding the ones listed as correctly replicating from the current master INFO output）。关键的是，对于非正常状态的slave，会从当前的配置中剔除。这样，被剔除节点在恢复正常后（注意此时的配置文件，需剔除slaveof的配置），也不会自动加入到之前的复制环境中。

需要注意的是，该命令仅对当前sentinel节点有效，如果要剔除某个节点，需要在所有的sentinel节点上执行reset操作。

sentinel failover <master name>

对指定 <master name> 主节点进行强制故障转移。相对于常规的故障转移，其无需

sentinel ckquorum <master name>

检测当前可达的Sentinel节点总数是否达到<quorum>的个数

127.0.0.1:26379> sentinel ckquorum mymaster
OK 3 usable Sentinels. Quorum and failover authorization can be reached

sentinel flushconfig

将Sentinel节点的配置信息强制刷到磁盘上，这个命令Sentinel节点自身用得比较多，对于开发和运维人员只有当外部原因（例如磁盘损坏）造成配置文件损坏或者丢失时，才会用上。

sentinel remove <master name>

取消当前Sentinel节点对于指定<master name>主节点的监控。

[root@slowtech redis-4.0.11]# grep -Ev "^#|^$" sentinel_26379.conf 
port 26379
dir "/tmp"
sentinel myid 2467530fa249dbbc435c50fbb0dc2a4e766146f8
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 127.0.0.1 6381 2
sentinel config-epoch mymaster 12
sentinel leader-epoch mymaster 0
sentinel known-slave mymaster 127.0.0.1 6380
sentinel known-slave mymaster 127.0.0.1 6379
sentinel known-sentinel mymaster 127.0.0.1 26381 738ccbddaa0d4379d89a147613d9aecfec765bcb
sentinel known-sentinel mymaster 127.0.0.1 26380 7251bb129ca373ad0d8c7baf3b6577ae2593079f
sentinel current-epoch 12

[root@slowtech redis-4.0.11]# redis-cli -p 26379
127.0.0.1:26379> sentinel remove mymaster
OK
127.0.0.1:26379> quit

[root@slowtech redis-4.0.11]# grep -Ev "^#|^$" sentinel_26379.conf 
port 26379
dir "/tmp"
sentinel myid 2467530fa249dbbc435c50fbb0dc2a4e766146f8
sentinel deny-scripts-reconfig yes
sentinel current-epoch 12

View Code

weixin_34314962

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Redis碎碎念

Redis Sentinel是Redis的高可用方案。是Redis 2.8中正式引入的。在之前的主从复制方案中，如果主节点出现问题，需要手动将一个从节点升级为主节点，然后将其它从节点指向新的主节点，并且需要修改应用方主节点的地址。整个过程都需要人工干预。 Sentinel的相关参数# bind 127.0.0.1 192.168.1.1# protected-mode no...
复制链接

扫一扫