Redis系列五 | 主从复制和哨兵模式

小鲸鱼大梦想

已于 2023-02-05 00:38:23 修改

阅读量204

点赞数

分类专栏： Redis 文章标签： redis 缓存数据库运维中间件

于 2023-02-05 00:22:32 首次发布

本文链接：https://blog.csdn.net/whale0306/article/details/128887453

版权

Redis 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

2.0 Redis主从复制

2.0.1 概念

主从复制

是指将一台Redis服务器的数据，复制到其他的Redis服务器。前者称为主节点(master)，后者称为从节点(slave)；数据的复制是单向的，只能由主节点到从节点。master以写为主，slave以读为主。

默认情况下，每台Redis服务器都是主节点；且一个主节点可以有多个从节点(或没有从节点)，但一个从节点只能有一个主节点。

主从复制的作用

数据冗余：主从复制实现了数据的热备份，是持久化之外的一种数据冗余方式。
故障恢复：当主节点出现问题时，可以由从节点提供服务，实现快速的故障恢复；实际上是一种服务的冗余。
负载均衡：在主从复制的基础上，配合读写分离，可以由主节点提供写服务，由从节点提供读服务（即写Redis数据时应用连接主节点，读Redis数据时应用连接从节点），分担服务器负载；尤其是在写少读多的场景下，通过多个从节点分担读负载，可以大大提高Redis服务器的并发量。
高可用基石：除了上述作用以外，主从复制还是哨兵和集群能够实施的基础，因此说主从复制是Redis高可用的基础。

一般来说，要将redis运用于生产项目中，只使用一台redis是万万不能的

从结构上，单个redis服务器会发生单点故障，且单机处理所有的请求会导致负载过大
从容量上，单机内存容量有限，一般来说，单台redis最大使用内存不应该超过20G

2.0.2 配置主从

主从复制，读写分离！80%的情况都是在进行读操作！减缓服务器的压力，架构中经常使用！一主二从！

测试场景

环境配置

只配置从库，不配置主库！

# 启动一个redis，查看信息
127.0.0.1:6379> info replication
# Replication
role:master														# 角色
connected_slaves:0												# 从机数量
master_replid:fa0e795a3e369ed7e46a40ee8818a51255ab6df3
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

复制一个配置文件，修改一下端口号、日志保存的文件名、rdb文件名、pid文件，在启动2个redis

[root@yunmx bin]# redis-server redis-conf/redis.conf1			# 6380
6038:C 12 Dec 2021 13:06:38.166 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
6038:C 12 Dec 2021 13:06:38.166 # Redis version=6.0.6, bits=64, commit=00000000, modified=0, pid=6038, just started
6038:C 12 Dec 2021 13:06:38.166 # Configuration loaded
[root@yunmx bin]# redis-server redis-conf/redis.conf2			# 6381
6045:C 12 Dec 2021 13:06:39.640 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
6045:C 12 Dec 2021 13:06:39.641 # Redis version=6.0.6, bits=64, commit=00000000, modified=0, pid=6045, just started
6045:C 12 Dec 2021 13:06:39.641 # Configuration loaded

未配置主从之前，三个节点都是主节点

认老大

一主（79）二从（80,81）

# 6380配置
127.0.0.1:6380> SLAVEOF 127.0.0.1 6379							# 认本机的6379端口服务的redis做老大
OK
127.0.0.1:6380> info replication
# Replication
role:slave														# 变成了从节点
master_host:127.0.0.1											# 主节点的信息
master_port:6379
master_link_status:up
master_last_io_seconds_ago:8
master_sync_in_progress:0
slave_repl_offset:0
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:55c3146a93d180ad5aa89e4d64ff894a451e77e5
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:0
# 查看主机的信息
127.0.0.1:6379> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=127.0.0.1,port=6380,state=online,offset=168,lag=0			# 从机的信息
slave1:ip=127.0.0.1,port=6381,state=online,offset=168,lag=0
master_replid:55c3146a93d180ad5aa89e4d64ff894a451e77e5
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:168
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:168

真实的主从配置应该在配置文件中配置，这样的话是永久的，上述使用的是命令，只是暂时的

replicaof <masterip> <masterport>   # 主机的IP地址和端口号
masterauth <master-password>		# 如果设定有密码，配置密码即可

2.0.3 特性验证

主机可以设置值，从机不能写，主机中所有信息和数据，从机都会保存

# 主机设置一个key
127.0.0.1:6379> set key1 yunmx
OK
127.0.0.1:6379>
# 从机中也会有，从机无法设置key
127.0.0.1:6380> keys *
1) "key1"
127.0.0.1:6380> get key1
"yunmx"
127.0.0.1:6380> set key2 yunmx2
(error) READONLY You can't write against a read only replica.
127.0.0.1:6380>

老大宕机后，从机还是从机，只是会显示主机状态不正常；主机恢复，从机依旧可以直接获取到主机写入的信息，保证了一定的高可用性

如果使用的是我们命令行配置的主从，如果从机宕机后，从机就会脱离主从了，需要再次命令行配置从机，变成从机以后，就能获取到key的值了

# 停掉6379主机服务，查看从机集群状态
127.0.0.1:6380> info replication
# Replication
role:slave
master_host:127.0.0.1
master_port:6379
master_link_status:down
master_last_io_seconds_ago:-1
master_sync_in_progress:0
slave_repl_offset:1065
master_link_down_since_seconds:10
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:55c3146a93d180ad5aa89e4d64ff894a451e77e5
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1065
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:1065

# 恢复6379主机服务，验证是否还会同步数据
[root@yunmx ~]# redis-cli -p 6379
127.0.0.1:6379> set key2 yunmx2													# 恢复主节点，设置一个key
OK
127.0.0.1:6379>
# 从节点读取主机恢复后设置的key
127.0.0.1:6380> get key2														# 能够正常读取
"yunmx2"

# 停止从机服务，主机设定一个key，在恢复从机，进行验证
127.0.0.1:6379> set key3 yunmx3
OK

127.0.0.1:6380> get key3														# 从机无法获得key3的数据
(nil)

127.0.0.1:6381> get key3
"yunmx3"

2.0.4 复制原理

Slave启动成功连接到master后会发送一个sync同步命令

Master接到命令后，确定后台的存盘进程，同时收集所有接收到的用于修改数据集命令，在后台进程执行完毕之后，master将传送整个数据文件到slave,并完成一次完全同步

全量复制：slave服务将接收到数据库文件数据后，将其存盘并加载到内存中

增量复制：Mster继续将新的所有收集到的修改命令依次传给slave，完成同步

但是只要重新连接master，一次完全同步将被自动执行！我们的数据一定可以在从机中看到！

2.0.5 层层链路

上一个M链接下一个S!

可以完成主从复制！

如果没有79，这个时候能不能选择一个老大出来呢，这个时候需要手动去配置！

谋朝篡位：slaveof no one 使自己变成主机！如果老大回来了，也是需要手动配置

2.1 Redis哨兵模式

自动版选老大的模式

2.1.1 概述

主从切换技术的方式是：当主服务器宕机后，需要手动把一台服务器切换为主服务，这就需要人工干预，费时费力，还会造成一段时间内服务不能使用。这不是一种推荐的方式，更多的是我们考虑哨兵模式，Redis从2.8开始正式提供Sentinel（哨兵）架构来解决这个问题。
能够后台监控主机是否故障，如果故障了根据投票数自动将从库转换为主库
哨兵模式是一种特殊的模式，首先Redis提供了哨兵的命令，是一个独立的进程，作为进程，它会独立运行。原理是哨兵通过发送命令，等待Redis服务器响应，从而监控运行的多个Redis实例。

2.1.2 基本架构

哨兵模式的作用：

通过发送命令，让Redis服务器返回监控运行状态，包括主服务和从服务
当哨兵检测到master宕机后，会自动将slave切换成master，然后通过发布订阅模式通过其他的从服务器，修改配置文件，让他们切换主机

一个哨兵进程对redis服务器进行监控，可能会出现问题，为此，我们可以使用多个哨兵进行监控，各个哨兵之间还会进行监控，这样就形成了多哨兵模式

在这里插入图片描述

使用哨兵模式，至少都会启动6个进程

假设主服务宕机，哨兵1先检测到这个结果，系统并不会马上进行failover过程，仅仅是哨兵1主观的认为主服务不可用，这个现场叫主观下线。当后面的哨兵也检测到主服务不可用，并且数量达到一定值后，那么哨兵就会进行一次投票，投票的结果由一个哨兵发起，进行failover故障转移操作。切换成功后，就会通过发布订阅模式，让各个哨兵把自己监控的从服务实现切换主机，这个过程被称为客观下线

2.1.3 场景测试

我们目前测试的架构是一主二从

配置哨兵模式的配置文件

# 新建配置文件并编辑以下内容
sentinel monitor myredis 127.0.0.1 6379 1# 语法：sentinel monitor 被监控的名称 host port 1（1代表主机宕机后，从机投票让谁来接替成为主机，）

启动哨兵

[root@yunmx bin]# redis-sentinel redis-conf/sentinel.conf				# 启动一个哨兵
12680:X 12 Dec 2021 15:13:42.570 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
12680:X 12 Dec 2021 15:13:42.570 # Redis version=6.0.6, bits=64, commit=00000000, modified=0, pid=12680, just started
12680:X 12 Dec 2021 15:13:42.570 # Configuration loaded
                _._
           _.-``__ ''-._
      _.-``    `.  `_.  ''-._           Redis 6.0.6 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._
 (    '      ,       .-`  | `,    )     Running in sentinel mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 26379
 |    `-._   `._    /     _.-'    |     PID: 12680
  `-._    `-._  `-./  _.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |           http://redis.io
  `-._    `-._`-.__.-'_.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |
  `-._    `-._`-.__.-'_.-'    _.-'
      `-._    `-.__.-'    _.-'
          `-._        _.-'
              `-.__.-'

12680:X 12 Dec 2021 15:13:42.571 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
12680:X 12 Dec 2021 15:13:42.575 # Sentinel ID is ea7ccf0119a4cf2873cf3bb108da5c7af86d36bd
12680:X 12 Dec 2021 15:13:42.575 # +monitor master myredis 127.0.0.1 6379 quorum 1
12680:X 12 Dec 2021 15:13:42.575 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ myredis 127.0.0.1 6379
12680:X 12 Dec 2021 15:13:42.580 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ myredis 127.0.0.1 6379

手动宕机测试

# 关掉主机
127.0.0.1:6379> SHUTDOWN
not connected>
# 哨兵监控的一些信息
12680:X 12 Dec 2021 15:16:49.174 # +failover-state-select-slave master myredis 127.0.0.1 6379
12680:X 12 Dec 2021 15:16:49.241 # +selected-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ myredis 127.0.0.1 6379
12680:X 12 Dec 2021 15:16:49.241 * +failover-state-send-slaveof-noone slave 127.0.0.1:6381 127.0.0.1 6381 @ myredis 127.0.0.1 6379
12680:X 12 Dec 2021 15:16:49.324 * +failover-state-wait-promotion slave 127.0.0.1:6381 127.0.0.1 6381 @ myredis 127.0.0.1 6379
12680:X 12 Dec 2021 15:16:50.181 # +promoted-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ myredis 127.0.0.1 6379
12680:X 12 Dec 2021 15:16:50.181 # +failover-state-reconf-slaves master myredis 127.0.0.1 6379
12680:X 12 Dec 2021 15:16:50.237 * +slave-reconf-sent slave 127.0.0.1:6380 127.0.0.1 6380 @ myredis 127.0.0.1 6379
12680:X 12 Dec 2021 15:16:51.182 * +slave-reconf-inprog slave 127.0.0.1:6380 127.0.0.1 6380 @ myredis 127.0.0.1 6379
12680:X 12 Dec 2021 15:16:51.182 * +slave-reconf-done slave 127.0.0.1:6380 127.0.0.1 6380 @ myredis 127.0.0.1 6379
12680:X 12 Dec 2021 15:16:51.234 # +failover-end master myredis 127.0.0.1 6379
12680:X 12 Dec 2021 15:16:51.234 # +switch-master myredis 127.0.0.1 6379 127.0.0.1 6381		# 哨兵显示主机自动切换到了6381
12680:X 12 Dec 2021 15:16:51.234 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ myredis 127.0.0.1 6381
12680:X 12 Dec 2021 15:16:51.234 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ myredis 127.0.0.1 6381
12680:X 12 Dec 2021 15:17:21.244 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ myredis 127.0.0.1 6381
# 检查6381主机的信息
127.0.0.1:6381> info replication
# Replication
role:master								# 变成了master
connected_slaves:1
slave0:ip=127.0.0.1,port=6380,state=online,offset=21944,lag=0
master_replid:a4719b795f6c088ed1a11408a2bc52cc48ece215
master_replid2:367d9493cb151b433bd535ad9e49603d1fa35013
master_repl_offset:21944
second_repl_offset:11812
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:21944

如果master宕机以后，这个时候就会从从机中随机选择一个服务器（有一个自己的投票算法）作为主机

主机恢复

# 重新开启之前宕机的主机。观察哨兵的反应
12680:X 12 Dec 2021 15:23:41.461 * +convert-to-slave slave 127.0.0.1:6379 127.0.0.1 6379 @ myredis 127.0.0.1 6381
# 查看先前主机的信息
127.0.0.1:6379> info replication
# Replication
role:slave								# 变成从机
master_host:127.0.0.1
master_port:6381
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:41645
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:a4719b795f6c088ed1a11408a2bc52cc48ece215
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:41645
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:39205
repl_backlog_histlen:2441
# 只能归并到新的主机下，当作从机，这就是哨兵模式的规则！

优缺点

哨兵集群，基于主从复制模式，所有的主从配置优点它多有
主从可以切换，故障可以转移，系统的可用性会更好
就是主从模式的升级，手动到自动，更加健壮！
不好在线扩容，集群容量一旦达到上线，在线扩容就十分麻烦！
实现哨兵模式的配置其实很麻烦的，里面有很多选择！

哨兵模式的全部配置

<后续进行详细学习>

# Example sentinel.conf
 
# port <sentinel-port>
# The port that this sentinel instance will run on
# sentinel实例运行的端口
port 26379												# 哨兵进程运行的端口号
 
# sentinel announce-ip <ip>
# sentinel announce-port <port>
#
# The above two configuration directives are useful in environments where,
# because of NAT, Sentinel is reachable from outside via a non-local address.
#
# When announce-ip is provided, the Sentinel will claim the specified IP address
# in HELLO messages used to gossip its presence, instead of auto-detecting the
# local address as it usually does.
#
# Similarly when announce-port is provided and is valid and non-zero, Sentinel
# will announce the specified TCP port.
#
# The two options don't need to be used together, if only announce-ip is
# provided, the Sentinel will announce the specified IP and the server port
# as specified by the "port" option. If only announce-port is provided, the
# Sentinel will announce the auto-detected local IP and the specified port.
#
# Example:
#
# sentinel announce-ip 1.2.3.4
 
# dir <working-directory>
# Every long running process should have a well-defined working directory.
# For Redis Sentinel to chdir to /tmp at startup is the simplest thing
# for the process to don't interferer with administrative tasks such as
# unmounting filesystems.
dir /tmp
 
# sentinel monitor <master-name> <ip> <redis-port> <quorum>
# master-name : master Redis Server名称
# ip : master Redis Server的IP地址
# redis-port : master Redis Server的端口号
# quorum : 主实例判断为失效至少需要 quorum 个 Sentinel 进程的同意，只要同意 Sentinel 的数量不达标，自动failover就不会执行
#
# Tells Sentinel to monitor this master, and to consider it in O_DOWN
# (Objectively Down) state only if at least <quorum> sentinels agree.
#
# Note that whatever is the ODOWN quorum, a Sentinel will require to
# be elected by the majority of the known Sentinels in order to
# start a failover, so no failover can be performed in minority.
#
# Slaves are auto-discovered, so you don't need to specify slaves in
# any way. Sentinel itself will rewrite this configuration file adding
# the slaves using additional configuration options.
# Also note that the configuration file is rewritten when a
# slave is promoted to master.
#
# Note: master name should not include special characters or spaces.
# The valid charset is A-z 0-9 and the three characters ".-_".
#
sentinel monitor mymaster 127.0.0.1 6379 2
 
# sentinel auth-pass <master-name> <password>
#
# Set the password to use to authenticate with the master and slaves.
# Useful if there is a password set in the Redis instances to monitor.
#
# Note that the master password is also used for slaves, so it is not
# possible to set a different password in masters and slaves instances
# if you want to be able to monitor these instances with Sentinel.
#
# However you can have Redis instances without the authentication enabled
# mixed with Redis instances requiring the authentication (as long as the
# password set is the same for all the instances requiring the password) as
# the AUTH command will have no effect in Redis instances with authentication
# switched off.
#
# Example:
#
# sentinel auth-pass mymaster MySUPER--secret-0123passw0rd
 
# sentinel down-after-milliseconds <master-name> <milliseconds>
#
# Number of milliseconds the master (or any attached slave or sentinel) should
# be unreachable (as in, not acceptable reply to PING, continuously, for the
# specified period) in order to consider it in S_DOWN state (Subjectively
# Down).
# 选项指定了 Sentinel 认为Redis实例已经失效所需的毫秒数。当实例超过该时间没有返回PING，或者直接返回错误， 那么 Sentinel 将这个实例标记为主观下线（subjectively down，简称 SDOWN ）
#
# Default is 30 seconds.
sentinel down-after-milliseconds mymaster 30000
 
# sentinel parallel-syncs <master-name> <numslaves>
#
# How many slaves we can reconfigure to point to the new slave simultaneously
# during the failover. Use a low number if you use the slaves to serve query
# to avoid that all the slaves will be unreachable at about the same
# time while performing the synchronization with the master.
# 选项指定了在执行故障转移时， 最多可以有多少个从Redis实例在同步新的主实例， 在从Redis实例较多的情况下这个数字越小，同步的时间越长，完成故障转移所需的时间就越长。
sentinel parallel-syncs mymaster 1
 
# sentinel failover-timeout <master-name> <milliseconds>
#
# Specifies the failover timeout in milliseconds. It is used in many ways:
#
# - The time needed to re-start a failover after a previous failover was
# already tried against the same master by a given Sentinel, is two
# times the failover timeout.
#
# - The time needed for a slave replicating to a wrong master according
# to a Sentinel current configuration, to be forced to replicate
# with the right master, is exactly the failover timeout (counting since
# the moment a Sentinel detected the misconfiguration).
#
# - The time needed to cancel a failover that is already in progress but
# did not produced any configuration change (SLAVEOF NO ONE yet not
# acknowledged by the promoted slave).
#
# - The maximum time a failover in progress waits for all the slaves to be
# reconfigured as slaves of the new master. However even after this time
# the slaves will be reconfigured by the Sentinels anyway, but not with
# the exact parallel-syncs progression as specified.
# 如果在该时间（ms）内未能完成failover操作，则认为该failover失败
#
# Default is 3 minutes.
sentinel failover-timeout mymaster 180000
 
# SCRIPTS EXECUTION
#
# sentinel notification-script and sentinel reconfig-script are used in order
# to configure scripts that are called to notify the system administrator
# or to reconfigure clients after a failover. The scripts are executed
# with the following rules for error handling:
#
# If script exits with "1" the execution is retried later (up to a maximum
# number of times currently set to 10).
#
# If script exits with "2" (or an higher value) the script execution is
# not retried.
#
# If script terminates because it receives a signal the behavior is the same
# as exit code 1.
#
# A script has a maximum running time of 60 seconds. After this limit is
# reached the script is terminated with a SIGKILL and the execution retried.
 
# NOTIFICATION SCRIPT
#
# sentinel notification-script <master-name> <script-path>
#
# Call the specified notification script for any sentinel event that is
# generated in the WARNING level (for instance -sdown, -odown, and so forth).
# This script should notify the system administrator via email, SMS, or any
# other messaging system, that there is something wrong with the monitored
# Redis systems.
#
# The script is called with just two arguments: the first is the event type
# and the second the event description.
#
# The script must exist and be executable in order for sentinel to start if
# this option is provided.
# 指定sentinel检测到该监控的redis实例指向的实例异常时，调用的报警脚本。该配置项可选，但是很常用。
#
# Example:
#
# sentinel notification-script mymaster /var/redis/notify.sh
 
# CLIENTS RECONFIGURATION SCRIPT
#
# sentinel client-reconfig-script <master-name> <script-path>
#
# When the master changed because of a failover a script can be called in
# order to perform application-specific tasks to notify the clients that the
# configuration has changed and the master is at a different address.
#
# The following arguments are passed to the script:
#
# <master-name> <role> <state> <from-ip> <from-port> <to-ip> <to-port>
#
# <state> is currently always "failover"
# <role> is either "leader" or "observer"
#
# The arguments from-ip, from-port, to-ip, to-port are used to communicate
# the old address of the master and the new address of the elected slave
# (now a master).
#
# This script should be resistant to multiple invocations.
#
# Example:
#
# sentinel client-reconfig-script mymaster /var/redis/reconfig.sh

小鲸鱼大梦想

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Redis系列五 | 主从复制和哨兵模式

主从切换技术的方式是：当主服务器宕机后，需要手动把一台服务器切换为主服务，这就需要人工干预，费时费力，还会造成一段时间内服务不能使用。这不是一种推荐的方式，更多的是我们考虑哨兵模式，Redis从2.8开始正式提供Sentinel（哨兵）架构来解决这个问题。能够后台监控主机是否故障，如果故障了根据投票数自动将从库转换为主库哨兵模式是一种特殊的模式，首先Redis提供了哨兵的命令，是一个独立的进程，作为进程，它会独立运行。原理是哨兵通过发送命令，等待Redis服务器响应，从而监控运行的多个Redis实例。
复制链接

扫一扫