1.环境检查
检查redis 是否已经完成安装:
whereis redis-cli //检查redis 客户端是否存在
默认安装如下,如果出现地址,则表明已经完成安装:
redis-cli: /usr/local/bin/redis-cli
如果已经安装,后续需要登录redis客户端,然后检查版本是否符合要求
[root@m112 ~]# /usr/local/bin/redis-cli -c -h m112 -p 10091 -a 'dyQwe123'
m112:10091> info
# Server
redis_version:4.0.9
2.工具准备
这里准备的工具包为:
1.redis-4.0.9.tar.gz //单机redis,每个redis节点服务器都需要安装
2.ruby-2.5.1.tar.gz //使用ruby 完成redis集群管理与搭建
3.单机安装步骤
首先完成redis安装包解压
tar -xzf redis-4.0.9.tar.gz
实际安装过程中,需要安装很多的工具,并且进行一定的配置,以下指令需要提前执行
yum install -y tcl //安装tcl
yum install -y gcc make //安装 gcc c语言编译工具
有了上述工具安装与配置,进入 redis目录,完成编译:
*这里需要特别注意,直接编译可能失败:
zmalloc.h:50:31: error: jemalloc/jemalloc.h: No such file or directory
zmalloc.h:55:2: error: #error "Newer version of jemalloc required"
make[1]: *** [adlist.o] Error 1
make[1]: Leaving directory `/data0/src/redis-2.6.2/src'
make: *** [all] Error 2
原因是jemalloc重载了Linux下的ANSIC的malloc和free函数,解决办法为make时添加参数。
make MALLOC=libc && make test && make install
//以上实际是三个指令:
make为编译,带参数MALLOC=libc;
make test 为测试;
make install 编译安装
4.redis 配置
这里介绍集群相关的一些配置,具体如下:
4.1、目录新建:
mkdir -p /data/redis_log //redis集群日志目录
mkdir -p /etc/redis //redis 各节点的配置文件目录
mkdir -p /etc/redis-cluster //redis 集群节点配置项目录(该目录无需主动存放,集群启动后自动生成)
mkdir -p /var/redis/10091
mkdir -p /var/redis/10092
4.2、配置文件
首先进入安装目录,拷贝配置文件到 /etc/redis 目录
4.2.1、主从配置
主节点配置
bind 172.16.1.82 //redis节点绑定ip,不要填写 127.0.0.1,使用服务器自身IP
从节点配置
主要配置从节点与主节点数据同步
新版
replicaof <masterip> <masterport>
#从机默认开启只读模式
replica-read-only yes
旧版
#配置指定的主节点
slaveof <masterip> <masterport>
slave-read-only yes
备注:
主从节点信息查看
#1.命令行直接查看
bin/redis-cli info replication
#2.登录redis客户端
info replication
查看结果如下:
# Replication
role:master
connected_slaves:0
master_replid:e5a18b7515af36da03e8e2c97f0b84b91ea54617
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
4.2.2、 公共配置
对于主从节点的配置文件,实际需要配置的公共配置如下
port 10091 //节点端口
requirepass dyQwe123 //密码配置
cluster-enabled yes //是否支持集群
//配置集群配置文件存放目录,该目录无需主动填写任何文件
cluster-config-file /etc/redis-cluster/node-10091.conf
cluster-node-timeout 15000 //集群节点连接超时时间
//开启守护进程模式:redis会在后台运行,并将进程pid号写入至redis.conf选项pidfile设置的文件中,此时redis将一直运行,除非手动kill该进程
//否则,设为no:redis运行将进入redis的命令行界面,exit强制退出或者关闭连接工具(putty,xshell等)都会导致redis进程退出
daemonize yes
pidfile /var/run/redis_10091.pid //节点运行线程数据文件
logfile /data/redis_log/10091.log //数据库日志存储文件
dir /var/redis/10091 //redis工作目录,只能是目录,不能是文件名:redis数据库将会存储数据到改目录,分别为 appendonly.aof 和 dump.rdb(数据库文件)
//数据持久化存储,yes代表 开启AOF模式,appendfilename 指定了持久化文件名,存储到上述的 dir(工作目录),防止因断电等突发原因造成的不可恢复
appendonly yes
注,其中dir工作目录,在实际启动后,如下所示:
[root@m111 ~]# ll /var/redis/10091
total 92
-rw-r--r--. 1 root root 75259 Jul 10 03:26 appendonly.aof
-rw-r--r--. 1 root root 13419 Jul 10 03:29 dump.rdb
实际搭建集群时,一个节点需要一个配置文件;为了稳定通常考虑设置主备双节点,所以这里拷贝为如下两个文件(一主一备):
redis-10091.conf
redis-10092.conf
注:这里redis-10091为主、10092为备,需要为slave服务器设置一个参数
#此处参数为主服务器的登录密码
masterauth 123456
如果是多个服务器,可以直接将对应文件拷贝到指定服务器指定目录即可,然后再修改端口、ip即可:
sed -i "s/10091/10092/g" redis-10092.conf
sed -i "s/172\.16\.1\.82/172\.16\.1\.237/g" redis-10093.conf
这里redis.txt 内逐行存储待发送服务器IP,如:
192.168.19.103
192.168.19.104
发送指令:
for line in `cat /root/redis.txt`;do scp -r redis-10092.conf root@$line:/etc/redis ; done
4.3 设置开机自启
4.3.1 新建启动服务文件
新建节点启动文件:redis_10091
#!/bin/sh
#
# Simple Redis init.d script conceived to work on Linux systems
# as it does use of the /proc filesystem.
# chkconfig: 2345 90 10
# description: Redis is a persistent key-value database
REDISPORT=10091
EXEC=/usr/local/bin/redis-server
CLIEXEC=/usr/local/bin/redis-cli
PIDFILE=/var/run/redis_${REDISPORT}.pid
CONF="/etc/redis/redis-${REDISPORT}.conf"
case "$1" in
start)
if [ -f $PIDFILE ]
then
echo "$PIDFILE exists, process is already running or crashed"
else
echo "Starting Redis server..."
$EXEC $CONF
fi
;;
stop)
if [ ! -f $PIDFILE ]
then
echo "$PIDFILE does not exist, process is not running"
else
PID=$(cat $PIDFILE)
echo "Stopping ..."
$CLIEXEC -p $REDISPORT shutdown
while [ -x /proc/${PID} ]
do
echo "Waiting for Redis to shutdown ..."
sleep 1
done
echo "Redis stopped"
fi
;;
*)
echo "Please use start or stop as first argument"
;;
esac
以上的启动文件需要存放于启动目录:/etc/init.d/
由于每个节点都需要一个,故新建好redis_10091后,需要cp拷贝出其他文件
然后使用 sed 指令进行端口参数变更:(实际上述启动文件只有端口不同)
sed -i "s/10091/10092/g" redis-10092
4.3.2 设置开机自启
新建的文件可能没有执行权限,需要增加执行权限,然后设置开机自启
#对新建的启动脚本增加 x - 可执行权限
chmod +x redis*
#然后直接chkconfig设置开机自启
chkconfig redis_10091 on
chkconfig redis_10092 on
设置后,chkconfig --list 查看是否设置成功,成功后如下:
netconsole 0:off 1:off 2:off 3:off 4:off 5:off 6:off
network 0:off 1:off 2:on 3:on 4:on 5:on 6:off
redis_10091 0:off 1:off 2:on 3:on 4:on 5:on 6:off
redis_10092 0:off 1:off 2:on 3:on 4:on 5:on 6:off
5 集群搭建
redis 需要使用 ruby进行安装:
5.1 ruby 安装
ruby 最简安装步骤:
#1.解压
tar -xvf ruby-2.5.1.tar.gz
#2.配置
cd ruby-2.5.1/
./configure -prefix=/usr/local/ruby
#3.编译安装
make && make install
##工具插件安装
yum install rubygems -y
yum install zlib-devel -y
yum install openssl-devel -y
#4.gem 安装 redis
gem install redis
最后拷贝文件:
cp /opt/redis-4.0.9/src/redis-trib.rb /usr/local/bin/
根据实际环境的不一致,实际安装过程中会出现不同的问题。很多问题都是因为需要需要提前安装的工具,为了避免失败,这里提前下载:
gem install redis -y
注,以上安装需要在 步骤3 或 4 之前执行。
#目录新建复制,需要在3执行之后执行
cd /usr/local/ruby
cp bin/ruby /usr/bin
cp bin/gem /usr/bin
5.2 安装过程总结
不同的系统环境可能导致不同异常,这里安装的是最简安装包,实际安装情况如下:
1、2、3 执行无异常,ruby安装全程无报错,make && make install 执行成功
4.gem 安装 redis 执行失败
gem install redis
报错:-bash: gem: command not found
安装gem:
yum install rubygems -y
报错:
Fetching: redis-4.2.1.gem (100%)
ERROR: Error installing redis:
redis requires Ruby version >= 2.3.0.
以上错误提示要更新版本,有一种说法是:
CentOS7 yum库中ruby的版本支持到 2.0.0,可gem 安装redis需要最低是2.3.0
解决方式:
1.安装 rvm,完成ruby版本更新
2.手动安装
这里采用的是手动安装gem:
cd /usr/local/ruby
cp bin/ruby /usr/bin
cp bin/gem /usr/bin
bin/gem install redis
报错:
ERROR: Loading command: install (LoadError)
cannot load such file -- zlib
ERROR: While executing gem ... (NoMethodError)
undefined method `invoke_with_build_args' for nil:NilClass
解决方案:
cd ruby-2.5.1/ext/zlib
ruby ./extconf.rb
报错:
checking for deflateReset() in -lz... no
checking for deflateReset() in -llibz... no
checking for deflateReset() in -lzlib1... no
checking for deflateReset() in -lzlib... no
checking for deflateReset() in -lzdll... no
checking for deflateReset() in -lzlibwapi... no
这里原因是,缺少 zlib-devel :
#完成安装zlib-devel
yum install zlib-devel -y
#编译
make & make install
报错:
make: *** No rule to make target `/include/ruby.h', needed by `zlib.o'. Stop.
解决方式:
sed -i "s#\$(top_srcdir)#\.\.\/\.\.#g" Makefile
执行 make & make install
然后继续安装:gem install redis:
报错:
ERROR: While executing gem ... (Gem::Exception)
Unable to require openssl, install OpenSSL and rebuild Ruby (preferred) or use non-HTTPS sources
cd ruby-2.5.1/ext /openssl/
ruby ./extconf.rb
报错:
checking for t_open() in -lnsl... no
checking for socket() in -lsocket... no
checking for openssl/ssl.h... no
Traceback (most recent call last):
./extconf.rb:94:in `<main>': OpenSSL library could not be found. You might want to use --with-openssl-dir=<dir> option to specify the prefix where OpenSSL is installed. (RuntimeError)
#安装 openssl-devel
yum install openssl-devel -y
#然后继续执行编译:
make & make install
报错:
make: *** No rule to make target `/include/ruby.h', needed by `ossl_x509attr.o'. Stop.
之后生成的Makefile 需要编辑替换该文件:
sed -i "s#\$(top_srcdir)#\.\.\/\.\.#g" Makefile
之后 继续编辑:
make & make install
最后:gem install redis
成功完成安装
6.集群启动与配置
6.1集群节点配置修改
Redis Cluster 无需主动配置主从节点,会在启动时根据参数自行分配,故需要保证每个节点的配置都是主节点配置,需设置好节点IP、端口,特别是账户密码配置:
cluster-enabled yes //是否支持集群
requirepass 123456 //集群节点被访问的密码配置
masterauth 123456 //集群节点互相访问所需密码
6.2集群节点启动
启动,集群配置的前提是各节点的redis 服务都已经完成启动:
启动各个服务器的 redis 服务:
service redis_10091 start
service redis_10092 start
为了后续不再需要手动启动服务,可以设置开机自启:
# 设置开机自启动:
chkconfig redis_10091 on
chkconfig redis_10092 on
# 检查是否设置成功:
chkconfig --list
netconsole 0:off 1:off 2:off 3:off 4:off 5:off 6:off
network 0:off 1:off 2:on 3:on 4:on 5:on 6:off
redis_10091 0:off 1:off 2:on 3:on 4:on 5:on 6:off
redis_10092 0:off 1:off 2:on 3:on 4:on 5:on 6:off
集群中,各个节点必须要能互相交互,需要增加防火墙对应规则:
firewall-cmd --zone=public --add-port=10091-10092/tcp --permanent
firewall-cmd --zone=public --add-port=20091-20092/tcp --permanent
(--permanent永久生效,没有此参数重启后失效)
重新载入
firewall-cmd --reload
查看
firewall-cmd --zone=public --query-port=10091-10092/tcp
删除
firewall-cmd --zone=public --remove-port=10091-10092/tcp --permanent
6.2 集群服务配置
完成防火墙节点服务端口的开放后,需要进行 redis 集群的密码配置:
[root@m111 ~]# find /usr/ -name "client.rb"
/usr/share/ruby/xmlrpc/client.rb
/usr/local/ruby/lib/ruby/gems/2.5.0/gems/xmlrpc-0.3.0/lib/xmlrpc/client.rb
/usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-4.2.1/lib/redis/client.rb
这里需要编辑,redis 目录的 client.rb,这里需要设置好密码,必须和redis 设置的密码一致,否则无法登陆
password: “123456”, //这里的密码必须和redis中设置的密码一致
6.3 集群启动
6.3.1.集群启动
配置结束后,可以考虑启动 redis集群:
redis-trib.rb create --replicas 1 192.168.19.112:10091 192.168.19.112:10092 192.168.19.113:10091 192.168.19.113:10092 192.168.19.111:10091 192.168.19.111:10092
启动详情如下:
中途会提示是否确认以上配置:输入yes即可(主要是展示了所有主从节点:M代表主节点、S代表从节点)
Can I set the above configuration? (type 'yes' to accept): yes
[root@ecs-zujian03 opt]# redis-trib.rb create --replicas 1 10.153.128.2:10091 10.153.128.2:10092 10.153.128.3:10091 10.153.128.3:10092 10.153.128.4:10091 10.153.128.4:10092
>>> Creating cluster
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
10.153.128.2:10091
10.153.128.3:10091
10.153.128.4:10091
Adding replica 10.153.128.3:10092 to 10.153.128.2:10091
Adding replica 10.153.128.4:10092 to 10.153.128.3:10091
Adding replica 10.153.128.2:10092 to 10.153.128.4:10091
M: 909124b309a62f4915ba4d23ccbc180c8d2a6e7c 10.153.128.2:10091
slots:0-5460 (5461 slots) master
S: c0930ac9697fc61c4169a17cfe478c5ed7abcdad 10.153.128.2:10092
replicates ea9e17f267fd57917effdb2d4dc8c036193fc7b4
M: da1a998071762c27c11f9868baa83aa264b2cb94 10.153.128.3:10091
slots:5461-10922 (5462 slots) master
S: 52c93cfd56a7505af121edadf56817010da939ac 10.153.128.3:10092
replicates 909124b309a62f4915ba4d23ccbc180c8d2a6e7c
M: ea9e17f267fd57917effdb2d4dc8c036193fc7b4 10.153.128.4:10091
slots:10923-16383 (5461 slots) master
S: 4b0c0db92167d311414adf8a91345fadda9801fd 10.153.128.4:10092
replicates da1a998071762c27c11f9868baa83aa264b2cb94
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join....
>>> Performing Cluster Check (using node 10.153.128.2:10091)
M: 909124b309a62f4915ba4d23ccbc180c8d2a6e7c 10.153.128.2:10091
slots:0-5460 (5461 slots) master
1 additional replica(s)
M: da1a998071762c27c11f9868baa83aa264b2cb94 10.153.128.3:10091
slots:5461-10922 (5462 slots) master
1 additional replica(s)
S: c0930ac9697fc61c4169a17cfe478c5ed7abcdad 10.153.128.2:10092
slots: (0 slots) slave
replicates ea9e17f267fd57917effdb2d4dc8c036193fc7b4
S: 52c93cfd56a7505af121edadf56817010da939ac 10.153.128.3:10092
slots: (0 slots) slave
replicates 909124b309a62f4915ba4d23ccbc180c8d2a6e7c
M: ea9e17f267fd57917effdb2d4dc8c036193fc7b4 10.153.128.4:10091
slots:10923-16383 (5461 slots) master
1 additional replica(s)
S: 4b0c0db92167d311414adf8a91345fadda9801fd 10.153.128.4:10092
slots: (0 slots) slave
replicates da1a998071762c27c11f9868baa83aa264b2cb94
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
6.3.2 集群状态检查
集群启动状态指令检查:
[root@m111 ~]# redis-trib.rb check 192.168.19.112:10091
>>> Performing Cluster Check (using node 192.168.19.112:10091)
M: 87d1c783257939fcf1fa457eb54a65e00faf900e 192.168.19.112:10091
slots:0-5460 (5461 slots) master
1 additional replica(s)
M: 5d6511053863f6174403014d60dee4050f6eb083 192.168.19.113:10091
slots:5461-10922 (5462 slots) master
1 additional replica(s)
M: 95532a328abbff8c127cb0c695a5070b2f43b196 192.168.19.111:10091
slots:10923-16383 (5461 slots) master
1 additional replica(s)
S: b66a59745e963bd2c012b152466dc30408794ae9 192.168.19.112:10092
slots: (0 slots) slave
replicates 95532a328abbff8c127cb0c695a5070b2f43b196
S: 37501419bd01aa96f45136bf7d5f9df28eeb2dbb 192.168.19.111:10092
slots: (0 slots) slave
replicates 5d6511053863f6174403014d60dee4050f6eb083
S: efb3021000ddbcb9356f5d0a0d7602eadc30b0cc 192.168.19.113:10092
slots: (0 slots) slave
replicates 87d1c783257939fcf1fa457eb54a65e00faf900e
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
以上代表集群启动成功,同时标明了主从节点
手动集群状态检查:
分别选择三个节点:A、B、C
A 节点: set testA 123
B 节点: get testA
C 节点: set testA 12C
A 节点: get testA
6.4 集群异常记录
1.当使用别名时,可能造成启动失败:
[root@localhost ~]# redis-trib.rb create --replicas 1 m111:10091 m111:10092 m112:10091 m112:10092 m113:10091 m113:10092
>>> Creating cluster
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
m111:10091
m112:10091
m113:10091
Adding replica m112:10092 to m111:10091
Adding replica m113:10092 to m112:10091
Adding replica m111:10092 to m113:10091
>>> Trying to optimize slaves allocation for anti-affinity
[WARNING] Some slaves are in the same host as their master
M: b57035cb84f4a9560ecb05fe80ea744c928b9b5d m111:10091
slots:0-5460 (5461 slots) master
S: 4e5ee66a192f8042373f6ec53307d921cc29e915 m111:10092
replicates b57035cb84f4a9560ecb05fe80ea744c928b9b5d
M: b57035cb84f4a9560ecb05fe80ea744c928b9b5d m112:10091
slots:5461-10922 (5462 slots) master
S: 4e5ee66a192f8042373f6ec53307d921cc29e915 m112:10092
replicates b57035cb84f4a9560ecb05fe80ea744c928b9b5d
M: b57035cb84f4a9560ecb05fe80ea744c928b9b5d m113:10091
slots:10923-16383 (5461 slots) master
S: 4e5ee66a192f8042373f6ec53307d921cc29e915 m113:10092
replicates b57035cb84f4a9560ecb05fe80ea744c928b9b5d
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Traceback (most recent call last):
10: from /usr/local/bin/redis-trib.rb:1830:in `<main>'
9: from /usr/local/bin/redis-trib.rb:1431:in `create_cluster_cmd'
8: from /usr/local/bin/redis-trib.rb:939:in `join_cluster'
7: from /usr/local/bin/redis-trib.rb:939:in `each'
6: from /usr/local/bin/redis-trib.rb:941:in `block in join_cluster'
5: from /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-4.2.1/lib/redis.rb:3310:in `cluster'
4: from /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-4.2.1/lib/redis.rb:69:in `synchronize'
3: from /usr/local/ruby/lib/ruby/2.5.0/monitor.rb:226:in `mon_synchronize'
2: from /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-4.2.1/lib/redis.rb:69:in `block in synchronize'
1: from /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-4.2.1/lib/redis.rb:3311:in `block in cluster'
/usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-4.2.1/lib/redis/client.rb:127:in `call': ERR Invalid node address specified: m111:10091 (Redis::CommandError)
2.当首次启动失败之后,第二次正确指令启动也会报错
Traceback (most recent call last):
11: from /usr/local/bin/redis-trib.rb:1830:in `<main>'
10: from /usr/local/bin/redis-trib.rb:1426:in `create_cluster_cmd'
9: from /usr/local/bin/redis-trib.rb:905:in `flush_nodes_config'
8: from /usr/local/bin/redis-trib.rb:905:in `each'
7: from /usr/local/bin/redis-trib.rb:906:in `block in flush_nodes_config'
6: from /usr/local/bin/redis-trib.rb:212:in `flush_node_config'
5: from /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-4.2.1/lib/redis.rb:3310:in `cluster'
4: from /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-4.2.1/lib/redis.rb:69:in `synchronize'
3: from /usr/local/ruby/lib/ruby/2.5.0/monitor.rb:226:in `mon_synchronize'
2: from /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-4.2.1/lib/redis.rb:69:in `block in synchronize'
1: from /usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-4.2.1/lib/redis.rb:3311:in `block in cluster'
/usr/local/ruby/lib/ruby/gems/2.5.0/gems/redis-4.2.1/lib/redis/client.rb:127:in `call': ERR Slot 5461 is already busy (Redis::CommandError)
该错误是因为上次启动错误遗留,所以只需要清空
redis 配置文件中配置的 集群配置文件目录的配置即可
cluster-config-file /etc/redis-cluster/node-10091.conf
#多节点,可以考虑直接清除文件夹下的所有文件:
rm -rf /etc/redis-cluster/*
然后重启各节点服务即可
3.端口连接失败:
Creating cluster
[ERR] Sorry, can't connect to node 192.168.19.112:10092
该错误明确提示,112服务器的10092端口 无法连接:
具体原因可能有:
- 10092 端口未开放:可能是防火墙关闭
- 10092 节点的服务密码不同,导致登录失败
- 10092 节点服务未启动,端口无法连接导致无法连接
后将对应服务启动,开放端口防火墙即可
4.节点登录失败:
NOAUTH Authentication required
# 以上错误是因为密码不正确,或未输入密码导致,当前没登录,只需要输入密码即可:
auth password
5.集群异常关闭导致启动失败
[ERR] Node 192.168.0.102:7001 is not empty. Either the node already knows other nodes (check with CLUSTER NODES) or contains some key in database 0.
此时,只需要清除对应节点,使用 flushall 指令清除,然后重启即可
6.其他异常
以下异常可能是某个集群服务未正确关闭或者服务器异常关闭重启导致,此时可以考虑重启,或者彻底关闭服务,重启启动
/var/redis/run/redis_10091.pid exists, process is already running or crashed