1. 问题描述
按默认配置安装完RabbitMQ后,发现其File descriptors(即文件描述符)和Socket descriptors都特别低,分别为924和829。客户端(消费者)保持长连接时很容易就将socket占满。
[root@hadoop1 ~]# rabbitmqctl status
Status of node rabbit@hadoop1 ...
[{pid,18946},
{running_applications,
[{rabbitmq_management,"RabbitMQ Management Console","3.7.7"},
..........................................................................
..........................................................................
..........................................................................
{disk_free_limit,50000000},
{disk_free,45836664832},
{file_descriptors,
[{total_limit,924},
{total_used,503},
{sockets_limit,829},
{sockets_used,501}]},
{processes,[{limit,1048576},{used,3890}]},
{run_queue,0},
{uptime,6841},
{kernel,{net_ticktime,60}}]
经查该问题是由于系统当前的ulimit -n
仅为1024,rabbitmq在启动时会进行如下换算:
file_limit = 1024 - 100; // 924
sockets_limit = trunc((1024 - 100) * 0.9 - 2); //829
2. 解决步骤
RabbitMQ的File descriptors(文件描述符)的数量受到三个参数的约束:
- 系统级别,配置文件为:
/proc/sys/fs/file-max
,如果太小需要修改。
[root@hadoop1 ~]# cat /proc/sys/fs/file-max
778230
[root@hadoop1 ~]# cat /proc/sys/fs/file-nr
2432 0 778230
2.用户级别,配置文件为:/etc/security/limits.conf
* - nofile 65536
3.进程级别,即ulimit -n
#修改
[root@hadoop1 ~]# ulimit -n 65536
以上三个参数都确认>=我们需要的连接数后,重启RabbitMQ,注意需要连Erlang一起重启:
#彻底关闭rabbitmq,包括erlang进程
[root@hadoop1 ~]# rabbitmqctl stop
Stopping and halting node rabbit@hadoop1 ...
#在后台启动rabbitmq服务
[root@hadoop1 ~]# rabbitmq-server -detached
Warning: PID file not written; -detached was passed.
#查询重启后rabbitmq的file_descriptors参数
[root@hadoop1 ~]# rabbitmqctl status
Status of node rabbit@hadoop1 ...
[{pid,5381},
.........................................
{file_descriptors,
[{total_limit,65436},
{total_used,2},
{sockets_limit,58890},
{sockets_used,0}]},
{processes,[{limit,1048576},{used,243}]},
{run_queue,0},
{uptime,3},
{kernel,{net_ticktime,60}}]
重启后即可看到效果。
3. RabbitMQ 集群的故障单机的重新加入
1. rabbitmqctl stop_app
2. 杀掉 erlang进程ps -aux | grep mq
kill -9 进程
3.修改
[root@hadoop1 ~]# vim /etc/security/limits.conf
* - nofile 65536
[root@hadoop1 ~]# ulimit -n 65536
4.重启
rabbitmq-server -detached