有个客户数据库环境是三节点RAC,三个节点每周都有不定期轮流重启,每次看alert日志都是心跳异常被踢出集群。主机重启后又可以加入集群,百度了很多,最后参考了一个,已过去1个多月了,再也没有出现过重启的问题。记录下
检查netstat -s发现packet reassembles failed指标大量增加
netstat -s|grep "packet reassembles failed"
netstat -s | fgrep reassembles
配置/etc/sysctl.conf并生效
echo 'net.ipv4.ipfrag_high_thresh = 16777216 ' >> /etc/sysctl.conf
echo 'net.ipv4.ipfrag_low_thresh = 15728640 ' >> /etc/sysctl.conf
echo 'net.ipv4.ipfrag_time = 120 ' >> /etc/sysctl.conf
echo 'net.ipv4.ipfrag_secret_interval = 600 ' >> /etc/sysctl.conf
echo 'net.ipv4.ipfrag_max_dist = 1024 ' >> /etc/sysctl.conf
sysctl -p
RHEL 6.6: IPC Send timeout/node eviction etc with high packet reassembles failure (Doc ID 2008933.1)
Troubleshooting gc block lost and Poor Network Performance in a RAC Environment (Doc ID 563566.1)