RHEL 7.X:12C and later异常排查 | IPC Send timeout/node eviction etc with high packet reassembles failure

db_murphy

于 2021-07-13 09:54:30 发布

阅读量247

点赞数

分类专栏： oracle

本文链接：https://blog.csdn.net/db_murphy/article/details/118693138

版权

oracle 专栏收录该内容

50 篇文章 4 订阅

订阅专栏

本文适用于：RHEL 7.X:12C and later

官方文档：
Theme: RHEL 6.6: IPC Send timeout/node eviction etc with high packet reassembles failure (Doc ID 2008933.1)

APPLIES TO:
Oracle Database - Enterprise Edition - Version 11.2.0.1 and later
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Backup Service - Version N/A and later
Generic Linux

SYMPTOMS

Fri May 01 03:05:48 2015
IPC Send timeout detected. Receiver ospid 28660 [oracle@xxxxx (LMS0)]
Fri May 01 03:05:48 2015
Errors in file <ORACLE_BASE>/diag/rdbms/xrcovd/<dbname>/trace/<sid>_lms0_28660.trc:
IPC Send timeout detected. Receiver ospid 28670 [oracle@xxxxx (LMS1)]
Fri May 01 03:05:53 2015
Errors in file <ORACLE_BASE>/diag/rdbms/xrcovd/<dbname>/trace/<sid>_lms1_28670.trc:
Fri May 01 03:06:00 2015
IPC Send timeout detected. Receiver ospid 31414 [oracle@xxxxx (PZ98)]
Fri May 01 03:06:00 2015
Errors in file <ORACLE_BASE>/diag/rdbms/xrcovd/<dbname>/trace/<sid>_pz98_31414.trc:
Fri May 01 03:06:13 2015
IPC Send timeout detected. Receiver ospid 1835 [oracle@xxxxx (PZ97)]
Fri May 01 03:06:13 2015
Errors in file <ORACLE_BASE>/diag/rdbms/xrcovd/<dbname>/trace/<sid>_pz97_1835.trc:
Fri May 01 03:06:43 2015
Fri May 01 03:06:43 2015
Received an instance abort message from instance 1Received an instance abort message from instance 1
Please check instance 1 alert and LMON trace files for detail.Please check instance 1 alert and LMON trace files for detail.

LMS0 (ospid: 28660): terminating the instance due to error 481

Fri May 01 03:06:43 2015

System state dump requested by (instance=3, osid=28660 (LMS0)), summary=[abnormal instance termination].
System State dumped to trace file <ORACLE_BASE>/diag/rdbms/xrcovd/<dbname>/trace/<sid>_diag_28625.trc

Other symptoms could be:

node eviction
instance/node won’t join the cluster after instance/node eviction without rebooting the node where “packet reassembles failed” is happening

CAUSE
RHEL 6.6 has a few ipfrag fix and increased the default ipfrag_*_thresh:

cat /proc/sys/net/ipv4/ipfrag_low_thresh
3145728
cat /proc/sys/net/ipv4/ipfrag_high_thresh
4194304

However, the issue still happen, for Oracle Linux running Red-Hat compatible kernel, the issue was tracked in below bug, later closed as ‘Not a bug’:

BUG 21036841 - LCOV5/7/17 SERVER CRASHED AFTER PATCH UPGRADE AND KERNEL UPGRADE

SOLUTION
Workaround is to enable jumbo frame

Increase value of below kernel parameter as mentioned below,

net.ipv4.ipfrag_high_thresh = 16M
net.ipv4.ipfrag_low_thresh = 15M

Units of these values are MB.

=================================

对Oracle进行安装部署时，需要按照相关要求修改OS内核参数，下面对Oracle按照部署时需要修改的相关内核参数进行简单介绍。

注:OS的内核参数大多数存放在/proc/sys目录下，可以在系统运行时进行更改，但是当系统重新启动时会失效，而通过/etc/sysctl.conf文件可以永久生效修改后的内核参数。
sysctl -p
该命令可以立即生效sysctl.conf中配置的内核参数。

官方默认：

如下为安装部署Oracle时需要配置的内核参数：

[root@ethanDB-rac1 ~]# cat /etc/sysctl.conf
kernel.shmall = 67377299456
kernel.shmmax = 269509197824
vm.nr_hugepages = 61686
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 6815744
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
fs.aio-max-nr = 1048576
vm.swappiness = 1
vm.min_free_kbytes = 51200

[root@ethanDB-rac1 ~]# sysctl -p
kernel.shmall = 67377299456
kernel.shmmax = 269509197824
vm.nr_hugepages = 61686
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 6815744
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
fs.aio-max-nr = 1048576
vm.swappiness = 1
vm.min_free_kbytes = 51200
net.ipv4.ipfrag_high_thresh = 16777216
net.ipv4.ipfrag_low_thresh = 15728640

参数介绍：
kernel.shmall
shmall表示共享内存总量，以页为单位，而shmall默认为4294967296 已经足够大了，一般不需要调整，shmall不能低于SGA大小，若shmall低于SGA大小，实例启动时会报：ORA-27123: unable to attach to shared memory segment错误，且无法启动实例。

kernel.shmmax
shmmax参数用来定义单个共享内存段的最大值，该值应该配置足够大，能够在一个共享内存段中容下整个的SGA ，如果配置过低可能导致需要创建多个共享内存段，这样可能会导致系统性能下降。
269838450688/1024/1024/1024=252GB。
shmmax的单位为Byte(字节)。

vm.min_free_kbytes
该参数表示Linux VM最低保留多少的空闲内存空间，当可用的内存低于配置参数时，系统会进行cache内存的回收，来进行内存的释放。
单位是kb，524288/1024=512M。

kernel.sem
sem是semaphores的缩写，进程间通信–信号量，kernel.sem中4个参数分别对应SEMMSL SEMMNS SEMOPM SEMMNI

SEMMSL: 每个信号集的最大信号数量，一般该参数配置为数据库中最大 PROCESS 参数的设置值加上 10，Oracle 建议 SEMMSL 不低于100。

SEMMNS：控制整个系统中信号量的最大数量，使用以下计算公式来确定系统中需要配置的信号的最大数量，(SEMMSL * SEMMNI)=SEMMNS。4096128=524288。

SEMOPM：该参数表示在一个 semop call中，每个信号量所允许的最大操作数量，一个信号集可以拥有每个信号集当中的最大数量SEMMSL 信号，建议 SEMOPM 等于SEMMSL 。 Oracle 建议SEMOPM的值不低于 100 。

SEMMNI：该参数用于控制整个系统中信号集的最大数量。Oracle 建议SEMMNI 的值不低于 100 。

fs.file-max
该参数表示系统级别最大可以打开文件句柄的数量，文件句柄代表系统中可以打开文件的数量。

net.ipv4.ip_local_port_range
该参数配置向外连接端口范围，缺省为1024到4999。

net.ipv4.ipfrag_
net.ipv4.ipfrag_low_thresh
net.ipv4.ipfrag_high_thresh
系统中当数据包传输发生错误，会进行碎片整理，有效的数据包被保留，而无效的数据包被丢弃，ipfrag参数指定了碎片整理时的最大/最小内存。

net.core.rmem_*
net.core.rmem_default默认数据接收窗口大小。
net.core.rmem_max最大数据接收窗口大小。
net.core.wmem_default默认数据发送窗口大小。
net.core.wmem_max最大数据发送窗口大小。
单位均为字节。

fs.aio-max-nr
该参数表示最大并发异步I/O请求数量，当系统中存在非常高的I/O请求时，如果该参数配置过低，可能导致数据库报ORA-27090 - Unable to Reserve Kernel Resources for Asynchronous Disk I/O错误，遇到该问题需将fs.aio-max-nr调整为Oracle建议值3145728。
注：Doc ID 579108.1

kernel.shmmni
该参数为共享内存段的最大数量，缺省值为4096 ，一般情况下无需调整。

vm.nr_hugepages
该参数指定采用大页内存，大页内存数量，单位为个数。

文章结束。
以下是个人微信公众号，欢迎关注：

db_murphy

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
RHEL 7.X:12C and later异常排查 | IPC Send timeout/node eviction etc with high packet reassembles failure

本文适用于：RHEL 7.X:12C and later官方文档：Theme: RHEL 6.6: IPC Send timeout/node eviction etc with high packet reassembles failure (Doc ID 2008933.1)APPLIES TO:Oracle Database - Enterprise Edition - Version 11.2.0.1 and laterOracle Database Cloud Schema Servic
复制链接

扫一扫