Linux AS 5.3 , Oracle10.2.0.4 , 64bit , RAC , 3 NODES
2/11 16:00 点之前节点1负载从 11 降低到了 16:00左右的 1.0, 节点3是同样的负载趋势, 节点4 负载一直保持 2.3 以下 。
为什么会报 IPC Send timeout detected. Receiver ospid 25822 的问题, 现在查询ospid = 25822 对应的
session应该查不到了 。
什么原因会导致这个错误发生 , 心跳线的网络问题 ? 交换机的log正在找网络管理员查看, 还没有出结果 。 不知道还有没有其他原因 ?
---------------------------------------------------------------------------------------------------------
NODE1 alert log Message : :
Thu Feb 11 16:12:16 2010
Thread 1 advanced to log sequence 8254 (LGWR switch)
Current log# 28 seq# 8254 mem# 0: /ocfs_ctrl_redo/mxdell/redo28.log
Current log# 28 seq# 8254 mem# 1: /ocfs_data/mxdell/redo28.log
Thu Feb 11 16:30:02 2010
IPC Send timeout detected. Receiver ospid 25822
Thu Feb 11 16:30:02 2010
Errors in file /u01/product/admin/mxdell/udump/mxdell1_ora_25822.trc:
IPC Send timeout detected. Receiver ospid 25822
Thu Feb 11 16:30:03 2010
Errors in file /u01/product/admin/mxdell/udump/mxdell1_ora_25822.trc:
Thu Feb 11 16:34:05 2010
Thread 1 advanced to log sequence 8255 (LGWR switch)
Current log# 30 seq# 8255 mem# 0: /ocfs_ctrl_redo/mxdell/redo30.log
Current log# 30 seq# 8255 mem# 1: /ocfs_data/mxdell/redo30.log
----------------------------------------------------------------------------------------------------------
NODE3 alert log Message :
Thu Feb 11 16:22:11 2010
Thread 3 advanced to log sequence 7402 (LGWR switch)
Current log# 22 seq# 7402 mem# 0: /ocfs_ctrl_redo/mxdell/redo22.log
Current log# 22 seq# 7402 mem# 1: /ocfs_data/mxdell/redo22.log
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 15764
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 15766
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:34:41 2010
Thread 3 advanced to log sequence 7403 (LGWR switch)
Current log# 23 seq# 7403 mem# 0: /ocfs_ctrl_redo/mxdell/redo23.log
Current log# 23 seq# 7403 mem# 1: /ocfs_data/mxdell/redo23.log
----------------------------------------------------------------------------------------------------------
NODE4 alert log Message
Thu Feb 11 16:12:16 2010
Thread 4 advanced to log sequence 3194 (LGWR switch)
Current log# 9 seq# 3194 mem# 0: /ocfs_ctrl_redo/mxdell/redo09_1.log
Current log# 9 seq# 3194 mem# 1: /ocfs_data/mxdell/redo09_2.log
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 6046
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 6071
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 6003
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 6035
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 6016
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:30:03 2010
IPC Send timeout detected.Sender: ospid 6067
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:42:08 2010
Thread 4 advanced to log sequence 3195 (LGWR switch)
Current log# 10 seq# 3195 mem# 0: /ocfs_ctrl_redo/mxdell/redo10_1.log
Current log# 10 seq# 3195 mem# 1: /ocfs_data/mxdell/redo10_2.log
----------------------------------------------------------------------------------------------------------
NODE1 trace file message :
/u01/product/admin/mxdell/udump/mxdell1_ora_25822.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /u01/product/oracle
System name: Linux
Node name: mxrac01
Release: 2.6.18-128.el5
Version: #1 SMP Wed Dec 17 11:41:38 EST 2008
Machine: x86_64
Instance name: mxdell1
Redo thread mounted by this instance: 1
Oracle process number: 583
Unix process pid: 25822, image: oracle@mxrac01
*** 2010-02-11 16:24:05.672
*** ACTION NAME ) 2010-02-11 16:24:05.672
*** MODULE NAME TOAD 9.0.1.8) 2010-02-11 16:24:05.672
*** SERVICE NAME mxdell) 2010-02-11 16:24:05.672
*** SESSION ID 2015.6536) 2010-02-11 16:24:05.672
SKGXPSEGRCV: MESSAGE TRUNCATED user data 48 bytes payload 2104 bytes
SKGXPSEGRCV: trucated message buffer data skgxpmsg meta. data header 0x0x7fff7775a678 len 48 bytes
SKGXPLOSTACK: message truncation expected
SKGXPLOSTACK: data sent to port with no buffers queued from
SKGXPGPID 0x7fff7775a7c8 Internet address 192.168.1.14 UDP port number 17506
SKGXPLOSTACK: sent seq 32787 expecting 32788
SKGXPLOSTACK: lost ack detected retransmit ack
SKGXPSEGRCV: MESSAGE TRUNCATED user data 48 bytes payload 2104 bytes
SKGXPSEGRCV: trucated message buffer data skgxpmsg meta. data header 0x0x7fff7775a678 len 48 bytes
SKGXPLOSTACK: message truncation expected
SKGXPLOSTACK: data sent to port with no buffers queued from
SKGXPGPID 0x7fff7775a7c8 Internet address 192.168.1.14 UDP port number 13121
SKGXPLOSTACK: sent seq 32787 expecting 32788
SKGXPLOSTACK: lost ack detected retransmit ack
SKGXPSEGRCV: MESSAGE TRUNCATED user data 48 bytes payload 2104 bytes
SKGXPSEGRCV: trucated message buffer data skgxpmsg meta. data header 0x0x7fff7775a678 len 48 bytes
SKGXPLOSTACK: message truncation expected
SKGXPLOSTACK: data sent to port with no buffers queued from
SKGXPGPID 0x7fff7775a7c8 Internet address 192.168.1.14 UDP port number 30817
SKGXPLOSTACK: sent seq 32787 expecting 32788
SKGXPLOSTACK: lost ack detected retransmit ack
SKGXPSEGRCV: MESSAGE TRUNCATED user data 48 bytes payload 2104 bytes
SKGXPSEGRCV: trucated message buffer data skgxpmsg meta. data header 0x0x7fff7775a678 len 48 bytes
SKGXPLOSTACK: message truncation expected
SKGXPLOSTACK: data sent to port with no buffers queued from
SKGXPGPID 0x7fff7775a7c8 Internet address 192.168.1.14 UDP port number 47606
eygle 's blog .
链接: http://www.eygle.com/archives/20 ... stance_evicted.html
IPC Send timeout 是 Oracle10g Rac中非常让人头痛的一个问题,在资源紧张、网络拥堵等情况下,就有可能发生IPC超时的问题,而RAC随后就会将问题节点驱逐,引发一轮重新配置。
可喜的是Metalink上针对10.2.0.3有了一个Patch可以修正,而且在10.2.0.4中彻底修正了该问题。
常见的错误提示是这样的:
Thu Nov 27 11:32:05 2008
IPC Send timeout detected. Receiver ospid 4001974
Thu Nov 27 11:33:08 2008
Trace dumping is performing id=[cdmp_20081127113236]
Thu Nov 27 11:34:37 2008
Errors in file /oracle/app/product/admin/srs/bdump/srs1_lms1_4001974.trc:
Thu Nov 27 11:34:38 2008
Errors in file /oracle/app/product/admin/srs/bdump/srs1_lmon_3977348.trc:
ORA-29740: evicted by member 1, group incarnation 32
Thu Nov 27 11:34:38 2008
LMON: terminating instance due to error 29740
这个BUG号是Bug 5190596 。
在我的印象里10.2.0.3的确常有这个问题,而10.2.0.4却很少看到
---------------------------------------------------------------------------
Bug on windows
Bug 6782276 Win: ORA-27508 from RAC IPC
This note gives a brief overview of bug 6782276.
The content was last updated on: 03-APR-2009
Click here for details of each of the sections below.
Affects:
Product (Component)
Oracle Server (Rdbms)
Range of versions believed to be affected
Versions < 11.2
Versions confirmed as being affected
11.1.0.6
Platforms affected
Windows/NT/XP
Fixed:
This issue is fixed in
10.2.0.4 Patch 9 on Windows Platforms
11.1.0.7 (Server Patch Set)
11.2 (Future Release)
Symptoms:
Related To:
(None Specified)
ORA-27508 / ORA-27300
RAC (Real Application Clusters) / OPS
Description
RAC instance on Windows may get failures such as:
ORA-27508: IPC error sending a message
ORA-27300: OS system dependent operation:IPCSOCK_Send failed with status: 10055
ORA-27301: OS failure message: An operation on a socket could not be performed ...
ORA-27302: failure occurred at: send_3
2/11 16:00 点之前节点1负载从 11 降低到了 16:00左右的 1.0, 节点3是同样的负载趋势, 节点4 负载一直保持 2.3 以下 。
为什么会报 IPC Send timeout detected. Receiver ospid 25822 的问题, 现在查询ospid = 25822 对应的
session应该查不到了 。
什么原因会导致这个错误发生 , 心跳线的网络问题 ? 交换机的log正在找网络管理员查看, 还没有出结果 。 不知道还有没有其他原因 ?
---------------------------------------------------------------------------------------------------------
NODE1 alert log Message : :
Thu Feb 11 16:12:16 2010
Thread 1 advanced to log sequence 8254 (LGWR switch)
Current log# 28 seq# 8254 mem# 0: /ocfs_ctrl_redo/mxdell/redo28.log
Current log# 28 seq# 8254 mem# 1: /ocfs_data/mxdell/redo28.log
Thu Feb 11 16:30:02 2010
IPC Send timeout detected. Receiver ospid 25822
Thu Feb 11 16:30:02 2010
Errors in file /u01/product/admin/mxdell/udump/mxdell1_ora_25822.trc:
IPC Send timeout detected. Receiver ospid 25822
Thu Feb 11 16:30:03 2010
Errors in file /u01/product/admin/mxdell/udump/mxdell1_ora_25822.trc:
Thu Feb 11 16:34:05 2010
Thread 1 advanced to log sequence 8255 (LGWR switch)
Current log# 30 seq# 8255 mem# 0: /ocfs_ctrl_redo/mxdell/redo30.log
Current log# 30 seq# 8255 mem# 1: /ocfs_data/mxdell/redo30.log
----------------------------------------------------------------------------------------------------------
NODE3 alert log Message :
Thu Feb 11 16:22:11 2010
Thread 3 advanced to log sequence 7402 (LGWR switch)
Current log# 22 seq# 7402 mem# 0: /ocfs_ctrl_redo/mxdell/redo22.log
Current log# 22 seq# 7402 mem# 1: /ocfs_data/mxdell/redo22.log
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 15764
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 15766
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:34:41 2010
Thread 3 advanced to log sequence 7403 (LGWR switch)
Current log# 23 seq# 7403 mem# 0: /ocfs_ctrl_redo/mxdell/redo23.log
Current log# 23 seq# 7403 mem# 1: /ocfs_data/mxdell/redo23.log
----------------------------------------------------------------------------------------------------------
NODE4 alert log Message
Thu Feb 11 16:12:16 2010
Thread 4 advanced to log sequence 3194 (LGWR switch)
Current log# 9 seq# 3194 mem# 0: /ocfs_ctrl_redo/mxdell/redo09_1.log
Current log# 9 seq# 3194 mem# 1: /ocfs_data/mxdell/redo09_2.log
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 6046
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 6071
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 6003
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 6035
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:30:02 2010
IPC Send timeout detected.Sender: ospid 6016
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:30:03 2010
IPC Send timeout detected.Sender: ospid 6067
Receiver: inst 1 binc 6 ospid 25822
Thu Feb 11 16:42:08 2010
Thread 4 advanced to log sequence 3195 (LGWR switch)
Current log# 10 seq# 3195 mem# 0: /ocfs_ctrl_redo/mxdell/redo10_1.log
Current log# 10 seq# 3195 mem# 1: /ocfs_data/mxdell/redo10_2.log
----------------------------------------------------------------------------------------------------------
NODE1 trace file message :
/u01/product/admin/mxdell/udump/mxdell1_ora_25822.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /u01/product/oracle
System name: Linux
Node name: mxrac01
Release: 2.6.18-128.el5
Version: #1 SMP Wed Dec 17 11:41:38 EST 2008
Machine: x86_64
Instance name: mxdell1
Redo thread mounted by this instance: 1
Oracle process number: 583
Unix process pid: 25822, image: oracle@mxrac01
*** 2010-02-11 16:24:05.672
*** ACTION NAME ) 2010-02-11 16:24:05.672
*** MODULE NAME TOAD 9.0.1.8) 2010-02-11 16:24:05.672
*** SERVICE NAME mxdell) 2010-02-11 16:24:05.672
*** SESSION ID 2015.6536) 2010-02-11 16:24:05.672
SKGXPSEGRCV: MESSAGE TRUNCATED user data 48 bytes payload 2104 bytes
SKGXPSEGRCV: trucated message buffer data skgxpmsg meta. data header 0x0x7fff7775a678 len 48 bytes
SKGXPLOSTACK: message truncation expected
SKGXPLOSTACK: data sent to port with no buffers queued from
SKGXPGPID 0x7fff7775a7c8 Internet address 192.168.1.14 UDP port number 17506
SKGXPLOSTACK: sent seq 32787 expecting 32788
SKGXPLOSTACK: lost ack detected retransmit ack
SKGXPSEGRCV: MESSAGE TRUNCATED user data 48 bytes payload 2104 bytes
SKGXPSEGRCV: trucated message buffer data skgxpmsg meta. data header 0x0x7fff7775a678 len 48 bytes
SKGXPLOSTACK: message truncation expected
SKGXPLOSTACK: data sent to port with no buffers queued from
SKGXPGPID 0x7fff7775a7c8 Internet address 192.168.1.14 UDP port number 13121
SKGXPLOSTACK: sent seq 32787 expecting 32788
SKGXPLOSTACK: lost ack detected retransmit ack
SKGXPSEGRCV: MESSAGE TRUNCATED user data 48 bytes payload 2104 bytes
SKGXPSEGRCV: trucated message buffer data skgxpmsg meta. data header 0x0x7fff7775a678 len 48 bytes
SKGXPLOSTACK: message truncation expected
SKGXPLOSTACK: data sent to port with no buffers queued from
SKGXPGPID 0x7fff7775a7c8 Internet address 192.168.1.14 UDP port number 30817
SKGXPLOSTACK: sent seq 32787 expecting 32788
SKGXPLOSTACK: lost ack detected retransmit ack
SKGXPSEGRCV: MESSAGE TRUNCATED user data 48 bytes payload 2104 bytes
SKGXPSEGRCV: trucated message buffer data skgxpmsg meta. data header 0x0x7fff7775a678 len 48 bytes
SKGXPLOSTACK: message truncation expected
SKGXPLOSTACK: data sent to port with no buffers queued from
SKGXPGPID 0x7fff7775a7c8 Internet address 192.168.1.14 UDP port number 47606
eygle 's blog .
链接: http://www.eygle.com/archives/20 ... stance_evicted.html
IPC Send timeout 是 Oracle10g Rac中非常让人头痛的一个问题,在资源紧张、网络拥堵等情况下,就有可能发生IPC超时的问题,而RAC随后就会将问题节点驱逐,引发一轮重新配置。
可喜的是Metalink上针对10.2.0.3有了一个Patch可以修正,而且在10.2.0.4中彻底修正了该问题。
常见的错误提示是这样的:
Thu Nov 27 11:32:05 2008
IPC Send timeout detected. Receiver ospid 4001974
Thu Nov 27 11:33:08 2008
Trace dumping is performing id=[cdmp_20081127113236]
Thu Nov 27 11:34:37 2008
Errors in file /oracle/app/product/admin/srs/bdump/srs1_lms1_4001974.trc:
Thu Nov 27 11:34:38 2008
Errors in file /oracle/app/product/admin/srs/bdump/srs1_lmon_3977348.trc:
ORA-29740: evicted by member 1, group incarnation 32
Thu Nov 27 11:34:38 2008
LMON: terminating instance due to error 29740
这个BUG号是Bug 5190596 。
在我的印象里10.2.0.3的确常有这个问题,而10.2.0.4却很少看到
---------------------------------------------------------------------------
Bug on windows
Bug 6782276 Win: ORA-27508 from RAC IPC
This note gives a brief overview of bug 6782276.
The content was last updated on: 03-APR-2009
Click here for details of each of the sections below.
Affects:
Product (Component)
Oracle Server (Rdbms)
Range of versions believed to be affected
Versions < 11.2
Versions confirmed as being affected
11.1.0.6
Platforms affected
Windows/NT/XP
Fixed:
This issue is fixed in
10.2.0.4 Patch 9 on Windows Platforms
11.1.0.7 (Server Patch Set)
11.2 (Future Release)
Symptoms:
Related To:
(None Specified)
ORA-27508 / ORA-27300
RAC (Real Application Clusters) / OPS
Description
RAC instance on Windows may get failures such as:
ORA-27508: IPC error sending a message
ORA-27300: OS system dependent operation:IPCSOCK_Send failed with status: 10055
ORA-27301: OS failure message: An operation on a socket could not be performed ...
ORA-27302: failure occurred at: send_3
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/35489/viewspace-627249/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/35489/viewspace-627249/