ORA-07445: exception encountered: core dump [kssdch()+2188] [SIGSEGV] [Address not mapped to object] [0x00008239D] [] []
相关错误 有可能是PL/SQL developer 引起的数据字典bug 但是由于在 V5之后 就不会存在这个问题了 我们的 PL/SQL DEV是 V8的。
(k2g table)
error 602 detected in background process
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [kssdch()+2188] [SIGSEGV] [Address not mapped to object] [0x00008239D] [] []
而且这个问题确实是可以引发宕机 BUG 在11G 中才修复好。
SIGSEGV
Typically, the signals seen are SIGBUS (signal 10, bus error) and SIGSEGV (signal 11, segmentation violation). There are other UNIX signals and exceptions that may happen, however, they are likely caused by OS problems rather than an Oracle problem. Examples of other signals are: SIGINT, SIGKILL, SIGSYS. A complete list is available in Note:1038055.6.
错误解释
SIGSEGV
Segmentation violation. This signal can also result from an illegal
pointer reference or an array bound error.
看起来 还是软件的错误 虽然他说是OS 的错误。但是论坛上有提到解决问题的办法是 flush shared_pool.。
下面是一个BUG REPORT 我选择其中的关键内容
When attempting to cleanup after a SQL*Net connection is terminated, the following error occurs:
ORA-07445: exception encountered: core dump [kssdct()+94] [SIGSEGV] [Address not mapped to object] [0x00000240E] [] []
and then the instance is terminated, due to PMON reporting the below errors:
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [kssdch()+2188] [SIGSEGV] [Address not mapped to object] [0x00000241E] [] []
Oracle 10.2.0.5 on Linux x86-64.
8 node RAC database
Intermittent instance failures on one node. So far, two failures.
这个过程来看跟我们的宕机有些相像。
:
ORA-602: internal programming exception
ORA-7445: exception encountered: core dump [kssdch()+2188] [SIGSEGV]
[Address not mapped to object] [0x0000708FB] [] []
Thu Nov 18 01:02:04 GMT 2010
PMON: terminating instance due to error 602
这个问题关系到一个 unpublished bug 9184754
我无法查到其中内容。
==================
无论如何这个问题已经FIX 掉了 以下是ORACLE的建议。
Download and apply the one-off patch number Patch:9184754 on top of your version/platform. combination if available.
比较call stack 完全一致 ,call stack 请务必确保一致 否则不要轻易尝试总结。
Call stack : kssdct()
当打过PATCH 之后 问题解决。
具体可以参考 Doc ID 1281101.1
下面是自己查的一些其他资料。算是学习笔记了
==============================
Disable RAC
3. Change the working directory to $ORACLE_HOME/lib:
cd $ORACLE_HOME/lib
4. Run the following make command to relink the Oracle binaries without the RAC option:
make -f ins_rdbms.mk rac_off
make -f ins_rdbms.mk ioracle
==========================
RAC 有3中reason 会fail 第一个 是节点自然离开 第二个 节点心跳死亡 心跳是记录在controlfile中的 第三个 节点通信终端
RAC 默认通信使用 UPD 因为TCP IP 有7层 UPD没那么多 也不许要3次握手 内连很少丢包。
通信终端的原因
If
a message is not received for a timeout period, then a “communication failure” is assumed. This
is more relevant for UDP, as Reliable Shared Memory (RSM), Reliable DataGram protocol (RDG),
and Hyper Messaging Protocol (HMP) do not need it, since the acknowledgment mechanisms are
built into the cluster communication and protocol itself
大部分UPD 协议都是不可靠的 如果发生丢包 那么可以通屏蔽这个协议 比如 将_reliable_block_sends=TRUE 这样可能是走TCP了……目前不知道。
(user-mode IPC protocols
such as RDG (on HP Tru64 UNIX TruCluster) or HP HMP are used,)
_lgwr_async_broadcasts = true 这个参数可以设置是否允许异步广播
9I 的时候每一次COMMIT都需要所有的节点写REDO。来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/21818314/viewspace-693195/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/21818314/viewspace-693195/