环境介绍
OS VERSION:Red Hat Enterprise Linux Server release 5.8
DB VERSION:Oracle Database 10g Enterprise Edition Release 10.2.0.1.0
现象描述
测试库正常关库,hang住了
SQL> shutdown immediate;
查看alert日志,出现大量以下内容
PMON failed to acquire latch,see PMON dump
查看 PMON的dump文件内容
PMON unable to acquire latch 60007498 process allocation level=1
Location from where latch is held: ksukia:
Context saved from call: 0
state=busy, wlstate=free
gotten 149 times wait, failed first 0 sleeps 0
gotten 80 times nowait, failed: 0
possible holder pid = 21 ospid=3587
原因
这是Oracle的一个bug
解决方法
可以打补丁或者尝试下面的步骤来解决
1.在$ORACLE_HOME/network/admin/listener.ora中添加如下内容
INBOUND_CONNECT_TIMEOUT_LISTENER=0
2.在Oracle server 10g中的sqlnet.ora文件添加如下内容
SQLNET.INBOUND_CONNECT_TIMEOUT=0
3.重新启动database和listener,使改变生效
说明:上面的方法不一定能够奏效,最好的解决方法还是打补丁
参考文档
metalink:
Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.1.0 to 10.2.0.3.0
This problem can occur on any platform.
Symptoms
Database Instance hangs and connections to database using 'sqlplus' are also not possible.
Checking alert.log we see following messages
PMON failed to acquire latch, see PMON dump
Fri Oct 5 10:33:00 2007
PMON failed to acquire latch, see PMON dump
Fri Oct 5 10:34:05 2007
PMON failed to acquire latch, see PMON dump
Errors in file / dwrac/BDUMP/dwhp_pmon_1912834.trc:
This will also dump a systemstate dump and the location will be mentioned in alert.log
Also at OS level, we see that MMAN is consuming lot of CPU.
Cause
Currently this issue is being worked upon by development in
Bug 6488694
- DATABSE HUNG WITH PMON FAILED TO ACQUIRE LATCH MESSAGE
Solution
As of now only workaround is to disable Automatic Shared Memory Management (ASMM
) Ie setting
SGA_TARGET = 0 Also as per bug, you can can set following event and restart the instance
EVENT = "10235 trace name context forever, level 2"
Development is suspecting memory corruption in this case. So with the above event, database might hit ORA-600 before spin. The trace file of ORA-600 would help investigate the issue. These trace files need to be sent to Oracle support for investigation.
Note: - Event 10235 with level 2 or higher can impact latch contention.Though may not be quite critical. In case you see latch contention then unset the event
References
Bug 6488694
- DATABSE HUNG WITH PMON FAILED TO ACQUIRE LATCH MESSAGE