MMMNL absent错误 ,数据库无法连接

MMMNL absent错误 ,数据库无法连接
2011-8-24
Kevin Zou
大运结束后回公司上班,美国同事告诉我,一个DB 异常无法连接上,只好shutdown abort,重启数据库。 需要去找下root cause.
我登录到系统,检查日志查看原因。

这个系统一般每个小时都会有归档日志的产生,但是在当地时间14-16之间是没有日志的。
SQL> select to_char(FIRST_TIME,'YYYY-MM-DD HH24') , count(*)
 from v$archived_log
 group by  to_char(FIRST_TIME,'YYYY-MM-DD HH24')
我截取了部分的记录
2011-08-23 00          4
2011-08-23 01          2
2011-08-23 02          3
2011-08-23 03          2
2011-08-23 04          2
2011-08-23 05          2
2011-08-23 06          2
2011-08-23 07          2
2011-08-23 08          2
2011-08-23 09          3
2011-08-23 10          2
2011-08-23 11          1
2011-08-23 12          3
2011-08-23 13          2
2011-08-23 14          4
2011-08-23 16          2
2011-08-23 17          5
2011-08-23 18          2
2011-08-23 19          2
2011-08-23 20          2
2011-08-23 21          2
2011-08-23 22          1


2011-08-23 14          4
2011-08-23 16          2
之间是没有2011-08-23 15 的记录。说明在这段时间内数据库处于HUNG 状态,用户无法登陆,系统也不能做任何事情。
Tue Aug 23 14:12:25 2011
Thread 1 advanced to log sequence 26547 (LGWR switch)
  Current log# 1 seq# 26547 mem# 0: /diskfdcprodb03/oradata1/oracle/timsdb/redo01/redo1a.log
  Current log# 1 seq# 26547 mem# 1: /diskfdcprodb04/oradata2/oracle/timsdb/redo02/redo1b.log
Tue Aug 23 16:29:31 2011
Shutting down instance: further logons disabled
Tue Aug 23 16:29:31 2011
MMNL absent for 7795 secs; Foregrounds taking over
Tue Aug 23 16:29:31 2011
WARNING: inbound connection timed out (ORA-3136)
Tue Aug 23 16:29:33 2011
ksvcreate: Process(q002) creation failed
Tue Aug 23 16:29:33 2011
ksvcreate: Process(m000) creation failed
Tue Aug 23 16:29:34 2011
Thread 1 advanced to log sequence 26548 (LGWR switch)
  Current log# 2 seq# 26548 mem# 0: /diskfdcprodb03/oradata1/oracle/timsdb/redo01/redo2a.log
  Current log# 2 seq# 26548 mem# 1: /diskfdcprodb04/oradata2/oracle/timsdb/redo02/redo2b.log
Tue Aug 23 16:30:33 2011
ksvcreate: Process(m000) creation failed
ksvcreate: Process(m000) creation failed
Tue Aug 23 16:29:34 2011
Thread 1 advanced to log sequence 26548 (LGWR switch)
  Current log# 2 seq# 26548 mem# 0: /diskfdcprodb03/oradata1/oracle/timsdb/redo01/redo2a.log
  Current log# 2 seq# 26548 mem# 1: /diskfdcprodb04/oradata2/oracle/timsdb/redo02/redo2b.log
Tue Aug 23 16:30:33 2011
ksvcreate: Process(m000) creation failed
Tue Aug 23 16:31:33 2011
ksvcreate: Process(m000) creation failed
Tue Aug 23 16:31:36 2011
Stopping background process QMNC
Tue Aug 23 16:31:38 2011
Stopping background process MMNL
Tue Aug 23 16:31:39 2011
Stopping background process MMON
Tue Aug 23 16:31:40 2011
Shutting down instance (immediate)
License high water mark = 164
Tue Aug 23 16:36:43 2011
SHUTDOWN: Active processes prevent shutdown operation
Tue Aug 23 16:38:37 2011
ALTER DATABASE CLOSE NORMAL
Tue Aug 23 16:38:37 2011
SMON: disabling tx recovery
SMON: disabling cache recovery
Tue Aug 23 16:38:43 2011
Shutting down archive processes
Archiving is disabled
Tue Aug 23 16:38:48 2011
ARCH shutting down
ARC3: Archival stopped


可能的原因:
1) The Memory Monitor Light (MMNL) process is a new process in 10g which works with the Automatic Workload Repository new features (AWR) to write out full statistics buffers to disk as needed. These messages can be generated if you have the database in restricted mode. 

2) These messages are for informational purposes only to indicate a hang situation which lasted a long period of time and MMNL is unable to run its job at that moment. When the hang lasts for several hours before the instance is completely stopped/killed, the tail of the alert.log may flood with these messages. Although the message text contains MMNL process, in this case MMNL is actually the victim of the hang issue rather than the cause. 


我觉的这个数据库应该是第二种原因,数据库由于HANG住了,MMNL进程无法在后台运行;当这种HANG的状态一直到系统完全被停止前有好几个小时,Oracle会在alert.log的尾端加上MMNL ABSENT 的一句话。 虽然这句话包括了MMNL进程,实际上是ORACLE进程HANG住的证明,而不仅仅是MMNL进程。
MMNL的这段话只证明了ORACLE 进程确实是被HANG啦,但到底是啥原因导致的呢?
我找到了一个BUG 的描述:
Bug 5470031: INCORRECT MAXSIZE ON UNDO DATAFILE LEADS TO "MMNL ABSENT FOR XX SECS; FOREGROUND。 这个BUG还是处于开发/解决状态。
就是当设置了AUTOEXTEND 为 ON,MAXSIZE小于真实的SIZE大小时,系统就会报这个错误。

An incorrect MAXSIZE for the undo-datafile(s) leads to MMNL messages in the 
alert.log

I think there are 2 issues :
1. Why are you able to set a ALTER DATABASE DATAFILE ...AUTOEXTEND MAXSIZE
   smaller than the actual filesize.

2. MMNL is reporting messages in the alert.log which do not have a clear
   root-cause. At least there is no direct link to the message and the 
   strange MAXSIZE (MAXBLOCKS) value.

解决方案为关闭AUOEXNTEND,或者MAXSIZE值大于SIZE的值。

我的环境:
SQL> select bytes   , maxbytes  ,file_name , AUTOEXTENSIBLE
  2  from dba_data_files
  3  where file_name like '%undo%';

          BYTES        MAXBYTES FILE_NAME                                          AUT
--------------- --------------- -------------------------------------------------- ---
     1073741824      1073741824 /diskfdcprodb03/oradata1/oracle/timsdb/data00/dbf/ YES
                                undotbs_01.dbf

     1073741824               0 /diskfdcprodb06/oradata1/oracle/timsdb/idx01/dbf/u NO
                                ndotbs_02.dbf


一个文件没有打开AUTOEXEND,另外一个文件是AUTOEXEND 为ON,难道 MAXSIZE等于SIZE的大小也会出错?
取消自动增长选项,观察是否再次出现类似情况
-END-

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/40239/viewspace-705720/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/40239/viewspace-705720/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值