ORA-00494: enqueue [CF] held for too long (more than 900 seconds)

近日某客户数据库凌晨5点左右crash,报如下错误:
Fri Nov 11 03:17:51 2016
Thread 1 advanced to log sequence 583682 (LGWR switch)
  Current log# 16 seq# 583682 mem# 0: /jzh/jzh_u38/jzh/redo16a.log
Fri Nov 11 05:23:03 2016
ALTER SYSTEM ARCHIVE LOG
Fri Nov 11 05:23:03 2016
Thread 1 cannot allocate new log, sequence 583683
Private strand flush not complete
  Current log# 16 seq# 583682 mem# 0: /jzh/jzh_u38/jzh/redo16a.log
Fri Nov 11 05:23:03 2016
Thread 1 advanced to log sequence 583683 (LGWR switch)
  Current log# 6 seq# 583683 mem# 0: /jzh/jzh_u38/jzh/redo06a.log
Fri Nov 11 05:24:13 2016
ALTER SYSTEM ARCHIVE LOG
Fri Nov 11 05:26:58 2016
Thread 1 advanced to log sequence 583684 (LGWR switch)
  Current log# 7 seq# 583684 mem# 0: /jzh/jzh_u38/jzh/redo07a.log
Fri Nov 11 05:27:27 2016
Starting control autobackup
Fri Nov 11 05:27:54 2016
Control autobackup written to SBT_TAPE device
        comment 'API Version 2.0,MMS Version 5.0.0.0',
        media '@aaaag'
        handle 'c-1734772114-20161111-03'
Fri Nov 11 05:48:24 2016
Errors in file /u01/oracle/admin/jzh/bdump/jzh_arc1_1940.trc:
ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by 'inst 1, osid 1921'
Fri Nov 11 05:48:25 2016
System State dumped to trace file /u01/oracle/admin/jzh/bdump/jzh_arc1_1940.trc
Fri Nov 11 05:48:35 2016
Killing enqueue blocker (pid=1921) on resource CF-00000000-00000000
 by killing session 824.1
Killing enqueue blocker (pid=1921) on resource CF-00000000-00000000
 by terminating the process
ARC1: terminating instance due to error 2103
Termination issued to instance processes. Waiting for the processes to exit
Fri Nov 11 05:48:46 2016
Instance termination failed to kill one or more processes
Fri Nov 11 05:55:48 2016
Instance terminated by ARC1, pid = 1940
Fri Nov 11 07:05:18 2016
Starting ORACLE instance (normal)
Fri Nov 11 07:05:18 2016
Specified value of sga_max_size is too small, bumping to 12247367680
说明:1、pid 1921/sid 824进程持有enqueue [CF]超过900s,R持有资源CF-00000000-00000000
          2、pid 1940 ARC1进程kill pid 1921失败,并关闭实例。
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining Scoring Engine
and Real Application Testing options
ORACLE_HOME = /u01/oracle/product/db10gr2
System name:    HP-UX
Node name:      JZHDB2
Release:        B.11.31
Version:        U
Machine:        9000/800
Instance name: jzh
Redo thread mounted by this instance: 1
Oracle process number: 9
Unix process pid: 1940, image: oracle@JZHDB2 (ARC1)
*** SERVICE NAME:(SYS$BACKGROUND) 2016-11-11 05:48:24.968
*** SESSION ID:(1651.1) 2016-11-11 05:48:24.968
*** 2016-11-11 05:48:24.968
Unable to get enqueue on resource CF-00000000-00000000 (ges mode req=3 held=6)
Possible local blocker ospid=1921 sid=824 sser=1 time_held=910 secs (ges mode req=6 held=4)
DUMP LOCAL BLOCKER: initiate state dump for KILL BLOCKER
  possible owner[10.1921] on resource CF-00000000-00000000
Dumping process info of pid[10.1921] requested by pid[9.1940]
ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by 'inst 1, osid 1921'
-------------------------------------------------------------------------------
ENQUEUE [CF] HELD FOR TOO LONG
enqueue holder: 'inst 1, osid 1921'
Process 'inst 1, osid 1921' is holding an enqueue for maximum allowed time.
The process will be terminated. 
Oracle Support Services triaging information: to find the root-cause, look
at the call stack of process 'inst 1, osid 1921' located below. Ask the
developer that owns the first NON-service layer in the stack to investigate.
Common service layers are enqueues (ksq), latches (ksl), library cache
pins and locks (kgl), and row cache locks (kqr). 
Dumping process 10.1921 info:
*** 2016-11-11 05:48:24.982
Dumping diagnostic information for ospid 1921:
OS pid = 1921
loadavg : 0.17 0.18 0.22
Swapinfo : 
        Avail = 111554.56Mb Used = 37997.61Mb
        Swap free = 73556.95Mb Kernel rsvd = 10888.46Mb
        Free Mem  = 22945.59Mb 
  F S      UID   PID  PPID  C PRI NI             ADDR   SZ            WCHAN    STIME TTY       TIME COMD
11001 R   oracle  1921     1  0 152 20        41489b600 29229                -  Apr 15  ?        8080:31 ora_ckpt_jzh
11001 S   oracle 19215     1  0 154 20        6dc245080 28763        3821d9b68  Nov  6  ?         0:13 oraclejzh (LOCAL=NO)
11001 S   oracle 19213     1  0 154 20        e9d85dd00 28827         dcd59268  Nov  6  ?         0:15 oraclejzh (LOCAL=NO)
11001 S   oracle 19211     1  0 154 20        29d10aa80 28826        cb19bce68  Nov  6  ?         0:15 oraclejzh (LOCAL=NO)
12401 S   patrol 15982     1  0 153 20        48f97f080 7320        a5c32a0a0 05:39:35 ?         0:00 /opt/langtools/bin/gdb64 -quie
t /u01/oracle/product/db10gr2/bin/oracle 1921
*** 2016-11-11 05:48:25.581
----------------------------------------
SO: c000000301002038, type: 2, owner: 0000000000000000, flag: INIT/-/-/0x00
  (process) Oracle pid=10, calls cur/top: c0000003015c8548/c0000003015c8548, flag: (6) SYSTEM
            int error: 0, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 0 0 24
              last post received-location: ksasnd
              last process to post me: c000000301009f38 159 0
              last post sent: 0 0 25
              last post sent-location: ksasnr
              last process posted by me: c000000301009f38 159 0
  (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: c00000038c9cf2b8
    O/S info: user: oracle, term: UNKNOWN, ospid: 1921 
    OSD pid info: Unix process pid: 1921, image: oracle@JZHDB2 (CKPT)
Dump of memory from 0xC00000038B934648 to 0xC00000038B934850
C00000038B934640                   00000010 00000000          [........]
C00000038B934650 C0000003 04851500 00000007 000313A7  [................]
C00000038B934660 C0000003 048515E0 00000007 000313A7  [................]
C00000038B934670 C0000003 048517B8 00000007 000313A7  [................]
C00000038B934680 C0000003 0484A998 00000007 000313A7  [................]
C00000038B934690 C0000003 048ECBB8 00000007 000313A7  [................]
C00000038B9346A0 C0000003 048F6960 00000007 000313A7  [......i`........]
C00000038B9346B0 C0000003 048F8210 00000007 000313A7  [................]
C00000038B9346C0 C0000003 048F8E68 00000007 000313A7  [.......h........]
C00000038B9346D0 C0000003 04901360 00000007 000313A7  [.......`........]
C00000038B9346E0 C0000003 04901A78 00000007 000313A7  [.......x........]
C00000038B9346F0 C0000003 04901ED8 00000007 000313A7  [................]
C00000038B934700 C0000003 04902178 00000007 000313A7  [......!x........]
C00000038B934710 C0000003 0647F1B8 00000007 000313A7  [.....G..........]
C00000038B934720 C0000003 0647F298 00000007 000313A7  [.....G..........]
C00000038B934730 C0000003 048386E0 00000007 000313A7  [................]
C00000038B934740 C0000003 048387C0 00000007 000313A7  [................]
C00000038B934750 C0000003 0484D840 00000007 000313A7  [.......@........]
C00000038B934760 C0000003 0484D920 00000007 000313A7  [....... ........]
C00000038B934770 C0000003 0484F8D0 00000007 000313A7  [................]
C00000038B934780 C0000003 0484F9B0 00000007 000313A7  [................]
C00000038B934790 C0000003 0484FAA8 00000007 000313A7  [................]
C00000038B9347A0 C0000003 0484FB88 00000007 000313A7  [................]
C00000038B9347B0 C0000003 0484FC68 00000007 000313A7  [.......h........]
C00000038B9347C0 C0000003 0484FE28 00000007 000313A7  [.......(........]
C00000038B9347D0 C0000003 0484FF08 00000007 000313A7  [................]
C00000038B9347E0 C0000003 0484FFE8 00000007 000313A7  [................]
C00000038B9347F0 C0000003 049266D0 00000007 000313A7  [......f.........]
C00000038B934800 C0000003 049267B0 00000007 000313A7  [......g.........]
C00000038B934810 C0000003 04926890 00000007 000313A7  [......h.........]
C00000038B934820 C0000003 04926988 00000007 000313A7  [......i.........]
C00000038B934830 C0000003 0492B9A0 00000007 000313A7  [................]
C00000038B934840 C0000003 04851420 00000007 000313A7  [....... ........]
----------------------------------------
SO: c0000003015c0b00, type: 4, owner: c000000301002038, flag: INIT/-/-/0x00
  (session) sid: 824 trans: 0000000000000000, creator: c000000301002038, flag: (51) USR/- BSY/-/-/-/-/-
            DID: 0001-000A-00000004, short-term DID: 0000-0000-00000000
            txn branch: 0000000000000000
            oct: 0, prv: 0, sql: 0000000000000000, psql: 0000000000000000, user: 0/SYS
  waiting for 'db file sequential read' blocking sess=0x0000000000000000 seq=52963 wait_time=0 seconds since wait started=910
          file#=7c, block#=1, blocks=1
  Dumping Session Wait History
   for 'db file single write' count=1 wait_time=488
          file#=7b, block#=1, blocks=1
   for 'db file sequential read' count=1 wait_time=4382
          file#=7b, block#=1, blocks=1
   for 'db file single write' count=1 wait_time=333
          file#=7a, block#=1, blocks=1
   for 'db file sequential read' count=1 wait_time=4759
          file#=7a, block#=1, blocks=1
   for 'db file single write' count=1 wait_time=587
          file#=79, block#=1, blocks=1
   for 'db file sequential read' count=1 wait_time=7210
          file#=79, block#=1, blocks=1
   for 'db file single write' count=1 wait_time=554
          file#=78, block#=1, blocks=1
   for 'db file sequential read' count=1 wait_time=8361
          file#=78, block#=1, blocks=1
   for 'db file single write' count=1 wait_time=330
          file#=77, block#=1, blocks=1
   for 'db file sequential read' count=1 wait_time=2886
          file#=77, block#=1, blocks=1
  temporary object counter: 0
KTU Session Commit Cache Dump for IDLs: 
KTU Session Commit Cache Dump for Non-IDLs: 
----------------------------------------
UOL used : 0 locks(used=0, free=2)
KGX Atomic Operation Log c000000388ff8820
 Mutex 0000000000000000(0, 0) idn 0 oper NONE
 Cursor Parent uid 824 efd 13 whr 22 slp 0
 oper=NONE pt1=c000000383fb0c50 pt2=c000000383fb0e38 pt3=c000000383fb1358
 pt4=0000000000000000 u41=0 stt=0
KGX Atomic Operation Log c000000388ff8868
 Mutex c000000382b23650(1073741824, 13490240) idn c0000003 oper NONE
 Cursor Stat uid 824 efd 13 whr 2 slp 0
 oper=NONE pt1=0000000000000000 pt2=0000000000000000 pt3=0000000000000000
 pt4=0000000000000000 u41=0 stt=8
KGX Atomic Operation Log c000000388ff88b0
 Mutex 0000000000000000(0, 0) idn 0 oper NONE
 Library Cache uid 824 efd 0 whr 0 slp 0
Session 824: no row
===================================================
SYSTEM STATE
------------
System global information:
     processes: base c00000038b930d40, size 1500, cleanup c000000301000078
     allocation: free sessions c00000038c9d7738, free calls 0000000000000000
     control alloc errors: 0 (process), 0 (session), 0 (call)
     PMON latch cleanup depth: 0
     seconds since PMON's last scan for dead processes: 18
说明:1、ARCH1进程请求的资源正是CF-00000000-00000000
          2、pid 1921为oracle后台进程ckpt,阻塞了ARCH1进程
          3、pid 1921 ckpt进程持有enqueue [CF] on resource CF-00000000-00000000为910s.
        ORA-00494错误通常是由于系统I/O原因,结合故障时,本地磁盘I/O较高,MOS上找到文档ORA-00494: Enqueue [CF] Error when        using rman (文档 ID 1598848.1)
CAUSE
The snapshot controlfile setting for rman was using a slow filesystem.
SOLUTION
The snapshot controlfile setting for rman was changed to a different filesystem that performs better, for instance:
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/new_mount/snapcf.ctl';
         说明: 1、当前系统已开启control file自动备份
                   2、snapshot controlfile备份存在本地磁盘,即$ORACLE_HOME/dbs目录下
                   3、snapshot controlfile大小约74M
-rw-r-----   1 oracle     oinstall   74252288 Nov 23  2015 c-1734772114-20151123-14
-rw-r-----   1 oracle     oinstall   74252288 Jan  7  2016 c-1734772114-20160107-11
-rw-r-----   1 oracle     oinstall   74252288 Jan  7  2016 c-1734772114-20160107-12
-rw-r-----   1 oracle     oinstall   74252288 Jan 13  2016 c-1734772114-20160113-12
-rw-r-----   1 oracle     oinstall   74252288 Feb 17  2016 c-1734772114-20160217-12
-rw-r-----   1 oracle     oinstall   74252288 Feb 17  2016 c-1734772114-20160217-13
-rw-r-----   1 oracle     oinstall   74252288 Mar 21  2016 c-1734772114-20160321-13
-rw-r-----   1 oracle     oinstall      5236 Apr 11  2016 alert_jzh.log
-rw-r-----   1 oracle     oinstall   74252288 Apr 22  2016 c-1734772114-20160422-13
-rw-r-----   1 oracle     oinstall   74252288 Apr 22  2016 c-1734772114-20160422-14
-rw-r-----   1 oracle     oinstall   74252288 Apr 22  2016 c-1734772114-20160422-16
-rw-r-----   1 oracle     oinstall   74252288 May  3  2016 c-1734772114-20160503-07
-rw-r-----   1 oracle     oinstall      3072 Nov  5 21:02 orapwjzh
-rw-rw----   1 oracle     oinstall      1544 Nov 11 05:48 hc_jzh.dat
-rw-r-----   1 oracle     oinstall   74170368 Nov 11 16:25 snapcf_jzh.f
总结:由于本地磁盘I/O性能较差,导致snapshot controlfile备份很慢,导致数据库crash
建议:将snapshot存放于I/O性能较好的磁盘或存储上,另外archivelog条目超过25000个,建议备份时crosscheck,并delete obsolete/epired archivelog






来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/10271187/viewspace-2128566/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/10271187/viewspace-2128566/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值