oracle11g 磁盘心跳_31串口心跳和磁盘心跳的区别

展开全部

如何分析这种问题了?先看62616964757a686964616fe59b9ee7ad9431333363373832系统日志,像他这个是HP-UX,那么系统日志为/var/log/syslog/syslog.log,AIX是errpt

在系统日志中,我看到:

Nov 11 18:43:57 rx8640c syslog: Oracle CSS family monitor shutting down. 3

Nov 11 18:43:59 rx8640c su: + tty?? root-oracle

Nov 11 18:43:59 rx8640c syslog: Cluster Ready Services completed waiting on dependencies.

在对比ALERT日志,发现系统基本是在这个时候重启的

Wed Nov 11 18:43:28 2009

Trace dumping is performing id=[cdmp_20091111184328]

Wed Nov 11 18:57:17 2009

Starting ORACLE instance (normal)

LICENSE_MAX_SESSION = 0

LICENSE_SESSIONS_WARNING = 0

如果是AIX系统,可以用last shutdown看看,HP我不知道是不是这个

这里,在syslog.log中可以看到,CSS进程shutdown(这个意思是偶猜的),CSS关闭或异常,会自动重启主机,符合现在的情况

接下来就是分析ORA_CRS_HOME中的ocssd日志了

[ CSSD]2009-11-11 18:39:18.460 [13] >WARNING: clssgmAssignMemberNo(): grock(#CSS_CLSSOMON) memberNo(1) already assigned

[ CSSD]2009-11-11 18:39:34.313 [14] >WARNING: clssnmPollingThread: node rx8640c (1) at 50% heartbeat fatal, eviction in 14.807 se

conds

[ CSSD]2009-11-11 18:39:35.313 [14] >WARNING: clssnmPollingThread: node rx8640c (1) at 50% heartbeat fatal, eviction in 13.807 se

conds

[ CSSD]2009-11-11 18:39:42.313 [14] >WARNING: clssnmPollingThread: node rx8640c (1) at 75% heartbeat fatal, eviction in 6.807 sec

onds

[ CSSD]2009-11-11 18:39:45.313 [14] >TRACE: clssnmPollingThread: node rx8640c (1) is impending reconfig

[ CSSD]2009-11-11 18:39:45.314 [14] >TRACE: clssnmPollingThread: diskTimeout set to (27000)ms impending reconfig status(1)

[ CSSD]2009-11-11 18:39:46.313 [14] >TRACE: clssnmPollingThread: node rx8640c (1) is impending reconfig

[ CSSD]2009-11-11 18:39:46.314 [14] >WARNING: clssnmPollingThread: node rx8640c (1) at 90% heartbeat fatal, eviction in 2.807 sec

onds

[ CSSD]2009-11-11 18:39:47.313 [14] >TRACE: clssnmPollingThread: node rx8640c (1) is impending reconfig

[ CSSD]2009-11-11 18:39:47.314 [14] >WARNING: clssnmPollingThread: node rx8640c (1) at 90% heartbeat fatal, eviction in 1.807 sec

onds

[ CSSD]2009-11-11 18:39:48.313 [14] >TRACE: clssnmPollingThread: node rx8640c (1) is impending reconfig

[ CSSD]2009-11-11 18:39:48.314 [14] >WARNING: clssnmPollingThread: node rx8640c (1) at 90% heartbeat fatal, eviction in 0.807 sec

onds

[ CSSD]2009-11-11 18:39:49.133 [14] >TRACE: clssnmPollingThread: node rx8640c (1) is impending reconfig

[ CSSD]2009-11-11 18:39:49.134 [14] >TRACE: clssnmPollingThread: Eviction started for node rx8640c (1), flags 0x000f, state 3,

这个日志信息很明显了,私有网络心跳丢失,节点被驱除

至于为什么私有网络出现问题,心跳丢失,我想这个不是DBA能处理的了,写个报告丢给管网络的去看吧

另外提下,可能造成节点重启的进程有3个,OCSSD,OPROCD,OCLSOMON

一般的,OCSSD的原因就是心跳丢失(网络心跳或者投票磁盘出现问题)和CSS进程请求不到CPU资源和BUG;OPROCD,OCLSOMON的原因是进程请求不到CPU资源和BUG

他这里在节点重启前,还顺便报了个600错误

Wed Nov 11 18:43:27 2009

Errors in file /oracle/app/oracle/admin/ora10g/udump/ora10g1_ora_24884.trc:

ORA-00600: internal error code, arguments: [keltnfy-ldmInit], [46], [1], [], [], [], [], []

确认是个Bug 5486074

ORA-600 [keltnfy-ldminit] can occur in the Server Generated Alert

subsystem when it cannot determine the Host Name or

Network Address. This can be caused by DNS server being unaavilable.

查了下,没说这个错误会导致CSS死亡,主机重启的,而该错误应该是客户端报出来的。。。

至少说可以确认网络出现过问题

启动的时候,报错

Wed Nov 11 18:58:06 2009

Errors in file /oracle/app/oracle/admin/ora10g/udump/ora10g1_ora_7203.trc:

ORA-00600: internal error code, arguments: [ksprlspeeq3], [65536], [], [], [], [], [], []

Wed Nov 11 18:58:07 2009

Errors in file /oracle/app/oracle/admin/ora10g/udump/ora10g1_ora_7203.trc:

ORA-07445: exception encountered: core dump [kgscDump()+801] [SIGSEGV] [Address not mapped to object] [0x000001004] [] []

ORA-00600: internal error code, arguments: [ksprlspeeq3], [65536], [], [], [], [], [], []

Wed Nov 11 18:58:08 2009

Errors in file /oracle/app/oracle/admin/ora10g/udump/ora10g1_ora_7203.trc:

ORA-07445: exception encountered: core dump [kgscDump()+801] [SIGSEGV] [Address not mapped to object] [0x000001004] [] []

ORA-07445: exception encountered: core dump [kgscDump()+801] [SIGSEGV] [Address not mapped to object] [0x000001004] [] []

ORA-00600: internal error code, arguments: [ksprlspeeq3], [65536], [], [], [], [], [], []

ORA-07445[kgscDump]对应有个Bug 5508574 - OERI[504] / OERI[99999] / Dump [kgscdump] with > 31 CPUs,可是系统只有15C,30核。

ORA-00600[ksprlspeeq3]这个没找到10203相关的BUG,先也懒的管了

推荐一个METALINK的note:4.1,这个就是以前的knowledge,里面有很多归类的文章,和一些工具的列表

已赞过

已踩过<

你对这个回答的评价是?

评论

收起

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值