墨墨导读:客户的监控告警频繁提示系统xx数据库死锁增长个数高于当前阈值_当前值1.00。下面是详细的故障分析诊断过程,以及详细的解决方案描述。
本文分为三部分:
1.背景概述
2.故障分析
3.根本解决方案及建议
1. 背景概述
客户的监控告警频繁提示系统xx数据库死锁增长个数高于当前阈值_当前值1.00。下面是详细的故障分析诊断过程,以及详细的解决方案描述。
2. 故障分析
2.1 故障现象
登录到系统,从数据库到alert日志可以发现的确存在很多ORA-60的信息,截取部分如下:
2020-04-23T19:32:00.644961+08:00XXXDB(4):ORA-00060: Global Enqueue Services Deadlock detected. See Note 60.1 at My Oracle for Troubleshooting ORA-60 Errors. More info in file /oracle/app/oracle/diag/rdbms/z1d1v19c/xxxxxxxx2/trace/xxxxxxxx2_ora_127408.trc.2020-04-23T19:32:01.000382+08:00Dumping diagnostic data in directory=[cdmp_20200423193200], requested by (instance=2, osid=127408), summary=[abnormal process termination].2020-04-23T19:32:54.093147+08:00XXXDB(4):ORA-00060: Global Enqueue Services Deadlock detected. See Note 60.1 at My Oracle for Troubleshooting ORA-60 Errors. More info in file /oracle/app/oracle/diag/rdbms/z1d1v19c/xxxxxxxx2/trace/xxxxxxxx2_ora_127383.trc.2020-04-23T19:32:54.289460+08:00Dumping diagnostic data in directory=[cdmp_20200423193254], requested by (instance=2, osid=127383), summary=[abnormal process termination].2020-04-23T19:32:57.576079+08:00XXXDB(4):ORA-00060: Global Enqueue Services Deadlock detected. See Note 60.1 at My Oracle for Troubleshooting ORA-60 Errors. More info in file /oracle/app/oracle/diag/rdbms/z1d1v19c/xxxxxxxx2/trace/xxxxxxxx2_ora_124482.trc.
2.2 故障根源
查看trc内容,发现是自身导致的阻塞,几个trc都类似,下面截取其中一个trc跟踪日志的关键内容:
*** 2020-04-23T19:32:00.644695+08:00 (XXXDB(4))*** SESSION ID:(7989.26294) 2020-04-23T19:32:00.644756+08:00*** CLIENT ID:() 2020-04-23T19:32:00.644762+08:00*** SERVICE NAME:(XXXDB) 2020-04-23T19:32:00.644767+08:00*** MODULE NAME:(oracle@xxxxxxxxdb2) 2020-04-23T