上图:多个update语句一直持续运行,给数据造成了很大压力,导致数据库运行缓慢。
查看日志发现了下面的错误:
LNS: Standby redo
logfile selected for thread 1 sequence 15058 for destination
LOG_ARCHIVE_DEST_4
Fri
Jan 04 13:46:04 2013
Archived Log entry
74894 added for thread 1 sequence 15057 ID 0xe9b400d dest
1:
Fri
Jan 04 13:48:00 2013
WARNING:
Heavy swapping observed on system in last 5 mins.
pct
of memory swapped in [2.17%] pct of memory swapped out
[1.30%].
Please make sure there
is no memory pressure and the SGA and PGA
are
configured correctly. Look at DBRM trace file for more
details.
Fri Jan 04
13:51:54 2013
LGWR: Standby redo
logfile selected to archive thread 1 sequence 15059
LGWR: Standby redo
logfile selected for thread 1 sequence 15059 for destination
LOG_ARCHIVE_DEST_2
......
ORA-00020: 超出最大进程数 (600)
ORA-20 errors will not be written to the alert log for
the next minute. Please look at trace files to see all
the ORA-20 errors.
分析:
在11gR2中DBRM(database resource manager,11gR2中新的后台进程,见《Learning 11g New Background
Processes》)会在Alert.log告警日志中反映OS操作系统最近5分钟是否有剧烈的swap活动了
可以观察DBRM后台进程的trace:
oracle
503 21966 0
16:30 pts/4
00:00:00 grep dbrm
oracle
6866
1
0 2012
?
00:01:01 ora_dbrm_nticket1
[oracle@nticket1
trace]$ pwd
/u01/app/ora11g/diag/rdbms/nticket1/nticket1/trace
[oracle@nticket1
trace]$ ls -ltr *dbrm*
-rw-r----- 1 oracle
asmadmin
100447 01-04 13:58 nticket1_dbrm_6866.trm
-rw-r----- 1 oracle
asmadmin 1586637 01-04 13:58 nticket1_dbrm_6866.trc
[oracle@nticket1 trace]$ vi
nticket1_dbrm_6866.trc
oracle
6866
1
0 2012
?
00:01:01 ora_dbrm_nticket1
oracle
28938 21966
0 15:44 pts/4
00:00:00 grep dbrm
***
2013-01-04 01:57:26.942
1
12571
1
0.20
0.00
26838
57
37
3
29
0
0
***
2013-01-04 01:58:26.961
1
11716
1
0.15
0.00
31681
58
31
8
11
0
0
***
2013-01-04 01:59:26.981
1
11607
0
0.25
0.00
38195
66
45
8
4
0
0
***
2013-01-04 02:00:00.014
kgsksysstop: blocking
mode (2) timestamp: 1357236000014692
kgsksysstop:
successful
kgsksysresume:
successful
Heavy swapping observed
in last 5 mins:
[pct of total memory][bytes]
***
2013-01-04 13:48:00.385
total swpin [
2.17%][1078326K], total swpout [ 1.30%][648770K]
vm
stats captured every 30 secs for last 5 mins:
swpin:
swpout:
[
0.19%][
96052K] [
0.06%][
34155K]
[
0.23%][
117853K] [
0.10%][
54080K]
[
0.23%][
117220K] [
0.20%][
101830K]
[
0.23%][
116923K] [
0.11%][
58593K]
[
0.19%][
97252K] [
0.14%][
73923K]
[
0.18%][
90623K] [
0.11%][
56906K]
[
0.18%][
93366K] [
0.10%][
52694K]
[
0.21%][
108634K] [
0.19%][
94206K]
[
0.24%][
122591K] [
0.11%][
58344K]
[
0.23%][
117812K] [
0.12%][
64039K]
Heavy swapping observed
in last 5 mins:
[pct of total memory][bytes]
***
2013-01-04 13:58:00.731
total swpin [
2.81%][1393938K], total swpout [ 1.10%][547194K]
vm
stats captured every 30 secs for last 5 mins:
swpin:
swpout:
[
0.17%][
86056K] [
0.16%][
83332K]
[
0.21%][
106479K] [
0.08%][
40506K]
[
0.30%][
148646K] [
0.13%][
65545K]
[
0.30%][
152775K] [
0.11%][
58557K]
[
0.31%][
154886K] [
0.09%][
47802K]
[
0.29%][
146032K] [
0.07%][
37422K]
[
0.30%][
152861K] [
0.06%][
31298K]
[
0.29%][
147241K] [
0.10%][
49991K]
[
0.30%][
151591K] [
0.10%][
53233K]
[
0.29%][
147371K] [
0.16%][
79508K]
可以看到dbrm收集到了短期内的swapin和swapout数据,这样便于我们诊断由swap造成的性能或者hang问题。
解决:
1.适当增加processes参数
alter system set
processes=800 scope=spfile;
2.修改应用程序写法,优化程序,减少并发。本文转自cuug官网