这个问题来的比较急,来邮件时候是下午4点多,但是数据库运行是正常的,从告警日志上看是进程无法获取内存了,导致客户端进程宕掉。
27102错误一般是oracle用户的系统资源限制设置过小导致,但是这里的情况却是不一样。
具体先看如下告警日志:
Fri Jul 25 12:51:08 CST 2014
Thread 1 advanced to log sequence 5862 (LGWR switch)
Current log# 3 seq# 5862 mem# 0: +DG1/wasudb/onlinelog/group_3.263.842714473
Fri Jul 25 12:51:10 CST 2014
ARC1: Standby redo logfile selected for thread 1 sequence 5861 for destination LOG_ARCHIVE_DEST_2
Fri Jul 25 12:51:57 CST 2014
Thread 1 cannot allocate new log, sequence 5863
Checkpoint not complete
Current log# 3 seq# 5862 mem# 0: +DG1/wasudb/onlinelog/group_3.263.842714473
Fri Jul 25 12:51:59 CST 2014
Thread 1 advanced to log sequence 5863 (LGWR switch)
Current log# 1 seq# 5863 mem# 0: +DG1/wasudb/onlinelog/group_1.261.842714471
Fri Jul 25 12:52:02 CST 2014
ARC1: Standby redo logfile selected for thread 1 sequence 5862 for destination LOG_ARCHIVE_DEST_2
Fri Jul 25 13:01:45 CST 2014
Errors in file /oracle/admin/wasudb/udump/wasudb_ora_13870.trc:
ORA-27102: out of memory
Linux-x86_64 Error: 12: Cannot allocate memory
Additional information: 108
Additional information: 3080192
Fri Jul 25 13:01:48 CST 2014
Errors in file /oracle/admin/wasudb/udump/wasudb_ora_13870.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-27102: out of memory
Linux-x86_64 Error: 12: Cannot allocate memory
Additional information: 108
Additional information: 3080192
告警的分析在第一段话就已经带出,在此时我们通过ulimitd -a查看oracle用户的分配信息,都是非常大的,不存在过小的设置。
在这种情况下非常有必要看下这个ora_13870.trc文件,此文件部分信息如下:
/oracle/admin/wasudb/udump/wasudb_ora_13870.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /oracle/product/10.2.0/db_1
System name:Linux
Node name:localhost
Release:3.8.13-16.2.1.el6uek.x86_64
Version:#1 SMP Thu Nov 7 17:01:44 PST 2013
Machine:x86_64
Instance name: wasudb
Redo thread mounted by this instance: 1
Oracle process number: 51
Unix process pid: 13870, image: oracle@localhost
*** 2014-07-25 13:01:44.015
*** SERVICE NAME:(SYS$USERS) 2014-07-25 13:01:43.995
*** SESSION ID:(6209.59205) 2014-07-25 13:01:43.995
=======================================
TOP 10 MEMORY USES FOR THIS PROCESS
---------------------------------------
81% 3328 MB, 213622 chunks: "pmuccst: adt/record "
koh-kghu call ds=0x7f8e128632f8 dsprt=0x6ae9740
13% 544 MB, 8868 chunks: "permanent memory " SQL
sort subheap ds=0x7f8e121f7a00 dsprt=0x7f8e12891058
5% 209 MB, 13443 chunks: "pmucalm coll "
koh-kghu call ds=0x7f8e128632f8 dsprt=0x6ae9740
0% 7467 KB, 8948 chunks: "free memory "
session heap ds=0x7f8e12866290 dsprt=0x6aed2c0
0% 1284 KB, 22 chunks: "permanent memory "
pga heap ds=0x6ae9740 dsprt=(nil)
0% 424 KB, 6 chunks: "free memory "
top call heap ds=0x6aed0a0 dsprt=(nil)
0% 331 KB, 82 chunks: "kxsFrame4kPage "
session heap ds=0x7f8e12866290 dsprt=0x6aed2c0
0% 274 KB, 10 chunks: "static frame of inst "
koh-kghu sessi ds=0x7f8e11fabe48 dsprt=0x7f8e12866290
0% 224 KB, 7 chunks: "kxsFrame32kPage "
session heap ds=0x7f8e12866290 dsprt=0x6aed2c0
0% 128 KB, 1 chunk : "free memory "
KSFD vec heap ds=0x6aeed80 dsprt=0x6ae9740
=======================================
PRIVATE MEMORY SUMMARY FOR THIS PROCESS
---------------------------------------
******************************************************
PRIVATE HEAP SUMMARY DUMP
4103 MB total:
4101 MB commented, 1286 KB permanent
537 KB free (0 KB in empty extents),
3546 MB, 3 heaps: "koh-kghu call " 51 KB free held
554 MB, 1 heap: "session heap "
------------------------------------------------------
Summary of subheaps at depth 1
4091 MB total:
4084 MB commented, 87 KB permanent
7603 KB free (132 KB in empty extents),
2 KB uncommented freeable with mark, 3328 MB, 213622 chunks: "pmuccst: adt/record "
552 MB, 1 heap: "kxs-heap-w " 7408 KB free held
=========================================
REAL-FREE ALLOCATOR DUMP FOR THIS PROCESS
-----------------------------------------
Dump of Real-Free Memory Allocator Heap [0x7f8e1283e000]
mag=0xfefe0001 flg=0x5000003 fds=0x7 blksz=65536
blkdstbl=0x7f8e1283e010, iniblk=68608 maxblk=262144 numsegs=115
In-use num=65638 siz=8978432, Freeable num=0 siz=0, Free num=1 siz=6619136
==========================================
INSTANCE-WIDE PRIVATE MEMORY USAGE SUMMARY
------------------------------------------
Dumping Work Area Table (level=1)
=====================================
Global SGA Info
---------------
global target: 40000 MB
auto target: 31163 MB
max pga: 2048 MB
pga limit: 4096 MB
pga limit known: 0
pga limit errors: 0
pga inuse: 6534 MB
pga alloc: 7384 MB
pga freeable: 707 MB
pga freed: 5776451 MB
pga to free: 0 %
pga auto: 1160 MB
pga manual: 0 MB
pga alloc (max): 15183 MB
pga auto (max): 4383 MB
pga manual (max): 10 MB
# workareas : 0
# workareas(max): 115
================================
PER-PROCESS PRIVATE MEMORY USAGE
--------------------------------
Private memory usage per Oracle process
-------------------------
Top 10 processes:
-------------------------
(percentage is of 8065 MB total allocated memory)
51% pid 51: 4102 MB used of 4105 MB allocated < = CURRENT PROC
7% pid 101: 536 MB used of 537 MB allocated
3% pid 127: 52 MB used of 211 MB allocated (158 MB freeable)
2% pid 94: 37 MB used of 138 MB allocated (101 MB freeable)
1% pid 215: 101 MB used of 120 MB allocated (18 MB freeable)
1% pid 216: 101 MB used of 119 MB allocated (18 MB freeable)
1% pid 254: 101 MB used of 119 MB allocated (18 MB freeable)
1% pid 278: 101 MB used of 119 MB allocated (18 MB freeable)
1% pid 155: 100 MB used of 116 MB allocated (15 MB freeable)
1% pid 156: 111 MB used of 111 MB allocated
-------------------------
All processes:
-------------------------
(session detail when over 403 MB allocated)
pid 2: 295 KB used of 374 KB allocated
pid 3: 294 KB used of 374 KB allocated
pid 4: 388 KB used of 502 KB allocated
pid 5: 8205 KB used of 14 MB allocated (3136 KB freeable)
pid 6: 8409 KB used of 13 MB allocated (2112 KB freeable)
pid 7: 8356 KB used of 15 MB allocated (4160 KB freeable)
....................(省略很多)
通过trace中的信息我们可以看到PER-PROCESS PRIVATE MEMORY USAGE的TOP 10 process以及all process的信息,其中PID为51的进程(TOP 1)马上就入了大头的法眼~进程开销PGA已经4个G了,基本可以联想到肯定是超大DML事务带有排序等没有批量提交导致把内存撑爆!经过检查发现是一个存储过程在跑大量数据的处理,验证了我的判断,协助客户发给应用开发处理,并建议缩小事务,尽量采用批量的提交方式。
从以上也可以看出,硬件资源是有限的,即使给oracle用户分配的资源再多,只要应用开发设计得不合理,你永远都不会满足这应用的需求!