关于shared pool的深入探讨:latch free/library cache pin/library cache lock

关于shared pool的深入探讨(一)

关于shared pool的设置一直是一个争议较多的内容。很多文章上说,shared pool设置过大会带来额外的管理上的负担,从而在某些条件下会导致性能的下降。那么这个管理上的负担指的是什么内容呢? 本文对这个内容作一定的深入探讨。本文只涉及一个方面,后续的文章将从其他方面继续讨论。

 

基础知识:

我们可以通过如下命令转储shared pool共享内存的内容:

 

SQL> alter session set events 'immediate trace name heapdump level 2';

Session altered.

本测试中引用的两个trace文件:

9i:

SQL> @gettrcname

TRACE_FILE_NAME
--------------------------------------------------------------------------------
/opt/oracle/admin/hsjf/udump/hsjf_ora_24983.trc

8i:

SQL> @gettrcname

TRACE_FILE_NAME
--------------------------------------------------------------------------------
/usr/oracle8/admin/guess/udump/guess_ora_22038.trc

 

Shared Pool通过free list管理free块,Free List按不同size划分Bucket.

在Oracle8i中,不同bucket的size范围如下所示(size显示的是下边界):

 

oracle:/usr/oracle8/admin/guess/udump>cat guess_ora_22038.trc|grep Bucket
Bucket 0 size=44
Bucket 1 size=76
Bucket 2 size=140
Bucket 3 size=268
Bucket 4 size=524
Bucket 5 size=1036
Bucket 6 size=2060
Bucket 7 size=4108
Bucket 8 size=8204
Bucket 9 size=16396
Bucket 10 size=32780

 

我们注意,在这里,小于76的块都位于Bucket 0上;大于32780的块,都在Bucket 10上初始的,数据库启动以后,shared pool多数是连续内存块当空间分配使用以后,内存块开始被分割,碎片开始出现,Bucket列表开始变长。

Oracle请求shared pool空间时,首先进入相应的Bucket进行查找,如果找不到,则转向下一个非空的bucket,获取第一个chunk。分割这个chunk,剩余部分会进入相应的Bucket,进一步增加碎片。

最终的结果是,Bucket 0上的内存块会越来越多,越来越碎小(在我这个测试的小型的数据库上,Bucket 0上的碎片已经达到9030个而shared_pool_size设置仅为150M),通常如果每个Bucket上的chunk多余2000个,就被认为是share pool碎片过多。

而在大多数情况下,我们请求的都是相对小的chunk,这样搜索Bucket 0往往消耗了大量的时间以及资源,这可能导致share pool Latch被长时间的持有,导致更多的share pool竞争。

所以在Oracle9i之前,如果盲目的增大shared_pool_size或设置过大的shared_pool_size,往往会适得其反。

我们看一下Oracle9i中的处理方式:

 [oracle@jumper oracle]$ sqlplus "/ as sysdba"

SQL*Plus: Release 9.2.0.3.0 - Production on Wed Aug 18 22:13:07 2004

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production

SQL> alter session set events 'immediate trace name heapdump level 2';

Session altered.

SQL> @gettrcname

TRACE_FILE_NAME
--------------------------------------------------------------------------------
/opt/oracle/admin/hsjf/udump/hsjf_ora_24983.trc

SQL>
SQL> !


[oracle@jumper oracle]$ cd $admin
[oracle@jumper udump]$ cat hsjf_ora_24983.trc|grep Bucket

(编者注:由于篇幅限制,编者将此处结果分栏显示。)


Bucket 0 size=16
Bucket 1 size=20
Bucket 2 size=24
Bucket 3 size=28
Bucket 4 size=32
Bucket 5 size=36
Bucket 6 size=40
Bucket 7 size=44
Bucket 8 size=48
Bucket 9 size=52
Bucket 10 size=56
Bucket 11 size=60
Bucket 12 size=64
Bucket 13 size=68
Bucket 14 size=72
Bucket 15 size=76
Bucket 16 size=80
Bucket 17 size=84
Bucket 18 size=88
Bucket 19 size=92
Bucket 20 size=96
Bucket 21 size=100
Bucket 22 size=104
Bucket 23 size=108
Bucket 24 size=112
Bucket 25 size=116
Bucket 26 size=120
Bucket 27 size=124
Bucket 28 size=128
Bucket 29 size=132
Bucket 30 size=136
Bucket 31 size=140
Bucket 32 size=144
Bucket 33 size=148
Bucket 34 size=152
Bucket 35 size=156
Bucket 36 size=160
Bucket 37 size=164
Bucket 38 size=168
Bucket 39 size=172
Bucket 40 size=176
Bucket 41 size=180
Bucket 42 size=184
Bucket 43 size=188
Bucket 44 size=192
Bucket 45 size=196
Bucket 46 size=200
Bucket 47 size=204
Bucket 48 size=208
Bucket 49 size=212
Bucket 50 size=216
Bucket 51 size=220
Bucket 52 size=224
Bucket 53 size=228
Bucket 54 size=232
Bucket 55 size=236
Bucket 56 size=240
Bucket 57 size=244
Bucket 58 size=248
Bucket 59 size=252
Bucket 60 size=256
Bucket 61 size=260
Bucket 62 size=264
Bucket 63 size=268
Bucket 64 size=272
Bucket 65 size=276
Bucket 66 size=280
Bucket 67 size=284
Bucket 68 size=288
Bucket 69 size=292
Bucket 70 size=296
Bucket 71 size=300
Bucket 72 size=304
Bucket 73 size=308
Bucket 74 size=312
Bucket 75 size=316
Bucket 76 size=320
Bucket 77 size=324
Bucket 78 size=328
Bucket 79 size=332
Bucket 80 size=336
Bucket 81 size=340
Bucket 82 size=344
Bucket 83 size=348
Bucket 84 size=352
Bucket 85 size=356
Bucket 86 size=360
Bucket 87 size=364
Bucket 88 size=368
Bucket 89 size=372
Bucket 90 size=376
Bucket 91 size=380
Bucket 92 size=384
Bucket 93 size=388
Bucket 94 size=392
Bucket 95 size=396
Bucket 96 size=400
Bucket 97 size=404
Bucket 98 size=408
Bucket 99 size=412
Bucket 100 size=416
Bucket 101 size=420
Bucket 102 size=424
Bucket 103 size=428
Bucket 104 size=432
Bucket 105 size=436
Bucket 106 size=440
Bucket 107 size=444
Bucket 108 size=448
Bucket 109 size=452
Bucket 110 size=456
Bucket 111 size=460
Bucket 112 size=464
Bucket 113 size=468
Bucket 114 size=472
Bucket 115 size=476
Bucket 116 size=480
Bucket 117 size=484
Bucket 118 size=488
Bucket 119 size=492
Bucket 120 size=496
Bucket 121 size=500
Bucket 122 size=504
Bucket 123 size=508
Bucket 124 size=512
Bucket 125 size=516
Bucket 126 size=520
Bucket 127 size=524
Bucket 128 size=528
Bucket 129 size=532
Bucket 130 size=536
Bucket 131 size=540
Bucket 132 size=544
Bucket 133 size=548
Bucket 134 size=552
Bucket 135 size=556
Bucket 136 size=560
Bucket 137 size=564
Bucket 138 size=568
Bucket 139 size=572
Bucket 140 size=576
Bucket 141 size=580
Bucket 142 size=584
Bucket 143 size=588
Bucket 144 size=592
Bucket 145 size=596
Bucket 146 size=600
Bucket 147 size=604
Bucket 148 size=608
Bucket 149 size=612
Bucket 150 size=616
Bucket 151 size=620
Bucket 152 size=624
Bucket 153 size=628
Bucket 154 size=632
Bucket 155 size=636
Bucket 156 size=640
Bucket 157 size=644
Bucket 158 size=648
Bucket 159 size=652
Bucket 160 size=656
Bucket 161 size=660
Bucket 162 size=664
Bucket 163 size=668
Bucket 164 size=672
Bucket 165 size=676
Bucket 166 size=680
Bucket 167 size=684
Bucket 168 size=688
Bucket 169 size=692
Bucket 170 size=696
Bucket 171 size=700
Bucket 172 size=704
Bucket 173 size=708
Bucket 174 size=712
Bucket 175 size=716
Bucket 176 size=720
Bucket 177 size=724
Bucket 178 size=728
Bucket 179 size=732
Bucket 180 size=736
Bucket 181 size=740
Bucket 182 size=744
Bucket 183 size=748
Bucket 184 size=752
Bucket 185 size=756
Bucket 186 size=760
Bucket 187 size=764
Bucket 188 size=768
Bucket 189 size=772
Bucket 190 size=776
Bucket 191 size=780
Bucket 192 size=784
Bucket 193 size=788
Bucket 194 size=792
Bucket 195 size=796
Bucket 196 size=800
Bucket 197 size=804
Bucket 198 size=808
Bucket 199 size=812
Bucket 200 size=876
Bucket 201 size=940
Bucket 202 size=1004
Bucket 203 size=1068
Bucket 204 size=1132
Bucket 205 size=1196
Bucket 206 size=1260
Bucket 207 size=1324
Bucket 208 size=1388
Bucket 209 size=1452
Bucket 210 size=1516
Bucket 211 size=1580
Bucket 212 size=1644
Bucket 213 size=1708
Bucket 214 size=1772
Bucket 215 size=1836
Bucket 216 size=1900
Bucket 217 size=1964
Bucket 218 size=2028
Bucket 219 size=2092
Bucket 220 size=2156
Bucket 221 size=2220
Bucket 222 size=2284
Bucket 223 size=2348
Bucket 224 size=2412
Bucket 225 size=2476
Bucket 226 size=2540
Bucket 227 size=2604
Bucket 228 size=2668
Bucket 229 size=2732
Bucket 230 size=2796
Bucket 231 size=2860
Bucket 232 size=2924
Bucket 233 size=2988
Bucket 234 size=3052
Bucket 235 size=3116
Bucket 236 size=3180
Bucket 237 size=3244
Bucket 238 size=3308
Bucket 239 size=3372
Bucket 240 size=3436
Bucket 241 size=3500
Bucket 242 size=3564
Bucket 243 size=3628
Bucket 244 size=3692
Bucket 245 size=3756
Bucket 246 size=3820
Bucket 247 size=3884
Bucket 248 size=3948
Bucket 249 size=4012
Bucket 250 size=4108
Bucket 251 size=8204
Bucket 252 size=16396
Bucket 253 size=32780
Bucket 254 size=65548

 


我们看到,在Oracle9i中,Free Lists被划分为0~254,共255个Bucket。
每个Bucket容纳的size范围
Bucket 0~199 容纳size以 4 递增
Bucket 200~249 容纳size以 64 递增

从Bucket 249开始,Oracle各Bucket步长进一步增加:

Bucket 249: 4012 ~4107 = 96
Bucket 250: 4108 ~8203 = 4096
Bucket 251: 8204 ~16395 = 8192
Bucket 252: 16396~32779 = 16384
Bucket 253: 32780~65547 = 32768
Bucket 254: >=65548

在Oracle9i中,对于小的chunk,Oracle增加了更多的Bucket来管理。0~199共200个Bucket,size以4为步长递增;200~249共50个Bucket,size以64递增。这样每个Bucket中容纳的chunk数量大大减少,查找的效率得以提高。

这就是Oracle9i中shared pool管理的增强,通过这个算法的改进。Oracle8i中,过大shared pool带来的栓锁争用等性能问题在某种程度上得以解决。

 

关于shared pool的深入探讨(二)

我们继续把前面的问题展开一下。

其实我们可以从数据库内部监控shared pool的空间碎片情况。
这涉及到一个内部视图x$ksmsp

X$KSMSP的名称含义为: [K]ernal [S]torage [M]emory Management [S]GA Hea[P]
其中每一行都代表着shared pool中的一个chunk.

首先记录一下测试环境: 

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
PL/SQL Release 9.2.0.3.0 - Production
CORE 9.2.0.3.0 Production
TNS for Linux: Version 9.2.0.3.0 - Production
NLSRTL Version 9.2.0.3.0 – Production

我们看一下x$ksmsp的结构:

SQL> desc x$ksmsp

 Name                                      Null?    Type

 ----------------------------------------- -------- ----------------------------

 ADDR                                                   RAW(4)

 INDX                                                    NUMBER

 INST_ID                                               NUMBER

 KSMCHIDX                                           NUMBER

 KSMCHDUR                                           NUMBER

 KSMCHCOM                                          VARCHAR2(16)

 KSMCHPTR                                           RAW(4)

 KSMCHSIZ                                            NUMBER

 KSMCHCLS                                           VARCHAR2(8)

 KSMCHTYP                                            NUMBER

 KSMCHPAR                                           RAW(4)                                                                   

                                                                             

我们关注以下几个字段:

KSMCHCOM是注释字段,每个内存块被分配以后,注释会添加在该字段中.
x$ksmsp.ksmchsiz代表块大小

x$ksmsp.ksmchcls列代表类型,主要有四类,说明如下:

free
Free chunks--不包含任何对象的chunk,可以不受限制的被分配.

recr
Recreatable chunks--包含可以被临时移出内存的对象,在需要的时候,这个对象可以被重新创建.例如,许多存储共享sql代码的内存都是可以重建的.

freeabl
Freeable chunks--包含session周期或调用的对象,随后可以被释放.这部分内存有时候可以全部或部分提前释放.但是注意,由于某些对象是中间过程产生的,这些对象不能临时被移出内存(因为不可重建).

perm
Permanent memory chunks--包含永久对象.通常不能独立释放.

我们可以通过查询x$ksmsp视图来考察shared pool中存在的内存片的数量。

不过注意:Oracle的某些版本(如:10.1.0.2)在某些平台上(如:HP-UX PA-RISC 64-bit)查
询该视图可能导致过度的CPU耗用,这是由于bug引起的。

我们看一下测试:

初始启动数据库,x$ksmsp中存在2259个chunk

SQL> select count(*) from x$ksmsp;

 

  COUNT(*)

----------

      2259

 

执行查询:

SQL> select count(*) from dba_objects;

 

  COUNT(*)

----------

     10491

 

此时shared pool中的chunk数量增加

SQL> select count(*) from x$ksmsp;

 

  COUNT(*)

----------

      2358

 

这就是由于shared pool中进行sql解析,请求空间,进而导致请求free空间,分配、分割。从而产生了更多,更细碎的内存chunk。由此我们可以看出,如果数据库系统中存在大量的硬解析,不停请求分配free的shred pool内存。除了必须的shared pool latch等竞争外,还不可避免的会导致shared pool中产生更多的内存碎片(当然,在内存回收时,你可能看到chunk数量减少的情况)。

我们看以下测试:

首先重新启动数据库:

SQL> startup force;

ORACLE instance started.

 

Total System Global Area   47256168 bytes

Fixed Size                          451176 bytes

Variable Size                      29360128 bytes

Database Buffers               16777216 bytes

Redo Buffers                      667648 bytes

Database mounted.

Database opened.

 

创建一张临时表用以保存之前x$ksmsp的状态:

 

SQL> CREATE GLOBAL TEMPORARY TABLE e$ksmsp ON COMMIT PRESERVE ROWS AS

  2  SELECT      a.ksmchcom,

  3           SUM (a.CHUNK) CHUNK,

  4           SUM (a.recr) recr,

  5           SUM (a.freeabl) freeabl,

  6           SUM (a.SUM) SUM

  7      FROM (SELECT   ksmchcom, COUNT (ksmchcom) CHUNK,

  8                     DECODE (ksmchcls, 'recr', SUM (ksmchsiz), NULL) recr,

  9                     DECODE (ksmchcls, 'freeabl', SUM (ksmchsiz), NULL) freeabl,

 10                     SUM (ksmchsiz) SUM

 11                FROM x$ksmsp GROUP BY ksmchcom, ksmchcls) a

 12  where 1 = 0

 13  GROUP BY a.ksmchcom;

 

Table created.

 

保存当前shared pool状态:

 

SQL> INSERT INTO E$KSMSP

  2  SELECT      a.ksmchcom,

  3           SUM (a.CHUNK) CHUNK,

  4           SUM (a.recr) recr,

  5           SUM (a.freeabl) freeabl,

  6           SUM (a.SUM) SUM

  7      FROM (SELECT   ksmchcom, COUNT (ksmchcom) CHUNK,

  8                     DECODE (ksmchcls, 'recr', SUM (ksmchsiz), NULL) recr,

  9                     DECODE (ksmchcls, 'freeabl', SUM (ksmchsiz), NULL) freeabl,

 10                     SUM (ksmchsiz) SUM

 11                FROM x$ksmsp

 12            GROUP BY ksmchcom, ksmchcls) a

 13  GROUP BY a.ksmchcom

 14  /

 

41 rows created.

 

执行查询:

 

SQL> select count(*) from dba_objects;

 

  COUNT(*)

----------

     10492

    

比较前后shared pool内存分配的变化:

 

SQL> select a.ksmchcom,a.chunk,a.sum,b.chunk,b.sum,(a.chunk - b.chunk) c_diff,(a.sum -b.sum) s_diff

  2  from

  3  (SELECT   a.ksmchcom,

  4           SUM (a.CHUNK) CHUNK,

  5           SUM (a.recr) recr,

  6           SUM (a.freeabl) freeabl,

  7           SUM (a.SUM) SUM

  8      FROM (SELECT   ksmchcom, COUNT (ksmchcom) CHUNK,

  9                     DECODE (ksmchcls, 'recr', SUM (ksmchsiz), NULL) recr,

 10                     DECODE (ksmchcls, 'freeabl', SUM (ksmchsiz), NULL) freeabl,

 11                     SUM (ksmchsiz) SUM

 12                FROM x$ksmsp

 13            GROUP BY ksmchcom, ksmchcls) a

 14  GROUP BY a.ksmchcom) a,e$ksmsp b

 15  where a.ksmchcom = b.ksmchcom and (a.chunk - b.chunk) <>0

 16  /

 

KSMCHCOM        CHUNK        SUM      CHUNK        SUM     C_DIFF     S_DIFF

---------------- ---------- ---------- ---------- ---------- ---------- ----------

KGL handles              313     102080        302      98416         11       3664

KGLS heap               274     365752        270     360424          4       5328

KQR PO                  389     198548        377     192580         12       5968

free memory               93    2292076         90    2381304          3     -89228

library cache             1005     398284        965     381416         40      16868

sql area                  287     547452        269     490052         18      57400

 

6 rows selected.                                                                   

                                                                             

 

我们简单分析一下以上结果:

首先free memory的大小减少了89228(增加到另外五个组件中),这说明sql解析存储占用了一定的内存空间。而chunk从90增加为93,这说明内存碎片增加了。

在下面的部分中,我会着手介绍一下KGL handles, KGLS heap这两个非常重要的shared pool中的内存结构。

关于shared pool的深入探讨(三)

基本命令:

ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level LL';

其中LL代表Level级别,对于9.2.0及以后版本,不同Level含义如下:
Level =1 ,转储Library cache统计信息
Level =2 ,转储hash table概要
Level =4 ,转储Library cache对象,只包含基本信息
Level =8 ,转储Library cache对象,包含详细信息(包括child references,pin waiters等)
Level =16,增加heap sizes信息
Level =32,增加heap信息

Library cache由一个hash表组成,而hash表是一个由hash buckets组成的数组。

每个hash bucket都是包含library cache handle的一个双向链表。
Library Cache Handle指向Library Cache Object和一个引用列表。
library cache对象进一步分为:依赖表、子表和授权表等

我们看一下library cache的结构:

通过
ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 4'
获得以下输出(这部分信息来自Oracle8i,Trace文件可以从www.eygle.com上找到)

第一部分(等价于Level 1):

 

LIBRARY CACHE STATISTICS:
     
gets hit ratio       pins hit ratio    reloads   invalids namespace
---------- --------- ---------- --------- ---------- ---------- ---------
 619658171 0.9999160 2193292112 0.9999511       9404        380 CRSR
  79698558 0.9998832  424614847 0.9999108      13589          0 TABL/PRCD/TYPE
    163399 0.9979926     163402 0.9978948         16          0 BODY/TYBD
         0 0.0000000          0 0.0000000          0          0 TRGR
        34 0.0294118         35 0.0571429          0          0 INDX
     18948 0.9968862      24488 0.9953855          0          0 CLST
         0 0.0000000          0 0.0000000          0          0 OBJE
         0 0.0000000          0 0.0000000          0          0 PIPE
         0 0.0000000          0 0.0000000          0          0 LOB
         0 0.0000000          0 0.0000000          0          0 DIR
         0 0.0000000          0 0.0000000          0          0 QUEU
         0 0.0000000          0 0.0000000          0          0 OBJG
         0 0.0000000          0 0.0000000          0          0 PROP
         0 0.0000000          0 0.0000000          0          0 JVSC
         0 0.0000000          0 0.0000000          0          0 JVRE
         0 0.0000000          0 0.0000000          0          0 ROBJ
         0 0.0000000          0 0.0000000          0          0 REIP
         0 0.0000000          0 0.0000000          0          0 CPOB
    115071 0.9992179     115071 0.9930999        704          0 EVNT
         0 0.0000000          0 0.0000000          0          0 SUMM
         0 0.0000000          0 0.0000000          0          0 DIMN
         0 0.0000000          0 0.0000000          0          0 CTX
         0 0.0000000          0 0.0000000          0          0 OUTL
         0 0.0000000          0 0.0000000          0          0 RULS
         0 0.0000000          0 0.0000000          0          0 RMGR
         0 0.0000000          0 0.0000000          0          0 UNUSED
         0 0.0000000          0 0.0000000          0          0 PPLN
         0 0.0000000          0 0.0000000          0          0 PCLS
         0 0.0000000          0 0.0000000          0          0 SUBS
         0 0.0000000          0 0.0000000          0          0 LOCS
         0 0.0000000          0 0.0000000          0          0 RMOB
         0 0.0000000          0 0.0000000          0          0 RSMD
 699654181 0.9999117 2618209955 0.9999440      23713        380 CUMULATIVE

这部分信息也就是v$librarycache中显示的.

第二部分(等价于Level 2中的输出): 

 

LIBRARY CACHE HASH TABLE: size=509 count=354
BUCKET    0:
BUCKET    1:
BUCKET    2: *
BUCKET    3:
BUCKET    4:
BUCKET    5: *
BUCKET    6: *
BUCKET    7:
BUCKET    8: **
BUCKET    9: ***
BUCKET   10: *
BUCKET   11: *
BUCKET   12: ***
……..

BUCKET  103:
BUCKET  104: *
BUCKET  105:
BUCKET  106:
BUCKET  107: ****
BUCKET  108:
BUCKET  109:
BUCKET  110:
BUCKET  111: *
BUCKET  112: **
BUCKET  113:
BUCKET  114:
BUCKET  115:
BUCKET  116: *
BUCKET  117:
BUCKET  118: *****
BUCKET  119:
BUCKET  120: *
BUCKET  121:
BUCKET  122:
BUCKET  123:
BUCKET  124:
BUCKET  125: *
BUCKET  126:
BUCKET  127:
BUCKET  128: *
BUCKET  129:
BUCKET  130: *
BUCKET  131: *
BUCKET  132:
BUCKET  133:
BUCKET  134:
BUCKET  135: *
BUCKET  136:
BUCKET  137:
BUCKET  138:
BUCKET  139: *
BUCKET  140: *
BUCKET  141: *
BUCKET  142:
BUCKET  143: *
BUCKET  144:
BUCKET  145: ***
BUCKET  146:
BUCKET  147: *
BUCKET  148:
BUCKET  149:
BUCKET  150: **
BUCKET  151:
BUCKET  152:
BUCKET  153: *
BUCKET  154:
BUCKET  155:
BUCKET  156:
BUCKET  157:
BUCKET  158:
BUCKET  159:
BUCKET  160:
BUCKET  161:
BUCKET  162:
BUCKET  163:
BUCKET  164: *
BUCKET  165: *
BUCKET  166:
BUCKET  167:
BUCKET  168:
BUCKET  169:
BUCKET  170: **
BUCKET  171:
BUCKET  172: *
BUCKET  173:
BUCKET  174:
BUCKET  175: *
BUCKET  176: *
BUCKET  177:
BUCKET  178:
BUCKET  179:
BUCKET  180:
BUCKET  181: *
BUCKET  182:
BUCKET  183:
BUCKET  184:
BUCKET  185: *
BUCKET  186:
BUCKET  187:
BUCKET  188: **
BUCKET  189:
BUCKET  190: *
BUCKET  191: *
BUCKET  192:
BUCKET  193:
BUCKET  194: *
BUCKET  195: **
BUCKET  196: *
BUCKET  197: **
BUCKET  198: ****
BUCKET  199: *
BUCKET  200: *
……….
BUCKET  509:
BUCKET  510:
BUCKET  511:

在Oracle8i中,Oracle以一个很长的LIBRARY CACHE HASH TABLE来记录Library Cache的使用情况。"*"代表该Bucket中包含的对象的个数。

在以上输出中我们看到Bucket 198中包含四个对象.

我们在第三部分中可以找到bucket 198: 

 BUCKET 198:
  LIBRARY OBJECT HANDLE: handle=2c2b4ac4
  name=
SELECT a.statement_id, a.timestamp, a.remarks, a.operation, a.options,
       a.object_node, a.object_owner, a.object_name, a.object_instance,
       a.object_type, a.optimizer, a.search_columns, a.id, a.parent_id,
       a.position, a.cost, a.cardinality, a.bytes, a.other_tag,
       a.partition_start, a.partition_stop, a.partition_id, a.other,
       a.distribution
, ROWID
  FROM plan_table a

  hash=60dd47a1 timestamp=08-27-2004 10:19:28
  namespace=CRSR flags=RON/TIM/PN0/LRG/[10010001]
  kkkk-dddd-llll=0000-0001-0001 lock=0 pin=0 latch=0
  lwt=2c2b4adc[2c2b4adc,2c2b4adc] ltm=2c2b4ae4[2c2b4ae4,2c2b4ae4]
  pwt=2c2b4af4[2c2b4af4,2c2b4af4] ptm=2c2b4b4c[2c2b4b4c,2c2b4b4c]
  ref=2c2b4acc[2c2b4acc,2c2b4acc]
    LIBRARY OBJECT: object=2c0b1430
    type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0
    CHILDREN: size=16
    child#    table reference   handle
    ------ -------- --------- --------
         0 2c0b15ec  2c0b15b4 2c2c0d50
    DATA BLOCKS:
    data#     heap  pointer status pins change
    ----- -------- -------- ------ ---- ------
        0 2c362290 2c0b14b4 I/-/A     0 NONE 
  LIBRARY OBJECT HANDLE: handle=2c3675d4
  name=SYS.DBMS_STANDARD
  hash=50748ddb timestamp=NULL
  namespace=BODY/TYBD flags=TIM/SML/[02000000]
  kkkk-dddd-llll=0000-0011-0011 lock=0 pin=0 latch=0
  lwt=2c3675ec[2c3675ec,2c3675ec] ltm=2c3675f4[2c3675f4,2c3675f4]
  pwt=2c367604[2c367604,2c367604] ptm=2c36765c[2c36765c,2c36765c]
  ref=2c3675dc[2c3675dc,2c3675dc]
    LIBRARY OBJECT: object=2c1528e8
    flags=NEX[0002] pflags= [00] status=VALD load=0
    DATA BLOCKS:
    data#     heap  pointer status pins change
    ----- -------- -------- ------ ---- ------
        0 2c367564 2c1529cc I/-/A     0 NONE 
        4 2c15297c        0 -/P/-     0 NONE 
  LIBRARY OBJECT HANDLE: handle=2c347dd8
  name=select pos#,intcol#,col#,spare1 from icol$ where obj#=:1
  hash=fa15ebe3 timestamp=07-28-2004 18:04:43
  namespace=CRSR flags=RON/TIM/PN0/SML/[12010000]
  kkkk-dddd-llll=0000-0001-0001 lock=0 pin=0 latch=0
  lwt=2c347df0[2c347df0,2c347df0] ltm=2c347df8[2c347df8,2c347df8]
  pwt=2c347e08[2c347e08,2c347e08] ptm=2c347e60[2c347e60,2c347e60]
  ref=2c347de0[2c347de0,2c347de0]
    LIBRARY OBJECT: object=2c1cd1a0
    type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0
    CHILDREN: size=16
    child#    table reference   handle
    ------ -------- --------- --------
         0 2c1cd35c  2c1cd324 2c281678
         1 2c1cd35c  2c352c50 2c0eeb8c
         2 2c1cd35c  2c352c6c 2c2bb05c
    DATA BLOCKS:
    data#     heap  pointer status pins change
    ----- -------- -------- ------ ---- ------
        0 2c2e8c58 2c1cd224 I/-/A     0 NONE 
  LIBRARY OBJECT HANDLE: handle=2c3a6484
  name=SYS.TS$
  hash=bb42852e timestamp=04-24-2002 00:04:15
  namespace=TABL/PRCD/TYPE flags=PKP/TIM/KEP/SML/[02900000]
  kkkk-dddd-llll=0111-0111-0119 lock=0 pin=0 latch=0
  lwt=2c3a649c[2c3a649c,2c3a649c] ltm=2c3a64a4[2c3a64a4,2c3a64a4]
  pwt=2c3a64b4[2c3a64b4,2c3a64b4] ptm=2c3a650c[2c3a650c,2c3a650c]
  ref=2c3a648c[2c0d4b14,2c09353c]
    LIBRARY OBJECT: object=2c3a626c
    type=TABL flags=EXS/LOC[0005] pflags= [00] status=VALD load=0
    DATA BLOCKS:
    data#     heap  pointer status pins change
    ----- -------- -------- ------ ---- ------
        0 2c3a8ea4 2c3a63b0 I/P/A     0 NONE 
        3 2c3a5828        0 -/P/-     0 NONE 
        4 2c3a6300 2c3a5960 I/P/A     0 NONE 
        8 2c3a6360 2c3a4f00 I/P/A     0 NONE 

 

我们看到这里包含了四个对象.

我们再来看看Oracle9i中的情况:

参考文件: hsjf_ora_15800.trc

LIBRARY CACHE HASH TABLE: size=131072 count=217
Buckets with more than 20 objects:
NONE
Hash Chain Size     Number of Buckets
---------------     -----------------
              0                130855
              1                   217
              2                     0
              3                     0
              4                     0
              5                     0
              6                     0
              7                     0
              8                     0
              9                     0
             10                     0
             11                     0
             12                     0
             13                     0
             14                     0
             15                     0
             16                     0
             17                     0
             18                     0
             19                     0
             20                     0
            >20                     0

 

Oracle9i中通过新的方式记录Library Cache的使用状况.

按不同的Hash Chain Size代表Library Cache中包含不同对象的个数.0表示Free的Bucket,>20表示包含超过20个对象的Bucket的个数.

从以上列表中我们看到,包含一个对象的Buckets有217个,包含0个对象的Buckets有130855个.

我们来验证一下:

  [oracle@jumper udump]$ cat hsjf_ora_15800.trc |grep BUCKET|more
  BUCKET 12:
  BUCKET 12 total object count=1
  BUCKET 385:
  BUCKET 385 total object count=1
  BUCKET 865:
  BUCKET 865 total object count=1
  ...

  [oracle@jumper udump]$ cat hsjf_ora_15800.trc |grep BUCKET|wc -l

  434

  [oracle@jumper udump]$

 

434/2 = 217,证实了我们的猜想.

通过HASH TABLE算法的改进,Oracle Library Cache管理的效率大大提高.

 

关于shared pool的深入探讨(四)

我们进一步来讨论一下shared pool的处理:

先进行相应查询,获得测试数据: 

 

[oracle@jumper udump]$ sqlplus "/ as sysdba"

 

SQL*Plus: Release 9.2.0.3.0 - Production on Thu Aug 26 10:21:54 2004

 

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

 

 

Connected to:

Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production

With the Partitioning, OLAP and Oracle Data Mining options

JServer Release 9.2.0.3.0 - Production

 

SQL> connect eygle/eygle

Connected.

 

SQL> create table emp as select * from scott.emp;

 

Table created.

 

SQL>

SQL> connect / as sysdba

Connected.

SQL> startup force;

ORACLE instance started.

 

Total System Global Area   47256168 bytes

Fixed Size                   451176 bytes

Variable Size              29360128 bytes

Database Buffers           16777216 bytes

Redo Buffers                 667648 bytes

Database mounted.

Database opened.

SQL> set linesize 120

SQL> connect scott/tiger

Connected.

SQL> select * from emp;

 

EMPNO ENAME      JOB       MGR HIREDATE       SAL       COMM     DEPTNO

---------- ---------- --------- ---------- --------- ---------- ---------- ----------

7369 SMITH      CLERK           7902 17-DEC-80        800                    20

8888 EYGLE      MANAGER              11-AUG-04       9999         10       10

7566 JONES      MANAGER         7839 02-APR-81       2975                   20

7698 BLAKE      MANAGER         7839 01-MAY-81       2850                   30

7782 CLARK      MANAGER         7839 09-JUN-81       2450                   10

7788 SCOTT      ANALYST         7566 19-APR-87       3000                    20

7839 KING       PRESIDENT            17-NOV-81       5000                    10

7876 ADAMS      CLERK           7788 23-MAY-87       1100                    20

7900 JAMES      CLERK           7698 03-DEC-81        950                     30

7902 FORD       ANALYST         7566 03-DEC-81       3000                    20

7934 MILLER     CLERK           7782 23-JAN-82       1300                     10

 

11 rows selected.

 

SQL> connect eygle/eygle

Connected.

SQL> select * from emp;

 

EMPNO ENAME      JOB          MGR HIREDATE     SAL      COMM     DEPTNO

---------- ---------- --------- ---------- --------- ---------- ---------- ----------

7369 SMITH      CLERK           7902 17-DEC-80        800                    20

8888 EYGLE      MANAGER              11-AUG-04       9999         10       10

7566 JONES      MANAGER         7839 02-APR-81       2975                  20

7698 BLAKE      MANAGER         7839 01-MAY-81       2850                  30

7782 CLARK      MANAGER         7839 09-JUN-81       2450                  10

7788 SCOTT      ANALYST         7566 19-APR-87       3000                   20

7839 KING       PRESIDENT            17-NOV-81       5000                   10

7876 ADAMS      CLERK           7788 23-MAY-87       1100                   20

7900 JAMES      CLERK           7698 03-DEC-81        950                    30

7902 FORD       ANALYST         7566 03-DEC-81       3000                   20

7934 MILLER     CLERK           7782 23-JAN-82       1300                    10

 

11 rows selected.

 

SQL> select SQL_TEXT,VERSION_COUNT,HASH_VALUE,to_char(HASH_VALUE,'xxxxxxxxxx') HEX,ADDRESS

  2  from v$sqlarea where sql_text like 'select * from emp%';

 

SQL_TEXT             VERSION_COUNT HASH_VALUE HEX         ADDRESS

-------------------- ------------- ---------- ----------- --------

select * from emp                2 2648707557    9de011e5 52D9EA28

 

 

SQL> select sql_text,username,ADDRESS,HASH_VALUE,to_char(HASH_VALUE,'xxxxxxxxxx') HEX_HASH_VALUE,CHILD_NUMBER,CHILD_LATCH

  2  from v$sql a,dba_users b where a.PARSING_USER_ID = b.user_id and  sql_text like 'select * from emp%';

 

SQL_TEXT             USERNAME                       ADDRESS  HASH_VALUE HEX_HASH_VA CHILD_NUMBER CHILD_LATCH

-------------------- ------------------------------ -------- ---------- ----------- ------------ -----------

select * from emp    SCOTT                          52D9EA28 2648707557    9de011e5            0           1

select * from emp    EYGLE                          52D9EA28 2648707557    9de011e5            1           1

 

注意:这里我们可以看出v$sqlareav$sql两个视图的不同之处

v$sql中为每一条sql保留一个条目,而v$sqlarea中根据sql_text进行group by,通过version_count计算子指针的个数.

 

我们注意到,这两条sql语句因为其代码完全相同,所以其ADDRESS,HASH_VALUE也完全相同.

这就意味着,这两条sql语句在shared pool中的存储位置是相同的(尽管其执行计划可能不同),代码得以共享.

 

在此过程中Oracle完成sql解析的第一个步骤:语法解析

Oracle进行语法检查时遵循自右向左,自下向上的原则,如果发现语法错误就马上返回错误.

 

语法检查通过以后,Oraclesql文本转换为相应的ASCII数值,然后根据数值通过Hash函数计算其HASH_VALUE

shared pool中寻找是否存在相同的sql语句,如果存在,则进入下一步骤;如果不存在则尝试获取shared pool latch

请求内存,存储该sql代码

 

在这里有一个问题需要说明一下:

因为大小写字母的ascii值是不同的,所以Oracle会把大小写不同的代码作为不同的sql来处理,我们看一下测试:

 

SQL> select * from scott.dept;

 

    DEPTNO DNAME          LOC

---------- -------------- -------------

        10 ACCOUNTING     NEW YORK

        20 RESEARCH       DALLAS

        30 SALES          CHICAGO

        40 OPERATIONS     BOSTON

 

SQL> select * from scott.DEPT;

 

    DEPTNO DNAME          LOC

---------- -------------- -------------

        10 ACCOUNTING     NEW YORK

        20 RESEARCH       DALLAS

        30 SALES          CHICAGO

        40 OPERATIONS     BOSTON

 

SQL> col sql_text for a30

SQL> select sql_text,hash_value from v$sql where sql_text like 'select * from scott%';

 

SQL_TEXT                       HASH_VALUE

------------------------------ ----------

select * from scott.DEPT       4096614922

select * from scott.dept       2089404358

 

我们注意到,仅只大小写的不同使得原本相同的sql语句变成了两条"不同的代码"

所以从这里我们可以看出,sql的规范编写非常重要.

 

完成这一个步骤以后,Oracle需要进行的是语义分析:

在此步骤中,Oracle需要验证对象是否存在,相关用户是否具有权限,引用的是否是相同的对象...

对于我们第一个查询,实际上emp表来自不同的用户,那么sql的执行计划也就不同了

当然影响sql执行计划的因素还有很多,包括优化器模式等

 

SQL> select a.*,to_char(to_hash,'xxxxxxxxxx') Hex_HASH_VALUE

  2  from V$OBJECT_DEPENDENCY  a where to_name='EMP';

 

FROM_ADD  FROM_HASH TO_OWNER             TO_NAME              TO_ADDRE    TO_HASH    TO_TYPE HEX_HASH_VA

-------- ---------- -------------------- -------------------- -------- ---------- ---------- -----------

52D9EA28 2648707557 SCOTT               EMP                  52D9DEBC  828228010          2    315dc1aa

52D9EA28 2648707557 EYGLE               EMP                  52D82E58 1930491453          2    7310f63d

 

SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 1';

 

Session altered.

 

SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 2';

 

Session altered.

 

SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 4';

 

Session altered.

 

SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 8';

 

Session altered.

 

SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 16';

 

Session altered.

 

SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 32';

 

Session altered.

 

SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 100';

 

Session altered.                     

此处在不同级别对shared pool进行转储.

在转向trace文件之前,我们通过下图来看一下Library CACHE的结构:

 hash.jpg

见上传图片

Library Cache Handle指向Library Cache Object,含对象名,namespace,时间戳,引用列表,锁定对象及pin对象列表等.

我们从dump文件中看看具体的信息,由以上v$sql视图我们得到以上查询的hash_value为9de011e5,ADDRESS为52D9EA28

在bucket 4851中,我们找到了select * from emp 这条sql语句. 

 

BUCKET 4581:

  LIBRARY OBJECT HANDLE: handle=52d9ea28

  name=select * from emp

  hash=9de011e5 timestamp=08-26-2004 10:24:43

 

==>这个hash正是v$sql中该sql语句的hash_value值   

 

  namespace=CRSR flags=RON/TIM/PN0/SML/[12010000]

  kkkk-dddd-llll=0000-0001-0001 lock=0 pin=0 latch#=1

  lwt=0x52d9ea40[0x52d9ea40,0x52d9ea40] ltm=0x52d9ea48[0x52d9ea48,0x52d9ea48]

  pwt=0x52d9ea58[0x52d9ea58,0x52d9ea58] ptm=0x52d9eab0[0x52d9eab0,0x52d9eab0]

  ref=0x52d9ea30[0x52d9ea30, 0x52d9ea30] lnd=0x52d9eabc[0x52d9eabc,0x52d9eabc]

    LIBRARY OBJECT: object=52d9e7b0

    type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0

    CHILDREN: size=16

    child#    table reference   handle

    ------ -------- --------- --------

         0 52d9e96c  52d9e6cc 52d9e4ac

         1 52d9e96c  52d9e70c 52d885cc

 

==>这就是我们前边提到过的子指针,每个都指向了一个不同的handle   

        

    DATA BLOCKS:

    data#     heap  pointer status pins change    alloc(K)  size(K)

    ----- -------- -------- ------ ---- ------     -------- --------

        0 52d9e9b8 52d9e838 I/-/A     0 NONE       0.86     1.09

       

==>此处的heap就是指内存地址.  

 

==>这里存放的就是SQL代码及用户连接信息

       

    HEAP DUMP OF DATA BLOCK 0:

******************************************************

HEAP DUMP heap name="library cache"  desc=0x52d9e9b8

 extent sz=0x224 alt=32767 het=16 rec=9 flg=2 opc=0

 parent=0x5000002c owner=0x52d9e7b0 nex=(nil) xsz=0x224

EXTENT 0 addr=0x52d9e558

  Chunk 52d9e560 sz=      540    perm      "perm           "  alo=448

52D9E560 5000021D 00000000 52D9E7A0 000001C0  [...P.......R....]

52D9E570 52D9E704 52D9E660 00020002 52D9E57C  [...R`..R....|..R]

52D9E580 52D9E57C 00000000 52D9E588 52D9E588  [|..R.......R...R]

52D9E590 00000000 52D9E594 52D9E594 00000000  [.......R...R....]

52D9E5A0 52D9E5A0 52D9E5A0 00000000 52D9E5AC  [...R...R.......R]

52D9E5B0 52D9E5AC 00000000 52D9E5B8 52D9E5B8  [...R.......R...R]

…….

52D9E8D0 52D9E630 00000000 52D9E570 00000000  [0..R....p..R....]

52D9E8E0 00000000 00000000 00000000 00000000  [................]

52D9E8F0 00000000 00000000 00000002 0000000D  [................]

52D9E900 00000001 00000000                    [........]       

Permanent space    =      900

******************************************************

  BUCKET 4581 total object count=1

 

 

我们以handle:52d885cc为例看一下Library Cache Object的结构:

******************************************************

  LIBRARY OBJECT HANDLE: handle=52d885cc

  namespace=CRSR flags=RON/KGHP/PN0/[10010000]

  kkkk-dddd-llll=0000-0041-0041 lock=0 pin=0 latch#=1

  lwt=0x52d885e4[0x52d885e4,0x52d885e4] ltm=0x52d885ec[0x52d885ec,0x52d885ec]

  pwt=0x52d885fc[0x52d885fc,0x52d885fc] ptm=0x52d88654[0x52d88654,0x52d88654]

  ref=0x52d885d4[0x52d9e70c, 0x52d9e70c] lnd=0x52d88660[0x52d88660,0x52d88660]

    LIBRARY OBJECT: object=52d82a24

    type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0

    DEPENDENCIES: count=1 size=16

    dependency#    table reference   handle position flags

    ----------- -------- --------- -------- -------- -------------------

              0 52d82be0  52d82b20 52d82e58       14 DEP[01]

             

==>在dependency部分我们看到,这个cursor依赖的对象handle:  52d82e58

==>这个handle指向的就是EYGLE.EMP表,如果以上两个CRSR访问的是同一个对象,

==>那么这两个SQL才会是真的共享.这里我们的sql虽然是相同的,访问的却是不同用户

==>的数据表, 子指针的概念就出来了.

==>在这里我们看到52d82e58指向的是EYGLE.EMP这个对象,也就是EYGLE所查询的数据表.

    ACCESSES: count=1 size=16

    dependency# types

    ----------- -----

              0 0009

    TRANSLATIONS: count=1 size=16

    original    final

    -------- --------

    52d82e58 52d82e58

    DATA BLOCKS:

    data#     heap  pointer status pins change    alloc(K)  size(K)

    ----- -------- -------- ------ ---- ------     -------- --------

        0 52d8c244 52d827e4 I/-/A     0 NONE       1.09     1.64

        6 52d82ac0 52d817c4 I/-/A     0 NONE       3.70     3.73

       

==>这里的Data Blocks是个重要的部分

==>每个控制块包含一个heap descriptor,指向相应的heap memory,这个heap memory ==>包含的就是Diana Tree,P-Code,Source Code,Shared Cursor Context area等重要==>数据,也就是我们通常。所说的,解析过的SQL及执行计划树,真正到这里以后,sql才得==>以共享.也就真正的避免了硬解析

       

    HEAP DUMP OF DATA BLOCK 0:

******************************************************   

HEAP DUMP heap name="library cache"  desc=0x52d8c244

 extent sz=0x224 alt=32767 het=16 rec=9 flg=2 opc=0

 parent=0x5000002c owner=0x52d82a24 nex=(nil) xsz=0x224

EXTENT 0 addr=0x52d80ff0

  Chunk 52d80ff8 sz=      464    free      "               "

52D80FF0                   C00001D1 00000000          [........]

52D81000 52D8C28C 52D8C28C 00000000 00000000  [...R...R........]

52D81010 00000000 00000000 00000000 00000000  [................]

        Repeat 26 times

52D811C0 00000000 00000000                    [........]       

  Chunk 52d811c8 sz=       76    freeable  "kgltbtab       "

52D811C0                   1000004D 52D80FF8          [M......R]

52D811D0 0A857928 52D82B68 00000000 00000000  [(y..h+.R........]

52D811E0 00000000 00000000 00000000 00000000  [................]

        Repeat 2 times

52D81210 00000000                             [....]           

EXTENT 1 addr=0x52d827cc

  Chunk 52d827d4 sz=      540    perm      "perm           "  alo=532

52D827D0          5000021D 00000000 52D82A14      [...P.....*.R]

52D827E0 00000214 00000000 0000001A 00000069  [............i...]

…………….

52D825D0 00000000 00000000 00000000 52D81ACF  [...............R]

52D825E0 00000000 00000000 00000000 00000000  [................]

        Repeat 3 times

52D82620 52D8263C 00000010 000006D4 00000010  [<&.R............]

52D82630 000006F4 00000010 000006F8 00000010  [................]

52D82640 000006FC 00000010 00000714 00000010  [................]

52D82650 00000720 00000010 0000072C 00000010  [ .......,.......]

52D82660 00000744 00000010 0000074C 00000010  [D.......L.......]

52D82670 00000764 00000010 0000077C           [d.......|...]   

Permanent space    =     3784

MARKS:

  Mark 0x52d8237c

******************************************************

这里的handle=52d82e58就是sql依赖的对象信息:

BUCKET 63037:

  LIBRARY OBJECT HANDLE: handle=52d82e58

  name=EYGLE.EMP

  hash=7310f63d timestamp=08-26-2004 10:23:40

  namespace=TABL/PRCD/TYPE flags=KGHP/TIM/SML/[02000000]

  kkkk-dddd-llll=0000-0501-0501 lock=0 pin=0 latch#=1

  lwt=0x52d82e70[0x52d82e70,0x52d82e70] ltm=0x52d82e78[0x52d82e78,0x52d82e78]

  pwt=0x52d82e88[0x52d82e88,0x52d82e88] ptm=0x52d82ee0[0x52d82ee0,0x52d82ee0]

  ref=0x52d82e60[0x52d82e60, 0x52d82e60] lnd=0x52d82eec[0x52d7dcf0,0x52d89fc8]

    LIBRARY OBJECT: object=52d81594

    type=TABL flags=EXS/LOC[0005] pflags= [00] status=VALD load=0

 

==>Type:对象类型,这里是一张表   

==>flags:代表对象状态

 

    DATA BLOCKS:

    data#     heap  pointer status pins change    alloc(K)  size(K)

    ----- -------- -------- ------ ---- ------     -------- --------

        0 52d8c1e4 52d8161c I/-/A     0 NONE       0.66     1.09

        8 52d81238 52d80a18 I/-/A     0 NONE       1.10     1.13

       10 52d8129c 52d80ea0 I/-/A     0 NONE       0.12     0.37

      

    HEAP DUMP OF DATA BLOCK 0:

******************************************************

HEAP DUMP heap name="library cache"  desc=0x52d8c1e4

 extent sz=0x224 alt=32767 het=16 rec=9 flg=2 opc=0

 parent=0x5000002c owner=0x52d81594 nex=(nil) xsz=0x224

 

==>每个heap descriptor 都包含一个owner部分,指向所有者,这里的 52d81594 也就==>是EYGLE.EMP指向的Library对象:  LIBRARY OBJECT: object=52d81594

 

EXTENT 0 addr=0x52d81220

  Chunk 52d81228 sz=      540    perm      "perm           "  alo=196

52D81220                   5000021D 00000000          [...P....]

52D81230 52D81584 000000C4 5000002C 00000824  [...R....,..P$...]

52D81240 52D81594 52D80A00 52D80A08 00000000  [...R...R...R....]

52D81250 00000000 05010200 00000000 00000000  [................]

52D81260 534C474B 61656820 00000070 00000000  [KGLS heap.......]

52D81270 00107FFF 7FFF7FFF 00000401 00000000  [................]

52D81280 52D81280 52D81280 52D8129C 00000000  [...R...R...R....]

52D81290 52D80EA0 00040000 52FF5C14 5000002C  [...R.....\.R,..P]

……..

52D80EE0 00000000 00000000 00000000 00000000  [................]

        Repeat 15 times

52D80FE0 00000000                             [....]           

Total heap size    =      340

FREE LISTS:

 Bucket 0 size=0

Total free space   =        0

UNPINNED RECREATABLE CHUNKS (lru first):

PERMANENT CHUNKS:

  Chunk 52d80e90 sz=      340    perm      "perm           "  alo=120

52D80E90 50000155 00000000 00000000 00000078  [U..P........x...]

52D80EA0 00000000 00000002 00000068 00000004  [........h.......]

52D80EB0 52D80EB0 52D80EB0 52D80EB8 52D80EB8  [...R...R...R...R]

52D80EC0 00000000 00000000 00000005 52D80ECC  [...............R]

52D80ED0 52D80ECC 52D80ED4 52D80ED4 00000000  [...R...R...R....]

52D80EE0 00000000 00000000 00000000 00000000  [................]

        Repeat 15 times

52D80FE0 00000000                             [....]           

Permanent space    =      340

******************************************************

  BUCKET 63037 total object count=1

                                                              

 

关于shared pool的深入探讨(五)

Oracle使用两种数据结构来进行shared pool的并发控制:lock 和 pin.
Lock比pin具有更高的级别.

Lock在handle上获得,在pin一个对象之前,必须首先获得该handle的锁定.
锁定主要有三种模式: Null,share,Exclusive.
在读取访问对象时,通常需要获取Null(空)模式以及share(共享)模式的锁定.
在修改对象时,需要获得Exclusive(排他)锁定.

在锁定了Library Cache对象以后,一个进程在访问之前必须pin该对象.
同样pin有三种模式,Null,shared和exclusive.

只读模式时获得共享pin,修改模式获得排他pin.

通常我们访问、执行过程、Package时获得的都是共享pin,如果排他pin被持有,那么数据库此时就要产生等待.

在很多statspack的report中,我们可能看到以下等待事件:

Top 5 Wait Events

~~~~~~~~~~~~~~~~~                                         Wait     % Total

Event                                              Waits  Time (cs)   Wt Time

-------------------------------------------- ------------ ------------ -------

library cache lock                                75,884    1,409,500   48.44

latch free                                     34,297,906    1,205,636   41.43

library cache pin                                    563      142,491    4.90

db file scattered read                           146,283       75,871    2.61

enqueue                                         2,211       13,003     .45 -------------------------------------------------------------                

 

这里的library cache lock和library cache pin都是我们关心的.接下来我们就研究一下这几个等待事件.

(一).LIBRARY CACHE PIN等待事件

Oracle文档上这样介绍这个等待事件:

"library cache pin" 是用来管理library cache的并发访问的,pin一个object会引起相应的heap被载入内存中(如果此前没有被加载),Pins可以在三个模式下获得:NULL,SHARE,EXCLUSIVE,可以认为pin是一种特定形式的锁.

当Library Cache Pin等待事件出现时,通常说明该Pin被其他用户已非兼容模式持有.

"library cache pin"的等待时间为3秒钟,其中有1秒钟用于PMON后台进程,即在取得pin之前最多等待3秒钟,否则就超时.

"library cache pin"的参数如下,有用的主要是P1和P2:

                P1 - KGL Handle address.

                P2 - Pin address

                P3 - Encoded Mode & Namespace

"LIBRARY CACHE PIN"通常是发生在编译或重新编译PL/SQL,VIEW,TYPES等object时.编译通常都是显性的,如安装应用程序,升级,安装补丁程序等,另外,"ALTER","GRANT","REVOKE"等操作也会使object变得无效, 可以通过object的"LAST_DDL"观察这些变化.

当object变得无效时,Oracle 会在第一次访问此object时试图去重新编译它,如果此时其他session已经把此object pin到library cache中,就会出现问题,特别时当有大量的活动session并且存在较复杂的dependence时.在某种情况下,重新编译object可能会花几个小时时间,从而阻塞其它试图去访问此object的进程.

下面让我们通过一个例子来模拟及解释这个等待:

1.创建测试用存储过程

[oracle@jumper udump]$ sqlplus "/ as sysdba"

 

SQL*Plus: Release 9.2.0.3.0 - Production on Mon Sep 6 14:16:57 2004

 

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

 

Connected to an idle instance.

 

SQL> startup

ORACLE instance started.

 

Total System Global Area   47256168 bytes

Fixed Size                   451176 bytes

Variable Size              29360128 bytes

Database Buffers           16777216 bytes

Redo Buffers                 667648 bytes

Database mounted.

Database opened.

SQL> create or replace PROCEDURE pining

  2  IS

  3  BEGIN

  4          NULL;

  5  END;

  6  /

 

Procedure created.

 

SQL>

SQL> create or replace procedure calling

  2  is

  3  begin

  4          pining;

  5          dbms_lock.sleep(3000);

  6  end;

  7  /

 

Procedure created.

 

SQL>                                                                     

                                                                             

2.模拟

首先执行calling过程,在calling过程中调用pining过程。此时pining过程上获得共享Pin,如果此时尝试对pining进行授权或重新编译,将产生Library Cache Pin等待,直到calling执行完毕.

session 1:

 

[oracle@jumper oracle]$ sqlplus "/ as sysdba"

SQL*Plus: Release 9.2.0.3.0 - Production on Mon Sep 6 16:13:43 2004

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production

SQL> exec calling

 

此时calling开始执行

session 2:

 

[oracle@jumper udump]$ sqlplus "/ as sysdba"

SQL*Plus: Release 9.2.0.3.0 - Production on Mon Sep 6 16:14:16 2004

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production

SQL> grant execute on pining to eygle;

 

此时session 2挂起

ok,我们开始我们的研究:

从v$session_wait入手,我们可以得到哪些session正在经历library cache pin的等待

 

 

SQL> select sid,seq#,event,p1,p1raw,p2,p2raw,p3,p3raw,state

  2  from v$session_wait where event like 'library%';

 

 

 SID       SEQ# EVENT                       P1 P1RAW            P2 P2RAW            P3  WAIT_TIME SECONDS_IN_WAIT STATE

---- ---------- ------------------- ---------- -------- ---------- -------- ---------- ---------- --------------- -------------------

   8        268 library cache pin   1389785868 52D6730C 1387439312 52B2A4D0        301          0               2 WAITING

 

等待3秒就超时,seq#会发生变化

 

SQL>

 

 SID       SEQ# EVENT                       P1 P1RAW            P2 P2RAW            P3  WAIT_TIME SECONDS_IN_WAIT STATE

---- ---------- ------------------- ---------- -------- ---------- -------- ---------- ---------- --------------- -------------------

   8        269 library cache pin   1389785868 52D6730C 1387439312 52B2A4D0        301          0               2 WAITING

 

SQL>

 

 SID       SEQ# EVENT                       P1 P1RAW            P2 P2RAW            P3  WAIT_TIME SECONDS_IN_WAIT STATE

---- ---------- ------------------- ---------- -------- ---------- -------- ---------- ---------- --------------- -------------------

   8        270 library cache pin   1389785868 52D6730C 1387439312 52B2A4D0        301          0               0 WAITING

                                                                                           

在这个输出中,P1 列是Library Cache Handle Address,Pn字段是10进制表示,PnRaw字段是16进制表示

我们看到,library cache pin等待的对象的handle地址为:52D6730C
通过这个地址,我们查询X$KGLOB视图就可以得到对象的具体信息:

Note: X$KGLOB--[K]ernel [G]eneric [L]ibrary Cache Manager [OB]ject

 

col KGLNAOWN for a10

col KGLNAOBJ for a20

select ADDR,KGLHDADR,KGLHDPAR,KGLNAOWN,KGLNAOBJ,KGLNAHSH,KGLHDOBJ

from X$KGLOB

where KGLHDADR ='52D6730C'

/

 

ADDR     KGLHDADR KGLHDPAR KGLNAOWN   KGLNAOBJ               KGLNAHSH KGLHDOBJ

-------- -------- -------- ---------- -------------------- ---------- --------

404F9FF0 52D6730C 52D6730C SYS        PINING               2300250318 52D65BA4                                                                              

                                                                           

这里KGLNAHSH代表该对象的Hash Value

由此我们知道,在PINING对象上正经历library cache pin的等待.

然后我们引入另外一个内部视图X$KGLPN:

Note:X$KGLPN--[K]ernel [G]eneric [L]ibrary Cache Manager object [P]i[N]s

 select a.sid,a.username,a.program,b.addr,b.KGLPNADR,b.KGLPNUSE,b.KGLPNSES,b.KGLPNHDL,

b.kGLPNLCK, b.KGLPNMOD, b.KGLPNREQ

from v$session a,x$kglpn b

where a.saddr=b.kglpnuse and b.kglpnhdl = '52D6730C' and b.KGLPNMOD<>0

/

 

  SID USERNAME    PROGRAM                                  ADDR     KGLPNADR KGLPNUSE KGLPNSES KGLPNHDL KGLPNLCK   KGLPNMOD   KGLPNREQ

----- ----------- ---------------------------------------- -------- -------- -------- -------- -------- -------- ---------- ----------

   13 SYS         sqlplus@jumper.hurray.com.cn (TNS V1-V3) 404FA034 52B2A518 51E2013C 51E2013C 52D6730C 52B294C8          2          0                                                                            

                                                                                           

通过联合v$session,可以获得当前持有该handle的用户信息.对于我们的测试sid=13的用户正持有该handle

那么这个用户正在等什么呢?

 SQL> select * from v$session_wait where sid=13;

 

       SID       SEQ# EVENT               P1TEXT            P1 P1RAW    P2TEXT          P2 P2RAW    P3TEXT          P3 P3RAW     WAIT_TIME SECONDS_IN_WAIT STATE

---------- ---------- ------------------- --------- ---------- -------- ------- ---------- -------- ------- ---------- -------- ---------- --------------- -------

        13         25 PL/SQL lock timer   duration      120000 0001D4C0                  0 00                        0 00                0            1200 WAITING

 

Ok,这个用户正在等待一次PL/SQL lock timer计时.得到了sid,我们就可以通过v$session.SQL_HASH_VALUE,v$session.SQL_ADDRESS等字段关联v$sqltext,v$sqlarea等视图获得当前session正在执行的操作.

 SQL> select sql_text from v$sqlarea where v$sqlarea.hash_value='3045375777';

 

SQL_TEXT

--------------------------------------------------------------------------------

BEGIN calling; END;

 

这里我们得到这个用户正在执行calling这个存储过程,接下来的工作就应该去检查calling在作什么了.

我们这个calling作的工作是dbms_lock.sleep(3000)也就是PL/SQL lock timer正在等待的原因。

至此就找到了Library Cache Pin的原因.

简化一下以上查询:

1.获得Library Cache Pin等待的对象

 SELECT addr, kglhdadr, kglhdpar, kglnaown, kglnaobj, kglnahsh, kglhdobj

  FROM x$kglob

 WHERE kglhdadr IN (SELECT p1raw

                      FROM v$session_wait

                     WHERE event LIKE 'library%')

/

 

ADDR     KGLHDADR KGLHDPAR KGLNAOWN   KGLNAOBJ    KGLNAHSH KGLHDOBJ

-------- -------- -------- ---------- -------------------- ---------- --------

404F2178 52D6730C 52D6730C SYS        PINING               2300250318 52D65BA4

                                                                                           

2.获得持有等待对象的session信息

 

 

SELECT a.SID, a.username, a.program, b.addr, b.kglpnadr, b.kglpnuse,

       b.kglpnses, b.kglpnhdl, b.kglpnlck, b.kglpnmod, b.kglpnreq

  FROM v$session a, x$kglpn b

 WHERE a.saddr = b.kglpnuse

   AND b.kglpnmod <> 0

   AND b.kglpnhdl IN (SELECT p1raw

                        FROM v$session_wait

                       WHERE event LIKE 'library%')

/

SQL>

 

       SID USERNAME   PROGRAM                                          ADDR     KGLPNADR KGLPNUSE KGLPNSES KGLPNHDL KGLPNLCK   KGLPNMOD   KGLPNREQ

---------- ---------- ------------------------------------------------ -------- -------- -------- -------- -------- -------- ---------- ----------

        13 SYS        sqlplus@jumper.hurray.com.cn (TNS V1-V3)         404F6CA4 52B2A518 51E2013C 51E2013C 52D6730C 52B294C8          2          0

                                                                                           

3.获得持有对象用户执行的代码

 SELECT sql_text

  FROM v$sqlarea

 WHERE (v$sqlarea.address, v$sqlarea.hash_value) IN (

          SELECT sql_address, sql_hash_value

            FROM v$session

           WHERE SID IN (

                    SELECT SID

                      FROM v$session a, x$kglpn b

                     WHERE a.saddr = b.kglpnuse

                       AND b.kglpnmod <> 0

                       AND b.kglpnhdl IN (SELECT p1raw

                                            FROM v$session_wait

                                           WHERE event LIKE 'library%')))

/

 

SQL_TEXT

--------------------------------------------------------------------------------

BEGIN calling; END;

                                                                                           

在grant之前和之后我们可以转储一下shared pool的内容观察比较一下:

 SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 32';

Session altered.

在grant之前:

从前面的查询获得pining的Handle是52D6730C:

 *****************************************************

BUCKET 67790:

  LIBRARY OBJECT HANDLE: handle=52d6730c

  name=SYS.PINING

  hash=891b08ce timestamp=09-06-2004 16:43:51

  namespace=TABL/PRCD/TYPE flags=KGHP/TIM/SML/[02000000]

  kkkk-dddd-llll=0000-0011-0011 lock=N pin=S latch#=1

--在Object上存在共享pin

--在handle上存在Null模式锁定,此模式允许其他用户继续以Null/shared模式锁定该对象

  lwt=0x52d67324[0x52d67324,0x52d67324] ltm=0x52d6732c[0x52d6732c,0x52d6732c]

  pwt=0x52d6733c[0x52b2a4e8,0x52b2a4e8] ptm=0x52d67394[0x52d67394,0x52d67394]

  ref=0x52d67314[0x52d67314, 0x52d67314] lnd=0x52d673a0[0x52d67040,0x52d6afcc]

    LIBRARY OBJECT: object=52d65ba4

    type=PRCD flags=EXS/LOC[0005] pflags=NST [01] status=VALD load=0

    DATA BLOCKS:

    data#     heap  pointer status pins change    alloc(K)  size(K)

    ----- -------- -------- ------ ---- ------     -------- --------

        0 52d65dac 52d65c90 I/P/A     0 NONE       0.30     0.55

        4 52d65c40 52d67c08 I/P/A     1 NONE       0.44     0.48

                                                                                           

在发出grant命令后:

 ******************************************************

BUCKET 67790:

  LIBRARY OBJECT HANDLE: handle=52d6730c

  name=SYS.PINING

  hash=891b08ce timestamp=09-06-2004 16:43:51

  namespace=TABL/PRCD/TYPE flags=KGHP/TIM/SML/[02000000]

  kkkk-dddd-llll=0000-0011-0011 lock=X pin=S latch#=1

--由于calling执行未完成,在object上仍让保持共享pin

--由于grant会导致重新编译该对象,所以在handle上的排他锁已经被持有

--进一步的需要获得object上的Exclusive pin,由于shared pin被calling持有,所以library cache pin等待出现.

  lwt=0x52d67324[0x52d67324,0x52d67324] ltm=0x52d6732c[0x52d6732c,0x52d6732c]

  pwt=0x52d6733c[0x52b2a4e8,0x52b2a4e8] ptm=0x52d67394[0x52d67394,0x52d67394]

  ref=0x52d67314[0x52d67314, 0x52d67314] lnd=0x52d673a0[0x52d67040,0x52d6afcc]

    LIBRARY OBJECT: object=52d65ba4

    type=PRCD flags=EXS/LOC[0005] pflags=NST [01] status=VALD load=0

    DATA BLOCKS:

    data#     heap  pointer status pins change    alloc(K)  size(K)

    ----- -------- -------- ------ ---- ------     -------- --------

        0 52d65dac 52d65c90 I/P/A     0 NONE       0.30     0.55

        4 52d65c40 52d67c08 I/P/A     1 NONE       0.44     0.48                                                                                    

实际上recompile过程包含以下步骤,我们看一下lock和pin是如何交替发挥作用的:

1.存储过程的library cache object以排他模式被锁定,这个锁定是在handle上获得的
exclusive锁定可以防止其他用户执行同样的操作,同时防止其他用户创建新的引用此过程的对象.

2.以shared模式pin该对象,以执行安全和错误检查.

3.共享pin被释放,重新以排他模式pin该对象,执行重编译.

4.使所有依赖该过程的对象失效

5.释放exclusive lock和exclusive pin

 

(二).LIBRARY CACHE LOCK等待事件

如果此时我们再发出一条grant或compile的命令,那么library cache lock等待事件将会出现:

session 3:

 [oracle@jumper oracle]$ sqlplus "/ as sysdba"

SQL*Plus: Release 9.2.0.3.0 - Production on Tue Sep 7 17:05:25 2004

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

Connected to:

Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production

With the Partitioning, OLAP and Oracle Data Mining options

JServer Release 9.2.0.3.0 - Production

SQL> alter procedure pining compile;

                                                                           

此进程挂起,我们查询v$session_wait视图可以获得以下信息:

 SQL> select * from v$session_wait;

 

 SID SEQ# EVENT               P1TEXT                  P1 P1RAW    P2TEXT               P2 P2RAW    P3TEXT              P3 P3RAW     WAIT_TIME SECONDS STATE 

---- ---- ------------------- --------------- ---------- -------- ------------ ---------- -------- ---------------- -------------- ---------- ------  ---

  11  143 library cache pin   handle address  1390239716 52DD5FE4 pin address  1387617456 52B55CB0 100*mode+namespace 301 0000012D          0       6 WAITING

  13   18 library cache lock  handle address  1390239716 52DD5FE4 lock address 1387433984 52B29000 100*mode+namespace 301 0000012D          0       3 WAITING

   8  415 PL/SQL lock timer   duration        120000     0001D4C0                       0 00                            0 00                0      63 WAITING

....

 

13 rows selected

                                                                           

由于handle上的lock已经被session 2以exclusive模式持有,所以session 3产生了等待。我们可以看到,在生产数据库中权限的授予、对象的重新编译都可能会导致library cache pin等待的出现。所以应该尽量避免在高峰期进行以上操作.

另外我们测试的案例本身就说明:如果Package或过程中存在复杂的、交互的依赖以来关系极易导致library cache pin的出现。所以在应用开发的过程中,我们也应该注意这方面的内容。

关于shared pool的深入探讨(六)

研究了几天shared pool,没想到忽然就撞到问题上来了。作为一个案例写出来给大家参考一下吧。

问题起因是公司做短信群发,就是那个18万买的4000字的短信小说(嘘,小声点,我也没看过...)。群发的时候每隔一段时间就会发生一次消息队列拥堵的情况,在数据库内部实际上是向一个数据表中记录发送日志。

我们介入来检查数据库的问题,在一个拥堵时段我开始诊断:

SQL> select sid,event,p1,p1raw from v$session_wait;  

 

       SID EVENT                           P1    P1RAW

---------- ---------------------------------------------------------------- ---------- --------

        76 latch free                 2147535824   8000CBD0

        83 latch free                 2147535824   8000CBD0

       148 latch free                 3415346832   CB920E90

       288 latch free                 2147535824   8000CBD0

       285 latch free                 2147535824   8000CBD0

       196 latch free                 2147535824   8000CBD0

       317 latch free                 2147535824   8000CBD0

         2 pmon timer                       300   0000012C

         1 rdbms ipc message                300   0000012C

         4 rdbms ipc message                300   0000012C

         6 rdbms ipc message             180000   0002BF20

 

       SID EVENT                            P1   P1RAW

---------- ---------------------------------------------------------------- ---------- --------

        18 rdbms ipc message               6000   00001770

       102 rdbms ipc message               6000   00001770

       311 rdbms ipc message               6000   00001770

       194 rdbms ipc message               6000   00001770

       178 rdbms ipc message               6000   00001770

         3 log file parallel write                   1   00000001

        13 log file sync                      2705   00000A91

        16 log file sync                      2699   00000A8B

       104 log file sync                      2699   00000A8B

       308 log file sync                      2694   00000A86

       262 log file sync                      2705   00000A91

 

       SID EVENT                            P1   P1RAW

---------- ---------------------------------------------------------------- ---------- --------

       172 log file sync                      2689   00000A81

       169 log file sync                       2705   00000A91

       108 log file sync                       2694   00000A86

        38 log file sync                       2707   00000A93

        34 db file scattered read                 63   0000003F

         5 smon timer                         300   0000012C

        27 SQL*Net message to client   1413697536   54435000

        60 SQL*Net message to client   1413697536   54435000

       239 SQL*Net message to client   1413697536   54435000

...ignore some idle waiting here...

        11 SQL*Net message from client  675562835   28444553

        12 SQL*Net message from client 1413697536   54435000

 

170 rows selected.                  

在这次查询中,我发现大量的latch free等待,再次查询时这些等待消失,应用也恢复了正常。

 SQL> select sid,event,p1,p1raw from v$session_wait where event not like 'SQL*Net%';

       SID EVENT                   P1     P1RAW

---------- ---------------------------------------------------------------- ---------- --------

         2 pmon timer               300     0000012C

         1 rdbms ipc message        300     0000012C

         4 rdbms ipc message        300     0000012C

         6 rdbms ipc message     180000     0002BF20

        18 rdbms ipc message        6000    00001770

       102 rdbms ipc message        6000    00001770

       178 rdbms ipc message        6000    00001770

       194 rdbms ipc message        6000    00001770

       311 rdbms ipc message        6000    00001770

         3 log file parallel write           1    00000001

       148 log file sync               2547   000009F3

 

       SID EVENT                     P1   P1RAW

---------- ---------------------------------------------------------------- ---------- --------

       273 log file sync               2544   000009F0

       190 log file sync               2545   000009F1

         5 smon timer                 300   0000012C

 

14 rows selected.

接下来我们来看这些latch free等待的是哪些latch。

SQL> select addr,latch#,name,gets,spin_gets from v$latch order by spin_gets;

 

ADDR         LATCH# NAME                                 GETS  SPIN_GETS

-------- ---------- ------------------------------------------------                ----------

80001398          3 session switching                       111937          0

80002010          6 longop free list                           37214          0

800023A0          7 cached attr list                               0          0

80002628         10 event group latch                      2391668          0

.....

80003F3C         28 message pool operations parent latch           3          0

.....

80006030         60 mostly latch-free SCN                         19          0

80005F8C         59 file number translation table                   68          0

80005F14         58 dlm cr bast queue latch                        0          0

80005E8C         57 name-service request                         0          0

80005E14         56 name-service memory objects                  0          0

80005DA0         55 name-service namespace bucket               0          0

 

ADDR         LATCH# NAME                                 GETS  SPIN_GETS

-------- ---------- ------------------------------------------------                  ----------

80005D2C         54 name-service pending queue                   0          0

80005CB4         53 name-service request queue                    0          0

80004E08         52 name-service entry                            0          0

80008AB0         76 KCL lock element parent latch                  0          0

80008A48         75 KCL instance latch                            0          0

80007F18         73 redo copy                                  816          0

80007BBC         71 archive process latch                          0          0

80007B54         70 archive control                                1          0

80006A10         68 Active checkpoint queue latch            2003308          0

800064B0         66 large memory latch                            0          0

80006448         65 cache protection latch                          0          0

 

ADDR         LATCH# NAME                                  GETS  SPIN_GETS

-------- ---------- ------------------------------------------------                ----------

800060EC         61 batching SCNs                                0          0

8000CAB0         96 global transaction                       6833807          0

8000CA48         95 global tx free list                           58258          0

8000C238         93 cost function                                   0          0

80009FCC         91 temp lob duration state obj allocation             0          0

8000995C         87 ktm global data                             8118          0

80009228         84 transaction branch allocation               282388          0

80008EC4         80 begin backup scn array                     6968          0

80008D54         79 loader state object freelist                  42712          0

80008B80         78 KCL freelist latch                              0          0

80008B18         77 KCL name table latch                           0          0

 

ADDR         LATCH# NAME                                  GETS  SPIN_GETS

-------- ---------- ------------------------------------------------                   ----------

8000D484        118 presentation list                                0          0

8000D41C        117 session timer                             855944          0

.....

8000E9D0        129 process queue                                44          0

8000E900        127 query server freelists                           66          0

8000FC84        140 AQ Propagation Scheduling System Load          0          0

8000E898        126 query server process                           10          0

8000E27C        125 job_queue_processes parameter latch       111937          0

8000DA1C        124 NLS data objects                               2          0

 

ADDR         LATCH# NAME                                   GETS  SPIN_GETS

-------- ---------- ------------------------------------------------                    ----------

8000D95C        123 ncodef allocation latch                   111937          0

8000D674        122 virtual circuits                                0           0

8000D60C        121 virtual circuit queues                    159877           0

8000D5A4        120 virtual circuit buffers                          0           0

8000D4EC        119 address list                                 2           0

.....

8000CD70        102 Direct I/O Adaptor                            2          0

.....

80002408          8 GDS latch                                   30          0

800092E4         85 sort extent pool                           69834          1

8000EC38        132 parallel query alloc buffer                     80          1

8000E968        128 error message lists                           22          1

80001400          4 process group creation                   2615542          2

8000EAA0        131 parallel query stats                           14          2

 

ADDR         LATCH# NAME                                   GETS  SPIN_GETS

-------- ---------- ------------------------------------------------                  ----------

8000CD08        101 Token Manager                         1151107          2

8000CB18         97 global tx hash mapping                    507846          2

80006378         63 cache buffer handles                      315924          4

8000EA38        130 process queue reference                  190993          5

80003E3C         26 channel handle pool latch                2391680         18

80003EAC         27 channel operations parent latch           4783425         24

80009B90         89 intra txn parallel recovery                   32654         33

8000FCF8        141 fixed table rows for x$hs_session          161368         41

800012C8          1 process allocation                       2391688        154

80009B28         88 parallel txn reco latch                     174519        271

8000CCA0        100 library cache load lock                 14947545       5958

 

ADDR         LATCH# NAME                                  GETS  SPIN_GETS

-------- ---------- ------------------------------------------------                  ----------

8000C8D0         94 user lock                               13086412       6078

8000914C         82 list of block allocation                   120650357      12024

80006A78         69 Checkpoint queue latch                 154361751      17686

80009D34         90 sequence cache                        64611720       32027

80009090         81 dml lock allocation                      234465024      45351

800091C0         83 transaction allocation                   214227648      48345

800096AC         86 undo global data                       188271244      49641

800028A0         13 enqueue hash chains                   373244264     131322

80007E04         72 redo allocation                         439389808     201498

80001468          5 session idle bit                        2039097976     204969

80002838         12 enqueues                             471338482      273695

 

ADDR         LATCH# NAME                                  GETS  SPIN_GETS

-------- ---------- ------------------------------------------------                  ----------

80001330          2 session allocation                     261826230      428312

800063E0         64 multiblock read objects               1380614923     1366278

800026B8         11 messages                           207935758     1372606

80001218          0 latch wait list                         203479569     1445342

80006310         62 cache buffers chains                 3.8472E+10     2521699

8000A17C         92 row cache objects                   1257586714     2555872

80007F80         74 redo writing                          264722932     4458044

80006700         67 cache buffers lru chain               5664313769    30046921

8000CBD0         98 shared pool                         122433688    59070585

8000CC38         99 library cache                       4414533796  1037032730

 

142 rows selected.

 

SQL> select startup_time from v$instance;

 

STARTUP_T

---------

13-AUG-04

 

检查数据库启动时间                                                            

我们注意到,在当前数据库中竞争最严重的两个latch是shared pool和library cache.
显然这极有可能是SQL的过度解析造成的。进一步我们检查v$sqlarea发现:

SQL> select sql_text,VERSION_COUNT,INVALIDATIONS,PARSE_CALLS,OPTIMIZER_MODE,

PARSING_USER_ID,PARSING_SCHEMA_ID,ADDRESS,HASH_VALUE

from v$sqlarea where version_count >1000;

  2 

 

SQL_TEXT

------------------------------------------------------------------------------------------------------------------------

VERSION_COUNT INVALIDATIONS PARSE_CALLS OPTIMIZER_MODE     PARSING_USER_ID

PARSING_SCHEMA_ID ADDRESS  HASH_VALUE

------------- ------------- ----------- ------------------------- --------------- ----------------- -------- ----------

insert into sms_log (MSGDATE,MSGTIME,MSGID,MSGKIND,MSGTYPE,MSGTYPE_MOMT,

MSGLEN,MSGSTATUS,AREAID,IFIDDEST,IFIDSRC,ADDRSRC,ADDRDEST,ADDRFEE,

ADDRUSER,SERVICECODE,PLANID,FEETYPE,FEEVALUE,DATACODING,FLAGS,SMLEN,

SMCONT) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19,:b20,:b21,:b22)

         7023             0        1596 MULTIPLE CHILDREN PRESENT              36                36 C82AF1C8 3974744754                                                                              

这就是写日志记录的代码,这段代码使用了绑定变量,但是version_count却有7023个。也就是这个sql有7023个子指针.这是不可想象的。

通过前面几节的研究我们知道,如果这个sql有7023个子指针,那么意味着这些子指针都将存在于同一个Bucket的链表上。那么这也就意味着,如果同样SQL再次执行,Oracle将不得不搜索这个链表以寻找可以共享的SQL。这将导致大量的library cache latch的竞争。

这时候我开始猜测原因:

1.可能代码存在问题,在每次执行之前程序修改某些session参数,导致sql不能共性

2.可能是8.1.5的v$sqlarea记录存在问题,我们看到的结果是假象:)

3.Bug

 

Ok,我们的诊断不能停。最直接的我dump内存来看:

SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 4';

察看trace文件得到如下结果(摘录包含该段代码的片断):

 BUCKET 21049:

  LIBRARY OBJECT HANDLE: handle=c82af1c8

  name=

insert into sms_log (MSGDATE,MSGTIME,MSGID,MSGKIND,MSGTYPE,MSGTYPE_MOMT,MSGLEN,

MSGSTATUS,AREAID,IFIDDEST,IFIDSRC,ADDRSRC,ADDRDEST,ADDRFEE,ADDRUSER,

SERVICECODE,PLANID,FEETYPE,FEEVALUE,DATACODING,FLAGS,SMLEN,SMCONT) values

(:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19,:b20,:b21,:b22)

  hash=ece9cab2 timestamp=09-09-2004 12:51:29

  namespace=CRSR flags=RON/TIM/PN0/LRG/[10010001]

  kkkk-dddd-llll=0000-0001-0001 lock=N pin=S latch=5

  lwt=c82af1e0[c82af1e0,c82af1e0] ltm=c82af1e8[c82af1e8,c82af1e8]

  pwt=c82af1f8[c82af1f8,c82af1f8] ptm=c82af250[c82af250,c82af250]

  ref=c82af1d0[c82af1d0,c82af1d0]

    LIBRARY OBJECT: object=c1588e84

    type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0

    CHILDREN: size=7024

    child#    table reference   handle

    ------ -------- --------- --------

         0 c1589040  c1589008 c668c2bc

         1 c1589040  bfd179c4 c6ec9ee8

         2 c1589040  bfd179e0 c2dd9b3c

         3 c1589040  bfd179fc c5a46614

         4 c1589040  bfd17a18 c35f1388

         5 c1589040  bfd17a34 c77401bc

         6 c1589040  bfd17a50 c4092838

         7 c1589040  bfddb310 c6cd5258

         8 c1589040  bfddb32c c63c6650

         9 c1589040  bfddb348 c7e4e3d0

        10 c1589040  bfddb364 c4c4b110

        11 c1589040  bfddb380 c5950348

        12 c1589040  bfddb39c c6c33aa4

        13 c1589040  bfddb3b8 c672b0bc

...........................................       

.....ignore losts of child cursor here.....

...........................................

      7001 bf595bc8  c641fba0 c6467890

      7002 bf595bc8  c641fbbc c3417168

      7003 bf595bc8  c641fbd8 c3417bb0

      7004 bf595bc8  c641fbf4 c2fdccbc

      7005 bf595bc8  c641fc10 c7f7ca50

      7006 bf595bc8  c641fc2c c7f508ec

      7007 bf595bc8  c641fc48 c268d8d8

      7008 c641fcb8  c641fc64 bec61ed8

      7009 c641fcb8  c641fc80 c4a6cc5c

      7010 c641fcb8  c641fc9c c1a8aa34

      7011 c641fcb8  c0ae4ea0 c0ae4ddc

      7012 c641fcb8  c0ae4ebc bd55fe60

      7013 c641fcb8  c0ae4ed8 c226914c

      7014 c641fcb8  c0ae4ef4 c51dd2e0

      7015 c641fcb8  c0ae4f10 c480c468

      7016 c641fcb8  c0ae4f2c c60196d0

      7017 c641fcb8  c0ae4f48 c4675d2c

      7018 c641fcb8  c0ae4f64 bd5e2750

      7019 c641fcb8  c0ae4f80 c09b1bb0

      7020 c641fcb8  c0ae4f9c bf2d6044

      7021 c641fcb8  c0ae4fb8 c332c1c4

      7022 c641fcb8  c0ae4fd4 cbdde0f8

    DATA BLOCKS:

    data#     heap  pointer status pins change

    ----- -------- -------- ------ ---- ------

        0 c3ef2c50 c1588f08 I/P/A     0 NONE 

 

这里确实存在7023个子指针

查询v$sql得到相同的结果:

SQL> select CHILD_NUMBER,EXECUTIONS,OPTIMIZER_MODE,OPTIMIZER_COST,PARSING_USER_ID,

   2 PARSING_SCHEMA_ID,ADDRESS,HASH_VALUE 

   3 from v$sql where HASH_VALUE='3974744754';   

 

CHILD_NUMBER EXECUTIONS OPTIMIZER_ OPTIMIZER_COST PARSING_USER_ID PARSING_SCHEMA_ID ADDRESS  HASH_VALUE

 ------------ ---------- ------ ---------- ------------- ------------ -------- ----------

    0      12966 CHOOSE       238150     36       36        C82AF1C8 3974744754

    1       7111 CHOOSE       238150     36       36        C82AF1C8 3974744754

    2       9160 CHOOSE       238150     36       36        C82AF1C8 3974744754

    3       9127 CHOOSE       238150     36       36        C82AF1C8 3974744754

    4       8109 CHOOSE       238150     36       36        C82AF1C8 3974744754

    5       4386 CHOOSE       238150     36       36        C82AF1C8 3974744754

    6       4913 CHOOSE       238150     36       36        C82AF1C8 3974744754

    7       3764 CHOOSE       238150     36       36        C82AF1C8 3974744754

    8       3287 CHOOSE       238150     36       36        C82AF1C8 3974744754

    9       3156 CHOOSE       238150     36       36        C82AF1C8 3974744754

.....                                                  

 7015          1 CHOOSE       238150     36       36        C82AF1C8 3974744754

 7016          1 CHOOSE       238150     36       36        C82AF1C8 3974744754

 7017          0 CHOOSE       238150     36       36        C82AF1C8 3974744754

 7018       9396 NONE                     0               0    C82AF1C8 3974744754

 7019       5008 CHOOSE       237913     36           36    C82AF1C8 3974744754

 7020        625 CHOOSE       237913     36           36    C82AF1C8 3974744754

 7021      10101 CHOOSE       237913     36           36    C82AF1C8 3974744754

 7022       7859 CHOOSE       237913     36           36    C82AF1C8 3974744754

 

7023 rows selected.

这里确实存在7023个子指针,第二种猜测被否定了,同时研发发过来的代码也不存在第一种情况。那么只能是第三种情况了,Oracle的Bug,Ok,那我们需要找到解决办法。

搜索Metalink,发现Bug:1210242,该Bug描述为:
On certain SQL statements cursors are not shared when TIMED_STATISTICS is enabled.

碰巧我这个数据库的TIMED_STATISTICS设置为True,修改TIMED_STATISTICS为False以后,观察v$sql,发现有效子指针很快下降到2个。

 SQL> select CHILD_NUMBER,OPTIMIZER_COST,OPTIMIZER_MODE,EXECUTIONS,ADDRESS from v$sql where hash_value=3974744754 and OPTIMIZER_MODE='CHOOSE';

CHILD_NUMBER OPTIMIZER_COST OPTIMIZER_ EXECUTIONS ADDRESS

------------ -------------- ---------- ---------- --------

           0         238167 CHOOSE          63943 C82AF1C8

           1         238300 CHOOSE          28915 C82AF1C8

 

第二天下降到只有一个.

SQL> select CHILD_NUMBER,OPTIMIZER_COST,OPTIMIZER_MODE,EXECUTIONS,ADDRESS from v$sql where hash_value=3974744754 and OPTIMIZER_MODE='CHOOSE';

 

CHILD_NUMBER OPTIMIZER_COST OPTIMIZER_ EXECUTIONS ADDRESS

------------ -------------- ---------- ---------- --------

           0         238702 CHOOSE         578124 C82AF1C8

 

短信群发从此正常.

对于这个问题,另外一个可选的方法是设置一个隐含参数:
_sqlexec_progression_cost = 0

这个参数的具体含义为:
SQL execution progression monitoring cost threshold
即:SQL执行进度监控成本阀值

这个参数根据COST来决定需要监控的SQL.执行进度监控会引入额外的函数调用和Row Sources这可能导致SQL的执行计划或成本发生改变,从而产生不同的子指针.
_sqlexec_progression_cost 的缺省值为1000,成本大于1000的所有SQL都会被监控
如果该参数设置为0,那么SQL的执行进度将不会被跟踪.

执行进度监控信息会被记录到V$SESSION_LONGOPS视图中,如果Time_statistics参数设置为False,那么这个信息就不会被记录.

所以,Time_statistics参数和_sqlexec_progression_cost是解决问题的两个途径。

通过查询我们也可以看到,在这个数据库中,OPTIMIZER_COST >1000的SQL主要有以下五个:

 SQL> select distinct(sql_text) from v$sql where OPTIMIZER_COST >1000;

 

SQL_TEXT

--------------------------------------------------------------------------------

insert into sms_detail_error (msgdate,addruser,msgid,areaid,reason,spnumber,msgt

ime,ifiddest,msqkey,servicecode,planid,feetype,feevalue,smcont,submittimes,submi

tdate,submittime,msgstate_resp,errorcode_resp,msgstate_rept,errorcode_rept) valu

es (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:

b17,:b18,:b19,:b20)

 

insert into sms_detail_success (msgdate,addruser,msgid,areaid,spnumber,msgtime,i

fiddest,servicecode,planid,feetype,feevalue,smcont,submittimes,submitdate,submit

time,respdate,resptime,reptdate,repttime,msqkey) values (:b0,:b1,:b2,:b3,:b4,:b5

,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19)

 

insert into sms_log (MSGDATE,MSGTIME,MSGID,MSGKIND,MSGTYPE,MSGTYPE_MOMT,

MSGLEN,MSGSTATUS,AREAID,IFIDDEST,IFIDSRC,ADDRSRC,ADDRDEST,ADDRFEE,ADDRUSER,

SERVICECODE,PLANID,FEETYPE,FEEVALUE,DATACODING,FLAGS,SMLEN,SMCONT) values (:b0,:b1,:b2,:b3,:b

4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19,:b20,:b2

1,:b22)

insert into sms_resprept_error (msgdate,areaid,addruser,msgid,submittimes,submit

date,submittime,msgid_gw,msgstate_resp,errorcode_resp,msgstate_rept,errorcode_re

pt,servicecode) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12)

 

insert into sms_statusrept (reptdate,addruser,msgid_gw,repttime,statustype,msgid

_stus,msgstate,errorcode) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7)

              

而这五个SQL中,在v$sqlarea中,有四个version_count都在10以上:

 SQL>  select sql_text,version_count from v$sqlarea where version_count>10;

 

SQL_TEXT                    VERSION_COUNT

--------------------------------------------------------------------------------

insert into sms_detail_error (msgdate,addruser,msgid,areaid,reason,spnumber,msgt

ime,ifiddest,msqkey,servicecode,planid,feetype,feevalue,smcont,submittimes,submi

tdate,submittime,msgstate_resp,errorcode_resp,msgstate_rept,errorcode_rept) valu

es (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:

b17,:b18,:b19,:b20)

           42

insert into sms_log (MSGDATE,MSGTIME,MSGID,MSGKIND,MSGTYPE,MSGTYPE_MOMT,MSGLEN,

MSGSTATUS,AREAID,IFIDDEST,IFIDSRC,ADDRSRC,ADDRDEST,ADDRFEE,ADDRUSER,

SERVICECODE,PLANID,FEETYPE,FEEVALUE,DATACODING,FLAGS,SMLEN,SMCONT) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19,:b20,:b21,:b22)

         7026

insert into sms_resprept_error (msgdate,areaid,addruser,msgid,submittimes,submit

date,submittime,msgid_gw,msgstate_resp,errorcode_resp,msgstate_rept,errorcode_re

pt,servicecode) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12)

          301

insert into sms_statusrept (reptdate,addruser,msgid_gw,repttime,statustype,msgid

_stus,msgstate,errorcode) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7)

           41

具体可以参考Metalink: Note 62143

至此这个关于shared pool的问题找到了原因,并得以及时解决。(End)

 

编者注:您可以通过以下链接参与关于本文的讨论,直接和作者对话:

http://www.eygle.com/index-internal.htm 

http://www.itpub.net/255975.html 关于shared pool的深入探讨()

http://www.itpub.net/257107.html  关于shared pool的深入探讨(

http://www.itpub.net/260363.html 关于shared pool的深入探讨(

http://www.itpub.net/261764.html 关于shared pool的深入探讨(

http://www.itpub.net/263051.html  关于shared pool的深入探讨()

http://www.itpub.net/265161.html 关于shared pool的深入探讨()

 

http://epub.itpub.net/9/1.htm

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值